How can I parse a string into different strings in C#?

How can I parse a string into different strings in C#? - c#

I just started work on a C# application. I need to break down a string into different parts. Is there an easy way to do this using C# patterns? I think I can do it with substrings but it might get messy and I want to do something that's easy to understand. Here's an example of the input:
AB-CDE-GHI-123-45-67-7777
variable1 = "AB-CDE-GHI"
variable2 = "123"
variable3 = "45"
variable4 = "67"
variable5 = "67-7777"
AB-CDE-GHIJKLM-123-45-67-7777
variable1 = "AB-CDE-GHIJKLM"
variable2 = "123"
variable3 = "45"
variable4 = "67"
variable5 = "67-7777"
AB-123-45-67-7777
variable1 = "AB"
variable2 = "123"
variable3 = "45"
variable4 = "67"
variable5 = "67-7777"
The first part of the string up until "123-45-67-7777" can be any length. Lucky for me the last part 123-45-67-7777 is always the same length and contains numbers that are zero padded.
I hope someone can come up with some suggestions for an easy method that uses regular expressions or something.
Input lines look like this:
aa-123-45-67-7777
HJHJH-123-45-67-7777
H-H-H--123-45-67-7777
222-123-45-67-7777

You do not need RegEx for parsing this kind of input.
You can use string.Split, in particular if the input is highly structured.
If you first split by - you will get a string[] with each part in a different index of the array.
The length property of the array will tell you how many parts you got and you can use that to reconstruct the parts you need.
You can rejoin any of the bit you need back.
string[] parts = "AB-CDE-GHI-123-45-67-7777".split('-');
// joining together the first 3 items:
string letters = string.Format("{0}-{1}-{2}", parts[0], parts[1], parts[2]);
// letters = "AB-CDE-GHI"
If the number of sections is variable (apart from the last 4), you can use the length in a loop to rebuild the wanted parts:
StringBuilder sb = new StringBuilder();
for(int i = 0; i < parts.Length - 4; i++)
{
sb.FormatAppend("{0}-", parts[i]);
}
sb.Length = sb.Length - 1; // remove trailing -

If the last part is always a known length (14 characters) you could just do something like this:
var firstPart = inputLine.Substring(inputLine.Length - 14);
var secondPart = inputLine.Substring(0, inputLine.Length - 15); // 15 to exclude the last -
Then you can just do your string splitting and job done :)

Although it is possible to use here String.Split, a better solution, in my opinion, would be to tokenize the input and then parse it.
You can use tools such as ANTLR for this purpose.

string[] str = "AB-CDE-GHI-123-45-67-7777".Split('-')
int a = str.Length;
variable1="";
for(int i=0;i=<a-5;i++)
{
variable1=str[i]+"-"+variable1;
}
// last - remove
variable1 = variable1.Remove(variable1.Length-1,1);
variable2 = str[a-4]
variable3 = str[a-3]
variable4 = str[a-2]
variable5 = str[a-2]+"-"str[a-1];

like Oded say you can use string.Split
I edit my answer like you want
string[] tab = textBox1.Text.Split('-');
int length = tab.Length;
string var1 = string.Empty;
for(int i=0; i <= length-5 ; i++)
{
var1 = var1 + tab[i] + '-';
}
var1 = var1.Remove(var1.Length-1,1);
string var2 = tab[length-4];
string var3 = tab[length-3];
string var4 = tab[length-2];
string var5 = tab[length-2] + '-' + tab[length-1];
it's the same with the answer of #Govind KamalaPrakash Malviya just you have make var1 + tab[i]

Related

C# -Take a string between a text

I've got various string in a list:
Ord.cl. N. 2724 del 08/11/2019
and it can be also
Ord.cl. N. 2725/web del 08/11/2019
I have to take all the content that comes after 'N.' and before 'del'. As result I want
2724
2725/web
Can someone do code for that in C#? I know there is substring, but maybe there are better ways?

you can build some extention method like this
public string SubstringFromTo(this string input, int from, int to)
{
return input.Substring(from, (to - from));
}
public string SubstringFromTo(this string input, string from, string to)
{
var index1 = input.IndexOf(from) != -1 ? input.IndexOf(from) : 0;
var index2 = input.IndexOf(to) != -1 ? input.IndexOf(to) : (input.Length - 1);
return input.SubstringFromTo(index1, index2);
}
var asd = " ciao ** come stai ? asdasd".SubstringFromTo("**","?");
result = "come stai"
//.Trim() if you want

Using regular expressions, you might do:
var m = Regex.Match("Ord.cl. N. 2724 del 08/11/2019", #"(?<=N\.).*?(?=del)");
if (m.Success)
{
var result = m.Value;
}
Explanation of the regular expression:
(?<=N\.) looks for a preceding "N.".
.*? matches any sequence of characters, but as few as possible
(?=del) lools for a trailng "del".

If it's always that predictable (space before and after N. and space before and after del), then it's fairly simple. Use Substring and use IndexOf to find the occurrences of N. and del:
var theString = "Ord.cl. N. 2725/web del 08/11/2019";
var start = theString.IndexOf("N. ") + 3;
var length = theString.IndexOf(" del", start) - start;
var partIWant = theString.Substring(start, length).Trim();
Console.WriteLine(partIWant);
That also assumes that there will only ever be one occurrence of N. and del in your string.

for (int i = 0; i< list.Count-1; i++)
{
NDocList.Add(list[i].DocumentiOrigine.Split(new string[] { " N. " }, StringSplitOptions.None)[1]
.Split()[0]
.Trim());
}
solved with this somehow.

Replace only 'n' occurences of a substring in a string in C#

I have a input string like -
abbdabab
How to replace only the 2nd, 3rd and subsequent occurances of the substring "ab" with any random string like "x" keeping the original string intact. Example in this case -
1st Output - xbdabab 2nd Output - abbdxab 3rd Output - abbdabx and so on...
I have tried using Regex like -
int occCount = Regex.Matches("abbdabab", "ab").Count;
if (occCount > 1)
{
for (int i = 1; i <= occCount; i++)
{
Regex regReplace = new Regex("ab");
string modifiedValue = regReplace.Replace("abbdabab", "x", i);
//decodedMessages.Add(modifiedValue);
}
}
Here I am able to get the 1st output when the counter i value is 1 but not able to get the subsequent results. Is there any overloaded Replace method which could achieve this ? Or Can anyone help me in pointing where I might have gone wrong?

You can try IndexOf instead of regular expressions:
string source = "abbdabab";
string toFind = "ab";
string toSet = "X";
for (int index = source.IndexOf(toFind);
index >= 0;
index = source.IndexOf(toFind, index + 1)) {
string result = source.Substring(0, index) +
toSet +
source.Substring(index + toFind.Length);
Console.WriteLine(result);
}
Outcome:
Xbdabab
abbdXab
abbdabX

You can use a StringBuilder:
string s = "abbdabab";
var matches = Regex.Matches(s, "ab");
StringBuilder sb = new StringBuilder(s);
var m = matches[0]; // 0 for first output, 1 for second output, and so on
sb.Remove(m.Index, m.Length);
sb.Insert(m.Index, "x");
var result = sb.ToString();
Console.WriteLine(result);

You may use a dynamically built regex to be used with regex.Replace directly:
var s = "abbdabab";
var idx = 1; // First = 1, Second = 2
var search = "ab";
var repl = "x";
var pat = new Regex($#"(?s)((?:{search}.*?){{{idx-1}}}.*?){search}"); // ((?:ab.*?){0}.*?)ab
Console.WriteLine(pat.Replace(s, $"${{1}}{repl}", 1));
See the C# demo
The pattern will look like ((?:ab.*?){0}.*?)ab and will match
(?s) - RegexOptions.Singleline to make . also match newlines
((?:ab.*?){0}.*?) - Group 1 (later, this value will be put back into the result with ${1} backreference)
(?:ab.*?){0} - 0 occurrences of ab followed with any 0+ chars as few as possible
.*? - any 0+ chars as few as possible
ab - the search string/pattern.
The last argument to pat.Replace is 1, so that only the first occurrence could be replaced.
If search is a literal text, you need to use var search = Regex.Escape("a+b");.
If the repl can have $, add repl = repl.Replace("$", "$$");.

Is there built-in method to add character multiple times to a string?

Is there a built-in function or more efficient way to add character to a string X number of times?
for example the following code will add '0' character 5 times to the string:
int count = 5;
char someChar = '0';
string myString = "SomeString";
for(int i=0;i<count;i++)
{
myString = someChar + myString;
}

Use PadLeft() or PadRight()
An example for PadRight():
int count = 5;
char someChar = '0';
string myString = "SomeString";
myString = myString.PadRight(count + myString.Length, someChar);
// output -> "SomeString00000"
Remember the first parameter of either method is the total string length required hence why I am adding count to the original string length.
Likewise if you want to append the character at the start of the string use PadLeft()
myString = myString.PadLeft(count + myString.Length, someChar);
// output -> "00000SomeString"

string.Concat(Enumerable.Repeat("0", 5));
will return
"00000"
Refered from :Is there a built-in function to repeat string or char in .net?

You can also do it as:
string line = "abc";
line = "abc" + new String('X', 5);
//line == abcXXXXX

Take a look here, You can use PadRight() / PadLeft();
int count = 5;
char someChar = '0';
string myString = "SomeString";
var stringLength = myString.Length;
var newPaddedStringRight = myString.PadRight(stringLength + count, '0');
//will give SomeString00000
var newPaddedStringLeft = myString.PadLeft(stringLength + count, '0');
//will give 00000SomeString
Remember, a string is Immutable, so you'll need to assign the result to a new string.

You could also use StringBuilder. As the string size increases the += incurs a cost on array copy.

Cutting text from string after matching pattern

I would like cut all text after <.br> before next <.br> and after the last <.br>, example:
string example1 = "some example<br>text1<br>text2";
//do the magic
int match_count = 2;
string match1 = "text1";
string match2 = "text2";
it's hard to explain this without showing an actual example ;)
is there an easy way to accomplish this with regex?
P.S. few more examples of usage:
string example1 = "some example<br>text1";
int match_count = 1;
string match1 = "text1";
and
string example2 = "some example";
int match_count = 0;

One possibility that does not require regular expresions, would be to use one of the String.Split overloads:
var input = #"some example<br>text1<br>text2";
// split on every <br>
var chunks = input.Split(new[] { "<br>" }, StringSplitOptions.RemoveEmptyEntries);
// remove the first entry, everything else is wanted result
foreach (var chunk in chunks.Skip(1))
{
Console.WriteLine(chunk);
}
The output is:
text1
text2
You could then easily check if you have any matches using the Count or Length on the array.

For match_count, you can use just String.Split method like;
string example1 = "some example<br>text1<br>text2";
int match_count = example1.Split(new[] { "<br>" },
StringSplitOptions.RemoveEmptyEntries
.Count() - 1;
For getting text between tags, take a look at this question;
Get innertext between two tags - VB.NET - HtmlAgilityPack
It is in vb.net but you can easyly convert it to c#.

strip out digits or letters at the most right of a string

I have a file name: kjrjh20111103-BATCH2242_20111113-091337.txt
I only need 091337, not the txt or the - how can I achieve that. It does not have to be 6 numbers it could be more or less but will always be after "-" and the last ones before ."doc" or ."txt"

You can either do this with a regex, or with simple string operations. For the latter:
int lastDash = text.LastIndexOf('-');
string afterDash = text.Substring(lastDash + 1);
int dot = afterDash.IndexOf('.');
string data = dot == -1 ? afterDash : afterDash.Substring(0, dot);
Personally I find this easier to understand and verify than a regular expression, but your mileage may vary.

String fileName = kjrjh20111103-BATCH2242_20111113-091337.txt;
String[] splitString = fileName.Split ( new char[] { '-', '.' } );
String Number = splitString[2];

Regex: .*-(?<num>[0-9]*). should do the job. num capture group contains your string.

The Regex would be:
string fileName = "kjrjh20111103-BATCH2242_20111113-091337.txt";
string fileMatch = Regex.Match(fileName, "(?<=-)\d+", RegexOptions.IgnoreCase).Value;

String fileName = "kjrjh20111103-BATCH2242_20111113-091337.txt";
var startIndex = fileName.LastIndexOf('-') + 1;
var length = fileName.LastIndexOf('.') - startIndex;
var output = fileName.Substring(startIndex, length);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How can I parse a string into different strings in C#? - c#

Although it is possible to use here String.Split, a better solution, in my opinion, would be to tokenize the input and then parse it. You can use tools such as ANTLR for this purpose.

Related

C# -Take a string between a text

Replace only 'n' occurences of a substring in a string in C#

Is there built-in method to add character multiple times to a string?

Cutting text from string after matching pattern

strip out digits or letters at the most right of a string

Categories

Resources