how to split the string in c# - c#

I want to split the string
"This is regarding the problem of {pro} in {statement}"
I want to get output is
This is regarding the problem of
{pro}
in
{statement}

You could try this regex:
([^{]+|{[^}]*})
It matches each group of characters which are defined by either:
A sequence of characters (at least one), none of which are {; or
A { character, followed by any number of characters which are not }, all followed by }

Here's a simple regex that will insert a new line before and after your token matches by overriding the evaluator:
string output = Regex.Replace(input, #"{\S+}", m => string.Format(#"{1}{0}{1}", m.Value, '\n'));
The output variable will have a newline after. You can then just do a string split if you need the output in an array of strings.
string[] lines = output.Split('\n');

Related

Replace the nth index of a character

How can I replace the nth index of a character using only Regex.
string input = "%fdfdfdfdfdfdfdfdfdfdfdffd";
string result = Regex.Replace(input, "^%", "");
The above code, replaces the first character with an empty string, But, I want to specify an index: like nth index, so that character gets replaced with an empty string.
Can someone help me out here.
It's possible to create a regex pattern that captures all characters before and after the replaced character and then replace the whole string with the two captures separated by the new character. For example:
Regex.Replace("abcdefgh", #"^(.{4}).(.*)$", #"$1E$2") // returns "abcdEfgh"
You could then create a method that replaces the character at a specific index:
string ReplaceCharacter(string text, int index, char value)
=> Regex.Replace(text, $#"^(.{{{index}}}).(.*)$", $#"${{1}}{value}${{2}}");
// Usage:
ReplaceCharacter("Foo-bar", 3, 'l') // returns "Foolbar"
As Johan Wentholt said in the comments, you can perfectly use Regex.Replace to match a number of characters from the start of the line and replace it with a capture group that's one character less than the full matched piece:
String result = Regex.Replace(input, "^(.{" + index + "}).", "$1");
This matches "index times any character, followed by another character, at the start of the string", but replaces it by only the "index times any character" without that last character, since that last dot is outside of the capture group.
If you want to replace by something else than an empty string, you just concatenate it to the end of the "$1" replacement string. Though to be safe then, you should replace it with "${1}" to avoid problems if the piece you add behind it starts with a number, since that would change the capture group number.
What you want to do may not be possible with Regex alone. This is sort of a cheat:
var input = "%fdfd678dfdfdfdfdfdfdfdffd";
var result = Regex.Replace(input, "^.{7}", input.Substring(0,6));
Console.WriteLine($"result = {result}");

C# Regex split() without removing the split condition character

I am splitting a string with regex using its Split() method.
var splitRegex = new Regex(#"[\s|{]");
string input = "/Tests/ShowMessage { 'Text': 'foo' }";
//second version of the input:
//string input = "/Tests/ShowMessage{ 'Text': 'foo' }";
string[] splittedText = splitRegex.Split(input, 2);
The string is just a sample pattern of the input. There are two different structures of input, once with a space before the { or without the space. I want to split the input on the { bracket in order to get the following result:
/Tests/ShowMessage
{ 'Text': 'foo' }
If there is a space, the string gets splitted there (space gets removed) and i get my desired result. But if there isnt a space i split the string on the {, so the { gets removed, what i dont want though. How can i use Regex.Split() without removing the split condition character?
The square brackets create a character set, so you want it to match exactly one of those inner characters. For your desire start off by removing them.
So to match it a random count of whitespaces you have to add *, the result is this one\s*.
\s is a whitespace
* means zero-or-more
That you don't remove the split condition character, you can use lookahead assertion (?=...).
(?=...) or (?!...) is a lookahead assertion
The combined Regex looks like this: \s*(?={)
This is a really good and detailed documentation of all the different Regex parts, you might have a look at it. Furthermore you can test your Regex easy and for free here.
In order to not include the curly brace in the match you can put it into a look ahead
\s*(?={)
That will match any number of white spaces up to the position before a open curly brace.
You can use regular string split, on "{" and trim the spaces off:
var bits = "/Tests/ShowMessage { 'Text': 'foo' }".Split("{", StringSplitOptions.RemoveEmptyEntries);
bits[0] = bits[0].TrimEnd();
bits[1] = "{" + bits[1];
If you want to use the RegEx route, you can add the { back if you change the regex a bit:
var splitRegex = new Regex(#"\s*{");
string input = "/Tests/ShowMessage { 'Text': 'foo' }";
//second version of the input:
//string input = "/Tests/ShowMessage{ 'Text': 'foo' }";
string[] splittedText = splitRegex.Split(input, 2);
splittedText[1] = "{" + splittedText[1];
It means "split at occurrence of (zero or more whitespace followed by {)" - so the split operation nukes your spaces (you want), and your { (you don't want) but you can put the { back with certainty that it will mean you get what you want
var splitedList = srt.Text.Replace(".", ".#").Replace("?", "?#").Replace("!", "!#").Split(new[] { "#"}, StringSplitOptions.RemoveEmptyEntries).ToList();
This will split text for .!? and will not remove condition chars. For better result just replace # with some uniq char. Like this one for example '®' That is all. Simple as it is. No regex.split which is slow and difficult due to many different task criterias, etc...
passing-> "Hello. I'am dev!"
result (split condition character exist )
"Hello."
"I'am dev!"

Splitting of a string using Regex

I have string of the following format:
string test = "test.BO.ID";
My aim is string that part of the string whatever comes after first dot.
So ideally I am expecting output as "BO.ID".
Here is what I have tried:
// Checking for the first occurence and take whatever comes after dot
var output = Regex.Match(test, #"^(?=.).*?");
The output I am getting is empty.
What is the modification I need to make it for Regex?
You get an empty output because the pattern you have can match an empty string at the start of a string, and that is enough since .*? is a lazy subpattern and . matches any char.
Use (the value will be in Match.Groups[1].Value)
\.(.*)
or (with a lookahead, to get the string as a Match.Value)
(?<=\.).*
See the regex demo and a C# online demo.
A non-regex approach can be use String#Split with count argument (demo):
var s = "test.BO.ID";
var res = s.Split(new[] {"."}, 2, StringSplitOptions.None);
if (res.GetLength(0) > 1)
Console.WriteLine(res[1]);
If you only want the part after the first dot you don't need a regex at all:
x.Substring(x.IndexOf('.'))

Quick way of splitting a mixed alphanum string into text and numeric parts?

Say I have a string such as
abc123def456
What's the best way to split the string into an array such as
["abc", "123", "def", "456"]
string input = "abc123def456";
Regex re = new Regex(#"\D+|\d+");
string[] result = re.Matches(input).OfType<Match>()
.Select(m => m.Value).ToArray();
string[] result = Regex.Split("abc123def456", "([0-9]+)");
The above will use any sequence of numbers as the delimiter, though wrapping it in () says that we still would like to keep our delimiter in our returned array.
Note: In the example snippet we will get an empty element as the last entry of our array.
The boundary you look for can be described as "A position where a digit follows a non-digit, or where a non-digit follows a digit."
So:
string[] result = Regex.Split("abc123def456", #"(?<=\D)(?=\d)|(?<=\d)(?=\D)");
Use [0-9] and [^0-9], respectively, if \d and \D are not specific enough.
Add space around digitals, then split it. So there is the solution.
Regex.Replace("abc123def456", #"(\d+)", #" \1 ").Split(' ');
I hope it works.
You could convert the string to a char array and then loop through the characters. As long as the characters are of the same type (letter or number) keep adding them to a string. When the next character no longer is of the same type (or you've reached the end of the string), add the temporary string to the array and reset the temporary string to null.

Regex not working in .NET

So I'm trying to match up a regex and I'm fairly new at this. I used a validator and it works when I paste the code but not when it's placed in the codebehind of a .NET2.0 C# page.
The offending code is supposed to be able to split on a single semi-colon but not on a double semi-colon. However, when I used the string
"entry;entry2;entry3;entry4;"
I get a nonsense array that contains empty values, the last letter of the previous entry, and the semi-colons themselves. The online javascript validator splits it correctly. Please help!
My regex:
((;;|[^;])+)
Split on the following regular expression:
(?<!;);(?!;)
It means match semicolons that are neither preceded nor succeeded by another semicolon.
For example, this code
var input = "entry;entry2;entry3;entry4;";
foreach (var s in Regex.Split(input, #"(?<!;);(?!;)"))
Console.WriteLine("[{0}]", s);
produces the following output:
[entry]
[entry2]
[entry3]
[entry4]
[]
The final empty field is a result of the semicolon on the end of the input.
If the semicolon is a terminator at the end of each field rather than a separator between consecutive fields, then use Regex.Matches instead
foreach (Match m in Regex.Matches(input, #"(.+?)(?<!;);(?!;)"))
Console.WriteLine("[{0}]", m.Groups[1].Value);
to get
[entry]
[entry2]
[entry3]
[entry4]
Why not use String.Split on the semicolon?
string sInput = "Entry1;entry2;entry3;entry4";
string[] sEntries = sInput.Split(';');
// Do what you have to do with the entries in the array...
Hope this helps,
Best regards,
Tom.
As tommieb75 wrote, you can use String.Split with StringSplitOptions Enumeration so you can control your output of newly created splitting array
string input = "entry1;;entry2;;;entry3;entry4;;";
char[] charSeparators = new char[] {';'};
// Split a string delimited by characters and return all non-empty elements.
result = input.Split(charSeparators, StringSplitOptions.RemoveEmptyEntries);
The result would contain only 4 elements like this:
<entry1><entry2><entry3><entry4>

Categories

Resources