This question already has answers here:
Regular Expression to find a string included between two characters while EXCLUDING the delimiters
(13 answers)
Closed 7 years ago.
Him I would like to split string by two characters.
For example I have string like this one:
"xx-aa-[aa]-22-[bb]".
I want to retrieve string array of [aa] and [bb]. All characters between [ ].
First I can split by '-', so I'll have string array
var tmp = myString.Split('-');
But now how can I retrieve only strings between [] ?
You can use following regex:
\[(.+?)\]
Use global flag to match all the groups.
Demo
Explanation
(): Capturing Group
\[: Matches [ literal. Need to escape using \
.+?: Non-greedy match any number of any characters
\]: Matches ] literal. Need to escape using \
Visualization
Related
This question already has an answer here:
Split Method with single quotes
(1 answer)
Closed 1 year ago.
I need to split a string with the character ' using string.Split()
However, the ' character is used in the string.Split()
For example:
string.Split(''')
but that gives a syntax error.
I have tried using the # symbol to represent a literal string, but that does not work either.
string.Split(#''')
Add escape sequences before single quote .Split('\''), like
var str = "Hell'o'Wo'rl'd";
var output = str.Split('\''); //["Hell", "o", "Wo", "rl", "d"]
.NET FIDDLE
This question already has answers here:
How do I remove all non alphanumeric characters from a string except dash?
(13 answers)
Closed 3 years ago.
I have some text, I need to select only alphanumeric characters only in single match.
I have tried this in regex
[^\W]+
Pattern : [^\W]+
Input: This is "My Page"
https://rubular.com/r/PMQwahJIqqiOOI
Output I Need: This is My Page
Remove everything that is not word character or space using this regex,
[^\w ]+
with empty space.
Regex Demo
This question already has answers here:
Can I escape a double quote in a verbatim string literal?
(6 answers)
How to split csv whose columns may contain comma
(9 answers)
Closed 4 years ago.
I have the a text file as follows:
"0","Column","column2","Column3"
I have managed to get the data down to split to the following:
"0"
"Column"
"Column2"
"Column3"
with ,(?=(?:[^']*'[^']*')*[^']*$), now I want to remove the quotes. I have tested the expression [^\s"']+|"([^"]*)"|\'([^\']*) an online regex tester, which gives the correct output im looking for. However, I am getting a syntax error when using the expression:
String[] columns = Regex.Split(dataLine, "[^\s"']+|"([^"]*)"|\'([^\']*)");
Syntax error ',' expected
I've tried escaping characters but to no avail, am I missing something?
Any help would be greatly appreciated!
Thanks.
C# might be escaping the backslash. Try:
String[] columns = Regex.Split(dataLine, #"[^\s""']+|"([^""]*)""|\'([^\']*)");
The problems are the double quotes inside the regex, the compiler chokes on them, think they are the end of string.
You must escape them, like this:
"[^\s\"']+|\"([^\"]*)\"|\'([^\']*)"
Edit:
You can actually do all, that you want with one regex, without first splitting:
#"(?<=[""])[^,]*?(?=[""])"
Here I use an # quoted string where double quotes are doubled instead of escaped.
The regex uses look behind to look for a double quote, then matching any character except comma ',' zero ore more times, then looks ahead for a double quote.
How to use:
string test = #"""0"",""Column"",""column2"",""Column3""";
Regex regex = new Regex(#"(?<=[""])[^,]*?(?=[""])");
foreach (Match match in regex.Matches(test))
{
Console.WriteLine(match.Value);
}
You need to escape the double quotes inside of your regular expression, as they're closing the string literal. Also, to handle 'unrecognized escape sequences', you'll need to escape the \ in \s.
Two ways to do this:
Escape all the characters of concern using backslashes: "[^\\s\"']+|\"([^\"]*)\"|\'([^\']*)"
Use the # syntax to denote a "verbatim" string literal. Double quotes still need to be escaped, but instead using "" for every ": #"[^\s""']+|""([^""]*)""|'([^']*)"
Regardless, when I test out your new regular expression it appears to be capturing some empty groups as well, see here: https://dotnetfiddle.net/1WQE4R
This question already has answers here:
What is the difference between a regular string and a verbatim string?
(6 answers)
Closed 7 years ago.
There is # operator that you place infornt of the string to allow special characters in string and there is \. Well I am aware that with # you can use reserved names for variables, but I am curious just about difference using these two operators with string.
Search on the web indicated that these two are the same but I still believe there has to be something different between # and \.
Code to test:
string _string0 = #"Just a ""qoute""";
string _string1 = "Just a \"qoute\"";
Console.WriteLine("{0} | {1}",_string0, _string1);
Question: what is the difference between #"Just a ""qoute"""; and "Just a \"qoute\""; only regarding strings?
Edit: Question is already answered here.
Using # (which denotes a verbatim string literal) you can put any character into the string, even line breaks. The only character you need to escape is the double quote. The usual \* escape sequences and Unicode escape sequences are not processed in such string literals.
Without # (in a regular string literal), you need to escape every special character, such as line breaks.
You can read more about it at the C# Programming Guide:
https://msdn.microsoft.com/en-us/library/ms228362.aspx#Anchor_3
# is a verbatim string, it allows you not to escape every special character at a time, but all of them in the string.While \ just allows you to escape one certain character.
More info about strings: https://msdn.microsoft.com/en-us/library/aa691090%28v=vs.71%29.aspx
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Regular expression to match string not containing a word?
To not match a set of characters I would use e.g. [^\"\\r\\n]*
Now I want to not match a fixed character set, e.g. "|="
In other words, I want to match: ( not ", not \r, not \n, and not |= ).
EDIT: I am trying to modify the regex for parsing data separated with delimiters. The single-delimiter solution I got form a CSV parser, but now I want to expand it to include multi-character delimiters. I do not think lookaheads will work, because I want to consume, not just assert and discard, the matching characters.
I figured it out, it should be: ((?![\"\\r\\n]|[|][=]).)*
The full regex, modified from the CSV parser link in the original post, will be: ((?<field>((?![\"\\r\\n]|[|][=]).)*)|\"(?<field>([^\"]|\"\")*)\")([|][=]|(?<rowbreak>\\r\\n|\\n|$))
This will match any amount of characters of ( not ", not \r, not \n, and not |= ), or a quoted string, followed by ( "|=" or end of line )