This question already has answers here:
What is the difference between a regular string and a verbatim string?
(6 answers)
Closed 7 years ago.
There is # operator that you place infornt of the string to allow special characters in string and there is \. Well I am aware that with # you can use reserved names for variables, but I am curious just about difference using these two operators with string.
Search on the web indicated that these two are the same but I still believe there has to be something different between # and \.
Code to test:
string _string0 = #"Just a ""qoute""";
string _string1 = "Just a \"qoute\"";
Console.WriteLine("{0} | {1}",_string0, _string1);
Question: what is the difference between #"Just a ""qoute"""; and "Just a \"qoute\""; only regarding strings?
Edit: Question is already answered here.
Using # (which denotes a verbatim string literal) you can put any character into the string, even line breaks. The only character you need to escape is the double quote. The usual \* escape sequences and Unicode escape sequences are not processed in such string literals.
Without # (in a regular string literal), you need to escape every special character, such as line breaks.
You can read more about it at the C# Programming Guide:
https://msdn.microsoft.com/en-us/library/ms228362.aspx#Anchor_3
# is a verbatim string, it allows you not to escape every special character at a time, but all of them in the string.While \ just allows you to escape one certain character.
More info about strings: https://msdn.microsoft.com/en-us/library/aa691090%28v=vs.71%29.aspx
Related
This question already has answers here:
How do I write a backslash (\) in a string?
(6 answers)
Why \b does not match word using .net regex
(2 answers)
Closed 3 years ago.
I have regex like this:
(?i)^(?!.*\bWITH\b).*\(\s*.*\s*\b(INDEX|FASTFIRSTROW|HOLDLOCK|SERIALIZABLE|REPEATABLEREAD|READCOMMITTED|READUNCOMMITTED|ROWLOCK|PAGLOCK|TABLOCK|TABLOCKX|NOLOCK|UPDLOCK|XLOCK|READPAST)\b\s*.*\s*\)
It return true in http://regexstorm.net.
But when i run in C#, it always return false.
String input to text:
INNER JOIN t_hat_meisaimidasi AS MM (READCOMMITTED, NOLOCK) WHERE ( AND hat_kanri_no = ?
Can someone explain me why?
Returns true for me; probably you didn't use #"...", so the escape tokens (\b etc) aren't what you think they are:
Console.WriteLine(Regex.IsMatch(
#"INNER JOIN t_hat_meisaimidasi AS MM (READCOMMITTED, NOLOCK) WHERE ( AND hat_kanri_no = ?",
#"(?i)^(?!.*\bWITH\b).*\(\s*.*\s*\b(INDEX|FASTFIRSTROW|HOLDLOCK|SERIALIZABLE|REPEATABLEREAD|READCOMMITTED|READUNCOMMITTED|ROWLOCK|PAGLOCK|TABLOCK|TABLOCKX|NOLOCK|UPDLOCK|XLOCK|READPAST)\b\s*.*\s*\)"));
Note: "\b" is a string of length 1 that contains a backspace character; #"\b" is a string of length 2 that contains a slash and a b. When dealing with regex, you almost always want to use a verbatim string literal (#"...").
To make it even better: Visual Studio will use colorization to tell you when you're getting it right:
This question already has answers here:
Can I escape a double quote in a verbatim string literal?
(6 answers)
How to split csv whose columns may contain comma
(9 answers)
Closed 4 years ago.
I have the a text file as follows:
"0","Column","column2","Column3"
I have managed to get the data down to split to the following:
"0"
"Column"
"Column2"
"Column3"
with ,(?=(?:[^']*'[^']*')*[^']*$), now I want to remove the quotes. I have tested the expression [^\s"']+|"([^"]*)"|\'([^\']*) an online regex tester, which gives the correct output im looking for. However, I am getting a syntax error when using the expression:
String[] columns = Regex.Split(dataLine, "[^\s"']+|"([^"]*)"|\'([^\']*)");
Syntax error ',' expected
I've tried escaping characters but to no avail, am I missing something?
Any help would be greatly appreciated!
Thanks.
C# might be escaping the backslash. Try:
String[] columns = Regex.Split(dataLine, #"[^\s""']+|"([^""]*)""|\'([^\']*)");
The problems are the double quotes inside the regex, the compiler chokes on them, think they are the end of string.
You must escape them, like this:
"[^\s\"']+|\"([^\"]*)\"|\'([^\']*)"
Edit:
You can actually do all, that you want with one regex, without first splitting:
#"(?<=[""])[^,]*?(?=[""])"
Here I use an # quoted string where double quotes are doubled instead of escaped.
The regex uses look behind to look for a double quote, then matching any character except comma ',' zero ore more times, then looks ahead for a double quote.
How to use:
string test = #"""0"",""Column"",""column2"",""Column3""";
Regex regex = new Regex(#"(?<=[""])[^,]*?(?=[""])");
foreach (Match match in regex.Matches(test))
{
Console.WriteLine(match.Value);
}
You need to escape the double quotes inside of your regular expression, as they're closing the string literal. Also, to handle 'unrecognized escape sequences', you'll need to escape the \ in \s.
Two ways to do this:
Escape all the characters of concern using backslashes: "[^\\s\"']+|\"([^\"]*)\"|\'([^\']*)"
Use the # syntax to denote a "verbatim" string literal. Double quotes still need to be escaped, but instead using "" for every ": #"[^\s""']+|""([^""]*)""|'([^']*)"
Regardless, when I test out your new regular expression it appears to be capturing some empty groups as well, see here: https://dotnetfiddle.net/1WQE4R
This question already has an answer here:
Reference - What does this regex mean?
(1 answer)
Closed 6 years ago.
I am new to regex. What does regex expression match pattern "\[.*\]" mean?
If I have a text like "Hello [Here]", then success is returned in the match. And match contain [Here].
I read that:
. indicates Any except \n (newline),
* indicates 0 or more times
I don't understand the "\". It believe it is just escape sequence for "\".
So, is the expression "\[.*\]" trying to match a pattern like \[Any text\]?
Yes, you are right. It will match any characters enclosed in []. The .* imply any or no characters enclosed in [].
Also you should try this link which is a very helpful regex tool. You can input the regex pattern and check for matches easily.
I have tried this on regexr, here is a screen shot:
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
If I store a verbatim string,
string s=#""Hello""; it should hypothetically parse the string and print "Hello" (taking the " character as a part of the string)
but in actual practice it fails and instead string s=#"""Hello"""; displays the desired output ("Hello").
Why the syntax demands an extra " character? What would be the disruptive outcomes if C# starts functioning in the aforementioned hypothetical manner?
So technically, string s=#""Hello""; should print "Hello"
No, that would just be invalid. The compiler should - and does - obey the rules of the language specification.
Within a verbatim string literal, double-quotes must be doubled (not tripled) to distinguish them from the double-quote just meaning "the end of the string".
It's easiest to see this if you use a double-quote within the middle of a string, not at the start or end:
string x = #"start "" end";
F# works perfectly in the above case
Not as far as I can see. In a sample F# project in VS2015 Preview:
let x = #""Hello""
... gives an error, whereas #"""Hello""" results in a string of "Hello".
Additionally, the F# verbatim string literal documentation suggests it works exactly as per C#:
If preceded by the # symbol, the literal is a verbatim string. This means that any escape sequences are ignored, except that two quotation mark characters are interpreted as one quotation mark character.
Basically, it sounds like verbatim string literals work in both C# and F# perfectly well, with both of them requiring double-quotes to be doubled. F# also has tripled-quoted strings, within which they don't need to be doubled... but that's just a separate feature that C# doesn't have.
EDIT: To answer your comment about ambiguity, you gave an example of #"1"2"". Let me change that very slightly:
string x = #"1"+"";
In C# at the moment, that means the concatenation of a verbatim string literal with content 1 and an empty regular string literal. If you're proposing that it should actually be a single verbatim string literal with content 1"+", with the parser relying on the quote directly before the next semi-colon being the end of the verbatim string literal, that strikes me as a really bad idea. For example, suppose I want to use a verbatim string literal as the first argument in a two-argument method call, like this:
Console.WriteLine(#"Foo {0}", "Bar");
By your rules, that would be a call with a single argument - it would be impossible to represent the call we're trying to make, as a single statement. (You'd need to use a variable for the verbatim string literal to avoid it messing things up.)
In short, I'm happier with the way C# already works.
That's because you need to have an opening and a closing " characters too. So with a line like yours:
s=#"""Hello""";
The first and the last quote characters is to indicate the start and the end of the string to the compiler. And then within the string the double quotation sequence ("") is to indicate you wish to have a quote character in your string.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Regular expression to match string not containing a word?
To not match a set of characters I would use e.g. [^\"\\r\\n]*
Now I want to not match a fixed character set, e.g. "|="
In other words, I want to match: ( not ", not \r, not \n, and not |= ).
EDIT: I am trying to modify the regex for parsing data separated with delimiters. The single-delimiter solution I got form a CSV parser, but now I want to expand it to include multi-character delimiters. I do not think lookaheads will work, because I want to consume, not just assert and discard, the matching characters.
I figured it out, it should be: ((?![\"\\r\\n]|[|][=]).)*
The full regex, modified from the CSV parser link in the original post, will be: ((?<field>((?![\"\\r\\n]|[|][=]).)*)|\"(?<field>([^\"]|\"\")*)\")([|][=]|(?<rowbreak>\\r\\n|\\n|$))
This will match any amount of characters of ( not ", not \r, not \n, and not |= ), or a quoted string, followed by ( "|=" or end of line )