Trying to understand this line of Java, as C# code - c#

See this java code :-
s = s.replaceAll( "\\\\", "\\\\\\\\" ).replaceAll( "\\$", "\\\\\\$" );
I sorta don't understand it. It's a regex replace all.
I've tried the following C# code...
text = text.RegexReplace("\\\\", "\\\\\\\\");
text = text.RegexReplace("\\$", "\\\\\\$");
But if i have the following unit test :-
} ul[id$=foo] label:hover {
The java code returns: } ul[id\$=foo] label:hover {
My c# code returns: } ul[id\\\$=foo] label:hover {
So i'm not sure I understand why my c# code is putting more \'s in, mainly with regards to how these control characters are being represented.. ??
Update:
So, when i use XXX's idea of just using text.Replace(..), this works.
eg.
text = text.Replace("\\\\", "\\\\\\\\");
text = text.Replace("\\$", "\\\\\\$");
But I was hoping to stick with RegEx... to try and keep it as close to the java code as possible.
The extension method being used is...
public static string RegexReplace(this string input,
string pattern,
string replacement)
{
return Regex.Replace(input, pattern, replacement);
}
hmm...

Java needs all $ signs escaped in its replace string - "\\\\\\$" means \\ and \$. Without it it throws an error: http://www.regular-expressions.info/refreplace.html (look for "$ (unescaped dollar as literal text)").
Remember $1, $0 etc are replaced the text with captured groups, so there are a part of the syntax on the second argument to replaceAll. C# has a slightly different syntax, and doesn't require the extra slash, which it takes literally.
You could write:
text = text.RegexReplace(#"\\", #"\\");
text = text.RegexReplace(#"\$", #"\$");
Or,
text = text.RegexReplace(#"[$\\]", #"\$&");

I think it's the equivalent of this C# code:
text = text.Replace(#"\", #"\\");
text = text.Replace("$", #"\$");
The # indicates a verbatim string in C#, meaning that the backslashes in strings don't have to be escaped with more backslashes. In other words, the code replaces a single backslash with a double backslash and then replaces a dollarsign with a backslash followed by a dollarsign.
If you were to use the regex function, it would be something like this:
text = text.RegexReplace(#"\\", #"\\");
text = text.RegexReplace(#"\$", #"\$$");
Note that in the regex pattern (the first parameter), backslashes are special, while in the replacement (the second parameter) it is the dollarsigns that are special.

The code quotes the backslashes and '$' characters in the original string.

Java regex parsing: http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Pattern.html
C#: http://msdn.microsoft.com/en-us/library/xwewhkd1.aspx
I think that in Java, you have to escape the \ character by using \, but in C#, you don't. Try taking out half of the \ in your C# version.

Related

Regex for ConstantText dot any text

I neeed a c# regex for this 2 cases.
1)MyConstantText
2)MyConstantText.[a-zA-Z]
ex.
My const text is Hello, then regex must match
Hello
Hello.ashdkajshd
Do not forget to escape when creating regular expressions:
String text = "Hello";
// Escape text as well as dot (\.)
// Technically, you do want to escape "Hello", but since
// text can be an arbitrary string, you'd better do it
String pattern = Regex.Escape(text) + #"(\.[a-zA-Z]+)?";
// Simple test
Console.Write(Regex.Match("Hello.ashdkajshd", pattern).Value);
Remark: Please note, that pattern, provided in the question (MyConstantText.[a-zA-Z]) doesn't match the sample in the question ("Hello.ashdkajshd") but "Hello.a" only. So, I've change the corresponding subpattern into [a-zA-Z]+ (note +).
Here is tuto for regex in c# ... if you got an error you can post it

How to give single back slash to a variable in c# program? [duplicate]

I want to write something like this C:\Users\UserName\Documents\Tasks in a textbox:
txtPath.Text = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments)+"\Tasks";
I get the error:
Unrecognized escape sequence.
How do I write a backslash in a string?
The backslash ("\") character is a special escape character used to indicate other special characters such as new lines (\n), tabs (\t), or quotation marks (\").
If you want to include a backslash character itself, you need two backslashes or use the # verbatim string:
var s = "\\Tasks";
// or
var s = #"\Tasks";
Read the MSDN documentation/C# Specification which discusses the characters that are escaped using the backslash character and the use of the verbatim string literal.
Generally speaking, most C# .NET developers tend to favour using the # verbatim strings when building file/folder paths since it saves them from having to write double backslashes all the time and they can directly copy/paste the path, so I would suggest that you get in the habit of doing the same.
That all said, in this case, I would actually recommend you use the Path.Combine utility method as in #lordkain's answer as then you don't need to worry about whether backslashes are already included in the paths and accidentally doubling-up the slashes or omitting them altogether when combining parts of paths.
To escape the backslash, simply use 2 of them, like this:
\\
If you need to escape other things, this may be helpful..
There is a special function made for this Path.Combine()
var folder = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments);
var fullpath = path.Combine(folder,"Tasks");
Just escape the "\" by using + "\\Tasks" or use a verbatim string like #"\Tasks"
txtPath.Text = Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments)+"\\\Tasks";
Put a double backslash instead of a single backslash...
even though this post is quite old I tried something that worked for my case .
I wanted to create a string variable with the value below:
21541_12_1_13\":null
so my approach was like that:
build the string using verbatim
string substring = #"21541_12_1_13\"":null";
and then remove the unwanted backslashes using Remove function
string newsubstring = substring.Remove(13, 1);
Hope that helps.
Cheers

Escape double quotes in a string

Double quotes can be escaped like this:
string test = #"He said to me, ""Hello World"". How are you?";
But this involves adding character " to the string. Is there a C# function or other method to escape double quotes so that no changing in string is required?
No.
Either use verbatim string literals as you have, or escape the " using backslash.
string test = "He said to me, \"Hello World\" . How are you?";
The string has not changed in either case - there is a single escaped " in it. This is just a way to tell C# that the character is part of the string and not a string terminator.
You can use backslash either way:
string str = "He said to me, \"Hello World\". How are you?";
It prints:
He said to me, "Hello World". How are you?
which is exactly the same that is printed with:
string str = #"He said to me, ""Hello World"". How are you?";
Here is a DEMO.
" is still part of your string.
You can check Jon Skeet's Strings in C# and .NET article for more information.
In C# you can use the backslash to put special characters to your string.
For example, to put ", you need to write \".
There are a lot of characters that you write using the backslash:
Backslash with other characters
\0 nul character
\a Bell (alert)
\b Backspace
\f Formfeed
\n New line
\r Carriage return
\t Horizontal tab
\v Vertical tab
\' Single quotation mark
\" Double quotation mark
\\ Backslash
Any character substitution by numbers:
\xh to \xhhhh, or \uhhhh - Unicode character in hexadecimal notation (\x has variable digits, \u has 4 digits)
\Uhhhhhhhh - Unicode surrogate pair (8 hex digits, 2 characters)
Another thing worth mentioning from C# 6 is interpolated strings can be used along with #.
Example:
string helloWorld = #"""Hello World""";
string test = $"He said to me, {helloWorld}. How are you?";
Or
string helloWorld = "Hello World";
string test = $#"He said to me, ""{helloWorld}"". How are you?";
Check running code here!
View the reference to interpolation here!
You're misunderstanding escaping.
The extra " characters are part of the string literal; they are interpreted by the compiler as a single ".
The actual value of your string is still He said to me, "Hello World". How are you?, as you'll see if you print it at runtime.
2022 UPDATE: Previously the answer would have been "no". However, C#11 introduces a new feature called "raw string literals." To quote the Microsoft documentation:
Beginning with C# 11, you can use raw string literals to more easily create strings that are multi-line, or use any characters requiring escape sequences. Raw string literals remove the need to ever use escape sequences. You can write the string, including whitespace formatting, how you want it to appear in output."
SOURCE: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/#raw-string-literals
EXAMPLE: So using the original example, you could do this (note that raw string literals always begin with three or more quotation marks):
string testSingleLine = """He said to me, "Hello World". How are you?""";
string testMultiLine = """
He said to me, "Hello World". How are you?
""";
Please explain your problem. You say:
But this involves adding character " to the string.
What problem is that? You can't type string foo = "Foo"bar"";, because that'll invoke a compile error. As for the adding part, in string size terms that is not true:
#"""".Length == 1
"\"".Length == 1
In C# 11.0 preview you can use raw string literals.
Raw string literals are a new format for string literals. Raw string literals can contain arbitrary text, including whitespace, new lines, embedded quotes, and other special characters without requiring escape sequences. A raw string literal starts with at least three double-quote (""") characters. It ends with the same number of double-quote characters. Typically, a raw string literal uses three double quotes on a single line to start the string, and three double quotes on a separate line to end the string.
string test = """He said to me, "Hello World" . How are you?""";
In C#, there are at least four ways to embed a quote within a string:
Escape quote with a backslash
Precede string with # and use double quotes
Use the corresponding ASCII character
Use the hexadecimal Unicode character
Please refer this document for detailed explanation.

How do I escape a RegEx?

I have a Regex that I now need to moved into C#. I'm getting errors like this
Unrecognized escape sequence
I am using Regex.Escape -- but obviously incorrectly.
string pattern = Regex.Escape("^.*(?=.{7,})(?=.*[a-zA-Z])(?=.*(\d|[!##$%\?\(\)\*\&\^\-\+\=_])).*$");
hiddenRegex.Attributes.Add("value", pattern);
How is this correctly done?
The error you're getting is coming at compile time correct? That means C# compiler is not able to make sense of your string. Prepend # sign before the string and you should be fine. You don't need Regex.Escape.
See What's the # in front of a string in C#?
var pattern = new Regex(#"^.*(?=.{7,})(?=.*[a-zA-Z])(?=.*(\d|[!##$%\?\(\)\*\&\^\-\+\=_])).*$");
pattern.IsMatch("Your input string to test the pattern against");
The error you are getting is due to the fact that your string contains invalid escape sequences (e.g. \d). To fix this, either escape the backslashes manually or write a verbatim string literal instead:
string pattern = #"^.*(?=.{7,})(?=.*[a-zA-Z])(?=.*(\d|[!##$%\?\(\)\*\&\^\-\+\=_])).*$";
Regex.Escape would be used when you want to embed dynamic content to a regular expression, not when you want to construct a fixed regex. For example, you would use it here:
string name = "this comes from user input";
string pattern = string.Format("^{0}$", Regex.Escape(name));
You do this because name could very well include characters that have special meaning in a regex, such as dots or parentheses. When name is hardcoded (as in your example) you can escape those characters manually.

Replace single backslash with double backslash

It seems simple enough, right? Well, I don't know.
Here's the code I'm trying:
input = Regex.Replace(input, "\\", "\\\\\\");
However, I'm receiving an error,
ArgumentException was unhandled - parsing "\" - Illegal \ at end of pattern.
How do I do this?
The first one should be "\\\\", not "\\". It works like this:
You have written "\\".
This translates to the sequence \ in a string.
The regex engine then reads this, which translates as backslash which isn't escaping anything, so it throws an error.
With regex, it's much easier to use a "verbatim string". In this case the verbatim string would be #"\\". When using verbatim strings you only have to consider escaping for the regex engine, as backslashes are treated literally. The second string will also be #"\\", as it will not be interpreted by the regex engine.
If you want to replace one backslash with two, it might be clearer to eliminate one level of escaping in the regular expression by using #"..." as the format for your string literals, also known as a verbatim string. It is then easier to see that
string output = Regex.Replace(input, #"\\", #"\\");
is a replacement from \ to \\.
I know it's too late to help you, maybe someone else will benefit from this. Anyway this worked for me:
text = text.Replace(#"\",#"\\");
and I find it even more simplier.
Cheers!
var result = Regex.Replace(#"afd\tas\asfd\", #"\\", #"\\");
The first parameter is string \\ which is \ in regex.
The second parameter is not processed by regex, so it will put it as is, when replacing.
If you intend to use the input in a regex pattern later, it can be a good idea to use Regex.Encode.
input = Regex.Escape(input);

Categories

Resources