Regex.Replace() for replacing the whole occurrence - c#

Am using regex.Replace() to replace the whole occurrence of a string.. so I gave like Regex.Replace(str,#stringToReplace,"**"); where stringToReplace = #"session" + "\b";
if i give like that its not replacing.. but if i give like Regex.Replace(str,#"session\b","**"); then its working.. how to avoid this.. i want to pass value which will be set dynamically..
Thanks
nimmi

try
stringToReplace = #"session" + #"\b";

The # here means a verbatim string literal.
When you write "\b" without the # it means the backspace character, i.e. the character with ASCII code 8. You want the string consisting of a backslash followed by a b, which means a word boundary when in a regular expression.
To get this you need to either escape the backslash to make it a literal backslash: "\\b" or make the second string also into a verbatim string literal: #"\b". Note also that the # in #"session" (without the \b) doesn't actually have an effect, although there is no harm in leaving it there.
stringToReplace = "session" + #"\b";

#"session" + "\b"
and
#"session\b"
are not the same string.
In the first case "\b" you don't treat the slash as a slash but as an escape parameter. In the second case you do.
So #"session" + #"\b" should bring the same result

Related

Writing "\\" inside a string

I want to do something like:
string s = "\\blabla";
when you write "\" it means there will only be a single '\'. How do I write the string so there will actually be 2 '\' meaning "\" ?
This works without a problem:
string s = "//blabla";
If you mean the backslash instead, you can use a verbatim string literal (using the # symbol to avoid processing escape symbols):
string s = #"\\blabla";
Alternatively you can escape the escape character itself:
string s = "\\\\blabla";
'/' is not the escape char, so you can simply write "//"
The escape char is '\' and, to use it properly, you can refer to the MSDN instructions.
Try this:
string s = #"\\blabla";
The '#' symbol treats whatever follows it as a verbatim string literal (ie: you won't need to worry about escape characters within the string).
I think you mean \ not /.
You could escape the \ with another backslash "\\\\" or you could use a string literal #"\\"
You may use # before strings to avoid having to escape special characters.
The advantage of #-quoting is that escape sequences are not processed,
which makes it easy to write
http://msdn.microsoft.com/en-us/library/362314fe(v=vs.71).aspx

How can I add \ symbol to the end of string in C#

Please forgive me a beginner's question :)
string S="abc";
S+="\";
won't complile.
string S="abc";
S+="\\";
will make S="abc\\"
How can I make S="abc\" ?
Your second piece of code is what you want (or a verbatim string literal #"\" as others have suggested), and it only adds a single backslash - print it to the console and you'll see that.
These two pieces of code:
S += "\\";
and
S += #"\";
are exactly equivalent. In both cases, a single backslash is appended1.
I suspect you're getting confused by the debugger view, which escapes backslashes (and some other characters). You can validate that even with the debugger by looking at S.Length, which you'll see is 4 rather than 5.
1 Note that it doesn't change the data in the existing string, but it sets the value of S to refer to a new string which consists of the original with a backslash on the end. String objects in .NET are immutable - but that's a whole other topic...
Try this:
String S = "abc";
S += #"\";
# = verbatim string literal
http://msdn.microsoft.com/en-us/library/aa691090%28v=vs.71%29.aspx
http://msdn.microsoft.com/en-us/library/vstudio/362314fe.aspx
string S = "abs" + "\\";
Should and does result in abc\.
What you are probably seeing is the way the debugger/intellisense visualizes the string for you.
Try printing your string to the console or display it in a textbox.
You already have the solution. The reason it appears as abc\\ whilst debugging is because VS will escape backslashes, print the value of S to a console window and you'll see abc\.
You could add an # to the start of the string literal, e.g.
string S="abc";
S+= #"\";
Which will achieve the same thing.
You can escape the backslash with the # character:
string S="abc";
S += #"\";
But this accomplishes exactly what you've written in your second example. The confusion on this is stemming from the fact that the Visual Studio debugger continues to escape these characters, even though your source string will contain only a single backslash.
Your second example is perfectly fine
string S="abc";
S+="\\";
Visual studio displays string escaped, that's why you see two slashes in result string. If you don't want to use escaping declare string like this
#"\"
This is not compiling because compiler is expecting a character after escape symbol
string S="abc";
S+="\";
string S="abc";
S+="\\";
Console.WriteLine(S); // This is what you're missing ;)
You'll see your string is not wrong at all.
The backslash (\) is an escape character, and allows you to get special characters that you wouldn't normally be able to insert in a string, such as "\r\n", which represents a NewLine character, or "\"" which basically gives you a " character.
In order to get the \ character, you need to input "\\" which is exactly what you're doing and also what you want.
Using the verbatim (#) replaces all occurrences of \ into \\, so #"\" == "\\". This is usually used for paths and regexes, where literal \ are needed in great numbers. Saying #"C:\MyDirectory\MyFile" is more comfortable than "C:\\MyDirectory\\MyFile" after all.
Try this
string s="abc";
s = s+"\\";

Escape double quotes in a string

Double quotes can be escaped like this:
string test = #"He said to me, ""Hello World"". How are you?";
But this involves adding character " to the string. Is there a C# function or other method to escape double quotes so that no changing in string is required?
No.
Either use verbatim string literals as you have, or escape the " using backslash.
string test = "He said to me, \"Hello World\" . How are you?";
The string has not changed in either case - there is a single escaped " in it. This is just a way to tell C# that the character is part of the string and not a string terminator.
You can use backslash either way:
string str = "He said to me, \"Hello World\". How are you?";
It prints:
He said to me, "Hello World". How are you?
which is exactly the same that is printed with:
string str = #"He said to me, ""Hello World"". How are you?";
Here is a DEMO.
" is still part of your string.
You can check Jon Skeet's Strings in C# and .NET article for more information.
In C# you can use the backslash to put special characters to your string.
For example, to put ", you need to write \".
There are a lot of characters that you write using the backslash:
Backslash with other characters
\0 nul character
\a Bell (alert)
\b Backspace
\f Formfeed
\n New line
\r Carriage return
\t Horizontal tab
\v Vertical tab
\' Single quotation mark
\" Double quotation mark
\\ Backslash
Any character substitution by numbers:
\xh to \xhhhh, or \uhhhh - Unicode character in hexadecimal notation (\x has variable digits, \u has 4 digits)
\Uhhhhhhhh - Unicode surrogate pair (8 hex digits, 2 characters)
Another thing worth mentioning from C# 6 is interpolated strings can be used along with #.
Example:
string helloWorld = #"""Hello World""";
string test = $"He said to me, {helloWorld}. How are you?";
Or
string helloWorld = "Hello World";
string test = $#"He said to me, ""{helloWorld}"". How are you?";
Check running code here!
View the reference to interpolation here!
You're misunderstanding escaping.
The extra " characters are part of the string literal; they are interpreted by the compiler as a single ".
The actual value of your string is still He said to me, "Hello World". How are you?, as you'll see if you print it at runtime.
2022 UPDATE: Previously the answer would have been "no". However, C#11 introduces a new feature called "raw string literals." To quote the Microsoft documentation:
Beginning with C# 11, you can use raw string literals to more easily create strings that are multi-line, or use any characters requiring escape sequences. Raw string literals remove the need to ever use escape sequences. You can write the string, including whitespace formatting, how you want it to appear in output."
SOURCE: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/#raw-string-literals
EXAMPLE: So using the original example, you could do this (note that raw string literals always begin with three or more quotation marks):
string testSingleLine = """He said to me, "Hello World". How are you?""";
string testMultiLine = """
He said to me, "Hello World". How are you?
""";
Please explain your problem. You say:
But this involves adding character " to the string.
What problem is that? You can't type string foo = "Foo"bar"";, because that'll invoke a compile error. As for the adding part, in string size terms that is not true:
#"""".Length == 1
"\"".Length == 1
In C# 11.0 preview you can use raw string literals.
Raw string literals are a new format for string literals. Raw string literals can contain arbitrary text, including whitespace, new lines, embedded quotes, and other special characters without requiring escape sequences. A raw string literal starts with at least three double-quote (""") characters. It ends with the same number of double-quote characters. Typically, a raw string literal uses three double quotes on a single line to start the string, and three double quotes on a separate line to end the string.
string test = """He said to me, "Hello World" . How are you?""";
In C#, there are at least four ways to embed a quote within a string:
Escape quote with a backslash
Precede string with # and use double quotes
Use the corresponding ASCII character
Use the hexadecimal Unicode character
Please refer this document for detailed explanation.

replace unicode character

String jData="Memur adayar\u0131n\u0131n en b\u00fcy\u00fck sorunar"
+ "\u0131ndan KPSS \u0 131 ";
jData = Regex.Replace(jData, #"\\u0 ", #"\\u0", RegexOptions.Compiled).Trim();
I have to replace "\u0 " in jData with "\u0" (i.e. remove the trailing whitespace character if there is one) but the method I used isn't working. What should I do?
So you've got some malformed Unicode escapes in the string and you want to fix them by removing any whitespace after the 0. That's simple enough:
jData = Regex.Replace(jData, #"(\\u0)\s+(\w+)", "$1$2");
The hardest part of all this is figuring out what all the backslashes are supposed to mean. C# can helps you with that supports an alternative string literal syntax for verbatim string, the only character that you have to escape with a backslash is the backslash itself. (You have to escape quotation marks too, but you do that with another quote, i.e. "").
With that out of the way, the real reason I answered this question was to advise you not to use RegexOptions.Compiled. I'm sure you've heard many people say it makes the regex work faster. That's true, but it's an oversimplification. Read this article for a good discussion of this issue. Do yourself a favor and forget RegexOptions.Compiled even exists until you run into a problem you can't solve without it.
find: #"\\u0 "
replace: #"\\u0"
they are the same. Try it with an capital O or normal o
I think I got it working
string jData= #"Memur adayar\u0131n\u0131n en b\u00fcy\u00fck sorunar\u0131ndan KPSS \u0 131 ";
jData = Regex.Replace(jData, #"\\u0 ", #"\u0", RegexOptions.Compiled).Trim();
Notice I added an extra '#' in front of the input string. And in the regex part I changed the third argument to #"\u0"
There's a problem with your example string. I'm supposing that you actually wanted the backslashes in the string, in which case the simplest approach is to put # before the string literals. And then I believe you have the opposite problem in the second line, where you should have either used just one backslash in each string, or omitted the #.
There's no reason to use Regex.Replace() here. jData.Replace() would suffice just fine:
String jData=#"Memur adayar\u0131n\u0131n en b\u00fcy\u00fck sorunar"
+ #"\u0131ndan KPSS \u0 131 ";
jData = jData.Replace(#"\u0 ", #"\u0").Trim();

How to replace two slash into one slash?

We have the following code:
string str="\\u5b89\u5fbd\\";
We need output in the format:
"\u5b89\u5fbd\"
We have tried this code:
str.Replace("\\",#"\")
Its not working.
Try this
string str = "\\u5b89\u5fbd\\";
str = str.Replace(#"\\", #"\");
\ is a reserved sign. \\ escapes it and results in \
Adding # at the start of a string tell the compiler to use the string as is and not to escape characters.
So use either "\\\\" or #"\\"
EDIT
\\u5b89\u5fbd\\ actually does not have two \ together. \ is just escaped.
The string results in \u5b89徽\. And in that string you can't replace \\ because there is only one \ together.
Have you tried this?
str.Replace("\\\\","\\");
Your example accomplish nothing. "\\" is an escaped version of \, and #"\" is another version of writing \. So your example replaces \ with \
EDIT
Now I understand your problem. What you want can't actually be done, since that would cause the string to end with a single \, and that will not be allowed. \ denotes a start of a escape sequence, and needs something after it.
I think there are no good option here, since in your case \u5b89 is not a string, but an escape sequence for one specific character.
str.Replace("\\u5b89","\u5b89");
This works for your current example, but will only work with this one specific character, so I guess it wont help you much. The \ at the end you cannot replace with \, but I can't see why you need the string to end with this char either.
Your best bet is to make sure that the \ does not occur at the start of the string in the first place, instead of trying to get rid of it afterwards.
Okay so the first string is actually saved as:
"\u5b89[someChineseCharacter]\"
because you are already using escape sequences. If you would like the original string to be what you typed, you have to do it like so:
string str = #"\\u5b89\u5fbd\\";
Then, str = str.Replace(#"\\",#"\") would work.
Some clarification:
When you type string str="\\u5b89\u5fbd\\"; in visual studio, it saves the string \u5b89徽\ in memory, because you are using several escape sequences in the original statement:
\\ actually means \
\u5fbd actually means unicode character 5fbd, which is 徽.
For that reason, these get replaced, and in memory your string looks as mentioned.
So if you try to replace occurrences of two backslashes #"\\", it will appear to do nothing, because there were no such occurrences in the original string to begin with.
Hope this makes it clear.
Try this it will solve your problem.
str.Replace("\\\\","\\");
Or maybe Something like this?
foreach (char c in str)
{
if ((int)c < 256)
Console.Write(c);
else
Console.Write(String.Format("\\u{0:x4}", (int)c));
}
;)
Maybe it is just me but I think the input string should have a "\" in the middle, or the second u5fbd will be interpreted as a unicode char (so you won't get it outputted as you wish). With a starting string like this:
string str="\\u5b89\\u5fbd\\";
You don't need any replace to output what you want, if for "output" you mean something like Console or an HTML page...

Categories

Resources