Unable to deserialize json where text contains quote [duplicate] - c#

I'm trying to show double quotes but it shows one of the backslashes:
"maingame": {
"day1": {
"text1": "Tag 1",
"text2": "Heute startet unsere Rundreise \\\"Example text\\\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.</strong> "
}
}
When rendering in the html it shows as \"Example text\". What is the correct way?

Try this:
"maingame": {
"day1": {
"text1": "Tag 1",
"text2": "Heute startet unsere Rundreise \" Example text\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.</strong> "
}
}
(just one backslash (\) in front of quotes).

When and where to use \\\" instead. OK if you are like me you will feel just as silly as I did when I realized what I was doing after I found this thread.
If you're making a .json text file/stream and importing the data from there then the main stream answer of just one backslash before the double quotes:\" is the one you're looking for.
However if you're like me and you're trying to get the w3schools.com "Tryit Editor" to have a double quotes in the output of the JSON.parse(text), then the one you're looking for is the triple backslash double quotes \\\". This is because you're building your text string within an HTML <script> block, and the first double backslash inserts a single backslash into the string variable then the following backslash double quote inserts the double quote into the string so that the resulting script string contains the \" from the standard answer and the JSON parser will parse this as just the double quotes.
<script>
var text="{";
text += '"quip":"\\\"If nobody is listening, then you\'re likely talking to the wrong audience.\\\""';
text += "}";
var obj=JSON.parse(text);
</script>
+1: since it's a JavaScript text string, a double backslash double quote \\" would work too; because the double quote does not need escaped within a single quoted string eg '\"' and '"' result in the same JS string.

if you want to escape double quote in JSON use \\ to escape it.
example if you want to create json of following javascript object
{time: '7 "o" clock'}
then you must write in following way
'{"time":"7 \\"o\\" clock"}'
if we parse it using JSON.parse()
JSON.parse('{"time":"7 \\"o\\" clock"}')
result will be
{time: "7 "o" clock"}

It's showing the backslash because you're also escaping the backslash.
Aside from double quotes, you must also escape backslashes if you want to include one in your JSON quoted string. However if you intend to use a backslash in an escape sequence, obviously you shouldn't escape it.

Note that this most often occurs when the content has been "double encoded", meaning the encoding algorithm has accidentally been called twice.
The first call would encode the "text2" value:
FROM: Heute startet unsere Rundreise "Example text". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
TO: Heute startet unsere Rundreise \"Example text\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
A second encoding then converts it again, escaping the already escaped characters:
FROM: Heute startet unsere Rundreise \"Example text\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
TO: Heute startet unsere Rundreise \\\"Example text\\\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
So, if you are responsible for the implementation of the server here, check to make sure there aren't two steps trying to encode the same content.

To escape backslashes that cause problems for JSON data I use this function.
//escape backslash to avoid errors
var escapeJSON = function(str) {
return str.replace(/\\/g,'\\');
};

tl;dr
If your inside javascript/python/etc., use raw strings (python's r'' or javascript's String.raw or similar). They make it much easier to write JSON strings because they avoid multiple escape sequence processing.
console.log(JSON.parse(String.raw`"This is a double quote >> \" <<"`))
// => This is a double quote >> " <<
More in depth
Some confusion when writing JSON strings in code comes from string escape sequences being processed multiple times. One time in the programming language, again in the JSON parser (e.g. JSON.parse() in Javascript, or similar)
Use the language's print function to see what escapes are happening in the programming language
It can be confusing to see how strings are displayed in a programming language repl.
E.g. when you type a string directly into a javascript repl, it displays it this way
'Two newlines:\n\nTab here >>\t<<\n\nBackslash here >>\\<<'
// => 'Two newlines:\n\nTab here >>\t<<\n\nBackslash here >>\\<<'
But when you console.log() the string, it displays it this way
console.log('Two newlines:\n\nTab here >>\t<<\n\nBackslash here >>\\<<')
/* =>
Two newlines:
Tab here >> <<
Backslash here >>\<<
*/
When javascript encounters a string, it 'evaluates' the escape sequences before passing it e.g. to a function, in the sense that it replaces each \n with a newline character, each \t with a tab character, etc.
So it helps a lot to console.log() the string to get a better idea of what's going on.
How to encode a single quote in JSON in javascript
To write a " to a JSON in javascript, you could use
console.log(JSON.parse('"This is a double quote >> \\" <<"'));
// => This is a double quote >> " <<
It'd be similar in python and other languages.
Step by step:
javascript evaluates the string using escape sequence rules from the javascript spec, replacing \n with a newline char, \t with a tab char, etc.
In our case, it replaces \\ with \.
The result string is "This is a double quote >> \" <<"
We put the outer double quotes to make it a valid JSON string
javascript takes the result and passes it to the JSON.parse() fn.
JSON.parse evaluates the string using escape sequence rules from the JSON standard, replacing \n with a newline char, \t with a tab char, etc. In our case,
the first character it sees is ", so it knows this is a JSON string.
Inside the JSON string, it sees \". Normally " would end the JSON string, but because " is escaped with \, it knows this isn't the end of the string and to replace \" with a literal double quote character.
the last character it sees is ", so it knows this is the end of the JSON string
The result parsed string is This is a double quote >> " <<. Note the outer double quotes are gone also.
Raw strings make things easier
Javascript's String.raw template function and python's r'' strings don't do any escape sequence evaluating, so it makes it much easier and less confusing to reason about
console.log(JSON.parse(String.raw`"This is a double quote >> \" <<"`))
// => This is a double quote >> " <<

For those who would like to use developer powershell. Here are the lines to add to your settings.json:
"terminal.integrated.automationShell.windows": "C:\\Windows\\SysWOW64\\WindowsPowerShell\\v1.0\\powershell.exe",
"terminal.integrated.shellArgs.windows": [
"-noe",
"-c",
" &{Import-Module 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\Common7\\Tools\\Microsoft.VisualStudio.DevShell.dll'; Enter-VsDevShell b7c50c8d} ",
],

Related

Escaping quotation mark in appsettings [duplicate]

I'm trying to show double quotes but it shows one of the backslashes:
"maingame": {
"day1": {
"text1": "Tag 1",
"text2": "Heute startet unsere Rundreise \\\"Example text\\\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.</strong> "
}
}
When rendering in the html it shows as \"Example text\". What is the correct way?
Try this:
"maingame": {
"day1": {
"text1": "Tag 1",
"text2": "Heute startet unsere Rundreise \" Example text\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.</strong> "
}
}
(just one backslash (\) in front of quotes).
When and where to use \\\" instead. OK if you are like me you will feel just as silly as I did when I realized what I was doing after I found this thread.
If you're making a .json text file/stream and importing the data from there then the main stream answer of just one backslash before the double quotes:\" is the one you're looking for.
However if you're like me and you're trying to get the w3schools.com "Tryit Editor" to have a double quotes in the output of the JSON.parse(text), then the one you're looking for is the triple backslash double quotes \\\". This is because you're building your text string within an HTML <script> block, and the first double backslash inserts a single backslash into the string variable then the following backslash double quote inserts the double quote into the string so that the resulting script string contains the \" from the standard answer and the JSON parser will parse this as just the double quotes.
<script>
var text="{";
text += '"quip":"\\\"If nobody is listening, then you\'re likely talking to the wrong audience.\\\""';
text += "}";
var obj=JSON.parse(text);
</script>
+1: since it's a JavaScript text string, a double backslash double quote \\" would work too; because the double quote does not need escaped within a single quoted string eg '\"' and '"' result in the same JS string.
if you want to escape double quote in JSON use \\ to escape it.
example if you want to create json of following javascript object
{time: '7 "o" clock'}
then you must write in following way
'{"time":"7 \\"o\\" clock"}'
if we parse it using JSON.parse()
JSON.parse('{"time":"7 \\"o\\" clock"}')
result will be
{time: "7 "o" clock"}
It's showing the backslash because you're also escaping the backslash.
Aside from double quotes, you must also escape backslashes if you want to include one in your JSON quoted string. However if you intend to use a backslash in an escape sequence, obviously you shouldn't escape it.
Note that this most often occurs when the content has been "double encoded", meaning the encoding algorithm has accidentally been called twice.
The first call would encode the "text2" value:
FROM: Heute startet unsere Rundreise "Example text". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
TO: Heute startet unsere Rundreise \"Example text\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
A second encoding then converts it again, escaping the already escaped characters:
FROM: Heute startet unsere Rundreise \"Example text\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
TO: Heute startet unsere Rundreise \\\"Example text\\\". Jeden Tag wird ein neues Reiseziel angesteuert bis wir.
So, if you are responsible for the implementation of the server here, check to make sure there aren't two steps trying to encode the same content.
To escape backslashes that cause problems for JSON data I use this function.
//escape backslash to avoid errors
var escapeJSON = function(str) {
return str.replace(/\\/g,'\\');
};
tl;dr
If your inside javascript/python/etc., use raw strings (python's r'' or javascript's String.raw or similar). They make it much easier to write JSON strings because they avoid multiple escape sequence processing.
console.log(JSON.parse(String.raw`"This is a double quote >> \" <<"`))
// => This is a double quote >> " <<
More in depth
Some confusion when writing JSON strings in code comes from string escape sequences being processed multiple times. One time in the programming language, again in the JSON parser (e.g. JSON.parse() in Javascript, or similar)
Use the language's print function to see what escapes are happening in the programming language
It can be confusing to see how strings are displayed in a programming language repl.
E.g. when you type a string directly into a javascript repl, it displays it this way
'Two newlines:\n\nTab here >>\t<<\n\nBackslash here >>\\<<'
// => 'Two newlines:\n\nTab here >>\t<<\n\nBackslash here >>\\<<'
But when you console.log() the string, it displays it this way
console.log('Two newlines:\n\nTab here >>\t<<\n\nBackslash here >>\\<<')
/* =>
Two newlines:
Tab here >> <<
Backslash here >>\<<
*/
When javascript encounters a string, it 'evaluates' the escape sequences before passing it e.g. to a function, in the sense that it replaces each \n with a newline character, each \t with a tab character, etc.
So it helps a lot to console.log() the string to get a better idea of what's going on.
How to encode a single quote in JSON in javascript
To write a " to a JSON in javascript, you could use
console.log(JSON.parse('"This is a double quote >> \\" <<"'));
// => This is a double quote >> " <<
It'd be similar in python and other languages.
Step by step:
javascript evaluates the string using escape sequence rules from the javascript spec, replacing \n with a newline char, \t with a tab char, etc.
In our case, it replaces \\ with \.
The result string is "This is a double quote >> \" <<"
We put the outer double quotes to make it a valid JSON string
javascript takes the result and passes it to the JSON.parse() fn.
JSON.parse evaluates the string using escape sequence rules from the JSON standard, replacing \n with a newline char, \t with a tab char, etc. In our case,
the first character it sees is ", so it knows this is a JSON string.
Inside the JSON string, it sees \". Normally " would end the JSON string, but because " is escaped with \, it knows this isn't the end of the string and to replace \" with a literal double quote character.
the last character it sees is ", so it knows this is the end of the JSON string
The result parsed string is This is a double quote >> " <<. Note the outer double quotes are gone also.
Raw strings make things easier
Javascript's String.raw template function and python's r'' strings don't do any escape sequence evaluating, so it makes it much easier and less confusing to reason about
console.log(JSON.parse(String.raw`"This is a double quote >> \" <<"`))
// => This is a double quote >> " <<
For those who would like to use developer powershell. Here are the lines to add to your settings.json:
"terminal.integrated.automationShell.windows": "C:\\Windows\\SysWOW64\\WindowsPowerShell\\v1.0\\powershell.exe",
"terminal.integrated.shellArgs.windows": [
"-noe",
"-c",
" &{Import-Module 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\Common7\\Tools\\Microsoft.VisualStudio.DevShell.dll'; Enter-VsDevShell b7c50c8d} ",
],

Store string with placeholder and html tags [duplicate]

In a verbatim string literal (#"foo") in C#, backslashes aren't treated as escapes, so doing \" to get a double quote doesn't work. Is there any way to get a double quote in a verbatim string literal?
This understandably doesn't work:
string foo = #"this \"word\" is escaped";
Use a duplicated double quote.
#"this ""word"" is escaped";
outputs:
this "word" is escaped
Use double quotation marks.
string foo = #"this ""word"" is escaped";
For adding some more information, your example will work without the # symbol (it prevents escaping with \), this way:
string foo = "this \"word\" is escaped!";
It will work both ways but I prefer the double-quote style for it to be easier working, for example, with filenames (with lots of \ in the string).
This should help clear up any questions you may have: C# literals
Here is a table from the linked content:
Regular literal
Verbatim literal
Resulting string
"Hello"
#"Hello"
Hello
"Backslash: \\"
#"Backslash: \"
Backslash: \
"Quote: \""
#"Quote: """
Quote: "
"CRLF:\r\nPost CRLF"
#"CRLF:Post CRLF"
CRLF:Post CRLF
Update: With C# 11 Preview feature - Raw String Literals
string foo1 = """
this "word" is escaped
""";
string foo2 = """this "word" is escaped""";
History:
There is a proposal open in GitHub for the C# language about having better support for raw string literals. One valid answer, is to encourage the C# team to add a new feature to the language (such as triple quote - like Python).
see https://github.com/dotnet/csharplang/discussions/89#discussioncomment-257343
As the documentation says:
Simple escape sequences ... are interpreted literally. Only a quote escape sequence ("") is not interpreted literally; it produces one double quotation mark. Additionally, in case of a verbatim interpolated string brace escape sequences ({{ and }}) are not interpreted literally; they produce single brace characters.

.bat script in C# code, it keeps trying to escape my original code [duplicate]

In a verbatim string literal (#"foo") in C#, backslashes aren't treated as escapes, so doing \" to get a double quote doesn't work. Is there any way to get a double quote in a verbatim string literal?
This understandably doesn't work:
string foo = #"this \"word\" is escaped";
Use a duplicated double quote.
#"this ""word"" is escaped";
outputs:
this "word" is escaped
Use double quotation marks.
string foo = #"this ""word"" is escaped";
For adding some more information, your example will work without the # symbol (it prevents escaping with \), this way:
string foo = "this \"word\" is escaped!";
It will work both ways but I prefer the double-quote style for it to be easier working, for example, with filenames (with lots of \ in the string).
This should help clear up any questions you may have: C# literals
Here is a table from the linked content:
Regular literal
Verbatim literal
Resulting string
"Hello"
#"Hello"
Hello
"Backslash: \\"
#"Backslash: \"
Backslash: \
"Quote: \""
#"Quote: """
Quote: "
"CRLF:\r\nPost CRLF"
#"CRLF:Post CRLF"
CRLF:Post CRLF
Update: With C# 11 Preview feature - Raw String Literals
string foo1 = """
this "word" is escaped
""";
string foo2 = """this "word" is escaped""";
History:
There is a proposal open in GitHub for the C# language about having better support for raw string literals. One valid answer, is to encourage the C# team to add a new feature to the language (such as triple quote - like Python).
see https://github.com/dotnet/csharplang/discussions/89#discussioncomment-257343
As the documentation says:
Simple escape sequences ... are interpreted literally. Only a quote escape sequence ("") is not interpreted literally; it produces one double quotation mark. Additionally, in case of a verbatim interpolated string brace escape sequences ({{ and }}) are not interpreted literally; they produce single brace characters.

Escape double quotes in a string

Double quotes can be escaped like this:
string test = #"He said to me, ""Hello World"". How are you?";
But this involves adding character " to the string. Is there a C# function or other method to escape double quotes so that no changing in string is required?
No.
Either use verbatim string literals as you have, or escape the " using backslash.
string test = "He said to me, \"Hello World\" . How are you?";
The string has not changed in either case - there is a single escaped " in it. This is just a way to tell C# that the character is part of the string and not a string terminator.
You can use backslash either way:
string str = "He said to me, \"Hello World\". How are you?";
It prints:
He said to me, "Hello World". How are you?
which is exactly the same that is printed with:
string str = #"He said to me, ""Hello World"". How are you?";
Here is a DEMO.
" is still part of your string.
You can check Jon Skeet's Strings in C# and .NET article for more information.
In C# you can use the backslash to put special characters to your string.
For example, to put ", you need to write \".
There are a lot of characters that you write using the backslash:
Backslash with other characters
\0 nul character
\a Bell (alert)
\b Backspace
\f Formfeed
\n New line
\r Carriage return
\t Horizontal tab
\v Vertical tab
\' Single quotation mark
\" Double quotation mark
\\ Backslash
Any character substitution by numbers:
\xh to \xhhhh, or \uhhhh - Unicode character in hexadecimal notation (\x has variable digits, \u has 4 digits)
\Uhhhhhhhh - Unicode surrogate pair (8 hex digits, 2 characters)
Another thing worth mentioning from C# 6 is interpolated strings can be used along with #.
Example:
string helloWorld = #"""Hello World""";
string test = $"He said to me, {helloWorld}. How are you?";
Or
string helloWorld = "Hello World";
string test = $#"He said to me, ""{helloWorld}"". How are you?";
Check running code here!
View the reference to interpolation here!
You're misunderstanding escaping.
The extra " characters are part of the string literal; they are interpreted by the compiler as a single ".
The actual value of your string is still He said to me, "Hello World". How are you?, as you'll see if you print it at runtime.
2022 UPDATE: Previously the answer would have been "no". However, C#11 introduces a new feature called "raw string literals." To quote the Microsoft documentation:
Beginning with C# 11, you can use raw string literals to more easily create strings that are multi-line, or use any characters requiring escape sequences. Raw string literals remove the need to ever use escape sequences. You can write the string, including whitespace formatting, how you want it to appear in output."
SOURCE: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/strings/#raw-string-literals
EXAMPLE: So using the original example, you could do this (note that raw string literals always begin with three or more quotation marks):
string testSingleLine = """He said to me, "Hello World". How are you?""";
string testMultiLine = """
He said to me, "Hello World". How are you?
""";
Please explain your problem. You say:
But this involves adding character " to the string.
What problem is that? You can't type string foo = "Foo"bar"";, because that'll invoke a compile error. As for the adding part, in string size terms that is not true:
#"""".Length == 1
"\"".Length == 1
In C# 11.0 preview you can use raw string literals.
Raw string literals are a new format for string literals. Raw string literals can contain arbitrary text, including whitespace, new lines, embedded quotes, and other special characters without requiring escape sequences. A raw string literal starts with at least three double-quote (""") characters. It ends with the same number of double-quote characters. Typically, a raw string literal uses three double quotes on a single line to start the string, and three double quotes on a separate line to end the string.
string test = """He said to me, "Hello World" . How are you?""";
In C#, there are at least four ways to embed a quote within a string:
Escape quote with a backslash
Precede string with # and use double quotes
Use the corresponding ASCII character
Use the hexadecimal Unicode character
Please refer this document for detailed explanation.

Cleaning strings to be valid JSON values

I want to clean strings that are retrieved from a database.
I ran into this issue where a property value (a name from a database) had an embedded TAB character, and Chrome gave me an invalid TOKEN error while trying to load the JSON object.
So now, I went to http://www.json.org/ and on the side it has a specification. But I'm having trouble understanding how to write a cleanser using this spec:
string
""
" chars "
chars
char
char chars
char
any-Unicode-character-
except-"-or--or-
control-character
\"
\\
/
\b
\f
\n
\r
\t
\u four-hex-digits
Given a string, how can I "clean" it such that I conform to this spec?
Specifically, I am confused: does the spec allow TAB (0x0900) characters? If so, why did Chrome given an invalid TOKEN error?
Tab characters (actual 0x09, not escapes) cannot appear inside of quotes in JSON (though they are valid whitespace outside of quotes). You'll need to escape them with \t or \u0009 (the former being preferable).
json.org says an unescaped character of a string must be:
Any UNICODE character except " or \ or
control character
Tab counts as a control character.
This maybe what you are looking for it shows how to use the JavaScriptSerializer class in C#.
How to create JSON String in C#

Categories

Resources