C# Deserialize JSON html string - c#

I'm trying to deserialize a JSON object in c#, my problem is that one of the fields can contain html text (I plan on sanitizing it afterwards).
I’m using a JavaScriptSerializer object to deserialize, but I’m getting a “Invalid object passed in“ error (from the JavaScriptSerializer). If I pass plain text for that same field it works fine and the other fields (including a date and an array) in the object also deserialize correctly so it seems like the html is what’s tripping it up.
I’m using JSON.stringify to serialize the Javascript object and I’m passing it to my page via jQuery.
Is there something I’m supposed to do to in order to pass a string that contains html? I’ve tried enclosing it in quotes, but it didn’t help.
As an example of a string that's accepted vs what throws an error: "Test" is fine while
"<div style="text-align: center;">Test</div>" is not.
Strangely <span> tags also seem to be fine.

Can you encode the html with the javascript escape() function before serializing.

You may have to encodeURIComponent in javascript, then HttpServerUtility.UrlDecode in .NET

You can't pass in HTML characters that aren't encoded for security reasons. You can override this in MVC.Net at the application of function level if you feel secure in your source.

just do some replace like this
jsonString.Replace(#"=""\""",#"=\""\""").Replace(#"\""""",#"\""\""").Replace(#"=""""", #"=\""\""")

Related

format dynamic json string from model for display in .net view

so I have a json string within my model data that is sent to the view for display in a table.
I am wanting to be sure it is displayed in a formatted fashion instead of one line string.
My research has led me to find this to be the cleanest method...
string json = JsonConvert.SerializeObject(account, Formatting.Indented);
however, within the view, once my value is extracted to #item.requestExample (the json string to be formatted), can I can call this c# to return the formatted string to the html?
btw, I've tried a few other methods just js, but every time the #item.requestExample is used within the , the inspect/console complains of the invalid tokens of the string since the string it an html representation that is using &quote; instead of "'s.
tia
Maybe you could parse the json (if it is json, I'm not sure I understood well) string into a dynamic object. Then you can iterate over the properties and visualize them at your will.
You can see how to do that here.
Deserialize JSON into C# dynamic object

How to use AntiXss with a Web API

This is a question that has been asked before, but I've not found the information I'm looking for or maybe I'm just missing the point so please bear with me. I can always adjust my question if I'm asking it the wrong way.
If for example, I have a POST endpoint that use a simply DTO object with 2 properties (i.e. companyRequestDto) and contains a script tag in one of its properties. When I call my endpoint from Postman I use the following:
{
"company": "My Company<script>alert(1);</script>",
"description": "This is a description"
}
When it is received by the action in my endpoint,
public void Post(CompanyRequestDto companyRequestDto)
my DTO object will automatically be set and its properties will be set to:
companyDto.Company = "My Brand<script>alert(1);</script>";
companyDto.Description = "This is a description";
I clearly don't want this information to be stored in our database as is, nor do I want it stored as an escaped string as displayed above.
1) Request: So my first question is how do I throw an error if the DTO posted contains some invalid content such as the tag?
I've looked at Microsoft AntiXss but I don't understand how to handle this as the data provided in the properties of a DTO object is not an html string but just a string, so What I am missing here as I don't understand how this is helping sanitizing or validating the passed data.
When I call
var test = AntiXss.AntiXssEncoder.HtmlEncode(companyRequestDto.Company, true);
It returns an encoded string, but then what??
Is there a way to remove disallowed keywords or just simply throw an error?
2) Response: Assuming 1) was not implemented or didn't work properly and it ended up being stored in our database, am I suppose to return encoded data as a json string, so instead of returning:
"My company"
Am I suppose to return:
"My Company<script>alert(1)</script>"
Is the browser (or whatever app) just supposed to display as below then?:
"My Company<script>alert(1)</script>"
3) Code: Assuming there is a way to sanitize or throw an error, should I use this at the property level using attribute on all the properties of my various DTO objects or is there a way to apply this at the class level using an attribute that will validate and/or sanitize all string properties of a DTO object for example?
I found interesting articles but none really answering my problems or I'm having other problems with some of the answers:
asp.net mvc What is the difference between AntiXss.HtmlEncode and HttpUtility.HtmlEncode?
Stopping XSS when using WebAPI (currently looking into this one but don't see how example is solving problem as property is always failing whether I use the script tag or not)
how to sanitize input data in web api using anti xss attack (also looking at this one but having a problem calling ReadFromStreamAsync from my project at work. Might be down to some of the settings in my web.config but haven't figured out why but it always seems to return an empty string)
Thanks.
UPDATE 1:
I've just finished going through the answer from Stopping XSS when using WebAPI
This is probably the closest one to what I am looking for. Except I don't want to encode the data, as I don't want to store it in my database, so I'll see if I can figure out how to throw an error but I'm not sure what the condition will be. Maybe I should just look for characters such as <, >, ; , etc... as these will not likely be used in any of our fields.
You need to consider where your data will be used when you think about encoding, so that data with in it is only a problem if it's rendered as HTML so if you are going to display data that has been provided by users anywhere, it's probably at the point you are going to display it that you would want to html encode it for display (you want to avoid repeatedly html encoding the same string when saving it for example).
Again, it depends what the response is going to be used for... you probably want to html encode it at the point it's going to be displayed... remember if you are encoding something in the response it may not match whats in data so if the calling code could do something like call your API to search for a company with that name that could cause problems. If the browser does display the html encoded version it might look ugly but it's better than users being compromised by XSS attacks.
It's quite difficult to sanitize text for things like tags if you allow most characters for normal use. It's easier if you can whitelist characters allowed and only allow, say, alphanumeric but that isn't often possible. This can be done using a regex validation attribute on the DTO object. The best approach I think is to encode values for display if you can't stop certain characters. It's really difficult to try to allow all characters but avoid things like as people can start using ascii characters etc.

Is it possible to change the way Attribute.Add formats the addition of the attribute?

My Question: Is it possible to change the way Attribute.Add formats the addition of the attribute?
I have an ASP.net website that loads a widget in a div, and I'm trying to find a way to add a data-options attribute to the div with my codebehind. I need the attribute to be created with a single quote around the data-options value instead of double quotes, because the value I'm assigning is a JSON pair.
What I need the attribute to look like:
data-options='{“post_message_origin”:”https://www.mysite.com/MyWidget.aspx”}'
What it looks like when using Attribute.Add("data-options"):
My code:
string dataoptions = "{\"post_message_origin\":\""+ HttpContext.Current.Request.Url.AbsoluteUri + "\"}";
MYWIDGET.Attributes.Add("data-options", dataoptions);
The attribute result:
data-options="{“post_message_origin”:”https://www.mysite.com/MyWidget.aspx”}"
The set of double quotes encompassing the data-options value is preventing the JSON pair from being read correctly, hence my question.
I'm doing my best to avoid using hard coding so that I can easily load the page from development servers to production servers without changing the code, which is why I'm using HttpContext.Current.Request.Url.AbsoluteUri in the code behind instead of writing the data-options value straight to the div in the ASP markup.
I would suggest using single quotes with the JSON, in this case. Either is acceptable, as long as they are in open-close pairs. This sidesteps the issue.
EDIT: Unfortunately, Attribute.Add encodes the quotes...
This has been brought up before. It looks like the long term solution is implementing your own encoder...
I would recommend including neither kind of quote in your data:
string dataoptions = "{\"post_message_origin\":\""+
HttpContext.Current.Request.Url.AbsoluteUri + "\"}";
dataoptions = dataoptions.Replace("\"", """).Replace("'", "&apos;");
MYWIDGET.Attributes.Add("data-options", dataoptions);

Javascript / ASP.NET MVC 4 - Using C# Strings in Javascript

I need to be able to access strings held in my C# code in JavaScript. To test, I have tried displaying a message box with the C# string in JavaScript (I am using this string literal and the message box as an example scenario):
alert(<%: "TEST" %>);
When this code runs, no message box is displayed. On the other hand, a message box is displayed with this code:
alert(<%: 6 %>);
Why is it that I can use integers but not strings? Is there any way around this?
Thanks.
You need to add quotes around the string; otherwise, the browser sees alert(TEST);, which is incorrect. To prevent cross-site scripting attacks, you also need to properly escape special characters. Calling HttpUtility.JavaScriptStringEncode lets you do both:
alert(<%= HttpUtility.JavaScriptStringEncode("TEST", true) %>);
Note: If this JavaScript snippet appears inside an HTML attribute like onclick, you may need to change <%= to <%: so that the double quotes are also HTML encoded.
Why is it that I can use integers but not strings?
Because you need to put strings in quotes:
alert("<%: "TEST" %>");
The key here, as always, is to look at what the browser actually receives. With your original code, what the browser sees is:
alert(TEST);
...which is trying to use the variable TEST, not a literal string.
Now in the above, I've assumed the string won't have any " in it or other things that aren't valid within a JavaScript string literal. That's not usually a good assumption to make.
If you're using a recent version of .Net or using JSON.Net (see this question for details), you can output the string using a JSON serializer, which will ensure that anything within it that may be problematic is properly encoded/escaped. For instance, with JSON.Net, you might use:
// With JSON.Net
alert(<%: JsonConvert.ToString("TEST") %>);
// With a recent version of .Net
alert(<%: HttpUtility.JavaScriptStringEncode("TEST", true) %>);
The problem is in how this translates into JavaScript:
alert(<%: "TEST" %>);
becomes
alert(TEST);
This is a problem because it assumes there is a variable named TEST that you'd like to display the value of, but most likely, TEST is undefined. What you probably want to do is this:
alert('<%: "TEST" %>');
But since this is MVC 4, you can use the Json.Encode method to be a little cleaner, like this:
alert(<%: Json.Encode("TEST") %>);
Both of thse will translate to
alert('TEST');
This should display a message box with the string 'TEST'.

HTML Decode and Encode

I have tried to decode the html text that i have in the databse in my MVC 3 Razor application.
the html text in the databse is not encoded.
I tries httpUtility.decode , server.decode but none of them work.
finally i managed to make it work with Html.raw(string)
sample of non working code
#Server.HtmlDecode(item.ShortDescription)
#HttpUtility.HtmlDecode(item.ShortDescription)
Do you know why we can not use html.decode in my case !
I thought this would save some one else from looking for few hours.
It works just fine to decode the text, but then it will automatically be encoded again when it's put in the page using the # syntax.
The Html.Raw method wraps the string in an HtmlString, which tells the razor engine not to encode it when it's put in the page.
If you want to display the value as-is without any HTML encoding you could use the Html.Raw helper:
#Html.Raw(item.ShortDescription)
Be warned thought that by doing this you are opening your site to XSS attacks so you should be very careful about what HTML this ShortDescription property contains. If it is the user that enters it you should absolutely ensure that it is safe. You could use the AntiXss library for this.
Do you know why we can not use html.decode in my case !
Because Html.Decode returns a string and when you feed a string to the #() Razor function it automatically Html encodes it again and ruins your previous efforts. That's why the Html.Raw helper exists.

Categories

Resources