I want to display user content in a java script variable.
As with all user generated content, I want to sanitize it before outputting.
ASP.Net MVC does a great job of this by default:
#{
var name = "Jón";
}
<script> var name ='#name';</script>
The output for the above is:
Jón
This is great as it protects me from users putting <tags> and <script>evilStuff</script> in their names and playing silly games.
In the above example,I want sanity from evil doers but I don't want to HTML encode UTF8 valid characters that aren't evil.
I want the output to read:
Jón
but I also want the XSS protection that encoding gives me.
Outside of using a white listing framework (ie Microsoft.AntiXSS) is there any built in MVC function that helps here?
UPDATE:
It looks like this appears to achieve something that looks like it does the job:
#{
var name = "Jón";
}
<script> var name ='#Html.Raw(HttpUtility.JavaScriptStringEncode(name))';
Will this protect against most all XSS attacks?
You'd have to write your own encoder or find another 3rd party one. The default encoders in ASP.NET tend to err on the side of being more secure by encoding more than what might necessarily be needed.
Having said that, please don't write your own encoder! Writing correct HTML encoding routines is a very difficult job that is appropriate only for those who have specific advanced security expertise.
My recommendation is to use what's built-in because it is correct, and quite secure. While it might appear to produce less-than-ideal HTML output, you're better safe than sorry.
Now, please note that this code:
#Html.Raw(HttpUtility.JavaScriptStringEncode(name))
Is not correct and is not secure because it is invalid to use a JavaScript encoding routing to render HTML markup.
Related
This is a question that has been asked before, but I've not found the information I'm looking for or maybe I'm just missing the point so please bear with me. I can always adjust my question if I'm asking it the wrong way.
If for example, I have a POST endpoint that use a simply DTO object with 2 properties (i.e. companyRequestDto) and contains a script tag in one of its properties. When I call my endpoint from Postman I use the following:
{
"company": "My Company<script>alert(1);</script>",
"description": "This is a description"
}
When it is received by the action in my endpoint,
public void Post(CompanyRequestDto companyRequestDto)
my DTO object will automatically be set and its properties will be set to:
companyDto.Company = "My Brand<script>alert(1);</script>";
companyDto.Description = "This is a description";
I clearly don't want this information to be stored in our database as is, nor do I want it stored as an escaped string as displayed above.
1) Request: So my first question is how do I throw an error if the DTO posted contains some invalid content such as the tag?
I've looked at Microsoft AntiXss but I don't understand how to handle this as the data provided in the properties of a DTO object is not an html string but just a string, so What I am missing here as I don't understand how this is helping sanitizing or validating the passed data.
When I call
var test = AntiXss.AntiXssEncoder.HtmlEncode(companyRequestDto.Company, true);
It returns an encoded string, but then what??
Is there a way to remove disallowed keywords or just simply throw an error?
2) Response: Assuming 1) was not implemented or didn't work properly and it ended up being stored in our database, am I suppose to return encoded data as a json string, so instead of returning:
"My company"
Am I suppose to return:
"My Company<script>alert(1)</script>"
Is the browser (or whatever app) just supposed to display as below then?:
"My Company<script>alert(1)</script>"
3) Code: Assuming there is a way to sanitize or throw an error, should I use this at the property level using attribute on all the properties of my various DTO objects or is there a way to apply this at the class level using an attribute that will validate and/or sanitize all string properties of a DTO object for example?
I found interesting articles but none really answering my problems or I'm having other problems with some of the answers:
asp.net mvc What is the difference between AntiXss.HtmlEncode and HttpUtility.HtmlEncode?
Stopping XSS when using WebAPI (currently looking into this one but don't see how example is solving problem as property is always failing whether I use the script tag or not)
how to sanitize input data in web api using anti xss attack (also looking at this one but having a problem calling ReadFromStreamAsync from my project at work. Might be down to some of the settings in my web.config but haven't figured out why but it always seems to return an empty string)
Thanks.
UPDATE 1:
I've just finished going through the answer from Stopping XSS when using WebAPI
This is probably the closest one to what I am looking for. Except I don't want to encode the data, as I don't want to store it in my database, so I'll see if I can figure out how to throw an error but I'm not sure what the condition will be. Maybe I should just look for characters such as <, >, ; , etc... as these will not likely be used in any of our fields.
You need to consider where your data will be used when you think about encoding, so that data with in it is only a problem if it's rendered as HTML so if you are going to display data that has been provided by users anywhere, it's probably at the point you are going to display it that you would want to html encode it for display (you want to avoid repeatedly html encoding the same string when saving it for example).
Again, it depends what the response is going to be used for... you probably want to html encode it at the point it's going to be displayed... remember if you are encoding something in the response it may not match whats in data so if the calling code could do something like call your API to search for a company with that name that could cause problems. If the browser does display the html encoded version it might look ugly but it's better than users being compromised by XSS attacks.
It's quite difficult to sanitize text for things like tags if you allow most characters for normal use. It's easier if you can whitelist characters allowed and only allow, say, alphanumeric but that isn't often possible. This can be done using a regex validation attribute on the DTO object. The best approach I think is to encode values for display if you can't stop certain characters. It's really difficult to try to allow all characters but avoid things like as people can start using ascii characters etc.
I have a href which gets filled in by reading a property from a database like this
lblName.HRef = user.PublicSiteUrl;
I want to safely encode this URL to protect against any persisted XSS attack.
Which encoding should be useful for this without causing any issues with the URL structure?
For example, if I have this URL coming from the database https://google.com?q=<SCRIPT>alert(“Cookie”+document.cookie)</SCRIPT> ..How Do i make this URL safe so the script is not executed as part of URL
Why not HttpUtility.UrlEncode?
I think what you are looking for is Uri.EscapeUriString().
https://server/test.aspx?<SCRIPT>alert(“Cookie”+document.cookie)</SCRIPT>
Will become:
https://server/test.aspx?%3CSCRIPT%3Ealert(%E2%80%9CCookie%E2%80%9D+document.cookie)%3C/SCRIPT%3E
I am using asp.net htmleditorextender and unfortunately there is no working XSS sanitizer right now. So for as quick solution i am replacing all of the script and java words from user input as below
var regex = new Regex("script", RegexOptions.IgnoreCase);
srSendText = regex.Replace(srSendText, "");
regex = new Regex("java", RegexOptions.IgnoreCase);
srSendText = regex.Replace(srSendText, "");
Can i assume that i am safe from XSS attacks ?
Actually i am using htmlagilitypack anti xss sanitizer but it is not even removing script tags so totally useless
No, you can't.
An attacker can easily circumvent such a check by for example encoding script as script.
Making the code safe from XSS attacks is done by making sure that any content that can come from a user is never put in the page without proper encoding, to make sure that any code in the text is not executed at all.
Custom XSS prevention is usually a no no in my opinion as you should always use a library. However stripping script and java is not enough, anything that's passed up to the server should use HttpUtility.HtmlEncode which will encode any input from the user.
Also ensure that validateRequest="true" is set in the config file.
Other dangerous tags may include:
applet
body
embed
frame
script
frameset
html
iframe
img
style
layer
link
ilayer
meta
object
http://msdn.microsoft.com/en-us/library/ff649310.aspx
For example I have:
<img src="http://gateway.com/Providername/NameOfTheSupplier/RequestedImg.jpg" />
Now some customers are complaining that there customers can see the company name in the url. Because its to much work to change the structure of the gateway I use, I search for another way to do it.
Is there a way I can hide the src for the client? For example with base64, or another encryption that can decrypt client-side?
Simple answer: no.
Long answer: no.
Technical answer: Everything that is displayed by a browser needs to be translated to human readable* text in one way. You can obfuscate server side though.
*) human readable also includes very short names like http://gateway.com/P/N/R.jpg.
You can use data URIs, but this requires you to download the image, convert to base64 and embed in the page (or CSS).
This increases the payload by about 34%.
An alternative is to use CSS to style the page, embedding the image URLs in the CSS. This tends to not be dynamic (though it can be) and still, anyone who knows a bit about web technologies can still view the CSS and see the URLs. Using images in this way is of course also not semantic and can break your page in unexpected ways, meaning you will need to expand more efforts on making things that should "just work", work.
You can convert your entire image in base64 code.
For this can use a tons of services available out there.
Just an example:
Convert any image into a base64 string
<a href="data:text/html;charset=utf-8;base64,PCFET0NUWVBFIEhUTUw%2BDQo8aHRtbCBs
YW5nPSJlbiI%2BDQogPGhlYWQ%2BDQogIDx0aXRsZT5QcmV0dHkgR2xvd2luZyBMaW5lczwvd
Gl0bGU%2BDQogPC9oZWFkPg0KIDxib2R5Pg0KPGNhbnZhcyB3aWR0aD0iODAwIiBoZWlnaHQ
9IjQ1MCI%2BPC9jY..."</a>
You can try this piece of code
$('img').filter(function(index){return $(this).attr('src')==='';}).hide();
Can you use Handler for this. This way you can hide URL and if you want you can send image using base64 also with this or just write byte stream in response.
I think this should work and you can also control image path by just changing one piece of code.
Check this link:
http://www.codeproject.com/Articles/34084/Generic-Image-Handler-Using-IHttpHandler
Thanks.
Is there a way to detect if an HTML page contains any razor/C# code? Essentially I want users to be able to provide custom layouts, with tags that I will replace with RenderSection. I want to validate that prior to making this replacement, that none of the HTML contains anything like for example, <a href="#(some C# code)".
All discussions about alternative ways to do this, should/could/would aside, just simply:
Is there a way to programmatically detect if a file contains C#/Razor code?
I don't know a lot about the Razor markup -- but I am thinking that when you grab the layout string they are passing in you will want to parse the text out and grab everything that starts with an # and toss those words into an array. Then, when you republish it to you website use razor code to access the data in the array...
Alternately, and easier, would be to go through all the passed in code and replace all the # signs with a different symbol say & that way it wont get interpreted by the Razor processor:
layoutString = layoutString.Replace('#', '&');
In the browser? No, because unless the programmer made a mistake, there is no Razor/C# code in teh rendered HTML, only HTML that was the result of that.
What you ask is like asking what type of oven was used to bake a pizza from the pizza. Bad news - you never will know.
If you provie sensible tags from those, you could parse them in javascript, but you have to output that metadata yourself as part of the generated html.
After reading your comment to TomTom; the answer is:
No. Razor does not come with any public syntax parser.