I want to protect my page when a user inputs the following:
<script type="text/javascript">
alert("hi");
</script>
I'm using ShowDown:
jQuery.fn.markDown = function()
{
return this.each(function() {
var caller = this;
var converter = new Showdown.converter();
var text = $(caller).text();
var html = converter.makeHtml(text);
$(caller).html(html);
});
}
If you want to sanitize html on a .NET server-side code, I'd advise you use Microsoft web protection library, after transforming the markup to html, before rendering it to the page.
e.g. the following snippet:
x = #"<div>safe</div>
<script type='text/javascript'>
alert('hi');
</script>";
return Microsoft.Security.Application.Sanitizer.GetSafeHtmlFragment(x);
returns <div>safe</div>
http://wpl.codeplex.com/
One of the solution that could be effective would be to strip all the tag in the source or HTML encode the tag before it is transformed with Showdown.
For how to strip all the HTML tag, there are a couple of way to do it that you can find in this question :
Strip HTML from Text JavaScript
For how to HTML encode the tag, you can use this :
myString.replace(/</g, '<').replace(/>/g, '>');
Note: This will remove you the ability to use HTML in Showdown.
The ShowDown page strips any javascript, so I don't know what you mean exactly. But you can't do this on the client. If this is never going to be submitted to the server, then it doesn't matter. However, 99% of the time, you want to store it on the server.
I think the best approach is to create a server side DOM object out of the html that is submitted (which could be spoofed and bypass ShowDown) and look for any script or other dangerous tags. This is not so simple!
The best compromise for me is to use a server side markdown language (like https://github.com/charliesome/bbsharp) that you could then use to generate the html. You would then html encode any html before passing it to the tool that converts the markdown to HTML.
I use HTML Purifier which works very well for filtering user input and is highly customizable.
I assume you can use it with MarkDown, although I never tried.
Related
I have a webform that allows users to upload text as Markdown.
The Markdown is converted to Html on the server(using Markdig) and also stored.
When displaying the converted Html that the user uploaded, should I #Html.Encode the content - the project is in c#, MVC 5/razor with request validation on.
Generally it depends on the markdown converter.
By default Markdig doesn't escape html. You can however use the DisableHtml function in the pipeline that escapes all remaining HTML encodable strings that were not processed by previous extensions. This should also give better performance than letting an anti-xss function run over the string again.
See example:
var pipeline = new MarkdownPipelineBuilder().DisableHtml().Build();
var result = Markdig.Markdown.ToHtml("<a href='javascript:evil()'>hello</a>", pipeline);
No, it isn't.
I just trivially tested the following:
hello
and markdig lets it through:
See online example.
Although I haven't looked into it too deeply, the Microsoft AntiXSS library might be useful here:
var safeHtml = Microsoft.Security.Application.Sanitizer
.GetSafeHtmlFragment("<a href='javascript:evil()'>hello</a>");
gives:
hello
but
var safeHtml = Microsoft.Security.Application.Sanitizer
.GetSafeHtmlFragment("<a href='http://stackoverflow.com'>hello</a>");
gives:
hello
I want to display user content in a java script variable.
As with all user generated content, I want to sanitize it before outputting.
ASP.Net MVC does a great job of this by default:
#{
var name = "Jón";
}
<script> var name ='#name';</script>
The output for the above is:
Jón
This is great as it protects me from users putting <tags> and <script>evilStuff</script> in their names and playing silly games.
In the above example,I want sanity from evil doers but I don't want to HTML encode UTF8 valid characters that aren't evil.
I want the output to read:
Jón
but I also want the XSS protection that encoding gives me.
Outside of using a white listing framework (ie Microsoft.AntiXSS) is there any built in MVC function that helps here?
UPDATE:
It looks like this appears to achieve something that looks like it does the job:
#{
var name = "Jón";
}
<script> var name ='#Html.Raw(HttpUtility.JavaScriptStringEncode(name))';
Will this protect against most all XSS attacks?
You'd have to write your own encoder or find another 3rd party one. The default encoders in ASP.NET tend to err on the side of being more secure by encoding more than what might necessarily be needed.
Having said that, please don't write your own encoder! Writing correct HTML encoding routines is a very difficult job that is appropriate only for those who have specific advanced security expertise.
My recommendation is to use what's built-in because it is correct, and quite secure. While it might appear to produce less-than-ideal HTML output, you're better safe than sorry.
Now, please note that this code:
#Html.Raw(HttpUtility.JavaScriptStringEncode(name))
Is not correct and is not secure because it is invalid to use a JavaScript encoding routing to render HTML markup.
I have code like this in my Razor view
<script>
var JsModel = #Html.Raw(Json.Encode(CsModel));
</script>
This works just fine. It converts the C# model received from the controller to a JS object, which is later used to populate Google Map with markers and various other stuff.
The problem is, Visual Studio is showing a big red syntax error in this line, and it's driving me insane. The code is perfectly fine, model is always non-null, and the encoding always works. But Razor parser, or perhaps even R#, I'm not sure which, trips up on it, as it seems to ignore C# code. So what it sees is just var JsModel = ; and complains.
Any way for me to tell it that it's ok? What can I do here?
We can get around this by putting raw json into quotes, and thus tricking the parser, and then using JQuery eval (or any other eval you prefer) to convert string into an object.
<script>
var JsModel = $.parseJSON('#Html.Raw(Json.Encode(CsModel))');
</script>
<script>
var JsModel = '#Html.Raw(Json.Encode(CsModel))'; //apply single quotes as shown
</script>
Use function as shown below :
function SetMyValue(value){
return value;
}
var JsModel = SetMyValue(#Html.Raw(Json.Encode(CsModel)));
As documented here http://msdn.microsoft.com/en-us/library/gg480740(v=vs.118).aspx:-
This method wraps HTML markup using the IHtmlString class, which
renders unencoded HTML.
The Razor escaping system will not escape anything of type IHtmlString. The Html.Raw() method simply returns an HtmlString instance containing your text.
So wrapping it into function and returning from there would work for you.
I am trying to protect my website from Cross-Site Scripting (XSS) and I'm thinking of using regular expressions to validate user inputs.
Here is my question: I have a list of dangerous HTML tags...
<applet>
<body>
<embed>
<frame>
<script>
<frameset>
<html>
<iframe>
<img>
<style>
<layer>
<link>
<ilayer>
<meta>
<object>
...and I want to include them in regular expressions - is this possible? If not, what should I use? Do you have any ideas how to implement something like that?
public static bool ValidateAntiXSS(string inputParameter)
{
if (string.IsNullOrEmpty(inputParameter))
return true;
// Following regex convers all the js events and html tags mentioned in followng links.
//https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet
//https://msdn.microsoft.com/en-us/library/ff649310.aspx
var pattren = new StringBuilder();
//Checks any js events i.e. onKeyUp(), onBlur(), alerts and custom js functions etc.
pattren.Append(#"((alert|on\w+|function\s+\w+)\s*\(\s*(['+\d\w](,?\s*['+\d\w]*)*)*\s*\))");
//Checks any html tags i.e. <script, <embed, <object etc.
pattren.Append(#"|(<(script|iframe|embed|frame|frameset|object|img|applet|body|html|style|layer|link|ilayer|meta|bgsound))");
return !Regex.IsMatch(System.Web.HttpUtility.UrlDecode(inputParameter), pattren.ToString(), RegexOptions.IgnoreCase | RegexOptions.Compiled);
}
Please read over the OWASP XSS (Cross Site Scripting) Prevention Cheat Sheet for a broad array of information. Black listing tags is not a very efficient way to do it and will leave gaps. You should filter input, sanitize before outputting to browser, encode HTML entities, and various other techniques discussed in my link.
You should encode string as HTML. Use dotNET method
HttpUtils.HtmlEncode(string text)
There is more details http://msdn.microsoft.com/en-us/library/73z22y6h.aspx
Blacklisting as sanitization is not effective, as has already been discussed. Think about what happens to your blacklist when someone submits crafted input:
<SCRIPT>
<ScRiPt>
< S C R I P T >
<scr�ipt>
<scr<script>ipt> (did you apply the blacklist recursively ;-) )
This is not an enumeration of possible attacks, but just some examples to keep in mind about how the blacklist can be defeated. These will all render in the browser correctly.
I have tried to decode the html text that i have in the databse in my MVC 3 Razor application.
the html text in the databse is not encoded.
I tries httpUtility.decode , server.decode but none of them work.
finally i managed to make it work with Html.raw(string)
sample of non working code
#Server.HtmlDecode(item.ShortDescription)
#HttpUtility.HtmlDecode(item.ShortDescription)
Do you know why we can not use html.decode in my case !
I thought this would save some one else from looking for few hours.
It works just fine to decode the text, but then it will automatically be encoded again when it's put in the page using the # syntax.
The Html.Raw method wraps the string in an HtmlString, which tells the razor engine not to encode it when it's put in the page.
If you want to display the value as-is without any HTML encoding you could use the Html.Raw helper:
#Html.Raw(item.ShortDescription)
Be warned thought that by doing this you are opening your site to XSS attacks so you should be very careful about what HTML this ShortDescription property contains. If it is the user that enters it you should absolutely ensure that it is safe. You could use the AntiXss library for this.
Do you know why we can not use html.decode in my case !
Because Html.Decode returns a string and when you feed a string to the #() Razor function it automatically Html encodes it again and ruins your previous efforts. That's why the Html.Raw helper exists.