Practices on filtering user inputs

Practices on filtering user inputs - c#

I would like to ask some suggestions from the more experienced people out there.
I have to filter the inputs the user wherein the they might try to input values like
<script type="text/javascript">alert(12);</script>
on the textbox. I would like to ask if do you have any recommendations for good practices regarding this issue?
Recently we encountered a problem actually on one of our sharepoint projects. We tried to input a script on the textbox and boom the page crashes... I mean trapping it can be easy I think because we know that it is one of the possible inputs of the user but how about the things that we don't know? There might be some other situations that we haven't considered aside from just trapping a script. Can somebody suggest a good practice regarding this matter?
Thanks in advance! :)

Microsoft actually produce an anti-cross site scripting library, though when I looked at it, it was litte more than a wrapper round various encoding functions in the .NET framework. AntiXSS library
Two of the main threats you should consider are:
Script injection
HTML tag injection
Both of these can be mitigated (to a degree) by HTML encoding user input before re=rendering it on the page.
There is also a library called AntiSamy available from the OWASP project, designed to neuter malicious input in web applications.

Jimmy answer is a good technique to manage "Input Validation & Representation" problems.
But you can filter your textbox inputs by yourself before passing it to third party API such AntiSamy and so on.
I generally use these controls:
1) minimize the length of the textbox value: not only in the client side but in the server side too (you couldn't believe me but there aren't buffer overflow attacks also in scripting)
2) Apply a Whitelist control to the characters the users write into the textbox (clientside and Serverside)
3) Use Whitelist if possibile. Blacklist are less secure than Whitelist
It is very important you do these controls into the server side part.
Sure it's very easy to forget some controls and so AntiSamy and products like this are very useful. But I advise you to implement your personal "Input Validation" API.
Securing software is not to get some third party product but it is to program in a different way.

I have tried this on sharepoint with both a single line of text and multiple lines of text, and in both cases sharepoint encodes the value. (i get no alert)
What SharePoint are you using?

Related

How much can I trust ASP.NET Request Validation with Web Pages/WebMatrix vs. XSS?

I have read (and am coming to terms with) the fact that no solution can be 100% effective against XSS attacks. It seems that the best we can hope for is to stop "most" XSS attack avenues, and probably have good recovery and/or legal plans afterwords. Lately, I've been struggling to find a good frame of reference for what should and shouldn't be an acceptable risk.
After having read this article, by Mike Brind (A very good article, btw):
http://www.mikesdotnetting.com/Article/159/WebMatrix-Protecting-Your-Web-Pages-Site
I can see that using an html sanitizer can also be very effective in lowering the avenues of XSS attacks if you need the user-input unvalidated.
However, in my case, it's kind of the opposite. I have a (very limited) CMS with a web interface. The user input (after being URL encoded) is saved to a JSON file, which is then picked up (decoded) on the view-able page. My main way for stopping XSS attacks here is that you would have to be one of few registered members in order to change content at all. By logging registered users, IP addresses, and timestamps, I feel that this threat is mostly mitigated, however, I would like to use a try/catch statement that would catch the YSOD produced by asp.net's default request validator in addition to the previously mentioned methods.
My question is: How much can I trust this validator? I know it will detect tags (this partial CMS is NOT set up to accept any tags, logistically speaking, so I am fine with an error being thrown if ANY tag is detected). But what else (if anything) does this inborn validator detect?
I know that XSS can be implemented without ever having touched an angle bracket (or a full tag, at all, for that matter), as html sources can be saved, edited, and subsequently ran from the client computer after having simply added an extra "onload='BS XSS ATTACK'" to some random tag.
Just curious how much this validator can be trusted if a person does want to use it as part of their anti-XSS plans (obviously with a try/catch, so the users don't see the YSOD). Is this validator pretty decent but not perfect, or is this just a "best guess" that anyone with enough knowledge to know XSS, at all, would have enough knowledge that this validation wouldn't really matter?
-----------------------EDIT-------------------------------
At this site...: http://msdn.microsoft.com/en-us/library/hh882339(v=vs.100).aspx
...I found this example for web-pages.
var userComment = Request.Form["userInput"]; // Validated, throws error if input includes markup
Request.Unvalidated("userInput"); // Validation bypassed
Request.Unvalidated().Form["userInput"]; // Validation bypassed
Request.QueryString["userPreference"]; // Validated
Request.Unvalidated().QueryString["userPreference"]; // Validation bypassed;
Per the comment: "//Validated, throws error if input includes markup" I take it that the validator throws an error if the string contains anything that is considered markup. Now the question (for me) really becomes: What is considered markup? Through testing I have found that a single angle bracket won't throw an error, but if anything (that I have tested so far) comes after that angle bracket, such as
"<l"
it seems to error. I am sure it does more checking than that, however, and I would love to see what does and does not qualify as markup in the eyes of the request validator.

I believe the ASP.NET request validation is fairly trustworthy but you should not rely on it alone. For some projects I leave it enabled to provide an added layer of security. In general it is preferable to use a widely tested/utilized solution than to craft one yourself. If the "YSOD" (or custom error page) becomes an issue with my clients, I usually just disable the .NET request validation feature for the page.
Once doing so, I carefully ensure that my input is sanitized but more importantly that my output is encoded. So anywhere where I push user-entered (or web service, etc. -- anything that comes from a third party) content to the user it gets wrapped in Server.HtmlEncode(). This approach has worked pretty well for a number of years now.
The link you provided to Microsoft's documentation is quite good. To answer your question about what is considered markup (or what should be considered markup) get on your hacker hat and check out the OWASP XSS Evasion Cheat Sheet.
https://www.owasp.org/index.php/XSS_Filter_Evasion_Cheat_Sheet#HTML_entities

Extracting data from an ASPX page

I've been entrusted with an idiotic and retarded task by my boss.
The task is: given a web application that returns a table with pagination, do a software that "reads and parses it" since there is nothing like a webservice that provides the raw data. It's like a "spider" or a "crawler" application to steal data that is not meant to be accessed programmatically.
Now the thing: the application is made with standart aspx webform engine, so nothing like standard URLs or posts, but the dreadful postback engine crowded with javascript and non accessible html. The pagination links call the infamous javascript:__doPostBack(param, param) so I think it wouldn't even work if I try even to simulate clicks on those links.
There are also inputs to filter the results and they are also part of the postback mechanism, so I can't simulate a regular post to get the results.
I was forced to do something like this in the past, but it was on a standard-like website with parameters in the querystring like pagesize and pagenumber so I was able to sort it out.
Anyone has a vague idea if this is doable, or if I should tell to my boss to quit asking me to do this retarded stuff?
EDIT: maybe I was a bit unclear about what I have to achieve. I have to parse, extract and convert that data in another format - let's say excel - and not just read it. And this stuff must be automated without user input. I don't think Selenium would cut it.
EDIT: I just blogged about this situation. If anyone is interested can check my post at http://matteomosca.com/archive/2010/09/14/unethical-programming.aspx and comment about that.

Stop disregarding the tools suggested.
No, the parser you can write isn't WatiN or Selenium, both of those Will work in that scenario.
ps. had you mentioned anything on needing to extract the data from flash/flex/silverlight/similar this would be a different answer.
btw, reason to proceed or not is Definitely not technical, but ethical and maybe even lawful. See my comment on the question for my opinion on this.

WatiN will help you navigate the site from the perspective of the UI and grab the HTML for you, and you can find information on .NET DOM parsers here.

Already commented but think thus is actually an answer.
You need a tool which can click client side links and wait while page reloads.
Tool s like selenium can do that.
Also (from comments) WatiN WatiR

#Insane, the CDC's website has this exact problem, and the data is public (and we taxpayers have paid for it), I'm trying to get the survey and question data from http://wwwn.cdc.gov/qbank/Survey.aspx and it's absurdly difficult. Not illegal or unethical, just a terrible implementation that appears to be intentionally making it difficult to get the data (also inaccessible to search engines).
I think Selenium is going to work for us, thanks for the suggestion.

ASP.NET User-based Templates

Is there anyway to let users write their own aspx templates with my defined dynamic variables? Note that I don't want to use Web Forms (so there are no tags like <asp:button> etc).
In addition, I'd need a security solution so users can't change the system or do dangerous things like this.
Thanks.

Personally I would avoid using the ASPX engine for this. I would probably use either a really simple custom formatting solution (such as just a text file with %%VAR_NAME%% allowed for dynamic values), or I would look at a templating language such as Markdown (used by StackOverflow and others). BBCode is another option in a similar vein.
Allowing people to create ASPX templated pages on the fly seems like to much of a security issue to me. It would be hard to make sure you have closed all the possible attack vectors once they have direct access to the ASP.NET engine.
Since you didn't specify, I'm just guessing at your needs, so depending on the exact problem this may or may not be your best bet. If you include more details about the problem you are solving it would be easier to make suggestions.

Why isn't ValidateRequest="true" enough for XSS prevention?

In the notes for Step 1 in the "How To: Prevent Cross-Site Scripting in ASP.NET" it is stated that you should "not rely on ASP.NET request validation. Treat it as an extra precautionary measure in addition to your own input validation."
Why isn't it enough?

For one thing, hackers are always coming up with new attacks and new ways of inserting XSS. ASP.NET's RequestValidation only gets updated when a new version of ASP.NET gets released, so if someone comes up with a new attack the day after an ASP.NET release RequestValidation won't catch it.
That (I believe) is one of the reasons why the AntiXSS project appeared, so it can have a faster release cycle.

Just two hints:
Your application might output not only data that was entered using your ASP.NET forms. Think of web services, RSS feeds, other databases, informations extracted from user uploads etc.
Sometimes it's necessary to disable the default (effective but overly simple) request validation because you need to accept angle brackets in your forms. Think of a WYSIWYG editor.

Encrypt my framework and code

i am creating my own CMS frame work, because many of the clients i have, the have same requirements, like news module, newsletter module, etc.
now i am doing it fine, the only thing that is bothering me, is if a client wants to move from my server he would ask me to gibe him his files, and of course if i do so the new person who will take it he will see all my code, use it and benefit from i, and this is so bad for me that i spend all this time on creating my system and any one can easily see the code, plus he will see all the logic for my system, and he can easily know how other clients of mine sites are working, and that is a threat to me, finally i am using third party controls that i have paid for their license, and i don't want him to take it on a golden plate.
now what is the best way to solve this ? i thought it is encrypting, but how can i do that and how efficient is it ?
-should i merge all my CS files and Dlls in bin folder to one Dll and encrypt it, and how can i do that ?
i totally appreciate all the help on this matter as it is really crucial for me.

you should read this
Best .NET obfuscation tools/strategy
How effective is obfuscation?

In my experience, this is rarely worth the effort. Lots of companies who provide libraries like this don't bother obfuscating their code (Telerik, etc).
Especially considering what you are writing (CMSes are everywhere), you'd likely see more benefit from your time spent implementing features that put your product/implementation in a competitive advantage and make companies see that the software you are capable of writing has value, rather than the code itself.
In the end, you want to ensure you are a key factor in making software work for a company, not the DLLs you give them.

You'll need to precompile your site and obfuscate dlls.
Visual Studio has something like Dotfuscator Community Edition shipped with it. You could give it a try.
Of course, HTML output, CSS declarations, database structure and stored procedures code cannot be encrypted.
You can however try to compress CSS which will also reduce its readbility by humans.
Check here: The best approach to scramble CSS definitions to a human-unreadable state throughout an ASP.NET application
One other idea would be to use a frame in your HTML and put the most of the site pages inside of it. This way, it will not be visible when doing "View source".
Or just state it clearly that you offer whatever you're doing as a service and do not provide source codes of your work. I somehow doubt salesforce would be willing to give their sources to anyone who asks.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.