I have a requirement that user can input HTML tags in the ASP.NET TextBox. The value of the textbox will be saved in the database and then we need to show it
on some other page what he had entered. SO to do so I set the ValidateRequest="false" on the Page directive.
Now the problem is that when user input somthing like :
<script> window.location = 'http://www.xyz.com'; </script>
Now its values saved in the database, but when I am showing its value in some other page It redirects me to "http://www.xyz.com" which is obvious
as the javascript catches it. But I need to find a solution as I need to show exactly what he had entered.
I am thinking of Server.HtmlEncode. Can you guide me to a direction for my requirement
Always always always encode the input from the user and then and only then persist in your database. You can achieve this easily by doing
Server.HtmlEncode(userinput)
Now, when it come time to display the content to the user decode the user input and put it on the screen:
Server.HtmlDecode(userinput)
You need to encode all of the input before you output it back to the user and you could consider implementing a whitelist based approach to what kind of HTML you allow a user to submit.
I suggest a whitelist approach because it's much easier to write rules to allow p,br,em,strong,a (for example) rather than to try and identify every kind of malicious input and blacklist them.
Possibly consider using something like MarkDown (as used on StackOverflow) instead of allowing plain HTML?
You need to escape some characters during generating the HTML: '<' -> <, '>' -> >, '&' -> &. This way you get displayed exactly what the user entered, otherwise the HTML parser would possibly recognize HTML tags and execute them.
Have you tried using HTMLEncode on all of your inputs? I personally use the Telerik RadEditor that escapes the characters before submitting them... that way the system doesn't barf on exceptions.
Here's an SO question along the same lines.
You should have a look at the HTML tags you do not want to support because of vulnerabilities as the one you described, such as
script
img
iframe
applet
object
embed
form, button, input
and replace the leading "<" by "& lt;".
Also replace < /body> and < /html>
HTML editors such as CKEditor allow you to require well-formed XHTML, and define tags to be excluded from input.
Related
I have a contenteditable div the user enter data. When they enter line break, each browser stores the data differently. When I export this data to Word using HtmlToOpenXml it adds a blank line for the content and I want to avoid that so the html page and word doc look the same.
One option for me is to replace the tags <br>, <div>, <p> with blank and then replace the </div> and </p> with <br/> in the C# code using RegEx. But I do not know what all formatting is used for contenteditable div by different browsers and this implementation may not help.
I would like to know what is the best way to address this or is there any open source tool/dll that helps me with this issue?
e.g. ContentEditable div actual data in browsers looks like below
Chrome -
line1<div>line2</div><div>line3</div>
IE Edge-
<div>line1</div><div>line22</div><div>line3<br></div>
FireFox - I read it uses <p> </p> instead of <div> </div>
Safari - ????
A Solution I found:
You could use RegEx, which I highly recommend in C# for parsing information.
Then effectively based on the formatting you could narrow down what browser it is and then move on towards parsing it's output and what its XML means universally. This will not be easy but no cross-platform ever truly is. I would give a example of how this could be done, but RegEx in all honesty takes a good amount of work and it would be quite a bit of code to make a example that could show you how to parse it and find out what the browser is.
I have some HTML code to show up on an HTML page, so it must not be interpreted as HTML.
Also, I'd like to maintain space/empty line and so on.
I'm on C#/.NET 3.5 : what can I use?
Just use HtmlEncode.
Encodes a string to be displayed in a browser.
And documented in the overloads:
HTML encoding makes sure that text is displayed correctly in the browser and not interpreted by the browser as HTML. For example, if a text string contains a less than sign (<) or greater than sign (>), the browser would interpret these characters as the opening or closing bracket of an HTML tag. When the characters are HTML encoded, they are converted to the strings < and >, which causes the browser to display the less than sign and greater than sign correctly.
It is not clear for what purpose you want to display this, but you may want to pretty print before HTML encoding (the HTML Agility Pack may do this, not sure) - and to show it as fixed width you can enclose in a <pre> element.
Since you're not actually saying which technology within .Net you are using to render your Html page (Asp.Net WebForms or MVC or whatever) the answer falls back to how you would do it in HTML, regardless of your server technology. After that, how you actually achieve this output is entirely up to you.
Render it in a <pre /> block:
<pre>
<p>hello world!</p>
<pre>
Here the text will appear as <p>Hello world!</p> and, by default, appear in a fixed-width font and all whitespace will be retained.
My issue is that I have a designer that will create a custom aspx page bu without any .net controls. I need a way of adding the controls dynamically. So far the only types of controls will be textboxes and a button, but there are 30 variations of what the textboxes can be (name, phone #, email, etc). Also the textboxes may or may not need to be required. Once the textboxes are added the form will be submitted to a db.
My first thought was to have the designer place something like [name] and then replace that with a user control that has a name textbox and a required field validator. In order to determine if the validator should be enabled I was thinking that the place holder could look like this, [name;val] or [name;noval]. I could either do replace the place holders in code dynamically or set up a tool that the user pastes their html into a textbox and clicks a button which then spits out the necessary code to create the aspx page.
I'm sure there must be a better way to do this but its a fairly unique problem so I haven't been able to find any alternatives. Does anyone have any ideas?
Thanks,
Kirk
IF your designer gives you html pages, just create a new website. copy and pages all the HTML pages with the Image folders and everything to your project. then for every HTML page create an aspx page, (with the same name) copy and pages the html's tags which are between to the aspx page's and for the body copy and paste HTML page's tags which are between into the of the aspx page.
Now you have your aspx page, exactly the same as html page.
Sounds like an attempt to over-engineer a solution to what should be a non-issue.
As #Alessandro mentioned in a comment above, why can't the designer provide you with pages that have the control markup? As it stands right now, the designer isn't providing you with "a custom aspx" so much as "a custom html page." If the designer is promising ASPX but delivering only HTML, that's a misinterpretation somewhere in the business requirements.
However, even if the designer is rightfully providing only HTML, there shouldn't be a problem with that. At worst, you can set each element you need on the server to runat="server" to access them on the server-side. Or, probably better, would be to simply replace them with the ASPX control markup for the relevant controls.
Write a simple parser that will recognize the [...] tags and replace them with corresponding controls. Its pretty easy to do and i've often done this... the tag i use is usually $$(..); though, but that doesn't matter as long as your parser knows your tags.
Such a parser will consist of a simple state-machine that can be in two states; text-mode or tag-mode. Loop through the whole page-text, char for char. As long as you're in text-mode you keep appending each char into a temporary buffer. As soon as you get into tag-mode you create a LiteralControl with the content of the temporary buffer and add it to the bottom of your Control-tree, and emtpy the buffer.
Now, you still keep adding each char into the buffer, but when you hit text-mode again, you analyze the content of the buffer and create the correct control - could be a simple switch case statement. Add the control to the bottom of your control tree and keep looping through the rest of the chars unto you read the end and keep switching back and forth between text-mode and tag-mode adding LiteralControls and concrete controls.
Simple example of such a parser... written in notepad in 4 minutes, but you should get the idea.
foreach (var c in text)
{
buffer.Append(c);
if (c== '[' && mode == Text)
{
mode = Tag;
Controls.Add(new LiteralControl(buffer));
buffer.Clear();
}
if (c == ']' && mode == Tag)
{
mode = Text;
switch (buffer)
{
case "[name]": Controls.Add(new NameControl());
... the rest of possible tags
}
buffer.Clear();
}
I have to write a web page annotator.
And my requirements are the following:
1) given a set of pages, I want to annotate them efficiently (in a browser, in an external application that knows how to render HTML, etc.)
2) I select (highlight, make active) manually a string of text, and dropdown menu appears that allows to select from a set of options
3) after that the iterator appears (like in a browser when pressed ctrl+F to search) and I want to be able to navigate through matches of the string, selected in the previous step, on the same page
4) comparison function on strings is given that has interface: given two strings it outputs either 1 or 0, depending on strings match
5) when I press iterator button, i move to the next match for selected string, and then a message box should appear (or any other thing where i can confirm that it is a true match)
6) having confirmed that it is a true match, the text of a page should be modified such that current match became surrounded by special tag
(for instance <<< optionX >>> matched text <<< /optionX >>> ), where optionX is defined based on a value selected in the first step (dropdown menu)
5) when all matches (defined by comparison function) are found on a page, I would like to mark another string of text on the same page and then repeat the process, by finding all the matches, confirming some of them, and modifying page source correspondingly
6) then a modified page should be stored on a local drive
QUESTIONS:
Can you please suggest what is the right tool to do that?
1)Is it OK to use javascript and work in a browser. If yes, what methods are required for that and are there any useful libraries that do just that, or at least cover some functionality described above
2) May be is it better to build a custom desktop app, that renders a page in a special frame, and have appropriate buttons to navigate, confirm etc. (python or C# are considered), and again what classes and libraries can help
[UPDATE]:
I know how to work with the content of a page, but I am curious how to make it comfortable for annotators to use, how to build the right dialog with the user: ways to have all candies such as dropdown menus and iterator that is visible for users, dialog for confirmation etc.
The goal is to annotate a lot of pages with that, therefore interface should be efficient. I am a researcher (and this is not a homework as you might think, i just described what is needed in a formal way ) and I have only poor experience writing user oriented apps.
Thank you in advance!
This is definitely a web browsers job. You can use JQuery to search and modify a html page. This example will find all occurrences of homework and change the text to << homework >>
$("*").each(function () {
if ($(this).children().length == 0) {
$(this).text($(this).text().replace('homework','<< homework >>'));
}
});
Hi I get this error when i hit the submit button on a user login form because there is a repeater on the same page with is repeating html which is being posted back with the form content. apart from applying ValidateRequest="false" to the login usercontrol is there anything i can add around the repeater to stop this?
When you set ValidateRequest to false all kind of dangerous characters are accepted as parameters so you must make sure to properly HTML encode them if you intend to redisplay this user input.
If for some reason you can't HTML encode the text:
1) In the repeater, render the dangerous text inside HTML elements that don't get posted, like <p> or <span>.
2) If you absolutely must render the HTML inside <input> elements, disable those elements and so your page doesn't submit them.
I answered how to allow this here:
"<" in a text box in ASP.NET --> how to allow it?
basically by escaping the HTML just before the post