Format HtmlEncoded text to ASP - c#

I am taking string from database, which will then be HtmlEncoded. How do I do the formatting of newline and tab?
I don't think I will be able to use CSS because it is only one string (unless using CSS to replace the substring)
One way I've tried is by putting <br> and   inside of the text in database and then using HttpUtility.HtmlDecode to format it, but I am not sure it is the right way.
Any suggestion and feedback is welcomed.

if you are getting a html encoded string from database then you just have to use htmldecode for decoding and it will place tabs and new line.
Prior to that check if the encoded string is html encoded or any other encoding has been used.

Related

C# download HTML page to string asp.net

I'm trying to download a aspx page's html from an another page, using the following code:
WebClient webClient = new WebClient();
String CompleteReport = webClient.DownloadString(new System.Uri(reportURL));
however the HTML that is returned contains the markup similar to the following:
"\r\n\r\n<!DOCTYPE html>\r\n\r\n<html xmlns=\"http://www.w3.org/1999/xhtml\">\r\n<head><meta charset=\"utf-8\"
what should i do to download a string without these escape sequences.
Thank You!
The string doesn't actually contain those sequences. It contains the characters that they represent (actual newline and linefeed characters).
You are probably viewing the string in a debugger and the debugger is adding those sequences for you. If you dump it to a file and read it in notepad they won't be there.
See also this answer. If you add ,nq to the variable name in the watch window, the escape sequences will go away.

HTML Decode and Encode

I have tried to decode the html text that i have in the databse in my MVC 3 Razor application.
the html text in the databse is not encoded.
I tries httpUtility.decode , server.decode but none of them work.
finally i managed to make it work with Html.raw(string)
sample of non working code
#Server.HtmlDecode(item.ShortDescription)
#HttpUtility.HtmlDecode(item.ShortDescription)
Do you know why we can not use html.decode in my case !
I thought this would save some one else from looking for few hours.
It works just fine to decode the text, but then it will automatically be encoded again when it's put in the page using the # syntax.
The Html.Raw method wraps the string in an HtmlString, which tells the razor engine not to encode it when it's put in the page.
If you want to display the value as-is without any HTML encoding you could use the Html.Raw helper:
#Html.Raw(item.ShortDescription)
Be warned thought that by doing this you are opening your site to XSS attacks so you should be very careful about what HTML this ShortDescription property contains. If it is the user that enters it you should absolutely ensure that it is safe. You could use the AntiXss library for this.
Do you know why we can not use html.decode in my case !
Because Html.Decode returns a string and when you feed a string to the #() Razor function it automatically Html encodes it again and ruins your previous efforts. That's why the Html.Raw helper exists.

Url encoding quotes and spaces

I have some query text that is being encoded with JavaScript, but I've encountered a use case where I might have to encode the same text on the server side, and the encoding that's happening is not the same. I need it to be the same. Here's an example.
I enter "I like food" into the search box and hit the search button. JavaScript encodes this as %22I%20like%20food%22
Let's say I get the same value as a string on a request object on the server side. It will look like this: "\"I like food\""
When I use HttpUtility.UrlEncode(value), the result is "%22I+like+food%22". If I use HttpUtility.UrlPathEncode(value), the result is "\"I%20like%20food\""
So UrlEncode is encoding my quotes but is using the + character for spaces. UrlPathEncode is encoding my spaces but is not encoding my escaped quotes.
I really need it to do both, otherwise the Search code completely borks on me (and I have no control over the search code).
Tips?
UrlPathEncode doesn't escape " because they don't need to be escaped in path components.
Uri.EscapeDataString should do what you want.
There are a few options available to you, the fastest might be to use UrlEncode then do a string.replace to swap the + characters with %20.
Something like
HttpUtility.UrlEncode(input).Replace("+", "%20");
WebUtility.UrlEncode(str)
Will encode all characters that need encoded using the %XX format, including space.

HTMLencode HTMLdecode

I have a text area and I want to store the text entered by user in database with html formatting like paragraph break, numbered list. I am using HTMLencode and HTMLdecode for this.
Sample of my code is like this:
string str1 = Server.HtmlEncode(TextBox1.Text);
Response.Write(Server.HtmlDecode(str1));
If user entered text with 2 paragraphs, str1 shows characters \r\n\r\n between paragraphs. but when it writes it to screen, just append 2nd paragraph with 1st. While I'm decoding it, why doesn't it print 2 paragraphs?
The simple solution would be to do:
string str1 = Server.HtmlEncode(TextBox1.Text).Replace("\r\n", "<br />");
This is assuming that you only care about getting the right <br /> tags in place. If you want a real formatter you will need a library like Aaronaught suggested.
That's not what HtmlEncode and HtmlDecode do. Not even close.
Those methods are for "escaping" HTML. < becomes <, > becomes >, and so on. You use these to escape user entered input in order to avoid Cross-Site Scripting attacks and related issues.
If you want to be able to take plain-text input and transform it into HTML, consider a formatting tool like Markdown (I believe that Stack Overflow uses MarkdownSharp).
If all you want are line breaks, you can use text.Replace("\r\n", "<br/>"), but handling more complex structures like ordered lists is difficult, and there are already existing tools to handle it.
HTML doesn't recognize \r\n as a line break. Convert them to "p" or "br" tags.

string replacement in page created from template

I've got some aspx pages being created by the user from a template. Included is some string replacement (anyting with ${fieldname}), so a portion of the template looks like this:
<%
string title = #"${title}";
%>
<title><%=HttpUtility.HtmlEncode(title) %></title>
When an aspx file is created from this template, the ${title} gets replaced by the value the user entered.
But obviously they can inject arbitrary HTML by just closing the double quote in their input string. How do I get around this? I feel like it should be obvious, but I can't figure a way around this.
I have no control over the template instantiating process -- I need to accept that as a given.
Can you store their values in another file(xml maybe) or in a database? That way their input is not compiled into your page. Then you just read the data into variables. Then all you have to worry about is html, which your html encode would take care of.
If they include a double quote in their string, that will not inject arbitrary HTML, but arbitrary code, which is even worse.
You can use a regex to filter the input string. I would use an inclusive regex rathern than trying to exclude dangerous chars. Only allow them A-Za-z0-9 and whitespace.
Not sure i understand fully, but...
Try using a regex to strip html from the title instead of html encoding it:
public string StripHTML(string text)
{
return Regex.Replace(text, #”<(.|\n)*?>”, string.Empty);
}
Is this possible?
<%
string title = Regex.Replace(#"${title}", #”<(.|\n)*?>”, string.Empty);
%>
or
<title><%=HttpUtility.HtmlEncode(System.Text.RegularExpressions.Regex.Replace(title, #"<(.|\n)*?>", string.Empty)) %></title>

Categories

Resources