I'm imagining that this will come down to personal preference but which way would you go?
StringBuilder markup = new StringBuilder();
foreach (SearchResult image in Search.GetImages(componentId))
{
markup.Append(String.Format("<div class=\"captionedImage\"><img src=\"{0}\" width=\"150\" alt=\"{1}\"/><p>{1}</p></div>", image.Resolutions[0].Uri, image.Caption));
}
LiteralMarkup.Text = markup.ToString();
Vs.
StringBuilder markup = new StringBuilder();
foreach (SearchResult image in Search.GetImages(componentId))
{
markup.Append(String.Format(#"<div class=""captionedImage""><img src=""{0}"" width=""150"" alt=""{1}""/><p>{1}</p></div>", image.Resolutions[0].Uri, image.Caption));
}
LiteralMarkup.Text = markup.ToString();
Or should I not be doing this at all and using the HtmlTextWriter class instead?
EDIT: Some really good suggestions here. We are on 2.0 framework so LINQ is not available
Another vote for "AppendFormat". Also, for the sake of the server code I might put up with single quotes here to avoid the need to escape anything:
StringBuilder markup = new StringBuilder();
foreach (SearchResult image in Search.GetImages(componentId))
{
markup.AppendFormat(
"<div class='captionedImage'><img src='{0}' width='150' alt='{1}'/><p>{1}</p></div>",
image.Resolutions[0].Uri, image.Caption
);
}
LiteralMarkup.Text = markup.ToString();
Finally, you may also want an additional check in there somewhere to prevent html/xss injection.
Another option is to encapsulate your image in a class:
public class CaptionedHtmlImage
{
public Uri src {get; set;};
public string Caption {get; set;}
CaptionedHtmlImage(Uri src, string Caption)
{
this.src = src;
this.Caption = Caption;
}
public override string ToString()
{
return String.Format(
"<div class='captionedImage'><img src='{0}' width='150' alt='{1}'/><p>{1}</p></div>"
src.ToString(), Caption
);
}
}
This has the advantage of making it easy to re-use and add features to the concept over time. To get real fancy you can turn that class into a usercontrol.
Me personally I would opt for the first option. Just to point out also, a quick code saving tip here. you can use AppendFormat for the StringBuilder.
EDIT: The HtmlTextWriter approach would give you a much more structured result, and if the amount of HTML to generate grew, then this would be an obvious choice. OR the use of an HTML file and using it as a template and maybe string replacements
ASP.NET User Control are another good choice for templating.
I'd prefer:
StringBuilder markup = new StringBuilder();
string template = "<div class=\"captionedImage\"><img src=\"{0}\" width=\"150\" alt=\"{1}\"/><p>{1}</p></div>";
foreach (SearchResult image in Search.GetImages(componentId))
{
markup.AppendFormat(template,image.Resolutions[0].Uri, image.Caption);
}
LiteralMarkup.Text = markup.ToString();
As mentioned in another answer, using AppendFormat().
Take a look at SharpTemplate for html templating, it makes it all a lot easier to read even for small amounts of HTML.
I would prefer doing this in aspx using the ListView Control which exactly is made for this purpose. So your code will remain readable and is seperated (no HTML in the C# Code). With your approach you will get no compilte time XHTML validation warnings.
I have made some helper classes to create html controls, so my code would look like this:
foreach (SearchResult image in Search.GetImages(componentId)) {
ContainerMarkup.Controls.Add(
Tag.Div.CssClass("captionedImage")
.AddChild(Tag.Image(image.Resolutions[0].Uri).Width(150).Alt(Image.Caption))
.AddChild(Tag.Paragraph.Text(Image.Caption)));
}
At first it might not look simpler, but it's easy to work with, creates controls that does proper html encoding of the values, and there is no string escaping issues.
Related
I need to do a user input validation, and I want it validated both in the client side and in the server side.
I have ang textbox that the user can write his comment on the product, now what I wanted to do is to validate if his comment doesn't have any injections like html or javascripts. So what I wanted to do, after the user clicks on submit
1.) Client Side: How will I execute a validation like if the user inputs this kinds of string
abcd // I will accept only abcd and remove the anchor tag but the abcd should appear as a link
<script type="text/javascript">alert(123);</script> // I will accept only alert(123);as the valid string
<b>abcd</b> // I will display abcd but it must appear bold
2.) Server side: Same situation with the client side, I will remove the tags of the injected script and html tags.
I am using sharepoint 2007, I'm not sure if there is a built-in function to do this kind of validation in sharepoint api or c# for the server side validation.
Note: I don't want to use RegEx for this or any third party software. I know many experts here can help me with this. Thank you so much!
While RegEx is probably your best bet, you can use this and modify to your liking:
public static string StripHtml(this string source)
{
string[] removeElements = new string[] { "a", "script" };
string _newString = source;
foreach (string removeElement in removeElements)
{
while (_newString.ToLower().Contains("<" + removeElement.ToLower()))
{
_newString = _newString.Substring(0, _newString.ToLower().IndexOf("<" + removeElement.ToLower())) + _newString.Substring(_newString.ToLower().IndexOf("</" + removeElement.ToLower() + ">") + removeElement.Length + 3);
}
}
return _newString;
}
You'll use string clean = txtInput.Text.StripHtml();
I am not sure about creating an validation for this. But you can programtically remove the tags using this function.
Use this function to remove the Html tage from the textbox value that user has input
public static string StripHtml(string html, bool allowHarmlessTags)
{
if (html == null || html == string.Empty)
return string.Empty;
if (allowHarmlessTags)
return System.Text.RegularExpressions.Regex.Replace(html, "", string.Empty);
return System.Text.RegularExpressions.Regex.Replace(html, "<[^>]*>", string.Empty);
}
If you want prevent javascript injection attacks just encode user input Server.HtmlEncode(message).
But if you need to clean some tags then Omar Al Zabir wrote good article Convert HTML to XHTML and Clean Unnecessary Tags and Attributes
// Encode the string input
StringBuilder sb = new StringBuilder(
HttpUtility.HtmlEncode(htmlInputTxt.Text));
// Selectively allow <b> and <i>
sb.Replace("<b>", "<b>");
sb.Replace("</b>", "");
sb.Replace("<i>", "<i>");
sb.Replace("</i>", "");
Response.Write(sb.ToString());
I also would like to recomand you check AntiSamy.NET project but I didn't try it by myself.
Good day
I have question about displaying html documents in a windows forms applications. App that I'm working on should display information from the
database in the html format. I will try to describe actions that I have taken (and which failed):
1) I tried to load "virtual" html page that exists only in memory and dynamically change it's parameters (webbMain is a WebBrowser control):
public static string CreateBookHtml()
{
StringBuilder sb = new StringBuilder();
//Declaration
sb.AppendLine(#"<?xml version=""1.0"" encoding=""utf-8""?>");
sb.AppendLine(#"<?xml-stylesheet type=""text/css"" href=""style.css""?>");
sb.AppendLine(#"<!DOCTYPE html PUBLIC ""-//W3C//DTD XHTML 1.1//EN""
""http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"">");
sb.AppendLine(#"<html xmlns=""http://www.w3.org/1999/xhtml"" xml:lang=""en"">");
//Head
sb.AppendLine(#"<head>");
sb.AppendLine(#"<title>Exemplary document</title>");
sb.AppendLine(#"<meta http-equiv=""Content-Type"" content=""application/xhtml+xml;
charset=utf-8""/ >");
sb.AppendLine(#"</head>");
//Body
sb.AppendLine(#"<body>");
sb.AppendLine(#"<p id=""paragraph"">Example.</p>");
sb.AppendLine(#"</body>");
sb.AppendLine(#"</html>");
return sb.ToString();
}
void LoadBrowser()
{
this.webbMain.Navigate("about:blank");
this.webbMain.DocumentText = CreateBookHtml();
HtmlDocument doc = this.webbMain.Document;
}
This failed, because doc.Body is null, and doc.getElementById("paragraph") returns null too. So I cannot change paragraph InnerText property.
Furthermore, this.webbMain.DocumentText is "\0"...
2) I tried to create html file in specified folder, load it to the WebBrowser and then change its parameters. Html is the same as created by
CreateBookHtml() method:
private void LoadBrowser()
{
this.webbMain.Navigate("HTML\\BookPage.html"));
HtmlDocument doc = this.webbMain.Document;
}
This time this.webbMain.DocumentText contains Html data read from the file, but doc.Body returns null again, and I still cannot take element using
getByElementId() method. Of course, when I have text, I would try regex to get specified fields, or maybe do other tricks to achieve a goal, but I wonder - is there simply way to mainipulate html? For me, ideal way would be to create HTML text in memory, load it into the WebBrowser control, and then dynamically change its parameters using IDs. Is it possible? Thanks for the answers in advance, best regards,
Paweł
I've worked some time ago with the WebControl and like you wanted to load a html from memory but have the same problem, body being null. After some investigation, I noticed that the Navigate and NavigateToString methods work asynchronously, so it needs a little time for the control to load the document, the document is not available right after the call to Navigate. So i did something like (wbChat is the WebBrowser control):
wbChat.NavigateToString("<html><body><div>first line</div></body><html>");
DoEvents();
where DoEvents() is implemeted as:
[SecurityPermissionAttribute(SecurityAction.Demand, Flags = SecurityPermissionFlag.UnmanagedCode)]
public void DoEvents()
{
DispatcherFrame frame = new DispatcherFrame();
Dispatcher.CurrentDispatcher.BeginInvoke(DispatcherPriority.Background,
new DispatcherOperationCallback(ExitFrame), frame);
Dispatcher.PushFrame(frame);
}
and it worked for me, after the DoEvents call, I could obtain a non-null body:
mshtml.IHTMLDocument2 doc2 = (mshtml.IHTMLDocument2)wbChat.Document;
mshtml.HTMLDivElement div = (mshtml.HTMLDivElement)doc2.createElement("div");
div.innerHTML = "some text";
mshtml.HTMLBodyClass body = (mshtml.HTMLBodyClass)doc2.body;
if (body != null)
{
body.appendChild((mshtml.IHTMLDOMNode)div);
body.scrollTop = body.scrollHeight;
}
else
Console.WriteLine("body is still null");
I don't know if this is the right way of doing this, but it fixed the problem for me, maybe it helps you too.
Later Edit:
public object ExitFrame(object f)
{
((DispatcherFrame)f).Continue = false;
return null;
}
The DoEvents method is necessary on WPF. For System.Windows.Forms one can use Application.DoEvents().
Another way to do the same thing is:
webBrowser1.DocumentText = "<html><body>blabla<hr/>yadayada</body></html>";
this works without any extra initialization
I have a need to verify a specific hyperlink exists on a given web page. I know how to download the source HTML. What I need help with is figuring out if a "target" url exists as a hyperlink in the "source" web page.
Here is a little console program to demonstrate the problem:
public static void Main()
{
var sourceUrl = "http://developer.yahoo.com/search/web/V1/webSearch.html";
var targetUrl = "http://developer.yahoo.com/ypatterns/";
Console.WriteLine("Source contains link to target? Answer = {0}",
SourceContainsLinkToTarget(
sourceUrl,
targetUrl));
Console.ReadKey();
}
private static bool SourceContainsLinkToTarget(string sourceUrl, string targetUrl)
{
string content;
using (var wc = new WebClient())
content = wc.DownloadString(sourceUrl);
return content.Contains(targetUrl); // Need to ensure this is in a <href> tag!
}
Notice the comment on the last line. I can see if the target URL exists in the HTML of the source URL, but I need to verify that URL is inside of a <href/> tag. This way I can validate it's actually a hyperlink, instead of just text.
I'm hoping someone will have a kick-ass regular expression or something I can use.
Thanks!
Here is the solution using the HtmlAgilityPack:
private static bool SourceContainsLinkToTarget(string sourceUrl, string targetUrl)
{
var doc = (new HtmlWeb()).Load(sourceUrl);
foreach (var link in doc.DocumentNode.SelectNodes("//a[#href]"))
if (link.GetAttributeValue("href",
string.Empty).Equals(targetUrl))
return true;
return false;
}
The best way is to use a web scraping library with a built in DOM parser, which will build an object tree out of the HTML and let you explore it programmatically for the link entity you are looking for. There are many available - for example Beautiful Soup (python) or scrapi (ruby) or Mechanize (perl). For .net, try the HTML agility pack. http://htmlagilitypack.codeplex.com/
i wanted to create a web server while on the process.....
i am not being able to create a dynamic html which could take link from my c# console application...
for example i have a code which shows files from the system.. for example "c:\tike\a.jpeg" now i wanted to make that particular link a a href link in my html page...
any help would be appreciated.....
thank you..
(to sum up ... i want to create dynamic html page which takes value from c# console application.)
Ignoring virtual paths etc. for now, here is a simple example to get you started:
StringBuilder sb = new StringBuilder();
sb.AppendLine("<html>");
sb.AppendLine("<head>");
sb.AppendLine("<title>Index of c:\\dir</title>");
sb.AppendLine("</head>");
sb.AppendLine("<body>");
sb.AppendLine("<ul>");
string[] filePaths = Directory.GetFiles(#"c:\dir");
for (int i = 0; i < filePaths.Length; ++i) {
string name = Path.GetFileName(filePaths[i]);
sb.AppendLine(string.Format("<li>{1}</li>",
HttpUtility.HtmlEncode(HttpUtility.UrlEncode(name)),
HttpUtility.HtmlEncode(name)));
}
sb.AppendLine("</ul>");
sb.AppendLine("</body>");
sb.AppendLine("</html>");
string result = sb.ToString();
result contains a string that you can send as body of an HTTP response to the web-browser.
(Note: I typed the code right into the answer box, no idea if it compiles as-is.)
I have an aspx page that contains regular html, some uicomponents, and multiple tokens of the form {tokenname} .
When the page loads, I want to parse the page content and replace these tokens with the correct content. The idea is that there will be multiple template pages using the same codebehind.
I've no trouble parsing the string data itself, (see named string formatting, replace tokens in template) my trouble lies in when to read, and how to write the data back to the page...
What's the best way for me to rewrite the page content? I've been using a streamreader, and the replacing the page with Response.Write, but this is no good - a page containing other .net components does not render correctly.
Any suggestions would be greatly appreciated!
Take a look at System.Web.UI.Adapters.PageAdapter method TransformText - generally it is used for multi device support, but you can postprocess your page with this.
I'm not sure if I'm answering your question, but...
If you can change your notation from
{tokenname}
to something like
<%$ ZeusExpression:tokenname %>
you could consider creating your System.Web.Compilation.ExpressionBuilder.
After reading your comment...
There are other ways of getting access to the current page using ExpressionBuilder: just... create an expression. ;-)
Changing just a bit the sample from MSDN and supposing the code of your pages contain a method like this
public object GetData(string token);
you could implement something like this
public override CodeExpression GetCodeExpression(BoundPropertyEntry entry, object parsedData, ExpressionBuilderContext context)
{
Type type1 = entry.DeclaringType;
PropertyDescriptor descriptor1 = TypeDescriptor.GetProperties(type1)[entry.PropertyInfo.Name];
CodeExpression[] expressionArray1 = new CodeExpression[1];
expressionArray1[0] = new CodePrimitiveExpression(entry.Expression.Trim());
return new CodeCastExpression(
descriptor1.PropertyType,
new CodeMethodInvokeExpression(
new CodeThisReferenceExpression(),
"GetData",
expressionArray1));
}
This replaces your placeholder with a call like this
(string)this.GetData("tokenname");
Of course you can elaborate much more on this, perhaps using a "utility method" to simplify and "protect" access to data (access to properties, no special method involved, error handling, etc.).
Something that replaces instead with (e.g.)
(string)Utilities.GetData(this, "tokenname");
Hope this helps.
Many thanks to those that contributed to this question, however I ended up using a different solution -
Overriding the render function as per this page, except I parsed the page content for multiple different tags using regular expressions.
protected override void Render(HtmlTextWriter writer)
{
if (!Page.IsPostBack)
{
using (System.IO.MemoryStream stream = new System.IO.MemoryStream())
{
using (System.IO.StreamWriter streamWriter = new System.IO.StreamWriter(stream))
{
HtmlTextWriter htmlWriter = new HtmlTextWriter(streamWriter);
base.Render(htmlWriter);
htmlWriter.Flush();
stream.Position = 0;
using (System.IO.StreamReader oReader = new System.IO.StreamReader(stream))
{
string pageContent = oReader.ReadToEnd();
pageContent = ParseTagsFromPage(pageContent);
writer.Write(pageContent);
oReader.Close();
}
}
}
}
else
{
base.Render(writer);
}
}
Here's the regex tag parser
private string ParseTagsFromPage(string pageContent)
{
string regexPattern = "{zeus:(.*?)}"; //matches {zeus:anytagname}
string tagName = "";
string fieldName = "";
string replacement = "";
MatchCollection tagMatches = Regex.Matches(pageContent, regexPattern);
foreach (Match match in tagMatches)
{
tagName = match.ToString();
fieldName = tagName.Replace("{zeus:", "").Replace("}", "");
//get data based on my found field name, using some other function call
replacement = GetFieldValue(fieldName);
pageContent = pageContent.Replace(tagName, replacement);
}
return pageContent;
}
Seems to work quite well, as within the GetFieldValue function you can use your field name in any way you wish.