I have been using the .NET WebBrowser control in edit mode as part of an interface for end users to create sections of HTML content for insertion into various websites. They have had a very cutdown list of tags available such as <p>, <br>, <a href>, <strong>, <ul> <li>... they could not apply any formatting on top of the tags as that was determined by the particular web pages css. This system has been working well up until now.
Unfortunately I now have a need for xhtml to go into a larger xml document for aggregation purposes by various other websites. The WebBrowsers main problem seems to be lists where it produces:
<UL><LI>Item1
<LI>item2
<LI>item3</LI></UL>
Is there a good converter library to fix this or could I force the WebBrowser control to create XHTML? I have tried the HTMLAgilityPack but it converted to XHTML by doing something like:
<UL><LI>Item1
<LI>item2
<LI>item3</LI></LI></LI></UL>
I don't think his is appropriately set as surely the tags should be at the end of each item although it would pass xhtml validation. If it is ok, will I end up with rendering issues on certain browsers when the XML is eventually put into whatever website?
Try this.
http://tidy.sourceforge.net/
You must be using Internet Explorer, which is the only browser I can think of that doesn't close list-item tags in a content-editable section. Also, the tags ought to be lower case, which is the other give-away.
It is worth checking that you are sending the correct document-type to the browser as this may solve your problem (i.e. make sure the editable bit is definitely an XHTML page). Other than this, you could manage it by having a plain-text editable area with some custom(ish) mark-up and a preview area below. Erm... a bit like Stack Overflow. That way, you can create the exact mark-up you want, rather than relying on what a browser generates.
Related
I have a contenteditable div the user enter data. When they enter line break, each browser stores the data differently. When I export this data to Word using HtmlToOpenXml it adds a blank line for the content and I want to avoid that so the html page and word doc look the same.
One option for me is to replace the tags <br>, <div>, <p> with blank and then replace the </div> and </p> with <br/> in the C# code using RegEx. But I do not know what all formatting is used for contenteditable div by different browsers and this implementation may not help.
I would like to know what is the best way to address this or is there any open source tool/dll that helps me with this issue?
e.g. ContentEditable div actual data in browsers looks like below
Chrome -
line1<div>line2</div><div>line3</div>
IE Edge-
<div>line1</div><div>line22</div><div>line3<br></div>
FireFox - I read it uses <p> </p> instead of <div> </div>
Safari - ????
A Solution I found:
You could use RegEx, which I highly recommend in C# for parsing information.
Then effectively based on the formatting you could narrow down what browser it is and then move on towards parsing it's output and what its XML means universally. This will not be easy but no cross-platform ever truly is. I would give a example of how this could be done, but RegEx in all honesty takes a good amount of work and it would be quite a bit of code to make a example that could show you how to parse it and find out what the browser is.
I have been converting HTML to PDF using the Syncfusion.HtmlConverter.HtmlToPdfConverter class, which produces great results as long as you use the Webkit rendering engine, but I have not been able to get it to properly honor page breaks. Syncfusion documents an older class, HtmlConverter, and suggests utilizing <p style="page-break-before: always;"> which requires you to set the AutoDetectPageBreak property to true.
The problem is that the newer class does not contain this property. Does anyone know the proper way to enforce page breaks using HtmlToPdfConverter?
As it turns out, the documentation is almost correct. If you pass <p style="page-break-before: always;"> as they suggest it will page break, but add a huge space after the break. However if you use:
<p style="page-break-before: always;"></p>
It honors the break properly (and automatically) without the need for setting any properties on the conversion object. This is as of version 14.2450.
I have a page that does not have runat="server" set in the <head/> section. I do not have access to modify any of the code in the page.
This page contains a user control which I do have access to. Can I add a <meta/> tag to the head section of the page from the user control? It needs to be server-side so a javascript solution won't work.
One option is to create a Response Filter, and then modify the output before it's sent to the user.
https://web.archive.org/web/20211029043851/https://www.4guysfromrolla.com/articles/120308-1.aspx
You can parse the text in
(this.Page.Controls[0] as LiteralControl).Text
to see where the string <head> starts, and insert whatever text you need in there thus injecting your own code into the page header without it being marked with runat="server".
Please be aware though, this is pretty hacky way of getting your code where it most likely shouldn't be (otherwise the <head> element would have been marked as runat="server" so you can access it normally). This will also break if at a later date the head element is changed to be an ASP.NET control. It might will not work with master pages, you will have to walk up the control tree looking for topmost literal element.
We have used the redcloth and bluecloth wiki renderer's with Ruby, basically you can do something like this...
html = RedCloth.to_html(wiki_content)
and poof, you get back HTML.
Is there something out there for C#/.NET ?
try http://wikiplex.codeplex.com/
There are some wiki rendering engines but the names escape me right now. Perhaps check out some of these open-source options? I've previously reviewed MindTouch from that list for an application and it was quite rich, but it did much more than I needed to do.
If you just need something to turn text into HTMLcontent, I use Halide which lets people type in a textarea then it'll HTML-ify links, remove dangerous content, add <p></p> and <br />, etc. Very simple but no built-in formatting options.
SO uses a custom version of Markdown for their text editor and HTML content rendering. Search google for Markdown.NET for a number of ports.
I'm trying to inject some CSS that accompanies some other HTML into a C# managed WebBrowser control. I am trying to do this via the underlying MSHTML (DomDocument property) control, as this code is serving as a prototype of sorts for a full IE8 BHO.
The problem is, while I can inject HTML (via mydomdocument.body.insertAdjacentHTML) and Javascript (via mydomdocument.parentWindow.execScript), it is flat-out rejecting my CSS code.
If I compare the string containing the HTML I want to insert with the destination page source after injection, the MSHTML's source will literally contain everything except for the <style> element and its underlying source.
The CSS passes W3C validation for CSS 2.1. It doesn't do anything too tricky, with the exception that some background-image properties have the image directly embedded into the CSS (e.g. background-image: url("data:image/png;base64 ...), and commenting out those lines doesn't change the result.
More strangely (and I am not sure if this is relevant), was that I was having no problems with this last week. I came back to it this week and, after switching around some of the code that handles the to-be-injected HTML before actual injection, it no longer worked. Naturally I thought that one of my changes might somehow be the problem, but after commenting all that logic out and feeding it a straight string the HTML is still appearing unformatted.
At the moment I'm injecting into the <body> tag, though I've attempted to inject into <head> and that's met with similar results.
Thanks in advance for your help!
tom
Ended up solving this myself:
mshtml.HTMLDocument test = (mshtml.HTMLDocument)webBrowser1.Document.DomDocument;
//inject CSS
if (test.styleSheets.length < 31) { // createStyleSheet throws "Invalid Argument if >31 stylesheets on page
mshtml.IHTMLStyleSheet css = (mshtml.IHTMLStyleSheet)test.createStyleSheet("", 0);
css.cssText = myDataClass.returnInjectionCSS(); // String containing CSS to inject into the page
// CSS should now affect page
} else {
System.Console.WriteLine("Could not inject CSS due to styleSheets.length > 31");
return;
}
What I didn't realize is that createStyleSheet creates a pointer that is still 'live' in the document's DOM... therefore you don't need to append your created stylesheet back to its parent. I ended up figuring this out by studying dynamic CSS code for Javascript as the implementations are pretty much identical.