Create PDF from HTML using TheArtOfDev.HtmlRenderer.PdfSharp - c#

I need to convert a well-formatted HTML string to a PDF document.
I found this DLL that should do what I need, but it isn't working fine on formatting.
That's the HTML code I'm trying to convert, and viewing it on browser works fine (I've used Bootstrap CSS, that's been correctly referenced as cdn).
But once converted to PDF this is the result:
And that's the code I'm using to convert it:
string html = "";
if (File.Exists(pathIN))
{
html = File.ReadAllText(pathIN);
}
PdfDocument pdfDocument = new PdfDocument();
PdfDocument pdf = PdfGenerator.GeneratePdf(html, PageSize.A4, 60);
pdf.Save(pathOUT);
Does anyone have any suggestion?

I also had issues with this when using HtmlRenderer/PdfSharp with Bootstrap controlling the layout.
Although it goes against the grain, I resorted to using tables for the layout. Given that the destination (pdf) was a obviously a fixed width, being responsive was not a requirement.

Try using https://wkhtmltopdf.org, works well for bootstrap pages.

I know its a little late but this can help someone
The problem with bootstrap is that to align the columns use float: left and pdfsharp cannot read this property instead use display: inline-block and define the width in pixels.

To avoid useless effort to other people, it doesn't work for Bootstrap like the Table even using display: inline-block to define the width. Right side of table is always trimmed, the size unfitting the letter size in my case.

Related

Devexpress RichEditDocumentServer HtmlText property not properly formatting CSS (width, padding, margin) .NET Framework 4.7.2

Having an issue where I'm using DevExpress RichEditDocument server and the .HtmlText property being used to convert HTML to a PDF document. The base html and some styles get applied, however styles like width, padding, margins are not being applied. I have tried putting the styles in the HTML tags and using a separate style tag with classes as well.
Are there adjustments I can make to the DevExpress page itself for margins so the HTML will start at the edge of the PDF?
To setup page margins you can use Section.Margins property of a document:
document.Sections[0].Margins.Top = 0;
document.Sections[0].Margins.Bottom = 0;
document.Sections[0].Margins.Left = 0.2f;
document.Sections[0].Margins.Right = 0.2f;
Full example is shown on DevExpress documentation page: https://docs.devexpress.com/OfficeFileAPI/402993/word-processing-document-api/html-import-and-export?p=netstandard#example-1-convert-html-to-pdf
PS: HTML support in RichEditDocumentServer appears to be very limited. All supported/unsupported tags with attributes are listed here: https://docs.devexpress.com/OfficeFileAPI/15684/word-processing-document-api/html-import-and-export/html-support-limitations

Sautfinsoft PDF Focus .Net Converter text issue

So I've been trying to use SautinSoft software to convert a pdf document to a docx document. However, whenever I run this code my word document ends up having squished text. I've attached the images below, any idea what is going on?
SautinSoft.PdfFocus f = new SautinSoft.PdfFocus()
f.OpenPdf(#source);
if (f.PageCount > 0){
string path = Path.ChangeExtension(source, ".docx");
f.WordOptions.Format = SautinSoft.PdfFocus.CWordOptions.eWordDocument.Docx;
f.ToWord(#path);
}
This is a docx file after conversion. The image rendered fine, but the text is all squished for some reason. I'm also running on macOS (if that makes a difference). Thank you for anyone that can help!
I found the answer. You have to make sure to set
f.WordOptions.KeepCharScaleAndSpacing = false;
Because by default, the converter will scale the font widths based on the PDF fonts rather than your default word doc fonts.
Just load your fonts which you are using in the PDF in "C:/Windows/Fonts" and reboot your PC.
Then to convert your PDF to WORD again.
If it doesn't work for you. Try to use an another way. Specify the path to your folder's font for PDF Focus like :
f.WordOptions.Fonts = D:/Fonts;

HtmlRenderer C# not showing table border in Rendered Image

In my C# code I am rendering a JPEG image of a HTML page using a string variable (which holds my html code). There's a table in that html code whose borders are not being rendered in JPEG image.
I am using the following code to generate image :-
string sHtml = m_Html; //m_Html contains the html code
Image img = HtmlRender.RenderToImage(sHtml);
After thorough searching, I understood that HtmlRenderer renders the string which we pass (as a parameter) into a HTML page and then takes a snapshot. Now, the rendering engine which HtmlRenderer uses is not very sophisticated, it does not support latest/complex CSS queries.So if you are facing this issue, use simple HTML/CSS.
If still you are not able to solve the issue, use a different library 'NReco'. NReco is open source if you just use it, licensed if you want to modify it. NReco is better than HTML Renderer.

C# - How can I cut a string at its end to fit in a div?

I'm making a list of recent news. So, it will show something like this:
- Take a look at the new Volks...
- John Doe is looking for a jo...
- Microsoft is launching the n...
So, the list above only shows me the title of the news and the length of each news is limited in 25 characters. But, this is not working well... for example, if you type 25 M's, it will explode my div.
I've been told that there is a way to calculate the length of the string and make it fit in a div automatically.
Does anyone know how to do it?
thanks!!
"text-overflow: ellipsis" is what you want but not everybody supports it. More info here...
I think you talking about is using the System.Drawing.Gaphics class's MeasureString() method.
However, this requires making a Graphics object which matches the font characteristics of your web page. But, your server process shouldn't know anything about the style elements of the web page, which should be handled by the CSS sheet.
I think you want to use css for this.
word-wrap:break-word;
should do it
One very simple way to prevent "exploding the div" is to use a css style to set the overflow of the div to scroll or hide the extra text instead of stretching to accomodate it.
I don't think there is an easy way to do this that works with all browsers and fonts.
The best way is just making sure your layout don't break if someone enters 25*m.
An useful thing to do is to split words that are more than X letter.
I the word-wrap css don't work that well on all browers.
This is not really a server-side problem, as the server shouldn't know what fonts people are using. You can do it using Ajax - post the font to the server, calculate the width (as James Curran mentioned), and return the right strings. However, the server may ont have the same fonts installed, and you have to calculate padding and margins on the server side.
I can think of several options on the client side:
Wrap every line with a span. A span would expand automatically to the width of the line. Using jQuery or your favorite javascript you can remove characters until the width is ok. (you can do a sort of binary search, where at every stage you add the ellipsis and checks the width)
Easy - Wrap every line with a fixed-width div and set it overflow:hidden, and add the ellipsis after the div. This will cut through letters though, and when you get a short text it'll still show the ellipsis.
Too easy - Use a fixed width font (they're mostly ugly).
As others have mentioned you can measure strings in thick client applications using System.Drawing.Graphics.MeasureString, but since you mention you want to fit it in an HTML div tag it would be perferable to let the browser handle the user interface using CSS.
<html>
<head>
<title>C# - How can I cut a string at its end to fit in a div? </title>
<style type="text/css">
.ellipsis li
{
display: block;
white-space: nowrap;
overflow: hidden;
text-overflow: ellipsis;
width: 166px;
}
</style>
</head>
<body>
<ul class="ellipsis">
<li>Take a look at the new Volksxxxxx</li>
<li>John Doe is looking for a joxxxxx</li>
<li>Microsoft is launching the nxxxxx</li>
</ul>
</body>
</html>
I used the unordered list tag (UL) instead of div since your sample list begins with a bullet character. Similar CSS would apply to DIV tags. And although all browser can be made to clip the content, not all browsers support the non-standard text-overflow: ellipsis style.

Render HTML as an Image

I'm generating a coupon based on dynamic input and a cropped image, and I'm displaying the coupon using ntml and css right now, the problem is, printing this has become an issue because of how backgrounds disappear when printing and other problems, so I think the best solution would be to be able to generate an image based on the html, or set up some kind of template that takes in strings and an image, and generates an image using the image fed in as a background and puts the coupon information on top.
Is there anything that does this already?
This is for an ASP.NET 3.5 C# website!
Thanks in advance.
edit: It'd be great if the output could be based on the HTML input, as the coupon is designed by manipulating the DOM using jQuery and dragging stuff around, it all works fine, it's just when it comes to the printing (to paper) it has z-indexing issues.
What you can do is create an aspx page that changes the response type to be in the format you want and then put the image into the stream. I created a barcode generator that does a similar thing. Excluding all the formalities of generating the image, you'll Page_Load will look something like this:
Bitmap FinalBitmap = new Bitmap();
MemoryStream msStream = new MemoryStream();
strInputParameter == Request.Params("MagicParm").ToString()
// Magic code goes here to generate your bitmap image.
FinalBitmap.Save(msStream, ImageFormat.Png);
Response.Clear();
Response.ContentType = "image/png";
msStream.WriteTo(Response.OutputStream);
if ((FinalBitmap != null)) FinalBitmap.Dispose();
and that's it! Then all you have to do in your image is set the URL to be something like RenderImage.aspx?MagicParm=WooHoo or whatever you need. That way you can have it render whatever you want to specify.
You can render html to a bitmap using the WebBrowser control in either a winforms or console application.
An example of this can be found here: http://www.wincustomize.com/articles.aspx?aid=136426&c=1
The above example can be modified to run in ASP.Net by creating a new STAThread and performing an Application.Run on it to start a new message loop.
PHP/Ruby Alternative
If you have accessed this question and are actually looking for soething that will work without Windows, you can try the KHTML library: http://wiki.goatpr0n.de/projects/khtmld
The website has a ridiculous name I admit, but I can assure you it is genuine. Other related pages are: the sourceforge page http://khtml2png.sourceforge.net/
Try PDFSharp...it's not exactly a "take this HTML and make a PDF" but with a small amout of fiddling you can easily make a PDF out of the info you are using to make the HTML.
MARKUP ONLY ALTERNATE SOLUTION
Use SVG and XSLT to transform the html data into an image that can be rendered/saved/etc.
I'll admit that at first it was tedious getting this to work because of all of the coordinates, but well worth the effort once it is running.
There is a very powerful image creation library called GD which I often use with PHP.
I am led to believe there is a wrapper for this library that ASP programmers can use. Try this
Unless the "other problems" are pretty severe, couldn't you just instruct your users to turn on Background Images when printing?
In any case, I'd default to serving a PDF rather than an image, doubly so since it is intended for print.
Just set up your css properly, so that you have a css file targeted at the print medium. It is pretty easy to guarantee that the coupon will always be legible, without worrying about whether they have bg images on or not. Needlesly moving to an image doesn't make any sense, unless there is some reason you don't want it to be machine readable.
I haven't tried to myself, but you should be able to render HTML into an image by using the WebBrowser control and the DrawToBitmap() method inherited from the base Control class.
UPDATE: I tried this myself and there are some caveats. The WebBrowser control doesn't seem to render the web page until the control is show, so the WebBrowser needs to be in a Form and the Form must be shown for the HTML to be rendered and the DocumentCompleted event to be raised.

Categories

Resources