Using WKHtmlToPDF to generate PDFs for my company's web-based mapping service.
Essentially, I take a template HTML file, inject an image into a div, save the HTML to disk and use WKHtmlToPDF to render to PDF.
Now, on most templates it works a treat. On one particular one though, where the image should be (int the pdf) is a grey area. HOWEVER, if I right click on the grey area, and select "Save Image As...", the saved image is correct.
Linked are the created PDF and the HTML on which it is based. Help required most urgently, and hints appreciated.
Zip File Containing HTML and PDF
I was having an issue where a particular image was not being printed to the PDF. Other images on the same page were. The src of the missing image is from a CDN, but had no extension, i.e. src="\\path/to/image?param". Using the aforementioned -n switch (disable Javascript), the image shows up in the resulting PDF. Thanks Jordaan.
Don't know WHY this worked, but adding the "--disable-smart-shrinking" option, and/or removing the "-n"(Disable Javascript) option, fixed it.
Related
In my system, there are multiple PDFs listed in the website. I need to show the preview image of 1st page of all the PDFs.
There are two previews which I want to display -
One small preview
One big preview on mouse hover
What I am doing now?
We are taking help few third party preview generators. Which is used to create JPEG image and using those images in the website for previews.
What I tried differently?
I used EvoPDFtoHTML tool to use HTML instead of images directly but for many files the generated HTML is not appropriate.
Also, These both process is taking a lot of time and making website
slow in response.
I would like to know that is there any better way to achieve this?
Image attached below for better understandings -
An approach that is worth exploring
Parse the PDF and extract the 1st page.You may use command line tools like : PDFtk, Ghostscript, or Implement your own class to parse out the first page in C#
Then use Google doc viewer and embed an iframe to point to PDF
Example of PDFtk:
pdftk input.pdf cat 1 output page-1-of-input.pdf
Example of GhostScript:
gs -o page-1-of-input.pdf -sDEVICE=pdfwrite -dPDFLastPage=1 input.pdf
References:
Display first page of PDF as Image
You can also look at Fahims answer for the C# snippet that he tried
I have been using the (free version) HiQPdf libraries to converto html pages to pdf documents.
I am also using in my pages several Chart objecs from the .Net Framework (System.Web.UI.DataVisualization.Charting.Chart) to produce bar graphics dysplaying values changing along time. It works wonders in my local environment when I debug with VS, but when I publish on my IIS or on other servers the charts do not appear at all on the pdf - note: they do appear on the webpages just not on the pdf.
In the Pdfs all the html is displayed correctly, inlcuding css, showing it exactly as seen on the page, except the chart images. I kind of understand that they would not appear as in the html, the image source from the charts results in something like:
<img id="MainContent_MyPageControl_ctl00" src="/MyTestWebSite/ChartImg.axd?i=charts_0/chart_0_2.png&g=396d61e14ceb41c08be06fd956cd4dca"
Because the real generated png image is not even directly referenced as usual. But the fact is they do appear on the generated pdf when running local from VS, which produces a similar html anyway as the above.
Only difference I see is that when I run local the Image Chart Handler key is defined in the webconfig as:
add key="ChartImageHandler" value="storage=file;timeout=20;dir=c:\TempImageFiles\;"
but when publishing on a server I have changed it to:
add key="ChartImageHandler" value="storage=file;timeout=20;Url=~/MyfolderTempcharts/;deleteAfterServicing=false;"
So why dont the charts appear too on my generated pdf, or why they do appear if Im running from VS?
Anyone ever had this problem?
So I have figured out a way to deal with my problem, and I am sharing here if anyone is also in need. The deal is that the Chart objects use http handlers to render the chart generated images (you can read more about it in detail from these guys: https://web.archive.org/web/20201205231110/https://www.4guysfromrolla.com/articles/081909-1.aspx)
At first I thought I could manipulate the html (that I get before the page render) and replace the image sources with accessible links to send to the HiQPdf - for that I would need to temporarely copy the generated images to a public folder. That would be fine unless in my case I deal with sensitive data that should never be accessed even if only momentarily.
So what I do is I check that the image file does exists (in my configured folder) and extract the byte data from it, convert it to base 64 and replace it in the image source html where the handler is, and voila.
example of Base64 html embeded image:
Notice that this manipulated html is never accessible from the outside and is only worked on on the server side, in order to provide a "normal" html that the HiQPdf can pprocess.
I want to convert my webpage by button on click in codebehide i will receive that url and finally it will convert webpage to pdf and save it in to my specific path.Can u suggest the easy tool(dll files) to convert webpage by url website. please Thank you so much
PDFsharp
Key Features
Creates PDF documents on the fly from any .Net language
Easy to understand object model to compose documents
One source code for drawing on a PDF page as well as in a window or on the printer
Modify, merge, and split existing PDF files
Images with transparency (color mask, monochrome mask, alpha mask)
Newly designed from scratch and written entirely in C#
The graphical classes go well with .Net
I am generating PDF documents using DevExpress XtraReports.
I am using the same image over and over (in rows of status lights).
The PDF generated seems to duplicate the image definition for each image included. I would prefer if it included the image once and referenced it wherever it needed another copy - this would drastically reduce the size of my PDF docs.
Is there any way to achieve this using DevExpress or even post processed via a third party application. Any help is appreciated.
Two options:
OPT1: I suppose your image is a background or a company logo and the image is the same on all the pages of the pdf. If yes, then create the pdf without the image. Post-process the pdf and add the image on all the pages (you can do that using itext/itextsharp or pdflib).
OPT2: take your actual pdf and convert it using Ghoscript. Using Ghsoscript you can do a "pdf to pdf" conversion. During the conversion Ghostscript try to identify repeated images and removes them. The resulting file is smaller. (Ghostscript is not always able to do that... try with you pdf file).
It is possible to re-use the same image content in multiple locations throughout your document. But it's a fair bit easier to do this while adding the image(s) to the PDF.
I'm not sure if DevExpress supports this.
I have a 2-page PDF that I want to overlay information on (think of a form that someone manually fills out) using a C#/.NET Windows application. After this form is generated, it will need to be previewed and printed (exported into a graphic or PDF is nice, but not a requirement).
At first glance, I'm thinking of two ways to do this:
Use a PDF manipulator like iTextSharp, take a copy of the blank PDF form, and add text to the PDF. Then, launch Adobe Reader to do a print or preview.
Convert the PDF into a graphic, and put the graphic into a C# Report. Then, overlay text fields onto the report, and use a .NET ReportViewer control to preview and print the report.
The text does not need to be searchable or copyable or have any of the cool things that PDF gives me, so I'm leaning towards the second option. Am I missing anything, or is there something I'm not thinking of? Thanks in advance.
We use #1 extensively and it works perfectly. You shouldn't have any problems, it's rather easy (just need to make the fields writeable and use a FDF file and merge it with the PDF file). At least that's how we did it.