iTextSharp HTMLWorker ParseHTML Tablestyle and PDFStamper - c#

Hi I have succesfully used a HTMLWorker to convert a gridview using asp.NET / C#.
(1) I have applied some limited style to the resulting table but cannot see how to apply tablestyle for instance grid lines or apply other formatting style such as a large column width for example for a particular column.
(2) I would actually like to put this text onto a pre-existing template which contains a logo etc. I've used PDF Stamper before for this but cannot see how I can use both PDFStamper and HTMLWorker at once. HTMLWorker needs a Document which implements iDocListener ... but that doesnt seem compatible with usign a PDFStamper. I guess what I am looking for is a way to create a PDFStamper, write title etc, then add the parsed HTML from the grid. The other problem is that the parsed content doesnt interact with the other stuff on the page. For instance below I add a title chunk to the page. Rather than starting below it, the parsed HTML writes over the top. How do I place / interact the parsed HTML content with the rest of what is on the PDF document ?
Thanks in advance
Rob
Here';s the code I have already
Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 30f, 0f);
HTMLWorker htmlWorker = new HTMLWorker(pdfDoc);
StyleSheet styles = new StyleSheet();
styles.LoadTagStyle("th", "size", "12px");
styles.LoadTagStyle("th", "face", "helvetica");
styles.LoadTagStyle("span", "size", "10px");
styles.LoadTagStyle("span", "face", "helvetica");
styles.LoadTagStyle("td", "size", "10px");
styles.LoadTagStyle("td", "face", "helvetica");
htmlWorker.SetStyleSheet(styles);
PdfWriter.GetInstance(pdfDoc, HttpContext.Current.Response.OutputStream);
pdfDoc.Open();
//Title - but this gets obsured by data, doesnt move it down
Font font = new Font(Font.FontFamily.HELVETICA, 14, Font.BOLD);
Chunk chunk = new Chunk(title, font);
pdfDoc.Add(chunk);
//Body
htmlWorker.Parse(sr);

Let me first give you a couple of links to look over when you get a chance:
ItextSharp support for HTML and CSS
How to apply font properties on while passing html to pdf using itextsharp
These answers go deeper into what's going on and I recommend reading them when you get a chance. Specifically the second one will show you why you need to use pt instead of px.
To answer your first question let me show you a different way to use the HTMLWorker class. This class has a static method on it called ParseToList that will convert HTML to a List<IElement>. The objects in that list are all iTextSharp specific versions of your HTML. Normally you would do a foreach on those and just add them to a document but you can modify them before adding which is what you want to do. Below is code that takes a static string and does that:
string file1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File1.pdf");
using (FileStream fs = new FileStream(file1, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document doc = new Document(PageSize.LETTER))
{
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs))
{
doc.Open();
//Our HTML
string html = "<table><tr><th>First Name</th><th>Last Name</th></tr><tr><td>Chris</td><td>Haas</td></tr></table>";
//ParseToList requires a StreamReader instead of just a string so just wrap it
using (StringReader sr = new StringReader(html))
{
//Create a style sheet
StyleSheet styles = new StyleSheet();
//...styles omitted for brevity
//Convert our HTML to iTextSharp elements
List<IElement> elements = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(sr, styles);
//Loop through each element (in this case there's actually just one PdfPTable)
foreach (IElement el in elements)
{
//If the element is a PdfPTable
if (el is PdfPTable)
{
//Cast it
PdfPTable tt = (PdfPTable)el;
//Change the widths, these are relative width by the way
tt.SetWidths(new float[] { 75, 25 });
}
//Add the element to the document
doc.Add(el);
}
}
doc.Close();
}
}
}
Hopefully you can see that once you get access to the raw PdfPTable you can tweak it as necessary.
To answer your second question, if you want to use the normal Paragraph and Chunk objects with a PdfStamper then you need to use a PdfContentByte object. You can get this from your stamper in one of two ways, either by asking for one that sits "above" existing content, stamper.GetOverContent(int) or one that sits "below" existing content, stamper.GetUnderContent(int). Both versions take a single parameter saying what page to work with. Once you have a PdfContentByte you can create a ColumnText object bound to it and use this object's AddElement() method to add your normal elements. Before doing this (and this answers your third question), you'll want to create at least one "column". When I do this I generally create one that essentially covers the entire page. (This part might sound weird but we're essentially make a single row, single column table cell to add our objects to.)
Below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.1.0 that shows off everything above. First it creates a generic PDF on the desktop. Then it creates a second document based off of the first, adds a paragraph and then some HTML. See the comments in the code for any questions.
using System;
using System.Collections.Generic;
using System.Text;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.html.simpleparser;
using iTextSharp.text.pdf;
using System.IO;
namespace WindowsFormsApplication1
{
public partial class Form1 : Form
{
public Form1()
{
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e)
{
//The two files that we are creating
string file1 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File1.pdf");
string file2 = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "File2.pdf");
//Create a base file to write on top of
using (FileStream fs = new FileStream(file1, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document doc = new Document(PageSize.LETTER))
{
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs))
{
doc.Open();
doc.Add(new Paragraph("Hello world"));
doc.Close();
}
}
}
//Bind a reader to our first document
PdfReader reader = new PdfReader(file1);
//Create our second document
using (FileStream fs = new FileStream(file2, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (PdfStamper stamper = new PdfStamper(reader, fs))
{
StyleSheet styles = new StyleSheet();
//...styles omitted for brevity
//Our HTML
string html = "<table><tr><th>First Name</th><th>Last Name</th></tr><tr><td>Chris</td><td>Haas</td></tr></table>";
//ParseToList requires a StreamReader instead of just a string so just wrap it
using (StringReader sr = new StringReader(html))
{
//Get our raw PdfContentByte object letting us draw "above" existing content
PdfContentByte cb = stamper.GetOverContent(1);
//Create a new ColumnText object bound to the above PdfContentByte object
ColumnText ct = new ColumnText(cb);
//Get the dimensions of the first page of our source document
iTextSharp.text.Rectangle page1size = reader.GetPageSize(1);
//Create a single column object spanning the entire page
ct.SetSimpleColumn(0, 0, page1size.Width, page1size.Height);
ct.AddElement(new Paragraph("Hello world!"));
//Convert our HTML to iTextSharp elements
List<IElement> elements = iTextSharp.text.html.simpleparser.HTMLWorker.ParseToList(sr, styles);
//Loop through each element (in this case there's actually just one PdfPTable)
foreach (IElement el in elements)
{
//If the element is a PdfPTable
if (el is PdfPTable)
{
//Cast it
PdfPTable tt = (PdfPTable)el;
//Change the widths, these are relative width by the way
tt.SetWidths(new float[] { 75, 25 });
}
//Add the element to the ColumnText
ct.AddElement(el);
}
//IMPORTANT, this actually commits our object to the PDF
ct.Go();
}
}
}
this.Close();
}
}
}

protected void LinkPdf_Click(object sender, EventArgs e)
{
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;filename=TestPage.pdf");
Response.Cache.SetCacheability(HttpCacheability.NoCache);
StringWriter sw = new StringWriter();
HtmlTextWriter hw = new HtmlTextWriter(sw);
this.Page.RenderControl(hw);
StringReader sr = new StringReader(sw.ToString());
Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 100f, 0f);
HTMLWorker htmlparser = new HTMLWorker(pdfDoc);
PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
pdfDoc.Open();
htmlparser.Parse(sr);
pdfDoc.Close();
Response.Write(pdfDoc);
Response.End();
}

Related

Ordered lists in HTML to PDF conversion using iTextSharp (C#, MVC)

I'm trying to convert some HTML code into PDF file using iTextSharp. And It's working with one problem. I have some ordered lists and they are displaying but without bullets/numbers (I tested both OL and UL) and I can't figure out how to do this without manually parsing HTML code and inserting numbers at the beginning of each li element.
_topic.Body = "<ol><li>First</li><li>Second</li></ol>";
using (FileStream msOutput = new FileStream("file.pdf", FileMode.Create))
{
TextReader reader = new StringReader(_topic.Body);
Document document = new Document(PageSize.A4, 30, 30, 30, 30);
PdfWriter writer = PdfWriter.GetInstance(document, msOutput);
HTMLWorker worker = new HTMLWorker(document);
document.Open();
worker.StartDocument();
worker.Parse(reader);
worker.EndDocument();
worker.Close();
document.Close();
}

create multiple page pdf from asp.net mvc?

I was trying to create a pdf dynamically and send it by attaching in the mail.
This is my code and it works perfectly for me.
public ActionResult sendmail()
{
MemoryStream ms = new MemoryStream();
Document doc = new Document(PageSize.A4, 10f, 10f, 100f, 0.0f);
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
doc.Open(); //open doc for editing
doc.Add(new Paragraph("First Paragraph"));
doc.Add(new Paragraph("Second Paragraph"));
writer.CloseStream = false; //important
doc.Close(); //build the doc.
ms.Position = 0;
SmtpClient smtpClient = new SmtpClient();
smtpClient.Host = "provider.com";
smtpClient.Credentials = new NetworkCredential("credentialmailid", "password");
MailMessage mailMessage = new MailMessage()
{
From = new MailAddress("from#gmail.com")
};
mailMessage.To.Add(new MailAddress("to#gmail.com"));
mailMessage.Subject = "Pdf attached";
mailMessage.Attachments.Add(new Attachment(ms, "pdfname.pdf"));
smtpClient.Send(mailMessage);
return RedirectToAction("index");
}
Now my issue is : Document that I have to send is a purchase confirmation . it will have 3 pages. Many headings and styles will be there in it.
also some values I have to pass dynamically like who purchased it and date like wise a lot datas should pass dynamically
How to do this? I thought to create an Html of pdf file to be send and use something like this add parameters dynamically...
string mailpath = Server.MapPath("~/Mail/HtmlOF_pdfToSend.html");
string mailbody = System.IO.File.ReadAllText(mailpath);
mailbody = mailbody.Replace("##CompanyName", "Bhavin Merchant");
mailbody = mailbody.Replace("##BusinessType", "Bhavin business");
Fist You have to add iTextSharp dll then u have to add some packages :
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.text.html;
using iTextSharp.text.xml;
using iTextSharp.text.html.simpleparser;
then as per your question. you want to pass dynamically values so i post some syntax as example :
// Create a Document object
var document = new Document(PageSize.A4, 50, 50, 25, 25);
var output = new MemoryStream();
// Create a new PdfWriter object, specifying the output stream
var writer = PdfWriter.GetInstance(document, output);
// Open the Document for writing
document.Open();
Suppose you have header in your pdf documnet so syntax will be :
var logo = iTextSharp.text.Image.GetInstance(Server.MapPath("~/images/it.jpg"));
logo.SetAbsolutePosition(300, 750);
document.Add(logo);
If you want to add phrase:
Phrase titl = new Phrase("\nE-Ticket\n");
titl.Font.SetStyle(Font.BOLD);
document.Add(titl);
Add lines :
Phrase titl1 = new Phrase("--------------------------------------------------------------------------------------\n\n");
titl1.Font.SetStyle(Font.BOLD);
document.Add(titl1);
Change the style of text :
Here you can change the font style & color.
Phrase title = new Phrase("Booking Date-" + txtDate1.Text + "\n");
title.Font.SetStyle(Font.BOLD);
document.Add(title);
If you want to add pdf table:-dt is data table.
PdfPTable UserInfoTable = new PdfPTable(dt.Columns.Count);
PdfPRow row = null;
UserInfoTable.AddCell(--add cell----);
document.Add(UserInfoTable);
Close the Document - this saves the document contents to the output stream
document.Close();
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", string.Format("attachment;filename=Receipt-{0}.pdf", "hello"));
Response.BinaryWrite(output.ToArray())
Here I paste some example code as your question.
You can add more pages to the document like this:
doc.Open(); //open doc for editing
doc.Add(new Paragraph("First Paragraph"));
doc.newPage();
doc.add(new Paragraph("This is a new page =)"));

MVC - Convert PdfPTable to Html

I am working on downloading a PDF document. I have used PdfPTable to create a table. Below is my shortened code.
var document = new Document(PageSize.A4, 50, 50, 25, 25);
var output = new MemoryStream();
var writer = PdfWriter.GetInstance(document, output);
document.Open();
PdfPTable table = new PdfPTable(2);
table.WidthPercentage = 80;
PdfPCell cell = new PdfPCell(new Phrase("Description", fntTableFontBold));
cell.HorizontalAlignment = Element.ALIGN_CENTER;
table.AddCell(cell);
......
document.Add(table);
document.Close();
Response.ContentType = "application/pdf";
Response.AddHeader("Content-Disposition", string.Format("attachment;filename={0}.pdf", "Journal"));
Response.BinaryWrite(output.ToArray());
All is working fine. Above code straight away downloads the PDF in browser.
Now i want to add Preview functionality, If the User clicks on Preview, He should see the HTML output.
How to convert this above code to show as HTML? There is lots of code & queries which are used for creating PdfPTable.
Creating a pdf using iTextSharp involves a lot of coding. You cannot render a pdf as HTML code (that I know of), but you can create an HTML first, convert it into a pdf and then render the pdf as stream to a browser. If you are creating the HTML first then you can use the html for preview.
Here is a block of code that I once wrote to convert html to pdf.
protected void ConvertHTMLToPDF(String HTMLCode, String fileName)
{
//Create PDF document
iTextSharp.text.Document doc = new iTextSharp.text.Document(iTextSharp.text.PageSize.A4);
PdfWriter.GetInstance(doc, new FileStream(HttpContext.Current.Server.MapPath("~") + "/Resources/Temp/" + fileName, FileMode.Create));
doc.Open();
foreach (IElement element in HTMLWorker.ParseToList(new StringReader(HTMLCode), null))
{
doc.Add(element);
}
doc.Close();
//Response.End();
}

itextsharp html to pdf

I want to change some HTML in a pdf. All my html is in HTML string but I don't know how to pass it in correctly within iTextSharp.
public void PDF()
{
// Create a doc object
var doc = new doc(PageSize.A4, 50, 50, 25, 25);
// Create a new PdfWrite object, writing the output to the file ~/PDFTemplate/SimpleFormFieldDemo.pdf
var output = new FileStream(Server.MapPath("t.pdf"), FileMode.Create);
var writer = PdfWriter.GetInstance(doc, output);
// Open the doc for writing
doc.Open();
//Add Wallpaper image to the pdf
var Wallpaper = iTextSharp.text.Image.GetInstance(Server.MapPath("hfc.png"));
Wallpaper.SetAbsolutePosition(0, 0);
Wallpaper.ScaleAbsolute(600, 840);
doc.Add(Wallpaper);
iTextSharp.text.html.simpleparser.HTMLWorker hw = new iTextSharp.text.html.simpleparser.HTMLWorker(doc);
StyleSheet css = new StyleSheet();
css.LoadTagStyle("body", "face", "Garamond");
css.LoadTagStyle("body", "encoding", "Identity-H");
css.LoadTagStyle("body", "size", "12pt");
hw.Parse(new StringReader(HTML));
doc.Close();
Response.Redirect("t.pdf");
}
If anyone knows how to make this work.. it be good.
Thanks
Dom
Please download The Best iText Questions on StackOverflow. It's a free ebook, you'll benefit from it.
Once you have downloaded is, go to the section entitled "Parsing XML and XHTML".
Allow me to quote from the answer to this question: RowSpan does not work in iTextSharp?
You are using HTMLWorker instead of XML Worker, and you are right:
HTMLWorker has no support for CSS. Saying CSS doesn't work in
iTextSharp is wrong. It doesn't work when you use HTMLWorker, but
that's documented: the CSS you need works in XML Worker.
Please throw away your code, and start anew using XML Worker.
There are many examples (simple ones as well as complex ones) in the book. Let me give you only one:
using (var fsOut = new FileStream(outputFile, FileMode.Create, FileAccess.Write))
using (var stringReader = new StringReader(result))
{
var document = new Document();
var pdfWriter = PdfWriter.GetInstance(document, fsOut);
pdfWriter.InitialLeading = 12.5f;
document.Open();
var xmlWorkerHelper = XMLWorkerHelper.GetInstance();
var cssResolver = new StyleAttrCSSResolver();
var xmlWorkerFontProvider = new XMLWorkerFontProvider();
foreach (string font in fonts)
{
xmlWorkerFontProvider.Register(font);
}
var cssAppliers = new CssAppliersImpl(xmlWorkerFontProvider);
var htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
PdfWriterPipeline pdfWriterPipeline = new PdfWriterPipeline(document, pdfWriter);
HtmlPipeline htmlPipeline = new HtmlPipeline(htmlContext, pdfWriterPipeline);
CssResolverPipeline cssResolverPipeline = new CssResolverPipeline(cssResolver, htmlPipeline);
XMLWorker xmlWorker = new XMLWorker(cssResolverPipeline, true);
XMLParser xmlParser = new XMLParser(xmlWorker);
xmlParser.Parse(stringReader);
document.Close();
}
}
(Source: iTextSharp XmlWorker: right-to-left)
If you want an easier example, take a look at the answers of these questions:
How to parse multiple HTML files into a single PDF?
How to add a rich Textbox (HTML) to a table cell?
...
The code that parses an HTML string and a CSS string to a list of iText(Sharp) elements is as simple as this:
ElementList list = XMLWorkerHelper.parseToElementList(html, css);
You can find more examples on the official iText web site.

Pdf Merge/Overlap with iText

I have used iText for some various utility, such us merge and editing of pdf files with success. Now I need to overlap 2 pdf pages:
For Instance:
INPUT:
PDF#1 (1 Page)
PDF#2 (1 Page)
OUTPUT:
PDF#3 (1 Page: This is the result of the 2 Input Pages Overlapped)
I don't know if it's possible to do this with iText latest version. I am also considering to use one of the 2 input PDF Files as background for the PDF Output Files.
Thank you in advance.
It's actually pretty easy to do. The PdfWriter object has an instance method called GetImportedPage() which returns a PdfImportedPage object. This object can be passed to a PdfContentByte's AddTemplate() method.
GetImportedPage() takes a PdfReader object and the page number that you want to get. You can get a PdfContentByte from an instance of a PdfWriter's DirectContent property.
The code below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.2.0 that shows this all off. It first creates two files on the desktop, the first with just a solid red background color and the second with just a paragraph. It then combines those two files overlapping into a third document. See the code for additional comments.
using System;
using System.IO;
using System.Windows.Forms;
using iTextSharp.text;
using iTextSharp.text.pdf;
namespace WindowsFormsApplication1 {
public partial class Form1 : Form {
public Form1() {
InitializeComponent();
}
private void Form1_Load(object sender, EventArgs e) {
//Folder that we'll work from
string workingFolder = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
string pdf1 = Path.Combine(workingFolder, "pdf1.pdf");//PDF with solid red background color
string pdf2 = Path.Combine(workingFolder, "pdf2.pdf");//PDF with text
string pdf3 = Path.Combine(workingFolder, "pdf3.pdf");//Merged PDF
//Create a basic PDF filled with red, nothing special
using (FileStream fs = new FileStream(pdf1, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
PdfContentByte cb = writer.DirectContent;
cb.SetColorFill(BaseColor.RED);
cb.Rectangle(0, 0, doc.PageSize.Width, doc.PageSize.Height);
cb.Fill();
doc.Close();
}
}
}
//Create a basic PDF with a single line of text, nothing special
using (FileStream fs = new FileStream(pdf2, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
doc.Add(new Paragraph("This is a test"));
doc.Close();
}
}
}
//Create a basic PDF
using (FileStream fs = new FileStream(pdf3, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.LETTER)) {
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
doc.Open();
//Get page 1 of the first file
PdfImportedPage imp1 = writer.GetImportedPage(new PdfReader(pdf1), 1);
//Get page 2 of the second file
PdfImportedPage imp2 = writer.GetImportedPage(new PdfReader(pdf2), 1);
//Add the first file to coordinates 0,0
writer.DirectContent.AddTemplate(imp1, 0, 0);
//Since we don't call NewPage the next call will operate on the same page
writer.DirectContent.AddTemplate(imp2, 0, 0);
doc.Close();
}
}
}
this.Close();
}
}
}

Categories

Resources