Html to pdf Unicode chars not being displayed in pdfptable

Html to pdf Unicode chars not being displayed in pdfptable - c#

I am writing a pdf file from html markup. In my code snippet the pdf is being successfully generated and the unicode characters are being also being rendered on the table.
Here is my code snippet:
void GeneratePdfFromHtml()
{
const string outputFilename = #"c:\report.pdf";
const string inputFilename = #"C:\report.html";
using (var input = new FileStream(inputFilename, FileMode.Open))
using (var output = new FileStream(outputFilename, FileMode.Create))
{
CreatePdf(input, output);
}
}
and this method creates pdf with unicode support:
void CreatePdf(Stream htmlInput, Stream pdfOutput)
{
using (var document = new Document(PageSize.A4, 30, 30, 30, 30))
{
var writer = PdfWriter.GetInstance(document, pdfOutput);
var worker = XMLWorkerHelper.GetInstance();
document.Open();
worker.ParseXHtml(writer, document, htmlInput, null, Encoding.Unicode, new UnicodeFontFactory());
document.Close();
}
}
This is a helper class to provide required fonts:
public class UnicodeFontFactory : FontFactoryImp
{
private static readonly string FontPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop),
"ARIALUNI.ttf");
private readonly BaseFont _baseFont;
public UnicodeFontFactory()
{
_baseFont = BaseFont.CreateFont(FontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
}
public override iTextSharp.text.Font GetFont(string fontname, string encoding, bool embedded, float size, int style, BaseColor color,
bool cached)
{
return new iTextSharp.text.Font(_baseFont, size, style, color);
}
}
Problem this draws the resultant contents on Pdf document page but I want this to be drawn in a pdfptable cell.
What I have tried to achieve this:
void CreatePdf(string htmltext)
{
using (var document = new Document(PageSize.A4, 30, 30, 30, 30))
{
document.Open();
PdfPTable table = new PdfPTable(1);
PdfPCell cell = new PdfPCell();
List<IElement> elementsIzv = XMLWorkerHelper.ParseToElementList(htmltext, null);
foreach (IElement e in elementsIzv)
{
cell.AddElement(e);
}
table.AddCell(cell);
document.Add(table);
document.Close();
}
}
This snippet truncates the Special Characters. Please direct me in the right way, Any replacement to itextsharp is also acceptable.

Related

ItextSharp PDF Header Footer Repetition Issue

I have used Itextsharp from Nuget Package and parsed HTML to PDF and passed Bytes to frontend of website and showing the PDF on iframe.
Firstly i had two separate HTML for the PDF,i.e., one for the header and other one for the body. When i used to parse the HTML to PDF, the issue which came across was the repetition of header. Header is works fine on the first page but on the second page header and body overlap each other. I tried a lot by overriding OnStartPage and OnEndPage function but nothing worked.
Secondly I tried header through C# code and body through HTML but that also seems not be not working and having the same issue.
I think the main issue is with page break(correct me if i am wrong). If any of you from the community can help me out please go forward. I am sharing the code and really appreciate all the help.
Please let me know if any code is missing. I need header on every page that need to be consistent but with dynamic content.
using System;
using System.Collections.Generic;
using iTextSharp.text.html.simpleparser;
namespace WebApi.Controllers
{
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml;
using iTextSharp.tool.xml.css;
using iTextSharp.tool.xml.html;
using iTextSharp.tool.xml.parser;
using iTextSharp.tool.xml.pipeline.css;
using iTextSharp.tool.xml.pipeline.end;
using iTextSharp.tool.xml.pipeline.html;
using System.IO;
using System.Text;
using System.Web.Mvc;
public class PDFGenerateController : Controller
{
[NonAction]
public byte[] Index(IModel iModel)
{
// get HTML for body
var html = GetHtml(iModel, false);
byte[] bytes;
Document pdfDocument = new Document(PageSize.A4);
using (MemoryStream memoryStream = new MemoryStream())
{
PdfWriter writer = PdfWriter.GetInstance(pdfDocument, memoryStream);
AddImageToHeader pageEventHandler = new AddImageToHeader(GetHtml(processedData, true)); // get html for header right side
IHeaderFooter iHeaderFooter = new IHeaderFooter(GetHtml(processedData, true)); // get html for header right side
writer.PageEvent = iHeaderFooter;
writer.PageEvent = pageEventHandler;
writer.PageEvent = new HeaderFooterAdd(iModel);
writer.CloseStream = false;
pdfDocument.Open();
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
ICSSResolver cssResolver = XMLWorkerHelper.GetInstance().GetDefaultCssResolver(false);
cssResolver.AddCssFile("C:/pdf.css", true);
IPipeline pipeline = new CssResolverPipeline(cssResolver, new HtmlPipeline(htmlContext,
new PdfWriterPipeline(pdfDocument, writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser xmlParser = new XMLParser(worker);
xmlParser.Parse(new MemoryStream(Encoding.UTF8.GetBytes(html)));
pdfDocument.Close();
bytes = memoryStream.GetBuffer();
memoryStream.Close();
}
return bytes;
}
public class iHeaderFooter : PdfPageEventHelper
{
private readonly string _html;
public iHeaderFooter(string html)
{
_html = html;
}
public override void OnStartPage(PdfWriter writer, Document document)
{
var cssResolver = new StyleAttrCSSResolver();
XMLWorkerFontProvider fontProvider =
new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.SetTagFactory(Tags.GetHtmlTagProcessorFactory());
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html1 = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html1);
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.Parse(new StringReader(_html));
//for page break but didn't worked out
/*using (TextReader htmlViewReader = new StringReader(_html))
{
using (var htmlWorker = new HeaderFooterAdd.HTMLWorkerExtended(document))
{
htmlWorker.Open();
htmlWorker.Parse(htmlViewReader);
}
}*/
base.OnStartPage(writer, document);
}
}
}
// Add logo image to header left side
public class AddImageToHeader : PdfPageEventHelper
{
private readonly string _html;
public AddImageToHeader(string html)
{
_html = html;
}
public override void OnStartPage(PdfWriter writer, Document document)
{
iTextSharp.text.Image imghead = iTextSharp.text.Image.GetInstance("C:/logo.png");
imghead.ScaleAbsolute(189f, 79f);
imghead.SetAbsolutePosition(30, 0);
PdfContentByte cbhead = writer.DirectContent;
PdfTemplate tp = cbhead.CreateTemplate(320, 100);
tp.AddImage(imghead);
cbhead.AddTemplate(tp, 0, 830 - 95);
base.OnStartPage(writer, document);
}
}
public class HeaderFooterAdd : PdfPageEventHelper
{
//C# Header
public override void OnEndPage(PdfWriter writer, Document document)
{
PdfPTable tbHeader = new PdfPTable(2);
tbHeader.TotalWidth = document.PageSize.Width - document.LeftMargin - document.RightMargin;
tbHeader.DefaultCell.Border = Rectangle.NO_BORDER;
tbHeader.DefaultCell.BorderWidth = 0;
tbHeader.DefaultCell.Top = 100;
tbHeader.DefaultCell.Bottom = 100;
tbHeader.AddCell(new Paragraph());
Phrase datePhrase = new Phrase(new Chunk($"{"Label"}: {"Text"}\n", FontFactory.GetFont(FontFactory.TIMES, 10, Font.NORMAL, BaseColor.BLACK)));
PdfPCell _cell = new PdfPCell(datePhrase);
_cell.HorizontalAlignment = Element.ALIGN_RIGHT;
_cell.BorderWidthBottom = 0f;
_cell.BorderWidthLeft = 0f;
_cell.BorderWidthTop = 0f;
_cell.BorderWidthRight = 0f;
_cell.PaddingTop = 45f;
_cell.ExtraParagraphSpace = 2f;
tbHeader.AddCell(_cell);
tbHeader.AddCell(new Paragraph());
tbHeader.WriteSelectedRows(0, -1, document.Left,
writer.PageSize.GetTop(document.TopMargin) + 40,
writer.DirectContent);
}
//for page break but didn't worked out
/*public class HTMLWorkerExtended : HTMLWorker
{
public HTMLWorkerExtended(IDocListener document) :
base(document)
{
}
public override void StartElement(string tag,
IDictionary<string, string> str)
{
if (tag.Equals("newpage"))
document.Add(Chunk.NEXTPAGE);
else
base.StartElement(tag, str);
}
}*/
}

why courier font not working in iText PDF document?

Using the following code to create a PDF document in C# using iText 5. The text does not render in the courier font. Why not?
private void SimpleFontDoc(string pdfDocPath)
{
Document doc = new Document(PageSize.LETTER, 10, 10, 42, 30);
var fs = new FileStream(pdfDocPath, FileMode.Create);
PdfWriter writer = PdfWriter.GetInstance(doc, fs);
doc.Open();
string[] lines = new string[]
{
"First text line",
"Second text line"
};
var font = FontFactory.GetFont("courier", 12.0f, BaseColor.BLACK);
foreach (var line in lines)
{
var para = new iTextSharp.text.Paragraph(line);
para.Font = font;
doc.Add(para);
}
doc.Close();
}

In iText5 you have to specify the font before adding text to the Paragraph element (or alternatively pass it to the constructor).
Change
var para = new iTextSharp.text.Paragraph(line);
para.Font = font;
into
var para = new iTextSharp.text.Paragraph(line, font);

Creating a PDF file?

I looked into iTextSharp and SharpPDF and Report.Net as well as PDFSharp.
None of these open source projects have good documentation OR do not work with VS 2012.
Does anyone have a recommended solution or can show me the documentation?
My employer blocks many sites and although Google is not blocked, some of the results are.
I plan on using C# with WinForms and obtaining my data from an Access DB

Hey #Cocoa Dev get this a complete example with diferent functions.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using iTextSharp.text.pdf;
using System.Data;
using System.Text;
using System.util.collections;
using iTextSharp.text;
using System.Net.Mail;
public partial class PDFScenarios : System.Web.UI.Page
{
public string P_InputStream = "~/MyPDFTemplates/ex1.pdf";
public string P_InputStream2 = "~/MyPDFTemplates/ContactInfo.pdf";
public string P_InputStream3 = "~/MyPDFTemplates/MulPages.pdf";
public string P_InputStream4 = "~/MyPDFTemplates/CompanyLetterHead.pdf";
public string P_OutputStream = "~/MyPDFOutputs/ex1_1.pdf";
//Read all 'Form values/keys' from an existing multi-page PDF document
public void ReadPDFformDataPageWise()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
AcroFields form = reader.AcroFields;
try
{
for (int page = 1; page <= reader.NumberOfPages; page++)
{
foreach (KeyValuePair<string, AcroFields.Item> kvp in form.Fields)
{
switch (form.GetFieldType(kvp.Key))
{
case AcroFields.FIELD_TYPE_CHECKBOX:
case AcroFields.FIELD_TYPE_COMBO:
case AcroFields.FIELD_TYPE_LIST:
case AcroFields.FIELD_TYPE_RADIOBUTTON:
case AcroFields.FIELD_TYPE_NONE:
case AcroFields.FIELD_TYPE_PUSHBUTTON:
case AcroFields.FIELD_TYPE_SIGNATURE:
case AcroFields.FIELD_TYPE_TEXT:
int fileType = form.GetFieldType(kvp.Key);
string fieldValue = form.GetField(kvp.Key);
string translatedFileName = form.GetTranslatedFieldName(kvp.Key);
break;
}
}
}
}
catch
{
}
finally
{
reader.Close();
}
}
//Read and alter form values for only second and
//third page of an existing multi page PDF doc.
//Save the changes in a brand new pdf file.
public void ReadAlterPDFformDataInSelectedPages()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
reader.SelectPages("1-2"); //Work with only page# 1 & 2
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(Server.MapPath(P_OutputStream), FileMode.Create)))
{
AcroFields form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
//Replace Address Form field with my custom data
if (fieldKey.Contains("Address"))
{
form.SetField(fieldKey, "MyCustomAddress");
}
}
//The below will make sure the fields are not editable in
//the output PDF.
stamper.FormFlattening = true;
}
}
//Extract text from an existing PDF's second page.
private string ExtractText()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
string txt = PdfTextExtractor.GetTextFromPage(reader, 2, new LocationTextExtractionStrategy());
return txt;
}
//Create a brand new PDF from scratch and without a template
private void CreatePDFNoTemplate()
{
Document pdfDoc = new Document();
PdfWriter writer = PdfWriter.GetInstance(pdfDoc, new FileStream(Server.MapPath(P_OutputStream), FileMode.OpenOrCreate));
pdfDoc.Open();
pdfDoc.Add(new Paragraph("Some data"));
PdfContentByte cb = writer.DirectContent;
cb.MoveTo(pdfDoc.PageSize.Width / 2, pdfDoc.PageSize.Height / 2);
cb.LineTo(pdfDoc.PageSize.Width / 2, pdfDoc.PageSize.Height);
cb.Stroke();
pdfDoc.Close();
}
private void fillPDFForm()
{
string formFile = Server.MapPath(P_InputStream);
string newFile = Server.MapPath(P_OutputStream);
PdfReader reader = new PdfReader(formFile);
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(newFile, FileMode.Create)))
{
AcroFields fields = stamper.AcroFields;
// set form fields
fields.SetField("name", "John Doe");
fields.SetField("address", "xxxxx, yyyy");
fields.SetField("postal_code", "12345");
fields.SetField("email", "johndoe#xxx.com");
// flatten form fields and close document
stamper.FormFlattening = true;
stamper.Close();
}
}
//Helper functions
private void SendEmail(MemoryStream ms)
{
MailAddress _From = new MailAddress("XXX#domain.com");
MailAddress _To = new MailAddress("YYY#a.com");
MailMessage email = new MailMessage(_From, _To);
Attachment attach = new Attachment(ms, new System.Net.Mime.ContentType("application/pdf"));
email.Attachments.Add(attach);
SmtpClient mailSender = new SmtpClient("Gmail-Server");
mailSender.Send(email);
}
private void DownloadAsPDF(MemoryStream ms)
{
Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "application/pdf";
Response.AppendHeader("Content-Disposition", "attachment;filename=abc.pdf");
Response.OutputStream.Write(ms.GetBuffer(), 0, ms.GetBuffer().Length);
Response.OutputStream.Flush();
Response.OutputStream.Close();
Response.End();
ms.Close();
}
//Working with Memory Stream and PDF
public void CreatePDFFromMemoryStream()
{
//(1)using PDFWriter
Document doc = new Document();
MemoryStream memoryStream = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(doc, memoryStream);
doc.Open();
doc.Add(new Paragraph("Some Text"));
writer.CloseStream = false;
doc.Close();
//Get the pointer to the beginning of the stream.
memoryStream.Position = 0;
//You may use this PDF in memorystream to send as an attachment in an email
//OR download as a PDF
SendEmail(memoryStream);
DownloadAsPDF(memoryStream);
//(2)Another way using PdfStamper
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream2));
using (MemoryStream ms = new MemoryStream())
{
PdfStamper stamper = new PdfStamper(reader, ms);
AcroFields fields = stamper.AcroFields;
fields.SetField("SomeField", "MyValueFromDB");
stamper.FormFlattening = true;
stamper.Close();
SendEmail(ms);
}
}
//Burst-- Make each page of an existing multi-page PDF document
//as another brand new PDF document
private void PDFBurst()
{
string pdfTemplatePath = Server.MapPath(P_InputStream3);
PdfReader reader = new PdfReader(pdfTemplatePath);
//PdfCopy copy;
PdfSmartCopy copy;
for (int i = 1; i < reader.NumberOfPages; i++)
{
Document d1 = new Document();
copy = new PdfSmartCopy(d1, new FileStream(Server.MapPath(P_OutputStream).Replace(".pdf", i.ToString() + ".pdf"), FileMode.Create));
d1.Open();
copy.AddPage(copy.GetImportedPage(reader, i));
d1.Close();
}
}
//Copy a set of form fields from an existing PDF template/doc
//and keep appending to a brand new PDF file.
//The copied set of fields will have different values.
private void AppendSetOfFormFields()
{
PdfCopyFields _copy = new PdfCopyFields(new FileStream(Server.MapPath(P_OutputStream), FileMode.Create));
_copy.AddDocument(new PdfReader(a1("1")));
_copy.AddDocument(new PdfReader(a1("2")));
_copy.AddDocument(new PdfReader(new FileStream(Server.MapPath("~/MyPDFTemplates/Myaspx.pdf"), FileMode.Open)));
_copy.Close();
}
//ConcatenateForms
private byte[] a1(string _ToAppend)
{
using (var existingFileStream = new FileStream(Server.MapPath(P_InputStream), FileMode.Open))
using (MemoryStream stream = new MemoryStream())
{
// Open existing PDF
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, stream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
form.RenameField(fieldKey, fieldKey + _ToAppend);
}
// "Flatten" the form so it wont be editable/usable anymore
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
return stream.ToArray();
}
}
//Working with Image
private void AddAnImage()
{
using (var inputPdfStream = new FileStream(#"C:\MyInput.pdf", FileMode.Open))
using (var inputImageStream = new FileStream(#"C:\img1.jpg", FileMode.Open))
using (var outputPdfStream = new FileStream(#"C:\MyOutput.pdf", FileMode.Create))
{
PdfReader reader = new PdfReader(inputPdfStream);
PdfStamper stamper = new PdfStamper(reader, outputPdfStream);
PdfContentByte pdfContentByte = stamper.GetOverContent(1);
var image = iTextSharp.text.Image.GetInstance(inputImageStream);
image.SetAbsolutePosition(1, 1);
pdfContentByte.AddImage(image);
stamper.Close();
}
}
//Add Company Letter-Head/Stationary to an existing pdf
private void AddCompanyStationary()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream2));
PdfReader s_reader = new PdfReader(Server.MapPath(P_InputStream4));
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(Server.MapPath(P_OutputStream), FileMode.Create)))
{
PdfImportedPage page = stamper.GetImportedPage(s_reader, 1);
int n = reader.NumberOfPages;
PdfContentByte background;
for (int i = 1; i <= n; i++)
{
background = stamper.GetUnderContent(i);
background.AddTemplate(page, 0, 0);
}
stamper.Close();
}
}

Try this example:
using iTextSharp.text;
// Set up the fonts to be used on the pages
private Font _largeFont = new Font(Font.FontFamily.HELVETICA, 18, Font.BOLD, BaseColor.BLACK);
private Font _standardFont = new Font(Font.FontFamily.HELVETICA, 14, Font.NORMAL, BaseColor.BLACK);
private Font _smallFont = new Font(Font.FontFamily.HELVETICA, 10, Font.NORMAL, BaseColor.BLACK);
public void Build()
{
iTextSharp.text.Document doc = null;
try
{
// Initialize the PDF document
doc = new Document();
iTextSharp.text.pdf.PdfWriter writer = iTextSharp.text.pdf.PdfWriter.GetInstance(doc,
new System.IO.FileStream(System.IO.Directory.GetCurrentDirectory() + "\\ScienceReport.pdf",
System.IO.FileMode.Create));
// Set margins and page size for the document
doc.SetMargins(50, 50, 50, 50);
// There are a huge number of possible page sizes, including such sizes as
// EXECUTIVE, LEGAL, LETTER_LANDSCAPE, and NOTE
doc.SetPageSize(new iTextSharp.text.Rectangle(iTextSharp.text.PageSize.LETTER.Width,
iTextSharp.text.PageSize.LETTER.Height));
// Add metadata to the document. This information is visible when viewing the
// document properities within Adobe Reader.
doc.AddTitle("My Science Report");
doc.AddCreator("M. Lichtenberg");
doc.AddKeywords("paper airplanes");
// Add Xmp metadata to the document.
this.CreateXmpMetadata(writer);
// Open the document for writing content
doc.Open();
// Add pages to the document
this.AddPageWithBasicFormatting(doc);
this.AddPageWithInternalLinks(doc);
this.AddPageWithBulletList(doc);
this.AddPageWithExternalLinks(doc);
this.AddPageWithImage(doc, System.IO.Directory.GetCurrentDirectory() + "\\FinalGraph.jpg");
// Add page labels to the document
iTextSharp.text.pdf.PdfPageLabels pdfPageLabels = new iTextSharp.text.pdf.PdfPageLabels();
pdfPageLabels.AddPageLabel(1, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Basic Formatting");
pdfPageLabels.AddPageLabel(2, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Internal Links");
pdfPageLabels.AddPageLabel(3, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Bullet List");
pdfPageLabels.AddPageLabel(4, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "External Links");
pdfPageLabels.AddPageLabel(5, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Image");
writer.PageLabels = pdfPageLabels;
}
catch (iTextSharp.text.DocumentException dex)
{
// Handle iTextSharp errors
}
finally
{
// Clean up
doc.Close();
doc = null;
}
}

You can always just create an html page and then convert that to pdf using wkhtmltopdf. This has the benefit of you not having to construct the pdf with a library such as iText. You just make a text file (html) and then pass it to the wkhtmltopdf executable.
See Calling wkhtmltopdf to generate PDF from HTML for more info.

Creating multiple page pdf using iTextSharp

I am trying to create pdf with multiple pages using iTextSharp
Document document = new Document(PageSize.A4, 2, 2, 10, 10);
private PdfContentByte _pcb;
try
{
PdfWriter writer = PdfWriter.GetInstance(document, output);
document.Open();
document.NewPage();
_pcb = writer.DirectContent;
_pcb.BeginText();
_pcb.ShowTextAligned(PdfContentByte.ALIGN_LEFT, text, x, y, 0);
_pcb.EndText();
writer.Flush();
}
catch(e)
{
}
finally
{
document.Close();
}
This is working fine for me. When I am trying to add a new page on the same document, it is replacing the existing written text with new page and no new page is getting added. Below is the code which is not working.
_pcb.EndText();
writer.Flush();
document.NewPage();
_pcb = writer.DirectContent;
_pcb.BeginText();
_pcb.ShowTextAligned(PdfContentByte.ALIGN_LEFT, text, x, y, 0);
_pcb.EndText();
writer.Flush();

Below is my attempt to clean-up and unify your code. Generally avoid try-catch until you actually have to, you'll often miss some very important errors. (For instance, you're not actually setting the font and size which is required but maybe you just omitted that code.) Also, unless you are writing a very large PDF there's really no reason to flush the buffers, leave that to the OS to do for you when necessary.
When I run the code below I get two pages with text on both pages, does it work for you? (Targeting iTextSharp 5.2.0.0)
var output = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Output.pdf");
var bf = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1250, BaseFont.NOT_EMBEDDED);
using (FileStream fs = new FileStream(output, FileMode.Create, FileAccess.Write, FileShare.None)) {
using (Document doc = new Document(PageSize.A4, 2, 2, 10, 10)) {
PdfContentByte _pcb;
using (PdfWriter writer = PdfWriter.GetInstance(doc, fs)) {
//Open document for writing
doc.Open();
//Insert page
doc.NewPage();
//Alias to DirectContent
_pcb = writer.DirectContent;
//Set the font and size
_pcb.SetFontAndSize(bf, 12);
//Show some text
_pcb.BeginText();
_pcb.ShowTextAligned(PdfContentByte.ALIGN_LEFT, "Page 1", 40, 600, 0);
_pcb.EndText();
//Insert a new page
doc.NewPage();
//Re-set font and size
_pcb.SetFontAndSize(bf, 12);
//Show more text on page 2
_pcb.BeginText();
_pcb.ShowTextAligned(PdfContentByte.ALIGN_LEFT, "Page 2", 100, 400, 0);
_pcb.EndText();
doc.Close();
}
}
}

Why do you use the DirectContent? If you just want to create a PDF from scratch, just add content to the Document.
try
{
iTextSharp.text.Document doc = new iTextSharp.text.Document();
PdfWriter.GetInstance(doc, new FileStream("HelloWorld.pdf", FileMode.Create));
doc.Open();
doc.Add(new Paragraph("Hello World!"));
doc.NewPage();
doc.Add(new Paragraph("Hello World on a new page!"));
}
catch (Exception ex)
{
}
finally
{
doc.Close();
}

Below Code is working
string str = "Page1Page1Page1Page1Page1Page1Page1Page1Page1Page1";
string str2 = "Page2Page2Page2Page2Page2Page2Page2Page2Page2Page2";
Document pdfDoc = new Document(PageSize.A4, 10f, 10f, 10f, 0f);
PdfWriter.GetInstance(pdfDoc, Response.OutputStream);
pdfDoc.Open();
using (var htmlWorker = new HTMLWokExtend(pdfDoc))
{
using (var sr = new StringReader(str))
{
htmlWorker.Parse(sr);
}
}
pdfDoc.NewPage();
using (var htmlWorker = new HTMLWokExtend(pdfDoc))
{
using (var sr = new StringReader(str2))
{
htmlWorker.Parse(sr);
}
}
pdfDoc.Close();
Response.ContentType = "application/pdf";
Response.AddHeader("content-disposition", "attachment;" +
"filename=Proforma_Invoice.pdf");
Response.Cache.SetCacheability(HttpCacheability.NoCache);
Response.Write(pdfDoc);
Response.End();
You can populate string from HTML body using below function
private string populatebody()
{
string body="";
using (StreamReader reader = new StreamReader(Server.MapPath("~/Dir/Page.html")))
{
body = reader.ReadToEnd();
}
body = body.Replace("{Content in htmlpage}", "Your Content");
return body
}
And then return this body to string in upper code.
You can manipulate with this code as per your requirement.
Below is HTMLWokExtend Class:
public class HTMLWokExtend : HTMLWorker
{
LineSeparator line = new LineSeparator(1f, 90f, BaseColor.GRAY, Element.ALIGN_CENTER, -12);
public HTMLWokExtend(IDocListener document) : base(document)
{
}
public override void StartElement(string tag, IDictionary<string, string> str)
{
if (tag.Equals("hrline"))
document.Add(new Chunk(line));
else
base.StartElement(tag, str);
}
}

Display Unicode characters in converting Html to Pdf

I am using itextsharp dll to convert HTML to PDF.
The HTML has some Unicode characters like α, β... when I try to convert HTML to PDF, Unicode characters are not shown in PDF.
My function:
Document doc = new Document(PageSize.LETTER);
using (FileStream fs = new FileStream(Path.Combine("Test.pdf"), FileMode.Create, FileAccess.Write, FileShare.Read))
{
PdfWriter.GetInstance(doc, fs);
doc.Open();
doc.NewPage();
string arialuniTff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts),
"ARIALUNI.TTF");
BaseFont bf = BaseFont.CreateFont(arialuniTff, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
Font fontNormal = new Font(bf, 12, Font.NORMAL);
List<IElement> list = HTMLWorker.ParseToList(new StringReader(stringBuilder.ToString()),
new StyleSheet());
Paragraph p = new Paragraph {Font = fontNormal};
foreach (var element in list)
{
p.Add(element);
doc.Add(p);
}
doc.Close();
}

You can also use the new XMLWorkerHelper (from library itextsharp.xmlworker), you need to override the default FontFactory implementation however.
void GeneratePdfFromHtml()
{
const string outputFilename = #".\Files\report.pdf";
const string inputFilename = #".\Files\report.html";
using (var input = new FileStream(inputFilename, FileMode.Open))
using (var output = new FileStream(outputFilename, FileMode.Create))
{
CreatePdf(input, output);
}
}
void CreatePdf(Stream htmlInput, Stream pdfOutput)
{
using (var document = new Document(PageSize.A4, 30, 30, 30, 30))
{
var writer = PdfWriter.GetInstance(document, pdfOutput);
var worker = XMLWorkerHelper.GetInstance();
document.Open();
worker.ParseXHtml(writer, document, htmlInput, null, Encoding.UTF8, new UnicodeFontFactory());
document.Close();
}
}
public class UnicodeFontFactory : FontFactoryImp
{
private static readonly string FontPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts),
"arialuni.ttf");
private readonly BaseFont _baseFont;
public UnicodeFontFactory()
{
_baseFont = BaseFont.CreateFont(FontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
}
public override Font GetFont(string fontname, string encoding, bool embedded, float size, int style, BaseColor color,
bool cached)
{
return new Font(_baseFont, size, style, color);
}
}

When dealing with Unicode characters and iTextSharp there's a couple of things you need to take care of. The first one you did already and that's getting a font that supports your characters. The second thing is that you want to actually register the font with iTextSharp so that its aware of it.
//Path to our font
string arialuniTff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
//Register the font with iTextSharp
iTextSharp.text.FontFactory.Register(arialuniTff);
Now that we have a font we need to create a StyleSheet object that tells iTextSharp when and how to use it.
//Create a new stylesheet
iTextSharp.text.html.simpleparser.StyleSheet ST = new iTextSharp.text.html.simpleparser.StyleSheet();
//Set the default body font to our registered font's internal name
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.FACE, "Arial Unicode MS");
The one non-HTML part that you also need to do is set a special encoding parameter. This encoding is specific to iTextSharp and in your case you want it to be Identity-H. If you don't set this then it default to Cp1252 (WINANSI).
//Set the default encoding to support Unicode characters
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.ENCODING, BaseFont.IDENTITY_H);
Lastly, we need to pass our stylesheet to the ParseToList method:
//Parse our HTML using the stylesheet created above
List<IElement> list = HTMLWorker.ParseToList(new StringReader(stringBuilder.ToString()), ST);
Putting that all together, from open to close you'd have:
doc.Open();
//Sample HTML
StringBuilder stringBuilder = new StringBuilder();
stringBuilder.Append(#"<p>This is a test: <strong>α,β</strong></p>");
//Path to our font
string arialuniTff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "ARIALUNI.TTF");
//Register the font with iTextSharp
iTextSharp.text.FontFactory.Register(arialuniTff);
//Create a new stylesheet
iTextSharp.text.html.simpleparser.StyleSheet ST = new iTextSharp.text.html.simpleparser.StyleSheet();
//Set the default body font to our registered font's internal name
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.FACE, "Arial Unicode MS");
//Set the default encoding to support Unicode characters
ST.LoadTagStyle(HtmlTags.BODY, HtmlTags.ENCODING, BaseFont.IDENTITY_H);
//Parse our HTML using the stylesheet created above
List<IElement> list = HTMLWorker.ParseToList(new StringReader(stringBuilder.ToString()), ST);
//Loop through each element, don't bother wrapping in P tags
foreach (var element in list) {
doc.Add(element);
}
doc.Close();
EDIT
In your comment you show HTML that specifies an override font. iTextSharp does not spider the system for fonts and its HTML parser doesn't use font fallback techniques. Any fonts specified in HTML/CSS must be manually registered.
string lucidaTff = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "l_10646.ttf");
iTextSharp.text.FontFactory.Register(lucidaTff);

private class UnicodeFontFactory : FontFactoryImp
{
private BaseFont _baseFont;
public UnicodeFontFactory()
{
string FontPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts), "arialuni.ttf");
_baseFont = BaseFont.CreateFont(FontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
}
public override Font GetFont(string fontname, string encoding, bool embedded, float size, int style, BaseColor color, bool cached)
{
return new Font(_baseFont, size, style, color);
}
}
//and Code
FontFactory.FontImp = new UnicodeFontFactory();
string convertedHtml = string.Empty;
foreach (char c in htmlText)
{
if (c < 127)
convertedHtml += c;
else
convertedHtml += "&#" + (int)c + ";";
}
List<IElement> htmlElements = XMLWorkerHelper.ParseToElementList(convertedHtml, null);
// add the IElements to the document
foreach (IElement htmlElement in htmlElements)
{
document.Add(htmlElement);
}

This has to be one of the most difficult problems that I've had to figure out to date. The answers on the web, including stack overflow has either poor or outdated information. The answer from Gregor is very close. I wanted to give back to this community because I spent many hours to get to this answer.
Here's a very simple program I wrote in c# as an example for my own notes.
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using iTextSharp.text;
using iTextSharp.text.pdf;
using iTextSharp.tool.xml;
namespace ExampleOfExportingPDF
{
class Program
{
static void Main(string[] args)
{
//Build HTML document
StringBuilder sb = new StringBuilder();
sb.Append("<body>");
sb.Append("<h1 style=\"text-align:center;\">これは日本語のテキストの例です。</h1>");
sb.Append("</body>");
//Create our document object
Document Doc = new Document(PageSize.A4);
//Create our file stream
using (FileStream fs = new FileStream(Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Test.pdf"), FileMode.Create, FileAccess.Write, FileShare.Read))
{
//Bind PDF writer to document and stream
PdfWriter writer = PdfWriter.GetInstance(Doc, fs);
//Open document for writing
Doc.Open();
//Add a page
Doc.NewPage();
MemoryStream msHtml = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(sb.ToString()));
XMLWorkerHelper.GetInstance().ParseXHtml(writer, Doc, msHtml, null, Encoding.UTF8, new UnicodeFontFactory());
//Close the PDF
Doc.Close();
}
}
public class UnicodeFontFactory : FontFactoryImp
{
private static readonly string FontPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Fonts),
"arialuni.ttf");
private readonly BaseFont _baseFont;
public UnicodeFontFactory()
{
_baseFont = BaseFont.CreateFont(FontPath, BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
}
public override Font GetFont(string fontname, string encoding, bool embedded, float size, int style, BaseColor color,
bool cached)
{
return new Font(_baseFont, size, style, color);
}
}
}
}
Hopefully this will save someone some time in the future.

Here is the few steps to display unicode characters in converting Html to Pdf
Create a HTMLWorker
Register a unicode font and assign it
Create a style sheet and set the encoding to Identity-H
Assign the style sheet to the html parser
Check below link for more understanding....
Display Unicode characters in converting Html to Pdf
Hindi, Turkish, and special characters are also display during converting from HTML to PDF using this method. Check below demo image.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Html to pdf Unicode chars not being displayed in pdfptable - c#

Related

ItextSharp PDF Header Footer Repetition Issue

why courier font not working in iText PDF document?

Creating a PDF file?

Creating multiple page pdf using iTextSharp

Display Unicode characters in converting Html to Pdf

Categories

Resources