iTextSharp How to Add Custom Properties to PDF-A?

iTextSharp How to Add Custom Properties to PDF-A? - c#

Basically, i have original XFA file with many Custom > Properties and i can convert it to PDF-A but all of properties have lost after convert.
Here is my ConvertToPDFa Code:
public byte[] ConvertToPDFa3(byte[] OriginPDF,byte[] FileStore,string XML)
{
//***************************************
// Convert PDF to PDFa3 and Attach XML .
//***************************************
//Create Reader for Reading PDF in byte[]
using (var reader = new PdfReader(OriginPDF))
{
// Open Stream for Converting
using (var Convertstream = new MemoryStream())
{
// New Doc for PDF/A-3
Document pdfAdocument = new Document();
//Init instance for pdfAdocument
PdfAWriter writer = PdfAWriter.GetInstance(pdfAdocument, Convertstream, PdfAConformanceLevel.PDF_A_3U);
writer.CreateXmpMetadata();
if (!pdfAdocument.IsOpen())
pdfAdocument.Open();
PdfContentByte cb = writer.DirectContent; // Holds the PDF data
//Count original PDF pages and uses for pdfa-3
PdfImportedPage page;
int pageCount = reader.NumberOfPages;
for (int i = 0; i < pageCount; i++)
{
pdfAdocument.NewPage();
page = writer.GetImportedPage(reader, i + 1);
cb.AddTemplate(page, 0, 0);
}
//Create Output Intents
ICC_Profile icc = ICC_Profile.GetInstance(AppDomain.CurrentDomain.BaseDirectory + "res\\sRGB Color Space Profile.icm");
writer.SetOutputIntents("sRGB IEC61966-2.1", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);
//Embedded File to PDFa3
writer.AddFileAttachment("XML for : " + certRefNumber, FileStore, XML, lblRefNo.Text + ".xml");
//Close all file to finish
pdfAdocument.Close();
reader.Close();
//Test writes output in Phys path
//File.WriteAllBytes(#"OutPDFa.pdf", Convertstream.ToArray());
byte[] filledPDfa = Convertstream.ToArray();
return filledPDfa;
}
}
// End Convert and Attached
}
i tried get origin properties with these code :
static Dictionary<string, string> GetPdfProperties(byte[] originfile)
{
Dictionary<string, string> propertyInfo = null;
using (PdfReader reader = new PdfReader(originfile))
{
propertyInfo = reader.Info;
reader.Close();
}
return propertyInfo;
}
And Copy origin properties to PDF-A like this :
static byte[] CopyProps(byte[] fileinput,byte[] fileOrigin,PdfAWriter wr)
{
using (var rd = new PdfReader(fileinput))
{
using (var outp = new MemoryStream())
{
using (var stamper = new PdfAStamper(rd, outp,PdfAConformanceLevel.PDF_A_3U))
{
var info = rd.Info;
Dictionary<string, string> propertyInfo = GetPdfProperties(fileOrigin);
foreach (KeyValuePair<string, string> property in propertyInfo)
{
info[property.Key] = property.Value;
}
stamper.MoreInfo = info;
using (var ms = new MemoryStream())
{
wr.CreateXmpWriter(ms,info); //****error was here
stamper.XmpMetadata = ms.ToArray();
}
}
return outp.ToArray();
}
}
}
but found the error was inaccesable due to its protected level
Should i place the error line to another ?
Or
Have another solution for these problem ?
Please Help :(
i use iTextSharp 5.5.13

Related

Creating a PDF file?

I looked into iTextSharp and SharpPDF and Report.Net as well as PDFSharp.
None of these open source projects have good documentation OR do not work with VS 2012.
Does anyone have a recommended solution or can show me the documentation?
My employer blocks many sites and although Google is not blocked, some of the results are.
I plan on using C# with WinForms and obtaining my data from an Access DB

Hey #Cocoa Dev get this a complete example with diferent functions.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using System.IO;
using iTextSharp.text.pdf;
using System.Data;
using System.Text;
using System.util.collections;
using iTextSharp.text;
using System.Net.Mail;
public partial class PDFScenarios : System.Web.UI.Page
{
public string P_InputStream = "~/MyPDFTemplates/ex1.pdf";
public string P_InputStream2 = "~/MyPDFTemplates/ContactInfo.pdf";
public string P_InputStream3 = "~/MyPDFTemplates/MulPages.pdf";
public string P_InputStream4 = "~/MyPDFTemplates/CompanyLetterHead.pdf";
public string P_OutputStream = "~/MyPDFOutputs/ex1_1.pdf";
//Read all 'Form values/keys' from an existing multi-page PDF document
public void ReadPDFformDataPageWise()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
AcroFields form = reader.AcroFields;
try
{
for (int page = 1; page <= reader.NumberOfPages; page++)
{
foreach (KeyValuePair<string, AcroFields.Item> kvp in form.Fields)
{
switch (form.GetFieldType(kvp.Key))
{
case AcroFields.FIELD_TYPE_CHECKBOX:
case AcroFields.FIELD_TYPE_COMBO:
case AcroFields.FIELD_TYPE_LIST:
case AcroFields.FIELD_TYPE_RADIOBUTTON:
case AcroFields.FIELD_TYPE_NONE:
case AcroFields.FIELD_TYPE_PUSHBUTTON:
case AcroFields.FIELD_TYPE_SIGNATURE:
case AcroFields.FIELD_TYPE_TEXT:
int fileType = form.GetFieldType(kvp.Key);
string fieldValue = form.GetField(kvp.Key);
string translatedFileName = form.GetTranslatedFieldName(kvp.Key);
break;
}
}
}
}
catch
{
}
finally
{
reader.Close();
}
}
//Read and alter form values for only second and
//third page of an existing multi page PDF doc.
//Save the changes in a brand new pdf file.
public void ReadAlterPDFformDataInSelectedPages()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
reader.SelectPages("1-2"); //Work with only page# 1 & 2
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(Server.MapPath(P_OutputStream), FileMode.Create)))
{
AcroFields form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
//Replace Address Form field with my custom data
if (fieldKey.Contains("Address"))
{
form.SetField(fieldKey, "MyCustomAddress");
}
}
//The below will make sure the fields are not editable in
//the output PDF.
stamper.FormFlattening = true;
}
}
//Extract text from an existing PDF's second page.
private string ExtractText()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream3));
string txt = PdfTextExtractor.GetTextFromPage(reader, 2, new LocationTextExtractionStrategy());
return txt;
}
//Create a brand new PDF from scratch and without a template
private void CreatePDFNoTemplate()
{
Document pdfDoc = new Document();
PdfWriter writer = PdfWriter.GetInstance(pdfDoc, new FileStream(Server.MapPath(P_OutputStream), FileMode.OpenOrCreate));
pdfDoc.Open();
pdfDoc.Add(new Paragraph("Some data"));
PdfContentByte cb = writer.DirectContent;
cb.MoveTo(pdfDoc.PageSize.Width / 2, pdfDoc.PageSize.Height / 2);
cb.LineTo(pdfDoc.PageSize.Width / 2, pdfDoc.PageSize.Height);
cb.Stroke();
pdfDoc.Close();
}
private void fillPDFForm()
{
string formFile = Server.MapPath(P_InputStream);
string newFile = Server.MapPath(P_OutputStream);
PdfReader reader = new PdfReader(formFile);
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(newFile, FileMode.Create)))
{
AcroFields fields = stamper.AcroFields;
// set form fields
fields.SetField("name", "John Doe");
fields.SetField("address", "xxxxx, yyyy");
fields.SetField("postal_code", "12345");
fields.SetField("email", "johndoe#xxx.com");
// flatten form fields and close document
stamper.FormFlattening = true;
stamper.Close();
}
}
//Helper functions
private void SendEmail(MemoryStream ms)
{
MailAddress _From = new MailAddress("XXX#domain.com");
MailAddress _To = new MailAddress("YYY#a.com");
MailMessage email = new MailMessage(_From, _To);
Attachment attach = new Attachment(ms, new System.Net.Mime.ContentType("application/pdf"));
email.Attachments.Add(attach);
SmtpClient mailSender = new SmtpClient("Gmail-Server");
mailSender.Send(email);
}
private void DownloadAsPDF(MemoryStream ms)
{
Response.Clear();
Response.ClearContent();
Response.ClearHeaders();
Response.ContentType = "application/pdf";
Response.AppendHeader("Content-Disposition", "attachment;filename=abc.pdf");
Response.OutputStream.Write(ms.GetBuffer(), 0, ms.GetBuffer().Length);
Response.OutputStream.Flush();
Response.OutputStream.Close();
Response.End();
ms.Close();
}
//Working with Memory Stream and PDF
public void CreatePDFFromMemoryStream()
{
//(1)using PDFWriter
Document doc = new Document();
MemoryStream memoryStream = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(doc, memoryStream);
doc.Open();
doc.Add(new Paragraph("Some Text"));
writer.CloseStream = false;
doc.Close();
//Get the pointer to the beginning of the stream.
memoryStream.Position = 0;
//You may use this PDF in memorystream to send as an attachment in an email
//OR download as a PDF
SendEmail(memoryStream);
DownloadAsPDF(memoryStream);
//(2)Another way using PdfStamper
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream2));
using (MemoryStream ms = new MemoryStream())
{
PdfStamper stamper = new PdfStamper(reader, ms);
AcroFields fields = stamper.AcroFields;
fields.SetField("SomeField", "MyValueFromDB");
stamper.FormFlattening = true;
stamper.Close();
SendEmail(ms);
}
}
//Burst-- Make each page of an existing multi-page PDF document
//as another brand new PDF document
private void PDFBurst()
{
string pdfTemplatePath = Server.MapPath(P_InputStream3);
PdfReader reader = new PdfReader(pdfTemplatePath);
//PdfCopy copy;
PdfSmartCopy copy;
for (int i = 1; i < reader.NumberOfPages; i++)
{
Document d1 = new Document();
copy = new PdfSmartCopy(d1, new FileStream(Server.MapPath(P_OutputStream).Replace(".pdf", i.ToString() + ".pdf"), FileMode.Create));
d1.Open();
copy.AddPage(copy.GetImportedPage(reader, i));
d1.Close();
}
}
//Copy a set of form fields from an existing PDF template/doc
//and keep appending to a brand new PDF file.
//The copied set of fields will have different values.
private void AppendSetOfFormFields()
{
PdfCopyFields _copy = new PdfCopyFields(new FileStream(Server.MapPath(P_OutputStream), FileMode.Create));
_copy.AddDocument(new PdfReader(a1("1")));
_copy.AddDocument(new PdfReader(a1("2")));
_copy.AddDocument(new PdfReader(new FileStream(Server.MapPath("~/MyPDFTemplates/Myaspx.pdf"), FileMode.Open)));
_copy.Close();
}
//ConcatenateForms
private byte[] a1(string _ToAppend)
{
using (var existingFileStream = new FileStream(Server.MapPath(P_InputStream), FileMode.Open))
using (MemoryStream stream = new MemoryStream())
{
// Open existing PDF
var pdfReader = new PdfReader(existingFileStream);
var stamper = new PdfStamper(pdfReader, stream);
var form = stamper.AcroFields;
var fieldKeys = form.Fields.Keys;
foreach (string fieldKey in fieldKeys)
{
form.RenameField(fieldKey, fieldKey + _ToAppend);
}
// "Flatten" the form so it wont be editable/usable anymore
stamper.FormFlattening = true;
stamper.Close();
pdfReader.Close();
return stream.ToArray();
}
}
//Working with Image
private void AddAnImage()
{
using (var inputPdfStream = new FileStream(#"C:\MyInput.pdf", FileMode.Open))
using (var inputImageStream = new FileStream(#"C:\img1.jpg", FileMode.Open))
using (var outputPdfStream = new FileStream(#"C:\MyOutput.pdf", FileMode.Create))
{
PdfReader reader = new PdfReader(inputPdfStream);
PdfStamper stamper = new PdfStamper(reader, outputPdfStream);
PdfContentByte pdfContentByte = stamper.GetOverContent(1);
var image = iTextSharp.text.Image.GetInstance(inputImageStream);
image.SetAbsolutePosition(1, 1);
pdfContentByte.AddImage(image);
stamper.Close();
}
}
//Add Company Letter-Head/Stationary to an existing pdf
private void AddCompanyStationary()
{
PdfReader reader = new PdfReader(Server.MapPath(P_InputStream2));
PdfReader s_reader = new PdfReader(Server.MapPath(P_InputStream4));
using (PdfStamper stamper = new PdfStamper(reader, new FileStream(Server.MapPath(P_OutputStream), FileMode.Create)))
{
PdfImportedPage page = stamper.GetImportedPage(s_reader, 1);
int n = reader.NumberOfPages;
PdfContentByte background;
for (int i = 1; i <= n; i++)
{
background = stamper.GetUnderContent(i);
background.AddTemplate(page, 0, 0);
}
stamper.Close();
}
}

Try this example:
using iTextSharp.text;
// Set up the fonts to be used on the pages
private Font _largeFont = new Font(Font.FontFamily.HELVETICA, 18, Font.BOLD, BaseColor.BLACK);
private Font _standardFont = new Font(Font.FontFamily.HELVETICA, 14, Font.NORMAL, BaseColor.BLACK);
private Font _smallFont = new Font(Font.FontFamily.HELVETICA, 10, Font.NORMAL, BaseColor.BLACK);
public void Build()
{
iTextSharp.text.Document doc = null;
try
{
// Initialize the PDF document
doc = new Document();
iTextSharp.text.pdf.PdfWriter writer = iTextSharp.text.pdf.PdfWriter.GetInstance(doc,
new System.IO.FileStream(System.IO.Directory.GetCurrentDirectory() + "\\ScienceReport.pdf",
System.IO.FileMode.Create));
// Set margins and page size for the document
doc.SetMargins(50, 50, 50, 50);
// There are a huge number of possible page sizes, including such sizes as
// EXECUTIVE, LEGAL, LETTER_LANDSCAPE, and NOTE
doc.SetPageSize(new iTextSharp.text.Rectangle(iTextSharp.text.PageSize.LETTER.Width,
iTextSharp.text.PageSize.LETTER.Height));
// Add metadata to the document. This information is visible when viewing the
// document properities within Adobe Reader.
doc.AddTitle("My Science Report");
doc.AddCreator("M. Lichtenberg");
doc.AddKeywords("paper airplanes");
// Add Xmp metadata to the document.
this.CreateXmpMetadata(writer);
// Open the document for writing content
doc.Open();
// Add pages to the document
this.AddPageWithBasicFormatting(doc);
this.AddPageWithInternalLinks(doc);
this.AddPageWithBulletList(doc);
this.AddPageWithExternalLinks(doc);
this.AddPageWithImage(doc, System.IO.Directory.GetCurrentDirectory() + "\\FinalGraph.jpg");
// Add page labels to the document
iTextSharp.text.pdf.PdfPageLabels pdfPageLabels = new iTextSharp.text.pdf.PdfPageLabels();
pdfPageLabels.AddPageLabel(1, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Basic Formatting");
pdfPageLabels.AddPageLabel(2, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Internal Links");
pdfPageLabels.AddPageLabel(3, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Bullet List");
pdfPageLabels.AddPageLabel(4, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "External Links");
pdfPageLabels.AddPageLabel(5, iTextSharp.text.pdf.PdfPageLabels.EMPTY, "Image");
writer.PageLabels = pdfPageLabels;
}
catch (iTextSharp.text.DocumentException dex)
{
// Handle iTextSharp errors
}
finally
{
// Clean up
doc.Close();
doc = null;
}
}

You can always just create an html page and then convert that to pdf using wkhtmltopdf. This has the benefit of you not having to construct the pdf with a library such as iText. You just make a text file (html) and then pass it to the wkhtmltopdf executable.
See Calling wkhtmltopdf to generate PDF from HTML for more info.

PDF Merging by ItextSharp

How would I merge several pdf pages into one with iTextSharp which also supports merging pages having form elements like textboxes, checkboxes, etc.
I have tried so many by googling, but nothing has worked well.

See my answer here Merging Memory Streams. I give an example of how to merge PDFs with itextsharp.
For updating form field names add this code that uses the stamper to change the form field names.
/// <summary>
/// Merges pdf files from a byte list
/// </summary>
/// <param name="files">list of files to merge</param>
/// <returns>memory stream containing combined pdf</returns>
public MemoryStream MergePdfForms(List<byte[]> files)
{
if (files.Count > 1)
{
string[] names;
PdfStamper stamper;
MemoryStream msTemp = null;
PdfReader pdfTemplate = null;
PdfReader pdfFile;
Document doc;
PdfWriter pCopy;
MemoryStream msOutput = new MemoryStream();
pdfFile = new PdfReader(files[0]);
doc = new Document();
pCopy = new PdfSmartCopy(doc, msOutput);
pCopy.PdfVersion = PdfWriter.VERSION_1_7;
doc.Open();
for (int k = 0; k < files.Count; k++)
{
for (int i = 1; i < pdfFile.NumberOfPages + 1; i++)
{
msTemp = new MemoryStream();
pdfTemplate = new PdfReader(files[k]);
stamper = new PdfStamper(pdfTemplate, msTemp);
names = new string[stamper.AcroFields.Fields.Keys.Count];
stamper.AcroFields.Fields.Keys.CopyTo(names, 0);
foreach (string name in names)
{
stamper.AcroFields.RenameField(name, name + "_file" + k.ToString());
}
stamper.Close();
pdfFile = new PdfReader(msTemp.ToArray());
((PdfSmartCopy)pCopy).AddPage(pCopy.GetImportedPage(pdfFile, i));
pCopy.FreeReader(pdfFile);
}
}
pdfFile.Close();
pCopy.Close();
doc.Close();
return msOutput;
}
else if (files.Count == 1)
{
return new MemoryStream(files[0]);
}
return null;
}

Here is my simplified version of Jonathan's Merge code with namespaces added, and stamping removed.
public IO.MemoryStream MergePdfForms(System.Collections.Generic.List<byte[]> files)
{
if (files.Count > 1) {
using (System.IO.MemoryStream msOutput = new System.IO.MemoryStream()) {
using (iTextSharp.text.Document doc = new iTextSharp.text.Document()) {
using (iTextSharp.text.pdf.PdfSmartCopy pCopy = new iTextSharp.text.pdf.PdfSmartCopy(doc, msOutput) { PdfVersion = iTextSharp.text.pdf.PdfWriter.VERSION_1_7 }) {
doc.Open();
foreach (byte[] oFile in files) {
using (iTextSharp.text.pdf.PdfReader pdfFile = new iTextSharp.text.pdf.PdfReader(oFile)) {
for (i = 1; i <= pdfFile.NumberOfPages; i++) {
pCopy.AddPage(pCopy.GetImportedPage(pdfFile, i));
pCopy.FreeReader(pdfFile);
}
}
}
}
}
return msOutput;
}
} else if (files.Count == 1) {
return new System.IO.MemoryStream(files[0]);
}
return null;
}

to merge PDF see "Merging two pdf pages into one using itextsharp"

Below is my code for pdf merging.Thanks Jonathan for giving suggestion abt renaming fields,which resolved the issues while merging pdf pages with form fields.
private static void CombineAndSavePdf(string savePath, List<string> lstPdfFiles)
{
using (Stream outputPdfStream = new FileStream(savePath, FileMode.Create, FileAccess.Write, FileShare.None))
{
Document document = new Document();
PdfSmartCopy copy = new PdfSmartCopy(document, outputPdfStream);
document.Open();
PdfReader reader;
int totalPageCnt;
PdfStamper stamper;
string[] fieldNames;
foreach (string file in lstPdfFiles)
{
reader = new PdfReader(file);
totalPageCnt = reader.NumberOfPages;
for (int pageCnt = 0; pageCnt < totalPageCnt; )
{
//have to create a new reader for each page or PdfStamper will throw error
reader = new PdfReader(file);
stamper = new PdfStamper(reader, outputPdfStream);
fieldNames = new string[stamper.AcroFields.Fields.Keys.Count];
stamper.AcroFields.Fields.Keys.CopyTo(fieldNames, 0);
foreach (string name in fieldNames)
{
stamper.AcroFields.RenameField(name, name + "_file" + pageCnt.ToString());
}
copy.AddPage(copy.GetImportedPage(reader, ++pageCnt));
}
copy.FreeReader(reader);
}
document.Close();
}
}

iTextSharp RAM (memory) overflow

I'm generating a PDF document based on template. The document has multiple pages. The document can have about 5000 pages. When creating the 500-th page I get an overflow RAM (memory). Any idea?
public static void CreateBankBlank2012Year(string pdfTemplatePath, string directoryOutPdf, string nameOutPdf, AnnualReportsFilterParameters filterParametrs, string serverPath)
{
// Get details salary
IEnumerable<SalayDetailsForPdf> dataSalaryDetails = (IEnumerable<SalayDetailsForPdf>) GetSalaryData(filterParametrs);
String fontPath = Path.Combine(serverPath + "\\Fonts", "STSONG.ttf");
Font font = FontFactory.GetFont(fontPath, BaseFont.IDENTITY_H, 8);
using (Document document = new Document())
{
using (PdfSmartCopy copy = new PdfSmartCopy(
document, new FileStream(directoryOutPdf + nameOutPdf, FileMode.Create))
)
{
document.Open();
foreach (var data in dataSalaryDetails)
{
PdfReader reader = new PdfReader(pdfTemplatePath + #"\EmptyTemplateBankBlank_2012.pdf");
using (var ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, ms))
{
stamper.AcroFields.AddSubstitutionFont(font.BaseFont);
AcroFields form = stamper.AcroFields;
form.SetField("t1_address1", data.Address1);
form.SetField("t1_name", data.NameHieroglyphic);
// Other field ...
stamper.FormFlattening = true;
}
reader = new PdfReader(ms.ToArray());
copy.AddPage(copy.GetImportedPage(reader, 1));
}
}
}
}
}
p.s
I'm trying resolve my problem as follow:
generating empty pages based on template
private static void GeneratePdfFromTemplate(string directoryOutPdf, string nameOutPdf, string pdfTemplatePath, int countPages)
{
using (Document document = new Document())
{
using (PdfSmartCopy copy = new PdfSmartCopy(
document, new FileStream(directoryOutPdf + nameOutPdf, FileMode.Create))
)
{
document.Open();
PdfReader reader = new PdfReader(pdfTemplatePath + #"\EmptyTemplateBankBlank_2012.pdf");
for (int i = 0; i < countPages; i++)
{
copy.AddPage(copy.GetImportedPage(reader, 1));
}
reader.Close();
copy.Close();
}
document.Close();
}
GC.Collect();
}
But after a generating I can't set values to the fields.

I'm found solution for my problem, if anyone else would be interested :
private static void SettingFieltValue(Font font, IEnumerable<SalayDetailsForPdf> dataSalaryDetails, int selectedYear, string directoryOutPdf, string nameOutPdf, string pdfTemplatePath)
{
string pdfTemplate = pdfTemplatePath + #"\EmptyTemplateBankBlank_2012.pdf";
string newFile = directoryOutPdf + nameOutPdf;
var fs = new FileStream(newFile, FileMode.Create);
var conc = new PdfConcatenate(fs, true);
foreach (var data in dataSalaryDetails)
{
var reader = new PdfReader(pdfTemplate);
using (var ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, ms))
{
stamper.AcroFields.AddSubstitutionFont(font.BaseFont);
AcroFields form = stamper.AcroFields;
form.SetField("t1_name", data.NameHieroglyphic);
//Other field
stamper.FormFlattening = true;
stamper.Close();
}
reader = new PdfReader(ms.ToArray());
ms.Close();
}
conc.AddPages(reader);
reader.Close();
}
conc.Close();
}

Insert page into existing PDF using itextsharp

We are using itextsharp to create a single PDF from multiple PDF files. How do I insert a new page into a PDF file that has multiple pages already in the file? When I use add page it is overwriting the existing pages and only saves the 1 page that was selected.
Here is the code that I am using to add the page to the existing PDF:
PdfReader reader = new PdfReader(sourcePdfPath);
Document document = new Document(reader.GetPageSizeWithRotation(1));
PdfCopy pdfCopy = new PdfCopy(document, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));
MemoryStream memoryStream = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(document, memoryStream);
document.AddDocListener(writer);
document.Open();
for (int p = 1; p <= reader.NumberOfPages; p++)
{
if (pagesToExtract.FindIndex(s => s == p) == -1) continue;
document.SetPageSize(reader.GetPageSize(p));
document.NewPage();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage pageImport = writer.GetImportedPage(reader, p);
int rot = reader.GetPageRotation(p);
if (rot == 90 || rot == 270)
{
cb.AddTemplate(pageImport, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(p).Height);
}
else
{
cb.AddTemplate(pageImport, 1.0F, 0, 0, 1.0F, 0, 0);
}
pdfCopy.AddPage(pageImport);
}
pdfCopy.Close();

This code works. You need to have a different file to output the results.
private static void AppendToDocument(string sourcePdfPath1, string sourcePdfPath2, string outputPdfPath)
{
using (var sourceDocumentStream1 = new FileStream(sourcePdfPath1, FileMode.Open))
{
using (var sourceDocumentStream2 = new FileStream(sourcePdfPath2, FileMode.Open))
{
using (var destinationDocumentStream = new FileStream(outputPdfPath, FileMode.Create))
{
var pdfConcat = new PdfConcatenate(destinationDocumentStream);
var pdfReader = new PdfReader(sourceDocumentStream1);
var pages = new List<int>();
for (int i = 0; i < pdfReader.NumberOfPages; i++)
{
pages.Add(i);
}
pdfReader.SelectPages(pages);
pdfConcat.AddPages(pdfReader);
pdfReader = new PdfReader(sourceDocumentStream2);
pages = new List<int>();
for (int i = 0; i < pdfReader.NumberOfPages; i++)
{
pages.Add(i);
}
pdfReader.SelectPages(pages);
pdfConcat.AddPages(pdfReader);
pdfReader.Close();
pdfConcat.Close();
}
}
}
}

I've tried this code, and it works for me, but don't forget to do some validations of the number of pages and existence of the paths you use
here is the code:
private static void AppendToDocument(string sourcePdfPath, string outputPdfPath, List<int> neededPages)
{
var sourceDocumentStream = new FileStream(sourcePdfPath, FileMode.Open);
var destinationDocumentStream = new FileStream(outputPdfPath, FileMode.Create);
var pdfConcat = new PdfConcatenate(destinationDocumentStream);
var pdfReader = new PdfReader(sourceDocumentStream);
pdfReader.SelectPages(neededPages);
pdfConcat.AddPages(pdfReader);
pdfReader.Close();
pdfConcat.Close();
}

You could use something like this, where src is the IEnumerable<string> of input pdf filenames. Just make sure that your existing pdf file is one of those sources.
The PdfConcatenate class is in the latest iTextSharp release.
var result = "combined.pdf";
var fs = new FileStream(result, FileMode.Create);
var conc = new PdfConcatenate(fs, true);
foreach(var s in src) {
var r = new PdfReader(s);
conc.AddPages(r);
}
conc.Close();

PdfCopy is intended for use with an empty Document. You should add everything you want, one page at a time.
The alternative is to use PdfStamper.InsertPage(pageNum, rectangle) and then draw a PdfImportedPage onto that new page.
Note that PdfImportedPage only includes the page contents, not the annotations or doc-level information ("document structure", doc-level javascripts, etc) that page may have originally used... unless you use one with PdfCopy.
A Stamper would probably be more efficient and use less code, but PdfCopy will import all the page-level info, not just the page's contents.
This might be important, it might not. It depends on what page you're trying to import.

Had to even out the page count with a multiple of 4:
private static void AppendToDocument(string sourcePdfPath)
{
var tempFileLocation = Path.GetTempFileName();
var bytes = File.ReadAllBytes(sourcePdfPath);
using (var reader = new PdfReader(bytes))
{
var numberofPages = reader.NumberOfPages;
var modPages = (numberofPages % 4);
var pages = modPages == 0 ? 0 : 4 - modPages;
if (pages == 0)
return;
using (var fileStream = new FileStream(tempFileLocation, FileMode.Create, FileAccess.Write))
{
using (var stamper = new PdfStamper(reader, fileStream))
{
var rectangle = reader.GetPageSize(1);
for (var i = 1; i <= pages; i++)
stamper.InsertPage(numberofPages + i, rectangle);
}
}
}
File.Delete(sourcePdfPath);
File.Move(tempFileLocation, sourcePdfPath);
}

I know I'm really late to the part here, but I mixed a bit of the two best answers and created a method if anyone needs it that adds a list of source PDF documents to a single document using itextsharp.
private static void appendToDocument(List<string> sourcePDFList, string outputPdfPath)
{
//Output document name and creation
FileStream destinationDocumentStream = new FileStream(outputPdfPath, FileMode.Create);
//Object to concat source pdf's to output pdf
PdfConcatenate pdfConcat = new PdfConcatenate(destinationDocumentStream);
//For each source pdf in list...
foreach (string sourcePdfPath in sourcePDFList)
{
//If file exists...
if (File.Exists(sourcePdfPath))
{
//Open the document
FileStream sourceDocumentStream = new FileStream(sourcePdfPath, FileMode.Open);
//Read the document
PdfReader pdfReader = new PdfReader(sourceDocumentStream);
//Create an int list
List<int> pages = new List<int>();
//for each page in pdfReader
for (int i = 1; i < pdfReader.NumberOfPages + 1; i++)
{
//Add that page to the list
pages.Add(i);
}
//Add that page to the pages to add to ouput document
pdfReader.SelectPages(pages);
//Add pages to output page
pdfConcat.AddPages(pdfReader);
//Close reader
pdfReader.Close();
}
}
//Close pdfconcat
pdfConcat.Close();
}

iTextSharp is producing a corrupt PDF

The code snippet below returns a corrupt PDF document however if I return mergedDocument instead it always returns a valid PDF. mergedDocument is based on a PDF file i created using Word, whereas completed document is entirely programmatically generated. The code "works" in that it throws no exceptions. Why is iTextSharp creating a corrupt PDF?
byte[] completedDocument = null;
using (MemoryStream streamCompleted = new MemoryStream())
{
using (Document document = new Document())
{
PdfCopy copy = new PdfCopy(document, streamCompleted);
document.Open();
copy.Open();
foreach (var item in eventItems)
{
byte[] mergedDocument = null;
PdfReader reader = new PdfReader(pdfTemplates[item.DataTokens[NotifyTokenType.OrganisationID]]);
using (MemoryStream streamTemplate = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, streamTemplate))
{
foreach (var token in item.DataTokens)
{
if (stamper.AcroFields.Fields.Any(fld => fld.Key == token.Key.ToString()))
{
stamper.AcroFields.SetField(token.Key.ToString(), token.Value);
}
}
stamper.FormFlattening = true;
stamper.Writer.CloseStream = false;
}
mergedDocument = new byte[streamTemplate.Length];
streamTemplate.Position = 0;
streamTemplate.Read(mergedDocument, 0, (int)streamTemplate.Length);
}
reader = new PdfReader(mergedDocument);
for (int i = 1; i <= reader.NumberOfPages; i++)
{
document.SetPageSize(PageSize.A4);
copy.AddPage(copy.GetImportedPage(reader, i));
}
}
completedDocument = new byte[streamCompleted.Length];
streamCompleted.Position = 0;
streamCompleted.Read(completedDocument, 0, (int)streamCompleted.Length);
}
}
return completedDocument;

You need to close the document and copy objects to flush the PDF writing buffer. This, however, causes some problems when trying to read the stream into an array. The fix for that is to use the ToArray() method of the MemoryStream which still works on closed streams. The changes I made have comments on them.
byte[] completedDocument = null;
using (MemoryStream streamCompleted = new MemoryStream())
{
using (Document document = new Document())
{
PdfCopy copy = new PdfCopy(document, streamCompleted);
document.Open();
copy.Open();
foreach (var item in eventItems)
{
byte[] mergedDocument = null;
PdfReader reader = new PdfReader(pdfTemplates[item.DataTokens[NotifyTokenType.OrganisationID]]);
using (MemoryStream streamTemplate = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, streamTemplate))
{
foreach (var token in item.DataTokens)
{
if (stamper.AcroFields.Fields.Any(fld => fld.Key == token.Key.ToString()))
{
stamper.AcroFields.SetField(token.Key.ToString(), token.Value);
}
}
stamper.FormFlattening = true;
stamper.Writer.CloseStream = false;
}
//Copy the stream's bytes
mergedDocument = streamTemplate.ToArray();
}
reader = new PdfReader(mergedDocument);
for (int i = 1; i <= reader.NumberOfPages; i++)
{
document.SetPageSize(PageSize.A4);
copy.AddPage(copy.GetImportedPage(reader, i));
}
//Close the document and the copy
document.Close();
copy.Close();
}
//ToArray() can operate on closed streams
completedDocument = streamCompleted.ToArray();
}
}
return completedDocument;

Also make sure your html doesn't contains hr tag while converting html to pdf
hdnEditorText.Value.Replace("\"", "'").Replace("<hr />", "").Replace("<hr/>", "")

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

iTextSharp How to Add Custom Properties to PDF-A? - c#

Related

Creating a PDF file?

PDF Merging by ItextSharp

iTextSharp RAM (memory) overflow

Insert page into existing PDF using itextsharp

iTextSharp is producing a corrupt PDF

Categories

Resources