Adding Page Navigation links to PDF document using itextsharp [duplicate] - c#

I have written some code that merges together multiple PDF's into a single PDF that I then display from the MemoryStream. This works great. What I need to do is add a table of contents to the end of the file with links to the start of each of the individual PDF's. I planned on doing this using the GotoLocalPage action which has an option for page numbers but it doesn't seem to work. If I change the action to the code below to one of the presset ones like PDFAction.FIRSTPAGE it works fine. Does this not work because I am using the PDFCopy object for the writer parameter of GotoLocalPage?
Document mergedDoc = new Document();
MemoryStream ms = new MemoryStream();
PdfCopy copy = new PdfCopy(mergedDoc, ms);
mergedDoc.Open();
MemoryStream tocMS = new MemoryStream();
Document tocDoc = null;
PdfWriter tocWriter = null;
for (int i = 0; i < filesToMerge.Length; i++)
{
string filename = filesToMerge[i];
PdfReader reader = new PdfReader(filename);
copy.AddDocument(reader);
// Initialise TOC document based off first file
if (i == 0)
{
tocDoc = new Document(reader.GetPageSizeWithRotation(1));
tocWriter = PdfWriter.GetInstance(tocDoc, tocMS);
tocDoc.Open();
}
// Create link for TOC, added random number of 3 for now
Chunk link = new Chunk(filename);
PdfAction action = PdfAction.GotoLocalPage(3, new PdfDestination(PdfDestination.FIT), copy);
link.SetAction(action);
tocDoc.Add(new Paragraph(link));
}
// Add TOC to end of merged PDF
tocDoc.Close();
PdfReader tocReader = new PdfReader(tocMS.ToArray());
copy.AddDocument(tocReader);
copy.Close();
displayPDF(ms.ToArray());
I guess an alternative would be to link to a named element (instead of page number) but I can't see how to add an 'invisible' element to the start of each file before adding to the merged document?

I would just go with two passes. In your first pass, do the merge as you are but also record the filename and page number it should link to. In your second pass, use a PdfStamper which will give you access to a ColumnText that you can use general abstractions like Paragraph in. Below is a sample that shows this off:
Since I don't have your documents, the below code creates 10 documents with a random number of pages each just for testing purposes. (You obviously don't need to do this part.) It also creates a simple dictionary with a fake file name as the key and the raw bytes from the PDF as a value. You have a true file collection to work with but you should be able to adapt that part.
//Create a bunch of files, nothing special here
//files will be a dictionary of names and the raw PDF bytes
Dictionary<string, byte[]> Files = new Dictionary<string, byte[]>();
var r = new Random();
for (var i = 1; i <= 10; i++) {
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Create a random number of pages
for (var j = 1; j <= r.Next(1, 5); j++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("Hello from document {0} page {1}", i, j)));
}
doc.Close();
}
}
Files.Add("File " + i.ToString(), ms.ToArray());
}
}
This next block merges the PDFs. This is mostly the same as your code except that instead of writing a TOC here I'm just keeping track of what I want to write in the future. Where I'm using file.value you'd use your full file path and where I'm using file.key you'd use your file's name instead.
//Dictionary of file names (for display purposes) and their page numbers
var pages = new Dictionary<string, int>();
//PDFs start at page 1
var lastPageNumber = 1;
//Will hold the final merged PDF bytes
byte[] mergedBytes;
//Most everything else below is standard
using (var ms = new MemoryStream()) {
using (var document = new Document()) {
using (var writer = new PdfCopy(document, ms)) {
document.Open();
foreach (var file in Files) {
//Add the current page at the previous page number
pages.Add(file.Key, lastPageNumber);
using (var reader = new PdfReader(file.Value)) {
writer.AddDocument(reader);
//Increment our current page index
lastPageNumber += reader.NumberOfPages;
}
}
}
}
mergedBytes = ms.ToArray();
}
This last block actually writes the TOC. If we use a PdfStamper we can create a ColumnText which allows us to use Paragraphs
//Will hold the final PDF
byte[] finalBytes;
using (var ms = new MemoryStream()) {
using (var reader = new PdfReader(mergedBytes)) {
using (var stamper = new PdfStamper(reader, ms)) {
//The page number to insert our TOC into
var tocPageNum = reader.NumberOfPages + 1;
//Arbitrarily pick one page to use as the size of the PDF
//Additional logic could be added or this could just be set to something like PageSize.LETTER
var tocPageSize = reader.GetPageSize(1);
//Arbitrary margin for the page
var tocMargin = 20;
//Create our new page
stamper.InsertPage(tocPageNum, tocPageSize);
//Create a ColumnText object so that we can use abstractions like Paragraph
var ct = new ColumnText(stamper.GetOverContent(tocPageNum));
//Set the working area
ct.SetSimpleColumn(tocPageSize.GetLeft(tocMargin), tocPageSize.GetBottom(tocMargin), tocPageSize.GetRight(tocMargin), tocPageSize.GetTop(tocMargin));
//Loop through each page
foreach (var page in pages) {
var link = new Chunk(page.Key);
var action = PdfAction.GotoLocalPage(page.Value, new PdfDestination(PdfDestination.FIT), stamper.Writer);
link.SetAction(action);
ct.AddElement(new Paragraph(link));
}
ct.Go();
}
}
finalBytes = ms.ToArray();
}

Related

Generate one pdf document with multiple pages converting from html using IText 7

I'm working with IText 7, I've been able to get one html page and generate a pdf for that page, but I need to generate one pdf document from multiple html pages and separated by pages. For example: I have Page1.html, Page2.html and Page3.html. I will need a pdf document with 3 pages, the first page with the content of Page1.html, second page with the content of Page2.html and like that...
This is the code I have and it's working for one html page:
ConverterProperties properties = new ConverterProperties();
PdfWriter writer = new PdfWriter(pdfRoot, new WriterProperties().SetFullCompressionMode(true));
PdfDocument pdfDocument = new PdfDocument(writer);
pdfDocument.AddEventHandler(PdfDocumentEvent.END_PAGE, new HeaderPdfEventHandler());
HtmlConverter.ConvertToPdf(htmlContent, pdfDocument, properties);
Is it possible to loop against the multiple html pages, add a new page to the PdfDocument for every html page and then have only one pdf generated with one page per html page?
UPDATE
I've been following this example and trying to translate it from Java to C#, I'm trying to use PdfMerger and loop around the html pages... but I'm receiving the Exception Cannot access a closed stream, on this line:
temp = new PdfDocument(
new PdfReader(new RandomAccessSourceFactory().CreateSource(baos), rp));
It looks like is related to the ByteArrayOutputStream baos instance. Any suggestions? This is my current code:
foreach (var html in htmlList)
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfDocument temp = new PdfDocument(new PdfWriter(baos));
HtmlConverter.ConvertToPdf(html, temp, properties);
ReaderProperties rp = new ReaderProperties();
temp = new PdfDocument(
new PdfReader(new RandomAccessSourceFactory().CreateSource(baos), rp));
merger.Merge(temp, 1, temp.GetNumberOfPages());
temp.Close();
}
pdfDocument.Close();
You are using RandomAccessSourceFactory and passing there a closed stream which you wrote a PDF document into. RandomAccessSourceFactory expects an input stream instead that is ready to be read.
First of all you should use MemoryStream which is native to .NET world. ByteArrayOutputStream is the class that was ported from Java for internal purposes (although it extends MemoryStream as well). Secondly, you don't have to use RandomAccessSourceFactory - there is a simpler way.
You can create a new MemoryStream instance from the bytes of the MemoryStream that you used to create a temporary PDF with the following line:
baos = new MemoryStream(baos.ToArray());
As an additional remark, it's better to close PdfMerger instance directly instead of closing the document - closing PdfMerger closes the underlying document as well.
All in all, we get the following code that works:
foreach (var html in htmlList)
{
MemoryStream baos = new MemoryStream();
PdfDocument temp = new PdfDocument(new PdfWriter(baos));
HtmlConverter.ConvertToPdf(html, temp, properties);
ReaderProperties rp = new ReaderProperties();
baos = new MemoryStream(baos.ToArray());
temp = new PdfDocument(new PdfReader(baos, rp));
pdfMerger.Merge(temp, 1, temp.GetNumberOfPages());
temp.Close();
}
pdfMerger.Close();
Maybe not so succinctly. I use "using". Similar answer
private byte[] CreatePDF(string html)
{
byte[] binData;
using (var workStream = new MemoryStream())
{
using (var pdfWriter = new PdfWriter(workStream))
{
//Create one pdf document
using (var pdfDoc = new PdfDocument(pdfWriter))
{
pdfDoc.SetDefaultPageSize(iText.Kernel.Geom.PageSize.A4.Rotate());
//Create one pdf merger
var pdfMerger = new PdfMerger(pdfDoc);
//Create two identical pdfs
for (int i = 0; i < 2; i++)
{
using (var newStream = new MemoryStream(CreateDocument(html)))
{
ReaderProperties rp = new ReaderProperties();
using (var newPdf = new PdfDocument(new PdfReader(newStream, rp)))
{
pdfMerger.Merge(newPdf, 1, newPdf.GetNumberOfPages());
}
}
}
}
binData = workStream.ToArray();
}
}
return binData;
}
Create pdf
private byte[] CreateDocument(string html)
{
byte[] binData;
using (var workStream = new MemoryStream())
{
using (var pdfWriter = new PdfWriter(workStream))
{
using (var pdfDoc = new PdfDocument(pdfWriter))
{
pdfDoc.SetDefaultPageSize(iText.Kernel.Geom.PageSize.A4.Rotate());
ConverterProperties props = new ConverterProperties();
using (var document = HtmlConverter.ConvertToDocument(html, pdfDoc, props))
{
}
}
binData = workStream.ToArray();
}
}
return binData;
}

How do I read font information when reading text from pdf using itext?

I have multiple separate pdfs and I want to read the text from them 1 by 1, and write it to a new pdf so they are all in the same document. I dont want to use PdfMerger because that will include all the white space. This is why I want to read and copy the actual text.
The main problem though is the font. The below code is fine for reading and writing all the text, but I lost the font size/type and whether or not its in bold. I need this to format to my destination page. Does anyone know how to do this?
Thanks.
byte[] result;
using (var ms = new MemoryStream())
{
var writer = new PdfWriter(ms);
PdfDocument outPdf = new PdfDocument(writer);
//PdfMerger merger = new PdfMerger(outPdf);
Document outDocument = new Document(outPdf);
foreach (var clause in clauses)
{
//Add pages from the first document
var sourceReader = new PdfReader(new MemoryStream(clause.ClauseBytes));
var sourcePdf = new PdfDocument(sourceReader);
for (int i = 0; i < sourcePdf.GetNumberOfPages(); i++)
{
var sourcePage = sourcePdf.GetPage(i+1);
var strategy = new SimpleTextExtractionStrategy();
var text = PdfTextExtractor.GetTextFromPage(sourcePage, strategy);
var currentText = Encoding.
UTF8.GetString(Encoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(text)));
outDocument.Add(new Paragraph(currentText));
//merger.Merge(sourcePdf, 1, sourcePdf.GetNumberOfPages());
//sourcePdf.Close();
}
}
outDocument.Close();
//merger.Close();
result = ms.GetBuffer();
}
return result;

C# iTextSharp Merge multiple pdf via byte array

I am new to using iTextSharp and working with Pdf files in general, but I think I'm on the right track.
I iterate through a list of pdf files, convert them to bytes, and push all of the resulting bytes into a byte array. From there I pass the byte array to concatAndAddContent() to merge all of the pdf's into a single large pdf. Currently I'm just getting the last pdf in the list (they seem to be overwriting)
public static byte[] concatAndAddContent(List<byte[]> pdfByteContent)
{
byte[] allBytes;
using (MemoryStream ms = new MemoryStream())
{
Document doc = new Document();
PdfWriter writer = PdfWriter.GetInstance(doc, ms);
doc.SetPageSize(PageSize.LETTER);
doc.Open();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage page;
PdfReader reader;
foreach (byte[] p in pdfByteContent)
{
reader = new PdfReader(p);
int pages = reader.NumberOfPages;
// loop over document pages
for (int i = 1; i <= pages; i++)
{
doc.SetPageSize(PageSize.LETTER);
doc.NewPage();
page = writer.GetImportedPage(reader, i);
cb.AddTemplate(page, 0, 0);
}
}
doc.Close();
allBytes = ms.GetBuffer();
ms.Flush();
ms.Dispose();
}
return allBytes;
}
Above is the working code that results in a single pdf being created, and the rest of the files are being ignored. Any suggestions
This is pretty much just a C# version of Bruno's code here.
This is pretty much the simplest, safest and recommended way to merge PDF files. The PdfSmartCopy object is able to detect redundancies in the multiple files which can reduce file size some times. One of the overloads on it accepts a full PdfReader object which can be instantiated however you want.
public static byte[] concatAndAddContent(List<byte[]> pdfByteContent) {
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var copy = new PdfSmartCopy(doc, ms)) {
doc.Open();
//Loop through each byte array
foreach (var p in pdfByteContent) {
//Create a PdfReader bound to that byte array
using (var reader = new PdfReader(p)) {
//Add the entire document instead of page-by-page
copy.AddDocument(reader);
}
}
doc.Close();
}
}
//Return just before disposing
return ms.ToArray();
}
}
List<byte[]> finallist= new List<byte[]>();
finallist.Add(concatAndAddContent(bytes));
System.IO.File.WriteAllBytes("path",finallist);

How to merge .pdf and .jpg file in one pdf

There are two files on disk .jpg and .pdf, i need to read both files and add them to new pdf and send to browser so that it can be downloaded.
New pdf file only contains pdf contents not jpeg file image.
memoryStream myMemoryStream = new MemoryStream();
//----pdf file--------------
iTextSharp.text.pdf.PdfCopy writer2 = new iTextSharp.text.pdf.PdfCopy(doc, myMemoryStream);
doc.Open();
iTextSharp.text.pdf.PdfReader reader = new iTextSharp.text.pdf.PdfReader(imagepath + "/30244.pdf");
reader.ConsolidateNamedDestinations();
for (int i = 1; i <= reader.NumberOfPages; i++) {
iTextSharp.text.pdf.PdfImportedPage page = writer2.GetImportedPage(reader, i);
writer2.AddPage(page);
}
iTextSharp.text.pdf.PRAcroForm form = reader.AcroForm;
if (form != null) {
writer2.CopyAcroForm(reader);
}
//-----------------jpeg file-------------------------------------
MemoryStream myMemoryStream2 = new MemoryStream();
System.Drawing.Image image = System.Drawing.Image.FromFile(imagepath + "/Vouchers.jpg");
iTextSharp.text.Document doc2 = new iTextSharp.text.Document(iTextSharp.text.PageSize.A4);
iTextSharp.text.pdf.PdfWriter.GetInstance(doc2, myMemoryStream2);
doc2.Open();
iTextSharp.text.Image pdfImage = iTextSharp.text.Image.GetInstance(image, System.Drawing.Imaging.ImageFormat.Jpeg);
doc2.Add(pdfImage);
doc2.close();
doc.close();
byte[] content = myMemoryStream.ToArray;
Response.ContentType = "application/pdf";
Response.AppendHeader("Content-Disposition", "attachment; filename=LeftCorner568.pdf");
Response.BinaryWrite(content);
Since you've been having trouble with this for a while now I'm going to give you a long-ish answer that will hopefully help you.
First, I don't have access to an ASP.Net server so I'm running everything from a folder on the desktop. So instead of reading and writing from and to relative paths you'll see me working from Environment.GetFolderPath(Environment.SpecialFolder.Desktop). I'm assuming that you'll be able to swap your paths in later.
Second, (and not that it really matter) I don't have SSRS so instead I created a helper method that makes a fake PDF for me to work from that returns a PDF as a byte array:
/// <summary>
/// Create a fake SSRS report
/// </summary>
/// <returns>A valid PDF stored as a byte array</returns>
private Byte[] getSSRSPdfAsByteArray() {
using (var ms = new System.IO.MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
doc.Add(new Paragraph("This is my SSRS report"));
doc.Close();
}
}
return ms.ToArray();
}
}
Third, just so that we're on the same page and to have something to work with I created two additional helper methods that generate some sample images and PDFs:
/// <summary>
/// Create sample images in the folder provided
/// </summary>
/// <param name="count">The number of images to create</param>
/// <param name="workingFolder">The folder to create images in</param>
private void createSampleImages(int count, string workingFolder) {
var random = new Random();
for (var i = 0; i < count; i++) {
using (var bmp = new System.Drawing.Bitmap(200, 200)) {
using (var g = System.Drawing.Graphics.FromImage(bmp)) {
g.Clear(Color.FromArgb(random.Next(0, 255), random.Next(0, 255), random.Next(0, 255)));
}
bmp.Save(System.IO.Path.Combine(workingFolder, string.Format("Image_{0}.jpg", i)));
}
}
}
/// <summary>
/// Create sample PDFs in the folder provided
/// </summary>
/// <param name="count">The number of PDFs to create</param>
/// <param name="workingFolder">The folder to create PDFs in</param>
private void createSamplePDFs(int count, string workingFolder) {
var random = new Random();
for (var i = 0; i < count; i++) {
using (var ms = new System.IO.MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
var pageCount = random.Next(1, 10);
for (var j = 0; j < pageCount; j++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("This is page {0} of document {1}", j, i)));
}
doc.Close();
}
}
System.IO.File.WriteAllBytes(System.IO.Path.Combine(workingFolder, string.Format("File_{0}.pdf", i)), ms.ToArray());
}
}
}
Just to reiterate, you obviously wouldn't have a need for these three helper methods, they're just so that you and I have a common set of files to work from. These helper methods are also intentionally not commented.
Fourth, at the end of the code below I'm storing the final PDF into a byte array called finalFileBytes and I'm then writing that to disk. Once again, I'm working on the desktop so this is where you'd do Response.BinaryWrite(finalFileBytes) instead.
Fifth, there's different ways to merge and combine files. PdfCopy, PdfSmartCopy and PdfStamper are all commonly used. I encourage you to read the official iText/iTextSharp book or at least the free Chapter 6, Working with existing PDFs that goes into great detail about this. In the code below I'm using PdfSmartCopy and I'm converting each image to a PDF before importing them. There might be a better way but I'm not sure if you can do it all in one pass or not. Bruno would know better than me. But the below works.
See the individual code comments for more details on what's going on.
//The folder that all of our work will be done in
var workingFolder = System.IO.Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.Desktop), "Pdf Test");
//This is the final PDF that we'll create for testing purposes
var finalPDF = System.IO.Path.Combine(workingFolder, "test.pdf");
//Create our working directory if it doesn't exist already
System.IO.Directory.CreateDirectory(workingFolder);
//Create sample PDFs and images
createSampleImages(10, workingFolder);
createSamplePDFs(10, workingFolder);
//Create our sample SSRS PDF byte array
var SSRS_Bytes = getSSRSPdfAsByteArray();
//This variable will eventually hold our combined PDF as a byte array
Byte[] finalFileBytes;
//Write everything to a MemoryStream
using (var finalFile = new System.IO.MemoryStream()) {
//Create a generic Document
using (var doc = new Document()) {
//Use PdfSmartCopy to intelligently merge files
using (var copy = new PdfSmartCopy(doc, finalFile)) {
//Open our document for writing
doc.Open();
//#1 - Import the SSRS report
//Bind a reader to our SSRS report
using (var reader = new PdfReader(SSRS_Bytes)) {
//Loop through each page
for (var i = 1; i <= reader.NumberOfPages; i++) {
//Add the imported page to our final document
copy.AddPage(copy.GetImportedPage(reader, i));
}
}
//#2 - Image the images
//Loop through each image in our working directory
foreach (var f in System.IO.Directory.EnumerateFiles(workingFolder, "*.jpg", SearchOption.TopDirectoryOnly)) {
//There's different ways to do this and it depends on what exactly "add an image to a PDF" really means
//Below we add each individual image to a PDF and then merge that PDF into the main PDF
//This could be optimized greatly
//From https://alandjackson.wordpress.com/2013/09/27/convert-an-image-to-a-pdf-in-c-using-itextsharp/
//Get the size of the current image
iTextSharp.text.Rectangle pageSize = null;
using (var srcImage = new Bitmap(f)) {
pageSize = new iTextSharp.text.Rectangle(0, 0, srcImage.Width, srcImage.Height);
}
//Will eventually hold the PDF with the image as a byte array
Byte[] imageBytes;
//Simple image to PDF
using (var m = new MemoryStream()) {
using (var d = new Document(pageSize, 0, 0, 0, 0)) {
using (var w = PdfWriter.GetInstance(d, m)) {
d.Open();
d.Add(iTextSharp.text.Image.GetInstance(f));
d.Close();
}
}
//Grab the bytes before closing out the stream
imageBytes = m.ToArray();
}
//Now merge using the same merge code as #1
using (var reader = new PdfReader(imageBytes)) {
for (var i = 1; i <= reader.NumberOfPages; i++) {
copy.AddPage(copy.GetImportedPage(reader, i));
}
}
}
//#3 - Merge additional PDF
//Look for each PDF in our working directory
foreach (var f in System.IO.Directory.EnumerateFiles(workingFolder, "*.pdf", SearchOption.TopDirectoryOnly)) {
//Because I'm writing samples files to disk but not cleaning up afterwards
//I want to avoid adding my output file as an input file
if (f == finalPDF) {
continue;
}
//Now merge using the same merge code as #1
using (var reader = new PdfReader(f)) {
for (var i = 1; i <= reader.NumberOfPages; i++) {
copy.AddPage(copy.GetImportedPage(reader, i));
}
}
}
doc.Close();
}
}
//Grab the bytes before closing the stream
finalFileBytes = finalFile.ToArray();
}
//At this point finalFileBytes holds a byte array of a PDF
//that contains the SSRS PDF, the sample images and the
//sample PDFs. For demonstration purposes I'm just writing to
//disk but this could be written to the HTTP stream
//using Response.BinaryWrite()
System.IO.File.WriteAllBytes(finalPDF, finalFileBytes);

Insert page into existing PDF using itextsharp

We are using itextsharp to create a single PDF from multiple PDF files. How do I insert a new page into a PDF file that has multiple pages already in the file? When I use add page it is overwriting the existing pages and only saves the 1 page that was selected.
Here is the code that I am using to add the page to the existing PDF:
PdfReader reader = new PdfReader(sourcePdfPath);
Document document = new Document(reader.GetPageSizeWithRotation(1));
PdfCopy pdfCopy = new PdfCopy(document, new System.IO.FileStream(outputPdfPath, System.IO.FileMode.Create));
MemoryStream memoryStream = new MemoryStream();
PdfWriter writer = PdfWriter.GetInstance(document, memoryStream);
document.AddDocListener(writer);
document.Open();
for (int p = 1; p <= reader.NumberOfPages; p++)
{
if (pagesToExtract.FindIndex(s => s == p) == -1) continue;
document.SetPageSize(reader.GetPageSize(p));
document.NewPage();
PdfContentByte cb = writer.DirectContent;
PdfImportedPage pageImport = writer.GetImportedPage(reader, p);
int rot = reader.GetPageRotation(p);
if (rot == 90 || rot == 270)
{
cb.AddTemplate(pageImport, 0, -1.0F, 1.0F, 0, 0, reader.GetPageSizeWithRotation(p).Height);
}
else
{
cb.AddTemplate(pageImport, 1.0F, 0, 0, 1.0F, 0, 0);
}
pdfCopy.AddPage(pageImport);
}
pdfCopy.Close();
This code works. You need to have a different file to output the results.
private static void AppendToDocument(string sourcePdfPath1, string sourcePdfPath2, string outputPdfPath)
{
using (var sourceDocumentStream1 = new FileStream(sourcePdfPath1, FileMode.Open))
{
using (var sourceDocumentStream2 = new FileStream(sourcePdfPath2, FileMode.Open))
{
using (var destinationDocumentStream = new FileStream(outputPdfPath, FileMode.Create))
{
var pdfConcat = new PdfConcatenate(destinationDocumentStream);
var pdfReader = new PdfReader(sourceDocumentStream1);
var pages = new List<int>();
for (int i = 0; i < pdfReader.NumberOfPages; i++)
{
pages.Add(i);
}
pdfReader.SelectPages(pages);
pdfConcat.AddPages(pdfReader);
pdfReader = new PdfReader(sourceDocumentStream2);
pages = new List<int>();
for (int i = 0; i < pdfReader.NumberOfPages; i++)
{
pages.Add(i);
}
pdfReader.SelectPages(pages);
pdfConcat.AddPages(pdfReader);
pdfReader.Close();
pdfConcat.Close();
}
}
}
}
I've tried this code, and it works for me, but don't forget to do some validations of the number of pages and existence of the paths you use
here is the code:
private static void AppendToDocument(string sourcePdfPath, string outputPdfPath, List<int> neededPages)
{
var sourceDocumentStream = new FileStream(sourcePdfPath, FileMode.Open);
var destinationDocumentStream = new FileStream(outputPdfPath, FileMode.Create);
var pdfConcat = new PdfConcatenate(destinationDocumentStream);
var pdfReader = new PdfReader(sourceDocumentStream);
pdfReader.SelectPages(neededPages);
pdfConcat.AddPages(pdfReader);
pdfReader.Close();
pdfConcat.Close();
}
You could use something like this, where src is the IEnumerable<string> of input pdf filenames. Just make sure that your existing pdf file is one of those sources.
The PdfConcatenate class is in the latest iTextSharp release.
var result = "combined.pdf";
var fs = new FileStream(result, FileMode.Create);
var conc = new PdfConcatenate(fs, true);
foreach(var s in src) {
var r = new PdfReader(s);
conc.AddPages(r);
}
conc.Close();
PdfCopy is intended for use with an empty Document. You should add everything you want, one page at a time.
The alternative is to use PdfStamper.InsertPage(pageNum, rectangle) and then draw a PdfImportedPage onto that new page.
Note that PdfImportedPage only includes the page contents, not the annotations or doc-level information ("document structure", doc-level javascripts, etc) that page may have originally used... unless you use one with PdfCopy.
A Stamper would probably be more efficient and use less code, but PdfCopy will import all the page-level info, not just the page's contents.
This might be important, it might not. It depends on what page you're trying to import.
Had to even out the page count with a multiple of 4:
private static void AppendToDocument(string sourcePdfPath)
{
var tempFileLocation = Path.GetTempFileName();
var bytes = File.ReadAllBytes(sourcePdfPath);
using (var reader = new PdfReader(bytes))
{
var numberofPages = reader.NumberOfPages;
var modPages = (numberofPages % 4);
var pages = modPages == 0 ? 0 : 4 - modPages;
if (pages == 0)
return;
using (var fileStream = new FileStream(tempFileLocation, FileMode.Create, FileAccess.Write))
{
using (var stamper = new PdfStamper(reader, fileStream))
{
var rectangle = reader.GetPageSize(1);
for (var i = 1; i <= pages; i++)
stamper.InsertPage(numberofPages + i, rectangle);
}
}
}
File.Delete(sourcePdfPath);
File.Move(tempFileLocation, sourcePdfPath);
}
I know I'm really late to the part here, but I mixed a bit of the two best answers and created a method if anyone needs it that adds a list of source PDF documents to a single document using itextsharp.
private static void appendToDocument(List<string> sourcePDFList, string outputPdfPath)
{
//Output document name and creation
FileStream destinationDocumentStream = new FileStream(outputPdfPath, FileMode.Create);
//Object to concat source pdf's to output pdf
PdfConcatenate pdfConcat = new PdfConcatenate(destinationDocumentStream);
//For each source pdf in list...
foreach (string sourcePdfPath in sourcePDFList)
{
//If file exists...
if (File.Exists(sourcePdfPath))
{
//Open the document
FileStream sourceDocumentStream = new FileStream(sourcePdfPath, FileMode.Open);
//Read the document
PdfReader pdfReader = new PdfReader(sourceDocumentStream);
//Create an int list
List<int> pages = new List<int>();
//for each page in pdfReader
for (int i = 1; i < pdfReader.NumberOfPages + 1; i++)
{
//Add that page to the list
pages.Add(i);
}
//Add that page to the pages to add to ouput document
pdfReader.SelectPages(pages);
//Add pages to output page
pdfConcat.AddPages(pdfReader);
//Close reader
pdfReader.Close();
}
}
//Close pdfconcat
pdfConcat.Close();
}

Categories

Resources