How to create multiple copies documents with iText - c#

I'm using iText to generate a PDF document that consists of several copies of almost the same information.
E.g.: An invoice. One copy is given to the customer, another is filed and a third one is given to an accountant for book-keeping.
All the copies must be exactly the same except for a little piece of text that indicates who is the copy to (Customer, Accounting, File, ...).
There are two possible scenarios (I don't know if the solution is the same for both of them):
a) Each copy goes in a different page.
b) All the copies goes in the same page (the paper will have cutting holes to separete copies).
There will be a wrapper or helper class which uses iText to generate the PDF in order to be able to do something like var pdf = HelperClass.CreateDocument(DocuemntInfo info);. The multiple-copies problem will be solved inside this wrapper/helper.
What does iText provides to accomplish this? Do I need to write each element in the document several times in different positions/pages? Or does iText provide some way to write one copy to the document and then copy it to other position/page?
Note: It's a .Net project, but I tagged the question with both java and c# because this qustion is about how to use iText properly the answer will help both laguage developers.

If each copy goes on a different page, you can create a new document and copy in the page multiple times. Using iText in Java you can do it like this:
// Create output PDF
Document document = new Document(PageSize.A4);
PdfWriter writer = PdfWriter.getInstance(document, outputStream);
document.open();
PdfContentByte cb = writer.getDirectContent();
// Load existing PDF
PdfReader reader = new PdfReader(templateInputStream);
PdfImportedPage page = writer.getImportedPage(reader, 1);
// Copy first page of existing PDF into output PDF
document.newPage();
cb.addTemplate(page, 0, 0);
// Add your first piece of text here
document.add(new Paragraph("Customer"));
// Copy second page of existing PDF into output PDF
document.newPage();
cb.addTemplate(page, 0, 0);
// Add your second piece of text here
document.add(new Paragraph("Accounting"));
// etc...
document.close();
If you want to put all the copies on the same page, the code is similar but instead of using zeroes in addTemplate(page, 0, 0) you'll need to set values for the correct position; the numbers to use depend on the size and shape of your invoice.
See also iText - add content to existing PDF file — the above code is based on the code I wrote in that answer.

Here's how I see this working.
PdfReader reader = new PdfReader( templatePDFPath );
Document doc = new Document();
PdfWriter writer = PdfWriter.createInstance( doc, new FileOutputStream("blah.pdf" ) );
PdfImportedPage inputPage = writer.getImportedPage( reader, 1 );
PdfDirectContent curPageContent = writer.getDirectContent();
String extraStuff[] = getExtraStuff();
for (String stuff : extraStuff) {
curPageContent.saveState();
curPageContent.addTemplate( inputPage /*, x, y*/ );
curPageContent.restoreState();
curPageContent.beginText();
curPageContent.setTextMatrix(x, y);
curPageContent.setFontAndSize( someFont, someSize );
// the actual work:
curPageContent.showText( stuff );
curPageContent.EndText();
// save the contents of curPageContent out to the file and reset it for the next page.
doc.newPage();
}
That's the bare minimum of work on the computer's part. Quite Efficient, and it'll result in a smaller PDF. Rather than having N copies of that page, with tweaks, you have one copy of that page that's reused on N pages, with little tweaks on top.
You could do the same thing, and use the "x,y" parameters in addTemplate to draw them all on the same page. Up to you.
PS: you'll need to figure out the coordinates for setTextMatrix in advance.

You could also use PDfCopy Or PDfSmartCopy to do this.
PdfReader reader = new PdfReader("Path\To\File");
Document doc = new Document();
PdfCopy copier = new PdfCopy(doc, ms1);
//PdfSmartCopy copier = new PdfSmartCopy(doc, ms1);
doc.Open();
copier.CloseStream = false;
PdfImportedPage inputPage = writer.GetImportedPage(reader, 1);
PdfContentByte curPageContent = writer.DirectContent;
for (int i = 0; i < count; i++)
{
copier.AddPage(inputPage);
}
doc.Close();
ms1.Flush();
ms1.Position = 0;
The difference between PdfCopy and PdfSmartCopy is that PdfCopy copies the entire PDF for each page, while PdfSmartCopy outputs a PDF that internally contains only one copy and all pages reference it, resulting in a smaller file and less bandwidth on a network, however it uses more memory on the server and takes longer to process.

Related

c# iTextSharp Object not set to instance

So I am trying to learn how to use iTextSharp with c# and winform to create a pdf based off input by a user of a program I created. I found this example code on the internet and it throws a couple different errors.
1.) Document has no pages, when I run the actual application
2.) Object Reference not set to an instance of an object. Point to the line with PdfWrtiter writer = Pdfwrite.GetInstance(document, output);
Basically, I'm trying to print on top of a pdf template, or image so that it looks like a sales form with description of the part.
public void createPDF()
{
Document document = new Document();
PdfReader reader = null;
MemoryStream output = new MemoryStream();
try
{
PdfWriter writer = PdfWriter.GetInstance(document, output);
document.Open();
// Load the background image and add it to the document structure
reader = new PdfReader(Resources.GetSalesForm());
PdfTemplate background = writer.GetImportedPage(reader, 1);
// Create a page in the document and add it to the bottom layer
document.NewPage();
_pcb = writer.DirectContentUnder;
_pcb.AddTemplate(background, 0, 0);
// Get the top layer and write some text
_pcb = writer.DirectContent;
_pcb.BeginText();
if (_showRulers)
{
PrintXAxis(800);
PrintXAxis(100);
PrintYAxis(40);
PrintYAxis(500);
}
SetFont36();
PrintTextCentered("words", 280, 680);
PrintTextCentered("words", 280, 190);
SetFont18();
PrintTextCentered("words", 280, 640);
PrintTextCentered("words", 280, 160);
_pcb.EndText();
writer.Flush();
}
finally
{
if (reader != null)
{
reader.Close();
}
document.Close();
}
}
Document has no pages: you close the document, but you didn't add any content. As you didn't add any content, no pages were created. It doesn't make sense to have a document without pages, hence the exception.
The second error can't occur where you say it occurs, but I see a lot of things that hurt the eyes in the rest of your code, so please throw everything you have so far and start anew.
When you start anew, why not use iText 7 for C#. Currently you are using an old version of iText. There is a jump-start tutorial on how to use the new version on the official iText web site: iText 7L jump-start tutorial. Check out chapter 5!
If you insist on using an old iText version, then be aware that you're doing it wrong. Adding content to an existing PDF is done with PdfStamper, not with PdfWriter. Adding text with BeginText()/EndText() is something you should only do when you know ISO-32000-1 by heart. Do you know that PDF reference by heart? No, then don't use BeginText()/EndText(), but use a convenience method such as ColumnText.ShowTextAligned() or use ColumnText, set the column dimensions, add elements to the column, and invoke Go() to render the content.

Why is my copied PDF file sized incorrect?

I need to remove the first few pages of a PDF file. Apparently, the easiest way to do that is to create a copy of it and not duplicate the unwanted pages. This works, but they look a lot smaller than they should. Any ideas?
How it should look
How it actually looks
private static void ClipSpecificPDF(string input, string output, int pagesToCut)
{
PdfReader myReader = new PdfReader(input);
using (FileStream fs = new FileStream(output, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (Document doc = new Document())
{
using (PdfWriter myWriter = PdfWriter.GetInstance(doc, fs))
{
//Open the desitination for writing
doc.Open();
//Loop through each page that we want to keep
for (int i = pagesToCut; i < myReader.NumberOfPages; i++)
{
//Add a new blank page to destination document
var PS = myReader.GetPageSizeWithRotation(i);
myWriter.SetPageSize(PS);
doc.NewPage();
//Extract the given page from our reader and add it directly to the destination PDF
myWriter.DirectContent.AddTemplate(myWriter.GetImportedPage(myReader, i + 1), 0, 0);
}
//Close our document
doc.Close();
}
}
}
}
The problem you describe is explained in the FAQ. For instance in the answer to the questions:
How to merge documents correctly?
Why does the function to concatenate / merge PDFs cause issues in some cases?
Using PdfWriter to manipulate PDF documents is a very bad idea. Read chapter 6 of my book to discover why this is a bad idea, and take a look at Table 6.1 to find out which class is a better fit.
In the same chapter, you'll find the SelectPages example. Suppose that you want to create a new PDF containing only page 4 to 8. In that case, you simply use the SelectPages() method and PdfStamper:
PdfReader reader = new PdfReader(src);
reader.SelectPages("4-8");
PdfStamper stamper = new PdfStamper(reader, new FileStream(dest, FileMode.Create, FileAccess.Write));
stamper.Close();
reader.Close();
By using PdfReader, the page size is preserved, as well as any of the interactive features that may be present.
Your approach is bad because you do not respect the original page size: you copy a document with letter (?) format to a document with A4 pages. If the origin of the page doesn't correspond with the lower-left corner, parts of your document will be invisible. If there are interactive features in your PDF, they will be lost. Of all the possible examples you could have followed, you picked the worst one...

Create Multi-page Index File(TOC) for merged pdf using itext library in java

How can I write a multi-page ToC to the end of a PDF consisting of merged documents, using iTextSharp?
The answer to Create Index File(TOC) for merged pdf using itext library in java explains how to create a ToC page when merging PDFs (catalogued in the iTextSharp book http://developers.itextpdf.com/examples/merging-pdf-documents/merging-documents-and-create-table-contents#795-mergewithtoc.java). Code in this answer is based on those examples.
However it only works if the ToC is 1 page long. If the content becomes longer, then it repeats itself on the same page rather than spanning into the next page.
Trying to add the link directly to the text via:
ct.Add(new Chunk("link").SetLocalGoto("p1"))
causes an exception ("Cannot add Annotations, not enough pages in document").
Can anyone explain a method that will allow me to append multiple pages of content to a PDF when merging them (the more general the approach, the better). Is there a way to write into the document using Document.Add() instead of having to copy in template pages and write on the top of them?
(Note, code is in c#)
This answer is based on the example from the iTextSharp documentation, but converted to C#.
To make the added text span multiple pages, I found I could use ColumnText.HasMoreText(ct.Go()) to tell me if there was more text than could fit on the current page. You can then save the current page, re-create a new page template, and move the columntext to the new page. Below this is in a function called CheckForNewPage:
private bool CheckForNewPage(PdfCopy copy, ref PdfImportedPage page, ref PdfCopy.PageStamp stamp, ref PdfReader templateReader, ColumnText ct)
{
if (ColumnText.HasMoreText(ct.Go()))
{
//Write current page
stamp.AlterContents();
copy.AddPage(page);
//Start a new page
ct.SetSimpleColumn(36, 36, 559, 778);
templateReader = new PdfReader("template.pdf");
page = copy.GetImportedPage(templateReader, 1);
stamp = copy.CreatePageStamp(page);
ct.Canvas = stamp.GetOverContent();
ct.Go();
return true;
}
return false;
}
This should be called each time text is added to the ct variable.
If CheckForNewPage returns true you can then increment the page count, and reset the y variable to the top of the new page so that link annotation is in the correct place on the new page.
e.g.
var tocPageCount = 0;
var para = new iTextSharp.text.Paragraph(documentName);
ct.AddElement(para);
ct.Go();
if (CheckForNewPage(context, copy, ref page, ref stamp, ref tocReader, ct))
{
tocPageCount++;
y = 778;
}
//Add link annotation
action = PdfAction.GotoLocalPage(d.DocumentID.ToString(), false);
link = new PdfAnnotation(copy, TOC_Page.Left, ct.YLine, TOC_Page.Right, y, action);
stamp.AddAnnotation(link);
y = ct.YLine;
This creates the pages correctly. The below code adapts the end of ToC2 example for re-ordering the pages, in order to handle more than 1 page.
var rdr = new PdfReader(baos.toByteArray());
var totalPageCount = rdr.NumberOfPages;
rdr.SelectPages(String.Format("{0}-{1}, 1-{2}", totalPageCount - tocPageCount +1, totalPageCount, totalPageCount - tocPageCount));
PdfStamper stamper = new PdfStamper(rdr, new FileStream(outputFilePath, FileMode.Create));
stamper.Close();
By re-using the CheckForNewPage function, you should be able to add any content to new pages you create, and have it span multiple pages. If you don't need the annnotations you call CheckForNewPage in a loop at the end of adding all your content (just don't call ct.Go() beforehand).

Edit DirectContent of iTextSharp PdfSmartCopy class

At my work sometimes I have to merge from few to few hundreds pdf files. All the time I've been using Writer and ImportedPages classes. But when I have merged all files into one, file size becomes enormous, sum of all merged files sizes, because fonts being attached to every page, and not reused (fonts are embedded to every page, not whole document).
Not very long time ago I found out about PdfSmartCopy class, which reuses embedded fonts and images. And here the problem kicks in. Very often, before merging files together, I have to add additional content to them (images, text). For this purpose I usually use PdfContentByte from Writer object.
Document doc = new Document();
PdfWriter writer = PdfWriter.GetInstance(doc, new FileStream("C:\test.pdf", FileMode.Create));
PdfContentByte cb = writer.DirectContent;
cb.Rectangle(100, 100, 100, 100);
cb.SetColorStroke(BaseColor.RED);
cb.SetColorFill(BaseColor.RED);
cb.FillStroke();
When I do similar thing with PdfSmartCopy object, pages are merged, but no additional content being added. Full code of my test with PdfSmartCopy:
using (Document doc = new Document())
{
using (PdfSmartCopy copy = new PdfSmartCopy(doc, new FileStream(Path.GetDirectoryName(pdfPath[0]) + "\\testas.pdf", FileMode.Create)))
{
doc.Open();
PdfContentByte cb = copy.DirectContent;
for (int i = 0; i < pdfPath.Length; i++)
{
PdfReader reader = new PdfReader(pdfPath[i]);
for (int ii = 0; ii < reader.NumberOfPages; ii++)
{
PdfImportedPage import = copy.GetImportedPage(reader, ii + 1);
copy.AddPage(import);
cb.Rectangle(100, 100, 100, 100);
cb.SetColorStroke(BaseColor.RED);
cb.SetColorFill(BaseColor.RED);
cb.FillStroke();
doc.NewPage();// net nesessary line
//ColumnText col = new ColumnText(cb);
//col.SetSimpleColumn(100,100,500,500);
//col.AddText(new Chunk("wdasdasd", PdfFontManager.GetFont(#"C:\Windows\Fonts\arial.ttf", 20)));
//col.Go();
}
}
}
}
}
Now I have few questions:
Is it possible to edit PdfSmartCopy object's DirectContent?
If not, is there another way to merge multiple pdf files into one not increasing its size dramatically and still being able to add additional content to pages while merging?
First this: using PdfWriter/PdfImportedPage is not a good idea. You throw away all interactive features! Being the author of iText, it's very frustrating to so many people making the same mistake in spite of the fact that I wrote two books about this, and in spite of the fact that I convinced my publisher to offer one of the most important chapters for free: http://www.manning.com/lowagie2/samplechapter6.pdf
Is my writing really that bad? Or is there another reason why people keep on merging documents using PdfWriter/PdfImportedPage?
As for your specific questions, here are the answers:
Yes. Download the sample chapter and search the PDF file for PageStamp.
Only if you create the PDF in two passes. For instance: create the huge PDF first, then reduce the size by passing it through PdfCopy; or create the merged PDF first with PdfCopy, then add the extra content in a second pass using PdfStamper.
Code after using Bruno Lowagie answer
for (int i = 0; i < pdfPath.Length; i++)
{
PdfReader reader = new PdfReader(pdfPath[i]);
PdfImportedPage page;
PdfSmartCopy.PageStamp stamp;
for (int ii = 0; ii < reader.NumberOfPages; ii++)
{
page = copy.GetImportedPage(reader, ii + 1);
stamp = copy.CreatePageStamp(page);
PdfContentByte cb = stamp.GetOverContent();
cb.Rectangle(100, 100, 100, 100);
cb.SetColorStroke(BaseColor.RED);
cb.SetColorFill(BaseColor.RED);
cb.FillStroke();
stamp.AlterContents(); // don't forget to add this line
copy.AddPage(page);
}
}
2.Only if you create the PDF in two passes. For instance: create the huge PDF first, then reduce the size by passing it through PdfCopy; or create the merged PDF first with PdfCopy, then add the extra content in a second pass using PdfStamper.
It is much more difficult to use the PdfStamper with a second pass. When your working with lots of data it's far easier to create 1 pdf stamp then append.
PdfCopyFields had worked well for this. Now it doesn't work as of the 5.4.4.0 release which is why I'm here.

Quickly adding a cover page to a pre-linearized PDF for streaming to browser?

Question 298829 describes how linearizing your PDFs lets them stream page-by-page into the user's browser, so the user doesn't have to wait for the whole document to download before starting to view it. We have been using such PDFs successfully, but now have a new wrinkle: We want to keep the page-by-page streaming, but we also want to insert a fresh cover page at the front of the PDF documents each time we serve them up. (The cover-page will have time-sensitive information, such as the date, so it's not practical to include the cover page in the PDFs on disk.)
To help with this, are there any PDF libraries that can quickly append a cover page to a pre-linearized PDF and yield a streamable, linearized PDF as output? What's of the greatest concern is not the total time to merge the PDFs, but how soon we can start streaming part of the merged document to the user.
We were trying to do this with itextsharp, but it turns out that library can't output linearized PDFs. (See http://itext.ugent.be/library/question.php?id=21) Nonetheless, the following ASP.NET/itextsharp scratch code demonstrates the sort of API we're thinking of. In particular, if itextsharp always output linearized PDFs, something like this might already be the solution:
public class StreamPdf : IHttpHandler
{
public void ProcessRequest(HttpContext context)
{
context.Response.ContentType = "application/pdf";
RandomAccessFileOrArray ramFile = new RandomAccessFileOrArray(#"C:\bigpdf.pdf");
PdfReader reader1 = new PdfReader(ramFile, null);
Document doc = new Document();
// We'll stream the PDF to the ASP.NET output
// stream, i.e. to the browser:
PdfWriter writer = PdfWriter.GetInstance(doc, context.Response.OutputStream);
writer.Open();
doc.Open();
PdfContentByte cb = writer.DirectContent;
// output cover page:
BaseFont bf = BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED);
Font font = new Font(bf, 11, Font.NORMAL);
ColumnText ct = new ColumnText(cb);
ct.SetSimpleColumn(60, 300, 600, 300 + 28 * 15, 15, Element.ALIGN_CENTER);
ct.AddText(new Phrase(15, "This is a cover page information\n", font));
ct.AddText(new Phrase(15, "Date: " + DateTime.Now.ToShortDateString() + "\n", font));
ct.Go();
// output src document:
int i = 0;
while (i < reader1.NumberOfPages)
{
i++;
// add next page from source PDF:
doc.NewPage();
PdfImportedPage page = writer.GetImportedPage(reader1, i);
cb.AddTemplate(page, 0, 0);
// use something like this to flush the current page to the
// browser:
writer.Flush();
s.Flush();
context.Response.Flush();
}
doc.Close();
writer.Close();
s.Close();
}
}
}
Ideally we're looking for a .NET library, but it would be worth hearing about any other options as well.
You could try GhostScript, I think its possible to stitch PDF's together but dont know about linearizing when it comes to PDF. I have a C# GhostScript Wrapper that can be used with the GhostScript dll directly, I am sure this can be modified to Merge PDFs. contact details at: redmanscave.blogspot.com

Categories

Resources