iText CopyAsFormXObject PdfException - c#

As I am processing my main pdf file and when I get to page 2 I start to read this Adobe Illustrator. I get this PdfException at the line origPage.CopyAsFormXObject(aiDocument); about the indirect object. I am not sure why I am getting this error because its a different file.
iText.Kernel.Exceptions.PdfException: 'Cannot copy indirect object from the document that is being written.'
const float imageX = 111.318f, imageY = 130.791f;
//const float imageWidth = 755.454f, imageHeight = 432.094f;
string aifilePath = #"PathToAIFile.ai";
var buffer = File.ReadAllBytes(aifilePath);
using (var aiStream = new MemoryStream(buffer))
using (var aimodifiedaiStream = new MemoryStream())
{
var aiReader = new iText.Kernel.Pdf.PdfReader(aiStream);
var aiDocument = new PdfDocument(aiReader, new PdfWriter(aimodifiedaiStream));
PdfPage origPage = aiDocument.GetFirstPage();
PdfFormXObject aifForm = origPage.CopyAsFormXObject(aiDocument);
canvasPage.AddXObjectAt(aifForm, imageX, imageY);
}
Updated with two PdfDocument
const float imageX = 111.318f, imageY = 130.791f;
//const float imageWidth = 755.454f, imageHeight = 432.094f;
string aifilePath = #"PathToAIFile.ai";
var buffer = File.ReadAllBytes(aifilePath);
using (var aiStream = new MemoryStream(buffer))
using (var aimodifiedaiStream = new MemoryStream())
{
var aisourceReader = new iText.Kernel.Pdf.PdfReader(aiStream);
var pdfsourceDocument = new PdfDocument(aisourceReader, new PdfWriter(aimodifiedaiStream));
var pdftargetDocument = new PdfDocument(aisourceReader);
PdfPage origPage = aiDocument.GetFirstPage();
PdfFormXObject aifForm = origPage.CopyAsFormXObject(aiDocument);
canvasPage.AddXObjectAt(aifForm, imageX, imageY);
}
Updated with pdftargetDocument has a writer and pdfsourceDocument with just PdfReader still getting the error Cannot copy to document opened in reading mode.'
using (var aiStream = new MemoryStream(buffer))
using (var aimodifiedaiStream = new MemoryStream())
{
var aisourceReader = new iText.Kernel.Pdf.PdfReader(new MemoryStream(buffer));
var aitargetReader = new iText.Kernel.Pdf.PdfReader(new MemoryStream(buffer));
var pdfsourceDocument = new PdfDocument(aisourceReader);
var pdftargetDocument = new PdfDocument(aitargetReader, new PdfWriter(aimodifiedaiStream));
PdfPage origPage = pdfsourceDocument.GetFirstPage();
PdfFormXObject aifForm = origPage.CopyAsFormXObject(pdfsourceDocument);
canvasPage.AddXObjectAt(aifForm, imageX, imageY);
pdfsourceDocument.Close();
pdftargetDocument.Close();
}

Summing up my comments as an answer...
You copy a page from the PdfDocument aiDocument as form XObject to the same aiDocument.
Clearly aiDocument is being written, it has a PdfWriter in its constructor. But the iText API only allows copying pages from a PdfDocument constructed without a PdfWriter.
Furthermore, you can only copy to a PdfDocument that is written to, i.e. that is constructed with a PdfWriter. Thus, you cannot copy from a document into itself.
Thus, you need different PdfDocument instances as source and target of the copy process.
The source PdfDocument instance is generated with only a PdfReader, no PdfWriter. The target PdfDocument instance is generated with a PdfWriter. If the target also is generated with a PdfReader depends on your precise use case: Do you want to add the XObject to an existing document, or do you want to add it to an empty document without any prior content.

Related

itext7 CopyPagesTo not opening PDF

I am trying to add a Cover Page PDF file to another PDF file. I am using CopyPagesTo method. CoverPageFilePath will go before any pages in the pdfDocumentFile. I then need to rewrite that new file to the same location. When I run the code and open the new pdf file I get an error about it being damaged.
public static void iText7MergePDF()
{
byte[] modifiedPdfInBytes = null;
string pdfCoverPageFilePath = #"PathtoCoverPage\Cover Page.pdf";
PdfDocument pdfDocumentCover = new PdfDocument(new iText.Kernel.Pdf.PdfReader(pdfCoverPageFilePath));
string pdfDocumentFile =#"PathtoFullDocument.pdf";
var buffer = File.ReadAllBytes(pdfDocumentFile);
using (var originalPdfStream = new MemoryStream(buffer))
using (var modifiedPdfStream = new MemoryStream())
{
var pdfReader = new iText.Kernel.Pdf.PdfReader(originalPdfStream);
var pdfDocument = new PdfDocument(pdfReader, new PdfWriter(modifiedPdfStream));
int numberOfPages = pdfDocumentCover.GetNumberOfPages();
pdfDocumentCover.CopyPagesTo(1, numberOfPages, pdfDocument);
modifiedPdfInBytes = modifiedPdfStream.ToArray();
pdfDocument.Close();
}
System.IO.File.WriteAllBytes(pdfGL, modifiedPdfInBytes);
}
Whenever you have some other type, like a StreamWriter, or here a PdfWriter writing to a Stream, it may not write all the data to the Stream immediately.
Here you Close the pdfDocument for all the data to be written to the MemoryStream.
ie this
modifiedPdfInBytes = modifiedPdfStream.ToArray();
pdfDocument.Close();
Should be
pdfDocument.Close();
modifiedPdfInBytes = modifiedPdfStream.ToArray();

Generate one pdf document with multiple pages converting from html using IText 7

I'm working with IText 7, I've been able to get one html page and generate a pdf for that page, but I need to generate one pdf document from multiple html pages and separated by pages. For example: I have Page1.html, Page2.html and Page3.html. I will need a pdf document with 3 pages, the first page with the content of Page1.html, second page with the content of Page2.html and like that...
This is the code I have and it's working for one html page:
ConverterProperties properties = new ConverterProperties();
PdfWriter writer = new PdfWriter(pdfRoot, new WriterProperties().SetFullCompressionMode(true));
PdfDocument pdfDocument = new PdfDocument(writer);
pdfDocument.AddEventHandler(PdfDocumentEvent.END_PAGE, new HeaderPdfEventHandler());
HtmlConverter.ConvertToPdf(htmlContent, pdfDocument, properties);
Is it possible to loop against the multiple html pages, add a new page to the PdfDocument for every html page and then have only one pdf generated with one page per html page?
UPDATE
I've been following this example and trying to translate it from Java to C#, I'm trying to use PdfMerger and loop around the html pages... but I'm receiving the Exception Cannot access a closed stream, on this line:
temp = new PdfDocument(
new PdfReader(new RandomAccessSourceFactory().CreateSource(baos), rp));
It looks like is related to the ByteArrayOutputStream baos instance. Any suggestions? This is my current code:
foreach (var html in htmlList)
{
ByteArrayOutputStream baos = new ByteArrayOutputStream();
PdfDocument temp = new PdfDocument(new PdfWriter(baos));
HtmlConverter.ConvertToPdf(html, temp, properties);
ReaderProperties rp = new ReaderProperties();
temp = new PdfDocument(
new PdfReader(new RandomAccessSourceFactory().CreateSource(baos), rp));
merger.Merge(temp, 1, temp.GetNumberOfPages());
temp.Close();
}
pdfDocument.Close();
You are using RandomAccessSourceFactory and passing there a closed stream which you wrote a PDF document into. RandomAccessSourceFactory expects an input stream instead that is ready to be read.
First of all you should use MemoryStream which is native to .NET world. ByteArrayOutputStream is the class that was ported from Java for internal purposes (although it extends MemoryStream as well). Secondly, you don't have to use RandomAccessSourceFactory - there is a simpler way.
You can create a new MemoryStream instance from the bytes of the MemoryStream that you used to create a temporary PDF with the following line:
baos = new MemoryStream(baos.ToArray());
As an additional remark, it's better to close PdfMerger instance directly instead of closing the document - closing PdfMerger closes the underlying document as well.
All in all, we get the following code that works:
foreach (var html in htmlList)
{
MemoryStream baos = new MemoryStream();
PdfDocument temp = new PdfDocument(new PdfWriter(baos));
HtmlConverter.ConvertToPdf(html, temp, properties);
ReaderProperties rp = new ReaderProperties();
baos = new MemoryStream(baos.ToArray());
temp = new PdfDocument(new PdfReader(baos, rp));
pdfMerger.Merge(temp, 1, temp.GetNumberOfPages());
temp.Close();
}
pdfMerger.Close();
Maybe not so succinctly. I use "using". Similar answer
private byte[] CreatePDF(string html)
{
byte[] binData;
using (var workStream = new MemoryStream())
{
using (var pdfWriter = new PdfWriter(workStream))
{
//Create one pdf document
using (var pdfDoc = new PdfDocument(pdfWriter))
{
pdfDoc.SetDefaultPageSize(iText.Kernel.Geom.PageSize.A4.Rotate());
//Create one pdf merger
var pdfMerger = new PdfMerger(pdfDoc);
//Create two identical pdfs
for (int i = 0; i < 2; i++)
{
using (var newStream = new MemoryStream(CreateDocument(html)))
{
ReaderProperties rp = new ReaderProperties();
using (var newPdf = new PdfDocument(new PdfReader(newStream, rp)))
{
pdfMerger.Merge(newPdf, 1, newPdf.GetNumberOfPages());
}
}
}
}
binData = workStream.ToArray();
}
}
return binData;
}
Create pdf
private byte[] CreateDocument(string html)
{
byte[] binData;
using (var workStream = new MemoryStream())
{
using (var pdfWriter = new PdfWriter(workStream))
{
using (var pdfDoc = new PdfDocument(pdfWriter))
{
pdfDoc.SetDefaultPageSize(iText.Kernel.Geom.PageSize.A4.Rotate());
ConverterProperties props = new ConverterProperties();
using (var document = HtmlConverter.ConvertToDocument(html, pdfDoc, props))
{
}
}
binData = workStream.ToArray();
}
}
return binData;
}

Watermark a pdf document with itext 7 instead of iTextSharp

This was my code for itextsharp which worked ok. It displayed "Quote Only" in the middle of each page in a pdf file.
iTextSharp.text.Image img = iTextSharp.text.Image.GetInstance(Server.MapPath(#"~\Content\WaterMarkQuoteOnly.png"));
PdfReader readerOriginalDoc = new PdfReader(File(all, "application/pdf").FileContents);
int n = readerOriginalDoc.NumberOfPages;
img.SetAbsolutePosition(0, 300);
PdfGState _state = new PdfGState()
{
FillOpacity = 0.1F,
StrokeOpacity = 0.1F
};
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(readerOriginalDoc, ms, '\0', true))
{
for (int i = 1; i <= n; i++)
{
PdfContentByte content = stamper.GetOverContent(i);
content.SaveState();
content.SetGState(_state);
content.AddImage(img);
content.RestoreState();
}
}
//return ms.ToArray();
all = ms.GetBuffer();
}
This is my new itext 7 code, this also displays the watermark but the position is wrong. I was dismayed to see that you cant add an image to the canvas but you have to add ImageData when the position is being set on the image. The image is also way smaller and back to front.
var imagePath = Server.MapPath(#"~\Content\WaterMarkQuoteOnly.png");
var tranState = new iText.Kernel.Pdf.Extgstate.PdfExtGState();
tranState.SetFillOpacity(0.1f);
tranState.SetStrokeOpacity(0.1f);
ImageData myImageData = ImageDataFactory.Create(imagePath, false);
Image img = new Image(myImageData);
img.SetFixedPosition(0, 300);
var reader = new PdfReader(new MemoryStream(all));
var doc = new PdfDocument(reader);
int pages = doc.GetNumberOfPages();
using (var ms = new MemoryStream())
{
var writer = new PdfWriter(ms);
var newdoc = new PdfDocument(writer);
for (int i = 1; i <= pages; i++)
{
//get existing page
PdfPage page = doc.GetPage(i);
//copy page to new document
newdoc.AddPage(page.CopyTo(newdoc)); ;
//get our new page
PdfPage newpage = newdoc.GetPage(i);
Rectangle pageSize = newpage.GetPageSize();
//get canvas based on new page
var canvas = new PdfCanvas(newpage);
//write image data to new page
canvas.SaveState().SetExtGState(tranState);
canvas.AddImage(myImageData, pageSize, true);
canvas.RestoreState();
}
newdoc.Close();
all = ms.GetBuffer();
ms.Flush();
}
You are doing something strange with the PdfDocument objects, and you are also using the wrong AddImage() method.
I am not a C# developer, so I rewrote your example in Java. I took this PDF file:
And I took this image:
Then I added the image to the PDF file using transparency with the following result:
The code to do this, was really simple:
public void createPdf(String src, String dest) throws IOException {
PdfExtGState tranState = new PdfExtGState();
tranState.setFillOpacity(0.1f);
ImageData img = ImageDataFactory.create(IMG);
PdfReader reader = new PdfReader(src);
PdfWriter writer = new PdfWriter(dest);
PdfDocument pdf = new PdfDocument(reader, writer);
for (int i = 1; i <= pdf.getNumberOfPages(); i++) {
PdfPage page = pdf.getPage(i);
PdfCanvas canvas = new PdfCanvas(page);
canvas.saveState().setExtGState(tranState);
canvas.addImage(img, 36, 600, false);
canvas.restoreState();
}
pdf.close();
}
For some reason, you created two PdfDocument instances. This isn't necessary. You also used the AddImage() method passing a Rectangle which resizes the image. Also make sure that you don't add the image as an inline image, because that bloats the file size.
I don't know which programming language you are using. For instance: I am not used to variables that are created using var such as var tranState. It should be very easy for you to adapt my Java code though. It's just a matter of changing lowercases into uppercases.

Create a new PdfDocument from other PdfDocument pieces that are created in memory with itext7

I am trying to create a new PdfDocument from other PdfDocuments that are created in memory. I can do this by saving them to the disk and then just reading them, but I was wondering if there is a way to do this with just a memory stream? That way I could create the other pieces in memory and just place them into the new pdfdocument. Any help would be appreciated. Below is my attempt to do this with a memory stream but I seem to be missing something.
using (var stream = new MemoryStream())
{
using (var pdfDocument = new PdfDocument(new PdfWriter(stream )))
{
var doc = new Document(pdfDocument, new PageSize(298f, 178f));
doc.SetMargins(0,0,0,0);
var tableInfo = PageElementsFactory.BuildDefaultTable(null, 1);
tableInfo.SetMargin(0);
tableInfo.SetPadding(0);
var cell = PageElementsFactory.BuildDefaultCell();
var dataValue = DataValues[PdfConstValues.RETENTION_INFORMATION_CUSTOMER_NAME];
cell.Add(new Paragraph(dataValue).SetFontSize(12f));
tableInfo.AddCell(cell);
var cell2 = PageElementsFactory.BuildDefaultCell();
cell2.Add(new Paragraph(DataValues[PdfConstValues.RETENTION_INFORMATION_ORDER_INFO]).SetBold());
tableInfo.AddCell(cell2);
var cell3 = PageElementsFactory.BuildDefaultCell();
cell3.Add(new Paragraph(DataValues[PdfConstValues.RETENTION_INFORMATION_PART_NUMBER])
.SetFontSize(23f).SetBold());
tableInfo.AddCell(cell3);
doc.Add(tableInfo);
doc.Close();
var page = pdfDocument.GetFirstPage();
var xObject = page.CopyAsFormXObject(newPdfDocument);
return new Image(xObject);
}
}
mkl gave me a few ideas from his comment, and after some more trial and error all I had to do was close the pdfdocument and then copy the stream to an array and create a new memory stream from it and BAM! it worked.
doc.Add(tableInfo);
doc.Close();
var streamInfo = stream.ToArray();
var memstreamClone = new MemoryStream(streamInfo);
memstreamClone.Position = 0;
var pdfStreamDoc = new PdfDocument(new PdfReader(memstreamClone));
var firstPage = pdfStreamDoc.GetFirstPage();
return new Image(firstPage.CopyAsFormXObject(pdfDoc));

Adding Page Navigation links to PDF document using itextsharp [duplicate]

I have written some code that merges together multiple PDF's into a single PDF that I then display from the MemoryStream. This works great. What I need to do is add a table of contents to the end of the file with links to the start of each of the individual PDF's. I planned on doing this using the GotoLocalPage action which has an option for page numbers but it doesn't seem to work. If I change the action to the code below to one of the presset ones like PDFAction.FIRSTPAGE it works fine. Does this not work because I am using the PDFCopy object for the writer parameter of GotoLocalPage?
Document mergedDoc = new Document();
MemoryStream ms = new MemoryStream();
PdfCopy copy = new PdfCopy(mergedDoc, ms);
mergedDoc.Open();
MemoryStream tocMS = new MemoryStream();
Document tocDoc = null;
PdfWriter tocWriter = null;
for (int i = 0; i < filesToMerge.Length; i++)
{
string filename = filesToMerge[i];
PdfReader reader = new PdfReader(filename);
copy.AddDocument(reader);
// Initialise TOC document based off first file
if (i == 0)
{
tocDoc = new Document(reader.GetPageSizeWithRotation(1));
tocWriter = PdfWriter.GetInstance(tocDoc, tocMS);
tocDoc.Open();
}
// Create link for TOC, added random number of 3 for now
Chunk link = new Chunk(filename);
PdfAction action = PdfAction.GotoLocalPage(3, new PdfDestination(PdfDestination.FIT), copy);
link.SetAction(action);
tocDoc.Add(new Paragraph(link));
}
// Add TOC to end of merged PDF
tocDoc.Close();
PdfReader tocReader = new PdfReader(tocMS.ToArray());
copy.AddDocument(tocReader);
copy.Close();
displayPDF(ms.ToArray());
I guess an alternative would be to link to a named element (instead of page number) but I can't see how to add an 'invisible' element to the start of each file before adding to the merged document?
I would just go with two passes. In your first pass, do the merge as you are but also record the filename and page number it should link to. In your second pass, use a PdfStamper which will give you access to a ColumnText that you can use general abstractions like Paragraph in. Below is a sample that shows this off:
Since I don't have your documents, the below code creates 10 documents with a random number of pages each just for testing purposes. (You obviously don't need to do this part.) It also creates a simple dictionary with a fake file name as the key and the raw bytes from the PDF as a value. You have a true file collection to work with but you should be able to adapt that part.
//Create a bunch of files, nothing special here
//files will be a dictionary of names and the raw PDF bytes
Dictionary<string, byte[]> Files = new Dictionary<string, byte[]>();
var r = new Random();
for (var i = 1; i <= 10; i++) {
using (var ms = new MemoryStream()) {
using (var doc = new Document()) {
using (var writer = PdfWriter.GetInstance(doc, ms)) {
doc.Open();
//Create a random number of pages
for (var j = 1; j <= r.Next(1, 5); j++) {
doc.NewPage();
doc.Add(new Paragraph(String.Format("Hello from document {0} page {1}", i, j)));
}
doc.Close();
}
}
Files.Add("File " + i.ToString(), ms.ToArray());
}
}
This next block merges the PDFs. This is mostly the same as your code except that instead of writing a TOC here I'm just keeping track of what I want to write in the future. Where I'm using file.value you'd use your full file path and where I'm using file.key you'd use your file's name instead.
//Dictionary of file names (for display purposes) and their page numbers
var pages = new Dictionary<string, int>();
//PDFs start at page 1
var lastPageNumber = 1;
//Will hold the final merged PDF bytes
byte[] mergedBytes;
//Most everything else below is standard
using (var ms = new MemoryStream()) {
using (var document = new Document()) {
using (var writer = new PdfCopy(document, ms)) {
document.Open();
foreach (var file in Files) {
//Add the current page at the previous page number
pages.Add(file.Key, lastPageNumber);
using (var reader = new PdfReader(file.Value)) {
writer.AddDocument(reader);
//Increment our current page index
lastPageNumber += reader.NumberOfPages;
}
}
}
}
mergedBytes = ms.ToArray();
}
This last block actually writes the TOC. If we use a PdfStamper we can create a ColumnText which allows us to use Paragraphs
//Will hold the final PDF
byte[] finalBytes;
using (var ms = new MemoryStream()) {
using (var reader = new PdfReader(mergedBytes)) {
using (var stamper = new PdfStamper(reader, ms)) {
//The page number to insert our TOC into
var tocPageNum = reader.NumberOfPages + 1;
//Arbitrarily pick one page to use as the size of the PDF
//Additional logic could be added or this could just be set to something like PageSize.LETTER
var tocPageSize = reader.GetPageSize(1);
//Arbitrary margin for the page
var tocMargin = 20;
//Create our new page
stamper.InsertPage(tocPageNum, tocPageSize);
//Create a ColumnText object so that we can use abstractions like Paragraph
var ct = new ColumnText(stamper.GetOverContent(tocPageNum));
//Set the working area
ct.SetSimpleColumn(tocPageSize.GetLeft(tocMargin), tocPageSize.GetBottom(tocMargin), tocPageSize.GetRight(tocMargin), tocPageSize.GetTop(tocMargin));
//Loop through each page
foreach (var page in pages) {
var link = new Chunk(page.Key);
var action = PdfAction.GotoLocalPage(page.Value, new PdfDestination(PdfDestination.FIT), stamper.Writer);
link.SetAction(action);
ct.AddElement(new Paragraph(link));
}
ct.Go();
}
}
finalBytes = ms.ToArray();
}

Categories

Resources