Itextsharp creates a corrupted pdf after stamping some text - c#

Just stamping some text into a pdf and itextsharp creats a corrupted file. When tried to read the pdf it throws error as follows
An exception of type 'iTextSharp.text.exceptions.InvalidPdfException'
Additional information: The document has no page root (meaning: it's an invalid PDF).
Following code is used to edit the pdf and stamp text content
using (PdfReader pdfReader = new PdfReader(System.IO.File.ReadAllBytes(pdfPath)))
using (Stream pdfStream = new FileStream(pdfPath, FileMode.Open, FileAccess.ReadWrite))
{
PdfReaderContentParser parserReason = new PdfReaderContentParser(pdfReader);
PdfStamper pdfStamper = new PdfStamper(pdfReader, pdfStream);
PdfContentByte pdfContentByte = pdfStamper.GetOverContent(pdfReader.NumberOfPages);
BaseFont baseFont = BaseFont.CreateFont(BaseFont.COURIER_BOLD, BaseFont.CP1250, BaseFont.NOT_EMBEDDED);
pdfContentByte.SetColorFill(BaseColor.BLACK);
pdfContentByte.SetFontAndSize(baseFont, 12);
pdfContentByte.BeginText();
TextMarginFinder finderReason = parserReason.ProcessContent(pdfReader.NumberOfPages, new iTextSharp.text.pdf.parser.TextMarginFinder());
pdfContentByte.ShowTextAligned(PdfContentByte.ALIGN_LEFT, "Some text : " + annotation, finderReason.GetLlx(), finderReason.GetLly() - 20f, 0);
pdfContentByte.EndText();
pdfStamper.Close();
}
The pdf files are created with apache fop 1.1 and itextsharp is used to edit the file.The issue is not happening with all pdf but only with some files.
You can find the PDF which creates the issue here

The issue is that you are opening the file stream like this:
using (Stream pdfStream = new FileStream(pdfPath, FileMode.Open, FileAccess.ReadWrite))
FileMode.Open leaves the old content in place, writing to it merely overwrites it. In particular, if the new document is shorter than the original one, an old tail piece of the original document remains. As the PDF cross references are at its end, this results in the old cross references being applied to the new document. This obviously does not match.
If your use FileMode.Create instead, this issue does not happen.
By the way, your code completely fails for the sample file you provided because that sample file has no text on the final page. Thus, finderReason determines no margins rectangle, your access to finderReason.GetLlx() tries to access a null rectangle member, and consequentially it fails. You should add some appropriate checks.

Related

iText7 usage rights and Adobe Reader

I'm revisiting some old code in an attempt to get this working again. I'm using the code from the iText KB here -> https://kb.itextpdf.com/home/it7kb/faq/how-to-fill-xfa-form-using-itext-without-breaking-usage-rights.
It seems even using that it somehow still breaks the pdf and not then editable with Adobe Reader. Last time I couldn't post the form but have now managed to strip out any important stuff so at least someone can test...hopefully.
I did create a super basic form in Livecycle and reader enable it in Acrobat DC and using the code below it worked fine, yet for some reason this form always breaks.
Here is the code I'm using.
String source = #"D:\Temp\SVTESTx.pdf";
String dest = #"D:\Temp\SVAx.PDF";
PdfReader preader = new PdfReader(source);
PdfDocument pdfDoc=new PdfDocument(preader, new PdfWriter(dest), new StampingProperties().UseAppendMode());
PdfAcroForm form = PdfAcroForm.GetAcroForm(pdfDoc, true);
XfaForm xfa = form.GetXfaForm();
xfa.FillXfaForm(new FileStream(#"D:\Temp\SVTEST.xml", FileMode.Open, FileAccess.Read));
xfa.Write(pdfDoc);
pdfDoc.Close();
The files in question are here (for the non-working PDF) -> https://drive.google.com/drive/folders/1MET19PUubd-J9D4fbzkX1KJAUbn0aMCZ?usp=sharing
Cheers

Creating a new Document with a FileStream with Aspose.Pdf

I'm trying to create a new Aspose.Pdf.Document using a FileStream to a new File, but it always throws an "Incorrect File Header" Exception. I need to work with the FileStream so that I can incrementally save the Document when merging other Pdf documents without keeping all of the Streams in scope.
According to the documentation, the following is the code to create a Document with a FileStream (I changed FileMode.Open to FileMode.OpenOrCreate since I don't have an existing Pdf file and want to start with a blank Document).
await using var fileStream = new FileStream(fileName, FileMode.OpenOrCreate, FileAccess.ReadWrite);
var document = new Document(fileStream);
This code throws an "Incorrect File Header" Exception unless the FileStream points to an existing valid Pdf file.
The following code works, but it's kind of silly to create and dispose a Document just so that we can work with the Document through the FileStream.
var fileName = Path.GetTempFileName();
var doc = new Document();
doc.Save(fileName);
doc.Dispose();
await using var fileStream = new FileStream(fileName, FileMode.Open, FileAccess.ReadWrite);
var document = new Document(fileStream);
I have to be missing something painfully obvious, because this is an incredibly simple use case and I don't see anything about it when searching online.
You cannot initialize the Document object with an empty Stream or invalid PDF file. File or Stream should be a valid PDF document. In order to use the incremental saving approach, you can initialize the FileStream with a new file and keep saving the Document into it. For example, please check the below sample code snippet:
using var fileStream = new FileStream(dataDir + "output.pdf", FileMode.OpenOrCreate, FileAccess.ReadWrite);
{
var document = new Document();
document.Pages.Add();
document.Save(fileStream);
document.Pages.Add();
document.Save(fileStream);
}
Please note that the FileStream needs to remain open during the whole process of PDF generation. Along with that, you can also use Document.Save(); method (without any constructor) to implement incremental saving.
We believe that you have also posted a similar inquiry in Aspose.PDF official support forum and we have responded to you there as well. You can please follow up on it there and carry on the discussion in case you need more information.
This is Asad Ali and I work as Developer Evangelist at Aspose.

Converting a word doc to pdf using Syncfusion DocIO and saving to disk

I am using the Syncfusion DocIO library to try and convert a word doc to a pdf.
I am following this simple example:
At the bottom of the example they are doing:
PdfDocument pdfDocument = render.ConvertToPDF(wordDocument);
//Releases all resources used by the Word document and DocIO Renderer objects
render.Dispose();
wordDocument.Dispose();
//Saves the PDF file
MemoryStream outputStream = new MemoryStream();
pdfDocument.Save(outputStream);
//Closes the instance of PDF document object
pdfDocument.Close();
I need to save the pdf file to the disk instead. How can I take the outputStream and save it to disk? I believe the example is just saving it to memory.
You can use a FileStream to write the file to disk:
using (var fs = new FileStream(fileName, FileMode.Create, FileAccess.Write))
{
pdfDocument.Save(fs);
}
You don't need to use the MemoryStream if you don't want to. You can write directly to the FileStream.
Yes, it is possible to save the PDF document to a disk which is illustrated in the following code.
https://help.syncfusion.com/file-formats/pdf/loading-and-saving-document?cs-save-lang=1&cs-lang=asp.net%20core#saving-a-pdf-document-to-file-system
Meanwhile, Word document can also be opened from a disk as a FileStream and converted to PDF document. Kindly refer the following link for code example.
https://help.syncfusion.com/file-formats/docio/word-to-pdf?cs-save-lang=1&cs-lang=asp.net%20core
Note: I work for Syncfusion.
Regards,
Dilli babu.

C# iTextSharp: The process cannot access the file because it is being used by another process

I'm generating a pdf file from a template with iTextSharp, filling each field in this code portion:
PdfReader pdfReader = new PdfReader(templatePath);
try
{
using (FileStream newFileStream = new FileStream(newFilePath, FileMode.Create))
{
using (PdfStamper stamper = new PdfStamper(pdfReader, newFileStream))
{
// fill each field
AcroFields pdfFormFields = stamper.AcroFields;
foreach (KeyValuePair<string, string> entry in content)
{
if (!String.IsNullOrEmpty(entry.Value))
pdfFormFields.SetField(entry.Key, entry.Value);
}
//The below will make sure the fields are not editable in
//the output PDF.
stamper.FormFlattening = true;
stamper.Close();
}
}
}
finally
{
pdfReader.Close();
}
Everything goes fine, file looks ok, but when i try to reopen the file to merge it with some other files I've generated in a unique document i get this error:
2015-11-23 09:46:54,651||ERROR|UrbeWeb|System.IO.IOException: The process cannot access the file 'D:\Sviluppo\communitygov\MaxiAnagrafeImmobiliare\MaxiAnagrafeImmobiliare\cache\IMU\E124\admin\Stampe\Provvedimento_00223850306_2015_11_23_094654.pdf' because it is being used by another process.
Error occurs at this point
foreach (Documento item in docs)
{
string fileName = item.FilePath;
pdfReader = new PdfReader(fileName); // IOException
// some other operations ...
}
Edit: Using Process monitor as suggested I can see there is no close CloseFile operation as I would expect. Can this be the source of the issue?
I've been stuck on this for hours any help is really really appreciated.
Had the same issue with me. This helped a lot.
"You're problem is that you are writing to a file while you are also reading from it. Unlike some file types (JPG, PNG, etc) that "load" all of the data into memory, iTextSharp reads the data as a stream. You either need to use two files and swap them at the end or you can force iTextSharp to "load" the first file by binding your PdfReader to a byte array of the file."
PdfReader reader = new PdfReader(System.IO.File.ReadAllBytes(filePath));
Ref: Cris Haas answer to Cannot access the file because it is being used by another process
I had a similar problem with opening pdf files (for read only) with iTextSharp PdfReader. The first file gave no problem, the second one gave that exception (can not access the file, etc.).
After hours and googling and searching for complicate solutions and twisting my brain, only the simple following code resolved it fully:
iTextSharp_pdf.PdfReader pdfReader = null;
pdfReader = new iTextSharp_pdf.PdfReader(fileName);

Encrypt PDF document using iTextSharp

I want to make my PDF document protected by not allowing fill in and copy from it. I am using iTextSharp for this. I have following code:
PdfReader reader = new PdfReader(document, System.Text.Encoding.UTF8.GetBytes(PASSWORD));
using (MemoryStream ms = new MemoryStream())
{
using (PdfStamper stamper = new PdfStamper(reader, ms))
{
stamper.SetEncryption(
null,
Encoding.ASCII.GetBytes(PASSWORD),
PdfWriter.ALLOW_PRINTING,
PdfWriter.ENCRYPTION_AES_128);
}
}
reader.Close();
When the document is generated I use that code to encrypt the document. But later when I open the document in Adobe Reader (tested on 9 and 11) and check the 'File > Properties > Security' their are no restrictions applied on fill in and copy of the document and their status is Allowed.
Is there any issue in that code?
According to the ITextSharp documentation for PdfStamper, the second parameter to this method is an output stream representing the destination for the encrypted PDF document data. The code you show in the question simply disposes the MemoryStream after you setup the encryption so any changes this code could apply to your PDF document will never be saved to disk or otherwise be available outside your application.

Categories

Resources