I'm trying to create a new Aspose.Pdf.Document using a FileStream to a new File, but it always throws an "Incorrect File Header" Exception. I need to work with the FileStream so that I can incrementally save the Document when merging other Pdf documents without keeping all of the Streams in scope.
According to the documentation, the following is the code to create a Document with a FileStream (I changed FileMode.Open to FileMode.OpenOrCreate since I don't have an existing Pdf file and want to start with a blank Document).
await using var fileStream = new FileStream(fileName, FileMode.OpenOrCreate, FileAccess.ReadWrite);
var document = new Document(fileStream);
This code throws an "Incorrect File Header" Exception unless the FileStream points to an existing valid Pdf file.
The following code works, but it's kind of silly to create and dispose a Document just so that we can work with the Document through the FileStream.
var fileName = Path.GetTempFileName();
var doc = new Document();
doc.Save(fileName);
doc.Dispose();
await using var fileStream = new FileStream(fileName, FileMode.Open, FileAccess.ReadWrite);
var document = new Document(fileStream);
I have to be missing something painfully obvious, because this is an incredibly simple use case and I don't see anything about it when searching online.
You cannot initialize the Document object with an empty Stream or invalid PDF file. File or Stream should be a valid PDF document. In order to use the incremental saving approach, you can initialize the FileStream with a new file and keep saving the Document into it. For example, please check the below sample code snippet:
using var fileStream = new FileStream(dataDir + "output.pdf", FileMode.OpenOrCreate, FileAccess.ReadWrite);
{
var document = new Document();
document.Pages.Add();
document.Save(fileStream);
document.Pages.Add();
document.Save(fileStream);
}
Please note that the FileStream needs to remain open during the whole process of PDF generation. Along with that, you can also use Document.Save(); method (without any constructor) to implement incremental saving.
We believe that you have also posted a similar inquiry in Aspose.PDF official support forum and we have responded to you there as well. You can please follow up on it there and carry on the discussion in case you need more information.
This is Asad Ali and I work as Developer Evangelist at Aspose.
Related
I am using the Syncfusion DocIO library to try and convert a word doc to a pdf.
I am following this simple example:
At the bottom of the example they are doing:
PdfDocument pdfDocument = render.ConvertToPDF(wordDocument);
//Releases all resources used by the Word document and DocIO Renderer objects
render.Dispose();
wordDocument.Dispose();
//Saves the PDF file
MemoryStream outputStream = new MemoryStream();
pdfDocument.Save(outputStream);
//Closes the instance of PDF document object
pdfDocument.Close();
I need to save the pdf file to the disk instead. How can I take the outputStream and save it to disk? I believe the example is just saving it to memory.
You can use a FileStream to write the file to disk:
using (var fs = new FileStream(fileName, FileMode.Create, FileAccess.Write))
{
pdfDocument.Save(fs);
}
You don't need to use the MemoryStream if you don't want to. You can write directly to the FileStream.
Yes, it is possible to save the PDF document to a disk which is illustrated in the following code.
https://help.syncfusion.com/file-formats/pdf/loading-and-saving-document?cs-save-lang=1&cs-lang=asp.net%20core#saving-a-pdf-document-to-file-system
Meanwhile, Word document can also be opened from a disk as a FileStream and converted to PDF document. Kindly refer the following link for code example.
https://help.syncfusion.com/file-formats/docio/word-to-pdf?cs-save-lang=1&cs-lang=asp.net%20core
Note: I work for Syncfusion.
Regards,
Dilli babu.
I am trying to create a pdf file using file stream and while the file is being created in the specified folder of the path, when I try to open the file Adobe Reader is throwing the error that I mentioned above in the title. I am not sure what I am doing wrong but I would greatly appreciate it if someone could look into it and help me out. Thank you!. Here is my code for the filestream:
System.IO.FileStream wFile;
byte[] byteData = null;
byteData = Encoding.ASCII.GetBytes("FileStream Test");
string filePath = $"{_ApplicationPath}PrintedResults\\StreamTest5.pdf";
wFile = new FileStream(filePath, FileMode.Create);
wFile.Write(byteData, 0, byteData.Length);
wFile.Close();
using (var fileStream = new FileStream(filePath, FileMode.Open))
{
Console.WriteLine("This is a test stream file");
}
System.Diagnostics.Process.Start(filePath);
You are writing a text file with the contents "FileStream Test" but naming it with a file extension of ".PDF". This is not a PDF file. It is a text file with a misleading name. Adobe Reader is correctly reporting that it is not a valid PDF.
This is not an answer but I don't have enough reputation to add it as a comment:
You might want to look to 3rd party library offered by http://www.dynamicpdf.com/
Their free evaluation version is quite capable and it is not time limited as they say on their website. With a little more effort you can create quite complicated PDFs with only the evaluation version!
Just stamping some text into a pdf and itextsharp creats a corrupted file. When tried to read the pdf it throws error as follows
An exception of type 'iTextSharp.text.exceptions.InvalidPdfException'
Additional information: The document has no page root (meaning: it's an invalid PDF).
Following code is used to edit the pdf and stamp text content
using (PdfReader pdfReader = new PdfReader(System.IO.File.ReadAllBytes(pdfPath)))
using (Stream pdfStream = new FileStream(pdfPath, FileMode.Open, FileAccess.ReadWrite))
{
PdfReaderContentParser parserReason = new PdfReaderContentParser(pdfReader);
PdfStamper pdfStamper = new PdfStamper(pdfReader, pdfStream);
PdfContentByte pdfContentByte = pdfStamper.GetOverContent(pdfReader.NumberOfPages);
BaseFont baseFont = BaseFont.CreateFont(BaseFont.COURIER_BOLD, BaseFont.CP1250, BaseFont.NOT_EMBEDDED);
pdfContentByte.SetColorFill(BaseColor.BLACK);
pdfContentByte.SetFontAndSize(baseFont, 12);
pdfContentByte.BeginText();
TextMarginFinder finderReason = parserReason.ProcessContent(pdfReader.NumberOfPages, new iTextSharp.text.pdf.parser.TextMarginFinder());
pdfContentByte.ShowTextAligned(PdfContentByte.ALIGN_LEFT, "Some text : " + annotation, finderReason.GetLlx(), finderReason.GetLly() - 20f, 0);
pdfContentByte.EndText();
pdfStamper.Close();
}
The pdf files are created with apache fop 1.1 and itextsharp is used to edit the file.The issue is not happening with all pdf but only with some files.
You can find the PDF which creates the issue here
The issue is that you are opening the file stream like this:
using (Stream pdfStream = new FileStream(pdfPath, FileMode.Open, FileAccess.ReadWrite))
FileMode.Open leaves the old content in place, writing to it merely overwrites it. In particular, if the new document is shorter than the original one, an old tail piece of the original document remains. As the PDF cross references are at its end, this results in the old cross references being applied to the new document. This obviously does not match.
If your use FileMode.Create instead, this issue does not happen.
By the way, your code completely fails for the sample file you provided because that sample file has no text on the final page. Thus, finderReason determines no margins rectangle, your access to finderReason.GetLlx() tries to access a null rectangle member, and consequentially it fails. You should add some appropriate checks.
I am creating an XML file on the fly.
One of it's nodes contains a ZIP file encoded as a BASE64 string.
I then create another ZIP file.
I add this XML file and a few other JPEG files.
I output the file to the browser.
I am unable to open the FINAL ZIP file.
I get: "Windows cannot open the folder. The Compressed(zipped) Folder'c:\path\file.zip' is invalid."
I am able to save my original XML file to the file system.
I can open that XML file, decode the ZIP node and save to the file system.
I am then able to open that Zip file with no problems.
I can create the final ZIP file, OMIT my XML file, and the ZIP file opens no problem.
I seem to only have an issue with I attempt to ZIP an XML file that has a node with ZIP content encoded as a BASE64 string.
Any ideas? Code snipets are below. Heavily edited.
XDocument xDoc = new XDocument();
XDocument xDocReport = new XDocument();
XElement xNodeReport;
using (FileStream fsData = new FileStream(strFullFilePath, FileMode.Open, FileAccess.Read)) {
xDoc = XDocument.Load(fsData);
xNodeReport = xDoc.Element("Data").Element("Reports").Element("Report");
//SNIP
//create XDocument xDocReport
//SNIO
using (MemoryStream zipInMemoryReport = new MemoryStream()) {
using (ZipArchive zipFile = new ZipArchive(zipInMemoryReport, ZipArchiveMode.Update)) {
//Add REPORT to ZIP file
ZipArchiveEntry entryReport = zipFile.CreateEntry("data.xml");
using (StreamWriter writer = new StreamWriter(entryReport.Open())) {
writer.Write(xDocReport.ToString());
} //END USING report entry
}
xNodeReport.Value = System.Convert.ToBase64String(zipInMemoryReport.GetBuffer());
//I am able to write this file to disk and manipulate it no problem.
//File.WriteAllText("c:\\users\\snip\\desktop\\Report.xml",xDoc.ToString());
}
//create ZIP for response
using (MemoryStream zipInMemory = new MemoryStream()) {
using (ZipArchive zipFile = new ZipArchive(zipInMemory, ZipArchiveMode.Update)) {
//Add REPORT to ZIP file
ZipArchiveEntry entryReportWrapper = zipFile.CreateEntry("Report.xml");
//THIS IS THE STEP THAT makes the Zip "invalid". Although i can open and manipulate this source file no problem.
//********
using (StreamWriter writer = new StreamWriter(entryReportWrapper.Open())) {
xDoc.Save(writer);
}
//Add JPEG(s) to report
//Create Charts
if (chkDLSalesPrice.Checked) {chartDownloadSP.SaveImage(entryChartSP.Open(), ChartImageFormat.Jpeg);}
if (chkDLSalesDOM.Checked) {chartDownloadDOM.SaveImage(entryChartDOM.Open(), ChartImageFormat.Jpeg);}
if (chkDLSPLP.Checked) {chartDownloadSPLP.SaveImage(entryChartSPLP.Open(), ChartImageFormat.Jpeg);}
if (chkDLSPLP.Checked) {chartDownloadLP.SaveImage(entryChartLP.Open(), ChartImageFormat.Jpeg);}
} // END USING ziparchive
Response.Clear();
Response.AppendHeader("content-disposition", "attachment; filename=file.zip");
Response.ContentType = "application/zip";
Response.BinaryWrite(zipInMemory.GetBuffer());
Response.End();
Without a good, minimal, complete code example, it's impossible to know for sure what bugs are in the code. But there are at least two apparent errors in the code snippet you posted, one of which could easily be responsible for the "invalid .zip" error:
In the statement writer.Write(xDocReport.ToString());, the variable xDocReport has not been initialized to anything useful, at least not in the code you posted. So you'll get an empty XML document in the archive.
Since the code example is incomplete, it's possible you just omitted from the code example in your question the initialization of that variable to something else. In any case, even if you didn't that would just lead to an empty XML document in the archive, not an invalid archive.
More problematic though…
You are calling GetBuffer() on your MemoryStream objects, instead of ToArray(). You want the latter. The former gets the entire backing buffer for the MemoryStream object, including the uninitialized bytes past the end of the valid stream. Since a valid .zip file includes a CRC value at the end of the file, adding extra data beyond that causes anything trying to read the file as a .zip archive to miss the correct CRC, reading the uninitialized data instead.
Replace your calls to GetBuffer() with calls to ToArray() instead.
If the above does not lead to a solution for your problem, you should edit your post, to provide a better code example.
One last comment: there is no point in initializing a variable like xDoc to an empty XDocument object when you're going to just replace that object with a different one (e.g. by calling XDocument.Load()).
My web method creates a pdf file in my %temp% folder and that works. I then want to add some custom fields (meta) to that file using the code below.
The class PdfStamper generates an IOException, whether I use its .Close() method or the using block just ends. The process that is still holding on to the file handle is the webdev web server itself (I'm debugging in VS2010 SP1).
private string AddCustomMetaData(string guid, int companyID, string filePath)
{
try
{
PdfReader reader = new PdfReader(filePath);
using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.ReadWrite, FileShare.ReadWrite))
{
PdfStamper st = new PdfStamper(reader, fs);
Dictionary<string, string> info = reader.Info;
info.Add("Guid", guid);
info.Add("CompanyID", companyID.ToString());
st.MoreInfo = info;
st.Close();
}
reader.Close();
return guid;
}
catch (Exception e)
{
return e.Message;
}
}
No matter what I try, it keeps throwing the exception at st.Close();, to be more precise:
The process cannot access the file 'C:\Users[my
username]\AppData\Local\Temp\53b96eaf-74a6-49d7-a715-6c2e866a63c3.pdf'
because it is being used by another process.
Either I'm overlooking something obvious or there's a problem with the PdfStamper class I'm as of yet unaware of. Versions of itextsharp used are 5.3.3.0 and 5.4.0.0, the issue is the same.
Any insight would be greatly appreciated.
EDIT: I'm currently "coding around" the issue, but I haven't found any solution.
Your problem is that you are writing to a file while you are also reading from it. Unlike some file types (JPG, PNG, etc) that "load" all of the data into memory, iTextSharp reads the data as a stream. You either need to use two files and swap them at the end or you can force iTextSharp to "load" the first file by binding your PdfReader to a byte array of the file.
PdfReader reader = new PdfReader(System.IO.File.ReadAllBytes(filePath));
I suggest you to use the FileShare enumerator when you open the file, so Try to open a file with None sharing
File.Open(fileName, FileMode.Open, FileAccess.Read, FileShare.None);
Try to .Dispose() your PDF reader (or whatever you use for creating it) when you save the file for the first time
Try this solution if you think its feasible for you - Once the webmethod creates file in Temp folder, you need to copy the file and paste it into other location or same location with different name and pass newly copied file path to your PDF reader.