PdfStamper being disposed

PdfStamper being disposed - c#

The PdfStamper I'm passing in to this method is being disposed of at the end of the method - why, and how do I stop it? I'm trying to create a page object from the template, which I can then add to the PdfStamper X number of times.
//real code
public void DoSpecialAction(PdfStamper pdfStamper)
{
using (var pdfTemplate = new PdfReader(_extraPageTemplatePath))
using (var pdfReader = new PdfReader(pdfTemplate))
{
PdfImportedPage page = pdfStamper.GetImportedPage(pdfReader, 1);
pdfStamper.InsertPage(3, pdfReader.GetPageSize(1));
PdfContentByte pb = pdfStamper.GetUnderContent(3);
pb.AddTemplate(page, 0, 0);
}
}
the program structure is as follows:
//psuedocode
class PrintFieldsToPdf {
foreach (normalfield) {
PrintNormalFields();
}
foreach (specialaction) {
DoSpecialAction(pdfStamper);
}
pdfStamper.Close(); //at this point the object has been deallocated
}
Throwing the following exception:
An exception of type 'System.ObjectDisposedException' occurred in mscorlib.dll but was not handled in user code
Additional information: Cannot access a closed file.

The OP eventually commented:
I have a hunch it may be that the page object never actually gets copied until the PdfStamper calls Close and writes the file, and therefore the PdfReader I'm using to read the extra page template is causing the issue, as it is disposed of at the end of my method, before PdfStamper is closed.
His hunch was correct: The copying of at least certain parts of the original page is delayed until the PdfStamper is being closed. This allows for certain optimizations in case multiple pages from the same PdfReader instance are imported in separate calls.
The use case of imports from many different PdfReaders had also been on the mind of the iText(Sharp) developers. So they provided a way to tell the PdfStamper to copy everything required from a given PdfReader at the time the user is sure he won't copy anything else from it:
public void DoSpecialAction(PdfStamper pdfStamper)
{
using (var pdfTemplate = new PdfReader(_extraPageTemplatePath))
using (var pdfReader = new PdfReader(pdfTemplate))
{
PdfImportedPage page = pdfStamper.GetImportedPage(pdfReader, 1);
pdfStamper.InsertPage(3, pdfReader.GetPageSize(1));
PdfContentByte pb = pdfStamper.GetUnderContent(3);
pb.AddTemplate(page, 0, 0);
// Copy everything required from the PdfReader
pdfStamper.Writer.FreeReader(pdfReader);
}
}

Related

C# - Cannot access a closed stream

My code for PDF creation with iTextSharp version 5.5.13.2 is returning error, "Cannot access a closed stream."
I am unsure how this error could arise as I have my code encapsulated within reaches of Using statement. Debugging results in app going to break state.
PdfWriter writer = PdfWriter.GetInstance(doc, ms);

Looking at the source code for the (deprecated) iTextSharp 5.5.13.2 here, I can find the source for the DocWriter (base class of PdfWriter) and it's Close method here
public virtual void Close() {
open = false;
os.Flush();
if (closeStream)
os.Close();
}
os in this case is whatever was passed as the second argument to PdfWriter.GetInstance (ms in your case). Using Ctrl + F I can find the source for closeStream, which happens to be a property exposes as CloseStream here
public virtual bool CloseStream {
get {
return closeStream;
}
set {
closeStream = value;
}
}
And all together Close is automatically called by the Dispose method of DocWriter
public virtual void Dispose() {
Close();
}
So, if you don't want the PdfWriter to close your ms, you'll need to set writer.CloseStream = false; before your PdfWriter gets closed

Open XML WordprocessingDocument with MemoryStream is 0KB

I am trying to learn how to use with Microsoft's Open XML SDK. I followed their steps on how to create a Word document using a FileStream and it worked perfectly. Now I want to create a Word document but only in memory, and wait for the user to specify whether they would like to save the file or not.
This document by Microsoft says how to deal with in-memory documents using MemoryStream, however, the document is first loaded from an existing file and "dumped" into a MemorySteam. What I want is to create a document entirely in memory (not based on a file in a drive). What I thought would achieve that was this code:
// This is almost the same as Microsoft's code except I don't
// dump any files into the MemoryStream
using (var mem = new MemoryStream())
{
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
using (var file = new FileStream(destination, FileMode.CreateNew))
{
mem.WriteTo(file);
}
}
}
But the result is a file that is 0KB and that can't be read by Word. At first I thought it was because of the size of the MemoryStream so I provided it with an initial size of 1024 but the results were the same. On the other hand if I change the MemoryStream for a FileStreamit works perfectly.
My question is whether what I want to do is possible, and if so, how? I guess it must be possible, just not how I'm doing it. If it isn't possible what alternative do I have?

There's a couple of things going on here:
First, unlike Microsoft's sample, I was nesting the using block code that writes the file to disk inside the block that creates and modifies the file. The WordprocessingDocument gets saved to the stream until it is disposed or when the Save() method is called. The WordprocessingDocument gets disposed automatically when reaching the end of it's using block. If I had not nested the third using statement, thus reaching the end of the second using statement before trying to save the file, I would have allowed the document to be written to the MemoryStream- instead I was writing a still empty stream to disk (hence the 0KB file).
I suppose calling Save()might have helped, but it is not supported by .Net core (which is what I'm using). You can check whether Save()is supported on you system by checking CanSave.
/// <summary>
/// Gets a value indicating whether saving the package is supported by calling <see cref="Save"/>. Some platforms (such as .NET Core), have limited support for saving.
/// If <c>false</c>, in order to save, the document and/or package needs to be fully closed and disposed and then reopened.
/// </summary>
public static bool CanSave { get; }
So the code ended up being almost identical to Microsoft's code except I don't read any files beforehand, rather I just begin with an empty MemoryStream:
using (var mem = new MemoryStream())
{
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
}
using (var file = new FileStream(destination, FileMode.CreateNew))
{
mem.WriteTo(file);
}
}
Also you don't need to reopen the document before saving it, but if you do remember to use Open() instead of Create() because Create() will empty the MemoryStream and you'll also end with a 0KB file.

You're passing mem to WordprocessingDocument.Create(), which is creating the document from the (now-empty) MemoryStream, however, I don't think that is associating the MemoryStream as the backing store of the document. That is, mem is only the input of the document, not the output as well. Therefore, when you call mem.WriteTo(file);, mem is still empty (the debugger would confirm this).
Then again, the linked document does say "you must supply a resizable memory stream to [Open()]", which implies that the stream will be written to, so maybe mem does become the backing store but nothing has been written to it yet because the AutoSave property (for which you specified true in Create()) hasn't had a chance to take effect yet (emphasis mine)...
Gets a flag that indicates whether the parts should be saved when disposed.
I see that WordprocessingDocument has a SaveAs() method, and substituting that for the FileStream in the original code...
using (var mem = new MemoryStream())
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
// Explicitly close the OpenXmlPackage returned by SaveAs() so destination doesn't stay locked
doc.SaveAs(destination).Close();
}
...produces the expected file for me. Interestingly, after the call to doc.SaveAs(), and even if I insert a call to doc.Save(), mem.Length and mem.Position are both still 0, which does suggest that mem is only used for initialization.
One other thing I would note is that the sample code is calling Open(), whereas you are calling Create(). The documentation is pretty sparse as far as how those two methods differ, but I would have suggested you try creating your document with Open() instead...
using (MemoryStream mem = new MemoryStream())
using (WordprocessingDocument doc = WordprocessingDocument.Open(mem, true))
{
// ...
}
...however when I do that Open() throws an exception, presumably because mem has no data. So, it seems the names are somewhat self-explanatory in that Create() initializes new document data whereas Open() expects existing data. I did find that if I feed Create() a MemoryStream filled with random garbage...
using (var mem = new MemoryStream())
{
// Fill mem with garbage
byte[] buffer = new byte[1024];
new Random().NextBytes(buffer);
mem.Write(buffer, 0, buffer.Length);
mem.Position = 0;
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
// ...
}
}
...it still produces the exact same document XML as the first code snippet above, which makes me wonder why Create() even needs an input Stream at all.

I was facing the same problem today, after all, the solution is closing the document to fill the memorystream, here is the example, Lance U. Matthews's example help me alot, and finally I realized, after cheking others document types exports, after fill thems, each one calls method Close, but, Microsoft example doesn't show it
private MemoryStream GenerateWord(DataTable dt)
{
MemoryStream mStream = new MemoryStream();
// Create Document
OpenXMLPackaging.WordprocessingDocument wordDocument =
OpenXMLPackaging.WordprocessingDocument.Create(mStream, OpenXML.WordprocessingDocumentType.Document, true);
// Add a main document part.
OpenXMLPackaging.MainDocumentPart mainPart = wordDocument.AddMainDocumentPart();
mainPart.Document = new OpenXMLWordprocessing.Document();
OpenXMLWordprocessing.Body body = mainPart.Document.AppendChild(new OpenXMLWordprocessing.Body());
OpenXMLWordprocessing.Table table = new OpenXMLWordprocessing.Table();
body.AppendChild(table);
OpenXMLWordprocessing.TableRow tr = new OpenXMLWordprocessing.TableRow();
foreach (DataColumn c in dt.Columns)
{
tr.Append(new OpenXMLWordprocessing.TableCell(new OpenXMLWordprocessing.Paragraph(new OpenXMLWordprocessing.Run(new OpenXMLWordprocessing.Text(c.ColumnName.ToString())))));
}
table.Append(tr);
foreach (DataRow r in dt.Rows)
{
if (dt.Rows.Count > 0)
{
OpenXMLWordprocessing.TableRow dataRow = new OpenXMLWordprocessing.TableRow();
for (int h = 0; h < dt.Columns.Count; h++)
{
dataRow.Append(new OpenXMLWordprocessing.TableCell(new OpenXMLWordprocessing.Paragraph(new OpenXMLWordprocessing.Run(new OpenXMLWordprocessing.Text(r[h].ToString())))));
}
table.Append(dataRow);
}
}
mainPart.Document.Save();
wordDocument.Close();
mStream.Position = 0;
return mStream;
}

Printing a Local Report without Preview - Stream size exceeded or A generic error occurred in GDI+ C#

I am using this article to print my rdlc directly to printer but when I am trying to create Metafile object by passing stream it gives me error. (A generic error occurred in GDI+)
Code:
using System;
using System.IO;
using System.Data;
using System.Text;
using System.Drawing.Imaging;
using System.Drawing.Printing;
using System.Collections.Generic;
using System.Windows.Forms;
using Microsoft.Reporting.WinForms;
public class Demo : IDisposable
{
private int m_currentPageIndex;
private IList<Stream> m_streams;
// Routine to provide to the report renderer, in order to
// save an image for each page of the report.
private Stream CreateStream(string name, string fileNameExtension, Encoding encoding, string mimeType, bool willSeek)
{
DataSet ds = new DataSet();
ds.Tables.Add(dsData.Tables[0].Copy());
using (MemoryStream stream = new MemoryStream())
{
IFormatter bf = new BinaryFormatter();
ds.RemotingFormat = SerializationFormat.Binary;
bf.Serialize(stream, ds);
data = stream.ToArray();
}
Stream stream1 = new MemoryStream(data);
m_streams.Add(stream1);
return stream1;
}
// Export the given report as an EMF (Enhanced Metafile) file.
private void Export(LocalReport report)
{
string deviceInfo =
#"<DeviceInfo>
<OutputFormat>EMF</OutputFormat>
<PageWidth>8.5in</PageWidth>
<PageHeight>11in</PageHeight>
<MarginTop>0.25in</MarginTop>
<MarginLeft>0.25in</MarginLeft>
<MarginRight>0.25in</MarginRight>
<MarginBottom>0.25in</MarginBottom>
</DeviceInfo>";
Warning[] warnings;
m_streams = new List<Stream>();
report.Render("Image", deviceInfo, CreateStream,
out warnings);
foreach (Stream stream in m_streams)
stream.Position = 0;
}
// Handler for PrintPageEvents
private void PrintPage(object sender, PrintPageEventArgs ev)
{
Metafile pageImage = new
Metafile(m_streams[m_currentPageIndex]);
// Adjust rectangular area with printer margins.
Rectangle adjustedRect = new Rectangle(
ev.PageBounds.Left - (int)ev.PageSettings.HardMarginX,
ev.PageBounds.Top - (int)ev.PageSettings.HardMarginY,
ev.PageBounds.Width,
ev.PageBounds.Height);
// Draw a white background for the report
ev.Graphics.FillRectangle(Brushes.White, adjustedRect);
// Draw the report content
ev.Graphics.DrawImage(pageImage, adjustedRect);
// Prepare for the next page. Make sure we haven't hit the end.
m_currentPageIndex++;
ev.HasMorePages = (m_currentPageIndex < m_streams.Count);
}
private void Print()
{
if (m_streams == null || m_streams.Count == 0)
throw new Exception("Error: no stream to print.");
PrintDocument printDoc = new PrintDocument();
if (!printDoc.PrinterSettings.IsValid)
{
throw new Exception("Error: cannot find the default printer.");
}
else
{
printDoc.PrintPage += new PrintPageEventHandler(PrintPage);
m_currentPageIndex = 0;
printDoc.Print();
}
}
// Create a local report for Report.rdlc, load the data,
// export the report to an .emf file, and print it.
private void Run()
{
LocalReport report = new LocalReport();
LocalReport report = new LocalReport();
report.ReportPath = #"Reports\InvoiceReportTest.rdlc";
report.DataSources.Add(
new ReportDataSource("DataSet1", dsPrintDetails));
Export(report);
Print();
}
public void Dispose()
{
if (m_streams != null)
{
foreach (Stream stream in m_streams)
stream.Close();
m_streams = null;
}
}
public static void Main(string[] args)
{
using (Demo demo = new Demo())
{
demo.Run();
}
}
}
It gives me error when stream size exceed or rdlc static content is more.
My dataset that I use to create stream of it is:
I don't know whether static content should not affect stream size or not but it is not giving me any error if I remove some content from rdlc but when I add that it again throw error (A generic error occurred in GDI+)

A generic error exception is a pretty lousy exception to diagnose. It conveys little info beyond "it did not work". The exception is raised whenever the Graphics class runs into trouble using drawing objects or rendering the drawing commands to the underlying device context. There is a clear and obvious reason for that in this code and from the things you did to troubleshoot it: the program ran out of memory.
The Graphics class treats its underlying device context as unmanaged resource, the basic reason why you don't get the more obvious OutOfMemoryException. It usually is, like when you use it to render to the screen or a printer, just not in this case because it renders to a MemoryStream. Some odds that you can see the first-chance notification for it in the VS Output window. Adding the Commit Size column in Task Manager can provide an additional diagnostic, trouble starts when it heads north of a gigabyte.
What is especially notable about this code that the program will always fail with this exception. Give it a report with too many pages or a data table with too many records and it is doomed. It will inevitably always require too much memory to store the metafile records in the memory streams. The only thing you can do about it is to make the program more memory-efficient so it can deal with production demands. Lots of opportunities here.
First observation is that you inherited some sloppiness from the MSDN code sample. Which is common and something in general to beware of, such samples focus on demonstrating coding techniques. Making the code bullet-proof gets in the way of the mission, untested and left as an exercise to the reader. Notable is that it ignores the need to Dispose() too much. The provided Dispose() method does not actually accomplish anything, disposing a memory stream merely marks it as unreadable. What it does not do is properly dispose the Metafile, LocalReport and PrintDocument objects. Use the using statement to correct these omissions.
Second observation is that the addition to the CreateStream() method is hugely wasteful. Also the bad kind of waste, it is very rough on the Large Object Heap. There is no need to Copy() the DataTable, the report doesn't write to it. There is no need to convert the MemoryStream to an array and create a MemoryStream from the array again, the first MemoryStream is already good as-is. Don't use using, set its Position to 0. This is pretty likely good enough to solve the problem.
If you still have trouble then you should consider using a FileStream instead of a MemoryStream. It will be just as efficient, the OS ensures it is, having to pick a name for the file is the only additional burden. Not a real issue here, use Path.GetTempFileName(). Note how the Dispose() method now becomes useful and necessary, you'll also want to delete the file again. Or better, use the FileOptions.DeleteOnClose option when you open the file so it is automagic.
And last but not least, you'll want to take advantage of the OS capabilities, modern machines can provide terabytes of address space and LOH fragmentation is never a problem. Project > Properties > Build tab > untick the "Prefer 32-bit" checkbox. Repeat for the Release configuration. You never prefer it when you battle out-of-memory problems.

At my end using the same functions as you are using and getting the same problem don't know why I use the provided function but it's running at my end so use this function may solve your problem:
private Stream CreateStream(string name, string fileNameExtension, Encoding encoding, string mimeType, bool willSeek)
{
Stream stream = new MemoryStream();
m_streams.Add(stream);
return stream;
}

Merging N pdf files, created from html using ITextSharp, to another blank pdf file

I need to merge N PDF files into one. I create a blank file first
byte[] pdfBytes = null;
var ms = new MemoryStream();
var doc = new iTextSharp.text.Document();
var cWriter = new PdfCopy(doc, ms);
Later I cycle through html strings array
foreach (NBElement htmlString in someElement.Children())
{
byte[] msTempDoc = getPdfDocFrom(htmlString.GetString(), cssString.GetString());
addPagesToPdf(cWriter, msTempDoc);
}
In getPdfDocFrom I create pdf file using XMLWorkerHelper and return it as byte array
private byte[] getPdfDocFrom(string htmlString, string cssString)
{
var tempMs = new MemoryStream();
byte[] tempMsBytes;
var tempDoc = new iTextSharp.text.Document();
var tempWriter = PdfWriter.GetInstance(tempDoc, tempMs);
tempDoc.Open();
using (var msCss = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(cssString)))
{
using (var msHtml = new MemoryStream(System.Text.Encoding.UTF8.GetBytes(htmlString)))
{
//Parse the HTML
iTextSharp.tool.xml.XMLWorkerHelper.GetInstance().ParseXHtml(tempWriter, tempDoc, msHtml, msCss);
tempMsBytes = tempMs.ToArray();
}
}
tempDoc.Close();
return tempMsBytes;
}
Later on I try to add pages from this PDF file to the blank one.
private static void addPagesToPdf(PdfCopy mainDocWriter, byte[] sourceDocBytes)
{
using (var msOut = new MemoryStream())
{
PdfReader reader = new PdfReader(new MemoryStream(sourceDocBytes));
int n = reader.NumberOfPages;
PdfImportedPage page;
for (int i = 1; i <= n; i++)
{
page = mainDocWriter.GetImportedPage(reader, i);
mainDocWriter.AddPage(page);
}
}}
It breaks when it tries to create a PdfReader from the byte array I pass to the function. "Rebuild failed: trailer not found.; Original message: PDF startxref not found."
I used another library to work with PDF before. I passed 2 PdfDocuments as an objects and just added pages from one to another in cycle. It didn't support Css though, so I had to switch to ITextSharp.
I don't quite get the difference between PdfWriter and PdfCopy.

There a logical error in your code. When you create a document from scratch as is done in the getPdfDocFrom() method, the document isn't complete until you've triggered the Close() method. In this Close() method, a trailer is created as well as a cross-reference (xref) table. The error tells you that those are missing.
Indeed, you do call the Close() method:
tempDoc.Close();
But by the time you Close() the document, it's too late: you have already created the tempMsBytes array. You need to create that array after you close the document.
Edit: I don't know anything about C#, but if MemoryStream clears its buffer after closing it, you could use mainDocWriter.CloseStream = false; so that the MemoryStream isn't closed when you close the document.
In Java, it would be a bad idea to set the "close stream" parameter to false. When I read the answers to the question Create PDF in memory instead of physical file I see that C# probably doesn't always require this extra line.
Remark: merging files by adding PdfImportedPage instances to a PdfWriter is an example of bad taste. If you are using iTextSharp 5 or earlier, you should use PdfCopy or PdfSmartCopy to do that. If you use PdfWriter, you throw away a lot of information (e.g. link annotations).

Modify existing pdf (add/remove pages) while preserving metadata

my target is to open an existing pdf, add or remove some pages while preserving the metadata (Author, Subject, ...) in a Windows.Forms C# application.
I use iTextSharp and found examples how to add or remove pages by using the PdfConcatenate class. To keep the metadata I use a PdfStamper afterwards. To speed things up I want to do the modifications in memory before storing the result to disk.
The problem is NOT adding or removing the pages but to keep the metadata in the same step.
So can anybody tell me/giva an example on how to achieve this (better) or am I on the completely wrong track?
Here my current code (see comments for problem related lines):
public void RemovePagesInFile(string documentLocation, int pageIndexFrom, int pageCount)
{
// TB: open the pdf
using (PdfReader sourcePdfReader = new PdfReader(documentLocation))
using (MemoryStream concatenatedTargetStream = new MemoryStream((int)sourcePdfReader.FileLength))
{
// TB: use a concatenator to create a new pdf containing only the desired pages
PdfConcatenate concatenator = new PdfConcatenate(concatenatedTargetStream);
// TB: create a list with the page numbers to keep
List<int> pagesToKeep = new List<int>();
for (int i = 1; i <= pageIndexFrom; i++)
{
pagesToKeep.Add(i);
}
for (int i = pageIndexFrom + pageCount + 1; i <= sourcePdfReader.NumberOfPages; i++)
{
pagesToKeep.Add(i);
}
// TB: execute the page copy
sourcePdfReader.SelectPages(pagesToKeep);
concatenator.AddPages(sourcePdfReader);
// TB: problem(s) here:
// 1. when calling concatenator.Close() the memory stream gets disposed as expected.
// concatenator.Close();
// 2. even when calling concatenator.WriterFlush() the memory stream seems to be missing content (error when creating targetReader (see below)).
// concatenator.Writer.Flush();
// 3. when keeping concatenator open the same error as above occures (I assume not all bytes have been written to the memory stream)
// TB: preserve the meta data from the source document
// => ERROR here: "Rebuild trailer not found. Original Error: PDF startxref not found"
using (PdfReader targetReader = new PdfReader(concatenatedTargetStream))
using (MemoryStream targetStream = new MemoryStream((int)concatenatedTargetStream.Length))
{
using (PdfStamper stamper = new PdfStamper(targetReader, targetStream))
{
stamper.MoreInfo = sourcePdfReader.Info;
// TB: same problem as above with stamper ?
stamper.Close();
}
// TB: close the reader to be able to access the source pdf
sourcePdfReader.Close();
// TB: write the modified pdf to the disk
File.WriteAllBytes(documentLocation, targetStream.ToArray());
}
}
}

Two changes need to be made. Call
concatenator.Writer.CloseStream = false
before calling
concatenator.Close()
Do the same thing for the PdfStamper and you're set.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.