Is PdfStamper disposing output stream? (iTextSharp) - c#

I am using iTextSharp to add page numbers to a PDF with C#. While running code analysis the MemoryStream for the output is suspected to be disposed more than once. See this warning generated by Visual Studio. Is this an API problem? Should the second parameter of PdfStamper be marked as out? Is there a way for me to fix this warning?
MemoryStream mem = null;
PdfReader reader = null;
PdfStamper stamper = null;
try
{
mem = new MemoryStream();
reader = new PdfReader(m_pdf);
stamper = new PdfStamper(reader, mem);
// do stuff
stamper.Close();
var result = mem.ToArray();
}
finally
{
if(stamper != null)
{
stamper.Dispose();
}
if (reader != null)
{
reader.Dispose();
}
if (mem != null)
{
mem.Dispose();
}
}

This isn't really an answer but to expand upon what #mkl said, switch over to using directives since those perform the try/finally stuff for you automatically.
Below is the way I (and probably everyone else that uses iTextSharp) would generally recommend to interact with iTextSharp. The outer using is BCL stuff, in this case the MemoryStream and the inner using statements are iTextSharp stuff.
//Will hold our raw PDF bytes
Byte[] result;
//BCL stuff first
using (var mem = new MemoryStream()) {
//iText stuff in the middle
using (var reader = new PdfReader(m_pdf)) {
using (var stamper = new PdfStamper(reader, mem)) {
// do stuff
}
}
//iText is completely done and disposed of at this point
//so we can now grab the raw bytes that represent a PDF
result = mem.ToArray();
}
As an aside, not necessarily for the OP but just in case someone else sees this, there is almost never (and by "almost never" I really mean "never") a good reason to not close the underlying stream. You can read from the stream by grabbing the raw bytes and writing to it again never makes sense.

The following allowed me to keep the MemoryStream open after the stamper is closed:
pdfStamper.Writer.CloseStream = false;

Related

ITextSharp - Cannot Open .pdf because it is being used by another process?

I am having a issue where I write to a pdf file and then close it and later on open it up again and try to read it.
I get "Cannot Open .pdf because it is being used by another process?"
var path = // get path
Directory.CrateDirctory(path);
using(var writer = new PdfWriter(path, //writer properties)){
var reader = new PdfReader(//template);
var pdfDoc = new PdfDcoument(reader, writer);
writer.SetCloseStream(false)
// some code to fill in the pdf
reader.Close();
pdfDoc.Close();
}
//later in code
using(var stream = new MemoryStream(File.ReadAllBytes(//the file that I crated above)))
{
// do some stuff here
}
I get the error right on the using statement above. I thought all the streams for creating the pdf are closed at this point but it seems like it is not.
The issue is with the line writer.SetCloseStream(false);. That is telling it to not close the stream when the writer is closed. Since the stream is left open you will get an IOException when you create another stream for reading. Remove this line or set to true to resolve the issue.
If you need to keep the stream open for whatever reason, like issues with flushing the write buffer too soon when PdfWriter is closed. Then you can get access to the write stream and close it later when you are ready to read it. Something like this:
Stream outputStream;
using (var writer = new PdfWriter(path)){
writer.SetCloseStream(false);
// do whatever you need here
outputStream = writer.GetOutputStream();
}
// do whatever else you need here
// close the stream before creating a read stream
if(null != outputStream) {
outputStream.Close();
}
using (var stream = new MemoryStream(File.ReadAllBytes(path)))
{
// do some stuff here
}

Open XML WordprocessingDocument with MemoryStream is 0KB

I am trying to learn how to use with Microsoft's Open XML SDK. I followed their steps on how to create a Word document using a FileStream and it worked perfectly. Now I want to create a Word document but only in memory, and wait for the user to specify whether they would like to save the file or not.
This document by Microsoft says how to deal with in-memory documents using MemoryStream, however, the document is first loaded from an existing file and "dumped" into a MemorySteam. What I want is to create a document entirely in memory (not based on a file in a drive). What I thought would achieve that was this code:
// This is almost the same as Microsoft's code except I don't
// dump any files into the MemoryStream
using (var mem = new MemoryStream())
{
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
using (var file = new FileStream(destination, FileMode.CreateNew))
{
mem.WriteTo(file);
}
}
}
But the result is a file that is 0KB and that can't be read by Word. At first I thought it was because of the size of the MemoryStream so I provided it with an initial size of 1024 but the results were the same. On the other hand if I change the MemoryStream for a FileStreamit works perfectly.
My question is whether what I want to do is possible, and if so, how? I guess it must be possible, just not how I'm doing it. If it isn't possible what alternative do I have?
There's a couple of things going on here:
First, unlike Microsoft's sample, I was nesting the using block code that writes the file to disk inside the block that creates and modifies the file. The WordprocessingDocument gets saved to the stream until it is disposed or when the Save() method is called. The WordprocessingDocument gets disposed automatically when reaching the end of it's using block. If I had not nested the third using statement, thus reaching the end of the second using statement before trying to save the file, I would have allowed the document to be written to the MemoryStream- instead I was writing a still empty stream to disk (hence the 0KB file).
I suppose calling Save()might have helped, but it is not supported by .Net core (which is what I'm using). You can check whether Save()is supported on you system by checking CanSave.
/// <summary>
/// Gets a value indicating whether saving the package is supported by calling <see cref="Save"/>. Some platforms (such as .NET Core), have limited support for saving.
/// If <c>false</c>, in order to save, the document and/or package needs to be fully closed and disposed and then reopened.
/// </summary>
public static bool CanSave { get; }
So the code ended up being almost identical to Microsoft's code except I don't read any files beforehand, rather I just begin with an empty MemoryStream:
using (var mem = new MemoryStream())
{
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
}
using (var file = new FileStream(destination, FileMode.CreateNew))
{
mem.WriteTo(file);
}
}
Also you don't need to reopen the document before saving it, but if you do remember to use Open() instead of Create() because Create() will empty the MemoryStream and you'll also end with a 0KB file.
You're passing mem to WordprocessingDocument.Create(), which is creating the document from the (now-empty) MemoryStream, however, I don't think that is associating the MemoryStream as the backing store of the document. That is, mem is only the input of the document, not the output as well. Therefore, when you call mem.WriteTo(file);, mem is still empty (the debugger would confirm this).
Then again, the linked document does say "you must supply a resizable memory stream to [Open()]", which implies that the stream will be written to, so maybe mem does become the backing store but nothing has been written to it yet because the AutoSave property (for which you specified true in Create()) hasn't had a chance to take effect yet (emphasis mine)...
Gets a flag that indicates whether the parts should be saved when disposed.
I see that WordprocessingDocument has a SaveAs() method, and substituting that for the FileStream in the original code...
using (var mem = new MemoryStream())
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
doc.AddMainDocumentPart().Document = new Document();
var body = doc.MainDocumentPart.Document.AppendChild(new Body());
var paragraph = body.AppendChild(new Paragraph());
var run = paragraph.AppendChild(new Run());
run.AppendChild(new Text("Hello docx"));
// Explicitly close the OpenXmlPackage returned by SaveAs() so destination doesn't stay locked
doc.SaveAs(destination).Close();
}
...produces the expected file for me. Interestingly, after the call to doc.SaveAs(), and even if I insert a call to doc.Save(), mem.Length and mem.Position are both still 0, which does suggest that mem is only used for initialization.
One other thing I would note is that the sample code is calling Open(), whereas you are calling Create(). The documentation is pretty sparse as far as how those two methods differ, but I would have suggested you try creating your document with Open() instead...
using (MemoryStream mem = new MemoryStream())
using (WordprocessingDocument doc = WordprocessingDocument.Open(mem, true))
{
// ...
}
...however when I do that Open() throws an exception, presumably because mem has no data. So, it seems the names are somewhat self-explanatory in that Create() initializes new document data whereas Open() expects existing data. I did find that if I feed Create() a MemoryStream filled with random garbage...
using (var mem = new MemoryStream())
{
// Fill mem with garbage
byte[] buffer = new byte[1024];
new Random().NextBytes(buffer);
mem.Write(buffer, 0, buffer.Length);
mem.Position = 0;
using (var doc = WordprocessingDocument.Create(mem, WordprocessingDocumentType.Document, true))
{
// ...
}
}
...it still produces the exact same document XML as the first code snippet above, which makes me wonder why Create() even needs an input Stream at all.
I was facing the same problem today, after all, the solution is closing the document to fill the memorystream, here is the example, Lance U. Matthews's example help me alot, and finally I realized, after cheking others document types exports, after fill thems, each one calls method Close, but, Microsoft example doesn't show it
private MemoryStream GenerateWord(DataTable dt)
{
MemoryStream mStream = new MemoryStream();
// Create Document
OpenXMLPackaging.WordprocessingDocument wordDocument =
OpenXMLPackaging.WordprocessingDocument.Create(mStream, OpenXML.WordprocessingDocumentType.Document, true);
// Add a main document part.
OpenXMLPackaging.MainDocumentPart mainPart = wordDocument.AddMainDocumentPart();
mainPart.Document = new OpenXMLWordprocessing.Document();
OpenXMLWordprocessing.Body body = mainPart.Document.AppendChild(new OpenXMLWordprocessing.Body());
OpenXMLWordprocessing.Table table = new OpenXMLWordprocessing.Table();
body.AppendChild(table);
OpenXMLWordprocessing.TableRow tr = new OpenXMLWordprocessing.TableRow();
foreach (DataColumn c in dt.Columns)
{
tr.Append(new OpenXMLWordprocessing.TableCell(new OpenXMLWordprocessing.Paragraph(new OpenXMLWordprocessing.Run(new OpenXMLWordprocessing.Text(c.ColumnName.ToString())))));
}
table.Append(tr);
foreach (DataRow r in dt.Rows)
{
if (dt.Rows.Count > 0)
{
OpenXMLWordprocessing.TableRow dataRow = new OpenXMLWordprocessing.TableRow();
for (int h = 0; h < dt.Columns.Count; h++)
{
dataRow.Append(new OpenXMLWordprocessing.TableCell(new OpenXMLWordprocessing.Paragraph(new OpenXMLWordprocessing.Run(new OpenXMLWordprocessing.Text(r[h].ToString())))));
}
table.Append(dataRow);
}
}
mainPart.Document.Save();
wordDocument.Close();
mStream.Position = 0;
return mStream;
}

iText 7 - HTML to PDF write to MemoryStream instead of file

I'm using iText 7, specifically the HtmlConverter.ConvertToDocument method, to convert HTML to PDF. The problem is, I would really rather not create a PDF file on my server, I'd rather do everything in memory and just send it to the users browser so they can download it.
Could anyone show me an example of how to use this library but instead of writing to file write to a MemoryStream so I can send it directly to the browser?
I've been looking for examples and all I can seem to find are those which refer to file output.
I've tried the following, but keep getting an error about cannot access a closed memory stream.
public FileStreamResult pdf() {
using (var workStream = new MemoryStream())
using (var pdfWriter = new PdfWriter(workStream)) {
pdfWriter.SetCloseStream(false);
using (var document = HtmlConverter.ConvertToDocument(html, pdfWriter)) {
//Returns the written-to MemoryStream containing the PDF.
byte[] byteInfo = workStream.ToArray();
workStream.Write(byteInfo, 0, byteInfo.Length);
workStream.Position = 0;
return new FileStreamResult(workStream, "application/pdf");
}
//return new FileStreamResult(workStream, "application/pdf");
}
}
You meddle with the workStream before the document and pdfWriter have finished creating the result in it. Furthermore, the intent of your meddling is unclear, first you retrieve the bytes from the memory stream, then you write them back into it...?
public FileStreamResult pdf()
{
var workStream = new MemoryStream())
using (var pdfWriter = new PdfWriter(workStream))
{
pdfWriter.SetCloseStream(false);
using (var document = HtmlConverter.ConvertToDocument(html, pdfWriter))
{
}
}
workStream.Position = 0;
return new FileStreamResult(workStream, "application/pdf");
}
By the way, as you are essentially doing nothing special with the document returned by HtmlConverter.ConvertToDocument, you probably could use a different HtmlConverter method with less overhead in your code.
Generally this approach works
using (var ms = new MemoryStream())
{
//yourStream.Seek(0, SeekOrigin.Begin)
yourStream.CopyTo(ms);
}

I can't get code analysis rule CA2202 fixed

I have a function (see code snippet below).
I have Code Analysis enabled, and I get a CA2202 rule violation.
(edit: I added the close on the pdfStamper otherwise the PDF will be corrupted)
CA2202: Do not dispose objects multiple times
A method implementation contains code paths that could cause multiple calls to IDisposable.Dispose or a Dispose equivalent, such as a Close() method on some types, on the same object.
In the CA2202 MSDN page(here), the proposed fix doesn't work.
How can the code be rewritten without having to suppress this violation ?
private byte[] DoGenerateFinishedGamePdf(int gameSessionLogId)
{
var finishedGameCertificatePdfFile = httpServerUtilityWrapper.MapPath(ConfigurationManager.AppSettings["FinishedGameCertificateFile"]);
var pdfReader = new PdfReader(finishedGameCertificatePdfFile); // note that PdfReader is not IDisposeable
using (MemoryStream memoryStream = new MemoryStream())
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, memoryStream))
{
var fields = pdfStamper.AcroFields;
fields.SetField("CityName", "It works!");
pdfReader.Close();
pdfStamper.FormFlattening = true;
pdfStamper.FreeTextFlattening = true;
pdfStamper.Close();
return memoryStream.ToArray();
}
}
Ah, everyone's favourite warning! In this instance MemoryStream.Dispose is idempotent (the current implementation does nothing) so it's not really a problem, however the 'fix' is as follows:
MemoryStream memoryStream = null;
try
{
memoryStream = new MemoryStream();
using (PdfStamper pdfStamper = new PdfStamper(pdfReader, memoryStream))
{
memoryStream = null;
var fields = pdfStamper.AcroFields;
fields.SetField("CityName", "It works!");
pdfReader.Close();
pdfStamper.FormFlattening = true;
pdfStamper.FreeTextFlattening = true;
pdfStamper.Close();
return memoryStream.ToArray();
}
}
finally
{
if (memoryStream != null) memoryStream.Dispose();
}
Since the PdfStamper 'owns' the MemoryStream it will dispose of it when PdfStamper.Dispose is called, so we only need to call Dispose on the MemoryStream if we do not Dispose of the PdfStamper, which can only happen if the construction of the PdfStamper fails.
This is happening because the PdfStamper is disposing the stream even though it shouldn't. It didn't create it and doesn't own it, so it has no business disposing it.
Your code did create the stream and it does own it, so it's quite natural that it should dispose it. If the PdfStamper weren't disposing the stream inappropriately, all would be fine with your nested usings.
Your first move should probably be to file a bug report/feature request that the PdfStamper disposition of the stream be removed or at least made avoidable. Once you've done that, you can pretty safely suppress the CA2202 violation since double disposition of a MemoryStream will have no deleterious consequences.
BTW, PdfStamper.Dispose() call PdfStamper.Close() (at least in version 5.4.0), so you should be able to remove the PdfStamper.Close() call.

"Object can be disposed of more than once" error

When I run code analysis on the following chunk of code I get this message:
Object 'stream' can be disposed more than once in method 'upload.Page_Load(object, EventArgs)'. To avoid generating a System.ObjectDisposedException you should not call Dispose more than one time on an object.
using(var stream = File.Open(newFilename, FileMode.CreateNew))
using(var reader = new BinaryReader(file.InputStream))
using(var writer = new BinaryWriter(stream))
{
var chunk = new byte[ChunkSize];
Int32 count;
while((count = reader.Read(chunk, 0, ChunkSize)) > 0)
{
writer.Write(chunk, 0, count);
}
}
I don't understand why it might be called twice, and how to fix it to eliminate the error. Any help?
I struggled with this problem and found the example here to be very helpful. I'll post the code for a quick view:
using (Stream stream = new FileStream("file.txt", FileMode.OpenOrCreate))
{
using (StreamWriter writer = new StreamWriter(stream))
{
// Use the writer object...
}
}
Replace the outer using statement with a try/finally making sure to BOTH null the stream after using it in StreamWriter AND check to make sure it is not null in the finally before disposing.
Stream stream = null;
try
{
stream = new FileStream("file.txt", FileMode.OpenOrCreate);
using (StreamWriter writer = new StreamWriter(stream))
{
stream = null;
// Use the writer object...
}
}
finally
{
if(stream != null)
stream.Dispose();
}
Doing this cleared up my errors.
To illustrate, let's edit your code
using(var stream = File.Open(newFilename, FileMode.CreateNew))
{
using(var reader = new BinaryReader(file.InputStream))
{
using(var writer = new BinaryWriter(stream))
{
var chunk = new byte[ChunkSize];
Int32 count;
while((count = reader.Read(chunk, 0, ChunkSize)) > 0)
{
writer.Write(chunk, 0, count);
}
} // here we dispose of writer, which disposes of stream
} // here we dispose of reader
} // here we dispose a stream, which was already disposed of by writer
To avoid this, just create the writer directly
using(var reader = new BinaryReader(file.InputStream))
{
using(var writer = new BinaryWriter( File.Open(newFilename, FileMode.CreateNew)))
{
var chunk = new byte[ChunkSize];
Int32 count;
while((count = reader.Read(chunk, 0, ChunkSize)) > 0)
{
writer.Write(chunk, 0, count);
}
} // here we dispose of writer, which disposes of its inner stream
} // here we dispose of reader
edit: to take into account what Eric Lippert is saying, there could indeed be a moment when the stream is only released by the finalizer if BinaryWriter throws an exception. According to the BinaryWriter code, that could occur in three cases
If (output Is Nothing) Then
Throw New ArgumentNullException("output")
End If
If (encoding Is Nothing) Then
Throw New ArgumentNullException("encoding")
End If
If Not output.CanWrite Then
Throw New ArgumentException(Environment.GetResourceString("Argument_StreamNotWritable"))
End If
if you didn't specify an output, ie if stream is null. That shouldn't be a problem since a null stream means no resources to dispose of :)
if you didn't specify an encoding. since we don't use the constructor form where the encoding is specified, there should be no problem here either (i didn't look into the encoding contructor too much, but an invalid codepage can throw)
if you don't pass a writable stream. That should be caught quite quickly during development...
Anyway, good point, hence the edit :)
The BinaryReader/BinaryWriter will dispose the underlying stream for you when it disposes. You don't need to do it explicitly.
To fix it you can remove the using around the Stream itself.
A proper implementation of Dispose is explicitly required not to care if it's been called more than once on the same object. While multiple calls to Dispose are sometimes indicative of logic problems or code which could be better written, the only way I would improve the original posted code would be to convince Microsoft to add an option to BinaryReader and BinaryWriter instructing them not to dispose their passed-in stream (and then use that option). Otherwise, the code required to ensure the file gets closed even if the reader or writer throws in its constructor would be sufficiently ugly that simply letting the file get disposed more than once would seem cleaner.
Your writer will dispose your stream, always.
Suppress CA2202 whenever you are sure that the object in question handles multiple Dispose calls correctly and that your control flow is impeccably readable. BCL objects generally implement Dispose correctly. Streams are famous for that.
But don't necessarily trust third party or your own streams if you don't have unit tests probing that scenario yet. An API which returns a Stream may be returning a fragile subclass.

Categories

Resources