Does my code properly clean up its List<MemoryStream>? - c#

I've got a third-party component that does PDF file manipulation. Whenever I need to perform operations I retrieve the PDF documents from a document store (database, SharePoint, filesystem, etc.). To make things a little consistent I pass the PDF documents around as a byte[].
This 3rd party component expects a MemoryStream[] (MemoryStream array) as a parameter to one of the main methods I need to use.
I am trying to wrap this functionality in my own component so that I can use this functionality for a number of areas within my application. I have come up with essentially the following:
public class PdfDocumentManipulator : IDisposable
{
List<MemoryStream> pdfDocumentStreams = new List<MemoryStream>();
public void AddFileToManipulate(byte[] pdfDocument)
{
using (MemoryStream stream = new MemoryStream(pdfDocument))
{
pdfDocumentStreams.Add(stream);
}
}
public byte[] ManipulatePdfDocuments()
{
byte[] outputBytes = null;
using (MemoryStream outputStream = new MemoryStream())
{
ThirdPartyComponent component = new ThirdPartyComponent();
component.Manipuate(this.pdfDocumentStreams.ToArray(), outputStream);
//move to begining
outputStream.Seek(0, SeekOrigin.Begin);
//convert the memory stream to a byte array
outputBytes = outputStream.ToArray();
}
return outputBytes;
}
#region IDisposable Members
public void Dispose()
{
for (int i = this.pdfDocumentStreams.Count - 1; i >= 0; i--)
{
MemoryStream stream = this.pdfDocumentStreams[i];
this.pdfDocumentStreams.RemoveAt(i);
stream.Dispose();
}
}
#endregion
}
The calling code to my "wrapper" looks like this:
byte[] manipulatedResult = null;
using (PdfDocumentManipulator manipulator = new PdfDocumentManipulator())
{
manipulator.AddFileToManipulate(file1bytes);
manipulator.AddFileToManipulate(file2bytes);
manipulatedResult = manipulator.Manipulate();
}
A few questions about the above:
Is the using clause in the AddFileToManipulate() method redundant and unnecessary?
Am I cleaning up things OK in my object's Dispose() method?
Is this an "acceptable" usage of MemoryStream? I am not anticipating very many files in memory at once...Likely 1-10 total PDF pages, each page about 200KB. App designed to run on server supporting an ASP.NET site.
Any comments/suggestions?
Thanks for the code review :)

AddFileToManipulate scares me.
public void AddFileToManipulate(byte[] pdfDocument)
{
using (MemoryStream stream = new MemoryStream(pdfDocument))
{
pdfDocumentStreams.Add(stream);
}
}
This code is adding a disposed stream to your pdfDocumentStream list. Instead you should simply add the stream using:
pdfDocumentStreams.Add(new MemoryStream(pdfDocument));
And dispose of it in the Dispose method.
Also you should look at implementing a finalizer to ensure stuff gets disposed in case someone forgets to dispose the top level object.

Is the using clause in the AddFileToManipulate() method redundant and unnecessary?
Worse, it's destructive. You're basically closing your memory stream before it's added in. See the other answers for details, but basically, dispose at the end, but not any other time. Every using with an object causes a Dispose to happen at the end of the block, even if the object is "passed off" to other objects via methods.
Am I cleaning up things OK in my object's Dispose() method?
Yes, but you're making life more difficult than it needs to be. Try this:
foreach (var stream in this.pdfDocumentStreams)
{
stream.Dispose();
}
this.pdfDocumentStreams.Clear();
This works just as well, and is much simpler. Disposing an object does not delete it - it just tells it to free it's internal, unmanaged resources. Calling dispose on an object in this way is fine - the object stays uncollected, in the collection. You can do this and then clear the list in one shot.
Is this an "acceptable" usage of MemoryStream? I am not anticipating very many files in memory at once...Likely 1-10 total PDF pages, each page about 200KB. App designed to run on server supporting an ASP.NET site.
This depends on your situation. Only you can determine whether the overhead of having these files in memory is going to cause you problems. This is going to be a fairly heavy-weight object, though, so I'd use it carefully.
Any comments/suggestions?
Implement a finalizer. It's a good idea whenever you implement IDisposable. Also, you should rework your Dispose implementation to the standard one, or mark your class as sealed. For details on how this should be done, see this article. In particular, you should have a method declared as protected virtual void Dispose(bool disposing) that your Dispose method and your finalizer both call.

It looks to me like you misunderstand what Using does.
It's just syntactic sugar to replace
MemoryStream ms;
try
{
ms = new MemoryStream();
}
finally
{
ms.Dispose();
}
Your usage in AddFileToManipulate is redundant. I'd set up the list of memorystreams in the constructor of PdfDocumentManipulator, then have PdfDocumentManipulator's dispose method call dispose on all the memorystreams.

Side note. This really seems like it calls for an extension method.
public static void DisposeAll<T>(this IEnumerable<T> enumerable)
where T : IDisposable {
foreach ( var cur in enumerable ) {
cur.Dispose();
}
}
Now your Dispose method becomes
public void Dispose() {
pdfDocumentStreams.Reverse().DisposeAll();
pdfDocumentStreams.Clear();
}
EDIT
You don't need the 3.5 framework in order to have extension methods. They will happily work on the 3.0 compiler down targeted to 2.0
http://blogs.msdn.com/jaredpar/archive/2007/11/16/extension-methods-without-3-5-framework.aspx

Related

StreamReader with using statement difference?

I am using StreamReader as shown below in my code:
string json = await new StreamReader(context.Request.Body).ReadToEndAsync();
// ... use json variable here in some other code
And I stumbled upon using statement. Is there any difference between my first statement vs using the using statement with StreamReader?
Should I be using using statement with StreamReader here in prod code?
string json;
using (var reader = new StreamReader(context.Request.Body))
{
json = await reader.ReadToEndAsync();
}
Is there any difference between my first statement vs using the using
statement with StreamReader
Yes. The difference is that when you wrap StreamReader in a using statement it will clear up some resources directly instead of waiting for the garbage collector. More specifically it will call Dispose() on StreamReader. You should almost always use using when the class implements IDisposable.
If your app simply uses an object that implements the IDisposable
interface, you should call the object's IDisposable.Dispose
implementation when you are finished using it.
Thanks to .NET Core being open source we can take a look at the source for StreamReader:
protected override void Dispose(bool disposing)
{
if (m_stream != null)
{
if (disposing)
{
m_stream.Close();
}
m_stream = null;
m_buffer = null;
m_curBufPos = 0;
m_curBufLen = 0;
}
m_disposed = true;
}
As you can see it calls Close() on the stream, which (according to the docs) in turn will call Dispose() on the stream itself.
Correctly disposing objects can be crucial when working with larger objects or streams. However, I will try to target your other question.
Should I be using using statement with StreamReader here in prod code?
Yes, no and maybe. In your partical case you have a context.Request.Body as a Stream (which I assume is from HttpContext). There is no need for the StreamReader to close that particular stream. It will be disposed correctly (later) anyway. Also, there might be some other resource that need access to that particual stream later in the pipeline.
Generally, if the class implements IDisposable then you should wrap it in a using. But here I think that you have two better choices:
1.
If you actually have a json (as your variable suggest), you can deserialize it directly using JsonSerializer found in System.Text.Json.JsonSerializer:
YourModel model = await System.Text.Json.JsonSerializer.DeserializeAsync<YourModel>(context.Request.Body);
UPDATE: Or if you are using .NET 5 you have access to HttpResponseJsonExtensions and can use ReadFromJsonAsync. Then you can simply try the following:
YourModel model = await context.Request.ReadFromJsonAsync<YourModel>();
Thanks to caius-jard.
2.
Use the overload of StreamReader that doesn't close the stream.
string json;
using (var reader = new StreamReader(stream, Encoding.UTF8, true, -1, true))
{
json = await reader.ReadToEndAsync();
}
So, to sum up. Yes, there is a difference when using using. However, in your particular case you have better options.
Check out this link
https://www.c-sharpcorner.com/UploadFile/manas1/usage-and-importance-of-using-in-C-Sharp472/
In short: "using" statement ensures that managed/unmanaged resource object disposes correctly and you don't have to call "Dispose" method explicitly even there is any execeptions occured within the using block
You can read further from Microsoft official site too
https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/using-statement#:~:text=The%20using%20statement%20calls%20the,t%20be%20modified%20or%20reassigned.

Pre-created Stream and "using" block

I'm really annoyed when "using" block tampered my pre-created object. I have this piece of code
class Asset {
public Stream FileStream { get; set; }
public Asset(string fileName) {
FileStream = ...open a file stream...;
}
}
// Somewhere else
Asset asset = new Asset("file.txt");
using (var reader = new StreamReader(asset.FileStream)) {
//blah blah blah
}
// Somewhere else else
using (var reader2 = new StreamReader(asset.FileStream))
=> throws this exception:
System.ArgumentException: Stream was not readable.
Debugging step-by-step in Visual Studio helped me know that asset.FileStream has been disposed after the first "using" block.
Please help me to save his life :((
How can I create a clone stream from a stream?
I agree that the fact that readers close the underlying stream is dumb. The approach outlined in this article is to create a decorator class that wraps the Stream and has a no-op for the Close and Dispose methods. It's probably not worth the overhead, though, so you should consider just not using using for these readers.

Close a filestream without Flush()

Can I close a file stream without calling Flush (in C#)? I understood that Close and Dispose calls the Flush method first.
MSDN is not 100% clear, but Jon Skeet is saying "Flush", so do it before close/dispose. It won't hurt, right?
From FileStream.Close Method:
Any data previously written to the buffer is copied to the file before
the file stream is closed, so it is not necessary to call Flush before
invoking Close. Following a call to Close, any operations on the file
stream might raise exceptions. After Close has been called once, it
does nothing if called again.
Dispose is not as clear:
This method disposes the stream, by writing any changes to the backing
store and closing the stream to release resources.
Remark: the commentators might be right, it's not 100% clear from the Flush:
Override Flush on streams that implement a buffer. Use this method to
move any information from an underlying buffer to its destination,
clear the buffer, or both. Depending upon the state of the object, you
might have to modify the current position within the stream (for
example, if the underlying stream supports seeking). For additional
information see CanSeek.
When using the StreamWriter or BinaryWriter class, do not flush the
base Stream object. Instead, use the class's Flush or Close method,
which makes sure that the data is flushed to the underlying stream
first and then written to the file.
TESTS:
var textBytes = Encoding.ASCII.GetBytes("Test123");
using (var fileTest = System.IO.File.Open(#"c:\temp\fileNoCloseNoFlush.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes,0,textBytes.Length);
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseNoFlush.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes, 0, textBytes.Length);
fileTest.Close();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileFlushNoClose.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes, 0, textBytes.Length);
fileTest.Flush();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseAndFlush.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes, 0, textBytes.Length);
fileTest.Flush();
fileTest.Close();
}
What can I say ... all files got the text - maybe this is just too little data?
Test2
var rnd = new Random();
var size = 1024*1024*10;
var randomBytes = new byte[size];
rnd.NextBytes(randomBytes);
using (var fileTest = System.IO.File.Open(#"c:\temp\fileNoCloseNoFlush.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseNoFlush.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
fileTest.Close();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileFlushNoClose.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
fileTest.Flush();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseAndFlush.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
fileTest.Flush();
fileTest.Close();
}
And again - every file got its bytes ... to me it looks like it's doing what I read from MSDN: it doesn't matter if you call Flush or Close before dispose ... any thoughts on that?
You don't have to call Flush() on Close()/Dispose(), FileStream will do it for you as you can see from its source code:
http://referencesource.microsoft.com/#mscorlib/system/io/filestream.cs,e23a38af5d11ddd3
[System.Security.SecuritySafeCritical] // auto-generated
protected override void Dispose(bool disposing)
{
// Nothing will be done differently based on whether we are
// disposing vs. finalizing. This is taking advantage of the
// weak ordering between normal finalizable objects & critical
// finalizable objects, which I included in the SafeHandle
// design for FileStream, which would often "just work" when
// finalized.
try {
if (_handle != null && !_handle.IsClosed) {
// Flush data to disk iff we were writing. After
// thinking about this, we also don't need to flush
// our read position, regardless of whether the handle
// was exposed to the user. They probably would NOT
// want us to do this.
if (_writePos > 0) {
FlushWrite(!disposing); // <- Note this
}
}
}
finally {
if (_handle != null && !_handle.IsClosed)
_handle.Dispose();
_canRead = false;
_canWrite = false;
_canSeek = false;
// Don't set the buffer to null, to avoid a NullReferenceException
// when users have a race condition in their code (ie, they call
// Close when calling another method on Stream like Read).
//_buffer = null;
base.Dispose(disposing);
}
}
I've been tracking a newly introduced bug that seems to indicate .NET 4 does not reliably flush changes to disk when the stream is disposed (unlike .NET 2.0 and 3.5, which always did so reliably).
The .NET 4 FileStream class has been heavily modified in .NET 4, and while the Flush*() methods have been rewritten, similar attention seems to have been forgotten for .Dispose().
This is resulting in incomplete files.
Since you've stated that you understood that close & dispose called the flush method if it was not called explicitly by user code, I believe that (by close without flush) you actually want to have a possibility to discard changes made to a FileStream, if necessary.
If that is correct, using a FileStream alone won't help. You will need to load this file into a MemoryStream (or an array, depending on how you modify its contents), and then decide whether you want to save changes or not after you're done.
A problem with this is file size, obviously. FileStream uses limited size write buffers to speed up operations, but once they are depleted, changes need to be flushed. Due to .NET memory limits, you can only expect to load smaller files in memory, if you need to hold them entirely.
An easier alternative would be to make a disk copy of your file, and work on it using a plain FileStream. When finished, if you need to discard changes, simply delete the temporary file, otherwise replace the original with a modified copy.
Wrap the FileStream in a BufferedStream and close the filestream before the buffered stream.
var fs = new FileStream(...);
var bs = new BufferedStream(fs, buffersize);
bs.Write(datatosend, 0, length);
fs.Close();
try {
bs.Close();
}
catch (IOException) {
}
Using Flush() is worthy inside big Loops.
when you have to read and write a big File inside one Loop. In other case the buffer or the computer is big enough, and doesn´t matter to close() without making one Flush() before.
Example: YOU HAVE TO READ A BIG FILE (in one format) AND WRITE IT IN .txt
StreamWriter sw = .... // using StreamWriter
// you read the File ...
// and now you want to write each line for this big File using WriteLine ();
for ( .....) // this is a big Loop because the File is big and has many Lines
{
sw.WriteLine ( *whatever i read* ); //we write here somrewhere ex. one .txt anywhere
sw.Flush(); // each time the sw.flush() is called, the sw.WriteLine is executed
}
sw.Close();
Here it is very important to use Flush(); beacause otherwise each writeLine is save in the buffer and does not write it until the buffer is frull or until the program reaches sw.close();
I hope this helps a little to understand the function of Flush
I think it is safe to use simple using statement, which closes the stream after the call to GetBytes();
public static byte[] GetBytes(string fileName)
{
byte[] buffer = new byte[4096];
using (FileStream fs = new FileStream(fileName))
using (MemoryStream ms = new MemoryStream())
{
fs.BlockCopy(ms, buffer, 4096); // extension method for the Stream class
fs.Close();
return ms.ToByteArray();
}
}

MemoryStream instance timing help

Is it ok to instance a MemoryStream at the top of my method, do a bunch of stuff to it, and then use it?
For instance:
public static byte[] TestCode()
{
MemoryStream m = new MemoryStream();
...
...
whole bunch of stuff in between
...
...
//finally
using(m)
{
return m.ToArray();
}
}
Updated code
public static byte[] GetSamplePDF()
{
using (MemoryStream m = new MemoryStream())
{
Document document = new Document();
PdfWriter.GetInstance(document, m);
document.Open();
PopulateTheDocument(document);
document.Close();
return m.ToArray();
}
}
private static void PopulateTheDocument(Document document)
{
Table aTable = new Table(2, 2);
aTable.AddCell("0.0");
aTable.AddCell("0.1");
aTable.AddCell("1.0");
aTable.AddCell("1.1");
document.Add(aTable);
for (int i = 0; i < 20; i++)
{
document.Add(new Phrase("Hello World, Hello Sun, Hello Moon, Hello Stars, Hello Sea, Hello Land, Hello People. "));
}
}
My point was to try to reuse building the byte code. In other words, build up any kind of document and then send it to TestCode() method.
Technically, this is possible, but it's pointless. If you really want to avoid using the "using" statement around that code, just call Dispose() directly.
You should put the entire work that's using the MemoryStream into the using statement. This guarantees that the MemoryStream's Dispose method will be called, even if you receive an exception during your "whole bunch of stuff in between" code. The way you have it now, exceptions will prevent your MemoryStream from having Dispose() called on it.
The proper way to handle this would be:
public static byte[] TestCode()
{
MemoryStream m = new MemoryStream();
using(m)
{
// ...
// ...
// whole bunch of stuff in between
// ...
// ...
return m.ToArray();
}
}
Or, in the more common form:
public static byte[] TestCode()
{
using(MemoryStream m = new MemoryStream())
{
// ...
// ...
// whole bunch of stuff in between
// ...
// ...
return m.ToArray();
}
}
When seeing questions like these, I often wonder if we wouldn't have been better off using the Java model. There's an extraordinary amount of agony that .NET programmers suffer over that doggone IDisposable. After thousands of questions on SO (et al), it still remains poorly understood.
It is a memory stream. There's nothing that needs to be disposed when you use memory, the garbage collector already takes care of it. It is not some kind of special memory just because the class has a Dispose() method, there's only one kind. The other kind is wrapped by UnmanagedMemoryStream. The fact that MemoryStream inherits a do-nothing Dispose() method from Stream is a sad OOP liability.
It is up to you to decide to slovenly call a do-nothing method because it is there. Or you could take charge of your code and refuse to call methods that you know don't do anything useful now, nor will ever do anything useful for the rest of your career. Clearly I'm in the second camp, our life-expectancy must be better. I hope anyway. Then again, this post might have knocked a day off.

Storing MemoryStream in Cache

I've come across this code in one of my projects, which has a static function to return a MemoryStream from a file, which is then stored in Cache. Now the same class has a constructor which allows to store a MemoryStream in a private variable and later use it. So it looks like this:
private MemoryStream memoryStream;
public CountryLookup(MemoryStream ms)
{
memoryStream = ms;
}
public static MemoryStream FileToMemory(string filePath)
{
MemoryStream memoryStream = new MemoryStream();
ReadFileToMemoryStream(filePath, memoryStream);
return memoryStream;
}
Usage:
Context.Cache.Insert("test",
CountryLookup.FileToMemory(
ConfigurationSettings.AppSettings["test"]),
new CacheDependency(someFileName)
);
And then:
CountryLookup cl = new CountryLookup(
((MemoryStream)Context.Cache.Get("test"))
);
So I was wondering who should dispose the memoryStream and when? Ideally CountryLookup should implement IDisposable.
Should I even care about it?
It's slightly ugly - in particular, the MemoryStream is stateful, because it has the concept of the "current position".
Why not just store a byte array instead? You can easily build multiple MemoryStreams which wrap the same byte array when you need to, and you don't need to worry about the statefulness.
MemoryStreams don't usually require disposal, but I personally tend to dispose them out of habit. If you perform asynchronous operations on them or use them in remoting, I believe disposal does make a difference at that point. Byte arrays are just simpler :)

Categories

Resources