I have a file stream opened in writeshare and append mode from multiple processes.
Does anybody know if a single unbuffered write operation can be considered atomic?
Or have i to develop a way to synchronize different writes to ensure my data are safe?
I found my way.
You can open a filestream using this constructor.
new FileStream(FileName,
FileMode.Append,
System.Security.AccessControl.FileSystemRights.AppendData,
FileShare.ReadWrite, 4096, FileOptions.None);
using System.Security.AccessControl.FileSystemRights.AppendData parameter to open the stream, with FileMode.Append, the OS will try to write the buffer in atomic way.
If your write is bigger than buffer size, the operation will not be atomic, so you have to check your buffer size.
Related
I am working in a project involving data acquisition. One very important requisite is described like this:
At the beginning of the recording, a file must be created, and its headers must be written;
As soon as the data-acquisition starts, the application should keep saving collected data to file periodically (typically once per second);
Writing consist on appending data blocks to the file, atomically if possible;
Should any error occur (program error, power failure), the file must be kept valid untill the last write before the error.
So I plan to use some Thread to watch for data received and write this data do file, but I don't know which practice is best (code below is not real, just to get the feeling):
First option: Single open
using (var fs = new FileStream(filename, FileMode.CreateNew, FileAccess.Write))
fs.Write(headers, 0, headers.Length);
using (var fs = new FileStream(filename, FileMode.Append, FileAccess.Write))
{
while (acquisitionRunning)
{
Thread.Sleep(100);
if (getNewData(out _someData;))
{
fs.Write(_someData, 0, _someData.Length);
}
}
}
Second option: multiple open:
using (var fs = new FileStream(filename, FileMode.CreateNew, FileAccess.Write))
fs.Write(headers, 0, headers.Length);
while (acquisitionRunning)
{
Thread.Sleep(100);
if (getNewData(out _someData;))
{
using (var fs = new FileStream(filename, FileMode.Append, FileAccess.Write))
{
fs.Write(_someData, 0, _someData.Length);
}
}
}
The application is supposed to work in a client machine, and file access by other processes should not be a concern. What I am most concerned about is:
Do multiple open/close impact performance (mind that typical write interval is once per second);
Which one is best to keep file integrity safe in the event of failure (including explicitly power failure)?
Is any of this forms considered a particularly good or bad practice, or either can be used depending on specifics of the problem at hand?
A good way to preserve file content in the event of a power outage/etc, is to flush the filestream after each write. This will make sure the contents you just wrote to the stream get immediately written to disk.
As you've mentioned, other processes won't be accessing the file, so keeping it open wouldn't complicate things, and also it would be faster. But, keep in mind that if the app crashes the lock will remain on the file and you might probably need to handle this accordingly to your scenario.
I was searching in StackOverflow about try-finally and using blocks and what are the best practices on using them.
I read in a comment here that if your application is terminated abruptly by killing the process, a finally block will not get executed.
I was wondering, does the same apply to the using blocks? Will for example a stream get closed if a Environment.exit() call occurs inside the using block?:
//....
using (FileStream fsSource1 = new FileStream(pathSource,
FileMode.Open, FileAccess.Read))
{
//Use the stream here
Environment.exit();
}
As a second thought on it, and knowing that CLR Garbage Collector will probably take care of the stream objects if not closed properly in the program calls, is it considered necessary to close a stream in code, if the program gets terminated for sure after the completion of the stream usage?
For example, is there any practical difference between:
//....
using (FileStream fsSource1 = new FileStream(pathSource,
FileMode.Open, FileAccess.Read))
{
//Use the stream here
}
Environment.exit();
And:
//....
FileStream fsSource1 = new FileStream(pathSource, FileMode.Open, FileAccess.Read);
//Use the stream here
Environment.exit();
Or even the example mentioned before?
It shouldn't make a difference in the specific case of FileStream, modulo a tricky corner case when you used its BeginWrite() method. Its finalizer attempts to complete writing any unwritten data still present in its internal buffer. This is however not generally true, it will make a difference if you use StreamWriter for example.
What you are leaving up to the .NET Framework to decide is whether you truly meant to jerk the floor mat and seize writing the file. Or whether it should make a last-gasp attempt to flush any unwritten data. The outcome in the case of StreamWriter will tend to be an unpleasant one, non-zero odds that something is going to fall over when it tries to read a half-written file.
Always be explicit, if you want to make sure this does not happen then it is up to you to ensure that you properly called the Close() or Dispose() method. Or delete the file.
Right now I have a StreamWriter and a StreamReader, 1 file that holds the (text) data, at least 2 threads. 1 thread is a listener and reads the data. The other thread writes stuff into the stream.
Can I avoid using a file as the memory buffer ?
I thought it might be possible to connect the 2 streams from both ends. But dunno how. I create the writer that writes to the file. Then I start a thread that creates a reader that reads from this file and does his work. It works but I want to avoid the file thingy.
// writer
StreamWriter writer = new StreamWriter(new FileStream("text.txt", FileMode.Create, FileAccess.Write, FileShare.Read));
writer.AutoFlush = true;
// reader
StreamReader reader = new StreamReader(new FileStream("text.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite));
I published something I called ProducerConsumerStream that will do this. It's an in-memory stream that allows one reader and one writer. It's a fixed-size circular buffer that allows a consumer to read as fast as the producer can write. See Building a new type of stream.
I'm coming across something trivial, but it appears that data is flushed to disk (out of the FileStream's buffer) when the data I'm buffering hits the size of the FileStream's buffer.
//use the FileStream buffer to actually buffer the data to be written, so segments are written as desired.
FileStream writeStream = new FileStream(filename, FileMode.Append, FileAccess.Write, FileShare.None, CommandOperationBufferSize);
BinaryWriter binWriter = new BinaryWriter(writeStream);
byte[] FullSize = new byte[CommandOperationTotalSize];
//the BinaryWriter will flush when the FileStream buffer is hit
binWriter.Write(FullSize); //DATA FLUSHES TO DISK HERE!
//if wait, wait five seconds
if (CommandOperation == "writewait" || CommandOperation == "appendwait")
{
Thread.Sleep(5000);
writeStream.Flush();
Thread.Sleep(5000);
}
writeStream.Close();
writeStream.Dispose();
binWriter.Close();
Can anyone confirm that this is the case? That the FileStream's buffer is actual .Flush() when the FileStream's buffer is filled?
I ask because it appears that if I set CommandOperationTotalSize to 1MB, and set the CommandOperationBufferSize to 64KB, data is flushed to disk when the buffer is filled.
Sounds like I answered my own question, but it seems odd that the FileStream buffer wouldn't just overflow? But maybe the API developers are trying to be nice?
Thanks,
Matt
You can readily assume that overflowing the buffer is not possible. The class would be rather hard to use if that was the case, given that FileStream has no properties at all to tell you how much is currently being buffered.
The buffer is only there to reduce the number of calls to the native Windows WriteFile() call. Important when you write small amounts of data, say one byte at a time. If you don't explicitly specify the buffer size then it will use a buffer of 4096 bytes. Which is fine, it is very rare to need something else. Any writes are further buffered by the file system cache. You should only consider a non-standard size when you use FileOptions.WriteThrough
I have this code that saves a pdf file.
FileStream fs = new FileStream(SaveLocation, FileMode.Create);
fs.Write(result.DocumentBytes, 0, result.DocumentBytes.Length);
fs.Flush();
fs.Close();
It works fine. However sometimes it does not release the lock right away and that causes file locking exceptions with functions run after this one run.
Is there a ideal way to release the file lock right after the fs.Close()
Here's the ideal:
using (var fs = new FileStream(SaveLocation, FileMode.Create))
{
fs.Write(result.DocumentBytes, 0, result.DocumentBytes.Length);
}
which is roughly equivalent to:
FileStream fs = null;
try
{
fs = new FileStream(SaveLocation, FileMode.Create);
fs.Write(result.DocumentBytes, 0, result.DocumentBytes.Length);
}
finally
{
if (fs != null)
{
((IDisposable)fs).Dispose();
}
}
the using being more readable.
UPDATE:
#aron, now that I'm thinking
File.WriteAllBytes(SaveLocation, result.DocumentBytes);
looks even prettier to the eye than the ideal :-)
We have seen this same issue in production with a using() statement wrapping it.
One of the top culprits here is anti-virus software which can sneak in after the file is closed, grab it to check it doesn't contain a virus before releasing it.
But even with all anti-virus software out of the mix, in very high load systems with files stored on network shares we still saw the problem occasionally. A, cough, short Thread.Sleep(), cough, after the close seemed to cure it. If someone has a better solution I'd love to hear it!
I can't imagine why the lock would be maintained after the file is closed. But you should consider wrapping this in a using statment to ensure that the file is closed even if an exception is raised
using (FileStream fs = new FileStream(SaveLocation, FileMode.Create))
{
fs.Write(result.DocumentBytes, 0, result.DocumentBytes.Length);
}
If the functions that run after this one are part of the same application, then a better approach might be to open the file for read/write at the beginning of the entire process, and then pass the file to each function without closing it until the end of the process. Then it will be unnecessary for the application to block waiting for the IO operation to complete.
This worked for me when using .Flush() I had to add a close inside the using statement.
using (var imageFile = new FileStream(filePath, FileMode.Create, FileAccess.ReadWrite,FileShare.ReadWrite))
{
imageFile.Write(bytes, 0, bytes.Length);
imageFile.Flush();
imageFile.Close();
}
Just had the same issue when I closed a FileStream and opened the file immediately in another class. The using statement was not a solution since the FileStream had been created at another place and stored in a list. Clearing the list was not enough.
It looks like the stream needs to be freed by the garbage collector before the file can be reused. If the time between closing and opening is too short, you can use
GC.Collect();
right after you closed the stream. This worked for me.
I guess the solutions of Ian Mercer to put the thread to sleep might have the same effect, giving the GC time to free the resources.