I am reading a file using StreamReader fileReader = File.OpenText(filePath). I would like to modify one line in the file in memory and push the modified stream to another method.
What I would like to avoid is reading the whole file into a string and modifying the string (doesn't scale). I would also like to avoid modifying the actual file.
Is there a straightforward way of doing this?
There is no built-in way to do that in .Net framework.
Stream and StreamReader/StreamWriter classes are designed to be chained if necessary (like GZipStream wraps stream to compress it). So you can create wrapper StreamReader and update data as you need for every operation after calling wrapped reader.
You can open two stream -one for read, one for write- at the same time. I tested simple code that works, but not sure that's what you want:
// "2.bar\r\n" will be replaced by "!!!!!\r\n"
File.WriteAllText("test.txt",
#"1.foo
2.bar
3.fake");
// open inputStream for StreamReader, and open outputStream for StreamWriter
using (var inputStream = File.Open("test.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (var reader = new StreamReader(inputStream))
using (var outputStream = File.Open("test.txt", FileMode.Open, FileAccess.Write, FileShare.Read))
using (var writer = new StreamWriter(outputStream))
{
var position = 0L; // track the reading position
var newLineLength = Environment.NewLine.Length;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
// your particular conditions here.
if (line.StartsWith("2."))
{
// seek line start position
outputStream.Seek(position, SeekOrigin.Begin);
// replace by something,
// but the length should be equal to original in this case.
writer.WriteLine(new String('!', line.Length));
}
position += line.Length + newLineLength;
}
}
/* as a result, test.txt will be:
1.foo
!!!!!
3.fake
*/
As you can see, both streams can be accessed by StreamReader and StreamWriter at the same time. And you can also manipulate both read/write position as well.
Can I get a GZipStream for a file on disk without writing the entire compressed content to temporary storage? I'm currently using a temporary file on disk in order to avoid possible memory exhaustion using MemoryStream on very large files (this is working fine).
public void UploadFile(string filename)
{
using (var temporaryFileStream = File.Open("tempfile.tmp", FileMode.CreateNew, FileAccess.ReadWrite))
{
using (var fileStream = File.OpenRead(filename))
using (var compressedStream = new GZipStream(temporaryFileStream, CompressionMode.Compress, true))
{
fileStream.CopyTo(compressedStream);
}
temporaryFileStream.Position = 0;
Uploader.Upload(temporaryFileStream);
}
}
What I'd like to do is eliminate the temporary storage by creating GZipStream, and have it read from the original file only as the Uploader class requests bytes from it. Is such a thing possible? How might such an implementation be structured?
Note that Upload is a static method with signature static void Upload(Stream stream).
Edit: The full code is here if it's useful. I hope I've included all the relevant context in my sample above however.
Yes, this is possible, but not easily with any of the standard .NET stream classes. When I needed to do something like this, I created a new type of stream.
It's basically a circular buffer that allows one producer (writer) and one consumer (reader). It's pretty easy to use. Let me whip up an example. In the meantime, you can adapt the example in the article.
Later: Here's an example that should come close to what you're asking for.
using (var pcStream = new ProducerConsumerStream(BufferSize))
{
// start upload in a thread
var uploadThread = new Thread(UploadThreadProc(pcStream));
uploadThread.Start();
// Open the input file and attach the gzip stream to the pcStream
using (var inputFile = File.OpenRead("inputFilename"))
{
// create gzip stream
using (var gz = new GZipStream(pcStream, CompressionMode.Compress, true))
{
var bytesRead = 0;
var buff = new byte[65536]; // 64K buffer
while ((bytesRead = inputFile.Read(buff, 0, buff.Length)) != 0)
{
gz.Write(buff, 0, bytesRead);
}
}
}
// The entire file has been compressed and copied to the buffer.
// Mark the stream as "input complete".
pcStream.CompleteAdding();
// wait for the upload thread to complete.
uploadThread.Join();
// It's very important that you don't close the pcStream before
// the uploader is done!
}
The upload thread should be pretty simple:
void UploadThreadProc(object state)
{
var pcStream = (ProducerConsumerStream)state;
Uploader.Upload(pcStream);
}
You could, of course, put the producer on a background thread and have the upload be done on the main thread. Or have them both on background threads. I'm not familiar with the semantics of your uploader, so I'll leave that decision to you.
I have a file with size 10124, I am adding a byte array, which has length 4 in the beginning of the file.
After that the file size should become 10128, but as I write it to file, the size decreased to 22 bytes. I don't know where is the problem
public void AppendAllBytes(string path, byte[] bytes)
{
var encryptedFile = new FileStream(path, FileMode.Open, FileAccess.Read);
////argument-checking here.
Stream header = new MemoryStream(bytes);
var result = new MemoryStream();
header.CopyTo(result);
encryptedFile.CopyTo(result);
using (var writer = new StreamWriter(#"C:\\Users\\life.monkey\\Desktop\\B\\New folder (2)\\aaaaaaaaaaaaaaaaaaaaaaaaaaa.docx.aef"))
{
writer.Write(result);
}
}
How can I write bytes to the file?
The issue seems to be caused by:
using a StreamWriter to write binary formatted data. The name does not inthuitively suggest this, but the StreamWriter class is suited for writing textual data.
passing an entire stream instead of the actual binary data. To obtain the bytes stored in a MemoryStream, use its convenient ToArray() method.
I suggest you the following code:
public void AppendAllBytes(string path, byte[] bytes)
{
var fileName = #"C:\\Users\\life.monkey\\Desktop\\B\\New folder (2)\\aaaaaaaaaaaaaaaaaaaaaaaaaaa.docx.aef";
using (var encryptedFile = new FileStream(path, FileMode.Open, FileAccess.Read))
using (var writer = new BinaryWriter(File.Open(fileName, FileMode.Append)))
using (var result = new MemoryStream())
{
encryptedFile.CopyTo(result);
result.Flush(); // ensure header is entirely written.
// write header directly, no need to put it in a memory stream
writer.Write(bytes);
writer.Flush(); // ensure the header is written to the result stream.
writer.Write(result.ToArray());
writer.Flush(); // ensure the encryptdFile is written to the result stream.
}
}
The code above uses the BinaryWriter class which is better suited for binary data. It has a Write(byte[] bytes) method overload that is used above to write an entire array to the file. The code uses regular calls to the Flush() method that some may consider not needed, but these guarantee in general, that all the data written prior the call of the Flush() method is persisted within the stream.
Can I close a file stream without calling Flush (in C#)? I understood that Close and Dispose calls the Flush method first.
MSDN is not 100% clear, but Jon Skeet is saying "Flush", so do it before close/dispose. It won't hurt, right?
From FileStream.Close Method:
Any data previously written to the buffer is copied to the file before
the file stream is closed, so it is not necessary to call Flush before
invoking Close. Following a call to Close, any operations on the file
stream might raise exceptions. After Close has been called once, it
does nothing if called again.
Dispose is not as clear:
This method disposes the stream, by writing any changes to the backing
store and closing the stream to release resources.
Remark: the commentators might be right, it's not 100% clear from the Flush:
Override Flush on streams that implement a buffer. Use this method to
move any information from an underlying buffer to its destination,
clear the buffer, or both. Depending upon the state of the object, you
might have to modify the current position within the stream (for
example, if the underlying stream supports seeking). For additional
information see CanSeek.
When using the StreamWriter or BinaryWriter class, do not flush the
base Stream object. Instead, use the class's Flush or Close method,
which makes sure that the data is flushed to the underlying stream
first and then written to the file.
TESTS:
var textBytes = Encoding.ASCII.GetBytes("Test123");
using (var fileTest = System.IO.File.Open(#"c:\temp\fileNoCloseNoFlush.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes,0,textBytes.Length);
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseNoFlush.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes, 0, textBytes.Length);
fileTest.Close();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileFlushNoClose.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes, 0, textBytes.Length);
fileTest.Flush();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseAndFlush.txt", FileMode.CreateNew))
{
fileTest.Write(textBytes, 0, textBytes.Length);
fileTest.Flush();
fileTest.Close();
}
What can I say ... all files got the text - maybe this is just too little data?
Test2
var rnd = new Random();
var size = 1024*1024*10;
var randomBytes = new byte[size];
rnd.NextBytes(randomBytes);
using (var fileTest = System.IO.File.Open(#"c:\temp\fileNoCloseNoFlush.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseNoFlush.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
fileTest.Close();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileFlushNoClose.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
fileTest.Flush();
}
using (var fileTest = System.IO.File.Open(#"c:\temp\fileCloseAndFlush.bin", FileMode.CreateNew))
{
fileTest.Write(randomBytes, 0, randomBytes.Length);
fileTest.Flush();
fileTest.Close();
}
And again - every file got its bytes ... to me it looks like it's doing what I read from MSDN: it doesn't matter if you call Flush or Close before dispose ... any thoughts on that?
You don't have to call Flush() on Close()/Dispose(), FileStream will do it for you as you can see from its source code:
http://referencesource.microsoft.com/#mscorlib/system/io/filestream.cs,e23a38af5d11ddd3
[System.Security.SecuritySafeCritical] // auto-generated
protected override void Dispose(bool disposing)
{
// Nothing will be done differently based on whether we are
// disposing vs. finalizing. This is taking advantage of the
// weak ordering between normal finalizable objects & critical
// finalizable objects, which I included in the SafeHandle
// design for FileStream, which would often "just work" when
// finalized.
try {
if (_handle != null && !_handle.IsClosed) {
// Flush data to disk iff we were writing. After
// thinking about this, we also don't need to flush
// our read position, regardless of whether the handle
// was exposed to the user. They probably would NOT
// want us to do this.
if (_writePos > 0) {
FlushWrite(!disposing); // <- Note this
}
}
}
finally {
if (_handle != null && !_handle.IsClosed)
_handle.Dispose();
_canRead = false;
_canWrite = false;
_canSeek = false;
// Don't set the buffer to null, to avoid a NullReferenceException
// when users have a race condition in their code (ie, they call
// Close when calling another method on Stream like Read).
//_buffer = null;
base.Dispose(disposing);
}
}
I've been tracking a newly introduced bug that seems to indicate .NET 4 does not reliably flush changes to disk when the stream is disposed (unlike .NET 2.0 and 3.5, which always did so reliably).
The .NET 4 FileStream class has been heavily modified in .NET 4, and while the Flush*() methods have been rewritten, similar attention seems to have been forgotten for .Dispose().
This is resulting in incomplete files.
Since you've stated that you understood that close & dispose called the flush method if it was not called explicitly by user code, I believe that (by close without flush) you actually want to have a possibility to discard changes made to a FileStream, if necessary.
If that is correct, using a FileStream alone won't help. You will need to load this file into a MemoryStream (or an array, depending on how you modify its contents), and then decide whether you want to save changes or not after you're done.
A problem with this is file size, obviously. FileStream uses limited size write buffers to speed up operations, but once they are depleted, changes need to be flushed. Due to .NET memory limits, you can only expect to load smaller files in memory, if you need to hold them entirely.
An easier alternative would be to make a disk copy of your file, and work on it using a plain FileStream. When finished, if you need to discard changes, simply delete the temporary file, otherwise replace the original with a modified copy.
Wrap the FileStream in a BufferedStream and close the filestream before the buffered stream.
var fs = new FileStream(...);
var bs = new BufferedStream(fs, buffersize);
bs.Write(datatosend, 0, length);
fs.Close();
try {
bs.Close();
}
catch (IOException) {
}
Using Flush() is worthy inside big Loops.
when you have to read and write a big File inside one Loop. In other case the buffer or the computer is big enough, and doesn´t matter to close() without making one Flush() before.
Example: YOU HAVE TO READ A BIG FILE (in one format) AND WRITE IT IN .txt
StreamWriter sw = .... // using StreamWriter
// you read the File ...
// and now you want to write each line for this big File using WriteLine ();
for ( .....) // this is a big Loop because the File is big and has many Lines
{
sw.WriteLine ( *whatever i read* ); //we write here somrewhere ex. one .txt anywhere
sw.Flush(); // each time the sw.flush() is called, the sw.WriteLine is executed
}
sw.Close();
Here it is very important to use Flush(); beacause otherwise each writeLine is save in the buffer and does not write it until the buffer is frull or until the program reaches sw.close();
I hope this helps a little to understand the function of Flush
I think it is safe to use simple using statement, which closes the stream after the call to GetBytes();
public static byte[] GetBytes(string fileName)
{
byte[] buffer = new byte[4096];
using (FileStream fs = new FileStream(fileName))
using (MemoryStream ms = new MemoryStream())
{
fs.BlockCopy(ms, buffer, 4096); // extension method for the Stream class
fs.Close();
return ms.ToByteArray();
}
}
What is the best method to convert a Stream to a FileStream using C#.
The function I am working on has a Stream passed to it containing uploaded data, and I need to be able to perform stream.Read(), stream.Seek() methods which are methods of the FileStream type.
A simple cast does not work, so I'm asking here for help.
Read and Seek are methods on the Stream type, not just FileStream. It's just that not every stream supports them. (Personally I prefer using the Position property over calling Seek, but they boil down to the same thing.)
If you would prefer having the data in memory over dumping it to a file, why not just read it all into a MemoryStream? That supports seeking. For example:
public static MemoryStream CopyToMemory(Stream input)
{
// It won't matter if we throw an exception during this method;
// we don't *really* need to dispose of the MemoryStream, and the
// caller should dispose of the input stream
MemoryStream ret = new MemoryStream();
byte[] buffer = new byte[8192];
int bytesRead;
while ((bytesRead = input.Read(buffer, 0, buffer.Length)) > 0)
{
ret.Write(buffer, 0, bytesRead);
}
// Rewind ready for reading (typical scenario)
ret.Position = 0;
return ret;
}
Use:
using (Stream input = ...)
{
using (Stream memory = CopyToMemory(input))
{
// Seek around in memory to your heart's content
}
}
This is similar to using the Stream.CopyTo method introduced in .NET 4.
If you actually want to write to the file system, you could do something similar that first writes to the file then rewinds the stream... but then you'll need to take care of deleting it afterwards, to avoid littering your disk with files.