Modify File Stream in memory - c#

I am reading a file using StreamReader fileReader = File.OpenText(filePath). I would like to modify one line in the file in memory and push the modified stream to another method.
What I would like to avoid is reading the whole file into a string and modifying the string (doesn't scale). I would also like to avoid modifying the actual file.
Is there a straightforward way of doing this?

There is no built-in way to do that in .Net framework.
Stream and StreamReader/StreamWriter classes are designed to be chained if necessary (like GZipStream wraps stream to compress it). So you can create wrapper StreamReader and update data as you need for every operation after calling wrapped reader.

You can open two stream -one for read, one for write- at the same time. I tested simple code that works, but not sure that's what you want:
// "2.bar\r\n" will be replaced by "!!!!!\r\n"
File.WriteAllText("test.txt",
#"1.foo
2.bar
3.fake");
// open inputStream for StreamReader, and open outputStream for StreamWriter
using (var inputStream = File.Open("test.txt", FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (var reader = new StreamReader(inputStream))
using (var outputStream = File.Open("test.txt", FileMode.Open, FileAccess.Write, FileShare.Read))
using (var writer = new StreamWriter(outputStream))
{
var position = 0L; // track the reading position
var newLineLength = Environment.NewLine.Length;
while (!reader.EndOfStream)
{
var line = reader.ReadLine();
// your particular conditions here.
if (line.StartsWith("2."))
{
// seek line start position
outputStream.Seek(position, SeekOrigin.Begin);
// replace by something,
// but the length should be equal to original in this case.
writer.WriteLine(new String('!', line.Length));
}
position += line.Length + newLineLength;
}
}
/* as a result, test.txt will be:
1.foo
!!!!!
3.fake
*/
As you can see, both streams can be accessed by StreamReader and StreamWriter at the same time. And you can also manipulate both read/write position as well.

Related

Strange behaviour with 2 streams open on the same FileStream

I have both a BinaryReader and a BinaryWriter open on the same underlying FileStream. When I seek to a location using the BinaryWriter and write a value, it is getting written at the start of the stream instead of the location that was seeked to in the previous line.
My code (simplified):
using (var stream = new FileStream(outputFile, FileMode.Create))
using (var sw = new BinaryWriter(stream))
using (var sr = new BinaryReader(stream))
{
int address;
int value;
//Write some data using sw
//Seek and read a few different places using sr,
//and set address and value
sw.BaseStream.Seek(address, SeekOrigin.Begin);
sw.Write(value);
}
The Position property is showing the correct value in both the FileStream and the BinaryWriter.BaseStream when I inspect them immediately after the Seek, yet it is still writing to the start of the stream.
I also noticed that after performing a seek and read with the BinaryReader, the data that was initially written to the stream gets shifted forward 4 bytes, filling the first 4 bytes with 0.
What could be causing this strange behaviour? Is using two streams at a time valid?
UPDATE
Everything works just fine if I use a MemoryStream instead and then write the data from that to my file. So why would this happening with a FileStream and not a MemoryStream?

StreamReader is too greedy

I'm trying to process part of a text file, and write the remainder of the text file to a cloud blob using UploadFromStream. The problem is that the StreamReader appears to be grabbing too much content from the underlying stream, and so the subsequent write does nothing.
Text file:
3
Col1,String
Col2,Integer
Col3,Boolean
abc,123,True
def,3456,False
ghijkl,532,True
mnop,1211,False
Code:
using (var stream = File.OpenRead("c:\\test\\testinput.txt"))
using (var reader = new StreamReader(stream))
{
var numColumns = int.Parse(reader.ReadLine());
while (numColumns-- > 0)
{
var colDescription = reader.ReadLine();
// do stuff
}
// Write remaining contents to another file, for testing
using (var destination = File.OpenWrite("c:\\test\\testoutput.txt"))
{
stream.CopyTo(destination);
destination.Flush();
}
// Actual intended usage:
// CloudBlockBlob blob = ...;
// blob.UploadFromStream(stream);
}
When debugging, I observe that stream.Position jumps to the end of the file on the first call to reader.ReadLine(), which I don't expect. I expected the stream to be advanced only as many positions as the reader needed to read some content.
I imagine that the stream reader is doing some buffering for performance reasons, but there doesn't seem to be a way to ask the reader where in the underlying stream it "really" is. (If there was, I could manually Seek the stream to that position before CopyingTo).
I know that I could keep taking lines using the same reader and sequentially append them to the text file I'm writing, but I'm wondering if there's a cleaner way?
EDIT:
I found a StreamReader constructor which leaves the underlying stream open when it is disposed, so I tried this, hoping that the reader would set the stream's position as it's being disposed:
using (var stream = File.OpenRead("c:\\test\\testinput.txt"))
{
using (var reader = new StreamReader(stream, Encoding.UTF8,
detectEncodingFromByteOrderMarks: true,
bufferSize: 1 << 12,
leaveOpen: true))
{
var numColumns = int.Parse(reader.ReadLine());
while (numColumns-- > 0)
{
var colDescription = reader.ReadLine();
// do stuff
}
}
// Write remaining contents to another file
using (var destination = File.OpenWrite("c:\\test\\testoutput.txt"))
{
stream.CopyTo(destination);
destination.Flush();
}
}
But it doesn't. Why would this constructor be exposed if it doesn't leave the stream in an intuitive state/position?
Sure, there's a cleaner way. Use ReadToEnd to read the remaining data, and then write it to a new file. For example:
using (var reader = new StreamReader("c:\\test\\testinput.txt"))
{
var numColumns = int.Parse(reader.ReadLine());
while (numColumns-- > 0)
{
var colDescription = reader.ReadLine();
// do stuff
}
// write everything else to another file.
File.WriteAllText("c:\\test\\testoutput.txt", reader.ReadToEnd());
}
Edit after comment
If you want to read the text and upload it to a stream, you could replace the File.WriteAllText with code that reads the remaining text, writes it to a StreamWriter backed by a MemoryStream, and then sends the contents of that MemoryStream. Something like:
using (var memStream = new MemoryStream())
{
using (var writer = new StreamWriter(memStream))
{
writer.Write(reader.ReadToEnd());
writer.Flush();
memStream.Position = 0;
blob.UploadFromStream(memStream);
}
}
You should never access the underlying stream of a StreamReader. Trying to use both is going to have an undefined behavior.
What's going on here is that the reader is buffering the data from the underlying stream. It doesn't read each byte exactly when you request it, because that's often going to be very inefficient. Instead it will grab chunks, put them in a buffer, and then provide you with data from that buffer, grabbing a new chunk when it needs to.
You should continue to use the StreamReader throughout the remainder of that block, instead of using stream. To minimize the memory footprint of the program, the most effective way of doing this would be to read the next line from the reader in a loop until it his the end of the file, writing each line to the output stream as you go.
Also note that you don't need to be disposing of both the stream reader and the underlying stream. The stream reader will dispose of the underlying stream itself, so you can simply adjust your header to:
using (var reader = new StreamReader(
File.OpenRead("c:\\test\\testinput.txt")))

Inconsistent file size change while writing bytes from stream to a file

I have a file with size 10124, I am adding a byte array, which has length 4 in the beginning of the file.
After that the file size should become 10128, but as I write it to file, the size decreased to 22 bytes. I don't know where is the problem
public void AppendAllBytes(string path, byte[] bytes)
{
var encryptedFile = new FileStream(path, FileMode.Open, FileAccess.Read);
////argument-checking here.
Stream header = new MemoryStream(bytes);
var result = new MemoryStream();
header.CopyTo(result);
encryptedFile.CopyTo(result);
using (var writer = new StreamWriter(#"C:\\Users\\life.monkey\\Desktop\\B\\New folder (2)\\aaaaaaaaaaaaaaaaaaaaaaaaaaa.docx.aef"))
{
writer.Write(result);
}
}
How can I write bytes to the file?
The issue seems to be caused by:
using a StreamWriter to write binary formatted data. The name does not inthuitively suggest this, but the StreamWriter class is suited for writing textual data.
passing an entire stream instead of the actual binary data. To obtain the bytes stored in a MemoryStream, use its convenient ToArray() method.
I suggest you the following code:
public void AppendAllBytes(string path, byte[] bytes)
{
var fileName = #"C:\\Users\\life.monkey\\Desktop\\B\\New folder (2)\\aaaaaaaaaaaaaaaaaaaaaaaaaaa.docx.aef";
using (var encryptedFile = new FileStream(path, FileMode.Open, FileAccess.Read))
using (var writer = new BinaryWriter(File.Open(fileName, FileMode.Append)))
using (var result = new MemoryStream())
{
encryptedFile.CopyTo(result);
result.Flush(); // ensure header is entirely written.
// write header directly, no need to put it in a memory stream
writer.Write(bytes);
writer.Flush(); // ensure the header is written to the result stream.
writer.Write(result.ToArray());
writer.Flush(); // ensure the encryptdFile is written to the result stream.
}
}
The code above uses the BinaryWriter class which is better suited for binary data. It has a Write(byte[] bytes) method overload that is used above to write an entire array to the file. The code uses regular calls to the Flush() method that some may consider not needed, but these guarantee in general, that all the data written prior the call of the Flush() method is persisted within the stream.

Copy MemoryStream to FileStream and save the file?

I don't understand what I'm doing wrong here. I generate couple of memory streams and in debug-mode I see that they are populated. But when I try to copy MemoryStream to FileStream in order to save the file fileStream is not populated and file is 0bytes long (empty).
Here is my code
if (file.ContentLength > 0)
{
var bytes = ImageUploader.FilestreamToBytes(file); // bytes is populated
using (var inStream = new MemoryStream(bytes)) // inStream is populated
{
using (var outStream = new MemoryStream())
{
using (var imageFactory = new ImageFactory())
{
imageFactory.Load(inStream)
.Resize(new Size(320, 0))
.Format(ImageFormat.Jpeg)
.Quality(70)
.Save(outStream);
}
// outStream is populated here
var fileName = "test.jpg";
using (var fileStream = new FileStream(Server.MapPath("~/content/u/") + fileName, FileMode.CreateNew, FileAccess.ReadWrite))
{
outStream.CopyTo(fileStream); // fileStream is not populated
}
}
}
}
You need to reset the position of the stream before copying.
outStream.Position = 0;
outStream.CopyTo(fileStream);
You used the outStream when saving the file using the imageFactory. That function populated the outStream. While populating the outStream the position is set to the end of the populated area. That is so that when you keep on writing bytes to the steam, it doesn't override existing bytes. But then to read it (for copy purposes) you need to set the position to the start so you can start reading at the start.
If your objective is simply to dump the memory stream to a physical file (e.g. to look at the contents) - it can be done in one move:
System.IO.File.WriteAllBytes(#"C:\\filename", memoryStream.ToArray());
No need to set the stream position first either, since the .ToArray() operation explicitly ignores that, as per #BaconBits comment below https://learn.microsoft.com/en-us/dotnet/api/system.io.memorystream.toarray?view=netframework-4.7.2.
Another alternative to CopyTo is WriteTo.
Advantage:
No need to reset Position.
Usage:
outStream.WriteTo(fileStream);
Function Description:
Writes the entire contents of this memory stream to another stream.

GZipStream only decompresses first line

My GZipStream will only decompress the first line of the file. Extracting the contents via 7-zip works as expected and gives me the entire file contents. It also extracts as expected using gunzip on cygwin and linux, so I expect this is O/S specific (Windows 7).
I'm not certain how to go about troubleshooting this, so any tips on that would help me a great deal. It sounds very similar to this, but using SharpZLib results in the same thing.
Here's what I'm doing:
var inputFile = String.Format(#"{0}\{1}", inputDir, fileName);
var outputFile = String.Format(#"{0}\{1}.gz", inputDir, fileName);
var dcmpFile = String.Format(#"{0}\{1}", outputDir, fileName);
using (var input = File.OpenRead(inputFile))
using (var fileOutput = File.Open(outputFile, FileMode.Append))
using (GZipStream gzOutput = new GZipStream(fileOutput, CompressionMode.Compress, true))
{
input.CopyTo(gzOutput);
}
// Now, decompress
using (FileStream of = new FileStream(outputFile, FileMode.Open, FileAccess.Read))
using (GZipStream ogz = new GZipStream(of, CompressionMode.Decompress, false))
using (FileStream wf = new FileStream(dcmpFile, FileMode.Append, FileAccess.Write))
{
ogz.CopyTo(wf);
}
Your output file only contains a single line (gzipped) - but it contains all of the text data other than the line breaks.
You're repeatedly calling ReadLine() which returns a line of text without the line break and converting that text to bytes. So if you had an input file which had:
abc
def
ghi
You'd end up with an output file which was the compressed version of
abcdefghi
If you don't want that behaviour, why even go through a StreamReader in the first place? Just copy from the input FileStream straight to the GZipStream a block at a time, or use Stream.CopyTo if you're using .NET 4:
// Note how much simpler the code is using File.*
using (var input = File.OpenRead(inputFile))
using (var fileOutput = File.Open(outputFile, FileMode.Append))
using (GZipStream gzOutput = new GZipStream(os, CompressionMode.Compress, true))
{
input.CopyTo(gzOutput);
}
Also note that appending to a compressed file is rarely a good idea, unless you've got some sort of special handling for multiple "chunks" within a single file.

Categories

Resources