I'm attempting to take a large file, uploaded from a web app, and make it a memorystream for processing later. I was receiving OutOfMemory exceptions when trying to copy the HttpPostedFileBase's inputstream into a new MemoryStream. During troubleshooting, I tried just creating a new MemoryStream and allocate the same amount of space (roughly) as the length of the InputStream (935,638,275), like so:
MemoryStream memStream = new MemoryStream(935700000);
Even doing this results in a System.OutOfMemoryException on this line.
I only slightly understand MemoryStreams, and this seems to be something to do with how MemoryStreams buffer data. Is there a way for me to get all of the data into one MemoryStream without too much fuss?
I am not sure what the processing involves, but the HttpPostedFileBase already contains a stream with the data. You can use that stream to process what you need to do.
If you really need to move back and forth or multiple times over the stream, and the input stream does not support seeking/positioning, you may want to stream the data to a temporary local file first and then use a file stream to do your processing against that file.
If many people uploading via your web app, the array size you specified would quickly eat up all memory using a MemoryStream.
Related
I have a requirement where I need to encrypt file of size 1-2 GB in azure function. In am using PGP core library to encrypt file in memory. The below code is throwing out of memory exception if file size is above 700 MB. Note:- I am using azure function. Scaling up of App service plan didn't help.
I there any alternate of Memory stream that I can use. After encryption , I am uploading file into blob storage.
var privateKeyEncoded = Encoding.UTF8.GetString(Convert.FromBase64String(_options.PGPKeys.PublicKey));
using Stream privateKeyStream = StringToStreamUtility.GenerateStreamFromString(privateKeyEncoded);
privateKeyStream.Position = 0;
var encryptionKeys = new EncryptionKeys(privateKeyStream);
var pgp = new PGP(encryptionKeys);
//encrypt stream
var encryptStream = new MemoryStream();
await pgp.EncryptStreamAsync(streamToEncrypt, encryptStream );
MemoryStream is a Stream wrapper over a byte[]` buffer. Every time that buffer is full, a new one with double the size is allocated and the data is copied. This eventually uses double the final buffer size (4GB for a 2GB file) but worse, it results in such memory fragmentation that eventually the memory allocator can't find a new contiguous memory block to allocate. That's when you get an OOM.
While you could avoid OOM errors by specifying a capacity in the constructor, storing 2GB in memory before even starting to write it is very wasteful. With a real FileStream the encrypted bytes would be written out as soon as they were available.
Azure Functions allow temporary storage. This means you can create a temporary file, open a stream on it and use it for encryption.
var tempPath=Path.GetTempFileName();
try
{
using (var outputStream=File.Open(tempPath))
{
await pgp.EncryptStreamAsync(streamToEncrypt, outputStream);
...
}
}
finally
{
File.Delete(tempPath);
}
MemoryStream uses a byte[] internally, and any byte[] is going to get a bit brittle as it gets around/above 1GiB (although in theory a byte[] can be nearly 2 GiB, in reality this isn't a good idea, and is rarely seen).
Frankly, MemoryStream simply isn't a good choice here; I'd probably suggest using a temporary file instead, and use a FileStream. This doesn't attempt to keep everything in memory at once, and is more reliable at large sizes. Alternatively: avoid ever needing all the data at once completely, by performing the encryption in a pass-thru streaming way.
So I have some code that takes a capture of the screen and saves it to a jpeg file. This works fine, however I want to instead save the jpeg encoded capture to a new ZipArchive without writing the Bitmap to the file system first.
Here is what I have so far:
FileInfo zipArchive = new FileInfo(fileToZip.FullName + ".zip");
using (ZipArchive zipFile = ZipFile.Open(zipArchive.FullName, ZipArchiveMode.Create)))
{
ZipArchiveEntry zae = zipFile.CreateEntry(fileToZip.FullName, CompressionLevel.Optimal);
using (Stream zipStream = zae.Open())
bmp.Save(zipStream, ImageFormat.Jpeg);
}
The problem is that on the bmp.Save() line a System.NotSupportedException is thrown
This stream from ZipArchiveEntry does not support seeking.
I've seen a lot of examples that write directly to the Stream returned from zae.Open() so I am not sure why this doesn't work because I figured that all bmp.Save() would need to do is write, not seek. I don't know if this would work but I don't want to have to save the Bitmap to a MemoryStream and the copy that stream to the Stream returned from zae.Open() because it feels like unnecessary extra work. Am I missing something obvious?
Many file formats have pointers to other parts of the file, or length values, which may not be known beforehand. The simplest way is to just write zeros first, then the data and then seek to change the value. If this way is used, there is no way to get by this, so you will need to first write the data into a MemoryStream and then write the resulting data into the ZipStream, as you mentioned.
This doesn't really add that much code and is a simple fix for the problem.
I am working on developing an HTTP Server/Client and I can currently send small files over it such as .txt files and other easy to read files that do not require much memory. However when I want to send a larger file say a .exe or large .pdf I get memory errors. This are occurring from the fact that before I try to send or receive a file I have to specify the size of my byte[] buffer. Is there a way to get the size of the buffer while reading it from stream?
I want to do something like this:
//Create the stream.
private Stream dataStream = response.GetResponseStream();
//read bytes from stream into buffer.
byte[] byteArray = new byte[Convert.ToInt32(dataStream.Length)];
dataStream.read(byteArray,0,byteArray.Length);
However when calling "dataStream.Length" it throws the error:
ExceptionError: This stream does not support seek operations.
Can someone offer some advice as to how I can get the length of my byte[] from the stream?
Thanks,
You can use CopyTo method of the stream.
MemoryStream m = new MemoryStream();
dataStream.CopyTo(m);
byte[] byteArray = m.ToArray();
You can also write directly to file
var fs = File.Create("....");
dataStream.CopyTo(fs);
The network layer has no way of knowing how long the response stream is.
However, the server is supposed to tell you how long it is; look in the Content-Length response header.
If that header is missing or incorrect, you're out of luck; you'll need to keep reading until you run out of data.
I have a function that is returning MemoryStream array, i want to convert this memory stream array to a FileStream object.
Is it possible if yes can you please provide a way to do that...
Thanks
A.S
You cannot "convert" the stream, because a MemoryStream and a FileStream are very different things. However, you can write the entire contents of the MemoryStream to a file. There is a CopyTo method that you can use for that:
// memStream is the MemoryStream
using (var output = File.Create(filename)) {
memStream.CopyTo(output);
}
A file stream object represents an open file (from disk) as a stream. A memory stream represents an area of memory (byte array) as a stream. So you can't really convert a memory stream into a file stream directly - at least not trivially.
There are two approaches you could take:
OFFLINE: fully consume the contents of the memory stream and write it all out to a file on disk; then open that file as a file stream
ONLINE: extent the FileStream class creating an adapter that will wrap a MemoryStream object and expose it as a FileStream (essentially acting as a converter)
The reason one is marked [OFFLINE] is because you need to have to full contents of the memory stream before you output it to the file (and once you do, modifications to the file stream will not affect the memory stream; nor changes to the memory stream, such as new data, be available to the file stream)
The second one is marked as [ONLINE] because once you create the adapter and you initialize the FileStream object from the MemoryStream you could process any new data in the MemoryStream using the FileStream adapter object. You would essentially be able to read/write and seek into the memory stream using the file stream as a layer on top of the memory stream. Presumably, that's what you'd want to do..
Of course, it depends on what you need to do, but I'm leaning towards the second [ONLINE] version as the better in the general sense.
While opening a file in C# using stream reader is the file going to remain in memory till it closed.
For eg if a file of size 6MB is opened by a program using streamreader to append a single line at the end of the file. Will the program hold the entire 6 MB in it's memory till file is closed. OR is a file pointer returned internally by .Net code and the line is appended at the end. So the 6MB memory will not be taken up by the program
The whole point of a stream is so that you don't have to hold an entire object in memory. You read from it piece by piece as needed.
If you want to append to a file, you should use File.AppendText which will create a StreamWriter that appends to the end of a file.
Here is an example:
string path = #"c:\temp\MyTest.txt";
// This text is always added, making the file longer over time
// if it is not deleted.
using (StreamWriter sw = File.AppendText(path))
{
sw.WriteLine("This");
sw.WriteLine("is Extra");
sw.WriteLine("Text");
}
Again, the whole file will not be stored in memory.
Documentation: http://msdn.microsoft.com/en-us/library/system.io.file.appendtext.aspx
The .NET FileStream will buffer a small amount of data (you can set this amount with some of the constructors).
The Windows OS will do more significant caching of the file, if you have plenty of RAM this might be the whole file.
A StreamReader uses FileStream to open the file. FileStream stores a Windows handle, returned by the CreateFile() API function. It is 4 bytes on a 32-bit operating system. FileStream also has a byte[] buffer, it is 4096 bytes by default. This buffer avoids having to call the ReadFile() API function for every single read call. StreamReader itself has a small buffer to make decoding the text in the file more efficient, it is 128 bytes by default. And it has some private variables to keep track of the buffer index and whether or not a BOM has been detected.
This all adds up to a few kilobytes. The data you read with StreamReader will of course take space in your program's heap. That could add up to 12 megabytes if you store every string in, say, a List. You usually want to avoid that.
StreamReader will not read the 6 MB file into memory. Also, you can't append a line to the end of the file using StreamReader. You might want to use StreamWriter.
update: not counting buffering and OS caching as someone else mentioned