DotNetZip streaming - c#

I'm trying to zip a bunch of files and make the data consumable via a stream.
I would like to keep the memory footprint as small as possible.
My idea was to implement a Stream where I've got a bunch of FileStream objects as data members. When the Read method on my Stream was called, I would read some data from one of my file streams and use the ZipOutputStream instance to write zipped data to temporary storage stream which i would then forward the read request to.
This temporary storage stream would just be a queue of bytes. As these bytes are moved into a buffer (via a call to Read), they'd be deleted from the queue. This way, I'd only be storing the bytes that haven't been read yet.
Unfortunately, it seems as though when i dispose a ZipOutputStream it needs to write in random file locations in order to create a valid zip file. This will prevent me from using my "fleeting data" solution.
Hopefully this is all clear :)
Is there another way to minimize memory footprint when creating zip files? Please Help!
Thanks!

ZipOutputStream doesn't need to write to random locations in the output stream (in other words, call Seek()). But if the stream you're writing into reports that it CanSeek, it will use that ability to update some headers.
So, make sure that the stream you're writing to returns false for CanSeek() and everything should work fine.

Related

Zipping a large amount of data into an output stream without loading all the data into memory first in C#

I have a C# program that generates a bunch of short (10 seconds or so) video files. These are stored in an azure file storage blob. I want the user to be able to download these files at a later date as a zip. However, it would take a substantial amount of memory to load the entire collection of video files into memory to create the zip. I was wondering if it is possible to pull data from a stream into memory, zip encode it, output it to another stream, and dispose of it before moving onto the next segment of data.
Lets say the user has generated 100 10mb videos. If possible, this would allow me to send the zip to the user without first loading the entire 1GB of footage into memory (or storing the entire zip in memory after the fact).
The individual videos are pretty small, so if I need to load an entire file into memory at a time, that is fine as long as I can remove it from memory after it has been encoded and transmitted before moving onto the next file
Yes, it is certainly possible to stream in files, not requiring even any of those to be entirely in memory at any one time, and to compress, stream out, and transmit a zip file containing those, without holding the entire zip file either in memory or mass storage. The zip format is designed to be streamable. However I am not aware of a library that will do that for you.
ZipFile would require saving the entire zip file before transmitting it. If you're ok with saving the zip file in mass storage (not memory) before transmitting, then use ZipFile.
To write your own zip streamer, you would need to generate the zip file format manually. The zip format is documented here. You can use DeflateStream to do the actual compression and Crc32 to compute the CRC-32s. You would transmit the local header before each file's compressed data, followed by a data descriptor after each. You would save the local header information in memory as you go along, and then transmit the central directory and end record after all of the local entries.
zip is a relatively straightforward format, so while it would take a little bit of work, it is definitely doable.

How can i get length (file size) of a task stream

My primary requirement is to get length (File Size) of a task stream System.Threading.Tasks.Task<System.IO.Stream>. In a memorystream i used to get the file size using "stream.Result.Length" but when i tried to use the same in a taskstream it throws an exception saying System.NotSupportedException, Seems like the stream doesn't support that property. I think there is a difference between memory streams and other streams.
Exception occurred handling notification:System.NotSupportedException:
This stream does not support seek operations.
Could you please give me any instructions how can i achieve this
i found this link which gives me the instructions. I am using .Net 3.5 therefore i cant use ConvertTo() functions that is there in .Net 4
The point of a stream is that you don't need to have all the data available before you can start processing the first part of it. MemoryStream is an exception, in that it does have the entire contents of the stream in memory at the same time, so it can "support seek operations" to tell you things like how big the stream is.
If you need the full size of a stream that can't seek, you're going to have to consume the entire stream to find that out. If the stream is always going to be relatively small, you could copy its contents into a MemoryStream. If it's going to be larger, you could copy its contents into a file on disk.
You may want to examine why you need to know the length. If it's so that you can cancel uploads that are too large, for example, then perhaps you should just start processing the upload in chunks, but after each piece of data comes in check how much data you've received so far, and cancel the process if it gets too big.

Generate and stream Zip archive without storing it in the filesystem or reading into memory first

How can I asynchronously take multiple existing streams (from the db), add them to a zip archive stream and return it in asp.net web api 2?
The key difference with the other "duplicate" question is how to do this in a streaming fashion without writing it to a temp file or buffer it completely in memory first.
It looks like you can't do this directly
Writing to ZipArchive using the HttpContext OutputStream
The http response stream needs to support seeking for a zip to be written directly to it which it doesn't. Will need to write to a temporary file by the looks of it.

Buffering stream of byte array

I am using DropNet library to download files from Dropbox.
public Stream GetFileStream(string path)
{
return new MemoryStream(dropboxClient.GetFile(path));
}
I am facing a problem in downloading large files because DropNet library returns byte array then I convert that byte array to stream for another logical purposes using MemoryStream which is not good because I have to download files to server memory then complete my logic
I am trying to find a way to buffer that files as a stream.
I looked at BufferedStream Class but to create new buffersteam it requires a stream first. I can't figure the best solution for my problem.
The DropNet API does not expose a Stream functionality for retrieving files. You must wait for the entire file to be downloaded before you can use it. If you want to be able to read the stream as it comes in you will need to use a different library, modify an existing one, or write your own.

MemoryStream "out of memory" C#

I have an implementation of a custom DataObject (Virtual File) see here. I have drag and drop functionality in a control view (drag and drop a file OUT of a control view without having a temp local file).
This works fine with smaller files but as soon as the file is larger than say 12-15megs it says not enough memory available. seems like the memory stream is out of memory.
what can i do about this? can i somehow split a larger byte[] into several memoryStreams and reassemble those to a single file?
Any help would be highly appreciated.
can i somehow split a larger byte[]
into several momoryStreams and
reassemble those to a single file?
Yes.
When I had to deal with a similar situation I built my own stream that internally used byte arrays of 4mb. This "paging" means it never has to allocate ONE LARGE BYTE ARRAY, which is what memory stream does. So, dump memory stream, build your own stream based on another internal storage mechanism.

Categories

Resources