How to split a Filestream into two substreams - c#

Is there a way to split a filestream which is obtained via
File.Open("100GB.bin", FileMode.Open, FileAccess.Read, FileShare.Read)
into 2 equally sized substreams?
I would like to upload the file partially to a website, but the webserver has limitations on the maximum filesize allowed for posting.
Two streams are necessary to simultaneously upload the parts to the website.
Thanks in advance.

If the website doesn't provide a multi-part upload mechanism, just read N bytes into different streams:
using (var fs = File.Open("100GB.bin", FileMode.Open, FileAccess.Read, FileShare.Read))
{
var chunkSizeInBytes = ...; // whatever you like, below code assumes it's evenly divisible into your 100GB file
var numChunks = fs.Length / chunkSizeInBytes;
var buf = new byte[chunkSizeInBytes];
for (int i = 0, bufIndex = 0; i < numChunks; ++i, bufIndex += chunkSizeInBytes)
{
fs.Read(buf, bufIndex, chunkSizeInBytes);
// if, for whatever reason, you actually need a new stream,
// just create a MemoryStream and use fs.CopyTo(stream, size)
PostMyData(buf);
}
}

Related

How To Monitor Download Progress of Stream

From SharePoint, I get a "Stream" for a file. I want to copy this entire file from the internet to a local file on the PC, but I want to have status while this download is occurring. How? FileStream and StreamReader seem to have bytes vs char differences when not doing a full "CopyTo" which doesn't have progress updates. There has to be a cleaner way...
using (var sr = new StreamReader(fileData.Value))
{
using (FileStream fs = new FileStream(localFile + "_tmp", FileMode.Create))
{
byte[] block = new byte[1024];
// only guessing something like this is necessary...
int count = (int)Math.Min(sr.BaseStream.Length - sr.BaseStream.Position, block.Length);
while (sr.BaseStream.Position < sr.BaseStream.Length)
{
// read requires char[]
sr.Read(block, 0, count);
// write requires byte[]
fs.Write(block, 0, count);
Log("Percent complete: " + (sr.BaseStream.Position / sr.BaseStream.Length));
count = (int)Math.Min(sr.BaseStream.Length - sr.BaseStream.Position, block.Length);
}
}
}
Just had to use BinaryReader instead of StreamReader. Easy Peasy.

Saving a binary file from MemoryStream to a Network path intermittently saves 0 bytes file

I have the following code which works most of the time when saving a file from a memory stream pointing to a byte array to a network location.
using (var writer = new BinaryWriter(new FileStream(filePath, FileMode.Create)))
using (var reader = new BinaryReader(stream))
{
var chunkSize = 1024;
var chunkCount = (int)reader.BaseStream.Length / chunkSize;
var chunks = Enumerable.Range(0, chunkCount)
.Select(_ => reader.ReadBytes(chunkSize));
chunks.ForEach(c => writer.Write(c));
writer.Write(reader.ReadBytes((int)reader.BaseStream.Length % chunkSize));
writer.Close();
reader.Close();
}
Is there any possible way the above code may end up saving zero bytes file? Is it a bad idea to save to the network directly and not use a "copy from temp" method?

Zipping a number of potentially large files in chunks to avoid large memory consumption

I am working on an application that can take a list of file keys to files on AWS S3 as input and then create a zip file back on AWS S3 with all of those files inside. The compression part does not matter - the important part is to have a single zip file containing all of the other files.
To be able to run the application on a server without a lot of memory or file storage space, I was thinking of using the API that allows fetching a byte range from a file on S3: https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html for downloading the files in chunks, and then add them to the zip file and upload the chunk using the multipart upload API: https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html
I have tried to make a small sample app, that will simulate how it could work (without actually calling the S3 APIs yet), but it gets stuck on this line: "await zipStream.WriteAsync(inBuffer, 0, currentChunk);"
public static async Task Main(string[] args)
{
const int ChunkSize = 5 * 1024 * 1024;
using (var fileOutputStream = new FileStream("/Users/SPE/Downloads/BG_K01.zip", FileMode.Create))
{
using (var fileInputStream = File.Open("/Users/SPE/Downloads/BG_K01.rvt", FileMode.Open))
{
long fileSize = new FileInfo("/Users/SPE/Downloads/BG_K01.rvt").Length;
int readBytes = 0;
using (AnonymousPipeServerStream pipeServer = new AnonymousPipeServerStream())
{
using (AnonymousPipeClientStream pipeClient = new AnonymousPipeClientStream(pipeServer.GetClientHandleAsString()))
{
using (var zipArchive = new ZipArchive(pipeServer, ZipArchiveMode.Create, true))
{
var zipEntry = zipArchive.CreateEntry("BG_K01.rvt", CompressionLevel.NoCompression);
using (var zipStream = zipEntry.Open())
{
// Simulate receiving and sending a chunk of bytes
while (readBytes < fileSize)
{
var currentChunk = (int)Math.Min(ChunkSize, fileSize - readBytes);
var inBuffer = new byte[currentChunk];
var outBuffer = new byte[currentChunk];
await fileInputStream.ReadAsync(inBuffer, 0, currentChunk);
await zipStream.WriteAsync(inBuffer, 0, currentChunk);
await pipeClient.ReadAsync(outBuffer, 0, currentChunk);
await fileOutputStream.WriteAsync(outBuffer, 0, currentChunk);
readBytes += currentChunk;
}
}
}
}
}
}
}
}
I am also not sure if using the pipe streams is the best way to do this, but my hope is that they will release any memory consumed once the stream has been read, and thereby keep the memory consumption very low.
Does anybody know why writing to the zipStream hangs?

Alternative way of using System.IO.File.OpenWrite() in Using(Stream fileStream = System.IO.File.OpenWrite("D:\sample.svg"))

using (Stream fileStream = System.IO.File.OpenWrite("D:\sample.svg"))
{
byte[] buffer = new byte[8 * 1024];
int len;
while ((len = SvgStream.Read(buffer, 0, buffer.Length)) > 0)
{
fileStream.Write(buffer, 0, len);
}
}
I have this code for converting a PPT to SVG. SvgStream contains the PPT Slides. I don't want to store the converted SVG file on a physical path like D:\sample.svg. Is it possible to store it on an object that is not a physical path?
Is it possible to store it on an object that is not a physical path?
Sure. Create a MemoryStream instead of a FileStream if you want to stream into memory rather than the file system.
And while I'm here: you might want to use the CopyTo method rather than this tedious business of buffering it over one page at a time.
I saved it to a string:
SvgStream = new MemoryStream();
sld.WriteAsSvg(SvgStream);
SvgStream.Position = 0;
string[] svgArray = new string[MAX];
//instead of using(var fileStream = File.IO.......
//I came up with this:
var sr = new StreamReader(SvgStream);
svgArray[i] = sr.ReadToEnd();

GZipStream not reading the whole file

I have some code that downloads gzipped files, and decompresses them. The problem is, I can't get it to decompress the whole file, it only reads the first 4096 bytes and then about 500 more.
Byte[] buffer = new Byte[4096];
int count = 0;
FileStream fileInput = new FileStream("input.gzip", FileMode.Open, FileAccess.Read, FileShare.Read);
FileStream fileOutput = new FileStream("output.dat", FileMode.Create, FileAccess.Write, FileShare.None);
GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress, true);
// Read from gzip steam
while ((count = gzipStream.Read(buffer, 0, buffer.Length)) > 0)
{
// Write to output file
fileOutput.Write(buffer, 0, count);
}
// Close the streams
...
I've checked the downloaded file; it's 13MB when compressed, and contains one XML file. I've manually decompressed the XML file, and the content is all there. But when I do it with this code, it only outputs the very beginning of the XML file.
Anyone have any ideas why this might be happening?
EDIT
Try not leaving the GZipStream open:
GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress,
false);
or
GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress);
I ended up using a gzip executable to do the decompression instead of a GZipStream. It can't handle the file for some reason, but the executable can.
Same thing happened to me. In my case only reads up to 6 lines and then reached end of file. So I realized that although the extension is gz, it was compressed by another algorithm not supported by GZipStream. So I used SevenZipSharp library and it worked. This is my code
You can use SevenZipSharp library
using (var input = File.OpenRead(lstFiles[0]))
{
using (var ds = new SevenZipExtractor(input))
{
//ds.ExtractionFinished += DsOnExtractionFinished;
var mem = new MemoryStream();
ds.ExtractFile(0, mem);
using (var sr = new StreamReader(mem))
{
var iCount = 0;
String line;
mem.Position = 0;
while ((line = sr.ReadLine()) != null && iCount < 100)
{
iCount++;
LstOutput.Items.Add(line);
}
}
}
}
Are you calling Close or Flush on fileOutput? (Or just wrap it in a using, which is recommended practice.) If you don't the file might not be flushed to disk when your program ends.

Categories

Resources