Appending bytes using Amazon S3 .Net SDK - c#

I have the following piece of code which works great for a simple file upload. But let's say I wanted to append to an existing file or simply upload random chunks of bytes, like the first and last 10 bytes? Is this even possible with the official SDK?
PutObjectRequest request = new PutObjectRequest();
FileStream fs = new FileStream(#"C:\myFolder\MyFile.bin", FileMode.Open);
request.WithInputStream(fs);
request.WithBucketName(bucketName);
request.WithKey(keyName);
client.PutObject(request);
fs.Close();

There is no way to append data to existing objects in S3. You have to overwrite the entire file.
Although, in saying that, it is possible to a degree with Amazon's large file support. With this uploads are broken into chunks and reassembled on S3. But you have to do it as part of a single transfer and its only for large files.

This previous answer appears to no longer be the case. You can currently manage an append like process by using an existing object as the initial part of a multi-part upload. Then delete the previous object when done transferring.
See:
http://docs.aws.amazon.com/AmazonS3/latest/dev/CopyingObjctsUsingLLNetMPUapi.html
http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPartCopy.html

Related

Write Zip file to AWS as a stream

Using C#, I would like to create a zip file in AWS S3, add file entries to it, then close the stream. System.IO.Compression.ZipArchive can be created from a System.IO.Stream. Is it possible to get a writeable stream into an S3 bucket? I am using the .NET SDK FOR S3.
An object uploaded to S3 must have a known size when the request is made. Since the size of the zip file won't be known till the stream is closed you can't do what you are asking about. You would have to create the zip file locally then upload it to S3.
The closest you could get to what you are asking for is using S3's multi part upload. I would use a MemoryStream as the underlying stream for the ZipArchive and each time you add a file to the zip archive check to see if the MemoryStream is larger than 5 megabytes. If it is take the byte buffer from the MemoryStream and upload a new part to S3. Then clear the MemoryStream and continue adding files to the zip archive.
You'll probably want to take a look at this answer here for an existing discussion around this.
This doc page seems to suggest that there is an Upload method can take a stream (with S3 taking care of re-assembling the multi-part upload). Although this is for version 1 so might not be available in version 3.

Is it possible to download and unzip in parallel?

I have some large zip files that I'm downloading and then unzipping in my program. Performance is important, and one direction I started thinking about was whether it was possible to start the download and then begin unzipping the data as it arrives, instead of waiting for the download to complete and then start unzipping. Is this possible? From what I understand of DEFLATE, it should be theoretically possible right?
I'm currently using DotNetZip as my zip library, but it refuses to act on a non-seekable stream.
Code would be something like this:
// HTTP Get the application from the server
var request = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(url);
request.Method = "GET";
Directory.CreateDirectory(localPath);
using (var response = (HttpWebResponse)request.GetResponse())
using (Stream input = response.GetResponseStream())
{
// Unzip being some function which will start unzipping and
// return when unzipping is done
return Unzip(input, localPath);
}
I started thinking about was whether it was possible to start the download and then begin unzipping the data as it arrives, instead of waiting for the download to complete and then start unzipping. Is this possible?
If you want to start unzipping whilst the response body is still downloading, you can't really do this.
In a ZIP file, the Central Directory Record, which contains the list of files in the ZIP file, is located at the very end of the ZIP file. It will be the last thing you download. Without it, you can't reliably determine where the individual file records are located in your ZIP file.
This would also explain why DotNetZip needs a seekable stream. It needs to be able to read the Central Directory Record at the end of the file first, then jump back to earlier sections to read information about individual ZIP entries to extract them.
If you have very specific ZIP files you could make certain assumptions about the layout of those individual file records and extract them by hand, without seeking backwards, but it would not be broadly compatible with ZIP files in general.
You could use a async Task to unzip
await Task.Run(() => ZipFile.ExtractToDirectory(localPath + #"\" + fileName, destinationPath));
If you want to unpack the vast majority of zipfiles, they contain only file records followed by compressed data, repeated until you hit the central directory. So it is very much possible to do streaming decompression like asked in this question. The fflate JavaScript library does it for example.
It is possible to create a) a self executing zipfile, or b) some other weird ass zipfile that isn't formatted like this, but you'd be hard pressed to find one in the wild.

mongoddb :How to read binary files in chunks with c#

I am able to save a binary file in mongodb server through following code:
using (var fs = new FileStream("C:\\Data_w.bin", FileMode.Open))
{
var gridFsInfo = database.GridFS.Upload(fs, fileName);
}
I can see the file saved in server. The file is about 42MB in size. I want to read the file in chunks i.e. read once chunk at a time and deserialize the binary data and flush to browser.
How can I read the data in chunks from mongodb through c# driver?
As per my understanding following commandline reads only 0th chunk from the big file:
db.fs.chunks.find({"files_id" : ObjectId("53f74e2f3f69bd30142f2193"),"n":0})
but dont know how to write same in c#. Please help
You can open the file GridFS.Open(string remoteFileName, FileMode mode) that returns MongoGridFSStream and use it like any other System.IO.Stream. It doesn't download chunks that you don't use and buffers chunks for you, so there is no need to concern yourself with implementation. So you can just read a part of stream, flush to browser, read another, flush and so on.
Documentation for C# driver is lacking, source code is the best documentation.

Generate and stream Zip archive without storing it in the filesystem or reading into memory first

How can I asynchronously take multiple existing streams (from the db), add them to a zip archive stream and return it in asp.net web api 2?
The key difference with the other "duplicate" question is how to do this in a streaming fashion without writing it to a temp file or buffer it completely in memory first.
It looks like you can't do this directly
Writing to ZipArchive using the HttpContext OutputStream
The http response stream needs to support seeking for a zip to be written directly to it which it doesn't. Will need to write to a temporary file by the looks of it.

Buffering stream of byte array

I am using DropNet library to download files from Dropbox.
public Stream GetFileStream(string path)
{
return new MemoryStream(dropboxClient.GetFile(path));
}
I am facing a problem in downloading large files because DropNet library returns byte array then I convert that byte array to stream for another logical purposes using MemoryStream which is not good because I have to download files to server memory then complete my logic
I am trying to find a way to buffer that files as a stream.
I looked at BufferedStream Class but to create new buffersteam it requires a stream first. I can't figure the best solution for my problem.
The DropNet API does not expose a Stream functionality for retrieving files. You must wait for the entire file to be downloaded before you can use it. If you want to be able to read the stream as it comes in you will need to use a different library, modify an existing one, or write your own.

Categories

Resources