S3 file uploaded is getting corrupted - c#

I have some code that uploads filestream to s3 bucket. One of my customer is having some issues but I'm having trouble reproducing it on my end. Upon upload, their filesize remains at 0 bytes. This does not occur every time. Seems very sporadic.
using (var webclient = new WebClient())
{
using (MemoryStream stream = new MemoryStream(webclient.DownloadData(uri)))
{
using (var client = new AmazonS3Client())
{
PutObjectRequest request = new PutObjectRequest
{
BucketName = bucketName,
Key = GetObjectKey(fileName, companyAccountId),
InputStream = stream
};
if (height > 0)
request.Metadata.Add("x-amz-meta-height", height.ToString());
if (width > 0)
request.Metadata.Add("x-amz-meta-width", width.ToString());
var response = await client.PutObjectAsync(request);
}
}
}
Any help or suggestion would be greatly appreciated to help determine why a file being uploaded is remaining at 0 bytes.

Is there anything else that could be consuming the inputStream and leaving the position at the end? I have had a similar problem before and solved the problem by ensuring the stream is at the start by either
stream.Seek(0, SeekOrigin.Begin);
or
stream.Position = 0;

Related

C# HttpClient Abruptly stops download

I have this C# Script that is supposed to download a zip archive from GitHub, unpack it and put it in a specific folder:
using (var client = new HttpClient())
{
var filePath = Path.GetFullPath("runtime");#"https://github.com/BlackBirdTV/tank/releases/latest/download/runtime.zip?raw=true";
ConsoleUtilities.UpdateProgress("Downloading Runtime...", 0);
var request = await client.GetStreamAsync(url);
var buffer = new byte[(int)bufferSize];
var totalBytesRead = 0;
int bytes = 0;
while ((bytes = await request.ReadAsync(buffer, 0, buffer.Length)) != 0)
{
totalBytesRead += bytes;
ConsoleUtilities.UpdateProgress($"Downloading Runtime... ({totalBytesRead} of {bufferSize} bytes read) ", (int)(totalBytesRead / bufferSize * 100));
}
}
Decompress(buffer, filePath);
When I now run this, the download starts and it seems like it finishes, yet at a sporadic place it just stops. Somehow, It downloads the bytes as my Console shows, but they are zeroed out. It seems like either my computer receives zeros (which I doubt) or the bytes don't get written to the buffer.
Weirdly enough, downloading the file over the browser works just fine.
Any help is greatly appreciated
As I state in the comments, your problem is that each iteration of your while loop is overwriting your buffer, and you are not accumulating the data anywhere. So your last iteration doesn't completely fill the buffer and all you're left with is whatever data you got in the last iteration.
You could fix that bug by storing the accumulated buffer somewhere, but a far better solution is to not fuss with buffers and such and just use the built-in CopyToAsync method of Stream:
using var client = new HttpClient();
using var stream = await client.GetStreamAsync("https://github.com/BlackBirdTV/tank/releases/latest/download/runtime.zip?raw=true");
using var file = new FileStream(#"c:\temp\runtime.zip", FileMode.Create);
await stream.CopyToAsync(file);
Here I'm saving it to a local file at c:\temp\runtime.zip, but obviously change that to suit your needs. I suppose you're avoiding this method so you can track progress, which is fair enough. So if that's really important to you, read on for a fix to your original solution.
For completeness, here's your original code fixed up to work by writing the buffer to a FileStream:
var bufferSize = 1024 * 10;
var url = #"https://github.com/BlackBirdTV/tank/releases/latest/download/runtime.zip?raw=true";
using var client = new HttpClient();
using var stream = await client.GetStreamAsync(url);
using var file = new FileStream(#"c:\temp\runtime.zip", FileMode.Create);
var filePath = Path.GetFullPath("runtime");
var request = await client.GetStreamAsync(url);
var buffer = new byte[(int)bufferSize];
var totalBytesRead = 0;
int bytes = 0;
while ((bytes = await request.ReadAsync(buffer, 0, buffer.Length)) != 0)
{
await file.WriteAsync(buffer, 0, bytes);
totalBytesRead += bytes;
ConsoleUtilities.UpdateProgress($"Downloading Runtime... ({totalBytesRead} of {bufferSize} bytes read) ", (int)(totalBytesRead / bufferSize * 100));
}

Zipping a number of potentially large files in chunks to avoid large memory consumption

I am working on an application that can take a list of file keys to files on AWS S3 as input and then create a zip file back on AWS S3 with all of those files inside. The compression part does not matter - the important part is to have a single zip file containing all of the other files.
To be able to run the application on a server without a lot of memory or file storage space, I was thinking of using the API that allows fetching a byte range from a file on S3: https://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectGET.html for downloading the files in chunks, and then add them to the zip file and upload the chunk using the multipart upload API: https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html
I have tried to make a small sample app, that will simulate how it could work (without actually calling the S3 APIs yet), but it gets stuck on this line: "await zipStream.WriteAsync(inBuffer, 0, currentChunk);"
public static async Task Main(string[] args)
{
const int ChunkSize = 5 * 1024 * 1024;
using (var fileOutputStream = new FileStream("/Users/SPE/Downloads/BG_K01.zip", FileMode.Create))
{
using (var fileInputStream = File.Open("/Users/SPE/Downloads/BG_K01.rvt", FileMode.Open))
{
long fileSize = new FileInfo("/Users/SPE/Downloads/BG_K01.rvt").Length;
int readBytes = 0;
using (AnonymousPipeServerStream pipeServer = new AnonymousPipeServerStream())
{
using (AnonymousPipeClientStream pipeClient = new AnonymousPipeClientStream(pipeServer.GetClientHandleAsString()))
{
using (var zipArchive = new ZipArchive(pipeServer, ZipArchiveMode.Create, true))
{
var zipEntry = zipArchive.CreateEntry("BG_K01.rvt", CompressionLevel.NoCompression);
using (var zipStream = zipEntry.Open())
{
// Simulate receiving and sending a chunk of bytes
while (readBytes < fileSize)
{
var currentChunk = (int)Math.Min(ChunkSize, fileSize - readBytes);
var inBuffer = new byte[currentChunk];
var outBuffer = new byte[currentChunk];
await fileInputStream.ReadAsync(inBuffer, 0, currentChunk);
await zipStream.WriteAsync(inBuffer, 0, currentChunk);
await pipeClient.ReadAsync(outBuffer, 0, currentChunk);
await fileOutputStream.WriteAsync(outBuffer, 0, currentChunk);
readBytes += currentChunk;
}
}
}
}
}
}
}
}
I am also not sure if using the pipe streams is the best way to do this, but my hope is that they will release any memory consumed once the stream has been read, and thereby keep the memory consumption very low.
Does anybody know why writing to the zipStream hangs?

Amazon S3 File Full of Zeros

I'm downloading a PDF file from an AWS S3 bucket using the official client in C#. It appears to download the whole file, but everything is 0s after 8192 (0x2000) bytes.
See below (original file on left, S3 download on right):
Any ideas as to why this is happening would be greatly appreciated.
Here's the code:
var client = new AmazonS3Client(
new AmazonS3Config
{
RegionEndpoint = RegionEndpoint.EUWest1
});
var transferUtility = new TransferUtility(client);
var request = new TransferUtilityOpenStreamRequest
{
BucketName = bucketName,
Key = key
};
using (var stream = transferUtility.OpenStream(request))
{
var bytes = new byte[stream.Length];
stream.Read(bytes, 0, (int)stream.Length);
stream.Close();
return bytes;
}
Thanks in advance,
Steve.
For anyone else hitting this issue, it was a case of having to repeatedly call Read on the stream until all bytes have been received:
using (var stream = transferUtility.OpenStream(request))
{
var position = 0;
var length = stream.Length;
var bytes = new byte[length];
do
{
position += stream.Read(bytes, position, (int)(stream.Length - position));
} while (position < length);
stream.Close();
return bytes;
}
Thanks to John for pointing that out.
Edit:
Or check out this extension method kindly pointed out by JohnLBevan: https://stackoverflow.com/a/24412022/361842

DotNetZip download zip file from a webservice

I'm trying on c# to download a zip file from a webservice and extract an entry in the memory but when I try to read the stream how is in the documentation of the dotnetzip I get the exception "This stream does not support seek operations” in the "ZipFile.Read(stream)" part.
Somebody could tell me what I'm doing wrong? Thanks in advance
urlAuthentication="https://someurl/?login=foo&token=faa"
var request = (HttpWebRequest)WebRequest.Create(urlAuthentication);
request.Proxy = WebRequest.DefaultWebProxy;
request.Credentials = System.Net.CredentialCache.DefaultCredentials; ;
request.Proxy.Credentials = System.Net.CredentialCache.DefaultCredentials;
using (var ms = new MemoryStream())
{
using (var response = (HttpWebResponse)request.GetResponse())
{
using (var stream =response.GetResponseStream())
{
using (ZipFile zipout = ZipFile.Read(stream))
{
ZipEntry entry = zipout["file1.xml"];
entry.Extract(ms);
}
}
}
}
Apparently dotnetzip requires a stream to support seek operations and the response stream of a HttpWebResponse does not support seeking.
You can solve this issue by first downloading the entire file in memory, and then accessing it:
using (var ms = new MemoryStream())
{
using (MemoryStream seekable = new MemoryStream())
{
using (var stream = response.GetResponseStream())
{
int bytes;
byte[] buffer = new byte[1024];
while ((bytes = stream.Read(buffer, 0, buffer.Length)) > 0)
{
seekable.Write(buffer, 0, bytes);
}
}
seekable.Position = 0;
using (ZipFile zipout = ZipFile.Read(seekable))
{
ZipEntry entry = zipout["file1.xml"];
entry.Extract(ms);
}
}
// access ms
}

Amazon S3 Save Response Stream

I am trying to load a .gz file out of a bucket.
Connection and authentication work finde, I even do get a file, but the problem is, the file is a lot bigger then the file should be. it is, original size, 155MB within the bucket but when it comes onto my hard disk it gets up to about 288MB
here is the function code:
public bool SaveBucketToFile(string Filename)
{
//Response check into file
using (StreamReader StRead = new StreamReader(_ObjResponse.ResponseStream))
{
string TempFile = Path.GetTempFileName();
StreamWriter StWrite = new StreamWriter(TempFile, false);
StWrite.Write(StRead.ReadToEnd());
StWrite.Close();
StRead.Close();
// Move to real destination
if (File.Exists(Filename))
{
File.Delete(Filename);
}
File.Move(TempFile, Filename);
}
return true;
}
the download and filling of _ObjResponse is made over usage of the AmazonS3 Client from their SDK. I am using a proxy but the same code on a different machine without proxy brings back the same result.
Any hints what to do here? the object request is simple:
_ObjRequest = new GetObjectRequest
{
BucketName = BucketName,
Key = Key
};
glad for any help...
for everyone to stumble upon this.
I needed to first save the stream via bufferedStream into a memorystream.
the code looks like this:
MemoryStream MemStream = new MemoryStream();
BufferedStream Stream2 = new BufferedStream(_ObjResponse.ResponseStream);
byte[] Buffer = new byte[0x2000];
int Count;
while ((Count = Stream2.Read(Buffer, 0, Buffer.Length)) > 0)
{
MemStream.Write(Buffer, 0, Count);
}
// Pfad auslesen
string TempFile = Path.GetTempFileName();
//Stream zum Tempfile öffnen
FileStream Newfile = new FileStream(TempFile,FileMode.Create);
//Stream wieder auf Position 0 ziehen
MemStream.Position = 0;
// in Tempdatei speichern
MemStream.CopyTo(Newfile);
Newfile.Close();
// Endgültigen Speicherpunkt prüfen und Tempdatei dorthin schieben
if (File.Exists(Filename))
{
File.Delete(Filename);
}
File.Move(TempFile, Filename);
I found this somewhere here:
http://www.codeproject.com/Articles/186132/Beginning-with-Amazon-S under the Caption "Get a file from Amazon S3"

Categories

Resources