High Memory Usage When Downloading Azure Blob through WebAPI

High Memory Usage When Downloading Azure Blob through WebAPI - c#

I am using the DownloadToStreamEncrypted extension from https://github.com/stefangordon/azure-encryption-extensions and it works fine for small files. I've noticed that it loads the entire stream into memory however.
I am using ASP.Net WebAPI. Here is my code:
var stream = new MemoryStream();
await blob.DownloadToStreamEncryptedAsync(provider, stream);
stream.Seek(0, SeekOrigin.Begin);
response.StatusCode = HttpStatusCode.OK;
response.Content = new StreamContent(stream);
response.Content.Headers.ContentLength = docLength;
response.Content.Headers.ContentType = new MediaTypeHeaderValue(doc.MimeType);
response.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment")
{
FileName = "somefilename",
Size = docLength
};
return response;
This works, but the memory used goes up by however big the file is. It doesn't actually stream the file back. Is there something I am doing wrong with the stream that is causing this high memory use? Ideally, it just streams back from the server without first getting loaded into memory in its entirety. I suspect the problem is the CopyToAsync method within the DownloadToStreamEncrypted but am not sure.
The code for DownloadToStreamEncrypted is
using (Stream blobStream = await blob.OpenReadAsync(accessCondition, options, operationContext))
using (Stream decryptedStream = provider.DecryptedStream(blobStream))
{
await decryptedStream.CopyToAsync(stream);
}
Is there something I can do to lower the memory usage?

Related

Unable to stream file upload through controller method

I'm trying to effectively proxy a file upload via an ASP.NET Core 5 MVC controller to another API:
[DisableFormValueModelBinding]
[HttpPost]
public async Task<IActionResult> Upload()
{
var reader = new MultipartReader(Request.GetMultipartBoundary(), Request.Body);
MultipartSection section;
while ((section = await reader.ReadNextSectionAsync().ConfigureAwait(false)) != null)
{
if (section.ContentType == "application/json")
{
await SendFile(section.Body);
}
}
return View("Upload");
}
private async Task SendFile(Stream stream)
{
var request = new HttpRequestMessage(HttpMethod.Post, "http://blah/upload");
request.Content = new StreamContent(stream);
var response = await httpClient.SendAsync(request);
}
However, the receiving API always gets an empty stream.
I can confirm the SendFile method works as the following test works from within the controller method:
using (var fs = new FileStream("test.json", FileMode.Open))
{
await SendFile(fs);
}
And I can see the uploaded file if I try to read it in the controller:
var buf = new char[256];
using (var sr = new StreamReader(section.Body))
{
var x = await sr.ReadAsync(buf, 0, buf.Length);
while (x > 0)
{
log.Debug(new string(buf));
x = await sr.ReadAsync(buf, 0, buf.Length);
}
}
So both ends seem to work, just not together.
I have EnableBuffering set:
app.Use(next => context =>
{
context.Request.EnableBuffering();
return next(context);
});
And I'm disabling binding of the uploaded files to the model using the DisableFormValueModelBindingAttribute example from Upload files in ASP.NET Core
I've also tried rewinding the stream manually using Seek, but it doesn't make a difference.
It works if I copy it through a MemoryStream:
using (var ms = new MemoryStream())
{
await section.Body.CopyToAsync(ms);
await ms.FlushAsync();
ms.Seek(0, SeekOrigin.Begin);
await SendFile(ms);
}
However, this buffers the file in memory which is not suitable for large files.
It also works if I read the uploaded file first, rewind and then try:
var buf = new char[256];
using (var sr = new StreamReader(section.Body))
{
var x = await sr.ReadAsync(buf, 0, buf.Length);
while (x > 0)
{
log.Debug(new string(buf));
x = await sr.ReadAsync(buf, 0, buf.Length);
}
}
section.Body.Seek(0, SeekOrigin.Begin);
// this works now:
await SendFile(section.Body);
Again, this is not suitable for large files.
It seems the stream is not in the correct state to be consumed by my SendFile method but I cannot see why.
UPDATE
Based on comments from Jeremy Lakeman I took a closer look at what was happening with the stream length.
I discovered that removing EnableBuffering makes it work as expected, so the issue is sort of resolved by that.
However, I came across this aspnetcore Github comment where a contributor states that:
We don't support flowing the Request Body through as a stream to HttpClient.
That and the other comments in that issue support Jeremy's comments about CanSeek and the stream length, and it's unclear (to me) whether this should actually work and whether it's just a coincidence that it now does (i.e. will I get hit with another gotcha later).
In this specific scenario with MIME multipart, where we don't know the stream length without buffering/counting the whole file, is there an alternative to StreamContent or a different way to handle the file upload?
The Microsoft docs page Upload files in ASP.NET Core advises only to use an alternative approach. It talks about streaming uploads, however, it stops short of properly consuming the stream and just buffers the file into a MemoryStream (completely defeating the purpose of streaming)

Download ZipArchive from c# web api method returns "net::ERR_CONNECTION_RESET" in chrome

I want to call a web api method and have it allow the user to download a zip file that I create in memory. I also want to create the entries in memory as well.
I'm having trouble getting the server to correctly output the download.
Here is my web api method:
[HttpGet]
[Route("api/downloadstaffdata")]
public HttpResponseMessage DownloadStaffData()
{
var response = new HttpResponseMessage(HttpStatusCode.OK);
using (var stream = new MemoryStream())
{
using (var archive = new ZipArchive(stream, ZipArchiveMode.Create, true))
{
//future for loop to create entries in memory from staff list
var entry = archive.CreateEntry("bob.txt");
using (var writer = new StreamWriter(entry.Open()))
{
writer.WriteLine("Info for: Bob");
}
//future add staff images as well
}
stream.Seek(0, SeekOrigin.Begin);
response.Content = new StreamContent(stream);
}
response.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment")
{
FileName = "staff_1234_1.zip"
};
response.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/zip");
return response;
}
Here is my calling js code:
window.open('api/downloadstaffdata');
Here is the response from Chrome:
net::ERR_CONNECTION_RESET
I don't know what I'm doing wrong. I've already searched SO and read the articles about creating the zip file, but I can't get passed the connection reset error when trying to return the zip archive to the client.
Any ideas?

You have your memory stream inside a using block. As such, your memory stream are being disposed before your controller has the chance to write it out (hence the ERR_CONNECTION_RESET).
A MemoryStream does not need to be disposed explicitly (its various derived type may need to be, but not the MemoryStream itself). Garbage Collector can clean it up automatically.

Mvc - How to stream large file in 4k chunks for download

i was following this example, but when download starts it hangs and than after a minute it shows server error. I guess response end before all data id sent to client.
Do you know another way that i can do this or why it's not working?
Writing to Output Stream from Action
private void StreamExport(Stream stream, System.Collections.Generic.IList<byte[]> data)
{
using (BufferedStream bs = new BufferedStream(stream, 256 * 1024))
using (StreamWriter sw = new StreamWriter(bs))
{
foreach (var stuff in data)
{
sw.Write(stuff);
sw.Flush();
}
}
}

Can you show the calling method? What is the Stream being passed in? Is it the Response Stream?
There are many helpful classes to use that you don't have to chuck yourself because they chunk by default. If you use StreamContent there is a constructor overload where you can specify buffer size. I believe default is 10kB.
From memory here so it my not be complete:
[Route("download")]
[HttpGet]
public async Task<HttpResponseMessage> GetFile()
{
var response = this.Request.CreateResponse(HttpStatusCode.OK);
//don't use a using statement around the stream because the framework will dispose StreamContent automatically
var stream = await SomeMethodToGetFileStreamAsync();
//buffer size of 4kB
var content = new StreamContent(stream, 4096);
response.Content = content;
return response;
}

Can I get file.OpenStreamForReadAsync() from file.OpenStreamForWriteAsync()?

I am still learning the tricks of read/write file streams and hope someone could help if what I am looking for is feasible.
The code below makes WebApi calls (note GetAsync() on line 2) to get an image file Id and save downloaded file to database with computed Md5Hash. The code works fine, but in the interest of efficiency I was wondering if it's possible to get file.OpenStreamForReadAsync() from file.OpenStreamForWriteAsync() (not sure even if this is possible, but I can see some extension methods that operate on a stream but no luck with attempts I've made so far). If this is possible, I can avoid saving a file and opening it again, by instead making the GetMD5Hash() method call within the using (var fileStream = await file.OpenStreamForWriteAsync()){ ... } block.
Can I have the equivalent of Utils.GetMD5Hash(stream);, shown below outside the using block, inside the block, the intention being to avoid opening the file outside of the using block?
var client = new HttpClient();
var response = await client.GetAsync(new Uri($"{url}{imageId}")); // call to WebApi; url and imageId defined earlier
if (response.IsSuccessStatusCode)
{
using (var contentStream = await response.Content.ReadAsInputStreamAsync())
{
var stream = contentStream.AsStreamForRead();
var file = await imagesFolder.CreateFileAsync(imageFileName, CreationCollisionOption.ReplaceExisting);
using (var fileStream = await file.OpenStreamForWriteAsync())
{
await stream.CopyToAsync(fileStream, 4096);
// >>>> At this point, from the Write fileStream, can I get the equivalent of file.OpenStreamForReadAsync() ??
}
var stream = await file.OpenStreamForReadAsync();
string md5Hash = await Utils.GetMD5Hash(stream);
await AddImageToDataBase(file, md5Hash);
}
}

A MemoryStream is read/write, so if all you wanted to do was to compute the hash, something like this should do the trick:
var stream = contentStream.AsStreamForRead();
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
ms.Seek(0, SeekOrigin.Begin);
string md5Hash = await Utils.GetMD5Hash(ms);
}
But since you want to save the file anyway (it's passed to AddImageToDataBase, after all), consider
Save to the MemoryStream
Reset the memory stream
Copy the memory stream to the file stream
Reset the memory stream
Compute the hash
I'd suggest you do performance measurements, though. The OS does cache file I/O, so it's unlikely that you'd actually have to do a physical disk read to compute the hash. The performance gains might not be what you imagine.

Here is a complete answer using MemoryStream as Petter Hesselberg suggested in his answer. Take note of a couple of tricky situations that I had to encounter, to do what I wanted to do: (1) make sure fileStream is made Disposable with using block, and (2) make sure MemoryStream is set to start with ms.Seek(0, SeekOrigin.Begin); before using it, both for the file to be saved and for the MemoryStream object handed over for computing MD5Hash.
var client = new HttpClient();
var response = await client.GetAsync(new Uri($"{url}{imageId}")); // call to WebApi; url and imageId defined earlier
if (response.IsSuccessStatusCode)
{
using (var contentStream = await response.Content.ReadAsInputStreamAsync())
{
var stream = contentStream.AsStreamForRead();
var file = await imagesFolder.CreateFileAsync(imageFileName, CreationCollisionOption.ReplaceExisting);
using (MemoryStream ms = new MemoryStream())
{
await stream.CopyToAsync(ms);
using (var fileStream = await file.OpenStreamForWriteAsync())
{
ms.Seek(0, SeekOrigin.Begin);
await ms.CopyToAsync(fileStream);
ms.Seek(0, SeekOrigin.Begin); // rewind for next use below
}
string md5Hash = await Utils.GetMD5Hash(ms);
await AddImageToDataBase(file, md5Hash);
}
}
}

How to upload a large file (1 GB +) to Google Drive using GoogleDrive REST API

I am tying to upload large files(1 GB+) to Google Drive using GoogleDrive API. My code works fine with smaller files. But when it comes to larger files error occurs.
Error occurs in the code part where the the file is converted into byte[].
byte[] data = System.IO.File.ReadAllBytes(filepath);
Out of memory exception is thrown here.

Probably you followed developers.google suggestions and you are doing this
byte[] byteArray = System.IO.File.ReadAllBytes(filename);
MemoryStream stream = new MemoryStream(byteArray);
try {
FilesResource.InsertMediaUpload request = service.Files.Insert(body, stream, mimeType);
request.Upload();
I have no idea why the suggest to put the whole file in a byte array and then create a MemoryStream on it.
I think that a better way is this:
using(var stream = new System.IO.FileStream(filename,
System.IO.FileMode.Open,
System.IO.FileAccess.Read))
{
try
{
FilesResource.InsertMediaUpload request = service.Files.Insert(body, stream, mimeType);
request.Upload();
.
.
.
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

High Memory Usage When Downloading Azure Blob through WebAPI - c#

Related

Unable to stream file upload through controller method

Download ZipArchive from c# web api method returns "net::ERR_CONNECTION_RESET" in chrome

Mvc - How to stream large file in 4k chunks for download

Can I get file.OpenStreamForReadAsync() from file.OpenStreamForWriteAsync()?

How to upload a large file (1 GB +) to Google Drive using GoogleDrive REST API

Categories

Resources