Unable to stream file upload through controller method - c#

I'm trying to effectively proxy a file upload via an ASP.NET Core 5 MVC controller to another API:
[DisableFormValueModelBinding]
[HttpPost]
public async Task<IActionResult> Upload()
{
var reader = new MultipartReader(Request.GetMultipartBoundary(), Request.Body);
MultipartSection section;
while ((section = await reader.ReadNextSectionAsync().ConfigureAwait(false)) != null)
{
if (section.ContentType == "application/json")
{
await SendFile(section.Body);
}
}
return View("Upload");
}
private async Task SendFile(Stream stream)
{
var request = new HttpRequestMessage(HttpMethod.Post, "http://blah/upload");
request.Content = new StreamContent(stream);
var response = await httpClient.SendAsync(request);
}
However, the receiving API always gets an empty stream.
I can confirm the SendFile method works as the following test works from within the controller method:
using (var fs = new FileStream("test.json", FileMode.Open))
{
await SendFile(fs);
}
And I can see the uploaded file if I try to read it in the controller:
var buf = new char[256];
using (var sr = new StreamReader(section.Body))
{
var x = await sr.ReadAsync(buf, 0, buf.Length);
while (x > 0)
{
log.Debug(new string(buf));
x = await sr.ReadAsync(buf, 0, buf.Length);
}
}
So both ends seem to work, just not together.
I have EnableBuffering set:
app.Use(next => context =>
{
context.Request.EnableBuffering();
return next(context);
});
And I'm disabling binding of the uploaded files to the model using the DisableFormValueModelBindingAttribute example from Upload files in ASP.NET Core
I've also tried rewinding the stream manually using Seek, but it doesn't make a difference.
It works if I copy it through a MemoryStream:
using (var ms = new MemoryStream())
{
await section.Body.CopyToAsync(ms);
await ms.FlushAsync();
ms.Seek(0, SeekOrigin.Begin);
await SendFile(ms);
}
However, this buffers the file in memory which is not suitable for large files.
It also works if I read the uploaded file first, rewind and then try:
var buf = new char[256];
using (var sr = new StreamReader(section.Body))
{
var x = await sr.ReadAsync(buf, 0, buf.Length);
while (x > 0)
{
log.Debug(new string(buf));
x = await sr.ReadAsync(buf, 0, buf.Length);
}
}
section.Body.Seek(0, SeekOrigin.Begin);
// this works now:
await SendFile(section.Body);
Again, this is not suitable for large files.
It seems the stream is not in the correct state to be consumed by my SendFile method but I cannot see why.
UPDATE
Based on comments from Jeremy Lakeman I took a closer look at what was happening with the stream length.
I discovered that removing EnableBuffering makes it work as expected, so the issue is sort of resolved by that.
However, I came across this aspnetcore Github comment where a contributor states that:
We don't support flowing the Request Body through as a stream to HttpClient.
That and the other comments in that issue support Jeremy's comments about CanSeek and the stream length, and it's unclear (to me) whether this should actually work and whether it's just a coincidence that it now does (i.e. will I get hit with another gotcha later).
In this specific scenario with MIME multipart, where we don't know the stream length without buffering/counting the whole file, is there an alternative to StreamContent or a different way to handle the file upload?
The Microsoft docs page Upload files in ASP.NET Core advises only to use an alternative approach. It talks about streaming uploads, however, it stops short of properly consuming the stream and just buffers the file into a MemoryStream (completely defeating the purpose of streaming)

Related

Extract the file header signature as it is being streamed directly to disk in ASP.NET Core

I have an API method that streams uploaded files directly to disk to be scanned with a virus checker. Some of these files can be quite large, so IFormFile is a no go:
Any single buffered file exceeding 64 KB is moved from memory to a
temp file on disk.
Source: https://learn.microsoft.com/en-us/aspnet/core/mvc/models/file-uploads?view=aspnetcore-3.1
I have a working example that uses multipart/form-data and a really nice NuGet package that takes out the headache when working with multipart/form-data, and it works well, however I want to add a file header signature check, to make sure that the file type defined by the client is actually what they say it is. I can't rely on the file extension to do this securely, but I can use the file header signature to make it at least a bit more secure. Since I'm am streaming directly to disk, how can I extract the first bytes as it's going through the file stream?
[DisableFormValueModelBinding] // required for form binding
[ValidateMimeMultipartContent] // simple check to make sure this is a multipart form
[FileUploadOperation(typeof(SwaggerFileItem))] // used to define the Swagger schema
[RequestSizeLimit(31457280)] // 30MB
[RequestFormLimits(MultipartBodyLengthLimit = 31457280)]
public async Task<IActionResult> PostAsync([FromRoute] int customerId)
{
// place holders
var uploadLocation = string.Empty;
var trustedFileNameForDisplay = string.Empty;
// this is using a nuget package that does the hard work on reading the multipart form-data.... using UploadStream;
var model = await this.StreamFiles<FileItem>(async x =>
{
// never trust the client
trustedFileNameForDisplay = WebUtility.HtmlEncode(Path.GetFileName(x.FileName));
// determien the quarantine location
uploadLocation = GetUploadLocation(trustedFileNameForDisplay);
// stream the input stream to the file stream
// importantly this should never load the file into memory
// it should be a straight pass through to disk
await using var fs = System.IO.File.Create(uploadLocation, BufSize);
// --> How do I extract the file signature? I.e. a copy of the header bytes as it is being streamed??? <--
await x.OpenReadStream().CopyToAsync(fs);
});
// The model state can now be checked
if (!ModelState.IsValid)
{
// delete the file
DeleteFileIfExists(uploadLocation);
// return a bad request
ThrowProblemDetails(ModelState, StatusCodes.Status400BadRequest);
}
// map as much as we can
var request = _mapper.Map<CreateAttachmentRequest>(model);
// map the remaining properties
request.CustomerId = customerId;
request.UploadServer = Environment.MachineName;
request.uploadLocation = uploadLocation;
request.FileName = trustedFileNameForDisplay;
// call mediator with this request to send it over WCF to Pulse Core.
var result = await _mediator.Send(request);
// build response
var response = new FileResponse { Id = result.FileId, CustomerId = customerId, ExternalId = request.ExternalId };
// return the 201 with the appropriate response
return CreatedAtAction(nameof(GetFile), new { fileId = response.Id, customerId = response.customerId }, response);
}
The section I'm stuck on is around the line await x.OpenReadStream().CopyToAsync(fs);. I would like to pull out the file header here as the stream is being copied to the FileStream. Is there a way to add some kind of inspector? I don't want to read the entire stream again, just the header.
Update
Based on the answer given by #Ackdari I have successfully switched the code to extract the header from the uploaded file stream. I don't know if this could be made any more efficient, but it does work:
//...... removed for clarity
var model = await this.StreamFiles<FileItem>(async x =>
{
trustedFileNameForDisplay = WebUtility.HtmlEncode(Path.GetFileName(x.FileName));
quarantineLocation = QuarantineLocation(trustedFileNameForDisplay);
await using (var fs = System.IO.File.Create(quarantineLocation, BufSize))
{
await x.OpenReadStream().CopyToAsync(fs);
fileFormat = await FileHelpers.GetFileFormatFromFileHeader(fs);
}
});
//...... removed for clarity
and
// using https://github.com/AJMitev/FileTypeChecker
public static async Task<IFileType> GetFileFormatFromFileHeader(FileStream fs)
{
IFileType fileFormat = null;
fs.Position = 0;
var headerData = new byte[40];
var bytesRead = await fs.ReadAsync(headerData, 0, 40);
if (bytesRead > 0)
{
await using (var ms = new MemoryStream(headerData))
{
if (!FileTypeValidator.IsTypeRecognizable(ms))
{
return null;
}
fileFormat = FileTypeValidator.GetFileType(ms);
}
}
return fileFormat;
}
You may want to consider reading the header yourself dependent on which file type is expected
int n = 4; // length of header
var headerData = new byte[n];
var bytesRead = 0;
while (bytesRead < n)
bytesRead += await x.ReadAsync(headerData.AsMemory(bytesRead));
CheckHeader(headerData);
await fs.WriteAsync(headerData.AsMemory());
await x.CopyToAsync(fs);

Asp.Net Core 2 + Google Cloud Storage download Memory Stream

I'm working on an Asp.Net Core 2 Web Api and I have to make an endpoint to download the file. This file is not public, so I cannot use the MediaLink property of the google storage object. I'm using their C# library.
In the piece of code you will see bellow _storageClient was created like this: _storageClient = StorageClient.Create(cred);. The client is working, just showing which class it is.
[HttpGet("DownloadFile/{clientId}/{fileId}")]
public async Task<IActionResult> DownloadFile([FromRoute] long fileId, long clientId)
{
// here there are a bunch of logic and permissions. Not relevant to the quest
var stream = new MemoryStream();
try
{
stream.Position = 0;
var obj = _storageClient.GetObject("bucket name here", "file.png");
_storageClient.DownloadObject(obj, stream);
var response = File(stream, obj.ContentType, "file.png"); // FileStreamResult
return response;
}
catch (Exception ex)
{
throw;
}
}
The variable obj comes OK. with all properties filled as expected. The stream seems to be filled properly. it has length and everything, but it returns me a 500 error that I cannot even catch.
I cannot see what I'm doing wrong, maybe how I'm using memory stream but I can;t even catch the error.
Thanks for any help
You're rewinding the stream, but before you've written anything to it - but you're not rewinding it afterwards. I'd expect that to result in an empty response rather than a 500 error, but I'd at least move the stream.Position call to after the download:
var obj = _storageClient.GetObject("bucket name here", "file.png");
_storageClient.DownloadObject(obj, stream);
stream.Position = 0;
Note that you don't need to fetch the object metadata before downloading it. You can just use:
_storageClient.DownloadObject("bucket name here", "file.png", stream);
stream.Position = 0;
Solution can be like below.
[HttpGet("get-file")]
public ActionResult GetFile()
{
var storageClient = ...;
byte[] buffer;
using (var memoryStream = new MemoryStream())
{
storageClient.DownloadObject("bucket name here"+"/my-file.jpg", memoryStream);
buffer = memoryStream.ToArray();
}
return File(buffer, "image/jpeg", "my-file.jpg");
}

Download ZipArchive from c# web api method returns "net::ERR_CONNECTION_RESET" in chrome

I want to call a web api method and have it allow the user to download a zip file that I create in memory. I also want to create the entries in memory as well.
I'm having trouble getting the server to correctly output the download.
Here is my web api method:
[HttpGet]
[Route("api/downloadstaffdata")]
public HttpResponseMessage DownloadStaffData()
{
var response = new HttpResponseMessage(HttpStatusCode.OK);
using (var stream = new MemoryStream())
{
using (var archive = new ZipArchive(stream, ZipArchiveMode.Create, true))
{
//future for loop to create entries in memory from staff list
var entry = archive.CreateEntry("bob.txt");
using (var writer = new StreamWriter(entry.Open()))
{
writer.WriteLine("Info for: Bob");
}
//future add staff images as well
}
stream.Seek(0, SeekOrigin.Begin);
response.Content = new StreamContent(stream);
}
response.Content.Headers.ContentDisposition = new System.Net.Http.Headers.ContentDispositionHeaderValue("attachment")
{
FileName = "staff_1234_1.zip"
};
response.Content.Headers.ContentType = new System.Net.Http.Headers.MediaTypeHeaderValue("application/zip");
return response;
}
Here is my calling js code:
window.open('api/downloadstaffdata');
Here is the response from Chrome:
net::ERR_CONNECTION_RESET
I don't know what I'm doing wrong. I've already searched SO and read the articles about creating the zip file, but I can't get passed the connection reset error when trying to return the zip archive to the client.
Any ideas?
You have your memory stream inside a using block. As such, your memory stream are being disposed before your controller has the chance to write it out (hence the ERR_CONNECTION_RESET).
A MemoryStream does not need to be disposed explicitly (its various derived type may need to be, but not the MemoryStream itself). Garbage Collector can clean it up automatically.

Mvc - How to stream large file in 4k chunks for download

i was following this example, but when download starts it hangs and than after a minute it shows server error. I guess response end before all data id sent to client.
Do you know another way that i can do this or why it's not working?
Writing to Output Stream from Action
private void StreamExport(Stream stream, System.Collections.Generic.IList<byte[]> data)
{
using (BufferedStream bs = new BufferedStream(stream, 256 * 1024))
using (StreamWriter sw = new StreamWriter(bs))
{
foreach (var stuff in data)
{
sw.Write(stuff);
sw.Flush();
}
}
}
Can you show the calling method? What is the Stream being passed in? Is it the Response Stream?
There are many helpful classes to use that you don't have to chuck yourself because they chunk by default. If you use StreamContent there is a constructor overload where you can specify buffer size. I believe default is 10kB.
From memory here so it my not be complete:
[Route("download")]
[HttpGet]
public async Task<HttpResponseMessage> GetFile()
{
var response = this.Request.CreateResponse(HttpStatusCode.OK);
//don't use a using statement around the stream because the framework will dispose StreamContent automatically
var stream = await SomeMethodToGetFileStreamAsync();
//buffer size of 4kB
var content = new StreamContent(stream, 4096);
response.Content = content;
return response;
}

Can I get file.OpenStreamForReadAsync() from file.OpenStreamForWriteAsync()?

I am still learning the tricks of read/write file streams and hope someone could help if what I am looking for is feasible.
The code below makes WebApi calls (note GetAsync() on line 2) to get an image file Id and save downloaded file to database with computed Md5Hash. The code works fine, but in the interest of efficiency I was wondering if it's possible to get file.OpenStreamForReadAsync() from file.OpenStreamForWriteAsync() (not sure even if this is possible, but I can see some extension methods that operate on a stream but no luck with attempts I've made so far). If this is possible, I can avoid saving a file and opening it again, by instead making the GetMD5Hash() method call within the using (var fileStream = await file.OpenStreamForWriteAsync()){ ... } block.
Can I have the equivalent of Utils.GetMD5Hash(stream);, shown below outside the using block, inside the block, the intention being to avoid opening the file outside of the using block?
var client = new HttpClient();
var response = await client.GetAsync(new Uri($"{url}{imageId}")); // call to WebApi; url and imageId defined earlier
if (response.IsSuccessStatusCode)
{
using (var contentStream = await response.Content.ReadAsInputStreamAsync())
{
var stream = contentStream.AsStreamForRead();
var file = await imagesFolder.CreateFileAsync(imageFileName, CreationCollisionOption.ReplaceExisting);
using (var fileStream = await file.OpenStreamForWriteAsync())
{
await stream.CopyToAsync(fileStream, 4096);
// >>>> At this point, from the Write fileStream, can I get the equivalent of file.OpenStreamForReadAsync() ??
}
var stream = await file.OpenStreamForReadAsync();
string md5Hash = await Utils.GetMD5Hash(stream);
await AddImageToDataBase(file, md5Hash);
}
}
A MemoryStream is read/write, so if all you wanted to do was to compute the hash, something like this should do the trick:
var stream = contentStream.AsStreamForRead();
using (var ms = new MemoryStream())
{
stream.CopyTo(ms);
ms.Seek(0, SeekOrigin.Begin);
string md5Hash = await Utils.GetMD5Hash(ms);
}
But since you want to save the file anyway (it's passed to AddImageToDataBase, after all), consider
Save to the MemoryStream
Reset the memory stream
Copy the memory stream to the file stream
Reset the memory stream
Compute the hash
I'd suggest you do performance measurements, though. The OS does cache file I/O, so it's unlikely that you'd actually have to do a physical disk read to compute the hash. The performance gains might not be what you imagine.
Here is a complete answer using MemoryStream as Petter Hesselberg suggested in his answer. Take note of a couple of tricky situations that I had to encounter, to do what I wanted to do: (1) make sure fileStream is made Disposable with using block, and (2) make sure MemoryStream is set to start with ms.Seek(0, SeekOrigin.Begin); before using it, both for the file to be saved and for the MemoryStream object handed over for computing MD5Hash.
var client = new HttpClient();
var response = await client.GetAsync(new Uri($"{url}{imageId}")); // call to WebApi; url and imageId defined earlier
if (response.IsSuccessStatusCode)
{
using (var contentStream = await response.Content.ReadAsInputStreamAsync())
{
var stream = contentStream.AsStreamForRead();
var file = await imagesFolder.CreateFileAsync(imageFileName, CreationCollisionOption.ReplaceExisting);
using (MemoryStream ms = new MemoryStream())
{
await stream.CopyToAsync(ms);
using (var fileStream = await file.OpenStreamForWriteAsync())
{
ms.Seek(0, SeekOrigin.Begin);
await ms.CopyToAsync(fileStream);
ms.Seek(0, SeekOrigin.Begin); // rewind for next use below
}
string md5Hash = await Utils.GetMD5Hash(ms);
await AddImageToDataBase(file, md5Hash);
}
}
}

Categories

Resources