I'm using the sample code from "OneDriveApiBrowser" as the base for adding save to one drive support to my app. This makes use of Microsoft.Graph, I can upload small files but larger files (10Mb) will not upload and give an error "maximum request length exceeded". I get the same error in both my app and the sample code with the following line of code:
DriveItem uploadedItem = await graphClient.Drive.Root.ItemWithPath(drivePath).Content.Request().PutAsync<DriveItem>(newStream);
Is there a way to increase the maximum size of file that can be uploaded? If so how?
Graph will only accept small files using PUT-to-content, so you'll want to look into creating an upload session. Since you're using the Graph SDK I'd use this test case as a guide.
Here's some code for completeness - it won't directly compile but it should let you see the steps involved:
var uploadSession = await graphClient.Drive.Root.ItemWithPath("filename.txt").CreateUploadSession().Request().PostAsync();
var maxChunkSize = 320 * 1024; // 320 KB - Change this to your chunk size. 5MB is the default.
var provider = new ChunkedUploadProvider(uploadSession, graphClient, inputStream, maxChunkSize);
// Setup the chunk request necessities
var chunkRequests = provider.GetUploadChunkRequests();
var readBuffer = new byte[maxChunkSize];
var trackedExceptions = new List<Exception>();
DriveItem itemResult = null;
//upload the chunks
foreach (var request in chunkRequests)
{
var result = await provider.GetChunkRequestResponseAsync(request, readBuffer, trackedExceptions);
if (result.UploadSucceeded)
{
itemResult = result.ItemResponse;
}
}
Related
I have an API method that streams uploaded files directly to disk to be scanned with a virus checker. Some of these files can be quite large, so IFormFile is a no go:
Any single buffered file exceeding 64 KB is moved from memory to a
temp file on disk.
Source: https://learn.microsoft.com/en-us/aspnet/core/mvc/models/file-uploads?view=aspnetcore-3.1
I have a working example that uses multipart/form-data and a really nice NuGet package that takes out the headache when working with multipart/form-data, and it works well, however I want to add a file header signature check, to make sure that the file type defined by the client is actually what they say it is. I can't rely on the file extension to do this securely, but I can use the file header signature to make it at least a bit more secure. Since I'm am streaming directly to disk, how can I extract the first bytes as it's going through the file stream?
[DisableFormValueModelBinding] // required for form binding
[ValidateMimeMultipartContent] // simple check to make sure this is a multipart form
[FileUploadOperation(typeof(SwaggerFileItem))] // used to define the Swagger schema
[RequestSizeLimit(31457280)] // 30MB
[RequestFormLimits(MultipartBodyLengthLimit = 31457280)]
public async Task<IActionResult> PostAsync([FromRoute] int customerId)
{
// place holders
var uploadLocation = string.Empty;
var trustedFileNameForDisplay = string.Empty;
// this is using a nuget package that does the hard work on reading the multipart form-data.... using UploadStream;
var model = await this.StreamFiles<FileItem>(async x =>
{
// never trust the client
trustedFileNameForDisplay = WebUtility.HtmlEncode(Path.GetFileName(x.FileName));
// determien the quarantine location
uploadLocation = GetUploadLocation(trustedFileNameForDisplay);
// stream the input stream to the file stream
// importantly this should never load the file into memory
// it should be a straight pass through to disk
await using var fs = System.IO.File.Create(uploadLocation, BufSize);
// --> How do I extract the file signature? I.e. a copy of the header bytes as it is being streamed??? <--
await x.OpenReadStream().CopyToAsync(fs);
});
// The model state can now be checked
if (!ModelState.IsValid)
{
// delete the file
DeleteFileIfExists(uploadLocation);
// return a bad request
ThrowProblemDetails(ModelState, StatusCodes.Status400BadRequest);
}
// map as much as we can
var request = _mapper.Map<CreateAttachmentRequest>(model);
// map the remaining properties
request.CustomerId = customerId;
request.UploadServer = Environment.MachineName;
request.uploadLocation = uploadLocation;
request.FileName = trustedFileNameForDisplay;
// call mediator with this request to send it over WCF to Pulse Core.
var result = await _mediator.Send(request);
// build response
var response = new FileResponse { Id = result.FileId, CustomerId = customerId, ExternalId = request.ExternalId };
// return the 201 with the appropriate response
return CreatedAtAction(nameof(GetFile), new { fileId = response.Id, customerId = response.customerId }, response);
}
The section I'm stuck on is around the line await x.OpenReadStream().CopyToAsync(fs);. I would like to pull out the file header here as the stream is being copied to the FileStream. Is there a way to add some kind of inspector? I don't want to read the entire stream again, just the header.
Update
Based on the answer given by #Ackdari I have successfully switched the code to extract the header from the uploaded file stream. I don't know if this could be made any more efficient, but it does work:
//...... removed for clarity
var model = await this.StreamFiles<FileItem>(async x =>
{
trustedFileNameForDisplay = WebUtility.HtmlEncode(Path.GetFileName(x.FileName));
quarantineLocation = QuarantineLocation(trustedFileNameForDisplay);
await using (var fs = System.IO.File.Create(quarantineLocation, BufSize))
{
await x.OpenReadStream().CopyToAsync(fs);
fileFormat = await FileHelpers.GetFileFormatFromFileHeader(fs);
}
});
//...... removed for clarity
and
// using https://github.com/AJMitev/FileTypeChecker
public static async Task<IFileType> GetFileFormatFromFileHeader(FileStream fs)
{
IFileType fileFormat = null;
fs.Position = 0;
var headerData = new byte[40];
var bytesRead = await fs.ReadAsync(headerData, 0, 40);
if (bytesRead > 0)
{
await using (var ms = new MemoryStream(headerData))
{
if (!FileTypeValidator.IsTypeRecognizable(ms))
{
return null;
}
fileFormat = FileTypeValidator.GetFileType(ms);
}
}
return fileFormat;
}
You may want to consider reading the header yourself dependent on which file type is expected
int n = 4; // length of header
var headerData = new byte[n];
var bytesRead = 0;
while (bytesRead < n)
bytesRead += await x.ReadAsync(headerData.AsMemory(bytesRead));
CheckHeader(headerData);
await fs.WriteAsync(headerData.AsMemory());
await x.CopyToAsync(fs);
I've searched and searched and have not found any examples.
I'm using the Azure.Storage.Blobs nuget packages in C# .NET Core.
Here is an example of my current code that doesn't work.
I get a Status: 413 (The request body is to large and exceeds the maximum permissible limit.)
Searching seems to indicate there is either a 4mb limit or a 100mb limit it's not clear but I think it's 4mb on Append Blobs and 100mb limit on Block Blobs.
var appendBlobClient = containerClient.GetAppendBlobClient(string.Format("{0}/{1}", tenantName, Path.GetFileName(filePath)));
using FileStream uploadFileStream = File.OpenRead(filePath);
appendBlobClient.CreateIfNotExists();
appendBlobClient.AppendBlock(uploadFileStream);
uploadFileStream.Close();
This doesn't work because of the 4mb limit so I need to append 4mb chunks of my file but I've not found examples of the best way to do this.
So what I'm trying to figure out is the best way to upload large files it seems it has to be done in chunks maybe 4mb for append blobs and 100mb for block blobs but the documentation isn't clear and doesn't have examples.
I want to thank #silent for responding since he provided enough info to work out what I needed. Sometimes just having someone to talk through things helps me figure things out.
What I found in on the BlockBlobClient.Upload method it chunks your file stream for you. I believe this to be 100mb blocks from my research. It appears it has a limit of 100mb blocks and 50,000 of them
For AppendBlockClient.AppendBlock it does not chunk your stream for you. It has a limit of 4mb blocks and 50,000 of them.
Here is part of my code that allowed me to upload a 6gb file as a block blob and a 200mb file as an append blob.
BlobServiceClient blobServiceClient = new BlobServiceClient(azureStorageAccountConnectionString);
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(azureStorageAccountContainerName);
containerClient.CreateIfNotExists();
if (appendData)
{
var appendBlobClient = containerClient.GetAppendBlobClient(string.Format("{0}/{1}", tenantName, Path.GetFileName(filePath)));
appendBlobClient.CreateIfNotExists();
var appendBlobMaxAppendBlockBytes = appendBlobClient.AppendBlobMaxAppendBlockBytes;
using (var file = File.OpenRead(filePath))
{
int bytesRead;
var buffer = new byte[appendBlobMaxAppendBlockBytes];
while ((bytesRead = file.Read(buffer, 0, buffer.Length)) > 0)
{
//Stream stream = new MemoryStream(buffer);
var newArray = new Span<byte>(buffer, 0, bytesRead).ToArray();
Stream stream = new MemoryStream(newArray);
stream.Position = 0;
appendBlobClient.AppendBlock(stream);
}
}
}
else
{
var blockBlobClient = containerClient.GetBlockBlobClient(string.Format("{0}/{1}", tenantName, Path.GetFileName(filePath)));
using FileStream uploadFileStream = File.OpenRead(filePath);
blockBlobClient.DeleteIfExists();
blockBlobClient.Upload(uploadFileStream);
uploadFileStream.Close();
}
Using the .NET SDK for Microsoft Graph, you can upload (small) files. Example here.
How can I upload a large file (> 4MB) using the .NET SDK?
In other words, can the SDK be utilized to implement "Upload large files with an upload session" ?
Here is the code I wrote recently using Microsoft Graph .Net SDK.
GraphServiceClient(graphClient) authentication is required.
if (fileSize.MegaBytes > 4)
{
var session = await graphClient.Drive.Root.ItemWithPath(uploadPath).CreateUploadSession().Request().PostAsync();
var maxSizeChunk = 320 * 4 * 1024;
var provider = new ChunkedUploadProvider(session, graphClient, stream, maxSizeChunk);
var chunckRequests = provider.GetUploadChunkRequests();
var exceptions = new List<Exception>();
var readBuffer = new byte[maxSizeChunk];
DriveItem itemResult = null;
//upload the chunks
foreach (var request in chunckRequests)
{
// Do your updates here: update progress bar, etc.
// ...
// Send chunk request
var result = await provider.GetChunkRequestResponseAsync(request, readBuffer, exceptions);
if (result.UploadSucceeded)
{
itemResult = result.ItemResponse;
}
}
// Check that upload succeeded
if (itemResult == null)
{
await UploadFilesToOneDrive(fileName, filePath, graphClient);
}
}
else
{
await graphClient.Drive.Root.ItemWithPath(uploadPath).Content.Request().PutAsync<DriveItem>(stream);
}
This will be available in the next release of .NET Microsoft Graph client library. It will work the same as the functionality in the .NET OneDrive client library. You can review this in my working branch. You can provide feedback in the repo.
I'm currently developing for an environment that has poor network connectivity. My application helps to automatically download required Google Drive files for users. It works reasonably well for small files (ranging from 40KB to 2MB), but fails far too often for larger files (9MB). I know these file sizes might seem small, but in terms of my client's network environment, Google Drive API constantly fails with the 9MB file.
I've concluded that I need to download files in smaller byte chunks, but I don't see how I can do that with Google Drive API. I've read this over and over again, and I've tried the following code:
// with the Drive File ID, and the appropriate export MIME type, I create the export request
var request = DriveService.Files.Export(fileId, exportMimeType);
// take the message so I can modify it by hand
var message = request.CreateRequest();
var client = request.Service.HttpClient;
// I change the Range headers of both the client, and message
client.DefaultRequestHeaders.Range =
message.Headers.Range =
new System.Net.Http.Headers.RangeHeaderValue(100, 200);
var response = await request.Service.HttpClient.SendAsync(message);
// if status code = 200, copy to local file
if (response.IsSuccessStatusCode)
{
using (var fileStream = new FileStream(downloadFileName, FileMode.CreateNew, FileAccess.ReadWrite))
{
await response.Content.CopyToAsync(fileStream);
}
}
The resultant local file (from fileStream) however, is still full-length (i.e. 40KB file for the 40KB Drive file, and a 500 Internal Server Error for the 9MB file). On a sidenote, I've also experimented with ExportRequest.MediaDownloader.ChunkSize, but from what I observe it only changes the frequency at which the ExportRequest.MediaDownloader.ProgressChanged callback is called (i.e. callback will trigger every 256KB if ChunkSize is set to 256 * 1024).
How can I proceed?
You seemed to be heading in the right direction. From your last comment, the request will update progress based on the chunk size, so your observation was accurate.
Looking into the source code for MediaDownloader in the SDK the following was found (emphasis mine)
The core download logic. We download the media and write it to an
output stream ChunkSize bytes at a time, raising the ProgressChanged
event after each chunk. The chunking behavior is largely a historical
artifact: a previous implementation issued multiple web requests, each
for ChunkSize bytes. Now we do everything in one request, but the API
and client-visible behavior are retained for compatibility.
Your example code will only download one chunk from 100 to 200. Using that approach you would have to keep track of an index and download each chunk manually, copying them to the file stream for each partial download
const int KB = 0x400;
int ChunkSize = 256 * KB; // 256KB;
public async Task ExportFileAsync(string downloadFileName, string fileId, string exportMimeType) {
var exportRequest = driveService.Files.Export(fileId, exportMimeType);
var client = exportRequest.Service.HttpClient;
//you would need to know the file size
var size = await GetFileSize(fileId);
using (var file = new FileStream(downloadFileName, FileMode.CreateNew, FileAccess.ReadWrite)) {
file.SetLength(size);
var chunks = (size / ChunkSize) + 1;
for (long index = 0; index < chunks; index++) {
var request = exportRequest.CreateRequest();
var from = index * ChunkSize;
var to = from + ChunkSize - 1;
request.Headers.Range = new RangeHeaderValue(from, to);
var response = await client.SendAsync(request);
if (response.StatusCode == HttpStatusCode.PartialContent || response.IsSuccessStatusCode) {
using (var stream = await response.Content.ReadAsStreamAsync()) {
file.Seek(from, SeekOrigin.Begin);
await stream.CopyToAsync(file);
}
}
}
}
}
private async Task<long> GetFileSize(string fileId) {
var file = await driveService.Files.Get(fileId).ExecuteAsync();
var size = file.size;
return size;
}
This code makes some assumptions about the drive api/server.
That the server will allow the multiple requests needed to download the file in chunks. Don't know if requests are throttled.
That the server still accepts the Range header like stated in the developer documenation
I have ASP.NET Web API project where a user can download some stuff from a database.
My Download controller fetches data from the database instance. Every single result has a blob field which is some kind of data (1).
I want add each result to a ZIP file (2). After all I send the HTTP response adding my stream content.
List<Result> results = m_Repository.GetResultsForResultId(given_id_by_request);
// 1
foreach (Result result in results)
{
string fileName = String.Format("{0}-{1}.bin", id >> 16, result.Id);
zipFile.AddEntry(fileName, result.Value);
}
// 2
PushStreamContent pushStreamContent = new PushStreamContent((stream, content, context) =>
{
zipFile.Save(stream);
stream.Close();
}
response = new HttpResponseMessage(HttpStatusCode.OK) { Content = pushStreamContent };
It works nice! But on big download requests this exhausts my memory. I need to find a way to put a stream into a zip archive bufferless. Can someone please help me?!
As far as I can see from the code you posted, you are not disposing the streams you create after usage. This can add to a great amount of memory being reserved by your app which might cause your problems.
I am using the ZipArchive to put multiple files into a zip file in my web application. The code Looks somewhat like that:
using (var compressedFileStream = new MemoryStream())
{
using (var zipArchive = new ZipArchive(compressedFileStream, ZipArchiveMode.Update, false))
{
foreach (Result result in results)
{
string fileName = String.Format("{0}-{1}.bin", id >> 16, result.Id);
var zipEntry = zipArchive.CreateEntry(fileName);
using (var originalFileStream = new MemoryStream(result.Value))
{
using (var zipEntryStream = zipEntry.Open())
{
originalFileStream.CopyTo(zipEntryStream);
}
}
}
}
return File(compressedFileStream.ToArray(), "application/zip", string.Format("Download_{0:ddMMyyy_hhmm}.zip", DateTime.Now));
}
I am using that code snippet inside an MVC Controller method so you have to adapt the return part for your situation.
The above code works fine in my application for up to 300 entries or 50MB volume (those are the limits set by the requirements for my app).
Hope that helps you.
EDIT: Forgot the closing bracket of the first using block. the return Statement has to be inside this using-block, else the stream will be disposed.