Can one use Stream.Position to track upload progress without causing issues? - c#

I'm using a MinIO S3 file server to store user data of my blazor server side application. Since users are working with datasets for some ML applications, quite a lot of files need to be updated.
Initial Issue
I'm stating my initial issue, just for completeness and to avoid creating a y/x problem post (However, I'm curious about my attempted solution in any case):
I'm using a hidden <InputFile/> and some javascript to create a drop area for the files the user wants to upload. Problem is that if more then ~200-250 files are dropped (i.e. OnChange event with those files is triggered), the blazor circuit breaks.
Attempted Solution
My solution was rather simple: Force the user to upload zipped files with the completed dataset, if the file count is >150.
New Problem
For long running upload operations I need some user feedback. For single files the upload is fast enough to have some GUI changes after each file, but the MinIO .NET client doesn't seem to have any way to know the progress of a single file upload.
Here is the current code for a single file upload:
public async Task StoreImage(ImageDatabaseModel image, Stream dataStream, string contentType, long fileSize)
{
await CreateBucketIfNotExists(image.UserID);
PutObjectArgs args = new PutObjectArgs()
.WithBucket(image.UserID)
.WithObject(GetMinioImageName(image))
.WithStreamData(dataStream)
.WithContentType(contentType)
.WithObjectSize(fileSize);
await _minio.PutObjectAsync(args);
}
My idea is was now to use the stream to get the progress information. Here is to code of an initial test:
public async Task StoreZibFile(ImageDatabaseModel image, Stream dataStream, string contentType, long fileSize)
{
// never mind I'm still uploading an image, this is only for testing anyway
await CreateBucketIfNotExists(image.UserID);
PutObjectArgs args = new PutObjectArgs()
.WithBucket(image.UserID)
.WithObject(GetMinioImageName(image))
.WithStreamData(dataStream)
.WithContentType(contentType)
.WithObjectSize(fileSize);
var t1 = _minio.PutObjectAsync(args);
var t2 = Task.Run(async () =>
{
while (dataStream.Position < fileSize)
{
Console.WriteLine(dataStream.Position);
await Task.Delay(500);
}
});
await Task.WhenAll(t1, t2);
}
This gives the expected output of the dataStream.Position value rising during the upload.
The question is: Is this approach suitable for my usecase? Are there any downsides I'm unaware of?

Related

ASP.NET: Async Action not Behaving Asyncronously

I have an action which retrieves a file from a SQL database, creates a temporary physical file, and depending on the type of the file either attempts to extract a thumbnail using the ShellFile API, or if it's a PDF, convert the 1st page to a jpg using ImageMagick.
This action is called using an Ajax GET request which receives a base64 representation of the image which is displayed to the user.
Currently, if lets say 5 Ajax requests were made, the server will process each request one at a time. Instead of this behavior, I'd prefer if it processed multiple requests at a time so the user can receive all of the information faster.
So far I've tried making the function use async/await, but it still processes the requests one at a time. Since the APIs I'm using are older and don't support async IO operations, I was following this guide from Microsoft which explains how you can wrap synchronous functions around async functions. The code I have so far is:
public ApiResult<string> RunTask(string fileName, string guid)
{
using (MagickImage image = new MagickImage())
{
using (MemoryStream stream = new MemoryStream())
{
DbFile file = SharedData.Files.GetFileData(Guid.Parse(guid));
file.ShouldNotBeNull();
string path = Server.MapPath($"~/Temp/{fileName}");
System.IO.File.WriteAllBytes(path, file.FileData);
switch (Path.GetExtension(fileName))
{
case ".pdf":
image.Read(path);
break;
default:
ShellFile shellFile = ShellFile.FromFilePath(path);
Bitmap shellThumb = shellFile.Thumbnail.ExtraLargeBitmap;
shellThumb.MakeTransparent(shellThumb.GetPixel(0, 0));
ImageConverter converter = new ImageConverter();
image.Read((byte[])converter.ConvertTo(shellThumb, typeof(byte[])));
break;
}
image.Format = MagickFormat.Png;
image.Write(stream, MagickFormat.Png);
System.IO.File.Delete(path);
return Ok<string>($"data:image/png;base64,{Convert.ToBase64String(stream.ToArray())}");
}
}
}
[HttpGet]
public async Task<ApiResult<string>> GenerateThumbnail(string fileName, string guid)
{
return await Task.Run(() => RunTask(fileName, guid));
}
But when debugging it and having breakpoints print when the start and end of the function are reached, I receive:
START: "file1.pdf"
END: "file1.pdf"
START: "file2.jpg"
END: "file2.jpg"
...
When I'd expect something more akin to:
START: "file1.pdf"
START: "file2.jpg"
...
END: "file1.pdf"
END: "file2.jpg"
How would I fix my code to have the desired behavior? If the function is wrapped by an async one, shouldn't it be running multiple calls together? This mostly comes down to not being very familiar with C# and ASP.net backends. Considering the article from Microsoft's docs that I linked says you generally shouldn't wrap synchronous functions around async ones, I have a feeling there is a very different approach I should take.
(I've also verified the Ajax calls are going off together, and it isn't the source of the bottleneck)

Can I monitor the progress of an S3 download using the AWS SDK?

I'm using the AWS SDK package from Nuget to download files from S3. This involves creating a GetObject request. Amazon has an example of how to do this in their documentation, although I'm actually using the async version of the method.
My code to download a file looks something like this:
using (var client = new AmazonS3Client(accessKey, secretAccessKey, RegionEndpoint.USEast1))
{
var request = new GetObjectRequest
{
BucketName = "my-bucket",
Key = "file.exe"
};
using (var response = await client.GetObjectAsync(request))
{
response.WriteResponseStreamToFile(#"C:\Downloads\file.exe");
}
}
This works; it downloads the file successfully. However, it seems like a little bit of a black box, in that I never really know how long it's going to take to download the file. What I'm hoping to do is get some sort of Progress event so that I can display a nice WPF ProgressBar and watch the download progress. This means I would need to know the size of the file and the number of bytes downloaded, and I'm not sure if there's a way to do that with the AWS SDK.
You can do:
using (var response = client.GetObject(request))
{
response.WriteObjectProgressEvent += Response_WriteObjectProgressEvent;
response.WriteResponseStreamToFile(#"C:\Downloads\file.exe");
}
private static void Response_WriteObjectProgressEvent(object sender, WriteObjectProgressArgs e)
{
Debug.WriteLine($"Tansfered: {e.TransferredBytes}/{e.TotalBytes} - Progress: {e.PercentDone}%");
}
Can you hook in to the WriteObjectProgressEvent object? If you subscribe to events from this object your function will be called multiple times during the download. It will receive the number of bytes that are downloaded/remaining so you can build a progress indicator.

Getting download progress / currently downloaded bytes from an async GET with C# Httpclient

I'm trying to get some download progress represented in Xamarin using System.Web.Http's httpclient. I understand in the typical case to get the file size in advance with a HEAD request and checking for the content-length of the file in question. What I don't understand is how to get the currently transferred file size so I can compare that against the total.
Is this possible? I've looked around for other examples, but they seem to be making use of WebClient or other such instead.
Edit:
submitButton.Clicked += async (object sender, EventArgs e) => {
var client = new HttpClient();
var url = (Just making use of a sizable wallpaper here as example.);
var result = await client.GetStreamAsync(url);
}
If I await the GetStreamAsync as above, then I can't perform any operations on the stream until it's done. If I remove the await and try to work with the Task result, the same thing takes place - even trying to get a Length won't work until the entire thing is completed anyway.
Obviously if I can get at the stream as it's being written to, I can check the length (hopefully) and just compare that to the header information. But how to access that mid-transfer, I do not get.
Unfortunately, the best solution I've been able to find for this so far is "use Webclient instead of httpclient". I haven't been able to give Webclient a shot yet in Xamarin, and hopefully it will work there, but I managed to get progress working with it on the PC in a straightforward manner.

Save files and uploading in the same time

I am writing an application which create files based on timestamp each 1 second, then move them to another folder, then send them as post to a webservice which save them on the folder.
When running the generating function it generates successfully.
When runnng the upload function it upload them successfully.
But when running both of them and a backgroundworker components, the first works perfectly, but the upload mechanism tell em that the file is opened by another proccess.
How can I solve that?
Thx
A good practice when dealing with classes that implement the IDisposable interface, such as the file stream class, is to wrap these classes usage in a unsing statement. From MSDN:
//Create the file.
using (FileStream fs = File.Create(path))
{
AddText(fs, "This is some text");
AddText(fs, "This is some more text,");
AddText(fs, "\r\nand this is on a new line");
AddText(fs, "\r\n\r\nThe following is a subset of characters:\r\n");
for (int i=1;i < 120;i++)
{
AddText(fs, Convert.ToChar(i).ToString());
}
}
Another thing that you should be aware of is multi-threading synchronization. Maybe your "upload" background worker is trying to access the file before your "generate file" background worker had time to finish creating it.

Silverlight Loading Reference Data On Demand from a 'dumb' server

I have a text file with a list of 300,000 words and the frequency with wich they occur. Each line is in the format Word:FequencyOfOccurence.
I want this information to be accessible from within the C# code. I can't hard code the list since it is too long, and I'm not sure how to go about accessing it from a file on the server. Ideally I'd ideally like the information to be downloaded only if it's used (To save on bandwidth) but this is not a high priority as the file is not too big and internet speeds are always increasing.
It doesn't need to be useable for binding.
The information does not need to be editable once the project has been built.
Here is another alternative. Zip the file up and stick it in the clientBin folder next to the apllication XAP. Then at the point in the app where the content is needed do something like this:-
public void GetWordFrequencyResource(Action<string> callback)
{
WebClient client = new WebClient();
client.OpenReadAsync += (s, args) =>
{
try
{
var zipRes = new StreamResourceInfo(args.Result, null)
var txtRes = Application.GetResourceStream(zipRes, new Uri("WordFrequency.txt", UriKind.Relative));
string result = new StreamReader(txtRes.Stream).ReadToEnd();
callback(result);
}
catch
{
callback(null); //Fetch failed.
}
}
client.OpenReadAsync(new Uri("WordFrequency.zip", UriKind.Relative"));
}
Usage:-
var wordFrequency = new Dictionary<string, int>();
GetWordFrequencyResource(s =>
{
// Code here to burst string into dictionary.
});
// Note code here is asynchronous with the building of the dictionary don't attempt to
// use the dictionary here.
The above code allows you to store the file in an efficient zip format but not in the XAP itself. Hence you can download it on demand. It makes use of the fact that a XAP is a zip file so Application.GetResourceStream which is designed to pull resources from XAP files can be used on a zip file.
BTW, I'm not actually suggesting you use a dictionary, I'm just using a dictionary as simple example. In reality I would imagine the file is in sorted order. If that is the case you could use a KeyValuePair<string, int> for each entry but create a custom collection type that holds them in an array or List and then use some Binary search methods to index into it.
Based on your comments, you could download the word list file if you are required to have a very thin server layer. The XAP file containing your Silverlight application is nothing more than a ZIP file with all the referenced files for your Silverlight client layer. Try adding the word list as content that gets compiled into the XAP and see how big the file gets. Text usually compresses really well. In general, though, you'll want to be friendly with your users in how much memory your application consumes. Loading a huge text file into memory, in addition to everything else you need in your app, may untimately make your app a resource hog.
A better practice, in general, would be to call a web service. The service could would perform whatever look up logic you need. Here's a blog post from a quick search that should get you started: (This was written for SL2, but should apply the same for SL3.)
Calling web services with Silverlight 2
Even better would be to store your list in a SQL Server. It will be much easier and quicker to query.
You could create a WCF service on the server side that will send the data to the Silverlight application. Once you retrieve the information you could cache it in-memory inside the client. Here's an example of calling a WCF service method from Silverlight.
Another possibility is to embed the text file into the Silverlight assembly that is deployed to the client:
using (var stream = Assembly.GetExecutingAssembly()
.GetManifestResourceStream("namespace.data.txt"))
using (var reader = new StreamReader(stream))
{
string data = reader.ReadToEnd();
// Do something with the data
}

Categories

Resources