i am trying to get a COMPLETE example of downloading a file form Azure Blob Storage using the .DownloadToStreamAsync() method wired up to a progress bar.
i've found references to old implementations of the azure storage sdk, but they dont compile with the newer sdk (that has implemented these async methods) or don't work with current nuget packages.
https://blogs.msdn.microsoft.com/avkashchauhan/2010/11/03/uploading-a-blob-to-azure-storage-with-progress-bar-and-variable-upload-block-size/
https://blogs.msdn.microsoft.com/kwill/2013/03/05/asynchronous-parallel-blob-transfers-with-progress-change-notification-2-0/
i'm a newbie to async/await threading in .NET, and was wondering if someone could help me out with taking the below (in a windows form app) and showing how i can 'hook' into the progress of the file download... i see some examples dont use the .DownloadToStream method, and they instead download chunks of the blob file.. but i wondered since these new ...Async() methods exist in the newer Storage SDK's, if there was something smarter that could be done?
So assuming the below is working (non async), what additionally would i have to do to use the blockBlob.DownloadToStreamAsync(fileStream); method, is this even the right use of this, and how can i get the progress?
ideally i am after any way i can just hook the progress of the blob download so i can update a Windows Form UI on big downloads.. so if the below is not the right way, please enlighten me :)
// Retrieve storage account from connection string.
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(
CloudConfigurationManager.GetSetting("StorageConnectionString"));
// Create the blob client.
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
// Retrieve reference to a previously created container.
CloudBlobContainer container = blobClient.GetContainerReference("mycontainer");
// Retrieve reference to a blob named "photo1.jpg".
CloudBlockBlob blockBlob = container.GetBlockBlobReference("photo1.jpg");
// Save blob contents to a file.
using (var fileStream = System.IO.File.OpenWrite(#"path\myfile"))
{
blockBlob.DownloadToStream(fileStream);
}
Using the awesome suggested method (downloading 1mb chunks) kindly suggsted by Gaurav, i have implemented using a background worker to do the download so i can update the UI as i go.
The main part inside the do loop that downloads the range to a stream and then writes the stream to the file system I havent touched from the original example, but i have added code to update the worker progress and to listen for the worker cancellation (to abort the download).. not sure if this could be the issue?
For completeness, below is everything inside the worker_DoWork method:
public void worker_DoWork(object sender, DoWorkEventArgs e)
{
object[] parameters = e.Argument as object[];
string localFile = (string)parameters[0];
string blobName = (string)parameters[1];
string blobContainerName = (string)parameters[2];
CloudBlobClient client = (CloudBlobClient)parameters[3];
try
{
int segmentSize = 1 * 1024 * 1024; //1 MB chunk
var blobContainer = client.GetContainerReference(blobContainerName);
var blob = blobContainer.GetBlockBlobReference(blobName);
blob.FetchAttributes();
blobLengthRemaining = blob.Properties.Length;
blobLength = blob.Properties.Length;
long startPosition = 0;
do
{
long blockSize = Math.Min(segmentSize, blobLengthRemaining);
byte[] blobContents = new byte[blockSize];
using (MemoryStream ms = new MemoryStream())
{
blob.DownloadRangeToStream(ms, startPosition, blockSize);
ms.Position = 0;
ms.Read(blobContents, 0, blobContents.Length);
using (FileStream fs = new FileStream(localFile, FileMode.OpenOrCreate))
{
fs.Position = startPosition;
fs.Write(blobContents, 0, blobContents.Length);
}
}
startPosition += blockSize;
blobLengthRemaining -= blockSize;
if (blobLength > 0)
{
decimal totalSize = Convert.ToDecimal(blobLength);
decimal downloaded = totalSize - Convert.ToDecimal(blobLengthRemaining);
decimal blobPercent = (downloaded / totalSize) * 100;
worker.ReportProgress(Convert.ToInt32(blobPercent));
}
if (worker.CancellationPending)
{
e.Cancel = true;
blobDownloadCancelled = true;
return;
}
}
while (blobLengthRemaining > 0);
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
This is working, but on bigger files (30mb for example), i sometimes am getting 'can't write to file as open in another process error...' and the process fails..
Using your code:
using (var fileStream = System.IO.File.OpenWrite(#"path\myfile"))
{
blockBlob.DownloadToStream(fileStream);
}
It is not possible to show the progress because the code comes out of this function only when the download is complete. DownloadToStream function will internally split a large blob in chunks and download the chunks.
What you need to do is download these chunks using your code. What you have to do is use DownloadRangeToStream method. I answered a similar question some time back that you may find useful: Azure download blob part.
Related
I'm using a C# (.net core) console program to upload a large file (.CSV) (limit 30 MB) to an Azure Fileshare location from an FTP site. In some cases, I'm getting incomplete writes in the Azure fileshare location. However, it works most of the time when I upload in chunks. Due to incomplete writing, the file might get damaged at the destination Fileshare location (missing a few rows and contents from the source location). The code I use is shown below. Please help me improve the procedure or provide other solutions.
public async void WriteFileToFileShare(ShareClient share, string azureFolder, string validDataFilePath, Stream stream)
{
ShareDirectoryClient directory = share.GetDirectoryClient(azureFolder);
// Azure allows for 4MB max uploads (4 x 1024 x 1024 = 4194304)
const int uploadLimit = 4194304;
stream.Seek(0, SeekOrigin.Begin); // ensure stream is at the beginning
var fileClient = await directory.CreateFileAsync(Path.GetFileName(validDataFilePath), stream.Length);
// If stream is below the limit upload directly
if (stream.Length <= uploadLimit)
{
await fileClient.Value.UploadRangeAsync(new HttpRange(0, stream.Length), stream);
return;
}
int bytesRead;
long index = 0;
byte[] buffer = new byte[uploadLimit];
// Stream is larger than the limit so we need to upload in chunks
while ((bytesRead = stream.Read(buffer, 0, buffer.Length)) > 0)
{
// Create a memory stream for the buffer to upload
using MemoryStream ms = new MemoryStream(buffer, 0, bytesRead);
await fileClient.Value.UploadRangeAsync(ShareFileRangeWriteType.Update, new HttpRange(index, ms.Length), ms);
index += ms.Length; // increment the index to the account for bytes already written
}
}
As suggested by #Nayan, To upload more then 30 mb to an Azure Fileshare location from an FTP site .
Thank you for your valuable suggestion ,posting it as an answer for other community members who have similar issue can find and fix their problem.
The Azure File Sync Agent can keep content synchronised between
on-premises and Azure File Share.
Here is the example of code based on the MS DOC:-
private static async Task UploadFilesAsync()
{
// Create five randomly named containers to store the uploaded files.
BlobContainerClient[] containers = await GetRandomContainersAsync();
// Path to the directory to upload
string uploadPath = Directory.GetCurrentDirectory() + "\\upload";
// Start a timer to measure how long it takes to upload all the files.
Stopwatch timer = Stopwatch.StartNew();
try
{
Console.WriteLine($"Iterating in directory: {uploadPath}");
int count = 0;
Console.WriteLine($"Found {Directory.GetFiles(uploadPath).Length} file(s)");
// Specify the StorageTransferOptions
BlobUploadOptions options = new BlobUploadOptions
{
TransferOptions = new StorageTransferOptions
{
// Set the maximum number of workers that
// may be used in a parallel transfer.
MaximumConcurrency = 8,
// Set the maximum length of a transfer to 50MB.
MaximumTransferSize = 50 * 1024 * 1024
}
};
// Create a queue of tasks that will each upload one file.
var tasks = new Queue<Task<Response<BlobContentInfo>>>();
// Iterate through the files
foreach (string filePath in Directory.GetFiles(uploadPath))
{
BlobContainerClient container = containers[count % 5];
string fileName = Path.GetFileName(filePath);
Console.WriteLine($"Uploading {fileName} to container {container.Name}");
BlobClient blob = container.GetBlobClient(fileName);
// Add the upload task to the queue
tasks.Enqueue(blob.UploadAsync(filePath, options));
count++;
}
// Run all the tasks asynchronously.
await Task.WhenAll(tasks);
timer.Stop();
Console.WriteLine($"Uploaded {count} files in {timer.Elapsed.TotalSeconds} seconds");
}
catch (RequestFailedException ex)
{
Console.WriteLine($"Azure request failed: {ex.Message}");
}
catch (DirectoryNotFoundException ex)
{
Console.WriteLine($"Error parsing files in the directory: {ex.Message}");
}
catch (Exception ex)
{
Console.WriteLine($"Exception: {ex.Message}");
}
}
For more information please refer the below links:-
SO THREAD| Uploading to Azure File Storage fails with large files.
& How much large file we can upload to Azure file share in a single shot in .net core 3.1
As per the title, I have a program whereby I'm trying to add to an existing BlobkBlob using the PutBlock method:
private static void UploadNewText(string text)
{
string fileName = "test4.txt";
string containerString = "mycontainer";
CloudStorageAccount storage = CloudStorageAccount.Parse(connection);
CloudBlobClient client = storage.CreateCloudBlobClient();
CloudBlobContainer container = client.GetContainerReference(containerString);
CloudBlockBlob blob = container.GetBlockBlobReference(fileName);
using (MemoryStream stream = new MemoryStream())
using (StreamWriter sw = new StreamWriter(stream))
{
sw.Write(text);
sw.Flush();
stream.Position = 0;
string blockId = Convert.ToBase64String(
ASCIIEncoding.ASCII.GetBytes("0000005"));
Console.WriteLine(blockId);
blob.PutBlock(blockId, stream, null);
blob.PutBlockList(new string[] { blockId });
}
}
As I understand it, providing the BlockId increases (or at least differs), and is a consistent size, this should work; however, when I run it a second time for the same file (regardless of whether or not I increase the Block ID) it just overwrites the existing file with the new text.
I realise there are other options for appending to a blob (such as AppendBlob), but I'm curious if PutBlock, specifically can do this. Is what I am trying to do possible and, if so, what am I doing wrong?
Can PutBlock be used to append to an existing BlockBlob in Azure
Yes, it can. However in order to do that, you will need to work that in a little bit different way.
What you will need to do is:
First get the previously committed block list. The method you want to use is DownloadBlockList.
Upload new block. Note down its block id.
Append this block id to the list of block ids you downloaded in step #1.
Call put block list with this new list.
I'm currently developing for an environment that has poor network connectivity. My application helps to automatically download required Google Drive files for users. It works reasonably well for small files (ranging from 40KB to 2MB), but fails far too often for larger files (9MB). I know these file sizes might seem small, but in terms of my client's network environment, Google Drive API constantly fails with the 9MB file.
I've concluded that I need to download files in smaller byte chunks, but I don't see how I can do that with Google Drive API. I've read this over and over again, and I've tried the following code:
// with the Drive File ID, and the appropriate export MIME type, I create the export request
var request = DriveService.Files.Export(fileId, exportMimeType);
// take the message so I can modify it by hand
var message = request.CreateRequest();
var client = request.Service.HttpClient;
// I change the Range headers of both the client, and message
client.DefaultRequestHeaders.Range =
message.Headers.Range =
new System.Net.Http.Headers.RangeHeaderValue(100, 200);
var response = await request.Service.HttpClient.SendAsync(message);
// if status code = 200, copy to local file
if (response.IsSuccessStatusCode)
{
using (var fileStream = new FileStream(downloadFileName, FileMode.CreateNew, FileAccess.ReadWrite))
{
await response.Content.CopyToAsync(fileStream);
}
}
The resultant local file (from fileStream) however, is still full-length (i.e. 40KB file for the 40KB Drive file, and a 500 Internal Server Error for the 9MB file). On a sidenote, I've also experimented with ExportRequest.MediaDownloader.ChunkSize, but from what I observe it only changes the frequency at which the ExportRequest.MediaDownloader.ProgressChanged callback is called (i.e. callback will trigger every 256KB if ChunkSize is set to 256 * 1024).
How can I proceed?
You seemed to be heading in the right direction. From your last comment, the request will update progress based on the chunk size, so your observation was accurate.
Looking into the source code for MediaDownloader in the SDK the following was found (emphasis mine)
The core download logic. We download the media and write it to an
output stream ChunkSize bytes at a time, raising the ProgressChanged
event after each chunk. The chunking behavior is largely a historical
artifact: a previous implementation issued multiple web requests, each
for ChunkSize bytes. Now we do everything in one request, but the API
and client-visible behavior are retained for compatibility.
Your example code will only download one chunk from 100 to 200. Using that approach you would have to keep track of an index and download each chunk manually, copying them to the file stream for each partial download
const int KB = 0x400;
int ChunkSize = 256 * KB; // 256KB;
public async Task ExportFileAsync(string downloadFileName, string fileId, string exportMimeType) {
var exportRequest = driveService.Files.Export(fileId, exportMimeType);
var client = exportRequest.Service.HttpClient;
//you would need to know the file size
var size = await GetFileSize(fileId);
using (var file = new FileStream(downloadFileName, FileMode.CreateNew, FileAccess.ReadWrite)) {
file.SetLength(size);
var chunks = (size / ChunkSize) + 1;
for (long index = 0; index < chunks; index++) {
var request = exportRequest.CreateRequest();
var from = index * ChunkSize;
var to = from + ChunkSize - 1;
request.Headers.Range = new RangeHeaderValue(from, to);
var response = await client.SendAsync(request);
if (response.StatusCode == HttpStatusCode.PartialContent || response.IsSuccessStatusCode) {
using (var stream = await response.Content.ReadAsStreamAsync()) {
file.Seek(from, SeekOrigin.Begin);
await stream.CopyToAsync(file);
}
}
}
}
}
private async Task<long> GetFileSize(string fileId) {
var file = await driveService.Files.Get(fileId).ExecuteAsync();
var size = file.size;
return size;
}
This code makes some assumptions about the drive api/server.
That the server will allow the multiple requests needed to download the file in chunks. Don't know if requests are throttled.
That the server still accepts the Range header like stated in the developer documenation
i know that this has been already asked, but the marked solution is not correct. Usually this article is marked as solution: https://learn.microsoft.com/en-us/archive/blogs/kwill/asynchronous-parallel-blob-transfers-with-progress-change-notification-2-0
It works and give an actual progress, but not the real time progress (and in some cases it gives a totally wrong value). Let me explain:
It gives the progress on the local read buffer, so when i upload something my first "uploaded value" is the read buffer total size. In my case this buffer is 4mb so every file smaller than 4mb results completed in 0 seconds for the progress bar, but it takes the real upload time to complete for real.
Also, if you try to kill your connection just before the upload start it gives as actual progress the first buffer size, so for my 1mb file i get 100% progress while disconnect.
I found another article with another solution, it reads the http response from azure everytime it complete a single block upload, but i need my blocks to be 4mb (since max block count for a single file is 50.000) and its not a perfect solution even with low block size.
The first article overrides the stream class and create a ProgressStream class with an ProgressChanged event that is triggered every time a read is done, there is some way to know the actual uploaded bytes when that ProgressChanged is triggered?
You can do this by using code similar to https://learn.microsoft.com/en-us/archive/blogs/kwill/asynchronous-parallel-block-blob-transfers-with-progress-change-notification (version 1.0 of the blog post you referenced), but instead of calling m_Blob.PutBlock you would instead upload the block with an HTTPWebRequest object and use the progress events from the HTTPWebRequest class. This introduces a lot more code complexity and you would have to add some additional error handling.
The alternative would be to download the Storage Client Library source code from GitHub and modify the block upload methods to track and report progress. The challenge you will face here is that you will have to make these same changes to every new version of the SCL if you plan on staying up to date with the latest fixes.
I must admit I didn't check if everything is as you desired, but here are my 2 cents on uploading with progress indication.
public async Task UploadVideoFilesToBlobStorage(List<VideoUploadModel> videos, CancellationToken cancellationToken)
{
var blobTransferClient = new BlobTransferClient();
//register events
blobTransferClient.TransferProgressChanged += BlobTransferClient_TransferProgressChanged;
//files
_videoCount = _videoCountLeft = videos.Count;
foreach (var video in videos)
{
var blobUri = new Uri(video.SasLocator);
//create the sasCredentials
var sasCredentials = new StorageCredentials(blobUri.Query);
//get the URL without sasCredentials, so only path and filename.
var blobUriBaseFile = new Uri(blobUri.GetComponents(UriComponents.SchemeAndServer | UriComponents.Path,
UriFormat.UriEscaped));
//get the URL without filename (needed for BlobTransferClient (seems to me like a issue)
var blobUriBase = new Uri(blobUriBaseFile.AbsoluteUri.Replace("/"+video.Filename, ""));
var blobClient = new CloudBlobClient(blobUriBaseFile, sasCredentials);
//upload using stream, other overload of UploadBlob forces to put online filename of local filename
using (FileStream fs = new FileStream(video.FilePath, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
await blobTransferClient.UploadBlob(blobUriBase, video.Filename, fs, null, cancellationToken, blobClient,
new NoRetry(), "video/x-msvideo");
}
_videoCountLeft -= 1;
}
blobTransferClient.TransferProgressChanged -= BlobTransferClient_TransferProgressChanged;
}
private void BlobTransferClient_TransferProgressChanged(object sender, BlobTransferProgressChangedEventArgs e)
{
Console.WriteLine("progress, seconds remaining:" + e.TimeRemaining.Seconds);
double bytesTransfered = e.BytesTransferred;
double bytesTotal = e.TotalBytesToTransfer;
double thisProcent = bytesTransfered / bytesTotal;
double procent = thisProcent;
//devide by video amount
int videosUploaded = _videoCount - _videoCountLeft;
if (_videoCountLeft > 0)
{
procent = (thisProcent + videosUploaded) / _videoCount;
}
procent = procent * 100;//to real %
UploadProgressChangedEvent?.Invoke((int)procent, videosUploaded, _videoCount);
}
Actually Microsoft.WindowsAzure.MediaServices.Client.BlobTransferClient should be able to do concurrent uploads but there is no Method for uploading multiple yet it has properties for NumberOfConcurrentTransfers and ParallelTransferThreadCount, not sure how to use this.
There is a bug in this BlobTransferClient, when uploading using the localFile parameter, it will use the filename of that file, while I gave permissions on a specific file name in the SaSLocator.
This example shows how to upload from a client (not on server), so we don't need any CloudMediaContext which is usually the case. Everything about SasLocators can be found here.
In my application I can download some media files from web. Normally I used WebClient.OpenReadCompleted method to download, decrypt and save the file to IsolatedStorage. It worked well and looked like that:
private void downloadedSong_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e, SomeOtherValues someOtherValues) // delegate, uses additional values
{
// Some preparations
try
{
if (e.Result != null)
{
using (isolatedStorageFile = IsolatedStorageFile.GetUserStoreForApplication())
{
// working with the gained stream, decryption
// saving the decrypted file to isolatedStorage
isolatedStorageFileStream = new IsolatedStorageFileStream("SomeFileNameHere", FileMode.OpenOrCreate, isolatedStorageFile);
// and use it for MediaElement
mediaElement.SetSource(isolatedStorageFileStream);
mediaElement.Position = new TimeSpan(0);
mediaElement.MediaOpened += new RoutedEventHandler(mediaFile_MediaOpened);
// and some other work
}
}
}
catch(Exception ex)
{
// try/catch stuff
}
}
But after some investigation I found out that with large files(for me it's more than 100 MB) I'm getting OutOfMemory exception during downloading this file. I suppose that's because WebClient.OpenReadCompleted loads the whole stream into RAM and chokes... And I will need more memory to decrypt this stream.
After another investigation, I've found how to divide large file into chunks after OpenReadCompleted event at saving this file to IsolatedStorage(or decryption and then saving in my ocasion), but this would help with only a part of problem... The primary problem is how to prevent phone chokes during download process. Is there a way to download large file in chunks? Then I could use the found solution to pass through decryption process. (and still I'd need to find a way to load such big file into mediaElement, but that would be another question)
Answer:
private WebHeaderCollection headers;
private int iterator = 0;
private int delta = 1048576;
private string savedFile = "testFile.mp3";
// some preparations
// Start downloading first piece
using (IsolatedStorageFile isolatedStorageFile = IsolatedStorageFile.GetUserStoreForApplication())
{
if (isolatedStorageFile.FileExists(savedFile))
isolatedStorageFile.DeleteFile(savedFile);
}
headers = new WebHeaderCollection();
headers[HttpRequestHeader.Range] = "bytes=" + iterator.ToString() + '-' + (iterator + delta).ToString();
webClientReadCompleted = new WebClient();
webClientReadCompleted.Headers = headers;
webClientReadCompleted.OpenReadCompleted += downloadedSong_OpenReadCompleted;
webClientReadCompleted.OpenReadAsync(new Uri(song.Link));
// song.Link was given earlier
private void downloadedSong_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e)
{
try
{
if (e.Cancelled == false)
{
if (e.Result != null)
{
((WebClient)sender).OpenReadCompleted -= downloadedSong_OpenReadCompleted;
using (IsolatedStorageFile myIsolatedStorage = IsolatedStorageFile.GetUserStoreForApplication())
{
using (IsolatedStorageFileStream fileStream = new IsolatedStorageFileStream(savedFile, FileMode.Append, FileAccess.Write, myIsolatedStorage))
{
int mediaFileLength = (int)e.Result.Length;
byte[] byteFile = new byte[mediaFileLength];
e.Result.Read(byteFile, 0, byteFile.Length);
fileStream.Write(byteFile, 0, byteFile.Length);
// If there's something left, download it recursively
if (byteFile.Length > delta)
{
iterator = iterator + delta + 1;
headers = new WebHeaderCollection();
headers[HttpRequestHeader.Range] = "bytes=" + iterator.ToString() + '-' + (iterator + delta).ToString();
webClientReadCompleted.Headers = headers;
webClientReadCompleted.OpenReadCompleted += downloadedSong_OpenReadCompleted;
webClientReadCompleted.OpenReadAsync(new Uri(song.Link));
}
}
}
}
}
}
To download a file in chunks you'll need to make multiple requests. One for each chunk.
Unfortunately it's not possible to say "get me this file and return it in chunks of size X";
Assuming that the server supports it, you can use the HTTP Range header to specify which bytes of a file the server should return in response to a request.
You then make multiple requests to get the file in pieces and then put it all back together on the device. You'll probably find it simplest to make sequential calls and start the next one once you've got and verified the previous chunk.
This approach makes it simple to resume a download when the user returns to the app. You just look at how much was downloaded previously and then get the next chunk.
I've written an app which downloads movies (up to 2.6GB) in 64K chunks and then played them back from IsolatedStorage with the MediaPlayerLauncher. Playing via the MediaElement should work too but I haven't verified. You can test this by loading a large file directly into IsolatedStorage (via Isolated Storage Explorer, or similar) and check the memory implications of playing that way.
Confirmed: You can use BackgroundTransferRequest to download multi-GB files but you must set TransferPreferences to None to force the download to happen while connected to an external power supply and while connected to wi-fi, else the BackgroundTransferRequest will fail.
I wonder if it's possible to use a BackgroundTransferRequest to download large files easily and let the phone worry about the implementation details? The documentation seems to suggest that file downloads over 100 MB are possible, and the "Range" verb is reserved for it's own use, so it probably uses this automatically if it can behind the scenes.
From the documentation regarding files over 100 MB:
For files larger than 100 MB, you must set the TransferPreferences
property of the transfer to None or the transfer will fail. If you do
not know the size of a transfer and it is possible that it could
exceed this limit, you should set the value to None, meaning that the
transfer will only proceed when the phone is connected to external
power and has a Wi-Fi connection.
From the documentation regarding use of the "Range" verb:
The Headers property of the BackgroundTransferRequest object is used
to set the HTTP headers for a transfer request. The following headers
are reserved for use by the system and cannot be used by calling
applications. Adding one of the following headers to the Headers
collection will cause a NotSupportedException to be thrown when the
Add(BackgroundTransferRequest) method is used to queue the transfer
request:
If-Modified-Since
If-None-Match
If-Range
Range
Unless-Modified-Since
Here's the documentation:
http://msdn.microsoft.com/en-us/library/windowsphone/develop/hh202955(v=vs.105).aspx