How to unzip Self Extracting Zip files in Azure Blob Storage? - c#

I have a zip file(.Exe - Self-extracting zip file) that can be extracted using 7zip. As I want to automate the extraction process, I used the below C# code. It is working for the normal 7z files. But facing this issue 'Cannot access the closed Stream', when I trying to extract the specific self-extracting (.Exe) zip file. Fyi. Manually I ensured the 7zip command line version is unzipping the file.
using (SevenZipExtractor extract = new SevenZipExtractor(zipFileMemoryStream))
{
foreach (ArchiveFileInfo archiveFileInfo in extract.ArchiveFileData)
{
if (!archiveFileInfo.IsDirectory)
{
using (var memory = new MemoryStream())
{
string shortFileName = Path.GetFileName(archiveFileInfo.FileName);
extract.ExtractFile(archiveFileInfo.Index, memory);
byte[] content = memory.ToArray();
file = new MemoryStream(content);
}
}
}
}
The zip file is in Azure blob storage. I dont know how to get the extracted files in the blob storage.

Here is one of the workarounds that has worked for me. Instead of 7Zip I have used ZipArchive.
ZipArchive archive = new ZipArchive(myBlob);
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(destinationStorage);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference(destinationContainer);
foreach(ZipArchiveEntry entry in archive.Entries) {
log.LogInformation($"Now processing {entry.FullName}");
string valideName = Regex.Replace(entry.Name, # "[^a-zA-Z0-9\-]", "-").ToLower();
CloudBlockBlob blockBlob = container.GetBlockBlobReference(valideName);
using(var fileStream = entry.Open()) {
await blockBlob.UploadFromStreamAsync(fileStream);
}
}
REFERENCE:
How to Unzip Automatically your Files with Azure Function v2

Related

Azure Storage Blob - Blob original properties

I am trying to download a dll from Azure blobl storage and read its version.
When I download it by hand all the properties are there but when I download it with my code there is nothing:
This is the code I am using:
Stream file = File.OpenWrite(Path.Combine(downloadPath,blobName));
BlobClient blobClient = new BlobClient(connectionString, containerName, blobName);
blobClient.DownloadTo(file);
I tried in my environment and successfully downloaded dll files from azure blob storage.
Code:
using Azure.Storage.Blobs;
namespace blobdll
{
class program
{
public static void Main()
{
var connectionString = < Connection string>;
var downloadPath = "< path of folder upto filename >";
using Stream file = File.OpenWrite(Path.Combine(downloadPath));
BlobClient blobClient = new BlobClient(connectionString, "test", "A2dLib-3.17.dll");
blobClient.DownloadTo(file);
}
}
}
Console:
Portal:
Downloaded file:
After downloaded file, I checked the size of the file it is in same size.

How to read Zipped txt file (blob) which locates in Azure container without downloading?

I can read txt file with this code, but when I try to read the txt.gz file of course it doesn't work.
How can I read zipped blob without downloading, because the framework will work on cloud?
Maybe it is possible to unzip the file to another container? But I couldn't find a solution.
public static string GetBlob(string containerName, string fileName)
{
string connectionString = $"yourConnectionString";
// Setup the connection to the storage account
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connectionString);
// Connect to the blob storage
CloudBlobClient serviceClient = storageAccount.CreateCloudBlobClient();
// Connect to the blob container
CloudBlobContainer container = serviceClient.GetContainerReference($"{containerName}");
// Connect to the blob file
CloudBlockBlob blob = container.GetBlockBlobReference($"{fileName}");
// Get the blob file as text
string contents = blob.DownloadTextAsync().Result;
return contents;
}
You can use GZipStream to decompress your gz file on the fly, you don't have to worry about downloading it and decompressing it on a physical location.
public static string GetBlob(string containerName, string fileName)
{
string connectionString = $"connectionstring";
// Setup the connection to the storage account
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connectionString);
// Connect to the blob storage
CloudBlobClient serviceClient = storageAccount.CreateCloudBlobClient();
// Connect to the blob container
CloudBlobContainer container = serviceClient.GetContainerReference($"{containerName}");
// Connect to the blob file
CloudBlockBlob blob = container.GetBlockBlobReference($"{fileName}");
// Get the blob file as text
using (var gzStream = await blob.OpenReadAsync())
{
using (GZipStream decompressionStream = new GZipStream(gzStream, CompressionMode.Decompress))
{
using (StreamReader reader = new StreamReader(decompressionStream, Encoding.UTF8))
{
return reader.ReadToEnd();
}
}
}
}
without downloading, because the framework will work on cloud
This is not possible. You cannot work with a file on blob storage without downloading it. No matter where your code is running. Of course, if your code is also running on Azure, the download time might be pretty fast, but nevertheless you have to download from blob storage first.
And for your zip file you want to use either DownloadToFileAsync() or DownloadToStreamAsync().

How to download txt azure blobs as plain text files?

I have to download some txt files which are in a Azure container allowing anonymous access. I am working with Visual Studio 2017 and the program is a Windows Form application.
This is my code (where myUri is the string containing the Uri and myContainer the one for the Container):
BlobServiceClient blobServiceClient = new BlobServiceClient(new Uri(myUri));
BlobContainerClient container = blobServiceClient.GetBlobContainerClient(myContainer);
Azure.Pageable<BlobItem> blobs = container.GetBlobs(BlobTraits.All,BlobStates.All);
foreach (BlobItem blob in blobs)
{
BlobClient bc = container.GetBlobClient(blob.Name);
bc.DownloadTo(new FileStream(path + blob.Name, FileMode.Create));
}
I can see the files in my local path with the correct names, the problem is that if I try to open the .txt(s) with a common editor such as Notepad++ I see encoded chars instead of normal ASCII.
Where is the problem? Can anyone help me?
(too long for comment)
While I am not able to see any issue in your code that would cause encoding issue, you may try the below to download your blob. Here I am using BlobDownloadInfo class to get an idea of the content type of what is being downloaded and it's Content.CopyTo method to write to the stream.
Azure.Pageable<BlobItem> blobs = container.GetBlobs(BlobTraits.All, BlobStates.All);
foreach (BlobItem blob in blobs)
{
BlobClient blobClient = container.GetBlobClient(blob.Name);
BlobDownloadInfo download = blobClient.Download();
Console.WriteLine("Content Type " + download.ContentType);
using (FileStream downloadFileStream = File.OpenWrite(Path.Combine(#"YourPath", blob.Name)))
{
download.Content.CopyTo(downloadFileStream);
downloadFileStream.Close();
}
}

Upload a zip file in small chunks to azure cloud blob storage

I want to upload zip file in small chunks (less than 5 MB) to blob containers in Microsoft Azure Storage. I already configured 4 MB chunk limits in BlobRequestOptions but when I run my code and check the memory usage in Azure Cloud, its not uploading in chunks. I am using C# .NET Core. Because I want to zip files that are already located in Azure Cloud, so first I am downloading the individual files to stream, adding stream to zip archive and then uploading the zip back to the cloud. The following is my code:
if (CloudStorageAccount.TryParse(_Appsettings.GetSection("StorConf").GetSection("StorageConnection").Value, out CloudStorageAccount storageAccount)) {
CloudBlobClient BlobClient = storageAccount.CreateCloudBlobClient();
TimeSpan backOffPeriod = TimeSpan.FromSeconds(2);
int retryCount = 1;
BlobRequestOptions bro = new BlobRequestOptions() {
SingleBlobUploadThresholdInBytes = 4096 * 1024, // 4MB
ParallelOperationThreadCount = 1,
RetryPolicy = new ExponentialRetry(backOffPeriod, retryCount),
// new
ServerTimeout = TimeSpan.MaxValue,
MaximumExecutionTime = TimeSpan.FromHours(3),
//EncryptionPolicy = policy
};
// set blob request option for created blob client
BlobClient.DefaultRequestOptions = bro;
// using specified container which comes via transaction id
CloudBlobContainer container = BlobClient.GetContainerReference(transaction id);
using(var zipArchiveMemoryStream = new MemoryStream()) {
using(var zipArchive = new ZipArchive(zipArchiveMemoryStream, ZipArchiveMode.Create, true)) // new
{
foreach(FilesListModel FileName in filesList) {
if (await container.ExistsAsync()) {
CloudBlob file = container.GetBlobReference(FileName.FileName);
if (await file.ExistsAsync()) {
// zip: get stream and add zip entry
var entry = zipArchive.CreateEntry(FileName.FileName, CompressionLevel.Fastest);
// approach 1
using(var entryStream = entry.Open()) {
await file.DownloadToStreamAsync(entryStream, null, bro, null);
await entryStream.FlushAsync();
entryStream.Close();
}
} else {
downlReady = "false";
}
} else {
// case: Container does not exist
//return BadRequest("Container does not exist");
}
}
}
if (downlReady == "true") {
string zipFileName = "sample.zip";
CloudBlockBlob zipBlockBlob = container.GetBlockBlobReference(zipFileName);
zipArchiveMemoryStream.Position = 0;
//zipArchiveMemoryStream.Seek(0, SeekOrigin.Begin);
// new
zipBlockBlob.Properties.ContentType = "application/x-zip-compressed";
await zipArchiveMemoryStream.FlushAsync();
await zipBlockBlob.UploadFromStreamAsync(zipArchiveMemoryStream, zipArchiveMemoryStream.Length, null, bro, null);
}
zipArchiveMemoryStream.Close();
}
}
The following is a snapshot of the memory usage (see private_Memory) in azure cloud kudu process explorer:
memory usage
Any suggestions would be really helpful. Thank you.
UPDATE 1:
To make it more clear. I have files which are already located in Azure blob storage. Now I want to read the files from the container, create a ZIP which contains all of my files. The major challenge here is that my code is obviously loading all files into memory to create the zip. If and how it is possible to read files from a container and write the ZIP file back into the same container in parallel/pieces, so that my Azure web app does NOT need to load the whole files into memory? Ideally I read the files in pieces and also start writing the zip already so that my Azure web app consumes less memory.
I have found the solution by referring to this stackoverflow article:
How can I dynamically add files to a zip archive stored in Azure blob storage?
The way to do is to simultaneously write to the zip memory stream while reading / downloading the input files.
Below is my code snippet:
using (var zipArchiveMemoryStream = await zipBlockBlob.OpenWriteAsync(null, bro, null))
using (var zipArchive = new ZipArchive(zipArchiveMemoryStream, ZipArchiveMode.Create))
{
foreach (FilesListModel FileName in filesList)
{
if (await container.ExistsAsync())
{
CloudBlob file = container.GetBlobReference(FileName.FileName);
if (await file.ExistsAsync())
{
// zip: get stream and add zip entry
var entry = zipArchive.CreateEntry(FileName.FileName, CompressionLevel.Fastest);
// approach 1
using (var entryStream = entry.Open())
{
await file.DownloadToStreamAsync(entryStream, null, bro, null);
entryStream.Close();
}
}
}
}
zipArchiveMemoryStream.Close();
}

How to download Azure Blobs by referencing the file?

I want to download files from Azure using C# then stream those into MemoryStream after that return/display to the user in Front-end with a link (Azure URI - which goes to the Azure blob) and the user will be able to see those PDF files in the browser or download them. There are multiple blobs/files in Azure so, I want to loop through each file and download to stream for example: using a foreach.
I'm not sure how can I reference those blobs CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName); as here I could give a name of the specific file but I've multiple files so not sure what to go here "fileName".
Code:
var files = container.ListBlobs();
foreach (var file in files)
{
using (var memoryStream = new MemoryStream())
{
CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName);
blockBlob.DownloadToStream(memoryStream);
}
}
I'm not sure if I'm looping correcting right now in the code and downloading every blob?
Also, I tried replacing fileName with file.Uri.Segments.Last() -
I guess which gets the name of blobs.
The problem I'm having is that this foreach is just getting me one PDF file whenever I try to use the links in front-end. So, I need to know how can I properly loop through each file and download them?
So, I need to know how can I properly loop through each file and download them?
We can't download the mutiple files from the memory directly. If zip file is acceptable, you could use a compressed file such as a zip file to transfer multiple files instead. The following is my demo code, it works correctly on my side.
using (var ms = new MemoryStream())
{
using (var zipArchive = new ZipArchive(ms, ZipArchiveMode.Create, true))
{
foreach (var file in files)
{
if (file.GetType() != typeof(CloudBlockBlob)) continue;
var blob = (CloudBlockBlob) file;
var entry = zipArchive.CreateEntry(blob.Name, CompressionLevel.Fastest);
using (var entryStream = entry.Open())
{
CloudBlockBlob blockBlob = container.GetBlockBlobReference(blob.Name);
blockBlob.DownloadToStream(entryStream);
}
}
}
}

Categories

Resources