Azure storage not finding csv file - c#

I am trying to read a csv file from my azure storage account.
To convert each line into an object and build a list of those objects.
It keeps erring, and the reason is it cant find the file (Blob not found). The file is there, It is a csv file.
Error:
StorageException: The specified blob does not exist.
BatlGroup.Site.Services.AzureStorageService.AzureFileMethods.ReadCsvFileFromBlobAsync(CloudBlobContainer container, string fileName) in AzureFileMethods.cs
+
await blob.DownloadToStreamAsync(memoryStream);
public async Task<Stream> ReadCsvFileFromBlobAsync(CloudBlobContainer container, string fileName)
{
// Retrieve reference to a blob (fileName)
var blob = container.GetBlockBlobReference(fileName);
using (var memoryStream = new MemoryStream())
{
//downloads blob's content to a stream
await blob.DownloadToStreamAsync(memoryStream);
return memoryStream;
}
}
I've made sure the file is public. I can download any text file that is stored there, but none of the csv files.
I am also not sure what format to take it in as I need to iterate through the lines.
I see examples of bringing the whole file down to a temp drive and working with it there but that seems unproductive as then I could just store the file in wwroot folder instead of azure.
What is the most appropriate way to read a csv file from azure storage.

Regarding how to iterate through the lines, after you get the memory stream, you can use StreamReader to read them line by line.
Sample code as below:
using Microsoft.WindowsAzure.Storage;
using Microsoft.WindowsAzure.Storage.Blob;
using System;
using System.IO;
namespace ConsoleApp17
{
class Program
{
static void Main(string[] args)
{
string connstr = "your connection string";
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(connstr);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference("t11");
CloudBlockBlob blockBlob = container.GetBlockBlobReference("students.csv");
string text="";
string temp = "";
using (var memoryStream = new MemoryStream())
{
blockBlob.DownloadToStream(memoryStream);
//remember set the position to 0
memoryStream.Position = 0;
using (var reader = new StreamReader(memoryStream))
{
//read the csv file as per line.
while (!reader.EndOfStream && !string.IsNullOrEmpty(temp=reader.ReadLine()))
{
text = text + "***" + temp;
}
}
}
Console.WriteLine(text);
Console.WriteLine("-------");
Console.ReadLine();
}
}
}
My csv file:
The test result:

Related

Azure Storage Blob - Blob original properties

I am trying to download a dll from Azure blobl storage and read its version.
When I download it by hand all the properties are there but when I download it with my code there is nothing:
This is the code I am using:
Stream file = File.OpenWrite(Path.Combine(downloadPath,blobName));
BlobClient blobClient = new BlobClient(connectionString, containerName, blobName);
blobClient.DownloadTo(file);
I tried in my environment and successfully downloaded dll files from azure blob storage.
Code:
using Azure.Storage.Blobs;
namespace blobdll
{
class program
{
public static void Main()
{
var connectionString = < Connection string>;
var downloadPath = "< path of folder upto filename >";
using Stream file = File.OpenWrite(Path.Combine(downloadPath));
BlobClient blobClient = new BlobClient(connectionString, "test", "A2dLib-3.17.dll");
blobClient.DownloadTo(file);
}
}
}
Console:
Portal:
Downloaded file:
After downloaded file, I checked the size of the file it is in same size.

How to unzip Self Extracting Zip files in Azure Blob Storage?

I have a zip file(.Exe - Self-extracting zip file) that can be extracted using 7zip. As I want to automate the extraction process, I used the below C# code. It is working for the normal 7z files. But facing this issue 'Cannot access the closed Stream', when I trying to extract the specific self-extracting (.Exe) zip file. Fyi. Manually I ensured the 7zip command line version is unzipping the file.
using (SevenZipExtractor extract = new SevenZipExtractor(zipFileMemoryStream))
{
foreach (ArchiveFileInfo archiveFileInfo in extract.ArchiveFileData)
{
if (!archiveFileInfo.IsDirectory)
{
using (var memory = new MemoryStream())
{
string shortFileName = Path.GetFileName(archiveFileInfo.FileName);
extract.ExtractFile(archiveFileInfo.Index, memory);
byte[] content = memory.ToArray();
file = new MemoryStream(content);
}
}
}
}
The zip file is in Azure blob storage. I dont know how to get the extracted files in the blob storage.
Here is one of the workarounds that has worked for me. Instead of 7Zip I have used ZipArchive.
ZipArchive archive = new ZipArchive(myBlob);
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(destinationStorage);
CloudBlobClient blobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer container = blobClient.GetContainerReference(destinationContainer);
foreach(ZipArchiveEntry entry in archive.Entries) {
log.LogInformation($"Now processing {entry.FullName}");
string valideName = Regex.Replace(entry.Name, # "[^a-zA-Z0-9\-]", "-").ToLower();
CloudBlockBlob blockBlob = container.GetBlockBlobReference(valideName);
using(var fileStream = entry.Open()) {
await blockBlob.UploadFromStreamAsync(fileStream);
}
}
REFERENCE:
How to Unzip Automatically your Files with Azure Function v2

Extract embedded files from azure blob in c#

I have embedded pdf files stored inside a blob file.I want to extract those file from my blob.
below are the thing I have done so far:
I have made http trigger function app
establish connection with the storage container
able to fetch the blob.
get the embedded file I am using following code:
namespace PDFDownloader {
public static class Function1 { [FunctionName("Function1")]
public static async Task <IActionResult> Run([HttpTrigger(AuthorizationLevel.Function, "get", "post", Route = null)] HttpRequest req, ILogger log) {
log.LogInformation($"GetVolumeData function executed at:
{DateTime.Now}");
try {
CloudStorageAccount storageAccount = CloudStorageAccount.Parse(Parameter.ConnectionString);
CloudBlobClient cloudBlobClient = storageAccount.CreateCloudBlobClient();
CloudBlobContainer cloudcontainer = cloudBlobClient.GetContainerReference(Parameter.SuccessContainer);
BlobResultSegment resultSegment = await
cloudcontainer.ListBlobsSegmentedAsync(currentToken: null);
IEnumerable <IListBlobItem> blobItems = resultSegment.Results;
string response = "";
int count = 0;
//string blobName = "";
foreach(IListBlobItem item in blobItems) {
var type = item.GetType();
if (type == typeof(CloudBlockBlob)) {
CloudBlockBlob blob = (CloudBlockBlob) item;
count++;
var blobname = blob.Name;
// response = blobname;
response = blob.DownloadTextAsync().Result;
//response = blob.DownloadToStream().Result;
}
}
if (count == 0) {
return new OkObjectResult("Error : File Not Found !!");
} else {
return new OkObjectResult(Convert.ToString(response));
}
} catch(Exception ex) {
log.LogError($ " Function Exception Message: {ex.Message}");
return new OkObjectResult(ex.Message.ToString());
} finally {
log.LogInformation($"Function- ENDED ON : {DateTime.Now}");
}
}
}
how can I read embedded files from my blob file response and send it to http?
Apart from the fact that your code needs quite some cleanup and that you should read up on the proper use of async, I believe your actual issue is here:
FileStream inputStream = new FileStream(response, FileMode.Open);
The response object contains the text content of your blob that you downloaded earlier. the Filestream ctor, however, expects a path to a file. Since you do not have a file here, Filestream is not the right thing to use. Either download the blob as a temp file or even directly as a string
Plus, do yourself a favor and switch to the latest version of the Storage Blob SDK (https://github.com/Azure/azure-sdk-for-net/tree/master/sdk/storage/Azure.Storage.Blobs#downloading-a-blob).
Think the one blob is the equivalent of one file(check it here).
Check the line:
response = Convert.ToString(blob.DownloadTextAsync().Result);
Is the content of your blob a valid file path ?
Maybe you are not using correctly the constructor of FileStream class public FileStream (string path, System.IO.FileMode mode). This constructor can throw a lot of different exceptions, try to find yours.
Also, as it was recommended in a previous answer, it is worth using the Azure.Storage.Blobs package based on SDK version 12, now you are using the SDK version 11(Microsoft.Azure.Storage.Blob).
using Bytescout.PDFExtractor;
var stream1 = await blob.OpenReadAsync(); //read your blob like
this
attachmentExtractor extractor = new AttachmentExtractor();
extractor.RegistrationName = "demo";
extractor.RegistrationKey = "demo";
// Load sample PDF document
extractor.LoadDocumentFromFile(stream1);
for (int i = 0; i < extractor.Count; i++)
{
Console.WriteLine("Saving attachment: " +
extractor.GetFileName(i));
// Save attachment to file
extractor.Save(i, extractor.GetFileName(i));
Console.WriteLine("File size: " + extractor.GetSize(i));
}
extractor.Dispose();*/

Add JSON string directly to Azure Blob Storage Container using C#

I am trying to load a JSON string (serialized with Newtonsoft.Json) without creating a temporary file.
I am serializing object in runtime using JsonConvert.SerializeObject(obj,settings) which returns a string.
Following Microsoft documentation I could do as it's illustrated below:
// Create a local file in the ./data/ directory for uploading and downloading
string localPath = "./data/";
string fileName = "quickstart" + Guid.NewGuid().ToString() + ".txt";
string localFilePath = Path.Combine(localPath, fileName);
// Write text to the file
await File.WriteAllTextAsync(localFilePath, "Hello, World!");
// Get a reference to a blob
BlobClient blobClient = containerClient.GetBlobClient(fileName);
Console.WriteLine("Uploading to Blob storage as blob:\n\t {0}\n", blobClient.Uri);
// Open the file and upload its data
using FileStream uploadFileStream = File.OpenRead(localFilePath);
await blobClient.UploadAsync(uploadFileStream, true);
uploadFileStream.Close();
Although it works, I would have to create temporary file for each uploaded JSON file.
I tried this:
BlobServiceClient blobServiceClient = new BlobServiceClient("SECRET");
BlobContainerClient container = BlobServiceClient.GetBlobContainerClient("CONTAINER_NAME");
container.CreateIfNotExistsAsync().Wait();
container.SetAccessPolicy(Azure.Storage.Blobs.Models.PublicAccessType.Blob);
CloudBlockBlob cloudBlockBlob = new CloudBlockBlob(container.Uri);
var jsonToUplaod = JsonConvert.SerializeObject(persons, settings);
cloudBlockBlob.UploadTextAsync(jsonToUpload).Wait();
But, well...it doesn't have right to work as I am not specifing any actual file in the given container (I don't know where to do it).
Is there any way to upload a blob directly to a given container?
Thank You in advance.
The BlobClient class wants a Stream, so you can create a MemoryStream from your JSON string.
Try something like this:
BlobClient blob = container.GetBlobClient("YourBlobName");
using (MemoryStream ms = new MemoryStream(Encoding.UTF8.GetBytes(jsonToUpload)))
{
await blob.UploadAsync(ms);
}

Upload a zip file in small chunks to azure cloud blob storage

I want to upload zip file in small chunks (less than 5 MB) to blob containers in Microsoft Azure Storage. I already configured 4 MB chunk limits in BlobRequestOptions but when I run my code and check the memory usage in Azure Cloud, its not uploading in chunks. I am using C# .NET Core. Because I want to zip files that are already located in Azure Cloud, so first I am downloading the individual files to stream, adding stream to zip archive and then uploading the zip back to the cloud. The following is my code:
if (CloudStorageAccount.TryParse(_Appsettings.GetSection("StorConf").GetSection("StorageConnection").Value, out CloudStorageAccount storageAccount)) {
CloudBlobClient BlobClient = storageAccount.CreateCloudBlobClient();
TimeSpan backOffPeriod = TimeSpan.FromSeconds(2);
int retryCount = 1;
BlobRequestOptions bro = new BlobRequestOptions() {
SingleBlobUploadThresholdInBytes = 4096 * 1024, // 4MB
ParallelOperationThreadCount = 1,
RetryPolicy = new ExponentialRetry(backOffPeriod, retryCount),
// new
ServerTimeout = TimeSpan.MaxValue,
MaximumExecutionTime = TimeSpan.FromHours(3),
//EncryptionPolicy = policy
};
// set blob request option for created blob client
BlobClient.DefaultRequestOptions = bro;
// using specified container which comes via transaction id
CloudBlobContainer container = BlobClient.GetContainerReference(transaction id);
using(var zipArchiveMemoryStream = new MemoryStream()) {
using(var zipArchive = new ZipArchive(zipArchiveMemoryStream, ZipArchiveMode.Create, true)) // new
{
foreach(FilesListModel FileName in filesList) {
if (await container.ExistsAsync()) {
CloudBlob file = container.GetBlobReference(FileName.FileName);
if (await file.ExistsAsync()) {
// zip: get stream and add zip entry
var entry = zipArchive.CreateEntry(FileName.FileName, CompressionLevel.Fastest);
// approach 1
using(var entryStream = entry.Open()) {
await file.DownloadToStreamAsync(entryStream, null, bro, null);
await entryStream.FlushAsync();
entryStream.Close();
}
} else {
downlReady = "false";
}
} else {
// case: Container does not exist
//return BadRequest("Container does not exist");
}
}
}
if (downlReady == "true") {
string zipFileName = "sample.zip";
CloudBlockBlob zipBlockBlob = container.GetBlockBlobReference(zipFileName);
zipArchiveMemoryStream.Position = 0;
//zipArchiveMemoryStream.Seek(0, SeekOrigin.Begin);
// new
zipBlockBlob.Properties.ContentType = "application/x-zip-compressed";
await zipArchiveMemoryStream.FlushAsync();
await zipBlockBlob.UploadFromStreamAsync(zipArchiveMemoryStream, zipArchiveMemoryStream.Length, null, bro, null);
}
zipArchiveMemoryStream.Close();
}
}
The following is a snapshot of the memory usage (see private_Memory) in azure cloud kudu process explorer:
memory usage
Any suggestions would be really helpful. Thank you.
UPDATE 1:
To make it more clear. I have files which are already located in Azure blob storage. Now I want to read the files from the container, create a ZIP which contains all of my files. The major challenge here is that my code is obviously loading all files into memory to create the zip. If and how it is possible to read files from a container and write the ZIP file back into the same container in parallel/pieces, so that my Azure web app does NOT need to load the whole files into memory? Ideally I read the files in pieces and also start writing the zip already so that my Azure web app consumes less memory.
I have found the solution by referring to this stackoverflow article:
How can I dynamically add files to a zip archive stored in Azure blob storage?
The way to do is to simultaneously write to the zip memory stream while reading / downloading the input files.
Below is my code snippet:
using (var zipArchiveMemoryStream = await zipBlockBlob.OpenWriteAsync(null, bro, null))
using (var zipArchive = new ZipArchive(zipArchiveMemoryStream, ZipArchiveMode.Create))
{
foreach (FilesListModel FileName in filesList)
{
if (await container.ExistsAsync())
{
CloudBlob file = container.GetBlobReference(FileName.FileName);
if (await file.ExistsAsync())
{
// zip: get stream and add zip entry
var entry = zipArchive.CreateEntry(FileName.FileName, CompressionLevel.Fastest);
// approach 1
using (var entryStream = entry.Open())
{
await file.DownloadToStreamAsync(entryStream, null, bro, null);
entryStream.Close();
}
}
}
}
zipArchiveMemoryStream.Close();
}

Categories

Resources