Problem with azure blob storage encoding when uploading a file

Problem with azure blob storage encoding when uploading a file - c#

I'm uploading files to Azure Blob Storage with the .Net package specifying the encoding iso-8859-1. The stream seems ok in Memory but when I upload to the blob storage it ends with corrupted characters that seems that could not be converted to that encoding. It would seem as if the file gets storaged in a corrupted state and when I download it again and check it the characters get all messed up. Here is the code I'm using.
public static async Task<bool> UploadFileFromStream(this CloudStorageAccount account, string containerName, string destBlobPath, string fileName, Stream stream, Encoding encoding)
{
if (account is null) throw new ArgumentNullException(nameof(account));
if (string.IsNullOrEmpty(containerName)) throw new ArgumentException("message", nameof(containerName));
if (string.IsNullOrEmpty(destBlobPath)) throw new ArgumentException("message", nameof(destBlobPath));
if (stream is null) throw new ArgumentNullException(nameof(stream));
stream.Position = 0;
CloudBlockBlob blob = GetBlob(account, containerName, $"{destBlobPath}/{fileName}");
blob.Properties.ContentType = FileUtils.GetFileContentType(fileName);
using var reader = new StreamReader(stream, encoding);
var ct = await reader.ReadToEndAsync();
await blob.UploadTextAsync(ct, encoding ?? Encoding.UTF8, AccessCondition.GenerateEmptyCondition(), new BlobRequestOptions(), new OperationContext());
return true;
}
This is the file just before uploading it
<provinciaDatosInmueble>Sevilla</provinciaDatosInmueble>
<inePoblacionDatosInmueble>969</inePoblacionDatosInmueble>
<poblacionDatosInmueble>Valencina de la Concepción</poblacionDatosInmueble>
and this is the file after the upload
<provinciaDatosInmueble>Sevilla</provinciaDatosInmueble>
<inePoblacionDatosInmueble>969</inePoblacionDatosInmueble>
<poblacionDatosInmueble>Valencina de la Concepci�n</poblacionDatosInmueble>
The encoding I send is ISO-5589-1 in the parameter of the encoding. Anybody knows why Blob Storage seems to ignore the encoding I'm specifying? Thanks in advance!

We could able to achieve this using Azure.Storage.Blobs instead of WindowsAzure.Storage which is a legacy Storage SDK. Below is the code that worked for us.
class Program
{
static async Task Main(string[] args)
{
string sourceContainerName = "<Source_Container_Name>";
string destBlobPath = "<Destination_Path>";
string fileName = "<Source_File_name>";
MemoryStream stream = new MemoryStream();
BlobServiceClient blobServiceClient = new BlobServiceClient("<Your_Connection_String>");
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(sourceContainerName);
BlobClient blobClientSource = containerClient.GetBlobClient(fileName);
BlobClient blobClientDestination = containerClient.GetBlobClient(destBlobPath);
// Reading From Blob
var line =" ";
if (await blobClientSource.ExistsAsync())
{
var response = await blobClientSource.DownloadAsync();
using (StreamReader streamReader = new StreamReader(response.Value.Content))
{
line = await streamReader.ReadToEndAsync();
}
}
// Writing To Blob
var content = Encoding.UTF8.GetBytes(line);
using (var ms = new MemoryStream(content))
blobClientDestination.Upload(ms);
}
}
RESULT:

Related

C# - Encrypt and Decrypt audio file with PGP (Error: inputStream must be seek-able)

Context: Encrypt and Decrypt an audio file (.wav) in Azure Storage.
Issue: inputStream must be seek-able (when encrypting) "await pgp.EncryptStreamAsync(sourceStream, outputStream);"
I'm not a C# Developer :)
Thank you for your help,
Here is the code i'm using:
static async Task Main()
{
string connectionString = Environment.GetEnvironmentVariable("AZURE_STORAGE_CONNECTION_STRING");
////Create a unique name for the container
string containerName = "audioinput";
string filename = "abc.wav";
//// Create a BlobServiceClient object which will be used to create a container client
BlobServiceClient blobServiceClient = new BlobServiceClient(connectionString);
BlobContainerClient sourcecontainer = blobServiceClient.GetBlobContainerClient(containerName);
BlobClient blobClient = sourcecontainer.GetBlobClient(filename);
if (sourcecontainer.Exists())
{
var sourceStream = new MemoryStream();
//Download blob to MemoryStream
await blobClient.DownloadToAsync(sourceStream);
sourceStream.Position = 0;
//OutputStream
await using var outputStream = new MemoryStream();
//Get encryptionkeys
EncryptionKeys encryptionKeys;
using (Stream publicKeyStream = new FileStream(#"...\public.asc", FileMode.Open))
encryptionKeys = new EncryptionKeys(publicKeyStream);
PGP pgp = new PGP(encryptionKeys);
await pgp.EncryptStreamAsync(sourceStream, outputStream);
}
else
{
Console.WriteLine("container doesn't exist");
}
}

How to build a stream while it is being uploaded in dotnet

I am trying to take a file and split it into piece and then push each new smaller file piece to azure. I have tried writing a MemoryStream to azure, but that causes the file to upload immediately, but the file is basically empty. I have tried using a BufferedStream which allows the data to be sent as i am writing to it, but i am not sure how to end the stream. I have tried to close each of the different streams i am using, but that does not work as it results in a stream closed exception. Any idea how to mark the stream as complete so the azure library will know to finish the file upload?
It does work to wait until the full file is build and then upload the memory stream, but i would like to be able to write to it while it is uploading if possible.
CloudBlobClient blobClient = StorageAccount.CreateCloudBlobClient();
CloudBlobContainer blobContainer = blobClient.GetContainerReference("containerName");
using (FileStream fileStream = File.Open(path)
{
int key = 0;
CsvWriter csvWriter = null;
MemoryStream memoryStream = null;
BufferedStream bufferedStream = null;
StreamWriter streamWriter = null;
Task<StorageUri> uploadTask = null;
using (var reader = new StreamReader(fileStream))
using (var csv = new CsvReader(reader, CultureInfo.InvariantCulture))
{
csv.Read();
csv.ReadHeader();
await foreach (model row in csv.GetRecordsAsync<MyModel>())
{
if (row.KeyColumn != key)
{
if (memoryStream != null)
{
//Wiat for the current upload to finish
await csvWriter.FlushAsync();
csvWriter.Dispose();
await uploadTask;
}
//Start New Upload
key = row.KeyColumn;
memoryStream = new MemoryStream();
bufferedStream = new BufferedStream(memoryStream)
streamWriter = new StreamWriter(bufferedStream);
csvWriter = new CsvWriter(streamWriter, CultureInfo.InvariantCulture);
csvWriter.WriteHeader<MyModel>();
await csvWriter.FlushAsync();
CloudBlockBlob blockBlob = blobContainer.GetBlockBlobReference($"file_{key}.csv");
uploadTask = blockBlob.UploadFromStreamAsync(bufferedStream);
}
csvWriter.WriteRecord(row);
await csvWriter.FlushAsync();
}
if (memoryStream != null)
{
await csvWriter.FlushAsync();
csvWriter.Dispose();
await uploadTask;
}
}
}

How to stream a file in aws lambda using c#

I am uploading an evidence file to stripe using filestream but apllication was hosted in aws lambda which is not supporting filestream.
Here is my code
public async Task<IActionResult> PostFile(D.StripeFilePurpose stripeFilePurpose)
{
IFormFile file = Request.Form.Files[0];
var fileName = ContentDispositionHeaderValue.Parse(
file.ContentDisposition).FileName.Trim('"');
var path = string.Empty;
var webRootPath = _hostingEnvironment.WebRootPath;
if (string.IsNullOrEmpty(webRootPath))
{
path = Directory.GetCurrentDirectory();
}
string fileId;
var filePath = Path.Combine(path, fileName);
using (var fileStream = new FileStream(filePath, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
file.CopyTo(fileStream);
}
using (var stream = new FileStream(filePath, FileMode.Open))
{
var stripeFileUpload = await _stripeDisputeService
.UploadFileAsync(
fileName,
stream,
stripeFilePurpose.GetDescription());
fileId = stripeFileUpload.Id;
}
return StatusCode(200, fileId);
}
whenever specifying a filepath lamba was appending it with /var/task/**mypath.
I even hardcoded filepath still appending /var/task before file path. I searched and found that streaming is possible only if we store file in /tmp folder(lambda)..
How to achieve this??

You can try using a MemoryStream instead.
public async Task<IActionResult> PostFile(D.StripeFilePurpose stripeFilePurpose)
{
IFormFile file = Request.Form.Files[0];
var fileName = file.FileName.Trim('"');
using MemoryStream memStream = new MemoryStream();
await file.CopyToAsync(memStream);
memStream.Position = 0;
var stripeFileUpload = await _stripeDisputeService
.UploadFileAsync(
fileName,
memStream,
stripeFilePurpose.GetDescription());
fileId = stripeFileUpload.Id;
return StatusCode(200, fileId);
}
It will consume more memory in the service, but avoid disk usage.

How to set encoding when stream blob from Azure Storage

I have an XML file in my blob storage. It contains words like this: Družstevní.
When I download the XML using Azure portal, this word is still correct.
But when I try using DownloadToStreamAsync the result is Dru�stevn�.
How do I fix this?
I found DownloadTextAsync is working because I get set the encoding: Encoding.GetEncoding(1252).
But then I end up with a string and the rest of my code is expecting a stream. Should I read the string again as a stream or exists a more elegant option?
Here's my code:
public Task<string> DownloadAsTextAsync(string code, Encoding encoding)
{
var blockBlob = _container.GetBlockBlobReference(code);
var blobRequestOptions = new BlobRequestOptions
{
MaximumExecutionTime = TimeSpan.FromMinutes(15),
ServerTimeout = TimeSpan.FromHours(1)
};
return blockBlob.DownloadTextAsync(Encoding.GetEncoding(1252), null, blobRequestOptions, null);
}
public async Task<Stream> DownloadAsStreamAsync(string code)
{
var blockBlob = _container.GetBlockBlobReference(code);
var blobRequestOptions = new BlobRequestOptions
{
MaximumExecutionTime = TimeSpan.FromMinutes(15),
ServerTimeout = TimeSpan.FromHours(1)
};
var output = new MemoryStream();
await blockBlob.DownloadToStreamAsync(output, null, blobRequestOptions, null);
return output;
}
Edit, after comment of Zhaoxing Lu:
I changed my unit test and added the encoding to StreamReader and now the unit test is passing:
using (var streamReader = new StreamReader(stream, Encoding.GetEncoding(1252)))
{
string line;
while ((line = streamReader.ReadLine()) != null)
{
if (!line.StartsWith(" <Str>Dru")) continue;
Debug.WriteLine(line);
var street = line.Trim().Replace("<Str>", "").Replace("</Str>", "");
Assert.AreEqual("Družstevní", street);
}
}
But in my 'real' code I'm sending the stream to load as XML:
fileStream.Position = 0;
var xmlDocument = XDocument.Load(fileStream);
The resulting xmlDocument is in the wrong encoding. I can't find how to set the encoding.

The problem seems to be when reading the stream as an XDocument
You could set the encoding as Encoding.GetEncoding("Windows-1252") with the following code to read the stream as XDocument.
XDocument xmlDoc = null;
using (StreamReader oReader = new StreamReader(stream, Encoding.GetEncoding("Windows-1252")))
{
xmlDoc = XDocument.Load(oReader);
}
The result:

Write Stream async to Azure BlobStorage

I'm rewriting my C# code to use Azure Blob storage instead of filesystem. So far no problems rewriting code for normal fileoperations. But I have some code that uses async write from a stream:
using (var stream = await Request.Content.ReadAsStreamAsync())
{
FileStream fileStream = new FileStream(#"c:\test.txt", FileMode.Create, FileAccess.Write, FileShare.None);
await stream.CopyToAsync(fileStream).ContinueWith(
(copyTask) =>
{
fileStream.Close();
});
}
I need to change the above to use Azure CloudBlockBlob or CloudBlobStream - but can't find a way to declare a stream object that copyToAsync can write to.

You would want to use UploadFromStreamAsync method on CloudBlockBlob. Here's a sample code to do so (I have not tried running this code though):
var cred = new StorageCredentials(accountName, accountKey);
var account = new CloudStorageAccount(cred, true);
var blobClient = account.CreateCloudBlobClient();
var container = blobClient.GetContainerReference("container-name");
var blob = container.GetBlockBlobReference("blob-name");
using (var stream = await Request.Content.ReadAsStreamAsync())
{
stream.Position = 0;
await blob.UploadFromStreamAsync(stream);
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Problem with azure blob storage encoding when uploading a file - c#

Related

C# - Encrypt and Decrypt audio file with PGP (Error: inputStream must be seek-able)

How to build a stream while it is being uploaded in dotnet

How to stream a file in aws lambda using c#

How to set encoding when stream blob from Azure Storage

Write Stream async to Azure BlobStorage

Categories

Resources