How to set encoding when stream blob from Azure Storage - c#

I have an XML file in my blob storage. It contains words like this: Družstevní.
When I download the XML using Azure portal, this word is still correct.
But when I try using DownloadToStreamAsync the result is Dru�stevn�.
How do I fix this?
I found DownloadTextAsync is working because I get set the encoding: Encoding.GetEncoding(1252).
But then I end up with a string and the rest of my code is expecting a stream. Should I read the string again as a stream or exists a more elegant option?
Here's my code:
public Task<string> DownloadAsTextAsync(string code, Encoding encoding)
{
var blockBlob = _container.GetBlockBlobReference(code);
var blobRequestOptions = new BlobRequestOptions
{
MaximumExecutionTime = TimeSpan.FromMinutes(15),
ServerTimeout = TimeSpan.FromHours(1)
};
return blockBlob.DownloadTextAsync(Encoding.GetEncoding(1252), null, blobRequestOptions, null);
}
public async Task<Stream> DownloadAsStreamAsync(string code)
{
var blockBlob = _container.GetBlockBlobReference(code);
var blobRequestOptions = new BlobRequestOptions
{
MaximumExecutionTime = TimeSpan.FromMinutes(15),
ServerTimeout = TimeSpan.FromHours(1)
};
var output = new MemoryStream();
await blockBlob.DownloadToStreamAsync(output, null, blobRequestOptions, null);
return output;
}
Edit, after comment of Zhaoxing Lu:
I changed my unit test and added the encoding to StreamReader and now the unit test is passing:
using (var streamReader = new StreamReader(stream, Encoding.GetEncoding(1252)))
{
string line;
while ((line = streamReader.ReadLine()) != null)
{
if (!line.StartsWith(" <Str>Dru")) continue;
Debug.WriteLine(line);
var street = line.Trim().Replace("<Str>", "").Replace("</Str>", "");
Assert.AreEqual("Družstevní", street);
}
}
But in my 'real' code I'm sending the stream to load as XML:
fileStream.Position = 0;
var xmlDocument = XDocument.Load(fileStream);
The resulting xmlDocument is in the wrong encoding. I can't find how to set the encoding.

The problem seems to be when reading the stream as an XDocument
You could set the encoding as Encoding.GetEncoding("Windows-1252") with the following code to read the stream as XDocument.
XDocument xmlDoc = null;
using (StreamReader oReader = new StreamReader(stream, Encoding.GetEncoding("Windows-1252")))
{
xmlDoc = XDocument.Load(oReader);
}
The result:

Related

Creating Zip file with multiple entries in C# .net

I created the functionality to get documents from blob storage and then add them to a zip file for download.
[HttpPost]
public FileContentResult DownloadDocumentsByDocIDZIP(List<int> documentIDs)
{
List<Document> docs = new List<Document>();
foreach (int doc in documentIDs)
{
if (doc != 0)
{
Document document = documentService.GetDocumentByID(doc, false);
docs.Add(document);
}
}
MemoryStream outms = new MemoryStream();
using (ZipArchive zar = new ZipArchive(outms, ZipArchiveMode.Create, false))
{
foreach (Document docu in docs)
{
if (docu != null)
{
byte[] documentdata = documentService.DownloadDocumentData(docu.DocumentID);
string name = docu.DocumentNiceName ?? docu.DocumentFileName;
byte[] unzipped = documentdata;
ZipArchiveEntry entry = zar.CreateEntry(name);
Stream str = entry.Open();
MemoryStream ms = new MemoryStream(unzipped);
ms.CopyTo(str);
}
}
outms.Seek(0, SeekOrigin.Begin);
}
var outdata = outms.ToArray();
var result = File(outdata, "application/zip", "documents.zip");
return result;
}
When I hit the function via ajax, It fails at
ZipArchiveEntry entry = zar.CreateEntry(name);
I'm given the exception,
System.IO.IOException: 'Entries cannot be created while previously created entries are still open.'
So I added str.close()
using (ZipArchive zar = new ZipArchive(outms, ZipArchiveMode.Create, false))
{
foreach (Document docu in docs)
{
if (docu != null)
{
byte[] documentdata = documentService.DownloadDocumentData(docu.DocumentID);
string name = docu.DocumentNiceName ?? docu.DocumentFileName;
byte[] unzipped = documentdata;
ZipArchiveEntry entry = zar.CreateEntry(name);
Stream str = entry.Open();
MemoryStream ms = new MemoryStream(unzipped);
ms.CopyTo(str);
str.Close();
}
}
outms.Seek(0, SeekOrigin.Begin);
}
var outdata = outms.ToArray();
var result = File(outdata, "application/zip", "documents.zip");
return result;
Now it creates the file but when you try to unzip it after download.
It gives me an error in WinZip. Error: unable to seek to beginning of Central Directory.
Can someone please assist I have no idea what I'm doing wrong?
you have to dispose the Stream before add new stream to zip but the real problem is that you call Seek on stream, try the following code:
using (ZipArchive zar = new ZipArchive(outms, ZipArchiveMode.Create, false))
{
foreach (Document docu in docs)
{
if (docu != null)
{
byte[] documentdata = documentService.DownloadDocumentData(docu.DocumentID);
string name = docu.DocumentNiceName ?? docu.DocumentFileName;
byte[] unzipped = documentdata;
ZipArchiveEntry entry = zar.CreateEntry(name);
using (Stream str = entry.Open())
{
str.Write(unzipped);
}
}
}
//outms.Seek(0, SeekOrigin.Begin); //This causes "Error: unable to seek to beginning of Central Directory."
}
var outdata = outms.ToArray();
var result = File(outdata, "application/zip", "documents.zip");
return result;

Problem with azure blob storage encoding when uploading a file

I'm uploading files to Azure Blob Storage with the .Net package specifying the encoding iso-8859-1. The stream seems ok in Memory but when I upload to the blob storage it ends with corrupted characters that seems that could not be converted to that encoding. It would seem as if the file gets storaged in a corrupted state and when I download it again and check it the characters get all messed up. Here is the code I'm using.
public static async Task<bool> UploadFileFromStream(this CloudStorageAccount account, string containerName, string destBlobPath, string fileName, Stream stream, Encoding encoding)
{
if (account is null) throw new ArgumentNullException(nameof(account));
if (string.IsNullOrEmpty(containerName)) throw new ArgumentException("message", nameof(containerName));
if (string.IsNullOrEmpty(destBlobPath)) throw new ArgumentException("message", nameof(destBlobPath));
if (stream is null) throw new ArgumentNullException(nameof(stream));
stream.Position = 0;
CloudBlockBlob blob = GetBlob(account, containerName, $"{destBlobPath}/{fileName}");
blob.Properties.ContentType = FileUtils.GetFileContentType(fileName);
using var reader = new StreamReader(stream, encoding);
var ct = await reader.ReadToEndAsync();
await blob.UploadTextAsync(ct, encoding ?? Encoding.UTF8, AccessCondition.GenerateEmptyCondition(), new BlobRequestOptions(), new OperationContext());
return true;
}
This is the file just before uploading it
<provinciaDatosInmueble>Sevilla</provinciaDatosInmueble>
<inePoblacionDatosInmueble>969</inePoblacionDatosInmueble>
<poblacionDatosInmueble>Valencina de la Concepción</poblacionDatosInmueble>
and this is the file after the upload
<provinciaDatosInmueble>Sevilla</provinciaDatosInmueble>
<inePoblacionDatosInmueble>969</inePoblacionDatosInmueble>
<poblacionDatosInmueble>Valencina de la Concepci�n</poblacionDatosInmueble>
The encoding I send is ISO-5589-1 in the parameter of the encoding. Anybody knows why Blob Storage seems to ignore the encoding I'm specifying? Thanks in advance!
We could able to achieve this using Azure.Storage.Blobs instead of WindowsAzure.Storage which is a legacy Storage SDK. Below is the code that worked for us.
class Program
{
static async Task Main(string[] args)
{
string sourceContainerName = "<Source_Container_Name>";
string destBlobPath = "<Destination_Path>";
string fileName = "<Source_File_name>";
MemoryStream stream = new MemoryStream();
BlobServiceClient blobServiceClient = new BlobServiceClient("<Your_Connection_String>");
BlobContainerClient containerClient = blobServiceClient.GetBlobContainerClient(sourceContainerName);
BlobClient blobClientSource = containerClient.GetBlobClient(fileName);
BlobClient blobClientDestination = containerClient.GetBlobClient(destBlobPath);
// Reading From Blob
var line =" ";
if (await blobClientSource.ExistsAsync())
{
var response = await blobClientSource.DownloadAsync();
using (StreamReader streamReader = new StreamReader(response.Value.Content))
{
line = await streamReader.ReadToEndAsync();
}
}
// Writing To Blob
var content = Encoding.UTF8.GetBytes(line);
using (var ms = new MemoryStream(content))
blobClientDestination.Upload(ms);
}
}
RESULT:

How to download and decompress file from Ebay API

I am attempting to get a file from Ebays API
https://developer.ebay.com/api-docs/sell/feed/resources/task/methods/getResultFile#h2-samples
I am getting data back but it is not decompressing properly or encoding properly. It should be an gziped xml file. The documentation is not very clear on this actually. I am using RestSharp for my http calls (106.15.0).
Exception:
Found invalid data while decoding
My Code:
const string url = "sell/feed/v1/task/task-16-SOMENUMBER/download_result_file";
var restClient = new RestClient(_restApiUrl)
{
Authenticator = new OAuth2AuthorizationRequestHeaderAuthenticator(authentication.AuthorizationToken, "Bearer")
};
var httpRequest = new RestRequest(url, Method.GET);
httpRequest.AddHeader("Accept-Encoding", "application/gzip");
httpRequest.AddHeader("Accept", "*/*");
byte[] myfile = restClient.DownloadData(httpRequest);
var decodedString = Encoding.UTF8.GetString(myfile);
using (var stream = new MemoryStream(myfile))
{
string res;
using (GZipStream zipStream = new GZipStream(stream, CompressionMode.Decompress))
{
using (var sr = new StreamReader(zipStream))
{
res = sr.ReadToEnd(); //ERROR HERE: Found invalid data while decoding
}
}
var result = res;
}
First 30 of returned string (Encoding.Default.GetString(myfile))
PK\u0003\u0004\u0014\0\b\b\b\0\f£†T\0\0\0\0\0\0\0\0\0\0\0\0>\0\0\0ActiveInventoryReport
Hex
50-4B-03-04-14-00-08-08-08-00-0C-A3-86-54-00-00-00-00-00-00-00-00-00-00-00-00-3E-00-00-00-41-63-74-69-76-65-49-6E-76-65-6E-74-6F-72-79-52-65-70-6F-72-74-2D-41-70-72-2D-30-36-2D-32-30-32-32-2D-32-30-3A-32-34-3A-32-30-2D-30-37-30-30-2D-31-33-33-34-39-39-38-35-32-34-2E-78-6D-6C-BD-9D-5B-53-9B-47-B6-86-AF-77-7E-45-CA-F7-32-7D-3E-4C-79-3C-25-09-03-1E-C0-D6-48-38-60-DF-B1-8D-F6-98-0A-01-17-86-4C-3C-BF-7E-F7-E1-

Async method to read and write to XML file

I am using DependencyService in android/ios and windows phone to write and read a XML file in my Xamarin.forms project. I am referring to working with files.
I was able to implement the function given in the example but what I actually want is reading and writing to a XML file.
I followed a usual c# procedure to read and write to xml file but getting errors as the method is async.
I have never used async await methods so not sure how to go about it.
Here is what I tried:
public async Task SaveTextAsync(string filename, string text)
{
ApplicationData data = new ApplicationData();
ApplicationVersion version = new ApplicationVersion();
version.SoftwareVersion = "test";
data.ApplicationVersion = version;
XmlSerializer writer =
new XmlSerializer(typeof(ApplicationData));
System.IO.FileStream file = System.IO.File.Create(path);
writer.Serialize(file, data);
file.Close();
}
public async Task<string> LoadTextAsync(string filename)
{
var path = CreatePathToFile(filename);
ApplicationData cars = null;
XmlSerializer serializer = new XmlSerializer(typeof(ApplicationData));
StreamReader reader = new StreamReader(path);
cars = (ApplicationData)serializer.Deserialize(reader);
reader.Close();
}
string CreatePathToFile(string filename)
{
var docsPath = System.Environment.GetFolderPath(System.Environment.SpecialFolder.Personal);
return Path.Combine(docsPath, filename);
}
Edit
Working Read and write to txt file code is here:
public async Task SaveTextAsync (string filename, string text)
{
var path = CreatePathToFile (filename);
using (StreamWriter sw = File.CreateText (path))
await sw.WriteAsync(text);
}
public async Task<string> LoadTextAsync (string filename)
{
var path = CreatePathToFile (filename);
using (StreamReader sr = File.OpenText(path))
return await sr.ReadToEndAsync();
}
I managed to get it work. Here is my code:
public async Task SaveTextAsync(string filename)
{
var path = CreatePathToFile(filename);
ApplicationData data = new ApplicationData();
ApplicationVersion version = new ApplicationVersion();
version.SoftwareVersion = "test version";
data.ApplicationVersion = version;
XmlSerializer writer =
new XmlSerializer(typeof(ApplicationData));
System.IO.FileStream file = System.IO.File.Create(path);
writer.Serialize(file, data);
file.Close();
}
public async Task<ApplicationData> LoadTextAsync(string filename)
{
var path = CreatePathToFile(filename);
ApplicationData records = null;
await Task.Run(() =>
{
// Create an instance of the XmlSerializer specifying type and namespace.
XmlSerializer serializer = new XmlSerializer(typeof(ApplicationData));
// A FileStream is needed to read the XML document.
FileStream fs = new FileStream(path, FileMode.Open);
XmlReader reader = XmlReader.Create(fs);
// Use the Deserialize method to restore the object's state.
records = (ApplicationData)serializer.Deserialize(reader);
fs.Close();
});
return records;
}

CloudBlob.DownloadText method inserts additional character?

The following unit test fails:
[TestMethod]
public void Add_file_to_blob_and_retrieve_it()
{
var blobName = Guid.NewGuid().ToString();
var testFileContents = File.ReadAllText(TestFileSpec);
Trace.WriteLine(string.Format("Opening blob container {0}", UnitTestBlobAgentName));
CloudStorageAccount.SetConfigurationSettingPublisher(
(configName, configSetter) => configSetter(ConfigurationManager.AppSettings[configName]));
var cloudStorage = CloudStorageAccount.FromConfigurationSetting("StorageConnectionString");
var blobClient = cloudStorage.CreateCloudBlobClient();
var container = blobClient.GetContainerReference(UnitTestBlobAgentName.ToLower());
try
{
Trace.WriteLine(string.Format("Uploading file {0}", TestFileSpec));
var blob = container.GetBlobReference(blobName);
blob.UploadFile(TestFileSpec);
blob.Properties.ContentType = "ByteArray";
blob.SetProperties();
var blob1 = container.GetBlobReference(blobName);
var found = blob1.DownloadText();
Assert.AreEqual(testFileContents.Trim(), found.Trim());
}
finally
{
if (null != container)
{
Trace.WriteLine(string.Format("Deleting blob {0}", blobName));
var blob2 = container.GetBlobReference(blobName);
blob2.DeleteIfExists(new BlobRequestOptions { DeleteSnapshotsOption = DeleteSnapshotsOption.IncludeSnapshots });
}
}
}
It turns out, the returned string begins with the dword 0xFEFF (the Unicode BOM). I've traced through the Microsoft debug symbols, and the BOM exists in the return stream. AFAICT, it comes from the HttpResponse.GetResponseStream() method call way down in the Microsoft.WindowsAzure.StorageClient.CloudBlob class.
What's the best way to ensure that the input and output are identical? Ensure the input is converted to Unicode before going in? Strip the BOM from the output? Any other ideas?
This is an old one, but if your blob is encoded as Unicode in azure, and you want to download it to a text string, this code will do the trick. Just keep in mind, the weakness here is that you've got to allocate the memory twice. If there's a more efficient way of getting to a Unicode string (synchronously, anyway,) I couldn't find it.
string fileText;
using (var memoryStream = new MemoryStream())
{
cloudBlob.DownloadToStream(memoryStream);
memoryStream.Position = 0;
using (var reader = new StreamReader(memoryStream, Encoding.Unicode))
{
fileText = reader.ReadToEnd();
}
}

Categories

Resources