Yesterday I had a strange problem: When I wanted to pass a zip file as byte[] and read it, i got an Ionic.Zip.ZipExpception
Cannot read that as a ZipFile
public string Import(byte[] file)
{
try
{
var stream = new MemoryStream(file);
if (ZipFile.IsZipFile(stream))
{
ImportArchive(stream);
} else {
...
}
...
}
private void ImportArchive(MemoryStream stream)
{
var zip = ZipFile.Read(stream); //--> ZipException thrown
...
}
Now if I pass the byte[] as parameter and not the MemoryStream, everything works fine:
public string Import(byte[] file)
{
try
{
if (ZipFile.IsZipFile(new MemoryStream(file), true))
{
ImportArchive(file);
} else {
...
}
...
}
private void ImportArchive(byte[] file)
{
var fileStream = new MemoryStream(file);
var zip = ZipFile.Read(fileStream); //--> no exception!
...
}
Where is the difference between those two versions? Why can't the first Version of the passed MemoryStream be read?
ZipFile.IsZipFile changes the stream position - it needs to read more than one byte of data. You need to "rewind" the stream before calling ImportArchive:
stream.Position = 0;
This is not something that can be done automatically - when you pass some method a stream, it's usually assumed that you're pointing to the beginning of the relevant data. This allows you to have different data "packets" in one stream, and it means that you can use streams that aren't seekable.
Related
I tried below code to upload file to azure blob container but uploaded file got corrupted.
public async void UploadFile(Stream memoryStream, string fileName, string containerName)
{
try
{
memoryStream.Position = 0;
CloudBlockBlob file = GetBlockBlobContainer(containerName).GetBlockBlobReference(fileName);
file.Metadata["FileType"] = Path.GetExtension(fileName);
file.Metadata["Name"] = fileName;
await file.UploadFromStreamAsync(memoryStream).ConfigureAwait(false);
}
catch (Exception ex)
{
throw ex;
}
}
How can I resolve it.
Unable to open excel file which was uploaded to blob using above code.
Error:
Stream streamData= ConvertDataSetToByteArray(sourceTable); // sourceTable is the DataTable
streamData.Position = 0;
UploadFile(streamData,'ABCD.xlsx','sampleBlobContainer'); //calling logic to upload stream to blob
private Stream ConvertDataSetToByteArray(DataTable dataTable)
{
StringBuilder sb = new StringBuilder();
IEnumerable<string> columnNames = dataTable.Columns.Cast<DataColumn>().
Select(column => column.ColumnName);
sb.AppendLine(string.Join(",", columnNames));
foreach (DataRow row in dataTable.Rows)
{
IEnumerable<string> fields = row.ItemArray.Select(field => (field.ToString()));
sb.AppendLine(string.Join(",", fields));
}
var myByteArray = System.Text.Encoding.UTF8.GetBytes(sb.ToString());
var streamData = new MemoryStream(myByteArray);
return streamData;
}
Your code above creates a .csv file, not an .xlsx file. You can easily test this out by creating something similar to what your code builds, e.g.:
Then if you rename it to .xlsx, to replicate what you do, you get:
You have two solutions:
You either need to build an actual .xlsx file, you can do this with the https://github.com/JanKallman/EPPlus package for example
or
You need to save your file as a .csv, because that's what it really is.
The fact the you upload it to azure blob storage is completely irrelevant here - there's no issue with the upload.
Since the stream is instantiated outside this method I assume the file is handled there and added to the stream, however, here you are returning the position of the stream to 0, thus invalidating the file.
First of all, are you sure the file got corrupted? Save both the MemoryStream contents and the blog to local files and compare them. You could also save the MemoryStream contents to a file and use UploadFromFileAsync.
To check for actual corruption you should calculate the content's MD5 hash in advance and compare it with the blob's hash after upload.
To calculate the stream's MD5 hash use ComputeHash.
var hasher=MD5.Create();
memoryStream.Position = 0;
var originalHash=Convert.ToBase64String(hasher.ComputeHash(memoryStream));
To get the client to calculate an blob has you need to set the BlobRequestOptions.StoreBlobContentMD5 option while uploading :
memoryStream.Position = 0;
var options = new BlobRequestOptions()
{
StoreBlobContentMD5 = testMd5
};
await file.UploadFromStreamAsync(memoryStream,null,options,null).ConfigureAwait(false);
To retrieve and check the uploaded hash use FetchAttributes or FetchAttributesAsync and compare the BlobProperties.ContentMD5 value with the original :
file.FetchAttributes();
var blobHash=file.Properties.ContentMD5;
if (blobHash != originalHash)
{
//Ouch! Retry perhaps?
}
It seems that your method don't have fatal problems. I guess the part of your Stream conversion has gone wrong.
This is my code:
using System;
using System.IO;
using Microsoft.WindowsAzure.Storage;
namespace ConsoleApp7
{
class Program
{
public static class Util
{
public async static void UploadFile(Stream memoryStream, string fileName, string containerName)
{
memoryStream.Position = 0;
var storageAccount = CloudStorageAccount.Parse("xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx");
var blockBlob = storageAccount.CreateCloudBlobClient()
.GetContainerReference(containerName)
.GetBlockBlobReference(fileName);
blockBlob.UploadFromStreamAsync(memoryStream);
}
}
static void Main(string[] args)
{
//Open the file
FileStream fileStream = new FileStream("C:\\Users\\bowmanzh\\Desktop\\Book1.xlsx", FileMode.Open);
//Read the byte[] of File
byte[] bytes = new byte[fileStream.Length];
fileStream.Read(bytes,0,bytes.Length);
fileStream.Close();
//turn from byte[] to Stream
Stream stream = new MemoryStream(bytes);
Util.UploadFile(stream,"Book2.xlsx","test");
Console.WriteLine("Hello World!");
Console.ReadLine();
}
}
}
Problem
I started c# just last week, so it's entirely possible that I'm missing something obvious. Please help. I'm open to better ways.
I'm making a WebAPI file upload endpoint that postpones saving until it confirm the file's signatures.
I receive this error in Postman:
"Unexpected end of MIME multipart stream. MIME multipart message is not complete.",
Per this stackoverflow answer, I receive the error because I start a second memory stream while the first is still open.
My code is below.
Efforts to Resolve problem:
I tried to implement this, this, and many others over the past two days.
A similar question is posed, unanswered, here.
Code So Far
C# code snippets not runnable in Stack Overflow.
I'm using MultipartMemoryStreamProviderto create a stream and get the file signatures. I created a CustomMultipartMemoryStream to inherit the IDisposable interface:
View CustomMultipartMemoryStreamProvider Class
public class CustomMultipartMemoryStreamProvider : MultipartMemoryStreamProvider, IDisposable
{
public void Dispose()
{
}
}
I then create a using statement to get start the CustomMultipartMemoryStream, get the signatures, and confirm if the file is a JPEG. I then dispose of the stream:
View MultipartMemoryStreamProvider Implementation
bool isJpeg;
using (var MemoryStream = await Request.Content.ReadAsMultipartAsync(new CustomMultipartMemoryStreamProvider()))
{
try
{
//Get Signatures works.
//It uses Stream.Read to read the first 6 bytes of the stream
//returns byte[]
byte[] FileSignature = GetSignature(await MemoryStream.Contents[0].ReadAsStreamAsync());
//HasJpegHeaders works.
//It returns true or false
isJpeg = HasJpegHeader(FileSignature);
}
finally
{
//This should dispose of the stream.
//It runs.
MemoryStream.Dispose();
}
}
View GetSignature (WORKS)
static byte[] GetSignature(Stream stream)
{
int offset = 0;
byte[] fileSignature = new byte[6];
while (offset < 6)
{
int read = stream.Read(fileSignature, offset, 6 - offset);
if (read == 0)
throw new System.IO.EndOfStreamException();
offset += read;
}
stream.Position = 0;
return fileSignature;
}
View HasJpegHeader (WORKS)
static bool HasJpegHeader(byte[] FileSignature)
{
byte[] SoiArray = FileSignature.Take(2).ToArray();
byte[] MarkerArray = FileSignature.Skip(2).Take(2).ToArray();
bool result = ((SoiArray.SequenceEqual(new byte[]{ 255, 216 })) && (MarkerArray.SequenceEqual(new byte[] { 255, 224 })));
return result;
}
If the file is a jpeg, I start a CustomMultipartFormDataStreamProvider, which is the same as the MultipartFormDataStreamProvider except that it keeps the filenames on save.
The await Request.Content.ReadAsMultipartAsync(provider) throws the error
(noted with an arrow.)
View CustomMultipartFormDataStreamProvider
if (isJpeg)
{
string fileSaveLocation = HttpContext.Current.Server.MapPath("~/App_Data");
CustomMultipartFormDataStreamProvider provider = new CustomMultipartFormDataStreamProvider(fileSaveLocation);
List<string> files = new List<string>();
try
{
here be error-->await Request.Content.ReadAsMultipartAsync(provider);
foreach (MultipartFileData file in provider.FileData)
{
files.Add(Path.GetFileName(file.LocalFileName));
}
// Send OK Response along with saved file names to the client.
return Request.CreateResponse(HttpStatusCode.OK, files);
}
catch (System.Exception e)
{
return Request.CreateErrorResponse(HttpStatusCode.InternalServerError, e);
}
}
else
{
return Request.CreateResponse(HttpStatusCode.NotAcceptable, "This request is not properly formatted");
}
I have an application that processes file streams based on a list of strings, and the string can either be a file on disk, or a file inside a Zip file. To clean up the code, I'd like to refactor out the process of opening the file.
I've created a method that returns a Stream of the file contents, but because the stream depends on the ZipFile IDisposable, by the time I read the stream, the ZipFile is disposed an throws an exception.
void Main()
{
using (var stream = OpenFileForImport("zipfile.zip;insidefile.txt"))
new StreamReader(stream).ReadToEnd(); // Exception
using (var stream = OpenFileForImport("outside.txt"))
new StreamReader(stream).ReadToEnd(); // Works
}
public static Stream OpenFileForImport(string filePath)
{
var path = Path.Combine(basefolder, filePath);
if (path.Contains(";"))
{
var parts = path.Split(';');
var zipPath = parts[0];
//Error checking logic to ensure zip file exists and is valid...
using (var zip = ZipFile.OpenRead(zipPath))
using (var entry = zip.GetEntry(parts[1]))
{
//Error checking logic to ensure inside file exists within zip file.
return entry.Open();
}
}
var file = new FileInfo(path);
if (file != null)
return file.OpenRead();
return null;
}
I could remove the using clause from the zip and entry declarations, but I doubt they'd ever get disposed. Is there an appropriate pattern to return a disposable, when it depends on other disposables?
Don't return the stream directly, instead return a disposable object which can provide the stream you want to dispose, but that cleans up that stream and the other dependant resources when it is disposed of:
public class NameToBeDetermined : IDisposable
{
private ZipFile zip;
public Stream Stream { get; }
public NameToBeDetermined(ZipFile zip, Stream stream)
{
this.zip = zip;
Stream = stream;
}
public void Dispose()
{
zip.Dispose();
Stream.Dispose();
}
}
Then return that, rather than the stream itself. If it's worth spending the time, you could turn that wrapper into a Stream itself, that just forwards all Stream methods into the composed stream, but that does the extra work when disposing. Whether it's worth the time to create that more involved wrapper rather than having a caller access a Stream property is up to you.
You likely should copy the file from the ZipEntry into a MemoryStream so that you have a copy to work with.
//Error checking logic to ensure zip file exists and is valid...
using (var zip = ZipFile.OpenRead(zipPath))
using (var entry = zip.GetEntry(parts[1]))
{
//Error checking logic to ensure inside file exists within zip file.
MemoryStream stream = new MemoryStream();
entry.Open().CopyTo(stream);
stream.Seek(0, SeekOrigin.Begin);
return stream;
}
I am trying to verify that the file is a .rar file through its bytes for security purposes. Th following code is my code the only problem is that the sub-header is not matching with the one generated from the file. I noticed that is different for different file. Could you please explain to me why?
static bool IsRARFile(string filePath)
{
bool isDocFile = false;
//
// File sigs from: http://www.garykessler.net/library/file_sigs.html
//
string msOfficeHeader = "52-61-72-21-1A-07-00-CF";
string docSubHeader = "64-2E-63-73";
using (Stream stream = File.OpenRead(filePath))
{
//get file header
byte[] headerBuffer = new byte[8];
stream.Read(headerBuffer, 0, headerBuffer.Length);
string headerString = BitConverter.ToString(headerBuffer);
if (headerString.Equals(msOfficeHeader, StringComparison.InvariantCultureIgnoreCase))
{
//get subheader
byte[] subHeaderBuffer = new byte[4];
stream.Seek(512, SeekOrigin.Begin);
stream.Read(subHeaderBuffer, 0, subHeaderBuffer.Length);
string subHeaderString = BitConverter.ToString(subHeaderBuffer);
if (subHeaderString.Equals(docSubHeader, StringComparison.InvariantCultureIgnoreCase))
{
isDocFile = true;
}
}
}
return isDocFile;
}
This is because you have just copied a function from somewhere for a different filetype and not every filetype has any notion of a "subheader". You only need to check the main header in the case of RAR.
I also suggest modifying the naming of the variables, it is quite a mismash if a function says it's checking for RAR type and internally all variables refer to DOCs.
In an utility method, which accepts a Stream parameter, I rely on some StreamReader to analyse data.
I don't want to close the incoming stream in my method. I want to let the caller method to take the decision to dispose the stream.
Is it safe to not dispose the opened StreamReader? I mean, will it eventually be automatically disposed? Will it lead to memory leaks?
Here is my utility method. Its goal is to read a Stream, and return its content as a string, regardless of how the data is encoded:
public static string GetStringAutoDetectEncoding(Stream data, out Encoding actualEncoding)
{
// 1. Is there a Bye Order Mask ?
var candidateEncoding = DetectEncodingWithByteOrderMask(data);
// 2a. No BOM, the data is either UTF8 no BOM or ANSI
if (candidateEncoding == Encoding.Default)
{
var utf8NoBomEncoding = Encoding.GetEncoding("utf-8",new EncoderExceptionFallback(), new DecoderExceptionFallback());
var positionBackup = data.Position;
var sr = new StreamReader(data, utf8NoBomEncoding);
try
{
// 3. Try as UTF8 With no BOM
var result = sr.ReadToEnd(); // will throw error if not UTF8
actualEncoding = utf8NoBomEncoding; // Probably an UTF8 no bom string
return result;
}
catch (DecoderFallbackException)
{
// 4. Rewind the stream and fallback to ASNI
data.Position = positionBackup;
var srFallback = new StreamReader(data, candidateEncoding);
actualEncoding = candidateEncoding;
return srFallback.ReadToEnd(); ;
}
}
// 2b. There is a BOM. Use the detected encoding
else
{
var sr = new StreamReader(data, candidateEncoding);
actualEncoding = candidateEncoding;
return sr.ReadToEnd(); ;
}
}
Then, I can have some methods in the like this:
void Foo(){
using(var stream = File.OpenRead(#"c:\somefile")) {
Encoding detected;
var fileContent = MyUtilityClass.GetStringAutoDetectEncoding(stream, detected);
Console.WriteLine("Detected encoding: {0}", encoding);
Console.WriteLine("File content: {0}", fileContent);
}
}
You could invert control using a closure. That is, create a method like so:
// This method will open the stream, execute the streamClosure, and then close the stream.
public static String StreamWork(Func<Stream, String> streamClosure) {
// Set up the stream here.
using (Stream stream = new MemoryStream()) { // Pretend the MemoryStream is your actual stream.
// Execute the closure. Return it's results.
return streamClosure(stream);
}
}
which is responsible for opening / closing the stream within the method.
Then you simply wrap up all the code that needs the stream into a Func<Stream, String> closure, and pass it in. The StreamWork method will open the stream, execute your code, then close the stream.
public static void Main()
{
// Wrap all of the work that needs to be done in a closure.
// This represents all the work that needs to be done while the stream is open.
Func<Stream, String> streamClosure = delegate(Stream stream) {
using (StreamReader streamReader = new StreamReader(stream)) {
return streamReader.ReadToEnd();
}
};
// Call StreamWork. This method handles creating/closing the stream.
String result = StreamWork(streamClosure);
Console.WriteLine(result);
Console.ReadLine();
}
UPDATE
Of course, this method of inversion is a matter of preference as mentioned in the comments below. The key point is to ensure that the stream is closed rather than allowing it to float around until the GC cleans it up (since the whole point of having stuff implement IDisposable is to avoid that sort of situation to begin with). Since this is a library function that accepts a Stream as input, the assumption is that the method-consumer will be creating the stream, and therefore as you point out, has the responsibility of ultimately closing the stream as well. But for sensitive resources where you are concerned about ensuring clean up occurs absolutely, inversion is sometimes a useful technique.
StreamReader close/dispose their underlying streams only when you call Dispose on them. They don't dispose of the stream if the reader/writer is just garbage collected.