Compressing a file from memory with SevenZipSharp, stranges mistakes

Compressing a file from memory with SevenZipSharp, stranges mistakes - c#

I download the SevenZipSharp Lib in order to compress some files.
I used this in order to compress a file :
var libPath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.ProgramFiles), "7-zip", "7z.dll");
SevenZip.SevenZipCompressor.SetLibraryPath(libPath);
SevenZip.SevenZipCompressor compressor = new SevenZipCompressor();
compressor.CompressFiles(#"C:\myTestFile.mdf", new string[] { #"C:\myTestFileCompressed.7z" });
With this, my file is compressed whitout problem. I can decompressed it.
Now...i would like to compress the same file, but, instead of compress directly the file, i would like to :
Read the file in a string. Yes, in a string, and not in a byte[].
Convert my string to byte[]
Compress the byte[] to another byte[].
Here is my try :
string strToCompress = File.ReadAllText(#"C:\myTestFile.mdf");
SevenZipCompressor compressor = new SevenZipCompressor();
byte[] byteArrayToCompress = Encoding.ASCII.GetBytes(text);
MemoryStream stream = new MemoryStream(byteArrayToCompress);
MemoryStream streamOut = new MemoryStream();
compressor.CompressStream(stream, streamOut);
string strcompressed = Encoding.ASCII.GetString(streamOut.ToArray());
File.WriteAllText(#"C:\myfileCompressed.7z",strcompressed);
My problem is very simple :
If i compare the size produced by these 2 methods, it's 3 603 443 bytes vs 3 604 081 bytes.
In addition, i cannot uncompressed the file produced by the second method.
Maybe it's because i used ASCII encoding, but my file to compress is not a Text, it's a binary file.
Anyone could explain me how solving it please ? I need to read my file to a string and compress it. ( i don't want to read the file directly to a byte[]).
Thanks a lot,
Best regards,
Nixeus

You cannot put binary data into a string, not every byte value has a Unicode codepoint. Using ASCII encoding will similarly always cause irretrievable data loss, it only has characters for byte values 0 through 127, higher values will produce a ?
You certainly can convert a byte[] to a string, it needs to be encoded. The standard encoding that's used for that is available in .NET from the Convert.ToBase64String() method. You recover the byte[] again with Convert.FromBase64String(). Inevitably it won't be as compact, it will be 4/3 bigger as the original data in a byte[].
You can never produce a valid .7z archive that way, it of course uses the most compact possible storage and that is bytes. You must pass a FileStream to the CompressStream() method.

Related

What is the simplest way to decompress a ZIP buffer in C#?

When I use zlib in C/C++, I have a simple method uncompress which only requires two buffers and no more else. Its definition is like this:
int uncompress (Bytef *dest, uLongf *destLen, const Bytef *source,
uLong sourceLen);
/*
Decompresses the source buffer into the destination buffer. sourceLen is the byte length of the source buffer. Upon entry,
destLen is the total size of the destination buffer, which must be
large enough to hold the entire uncompressed data. (The size of
the uncompressed data must have been saved previously by the
compressor and transmitted to the decompressor by some mechanism
outside the scope of this compression library.) Upon exit, destLen
is the actual size of the uncompressed data.
uncompress returns Z_OK if success, Z_MEM_ERROR if there was not enough memory, Z_BUF_ERROR if there was not enough room in the output
buffer, or Z_DATA_ERROR if the input data was corrupted or incomplete.
In the case where there is not enough room, uncompress() will fill
the output buffer with the uncompressed data up to that point.
*/
I want to know if C# has a similar way. I checked SharpZipLib FAQ as follows but did not quite understand:
How do I compress/decompress files in memory?
Use a memory stream when creating the Zip stream!
MemoryStream outputMemStream = new MemoryStream();
using (ZipOutputStream zipOutput = new ZipOutputStream(outputMemStream)) {
// Use zipOutput stream as normal
...
You can get the resulting data with memory stream methods ToArray or GetBuffer.
ToArray is the cleaner and easiest to use correctly with the penalty
of duplicating allocated memory. GetBuffer returns a raw buffer raw
and so you need to account for the true length yourself.
See the framework class library help for more information.
I can't figure out if this block of code is for compression or decompression, if outputMemStream meas a compressed stream or an uncompressed stream. I really hope there is a easy-to-understand-way like in zlib. Thanks you very much if you can help me.

Check out the ZipArchive class, which I think has the features you need to accomplish in-memory decompression of zip files.
Assuming you have an array of bytes (byte []) which represent the ZIP file in memory, you have to instantiate a ZipArchive object which will be used to read that array of bytes and interpret them as the ZIP file you whish to load. If you check the ZipArchive class' available constructors in documentation, you will see that they require a stream object from which the data will be read. So, first step would be to convert your byte [] array to a stream that can be read by the constructors, and you can do this by using a MemoryStream object.
Here's an example of how to list all entries inside of a ZIP archive represented in memory as a bytes array:
byte [] zipArchiveBytes = ...; // Read the ZIP file in memory as an array of bytes
using (var inputStream = new MemoryStream(zipArchiveBytes))
using (var zipArchive = new ZipArchive(inputStream, ZipArchiveMode.Read))
{
Console.WriteLine("Listing archive entries...");
foreach (var archiveEntry in zipArchive.Entries)
Console.WriteLine($" {archiveEntry.FullName}");
}
Each file in the ZIP archive will be represented as a ZipArchiveEntry instance. This class offers properties which allow you to retrieve information such as the original length of a file from the ZIP archive, its compressed length, its name, etc.
In order to read a specific file which is contained inside the ZIP file, you can use ZipArchiveEntry.Open(). The following exemplifies how to open a specific file from an archive, if you have its FullName inside the ZIP archive:
ZipArchiveEntry archEntry = zipArchive.GetEntry("my-folder-inside-zip/dog-picture.jpg");
byte[] readResult;
using (Stream entryReadStream = archEntry.Open())
{
using (var tempMemStream = new MemoryStream())
{
entryReadStream.CopyTo(tempMemStream);
readResult = tempMemStream.ToArray();
}
}
This example reads the given file contents, and returns them as an array of bytes (stored in the byte[] readResult variable) which you can then use according to your needs.

String to zip file

I use a webservice that returns a zip file, as a string, and not bytes as I expected. I tried to write it to the disk, but when I open it, it tells me that it is corrupt. What am I doing wrong?
string cCsv = oResponse.fileCSV;//this is the result from webservice
MemoryStream ms = new MemoryStream(System.Text.Encoding.ASCII.GetBytes(cCsv));
using (FileStream file = new FileStream("test.zip", FileMode.Create, FileAccess.Write))
{
ms.WriteTo(file);
}
ms.Close();

I'm not sure what kind of encoding the string is in, but assuming UTF-8, the following should work. UTF-16 would be another guess.
string cCsv = oResponse.fileCSV;
using (BinaryWriter bw = new BinaryWriter(File.Create("test.zip")))
{
bw.Write(System.Text.Encoding.UTF8.GetBytes(cCsv));
}
It'd be informative to look at the characters and the raw string itself being returned.
Edit
Per Frank's answer, the correct encoding is base64, which of course makes sense because it's binary data stored as a string.
Also, per Frank's answer, if the only action is to directly write a single byte array, then File.WriteAllBytes is more compact.

Ok, i solve the problem:
File.WriteAllBytes("testbase64.zip", Convert.FromBase64String(cCsv));

Download from byte array from CRM

In Microsoft CRM we have an attachment that should be fetched and downloaded. So I have a byte array that represents the fetched file:
byte[] fileContent = Convert.FromBase64String(query.DocumentBody);
If I use this code, of course it can be downloaded but the file path should be hardcoded (like C:/<folder name>/) and I don't want it like that.
using (FileStream fileStream = new FileStream(path + query.FileName, FileMode.OpenOrCreate))
{
byte[] fileContent = Convert.FromBase64String(query.DocumentBody);
fileStream.Write(fileContent, 0, fileContent.Length);
//Response.OutputStream.WriteByte(fileContent);
}
How can I download the file from a byte array? I've tried searching for ways but it all needs a file path, and I can't provide that file path since the object is a byte array.

I'm not sure what exactly is your problem, but following should write byte array to output stream. You may need "content-disposition" header for file name and "content-type" to let browser offer "download" instead of trying to open directly:
Response.OutputStream..Write(fileContent , 0, fileContent .Length);

convert binary file to text

I have a program that gets a response from a url in binary format and I do not know how to convert this to a text file.
byte[] postBytes = System.Text.Encoding.UTF8.GetBytes(postString);
request.ContentLength = postBytes.Length;
Stream stream = request.GetRequestStream();
stream.Write(postBytes, 0, postBytes.Length);
stream.Close();
response = (HttpWebResponse)request.GetResponse();
Stream ReceiveStream = response.GetResponseStream();
string filename = "C:\\responseGot.txt";
byte[] buffer = new byte[1024];
FileStream outFile = new FileStream(filename, FileMode.Create);
int bytesRead;
while ((bytesRead = ReceiveStream.Read(buffer, 0, buffer.Length)) != 0)
outFile.Write(buffer, 0, bytesRead);
When I open responseGot.txt it is a binary file how do I get text file.

In what format is the response you get? There is no such thing as a text file. There are only binary files. HTTP is also 100% binary.
Text is the interpretation of bytes, and it only exists as part of running application. You can never, ever write text to a file. You can only convert the text to bytes (using various ways) and write the bytes.
Therefore, ask yourself why the bytes you received cannot be interpreted by notepad.exe as text. Maybe the response is not directly text but a ZIP file or something.
You can guess the format with a hex editor
You can ask the website owner

You don't show in your code sample saving the file anywhere.
But to convert the response to string you can use:
using (HttpWebResponse response = req.GetResponse() as HttpWebResponse)
{
StreamReader reader = new StreamReader(response.GetResponseStream());
string ResponseTXT = reader.ReadToEnd();
}
Then you can save it with usual techniques
http://msdn.microsoft.com/en-us/library/6ka1wd3w%28v=vs.110%29.aspx
Did you mean that?

Every data represented in digital computing these days is based on 2 bits ie. binary (electrical/magnetic signals: on/off or north/south).
Every file written to disk is also a binary file ie. a sequence of (8 bit) bytes.
ASCII/ANSI defines character map for each byte sequence and only about 95 of the 256 bytes are referred to as printable (text) characters.
Your downloaded file seems to have more than just the printable characters (usually referred to as a plain text file).
To view the file as it is (in your current encoding settings):
type <file.ext>
To view in a different code page:
chcp <codepage>
type <file.ext>
To view a (plain)text representation of your file, you'd encode it first (ie. translate it to a text file) eg. hex coded string via some hex editor.
The first few characters of the hex sequence should give a magic number, indicating the type of file being read. You'd then open the file with the associated program (that is capable of opening those types of files).
If it is a text file you were expecting and instead got a file which has more than just printable (plain text) characters, then it's more likely there has been some sort of compression/encryption applied to it. Once again, the magic number should hint how the file should be treated eg. decompressed before attempting to read the data/file. (Encrypted files should come with a decryption hint/key, unless exchanged/agreed earlier)

Use the ReadFully method in this topic Creating a byte array from a stream
Get the string representation to an actual string:
string text = System.Text.Encoding.Default.GetString(byteArray);
And finally create the text file and write the content:
using(StreamWriter sw = new StreamWriter("C:\\responseGot.txt"))
{
sw.WriteLine(text);
}

Why is Base64 string different in C# and Android

I have convert one image into base64 string and that output same with online website.
But the same image when I convert it from Android is different.
Can you please explain why C# and Android base64 strings are different for the same image.
C#.NET Code
string cImagePath = #"G:\bg-listing.png";
byte[] imagebyte = StreamFile(cImagePath);
String result = System.Convert.ToBase64String(imagebyte);
System.IO.StreamWriter outFile;
try
{
outFile = new System.IO.StreamWriter(Application.StartupPath + "//image2base641.txt",
false,
System.Text.Encoding.Default);
outFile.Write(result.ToString());
outFile.Close();
}
catch (System.Exception exp)
{
// Error creating stream or writing to it.
System.Console.WriteLine("{0}", exp.Message);
}
Android Code
Bitmap bitmapOrg = BitmapFactory.decodeResource(getResources(), R.drawable.image);
ByteArrayOutputStream bao = new ByteArrayOutputStream();
bitmapOrg.compress(Bitmap.CompressFormat.JPEG, 100, bao);
byte [] ba = bao.toByteArray();
String ba1=Base64.encodeToString(ba,Base64.DEFAULT);
Both image base64 are different.
Please help me.

There are many variants of base 64, involving line length, padding, check sums, etc. The Wikipedia article on Base64 has a nice table of variants.
My guess is that C# and Android are simply using different variants.
EDIT Based on your updated post, there are a couple of other possibilities:
Android may be modifying the .jpg file when it packages it up as a resource (however, while the resource packager is extremely aggressive regarding compression, this is probably not the case);
Android may be re-encoding the image differently than the original (two .jpg files can represent the same pixel values and not be byte-for-byte identical)
A better test would be to skip (in the Android code) the conversion from a resource to a Bitmap and back to a .jpg encoding. Just open the resource as a stream, read it directly into a byte array, and encode that in base 64.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.