I created a simple program.
I create a string and compress it by following methods and store it in a binary data field type in sql server 2008 (binary(1000) field type).
When I read that binary data and result string is true like original string data with the same length and data but when I want to decompress it it gave me an error.
I use this method to get bytes:
System.Text.ASCIIEncoding.ASCII.GetBytes(mystring)
And this method to get string:
System.Text.ASCIIEncoding.ASCII.GetString(binarydata)
In hard code in VS2012 editor, result string works fine, but when I read it from sql it gives me this error in first line of decompression method:
The input is not a valid Base-64 string as it contains a
non-base 64 character, more than two padding characters,
or a non-white space character among the padding characters.
What's wrong with my code? These two strings are same but
string test1=Decompress("mystring");
...this method works fine but this gave me that error and can not decompress retrieved string
string temp=System.Text.ASCIIEncoding.ASCII.GetString(get data from sql) ;
string test2=Decompress(temp);
The comparing these string do not shows any deference
int result = string.Compare(test1, test2); // result=0
My compression method:
public static string Compress(string text)
{
byte[] buffer = Encoding.UTF8.GetBytes(text);
var memoryStream = new MemoryStream();
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
{
gZipStream.Write(buffer, 0, buffer.Length);
}
memoryStream.Position = 0;
var compressedData = new byte[memoryStream.Length];
memoryStream.Read(compressedData, 0, compressedData.Length);
var gZipBuffer = new byte[compressedData.Length + 4];
Buffer.BlockCopy(compressedData, 0, gZipBuffer, 4, compressedData.Length);
Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gZipBuffer, 0, 4);
return Convert.ToBase64String(gZipBuffer);
}
My decompression method:
public static string Decompress(string compressedText)
{
byte[] gZipBuffer = Convert.FromBase64String(compressedText);
using (var memoryStream = new MemoryStream())
{
int dataLength = BitConverter.ToInt32(gZipBuffer, 0);
memoryStream.Write(gZipBuffer, 4, gZipBuffer.Length - 4);
var buffer = new byte[dataLength];
memoryStream.Position = 0;
using (var gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
{
gZipStream.Read(buffer, 0, buffer.Length);
}
return Encoding.UTF8.GetString(buffer);
}
}
The most likely issue is the way you are getting the string from the SQL binary filed.
Currently (I guess, you have not showed how you stored or retrieved your data from SQL)
Compress : Text -> UTF8.GetBytes -> compress -> base64 string-> Send to Sql (transformed to binary)
Decompress: Binary -> String representation of binary -> base64 decode -> decompress -> UTF8.GetString
Your issue is the String representation of binary step is not the same as the Send to Sql (transformed to binary). If you are storing this as a varbinary you should be returning the byte array from compress and decompress should take in a byte array.
public byte[] string Compress(string text)
{
//Snip
}
public static string Decompress(byte[] compressedText)
{
//Snip
}
this changes your process to
Compress : Text -> UTF8.GetBytes -> compress -> Send to Sql
Decompress: Binary -> decompress -> UTF8.GetString
Related
I get the following error: The archive entry was compressed using an unsupported compression method.
I got to decode the following gzip compressed base64 string, here is the string:
H4sIAAAAAAAAAD1SwW6bQBAdO0mDrUq9VGoPPWzVSj1ZItiO7aNjk4TI4IRgY7gtMLZxFnBhiYM/oLee+wn8QL+AT+mHVF1y6F5WM+/N07yZaQO0oBG2AaDRhGYYNH424GyS5DFvtOGE000LTjH2t1C/E2jdhgFeM7rJRPi3De3Hp5yx+SHGVIKmFsBX2R8gri96nXXf9zpduT/seArKHcWnA2WoKD30B6LuPk32mPIQsxZIHF94nmL22oYEZ0vKcoTfWNzJ7morB6s75hfapYitR5nNtd1+oMXLwptol1ok8Nur4zwcPgc3y15wuyzclZ57Nstd2ygc25VnUZ8Fk9F/rdnRL3RLlR1rc9CPi64xdY5utCjc3aLvWnrfUK63zo5FztFR3OlDT49UxbAfLvSdGc2nm65+81Dott4V8U63tsxRnK5h+eF6dTESDtpwHoTZntFCzG6WpCi9Jt9X5dDE73kojBKGz8iIIoMksldJwut5fqjKgU3ZUxh/I0lMUhrGXnLIPgvoi4DG+z0rCN+GGUnzGAlPiFdXkiQl6zxD+CRI/JAIYIN8iymhXNCRmHkc+vBWoPcYYMYpqyU/VuVlVbKZeqMa07HpkMn8UVctbSLBqUEjhE5VBn9+/SBV6ZuCS6sSw6qkcVV6XlWOEgEfxF/LI9GEw3fqC0/pmPM09HJer/OsbjQ7gXNzrBlXc7veL0j1ncGpuTBUgCa8mdKIblAcF/wDb9VytI4CAAA\u003d
When I first use the Convert.FromBase64String method I receive this string when I convert it into a string:
a"\u001f�\b\0\0\0\0\0\0\0=R�n�#\u0010\u001d;I��J�Tj\u000f=l�J=Y"؎��c�����c�-0�q\u0016pa��?����\t�#��O�T]r�^V3��Ӽ�i\u0003��\u0011�\u0001�фf\u00184~6�l��1o���M\vN1��P�\u0013h݆\u0001^3��D��\r�ǧ���!�T��\u0016�W�\u001f �/z�u��:]�?�x\n�\u001dŧ\u0003e�(=�\a��>M���\u0010�\u0016H\u001c_x�b�چ\u0004gK�r��X���j+\a�;�\u0017ڥ��G�͵�~���\u009bh�Z$�۫�<\u001c>\a7�^p�,ܕ�{6�]�(\u001cەgQ�\u0005��\u007f���/tK�\u001dksЏ��1u�n�(�ݢ�Zz�P��ΎE��Q��CO�TŰ\u001f.��\u0019ͧ��~�P��\u0015�N���Q��a��zu1\u0012\u000e�p\u001e�ٞ�B�n��(�&�W����y(�\u0012��Ȉ\"�$�WI��y~�ʁM�S\u0018\u007f#ILR\u001a�^r�>\v苀��=+\b߆\u0019I�\u0018\tO�WW�$%�<C�$H��\b�|�)�\Б�y\u001c��V��\u0018`�)�%?V�eU��z�\u001aӱ���QW-m"��A#�NU\u0006\u007f~� U雂K�\u0012ê�qUz^U�\u0012\u0001\u001f�_�#ф�w�\vO��4�r^��n4;�ss�\u0019Ws��/H�����0T�&��҈nP\u001c\u0017�\u0003o�r��\u0002\0\0"
could this have something to do with the problem?
Here is my code:
public static string Decompress(string input)
{
byte[] compressed = Convert.FromBase64String(input);
byte[] decompressed = Decompress(compressed);
return Encoding.UTF8.GetString(decompressed);
}
private static byte[] Decompress(byte[] input)
{
using (var source = new MemoryStream(input))
{
byte[] lengthBytes = new byte[4];
source.Read(lengthBytes, 0, 4);
var length = BitConverter.ToInt32(lengthBytes, 0);
using (var decompressionStream = new GZipStream(source,
CompressionMode.Decompress))
{
var result = new byte[length];
decompressionStream.Read(result, 0, length); Error: The archive entry was compressed using an unsupported compression method.
return result;
}
}
}
There is one little oddity in the base64 string, though it should not result in the error message you are getting. the \u003d should be replaced an equal sign (=), in order for the base64 decoding to work properly. (I can't tell if the string actually has those five characters at the end, or if it is just a representation of a string with an equal sign at the end. In the latter case, I don't know it wouldn't just show an equal sign as opposed to a unicode escaped representation of an equal sign.)
Otherwise, that base64 string decodes to a valid gzip stream that should decompress with no problem.
I solved the issue, I use the GZipStream.CopyTo to a MemoryStream in place of the read function. Here is the code if anyone would need it!
public static string Decompress(string value)
{
byte[] buffer = Convert.FromBase64String(value);
byte[] decompressed;
using (var inputStream = new MemoryStream(buffer))
{
using var outputStream = new MemoryStream();
using (var gzip = new GZipStream(inputStream, CompressionMode.Decompress, leaveOpen: true))
{
gzip.CopyTo(outputStream);
}
decompressed = outputStream.ToArray();
}
return Encoding.UTF8.GetString(decompressed);
}
I have a large file (Source file (assuming 10GB)), that I need to read it by chunks, compress and hash it.
(Finally, we have two outputs: the hash of the file in string format (md5HashHex) and the compressed file in byte format (destData).)
Also before compression, I need to add a header to the destination (destData) and hash it. After that, need to open the source file and read it chunk by chunk, compress and hash each chunk. I found out that my hashing would be different when I read the file chunk by chunk comparing to do the hash in one go. Here is my code, I appreciate if you can help me with that. Also I would like to know if I am doing the compression correctly. Thank you.
public static void CompresingHashing(string inputFile)
{
MD5 md5 = MD5.Create();
int byteCount = 0;
var length = 8192;
var chunk = new byte[length];
byte[] destData;
byte[] compressedData;
byte[] header;
header = Encoding.ASCII.GetBytes("HEADER");
md5.TransformBlock(header, 0, header.Length, null, 0);
destData = AppendingArrays(destData, header); //destination
using (FileStream sourceFile = File.OpenRead(inputFile))
{
while ((byteCount = sourceFile.Read(chunk, 0, length)) > 0)
{
using (var ms = new MemoryStream())
{
using (ZlibStream result = new ZlibStream(ms, CompressionMode.Compress, CompressionLevel.Default)
result.Write(chunk, 0, chunk.Length);
}
compressedData = ms.ToArray();
md5.TransformBlock(compressedData, 0, compressedData.Length, null, 0);
destData = AppendingArrays(destData, compressedData);
}
md5.TransformFinalBlock(chunk, 0, 0);
byte[] md5Hash = md5.Hash;
string md5HashHex = string.Join(string.Empty, md5Hash.Select(b => b.ToString("x2")));
}
Console.WriteLine("Hash : " + hash);
}
public static byte[] AppendingArrays(byte[] existingArray, byte[] ArrayToAdd)
{
byte[] newArray = new byte[existingArray.Length + ArrayToAdd.Length];
existingArray.CopyTo(newArray, 0);
ArrayToAdd.CopyTo(newArray, existingArray.Length);
return newArray;
}
But If I hash destData (which is the source file + the header) I got the different result: (for the sake of space I didn't repeat the code )
.
.
.
destData = AppendingArrays(destData, compressedData);
byte[] md5Hash = md5.ComputeHash(data);
.
.
.
Looks like you are processing the last chunk twice on the md5. Simply call TransformFinalBlock with a byte[0] and length and offset of 0.
I have encountered a type problem. I have in SQL Server table with this structure
create table dbo.DataWithCompressedXML
(
ID int not null,
Data varbinary(max) not null,
)
The data looks like:
0x0BC9C82C5600A2448592D4E21285E292A2CCBC74454500
Then I want take this value to C# to decompress, for decompress I use function near, after decompress I see the text like (This is a test string!!), but I can't understand in what type I must take my variable value from DB, and then use my function this is a test string!
public static SqlBytes BinaryDecompress(SqlBytes input)
{
if (input.IsNull)
return SqlBytes.Null;
int batchSize = 32768;
byte[] buf = new byte[batchSize];
using (MemoryStream result = new MemoryStream())
{
using (DeflateStream deflateStream =
new DeflateStream(input.Stream, CompressionMode.Decompress, true))
{
int bytesRead;
while ((bytesRead = deflateStream.Read(buf, 0, batchSize)) > 0)
result.Write(buf, 0, bytesRead);
}
return new SqlBytes(result.ToArray());
}
}
The question is the word "compressed" ... If you store an XML in a column of varbinary(max) you will see such a HEX-string
SELECT CAST('<root>test</root>' AS VARBINARY(MAX))
--Result: 0x3C726F6F743E746573743C2F726F6F743E
SELECT CAST(0x3C726F6F743E746573743C2F726F6F743E AS VARCHAR(MAX));
SELECT CAST(0x3C726F6F743E746573743C2F726F6F743E AS XML);
Both will return the same, the first a plain text, the second as "real" XML
If you compressed this somehow, you'll first have to de-compress this as you are trying to do this with your DeflateStream. Are you sure, that this is a valid decompression for your given input?
What is the return value of your function (the content of SqlBytes)?
XML in SQL Server is coded UTF-16. If the de-compression is done correctly, it should work to convert your SqlBytes to an UTF-16 string.
Regrettfully your database column is not XML but VARBINARY(MAX), so it might be a different encoding...
UPDATE
According to your comment:
string hex = "0BC9C82C5600A2448592D4E21285E292A2CCBC74454500";
//credits: http://stackoverflow.com/a/321404/5089204
var byteArray = Enumerable.Range(0, hex.Length / 2)
.Select(x => Convert.ToByte(hex.Substring(x * 2, 2), 16))
.ToArray();
int batchSize = 32768;
byte[] buf = new byte[batchSize];
using (MemoryStream result = new MemoryStream()) {
using (DeflateStream deflateStream = new DeflateStream(new MemoryStream(byteArray), CompressionMode.Decompress)) {
int bytesRead;
while ((bytesRead = deflateStream.Read(buf, 0, batchSize)) > 0)
result.Write(buf, 0, bytesRead);
}
}
string s = System.Text.Encoding.Default.GetString(buf);
After this the variable "s" holds "This is a test string!!"
UPDATE2: As to your comment convert in SqlBytes
This should work as easy as new SqlBytes(buf) ...
I am trying to write a program that transfers a file through sound (kind of like a fax). I broke up my program into several steps:
convert file to binary
convert 1 to a certain tone and 0 to another
play the tones to another computer
other computer listens to tones
other computer converts tones into binary
other computer converts binary into file.
However, I can't seem to find a way to convert a file to binary. I found a way to convert a string to binary using
public static string StringToBinary(string data)
{
StringBuilder sb = new StringBuilder();
foreach (char c in data.ToCharArray())
{
sb.Append(Convert.ToString(c, 2).PadLeft(8,'0'));
}
return sb.ToString();
}
From http://www.fluxbytes.com/csharp/convert-string-to-binary-and-binary-to-string-in-c/ .
But I can't find out how to convert a file to binary (the file could be of any extension).
So, how can I convert a file to binary? Is there a better way for me to write my program?
Why don't you just open the file in binary mode?
this function opens the file in binary mode and returns the byte array:
private byte[] GetBinaryFile(filename)
{
byte[] bytes;
using (FileStream file = new FileStream(filename, FileMode.Open, FileAccess.Read))
{
bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
}
return bytes;
}
then to convert it to bits:
byte[] bytes = GetBinaryFile("filename.bin");
BitArray bits = new BitArray(bytes);
now bits variable holds 0,1 you wanted.
or you can just do this:
private BitArray GetFileBits(filename)
{
byte[] bytes;
using (FileStream file = new FileStream(filename, FileMode.Open, FileAccess.Read))
{
bytes = new byte[file.Length];
file.Read(bytes, 0, (int)file.Length);
}
return new BitArray(bytes);
}
Or even shorter code could be:
private BitArray GetFileBits(filename)
{
byte[] bytes = File.ReadAllBytes(filename);
return new BitArray(bytes);
}
I am getting an exception when trying to convert a base64 string to a byte array. I am converting an Image to a byte array then to a base64 string, then encrypting it and storing it in a file. Then I am attempting to convert the base64 string back to a byte array in a MemoryStream, and recreating the image. I am getting a FormatException here:
byte[] imgBytes = Convert.FromBase64String(str);
Here is the full code for the two main functions:
public string ImageToString(Image img)
{
using (MemoryStream ms = new MemoryStream())
{
img.Save(ms, ImageFormat.Jpeg);
return Convert.ToBase64String(ms.ToArray());
}
}
public Image StringToImage(String str)
{
int lent = str.Length;
byte[] imgBytes = Convert.FromBase64String(str);
MemoryStream ms = new MemoryStream(imgBytes, 0, imgBytes.Length);
ms.Write(imgBytes, 0, imgBytes.Length);
return Image.FromStream(ms, true);
}
Here is the beginning and end of the base64 string I am trying to convert....
G>/9j/4AAQSkZJRgABAQEAYABgAAD .... Uh+8fxpT/B9KAP/2Q==
Any ideas are greatly appreciated!
The problem is that your string got corrupted somewhere along the line. That's not a base64 string, as you can see by the second charcter >, which does not occur in a base64 string.
Side note: Your function creates a memory stream containing the data, then writes the data to the memory stream again. Then you try to read from the memory stream without resetting the position to the beginning of the stream.
Just create the memory stream and read from it:
public Image StringToImage(String str) {
byte[] imgBytes = Convert.FromBase64String(str);
return Image.FromStream(new MemoryStream(imgBytes), true);
}