How to use SharpCompress' BZip2Stream to compress a string? - c#

I am trying to compress a string (str) using SharpCompress' BZip2Stream but unable to achieve it. Following is the code I have so far,
public static string Compress(string str)
{
var data = Encoding.UTF8.GetBytes(str);
using (MemoryStream stream = new MemoryStream())
{
using (BZip2Stream zip = new BZip2Stream(stream, SharpCompress.Compressor.CompressionMode.Compress))
{
zip.Write(data, 0, data.Length);
var compressed = Encoding.UTF8.GetString(stream.ToArray());
return compressed;
}
}
}
No matter what string i pass to str it always returns BZh.
Any help is greatly appreciated!

I believe you need to finalize/close/flush the bzip2 stream in order to make sure all compressed data is written to the memory stream prior to reading data from the memory stream. Try:
using (MemoryMemoryStream stream = new MemoryStream())
{
using (BZip2Stream zip = new BZip2Stream(stream, SharpCompress.Compressor.CompressionMode.Compress))
{
zip.Write(data, 0, data.Length);
zip.Close();
}
var compressed = Encoding.UTF8.GetString(stream.ToArray());
return compressed;
}

Related

Unsupported decompress method error using GzipStream.Read in c#

I get the following error: The archive entry was compressed using an unsupported compression method.
I got to decode the following gzip compressed base64 string, here is the string:
H4sIAAAAAAAAAD1SwW6bQBAdO0mDrUq9VGoPPWzVSj1ZItiO7aNjk4TI4IRgY7gtMLZxFnBhiYM/oLee+wn8QL+AT+mHVF1y6F5WM+/N07yZaQO0oBG2AaDRhGYYNH424GyS5DFvtOGE000LTjH2t1C/E2jdhgFeM7rJRPi3De3Hp5yx+SHGVIKmFsBX2R8gri96nXXf9zpduT/seArKHcWnA2WoKD30B6LuPk32mPIQsxZIHF94nmL22oYEZ0vKcoTfWNzJ7morB6s75hfapYitR5nNtd1+oMXLwptol1ok8Nur4zwcPgc3y15wuyzclZ57Nstd2ygc25VnUZ8Fk9F/rdnRL3RLlR1rc9CPi64xdY5utCjc3aLvWnrfUK63zo5FztFR3OlDT49UxbAfLvSdGc2nm65+81Dott4V8U63tsxRnK5h+eF6dTESDtpwHoTZntFCzG6WpCi9Jt9X5dDE73kojBKGz8iIIoMksldJwut5fqjKgU3ZUxh/I0lMUhrGXnLIPgvoi4DG+z0rCN+GGUnzGAlPiFdXkiQl6zxD+CRI/JAIYIN8iymhXNCRmHkc+vBWoPcYYMYpqyU/VuVlVbKZeqMa07HpkMn8UVctbSLBqUEjhE5VBn9+/SBV6ZuCS6sSw6qkcVV6XlWOEgEfxF/LI9GEw3fqC0/pmPM09HJer/OsbjQ7gXNzrBlXc7veL0j1ncGpuTBUgCa8mdKIblAcF/wDb9VytI4CAAA\u003d
When I first use the Convert.FromBase64String method I receive this string when I convert it into a string:
a"\u001f�\b\0\0\0\0\0\0\0=R�n�#\u0010\u001d;I��J�Tj\u000f=l�J=Y"؎��c�����c�-0�q\u0016pa��?����\t�#��O�T]r�^V3��Ӽ�i\u0003��\u0011�\u0001�фf\u00184~6�l��1o���M\vN1��P�\u0013h݆\u0001^3��D��\r�ǧ���!�T��\u0016�W�\u001f �/z�u��:]�?�x\n�\u001dŧ\u0003e�(=�\a��>M���\u0010�\u0016H\u001c_x�b�چ\u0004gK�r��X���j+\a�;�\u0017ڥ��G�͵�~���\u009bh�Z$�۫�<\u001c>\a7�^p�,ܕ�{6�]�(\u001cەgQ�\u0005��\u007f���/tK�\u001dksЏ��1u�n�(�ݢ�Zz�P��ΎE��Q��CO�TŰ\u001f.��\u0019ͧ��~�P��\u0015�N���Q��a��zu1\u0012\u000e�p\u001e�ٞ�B�n��(�&�W����y(�\u0012��Ȉ\"�$�WI��y~�ʁM�S\u0018\u007f#ILR\u001a�^r�>\v苀��=+\b߆\u0019I�\u0018\tO�WW�$%�<C�$H��\b�|�)�\Б�y\u001c��V��\u0018`�)�%?V�eU��z�\u001aӱ���QW-m"��A#�NU\u0006\u007f~� U雂K�\u0012ê�qUz^U�\u0012\u0001\u001f�_�#ф�w�\vO��4�r^��n4;�ss�\u0019Ws��/H�����0T�&��҈nP\u001c\u0017�\u0003o�r��\u0002\0\0"
could this have something to do with the problem?
Here is my code:
public static string Decompress(string input)
{
byte[] compressed = Convert.FromBase64String(input);
byte[] decompressed = Decompress(compressed);
return Encoding.UTF8.GetString(decompressed);
}
private static byte[] Decompress(byte[] input)
{
using (var source = new MemoryStream(input))
{
byte[] lengthBytes = new byte[4];
source.Read(lengthBytes, 0, 4);
var length = BitConverter.ToInt32(lengthBytes, 0);
using (var decompressionStream = new GZipStream(source,
CompressionMode.Decompress))
{
var result = new byte[length];
decompressionStream.Read(result, 0, length); Error: The archive entry was compressed using an unsupported compression method.
return result;
}
}
}
There is one little oddity in the base64 string, though it should not result in the error message you are getting. the \u003d should be replaced an equal sign (=), in order for the base64 decoding to work properly. (I can't tell if the string actually has those five characters at the end, or if it is just a representation of a string with an equal sign at the end. In the latter case, I don't know it wouldn't just show an equal sign as opposed to a unicode escaped representation of an equal sign.)
Otherwise, that base64 string decodes to a valid gzip stream that should decompress with no problem.
I solved the issue, I use the GZipStream.CopyTo to a MemoryStream in place of the read function. Here is the code if anyone would need it!
public static string Decompress(string value)
{
byte[] buffer = Convert.FromBase64String(value);
byte[] decompressed;
using (var inputStream = new MemoryStream(buffer))
{
using var outputStream = new MemoryStream();
using (var gzip = new GZipStream(inputStream, CompressionMode.Decompress, leaveOpen: true))
{
gzip.CopyTo(outputStream);
}
decompressed = outputStream.ToArray();
}
return Encoding.UTF8.GetString(decompressed);
}

How use Magick.NET LosslessCompress with Stream and IFormFile

I am trying to compress image(usually around 5-30) quality / size with Magick.NET library, and I cant really understand how can I use ImageOptimizer class and call LosslessCompress() method using stream.
Do I need to use FileStream or MemoryStream?
Do I need to save / create a temp file on server for each image and then proceed with the compression flow? (Performance?)
Anything else?
Simple Code example:
private byte[] ConvertImageToByteArray(IFormFile image)
{
byte[] result = null;
// filestream
using (var fileStream = image.OpenReadStream())
// memory stream
using (var memoryStream = new MemoryStream())
{
var before = fileStream.Length;
ImageOptimizer optimizer = new ImageOptimizer();
optimizer.LosslessCompress(fileStream); // what & how can I pass here stream?
var after = fileStream.Length;
// convert to byte[]
fileStream.CopyTo(memoryStream);
result = memoryStream.ToArray();
}
return result;
}
You cannot use the fileStream because the stream needs to be both readable and writable. If you first copy the data to a memorystream you can then compresses the image in that stream. Your code should be changed to this:
private byte[] ConvertImageToByteArray(IFormFile image)
{
byte[] result = null;
// filestream
using (var fileStream = image.OpenReadStream())
// memory stream
using (var memoryStream = new MemoryStream())
{
fileStream.CopyTo(memoryStream);
memoryStream.Position = 0; // The position needs to be reset.
var before = memoryStream.Length;
ImageOptimizer optimizer = new ImageOptimizer();
optimizer.LosslessCompress(memoryStream);
var after = memoryStream.Length;
// convert to byte[]
result = memoryStream.ToArray();
}
return result;
}

Decompress string in java from compressed string in C#

I was searching for the correct solution to decompress the string in java coming from c# code.I tried myself with lot of techniques in java like(gzip,inflatter etc.).but didn't get the solution.i got some error while trying to decompress the string in java from compressed string from c# code.
My C# code to compress the string is,
public static string CompressString(string text)
{
byte[] byteArray = Encoding.GetEncoding(1252).GetBytes(text);// Encoding.ASCII.GetBytes(text);
using (var ms = new MemoryStream())
{
// Compress the text
using (var ds = new DeflateStream(ms, CompressionMode.Compress))
{
ds.Write(byteArray, 0, byteArray.Length);
}
return Convert.ToBase64String(ms.ToArray());
}
}
And decompress the string in java using,
private static void compressAndDecompress(){
try {
// Encode a String into bytes
String string = "xxxxxxSAMPLECOMPRESSEDSTRINGxxxxxxxxxx";
// // Compress the bytes
byte[] decoded = Base64.decodeBase64(string.getBytes());
byte[] output = new byte[4096];
// Decompress the bytes
Inflater decompresser = new Inflater();
decompresser.setInput(decoded);
int resultLength = decompresser.inflate(output);
decompresser.end();
// Decode the bytes into a String
String outputString = new String(output, 0, resultLength, "UTF-8");
System.out.println(outputString);
} catch(java.io.UnsupportedEncodingException ex) {
ex.printStackTrace();
} catch (java.util.zip.DataFormatException ex) {
ex.printStackTrace();
}
}
I get this exception when running the above code:
java.util.zip.DataFormatException: incorrect header check
Kindly give me the sample code in java to decompress the string java.Thanks
My C# code to compress is
private string Compress(string text)
{
byte[] buffer = Encoding.UTF8.GetBytes(text);
MemoryStream ms = new MemoryStream();
using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
{
zip.Write(buffer, 0, buffer.Length);
}
ms.Position = 0;
MemoryStream outStream = new MemoryStream();
byte[] compressed = new byte[ms.Length];
ms.Read(compressed, 0, compressed.Length);
byte[] gzBuffer = new byte[compressed.Length + 4];
System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
return Convert.ToBase64String(gzBuffer);
}
Java code to decompress the text is
private String Decompress(String compressedText)
{
byte[] compressed = compressedText.getBytes("UTF8");
compressed = org.apache.commons.codec.binary.Base64.decodeBase64(compressed);
byte[] buffer=new byte[compressed.length-4];
buffer = copyForDecompression(compressed,buffer, 4, 0);
final int BUFFER_SIZE = 32;
ByteArrayInputStream is = new ByteArrayInputStream(buffer);
GZIPInputStream gis = new GZIPInputStream(is, BUFFER_SIZE);
StringBuilder string = new StringBuilder();
byte[] data = new byte[BUFFER_SIZE];
int bytesRead;
while ((bytesRead = gis.read(data)) != -1)
{
string.append(new String(data, 0, bytesRead));
}
gis.close();
is.close();
return string.toString();
}
private byte[] copyForDecompression(byte[] b1,byte[] b2,int srcoffset,int dstoffset)
{
for(int i=0;i<b2.length && i<b1.length;i++)
{
b2[i]=b1[i+4];
}
return b2;
}
This code works perfectly fine for me.
Had exactly the same issue. Could solve it via
byte[] compressed = Base64Utils.decodeFromString("mybase64encodedandwithc#zippedcrap");
Inflater decompresser = new Inflater(true);
decompresser.setInput(compressed);
byte[] result = new byte[4096];
decompresser.inflate(result);
decompresser.end();
System.out.printf(new String(result));
The magic happens with the boolen parameter on instantiating the Inflator
BW Hubert
For beloved googlers,
As #dbw mentioned,
according to post How to decompress stream deflated with java.util.zip.Deflater in .NET?,
java.util.zip.deflater equivalent in c# the default deflater used in C#
is not having any java equivalent that's why users prefer Gzip, Ziplib
or some other zip techniques.
a relatively simple method would be using GZip.
And for the accepted answer, one problem is that in this method you should append the data size to the compressed string yourself, and more importantly as per my own experience in our production app, It is buggy when the string reaches ~2000 chars!
the bug is in the System.io.Compression.GZipStream
any way using SharpZipLib in c# the problem goes away and everything would be as simple as following snippets:
JAVA:
import android.util.Base64;
import com.google.android.gms.common.util.IOUtils;
import org.jetbrains.annotations.Nullable;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.util.zip.GZIPInputStream;
import java.util.zip.GZIPOutputStream;
public class CompressionHelper {
#Nullable
public static String compress(#Nullable String data) {
if(data == null || data.length() == 0)
return null;
try {
// Create an output stream, and a gzip stream to wrap over.
ByteArrayOutputStream bos = new ByteArrayOutputStream(data.length());
GZIPOutputStream gzip = new GZIPOutputStream(bos);
// Compress the input string
gzip.write(data.getBytes());
gzip.close();
byte[] compressed;
// Convert to base64
compressed = Base64.encode(bos.toByteArray(),Base64.NO_WRAP);
bos.close();
// return the newly created string
return new String(compressed);
} catch(IOException e) {
return null;
}
}
#Nullable
public static String decompress(#Nullable String compressedText) {
if(compressedText == null || compressedText.length() == 0)
return null;
try {
// get the bytes for the compressed string
byte[] compressed = compressedText.getBytes("UTF-8");
// convert the bytes from base64 to normal string
compressed = Base64.decode(compressed, Base64.NO_WRAP);
ByteArrayInputStream bis = new ByteArrayInputStream(compressed);
GZIPInputStream gis = new GZIPInputStream(bis);
byte[] bytes = IOUtils.toByteArray(gis);
return new String(bytes, "UTF-8");
}catch (IOException e){
e.printStackTrace();
}
return null;
}
}
and c#:
using ICSharpCode.SharpZipLib.GZip; //PM> Install-Package SharpZipLib
using System;
using System.Collections.Generic;
using System.IO;
using System.IO.Compression;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace GeneralTools
{
public static class CompressionTools
{
public static string CompressString(string text)
{
if (string.IsNullOrEmpty(text))
return null;
byte[] buffer = Encoding.UTF8.GetBytes(text);
using (var compressedStream = new MemoryStream())
{
GZip.Compress(new MemoryStream(buffer), compressedStream, false);
byte[] compressedData = compressedStream.ToArray();
return Convert.ToBase64String(compressedData);
}
}
public static string DecompressString(string compressedText)
{
if (string.IsNullOrEmpty(compressedText))
return null;
byte[] gZipBuffer = Convert.FromBase64String(compressedText);
using (var memoryStream = new MemoryStream())
{
using (var compressedStream = new MemoryStream(gZipBuffer))
{
var decompressedStream = new MemoryStream();
GZip.Decompress(compressedStream, decompressedStream, false);
return Encoding.UTF8.GetString(decompressedStream.ToArray()).Trim();
}
}
}
}
}
you may also find the codes here
If anyone still interested, here's my full solution with outputstream to handle unknown string size. Using C# DeflateStream and Java Inflater (based on Hubert Ströbitzer answer).
C# Compression:
string CompressString(string raw)
{
byte[] uncompressedData = Encoding.UTF8.GetBytes(raw);
MemoryStream output = new MemoryStream();
using (DeflateStream dStream = new DeflateStream(output, CompressionLevel.Optimal))
{
dStream.Write(uncompressedData, 0, uncompressedData.Length);
}
string compressedString = Convert.ToBase64String(output.ToArray());
return compressedString;
}
Java decompress:
String decompressString(String compressedString) {
byte[] compressed = Base64Utils.decodeFromString(compressedString);
Inflater inflater = new Inflater(true);
inflater.setInput(compressed);
//Using output stream to handle unknown size of decompressed string
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
try {
while(!inflater.finished()){
int count = inflater.inflate(buffer);
outputStream.write(buffer, 0, count);
}
inflater.end();
outputStream.close();
} catch (DataFormatException e) {
//Handle DataFormatException
} catch (IOException e) {
//Handle IOException
}
return outputStream.toString();
}

Why does one method of string compression return an empty string but another one doesn't?

I'm using GZipStream to compress a string, and I've modified two different examples to see what works. The first code snippet, which is a heavily modified version of the example in the documentation, simply returns an empty string.
public static String CompressStringGzip(String uncompressed)
{
String compressedString;
// Convert the uncompressed source string to a stream stored in memory
// and create the MemoryStream that will hold the compressed string
using (MemoryStream inStream = new MemoryStream(Encoding.Unicode.GetBytes(uncompressed)),
outStream = new MemoryStream())
{
using (GZipStream compress = new GZipStream(outStream, CompressionMode.Compress))
{
inStream.CopyTo(compress);
StreamReader reader = new StreamReader(outStream);
compressedString = reader.ReadToEnd();
}
}
return compressedString;
and when I debug it, all I can tell is nothing is read from reader, which is compressedString is empty. However, the second method I wrote, modified from a CodeProject snippet is successful.
public static String CompressStringGzip3(String uncompressed)
{
//Transform string to byte array
String compressedString;
byte[] uncompressedByteArray = Encoding.Unicode.GetBytes(uncompressed);
using (MemoryStream outStream = new MemoryStream())
{
using (GZipStream compress = new GZipStream(outStream, CompressionMode.Compress))
{
compress.Write(uncompressedByteArray, 0, uncompressedByteArray.Length);
compress.Close();
}
byte[] compressedByteArray = outStream.ToArray();
StringBuilder compressedStringBuilder = new StringBuilder(compressedByteArray.Length);
foreach (byte b in compressedByteArray)
compressedStringBuilder.Append((char)b);
compressedString = compressedStringBuilder.ToString();
}
return compressedString;
}
Why is the first code snippet not successful while the other one is? Even though they're slightly different, I don't know why the minor changes in the second snippet allow it to work. The sample string I'm using is SELECT * FROM foods f WHERE f.name = 'chicken';
I ended up using the following code for compression and decompression:
public static String Compress(String decompressed)
{
byte[] data = Encoding.UTF8.GetBytes(decompressed);
using (var input = new MemoryStream(data))
using (var output = new MemoryStream())
{
using (var gzip = new GZipStream(output, CompressionMode.Compress, true))
{
input.CopyTo(gzip);
}
return Convert.ToBase64String(output.ToArray());
}
}
public static String Decompress(String compressed)
{
byte[] data = Convert.FromBase64String(compressed);
using (MemoryStream input = new MemoryStream(data))
using (GZipStream gzip = new GZipStream(input, CompressionMode.Decompress))
using (MemoryStream output = new MemoryStream())
{
gzip.CopyTo(output);
StringBuilder sb = new StringBuilder();
return Encoding.UTF8.GetString(output.ToArray());
}
}
The explanation for a part of the problem comes from this question. Although I fixed the problem by changing the code to what I included in this answer, these lines (in my original code):
foreach (byte b in compressedByteArray)
compressedStringBuilder.Append((char)b);
are problematic, because as dlev aptly phrased it:
You are interpreting each byte as its own character, when in fact that is not the case. Instead, you need the line:
string decoded = Encoding.Unicode.GetString(compressedByteArray);
The basic problem is that you are converting to a byte array based on an encoding, but then ignoring that encoding when you retrieve the bytes.
Therefore, the problem is solved, and the new code I'm using is much more succinct than my original code.
You need to move the code below outside the second using statement:
using (GZipStream compress = new GZipStream(outStream, CompressionMode.Compress))
{
inStream.CopyTo(compress);
outStream.Position = 0;
StreamReader reader = new StreamReader(outStream);
compressedString = reader.ReadToEnd();
}
CopyTo() is not flushing the results to the underlying MemoryStream.
Update
Seems that GZipStream closes and disposes it's underlying stream when it is disposed (not the way I would have designed the class). I've updated the sample above and tested it.

StreamReader ReadToEnd() returns empty string on first attempt

I know this question has been asked before on Stackoverflow, but could not find an explanation.
When I try to read a string from a compressed byte array I get an empty string on the first attempt, on the second I succed and get the string.
Code example:
public static string Decompress(byte[] gzBuffer)
{
if (gzBuffer == null)
return null;
using (var ms = new MemoryStream(gzBuffer))
{
using (var decompress = new GZipStream(ms, CompressionMode.Decompress))
{
using (var sr = new StreamReader(decompress, Encoding.UTF8))
{
string ret = sr.ReadToEnd();
// this is the extra check that is needed !?
if (ret == "")
ret = sr.ReadToEnd();
return ret;
}
}
}
}
All suggestions are appreciated.
- Victor Cassel
I found the bug. It was as Michael suggested in the compression routine. I missed to call Close() on the GZipStream.
public static byte[] Compress(string text)
{
if (string.IsNullOrEmpty(text))
return null;
byte[] raw = Encoding.UTF8.GetBytes(text);
using (var ms = new MemoryStream())
{
using (var compress = new GZipStream (ms, CompressionMode.Compress))
{
compress.Write(raw, 0, raw.Length);
compress.Close();
return ms.ToArray();
}
}
}
What happened was that the data seemed to get saved in a bad state that required two calls to ReadToEnd() in the decompression routine later on to extract the same data. Very odd!
try adding ms.Position = 0 before string ret = sr.ReadToEnd();
Where is gzBuffer coming from? Did you also write the code that is producing the compressed data?
Perhaps the buffer data you have is invalid or somehow incomplete, or perhaps it consists of multiple deflate streams concatenated together.
I hope this helps.
For ByteArray:
static byte[] CompressToByte(string data)
{
MemoryStream outstream = new MemoryStream();
GZipStream compressionStream =
new GZipStream(outstream, CompressionMode.Compress, true);
StreamWriter writer = new StreamWriter(compressionStream);
writer.Write(data);
writer.Close();
return StreamToByte(outstream);
}
static string Decompress(byte[] data)
{
MemoryStream instream = new MemoryStream(data);
GZipStream compressionStream =
new GZipStream(instream, CompressionMode.Decompress);
StreamReader reader = new StreamReader(compressionStream);
string outtext = reader.ReadToEnd();
reader.Close();
return outtext;
}
public static byte[] StreamToByte(Stream stream)
{
stream.Position = 0;
byte[] buffer = new byte[128];
using (MemoryStream ms = new MemoryStream())
{
while (true)
{
int read = stream.Read(buffer, 0, buffer.Length);
if (!(read > 0))
return ms.ToArray();
ms.Write(buffer, 0, read);
}
}
}
You can replace if(!(read > 0)) with if(read <= 0).
For some reason if(read <= 0) isn't displayed corret above.
For Stream:
static Stream CompressToStream(string data)
{
MemoryStream outstream = new MemoryStream();
GZipStream compressionStream =
new GZipStream(outstream, CompressionMode.Compress, true);
StreamWriter writer = new StreamWriter(compressionStream);
writer.Write(data);
writer.Close();
return outstream;
}
static string Decompress(Stream data)
{
data.Position = 0;
GZipStream compressionStream =
new GZipStream(data, CompressionMode.Decompress);
StreamReader reader = new StreamReader(compressionStream);
string outtext = reader.ReadToEnd();
reader.Close();
return outtext;
}
The MSDN Page on the function mentions the following:
If the current method throws an OutOfMemoryException, the reader's position in the underlying Stream object is advanced by the number of characters the method was able to read, but the characters already read into the internal ReadLine buffer are discarded. If you manipulate the position of the underlying stream after reading data into the buffer, the position of the underlying stream might not match the position of the internal buffer. To reset the internal buffer, call the DiscardBufferedData method; however, this method slows performance and should be called only when absolutely necessary.
Perhaps try calling DiscardBufferedData() before your ReadToEnd() and see what it does (I know you aren't getting the exception, but it's all I can think of...)?

Categories

Resources