DeflateStream 4GB Limit in .NET - c#

From MSDN: DeflateStream Class
DeflateStream cannot be used to compress files larger than 4 GB.
Are there any other implementations for .NET without the 4 GB limit?
NOTE: I really need to decompress a file in GZ format with content larger than 4 GB. Can any code do that?

FYI, we have removed the 4 GB limit from DeflateStream in .NET 4.

There is sample code at CodeProject using the 7-Zip library.
The license is open, so you should be able to use this in your project.
7-Zip also supports GZ files.

Take a look at SharpZipLib. Not sure if it's subject to the same limitation, but worth a look.

Look for libraries that support DEFLATE64 (not Zip64, that's an extension to the ZIP file format). Xceed Zip for .NET does support Deflate64, and I'm sure others do too.

Having a look around, it seems a lot of people have encountered this problem.
System.IO.Compressio.DeflateStream clarifications please seems to be the most comprehensive.
The only implementation I was able to find that seems to overcome this problem by using Zip64 is
Xceed Zip for .NET.
However, it is very expensive and I am not sure if it would suite your needs.
Edit:
There does seem to be quite a number of implementations of Zip64 for .NET, but I can't find any that are free.

DotNetZip does ZIP64 for .NET, and it is free. But Zip64 is not the same as Deflate64.

Although that documentation says the 4GB limitation is for both the DeflateStream and GZipStream, only GZipStream is limited because of the CRC32 checksum. If you do not need CRC32 then use DeflateStream.

Related

GZipStream to gzip string

i´m using GZipStream to gzip string.
Can someone tell me if it is possible to control the level of compression? This is because I realize it is possible to create gzip streams more compressed than .net seems to create.
This will be possible in .NET 4.5 as a new constructor has been added which allows you to specify a compression level. Another possibility is to use a third party library that will allow you to achieve that.
You will get better compression using #ZipLib

Delta Encoding for large files - good implementation available?

For my backup tool, I'm still looking for a good delta encoding algorithm, that can handle a binary file. Problem is, I've got pretty huge binary files, e.g. 600MB and up. So, it's pretty hard for a 32bit application to assign up to 10 Gig of RAM. Honestly, it's impossible.
So I looked at the csharp bsdiff implementation found here. It's pretty cool, but loads the whole file into a byte array. So, does anyone now an implementation that can handle large files? I mean, REALLY large files?
Assuming you're running on a Windows environment, look at Remote Differential Compression. It was developed as an improvement upon rsync with the premise you have a server and a client with similar versions of a file, with one being the "master", and you want to sync them together.
C# wrapper on the COM libraries can be found here.

C#/.NET: In need of a few (de)compression libraries

My Google-Fu seems to be weak today. I've so far been able to find libraries for ZLib and BZip2 (SharpZipLib), and LZMA (7-Zip), but not Huffman, IMA ADPCM, and SPARSE.
Does anyone know of any pure .NET libraries that can handle these compression methods?
Thanks!
I've found this for Huffman...

zlib.h versus zlib.net

I have made a compression in c/c++ (NO UNDER THE CLR) using the library zlib.h, and it works great. the functions that I use deflate() and inflate().
Now the file compressed by the c application, I want to decompress it with the zlib.net application, using c#, but I do not manage to get it working. When trying to decompress it, I get the error of the magic number, a number used by an specific application in the header. Does anyone know how to get through this problem, or if someone can give me an example of the inflate()/deflate() functionality in .net
for more info on how I have done my compression, is similar to the one in the link http://www.zlib.net/zlib_how.html
Also, can any one advice me of a good lib to perform compression in both c++ and .net,
Many thanks in advance...
There's some discussion on this here: Zlib-compatible compression streams?
I think Boost may work with zlib to add the header information: http://www.boost.org/doc/libs/1_36_0/libs/iostreams/doc/classes/gzip.html

The best way to Compress XML

I need to compress a very large xml file to the smallest possible size.
I work in C#, and I prefer it to be some open source or application that I can access thru my code, but I can handle an algorithm as well.
Thank you!
It may not be the "smallest size possible", but you could use use System.IO.Compression to compress it. Zipping tends to provide very good compression for text.
using (var fileStream = File.OpenWrite(...))
using (var zipStream = new GZipStream(fileStream, CompressionMode.Compress))
{
zipStream.Write(...);
}
As stated above, Efficient XML Interchange (EXI) achieves the best available XML compression pretty consistently. Even without schemas, it is not uncommon for EXI to be 2-5 times smaller than zip. With schemas, you'll do even better.
If you're not opposed to a commercial implementation, you can use the .NET version of Efficient XML and call it directly from your C# code using standard .NET APIs. You can download a free trial copy from http://www.agiledelta.com/efx_download.html.
have a look at XML Compression Tools you can also compress it using SharpZipLib
If you have a schema available for the XML file, you could try EXIficient. It is an implementation of the Efficient XML Interchange (EXI) format that is pretty much the best available general-purpose XML compression method. If you don't have a schema, EXI is still better than regular zip (the deflate algorithm, that is), but not very much, especially for large files.
EXIficient is only Java but you can probably make it into an application that you can call. I'm not aware of any open-source implementations of EXI in C#.
File size is not the only advantage of EXI (or any binary scheme). The processing time and memory overhead are also greatly reduced when reading/writing it. Imagine a program that copies floating point numbers to disk by simply copying the bytes. Now imagine another program converts the floating point numbers to formatted text, and pastes them into a text stream, and then feeds that stream through an expensive compression algorithm. Because of this ridiculous overhead, XML is basically unusable for very large files that could have been effortlessly processed with a binary representation.
Binary XML promises to address this longstanding weakness of XML. It would be very easy to make a utility that converts between binary/text representations (without knowing the XML schema), which means you can still edit the files easily when you want to.
XML is highly compressible. You can use DotNetZip to produce compressed zip files from you XML.
if you require maximum compression level i would recommend LZMA. There is a SDK (including C#) that is part of the open source 7-Zip project, available here.
If you are looking for the smallest possible size then try Fast Infoset as binary XML encoding and then compress using BZIP2 or LZMA. You will probably get better results than compressing text XML or using EXI. FastInfoset.NET includes implementations of the Fast Infoset standard and several compression formats to choose from but it's commercial.

Categories

Resources