Compressing / Decompressing Folders & Files - c#

Does anyone know of a good way to compress or decompress files and folders in C# quickly? Handling large files might be necessary.

The .Net 2.0 framework namespace System.IO.Compression supports GZip and Deflate algorithms. Here are two methods that compress and decompress a byte stream which you can get from your file object. You can substitute GZipStream for DefaultStream in the methods below to use that algorithm. This still leaves the problem of handling files compressed with different algorithms though.
public static byte[] Compress(byte[] data)
{
MemoryStream output = new MemoryStream();
GZipStream gzip = new GZipStream(output, CompressionMode.Compress, true);
gzip.Write(data, 0, data.Length);
gzip.Close();
return output.ToArray();
}
public static byte[] Decompress(byte[] data)
{
MemoryStream input = new MemoryStream();
input.Write(data, 0, data.Length);
input.Position = 0;
GZipStream gzip = new GZipStream(input, CompressionMode.Decompress, true);
MemoryStream output = new MemoryStream();
byte[] buff = new byte[64];
int read = -1;
read = gzip.Read(buff, 0, buff.Length);
while (read > 0)
{
output.Write(buff, 0, read);
read = gzip.Read(buff, 0, buff.Length);
}
gzip.Close();
return output.ToArray();
}

I've always used the SharpZip Library.
Here's a link

You can use a 3rd-party library such as SharpZip as Tom pointed out.
Another way (without going 3rd-party) is to use the Windows Shell API. You'll need to set a reference to the Microsoft Shell Controls and Automation COM library in your C# project. Gerald Gibson has an example at:
Internet Archive's copy of the dead page

As of .Net 1.1 the only available method is reaching into the java libraries.
Using the Zip Classes in the J# Class Libraries to Compress Files and Data with C#
Not sure if this has changed in recent versions.

My answer would be close your eyes and opt for DotNetZip. It's been tested by a large community.

GZipStream is a really good utility to use.

This is very easy to do in java, and as stated above you can reach into the java.util.zip libraries from C#. For references see:
java.util.zip javadocs
sample code
I used this a while ago to do a deep (recursive) zip of a folder structure, but I don't think I ever used the unzipping. If I'm so motivated I may pull that code out and edit it into here later.

Another good alternative is also DotNetZip.

You can create zip file with this method:
public async Task<string> CreateZipFile(string sourceDirectoryPath, string name)
{
var path = HostingEnvironment.MapPath(TempPath) + name;
await Task.Run(() =>
{
if (File.Exists(path)) File.Delete(path);
ZipFile.CreateFromDirectory(sourceDirectoryPath, path);
});
return path;
}
and then you can unzip zip file with this methods:
1- This method work with zip file path
public async Task ExtractZipFile(string filePath, string destinationDirectoryName)
{
await Task.Run(() =>
{
var archive = ZipFile.Open(filePath, ZipArchiveMode.Read);
foreach (var entry in archive.Entries)
{
entry.ExtractToFile(Path.Combine(destinationDirectoryName, entry.FullName), true);
}
archive.Dispose();
});
}
2- This method work with zip file stream
public async Task ExtractZipFile(Stream zipFile, string destinationDirectoryName)
{
string filePath = HostingEnvironment.MapPath(TempPath) + Utility.GetRandomNumber(1, int.MaxValue);
using (FileStream output = new FileStream(filePath, FileMode.Create))
{
await zipFile.CopyToAsync(output);
}
await Task.Run(() => ZipFile.ExtractToDirectory(filePath, destinationDirectoryName));
await Task.Run(() => File.Delete(filePath));
}

Related

Zipping Two File with Same Content and Encoding them to base64 giving different response

I need to encode the zip file in base64 formats.
I followed the following approach
string text = File.ReadAllText("../../../SampleDat.dat");
byte[] compress0 = Compress(stringbyte);
string short_com0 = base64_encode(compress0);
public static byte[] Compress(byte[] data)
{
using (var compressedStream = new MemoryStream())
using (var zipStream = new GZipStream(compressedStream, CompressionMode.Compress))
{
zipStream.Write(data, 0, data.Length);
zipStream.Close();
return compressedStream.ToArray();
}
}
public string base64_encode(byte[] data)
{
if (data == null)
throw new ArgumentNullException("data");
return Convert.ToBase64String(data);
}
After using this I got this encoded string.
H4sIAAAAAAAEAJVQTU/CQBS8m/gfejHRgxQpoJJ4qGXBKlBsq6KXph8P2NjdrbuLleT9eBe/QvSgHt7hTWYmMzMmsdt3Yxe9lBe0SDVcisytqpLmqaaCkxctU5/PBQ5GZNabkjAxFwWThPhxQgYDNJd4bkyGQXifeEGfYKoUKMWA60nKYP+n5mwCTKyksjxJNUiaHmxpolzIf4tuZPk3iWcaLoRce6IAJPP5iHLwC5wC3ZSU7K30JwmjVcaoUgYynOGN38fI+OUQrZUGZrDtN6g5SAzhaUUV3dhMViwzyNey7//uzpiEQ/L74N/D46agaYZuwSinyvA0fQbLNQGVTrm2Di3CtVxbI3iGEjttXGpdqZ5t13XdyD9szLxVIxfMXlIJCkrItS2hElIrm/ICXuzH6V7rfL4oTx+CIMtY/+7aiaNZq7ZFnLfDinavZsFtBvfNpZ9HZIH4MyriUctpd7rHJ6dNvPDGDX88HaFz3MGO02w6r7wgTAN2AgAA
When I created zip manually and read file in the code and compress that file
//file zipped manually
string filePath1 = "../../../git_only/oraclehcm1/dbscripts/SampleDat.zip";
byte[] physicalfile1 = File.ReadAllBytes(filePath1);
string long_com1 = base64_encode(physicalfile1);
The response I get is
UEsDBBQAAAAIAECDYlK8IEwDbAEAAHYCAAANAAAAU2FtcGxlRGF0LmRhdJVQTU/CQBS8m/gfejHRgxQpqJB4qGXBKlBsq6KXph8P2NjdrbuLleT9eBc/4tdBPbzDvMxMZmZMYrfvxi56KS9okWo4F5lbVSXNU00FJ09apj6fCxyMyKw3JWFiLgomCfHjhAwGaC7x3JgMg/A28YI+wVQpUIoB15OUwe5PzckEmFhJZXmSapA03fukiXIh/y26kuXfJJ5puBBy7YkCkMznI8rBL3AKdFNSspfS7ySMVhmjSpmX4Qyv/D5Gxi+HaK00ML/4AoOag8QQHlZU0Y3NZMUykB/LvuLtrTEJh+T3wb+Hx01B0wzdglFOleFp+giWawIqnXJt7VuEa7m2RvAIJXbauNS6Uj3bruu6kb/ZmHmrRi6YvaQSFJSQa1tCJaRWNuUFPNn3053W6XxRdu+CIMtY/+bSiaNZq7ZFnLfDih5ezILrDG6bSz+PyALxZ1TEg5bT7hweHXebeOaNG/54OkLnqIMdp9l0ngFQSwECHwAUAAAACABAg2JSvCBMA2wBAAB2AgAADQAkAAAAAAAAACAAAAAAAAAAU2FtcGxlRGF0LmRhdAoAIAAAAAAAAQAYAEMpLaJSD9cBq6mosXsP1wFNJS5xSw7XAVBLBQYAAAAAAQABAF8AAACXAQAAAAA=
This is the actual response . I also noticed the two zip are of the different size and the zip I which I created programmatically , The files in this zip have no extensions.
Please help me to create the second encoding through program and > .NET version I am using is 4.5
and I cannot use Zip.createDirectory() method due to project dependencies.
Any help is appreaciated .
Thanks in Advnance!
The first one is a gzip file, the second one is a zip file. If you want to make a zip file, try the ZipFile class as opposed to the GZipStream class.
I wouldn't expect two different Zip algorithms/libraries to yield the same output. For one, in the programmatic way, the file metadata (name, modification date, attributes) are not set, while the command line version will include all that information for unzipping purposes.
Plus libraries update at different cadence than standalones, and you might not have the fixes synchronized to reliably match the outputs.

Zip within a zip opens to undocumented System.IO.Compression.SubReadStream

I have a function I use for aggregating streams from a zip archive.
private void ExtractMiscellaneousFiles()
{
foreach (var miscellaneousFileName in _fileData.MiscellaneousFileNames)
{
var fileEntry = _archive.GetEntry(miscellaneousFileName);
if (fileEntry == null)
{
throw new ZipArchiveMissingFileException("Couldn't find " + miscellaneousFileName);
}
var stream = fileEntry.Open();
OtherFileStreams.Add(miscellaneousFileName, (DeflateStream) stream);
}
}
This works well in most cases. However, if I have a zip within a zip, I get an excpetion on casting the stream to a DeflateStream:
System.InvalidCastException: Unable to cast object of type 'System.IO.Compression.SubReadStream' to type 'System.IO.Compression.DeflateStream'.
I am unable to find Microsoft documentation for a SubReadStream. I would like my zip within a zip as a DeflateStream. Is this possible? If so how?
UPDATE
Still no success. I attempted #Sunshine's suggestion of copying the stream using the following code:
private void ExtractMiscellaneousFiles()
{
_logger.Log("Extracting misc files...");
foreach (var miscellaneousFileName in _fileData.MiscellaneousFileNames)
{
_logger.Log($"Opening misc file stream for {miscellaneousFileName}");
var fileEntry = _archive.GetEntry(miscellaneousFileName);
if (fileEntry == null)
{
throw new ZipArchiveMissingFileException("Couldn't find " + miscellaneousFileName);
}
var openStream = fileEntry.Open();
var deflateStream = openStream;
if (!(deflateStream is DeflateStream))
{
var memoryStream = new MemoryStream();
deflateStream.CopyTo(memoryStream);
memoryStream.Position = 0;
deflateStream = new DeflateStream(memoryStream, CompressionLevel.NoCompression, true);
}
OtherFileStreams.Add(miscellaneousFileName, (DeflateStream)deflateStream);
}
}
But I get a
System.NotSupportedException: Stream does not support reading.
I inspected deflateStream.CanRead and it is true.
I've discovered this happens not just on zips, but on files that are in the zip but are not compressed (because too small, for example). Surely there's a way to deal with this; surely someone has encountered this before. I'm opening a bounty on this question.
Here's the .NET source for SubReadStream, thanks to #Quantic.
The return type of ZipArchiveEntry.Open() is Stream. An abstract type, in practice it can be a DeflateStream (you'd be happy), a SubReadStream (boo) or a WrappedStream (boo). Woe be you if they decide to improve the class some day and use a ZopfliStream (boo). The workaround is not good, you are trying to deflate data that is not compressed (boo).
Too many boos.
Only good solution is to change the type of your OtherFileStreams member. We can't see it, smells like a List<DeflateStream>. It needs to be a List<Stream>.
So it looks like the when storing a zip file inside another zip it doesn't deflate the zip but rather just inlines the content of the zip with the rest of the files with some information that these entries are part of a sub zip file. Which makes sense because applying compression to something that is already compressed is a waste of time.
This zip file is marked as CompressionMethodValues.Stored in the archive, which causes .NET to just return the original stream it read instead to wrapping it in a DeflateStream.
Source here: https://github.com/dotnet/corefx/blob/master/src/System.IO.Compression/src/System/IO/Compression/ZipArchiveEntry.cs#L670
You could pass the stream into a ZipArchive, if it's not a DeflateStream (if you are interested in the file inside)
var stream = entry.Open();
if (!(stream is DeflateStream))
{
var subArchive = new ZipArchive(stream);
}
Or you can copy the stream to a FileStream (if you want to save it to disk)
var stream = entry.Open();
if (!(stream is DeflateStream))
{
var fs = File.Create(Path.GetTempFileName());
stream.CopyTo(fs);
fs.Close();
}
Or copy to any stream you are interested in using.
Note: This is also how .NET 4.6 behaves

create zip file in .net with password

I'm working on a project that I need to create zip with password protected from file content in c#.
Before I've use System.IO.Compression.GZipStream for creating gzip content.
Does .net have any functionality for create zip or rar password protected file?
Take a look at DotNetZip (#AFract supplied a new link to GitHub in the comments)
It has got pretty geat documentation and it also allow you to load the dll at runtime as an embeded file.
Unfortunately there is no such functionality in the framework. There is a way to make ZIP files, but without password. If you want to create password protected ZIP files in C#, I'd recommend SevenZipSharp. It's basically a managed wrapper for 7-Zip.
SevenZipBase.SetLibraryPath(Path.Combine(
Path.GetDirectoryName(Assembly.GetExecutingAssembly().Location) ?? Environment.CurrentDirectory,
"7za.dll"));
SevenZipCompressor compressor = new SevenZipCompressor();
compressor.Compressing += Compressor_Compressing;
compressor.FileCompressionStarted += Compressor_FileCompressionStarted;
compressor.CompressionFinished += Compressor_CompressionFinished;
string password = #"whatever";
string destinationFile = #"C:\Temp\whatever.zip";
string[] sourceFiles = Directory.GetFiles(#"C:\Temp\YourFiles\");
if (String.IsNullOrWhiteSpace(password))
{
compressor.CompressFiles(destinationFile, sourceFiles);
}
else
{
//optional
compressor.EncryptHeaders = true;
compressor.CompressFilesEncrypted(destinationFile, password, sourceFiles);
}
DotNetZip worked great in a clean way.
DotNetZip is a FAST, FREE class library and toolset for manipulating zip files.
Code
static void Main(string[] args)
{
using (ZipFile zip = new ZipFile())
{
zip.Password = "mypassword";
zip.AddDirectory(#"C:\Test\Report_CCLF5\");
zip.Save(#"C:\Test\Report_CCLF5_PartB.zip");
}
}
I want to add some more alternatives.
For .NET one can use SharpZipLib, for Xamarin use SharpZipLib.Portable.
Example for .NET:
using ICSharpCode.SharpZipLib.Zip;
// Compresses the supplied memory stream, naming it as zipEntryName, into a zip,
// which is returned as a memory stream or a byte array.
//
public MemoryStream CreateToMemoryStream(MemoryStream memStreamIn, string zipEntryName) {
MemoryStream outputMemStream = new MemoryStream();
ZipOutputStream zipStream = new ZipOutputStream(outputMemStream);
zipStream.SetLevel(3); //0-9, 9 being the highest level of compression
zipStream.Password = "Your password";
ZipEntry newEntry = new ZipEntry(zipEntryName);
newEntry.DateTime = DateTime.Now;
zipStream.PutNextEntry(newEntry);
StreamUtils.Copy(memStreamIn, zipStream, new byte[4096]);
zipStream.CloseEntry();
zipStream.IsStreamOwner = false; // False stops the Close also Closing the underlying stream.
zipStream.Close(); // Must finish the ZipOutputStream before using outputMemStream.
outputMemStream.Position = 0;
return outputMemStream;
// Alternative outputs:
// ToArray is the cleaner and easiest to use correctly with the penalty of duplicating allocated memory.
byte[] byteArrayOut = outputMemStream.ToArray();
// GetBuffer returns a raw buffer raw and so you need to account for the true length yourself.
byte[] byteArrayOut = outputMemStream.GetBuffer();
long len = outputMemStream.Length;
}
More samples can be found here.
If you can live without password functionality, one can mention ZipStorer or the built in .NET function in System.IO.Compression.

Decompress tar files using C#

I'm searching a way to add embedded resource to my solution. This resources will be folders with a lot of files in them. On user demand they need to be decompressed.
I'm searching for a way do store such folders in executable without involving third-party libraries (Looks rather stupid, but this is the task).
I have found, that I can GZip and UnGZip them using standard libraries. But GZip handles single file only. In such cases TAR should come to the scene. But I haven't found TAR implementation among standard classes.
Maybe it possible decompress TAR with bare C#?
While looking for a quick answer to the same question, I came across this thread, and was not entirely satisfied with the current answers, as they all point to using third-party dependencies to much larger libraries, all just to achieve simple extraction of a tar.gz file to disk.
While the gz format could be considered rather complicated, tar on the other hand is quite simple. At its core, it just takes a bunch of files, prepends a 500 byte header (but takes 512 bytes) to each describing the file, and writes them all to single archive on a 512 byte alignment. There is no compression, that is typically handled by compressing the created file to a gz archive, which .NET conveniently has built-in, which takes care of all the hard part.
Having looked at the spec for the tar format, there are only really 2 values (especially on Windows) we need to pick out from the header in order to extract the file from a stream. The first is the name, and the second is size. Using those two values, we need only seek to the appropriate position in the stream and copy the bytes to a file.
I made a very rudimentary, down-and-dirty method to extract a tar archive to a directory, and added some helper functions for opening from a stream or filename, and decompressing the gz file first using built-in functions.
The primary method is this:
public static void ExtractTar(Stream stream, string outputDir)
{
var buffer = new byte[100];
while (true)
{
stream.Read(buffer, 0, 100);
var name = Encoding.ASCII.GetString(buffer).Trim('\0');
if (String.IsNullOrWhiteSpace(name))
break;
stream.Seek(24, SeekOrigin.Current);
stream.Read(buffer, 0, 12);
var size = Convert.ToInt64(Encoding.ASCII.GetString(buffer, 0, 12).Trim(), 8);
stream.Seek(376L, SeekOrigin.Current);
var output = Path.Combine(outputDir, name);
if (!Directory.Exists(Path.GetDirectoryName(output)))
Directory.CreateDirectory(Path.GetDirectoryName(output));
using (var str = File.Open(output, FileMode.OpenOrCreate, FileAccess.Write))
{
var buf = new byte[size];
stream.Read(buf, 0, buf.Length);
str.Write(buf, 0, buf.Length);
}
var pos = stream.Position;
var offset = 512 - (pos % 512);
if (offset == 512)
offset = 0;
stream.Seek(offset, SeekOrigin.Current);
}
}
And here is a few helper functions for opening from a file, and automating first decompressing a tar.gz file/stream before extracting.
public static void ExtractTarGz(string filename, string outputDir)
{
using (var stream = File.OpenRead(filename))
ExtractTarGz(stream, outputDir);
}
public static void ExtractTarGz(Stream stream, string outputDir)
{
// A GZipStream is not seekable, so copy it first to a MemoryStream
using (var gzip = new GZipStream(stream, CompressionMode.Decompress))
{
const int chunk = 4096;
using (var memStr = new MemoryStream())
{
int read;
var buffer = new byte[chunk];
do
{
read = gzip.Read(buffer, 0, chunk);
memStr.Write(buffer, 0, read);
} while (read == chunk);
memStr.Seek(0, SeekOrigin.Begin);
ExtractTar(memStr, outputDir);
}
}
}
public static void ExtractTar(string filename, string outputDir)
{
using (var stream = File.OpenRead(filename))
ExtractTar(stream, outputDir);
}
Here is a gist of the full file with some comments.
Tar-cs will do the job, but it is quite slow. I would recommend using SharpCompress which is significantly quicker. It also supports other compression types and it has been updated recently.
using System;
using System.IO;
using SharpCompress.Common;
using SharpCompress.Reader;
private static String directoryPath = #"C:\Temp";
public static void unTAR(String tarFilePath)
{
using (Stream stream = File.OpenRead(tarFilePath))
{
var reader = ReaderFactory.Open(stream);
while (reader.MoveToNextEntry())
{
if (!reader.Entry.IsDirectory)
{
ExtractionOptions opt = new ExtractionOptions {
ExtractFullPath = true,
Overwrite = true
};
reader.WriteEntryToDirectory(directoryPath, opt);
}
}
}
}
See tar-cs
using (FileStream unarchFile = File.OpenRead(tarfile))
{
TarReader reader = new TarReader(unarchFile);
reader.ReadToEnd("out_dir");
}
Since you are not allowed to use outside libraries, you are not restricted to a specific format of the tar file either. In fact, they don't even need it to be all in the same file.
You can write your own tar-like utility in C# that walks a directory tree, and produces two files: a "header" file that consists of a serialized dictionary mapping System.IO.Path instances to an offset/length pairs, and a big file containing the content of individual files concatenated into one giant blob. This is not a trivial task, but it's not overly complicated either.
there are 2 ways to compress/decompress in .NET first you can use Gzipstream class and DeflatStream both can actually do compress your files in .gz format so if you compressed any file in Gzipstream it can be opened with any popular compression applications such as winzip/ winrar, 7zip but you can't open compressed file with DeflatStream. these two classes are from .NET 2.
and there is another way which is Package class it's actually same as Gzipstream and DeflatStream the only different is you can compress multiple files which then can be opened with winzip/ winrar, 7zip.so that's all .NET has. but it's not even generic .zip file,
it something Microsoft uses to compress their *x extension office files. if you decompress any docx file with package class you can see everything stored in it. so don't use .NET libraries for compressing or even decompressing cause you can't even make a generic compress file or even decompress a generic zip file. you have to consider for a third party library such as
http://www.icsharpcode.net/OpenSource/SharpZipLib/
or implement everything from the ground floor.

sharpziplib compressed files to be uncompressed externally

I have a scenario where by I want to zip an email attachment using SharpZipLib. Then the end user will open the attachment and will unzip the attached file.
Will the file originally zipped file using SharpZipLib be easily unzipped by other programs for my end user?
It depends on how you use SharpZipLib. There is more than one way to compress the data with this library.
Here is example of method that will create a zip file that you will be able to open in pretty much any zip aware application:
private static byte[] CreateZip(byte[] fileBytes, string fileName)
{
using (var memoryStream = new MemoryStream())
using (var zipStream = new ZipOutputStream(memoryStream))
{
var crc = new Crc32();
crc.Reset();
crc.Update(fileBytes);
var zipEntry =
new ZipEntry(fileName)
{
Crc = crc.Value,
DateTime = DateTime.Now,
Size = fileBytes.Length
};
zipStream.PutNextEntry(zipEntry);
zipStream.Write(fileBytes, 0, fileBytes.Length);
zipStream.Finish();
zipStream.Close();
return memoryStream.ToArray();
}
}
Usage:
var fileBytes = File.ReadAllBytes(#"C:/1.xml");
var zipBytes = CreateZip(fileBytes, "MyFile.xml");
File.WriteAllBytes(#"C:/2.zip", zipBytes);
This CreateZip method is optimized for the cases when you already have bytes in memory and you just want to compress them and send without even saving to disk.

Categories

Resources