How do I extract any archives with SharpZipLib? - c#

I'm using SharpZipLib to extract archives. I managed to extract .zip archives:
FastZip fastZip = new FastZip();
fastZip.ExtractZip(file, directory, null);
and to extract .tar.gz:
// Use a 4K buffer. Any larger is a waste.
byte[] dataBuffer = new byte[4096];
using (Stream fileStream = new FileStream(file, FileMode.Open, FileAccess.Read))
{
using (GZipInputStream gzipStream = new GZipInputStream(fileStream))
{
// Change this to your needs
string fnOut = Path.Combine(directory, Path.GetFileNameWithoutExtension(file));
using (FileStream fsOut = File.Create(fnOut))
{
StreamUtils.Copy(gzipStream, fsOut, dataBuffer);
}
}
}
Is there also a way to extract any kind of archive where I don't need to know the type of archive upfront? (e.g. SharpZipLib.ExtractAnyArchive(file, directory))

SharpZipLib unfortunately is currently not able to auto-detect the format of an archive file/stream.
You either have to implement the functionality by yourself in some form, or seek an alternative library that is able to auto-detect the format of an archive. An example of such a library would be SharpCompress, however, as you already noted in the comments, different libraries can come with different kind of limitations and bugs that might affect the functionality of your software.
If you decide to roll your own auto-detection functionality for SharpZipLib, you can choose different approaches, like
Try opening an (unknown) archive using the archive (reader/stream) classes for every archive format supported by SharpZipLib, until you find one which can open and process the archive file successfully.
Implement some format detection routine that scans an archive file/stream for 'magic' signature bytes identifying a particular archive format. If the format of an archive file/stream has been thus identified, select and use the appropriate SharpZipLib classes for handling the detected archive format.

Related

Compress large log file before reading

We have a large amount of logs (117 logs with total of about 17gb of data). It's straight text so I know it will compress well. I'm not looking for great compression, or speed (but that would be a good bonus). What I currently do is get a list of log files to read (they have a date stamp in the file name, so I filter on that first). After I get the list I then read each file using File.ReadAllLines() but we also filter on that...
private void GetBulkUpdateItems(List<string> allLines, Regex updatedRowsRegEx)
{
foreach (var file in this)
allLines.AddRange(File.ReadAllLines(file).Where(x => updatedRowsRegEx.IsMatch(x)));
allLines.Sort();
}
reading 5 files from the network takes about 22 seconds. What I'd like to do is compress the list of files into a single zip file. copy the zip file locally, then unzip them and do the rest. Problem is I can't figure out how to start. Since I'm using .net 4.5 I first tried System.IO.Compression.ZipFile but it wants a Directory and I don't want all 117 files. I saw someone use a network stream and 7zip which sounded promising, and I'm fairly certain that 7zip is installed on the server I need the logs from (Probably not important because we use the UNC path). So I'm stuck. Any suggestions?
ZipArchive is the underlying class for ZipFile and allows more granular manipulation.
Sample from the article adding hardcoded text:
using (FileStream zipToOpen = new FileStream(
#"c:\users\exampleuser\release.zip", FileMode.Open))
{
using (ZipArchive archive = new ZipArchive(zipToOpen, ZipArchiveMode.Update))
{
ZipArchiveEntry readmeEntry = archive.CreateEntry("Readme.txt");
using (StreamWriter writer = new StreamWriter(readmeEntry.Open()))
{
writer.WriteLine("Information about this package.");
writer.WriteLine("========================");
}
}
}
As Praveen Paulose suggested you can use ZipFileExtensions.CreateEntryFromFile to create entry from file to add to archive.

DeflateStream CopyTo writes nothing and throws no exceptions

I've basically copied this code sample directly from msdn with some minimal changes. The CopyTo method is silently failing and I have no idea why. What would cause this behavior? It is being passed a 78 KB zipped folder with a single text file inside of it. The returned FileInfo object points to a 0 KB file. No exceptions are thrown.
public static FileInfo DecompressFile(FileInfo fi)
{
// Get the stream of the source file.
using (FileStream inFile = fi.OpenRead())
{
// Get original file extension,
// for example "doc" from report.doc.cmp.
string curFile = fi.FullName;
string origName = curFile.Remove(curFile.Length
- fi.Extension.Length);
//Create the decompressed file.
using (FileStream outFile = File.Create(origName))
{
// work around for incompatible compression formats found
// here http://george.chiramattel.com/blog/2007/09/deflatestream-block-length-does-not-match.html
inFile.ReadByte();
inFile.ReadByte();
using (DeflateStream Decompress = new DeflateStream(inFile,
CompressionMode.Decompress))
{
// Copy the decompression stream
// into the output file.
Decompress.CopyTo(outFile);
return new FileInfo(origName);
}
}
}
}
In a comment you say that you are trying to decompress a zip file. The DeflateStream class can not be used like this on a zip file. The MSDN example you mentioned uses DeflateStream to create individual compressed files and then uncompresses them.
Although zip files might use the same algorithm (not sure about that) they are not just compressed versions of a single file. A zip file is a container that can hold many files and/or folders.
If you can use .NET Framework 4.5 I would suggest to use the new ZipFile or ZipArchive class. If you must use an earlier framework version there are free libraries you can use (like DotNetZip or SharpZipLib).

7zip compress network stream

I will like to compress a file before sending it through the network. I think the best approach is 7zip because it is free and open source.
How I use 7zip with .net?
I know that 7zip is free and that they have the source code in c# but for some reason it is very slow on c# so I rather call the dll 7z.dll that comes when installing 7zip for performance reasons. So the way I am able to eassily marshal and call the methods in 7z.dll is with the help of the library called sevenzipsharp . For example adding that dll to my project will enable me to do:
// if you installed 7zip 64bit version then make sure you change plataform target
// like on the picture I showed above!
SevenZip.SevenZipCompressor.SetLibraryPath(#"C:\Program Files\7-Zip\7z.dll");
var stream = System.IO.File.OpenRead(#"SomeFileToCompress.txt");
var outputStream = System.IO.File.Create("Output.7z");
SevenZip.SevenZipCompressor compressor = new SevenZip.SevenZipCompressor();
compressor.CompressionMethod = SevenZip.CompressionMethod.Lzma2;
compressor.CompressionLevel = SevenZip.CompressionLevel.Ultra;
compressor.CompressStream(stream, outputStream);
that's how I use 7zip within c#.
Now my question is:
I will like to send a compressed file over the network. I know I could compress it first then send it. The file is 4GB so I will have to wait a long time for it to compress. I will be wasting a lot of space on hard drive. then I will finally be able to send it. I think that is to complicated. I was wondering how it will be possible to send the file meanwhile it is being compressed.
It seems to be a problem with SevenZipSharp:
Have you considered an alternate library - one that doesn't even require 7-Zip to be installed / available?
From the description posted at http://dotnetzip.codeplex.com/ :
creating zip files from stream content, saving to a stream, extracting
to a stream, reading from a stream
Unlike 7-Zip, DotNetZip is designed to work with C# / .Net.
Plenty of examples - including streaming, are available at http://dotnetzip.codeplex.com/wikipage?title=CS-Examples&referringTitle=Examples .
Another option is to use the 7-Zip Command Line Version (7z.exe), and write to/read from standard in/out. This would allow you to use the 7-Zip file format, while also keeping all of the core work in native code (though there likely won't be much of a significant difference).
Looking back at SevenZipSharp:
Since the 0.29 release, streaming is supported.
Looking at http://sevenzipsharp.codeplex.com/SourceControl/changeset/view/59007#364711 :
it seems you'd want this method:
public void CompressStream(Stream inStream, Stream outStream)
Thank you for considering performance here! I think way too many people would do exactly what you're trying to avoid: compress to a temp file, then do something with the temp file.
CompressStream threw an exception. My code is as follows:
public void TestCompress()
{
string fileToCompress = #"C:\Users\gary\Downloads\BD01.DAT";
byte[] inputBytes = File.ReadAllBytes(fileToCompress);
var inputStream = new MemoryStream(inputBytes);
byte[] zipBytes = new byte[38000000]; // this memory size is large enough.
MemoryStream outStream = new MemoryStream(zipBytes);
string compressorEnginePath = #"C:\Engine\7z.dll";
SevenZipCompressor.SetLibraryPath(compressorEnginePath);
compressor = new SevenZip.SevenZipCompressor();
compressor.CompressionLevel = CompressionLevel.Fast;
compressor.CompressionMethod = CompressionMethod.Lzma2;
compressor.CompressStream(inputStream, outputStream);
inputStream.Close();
outputStream.Close();
The exception messages:
Message: Test method Test7zip.UnitTest1.TestCompress threw exception:
SevenZip.SevenZipException: The execution has failed due to the bug in the SevenZipSharp.
Please report about it to http://sevenzipsharp.codeplex.com/WorkItem/List.aspx, post the release number and attach the archive

How to open a file from Memory Stream

Is it possible to open a file directly from a MemoryStream opposed to writing to disk and doing Process.Start() ? Specifically a pdf file? If not, I guess I need to write the MemoryStream to disk (which is kind of annoying). Could someone then point me to a resource about how to write a MemoryStream to Disk?
It depends on the client :) if the client will accept input from stdin you could push the dta to the client. Another possibility might be to write a named-pipes server or a socket-server - not trivial, but it may work.
However, the simplest option is to just grab a temp file and write to that (and delete afterwards).
var file = Path.GetTempFileName();
using(var fileStream = File.OpenWrite(file))
{
var buffer = memStream.GetBuffer();
fileStream.Write(buffer, 0, (int)memStream.Length);
}
Remember to clean up the file when you are done.
Path.GetTempFileName() returns file name with '.tmp' extension, therefore you cant't use Process.Start() that needs windows file association via extension.
If by opening a file, you mean something like starting Adobe Reader for PDF files, then yes, you have to write it to a file. That is, unless the application provides you with some API do that.
One way to write a stream to file would be:
using (var memoryStream = /* create the memory stream */)
using (var fileStream = File.OpenWrite(fileName))
{
memoryStream.WriteTo(fileStream);
}

ICSharpCode.SharpZipLib.Zip example with crc variable details

I am using icsharpziplib dll for zipping sharepoint files using c# in asp.net
When i open the output.zip file, it is showing "zip file is either corrupted or damaged".
And the crc value for files in the output.zip is showing as 000000.
How do we calculate or configure crc value using icsharpziplib dll?
Can any one have the good example how to do zipping using memorystreams?
it seems you're not creating each ZipEntry.
Here's is a code that I adapted to my needs:
http://wiki.sharpdevelop.net/SharpZipLib-Zip-Samples.ashx#Create_a_Zip_fromto_a_memory_stream_or_byte_array_1
Anyway with SharpZipLib there are many ways you can work with zip file: the ZipFile class, the ZipOutputStream and the FastZip.
I'm using the ZipOutputStream to create an in-memory ZIP file, adding in-memory streams to it and finally flushing to disk, and it's working quite good. Why ZipOutputStream? Because it's the only choice available if you want to specify a compression level and use Streams.
Good luck :)
1:
You could do it manually but the ICSharpCode library will take care of it for you. Also something I've discovered: 'zip file is either corrupted or damaged' can also be a result of not adding your zip entry name correctly (such as an entry that sits in a chain of subfolders).
2:
I solved this problem by creating a compressionHelper utility. I had to dynamically compose and return zip files. Temp files were not an option as the process was to be run by a webservice.
The trick with this was a BeginZip(), AddEntry() and EndZip() methods (because I made it into a utility to be invoked. You could just use the code directly if need be).
Something I've excluded from the example are checks for initialization (like calling EndZip() first by mistake) and proper disposal code (best to implement IDisposable and close your zipfileStream and your memoryStream if applicable).
using System.IO;
using ICSharpCode.SharpZipLib.Zip;
public void BeginZipUpdate()
{
_memoryStream = new MemoryStream(200);
_zipOutputStream = new ZipOutputStream(_memoryStream);
}
public void EndZipUpdate()
{
_zipOutputStream.Finish();
_zipOutputStream.Close();
_zipOutputStream = null;
}
//Entry name could be 'somefile.txt' or 'Assemblies\MyAssembly.dll' to indicate a folder.
//Unsure where you'd be getting your file, I'm reading the data from the database.
public void AddEntry(string entryName, byte[] bytes)
{
ZipEntry entry = new ZipEntry(entryName);
entry.DateTime = DateTime.Now;
entry.Size = bytes.Length;
_zipOutputStream.PutNextEntry(entry);
_zipOutputStream.Write(bytes, 0, bytes.Length);
_zipOutputStreamEntries.Add(entryName);
}
So you're actually having the zipOutputStream write to a memoryStream. Then once _zipOutputStream is closed, you can return the contents of the memoryStream.
public byte[] GetResultingZipFile()
{
_zipOutputStream.Finish();
_zipOutputStream.Close();
_zipOutputStream = null;
return _memoryStream.ToArray();
}
Just be aware of how much you want to add to a zipfile (delay in process/IO/timeouts etc).

Categories

Resources