7zip compress network stream

7zip compress network stream - c#

I will like to compress a file before sending it through the network. I think the best approach is 7zip because it is free and open source.
How I use 7zip with .net?
I know that 7zip is free and that they have the source code in c# but for some reason it is very slow on c# so I rather call the dll 7z.dll that comes when installing 7zip for performance reasons. So the way I am able to eassily marshal and call the methods in 7z.dll is with the help of the library called sevenzipsharp . For example adding that dll to my project will enable me to do:
// if you installed 7zip 64bit version then make sure you change plataform target
// like on the picture I showed above!
SevenZip.SevenZipCompressor.SetLibraryPath(#"C:\Program Files\7-Zip\7z.dll");
var stream = System.IO.File.OpenRead(#"SomeFileToCompress.txt");
var outputStream = System.IO.File.Create("Output.7z");
SevenZip.SevenZipCompressor compressor = new SevenZip.SevenZipCompressor();
compressor.CompressionMethod = SevenZip.CompressionMethod.Lzma2;
compressor.CompressionLevel = SevenZip.CompressionLevel.Ultra;
compressor.CompressStream(stream, outputStream);
that's how I use 7zip within c#.
Now my question is:
I will like to send a compressed file over the network. I know I could compress it first then send it. The file is 4GB so I will have to wait a long time for it to compress. I will be wasting a lot of space on hard drive. then I will finally be able to send it. I think that is to complicated. I was wondering how it will be possible to send the file meanwhile it is being compressed.
It seems to be a problem with SevenZipSharp:

Have you considered an alternate library - one that doesn't even require 7-Zip to be installed / available?
From the description posted at http://dotnetzip.codeplex.com/ :
creating zip files from stream content, saving to a stream, extracting
to a stream, reading from a stream
Unlike 7-Zip, DotNetZip is designed to work with C# / .Net.
Plenty of examples - including streaming, are available at http://dotnetzip.codeplex.com/wikipage?title=CS-Examples&referringTitle=Examples .
Another option is to use the 7-Zip Command Line Version (7z.exe), and write to/read from standard in/out. This would allow you to use the 7-Zip file format, while also keeping all of the core work in native code (though there likely won't be much of a significant difference).
Looking back at SevenZipSharp:
Since the 0.29 release, streaming is supported.
Looking at http://sevenzipsharp.codeplex.com/SourceControl/changeset/view/59007#364711 :
it seems you'd want this method:
public void CompressStream(Stream inStream, Stream outStream)
Thank you for considering performance here! I think way too many people would do exactly what you're trying to avoid: compress to a temp file, then do something with the temp file.

CompressStream threw an exception. My code is as follows:
public void TestCompress()
{
string fileToCompress = #"C:\Users\gary\Downloads\BD01.DAT";
byte[] inputBytes = File.ReadAllBytes(fileToCompress);
var inputStream = new MemoryStream(inputBytes);
byte[] zipBytes = new byte[38000000]; // this memory size is large enough.
MemoryStream outStream = new MemoryStream(zipBytes);
string compressorEnginePath = #"C:\Engine\7z.dll";
SevenZipCompressor.SetLibraryPath(compressorEnginePath);
compressor = new SevenZip.SevenZipCompressor();
compressor.CompressionLevel = CompressionLevel.Fast;
compressor.CompressionMethod = CompressionMethod.Lzma2;
compressor.CompressStream(inputStream, outputStream);
inputStream.Close();
outputStream.Close();
The exception messages:
Message: Test method Test7zip.UnitTest1.TestCompress threw exception:
SevenZip.SevenZipException: The execution has failed due to the bug in the SevenZipSharp.
Please report about it to http://sevenzipsharp.codeplex.com/WorkItem/List.aspx, post the release number and attach the archive

Related

C#- Renci.Ssh.Net- Which one gives optimized performance- WriteAllText Vs. UploadFile

I need to generate multiple XML files at SFTP location from C# code. for SFTP connectivity, I am using Renci.Ssh.net. I found there are different methods to generate files including WriteAllText() and UploadFile(). I am producing XML string runtime, currently I've used WriteAllText() method (just to avoid creating the XML file on local and thus to avoid IO operation).
using (SftpClient client = new SftpClient(host,port, sftpUser, sftpPassword))
{
client.Connect();
if (client.IsConnected)
{
client.BufferSize = 1024;
var filePath = sftpDir + fileName;
client.WriteAllText(filePath, contents);
client.Disconnect();
}
client.Dispose();
}
Will using UploadFile(), either from FileStream or MemoryStream give me better performance in long run?
The result document size will be in KB, around 60KB.
Thanks!

SftpClient.UploadFile is optimized for uploads of large amount of data.
But for 60KB, I'm pretty sure that it makes no difference whatsoever. So you can continue using the more convenient SftpClient.WriteAllText.
Though, I believe that most XML generators (like .NET XmlWriter are able to write XML to Stream (it's usually the preferred output API, rather than a string). So the use of SftpClient.UploadFile can be more convenient in the end.
See also What is the difference between SftpClient.UploadFile and SftpClient.WriteAllBytes?

Reading an image file from local storage on mono for android

In mono for android I have an app that saves images to local storage for caching purposes. When the app launches it tries to load images from the cache before trying to load them from the web.
I'm currently having a hard time finding a good way to read and load them from local storage.
I'm currently using something equivilant to this:
List<byte> byteList = new List<byte>();
using (System.IO.BinaryReader binaryReader = new System.IO.BinaryReader(context.OpenFileInput("filename.jpg")))
{
while (binaryReader.BaseStream.IsDataAvailable())
{
byteList.Add(binaryReader.ReadByte());
}
}
return byteList.toArray();
OpenFileInput() returns a stream that does not give me a length so I have to read one byte at a time. It also can't seek. This seems to be causing images to load much slower than they aughto. Loading images from Resrouce.Drawable is almost instantanious by comparison, but with my method there a very noticable pause, maybe 300ms, for loading a 8kb file. This seems like a really obvious task to be able to do, but I've tried many solutions and searched a lot for advise but to no avail.
I've also noticed this code seems to crash with an EndOfStream exception when not run on the UI thread.
Any help would be hugely appreciated

What do you intend on doing with the List<byte>? You want to "load images from the cache," but you don't specify what you want to load them into.
If you want to load them into a Android.Graphics.Bitmap, you could use BitmapFactory.DecodeStream(Stream):
Bitmap bitmap = BitmapFactory.DecodeStream(context.OpenFileInput("filename.jpg"));
This would remove the List<byte> intermediary.
If you really need all the bytes (for whatever reason), you can rely on the fact that System.Environment.GetFolderPath(System.Environment.SpecialFolder.Personal) is the same as Context.FilesDir, which is what context.OpenFileInput() will use, permitting:
byte[] bytes = System.IO.File.ReadAllBytes(
Path.Combine (
System.Environment.GetFolderPath(System.Environment.SpecialFolder.Personal),
"filename.jpg"));
However, if this is truly a cache, you should be using Context.CacheDir instead of Context.FilesDir, which is Path.GetTempPath returns:
byte[] cachedBytes = System.IO.File.ReadAllBytes(
Path.Combine(System.IO.Path.GetTempPath(), "filename.jpg"));

How to open a file from Memory Stream

Is it possible to open a file directly from a MemoryStream opposed to writing to disk and doing Process.Start() ? Specifically a pdf file? If not, I guess I need to write the MemoryStream to disk (which is kind of annoying). Could someone then point me to a resource about how to write a MemoryStream to Disk?

It depends on the client :) if the client will accept input from stdin you could push the dta to the client. Another possibility might be to write a named-pipes server or a socket-server - not trivial, but it may work.
However, the simplest option is to just grab a temp file and write to that (and delete afterwards).
var file = Path.GetTempFileName();
using(var fileStream = File.OpenWrite(file))
{
var buffer = memStream.GetBuffer();
fileStream.Write(buffer, 0, (int)memStream.Length);
}
Remember to clean up the file when you are done.

Path.GetTempFileName() returns file name with '.tmp' extension, therefore you cant't use Process.Start() that needs windows file association via extension.

If by opening a file, you mean something like starting Adobe Reader for PDF files, then yes, you have to write it to a file. That is, unless the application provides you with some API do that.
One way to write a stream to file would be:
using (var memoryStream = /* create the memory stream */)
using (var fileStream = File.OpenWrite(fileName))
{
memoryStream.WriteTo(fileStream);
}

ICSharpCode.SharpZipLib.Zip example with crc variable details

I am using icsharpziplib dll for zipping sharepoint files using c# in asp.net
When i open the output.zip file, it is showing "zip file is either corrupted or damaged".
And the crc value for files in the output.zip is showing as 000000.
How do we calculate or configure crc value using icsharpziplib dll?
Can any one have the good example how to do zipping using memorystreams?

it seems you're not creating each ZipEntry.
Here's is a code that I adapted to my needs:
http://wiki.sharpdevelop.net/SharpZipLib-Zip-Samples.ashx#Create_a_Zip_fromto_a_memory_stream_or_byte_array_1
Anyway with SharpZipLib there are many ways you can work with zip file: the ZipFile class, the ZipOutputStream and the FastZip.
I'm using the ZipOutputStream to create an in-memory ZIP file, adding in-memory streams to it and finally flushing to disk, and it's working quite good. Why ZipOutputStream? Because it's the only choice available if you want to specify a compression level and use Streams.
Good luck :)

1:
You could do it manually but the ICSharpCode library will take care of it for you. Also something I've discovered: 'zip file is either corrupted or damaged' can also be a result of not adding your zip entry name correctly (such as an entry that sits in a chain of subfolders).
2:
I solved this problem by creating a compressionHelper utility. I had to dynamically compose and return zip files. Temp files were not an option as the process was to be run by a webservice.
The trick with this was a BeginZip(), AddEntry() and EndZip() methods (because I made it into a utility to be invoked. You could just use the code directly if need be).
Something I've excluded from the example are checks for initialization (like calling EndZip() first by mistake) and proper disposal code (best to implement IDisposable and close your zipfileStream and your memoryStream if applicable).
using System.IO;
using ICSharpCode.SharpZipLib.Zip;
public void BeginZipUpdate()
{
_memoryStream = new MemoryStream(200);
_zipOutputStream = new ZipOutputStream(_memoryStream);
}
public void EndZipUpdate()
{
_zipOutputStream.Finish();
_zipOutputStream.Close();
_zipOutputStream = null;
}
//Entry name could be 'somefile.txt' or 'Assemblies\MyAssembly.dll' to indicate a folder.
//Unsure where you'd be getting your file, I'm reading the data from the database.
public void AddEntry(string entryName, byte[] bytes)
{
ZipEntry entry = new ZipEntry(entryName);
entry.DateTime = DateTime.Now;
entry.Size = bytes.Length;
_zipOutputStream.PutNextEntry(entry);
_zipOutputStream.Write(bytes, 0, bytes.Length);
_zipOutputStreamEntries.Add(entryName);
}
So you're actually having the zipOutputStream write to a memoryStream. Then once _zipOutputStream is closed, you can return the contents of the memoryStream.
public byte[] GetResultingZipFile()
{
_zipOutputStream.Finish();
_zipOutputStream.Close();
_zipOutputStream = null;
return _memoryStream.ToArray();
}
Just be aware of how much you want to add to a zipfile (delay in process/IO/timeouts etc).

ZIP file created with SharpZipLib cannot be opened on Mac OS X

Argh, today is the day of stupid problems and me being an idiot.
I have an application which creates a zip file containing some JPEGs from a certain directory. I use this code in order to:
read all files from the directory
append each of them to a ZIP file
using (var outStream = new FileStream("Out2.zip", FileMode.Create))
{
using (var zipStream = new ZipOutputStream(outStream))
{
foreach (string pathname in pathnames)
{
byte[] buffer = File.ReadAllBytes(pathname);
ZipEntry entry = new ZipEntry(Path.GetFileName(pathname));
entry.DateTime = now;
zipStream.PutNextEntry(entry);
zipStream.Write(buffer, 0, buffer.Length);
}
}
}
All works well under Windows, when I open the file e. g. with WinRAR, the files are extracted. But as soon as I try to unzip my archive on Mac OS X, it only creates a .cpgz file. Pretty useless.
A normal .zip file created manually with the same files on Windows is extracted without any problems on Windows and Mac OS X.
I found the above code on the Internet, so I am not absolutely sure if the whole thing is correct. I wonder if it is needed to use zipStream.Write() in order to write directly to the stream?

got the exact same problem today. I tried implementing the CRC stuff as proposed but it didn't help.
I finaly found the solution on this page: http://community.sharpdevelop.net/forums/p/7957/23476.aspx#23476
As a result, I just had to add this line in my code:
oZIPStream.UseZip64 = UseZip64.Off;
And the file opens up as it should on MacOS X :-)
Cheers
fred

I don't know for sure, because I am not very familiar with either SharpZipLib or OSX , but I still might have some useful insight for you.
I've spent some time wading through the zip spec, and actually I wrote DotNetZip, which is a zip library for .NET, unrelated to SharpZipLib.
Currently on the user forums for DotNetZip, there's a discussion going on about zip files generated by DotNetZip that cannot be read on OSX. One of the people using the library is having a problem that seems similar to what you are seeing. Except I have no idea what a .cpgxz file is.
We tracked it down, a little. At this point the most promising theory is that OSX does not like "bit 3" in the "general purpose bitfield" in the header of each zip entry.
Bit 3 is not new. PKWare added bit 3 to the spec 17 years ago. It was intended to support streaming generation of archives, in the way that SharpZipLib works. DotNetZip also has a way to produce a zipfile as it is streamed out, and it will also set bit-3 in the zip file if used in this way, although normally DotNetZip will produce a zipfile with bit-3 unset in it.
From what we can tell, when bit 3 is set, the OSX zip reader (whatever it is - like I said I'm not familiar with OSX) chokes on the zip file. The same zip contents produced without bit 3, allows the zip file to be opened. Actually it's not as simple as just flipping one bit - the presence of the bit signals the presence of other metadata. So I am using "bit 3" as a shorthand for all that.
So the theory is that bit 3 causes the problem. I haven't tested this myself. There's been some impedance mismatch on the communication with the person who has the OSX machine - so it is unresolved as yet.
But, if this theory holds, it would explain your situation: that WinRar and any Windows machine can open the file, but OSX cannot.
On the DotNetZip forums, we had a discussion about what to do about the problem. As near as I can tell, the OSX zip reader is broken, and cannot handle bit 3, so the workaround is to produce a zip file with bit 3 unset. I don't know if SharpZipLib can be convinced to do that.
I do know that if you use DotNetZip, and use the normal ZipFile class, and save to a seekable stream (like a filesystem file), you will get a zip that does not have bit 3 set. If the theory is correct, it should open with no problem on the Mac, every time. This is the result the DotNetZip user has reported. It's just one result so not generalizable yet, but it looks plausible.
example code for your scenario:
using (ZipFile zip = new ZipFile()
{
zip.AddFiles(pathnames);
zip.Save("Out2.zip");
}
Just for the curious, in DotNetZip you will get bit 3 set if you use the ZipFile class and save it to a nonseekable stream (like ASPNET's Response.OutputStream) or if you use the ZipOutputStream class in DotNetZip, which always writes forward only (no seeking back).
I think SharpZipLib's ZipOutputStream is also always "forward only."

So, I searched for a few more examples on how to use SharpZipLib and I finally got it to work on Windows and os x. Basically I added the "Crc32" of the file to the zip archive. No idea what this is though.
Here is the code that worked for me:
using (var outStream = new FileStream("Out3.zip", FileMode.Create))
{
using (var zipStream = new ZipOutputStream(outStream))
{
Crc32 crc = new Crc32();
foreach (string pathname in pathnames)
{
byte[] buffer = File.ReadAllBytes(pathname);
ZipEntry entry = new ZipEntry(Path.GetFileName(pathname));
entry.DateTime = now;
entry.Size = buffer.Length;
crc.Reset();
crc.Update(buffer);
entry.Crc = crc.Value;
zipStream.PutNextEntry(entry);
zipStream.Write(buffer, 0, buffer.Length);
}
zipStream.Finish();
// I dont think this is required at all
zipStream.Flush();
zipStream.Close();
}
}
Explanation from cheeso:
CRC is Cyclic Redundancy Check - it's a checksum on the entry data. Normally the header for each entry in a zip file contains a bunch of metadata, including some things that cannot be known until all the entry data has been streamed - CRC, Uncompressed size, and compressed size. When generating a zipfile through a streamed output, the zip spec allows setting a bit (bit 3) to specify that these three data fields will immediately follow the entry data.
If you use ZipOutputStream, normally as you write the entry data, it is compressed and a CRC is calculated, and the 3 data fields are written immediately after the file data.
What you've done is streamed the data twice - the first time implicitly as you calculate the CRC on the file before writing it. If my theory is correct, the what is happening is this: When you provide the CRC to the zipStream before writing the file data, this allows the CRC to appear in its normal place in the entry header, which keeps OSX happy. I'm not sure what happens to the other two quantities (compressed and uncompressed size).

I had exactly the same problem, my mistake was (and in your example code as well) that I didn't supply the file lenght for each entry.
Example code:
...
ZipEntry entry = new ZipEntry(Path.GetFileName(pathname));
entry.DateTime = now;
var fileInfo = new FileInfo(pathname)
entry.size = fileInfo.lenght;
...

I was separating the folder names with a backslash... when I changed this to a forward slash it worked!

What's going on with the .cpgz file is that Archive Utility is being launched by a file with a .zip extension. Archive Utility examines the file and thinks it isn't compressed, so it's compressing it. For some bizarre reason, .cpgz (CPIO archiving + gzip compression) is the default. You can set a different default in Archive Utility's Preferences.
If you do indeed discover this is a problem with OS X's zip decoder, please file a bug. You can also try using the ditto command-line tool to unpack it; you may get a better error message. Of course, OS X also ships unzip, the Info-ZIP utility, but I'd expect that to work.

I agree with Cheeso's answer however if the Input file size is greater than 2GB then byte[] buffer = File.ReadAllBytes(pathname); will throw an IO exception.
So i modified Cheeso code and it works like a charm for all the files.
.
long maxDataToBuffer = 104857600;//100MB
using (var outStream = new FileStream("Out3.zip", FileMode.Create))
{
using (var zipStream = new ZipOutputStream(outStream))
{
Crc32 crc = new Crc32();
foreach (string pathname in pathnames)
{
tempBuffLength = maxDataToBuffer;
FileStream fs = System.IO.File.OpenRead(pathname);
ZipEntry entry = new ZipEntry(Path.GetFileName(pathname));
entry.DateTime = now;
entry.Size = buffer.Length;
crc.Reset();
long totalBuffLength = 0;
if (fs.Length <= tempBuffLength) tempBuffLength = fs.Length;
byte[] buffer = null;
while (totalBuffLength < fs.Length)
{
if ((fs.Length - totalBuffLength) <= tempBuffLength)
tempBuffLength = (fs.Length - totalBuffLength);
totalBuffLength += tempBuffLength;
buffer = new byte[tempBuffLength];
fs.Read(buffer, 0, buffer.Length);
crc.Update(buffer, 0, buffer.Length);
buffer = null;
}
entry.Crc = crc.Value;
zipStream.PutNextEntry(entry);
tempBuffLength = maxDataToBuffer;
fs = System.IO.File.OpenRead(pathname);
totalBuffLength = 0;
if (fs.Length <= tempBuffLength) tempBuffLength = fs.Length;
buffer = null;
while (totalBuffLength < fs.Length)
{
if ((fs.Length - totalBuffLength) <= tempBuffLength)
tempBuffLength = (fs.Length - totalBuffLength);
totalBuffLength += tempBuffLength;
buffer = new byte[tempBuffLength];
fs.Read(buffer, 0, buffer.Length);
zipStream.Write(buffer, 0, buffer.Length);
buffer = null;
}
fs.Close();
}
zipStream.Finish();
// I dont think this is required at all
zipStream.Flush();
zipStream.Close();
}
}

I had a similar problem but on Windows 7. I updated to the as of this writing latest version of ICSharpZipLib 0.86.0.518. From then on I could no longer decompress any ZIP archives created with the code that was working so far.
There error messages were different depending on the tool I tried to extract with:
Unknown compression method.
Compressed size in local header does not match that of central directory header in new zip file.
What did the trick was to remove the CRC calculation as mentioned here: http://community.sharpdevelop.net/forums/t/8630.aspx
So I removed the line that is:
entry.Crc = crc.Value
And from then on I could again unzip the ZIP archives with any third party tool. I hope this helps someone.

There are two things:
Ensure your underlying output stream is seekable, or SharpZipLib won't be able to back up and fill in any ZipEntry fields that you omitted (size, crc, compressed size, ...). As a result, SharpZipLib will force "bit 3" to be enabled. The background has been explained pretty well in previous answers.
Fill in ZipEntry.Size, or explicitly set stream.UseZip64 = UseZip64.Off. The default is to conservatively assume the stream could be very large. Unzipping then requires "pk 4.5" support.

I encountered weird behavior when archive is empty (no entries inside it) it can not be opened on MAC - generates cpgz only. The idea was to put a dummy .txt file in it in case when no files for archiving.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.