split zip files to volumes [duplicate] - c#

I need to create spanned (multi-volume) zip files using .Net, but I have been unable to find a library that enables me to do it.
Spanned zip is a zip compressed file that is split among a number of files, which usually have extensions like .z00, .z01, and so on.
The library would have to be open-source or free, because I'm gonna use it for a open source project.
(it's a duplicate to this question, but there are no answers there and I'm not going for ASP specific anyway)

DotNetZip example:
int segmentsCreated ;
using (ZipFile zip = new ZipFile())
{
zip.UseUnicode= true; // utf-8
zip.AddDirectory(#"MyDocuments\ProjectX");
zip.Comment = "This zip was created at " + System.DateTime.Now.ToString("G") ;
zip.MaxOutputSegmentSize = 100*1024 ; // 100k segments
zip.Save("MyFiles.zip");
segmentsCreated = zip.NumberOfSegmentsForMostRecentSave ;
}
if segmentsCreated comes back as 5, then you have the following files, each not more than 100kb in size.
MyFiles.zip
MyFiles.z01
MyFiles.z02
MyFiles.z03
MyFiles.z04
Edited To Note: DotNetZip used to live at Codeplex. Codeplex has been shut down. The old archive is still [available at Codeplex][1]. It looks like the code has migrated to Github:
https://github.com/DinoChiesa/DotNetZip. Looks to be the original author's repo.
https://github.com/haf/DotNetZip.Semverd. This looks to be the currently maintained version. It's also packaged up an available via Nuget at https://www.nuget.org/packages/DotNetZip/

DotNetZip allows you to do this. From their documentation:
The library supports zip passwords, Unicode, ZIP64, stream input and output,
AES encryption, multiple compression levels, self-extracting archives,
spanned archives, and more.

Take a look at the SevenZipSharp library. It supports multivolumes archives.

Related

Programmatically merging zip segments made by DotNetZip

I have a problem with merging zip segments which I made using DotNetZip library.
I'm zipping big file, which produces files like: abc.zip, abc.z01 and abc.z02.
using (ZipFile zip = new ZipFile())
{
zip.AddDirectory(fullDir);
zip.MaxOutputSegmentSize = 65536;
zip.Save(packageFullName);
return zip.NumberOfSegmentsForMostRecentSave;
}
In other service I want to download these files and merge them to only one zip file. I made this by simply merging theirs byte arrays. Sadly I'm getting error, that archive created by me isn't valid.
I'm not sure why my approach isn't right. I found this stack question: https://superuser.com/questions/15935/how-do-i-reassemble-a-zip-file-that-has-been-emailed-in-multiple-parts - accepted answer also produces invalid archive.
Do anybody knows how can I merge a few DotNetZip files? I don't really want to extract them in memory and pack once again, but maybe it's the only way.
dotnetzip can read segment zip files without problem, you can refer it's source code to take a look how it handle the segement files as one zip file, its an internal class you cannot directly use, but it may have a clue tell you how to do it:
http://dotnetzip.codeplex.com/SourceControl/latest#Zip/ZipSegmentedStream.cs

C# re-zipping Docx after image replace won't open [duplicate]

I have been trying to write a simple Markdown -> docx parser/writer, but am completely stuck with the last part, which should be the easiest: i.e. compressing the folder into a .docx that Word, or any other .docx reader, will recognize.
My parser-writer is irrelevant really: I have this problem if I simply unzip any old Word-produced *.docx and then try to recompress it with the usual compression utilities, giving it the file-ending docx. Is there some mysterious header I should be adding, or do I need a special OPC compression utility, or what?
I don't so much want a tool that will do this, as to figure out what is supposed to be there. It seems to be independent of the WordprocessingML specification.
Needless to say I don't know anything about compression. Everything I can find via Google has to do with fancy utilities you can use in business, but I'm making a little executable that would be GPLd or something, and should work on anything.
The most common problem around manually zipping together Open XML documents is that it will not work if you zip the directory instead of the contents. In other words, the[content_types].xml file, and the word, docProps, and _rels directories need to reside at the root level of the zip file.
Here are steps to unzip my.docx and re-zip:
% mkdir unzipped
% cd unzipped/
% unzip ../my.docx
% zip -r ../rezipped.docx *
% open ../rezipped.docx
The compression algorithm used is "Zip" (Base 64) compression.
7zip seems to offer this, though i have no tested it.
Further to what Mica said, the contents of the ZIP file are organised according to the Open Packaging Convention; cf. Microsoft's Essentials of the Open Packaging Convention.
You can use the .NET System.IO.Packaging to make and manipulate .docx files; this class is implemented in the Mono project.

Efficient compression of folder with same file copied multiple times

I am creating a *.zip using Ionic.Zip. However, my *.zip contains same files multiple times, sometimes even 20x, and the ZIP format does not take advantage of it at all.
Whats worse, Ionic.Zip sometimes crashes with an OutOfMemoryException, since I am compressing the files into a MemoryStream.
Is there a .NET library for compressing that takes advantage of redundancy between files?
Users decompress the files on their own, so it cannot be an exotic format.
I ended up creating a tar.gz using the SharpZipLib library. Using this solution on 1 file, the archive is 3kB. Using it on 20 identical files, the archive is only 6kB, whereas in .zip it was 64kB.
Nuget:
Install-Package SharpZipLib
Usings:
using ICSharpCode.SharpZipLib.GZip;
using ICSharpCode.SharpZipLib.Tar;
Code:
var output = new MemoryStream();
using (var gzip = new GZipOutputStream(output))
using (var tar = TarArchive.CreateOutputTarArchive(gzip))
{
for (int i = 0; i < files.Count; i++)
{
var tarEntry = TarEntry.CreateEntryFromFile(file);
tar.WriteEntry(tarEntry,false);
}
tar.IsStreamOwner = false;
gzip.IsStreamOwner = false;
}
No, there is no such API exposed by well-known ones (such as GZip, PPMd, Zip, LZMA). They all operate per file (or stream of bytes to be more specific).
You could catenate all the files, ie using a tar-ball format and then use compression algorithm.
Or, it's trivial to implement your own check: compute hash for a file and store it in the a hash-filename dictionary. If hash matches for next file you can decide what you want to do, such as ignore this file completely, or perhaps note its name and save it in another file to mark duplicates.
Yes, 7-zip. There is a SevenZipSharp library you could use, but from my experience, launching a compressing process directly using command line is much faster.
My personal experience:
We used a SevenZipSharp in a company to decompress archives up to 1GB and it was terribly slow until I reworked it so that it will use the 7-zip library directly by running its command line interface. Then it was as fast as it was when decompressing manually in Windows Explorer.
I haven't tested this, but according to one answerer in How many times can a file be compressed?
If you have a large number of duplicate files, the zip format will zip each independently, and you can then zip the first zip file to remove duplicate zip information.

How to Decompress nested GZip (TGZ) files in C#

I am receiving a TGZ file that will contain one plain text file along with possibly one or more nested TGZ files. I have figured out how to decompress the main TGZ file and read the plain text file contained in it, but I have not been able to figure out how to recognize and decompress the nested TGZ files. Has anyone come across this problem before?
Also, I do not have control over the file I am receiving, so I cannot change the format of a TGZ file containing nested TGZ files. One other caveat (even though I don't think it matters) is that these files are being compressed and tarred in a Unix or Linux environment.
Thanks in advance for any help.
Try the SharpZipLib (http://www.icsharpcode.net/OpenSource/SharpZipLib/Download.aspx) free library.
It lets you work with TGZ and has methods to test files before trying to inflate them; so you can either rely on the file extensions being correct, or test them individually to see if you can read them as compressed files - then inflate them once the main file has been decompressed.
To read and write .tar and .tgz (or .tar.gz ) files from .NET, you can use this one-file tar class:
http://cheesoexamples.codeplex.com/SourceControl/changeset/view/97756#1868643
Very simple usage. To create an archive:
string[] filenames = { ... };
Ionic.Tar.CreateArchive("archive.tar", filenames);
Create a compressed (gzip'd) tar archive:
string[] filenames = { ... };
Ionic.Tar.CreateArchive("archive.tgz", filenames, TarOptions.Compress);
Read a tar archive:
var entries = Ionic.Tar.List("archive.tar"); // also handles .tgz files
Extract all entries in a tar archive:
var entries = Ionic.Tar.Extract("archive.tar"); // also handles .tgz files
Take a look at DotNetZip on CodePlex.
"If all you want is a better
DeflateStream or GZipStream class to
replace the one that is built-into the
.NET BCL, that is here, too.
DotNetZip's DeflateStream and
GZipStream are available in a
standalone assembly, based on a .NET
port of Zlib. These streams support
compression levels and deliver much
better performance that the built-in
classes. There is also a ZlibStream to
complete the set (RFC 1950, 1951,
1952)."
It appears that you can iterate through the compressed file and pull the individual files out of the archive. You can then test the files you uncompressed and see if any of them are themselves GZip files.
Here is a snippit from their Examples Page
using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
foreach (ZipEntry e in zip)
{
e.Extract(OutputStream);
}
}
Keith

c# file container

I am searching for a way to add several files into one file, much like a Zip file. I need to be able to create a file container on the fly and add several word documents, images and other important files into the container. My criteria is that you don't need to install any additional software on the computer (preferebly only a .DLL file that i can include in my project), that the program is free and that you can encrypt the data.
Anyone know of any good container programs that has support for these 2 criterias or if anyone know any good information about how to create your own container.
Patrick
Does it have to be like a zip file, or can it be a zip file?
Are you using .NET Framework 3.0 or 3.5? If so, look at
System.IO.Packaging.ZipPackage
This discussion has a section about it.
In addition to DotNetZip (licensed with Microsoft Public License) that Jay Riggs mentions, there's SharpZipLib (licensed with GPL). Whichever you choose, be sure the terms of the license match your understanding of the word "free".
If you can use ZipPackage, one benefit is that you don't need to think about license terms (beyond those of developing any other .NET app).
EDIT: DotNetZip and SharpZipLib support encryption. I don't see that ZipPackage does, but you could look at System.IO.Packaging.EncryptedPackageEnvelope.
I used DotNetZip in a project and it worked really well. I would recommend using it. It supports encryption and is easy to use.
http://www.codeplex.com/DotNetZip
User .Net GZipStream class(System.IO.Compression namespace.) to compress and decompress files. You can find more information on
MSDN Link
GZIP Compression
I have personally used this technique to decomress .zip file. Click Here

Categories

Resources