ZIP file from files with Danish characters in the filenames - c#

I am trying to create a ZIP file from files with Danish characters in the filenames in C#. I have tried with both ICSharpCode, System.IO.Compression, and Ionic.Zip, but regardless I cannot get the Danish characters into the ZIP file.
I need the filenames to be exactly the same as the originals, because I am uploading the ZIP files to a program I have no control over.
It looks as though I should choose that the filenames be saved as Unicode with newEntry.IsUnicodeText = true but this gives me something like +ª+++Ñ+å+ÿ+à instead of æøåÆØÅ when I open the zipfile in Windows Explorer. With IsUnicodeText = false I get æ¢åƥŠwhich is close - only the ø becomes a cent character.
I get the same result from System.IO.Compression, and Ionic.Zip if I choose UTF8 encoding.
( I have also tried ZipConstants.DefaultCodePage = 850 - this does not help )
I can see that people have been struggling with this for years, but I don't see any clear answer. I would be grateful for any tips.
// this is the ICSharpCode version
string fileToZip = #"Kontrolplan_15_Kørestrøm.docx"; // a file
string entryName = "Danish Letters (æøåÆØÅ).docx";
string zipPath = #"ZsharpZIP.ZIP";
using (ICSharpCode.SharpZipLib.Zip.ZipOutputStream s = new ICSharpCode.SharpZipLib.Zip.ZipOutputStream(File.Create(zipPath)))
{
byte[] buffer = new byte[4096];
// ---- store statusPath
FileInfo fi = new FileInfo(fileToZip);
ICSharpCode.SharpZipLib.Zip.ZipEntry newEntry = new ICSharpCode.SharpZipLib.Zip.ZipEntry(entryName);
newEntry.Size = fi.Length;
newEntry.IsUnicodeText = true;
s.PutNextEntry(newEntry);
// write file
using (FileStream streamReader = File.OpenRead(fileToZip))
{
StreamUtils.Copy(streamReader, s, buffer);
}
s.CloseEntry();

It seems that Windows Explorer's built-in zip utility cannot handle UTF-8 file names. When I used 7-Zip instead, the problem disappeared.

Related

Error GZIP header, first magic byte doesn’t match - How to Decompressing .tar.gz file using SharpZipLib

I’m trying to decompress a compress file with multiple files on it
the following code works to .tgz files, but don’t for .tar.gz files
What is wrong here?
FileInfo infoCompressed = new FileInfo(compressedFilePath);
DirectoryInfo destinyDir = new DirectoryInfo(infoCompressed.Directory.ToString());
if (!destinyDir.Exists)
destinyDir.Create();
using(Stream originFile = new GZipInputStream(infoCompressed.OpenRead()))
{
using(TarArchive tarFile = TarArchive.CreateInputTarArchive(originFile, TarBuffer.DefaultBlockFactor, Encoding.Default))
{
tarFile.ExtractContents(destinyDir.FullName);
}
}

How to open up a zip file from MemoryStream

I am using DotNetZip.
What I need to do is to open up a zip files with files from the server.
The user can then grab the files and store it locally on their machine.
What I did before was the following:
string path = "Q:\\ZipFiles\\zip" + npnum + ".zip";
zip.Save(path);
Process.Start(path);
Note that Q: is a drive on the server. With Process.Start, it simply open up the zip file so that the user can access all the files. I like to do the same but not store the file on disk but show it from memory.
Now, instead of storing the zip file on the server, I like to open it up with MemoryStream
I have the following but does not seem to work
var ms = new MemoryStream();
zip.Save(ms);
but not sure how to proceed further in terms of opening up the zip file from a memory stream so that the user can access all the files
Here is a live piece of code (copied verbatim) which I wrote to download a series of blog posts as a zipped csv file. It's live and it works.
public ActionResult L2CSV()
{
var posts = _dataItemService.SelectStuff();
string csv = CSV.IEnumerableToCSV(posts);
// These first two lines simply get our required data as a long csv string
var fileData = Zip.CreateZip("LogPosts.csv", System.Text.Encoding.UTF8.GetBytes(csv));
var cd = new System.Net.Mime.ContentDisposition
{
FileName = "LogPosts.zip",
// always prompt the user for downloading, set to true if you want
// the browser to try to show the file inline
Inline = false,
};
Response.AppendHeader("Content-Disposition", cd.ToString());
return File(fileData, "application/octet-stream");
}
You can use:
zip.Save(ms);
// Set read point to beginning of stream
ms.Position = 0;
ZipFile newZip = ZipFile.Read(ms);
See the documentation for Create a zip using content obtained from a stream.
using (ZipFile zip = new ZipFile())
{
ZipEntry e= zip.AddEntry("Content-From-Stream.bin", "basedirectory", StreamToRead);
e.Comment = "The content for entry in the zip file was obtained from a stream";
zip.AddFile("Readme.txt");
zip.Save(zipFileToCreate);
}
After saving it, you can then open it up as normal.

Creating an Epub file with a Zip library

HI All,
I am trying to zip up an Epub file i have made using c#
Things I have tried
Dot Net Zip http://dotnetzip.codeplex.com/
- DotNetZip works but epubcheck fails the resulting file (**see edit below)
ZipStorer zipstorer.codeplex.com
- creates an epub file that passes validation but the file won't open in Adobe Digital Editions
7 zip
- I have not tried this using c# but when i zip the file using there interface it tells me that the mimetype file name has a length of 9 and it should be 8
In all cases the mimetype file is the first file added to the archive and is not compressed
The Epub validator that I'am using is epubcheck http://code.google.com/p/epubcheck/
if anyone has succesfully zipped an epub file with one of these libraries please let me know how or if anyone has zipped an epub file successfully with any other open source zipping api that would also work.
EDIT
DotNetZip works, see accepted answer below.
If you need to control the order of the entries in the ZIP file, you can use DotNetZip and the ZipOutputStream.
You said you tried DotNetZip and it (the epub validator) gave you an error complaining about the mime type thing. This is probably because you used the ZipFile type within DotNetZip. If you use ZipOutputStream, you can control the ordering of the zip entries, which is apparently important for epub (I don't know the format, just surmising).
EDIT
I just checked, and the epub page on Wikipedia describes how you need to format the .epub file. It says that the mimetype file must contain specific text, must be uncompressed and unencrypted, and must appear as the first file in the ZIP archive.
Using ZipOutputStream, you would do this by setting CompressionLevel = None on that particular ZipEntry - that value is not the default.
Here's some sample code:
private void Zipup()
{
string _outputFileName = "Fargle.epub";
using (FileStream fs = File.Open(_outputFileName, FileMode.Create, FileAccess.ReadWrite ))
{
using (var output= new ZipOutputStream(fs))
{
var e = output.PutNextEntry("mimetype");
e.CompressionLevel = CompressionLevel.None;
byte[] buffer= System.Text.Encoding.ASCII.GetBytes("application/epub+zip");
output.Write(buffer,0,buffer.Length);
output.PutNextEntry("META-INF/container.xml");
WriteExistingFile(output, "META-INF/container.xml");
output.PutNextEntry("OPS/"); // another directory
output.PutNextEntry("OPS/whatever.xhtml");
WriteExistingFile(output, "OPS/whatever.xhtml");
// ...
}
}
}
private void WriteExistingFile(Stream output, string filename)
{
using (FileStream fs = File.Open(fileName, FileMode.Read))
{
int n = -1;
byte[] buffer = new byte[2048];
while ((n = fs.Read(buffer,0,buffer.Length)) > 0)
{
output.Write(buffer,0,n);
}
}
}
See the documentation for ZipOutputStream here.
Why not make life easier?
private void IonicZip()
{
string sourcePath = "C:\\pulications\\";
string fileName = "filename.epub";
// Creating ZIP file and writing mimetype
using (ZipOutputStream zs = new ZipOutputStream(sourcePath + fileName))
{
var o = zs.PutNextEntry("mimetype");
o.CompressionLevel = CompressionLevel.None;
byte[] mimetype = System.Text.Encoding.ASCII.GetBytes("application/epub+zip");
zs.Write(mimetype, 0, mimetype.Length);
}
// Adding META-INF and OEPBS folders including files
using (ZipFile zip = new ZipFile(sourcePath + fileName))
{
zip.AddDirectory(sourcePath + "META-INF", "META-INF");
zip.AddDirectory(sourcePath + "OEBPS", "OEBPS");
zip.Save();
}
}
For anyone like me who's searching for other ways to do this, I would like to add that the ZipStorer class from Jaime Olivares is a great alternative. You can copy the code right into your project, and it's very easy to choose between 'deflate' and 'store'.
https://github.com/jaime-olivares/zipstorer
Here's my code for creating an EPUB:
Dictionary<string, string> FilesToZip = new Dictionary<string, string>()
{
{ ConfigPath + #"mimetype", #"mimetype"},
{ ConfigPath + #"container.xml", #"META-INF/container.xml" },
{ OutputFolder + Name.Output_OPF_Name, #"OEBPS/" + Name.Output_OPF_Name},
{ OutputFolder + Name.Output_XHTML_Name, #"OEBPS/" + Name.Output_XHTML_Name},
{ ConfigPath + #"style.css", #"OEBPS/style.css"},
{ OutputFolder + Name.Output_NCX_Name, #"OEBPS/" + Name.Output_NCX_Name}
};
using (ZipStorer EPUB = ZipStorer.Create(OutputFolder + "book.epub", ""))
{
bool First = true;
foreach (KeyValuePair<string, string> File in FilesToZip)
{
if (First) { EPUB.AddFile(ZipStorer.Compression.Store, File.Key, File.Value, ""); First = false; }
else EPUB.AddFile(ZipStorer.Compression.Deflate, File.Key, File.Value, "");
}
}
This code creates a perfectly valid EPUB file. However, if you don't need to worry about validation, it seems most eReaders will accept an EPUB with a 'deflate' mimetype. So my previous code using .NET's ZipArchive produced EPUBs that worked in Adobe Digital Editions and a PocketBook.

SharpZipLib : Compressing a single file to a single compressed file

I am currently working with SharpZipLib under .NET 2.0 and via this I need to compress a single file to a single compressed archive. In order to do this I am currently using the following:
string tempFilePath = #"C:\Users\Username\AppData\Local\Temp\tmp9AE0.tmp.xml";
string archiveFilePath = #"C:\Archive\Archive_[UTC TIMESTAMP].zip";
FileInfo inFileInfo = new FileInfo(tempFilePath);
ICSharpCode.SharpZipLib.Zip.FastZip fZip = new ICSharpCode.SharpZipLib.Zip.FastZip();
fZip.CreateZip(archiveFilePath, inFileInfo.Directory.FullName, false, inFileInfo.Name);
This works exactly (ish) as it should, however while testing I have encountered a minor gotcha. Lets say that my temp directory (i.e. the directory that contains the uncompressed input file) contains the following files:
tmp9AE0.tmp.xml //The input file I want to compress
xxx_tmp9AE0.tmp.xml // Some other file
yyy_tmp9AE0.tmp.xml // Some other file
wibble.dat // Some other file
When I run the compression all the .xml files are included in the compressed archive. The reason for this is because of the final fileFilter parameter passed to the CreateZip method. Under the hood SharpZipLib is performing a pattern match and this also picks up the files prefixed with xxx_ and yyy_. I assume it would also pick up anything postfixed as well.
So the question is, how can I compress a single file with SharpZipLib? Then again maybe the question is how can I format that fileFilter so that the match can only ever pick up the file I want to compress and nothing else.
As an aside, is there any reason as to why System.IO.Compression not include a ZipStream class? (It only supports GZipStream)
EDIT : Solution (Derived from accepted answer from Hans Passant)
This is the compression method I implemented:
private static void CompressFile(string inputPath, string outputPath)
{
FileInfo outFileInfo = new FileInfo(outputPath);
FileInfo inFileInfo = new FileInfo(inputPath);
// Create the output directory if it does not exist
if (!Directory.Exists(outFileInfo.Directory.FullName))
{
Directory.CreateDirectory(outFileInfo.Directory.FullName);
}
// Compress
using (FileStream fsOut = File.Create(outputPath))
{
using (ICSharpCode.SharpZipLib.Zip.ZipOutputStream zipStream = new ICSharpCode.SharpZipLib.Zip.ZipOutputStream(fsOut))
{
zipStream.SetLevel(3);
ICSharpCode.SharpZipLib.Zip.ZipEntry newEntry = new ICSharpCode.SharpZipLib.Zip.ZipEntry(inFileInfo.Name);
newEntry.DateTime = DateTime.UtcNow;
zipStream.PutNextEntry(newEntry);
byte[] buffer = new byte[4096];
using (FileStream streamReader = File.OpenRead(inputPath))
{
ICSharpCode.SharpZipLib.Core.StreamUtils.Copy(streamReader, zipStream, buffer);
}
zipStream.CloseEntry();
zipStream.IsStreamOwner = true;
zipStream.Close();
}
}
}
This is an XY problem, just don't use FastZip. Follow the first example on this web page to avoid accidents.

Extracting files from a Zip archive programmatically using C# and System.IO.Packaging

I have a bunch of ZIP files that are in desperate need of some hierarchical reorganization and extraction. What I can do, currently, is create the directory structure and move the zip files to the proper location. The mystic cheese that I am missing is the part that extracts the files from the ZIP archive.
I have seen the MSDN articles on the ZipArchive class and understand them reasonable well. I have also seen the VBScript ways to extract. This is not a complex class so extracting stuff should be pretty simple. In fact, it works "mostly". I have included my current code below for reference.
using (ZipPackage package = (ZipPackage)Package.Open(#"..\..\test.zip", FileMode.Open, FileAccess.Read))
{
PackagePartCollection packageParts = package.GetParts();
foreach (PackageRelationship relation in packageParts)
{
//Do Stuff but never gets here since packageParts is empty.
}
}
The problem seems to be somewhere in the GetParts (or GetAnything for that matter). It seems that the package, while open, is empty. Digging deeper the debugger shows that the private member _zipArchive shows that it actually has parts. Parts with the right names and everything. Why won't the GetParts function retrieve them? I'ver tried casting the open to a ZipArchive and that didn't help. Grrr.
If you are manipulating ZIP files, you may want to look into a 3rd-party library to help you.
For example, DotNetZip, which has been recently updated. The current version is now v1.8. Here's an example to create a zip:
using (ZipFile zip = new ZipFile())
{
zip.AddFile("c:\\photos\\personal\\7440-N49th.png");
zip.AddFile("c:\\Desktop\\2005_Annual_Report.pdf");
zip.AddFile("ReadMe.txt");
zip.Save("Archive.zip");
}
Here's an example to update an existing zip; you don't need to extract the files to do it:
using (ZipFile zip = ZipFile.Read("ExistingArchive.zip"))
{
// 1. remove an entry, given the name
zip.RemoveEntry("README.txt");
// 2. Update an existing entry, with content from the filesystem
zip.UpdateItem("Portfolio.doc");
// 3. modify the filename of an existing entry
// (rename it and move it to a sub directory)
ZipEntry e = zip["Table1.jpg"];
e.FileName = "images/Figure1.jpg";
// 4. insert or modify the comment on the zip archive
zip.Comment = "This zip archive was updated " + System.DateTime.ToString("G");
// 5. finally, save the modified archive
zip.Save();
}
here's an example that extracts entries:
using (ZipFile zip = ZipFile.Read("ExistingZipFile.zip"))
{
foreach (ZipEntry e in zip)
{
e.Extract(TargetDirectory, true); // true => overwrite existing files
}
}
DotNetZip supports multi-byte chars in filenames, Zip encryption, AES encryption, streams, Unicode, self-extracting archives.
Also does ZIP64, for file lengths greater than 0xFFFFFFFF, or for archives with more than 65535 entries.
free. open source
get it at
codeplex or direct download from windows.net - CodePlex has been discontinued and archived
From MSDN,
In this sample, the Package class is used (as opposed to the ZipPackage.) Having worked with both, I've only seen flakiness happen when there's corruption in the zip file. Not necessarily corruption that throws the Windows extractor or Winzip, but something that the Packaging components have trouble handling.
Hope this helps, maybe it can provide you an alternative to debugging the issue.
using System;
using System.IO;
using System.IO.Packaging;
using System.Text;
class ExtractPackagedImages
{
static void Main(string[] paths)
{
foreach (string path in paths)
{
using (Package package = Package.Open(
path, FileMode.Open, FileAccess.Read))
{
DirectoryInfo dir = Directory.CreateDirectory(path + " Images");
foreach (PackagePart part in package.GetParts())
{
if (part.ContentType.ToLowerInvariant().StartsWith("image/"))
{
string target = Path.Combine(
dir.FullName, CreateFilenameFromUri(part.Uri));
using (Stream source = part.GetStream(
FileMode.Open, FileAccess.Read))
using (Stream destination = File.OpenWrite(target))
{
byte[] buffer = new byte[0x1000];
int read;
while ((read = source.Read(buffer, 0, buffer.Length)) > 0)
{
destination.Write(buffer, 0, read);
}
}
Console.WriteLine("Extracted {0}", target);
}
}
}
}
Console.WriteLine("Done");
}
private static string CreateFilenameFromUri(Uri uri)
{
char [] invalidChars = Path.GetInvalidFileNameChars();
StringBuilder sb = new StringBuilder(uri.OriginalString.Length);
foreach (char c in uri.OriginalString)
{
sb.Append(Array.IndexOf(invalidChars, c) < 0 ? c : '_');
}
return sb.ToString();
}
}
From "ZipPackage Class" (MSDN):
While Packages are stored as Zip files* through the ZipPackage class, all Zip files are not ZipPackages. A ZipPackage has special requirements such as URI-compliant file (part) names and a "[Content_Types].xml" file that defines the MIME types for all the files contained in the Package. The ZipPackage class cannot be used to open arbitary Zip files that do not conform to the Open Packaging Conventions standard.
For further details see Section 9.2 "Mapping to a ZIP Archive" of the ECMA International "Open Packaging Conventions" standard, http://www.ecma-international.org/publications/files/ECMA-ST/Office%20Open%20XML%20Part%202%20(DOCX).zip (342Kb) or http://www.ecma-international.org/publications/files/ECMA-ST/Office%20Open%20XML%20Part%202%20(PDF).zip (1.3Mb)
*You can simply add ".zip" to the extension of any ZipPackage-based file (.docx, .xlsx, .pptx, etc.) to open it in your favorite Zip utility.
I was having the exact same problem! To get the GetParts() method to return something, I had to add the [Content_Types].xml file to the root of the archive with a "Default" node for every file extension included. Once I added this (just using Windows Explorer), my code was able to read and extract the archived contents.
More information on the [Content_Types].xml file can be found here:
http://msdn.microsoft.com/en-us/magazine/cc163372.aspx - There is an example file below Figure 13 of the article.
var zipFilePath = "c:\\myfile.zip";
var tempFolderPath = "c:\\unzipped";
using (Package package = ZipPackage.Open(zipFilePath, FileMode.Open, FileAccess.Read))
{
foreach (PackagePart part in package.GetParts())
{
var target = Path.GetFullPath(Path.Combine(tempFolderPath, part.Uri.OriginalString.TrimStart('/')));
var targetDir = target.Remove(target.LastIndexOf('\\'));
if (!Directory.Exists(targetDir))
Directory.CreateDirectory(targetDir);
using (Stream source = part.GetStream(FileMode.Open, FileAccess.Read))
{
FileStream targetFile = File.OpenWrite(target);
source.CopyTo(targetFile);
targetFile.Close();
}
}
}
Note: this code uses the Stream.CopyTo method in .NET 4.0
I agree withe Cheeso. System.IO.Packaging is awkward when handling generic zip files, seeing as it was designed for Office Open XML documents. I'd suggest using DotNetZip or SharpZipLib
(This is basically a rephrasing of this answer)
Turns out that System.IO.Packaging.ZipPackage doesn't support PKZIP, that's why when you open a "generic" ZIP file no "parts" are returned. This class only supports some specific flavor of ZIP files (see comments at the bottom of MSDN description) used among other as Windows Azure service packages up to SDK 1.6 - that's why if you unpack a service package and then repack it using say Info-ZIP packer it will become invalid.

Categories

Resources