How to work with a ZipArchive in C#

How to work with a ZipArchive in C# - c#

I have a ZipArchive and am looking to access a file inside. I am not sure how I would do this, but I have a list
List<ZipContents> importList = new List<ZipContents>();
Which has two parameters:
ZipArchive which is called ZipFile
String which is called FileName
Inside the ZipArchive which is importList.ZipFile I need to find an XML file that has the same name as the Zip file name.
Currently I have this:
foreach (var import in importList)
{
var fn = import.FileName; // This is the actual file name of the zip file
// that was added into the ZipArchive.
// I will need to access the specific XML file need in the Zip by associating with
// fn
// ToDo: Extract XML file needed
// ToDo: Begin to access its contents...
}
So for example the code is looking into the ZipArchive with the name test.zip. there will be a file called test.xml that I will then need to be able to access its contents.
Like I said above I need to be able to access the contents of that file. I am sorry I have no code to support how to do this, but I have not been able to find anything else...
I have looked through a lot of the ZIpArchive documentation (including: http://msdn.microsoft.com/en-us/library/system.io.compression.ziparchive%28v=vs.110%29.aspx) and other posts on SO on how to do this, but I have come up empty. Would anyone have an idea on how to do this? Any help would be much appreciated. Thanks!

You need to extract the archive to a directory (may as well use temp since I assume you don't want to keep these):
archive.ExtractToDirectory("path string");
//Get the directory info for the directory you just extracted to
DirectoryInfo di = new DirectoryInfo("path string");
//find the xml file you want
FileInfo fi = di.GetFiles(string.Format("{0}.xml", archiveName)).FirstOrDefault();
//if the file was found, do your thing
if(fi != null)
{
//Do your file stuff here.
}
//delete the extracted directory
di.Delete();
Edit: To do the same thing just unpacking the file you care about:
//find your file
ZipArchiveEntry entry = archive
.Entries
.FirstOrDefault(e =>
e.Name == string.Format("{0}.xml", archiveName));
if(entry != null)
{
//unpack your file
entry.ExtractToFile("path to extract to");
//do your file stuff here
}
//delete file if you want

The MSDN you linked does a rather good job explaining how to access the files. Here it is applied to your example.
// iterate over the list items
foreach (var import in importList)
{
var fn = import.FileName;
// iterate over the actual archives
foreach (ZipArchiveEntry entry in import.ZipFile.Entries)
{
// only grab files that end in .xml
if (entry.FullName.EndsWith(".xml", StringComparison.OrdinalIgnoreCase))
{
// this extracts the file
entry.ExtractToFile(Path.Combine(#"C:\extract", entry.FullName));
// this opens the file as a stream
using(var stream = new StreamReader(entry.Open())){
// this opens file as xml stream
using(var xml = XmlReader.Create(stream){
// do some xml access on an arbitrary node
xml.MoveToContent();
xml.ReadToDescendant("my-node");
var item = xml.ReadElementContentAsString();
}
}
}
}
}

The following will extract a single xml file called file.xml and read it to an XDocument object:
var xmlEntry = importList.SelectMany(y => y.Entries)
.FirstOrDefault(entry => entry.Name.Equals("file.xml",
StringComparison.OrdinalIgnoreCase));
if (xmlEntry == null)
{
return;
}
// Open a stream to the underlying ZipArchiveEntry
using (XDocument xml = XDocument.Load(xmlEntry.Open()))
{
// Do stuff with XML
}

Related

Upload Csv to solution explorer

I'm trying to build an application in which a user can upload a csv file and convert it into XML. Currently my controller can create .txt file in a temp folder. However when I open the txt file it comes out all corrupted as below:
I have two questions
1. How can I make it so that the file displays properly i.e. as items separated by commas?
2. How can I change my code to make the file upload into my solution explorer
Here is the relevant controller code:
[HttpPost("UploadFiles")]
public async Task<IActionResult> FileUpload(List<IFormFile> files)
{
long size = files.Sum(f => f.Length);
var filePaths = new List<string>();
foreach (var formFile in files)
{
if(formFile.Length > 0)
{
var filePath = Path.GetTempPath()+ Guid.NewGuid().ToString()+".txt";
filePaths.Add(filePath);
using (var stream = new FileStream(filePath, FileMode.Create, FileAccess.ReadWrite))
{
await formFile.CopyToAsync(stream);
}
}
}
return Ok(new { count = files.Count, size, filePaths });
}
Any suggestions would be much appreciated
Thanks in advance

When the file is corrupt I think the conversion doesn't work. Try for now uploading it without any conversion.
For the second question you can do the following
var filePath = Path.Combine(AppContext.BaseDirectory, $"{Guid.NewGuid().ToString()}.csv"); // or whatever extension you are actually having without modifying the original extension
This will store the file either in the "bin" directory path or in the directory where the source-code is located.

ZipFile.CreateEntryFromFile File is stored with extension, but want to add extension when zipped

My class currently adds documents to zip folder via
using (var zip = ZipFile.Open(zipPath, ZipArchiveMode.Create)) {
foreach (var filePath in files) {
if (File.Exists(filePath.Value)) {
zip.CreateEntryFromFile(filePath.Value, filePath.Key);
}
}
}
files is a Dictionary, the Value is a file path without file extensions, as they are stored on the server without extensions (lets assume they are all .pdfs)
Is there a way I can add .pdf to the files as they are stored in zip? So that when the zip folder is extracted, the files have extensions?
Note: My assumption is that if I simply add .pdf to filePath, it won't be a valid path when trying to CreateEntryFromFile

Given that CreateEntryFromFile has separate parameters for the filename and the entry name, I'd expect you to just be able to modify the second argument:
using (var zip = ZipFile.Open(zipPath, ZipArchiveMode.Create))
{
foreach (var filePath in files)
{
if (File.Exists(filePath.Value))
{
// Note the second argument
zip.CreateEntryFromFile(filePath.Value, filePath.Key + ".pdf");
}
}
}
Alternatively, change whatever code is building the dictionary in the first place to include the extension in the dictionary key. (This may or may not be appropriate based on what else you use the dictionary for, or whether you even really need a dictionary.)

How to read data from a zip file without having to unzip the entire file

Is there anyway in .Net (C#) to extract data from a zip file without decompressing the complete file?
I possibly want to extract data (file) from the start of a zip file if the compression algorithm compress the file used was in a deterministic order.

With .Net Framework 4.5 (using ZipArchive):
using (ZipArchive zip = ZipFile.Open(zipfile, ZipArchiveMode.Read))
foreach (ZipArchiveEntry entry in zip.Entries)
if(entry.Name == "myfile")
entry.ExtractToFile("myfile");
Find "myfile" in zipfile and extract it.

DotNetZip is your friend here.
As easy as:
using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
ZipEntry e = zip["MyReport.doc"];
e.Extract(OutputStream);
}
(you can also extract to a file or other destinations).
Reading the zip file's table of contents is as easy as:
using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
foreach (ZipEntry e in zip)
{
if (header)
{
System.Console.WriteLine("Zipfile: {0}", zip.Name);
if ((zip.Comment != null) && (zip.Comment != ""))
System.Console.WriteLine("Comment: {0}", zip.Comment);
System.Console.WriteLine("\n{1,-22} {2,8} {3,5} {4,8} {5,3} {0}",
"Filename", "Modified", "Size", "Ratio", "Packed", "pw?");
System.Console.WriteLine(new System.String('-', 72));
header = false;
}
System.Console.WriteLine("{1,-22} {2,8} {3,5:F0}% {4,8} {5,3} {0}",
e.FileName,
e.LastModified.ToString("yyyy-MM-dd HH:mm:ss"),
e.UncompressedSize,
e.CompressionRatio,
e.CompressedSize,
(e.UsesEncryption) ? "Y" : "N");
}
}
Edited To Note: DotNetZip used to live at Codeplex. Codeplex has been shut down. The old archive is still available at Codeplex. It looks like the code has migrated to Github:
https://github.com/DinoChiesa/DotNetZip. Looks to be the original author's repo.
https://github.com/haf/DotNetZip.Semverd. This looks to be the currently maintained version. It's also packaged up an available via Nuget at https://www.nuget.org/packages/DotNetZip/

Something like this will list and extract the files one by one, if you want to use SharpZipLib:
var zip = new ZipInputStream(File.OpenRead(#"C:\Users\Javi\Desktop\myzip.zip"));
var filestream = new FileStream(#"C:\Users\Javi\Desktop\myzip.zip", FileMode.Open, FileAccess.Read);
ZipFile zipfile = new ZipFile(filestream);
ZipEntry item;
while ((item = zip.GetNextEntry()) != null)
{
Console.WriteLine(item.Name);
using (StreamReader s = new StreamReader(zipfile.GetInputStream(item)))
{
// stream with the file
Console.WriteLine(s.ReadToEnd());
}
}
Based on this example: content inside zip file

Here is how a UTF8 text file can be read from a zip archive into a string variable (.NET Framework 4.5 and up):
string zipFileFullPath = "{{TypeYourZipFileFullPathHere}}";
string targetFileName = "{{TypeYourTargetFileNameHere}}";
string text = new string(
(new System.IO.StreamReader(
System.IO.Compression.ZipFile.OpenRead(zipFileFullPath)
.Entries.Where(x => x.Name.Equals(targetFileName,
StringComparison.InvariantCulture))
.FirstOrDefault()
.Open(), Encoding.UTF8)
.ReadToEnd())
.ToArray());

the following code can read specific file as byte array :
using ZipArchive zipArchive = ZipFile.OpenRead(zipFilePath);
foreach(ZipArchiveEntry zipArchiveEntry in zipArchive.Entries)
{
if(zipArchiveEntry.Name.Equals(fileName,StringComparison.OrdinalIgnoreCase))
{
Stream stream = zipArchiveEntry.Open();
using MemoryStream memoryStream = new MemoryStream();
await stream.CopyToAsync(memoryStream);
return memoryStream.ToArray();
}
}

Zip files have a table of contents. Every zip utility should have the ability to query just the TOC. Or you can use a command line program like 7zip -t to print the table of contents and redirect it to a text file.

In such case you will need to parse zip local header entries. Each file, stored in zip file, has preceding Local File Header entry, which (normally) contains enough information for decompression, Generally, you can make simple parsing of such entries in stream, select needed file, copy header + compressed file data to other file, and call unzip on that part (if you don't want to deal with the whole Zip decompression code or library).

DotNetZip: How to extract files, but ignoring the path in the zipfile?

Trying to extract files to a given folder ignoring the path in the zipfile but there doesn't seem to be a way.
This seems a fairly basic requirement given all the other good stuff implemented in there.
What am i missing ?
code is -
using (Ionic.Zip.ZipFile zf = Ionic.Zip.ZipFile.Read(zipPath))
{
zf.ExtractAll(appPath);
}

While you can't specify it for a specific call to Extract() or ExtractAll(), the ZipFile class has a FlattenFoldersOnExtract field. When set to true, it flattens all the extracted files into one folder:
var flattenFoldersOnExtract = zip.FlattenFoldersOnExtract;
zip.FlattenFoldersOnExtract = true;
zip.ExtractAll();
zip.FlattenFoldersOnExtract = flattenFoldersOnExtract;

You'll need to remove the directory part of the filename just prior to unzipping...
using (var zf = Ionic.Zip.ZipFile.Read(zipPath))
{
zf.ToList().ForEach(entry =>
{
entry.FileName = System.IO.Path.GetFileName(entry.FileName);
entry.Extract(appPath);
});
}

You can use the overload that takes a stream as a parameter. In this way you have full control of path where the files will be extracted to.
Example:
using (ZipFile zip = new ZipFile(ZipPath))
{
foreach (ZipEntry e in zip)
{
string newPath = Path.Combine(FolderToExtractTo, e.FileName);
if (e.IsDirectory)
{
Directory.CreateDirectory(newPath);
}
else
{
using (FileStream stream = new FileStream(newPath, FileMode.Create))
e.Extract(stream);
}
}
}

That will fail if there are 2 files with equal filenames. For example
files\additionalfiles\file1.txt
temp\file1.txt
First file will be renamed to file1.txt in the zip file and when the second file is trying to be renamed an exception is thrown saying that an item with the same key already exists

Why can't this file be deleted after using C1ZipFile?

The following code gives me a System.IO.IOException with the message 'The process cannot access the file'.
private void UnPackLegacyStats()
{
DirectoryInfo oDirectory;
XmlDocument oStatsXml;
//Get the directory
oDirectory = new DirectoryInfo(msLegacyStatZipsPath);
//Check if the directory exists
if (oDirectory.Exists)
{
//Loop files
foreach (FileInfo oFile in oDirectory.GetFiles())
{
//Check if file is a zip file
if (C1ZipFile.IsZipFile(oFile.FullName))
{
//Open the zip file
using (C1ZipFile oZipFile = new C1ZipFile(oFile.FullName, false))
{
//Check if the zip contains the stats
if (oZipFile.Entries.Contains("Stats.xml"))
{
//Get the stats as a stream
using (Stream oStatsStream = oZipFile.Entries["Stats.xml"].OpenReader())
{
//Load the stats as xml
oStatsXml = new XmlDocument();
oStatsXml.Load(oStatsStream);
//Close the stream
oStatsStream.Close();
}
//Loop hit elements
foreach (XmlElement oHitElement in oStatsXml.SelectNodes("/*/hits"))
{
//Do stuff
}
}
//Close the file
oZipFile.Close();
}
}
//Delete the file
oFile.Delete();
}
}
}
I am struggling to see where the file could still be locked. All objects that could be holding onto a handle to the file are in using blocks and are explicitly closed.
Is it something to do with using FileInfo objects rather than the strings returned by the static GetFiles method?
Any ideas?

I do not see problems in your code, everything look ok. To check is the problem lies in C1ZipFile I suggest you initialize zip from stream, instead of initialization from file, so you close stream explicitly:
//Open the zip file
using (Stream ZipStream = oFile.OpenRead())
using (C1ZipFile oZipFile = new C1ZipFile(ZipStream, false))
{
// ...
Several other suggestions:
You do not need to call Close() method, with using (...), remove them.
Move xml processing (Loop hit elements) outsize zip processing, i.e. after zip file closeing, so you keep file opened as least as possible.

I assume you're getting the error on the oFile.Delete call. I was able to reproduce this error. Interestingly, the error only occurs when the file is not a zip file. Is this the behavior you are seeing?
It appears that the C1ZipFile.IsZipFile call is not releasing the file when it's not a zip file. I was able to avoid this problem by using a FileStream instead of passing the file path as a string (the IsZipFile function accepts either).
So the following modification to your code seems to work:
if (oDirectory.Exists)
{
//Loop files
foreach (FileInfo oFile in oDirectory.GetFiles())
{
using (FileStream oStream = new FileStream(oFile.FullName, FileMode.Open))
{
//Check if file is a zip file
if (C1ZipFile.IsZipFile(oStream))
{
// ...
}
}
//Delete the file
oFile.Delete();
}
}
In response to the original question in the subject: I don't know if it's possible to know if a file can be deleted without attempting to delete it. You could always write a function that attempts to delete the file and catches the error if it can't and then returns a boolean indicating whether the delete was successful.

I'm just guessing: are you sure that oZipFile.Close() is enough? Perhaps you have to call oZipFile.Dispose() or oZipFile.Finalize() to be sure it has actually released the resources.

More then Likely it's not being disposed, anytime you access something outside of managed code(streams, files, etc.) you MUST dispose of them. I learned the hard way with Asp.NET and Image files, it will fill up your memory, crash your server, etc.

In the interest of completeness I am posing my working code as the changes came from more than one source.
private void UnPackLegacyStats()
{
DirectoryInfo oDirectory;
XmlDocument oStatsXml;
//Get the directory
oDirectory = new DirectoryInfo(msLegacyStatZipsPath);
//Check if the directory exists
if (oDirectory.Exists)
{
//Loop files
foreach (FileInfo oFile in oDirectory.GetFiles())
{
//Set empty xml
oStatsXml = null;
//Load file into a stream
using (Stream oFileStream = oFile.OpenRead())
{
//Check if file is a zip file
if (C1ZipFile.IsZipFile(oFileStream))
{
//Open the zip file
using (C1ZipFile oZipFile = new C1ZipFile(oFileStream, false))
{
//Check if the zip contains the stats
if (oZipFile.Entries.Contains("Stats.xml"))
{
//Get the stats as a stream
using (Stream oStatsStream = oZipFile.Entries["Stats.xml"].OpenReader())
{
//Load the stats as xml
oStatsXml = new XmlDocument();
oStatsXml.Load(oStatsStream);
}
}
}
}
}
//Check if we have stats
if (oStatsXml != null)
{
//Process XML here
}
//Delete the file
oFile.Delete();
}
}
}
The main lesson I learned from this is to manage file access in one place in the calling code rather than letting other components manage their own file access. This is most apropriate when you want to use the file again after the other component has finished it's task.
Although this takes a little more code you can clearly see where the stream is disposed (at the end of the using), compared to having to trust that a component has correctly disposed of the stream.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to work with a ZipArchive in C# - c#

Related

Upload Csv to solution explorer

ZipFile.CreateEntryFromFile File is stored with extension, but want to add extension when zipped

How to read data from a zip file without having to unzip the entire file

DotNetZip: How to extract files, but ignoring the path in the zipfile?

Why can't this file be deleted after using C1ZipFile?

Categories

Resources