SharpZipLib Examine and select contents of a ZIP file - c#

I am using SharpZipLib in a project and am wondering if it is possible to use it to look inside a zip file, and if one of the files within has a data modified in a range I am searching for then to pick that file out and copy it to a new directory? Does anybody know id this is possible?

Yes, it is possible to enumerate the files of a zip file using SharpZipLib. You can also pick files out of the zip file and copy those files to a directory on your disk.
Here is a small example:
using (var fs = new FileStream(#"c:\temp\test.zip", FileMode.Open, FileAccess.Read))
{
using (var zf = new ZipFile(fs))
{
foreach (ZipEntry ze in zf)
{
if (ze.IsDirectory)
continue;
Console.Out.WriteLine(ze.Name);
using (Stream s = zf.GetInputStream(ze))
{
byte[] buf = new byte[4096];
// Analyze file in memory using MemoryStream.
using (MemoryStream ms = new MemoryStream())
{
StreamUtils.Copy(s, ms, buf);
}
// Uncomment the following lines to store the file
// on disk.
/*using (FileStream fs = File.Create(#"c:\temp\uncompress_" + ze.Name))
{
StreamUtils.Copy(s, fs, buf);
}*/
}
}
}
}
In the example above I use a MemoryStream to store the ZipEntry in memory (for further analysis). You could also store the ZipEntry (if it meets certain criteria) on disk.
Hope, this helps.

Related

Extract tgz file in memory and access files in C#

I have a service that downloads a *.tgz file from a remote endpoint. I use SharpZipLib to extract and write the content of that compressed archive to disk. But now I want to prevent writing the files to disk (because that process doesn't have write permissions on that disk) and keep them in memory.
How can I access the decompressed files from memory? (Let's assume the archive holds simple text files)
Here is what I have so far:
public void Decompress(byte[] byteArray)
{
Stream inStream = new MemoryStream(byteArray);
Stream gzipStream = new GZipInputStream(inStream);
TarArchive tarArchive = TarArchive.CreateInputTarArchive(gzipStream);
tarArchive.ExtractContents(#".");
tarArchive.Close();
gzipStream.Close();
inStream.Close();
}
Check this and this out.
Turns out, ExtractContents() works by iterating over TarInputStream. When you create your TarArchive like this:
TarArchive.CreateInputTarArchive(gzipStream);
it actually wraps the stream you're passing into a TarInputStream. Thus, if you want more fine-grained control over how you extract files, you must use TarInputStream directly.
See, if you can iterate over files, directories and actual file contents like this:
Stream inStream = new MemoryStream(byteArray);
Stream gzipStream = new GZipInputStream(inStream);
using (var tarInputStream = new TarInputStream(gzipStream))
{
TarEntry entry;
while ((entry = tarInputStream.GetNextEntry()) != null)
{
var fileName = entry.Name;
using (var fileContents = new MemoryStream())
{
tarInputStream.CopyEntryContents(fileContents);
// use entry, fileName or fileContents here
}
}
}

Does CopyTo store the whole thing in memory?

I have the following code snippet, which is designed to add files to a .zip file, while at the same time calculating their sha1 checksum.
However, it's running out of memory on large files.
Which part of it is causing the whole file to be in memory? Surely this should all be just streamed?
using (ZipArchive archive = ZipFile.Open(buildFile, ZipArchiveMode.Update))
{
foreach (var fileName in nameList)
{
ZipArchiveEntry entry = archive.CreateEntry(source.filename);
using (Stream zipData = entry.Open())
using (SHA1Managed shaForFile = new SHA1Managed())
using (Stream sourceFileStream = File.OpenRead(fileName))
using (Stream sourceData = new CryptoStream(sourceFileStream, shaForFile, CryptoStreamMode.Read))
{
sourceData.CopyTo(zipData);
print fileName + ':' + shaForFile.Hash;
}
}
}
(Copied from a comment - as this answers the question)
The problem is ZipArchiveMode.Update, that can require significant alterations to the file on disk. It can only ever directly stream to disk when you use ZipArchiveMode.Create

Zip File getting corrupted when i save it to isolated storage

I have downloaded a zip file from blob storage and save it to isolated storage of windows phone like this :- FileStream fs is downloaded from blob.
public static void SaveToIsolatedStorage(FileStream fs, string fileName)
{
var isolatedStorage = IsolatedStorageFile.GetUserStoreForApplication();
using (var streamWriter =
new StreamWriter(new IsolatedStorageFileStream(fileName,
FileMode.Create,
FileAccess.ReadWrite,
isolatedStorage)))
{
streamWriter.Write(fs);
}
}
But when checked this zip file using IsoStoreSpy it is showing corrupted. I have checked it by reading from isolated storage and tried to unzip it but not working. I am sure that it is corrupted because when i replace this file using IsoStoreSpy with some other zip and then tried to unzip it then it is working.
Edit:-
Code for downloading from Blob
private async Task DownloadFileFromBlobStorage()
{
var filename = "AppId_2.zip";
var blobContainer = GetBlobClient.GetContainerReference("testwpclientiapcontainer");
var blob = blobContainer.GetBlockBlobReference(filename);
using (var filestream = new FileStream(filename, FileMode.Create))
{
await blob.DownloadToStreamAsync(filestream);
SaveToIsolatedStorage(filestream, filename);
}
}
So anybody know how can i save the zip file to isolated storage without getting it corrupted ?
You're using a StreamWriter. That's for text. You shouldn't be using it to copy a zip file at all. Never use any TextWriter for binary data.
Next you're using StreamWriter.Write(object), which is basically going to call ToString on the FileStream. That's not going to work either.
You should just create an IsolatedStorageStream, and then call fs.CopyTo(output).
public static void SaveToIsolatedStorage(Stream input, string fileName)
{
using (var storage = IsolatedStorageFile.GetUserStoreForApplication())
{
// Simpler than calling the IsolatedStorageFileStream constructor...
using (var output = storage.CreateFile(fileName))
{
input.CopyTo(output);
}
}
}
In your edit you've shown code which saves to a FileStream first, and then copies the stream from the current position. As you've noted in comments, you needed to rewind it first.
Personally I wouldn't use a FileStream at all here - why do you want to save it as a normal file and an isolated file? Just use a MemoryStream:
using (var stream = new MemoryStream())
{
await blob.DownloadToStreamAsync(filestream);
stream.Position = 0;
SaveToIsolatedStorage(stream, filename);
}
(Note that your SaveToIsolatedStorage method is still synchronous... you may wish to consider an asynchronous version.)

sharpziplib compressed files to be uncompressed externally

I have a scenario where by I want to zip an email attachment using SharpZipLib. Then the end user will open the attachment and will unzip the attached file.
Will the file originally zipped file using SharpZipLib be easily unzipped by other programs for my end user?
It depends on how you use SharpZipLib. There is more than one way to compress the data with this library.
Here is example of method that will create a zip file that you will be able to open in pretty much any zip aware application:
private static byte[] CreateZip(byte[] fileBytes, string fileName)
{
using (var memoryStream = new MemoryStream())
using (var zipStream = new ZipOutputStream(memoryStream))
{
var crc = new Crc32();
crc.Reset();
crc.Update(fileBytes);
var zipEntry =
new ZipEntry(fileName)
{
Crc = crc.Value,
DateTime = DateTime.Now,
Size = fileBytes.Length
};
zipStream.PutNextEntry(zipEntry);
zipStream.Write(fileBytes, 0, fileBytes.Length);
zipStream.Finish();
zipStream.Close();
return memoryStream.ToArray();
}
}
Usage:
var fileBytes = File.ReadAllBytes(#"C:/1.xml");
var zipBytes = CreateZip(fileBytes, "MyFile.xml");
File.WriteAllBytes(#"C:/2.zip", zipBytes);
This CreateZip method is optimized for the cases when you already have bytes in memory and you just want to compress them and send without even saving to disk.

Create Zip archive from multiple in memory files in C#

Is there a way to create a Zip archive that contains multiple files, when the files are currently in memory? The files I want to save are really just text only and are stored in a string class in my application. But I would like to save multiple files in a single self-contained archive. They can all be in the root of the archive.
It would be nice to be able to do this using SharpZipLib.
Use ZipEntry and PutNextEntry() for this. The following shows how to do it for a file, but for an in-memory object just use a MemoryStream
FileStream fZip = File.Create(compressedOutputFile);
ZipOutputStream zipOStream = new ZipOutputStream(fZip);
foreach (FileInfo fi in allfiles)
{
ZipEntry entry = new ZipEntry((fi.Name));
zipOStream.PutNextEntry(entry);
FileStream fs = File.OpenRead(fi.FullName);
try
{
byte[] transferBuffer[1024];
do
{
bytesRead = fs.Read(transferBuffer, 0, transferBuffer.Length);
zipOStream.Write(transferBuffer, 0, bytesRead);
}
while (bytesRead > 0);
}
finally
{
fs.Close();
}
}
zipOStream.Finish();
zipOStream.Close();
Using SharpZipLib for this seems pretty complicated. This is so much easier in DotNetZip. In v1.9, the code looks like this:
using (ZipFile zip = new ZipFile())
{
zip.AddEntry("Readme.txt", stringContent1);
zip.AddEntry("readings/Data.csv", stringContent2);
zip.AddEntry("readings/Index.xml", stringContent3);
zip.Save("Archive1.zip");
}
The code above assumes stringContent{1,2,3} contains the data to be stored in the files (or entries) in the zip archive. The first entry is "Readme.txt" and it is stored in the top level "Directory" in the zip archive. The next two entries are stored in the "readings" directory in the zip archive.
The strings are encoded in the default encoding. There is an overload of AddEntry(), not shown here, that allows you to explicitly specify the encoding to use.
If you have the content in a stream or byte array, not a string, there are overloads for AddEntry() that accept those types. There are also overloads that accept a Write delegate, a method of yours that is invoked to write data into the zip. This works for easily saving a DataSet into a zip file, for example.
DotNetZip is free and open source.
This function should create a byte array from a stream of data: I've created a simple interface for handling files for simplicity
public interface IHasDocumentProperties
{
byte[] Content { get; set; }
string Name { get; set; }
}
public void CreateZipFileContent(string filePath, IEnumerable<IHasDocumentProperties> fileInfos)
{
using (var memoryStream = new MemoryStream())
{
using (var zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Create, true))
{
foreach(var fileInfo in fileInfos)
{
var entry = zipArchive.CreateEntry(fileInfo.Name);
using (var entryStream = entry.Open())
{
entryStream.Write(fileInfo.Content, 0, fileInfo.Content.Length);
}
}
}
using (var fileStream = new FileStream(filePath, FileMode.OpenOrCreate, System.IO.FileAccess.Write))
{
memoryStream.Position = 0;
memoryStream.CopyTo(fileStream);
}
}
}
Yes, you can use SharpZipLib to do this - when you need to supply a stream to write to, use a MemoryStream.
I come across this problem, using the MSDN example I created this class:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO.Packaging;
using System.IO;
public class ZipSticle
{
Package package;
public ZipSticle(Stream s)
{
package = ZipPackage.Open(s, FileMode.Create);
}
public void Add(Stream stream, string Name)
{
Uri partUriDocument = PackUriHelper.CreatePartUri(new Uri(Name, UriKind.Relative));
PackagePart packagePartDocument = package.CreatePart(partUriDocument, "");
CopyStream(stream, packagePartDocument.GetStream());
stream.Close();
}
private static void CopyStream(Stream source, Stream target)
{
const int bufSize = 0x1000;
byte[] buf = new byte[bufSize];
int bytesRead = 0;
while ((bytesRead = source.Read(buf, 0, bufSize)) > 0)
target.Write(buf, 0, bytesRead);
}
public void Close()
{
package.Close();
}
}
You can then use it like this:
FileStream str = File.Open("MyAwesomeZip.zip", FileMode.Create);
ZipSticle zip = new ZipSticle(str);
zip.Add(File.OpenRead("C:/Users/C0BRA/SimpleFile.txt"), "Some directory/SimpleFile.txt");
zip.Add(File.OpenRead("C:/Users/C0BRA/Hurp.derp"), "hurp.Derp");
zip.Close();
str.Close();
You can pass a MemoryStream (or any Stream) to ZipSticle.Add such as:
FileStream str = File.Open("MyAwesomeZip.zip", FileMode.Create);
ZipSticle zip = new ZipSticle(str);
byte[] fileinmem = new byte[1000];
// Do stuff to FileInMemory
MemoryStream memstr = new MemoryStream(fileinmem);
zip.Add(memstr, "Some directory/SimpleFile.txt");
memstr.Close();
zip.Close();
str.Close();
Note this answer is outdated; since .Net 4.5, the ZipArchive class allows zipping files in-memory. See johnny 5's answer below for how to use it.
You could also do it a bit differently, using a Serializable object to store all strings
[Serializable]
public class MyStrings {
public string Foo { get; set; }
public string Bar { get; set; }
}
Then, you could serialize it into a stream to save it.
To save on space you could use GZipStream (From System.IO.Compression) to compress it. (note: GZip is stream compression, not an archive of multiple files).
That is, of course if what you need is actually to save data, and not zip a few files in a specific format for other software.
Also, this would allow you to save many more types of data except strings.
I was utilizing Cheeso's answer by adding MemoryStreams as the source of the different Excel files. When I downloaded the zip, the files had nothing in them. This could be the way we were getting around trying to create and download a file over AJAX.
To get the contents of the different Excel files to be included in the Zip, I had to add each of the files as a byte[].
using (var memoryStream = new MemoryStream())
using (var zip = new ZipFile())
{
zip.AddEntry("Excel File 1.xlsx", excelFileStream1.ToArray());
zip.AddEntry("Excel File 2.xlsx", excelFileStream2.ToArray());
// Keep the file off of disk, and in memory.
zip.Save(memoryStream);
}
Use a StringReader to read from your string objects and expose them as Stream s.
That should make it easy to feed them to your zip-building code.

Categories

Resources