Extract files from stream containing zip - c#

I am making a GET request using HttpClient to download a zip file from the internet.
I want to extract all the files contained in the zip file without saving the zip file to disk.
Currently, I am able to download and save the zip file to disk, extract its contents and then delete the zip file from disk. This perfectly fine. However, I want to optimize the process.
I found a way to extract the contents directly from the downloaded zip stream but I have to specify the filenames and extensions.
I am not sure how to extract the contents while preserving their original filenames and extensions without me specifying them.
Current Approach:
string requestUri = "https://www.nuget.org/api/v2/package/" + PackageName + "/" + PackageVersion;
HttpResponseMessage response = await client.GetAsync(requestUri);
response.EnsureSuccessStatusCode();
using Stream PackageStream = await response.Content.ReadAsStreamAsync();
SaveStream($"{DownloadPath}.zip", PackageStream);
ZipFile.ExtractToDirectory($"{DownloadPath}.zip", ExtractPath);
File.Delete($"{DownloadPath}.zip");
// Directly extract Zip contents without saving file and without losing filename and extension
using (ZipArchive archive = new ZipArchive(await response.Content.ReadAsStreamAsync()))
{
foreach (ZipArchiveEntry entry in archive.Entries)
{
using (Stream stream = entry.Open())
{
using (FileStream file = new FileStream("file.txt", FileMode.Create, FileAccess.Write))
{
stream.CopyTo(file);
}
}
}
}
.NET 4.8
.NET Core 3.1
C# 8.0
Any help in this regards would be appreciated.
Please feel free to comment on alternative approaches or suggestions.
Thank you in advance.

ZipArchiveEntry has a Name and FullName property that can be used to get the names of the files within the archive while preserving their original filenames and extensions
The FullName property contains the relative path, including the subdirectory hierarchy, of an entry in a zip archive. (In contrast, the Name property contains only the name of the entry and does not include the subdirectory hierarchy.)
For example
using (ZipArchive archive = new ZipArchive(await response.Content.ReadAsStreamAsync())) {
foreach (ZipArchiveEntry entry in archive.Entries) {
using (Stream stream = entry.Open()) {
string destination = Path.GetFullPath(Path.Combine(downloadPath, entry.FullName));
var directory = Path.GetDirectoryName(destination);
if (!Directory.Exists(directory))
Directory.CreateDirectory(directory);
using (FileStream file = new FileStream(destination, FileMode.Create, FileAccess.Write)) {
await stream.CopyToAsync(file);
}
}
}
}
will extract the files in the same subdirectory hierarchy as they were stored in the archive while if entry.Name was used, all the files would be extracted to the same location.

Related

Mvc - Download multiple files as Zip Error as "Unreadable content" while opening downloaded zipped file

I am Using MVC.Net application and I want to download multiple files as zipped. I have written code with memory stream and ZipArchive in my controller action method. With that code I am able to successfully download the files as zipped folder. But when I unzipped those and trying to open them then I am getting the error as below
opening word document with Microsoft word - Word found unreadable content error
opening image file(.png) - dont support file format error
Here is my controller method code to zip the files
if (sendApplicationFiles != null && sendApplicationFiles.Any())
{
using (var compressedFileStream = new MemoryStream())
{
// Create an archive and store the stream in memory.
using (var zipArchive = new ZipArchive(compressedFileStream, ZipArchiveMode.Create, true))
{
foreach (var file in sendApplicationFiles)
{
// Create a zip entry for each attachment
var zipEntry = zipArchive.CreateEntry(file.FileName);
// Get the stream of the attachment
using (var originalFileStream = new MemoryStream(file.FileData))
using (var zipEntryStream = zipEntry.Open())
{
// Copy the attachment stream to the zip entry stream
originalFileStream.CopyTo(zipEntryStream);
}
}
}
return new FileContentResult(compressedFileStream.ToArray(), "application/zip") { FileDownloadName = "Filename.zip" };
}
}
Expecting the document content should load without error
I did not investigate your code, but this link might help. There are few examples.
https://learn.microsoft.com/en-us/dotnet/standard/io/how-to-compress-and-extract-files

Zip created but no files in it

Can someone tell me what's wrong with my code? I want to zip multiple xml into one file yet the result file is always empty.
using (MemoryStream zipStream = new MemoryStream())
{
using (ZipArchive zip = new ZipArchive(zipStream, ZipArchiveMode.Create, true))
{
string[] xmls = Directory.GetFiles(#"c:\temp\test", "*.xml");
foreach (string xml in xmls)
{
var file = zip.CreateEntry(xml);
using (var entryStream = file.Open())
using (var streamWriter = new StreamWriter(entryStream))
{
streamWriter.Write(xml);
}
}
}
using (FileStream fs = new FileStream(#"C:\Temp\test.zip", FileMode.Create))
{
zipStream.Position = 0;
zipStream.CopyTo(fs);
}
}
See the remarks in the documentation (emphasis mine):
The entryName string should reflect the relative path of the entry you want to create within the zip archive. There is no restriction on the string you provide. However, if it is not formatted as a relative path, the entry is created, but you may get an exception when you extract the contents of the zip archive. If an entry with the specified path and name already exists in the archive, a second entry is created with the same path and name.
You are using an absolute path here:
var file = zip.CreateEntry(xml);
My guess is that when you try to open the archive, it is failing silently to show the entries.
Change your code to use the names of the files without their path:
var file = zip.CreateEntry(Path.GetFileName(xml));
As a separate issue, notice that you're just writing the name of the file to the ZIP entry, rather than the actual file. I imagine you want something like this instead:
var zipEntry = zip.CreateEntry(Path.GetFileName(xml));
using (var entryStream = file.Open())
{
using var fileStream = File.OpenRead(xml);
fileStream.CopyTo(entryStream);
}

How to download Azure Blobs by referencing the file?

I want to download files from Azure using C# then stream those into MemoryStream after that return/display to the user in Front-end with a link (Azure URI - which goes to the Azure blob) and the user will be able to see those PDF files in the browser or download them. There are multiple blobs/files in Azure so, I want to loop through each file and download to stream for example: using a foreach.
I'm not sure how can I reference those blobs CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName); as here I could give a name of the specific file but I've multiple files so not sure what to go here "fileName".
Code:
var files = container.ListBlobs();
foreach (var file in files)
{
using (var memoryStream = new MemoryStream())
{
CloudBlockBlob blockBlob = container.GetBlockBlobReference(fileName);
blockBlob.DownloadToStream(memoryStream);
}
}
I'm not sure if I'm looping correcting right now in the code and downloading every blob?
Also, I tried replacing fileName with file.Uri.Segments.Last() -
I guess which gets the name of blobs.
The problem I'm having is that this foreach is just getting me one PDF file whenever I try to use the links in front-end. So, I need to know how can I properly loop through each file and download them?
So, I need to know how can I properly loop through each file and download them?
We can't download the mutiple files from the memory directly. If zip file is acceptable, you could use a compressed file such as a zip file to transfer multiple files instead. The following is my demo code, it works correctly on my side.
using (var ms = new MemoryStream())
{
using (var zipArchive = new ZipArchive(ms, ZipArchiveMode.Create, true))
{
foreach (var file in files)
{
if (file.GetType() != typeof(CloudBlockBlob)) continue;
var blob = (CloudBlockBlob) file;
var entry = zipArchive.CreateEntry(blob.Name, CompressionLevel.Fastest);
using (var entryStream = entry.Open())
{
CloudBlockBlob blockBlob = container.GetBlockBlobReference(blob.Name);
blockBlob.DownloadToStream(entryStream);
}
}
}
}

Attaching a file from .Zip folder

Using MailKit in .NET CORE an attachement can be loaded using:
bodyBuilder.Attachments.Add(FILE);
I'm trying to attach a file from inside a ZIP file using:
using System.IO.Compression;
string zipPath = #"./html-files.ZIP";
using (ZipArchive archive = ZipFile.OpenRead(zipPath))
{
// bodyBuilder.Attachments.Add("msg.html");
bodyBuilder.Attachments.Add(archive.GetEntry("msg.html"));
}
But it did not work, and gave me APP\"msg.html" not found, which means it is trying to load a file with the same name from the root directory instead of the zipped one.
bodyBuilder.Attachments.Add() doesn't have an overload that takes a ZipArchiveEntry, so using archive.GetEntry("msg.html") has no chance of working.
Most likely what is happening is that the compiler is casting the ZipArchiveEntry to a string which happens to be APP\"msg.html" which is why you get that error.
What you'll need to do is extract the content from the zip archive and then add that to the list of attachments.
using System.IO;
using System.IO.Compression;
string zipPath = #"./html-files.ZIP";
using (ZipArchive archive = ZipFile.OpenRead (zipPath)) {
ZipArchiveEntry entry = archive.GetEntry ("msg.html");
var stream = new MemoryStream ();
// extract the content from the zip archive entry
using (var content = entry.Open ())
content.CopyTo (stream);
// rewind the stream
stream.Position = 0;
bodyBuilder.Attachments.Add ("msg.html", stream);
}

How to read data from a zip file without having to unzip the entire file

Is there anyway in .Net (C#) to extract data from a zip file without decompressing the complete file?
I possibly want to extract data (file) from the start of a zip file if the compression algorithm compress the file used was in a deterministic order.
With .Net Framework 4.5 (using ZipArchive):
using (ZipArchive zip = ZipFile.Open(zipfile, ZipArchiveMode.Read))
foreach (ZipArchiveEntry entry in zip.Entries)
if(entry.Name == "myfile")
entry.ExtractToFile("myfile");
Find "myfile" in zipfile and extract it.
DotNetZip is your friend here.
As easy as:
using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
ZipEntry e = zip["MyReport.doc"];
e.Extract(OutputStream);
}
(you can also extract to a file or other destinations).
Reading the zip file's table of contents is as easy as:
using (ZipFile zip = ZipFile.Read(ExistingZipFile))
{
foreach (ZipEntry e in zip)
{
if (header)
{
System.Console.WriteLine("Zipfile: {0}", zip.Name);
if ((zip.Comment != null) && (zip.Comment != ""))
System.Console.WriteLine("Comment: {0}", zip.Comment);
System.Console.WriteLine("\n{1,-22} {2,8} {3,5} {4,8} {5,3} {0}",
"Filename", "Modified", "Size", "Ratio", "Packed", "pw?");
System.Console.WriteLine(new System.String('-', 72));
header = false;
}
System.Console.WriteLine("{1,-22} {2,8} {3,5:F0}% {4,8} {5,3} {0}",
e.FileName,
e.LastModified.ToString("yyyy-MM-dd HH:mm:ss"),
e.UncompressedSize,
e.CompressionRatio,
e.CompressedSize,
(e.UsesEncryption) ? "Y" : "N");
}
}
Edited To Note: DotNetZip used to live at Codeplex. Codeplex has been shut down. The old archive is still available at Codeplex. It looks like the code has migrated to Github:
https://github.com/DinoChiesa/DotNetZip. Looks to be the original author's repo.
https://github.com/haf/DotNetZip.Semverd. This looks to be the currently maintained version. It's also packaged up an available via Nuget at https://www.nuget.org/packages/DotNetZip/
Something like this will list and extract the files one by one, if you want to use SharpZipLib:
var zip = new ZipInputStream(File.OpenRead(#"C:\Users\Javi\Desktop\myzip.zip"));
var filestream = new FileStream(#"C:\Users\Javi\Desktop\myzip.zip", FileMode.Open, FileAccess.Read);
ZipFile zipfile = new ZipFile(filestream);
ZipEntry item;
while ((item = zip.GetNextEntry()) != null)
{
Console.WriteLine(item.Name);
using (StreamReader s = new StreamReader(zipfile.GetInputStream(item)))
{
// stream with the file
Console.WriteLine(s.ReadToEnd());
}
}
Based on this example: content inside zip file
Here is how a UTF8 text file can be read from a zip archive into a string variable (.NET Framework 4.5 and up):
string zipFileFullPath = "{{TypeYourZipFileFullPathHere}}";
string targetFileName = "{{TypeYourTargetFileNameHere}}";
string text = new string(
(new System.IO.StreamReader(
System.IO.Compression.ZipFile.OpenRead(zipFileFullPath)
.Entries.Where(x => x.Name.Equals(targetFileName,
StringComparison.InvariantCulture))
.FirstOrDefault()
.Open(), Encoding.UTF8)
.ReadToEnd())
.ToArray());
the following code can read specific file as byte array :
using ZipArchive zipArchive = ZipFile.OpenRead(zipFilePath);
foreach(ZipArchiveEntry zipArchiveEntry in zipArchive.Entries)
{
if(zipArchiveEntry.Name.Equals(fileName,StringComparison.OrdinalIgnoreCase))
{
Stream stream = zipArchiveEntry.Open();
using MemoryStream memoryStream = new MemoryStream();
await stream.CopyToAsync(memoryStream);
return memoryStream.ToArray();
}
}
Zip files have a table of contents. Every zip utility should have the ability to query just the TOC. Or you can use a command line program like 7zip -t to print the table of contents and redirect it to a text file.
In such case you will need to parse zip local header entries. Each file, stored in zip file, has preceding Local File Header entry, which (normally) contains enough information for decompression, Generally, you can make simple parsing of such entries in stream, select needed file, copy header + compressed file data to other file, and call unzip on that part (if you don't want to deal with the whole Zip decompression code or library).

Categories

Resources