.NET 4.6.2
I read some (I know they are valid) files from a database and then try to combine them into a single ZIP file using the following function:
public static byte[] CompressData(IList<ZipFileData> zipFileDatas)
{
var buffer = new byte[(int)zipFileDatas.Sum(z => z.Length)];
using (var ms = new MemoryStream(buffer))
{
using (var zip = new ZipFile())
{
foreach (var zipFileData in zipFileDatas)
zip.AddEntry($"{zipFileData.FileName}.{zipFileData.FileType}", zipFileData.Data);
zip.Save(ms);
}
return ms.ToArray();
}
}
The parameter is a collection of these:
public class ZipFileData
{
public string FileName { get; set; }
public string FileType { get; set; } // Eg: PDF, JPG, XSLX
public byte[] Data { get; set; }
public long? Length { get; set; } // Length of the data
}
The function appears to work correctly, but later when I save the returned byte[] as "my.zip" and try and open it (from Windows 10), I get the error "The Compressed (zipped) Folder C:...\my.zip is invalid.
I'm trying to determine if this function (or some other code) is the cause of the issue.
Has anyone done something similar before or could verify that the function is (in)correct?
You allocate a buffer too small:
var buffer = new byte[(int)zipFileDatas.Sum(z => z.Length)];
A non compressed zip file will be slightly larger that the sum of files inside: each zip entry has a header (sometimes a footer) and there is a "table of content" at the end of the zip file (the central directory).
var ms = new MemoryStream(buffer)
will create a non-resizable memory stream a bit too small. Unfortunately for you, the last bytes you miss are the most important ones: That's where you find the offset of the central directory. Without it, you have a corrupted zip file.
To fix this, use a resizable memory stream:
var ms = new MemoryStream()
Related
I need to encode the zip file in base64 formats.
I followed the following approach
string text = File.ReadAllText("../../../SampleDat.dat");
byte[] compress0 = Compress(stringbyte);
string short_com0 = base64_encode(compress0);
public static byte[] Compress(byte[] data)
{
using (var compressedStream = new MemoryStream())
using (var zipStream = new GZipStream(compressedStream, CompressionMode.Compress))
{
zipStream.Write(data, 0, data.Length);
zipStream.Close();
return compressedStream.ToArray();
}
}
public string base64_encode(byte[] data)
{
if (data == null)
throw new ArgumentNullException("data");
return Convert.ToBase64String(data);
}
After using this I got this encoded string.
H4sIAAAAAAAEAJVQTU/CQBS8m/gfejHRgxQpoJJ4qGXBKlBsq6KXph8P2NjdrbuLleT9eBe/QvSgHt7hTWYmMzMmsdt3Yxe9lBe0SDVcisytqpLmqaaCkxctU5/PBQ5GZNabkjAxFwWThPhxQgYDNJd4bkyGQXifeEGfYKoUKMWA60nKYP+n5mwCTKyksjxJNUiaHmxpolzIf4tuZPk3iWcaLoRce6IAJPP5iHLwC5wC3ZSU7K30JwmjVcaoUgYynOGN38fI+OUQrZUGZrDtN6g5SAzhaUUV3dhMViwzyNey7//uzpiEQ/L74N/D46agaYZuwSinyvA0fQbLNQGVTrm2Di3CtVxbI3iGEjttXGpdqZ5t13XdyD9szLxVIxfMXlIJCkrItS2hElIrm/ICXuzH6V7rfL4oTx+CIMtY/+7aiaNZq7ZFnLfDinavZsFtBvfNpZ9HZIH4MyriUctpd7rHJ6dNvPDGDX88HaFz3MGO02w6r7wgTAN2AgAA
When I created zip manually and read file in the code and compress that file
//file zipped manually
string filePath1 = "../../../git_only/oraclehcm1/dbscripts/SampleDat.zip";
byte[] physicalfile1 = File.ReadAllBytes(filePath1);
string long_com1 = base64_encode(physicalfile1);
The response I get is
UEsDBBQAAAAIAECDYlK8IEwDbAEAAHYCAAANAAAAU2FtcGxlRGF0LmRhdJVQTU/CQBS8m/gfejHRgxQpqJB4qGXBKlBsq6KXph8P2NjdrbuLleT9eBc/4tdBPbzDvMxMZmZMYrfvxi56KS9okWo4F5lbVSXNU00FJ09apj6fCxyMyKw3JWFiLgomCfHjhAwGaC7x3JgMg/A28YI+wVQpUIoB15OUwe5PzckEmFhJZXmSapA03fukiXIh/y26kuXfJJ5puBBy7YkCkMznI8rBL3AKdFNSspfS7ySMVhmjSpmX4Qyv/D5Gxi+HaK00ML/4AoOag8QQHlZU0Y3NZMUykB/LvuLtrTEJh+T3wb+Hx01B0wzdglFOleFp+giWawIqnXJt7VuEa7m2RvAIJXbauNS6Uj3bruu6kb/ZmHmrRi6YvaQSFJSQa1tCJaRWNuUFPNn3053W6XxRdu+CIMtY/+bSiaNZq7ZFnLfDih5ezILrDG6bSz+PyALxZ1TEg5bT7hweHXebeOaNG/54OkLnqIMdp9l0ngFQSwECHwAUAAAACABAg2JSvCBMA2wBAAB2AgAADQAkAAAAAAAAACAAAAAAAAAAU2FtcGxlRGF0LmRhdAoAIAAAAAAAAQAYAEMpLaJSD9cBq6mosXsP1wFNJS5xSw7XAVBLBQYAAAAAAQABAF8AAACXAQAAAAA=
This is the actual response . I also noticed the two zip are of the different size and the zip I which I created programmatically , The files in this zip have no extensions.
Please help me to create the second encoding through program and > .NET version I am using is 4.5
and I cannot use Zip.createDirectory() method due to project dependencies.
Any help is appreaciated .
Thanks in Advnance!
The first one is a gzip file, the second one is a zip file. If you want to make a zip file, try the ZipFile class as opposed to the GZipStream class.
I wouldn't expect two different Zip algorithms/libraries to yield the same output. For one, in the programmatic way, the file metadata (name, modification date, attributes) are not set, while the command line version will include all that information for unzipping purposes.
Plus libraries update at different cadence than standalones, and you might not have the fixes synchronized to reliably match the outputs.
I would like to take the contents of a file and rename the file while in memory to send with a different file name using an API.
The Goals:
Not alter the original file (file on disk) in any way.
Not create additional files (like a copy of the file with a new name). I'm trying to keep IO access as low as possible and do everything in memory.
Change the Name of a file object (in memory) to a different name.
Upload the file object to a WebAPI on another machine.
Have "FileA.txt" on source MachineA and have "FileB.txt" on destination MachineB.
I don't think it would matter but I have no plans to write the file back to the system (MachineA) with the new name, it will only be used to send the file object (in memory) to MachineB via a Web API.
I found a solution that uses reflection to accomplish this...
FileStream fs = new FileStream(#"C:\myfile.txt", FileMode.Open);
var myField = fs.GetType()
.GetField("_fileName", BindingFlags.Instance | BindingFlags.NonPublic)
myField.SetValue(fs, "my_new_filename.txt");
However, It's been a few years since that solution was given. Is there a better way to do this in 2021?
One other way would be defining the filename when you save it on MachineB.
You could pass this filename as a payload through the Web API and use it as the file name.
//buffer as byte[] and fileName as string would come from the request
using (FileStream fs = new FileStream(fileName, FileMode.Create))
{
fs.Write(buffer, 0, buffer.Length);
}
The best way I could come up with was using my old method from years ago. The following shows how I used it. I only do this to mask the original filename from the third-party WebAPI I'm sending it to.
// filePath: c:\test\my_secret_filename.txt
private byte[] GetBytesWithNewFileName(string filePath)
{
byte[] file = null;
using (var fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
{
// Change the name of the file in memory (does not affect the original file)
var fileNameField = fs.GetType().GetField(
"_fileName",
BindingFlags.Instance | BindingFlags.NonPublic
);
// If I leave out the next line, the file name field will have the full filePath
// string as its value in the resulting byte array. This will replace that with
// only the file name I wish to pass along "my_masked_filename.txt".
fileNameField.SetValue(fs, "my_masked_filename.txt");
// Get the filesize of the file and make sure it's compatible with
// the binaryreader object to be used
int fileSize;
try { fileSize = Convert.ToInt32(fs.Length); }
catch(OverflowException)
{ throw new Exception("The file is to big to convert using a binary reader."); }
// Get the file into a byte array
using (var br = new BinaryReader(fs)) { file = br.ReadBytes(fileSize); }
}
return file;
}
I have a console application written using C# on the top of Core .NET 2.2 framework.
I want to create asynchronous Task that would write a full-size image to storage. Additionally, the process will need to create a thumbnail and write it to the default storage.
Follow is the method that processes the logic. I documented each line to explain that I believe is happening
// This method accepts FileObject and returns a task
// The generated task will write the file as is to the default storage
// Then it'll create a thumbnail of that images and store it to the default storage
public async Task ProcessImage(FileObject file, int thumbnailWidth = 250)
{
// The name of the full-size image
string filename = string.Format("{0}{1}", file.ObjectId, file.Extension);
// The name along with the exact path to the full-size image
string path = Path.Combine(file.ContentId, filename);
// Write the full-size image to the storage
await Storage.CreateAsync(file.Content, path)
.ContinueWith(task =>
{
// Reset the stream to the beginning since this will be the second time the stream is read
file.Content.Seek(0, SeekOrigin.Begin);
// Create original Image
Image image = Image.FromStream(file.Content);
// Calulate the height of the new thumbnail
int height = (thumbnailWidth * image.Height) / image.Width;
// Create the new thumbnail
Image thumb = image.GetThumbnailImage(thumbnailWidth, height, null, IntPtr.Zero);
using (MemoryStream thumbnailStream = new MemoryStream())
{
// Save the thumbnail to the memory stream
thumb.Save(thumbnailStream, image.RawFormat);
// The name of the new thumbnail
string thumbnailFilename = string.Format("thumbnail_{0}", filename);
// The name along with the exact path to the thumbnail
string thumbnailPath = Path.Combine(file.ContentId, thumbnailFilename);
// Write the thumbnail to storage
Storage.CreateAsync(thumbnailStream, thumbnailPath);
}
// Dispose the file object to ensure the Stream is disposed
file.Dispose();
image.Dispose();
thumb.Dispose();
});
}
Here is my FileObject if needed
public class FileObject : IDisposable
{
public string ContentId { get; set; }
public string ObjectId { get; set; }
public ContentType ContentType { get; set; }
public string Extension { get; set; }
public Stream Content { get; set; }
private bool IsDisposed;
public void Dispose()
{
Dispose(true);
GC.SuppressFinalize(this);
}
protected virtual void Dispose(bool disposing)
{
if (IsDisposed)
return;
if (disposing && Content != null)
{
Content.Close();
Content.Dispose();
}
IsDisposed = true;
}
}
The above code writes the correct full-size image to the storage drive. It also writes the thumbnail to storage. However, the thumbnail is always corrupted. In other words, the generated thumbnail file is always written with 0 bytes.
How can I correctly create my thumbnail from file.Content stream after writing the same stream to the storage?
I figured out the cause of the issue. For some reason the line thumb.Save(thumbnailStream, image.RawFormat); position the thumbnailStream at the end and when writing to the storage nothing gets written
Fixing that issues was to reset the seek position after writing to the stream like this
using (MemoryStream thumbnailStream = new MemoryStream())
{
// Save the thumbnail to the memory stream
thumb.Save(thumbnailStream, image.RawFormat);
// Reset the seek position to the begining
thumbnailStream.Seek(0, SeekOrigin.Begin);
// The name of the new thumbnail
string thumbnailFilename = string.Format("thumbnail_{0}", filename);
// The name along with the exact path to the thumbnail
string thumbnailPath = Path.Combine(file.ContentId, thumbnailFilename);
// Write the thumbnail to storage
Storage.CreateAsync(thumbnailStream, thumbnailPath);
}
I am not sure what is the benefit that is gained when thumb.Save(...) does not reset the position to 0 after copying into a new stream! I just feel that it should be doing that since it will always write a new stream not appending to an existing one.
I have a function I use for aggregating streams from a zip archive.
private void ExtractMiscellaneousFiles()
{
foreach (var miscellaneousFileName in _fileData.MiscellaneousFileNames)
{
var fileEntry = _archive.GetEntry(miscellaneousFileName);
if (fileEntry == null)
{
throw new ZipArchiveMissingFileException("Couldn't find " + miscellaneousFileName);
}
var stream = fileEntry.Open();
OtherFileStreams.Add(miscellaneousFileName, (DeflateStream) stream);
}
}
This works well in most cases. However, if I have a zip within a zip, I get an excpetion on casting the stream to a DeflateStream:
System.InvalidCastException: Unable to cast object of type 'System.IO.Compression.SubReadStream' to type 'System.IO.Compression.DeflateStream'.
I am unable to find Microsoft documentation for a SubReadStream. I would like my zip within a zip as a DeflateStream. Is this possible? If so how?
UPDATE
Still no success. I attempted #Sunshine's suggestion of copying the stream using the following code:
private void ExtractMiscellaneousFiles()
{
_logger.Log("Extracting misc files...");
foreach (var miscellaneousFileName in _fileData.MiscellaneousFileNames)
{
_logger.Log($"Opening misc file stream for {miscellaneousFileName}");
var fileEntry = _archive.GetEntry(miscellaneousFileName);
if (fileEntry == null)
{
throw new ZipArchiveMissingFileException("Couldn't find " + miscellaneousFileName);
}
var openStream = fileEntry.Open();
var deflateStream = openStream;
if (!(deflateStream is DeflateStream))
{
var memoryStream = new MemoryStream();
deflateStream.CopyTo(memoryStream);
memoryStream.Position = 0;
deflateStream = new DeflateStream(memoryStream, CompressionLevel.NoCompression, true);
}
OtherFileStreams.Add(miscellaneousFileName, (DeflateStream)deflateStream);
}
}
But I get a
System.NotSupportedException: Stream does not support reading.
I inspected deflateStream.CanRead and it is true.
I've discovered this happens not just on zips, but on files that are in the zip but are not compressed (because too small, for example). Surely there's a way to deal with this; surely someone has encountered this before. I'm opening a bounty on this question.
Here's the .NET source for SubReadStream, thanks to #Quantic.
The return type of ZipArchiveEntry.Open() is Stream. An abstract type, in practice it can be a DeflateStream (you'd be happy), a SubReadStream (boo) or a WrappedStream (boo). Woe be you if they decide to improve the class some day and use a ZopfliStream (boo). The workaround is not good, you are trying to deflate data that is not compressed (boo).
Too many boos.
Only good solution is to change the type of your OtherFileStreams member. We can't see it, smells like a List<DeflateStream>. It needs to be a List<Stream>.
So it looks like the when storing a zip file inside another zip it doesn't deflate the zip but rather just inlines the content of the zip with the rest of the files with some information that these entries are part of a sub zip file. Which makes sense because applying compression to something that is already compressed is a waste of time.
This zip file is marked as CompressionMethodValues.Stored in the archive, which causes .NET to just return the original stream it read instead to wrapping it in a DeflateStream.
Source here: https://github.com/dotnet/corefx/blob/master/src/System.IO.Compression/src/System/IO/Compression/ZipArchiveEntry.cs#L670
You could pass the stream into a ZipArchive, if it's not a DeflateStream (if you are interested in the file inside)
var stream = entry.Open();
if (!(stream is DeflateStream))
{
var subArchive = new ZipArchive(stream);
}
Or you can copy the stream to a FileStream (if you want to save it to disk)
var stream = entry.Open();
if (!(stream is DeflateStream))
{
var fs = File.Create(Path.GetTempFileName());
stream.CopyTo(fs);
fs.Close();
}
Or copy to any stream you are interested in using.
Note: This is also how .NET 4.6 behaves
Is there a way to create a Zip archive that contains multiple files, when the files are currently in memory? The files I want to save are really just text only and are stored in a string class in my application. But I would like to save multiple files in a single self-contained archive. They can all be in the root of the archive.
It would be nice to be able to do this using SharpZipLib.
Use ZipEntry and PutNextEntry() for this. The following shows how to do it for a file, but for an in-memory object just use a MemoryStream
FileStream fZip = File.Create(compressedOutputFile);
ZipOutputStream zipOStream = new ZipOutputStream(fZip);
foreach (FileInfo fi in allfiles)
{
ZipEntry entry = new ZipEntry((fi.Name));
zipOStream.PutNextEntry(entry);
FileStream fs = File.OpenRead(fi.FullName);
try
{
byte[] transferBuffer[1024];
do
{
bytesRead = fs.Read(transferBuffer, 0, transferBuffer.Length);
zipOStream.Write(transferBuffer, 0, bytesRead);
}
while (bytesRead > 0);
}
finally
{
fs.Close();
}
}
zipOStream.Finish();
zipOStream.Close();
Using SharpZipLib for this seems pretty complicated. This is so much easier in DotNetZip. In v1.9, the code looks like this:
using (ZipFile zip = new ZipFile())
{
zip.AddEntry("Readme.txt", stringContent1);
zip.AddEntry("readings/Data.csv", stringContent2);
zip.AddEntry("readings/Index.xml", stringContent3);
zip.Save("Archive1.zip");
}
The code above assumes stringContent{1,2,3} contains the data to be stored in the files (or entries) in the zip archive. The first entry is "Readme.txt" and it is stored in the top level "Directory" in the zip archive. The next two entries are stored in the "readings" directory in the zip archive.
The strings are encoded in the default encoding. There is an overload of AddEntry(), not shown here, that allows you to explicitly specify the encoding to use.
If you have the content in a stream or byte array, not a string, there are overloads for AddEntry() that accept those types. There are also overloads that accept a Write delegate, a method of yours that is invoked to write data into the zip. This works for easily saving a DataSet into a zip file, for example.
DotNetZip is free and open source.
This function should create a byte array from a stream of data: I've created a simple interface for handling files for simplicity
public interface IHasDocumentProperties
{
byte[] Content { get; set; }
string Name { get; set; }
}
public void CreateZipFileContent(string filePath, IEnumerable<IHasDocumentProperties> fileInfos)
{
using (var memoryStream = new MemoryStream())
{
using (var zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Create, true))
{
foreach(var fileInfo in fileInfos)
{
var entry = zipArchive.CreateEntry(fileInfo.Name);
using (var entryStream = entry.Open())
{
entryStream.Write(fileInfo.Content, 0, fileInfo.Content.Length);
}
}
}
using (var fileStream = new FileStream(filePath, FileMode.OpenOrCreate, System.IO.FileAccess.Write))
{
memoryStream.Position = 0;
memoryStream.CopyTo(fileStream);
}
}
}
Yes, you can use SharpZipLib to do this - when you need to supply a stream to write to, use a MemoryStream.
I come across this problem, using the MSDN example I created this class:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO.Packaging;
using System.IO;
public class ZipSticle
{
Package package;
public ZipSticle(Stream s)
{
package = ZipPackage.Open(s, FileMode.Create);
}
public void Add(Stream stream, string Name)
{
Uri partUriDocument = PackUriHelper.CreatePartUri(new Uri(Name, UriKind.Relative));
PackagePart packagePartDocument = package.CreatePart(partUriDocument, "");
CopyStream(stream, packagePartDocument.GetStream());
stream.Close();
}
private static void CopyStream(Stream source, Stream target)
{
const int bufSize = 0x1000;
byte[] buf = new byte[bufSize];
int bytesRead = 0;
while ((bytesRead = source.Read(buf, 0, bufSize)) > 0)
target.Write(buf, 0, bytesRead);
}
public void Close()
{
package.Close();
}
}
You can then use it like this:
FileStream str = File.Open("MyAwesomeZip.zip", FileMode.Create);
ZipSticle zip = new ZipSticle(str);
zip.Add(File.OpenRead("C:/Users/C0BRA/SimpleFile.txt"), "Some directory/SimpleFile.txt");
zip.Add(File.OpenRead("C:/Users/C0BRA/Hurp.derp"), "hurp.Derp");
zip.Close();
str.Close();
You can pass a MemoryStream (or any Stream) to ZipSticle.Add such as:
FileStream str = File.Open("MyAwesomeZip.zip", FileMode.Create);
ZipSticle zip = new ZipSticle(str);
byte[] fileinmem = new byte[1000];
// Do stuff to FileInMemory
MemoryStream memstr = new MemoryStream(fileinmem);
zip.Add(memstr, "Some directory/SimpleFile.txt");
memstr.Close();
zip.Close();
str.Close();
Note this answer is outdated; since .Net 4.5, the ZipArchive class allows zipping files in-memory. See johnny 5's answer below for how to use it.
You could also do it a bit differently, using a Serializable object to store all strings
[Serializable]
public class MyStrings {
public string Foo { get; set; }
public string Bar { get; set; }
}
Then, you could serialize it into a stream to save it.
To save on space you could use GZipStream (From System.IO.Compression) to compress it. (note: GZip is stream compression, not an archive of multiple files).
That is, of course if what you need is actually to save data, and not zip a few files in a specific format for other software.
Also, this would allow you to save many more types of data except strings.
I was utilizing Cheeso's answer by adding MemoryStreams as the source of the different Excel files. When I downloaded the zip, the files had nothing in them. This could be the way we were getting around trying to create and download a file over AJAX.
To get the contents of the different Excel files to be included in the Zip, I had to add each of the files as a byte[].
using (var memoryStream = new MemoryStream())
using (var zip = new ZipFile())
{
zip.AddEntry("Excel File 1.xlsx", excelFileStream1.ToArray());
zip.AddEntry("Excel File 2.xlsx", excelFileStream2.ToArray());
// Keep the file off of disk, and in memory.
zip.Save(memoryStream);
}
Use a StringReader to read from your string objects and expose them as Stream s.
That should make it easy to feed them to your zip-building code.