I have the following code snippet, which is designed to add files to a .zip file, while at the same time calculating their sha1 checksum.
However, it's running out of memory on large files.
Which part of it is causing the whole file to be in memory? Surely this should all be just streamed?
using (ZipArchive archive = ZipFile.Open(buildFile, ZipArchiveMode.Update))
{
foreach (var fileName in nameList)
{
ZipArchiveEntry entry = archive.CreateEntry(source.filename);
using (Stream zipData = entry.Open())
using (SHA1Managed shaForFile = new SHA1Managed())
using (Stream sourceFileStream = File.OpenRead(fileName))
using (Stream sourceData = new CryptoStream(sourceFileStream, shaForFile, CryptoStreamMode.Read))
{
sourceData.CopyTo(zipData);
print fileName + ':' + shaForFile.Hash;
}
}
}
(Copied from a comment - as this answers the question)
The problem is ZipArchiveMode.Update, that can require significant alterations to the file on disk. It can only ever directly stream to disk when you use ZipArchiveMode.Create
Related
I'm scaning a Pdf417 barcode which returns me a byte[] array. The DataString itself is a cryptic value like face to keyboard multiple times and fast. So I'm guessing it could be a zip file which is stored in the barcode. In the zip file there should be a xml file.
I had different attempts so far to convert my byte[] array to a valid zip file. In the end I was never able to open said zip file.
The barcodes are created by certified software solutions, so the barcode is not the problem for sure.
I can't be the only one who has had this problem, right?
Output when reading a barcode with dummy data in it:
"\0\0\0\vz\0\u0002B\u0001\u0002PK\u0003\u0004\n\0\0\0\b\0 E\u0081Q|\u0015\u00163Î\u0001\0\0\u0004\u0004\0\0\u0004\0\0\0txab}SMo£0\u0010½ï¯°|\a\u001bª¤Ý\u0015Påc\u0093Fê&\b²9ôÆÂ$ \r¦²M\u0093üûNø\b$Që\u008bí7ï½\u0019{lçù\u0098ïÉ\aH\u0095\u0015Â¥\u0096É)\u0001\u0011\u0017I&v.ý»\u009e\u0019OôÙsÖ\u0004iB¹4Õúý\u0017c\u0087ÃÁT\u0087L©\u0004b3N\u0099\u008aSÈ#¦\u0012fs>ä6·X\u0018í#y\u009aB\u008cS¤Ñ}}\u001c\u008d)\t\u0017S\u0097ÎV+\\u009dÔ¦^z?\b\u000egRäï\u00918\u0091\u0097À\b&Æ2ÊÁ¥/\u0091Pä\u000fd ÉhNÉÛÂwé\u0013ç\u009c²Fäcé\u0085XLÉk¤´¨4\u0015\u009d\u0092Y&[äìÒ\u0017×ÚJ\u001fn\u008cQh,¥÷8\u0018\u009a\u0016\u000eÓÆa>ütØ%Tgbmªf\u001fö\0\u0094\u0015I\aTàV\u0016¹gãe\u0018|`pËa\u0015pÍ)\u0085Îö5ɲ\u008d\a$ÕHgÍn½\u009d\u0005ö'\ao\u0080'á&ç\u000ek\u0080\u008e1\u0093Ø>\u0018\u0083\u0080m¦\u0015ëEV:\u0005Ù\u0006njYÃQ{ãB\u0094ÊaÕú:\u001c\u0096¹g]r\u009ew½"¿ðuæ²Pª©ox\u0011÷Ñ\u008e;ÞÌ\u008dWß7&\u0085Ð2ûW\u009e\u001fÍM\r\u0001ìJ|Oxa\u008dS\v\ÓüRÆi¤ ª·â]\u0090^Íßçs\u0096 Û\u009b~lm:¬ãM!)ã³v¤Ã\u0002Óô²Þ\u0087:Ù$\u008dä\u000eTPî\u0081ÝÃ7\aú½Ý\u0002\u001a}\QÙ\u001d·nï×Ý\u000fð\u0093Êÿí×aø\u0082±ÓÞ'PK\u0001\u0002\u0014\0\n\0\0\0\b\0 E\u0081Q|\u0015\u00163Î\u0001\0\0\u0004\u0004\0\0\u0004\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0txabPK\u0005\u0006\0\0\0\0\u0001\0\u0001\02\0\0\0ð\u0001\0\0\0\0"
Dont pay too much attention to this function, at this stage I'm probably trying too much to get to a solution. This is just one of many test voids of mine.
public void Test(byte[] bytes)
{
byte[] zipBytes;
using (var memoryStream = new MemoryStream())
{
using (var zipArchive = new ZipArchive(memoryStream, ZipArchiveMode.Create, leaveOpen: true))
{
var zipEntry = zipArchive.CreateEntry("test");
using (Stream entryStream = zipEntry.Open())
{
entryStream.Write(bytes, 0, bytes.Length);
}
}
zipBytes = memoryStream.ToArray();
}
using (var fileStream = new FileStream(#"C:\BarcodeReaderTesting\test.zip", FileMode.OpenOrCreate))
{
fileStream.Write(zipBytes, 0, zipBytes.Length);
}
}
Any tips on this topic?
I want to create a zip-file and return it to the browser so that it downloads the zip to the downloads-folder.
var images = imageRepository.GetAll(allCountryId);
using (FileStream f2 = new FileStream("SudaAmerica", FileMode.Create))
using (GZipStream gz = new GZipStream(f2, CompressionMode.Compress, false))
{
foreach (var image in images)
{
gz.Write(image.ImageData, 0, image.ImageData.Length);
}
return base.File(gz, "application/zip", "SudaAmerica");
}
i have tried the above but then i get an error saying the stream is disposed.
Is this possible or should i use another library then gzipstream?
The problem here is exactly what it says: you are handing it something based on gz, but gz gets disposed the moment you leave the using.
One option would be to wait until outside the using block, then tell it to use the filename of the thing you just wrote ("SudaAmerica"). However, IMO you shouldn't actually be writing a file here at all. If you use a MemoryStream instead, you can use .ToArray() to get a byte[] of the contents, which you can use in the File method. This requires no IO access, which is a win in about 20 different ways. Well, maybe 3 ways. But...
var images = imageRepository.GetAll(allCountryId);
using (MemoryStream ms = new MemoryStream())
{
using (GZipStream gz = new GZipStream(ms, CompressionMode.Compress, false))
{
foreach (var image in images)
{
gz.Write(image.ImageData, 0, image.ImageData.Length);
}
}
return base.File(ms.ToArray(), "application/zip", "SudaAmerica");
}
Note that a gzip stream is not the same as a .zip archive, so I very much doubt this will have the result you want. Zip archive creation is available elsewhere in the .NET framework, but it is not via GZipStream.
You probably want ZipArchive
I am having a problem in my app where it reads a PDF from disk, and then has to write it back to a different location later.
The emitted file is not a valid PDF anymore.
In very simplified form, I have tried reading/writing it using
var bytes = File.ReadAllBytes(#"c:\myfile.pdf");
File.WriteAllBytes(#"c:\output.pdf", bytes);
and
var input = new StreamReader(#"c:\myfile.pdf").ReadToEnd();
File.WriteAllText("c:\output.pdf", input);
... and about 100 permutations of the above with various encodings being specified. None of the output files were valid PDFs.
Can someone please lend a hand? Many thanks!!
In C#/.Net 4.0:
using (var i = new FileStream(#"input.pdf", FileMode.Open, FileAccess.Read))
using (var o = File.Create(#"output.pdf"))
i.CopyTo(o);
If you insist on having the byte[] first:
using (var i = new FileStream(#"input.pdf", FileMode.Open, FileAccess.Read))
using (var ms = new MemoryStream())
{
i.CopyTo(ms);
byte[] rawdata = ms.GetBuffer();
using (var o = File.Create(#"output.pdf"))
ms.CopyTo(o);
}
The memory stream may need to be ms.Seek(0, SeekOrigin.Origin) or something like that before the second CopyTo. look it up, or try it out
You're using File.WriteAllText to write your file out.
Try File.WriteAllBytes.
My GZipStream will only decompress the first line of the file. Extracting the contents via 7-zip works as expected and gives me the entire file contents. It also extracts as expected using gunzip on cygwin and linux, so I expect this is O/S specific (Windows 7).
I'm not certain how to go about troubleshooting this, so any tips on that would help me a great deal. It sounds very similar to this, but using SharpZLib results in the same thing.
Here's what I'm doing:
var inputFile = String.Format(#"{0}\{1}", inputDir, fileName);
var outputFile = String.Format(#"{0}\{1}.gz", inputDir, fileName);
var dcmpFile = String.Format(#"{0}\{1}", outputDir, fileName);
using (var input = File.OpenRead(inputFile))
using (var fileOutput = File.Open(outputFile, FileMode.Append))
using (GZipStream gzOutput = new GZipStream(fileOutput, CompressionMode.Compress, true))
{
input.CopyTo(gzOutput);
}
// Now, decompress
using (FileStream of = new FileStream(outputFile, FileMode.Open, FileAccess.Read))
using (GZipStream ogz = new GZipStream(of, CompressionMode.Decompress, false))
using (FileStream wf = new FileStream(dcmpFile, FileMode.Append, FileAccess.Write))
{
ogz.CopyTo(wf);
}
Your output file only contains a single line (gzipped) - but it contains all of the text data other than the line breaks.
You're repeatedly calling ReadLine() which returns a line of text without the line break and converting that text to bytes. So if you had an input file which had:
abc
def
ghi
You'd end up with an output file which was the compressed version of
abcdefghi
If you don't want that behaviour, why even go through a StreamReader in the first place? Just copy from the input FileStream straight to the GZipStream a block at a time, or use Stream.CopyTo if you're using .NET 4:
// Note how much simpler the code is using File.*
using (var input = File.OpenRead(inputFile))
using (var fileOutput = File.Open(outputFile, FileMode.Append))
using (GZipStream gzOutput = new GZipStream(os, CompressionMode.Compress, true))
{
input.CopyTo(gzOutput);
}
Also note that appending to a compressed file is rarely a good idea, unless you've got some sort of special handling for multiple "chunks" within a single file.
I am using SharpZipLib in a project and am wondering if it is possible to use it to look inside a zip file, and if one of the files within has a data modified in a range I am searching for then to pick that file out and copy it to a new directory? Does anybody know id this is possible?
Yes, it is possible to enumerate the files of a zip file using SharpZipLib. You can also pick files out of the zip file and copy those files to a directory on your disk.
Here is a small example:
using (var fs = new FileStream(#"c:\temp\test.zip", FileMode.Open, FileAccess.Read))
{
using (var zf = new ZipFile(fs))
{
foreach (ZipEntry ze in zf)
{
if (ze.IsDirectory)
continue;
Console.Out.WriteLine(ze.Name);
using (Stream s = zf.GetInputStream(ze))
{
byte[] buf = new byte[4096];
// Analyze file in memory using MemoryStream.
using (MemoryStream ms = new MemoryStream())
{
StreamUtils.Copy(s, ms, buf);
}
// Uncomment the following lines to store the file
// on disk.
/*using (FileStream fs = File.Create(#"c:\temp\uncompress_" + ze.Name))
{
StreamUtils.Copy(s, fs, buf);
}*/
}
}
}
}
In the example above I use a MemoryStream to store the ZipEntry in memory (for further analysis). You could also store the ZipEntry (if it meets certain criteria) on disk.
Hope, this helps.