I have a 453MB XML file which I'm trying to compress to a ZIP using SharpZipLib.
Below is the code I'm using to create the zip, but it's causing an OutOfMemoryException. This code successfully compresses a file of 428MB.
Any idea why the exception is happening, as I can't see why, as my system has plenty of memory available.
public void CompressFiles(List<string> pathnames, string zipPathname)
{
try
{
using (FileStream stream = new FileStream(zipPathname, FileMode.Create, FileAccess.Write, FileShare.None))
{
using (ZipOutputStream stream2 = new ZipOutputStream(stream))
{
foreach (string str in pathnames)
{
FileStream stream3 = new FileStream(str, FileMode.Open, FileAccess.Read, FileShare.Read);
byte[] buffer = new byte[stream3.Length];
try
{
if (stream3.Read(buffer, 0, buffer.Length) != buffer.Length)
{
throw new Exception(string.Format("Error reading '{0}'.", str));
}
}
finally
{
stream3.Close();
}
ZipEntry entry = new ZipEntry(Path.GetFileName(str));
stream2.PutNextEntry(entry);
stream2.Write(buffer, 0, buffer.Length);
}
stream2.Finish();
}
}
}
catch (Exception)
{
File.Delete(zipPathname);
throw;
}
}
You're trying to create a buffer as big as the file. Instead, make the buffer a fixed size, read some bytes into it, and write the number of read bytes into the zip file.
Here's your code with a buffer of 4096 bytes (and some cleanup):
public static void CompressFiles(List<string> pathnames, string zipPathname)
{
const int BufferSize = 4096;
byte[] buffer = new byte[BufferSize];
try
{
using (FileStream stream = new FileStream(zipPathname,
FileMode.Create, FileAccess.Write, FileShare.None))
using (ZipOutputStream stream2 = new ZipOutputStream(stream))
{
foreach (string str in pathnames)
{
using (FileStream stream3 = new FileStream(str,
FileMode.Open, FileAccess.Read, FileShare.Read))
{
ZipEntry entry = new ZipEntry(Path.GetFileName(str));
stream2.PutNextEntry(entry);
int read;
while ((read = stream3.Read(buffer, 0, buffer.Length)) > 0)
{
stream2.Write(buffer, 0, read);
}
}
}
stream2.Finish();
}
}
catch (Exception)
{
File.Delete(zipPathname);
throw;
}
}
Especially note this block:
const int BufferSize = 4096;
byte[] buffer = new byte[BufferSize];
// ...
int read;
while ((read = stream3.Read(buffer, 0, buffer.Length)) > 0)
{
stream2.Write(buffer, 0, read);
}
This reads bytes into buffer. When there are no more bytes, the Read() method returns 0, so that's when we stop. When Read() succeeds, we can be sure there is some data in the buffer but we don't know how many bytes. The whole buffer might be filled, or just a small portion of it. Therefore, we use the number of read bytes read to determine how many bytes to write to the ZipOutputStream.
That block of code, by the way, can be replaced by a simple statement that was added to .Net 4.0, which does exactly the same:
stream3.CopyTo(stream2);
So, your code could become:
public static void CompressFiles(List<string> pathnames, string zipPathname)
{
try
{
using (FileStream stream = new FileStream(zipPathname,
FileMode.Create, FileAccess.Write, FileShare.None))
using (ZipOutputStream stream2 = new ZipOutputStream(stream))
{
foreach (string str in pathnames)
{
using (FileStream stream3 = new FileStream(str,
FileMode.Open, FileAccess.Read, FileShare.Read))
{
ZipEntry entry = new ZipEntry(Path.GetFileName(str));
stream2.PutNextEntry(entry);
stream3.CopyTo(stream2);
}
}
stream2.Finish();
}
}
catch (Exception)
{
File.Delete(zipPathname);
throw;
}
}
And now you know why you got the error, and how to use buffers.
You're allocating a lot of memory for no good reason, and I bet you have a 32-bit process. 32-bit processes can only allocate up to 2GB of virtual memory in normal conditions, and the library surely allocates memory too.
Anyway, several things are wrong here:
byte[] buffer = new byte[stream3.Length];
Why? You don't need to store the whole thing in memory to process it.
if (stream3.Read(buffer, 0, buffer.Length) != buffer.Length)
This one is nasty. Stream.Read is explicitly allowed to return less bytes than what you asked for, and this is still a valid result. When reading a stream into a buffer you have to call Read repeatedly until the buffer is filled or the end of the stream is reached.
Your variables should have more meaningful names. You can easily get lost with these stream2, stream3 etc.
A simple solution would be:
using (var zipFileStream = new FileStream(zipPathname, FileMode.Create, FileAccess.Write, FileShare.None))
using (ZipOutputStream zipStream = new ZipOutputStream(zipFileStream))
{
foreach (string str in pathnames)
{
using(var itemStream = new FileStream(str, FileMode.Open, FileAccess.Read, FileShare.Read))
{
var entry = new ZipEntry(Path.GetFileName(str));
zipStream.PutNextEntry(entry);
itemStream.CopyTo(zipStream);
}
}
zipStream.Finish();
}
Related
I make an archiver with block-by-block reading and file compression. I put the compressed block in FileStream.
I am reading the 5 mb block. The problem is that if I compress a pic of 8 mb, then when I pull it out of the resulting archive, its sum-hash does not match the original and it opens pic halfway, and the size is the same... I don’t know what to try. I ask for help.
Read chunk void:
private byte[] ReadChunk(int chunkId)
{
using (var inFile = new FileStream(sourceFile, FileMode.Open, FileAccess.Read, FileShare.Read))
{
long filePosition = chunkId * chunkDataSize;
int bytesRead;
if (inFile.Length - filePosition <= chunkDataSize)
{
bytesRead = (int)(inFile.Length - filePosition);
}
else
{
bytesRead = chunkDataSize;
}
var lastBuffer = new byte[bytesRead];
inFile.Read(lastBuffer, 0, bytesRead);
return lastBuffer;
}
}
Compress and write void:
private void CompressBlock(byte[] bytesTo)
{
using (MemoryStream ms = new MemoryStream())
{
using (GZipStream gs = new GZipStream(ms, CompressionMode.Compress))
{
gs.Write(bytesTo, 0, bytesTo.Length);
}
byte[] compressedData = ms.ToArray();
using (var outFile = new FileStream(resultFile, FileMode.Append))
{
BitConverter.GetBytes(compressedData.Length).CopyTo(compressedData, 4);
outFile.Write(compressedData, 0, compressedData.Length);
}
}
}
I have two methods, one works, the other not.
Working method:
public static void CompressAndEncrypt(string sourceFile, string encrFile)
{
int bufferSize = 5242880;
using (var readStream = new FileStream(sourceFile, FileMode.Open, FileAccess.ReadWrite))
{
using (var writeStream = new FileStream(encrFile, FileMode.OpenOrCreate, FileAccess.Write, FileShare.ReadWrite))
{
DESCryptoServiceProvider cryptic = new DESCryptoServiceProvider();
cryptic.Key = ASCIIEncoding.ASCII.GetBytes("ABCDEFGH");
cryptic.IV = ASCIIEncoding.ASCII.GetBytes("ABCDEFGH");
using (var crypto = new CryptoStream(writeStream, cryptic.CreateEncryptor(), CryptoStreamMode.Write))
{
using (var zip = new GZipStream(crypto, CompressionMode.Compress))
{
int bytesRead = -1;
byte[] bytes = new byte[bufferSize];
while ((bytesRead = readStream.Read(bytes, 0, bufferSize)) > 0)
{
zip.Write(bytes, 0, bytesRead);
}
}
}
}
}
}
Nonworking method:
public static void CompressAndEncryptBlock(string sourceFile, string outputFile)
{
int bufferSize = 5242880;
int bytesRead;
var bytes = new byte[bufferSize];
using (var readStream = new FileStream(sourceFile, FileMode.Open, FileAccess.ReadWrite))
{
using (var writer = new FileStream(outputFile, FileMode.OpenOrCreate, FileAccess.Write, FileShare.ReadWrite))
{
while ((bytesRead = readStream.Read(bytes, 0, bufferSize)) > 0)
{
using (var writeStream = new MemoryStream())
{
DESCryptoServiceProvider cryptic = new DESCryptoServiceProvider();
cryptic.Key = Encoding.ASCII.GetBytes("ABCDEFGH");
cryptic.IV = Encoding.ASCII.GetBytes("ABCDEFGH");
using (var crypto = new CryptoStream(writeStream, cryptic.CreateEncryptor(), CryptoStreamMode.Write))
{
using (var zip = new GZipStream(crypto, CompressionMode.Compress, true))
{
zip.Write(bytes, 0, bytesRead);
//After that, the Capacity of writeStream (MemoryStream) is somehow greater than its Length
}
var bytes1 = new byte[writeStream.Length];
writeStream.Read(bytes1, 0, bytes1.Length);
writer.Write(bytes1, 0, bytes1.Length);
}
}
}
}
}
}
Why does the second processing of the file go wrong?
The second way I will need in the future is to transfer the file in blocks (now while I'm just testing it for writing to disk).
After using the second method, the file size is slightly smaller than when using the first method. And further, if I decrypt and decompress I get this exception:
System.IO.InvalidDataException occurred
HResult=0x80131501
Message=Invalid magic number in the GZip header. The transfer must go to the GZip stream.
Source=System
StackTrace:
at System.IO.Compression.GZipDecoder.ReadHeader(InputBuffer input)
at System.IO.Compression.Inflater.Decode()
at System.IO.Compression.Inflater.Inflate(Byte[] bytes, Int32 offset, Int32 length)
at System.IO.Compression.DeflateStream.Read(Byte[] array, Int32 offset, Int32 count)
at System.IO.Compression.GZipStream.Read(Byte[] array, Int32 offset, Int32 count)
If I use the first method I can decrypt and decompress (I use only one method for decrypt and decompress).
I have an Active wave recording wave-file.wav happening to the Source folder.
I need to replicate this file to Destination folder with a new name wave-file-copy.wav.
The recording and replication should happen in parallel.
I have implemented a scheduled job, which will run in every 10 minutes and copy the source file to destination.
private static void CopyWaveFile(string destinationFile, string sourceFile){
using (var fs = File.Open(sourceFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)){
using (var reader = new WaveFileReader(fs)){
using (var writer = new WaveFileWriter(destinationFile, reader.WaveFormat)){
reader.Position = 0;
var endPos = (int)reader.Length;
var buffer = new byte[1024];
while (reader.Position < endPos){
var bytesRequired = (int)(endPos - reader.Position);
if (bytesRequired <= 0) continue;
var bytesToRead = Math.Min(bytesRequired, buffer.Length);
var bytesRead = reader.Read(buffer, 0, bytesToRead);
if (bytesRead > 0){
writer.Write(buffer, 0, bytesRead);
}
}
}
}
}
}
The copy operation is working fine, even though the source file is being updated continuously.
Time taken for the copy operation is increasing in linear time, because i am copying the entire file every time.
I am trying to implement a new function ConcatenateWavFiles(), which should update the content of destination file, with the latest available bytes of source recording.
I have tried few sample codes - the approach i am using is :
Read destination file meta info, and get the length.
Set the length of destination file as reader.Position of source file waveReader
Read the source file till end, starting from position.
public static void ConcatenateWavFiles(string destinationFile, string sourceFile){
WaveFileWriter waveFileWriter = null;
var sourceReadOffset = GetWaveFileSize(destinationFile);
try{
using (var fs = File.Open(sourceFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
using (var reader = new WaveFileReader(fs))
{
waveFileWriter = new WaveFileWriter(destinationFile, reader.WaveFormat);
if (!reader.WaveFormat.Equals(waveFileWriter.WaveFormat)){
throw new InvalidOperationException(
"Can't append WAV Files that don't share the same format");
}
var startPos = sourceReadOffset - sourceReadOffset % reader.WaveFormat.BlockAlign;
var endPos = (int) reader.Length;
reader.Position = startPos;
var bytesRequired = (int)(endPos - reader.Position);
var buffer = new byte[bytesRequired];
if (bytesRequired > 0)
{
var bytesToRead = Math.Min(bytesRequired, buffer.Length);
var bytesRead = reader.Read(buffer, 0, bytesToRead);
if (bytesRead > 0)
{
waveFileWriter.Write(buffer, startPos, bytesRead);
}
}
}
}
}
finally{
if (waveFileWriter != null){
waveFileWriter.Dispose();
}
}
}
I was able to get the new content.
Is it possible to append the latest content to existing destination file?
If possible what am I doing wrong in the code?
My code throws the following exception - Offset and length were out of bounds for the array or count is greater than the number of elements from index to the end of the source collection.
I couldn't find a solution to wave audio file replication with NAudio Library.
But I have implemented a solution using C# MemoryStreams and FileStreams.
Copy the Source file to destination, if destination file doesn't exist.
Append all the latest bytes (recorded after the last operation) to the destination file.
Modify the Wave File Header to to reflect the last appended bytes. (Else the duration of the file will not be updated, only the file size will increase.
Repeat this append operation in regular intervals.
public void ReplicateFile(string destinationFile, string sourceFile){
if (!Directory.Exists(GetRoutePathFromFile(sourceFile)))
return;
if (!File.Exists(sourceFile))
return;
if (Directory.Exists(GetRoutePathFromFile(destinationFile))){
if (File.Exists(destinationFile)){
UpdateLatestWaveFileContent(destinationFile, sourceFile);
}else{
CopyWaveFile(destinationFile, sourceFile);
}
}else{
Directory.CreateDirectory(GetRoutePathFromFile(destinationFile));
CopyWaveFile(destinationFile, sourceFile);
}
}
private static string GetRoutePathFromFile(string file){
var rootPath = Directory.GetParent(file);
return rootPath.FullName;
}
private static void CopyWaveFile(string destination, string source){
var sourceMemoryStream = new MemoryStream();
using (var fs = File.Open(source, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)){
fs.CopyTo(sourceMemoryStream);
}
using (var fs = new FileStream(destination, FileMode.CreateNew, FileAccess.ReadWrite, FileShare.ReadWrite)){
sourceMemoryStream.WriteTo(fs);
}
}
private static void UpdateLatestWaveFileContent(string destinationFile, string sourceFile){
var sourceMemoryStream = new MemoryStream();
long offset = 0;
using (var fs = File.Open(destinationFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)){
offset = fs.Length;
}
using (var fs = File.Open(sourceFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite)){
fs.CopyTo(sourceMemoryStream);
}
var length = sourceMemoryStream.Length - offset;
var buffer = sourceMemoryStream.GetBuffer();
using (var fs = new FileStream(destinationFile, FileMode.Append, FileAccess.Write, FileShare.ReadWrite)){
fs.Write(buffer, (int)offset, (int)length);
}
var bytes = new byte[45];
for (var i = 0; i < 45; i++){
bytes[i] = buffer[i];
}
ModifyHeaderDataLength(destinationFile, 0, bytes);
}
private static void ModifyHeaderDataLength(string filename, int position, byte[] data){
using (Stream stream = File.Open(filename, FileMode.OpenOrCreate, FileAccess.Write, FileShare.ReadWrite))
{
stream.Position = position;
stream.Write(data, 0, data.Length);
}
}
Try Reading the Source File one or two Wav blocks prior to the actual end of the source file.
The case could be that the code is judging the end of the source file too close for comfort.
I am working in win forms. Getting errors while doing following operation.
It shows me System.OutOfMemoryException error when i try to run the operation around 2-3 times continuously. Seems .NET is not able to free the resouces used in operation. The file i am using for operation is quite big, around more than 500 MB.
My sample code is as below. Please help me how to resolve the error.
try
{
using (FileStream target = new FileStream(strCompressedFileName, FileMode.Create, FileAccess.Write))
using (GZipStream alg = new GZipStream(target, CompressionMode.Compress))
{
byte[] data = File.ReadAllBytes(strFileToBeCompressed);
alg.Write(data, 0, data.Length);
alg.Flush();
data = null;
}
}
catch (Exception ex)
{
MessageBox.Show(ex.ToString());
}
Replace ReadAllBytes with Stream.CopyTo
using (FileStream target = new FileStream(strCompressedFileName, FileMode.Create, FileAccess.Write))
using (GZipStream alg = new GZipStream(target, CompressionMode.Compress))
{
using (var fileToRead = File.Open(.....))
{
fileToRead.CopyTo(alg);
}
}
A very rough example could be
// destFile - FileStream for destinationFile
// srcFile - FileStream of sourceFile
using (GZipStream gz = new GZipStream(destFile, CompressionMode.Compress))
{
byte[] src = new byte[1024];
int count = sourceFile.Read(src, 0, 1024);
while (count != 0)
{
gz.Write(src, 0, count );
count = sourceFile.Read(src, 0, 1024);
}
}
// flush, close, dispose ..
So basically I changed your ReadAllBytes to read only chunks of 1024 bytes.
You can try to use this method to compress file MSDN link
public static void Compress(FileInfo fileToCompress)
{
using (FileStream originalFileStream = fileToCompress.OpenRead())
{
using (FileStream compressedFileStream = File.Create(fileToCompress.FullName + ".gz"))
{
using (GZipStream compressionStream = new GZipStream(compressedFileStream, CompressionMode.Compress))
{
originalFileStream.CopyTo(compressionStream);
}
}
}
}
usage:
string directoryPath = #"c:\users\public\reports";
DirectoryInfo directorySelected = new DirectoryInfo(directoryPath);
foreach (FileInfo fileToCompress in directorySelected.GetFiles())
{
Compress(fileToCompress);
}
Can we merge two memory mapped files? if so the how? if not then why not?
So here are my first experiences with MemoryMappedFiles, give it a try:
String f1Path = #"C:\Temp\Test1.txt";
String f2Path = #"C:\Temp\Test2.txt";
byte[] buffer;
int offset;
int length;
using (FileStream f1ReadStream = new FileStream(f1Path, FileMode.Open, FileAccess.Read))
{
offset = (int)f1ReadStream.Length;
}
using (FileStream f2ReadStream = new FileStream(f2Path, FileMode.Open, FileAccess.Read))
{
length = (int)f2ReadStream.Length;
}
// read file2 and append all to file1
using (var mappedFile2 = MemoryMappedFile.CreateFromFile(f2Path, FileMode.Open, null, length))
{
using (var reader = mappedFile2.CreateViewStream(0, length, MemoryMappedFileAccess.Read))
{
// Read from MMF
buffer = new byte[length];
reader.Read(buffer, 0, length);
}
}
using (var mappedFile1 = MemoryMappedFile.CreateFromFile(f1Path,FileMode.Open, null, offset + length))
{
// Create writer to MMF
using (var writer = mappedFile1.CreateViewAccessor(offset, length, MemoryMappedFileAccess.Write))
{
// Write to MMF
writer.WriteArray<byte>(0, buffer, 0, length);
}
}