When I use the same GZipStream to compress file blocks in loop the result file compress successfully:
public static void Compress1(string fi)
{
using (FileStream inFile = File.Open(fi,FileMode.Open,FileAccess.Read,FileShare.Read))
{
using (FileStream outFile = File.Create(fi + ".gz"))
{
using (GZipStream Compress = new GZipStream(outFile,
CompressionMode.Compress))
{
byte[] buffer = new byte[6315120];
int numRead;
while ((numRead = inFile.Read(buffer, 0, buffer.Length)) != 0)
{
Compress.Write(buffer, 0, numRead);
}
}
}
}
}
But when I compress file blocks separately in different streams the result file corrupts:
public static void Compress2(string fi, int offset)
{
using (FileStream inFile = File.Open(fi,FileMode.Open))
{
using (FileStream outFile = File.OpenOrCreate(fi + ".gz"))
{
using (GZipStream Compress = new GZipStream(outFile,
CompressionMode.Compress))
{
// Copy the source file into the compression stream.
byte[] buffer = new byte[6315120];
int numRead=-1;
inFile.Seek(offset,SeekOrigin.Begin);
numRead = inFile.Read(buffer, 0, buffer.Length);
Compress.Write(buffer, 0, numRead);
}
}
}
}
In these examples I have a file with size = 12630240. And devide it into 2 blocks, size of each block = 6315120 (buffer size). So, the first block compress correctly in both methods, but the second block in second method compress different from the first method. What I missed?
What is happening is you are creating to different files as each GZipStream has its one headers
by dividing what you are doing is creating to different GZ files and if you write the two to the same file it is a corrupt file.
Related
I make an archiver with block-by-block reading and file compression. I put the compressed block in FileStream.
I am reading the 5 mb block. The problem is that if I compress a pic of 8 mb, then when I pull it out of the resulting archive, its sum-hash does not match the original and it opens pic halfway, and the size is the same... I don’t know what to try. I ask for help.
Read chunk void:
private byte[] ReadChunk(int chunkId)
{
using (var inFile = new FileStream(sourceFile, FileMode.Open, FileAccess.Read, FileShare.Read))
{
long filePosition = chunkId * chunkDataSize;
int bytesRead;
if (inFile.Length - filePosition <= chunkDataSize)
{
bytesRead = (int)(inFile.Length - filePosition);
}
else
{
bytesRead = chunkDataSize;
}
var lastBuffer = new byte[bytesRead];
inFile.Read(lastBuffer, 0, bytesRead);
return lastBuffer;
}
}
Compress and write void:
private void CompressBlock(byte[] bytesTo)
{
using (MemoryStream ms = new MemoryStream())
{
using (GZipStream gs = new GZipStream(ms, CompressionMode.Compress))
{
gs.Write(bytesTo, 0, bytesTo.Length);
}
byte[] compressedData = ms.ToArray();
using (var outFile = new FileStream(resultFile, FileMode.Append))
{
BitConverter.GetBytes(compressedData.Length).CopyTo(compressedData, 4);
outFile.Write(compressedData, 0, compressedData.Length);
}
}
}
I'm working with GZipStream at the moment using .net 3.5.
I have two methods listed below. As input file I use text file which consists of chars 's'. Size of the file is 2MB. This code works fine if I use .net 4.5 but with .net 3.5 after compress and decompress I get file of size 435KB which of course isn't the same with source file.
If I try to decompress file via WinRAR it is also looks good (the same with source file).
If I try decompress file using GZipStream from .net4.5 (file compressed via GZipStream from .net 3.5) the result is bad.
UPD:
In general I really need to read the file as several separate gzip chunks, in this case all the bytes of copressed files are read at one call of the Read() method so I still don't understand why decompressing doesn't works.
public void CompressFile()
{
string fileIn = #"D:\sin2.txt";
string fileOut = #"D:\sin2.txt.pgz";
using (var fout = File.Create(fileOut))
{
using (var fin = File.OpenRead(fileIn))
{
using (var zip = new GZipStream(fout, CompressionMode.Compress))
{
var buffer = new byte[1024 * 1024 * 10];
int n = fin.Read(buffer, 0, buffer.Length);
zip.Write(buffer, 0, n);
}
}
}
}
public void DecompressFile()
{
string fileIn = #"D:\sin2.txt.pgz";
string fileOut = #"D:\sin2.1.txt";
using (var fsout = File.Create(fileOut))
{
using (var fsIn = File.OpenRead(fileIn))
{
var buffer = new byte[1024 * 1024 * 10];
int n;
while ((n = fsIn.Read(buffer, 0, buffer.Length)) > 0)
{
using (var ms = new MemoryStream(buffer, 0, n))
{
using (var zip = new GZipStream(ms, CompressionMode.Decompress))
{
int nRead = zip.Read(buffer, 0, buffer.Length);
fsout.Write(buffer, 0, nRead);
}
}
}
}
}
}
You're trying to decompress each "chunk" as if it's a separate gzip file. Don't do that - just read from the GZipStream in a loop:
using (var fsout = File.Create(fileOut))
{
using (var fsIn = File.OpenRead(fileIn))
{
using (var zip = new GZipStream(fsIn, CompressionMode.Decompress))
{
var buffer = new byte[1024 * 32];
int bytesRead;
while ((bytesRead = zip.Read(buffer, 0, buffer.Length)) > 0)
{
fsout.Write(buffer, 0, bytesRead);
}
}
}
}
Note that your compression code should look similar, reading in a loop rather than assuming a single call to Read will read all the data.
(Personally I'd skip fsIn, and just use new GZipStream(File.OpenRead(fileIn)) but that's just a personal preference.)
First, as #Jon Skeet mentioned, you are not using Stream.Read method correctly. It doesn't matter if your buffer is big enough or not, the stream is allowed to return less bytes than requested, with zero indicating no more, so reading from stream should always be performed in a loop.
However the main problem in your decompress code is the way you share the buffer. Your read the input into a buffer, than wrap it in a MemoryStream (note that the constructor used does not make a copy of the passed array, but actually sets it as it's internal buffer), and then you try to read and write to that buffer at the same time. Taking into account that decompressing writes data "faster" than reading, it's surprising that your code works at all.
The correct implementation is quite simple
static void CompressFile()
{
string fileIn = #"D:\sin2.txt";
string fileOut = #"D:\sin2.txt.pgz";
using (var input = File.OpenRead(fileIn))
using (var output = new GZipStream(File.Create(fileOut), CompressionMode.Compress))
Write(input, output);
}
static void DecompressFile()
{
string fileIn = #"D:\sin2.txt.pgz";
string fileOut = #"D:\sin2.1.txt";
using (var input = new GZipStream(File.OpenRead(fileIn), CompressionMode.Decompress))
using (var output = File.Create(fileOut))
Write(input, output);
}
static void Write(Stream input, Stream output, int bufferSize = 10 * 1024 * 1024)
{
var buffer = new byte[bufferSize];
for (int readCount; (readCount = input.Read(buffer, 0, buffer.Length)) > 0;)
output.Write(buffer, 0, readCount);
}
I want to compress a Excel file to .zip or .cap extension. The code Used to do that it is compressing the file but that zip file can't be unzip. while unzip that i am getting the error file file corrupted or can't be unzip.
The code I am using:
static public bool CompressFile(string file, string outputFile)
{
try
{
using (var inFile = File.OpenRead(file))
{
using (var outFile = File.Create(outputFile))
{
using (var compress = new GZipStream(outFile, CompressionMode.Compress, false))
{
byte[] buffer = new byte[inFile.Length];
int read = inFile.Read(buffer, 0, buffer.Length);
while (read > 0)
{
compress.Write(buffer, 0, read);
read = inFile.Read(buffer, 0, buffer.Length);
}
}
}
}
return true;
}
catch (IOException ex)
{
MessageBox.Show(string.Format("Error compressing file: {0}", ex.Message));
return false;
}
}
Even i go some link to get the proper solution. But nothing is workout.I need some suggestion to get the proper solution. Any answer please.
This code uses the SharpZipLib library and will compress files that can be uncompressed no problems
private void Zip()
{
string output = #"C:\TEMP\test.zip";
string input = #"C:\TEMP\test.xlsx";
using (var zipStream = new ZipOutputStream(System.IO.File.Create(output)))
{
zipStream.SetLevel(9);
var buffer = new byte[4096];
var entry = new ZipEntry(Path.GetFileName(input));
zipStream.PutNextEntry(entry);
using (FileStream fs = System.IO.File.OpenRead(input))
{
int sourceBytes;
do
{
sourceBytes = fs.Read(buffer, 0, buffer.Length);
zipStream.Write(buffer, 0, sourceBytes);
} while (sourceBytes > 0);
}
zipStream.Finish();
zipStream.Close();
}
}
I need to compress a file using GZip by batch of specified size (not in a whole). I can successfuly fill the byte[] buffer, but after copying it into the compression stream, it just leaves the output stream empty.
public void Compress(string source, string output)
{
FileInfo fi = new FileInfo(source);
byte[] buffer = new byte[BufferSize];
int total, current = 0;
using (FileStream inFile = fi.OpenRead())
{
using (FileStream outFile = File.Create(output + ".gz"))
{
while ((total = inFile.Read(buffer, 0, buffer.Length)) != 0)
{
using (MemoryStream compressedStream = new MemoryStream())
{
using (MemoryStream bufferStream = new MemoryStream())
{
CopyToStream(buffer, bufferStream);
using (GZipStream Compress = new GZipStream(compressedStream, CompressionMode.Compress, true))
{
bufferStream.Position = 0;
bufferStream.CopyTo(Compress);
current += total;
}
compressedStream.Position = 0;
compressedStream.CopyTo(outFile);
}
}
}
}
}
}
static void CopyToStream(byte[] buffer, Stream output)
{
output.Write(buffer, 0, buffer.Length);
}
You need to rewind compressedStream by setting Position=0 before compressedStream.CopyTo(outFile);.
You are trying to over complicate things... You do not require additional MemoryStreams or buffers...
Taken from the MSDN... http://msdn.microsoft.com/en-us/library/system.io.compression.gzipstream.aspx
public static void Compress(FileInfo fi)
{
// Get the stream of the source file.
using (FileStream inFile = fi.OpenRead())
{
// Prevent compressing hidden and
// already compressed files.
if ((File.GetAttributes(fi.FullName)
& FileAttributes.Hidden)
!= FileAttributes.Hidden & fi.Extension != ".gz")
{
// Create the compressed file.
using (FileStream outFile =
File.Create(fi.FullName + ".gz"))
{
using (GZipStream Compress =
new GZipStream(outFile,
CompressionMode.Compress))
{
// Copy the source file into
// the compression stream.
inFile.CopyTo(Compress);
Console.WriteLine("Compressed {0} from {1} to {2} bytes.",
fi.Name, fi.Length.ToString(), outFile.Length.ToString());
}
}
}
}
}
I am attempting to create a new FileStream object from a byte array. I'm sure that made no sense at all so I will try to explain in further detail below.
Tasks I am completing:
1) Reading the source file which was previously compressed
2) Decompressing the data using GZipStream
3) copying the decompressed data into a byte array.
What I would like to change:
1) I would like to be able to use File.ReadAllBytes to read the decompressed data.
2) I would then like to create a new filestream object usingg this byte array.
In short, I want to do this entire operating using byte arrays. One of the parameters for GZipStream is a stream of some sort, so I figured I was stuck using a filestream. But, if some method exists where I can create a new instance of a FileStream from a byte array - then I should be fine.
Here is what I have so far:
FolderBrowserDialog fbd = new FolderBrowserDialog(); // Shows a browser dialog
fbd.ShowDialog();
// Path to directory of files to compress and decompress.
string dirpath = fbd.SelectedPath;
DirectoryInfo di = new DirectoryInfo(dirpath);
foreach (FileInfo fi in di.GetFiles())
{
zip.Program.Decompress(fi);
}
// Get the stream of the source file.
using (FileStream inFile = fi.OpenRead())
{
//Create the decompressed file.
string outfile = #"C:\Decompressed.exe";
{
using (GZipStream Decompress = new GZipStream(inFile,
CompressionMode.Decompress))
{
byte[] b = new byte[blen.Length];
Decompress.Read(b,0,b.Length);
File.WriteAllBytes(outfile, b);
}
}
}
Thanks for any help!
Regards,
Evan
It sounds like you need to use a MemoryStream.
Since you don't know how many bytes you'll be reading from the GZipStream, you can't really allocate an array for it. You need to read it all into a byte array and then use a MemoryStream to decompress.
const int BufferSize = 65536;
byte[] compressedBytes = File.ReadAllBytes("compressedFilename");
// create memory stream
using (var mstrm = new MemoryStream(compressedBytes))
{
using(var inStream = new GzipStream(mstrm, CompressionMode.Decompress))
{
using (var outStream = File.Create("outputfilename"))
{
var buffer = new byte[BufferSize];
int bytesRead;
while ((bytesRead = inStream.Read(buffer, 0, BufferSize)) != 0)
{
outStream.Write(buffer, 0, bytesRead);
}
}
}
}
Here is what I ended up doing. I realize that I did not give sufficient information in my question - and I apologize for that - but I do know the size of the file I need to decompress as I am using it earlier in my program. This buffer is referred to as "blen".
string fi = #"C:\Path To Compressed File";
// Get the stream of the source file.
// using (FileStream inFile = fi.OpenRead())
using (MemoryStream infile1 = new MemoryStream(File.ReadAllBytes(fi)))
{
//Create the decompressed file.
string outfile = #"C:\Decompressed.exe";
{
using (GZipStream Decompress = new GZipStream(infile1,
CompressionMode.Decompress))
{
byte[] b = new byte[blen.Length];
Decompress.Read(b,0,b.Length);
File.WriteAllBytes(outfile, b);
}
}
}