I make an archiver with block-by-block reading and file compression. I put the compressed block in FileStream.
I am reading the 5 mb block. The problem is that if I compress a pic of 8 mb, then when I pull it out of the resulting archive, its sum-hash does not match the original and it opens pic halfway, and the size is the same... I don’t know what to try. I ask for help.
Read chunk void:
private byte[] ReadChunk(int chunkId)
{
using (var inFile = new FileStream(sourceFile, FileMode.Open, FileAccess.Read, FileShare.Read))
{
long filePosition = chunkId * chunkDataSize;
int bytesRead;
if (inFile.Length - filePosition <= chunkDataSize)
{
bytesRead = (int)(inFile.Length - filePosition);
}
else
{
bytesRead = chunkDataSize;
}
var lastBuffer = new byte[bytesRead];
inFile.Read(lastBuffer, 0, bytesRead);
return lastBuffer;
}
}
Compress and write void:
private void CompressBlock(byte[] bytesTo)
{
using (MemoryStream ms = new MemoryStream())
{
using (GZipStream gs = new GZipStream(ms, CompressionMode.Compress))
{
gs.Write(bytesTo, 0, bytesTo.Length);
}
byte[] compressedData = ms.ToArray();
using (var outFile = new FileStream(resultFile, FileMode.Append))
{
BitConverter.GetBytes(compressedData.Length).CopyTo(compressedData, 4);
outFile.Write(compressedData, 0, compressedData.Length);
}
}
}
Related
Does anyone know why I'm getting the "Unexpected end of data" error message when un-gzipping the gzip file?
To verify the bytes data is not corrupted, I use the FooTest4.csv to write to file and was able to opened the file successfully.
Both 'FooTest3.csv.gzand 'FooTest2.csv.gz ran into "Unexpected end of data" when un-gzipping.
public static List<byte> CompressFile(List<byte> parmRawBytes)
{
//Initialize variables...
List<byte> returnModifiedBytes = null;
File.WriteAllBytes(#"X:\FooTest4.csv", parmRawBytes.ToArray());
using (var memoryStream = new MemoryStream())
{
using (var gzipStream = new GZipStream(memoryStream, CompressionMode.Compress, false))
{
gzipStream.Write(parmRawBytes.ToArray(), 0, parmRawBytes.ToArray().Length);
gzipStream.Flush();
File.WriteAllBytes(#"X:\FooTest3.csv.gz", memoryStream.ToArray());
returnModifiedBytes = memoryStream.ToArray().ToList();
}
}
File.WriteAllBytes(#"X:\FooTest2.csv.gz", returnModifiedBytes.ToArray());
return returnModifiedBytes;
}
GZipStream needs to be closed so it can write some terminating data to the end of the buffer to complete the gzip encoding.
byte[] inputBytes = ...;
using (var compressedStream = new MemoryStream())
{
using (var compressor = new GZipStream(compressedStream, CompressionMode.Compress))
{
compressor.Write(inputBytes, 0, inputBytes.Length);
}
// get bytes after the gzip stream is closed
File.WriteAllBytes(pathToFile, compressedStream.ToArray());
}
Instead of loading the bytes, compressing and saving them you could do do compression and writing at once. Also I don't know why you're using List<Byte> instead of byte[], maybe this could be it.
void CompressFile(string inputPath, string outputPath)
{
Stream readStream = new FileStream(inputPath, Filemode.Open);
Stream writeStream = new FileStream(outputPath, FileMode.Create);
Stream compressionStream = new GZipStream(writeStream. CompressionMode.Compress);
byte[] data = new byte[readStream.Length];
readStream.Read(data, 0, data.Length);
compressionStream.Write(data, 0, data.Length);
readStream.Close();
writeStream.Close();
}
byte[] CompressFile(string inputPath)
{
byte[] data = File.ReadAllBytes(inputPath);
MemoryStream memStream = new MemoryStream(data);
var gzipStream = new GZipStream(memStream, CompressionMode.Compress);
gzipStream.Write(data, 0, data.Length);
gzipStream.Close();
return gzipStream.ToArray();
}
PS: I wrote the code in the text editor, so there might be errors. Also you say the error is on the "unzippiing", why no show us the unzip code?
i am trying to not exceeded memory max size so i have to check every time if It greater than Max Memory Size Then i flush it into zip file Stream . The Problem Here it replace memory stream with existence one in file stream ,Or Is there Any way To Do the Same Rquired with Another Way ( But With Out Using Any DLL Lib)
MemoryStream memoryStream = new MemoryStream();
FileStream fileStream = new FileStream(sbZipFolderName.ToString(),FileMode.Create);
foreach (FileInfo flInfo in ListfileFolderPaths)
{
using (var archive = new ZipArchive(memoryStream, ZipArchiveMode.Update, true))
archive.CreateEntryFromFile(flInfo.FullName, slastFolderName + "/" + flInfo.DirectoryName.Replace(new DirectoryInfo(sFolderPath.ToString()).FullName, "") + "/" + flInfo.Name);
if (memoryStream.Length > MaxSize)
{
using (fileStream = new FileStream(sFolderPath + "/" + slastFolderName + ".zip", FileMode.Create))
{
memoryStream.Seek(0, SeekOrigin.Begin);
memoryStream.CopyTo(fileStream);
memoryStream = new MemoryStream();
}
}
}
if ((memoryStream != null) && (memoryStream.Length > 0))
memoryStream.CopyTo(fileStream);
You can use theGzip archive to compress a file.
This is the compression:
public static byte[] Compress(byte[] raw)
{
using (MemoryStream memory = new MemoryStream())
{
using (GZipStream gzip = new GZipStream(memory,
CompressionMode.Compress, true))
{
gzip.Write(raw, 0, raw.Length);
}
return memory.ToArray();
}
}
}
And this to decompression :
static byte[] Decompress(byte[] gzip)
{
// Create a GZIP stream with decompression mode.
// ... Then create a buffer and write into while reading from the GZIP stream.
using (GZipStream stream = new GZipStream(new MemoryStream(gzip), CompressionMode.Decompress))
{
const int size = 4096;
byte[] buffer = new byte[size];
using (MemoryStream memory = new MemoryStream())
{
int count = 0;
do
{
count = stream.Read(buffer, 0, size);
if (count > 0)
{
memory.Write(buffer, 0, count);
}
}
while (count > 0);
return memory.ToArray();
}
}
}
}
Tell me if it worked.
Goodluck.
Why GZipStream, based on MemoryStream reads from it only 24576bytes and cant read any further? I compress my gz with multithreading, in parts, but winrar could decompress it and GZipStream not.
static private bool ZipperWork2(string InputPath, string OutputPath, CompressionMode CompMode)
{
Mode = CompMode;
RawData = new byte[ThreadCount][];
CompressedData = new byte[ThreadCount][];
Thread[] CompressorThreads = new Thread[ThreadCount];
byte[] buffer = new byte[MAX_BLOCK_SIZE];
using (FileStream fs = new FileStream(InputPath, FileMode.Open))
{
using (FileStream fw = new FileStream(OutputPath, FileMode.Append))
{
while (fs.Position < fs.Length)
{
for (int i = 0; i < ThreadCount; i++)
{
int count = fs.Read(buffer, 0, MAX_BLOCK_SIZE);
RawData[i] = new byte[count];
Array.Copy(buffer, RawData[i], count);
CompressorThreads[i] = new Thread(WorkWithBlock2);
CompressorThreads[i].Start(i);
}
for (int i = 0; i < ThreadCount; i++)
{
CompressorThreads[i].Join();
fw.Write(CompressedData[i], 0, CompressedData[i].Length);
}
}
}
}
return true;
}
private static void WorkWithBlock2(object index)
{
int DataIndex = (int)index;
//int count=0,totalcount=0;
using (MemoryStream ms = new MemoryStream(RawData[DataIndex]))
{
using (MemoryStream output = new MemoryStream())
{
using (GZipStream gz = new GZipStream(ms, Mode))
{
gz.CopyTo(output);
}
CompressedData[DataIndex] = output.ToArray();
}
}
}
I trying to do it in a different ways, but result are always the same, decompress only 4KB (and it don't relate with my MAX_BLOCK_SIZE variable)
Your gzip stream may actually consist of multiple concatenated gzip streams. This is permitted by the gzip format. GZipStream is likely just reading the first one. All you'd need to do is repeat the operation on the subsequent input bytes until all of the input bytes have been consumed.
When I use the same GZipStream to compress file blocks in loop the result file compress successfully:
public static void Compress1(string fi)
{
using (FileStream inFile = File.Open(fi,FileMode.Open,FileAccess.Read,FileShare.Read))
{
using (FileStream outFile = File.Create(fi + ".gz"))
{
using (GZipStream Compress = new GZipStream(outFile,
CompressionMode.Compress))
{
byte[] buffer = new byte[6315120];
int numRead;
while ((numRead = inFile.Read(buffer, 0, buffer.Length)) != 0)
{
Compress.Write(buffer, 0, numRead);
}
}
}
}
}
But when I compress file blocks separately in different streams the result file corrupts:
public static void Compress2(string fi, int offset)
{
using (FileStream inFile = File.Open(fi,FileMode.Open))
{
using (FileStream outFile = File.OpenOrCreate(fi + ".gz"))
{
using (GZipStream Compress = new GZipStream(outFile,
CompressionMode.Compress))
{
// Copy the source file into the compression stream.
byte[] buffer = new byte[6315120];
int numRead=-1;
inFile.Seek(offset,SeekOrigin.Begin);
numRead = inFile.Read(buffer, 0, buffer.Length);
Compress.Write(buffer, 0, numRead);
}
}
}
}
In these examples I have a file with size = 12630240. And devide it into 2 blocks, size of each block = 6315120 (buffer size). So, the first block compress correctly in both methods, but the second block in second method compress different from the first method. What I missed?
What is happening is you are creating to different files as each GZipStream has its one headers
by dividing what you are doing is creating to different GZ files and if you write the two to the same file it is a corrupt file.
I have code that should do the compression:
FileStream fs = new FileStream("g:\\gj.txt", FileMode.Open);
FileStream fd = new FileStream("g:\\gj.zip", FileMode.Create);
GZipStream csStream = new GZipStream(fd, CompressionMode.Compress);
byte[] compressedBuffer = new byte[500];
int offset = 0;
int nRead;
nRead = fs.Read(compressedBuffer, offset, compressedBuffer.Length);
while (nRead > 0)
{
csStream.Write(compressedBuffer, offset, nRead);
offset = offset + nRead;
nRead = fs.Read(compressedBuffer, offset, compressedBuffer.Length);
}
fd.Close();
fs.Close();
and I think it does, but I want to decompress what was compressed the way above. I do somethink like that:
FileStream fd = new FileStream("g:\\gj.new", FileMode.Create);
FileStream fs = new FileStream("g:\\gj.zip", FileMode.Open);
GZipStream csStream = new GZipStream(fs, CompressionMode.Decompress);
byte[] decompressedBuffer = new byte[500];
int offset = 0;
int nRead;
nRead=csStream.Read(decompressedBuffer, offset, decompressedBuffer.Length);
while (nRead > 0)
{
fd.Write(decompressedBuffer, offset, nRead);
offset = offset + nRead;
nRead = csStream.Read(decompressedBuffer, offset, decompressedBuffer.Length);
}
fd.Close();
fs.Close();
and here it doesn't... I've got nRead = 0 befeore entering the loop... What I do wrong??
The test file I use is the simpliest TEXT file (size: 104 bytes)...
My first thought is that you haven't closed csStream. If you use using this happens automatically. Since gzip buffers data, you could be missing some.
Secondly; don't increment offset; that is the offset in the buffer (not the stream). Leave at 0:
using (Stream fs = File.OpenRead("gj.txt"))
using (Stream fd = File.Create("gj.zip"))
using (Stream csStream = new GZipStream(fd, CompressionMode.Compress))
{
byte[] buffer = new byte[1024];
int nRead;
while ((nRead = fs.Read(buffer, 0, buffer.Length))> 0)
{
csStream.Write(buffer, 0, nRead);
}
}
using (Stream fd = File.Create("gj.new.txt"))
using (Stream fs = File.OpenRead("gj.zip"))
using (Stream csStream = new GZipStream(fs, CompressionMode.Decompress))
{
byte[] buffer = new byte[1024];
int nRead;
while ((nRead = csStream.Read(buffer, 0, buffer.Length)) > 0)
{
fd.Write(buffer, 0, nRead);
}
}
The two methods I have are like James Roland mentioned.
private static byte[] Compress(HttpPostedFileBase file)
{
using var to = new MemoryStream();
using var gZipStream = new GZipStream(to, CompressionMode.Compress);
file.InputStream.CopyTo(gZipStream);
gZipStream.Flush();
return to.ToArray();
}
private static byte[] Decompress(byte[] compressed)
{
using var from = new MemoryStream(compressed);
using var to = new MemoryStream();
using var gZipStream = new GZipStream(from, CompressionMode.Decompress);
gZipStream.CopyTo(to);
return to.ToArray();
}
However, I'm using an upload with
Request.Files[0]
then compress and save in the db. Then I pull the img out, decompress and set a src with
$"data:image/gif;base64,{ToBase64String(Decompress(img))}";