I have some code that downloads gzipped files, and decompresses them. The problem is, I can't get it to decompress the whole file, it only reads the first 4096 bytes and then about 500 more.
Byte[] buffer = new Byte[4096];
int count = 0;
FileStream fileInput = new FileStream("input.gzip", FileMode.Open, FileAccess.Read, FileShare.Read);
FileStream fileOutput = new FileStream("output.dat", FileMode.Create, FileAccess.Write, FileShare.None);
GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress, true);
// Read from gzip steam
while ((count = gzipStream.Read(buffer, 0, buffer.Length)) > 0)
{
// Write to output file
fileOutput.Write(buffer, 0, count);
}
// Close the streams
...
I've checked the downloaded file; it's 13MB when compressed, and contains one XML file. I've manually decompressed the XML file, and the content is all there. But when I do it with this code, it only outputs the very beginning of the XML file.
Anyone have any ideas why this might be happening?
EDIT
Try not leaving the GZipStream open:
GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress,
false);
or
GZipStream gzipStream = new GZipStream(fileInput, CompressionMode.Decompress);
I ended up using a gzip executable to do the decompression instead of a GZipStream. It can't handle the file for some reason, but the executable can.
Same thing happened to me. In my case only reads up to 6 lines and then reached end of file. So I realized that although the extension is gz, it was compressed by another algorithm not supported by GZipStream. So I used SevenZipSharp library and it worked. This is my code
You can use SevenZipSharp library
using (var input = File.OpenRead(lstFiles[0]))
{
using (var ds = new SevenZipExtractor(input))
{
//ds.ExtractionFinished += DsOnExtractionFinished;
var mem = new MemoryStream();
ds.ExtractFile(0, mem);
using (var sr = new StreamReader(mem))
{
var iCount = 0;
String line;
mem.Position = 0;
while ((line = sr.ReadLine()) != null && iCount < 100)
{
iCount++;
LstOutput.Items.Add(line);
}
}
}
}
Are you calling Close or Flush on fileOutput? (Or just wrap it in a using, which is recommended practice.) If you don't the file might not be flushed to disk when your program ends.
Related
What is the equivalent of the Python function zlib.decompress() in C#? I need to decompress some zlib files using C# and I don't know how to do it.
Python example:
import zlib
file = open("myfile", mode = "rb")
data = zlib.decompress(file.read())
uncompressed_output = open("output_file", mode = "wb")
uncompressed_output.write(data)
I tried using the System.IO.Compression.DeflateStream class, but for every file I try it gives me an exception that the file contains invalid data while decoding.
byte[] binary = new byte[1000000];
using (DeflateStream compressed_file = new DeflateStream(new FileStream(#"myfile", FileMode.Open, FileAccess.Read), CompressionMode.Decompress))
compressed_file.Read(binary, 0, 1000000); //exception here
using (BinaryWriter outputFile = new BinaryWriter(new FileStream(#"output_file", FileMode.Create, FileAccess.Write)))
outputFile.Write(binary);
//Reading the file like normal with a BinaryReader and then turning it into a MemoryStream also didn't work
I should probably mention that the files are ZLIB compressed files. They start with the 78 9C header.
So, luckily, I found this post: https://stackoverflow.com/a/33855097/10505778
Basically the file must be stripped of its 2 header bytes (78 9C). While the 9C is important in decompression (it specifies whether a preset dictionary has been used or not), I don't need it, but I am pretty sure it is not that difficult to modify this to accomodate it:
byte[] binary, decompressed;
using (BinaryReader file = new BinaryReader(new FileStream(#"myfile", FileMode.Open, FileAccess.Read, FileShare.Read))
binary = file.ReadBytes(int.MaxValue); //read the entire file
output = new byte[int.MaxValue];
int outputSize;
using (MemoryStream memory_stream = new MemoryStream(binary, false))
{
memory_stream.Read(decompressed, 0, 2); //discard 2 bytes
using (DeflateStream compressed_file = new DeflateStream(memory_stream, CompressionMode.Decompress)
outputSize = compressed_file.Read(decompressed, 0, int.MaxValue);
}
binary = new byte[outputSize];
Array.Copy(decompressed, 0, binary, 0, outputSize);
using (BinaryWriter outputFile = new BinaryWriter(new FileStream(#"output_file", FileMode.Create, FileAccess.Write)))
outputFile.Write(binary);
I am using this method to compress files and it works great until I get to a file that is 2.4 GB then it gives me an overflow error:
void CompressThis (string inFile, string compressedFileName)
{
FileStream sourceFile = File.OpenRead(inFile);
FileStream destinationFile = File.Create(compressedFileName);
byte[] buffer = new byte[sourceFile.Length];
sourceFile.Read(buffer, 0, buffer.Length);
using (GZipStream output = new GZipStream(destinationFile,
CompressionMode.Compress))
{
output.Write(buffer, 0, buffer.Length);
}
// Close the files.
sourceFile.Close();
destinationFile.Close();
}
What can I do to compress huge files?
You should not to write the whole file to into the memory. Use Stream.CopyTo instead. This method reads the bytes from the current stream and writes them to another stream using a specified buffer size (81920 bytes by default).
Also you don't need to close Stream objects if use using keyword.
void CompressThis (string inFile, string compressedFileName)
{
using (FileStream sourceFile = File.OpenRead(inFile))
using (FileStream destinationFile = File.Create(compressedFileName))
using (GZipStream output = new GZipStream(destinationFile, CompressionMode.Compress))
{
sourceFile.CopyTo(output);
}
}
You can find a more complete example on Microsoft Docs (formerly MSDN).
You're trying to allocate all of this into memory. That just isn't necessary, you can feed the input stream directly into the output stream.
Alternative solution for zip format without allocating memory -
using (var sourceFileStream = new FileStream(this.GetFilePath(sourceFileName), FileMode.Open))
{
using (var destinationStream =
new FileStream(this.GetFilePath(zipFileName), FileMode.Create, FileAccess.ReadWrite))
{
using (var archive = new ZipArchive(destinationStream, ZipArchiveMode.Create, true))
{
var file = archive.CreateEntry(sourceFileName, CompressionLevel.Optimal);
using (var entryStream = file.Open())
{
var fileStream = sourceFileStream;
await fileStream.CopyTo(entryStream);
}
}
}
}
The solution will write directly from input stream to output stream
I am testing some code. I am stuck with the following. What ever I write as text, the length of the zipped stream is always 10? What am I doing wrong?
var inStream = new MemoryStream();
var inWriter = new StreamWriter(inStream);
str text = "HelloWorldsasdfghj123455667880fgsjfhdfasdferrbvbyjun hbwecwcxqsz edcrgvebrjnuj5juerqwetsrgfggshurhtnbvzkfjhguhgrgal;kjhao;rhl;zkfhg;aorihghg;oahrgarhguhh';aaeaeiaijeihjrhfidfhfidfidhh953453453";
inWriter.WriteLine(text);
inWriter.Flush();
inStream.Position = 0;
var outStream = new MemoryStream();
var compressStream = new GZipStream(outStream, CompressionMode.Compress);
inStream.CopyTo(compressStream);
compressStream.Flush();
outStream.Flush();
compressStream.Flush();
outStream.Position = 0;
Console.WriteLine(outStream.Position);
Console.WriteLine(outStream.Length);
Until you Close it the compression stream doesn't know you've finished writing to it - so cannot complete its compression algorithm. Flushing flushes those parts it can flush, but until its been told you have completed adding new bytes it cannot flush its last package of compressed data.
My company run an application that have to archive many kinds of files into some distants servers. The application works well but can't handle files larger than 1GB.
Here is the current function use to load the files to be uploaded to the distant server :
FileStream fs = File.OpenRead(fileToUploadPath);
byte[] fileArray = new byte[fs.Length];
fs.Read(fileArray, 0, fs.Length);
The byte array (when loaded successfully) was then splited into 100Mb bytes arrays and sent to the local server (using some WSDL web services) with the following function :
localServerWebService.SendData(subFileArray, filename);
I changed the function responsible for the file reading to use BufferendStream and I also wanted to improve the Webservice part so that it doesn't have to create a new stream at each call. I thought of somethings like this :
FileInfo source = new FileInfo(fileName);
using (FileStream reader = File.OpenRead(fileName))
{
using (FileStream distantWriter = localServerWebService.CreateWriteFileStream(fileName))
{
using (BufferedStream buffReader = new BufferedStream(reader))
{
using (BufferedStream buffWriter = new BufferedStream(distantWriter))
{
byte[] buffer = new byte[BUFFER_SIZE];
int bytesRead = 0;
long bytesToRead = source.Length;
while (bytesToRead > 0)
{
int nbBytesRead = buffReader.Read(buffer, 0, buffer.Length);
buffWriter.Write(buffer, 0, nbBytesRead);
bytesRead += nbBytesRead;
bytesToRead -= nbBytesRead;
}
}
}
}
}
But this code can't compile and always give me the error Cannot convert MyNameSpace.FileStream into System.IO.FileStream at line using (FileStream distantWriter = localServerWebService.CreateWriteFileStream(fileName)). I can't cast MyNameSpace.FileStream into System.IO.FileStream either.
The web service method :
[WebMethod]
public FileStream CreateWriteFileStream(String fileName)
{
String RepVaultUP =
System.Configuration.ConfigurationSettings.AppSettings.Get("SAS_Upload");
String desFile = Path.Combine(RepVaultUP, fileName);
return File.Open(desFile, FileMode.Create, FileAccess.Write);
}
So can you guys please explain to me why is this not working?
P.S.: English is not my mothertong so I hope what i wrote is clearly undestandable.
I am attempting to create a new FileStream object from a byte array. I'm sure that made no sense at all so I will try to explain in further detail below.
Tasks I am completing:
1) Reading the source file which was previously compressed
2) Decompressing the data using GZipStream
3) copying the decompressed data into a byte array.
What I would like to change:
1) I would like to be able to use File.ReadAllBytes to read the decompressed data.
2) I would then like to create a new filestream object usingg this byte array.
In short, I want to do this entire operating using byte arrays. One of the parameters for GZipStream is a stream of some sort, so I figured I was stuck using a filestream. But, if some method exists where I can create a new instance of a FileStream from a byte array - then I should be fine.
Here is what I have so far:
FolderBrowserDialog fbd = new FolderBrowserDialog(); // Shows a browser dialog
fbd.ShowDialog();
// Path to directory of files to compress and decompress.
string dirpath = fbd.SelectedPath;
DirectoryInfo di = new DirectoryInfo(dirpath);
foreach (FileInfo fi in di.GetFiles())
{
zip.Program.Decompress(fi);
}
// Get the stream of the source file.
using (FileStream inFile = fi.OpenRead())
{
//Create the decompressed file.
string outfile = #"C:\Decompressed.exe";
{
using (GZipStream Decompress = new GZipStream(inFile,
CompressionMode.Decompress))
{
byte[] b = new byte[blen.Length];
Decompress.Read(b,0,b.Length);
File.WriteAllBytes(outfile, b);
}
}
}
Thanks for any help!
Regards,
Evan
It sounds like you need to use a MemoryStream.
Since you don't know how many bytes you'll be reading from the GZipStream, you can't really allocate an array for it. You need to read it all into a byte array and then use a MemoryStream to decompress.
const int BufferSize = 65536;
byte[] compressedBytes = File.ReadAllBytes("compressedFilename");
// create memory stream
using (var mstrm = new MemoryStream(compressedBytes))
{
using(var inStream = new GzipStream(mstrm, CompressionMode.Decompress))
{
using (var outStream = File.Create("outputfilename"))
{
var buffer = new byte[BufferSize];
int bytesRead;
while ((bytesRead = inStream.Read(buffer, 0, BufferSize)) != 0)
{
outStream.Write(buffer, 0, bytesRead);
}
}
}
}
Here is what I ended up doing. I realize that I did not give sufficient information in my question - and I apologize for that - but I do know the size of the file I need to decompress as I am using it earlier in my program. This buffer is referred to as "blen".
string fi = #"C:\Path To Compressed File";
// Get the stream of the source file.
// using (FileStream inFile = fi.OpenRead())
using (MemoryStream infile1 = new MemoryStream(File.ReadAllBytes(fi)))
{
//Create the decompressed file.
string outfile = #"C:\Decompressed.exe";
{
using (GZipStream Decompress = new GZipStream(infile1,
CompressionMode.Decompress))
{
byte[] b = new byte[blen.Length];
Decompress.Read(b,0,b.Length);
File.WriteAllBytes(outfile, b);
}
}
}