I am writing software that deals with a large collection of files that contain zlib compressed data in different sections of the file rather than the entire file itself. I know how to grab the section(s) I need. However I was having trouble getting the documented zlib stream classes to work properly. I googled and tried several different solutions and could not get them to work except one that uses a static method. The following code works just fine but does not work directly with the Zlib stream class as I would prefer:
// reader is the BinaryReader for the original file and...
// The "Data" section consists of a UInt32 and the compressed data
byte[] compressedStream = reader.ReadBytes((int)Size - 4); // name was kept for code compatibility
MemoryStream deflatedStream = new MemoryStream(ZlibStream.UncompressBuffer(compressedStream), true);
I don't really have any issues with using the code above since it gives me the decompressed data I need. However, I am baffled as to why my original code which instances the zlib stream class directly did not work (since they use the same basic API):
MemoryStream compressedStream = new MemoryStream(reader.ReadBytes((int)Size - 4));
ZlibStream deflatedStream = new ZlibStream(compressedStream, CompressionMode.Decompress, true);
Accessing almost any property of "deflatedStream" results in an error. I assume this means it did not work. It might be worth noting that I have not yet used the DotNetZip lib and used Zlib.Portable instead (the second most popular library). However, the API seems to be the same.
Related
I need to generate multiple XML files at SFTP location from C# code. for SFTP connectivity, I am using Renci.Ssh.net. I found there are different methods to generate files including WriteAllText() and UploadFile(). I am producing XML string runtime, currently I've used WriteAllText() method (just to avoid creating the XML file on local and thus to avoid IO operation).
using (SftpClient client = new SftpClient(host,port, sftpUser, sftpPassword))
{
client.Connect();
if (client.IsConnected)
{
client.BufferSize = 1024;
var filePath = sftpDir + fileName;
client.WriteAllText(filePath, contents);
client.Disconnect();
}
client.Dispose();
}
Will using UploadFile(), either from FileStream or MemoryStream give me better performance in long run?
The result document size will be in KB, around 60KB.
Thanks!
SftpClient.UploadFile is optimized for uploads of large amount of data.
But for 60KB, I'm pretty sure that it makes no difference whatsoever. So you can continue using the more convenient SftpClient.WriteAllText.
Though, I believe that most XML generators (like .NET XmlWriter are able to write XML to Stream (it's usually the preferred output API, rather than a string). So the use of SftpClient.UploadFile can be more convenient in the end.
See also What is the difference between SftpClient.UploadFile and SftpClient.WriteAllBytes?
I am looking into PdfReport.Core and have been asked to let our .NET CORE 2.0 WEB-API return a PDF to the calling client. The client would be any https caller like a ajax or mvc client.
Below is a bit of the code I am using. I am using swashbuckle to test the api, which looks like it is returning the report but when I try to open in a PDF viewer it says it is curropted. I am thinking I am not actually outputting the actual PDF to the stream, suggestions?
[HttpGet]
[Route("api/v1/pdf")]
public FileResult GetPDF()
{
var outputStream = new MemoryStream();
InMemoryPdfReport.CreateStreamingPdfReport(_hostingEnvironment.WebRootPath, outputStream);
outputStream.Position = 0;
return new FileStreamResult(outputStream, "application/pdf")
{
FileDownloadName = "report.pdf"
};
}
I'm not familiar with that particular library, but generally speaking with streams, file corruption is a result of either 1) the write not being flushed or 2) incorrect positioning within the stream.
Since, you've set the position back to zero, I'm guessing the problem is that your write isn't being flushed correctly. Essentially, when you write to a stream, the data is not necessarily "complete" in the stream. Sometimes writes are queued to more efficiently write in batches. Sometimes, there's cleanup tasks a particular stream writer needs to complete to "finalize" everything. For example, with a format like PDF, end matter may need to be appended to the bytes, particular to the format. A stream writer that is writing PDF would take care of this in a flush operation, since it cannot be completed until all writing is done.
Long and short, review the documentation of the library. In particular, look for any method/process that deals with "flushing". That's most likely what your missing.
I am trying to use a LibTiff.Net library and rewriting a merge tool TiffCP api to use memory streams.
This library has a Tiff class and by passing a stream to this class, it can merge tiff images into this stream.
For testing, I passed on a Filestream and I got what i wanted - it merged and I was able to see multipage tif.
But when I pass a MemoryStream, I am able to verify that the page data is being added to the stream as I loop through but when I write it to the file at the end, I could see only 1st page.
var mso = new MemoryStream();
var fso = new FileStream(#"C:\test\ttest.tif",FileMode.OpenOrCreate); //This works
using (Tiff outImage = Tiff.ClientOpen("custom", "w", mso, tso))
{
//...
//..
System.Drawing.Image tiffImg = System.Drawing.Image.FromStream(mso, true);
tiffImg.Save(#"C:\test\test2.tiff", System.Drawing.Imaging.ImageFormat.Tiff);
tiffImg.Dispose();
//..
//..
}
P.S: I need it in memorystream because, of some folder permissions on servers + vendor API reasons.
You probably using the memory stream before data is actually written into the stream.
Please use Tiff.Flush() method before accessing data in the memory stream. And please make sure you call Tiff.WriteDirectory() method for each page you create.
EDIT:
Please also take a look at Bob Powell's article on Generating Multi-Page TIFF files. The article shows how to use EncoderParameters to actually generate a multipage TIFF.
Using
tiffImg.Save(#"C:\test\test2.tiff", System.Drawing.Imaging.ImageFormat.Tiff);
you are probably save only first frame.
I will like to compress a file before sending it through the network. I think the best approach is 7zip because it is free and open source.
How I use 7zip with .net?
I know that 7zip is free and that they have the source code in c# but for some reason it is very slow on c# so I rather call the dll 7z.dll that comes when installing 7zip for performance reasons. So the way I am able to eassily marshal and call the methods in 7z.dll is with the help of the library called sevenzipsharp . For example adding that dll to my project will enable me to do:
// if you installed 7zip 64bit version then make sure you change plataform target
// like on the picture I showed above!
SevenZip.SevenZipCompressor.SetLibraryPath(#"C:\Program Files\7-Zip\7z.dll");
var stream = System.IO.File.OpenRead(#"SomeFileToCompress.txt");
var outputStream = System.IO.File.Create("Output.7z");
SevenZip.SevenZipCompressor compressor = new SevenZip.SevenZipCompressor();
compressor.CompressionMethod = SevenZip.CompressionMethod.Lzma2;
compressor.CompressionLevel = SevenZip.CompressionLevel.Ultra;
compressor.CompressStream(stream, outputStream);
that's how I use 7zip within c#.
Now my question is:
I will like to send a compressed file over the network. I know I could compress it first then send it. The file is 4GB so I will have to wait a long time for it to compress. I will be wasting a lot of space on hard drive. then I will finally be able to send it. I think that is to complicated. I was wondering how it will be possible to send the file meanwhile it is being compressed.
It seems to be a problem with SevenZipSharp:
Have you considered an alternate library - one that doesn't even require 7-Zip to be installed / available?
From the description posted at http://dotnetzip.codeplex.com/ :
creating zip files from stream content, saving to a stream, extracting
to a stream, reading from a stream
Unlike 7-Zip, DotNetZip is designed to work with C# / .Net.
Plenty of examples - including streaming, are available at http://dotnetzip.codeplex.com/wikipage?title=CS-Examples&referringTitle=Examples .
Another option is to use the 7-Zip Command Line Version (7z.exe), and write to/read from standard in/out. This would allow you to use the 7-Zip file format, while also keeping all of the core work in native code (though there likely won't be much of a significant difference).
Looking back at SevenZipSharp:
Since the 0.29 release, streaming is supported.
Looking at http://sevenzipsharp.codeplex.com/SourceControl/changeset/view/59007#364711 :
it seems you'd want this method:
public void CompressStream(Stream inStream, Stream outStream)
Thank you for considering performance here! I think way too many people would do exactly what you're trying to avoid: compress to a temp file, then do something with the temp file.
CompressStream threw an exception. My code is as follows:
public void TestCompress()
{
string fileToCompress = #"C:\Users\gary\Downloads\BD01.DAT";
byte[] inputBytes = File.ReadAllBytes(fileToCompress);
var inputStream = new MemoryStream(inputBytes);
byte[] zipBytes = new byte[38000000]; // this memory size is large enough.
MemoryStream outStream = new MemoryStream(zipBytes);
string compressorEnginePath = #"C:\Engine\7z.dll";
SevenZipCompressor.SetLibraryPath(compressorEnginePath);
compressor = new SevenZip.SevenZipCompressor();
compressor.CompressionLevel = CompressionLevel.Fast;
compressor.CompressionMethod = CompressionMethod.Lzma2;
compressor.CompressStream(inputStream, outputStream);
inputStream.Close();
outputStream.Close();
The exception messages:
Message: Test method Test7zip.UnitTest1.TestCompress threw exception:
SevenZip.SevenZipException: The execution has failed due to the bug in the SevenZipSharp.
Please report about it to http://sevenzipsharp.codeplex.com/WorkItem/List.aspx, post the release number and attach the archive
I'm trying to implement file compression to an application. The application has been around for a while, so it needs to be able to read uncompressed documents written by previous versions. I expected that DeflateStream would be able to process an uncompressed file, but for GZipStream I get the "The magic number in GZip header is not correct" error. For DeflateStream I get "Found invalid data while decoding". I guess it does not find the header that marks the file as the type it is.
If it's not possible to simply process an uncompressed file, then 2nd best would be to have a way to determine whether a file is compressed, and choose the method of reading the file. I've found this link: http://blog.somecreativity.com/2008/04/08/how-to-check-if-a-file-is-compressed-in-c/, but this is very implementation specific, and doesn't feel like the right approach. It can also provide false positives (I'm sure this would be rare, but it does indicate that it's not the right approach).
A 3rd option I've considered is to attempt using DeflateStream, and fallback to normal stream IO if an exception occurs. This also feels messy, and causes VS to break at the exception (unless I untick that exception, which I don't really want to have to do).
Of course, I may simply be going about it the wrong way. This is the code I've tried in .Net 3.5:
Stream reader = new FileStream(fileName, FileMode.Open, readOnly ? FileAccess.Read : FileAccess.ReadWrite, readOnly ? FileShare.ReadWrite : FileShare.Read);
using (DeflateStream decompressedStream = new DeflateStream(reader, CompressionMode.Decompress))
{
workspace = (Workspace)new XmlSerializer(typeof(Workspace)).Deserialize(decompressedStream);
if (readOnly)
{
reader.Close();
workspace.FilePath = fileName;
}
else
workspace.SetOpen(reader, fileName);
}
Any ideas?
Thanks!
Luke.
Doesn't your file format have a header? If not, now is the time to add one (you're changing the file format by supporting compression, anyway). Pick a good magic value, make sure the header is extensible (add a version field, or use specific magic values for specific versions), and you're ready to go.
Upon loading, check for the magic value. If not present, use your current legacy loading routines. If present, the header will tell you whether the contents are compressed or not.
Update
Compressing the stream means the file is no longer an XML document, and thus there's not much reason to expect the file can't contain more than your data stream. You really do want a header identifying your file :)
The below is example (pseudo)-code; I don't know if .net has a "substream", SubRangeStream is likely something you'll have to code yourself (DeflateStream probably adds it's own header, so a substream might not be necessary; could turn out useful further down the road, though).
Int64 oldPosition = reader.Position;
reader.Read(magic, 0, magic.length);
if(IsRightMagicValue(magic))
{
Header header = ReadHeader(reader);
Stream furtherReader = new SubRangeStream(reader, reader.Position, header.ContentLength);
if(header.IsCompressed)
{
furtherReader = new DeflateStream(furtherReader, CompressionMode.Decompress);
}
XmlSerializer xml = new XmlSerializer(typeof(Workspace));
workspace = (Workspace) xml.Deserialize(furtherReader);
} else
{
reader.Position = oldPosition;
LegacyLoad(reader);
}
In real-life, I would do things a bit differently - some proper error handling and cleanup, for instance. Also, I wouldn't have the new loader code directly in the IsRightMagicValue block, but rather I'd spin off the work either based on the magic value (one magic value per file version), or I would keep a "common header" portion with fields common to all versions. For both, I'd use a Factory Method to return an IWorkspaceReader depending on the file version.
Can't you just create a wrapper class/function for reading the file and catch the exception? Something like
try
{
// Try return decompressed stream
}
catch(InvalidDataException e)
{
// Assume it is already decompressed and return it as it is
}