Stream chain wrapping done in the right way - c#

I'm using Bouncy Castle cryptographic API for C# to create armored output of encrypted data.
The code looks ugly, in particular, like this:
string result = string.Empty;
using (var outputStream = new MemoryStream())
{
using (var armoredStream = AddArmorWrappingTo(outputStream))
{
using (var encryptedStream = AddEncryptionWrappingTo(armoredStream))
{
using (var literalStream = AddLiteralWrappingTo(encryptedStream))
{
using (var inputStream = new MemoryStream(input))
{
this.Write(inputStream, literalStream);
}
}
}
}
result = Encoding.ASCII.GetString(outputStream.ToArray());
}
return result;
The issue here is if I will need to add compression of the data, I cannot change this piece of code, I need to write new one instead, since compressing in Bouncy Castle's world done as one more stream wrapper around future output stream.
To work properly, the streams need to be wrapped in correct order, and closed properly, otherwise there will be no usable result of this operation.
In addition all these intermediate streams should also present (I cannot overwrite the same stream variable over and over).
I've created extension methods to stream wrapper creators, and it looks like this now:
string result = string.Empty;
Stream[] pack = new Stream[3];
var outputStream = new MemoryStream();
var inputStream = new MemoryStream(input);
pack[0] = outputStream.Armor();
pack[1] = pack[0].EncryptWith(PublicKey);
pack[2] = pack[1].SplitByLiterals();
this.Write(inputStream, pack[2]);
pack[2].Close();
pack[1].Close();
pack[0].Close();
result = Encoding.ASCII.GetString(outputStream.ToArray());
return result;
I would say, the code become even worse.
My question is, is it possible to optimize stream wrapping? Maybe create array of delegates to wrap streams one by one and close them afterwards?
What's your experience on such tasks, is it possible to make this code more maintainable? Since currently adding compressing, or signing or excluding armoring is pain...

Related

Decompressing gzipped ReadOnlyMemory<byte> before I do JsonDocument.Parse

The websocket client is returning a ReadOnlyMemory<byte>.
The issue is that JsonDocument.Parse fails due to the fact that the buffer has been compressed. I've got to decompress it somehow before I parse it. How do I do that? I cannot really change the websocket library code.
What I want is something like public Func<ReadOnlyMemory<byte>> DataInterpreterBytes = () => which optionally decompresses these bytes out of this class. How do I do that? Is it possible to decompress ReadOnlyMemory<byte> and if the handler is unused to basically to do nothing.
private static string DecompressData(byte[] byteData)
{
using var decompressedStream = new MemoryStream();
using var compressedStream = new MemoryStream(byteData);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
deflateStream.CopyTo(decompressedStream);
decompressedStream.Position = 0;
using var streamReader = new StreamReader(decompressedStream);
return streamReader.ReadToEnd();
}
Snippet
private void OnMessageReceived(object? sender, MessageReceivedEventArgs e)
{
var timestamp = DateTime.UtcNow;
_logger.LogTrace("Message was received. {Message}", Encoding.UTF8.GetString(e.Message.Buffer.Span));
// We dispose that object later on
using var document = JsonDocument.Parse(e.Message.Buffer);
var tokenData = document.RootElement;
So, if you had a byte array, you'd do this:
private static JsonDocument DecompressData(byte[] byteData)
{
using var compressedStream = new MemoryStream(byteData);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
return JsonDocument.Parse(deflateStream);
}
This is similar to your snippet above, but no need for the intermediate copy: just read straight from the GzipStream. JsonDocument.Parse also has an overload that takes a stream, so you can use that and avoid yet another useless copy.
Unfortunately, you don't have a byte array, you have a ReadOnlyMemory<byte>. There is no way out of the box to create a memory stream out of a ReadOnlyMemory<byte>. Honestly, it feels like an oversight, like they forgot to put that feature into .NET.
So here are your options instead.
The first option is to just convert the ReadOnlyMemory<byte> object to an array with ToArray():
// assuming e.Message.Buffer is a ReadOnlyMemory<byte>
using var document = DecompressData(e.Message.Buffer.ToArray());
This is really straightforward, but remember it actually copies the data, so for large documents it might not be a good idea if you want to avoid using too much memory.
The second is to try and extract the underlying array from the memory. This can be achieved with MemoryMarshal.TryGetArray, which gives you an ArraySegment (but might fail if the memory isn't actually a managed array).
private static JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
if(MemoryMarshal.TryGetArray(byteData, out var segment))
{
using var compressedStream = new MemoryStream(segment.Array, segment.Offset, segment.Count);
// rest of the code goes here
}
else
{
// Welp, this memory isn't actually an array, so... tough luck?
}
}
The third way might feel dirty, but if you're okay with using unsafe code, you can just pin the memory's span and then use UnmanagedMemoryStream:
private static unsafe JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
fixed (byte* ptr = byteData.Span)
{
using var compressedStream = new UnmanagedMemoryStream(ptr, byteData.Length);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
return JsonDocument.Parse(deflateStream);
}
}
The other solution is to write your own Stream class that supports this. The Windows Community Toolkit has an extension method that returns a Stream wrapper around the memory object. If you're not okay with using an entire third party library just for that, you can probably just roll your own, it's not that much code.

C# FlatBufferBuilder create String from Stream

Suppose you need to read a large string from a stream and you want to put that string into a flatbuffer.
Currently what I do is read the stream into a string and then use the FlatbufferBuilder.CreateString(string s) function.
This works fine but it does have as a drawback that the string is copied and loaded into memory twice: once by reading it from the stream into the string; and then a second time the string is copied into the flatbuffer.
I was wondering if there is a way to fill the flatbuffer string directly from a stream?
For a more concrete example:
Suppose your flatbuffer schema looks like:
table Message
{
_Data: string;
}
root_type Message;
We can then create a flatbuffer like this (with myData a string)
var fbb = new FlatBufferBuilder(myData.Length);
var dataOffset = fbb.CreateString(myData);
var message = Message.CreateMessage(fbb, dataOffset);
Message.FinishMessageBuffer(fbb, message);
So the question is can we somehow do the same thing, where myData is a System.IO.Stream?
Obviously the following works, but I'd like to avoid first reading the Stream into memory.
using (var reader = new StreamReader(myStream)
{
var myData = reader.ReadToEnd();
var fbb = new FlatBufferBuilder(myData.Length);
var dataOffset = fbb.CreateString(myData);
var message = Message.CreateMessage(fbb, dataOffset);
Message.FinishMessageBuffer(fbb, message);
}
There is currently no way to avoid that copy twice, afaik.. it should be relatively simple to implement a version of CreateString that takes a stream and reduces it to one copy. You could have a go at that and open a PR on github with the result.

Decompress Stream to String using SevenZipSharp

I'd like to compress a string using SevenZipSharp and have cobbled together a C# console application (I'm new to C#) using the following code, (bits and pieces of which came from similar questions here on SO).
The compress part seems to work (albeit I'm passing in a file instead of a string), output of the compressed string to the console looks like gibberish but I'm stuck on the decompress...
I'm trying to do the same thing as here (I think):
https://stackoverflow.com/a/4305399/3451115
https://stackoverflow.com/a/45861659/3451115
https://stackoverflow.com/a/36331690/3451115
Appreciate any help, ideally the console will display the compressed string followed by the decompressed string.
Thanks :)
using System;
using System.IO;
using SevenZip;
namespace _7ZipWrapper
{
public class Program
{
public static void Main()
{
SevenZipCompressor.SetLibraryPath(#"C:\Temp\7za64.dll");
SevenZipCompressor compressor = new SevenZipCompressor();
compressor.CompressionMethod = CompressionMethod.Ppmd;
compressor.CompressionLevel = SevenZip.CompressionLevel.Ultra;
compressor.ScanOnlyWritable = true;
var compStream = new MemoryStream();
var decompStream = new MemoryStream();
compressor.CompressFiles(compStream, #"C:\Temp\a.txt");
StreamReader readerC = new StreamReader(compStream);
Console.WriteLine(readerC.ReadToEnd());
Console.ReadKey();
// works up to here... below here output to consol is: ""
SevenZipExtractor extractor = new SevenZip.SevenZipExtractor(compStream);
extractor.ExtractFile(0, decompStream);
StreamReader readerD = new StreamReader(decompStream);
Console.WriteLine(readerD.ReadToEnd());
Console.ReadKey();
}
}
}
The result of compression is binary data - it isn't a string. If you try to read it as a string, you'll just see garbage. That's to be expected - you shouldn't be treating it as a string.
The next problem is that you're trying to read from compStream twice, without "rewinding" it first. You're starting from the end of the stream, which means there's no data for it to decompress. If you just add:
compStream.Position = 0;
before you create the extractor, you may well find it works immediately. You may also need to rewind the decompStream before reading from it. So you'd have code like this:
// Rewind to the start of the stream before decompressing
compStream.Position = 0;
SevenZipExtractor extractor = new SevenZip.SevenZipExtractor(compStream);
extractor.ExtractFile(0, decompStream);
// Rewind to the start of the decompressed stream before reading
decompStream.Position = 0;

Uploading DataTable to Azure blob storage

I am trying to serialize a DataTable to XML and then upload it to Azure blob storage.
The below code works, but seems clunky and memory hungry. Is there a better way to do this? I'm especially referring to the fact that I am dumping a memory stream to a byte array and then creating a new memory stream from it.
var container = blobClient.GetContainerReference("container");
var blockBlob = container.GetBlockBlobReference("blob");
byte[] blobBytes;
using (var writeStream = new MemoryStream())
{
using (var writer = new StreamWriter(writeStream))
{
table.WriteXml(writer, XmlWriteMode.WriteSchema);
}
blobBytes = writeStream.ToArray();
}
using (var readStream = new MemoryStream(blobBytes))
{
blockBlob.UploadFromStream(readStream);
}
New answer:
I've learned of a better approach, which is to open a write stream directly to the blob. For example:
using (var writeStream = blockBlob.OpenWrite())
{
using (var writer = new StreamWriter(writeStream))
{
table.WriteXml(writer, XmlWriteMode.WriteSchema);
}
}
Per our developer, this does not require the entire table to be buffered in-memory, and will probably encounter less copying around of data.
Original answer:
You can use the CloudBlockBlob.UploadFromByteArray method, and upload the byte array directly, instead of creating the second stream.
See https://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storage.blob.cloudblockblob.uploadfrombytearray.aspx for the method syntax.

Comparing two strings from two different streams are never equal even though they should be

I have a stream that is reading the response from a site. I am then saving that stream to text in a text file.
If I then run it again and compare the string from the same site and the text saved in the file it thinks they are different.
When I compare the two strings in a diff tool like WinMerge it find differences at apparently identical points.
What is happening? They are both using the default UTF8 encoder.
I appreciate this may be difficult to follow so I have written a working example for you.
Here is an example:
var request = WebRequest.Create("http://www.google.com");
using (var response = request.GetResponse())
using (var body = response.GetResponseStream())
using (var googReader = new StreamReader(body))
using (var googFileStream = File.Open("goog.txt", FileMode.OpenOrCreate))
using (var fileReader = new StreamReader(googFileStream))
{
var googText = googReader.ReadToEnd();
var fileText = fileReader.ReadToEnd();
if (!string.Equals(googText, fileText))
{
googFileStream.Dispose();
using (var msnWriter = new StreamWriter(File.Open("goog.txt", FileMode.Create)))
{
msnWriter.Write(googText);
}
}
}
Here is the apparent 'difference' as reported by WinMerge. It is apparently at the point between html; charset:
Your code seems fine. It's just that Google actually returns different contents every time you send a request to it. Other than that you might try simplifying your code and use a site which doesn't return different contents everytime:
var file = "goog.txt";
using (var client = new WebClient())
{
var data = client.DownloadString("http://www.google.com");
if (!File.Exists(file) || !string.Equals(File.ReadAllText(file), data))
{
File.WriteAllText(file, data);
}
}

Categories

Resources