How to change code that uses Span to use byte array instead

How to change code that uses Span to use byte array instead - c#

I want to try a code sample for libvlcsharp, found here:
https://code.videolan.org/mfkl/libvlcsharp-samples/-/blob/master/PreviewThumbnailExtractor/Program.cs#L113
I want to try it in a Framework 4.6.1 project, but the sample is targeted at .NET 6. I am having trouble getting one line to compile. The section in question is here:
private static async Task ProcessThumbnailsAsync(string destination, CancellationToken token)
{
var frameNumber = 0;
while (!token.IsCancellationRequested)
{
if (FilesToProcess.TryDequeue(out var file))
{
using (var image = new Image<SixLabors.ImageSharp.PixelFormats.Bgra32>((int)(Pitch / BytePerPixel), (int)Lines))
using (var sourceStream = file.file.CreateViewStream())
{
var mg = image.GetPixelMemoryGroup();
for(int i = 0; i < mg.Count; i++)
{
sourceStream.Read(MemoryMarshal.AsBytes(mg[i].Span));
}
Console.WriteLine($"Writing {frameNumber:0000}.jpg");
var fileName = Path.Combine(destination, $"{frameNumber:0000}.jpg");
using (var outputFile = File.Open(fileName, FileMode.Create))
{
image.Mutate(ctx => ctx.Crop((int)Width, (int)Height));
image.SaveAsJpeg(outputFile);
}
}
file.accessor.Dispose();
file.file.Dispose();
frameNumber++;
}
else
{
await Task.Delay(TimeSpan.FromSeconds(1), token);
}
}
}
The troublesome line is :
sourceStream.Read(MemoryMarshal.AsBytes(mg[i].Span));
In .NET 6 there are two overloads,
Read(Span) Reads all the bytes of this unmanaged memory stream
into the specified span of bytes.
Read(Byte[], Int32, Int32) Reads the specified number of bytes into
the specified array.
but in .NET Framework 4.x there is just one
Read(Byte[], Int32, Int32)
I am having trouble understanding what is going on here, can someone please suggest a way to convert the line from the Read(Span) style to the Read(Byte[], Int32, Int32) style so that it works the same? I don't have experience with C#.
Thanks for any advice.

To understand what's happening, consider the following .NET 4.6.1 code which would achieve the same:
var mg = image.GetPixelMemoryGroup();
for(int i = 0; i < mg.Count; i++)
{
Span<byte> span = MemoryMarshal.AsBytes(mg[i].Span);
byte[] buffer = new byte[span.Length];
sourceStream.Read(buffer, 0, buffer.Length);
buffer.CopyTo(span);
}
This is just for demonstration though as it would allocate lots of byte arrays. You're better off "backporting" what new .NET does by default, s. this answer. Especially since you'll run into this again as the SixLabors libraries were written with .NET Core in mind, AFAIK. It may also not be as performant as what new .NET can do in case memory mapped file streams remove the need for the one copy done by default.
Also note that .NET 4.6.1 is no longer supported, and if you consider upgrading, you may find switching to .NET (Core) easier than pursuing backporting a whole library.

Related

How to calculate SHA512/256 in .Net 6?

How to calculate SHA512/256 or SHA512/224 without using external library?
In .Net 6, SHA512 hash can be calculated(documentation). Here is my example:
public string GetHashStringSHA512(string data)
{
using (SHA512 sha512 = SHA512.Create())
{
byte[] bytes = sha512.ComputeHash(Encoding.UTF8.GetBytes(data));
StringBuilder builder = new StringBuilder();
for (int i = 0; i < bytes.Length; i++)
{
builder.Append(bytes[i].ToString("x2"));
}
return builder.ToString();
}
}

As noted in the comments, it appears that the .Net library has not implemented SHA512/256 or SHA512/224.
To calculate SHA512/256 or SHA512/224 without using external library, the specification would need to be implemented. There's a document on the Cryptology ePrint Archive that includes some sample code. See also the NIST example. There are a variety of open source solutions as well to use as a starting point for your own code, such as the SHA512 library at wolfSSL that includes both SHA512/256 and SHA512/224.

Decompressing gzipped ReadOnlyMemory<byte> before I do JsonDocument.Parse

The websocket client is returning a ReadOnlyMemory<byte>.
The issue is that JsonDocument.Parse fails due to the fact that the buffer has been compressed. I've got to decompress it somehow before I parse it. How do I do that? I cannot really change the websocket library code.
What I want is something like public Func<ReadOnlyMemory<byte>> DataInterpreterBytes = () => which optionally decompresses these bytes out of this class. How do I do that? Is it possible to decompress ReadOnlyMemory<byte> and if the handler is unused to basically to do nothing.
private static string DecompressData(byte[] byteData)
{
using var decompressedStream = new MemoryStream();
using var compressedStream = new MemoryStream(byteData);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
deflateStream.CopyTo(decompressedStream);
decompressedStream.Position = 0;
using var streamReader = new StreamReader(decompressedStream);
return streamReader.ReadToEnd();
}
Snippet
private void OnMessageReceived(object? sender, MessageReceivedEventArgs e)
{
var timestamp = DateTime.UtcNow;
_logger.LogTrace("Message was received. {Message}", Encoding.UTF8.GetString(e.Message.Buffer.Span));
// We dispose that object later on
using var document = JsonDocument.Parse(e.Message.Buffer);
var tokenData = document.RootElement;

So, if you had a byte array, you'd do this:
private static JsonDocument DecompressData(byte[] byteData)
{
using var compressedStream = new MemoryStream(byteData);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
return JsonDocument.Parse(deflateStream);
}
This is similar to your snippet above, but no need for the intermediate copy: just read straight from the GzipStream. JsonDocument.Parse also has an overload that takes a stream, so you can use that and avoid yet another useless copy.
Unfortunately, you don't have a byte array, you have a ReadOnlyMemory<byte>. There is no way out of the box to create a memory stream out of a ReadOnlyMemory<byte>. Honestly, it feels like an oversight, like they forgot to put that feature into .NET.
So here are your options instead.
The first option is to just convert the ReadOnlyMemory<byte> object to an array with ToArray():
// assuming e.Message.Buffer is a ReadOnlyMemory<byte>
using var document = DecompressData(e.Message.Buffer.ToArray());
This is really straightforward, but remember it actually copies the data, so for large documents it might not be a good idea if you want to avoid using too much memory.
The second is to try and extract the underlying array from the memory. This can be achieved with MemoryMarshal.TryGetArray, which gives you an ArraySegment (but might fail if the memory isn't actually a managed array).
private static JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
if(MemoryMarshal.TryGetArray(byteData, out var segment))
{
using var compressedStream = new MemoryStream(segment.Array, segment.Offset, segment.Count);
// rest of the code goes here
}
else
{
// Welp, this memory isn't actually an array, so... tough luck?
}
}
The third way might feel dirty, but if you're okay with using unsafe code, you can just pin the memory's span and then use UnmanagedMemoryStream:
private static unsafe JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
fixed (byte* ptr = byteData.Span)
{
using var compressedStream = new UnmanagedMemoryStream(ptr, byteData.Length);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
return JsonDocument.Parse(deflateStream);
}
}
The other solution is to write your own Stream class that supports this. The Windows Community Toolkit has an extension method that returns a Stream wrapper around the memory object. If you're not okay with using an entire third party library just for that, you can probably just roll your own, it's not that much code.

How to perform a minimal-allocation conversion of 'Memory<T>' into 'Stream'?

There doesn't seem to exist a native way of converting a Memory<T> instance into a Stream in the framework. Is there a simple way to achieve this using some allocation-free approach that only uses the actual memory from the Memory<T> instance, or perhaps a way that leverages only a minimal buffer?
I wasn't able to find any examples going from Memory<T> to Stream: only the opposite is common since there are several newer overloads on Stream that allow handling Memory<T> instances.
It seems so intuitive that one would be able to convert from Memory<> to MemoryStream (due mostly to their names, admittedly) so I was a bit disappointed to find this wasn't the case.
I also wasn't able to find any easy way of creating a System.IO.Pipelines PipeReader or PipeWriter with a Memory<T>, as those have a AsStream extension.

The thing is, the Memory<T> might not actually be looking at an array: it might be using a MemoryManager, or doing something else.
That said, it's possible to use MemoryMarshal.TryGetArray to see whether it does reference an array, and take an optimized path if so:
ReadOnlyMemory<byte> memory = new byte[] { 1, 2, 3, 4, 5 }.AsMemory();
var stream = MemoryMarshal.TryGetArray(memory, out var arraySegment)
? new MemoryStream(arraySegment.Array, arraySegment.Offset, arraySegment.Count)
: new MemoryStream(memory.ToArray());

You will need to use unsafe code:
private static unsafe void DoUnsafeMemoryStuff<T>(ReadOnlyMemory<T> mem) where T : unmanaged
{
Stream stream;
MemoryHandle? memHandle = null;
GCHandle? byteHandle = null;
int itemSize = sizeof(T);
if (MemoryMarshal.TryGetArray(memory, out var arr))
{
byteHandle = GCHandle.Alloc(arr.Array, GCHandleType.Pinned);
int offsetBytes = arr.Offset * itemSize;
int totalBytes = arr.Count * itemSize;
int streamBytesRemaining = totalBytes - offsetBytes;
stream = new UnmanagedMemoryStream((byte*)byteHandle.Pointer + offsetBytes, streamBytesRemaining);
}
else
{
// common path, will be very fast to get the stream setup
memHandle = mem.Pin();
stream = new UnmanagedMemoryStream((byte*)memHandle.Pointer, mem.Length * itemSize);
}
// use stream like any other stream - you will need to keep your memory object and the handle object alive until your stream usage is complete
try
{
// do stuff with stream
}
finally
{
// cleanup
if (memHandle.HasValue) { memHandle.Value.Dispose(); }
if (byteHandle.HasValue) { byteHandle.Value.Free(); }
}
}

Convert.ToBase64String throws 'System.OutOfMemoryException' for byte [] (file: large size)

I am trying to convert byte[] to base64 string format so that i can send that information to third party. My code as below:
byte[] ByteArray = System.IO.File.ReadAllBytes(path);
string base64Encoded = System.Convert.ToBase64String(ByteArray);
I am getting below error:
Exception of type 'System.OutOfMemoryException' was thrown. Can you
help me please ?

Update
I just spotted #PanagiotisKanavos' comment pointing to Is there a Base64Stream for .NET?. This does essentially the same thing as my code below attempts to achieve (i.e. allows you to process the file without having to hold the whole thing in memory in one go), but without the overhead/risk of self-rolled code / rather using a standard .Net library method for the job.
Original
The below code will create a new temporary file containing the Base64 encoded version of your input file.
This should have a lower memory footprint, since rather than doing all data at once, we handle it several bytes at a time.
To avoid holding the output in memory, I've pushed that back to a temp file, which is returned. When you later need to use that data for some other process, you'd need to stream it (i.e. so that again you're not consuming all of this data at once).
You'll also notice that I've used WriteLine instead of Write; which will introduce non base64 encoded characters (i.e. the line breaks). That's deliberate, so that if you consume the temp file with a text reader you can easily process it line by line.
However, you can amend per your needs.
void Main()
{
var inputFilePath = #"c:\temp\bigfile.zip";
var convertedDataPath = ConvertToBase64TempFile(inputFilePath);
Console.WriteLine($"Take a look in {convertedDataPath} for your converted data");
}
//inputFilePath = where your source file can be found. This is not impacted by the below code
//bufferSizeInBytesDiv3 = how many bytes to read at a time (divided by 3); the larger this value the more memory is required, but the better you'll find performance. The Div3 part is because we later multiple this by 3 / this ensures we never have to deal with remainders (i.e. since 3 bytes = 4 base64 chars)
public string ConvertToBase64TempFile(string inputFilePath, int bufferSizeInBytesDiv3 = 1024)
{
var tempFilePath = System.IO.Path.GetTempFileName();
using (var fileStream = File.Open(inputFilePath,FileMode.Open))
{
using (var reader = new BinaryReader(fileStream))
{
using (var writer = new StreamWriter(tempFilePath))
{
byte[] data;
while ((data = reader.ReadBytes(bufferSizeInBytesDiv3 * 3)).Length > 0)
{
writer.WriteLine(System.Convert.ToBase64String(data)); //NB: using WriteLine rather than Write; so when consuming this content consider removing line breaks (I've used this instead of write so you can easily stream the data in chunks later)
}
}
}
}
return tempFilePath;
}

C++ zlib inflate failing - translation of c# fixup?

I'm trying to inflate a string using zlib's deflate, but it's failing, apparently because it doesn't have the right header. I read elsewhere that the C# solution to this problem is:
public static byte[] FlateDecode(byte[] inp, bool strict) {
MemoryStream stream = new MemoryStream(inp);
InflaterInputStream zip = new InflaterInputStream(stream);
MemoryStream outp = new MemoryStream();
byte[] b = new byte[strict ? 4092 : 1];
try {
int n;
while ((n = zip.Read(b, 0, b.Length)) > 0) {
outp.Write(b, 0, n);
}
zip.Close();
outp.Close();
return outp.ToArray();
}
catch {
if (strict)
return null;
return outp.ToArray();
}
}
But I know nothing about C#. I can surmise that all it's doing is adding a prefix to the string, but what that prefix is, I have no idea. Would someone be able to phrase this function (or even just the header creation and string concatenation) in C++?
The data which I'm trying to inflate is taken from a PDF using zlib deflation.
Thanks a million,
Wyatt

I've had better luck using SharpZipLib for zlib interop than with the native .Net Framework classes. This correctly handles streams from C++ (zlib native) and from Java's compression classes without any funny business being needed.

I can't see any prefixes, sorry. Here's what the logic appears to be; sorry this isn't in C++:
MemoryStream stream = new MemoryStream(inp);
InflaterInputStream zip = new InflaterInputStream(stream);
Create an inflate stream from the data passed
MemoryStream outp = new MemoryStream();
Create a memory buffer stream for output
byte[] b = new byte[strict ? 4092 : 1];
try {
int n;
while ((n = zip.Read(b, 0, b.Length)) > 0) {
If you're in strict mode, read up to 4092 bytes - or 1 in non-strict mode - into a byte buffer
outp.Write(b, 0, n);
Write all the bytes decoded (may be less than the 4092) to the output memory buffer stream
zip.Close();
outp.Close();
return outp.ToArray();
Clean up, and return the output memory buffer stream as an array.
I'm a bit confused, though: why not just cut array b off at n elements and return that rather than go via a MemoryStream? The code also ought really to take care to clean up the memory streams and zip on exception (e.g. using using) since they're all IDisposable but I guess that's not really important since they don't correspond to I/O file handles, only memory structures.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How to change code that uses Span to use byte array instead - c#

Related

How to calculate SHA512/256 in .Net 6?

Decompressing gzipped ReadOnlyMemory<byte> before I do JsonDocument.Parse

How to perform a minimal-allocation conversion of 'Memory<T>' into 'Stream'?

Convert.ToBase64String throws 'System.OutOfMemoryException' for byte [] (file: large size)

C++ zlib inflate failing - translation of c# fixup?

Categories

Resources