Convert List of bytes to memorystream without using ToArray() - c#

How can I save my list<byte> to MemoryStream() without using ToArray() or creating new array ?
This is my current method:
public Packet(List<byte> data)
{
// Create new stream from data buffer
using (Stream stream = new MemoryStream(data.ToArray()))
{
using (BinaryReader reader = new BinaryReader(stream))
{
Length = reader.ReadInt16();
pID = reader.ReadByte();
Result = reader.ReadByte();
Message = reader.ReadString();
ID = reader.ReadInt32();
}
}
}

The ToArray solution is the most efficient solution possible using documented APIs. MemoryStream will not copy the array. It will just store it. So the only copy is in List<T>.ToArray().
If you want to avoid that copy you need to pry List<T> open using reflection and access the backing array. I advise against that.
Instead, use a collection that allows you to obtain the backing array using legal means. Write your own, or use a MemoryStream in the first place.
A List<T> is not the most efficient way to move around bytes anyway. Storing them is fine, moving them usually has more overhead. For example, adding items bytewise will be far slower than a memcpy.

What about something like:
public Packet(List<byte> data)
{
using (Stream stream = new MemoryStream())
{
// Loop list and write out bytes
foreach(byte b in data)
stream.WriteByte(b);
// Reset stream position ready for read
stream.Seek(0, SeekOrigin.Begin);
using (BinaryReader reader = new BinaryReader(stream))
{
Length = reader.ReadInt16();
pID = reader.ReadByte();
Result = reader.ReadByte();
Message = reader.ReadString();
ID = reader.ReadInt32();
}
}
}
But why do you have a list in the first place? Can't you pass it into the method as a byte[] to start with? It'd be interesting to see how you populate that list.

Related

Decompressing gzipped ReadOnlyMemory<byte> before I do JsonDocument.Parse

The websocket client is returning a ReadOnlyMemory<byte>.
The issue is that JsonDocument.Parse fails due to the fact that the buffer has been compressed. I've got to decompress it somehow before I parse it. How do I do that? I cannot really change the websocket library code.
What I want is something like public Func<ReadOnlyMemory<byte>> DataInterpreterBytes = () => which optionally decompresses these bytes out of this class. How do I do that? Is it possible to decompress ReadOnlyMemory<byte> and if the handler is unused to basically to do nothing.
private static string DecompressData(byte[] byteData)
{
using var decompressedStream = new MemoryStream();
using var compressedStream = new MemoryStream(byteData);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
deflateStream.CopyTo(decompressedStream);
decompressedStream.Position = 0;
using var streamReader = new StreamReader(decompressedStream);
return streamReader.ReadToEnd();
}
Snippet
private void OnMessageReceived(object? sender, MessageReceivedEventArgs e)
{
var timestamp = DateTime.UtcNow;
_logger.LogTrace("Message was received. {Message}", Encoding.UTF8.GetString(e.Message.Buffer.Span));
// We dispose that object later on
using var document = JsonDocument.Parse(e.Message.Buffer);
var tokenData = document.RootElement;
So, if you had a byte array, you'd do this:
private static JsonDocument DecompressData(byte[] byteData)
{
using var compressedStream = new MemoryStream(byteData);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
return JsonDocument.Parse(deflateStream);
}
This is similar to your snippet above, but no need for the intermediate copy: just read straight from the GzipStream. JsonDocument.Parse also has an overload that takes a stream, so you can use that and avoid yet another useless copy.
Unfortunately, you don't have a byte array, you have a ReadOnlyMemory<byte>. There is no way out of the box to create a memory stream out of a ReadOnlyMemory<byte>. Honestly, it feels like an oversight, like they forgot to put that feature into .NET.
So here are your options instead.
The first option is to just convert the ReadOnlyMemory<byte> object to an array with ToArray():
// assuming e.Message.Buffer is a ReadOnlyMemory<byte>
using var document = DecompressData(e.Message.Buffer.ToArray());
This is really straightforward, but remember it actually copies the data, so for large documents it might not be a good idea if you want to avoid using too much memory.
The second is to try and extract the underlying array from the memory. This can be achieved with MemoryMarshal.TryGetArray, which gives you an ArraySegment (but might fail if the memory isn't actually a managed array).
private static JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
if(MemoryMarshal.TryGetArray(byteData, out var segment))
{
using var compressedStream = new MemoryStream(segment.Array, segment.Offset, segment.Count);
// rest of the code goes here
}
else
{
// Welp, this memory isn't actually an array, so... tough luck?
}
}
The third way might feel dirty, but if you're okay with using unsafe code, you can just pin the memory's span and then use UnmanagedMemoryStream:
private static unsafe JsonDocument DecompressData(ReadOnlyMemory<byte> byteData)
{
fixed (byte* ptr = byteData.Span)
{
using var compressedStream = new UnmanagedMemoryStream(ptr, byteData.Length);
using var deflateStream = new GZipStream(compressedStream, CompressionMode.Decompress);
return JsonDocument.Parse(deflateStream);
}
}
The other solution is to write your own Stream class that supports this. The Windows Community Toolkit has an extension method that returns a Stream wrapper around the memory object. If you're not okay with using an entire third party library just for that, you can probably just roll your own, it's not that much code.

ToArray() function limitation

I am using the .ToArray() method to convert my string to char array whose size i have kept char[] buffer = new char[1000000]; but when I am using the following code:
using (StreamReader streamReader = new StreamReader(path1))
{
buffer = streamReader.ReadToEnd().ToCharArray();
}
// buffer = result.ToArray();
threadfunc(data_path1);
The size of the buffer getting fixed up to 8190, even it is not reading the whole file after using .ToCharArray() or .ToArray().
What is the reason for this does .ToCharArray() or .ToArray() have size limitations? As if I do not use this function I'm able to read whole file in string format, but when trying to convert it into char array by using this function I am getting size limitations.
My guess is the problem is that read to end should finish before you call the ToCharArray(). This might help you. You don't need to define buffer since ToCharArray() creates a new instance of char[] itself.
string content;
using (StreamReader streamReader = new StreamReader(path1))
{
content = streamReader.ReadToEnd();
}
var buffer = content.ToCharArray();
ToCharArray() returns new instance of of array. So your buffer will refer to the new instance which is the size of data returned by ReadToEnd.
If you want keep buffer same size just add new array to the existed one
char[] buffer = new char[1000000];
using (StreamReader streamReader = new StreamReader(path1))
{
var tempArray = streamReader.ReadToEnd().ToCharArray();
tempArray.CopyTo(buffer, 0);
}
If you want just use the result array - you don't need to "predict" the size of array - just use returned one
public char[] GetArrayFromFile(string pathToFile)
{
using (StreamReader streamReader = new StreamReader(path1))
{
var data = streamReader.ReadToEnd();
}
return data.ToCharArray();
}
var arrayFromFile = GetArrayFromFile(#"..\path.file");
You are probably using incorrect encoding. By default StreamReader(String) uses UTF8 encoding:
The complete file path is specified by the path parameter. This
constructor initializes the encoding to UTF8Encoding and the buffer
size to 1024 bytes.
Don't pre-allocate the buffer size, unless you have a specific need.
If your file is in ASCII format, you need to update your StreamReader constructor:
char[] buffer = null;
using (StreamReader streamReader = new StreamReader(path1, Encoding.ASCII))
{
buffer = streamReader.ReadToEnd().ToCharArray();
}
// buffer = result.ToArray();
threadfunc(data_path1);
Does your file contain binary data? If it contains EOF character and the stream is opened in text mode (which StreamReader does), that character will signal end of file, even if it is not actually the end of the file.
I can reproduce this by reading random .exe files in text mode.

GZipStream - write not writing all compressed data even with flush?

I've got a pesky problem with gzipstream targeting .Net 3.5. This is my first time working with gzipstream, however I have modeled after a number of tutorials including here and I'm still stuck.
My app serializes a datatable to xml and inserts into a database, storing the compressed data into a varbinary(max) field as well as the original length of the uncompressed buffer. Then, when I need it, I retrieve this data and decompress it and recreates the datatable. The decompress is what seems to fail.
EDIT: Sadly after changing the GetBuffer to ToArray as suggested, my issue remains. Code Updated below
Compress code:
DataTable dt = new DataTable("MyUnit");
//do stuff with dt
//okay... now compress the table
using (MemoryStream xmlstream = new MemoryStream())
{
//instead of stream, use xmlwriter?
System.Xml.XmlWriterSettings settings = new System.Xml.XmlWriterSettings();
settings.Encoding = Encoding.GetEncoding(1252);
settings.Indent = false;
System.Xml.XmlWriter writer = System.Xml.XmlWriter.Create(xmlstream, settings);
try
{
dt.WriteXml(writer);
writer.Flush();
}
catch (ArgumentException)
{
//likely an encoding issue... okay, base64 encode it
var base64 = Convert.ToBase64String(xmlstream.ToArray());
xmlstream.Write(Encoding.GetEncoding(1252).GetBytes(base64), 0, Encoding.GetEncoding(1252).GetBytes(base64).Length);
}
using (MemoryStream zipstream = new MemoryStream())
{
GZipStream zip = new GZipStream(zipstream, CompressionMode.Compress);
log.DebugFormat("Compressing commands...");
zip.Write(xmlstream.GetBuffer(), 0, xmlstream.ToArray().Length);
zip.Flush();
float ratio = (float)zipstream.ToArray().Length / (float)xmlstream.ToArray().Length;
log.InfoFormat("Resulting compressed size is {0:P2} of original", ratio);
using (SqlCommand cmd = new SqlCommand())
{
cmd.CommandText = "INSERT INTO tinydup (lastid, command, compressedlength) VALUES (#lastid,#compressed,#length)";
cmd.Connection = db;
cmd.Parameters.Add("#lastid", SqlDbType.Int).Value = lastid;
cmd.Parameters.Add("#compressed", SqlDbType.VarBinary).Value = zipstream.ToArray();
cmd.Parameters.Add("#length", SqlDbType.Int).Value = xmlstream.ToArray().Length;
cmd.ExecuteNonQuery();
}
}
Decompress Code:
/* This is an encapsulation of what I get from the database
public class DupUnit{
public uint lastid;
public uint complength;
public byte[] compressed;
}*/
//I have already retrieved my list of work to do from the database in a List<Dupunit> dupunits
foreach (DupUnit unit in dupunits)
{
DataSet ds = new DataSet();
//DataTable dt = new DataTable();
//uncompress and extract to original datatable
try
{
using (MemoryStream zipstream = new MemoryStream(unit.compressed))
{
GZipStream zip = new GZipStream(zipstream, CompressionMode.Decompress);
byte[] xmlbits = new byte[unit.complength];
//WHY ARE YOU ALWAYS 0!!!!!!!!
int bytesdecompressed = zip.Read(xmlbits, 0, unit.compressed.Length);
MemoryStream xmlstream = new MemoryStream(xmlbits);
log.DebugFormat("Uncompressed XML against {0} is: {1}", m_source.DSN, Encoding.GetEncoding(1252).GetString(xmlstream.ToArray()));
try{
ds.ReadXml(xmlstream);
}catch(Exception)
{
//it may have been base64 encoded... decode first.
ds.ReadXml(Encoding.GetEncoding(1254).GetString(
Convert.FromBase64String(
Encoding.GetEncoding(1254).GetString(xmlstream.ToArray())))
);
}
xmlstream.Dispose();
}
}
catch (Exception e)
{
log.Error(e);
Thread.Sleep(1000);//sleep a sec!
continue;
}
Note the comment above... bytesdecompressed is always 0. Any ideas? Am I doing it wrong?
EDIT 2:
So this is weird. I added the following debug code to the decompression routine:
GZipStream zip = new GZipStream(zipstream, CompressionMode.Decompress);
byte[] xmlbits = new byte[unit.complength];
int offset = 0;
while (zip.CanRead && offset < xmlbits.Length)
{
while (zip.Read(xmlbits, offset, 1) == 0) ;
offset++;
}
When debugging, sometimes that loop would complete, but other times it would hang. When I'd stop the debugging, it would be at byte 1600 out of 1616. I'd continue, but it wouldn't move at all.
EDIT 3: The bug appears to be in the compress code. For whatever reason, it is not saving all of the data. When I try to decompress the data using a third party gzip mechanism, I only get part of the original data.
I'd start a bounty, but I really don't have much reputation to give as of now :-(
Finally found the answer. The compressed data wasn't complete because GZipStream.Flush() does absolutely nothing to ensure that all of the data is out of the buffer - you need to use GZipStream.Close() as pointed out here. Of course, if you get a bad compress, it all goes downhill - if you try to decompress it, you will always get 0 returned from the Read().
I'd say this line, at least, is the most wrong:
cmd.Parameters.Add("#compressed", SqlDbType.VarBinary).Value = zipstream.GetBuffer();
MemoryStream.GetBuffer:
Note that the buffer contains allocated bytes which might be unused. For example, if the string "test" is written into the MemoryStream object, the length of the buffer returned from GetBuffer is 256, not 4, with 252 bytes unused. To obtain only the data in the buffer, use the ToArray method.
It should be noted that in the zip format, it first works by locating data stored at the end of the file - so if you've stored more data than was required, the required entries at the "end" of the file don't exist.
As an aside, I'd also recommend a different name for your compressedlength column - I'd initially taken it (despite your narrative) as being intended to store, well, the length of the compressed data (and written part of my answer to address that). Maybe originalLength would be a better name?

Copy MemoryStream to FileStream and save the file?

I don't understand what I'm doing wrong here. I generate couple of memory streams and in debug-mode I see that they are populated. But when I try to copy MemoryStream to FileStream in order to save the file fileStream is not populated and file is 0bytes long (empty).
Here is my code
if (file.ContentLength > 0)
{
var bytes = ImageUploader.FilestreamToBytes(file); // bytes is populated
using (var inStream = new MemoryStream(bytes)) // inStream is populated
{
using (var outStream = new MemoryStream())
{
using (var imageFactory = new ImageFactory())
{
imageFactory.Load(inStream)
.Resize(new Size(320, 0))
.Format(ImageFormat.Jpeg)
.Quality(70)
.Save(outStream);
}
// outStream is populated here
var fileName = "test.jpg";
using (var fileStream = new FileStream(Server.MapPath("~/content/u/") + fileName, FileMode.CreateNew, FileAccess.ReadWrite))
{
outStream.CopyTo(fileStream); // fileStream is not populated
}
}
}
}
You need to reset the position of the stream before copying.
outStream.Position = 0;
outStream.CopyTo(fileStream);
You used the outStream when saving the file using the imageFactory. That function populated the outStream. While populating the outStream the position is set to the end of the populated area. That is so that when you keep on writing bytes to the steam, it doesn't override existing bytes. But then to read it (for copy purposes) you need to set the position to the start so you can start reading at the start.
If your objective is simply to dump the memory stream to a physical file (e.g. to look at the contents) - it can be done in one move:
System.IO.File.WriteAllBytes(#"C:\\filename", memoryStream.ToArray());
No need to set the stream position first either, since the .ToArray() operation explicitly ignores that, as per #BaconBits comment below https://learn.microsoft.com/en-us/dotnet/api/system.io.memorystream.toarray?view=netframework-4.7.2.
Another alternative to CopyTo is WriteTo.
Advantage:
No need to reset Position.
Usage:
outStream.WriteTo(fileStream);
Function Description:
Writes the entire contents of this memory stream to another stream.

How to use ICSharpCode.ZipLib with stream?

I'm very sorry for the conservative title and my question itself,but I'm lost.
The samples provided with ICsharpCode.ZipLib doesn't include what I'm searching for.
I want to decompress a byte[] by putting it in InflaterInputStream(ICSharpCode.SharpZipLib.Zip.Compression.Streams.InflaterInputStream)
I found a decompress function ,but it doesn't work.
public static byte[] Decompress(byte[] Bytes)
{
ICSharpCode.SharpZipLib.Zip.Compression.Streams.InflaterInputStream stream =
new ICSharpCode.SharpZipLib.Zip.Compression.Streams.InflaterInputStream(new MemoryStream(Bytes));
MemoryStream memory = new MemoryStream();
byte[] writeData = new byte[4096];
int size;
while (true)
{
size = stream.Read(writeData, 0, writeData.Length);
if (size > 0)
{
memory.Write(writeData, 0, size);
}
else break;
}
stream.Close();
return memory.ToArray();
}
It throws an exception at line(size = stream.Read(writeData, 0, writeData.Length);) saying it has a invalid header.
My question is not how to fix the function,this function is not provided with the library,I just found it googling.My question is,how to decompress the same way the function does with InflaterStream,but without exceptions.
Thanks and again - sorry for the conservative question.
the code in lucene is very nice.
public static byte[] Compress(byte[] input) {
// Create the compressor with highest level of compression
Deflater compressor = new Deflater();
compressor.SetLevel(Deflater.BEST_COMPRESSION);
// Give the compressor the data to compress
compressor.SetInput(input);
compressor.Finish();
/*
* Create an expandable byte array to hold the compressed data.
* You cannot use an array that's the same size as the orginal because
* there is no guarantee that the compressed data will be smaller than
* the uncompressed data.
*/
MemoryStream bos = new MemoryStream(input.Length);
// Compress the data
byte[] buf = new byte[1024];
while (!compressor.IsFinished) {
int count = compressor.Deflate(buf);
bos.Write(buf, 0, count);
}
// Get the compressed data
return bos.ToArray();
}
public static byte[] Uncompress(byte[] input) {
Inflater decompressor = new Inflater();
decompressor.SetInput(input);
// Create an expandable byte array to hold the decompressed data
MemoryStream bos = new MemoryStream(input.Length);
// Decompress the data
byte[] buf = new byte[1024];
while (!decompressor.IsFinished) {
int count = decompressor.Inflate(buf);
bos.Write(buf, 0, count);
}
// Get the decompressed data
return bos.ToArray();
}
Well it sounds like the data is just inappropriate, and that otherwise the code would work okay. (Admittedly I'd use a "using" statement for the streams instead of calling Close explicitly.)
Where did you get your data from?
Why don't you use the System.IO.Compression.DeflateStream class (available since .Net 2.0)? This uses the same compression/decompression method but doesn't require an extra library dependency.
Since .Net 2.0 you only need the ICSharpCode.ZipLib if you need the file container support.

Categories

Resources