Read an array of structs in C# - c#

I've seen here , and also googling for "marshal" several ways to convert a byte array to a struct.
But what I'm looking for is if there is a way to read an array of structs from a file (ok, whatever memory input) in one step?
I mean, load an array of structs from file normally takes more CPU time (a read per field using a BinaryReader) than IO time. Is there any workaround?
I'm trying to load about 400K structs from a file as fast as possible.
Thanks
pablo

Following URL may be of interest to you.
http://www.codeproject.com/KB/files/fastbinaryfileinput.aspx
Or otherwise I think of pseudo code like the following:
readbinarydata in a single shot and convert back to structure..
public struct YourStruct
{
public int First;
public long Second;
public double Third;
}
static unsafe byte[] YourStructToBytes( YourStruct s[], int arrayLen )
{
byte[] arr = new byte[ sizeof(YourStruct) * arrayLen ];
fixed( byte* parr = arr )
{
* ( (YourStruct * )parr) = s;
}
return arr;
}
static unsafe YourStruct[] BytesToYourStruct( byte[] arr, int arrayLen )
{
if( arr.Length < (sizeof(YourStruct)*arrayLen) )
throw new ArgumentException();
YourStruct s[];
fixed( byte* parr = arr )
{
s = * ((YourStruct * )parr);
}
return s;
}
Now you can read bytearray from the file in a single shot and convert back to strucure using BytesToYourStruct
Hope you can implement this idea and check...

I found a potential solution at this site -
http://www.eggheadcafe.com/software/aspnet/32846931/writingreading-an-array.aspx
It says basically to use Binary Formatter like this:
FileStream fs = new FileStream("DataFile.dat", FileMode.Create);
BinaryFormatter formatter = new BinaryFormatter();
formatter.Serialize(fs, somestruct);
I also found two questions from this site - Reading a C/C++ data structure in C# from a byte array
and
How to marshal an array of structs - (.Net/C# => C++)
I haven't done this before, being a C# .NET beginner myself. I hope this solution helps.

Related

Converting byte[] to an object

I'm trying to convert an object which I have in a byte[] to an object.
I've tried using this code I found online:
object byteArrayToObject(byte[] bytes)
{
try
{
MemoryStream ms = new MemoryStream(bytes);
BinaryFormatter bf = new BinaryFormatter();
//ms.Position = 0;
return bf.Deserialize(ms,null);
}
catch
{
return null;
}
}
SerializationException: "End of Stream encountered before parsing was
completed.".
I've tried it with the ms.Position = 0 line uncommented of course too...
bytes[] is only 8 bytes long, each byte isn't null.
Suggestions?
[edit]
The byte[] was written to a binary file from a c++ program using something along the lines of
void WriteToFile (std::ostream& file,T* value)
{
file.write(reinterpret_cast<char*>(value), sizeof(*T))
}
Where value may be a number of different types.
I can cast to some objects okay from the file using BitConverter, but anything BitConverter doesn't cover I can't do..
As was stated by cdhowie, you will need to manually deserialize the encoded data. Based on the limited information available, you may either want an array of objects or an object containing an array. It looks like you have a single long but there is no way to know from your code. You will need to recreate your object in its true form so take the below myLong as a simple example for a single long array. Since it was unspecified I'll assume you want a struct containing an array like:
public struct myLong {
public long[] value;
}
You could do the same thing with an array of structs, or classes with minor changes to the code posted below.
Your method will be something like this: (written in the editor)
private myLong byteArrayToObject(byte[] bytes) {
try
{
int len = sizeof(long);
myLong data = new myLong();
data.value = new long[bytes.Length / len];
int byteindex = 0;
for (int i = 0; i < data.value.Length; i++) {
data.value[i] = BitConverter.ToInt64(bytes,byteindex);
byteindex += len;
}
return data;
}
catch
{
return null;
}
}

Reading a struct containing character array

I have the following struct:
unsafe struct Locomotive
{
public fixed char locotype[6];
public int roadno,HP;
}
I have successfully written this to a binary file. Here's the code:
Locomotive l1 = new Locomotive();
for (int i = 0; i <= 5; i++)
{
l1.locotype[i] = textBox1.Text[i];
}
l1.roadno = int.Parse(textBox2.Text);
l1.HP = int.Parse(textBox3.Text);
BinaryWriter bw = new BinaryWriter(File.Open(#"C:\Documents and Settings\Ruchir Sharma\Desktop\Locodata.bin", FileMode.Append));
IntPtr ip = Marshal.AllocHGlobal(Marshal.SizeOf(l1));
Marshal.StructureToPtr(l1, ip, true);
Byte[] b1 = new Byte[Marshal.SizeOf(l1)];
Marshal.Copy(ip, b1, 0, b1.Length - 1);
bw.Write(b1);
MessageBox.Show("Data written successfully");
Marshal.FreeHGlobal(ip);
bw.Close();
Now, while reading this struct, the character array i.e locotype[6] is giving me a problem. I tried the method BinaryReader.ReadChars(), but it didn't worked for me. Please help me on reading this struct.
Your "read" code should be the reverse of your "write" code. You didn't write it with WriteChars, so don't use ReadChars to read it. You should use ReadBytes then Marshal.Copy and PtrToStructure.
Frankly, though, this level of "unsafe" (fixed buffers in structs, Marshal, etc) is very rare and specialized - I worry you might be over-engineering this.

Better ways of improving code serialization speed

I have the following code that serializes a List to a byte array for transport via Web Services. The code works relatively fast on smaller entities, but this is a list of 60,000 or so items. It takes several seconds to execute the formatter.Serialize method. Anyway to speed this up?
public static byte[] ToBinary(Object objToBinary)
{
using (MemoryStream memStream = new MemoryStream())
{
BinaryFormatter formatter = new BinaryFormatter(null, new StreamingContext(StreamingContextStates.Clone));
formatter.Serialize(memStream, objToBinary);
memStream.Seek(0, SeekOrigin.Begin);
return memStream.ToArray();
}
}
The inefficiency you're experiencing comes from several sources:
The default serialization routine uses reflection to enumerate object fields and get their values.
The binary serialization format stores things in associative lists keyed by the string names of the fields.
You've got a spurious ToArray in there (as Danny mentioned).
You can get a pretty big improvement off the bat by implementing ISerializable on the object type that is contained in your List. That will cut out the default serialization behavior that uses reflection.
You can get a little more speed if you cut down the number of elements in the associative array that holds the serialized data. Make sure the elements you do store in that associative array are primitive types.
Finally, you can eliminate the ToArray but I doubt you'll even notice the bump that gives you.
if you want some real serialization speed , consider using protobuf-net which is the c# version of google's protocol buffers. it's supposed to be an order of magnitude faster that binary formatter.
It would probably be much faster to serialize the entire array (or collection) of 60,000 items in one shot, into a single large byte[] array, instead of in separate chunks. Is having each of the individual objects be represented by its own byte[] array a requirement of other parts of the system you're working within? Also, are the actual Type's of the objects known? If you were using a specific Type (maybe some common base class to all of these 60,000 objects) then the framework would not have to do as much casting and searching for your prebuilt serialization assemblies. Right now you're only giving it Object.
.ToArray() creates a new array, it more be more effcient to copy the data to an existing array using unsafe methods (such as accessing the stream's memory using fixed, then copying the memory using MemCopy() via DllImport).
Also consider using a faster custom formatter.
I started a code-generator project, that includes a binary DataContract-Serialzer that beats at least Json.NET by a factor of 30. All you need are the generator nuget package and an additional lib that comes with faster replacements of BitConverter.
Then you create a partial class and decorate it with DataContract and each serializable property with DataMember. The generator will then create a ToBytes-method and together with the additional lib you can serialize collections as well. Look at my example from this post:
var objects = new List<Td>();
for (int i = 0; i < 1000; i++)
{
var obj = new Td
{
Message = "Hello my friend",
Code = "Some code that can be put here",
StartDate = DateTime.Now.AddDays(-7),
EndDate = DateTime.Now.AddDays(2),
Cts = new List<Ct>(),
Tes = new List<Te>()
};
for (int j = 0; j < 10; j++)
{
obj.Cts.Add(new Ct { Foo = i * j });
obj.Tes.Add(new Te { Bar = i + j });
}
objects.Add(obj);
}
With this generated ToBytes() method:
public int Size
{
get
{
var size = 24;
// Add size for collections and strings
size += Cts == null ? 0 : Cts.Count * 4;
size += Tes == null ? 0 : Tes.Count * 4;
size += Code == null ? 0 : Code.Length;
size += Message == null ? 0 : Message.Length;
return size;
}
}
public byte[] ToBytes(byte[] bytes, ref int index)
{
if (index + Size > bytes.Length)
throw new ArgumentOutOfRangeException("index", "Object does not fit in array");
// Convert Cts
// Two bytes length information for each dimension
GeneratorByteConverter.Include((ushort)(Cts == null ? 0 : Cts.Count), bytes, ref index);
if (Cts != null)
{
for(var i = 0; i < Cts.Count; i++)
{
var value = Cts[i];
value.ToBytes(bytes, ref index);
}
}
// Convert Tes
// Two bytes length information for each dimension
GeneratorByteConverter.Include((ushort)(Tes == null ? 0 : Tes.Count), bytes, ref index);
if (Tes != null)
{
for(var i = 0; i < Tes.Count; i++)
{
var value = Tes[i];
value.ToBytes(bytes, ref index);
}
}
// Convert Code
GeneratorByteConverter.Include(Code, bytes, ref index);
// Convert Message
GeneratorByteConverter.Include(Message, bytes, ref index);
// Convert StartDate
GeneratorByteConverter.Include(StartDate.ToBinary(), bytes, ref index);
// Convert EndDate
GeneratorByteConverter.Include(EndDate.ToBinary(), bytes, ref index);
return bytes;
}
It serializes each object in ~1.5 micro seconds -> 1000 objects in 1,7ms.

How to convert an object to a byte array in C#

I have a collection of objects that I need to write to a binary file.
I need the bytes in the file to be compact, so I can't use BinaryFormatter. BinaryFormatter throws in all sorts of info for deserialization needs.
If I try
byte[] myBytes = (byte[]) myObject
I get a runtime exception.
I need this to be fast so I'd rather not be copying arrays of bytes around. I'd just like the cast byte[] myBytes = (byte[]) myObject to work!
OK just to be clear, I cannot have any metadata in the output file. Just the object bytes. Packed object-to-object. Based on answers received, it looks like I'll be writing low-level Buffer.BlockCopy code. Perhaps using unsafe code.
To convert an object to a byte array:
// Convert an object to a byte array
public static byte[] ObjectToByteArray(Object obj)
{
BinaryFormatter bf = new BinaryFormatter();
using (var ms = new MemoryStream())
{
bf.Serialize(ms, obj);
return ms.ToArray();
}
}
You just need copy this function to your code and send to it the object that you need to convert to a byte array. If you need convert the byte array to an object again you can use the function below:
// Convert a byte array to an Object
public static Object ByteArrayToObject(byte[] arrBytes)
{
using (var memStream = new MemoryStream())
{
var binForm = new BinaryFormatter();
memStream.Write(arrBytes, 0, arrBytes.Length);
memStream.Seek(0, SeekOrigin.Begin);
var obj = binForm.Deserialize(memStream);
return obj;
}
}
You can use these functions with custom classes. You just need add the [Serializable] attribute in your class to enable serialization
If you want the serialized data to be really compact, you can write serialization methods yourself. That way you will have a minimum of overhead.
Example:
public class MyClass {
public int Id { get; set; }
public string Name { get; set; }
public byte[] Serialize() {
using (MemoryStream m = new MemoryStream()) {
using (BinaryWriter writer = new BinaryWriter(m)) {
writer.Write(Id);
writer.Write(Name);
}
return m.ToArray();
}
}
public static MyClass Desserialize(byte[] data) {
MyClass result = new MyClass();
using (MemoryStream m = new MemoryStream(data)) {
using (BinaryReader reader = new BinaryReader(m)) {
result.Id = reader.ReadInt32();
result.Name = reader.ReadString();
}
}
return result;
}
}
Well a cast from myObject to byte[] is never going to work unless you've got an explicit conversion or if myObject is a byte[]. You need a serialization framework of some kind. There are plenty out there, including Protocol Buffers which is near and dear to me. It's pretty "lean and mean" in terms of both space and time.
You'll find that almost all serialization frameworks have significant restrictions on what you can serialize, however - Protocol Buffers more than some, due to being cross-platform.
If you can give more requirements, we can help you out more - but it's never going to be as simple as casting...
EDIT: Just to respond to this:
I need my binary file to contain the
object's bytes. Only the bytes, no
metadata whatsoever. Packed
object-to-object. So I'll be
implementing custom serialization.
Please bear in mind that the bytes in your objects are quite often references... so you'll need to work out what to do with them.
I suspect you'll find that designing and implementing your own custom serialization framework is harder than you imagine.
I would personally recommend that if you only need to do this for a few specific types, you don't bother trying to come up with a general serialization framework. Just implement an instance method and a static method in all the types you need:
public void WriteTo(Stream stream)
public static WhateverType ReadFrom(Stream stream)
One thing to bear in mind: everything becomes more tricky if you've got inheritance involved. Without inheritance, if you know what type you're starting with, you don't need to include any type information. Of course, there's also the matter of versioning - do you need to worry about backward and forward compatibility with different versions of your types?
I took Crystalonics' answer and turned them into extension methods. I hope someone else will find them useful:
public static byte[] SerializeToByteArray(this object obj)
{
if (obj == null)
{
return null;
}
var bf = new BinaryFormatter();
using (var ms = new MemoryStream())
{
bf.Serialize(ms, obj);
return ms.ToArray();
}
}
public static T Deserialize<T>(this byte[] byteArray) where T : class
{
if (byteArray == null)
{
return null;
}
using (var memStream = new MemoryStream())
{
var binForm = new BinaryFormatter();
memStream.Write(byteArray, 0, byteArray.Length);
memStream.Seek(0, SeekOrigin.Begin);
var obj = (T)binForm.Deserialize(memStream);
return obj;
}
}
Use of binary formatter is now considered unsafe. see --> Docs Microsoft
Just use System.Text.Json:
To serialize to bytes:
JsonSerializer.SerializeToUtf8Bytes(obj);
To deserialize to your type:
JsonSerializer.Deserialize(byteArray);
You are really talking about serialization, which can take many forms. Since you want small and binary, protocol buffers may be a viable option - giving version tolerance and portability as well. Unlike BinaryFormatter, the protocol buffers wire format doesn't include all the type metadata; just very terse markers to identify data.
In .NET there are a few implementations; in particular
protobuf-net
dotnet-protobufs
I'd humbly argue that protobuf-net (which I wrote) allows more .NET-idiomatic usage with typical C# classes ("regular" protocol-buffers tends to demand code-generation); for example:
[ProtoContract]
public class Person {
[ProtoMember(1)]
public int Id {get;set;}
[ProtoMember(2)]
public string Name {get;set;}
}
....
Person person = new Person { Id = 123, Name = "abc" };
Serializer.Serialize(destStream, person);
...
Person anotherPerson = Serializer.Deserialize<Person>(sourceStream);
This worked for me:
byte[] bfoo = (byte[])foo;
foo is an Object that I'm 100% certain that is a byte array.
I found Best Way this method worked correcly for me
Use Newtonsoft.Json
public TData ByteToObj<TData>(byte[] arr){
return JsonConvert.DeserializeObject<TData>(Encoding.UTF8.GetString(arr));
}
public byte[] ObjToByte<TData>(TData data){
var json = JsonConvert.SerializeObject(data);
return Encoding.UTF8.GetBytes(json);
}
Take a look at Serialization, a technique to "convert" an entire object to a byte stream. You may send it to the network or write it into a file and then restore it back to an object later.
To access the memory of an object directly (to do a "core dump") you'll need to head into unsafe code.
If you want something more compact than BinaryWriter or a raw memory dump will give you, then you need to write some custom serialisation code that extracts the critical information from the object and packs it in an optimal way.
edit
P.S. It's very easy to wrap the BinaryWriter approach into a DeflateStream to compress the data, which will usually roughly halve the size of the data.
I believe what you're trying to do is impossible.
The junk that BinaryFormatter creates is necessary to recover the object from the file after your program stopped.
However it is possible to get the object data, you just need to know the exact size of it (more difficult than it sounds) :
public static unsafe byte[] Binarize(object obj, int size)
{
var r = new byte[size];
var rf = __makeref(obj);
var a = **(IntPtr**)(&rf);
Marshal.Copy(a, r, 0, size);
return res;
}
this can be recovered via:
public unsafe static dynamic ToObject(byte[] bytes)
{
var rf = __makeref(bytes);
**(int**)(&rf) += 8;
return GCHandle.Alloc(bytes).Target;
}
The reason why the above methods don't work for serialization is that the first four bytes in the returned data correspond to a RuntimeTypeHandle. The RuntimeTypeHandle describes the layout/type of the object but the value of it changes every time the program is ran.
EDIT: that is stupid don't do that -->
If you already know the type of the object to be deserialized for certain you can switch those bytes for BitConvertes.GetBytes((int)typeof(yourtype).TypeHandle.Value) at the time of deserialization.
I found another way to convert an object to a byte[], here is my solution:
IEnumerable en = (IEnumerable) myObject;
byte[] myBytes = en.OfType<byte>().ToArray();
Regards
This method returns an array of bytes from an object.
private byte[] ConvertBody(object model)
{
return Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(model));
}
Spans are very useful for something like this. To put it simply, they are very fast ref structs that have a pointer to the first element and a length. They guarantee a contiguous region of memory and the JIT compiler is able to optimize based on these guarantees. They work just like pointer arrays you can see all the time in the C and C++ languages.
Ever since spans have been added, you are able to use two MemoryMarshal functions that can get all bytes of an object without the overhead of streams. Under the hood, it is just a little bit of casting. Just like you asked, there are no extra allocations going down to the bytes unless you copy them to an array or another span. Here is an example of the two functions in use to get the bytes of one:
public static Span<byte> GetBytes<T>(ref T o)
where T : struct
{
if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
throw new Exception($"Type {nameof(T)} is or contains a reference");
var singletonSpan = MemoryMarshal.CreateSpan(ref o, 1);
var bytes = MemoryMarshal.AsBytes(singletonSpan);
return bytes;
}
The first function, MemoryMarshal.CreateSpan, takes a reference to an object with a length for how many adjacent objects of the same type come immediately after it. They must be adjacent because spans guarantee contiguous regions of memory. In this case, the length is 1 because we are only working with the single object. Under the hood, it is done by creating a span beginning at the first element.
The second function, MemoryMarshal.AsBytes, takes a span and turns it into a span of bytes. This span still covers the argument object so any changes to the bytes will be reflected within the object. Fortunately, spans have a method called ToArray which copies all of the contents from the span into a new array. Under the hood, it creates a span over bytes instead of T and adjusts the length accordingly. If there's a span you want to copy into instead, there's the CopyTo method.
The if statement is there to ensure that you are not copying the bytes of a type that is or contains a reference for safety reasons. If it is not there, you may be copying a reference to an object that doesn't exist.
The type T must be a struct because MemoryMarshal.AsBytes requires a non-nullable type.
You can use below method to convert list of objects into byte array using System.Text.Json serialization.
private static byte[] CovertToByteArray(List<object> mergedResponse)
{
var options = new JsonSerializerOptions
{
PropertyNameCaseInsensitive = true,
};
if (mergedResponse != null && mergedResponse.Any())
{
return JsonSerializer.SerializeToUtf8Bytes(mergedResponse, options);
}
return new byte[] { };
}

C# Problem using blowfish NET: How to convert from Uint32[] to byte[]

In C#,I'm using Blowfish.NET 2.1.3's BlowfishECB.cs file(can be found here)
In C++,It's unknown,but it is similiar.
In C++,the Initialize(blowfish) procedure is the following:
void cBlowFish::Initialize(BYTE key[], int keybytes)
In C#,the Initialize(blowfish) procedure is the same
public void Initialize(byte[] key, int ofs, int len)
This is the problem:
This is how the key is initialized in C++
DWORD keyArray[2] = {0}; //declaration
...some code
blowfish.Initialize((LPBYTE)keyArray, 8);
As you see,the key is an array of two DWORDS,which is 8 bytes total.
In C# I declare it like that,but I get an error
BlowfishECB blowfish = new BlowfishECB();
UInt32[] keyarray = new UInt32[2];
..some code
blowfish.Initialize(keyarray, 0, 8);
The error is:
Argument '1': cannot convert from 'uint[]' to 'byte[]'
What am I doing wrong?
Thanks in advance!
You can use BitConverter to get the bytes from a UInt32.
To do this, you'll need to convert each element in a loop. I would do something like:
private byte[] ConvertFromUInt32Array(UInt32[] array)
{
List<byte> results = new List<byte>();
foreach(UInt32 value in array)
{
byte[] converted = BitConverter.GetBytes(value);
results.AddRange(converted);
}
return results.ToArray();
}
To go back:
private UInt32[] ConvertFromByteArray(byte[] array)
{
List<UInt32> results = new List<UInt32>();
for(int i=0;i<array.Length;i += 4)
{
byte[] temp = new byte[4];
for (int j=0;j<4;++j)
temp[j] = array[i+j];
results.Add(BitConverter.ToUInt32(temp);
}
return results.ToArray();
}
If you are using VS2008 or C# 3.5, try the following LINQ + BitConverter solution
var converted =
keyArray
.Select(x => BitConverter.GetBytes(x))
.SelectMany(x => x)
.ToArray();
Breaking this down
The Select converts every UInt32 into a byte[]. The result is an IEnumerable<byte[]>
The SelectMany calls flattes the IEnumerable<byte[]> to IEnumerable<byte>
ToArray() simply converts the enumerable into an array
EDIT Non LINQ solution that works just as well
List<byte> list = new List<byte>();
foreach ( UInt32 k in keyArray) {
list.AddRange(BitConverter.GetBytes(k));
}
return list.ToArray();
If you need a faster way to convert your value types, you can use the hack I described in the following answer: What is the fastest way to convert a float[] to a byte[]?
This hack avoid memory allocations and iterations. It gives you a different view of your array in O(1).
Of course you should only use this if performance is an issue (avoid premature optimization).

Categories

Resources