I have a collection of objects that I need to write to a binary file.
I need the bytes in the file to be compact, so I can't use BinaryFormatter. BinaryFormatter throws in all sorts of info for deserialization needs.
If I try
byte[] myBytes = (byte[]) myObject
I get a runtime exception.
I need this to be fast so I'd rather not be copying arrays of bytes around. I'd just like the cast byte[] myBytes = (byte[]) myObject to work!
OK just to be clear, I cannot have any metadata in the output file. Just the object bytes. Packed object-to-object. Based on answers received, it looks like I'll be writing low-level Buffer.BlockCopy code. Perhaps using unsafe code.
To convert an object to a byte array:
// Convert an object to a byte array
public static byte[] ObjectToByteArray(Object obj)
{
BinaryFormatter bf = new BinaryFormatter();
using (var ms = new MemoryStream())
{
bf.Serialize(ms, obj);
return ms.ToArray();
}
}
You just need copy this function to your code and send to it the object that you need to convert to a byte array. If you need convert the byte array to an object again you can use the function below:
// Convert a byte array to an Object
public static Object ByteArrayToObject(byte[] arrBytes)
{
using (var memStream = new MemoryStream())
{
var binForm = new BinaryFormatter();
memStream.Write(arrBytes, 0, arrBytes.Length);
memStream.Seek(0, SeekOrigin.Begin);
var obj = binForm.Deserialize(memStream);
return obj;
}
}
You can use these functions with custom classes. You just need add the [Serializable] attribute in your class to enable serialization
If you want the serialized data to be really compact, you can write serialization methods yourself. That way you will have a minimum of overhead.
Example:
public class MyClass {
public int Id { get; set; }
public string Name { get; set; }
public byte[] Serialize() {
using (MemoryStream m = new MemoryStream()) {
using (BinaryWriter writer = new BinaryWriter(m)) {
writer.Write(Id);
writer.Write(Name);
}
return m.ToArray();
}
}
public static MyClass Desserialize(byte[] data) {
MyClass result = new MyClass();
using (MemoryStream m = new MemoryStream(data)) {
using (BinaryReader reader = new BinaryReader(m)) {
result.Id = reader.ReadInt32();
result.Name = reader.ReadString();
}
}
return result;
}
}
Well a cast from myObject to byte[] is never going to work unless you've got an explicit conversion or if myObject is a byte[]. You need a serialization framework of some kind. There are plenty out there, including Protocol Buffers which is near and dear to me. It's pretty "lean and mean" in terms of both space and time.
You'll find that almost all serialization frameworks have significant restrictions on what you can serialize, however - Protocol Buffers more than some, due to being cross-platform.
If you can give more requirements, we can help you out more - but it's never going to be as simple as casting...
EDIT: Just to respond to this:
I need my binary file to contain the
object's bytes. Only the bytes, no
metadata whatsoever. Packed
object-to-object. So I'll be
implementing custom serialization.
Please bear in mind that the bytes in your objects are quite often references... so you'll need to work out what to do with them.
I suspect you'll find that designing and implementing your own custom serialization framework is harder than you imagine.
I would personally recommend that if you only need to do this for a few specific types, you don't bother trying to come up with a general serialization framework. Just implement an instance method and a static method in all the types you need:
public void WriteTo(Stream stream)
public static WhateverType ReadFrom(Stream stream)
One thing to bear in mind: everything becomes more tricky if you've got inheritance involved. Without inheritance, if you know what type you're starting with, you don't need to include any type information. Of course, there's also the matter of versioning - do you need to worry about backward and forward compatibility with different versions of your types?
I took Crystalonics' answer and turned them into extension methods. I hope someone else will find them useful:
public static byte[] SerializeToByteArray(this object obj)
{
if (obj == null)
{
return null;
}
var bf = new BinaryFormatter();
using (var ms = new MemoryStream())
{
bf.Serialize(ms, obj);
return ms.ToArray();
}
}
public static T Deserialize<T>(this byte[] byteArray) where T : class
{
if (byteArray == null)
{
return null;
}
using (var memStream = new MemoryStream())
{
var binForm = new BinaryFormatter();
memStream.Write(byteArray, 0, byteArray.Length);
memStream.Seek(0, SeekOrigin.Begin);
var obj = (T)binForm.Deserialize(memStream);
return obj;
}
}
Use of binary formatter is now considered unsafe. see --> Docs Microsoft
Just use System.Text.Json:
To serialize to bytes:
JsonSerializer.SerializeToUtf8Bytes(obj);
To deserialize to your type:
JsonSerializer.Deserialize(byteArray);
You are really talking about serialization, which can take many forms. Since you want small and binary, protocol buffers may be a viable option - giving version tolerance and portability as well. Unlike BinaryFormatter, the protocol buffers wire format doesn't include all the type metadata; just very terse markers to identify data.
In .NET there are a few implementations; in particular
protobuf-net
dotnet-protobufs
I'd humbly argue that protobuf-net (which I wrote) allows more .NET-idiomatic usage with typical C# classes ("regular" protocol-buffers tends to demand code-generation); for example:
[ProtoContract]
public class Person {
[ProtoMember(1)]
public int Id {get;set;}
[ProtoMember(2)]
public string Name {get;set;}
}
....
Person person = new Person { Id = 123, Name = "abc" };
Serializer.Serialize(destStream, person);
...
Person anotherPerson = Serializer.Deserialize<Person>(sourceStream);
This worked for me:
byte[] bfoo = (byte[])foo;
foo is an Object that I'm 100% certain that is a byte array.
I found Best Way this method worked correcly for me
Use Newtonsoft.Json
public TData ByteToObj<TData>(byte[] arr){
return JsonConvert.DeserializeObject<TData>(Encoding.UTF8.GetString(arr));
}
public byte[] ObjToByte<TData>(TData data){
var json = JsonConvert.SerializeObject(data);
return Encoding.UTF8.GetBytes(json);
}
Take a look at Serialization, a technique to "convert" an entire object to a byte stream. You may send it to the network or write it into a file and then restore it back to an object later.
To access the memory of an object directly (to do a "core dump") you'll need to head into unsafe code.
If you want something more compact than BinaryWriter or a raw memory dump will give you, then you need to write some custom serialisation code that extracts the critical information from the object and packs it in an optimal way.
edit
P.S. It's very easy to wrap the BinaryWriter approach into a DeflateStream to compress the data, which will usually roughly halve the size of the data.
I believe what you're trying to do is impossible.
The junk that BinaryFormatter creates is necessary to recover the object from the file after your program stopped.
However it is possible to get the object data, you just need to know the exact size of it (more difficult than it sounds) :
public static unsafe byte[] Binarize(object obj, int size)
{
var r = new byte[size];
var rf = __makeref(obj);
var a = **(IntPtr**)(&rf);
Marshal.Copy(a, r, 0, size);
return res;
}
this can be recovered via:
public unsafe static dynamic ToObject(byte[] bytes)
{
var rf = __makeref(bytes);
**(int**)(&rf) += 8;
return GCHandle.Alloc(bytes).Target;
}
The reason why the above methods don't work for serialization is that the first four bytes in the returned data correspond to a RuntimeTypeHandle. The RuntimeTypeHandle describes the layout/type of the object but the value of it changes every time the program is ran.
EDIT: that is stupid don't do that -->
If you already know the type of the object to be deserialized for certain you can switch those bytes for BitConvertes.GetBytes((int)typeof(yourtype).TypeHandle.Value) at the time of deserialization.
I found another way to convert an object to a byte[], here is my solution:
IEnumerable en = (IEnumerable) myObject;
byte[] myBytes = en.OfType<byte>().ToArray();
Regards
This method returns an array of bytes from an object.
private byte[] ConvertBody(object model)
{
return Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(model));
}
Spans are very useful for something like this. To put it simply, they are very fast ref structs that have a pointer to the first element and a length. They guarantee a contiguous region of memory and the JIT compiler is able to optimize based on these guarantees. They work just like pointer arrays you can see all the time in the C and C++ languages.
Ever since spans have been added, you are able to use two MemoryMarshal functions that can get all bytes of an object without the overhead of streams. Under the hood, it is just a little bit of casting. Just like you asked, there are no extra allocations going down to the bytes unless you copy them to an array or another span. Here is an example of the two functions in use to get the bytes of one:
public static Span<byte> GetBytes<T>(ref T o)
where T : struct
{
if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
throw new Exception($"Type {nameof(T)} is or contains a reference");
var singletonSpan = MemoryMarshal.CreateSpan(ref o, 1);
var bytes = MemoryMarshal.AsBytes(singletonSpan);
return bytes;
}
The first function, MemoryMarshal.CreateSpan, takes a reference to an object with a length for how many adjacent objects of the same type come immediately after it. They must be adjacent because spans guarantee contiguous regions of memory. In this case, the length is 1 because we are only working with the single object. Under the hood, it is done by creating a span beginning at the first element.
The second function, MemoryMarshal.AsBytes, takes a span and turns it into a span of bytes. This span still covers the argument object so any changes to the bytes will be reflected within the object. Fortunately, spans have a method called ToArray which copies all of the contents from the span into a new array. Under the hood, it creates a span over bytes instead of T and adjusts the length accordingly. If there's a span you want to copy into instead, there's the CopyTo method.
The if statement is there to ensure that you are not copying the bytes of a type that is or contains a reference for safety reasons. If it is not there, you may be copying a reference to an object that doesn't exist.
The type T must be a struct because MemoryMarshal.AsBytes requires a non-nullable type.
You can use below method to convert list of objects into byte array using System.Text.Json serialization.
private static byte[] CovertToByteArray(List<object> mergedResponse)
{
var options = new JsonSerializerOptions
{
PropertyNameCaseInsensitive = true,
};
if (mergedResponse != null && mergedResponse.Any())
{
return JsonSerializer.SerializeToUtf8Bytes(mergedResponse, options);
}
return new byte[] { };
}
Related
I have an abstract class that I'm trying to serialize and deserialize the concrete implementations of. In my abstract base class I have this:
[DataContract]
public class MyAbstractBase
{
[DataMember]
public string Foo { get; set; }
// some other abstract methods that derived classes have to implement
}
And to that class I add a method to serialize:
public string SerializeBase64()
{
// Serialize to a base 64 string
byte[] bytes;
long length = 0;
MemoryStream ws = new MemoryStream();
DataContractSerializer serializer = new DataContractSerializer(this.GetType());
XmlDictionaryWriter binaryDictionaryWriter = XmlDictionaryWriter.CreateBinaryWriter(ws);
serializer.WriteObject(binaryDictionaryWriter, this);
binaryDictionaryWriter.Flush();
length = ws.Length;
bytes = ws.GetBuffer();
string encodedData = bytes.Length + ":" + Convert.ToBase64String(bytes, 0, bytes.Length, Base64FormattingOptions.None);
return encodedData;
}
This seems to work fine, in that it produces "something" and doesn't actually throw any errors.
Of course, the problem comes with deserialization. I added this:
public static MyAbstractBase DeserializeBase64(string s)
{
int p = s.IndexOf(':');
int length = Convert.ToInt32(s.Substring(0, p));
// Extract data from the base 64 string!
byte[] memorydata = Convert.FromBase64String(s.Substring(p + 1));
MemoryStream rs = new MemoryStream(memorydata, 0, length);
DataContractSerializer serializer = new DataContractSerializer(typeof(MyAbstractBase ), new List<Type>() { typeof(SomeOtherClass.MyDerivedClass) });
XmlDictionaryReader binaryDictionaryReader = XmlDictionaryReader.CreateBinaryReader(rs, XmlDictionaryReaderQuotas.Max);
return (MyAbstractBase)serializer.ReadObject(binaryDictionaryReader);
}
I thought by adding the "known types" to my DataContractSerializer, it would be able to figure out how to deserialize the derived class, but it appears that it doesn't. It complains with the error:
Expecting element 'MyAbstractBase' from namespace 'http://schemas.datacontract.org/2004/07/MyApp.Foo'.. Encountered 'Element' with name 'SomeOtherClass.MyDerivedClass', namespace 'http://schemas.datacontract.org/2004/07/MyApp.Foo.Bar'.
So any idea what I'm missing here?
I put together a simple demonstration of the problem on a dot net fiddle here:
http://dotnetfiddle.net/W7GCOw
Unfortunately, it won't run directly there because it doesn't include the System.Runtime.Serialization assemblies. But if you drop it into a Visual Studio project, it will serialize fine, but balks at deserialization.
When serializing your data, use the same overloaded method for serialization as you use for deserialization:
DataContractSerializer serializer = new DataContractSerializer(typeof(MyAbstractBase ), new List<Type>() { typeof(SomeOtherClass.MyDerivedClass) });
Also, Declare a KnownType attribute around your base class so it knows what possible derived classes it may deserialize:
[DataContract]
[KnownType(typeof(SomeOtherclass.MyDerivedClass))]
public class MyAbstractBase
{
[DataMember]
public string Foo { get; set; }
// some other abstract methods that derived classes have to implement
}
So I identified a couple of problems. The first is that the CreateBinaryWriter just doesn't seem to work at all. So I dropped it and just directly serialized with serializer.WriteObject(ws,this);
The second problem is on serialization I did this:
DataContractSerializer serializer = new DataContractSerializer(this.GetType(), new List<Type>() { typeof(SomeOtherClass.MyDerivedClass) });
The problem is that the type I'm passing there isn't the base type, it's whatever type I'm calling this function from. But in deserialization I have this:
DataContractSerializer serializer = new DataContractSerializer(typeof(MyAbstractBase ), new List<Type>() { typeof(SomeOtherClass.MyDerivedClass) });
This isn't the same serializer. The type is different. So changing both to typeof(MyAbstractBase) fixed the problem in my simple example.
Of course, in my real project I'm still getting errors with it complaining on deserialization with The data at the root level is invalid. Line 1, position 1. which is odd because I've compared both the data coming out of serialization and the data going into deserialization and they are absolutely identical. No rogue BOM or anything.
EDIT: I've solved the data at the root level is invalid problem. It seems that decorating my [DataContract] attributes with explicit Name and Namespace properties solved the problem and, as an added bonus, reduced the size of my data because I greatly shortened the namespaces. Quite why the serializer couldn't cope with the real namespaces I don't know.
Another Edit: One last wrinkle. I think the namespace thing was a red herring and I got it working only by pure coincidence. The root of the problem (along with the solution) is explained here:
https://msmvps.com/blogs/peterritchie/archive/2009/04/29/datacontractserializer-readobject-is-easily-confused.aspx
When you do GetBuffer() on the MemoryStream you can get excess null characters because of the way the underlying array resizes itself as needed. This puts a bunch of nulls on the end of your serialization (which can be spotted as a bunch of As after base 64'ing the array) and these are what are screwing up the deserialization with the very confusing The data at the root level is invalid. Line 1, position 1.. It's confusing because the problem isn't at the beginning at all, it's at the END!!!
In case anybody is interested, by serialization now looks like this:
public string SerializeBase64()
{
// Serialize to a base 64 string
byte[] bytes;
long length = 0;
using (MemoryStream ws = new MemoryStream())
{
XmlDictionaryWriter writer = XmlDictionaryWriter.CreateTextWriter(ws);
DataContractSerializer serializer = new DataContractSerializer(typeof(MyAbstractBase ), new List<Type>() { typeof(SomeOtherClass.MyDerivedClass) });
serializer.WriteObject(writer, this);
writer.Flush();
length = ws.Length;
// Note: https://msmvps.com/blogs/peterritchie/archive/2009/04/29/datacontractserializer-readobject-is-easily-confused.aspx
// We need to trim nulls from the buffer produced by the serializer because it'll barf on them when it tries to deserialize.
bytes = new byte[ws.Length];
Array.Copy(ws.GetBuffer(), bytes, bytes.Length);
}
string encodedData = bytes.Length + ":" + Convert.ToBase64String(bytes, 0, bytes.Length, Base64FormattingOptions.None);
return encodedData;
}
I'm trying to convert an object which I have in a byte[] to an object.
I've tried using this code I found online:
object byteArrayToObject(byte[] bytes)
{
try
{
MemoryStream ms = new MemoryStream(bytes);
BinaryFormatter bf = new BinaryFormatter();
//ms.Position = 0;
return bf.Deserialize(ms,null);
}
catch
{
return null;
}
}
SerializationException: "End of Stream encountered before parsing was
completed.".
I've tried it with the ms.Position = 0 line uncommented of course too...
bytes[] is only 8 bytes long, each byte isn't null.
Suggestions?
[edit]
The byte[] was written to a binary file from a c++ program using something along the lines of
void WriteToFile (std::ostream& file,T* value)
{
file.write(reinterpret_cast<char*>(value), sizeof(*T))
}
Where value may be a number of different types.
I can cast to some objects okay from the file using BitConverter, but anything BitConverter doesn't cover I can't do..
As was stated by cdhowie, you will need to manually deserialize the encoded data. Based on the limited information available, you may either want an array of objects or an object containing an array. It looks like you have a single long but there is no way to know from your code. You will need to recreate your object in its true form so take the below myLong as a simple example for a single long array. Since it was unspecified I'll assume you want a struct containing an array like:
public struct myLong {
public long[] value;
}
You could do the same thing with an array of structs, or classes with minor changes to the code posted below.
Your method will be something like this: (written in the editor)
private myLong byteArrayToObject(byte[] bytes) {
try
{
int len = sizeof(long);
myLong data = new myLong();
data.value = new long[bytes.Length / len];
int byteindex = 0;
for (int i = 0; i < data.value.Length; i++) {
data.value[i] = BitConverter.ToInt64(bytes,byteindex);
byteindex += len;
}
return data;
}
catch
{
return null;
}
}
Let's say I have a Person class. It has an editable Notes property.
I want to serialize the Person instance to a fixed size buffer, thus the Notes cannot be infinite long.
In the UI, I use a TextBox to let user edit the notes. I want to dynamically update a label saying how many more characters you can write.
This is my current implementation, is there any faster method? (I'm using rs282)
public Int32 GetSerializedLength()
{
Byte[] data;
using (MemoryStream ms = new MemoryStream())
{
Serializer.SerializeWithLengthPrefix<Person>(ms, this, PrefixStyle.Base128);
data = ms.ToArray();
}
using (MemoryStream ms = new MemoryStream(data))
{
Int32 length = 0;
if (Serializer.TryReadLengthPrefix(ms, PrefixStyle.Base128, out length))
return length;
else
return -1;
}
}
EDIT: I'm confused about the internal lengh of the serialized data and the total length of the serialized data.
Here is my final version:
private static MemoryStream _ms = new MemoryStream();
public static Int64 GetSerializedLength(Person person)
{
if(null == person) return 0;
_ms.SetLength(0);
Serializer.Serialize<Person>(_ms, person);
return _ms.Length;
}
With the edit, it sounds like you want the length without serializing it (since if you want the length with serializing it, you'd just serialize it and check the .Length).
Basically, no - this isn't available. I know that some of the other implementations build this data eagerly, that is in part because they are constructing the buffered data all the time, where-as protobuf-net works from an object graph.
protobuf-net does not do this - it builds the data by discovery during a single pass over the object graph. Is there a specific purpose you have in mind? Things can always be added (with effort, though).
Re the issue of a notes (string) field that you don't want to be over-sized; as a sanity check, note that protubuf uses UTF8 or string data, so personally I would just do:
if(theText.Length > MAX || Encoding.UTF8.GetByteCount(theText) > MAX
|| GetSerializedLength(obj) > MAX)
{
//
}
note we've checked this a bit more cheaply in the obvious cases
I am writing a prototype TCP connection and I am having some trouble homogenizing the data to be sent.
At the moment, I am sending nothing but strings, but in the future we want to be able to send any object.
The code is quite simple at the moment, because I thought everything could be cast into a byte array:
void SendData(object headerObject, object bodyObject)
{
byte[] header = (byte[])headerObject; //strings at runtime,
byte[] body = (byte[])bodyObject; //invalid cast exception
// Unable to cast object of type 'System.String' to type 'System.Byte[]'.
...
}
This of course is easily enough solved with a
if( state.headerObject is System.String ){...}
The problem is, if I do it that way, I need to check for EVERY type of object that can't be cast to a byte[] at runtime.
Since I do not know every object that can't be cast into a byte[] at runtime, this really isn't an option.
How does one convert any object at all into a byte array in C# .NET 4.0?
Use the BinaryFormatter:
byte[] ObjectToByteArray(object obj)
{
if(obj == null)
return null;
BinaryFormatter bf = new BinaryFormatter();
using (MemoryStream ms = new MemoryStream())
{
bf.Serialize(ms, obj);
return ms.ToArray();
}
}
Note that obj and any properties/fields within obj (and so-on for all of their properties/fields) will all need to be tagged with the Serializable attribute to successfully be serialized with this.
checkout this article :http://www.morgantechspace.com/2013/08/convert-object-to-byte-array-and-vice.html
Use the below code
// Convert an object to a byte array
private byte[] ObjectToByteArray(Object obj)
{
if(obj == null)
return null;
BinaryFormatter bf = new BinaryFormatter();
MemoryStream ms = new MemoryStream();
bf.Serialize(ms, obj);
return ms.ToArray();
}
// Convert a byte array to an Object
private Object ByteArrayToObject(byte[] arrBytes)
{
MemoryStream memStream = new MemoryStream();
BinaryFormatter binForm = new BinaryFormatter();
memStream.Write(arrBytes, 0, arrBytes.Length);
memStream.Seek(0, SeekOrigin.Begin);
Object obj = (Object) binForm.Deserialize(memStream);
return obj;
}
Like others have said before, you could use binary serialization, but it may produce an extra bytes or be deserialized into an objects with not exactly same data. Using reflection on the other hand is quite complicated and very slow.
There is an another solution that can strictly convert your objects to bytes and vise-versa - marshalling:
var size = Marshal.SizeOf(your_object);
// Both managed and unmanaged buffers required.
var bytes = new byte[size];
var ptr = Marshal.AllocHGlobal(size);
// Copy object byte-to-byte to unmanaged memory.
Marshal.StructureToPtr(your_object, ptr, false);
// Copy data from unmanaged memory to managed buffer.
Marshal.Copy(ptr, bytes, 0, size);
// Release unmanaged memory.
Marshal.FreeHGlobal(ptr);
And to convert bytes to object:
var bytes = new byte[size];
var ptr = Marshal.AllocHGlobal(size);
Marshal.Copy(bytes, 0, ptr, size);
var your_object = (YourType)Marshal.PtrToStructure(ptr, typeof(YourType));
Marshal.FreeHGlobal(ptr);
It's noticeably slower and partly unsafe to use this approach for small objects and structs comparing to your own serialization field by field (because of double copying from/to unmanaged memory), but it's easiest way to strictly convert object to byte[] without implementing serialization and without [Serializable] attribute.
Using Encoding.UTF8.GetBytes is faster than using MemoryStream.
Here, I am using NewtonsoftJson to convert input object to JSON string and then getting bytes from JSON string.
byte[] SerializeObject(object value) =>Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(value));
Benchmark for #Daniel DiPaolo's version with this version
Method | Mean | Error | StdDev | Median | Gen 0 | Allocated |
--------------------------|----------|-----------|-----------|----------|--------|-----------|
ObjectToByteArray | 4.983 us | 0.1183 us | 0.2622 us | 4.887 us | 0.9460 | 3.9 KB |
ObjectToByteArrayWithJson | 1.548 us | 0.0309 us | 0.0690 us | 1.528 us | 0.3090 | 1.27 KB |
What you're looking for is serialization. There are several forms of serialization available for the .Net platform
Binary Serialization
XML Serialization: Produces a string which is easily convertible to a byte[]
ProtoBuffers
public static class SerializerDeserializerExtensions
{
public static byte[] Serializer(this object _object)
{
byte[] bytes;
using (var _MemoryStream = new MemoryStream())
{
IFormatter _BinaryFormatter = new BinaryFormatter();
_BinaryFormatter.Serialize(_MemoryStream, _object);
bytes = _MemoryStream.ToArray();
}
return bytes;
}
public static T Deserializer<T>(this byte[] _byteArray)
{
T ReturnValue;
using (var _MemoryStream = new MemoryStream(_byteArray))
{
IFormatter _BinaryFormatter = new BinaryFormatter();
ReturnValue = (T)_BinaryFormatter.Deserialize(_MemoryStream);
}
return ReturnValue;
}
}
You can use it like below code.
DataTable _DataTable = new DataTable();
_DataTable.Columns.Add(new DataColumn("Col1"));
_DataTable.Columns.Add(new DataColumn("Col2"));
_DataTable.Columns.Add(new DataColumn("Col3"));
for (int i = 0; i < 10; i++) {
DataRow _DataRow = _DataTable.NewRow();
_DataRow["Col1"] = (i + 1) + "Column 1";
_DataRow["Col2"] = (i + 1) + "Column 2";
_DataRow["Col3"] = (i + 1) + "Column 3";
_DataTable.Rows.Add(_DataRow);
}
byte[] ByteArrayTest = _DataTable.Serializer();
DataTable dt = ByteArrayTest.Deserializer<DataTable>();
Combined Solutions in Extensions class:
public static class Extensions {
public static byte[] ToByteArray(this object obj) {
var size = Marshal.SizeOf(data);
var bytes = new byte[size];
var ptr = Marshal.AllocHGlobal(size);
Marshal.StructureToPtr(data, ptr, false);
Marshal.Copy(ptr, bytes, 0, size);
Marshal.FreeHGlobal(ptr);
return bytes;
}
public static string Serialize(this object obj) {
return JsonConvert.SerializeObject(obj);
}
}
How about something simple like this?
return ((object[])value).Cast<byte>().ToArray();
You could use the built-in serialization tools in the framework and serialize to a MemoryStream. This may be the most straightforward option, but might produce a larger byte[] than may be strictly necessary for your scenario.
If that is the case, you could utilize reflection to iterate over the fields and/or properties in the object to be serialized and manually write them to the MemoryStream, calling the serialization recursively if needed to serialize non-trivial types. This method is more complex and will take more time to implement, but allows you much more control over the serialized stream.
I'd rather use the expression "serialization" than "casting into bytes". Serializing an object means converting it into a byte array (or XML, or something else) that can be used on the remote box to re-construct the object. In .NET, the Serializable attribute marks types whose objects can be serialized.
One additional implementation, which uses Newtonsoft.Json binary JSON and does not require marking everything with the [Serializable] attribute. Only one drawback is that an object has to be wrapped in anonymous class, so byte array obtained with binary serialization can be different from this one.
public static byte[] ConvertToBytes(object obj)
{
using (var ms = new MemoryStream())
{
using (var writer = new BsonWriter(ms))
{
var serializer = new JsonSerializer();
serializer.Serialize(writer, new { Value = obj });
return ms.ToArray();
}
}
}
Anonymous class is used because BSON should start with a class or array.
I have not tried to deserialize byte[] back to object and not sure if it works, but have tested the speed of conversion to byte[] and it completely satisfies my needs.
Alternative way to convert object to byte array:
TypeConverter objConverter = TypeDescriptor.GetConverter(objMsg.GetType());
byte[] data = (byte[])objConverter.ConvertTo(objMsg, typeof(byte[]));
(this is a re-post of a question that I saw in my RSS, but which was deleted by the OP. I've re-added it because I've seen this question asked several times in different places; wiki for "good form")
Suddenly, I receive a ProtoException when deserializing and the message is: unknown wire-type 6
What is a wire-type?
What are the different wire-type values and their description?
I suspect a field is causing the problem, how to debug this?
First thing to check:
IS THE INPUT DATA PROTOBUF DATA? If you try and parse another format (json, xml, csv, binary-formatter), or simply broken data (an "internal server error" html placeholder text page, for example), then it won't work.
What is a wire-type?
It is a 3-bit flag that tells it (in broad terms; it is only 3 bits after all) what the next data looks like.
Each field in protocol buffers is prefixed by a header that tells it which field (number) it represents,
and what type of data is coming next; this "what type of data" is essential to support the case where
unanticipated data is in the stream (for example, you've added fields to the data-type at one end), as
it lets the serializer know how to read past that data (or store it for round-trip if required).
What are the different wire-type values and their description?
0: variant-length integer (up to 64 bits) - base-128 encoded with the MSB indicating continuation (used as the default for integer types, including enums)
1: 64-bit - 8 bytes of data (used for double, or electively for long/ulong)
2: length-prefixed - first read an integer using variant-length encoding; this tells you how many bytes of data follow (used for strings, byte[], "packed" arrays, and as the default for child objects properties / lists)
3: "start group" - an alternative mechanism for encoding child objects that uses start/end tags - largely deprecated by Google, it is more expensive to skip an entire child-object field since you can't just "seek" past an unexpected object
4: "end group" - twinned with 3
5: 32-bit - 4 bytes of data (used for float, or electively for int/uint and other small integer types)
I suspect a field is causing the problem, how to debug this?
Are you serializing to a file? The most likely cause (in my experience) is that you have overwritten an existing file, but have not truncated it; i.e. it was 200 bytes; you've re-written it, but with only 182 bytes. There are now 18 bytes of garbage on the end of your stream that is tripping it up. Files must be truncated when re-writing protocol buffers. You can do this with FileMode:
using(var file = new FileStream(path, FileMode.Truncate)) {
// write
}
or alternatively by SetLength after writing your data:
file.SetLength(file.Position);
Other possible cause
You are (accidentally) deserializing a stream into a different type than what was serialized. It's worth double-checking both sides of the conversation to ensure this is not happening.
Since the stack trace references this StackOverflow question, I thought I'd point out that you can also receive this exception if you (accidentally) deserialize a stream into a different type than what was serialized. So it's worth double-checking both sides of the conversation to ensure this is not happening.
This can also be caused by an attempt to write more than one protobuf message to a single stream. The solution is to use SerializeWithLengthPrefix and DeserializeWithLengthPrefix.
Why this happens:
The protobuf specification supports a fairly small number of wire-types (the binary storage formats) and data-types (the .NET etc data-types). Additionally, this is not 1:1, nor is is 1:many or many:1 - a single wire-type can be used for multiple data-types, and a single data-type can be encoded via any of multiple wire-types. As a consequence, you cannot fully understand a protobuf fragment unless you already know the scema, so you know how to interpret each value. When you are, say, reading an Int32 data-type, the supported wire-types might be "varint", "fixed32" and "fixed64", where-as when reading a String data-type, the only supported wire-type is "string".
If there is no compatible map between the data-type and wire-type, then the data cannot be read, and this error is raised.
Now let's look at why this occurs in the scenario here:
[ProtoContract]
public class Data1
{
[ProtoMember(1, IsRequired=true)]
public int A { get; set; }
}
[ProtoContract]
public class Data2
{
[ProtoMember(1, IsRequired = true)]
public string B { get; set; }
}
class Program
{
static void Main(string[] args)
{
var d1 = new Data1 { A = 1};
var d2 = new Data2 { B = "Hello" };
var ms = new MemoryStream();
Serializer.Serialize(ms, d1);
Serializer.Serialize(ms, d2);
ms.Position = 0;
var d3 = Serializer.Deserialize<Data1>(ms); // This will fail
var d4 = Serializer.Deserialize<Data2>(ms);
Console.WriteLine("{0} {1}", d3, d4);
}
}
In the above, two messages are written directly after each-other. The complication is: protobuf is an appendable format, with append meaning "merge". A protobuf message does not know its own length, so the default way of reading a message is: read until EOF. However, here we have appended two different types. If we read this back, it does not know when we have finished reading the first message, so it keeps reading. When it gets to data from the second message, we find ourselves reading a "string" wire-type, but we are still trying to populate a Data1 instance, for which member 1 is an Int32. There is no map between "string" and Int32, so it explodes.
The *WithLengthPrefix methods allow the serializer to know where each message finishes; so, if we serialize a Data1 and Data2 using the *WithLengthPrefix, then deserialize a Data1 and a Data2 using the *WithLengthPrefix methods, then it correctly splits the incoming data between the two instances, only reading the right value into the right object.
Additionally, when storing heterogeneous data like this, you might want to additionally assign (via *WithLengthPrefix) a different field-number to each class; this provides greater visibility of which type is being deserialized. There is also a method in Serializer.NonGeneric which can then be used to deserialize the data without needing to know in advance what we are deserializing:
// Data1 is "1", Data2 is "2"
Serializer.SerializeWithLengthPrefix(ms, d1, PrefixStyle.Base128, 1);
Serializer.SerializeWithLengthPrefix(ms, d2, PrefixStyle.Base128, 2);
ms.Position = 0;
var lookup = new Dictionary<int,Type> { {1, typeof(Data1)}, {2,typeof(Data2)}};
object obj;
while (Serializer.NonGeneric.TryDeserializeWithLengthPrefix(ms,
PrefixStyle.Base128, fieldNum => lookup[fieldNum], out obj))
{
Console.WriteLine(obj); // writes Data1 on the first iteration,
// and Data2 on the second iteration
}
Previous answers already explain the problem better than I can. I just want to add an even simpler way to reproduce the exception.
This error will also occur simply if the type of a serialized ProtoMember is different from the expected type during deserialization.
For instance if the client sends the following message:
public class DummyRequest
{
[ProtoMember(1)]
public int Foo{ get; set; }
}
But what the server deserializes the message into is the following class:
public class DummyRequest
{
[ProtoMember(1)]
public string Foo{ get; set; }
}
Then this will result in the for this case slightly misleading error message
ProtoBuf.ProtoException: Invalid wire-type; this usually means you have over-written a file without truncating or setting the length
It will even occur if the property name changed. Let's say the client sent the following instead:
public class DummyRequest
{
[ProtoMember(1)]
public int Bar{ get; set; }
}
This will still cause the server to deserialize the int Bar to string Foo which causes the same ProtoBuf.ProtoException.
I hope this helps somebody debugging their application.
Also check the obvious that all your subclasses have [ProtoContract] attribute. Sometimes you can miss it when you have rich DTO.
I've seen this issue when using the improper Encoding type to convert the bytes in and out of strings.
Need to use Encoding.Default and not Encoding.UTF8.
using (var ms = new MemoryStream())
{
Serializer.Serialize(ms, obj);
var bytes = ms.ToArray();
str = Encoding.Default.GetString(bytes);
}
If you are using SerializeWithLengthPrefix, please mind that casting instance to object type breaks the deserialization code and causes ProtoBuf.ProtoException : Invalid wire-type.
using (var ms = new MemoryStream())
{
var msg = new Message();
Serializer.SerializeWithLengthPrefix(ms, (object)msg, PrefixStyle.Base128); // Casting msg to object breaks the deserialization code.
ms.Position = 0;
Serializer.DeserializeWithLengthPrefix<Message>(ms, PrefixStyle.Base128)
}
This happened in my case because I had something like this:
var ms = new MemoryStream();
Serializer.Serialize(ms, batch);
_queue.Add(Convert.ToBase64String(ms.ToArray()));
So basically I was putting a base64 into a queue and then, on the consumer side I had:
var stream = new MemoryStream(Encoding.UTF8.GetBytes(myQueueItem));
var batch = Serializer.Deserialize<List<EventData>>(stream);
So though the type of each myQueueItem was correct, I forgot that I converted a string. The solution was to convert it once more:
var bytes = Convert.FromBase64String(myQueueItem);
var stream = new MemoryStream(bytes);
var batch = Serializer.Deserialize<List<EventData>>(stream);