C# serialization without struct metadata itself

C# serialization without struct metadata itself - c#

Right now I'm working on a game engine. To be more efficient and keep data from the end user, I'm trying to use serialization on a modified form of the Wavefront's *.OBJ format. I have multiple structs set up to represent data, and the serialization of the objects works fine except it takes up a significant amount of file space (at least x5 that of the original OBJ file).
To be specific, here's a quick example of what the final object would be (in a JSON-esque format):
{
[{float 5.0, float 2.0, float 1.0}, {float 7.0, float 2.0, float 1.0}, ...]
// ^^^ vertex positions
// other similar structures for colors, normals, texture coordinates
// ...
[[{int 1, int 1, int 1}, {int 2, int 2, int 1}, {int 3, int 3, int 2}], ...]
//represents one face; represents the following
//face[vertex{position index, text coords index, normal index}, vertex{}...]
}
Basically, my main issue with the method of serializing data (binary format) is it saves the names of the structs, not the values. I'd love to keep the data in the format I have already, just without saving the struct itself in my data. I want to save something similar to the above, yet it'll still let me recompile with a different struct name later.
Here's the main object I'm serializing and saving to a file:
[Serializable()] //the included structs have this applied
public struct InstantGameworksObjectData
{
public Position[] Positions;
public TextureCoordinates[] TextureCoordinates;
public Position[] Normals;
public Face[] Faces;
}
Here's the method in which I serialize and save the data:
IFormatter formatter = new BinaryFormatter();
long Beginning = DateTime.Now.Ticks / 10000000;
foreach (string file in fileNames)
{
Console.WriteLine("Begin " + Path.GetFileName(file));
var output = InstantGameworksObject.ConvertOBJToIGWO(File.ReadAllLines(file));
Console.WriteLine("Writing file");
Stream fileOutputStream = new FileStream(outputPath + #"\" + Path.GetFileNameWithoutExtension(file) + ".igwo", FileMode.Create, FileAccess.Write, FileShare.None);
formatter.Serialize(fileOutputStream, output);
Console.WriteLine(outputPath + #"\" + Path.GetFileNameWithoutExtension(file) + ".igwo");
}
The output, of course, is in binary/hex (based on what program you use to view the file), and that's great:
But putting it into a hex-to-text converter online yields specific name data:
In the long run, this could mean gigabytes worth of useless data. How can I save my C# object with the data in the correct format, just without the extra meta-clutter?

As you correctly note, the standard framework binary formatters include a host of metadata about the structure of the data. This is to try to keep the serialised data self-describing. If they were to separate the data from all that metadata, then the smallest change to the structure of classes would render the previously serialised data useless. By that token, I doubt you'd find any standard framework method of serialising binary data that didn't include all the metadata.
Even ProtoBuf includes the semantics of the data in the file data, albeit with less overhead.
Given that the structure of your data follows the reasonably common and well established form of 3D object data, you could roll your own format for your assets which strips the semantics and only stores the raw data. You can implement read and write methods easily using the BinaryReader/BinaryWriter classes (which would be my preference). If you're looking to obfuscate data from the end user, there are a variety of different ways that you could achieve that with this approach.
For example:
public static InstantGameworksObjectData ReadIgoObjct(BinaryReader pReader)
{
var lOutput = new InstantGameworksObjectData();
int lVersion = pReader.ReadInt32(); // Useful in case you ever want to change the format
int lPositionCount = pReader.ReadInt32(); // Store the length of the Position array before the data so you can pre-allocate the array.
lOutput.Positions = new Position[lPositionCount];
for ( int lPositionIndex = 0 ; lPositionIndex < lPositionCount ; ++ lPositionIndex )
{
lOutput.Positions[lPositionIndex] = new Position();
lOutput.Positions[lPositionIndex].X = pReader.ReadSingle();
lOutput.Positions[lPositionIndex].Y = pReader.ReadSingle();
lOutput.Positions[lPositionIndex].Z = pReader.ReadSingle();
// or if you prefer... lOutput.Positions[lPositionIndex] = Position.ReadPosition(pReader);
}
int lTextureCoordinateCount = pReader.ReadInt32();
lOutput.TextureCoordinates = new TextureCoordinate[lPositionCount];
for ( int lTextureCoordinateIndex = 0 ; lTextureCoordinateIndex < lTextureCoordinateCount ; ++ lTextureCoordinateIndex )
{
lOutput.TextureCoordinates[lTextureCoordinateIndex] = new TextureCoordinate();
lOutput.TextureCoordinates[lTextureCoordinateIndex].X = pReader.ReadSingle();
lOutput.TextureCoordinates[lTextureCoordinateIndex].Y = pReader.ReadSingle();
lOutput.TextureCoordinates[lTextureCoordinateIndex].Z = pReader.ReadSingle();
// or if you prefer... lOutput.TextureCoordinates[lTextureCoordinateIndex] = TextureCoordinate.ReadTextureCoordinate(pReader);
}
// ...
}
As far as space efficiency and speed goes, this approach is hard to beat. However, this works well for the 3D objects as they're fairly well-defined and the format is not likely to change, but this approach may not extend well to the other assets that you want to store.
If you find you are needing to change class structures frequently, you may find you have to write lots of if-blocks based on version to correctly read a file, and have to regularly debug issues where the data in the file is not quite in the format you expect. A happy medium might be to use something such as ProtoBuf for the bulk of your development until you're happy with the structure of your data object classes, and then writing raw binary Read/Write methods for each of them before you release.
I'd also recommend some Unit Tests to ensure that your Read and Write methods are correctly persisting the object to avoid pulling your hair out later.
Hope this helps

Related

protobuf-net serialization performance of large primitive arrays

I have 2 questions regarding the serialization of very large objects.
1: What happens when you serialize an object >> 2GB with length prefix? On first sight it looks like the length prefix is an integer. Does prototobuf-net support serializing such large objects with legth prefix?
2: Serialization of the following class seems to take forever (for 950,000,000 integers):
[ProtoContract]
public class Xyz
{
[ProtoMember(1, IsPacked = true)]
public int[] Field { get; set; }
}
Quick serialization code is:
int nn = 950000000;
Xyz xyz = new Xyz();
xyz.Field = new int[nn];
for (int i = 0; i < nn; i++)
{
xyz.Field[i] = i + 1;
}
RuntimeTypeModel xyzModel = RuntimeTypeModel.Create();
xyzModel.Add(typeof(Xyz), true);
TypeModel realModel = xyzModel.Compile();
using (var fs = new FileStream(#"C:\file.bin", FileMode.Create))
{
realModel.Serialize(fs, xyz);
}
For brevity I checked if it is a problem with the disk etc.. by using:
using (var bw = new BinaryWriter(fs))
{
for (int i = 0; i < nn; i++)
{
bw.Write(xyz.Field[i]);
}
}
Writing everything with BinaryWriter directly takes very little time even for this number of elements.
I would expect something slower when using protobuf-net but I was hoping still practical. I waited 15 minutes for the serialization and it has still not finished.
Is my usage wrong or is it expected that it is this slow?
NOTE This is just an example which is part of a bigger solution. I am interested in serializing such things with protobuf-net, even if for this particular example the obvious choice is to write all the integers manually :).
Regards, Iulian

at the wire level, "varint" should be fine - it can hold up to 64 bits; however, I doubt that the implementation has been tested beyond 2GB sizes; note that google's recommended usage of protocol buffers is much smaller than that
yes, serializing a billion things could take quite some considerable time; I haven't looked at that specific array scenario, but if I had to guess: it is trying to buffer it in-memory first; that could be a scenario I look at to optimize

How to write numbers to a file and make them readable between Java and C#

I'm into a "compatibility" issue between two versions of the same program, the first one written in Java, the second it's a port in C#.
My goal is to write some data to a file (for example, in Java), like a sequence of numbers, then to have the ability to read it in C#. Obviously, the operation should work in the reversed order.
For example, I want to write 3 numbers in sequence, represented with the following schema:
first number as one 'byte' (4 bit)
second number as one 'integer' (32 bit)
third number as one 'integer' (32 bit)
So, I can put on a new file the following sequence: 2 (as byte), 120 (as int32), 180 (as int32)
In Java, the writing procedure is more or less this one:
FileOutputStream outputStream;
byte[] byteToWrite;
// ... initialization....
// first byte
outputStream.write(first_byte);
// integers
byteToWrite = ByteBuffer.allocate(4).putInt(first_integer).array();
outputStream.write(byteToWrite);
byteToWrite = ByteBuffer.allocate(4).putInt(second_integer).array();
outputStream.write(byteToWrite);
outputStream.close();
While the reading part it's the following:
FileInputStream inputStream;
ByteBuffer byteToRead;
// ... initialization....
// first byte
first_byte = inputStream.read();
// integers
byteToRead = ByteBuffer.allocate(4);
inputStream.read(byteToRead.array());
first_integer = byteToRead.getInt();
byteToRead = ByteBuffer.allocate(4);
inputStream.read(byteToRead.array());
second_integer = byteToRead.getInt();
inputStream.close();
C# code is the following. Writing:
FileStream fs;
byte[] byteToWrite;
// ... initialization....
// first byte
byteToWrite = new byte[1];
byteToWrite[0] = first_byte;
fs.Write(byteToWrite, 0, byteToWrite.Length);
// integers
byteToWrite = BitConverter.GetBytes(first_integer);
fs.Write(byteToWrite, 0, byteToWrite.Length);
byteToWrite = BitConverter.GetBytes(second_integer);
fs.Write(byteToWrite, 0, byteToWrite.Length);
Reading:
FileStream fs;
byte[] byteToWrite;
// ... initialization....
// first byte
byte[] firstByteBuff = new byte[1];
fs.Read(firstByteBuff, 0, firstByteBuff.Length);
first_byte = firstByteBuff[0];
// integers
byteToRead = new byte[4 * 2];
fs.Read(byteToRead, 0, byteToRead.Length);
first_integer = BitConverter.ToInt32(byteToRead, 0);
second_integer = BitConverter.ToInt32(byteToRead, 4);
Please note that both the procedures works when the same Java/C# version of the program writes and reads the file. The problem is when I try to read a file written by the Java program from the C# version and viceversa. Readed integers are always "strange" numbers (like -1451020...).
There's surely a compatibility issue regarding the way Java stores and reads 32bit integer values (always signed, right?), in contrast to C#. How to handle this?

It's just an endian-ness issue. You can use my MiscUtil library to read big-endian data from .NET.
However, I would strongly advise a simpler approach to both your Java and your .NET:
In Java, use DataInputStream and DataOutputStream. There's no need to get complicated with ByteBuffer etc.
In .NET, use EndianBinaryReader from MiscUtil, which extends BinaryReader (and likewise EndianBinaryWriter for BinaryWriter)
Alternatively, consider just using text instead.

I'd consider using a standard format like XML or JSON to store your data. Then you can use standard serializers in both Java and C# to read/write the file. This sort of approach lets you easily name the data fields, read it from many languages, be easily understandable if someone opens the file in a text editor, and more easily add data to be serialized.
E.g. you can read/write JSON with Gson in Java and Json.NET in C#. The class might look like this in C#:
public class MyData
{
public byte FirstValue { get; set; }
public int SecondValue { get; set; }
public int ThirdValue { get; set; }
}
// serialize to string example
var myData = new MyData { FirstValue = 2, SecondValue = 5, ThirdValue = -1 };
string serialized = JsonConvert.SerializeObject(myData);
It would serialize to
{"FirstValue":2,"SecondValue":5,"ThirdValue":-1}
The Java would, similarly, be quite simple. You can find examples of how to read/write files in each library.
Or if an array would be a better model for your data:
string serialized = JsonConvert.SerializeObject(new[] { 2, 5, -1 }); // [2,5,-1]

Protobuf-net v2 and large Dictionaries

I have a weird situation happening that I'm not quite understanding.
I have a 'dataset' class that holds various metadata about a monitoring buoy including a list of 'sensors'.
Each current 'sensorstate'.
Each 'sensorstate' has a bit of metadata about it (timestamp, reason for change etc) but most importantly it has a Dictionary<DateTime,float> of values.
These sensors generally have upwards of 50k data points (years worth of 15min data readings) and so I wanted to find something that was a bit faster at serialising than the default .NET BinaryFormatter and so set up Protobuf-net which will serialize fantastically fast.
Unfortunately my problem occurs on deserialization when my dictionary of values throws a exception for there already being an item with the same key added and the only way I can get it to deserialise is to enable 'OverwriteList' but I'm a little unsure why when there aren't any duplicate keys (it's a dictionary) when serializing, so why are there duplicate keys when I deserialize? Which also brings up data integrity issues.
Any help in explaining this would be highly appreciated.
(On a side note, when giving ProtoMember attribute ids, do they need to be unique to the class or the whole project? and I'm looking for lossless compression recommendations to use in conjunction with protobuf-net as the files are getting pretty large)
Edit:
I've just put my source up on GitHub and here is the class in question
SensorState (Note: it currently has OverwriteList = true in order to have it working for other development)
Here is an example raw data file
I had already tried using the SkipContructor flag but even with it set to true it gets an exception unless OverwriteList is also true for the values dictionary.

If OverwriteList fixes it, then it suggests to me that the dictionary has some data in it by default, perhaps via a constructor or similar. If it is indeed coming from the constructor, you can disable that with [ProtoContract(SkipConstructor=true)].
If I have misunderstood the above, it may help to illustrate with a reproducible example, if possible.
With regard to the ids, they only need to be unique inside each type, and it is recommended to keep them small (due to "varint" encoding of tags, small keys are "cheaper" than large keys).
If you want to really minimise size, I would actually suggest looking at the content of the data, too. For example, you say that this is 15 minute readings... well, I'm guessing there are occasional gaps, but could you do, for example:
Block (class)
Start Time (DateTime)
Values (float[])
and have a Block for every contiguous bunch of 15-minute values (the assumption here is that every value is 15 after the last, else a new block is started). So you are storing multiple Block instances in place of a single dictionary. This has the advantages:
much less DateTime values to store
you can use "packed" encoding on the floats, which means it doesn't need to add all the intermediate tags; you do this by marking an array/list as ([ProtoMember({key}, IsPacked = true)]) - noting that it only works on a few basic data-types (not sub-objects)
combined, these two tweaks could yield significant savings
If the data has a lot of strings, you could try GZIP/DEFLATE. You can of course try these either way, but without large amounts of string data I would be cautious of expecting too much extra from compression.
As an update based on the supplied (CSV) data file, there is no inherent problem here handling the dictionary - as shown:
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using ProtoBuf;
class Program
{
static void Main()
{
var data = new Data
{
Points =
{
{new DateTime(2009,09,1,0,0,0), 11.04F},
{new DateTime(2009,09,1,0,15,0), 11.04F},
{new DateTime(2009,09,1,0,30,0), 11.01F},
{new DateTime(2009,09,1,0,45,0), 11.01F},
{new DateTime(2009,09,1,1,0,0), 11F},
{new DateTime(2009,09,1,1,15,0), 10.98F},
{new DateTime(2009,09,1,1,30,0), 10.98F},
{new DateTime(2009,09,1,1,45,0), 10.92F},
{new DateTime(2009,09,1,2,00,0), 10.09F},
}
};
var ms = new MemoryStream();
Serializer.Serialize(ms, data);
ms.Position = 0;
var clone =Serializer.Deserialize<Data>(ms);
Console.WriteLine("{0} points:", clone.Points.Count);
foreach(var pair in clone.Points.OrderBy(x => x.Key))
{
float orig;
data.Points.TryGetValue(pair.Key, out orig);
Console.WriteLine("{0}: {1}", pair.Key, pair.Value == orig ? "correct" : "FAIL");
}
}
}
[ProtoContract]
class Data
{
private readonly Dictionary<DateTime, float> points = new Dictionary<DateTime, float>();
[ProtoMember(1)]
public Dictionary<DateTime, float> Points { get { return points; } }
}

This is where I apologize for ever suggesting it had anything to do with code that wasn't my own doing. And while I'm here mad props to the team behind protobuf and Marc Gravell for protobuf-net it's seriously fast.
What was happening was in the Sensor class I had some logic to never let a couple of Properties never be null.
[ProtoMember(12)]
public SensorState CurrentState
{
get { return (_currentState == null) ? RawData : _currentState; }
set { _currentState = value; }
}
Link
[ProtoMember(16)]
public SensorState RawData
{
get { return _rawData ?? (_rawData = new SensorState(this, DateTime.Now, new Dictionary<DateTime, float>(), "", true, null)); }
private set { _rawData = value; }
}
Link
While this works fantastically for when I'm using the properties it messes up serialization processes.
The simple fix was to instead mark the underlying objects for serialization instead.
[ProtoMember(16)]
private SensorState _rawData;
[ProtoMember(12)]
private SensorState _currentState;
Link

Text and binary data in the same file

CString strFile = "c:\\test.txt";
CStdioFile aFile;
UINT nOpenFlags = CFile::modeWrite | CFile::modeCreate | CFile::typeText;
CFileException anError;
if (!aFile.Open(strFile, nOpenFlags, &anError))
{
return false
}
int nSize = 4*sizeof(double);
double* pData = new double[2];
CString strLine, str;
// Write begin of header
strLine = _T(">>> Begin of header <<<\n");
aFile.WriteString(strLine);
// Retrieve current position of file pointer
int lFilePos = (long) aFile.GetPosition();
// Close file
aFile.Close();
nOpenFlags = CFile::modeWrite | CFile::typeBinary;
if (!aFile.Open(strFile, nOpenFlags, &anError))
{
return false;
}
for(int i = 0 ; i < 2 ; i++)
{
pData[i] = i;
}
// Set position of file pointer behind header
aFile.Seek(lFilePos, CFile::begin);
// Write complex vector
aFile.Write(pData, nSize);
// Write complex vector
aFile.Write(pData, nSize);
// Close file
aFile.Close();
Intention to create a file which contains both text data and binary data. This code is written in MFC. I wanted to similarly created a file in C# which contains both text data a and binary data. Please let me know which stream class is used to create this

Text can be written as binary data => simply use binary mode for the whole file and be done.
The only thing the text mode does is that it converts "\n" to "\r\n" on write and back on read. Since the file is partly binary and therefore not editable in regular text editor anyway, you don't need that conversion. If the file is just for your application, you just don't care and if it's for another application, just use whatever newline sequence it requires manually.

As to C#, possibly this S.O. article can give you the answer you are looking for.
The C# solution could also guide you in writing something similar for c, but I suspect you are on your own, i.e., you can use generic read/write to file. In C++, you have the possibility of doing formatted input/output from/to streams by using operator>> and operator<<.

Using Protobuf-net, I suddenly got an exception about an unknown wire-type

(this is a re-post of a question that I saw in my RSS, but which was deleted by the OP. I've re-added it because I've seen this question asked several times in different places; wiki for "good form")
Suddenly, I receive a ProtoException when deserializing and the message is: unknown wire-type 6
What is a wire-type?
What are the different wire-type values and their description?
I suspect a field is causing the problem, how to debug this?

First thing to check:
IS THE INPUT DATA PROTOBUF DATA? If you try and parse another format (json, xml, csv, binary-formatter), or simply broken data (an "internal server error" html placeholder text page, for example), then it won't work.
What is a wire-type?
It is a 3-bit flag that tells it (in broad terms; it is only 3 bits after all) what the next data looks like.
Each field in protocol buffers is prefixed by a header that tells it which field (number) it represents,
and what type of data is coming next; this "what type of data" is essential to support the case where
unanticipated data is in the stream (for example, you've added fields to the data-type at one end), as
it lets the serializer know how to read past that data (or store it for round-trip if required).
What are the different wire-type values and their description?
0: variant-length integer (up to 64 bits) - base-128 encoded with the MSB indicating continuation (used as the default for integer types, including enums)
1: 64-bit - 8 bytes of data (used for double, or electively for long/ulong)
2: length-prefixed - first read an integer using variant-length encoding; this tells you how many bytes of data follow (used for strings, byte[], "packed" arrays, and as the default for child objects properties / lists)
3: "start group" - an alternative mechanism for encoding child objects that uses start/end tags - largely deprecated by Google, it is more expensive to skip an entire child-object field since you can't just "seek" past an unexpected object
4: "end group" - twinned with 3
5: 32-bit - 4 bytes of data (used for float, or electively for int/uint and other small integer types)
I suspect a field is causing the problem, how to debug this?
Are you serializing to a file? The most likely cause (in my experience) is that you have overwritten an existing file, but have not truncated it; i.e. it was 200 bytes; you've re-written it, but with only 182 bytes. There are now 18 bytes of garbage on the end of your stream that is tripping it up. Files must be truncated when re-writing protocol buffers. You can do this with FileMode:
using(var file = new FileStream(path, FileMode.Truncate)) {
// write
}
or alternatively by SetLength after writing your data:
file.SetLength(file.Position);
Other possible cause
You are (accidentally) deserializing a stream into a different type than what was serialized. It's worth double-checking both sides of the conversation to ensure this is not happening.

Since the stack trace references this StackOverflow question, I thought I'd point out that you can also receive this exception if you (accidentally) deserialize a stream into a different type than what was serialized. So it's worth double-checking both sides of the conversation to ensure this is not happening.

This can also be caused by an attempt to write more than one protobuf message to a single stream. The solution is to use SerializeWithLengthPrefix and DeserializeWithLengthPrefix.
Why this happens:
The protobuf specification supports a fairly small number of wire-types (the binary storage formats) and data-types (the .NET etc data-types). Additionally, this is not 1:1, nor is is 1:many or many:1 - a single wire-type can be used for multiple data-types, and a single data-type can be encoded via any of multiple wire-types. As a consequence, you cannot fully understand a protobuf fragment unless you already know the scema, so you know how to interpret each value. When you are, say, reading an Int32 data-type, the supported wire-types might be "varint", "fixed32" and "fixed64", where-as when reading a String data-type, the only supported wire-type is "string".
If there is no compatible map between the data-type and wire-type, then the data cannot be read, and this error is raised.
Now let's look at why this occurs in the scenario here:
[ProtoContract]
public class Data1
{
[ProtoMember(1, IsRequired=true)]
public int A { get; set; }
}
[ProtoContract]
public class Data2
{
[ProtoMember(1, IsRequired = true)]
public string B { get; set; }
}
class Program
{
static void Main(string[] args)
{
var d1 = new Data1 { A = 1};
var d2 = new Data2 { B = "Hello" };
var ms = new MemoryStream();
Serializer.Serialize(ms, d1);
Serializer.Serialize(ms, d2);
ms.Position = 0;
var d3 = Serializer.Deserialize<Data1>(ms); // This will fail
var d4 = Serializer.Deserialize<Data2>(ms);
Console.WriteLine("{0} {1}", d3, d4);
}
}
In the above, two messages are written directly after each-other. The complication is: protobuf is an appendable format, with append meaning "merge". A protobuf message does not know its own length, so the default way of reading a message is: read until EOF. However, here we have appended two different types. If we read this back, it does not know when we have finished reading the first message, so it keeps reading. When it gets to data from the second message, we find ourselves reading a "string" wire-type, but we are still trying to populate a Data1 instance, for which member 1 is an Int32. There is no map between "string" and Int32, so it explodes.
The *WithLengthPrefix methods allow the serializer to know where each message finishes; so, if we serialize a Data1 and Data2 using the *WithLengthPrefix, then deserialize a Data1 and a Data2 using the *WithLengthPrefix methods, then it correctly splits the incoming data between the two instances, only reading the right value into the right object.
Additionally, when storing heterogeneous data like this, you might want to additionally assign (via *WithLengthPrefix) a different field-number to each class; this provides greater visibility of which type is being deserialized. There is also a method in Serializer.NonGeneric which can then be used to deserialize the data without needing to know in advance what we are deserializing:
// Data1 is "1", Data2 is "2"
Serializer.SerializeWithLengthPrefix(ms, d1, PrefixStyle.Base128, 1);
Serializer.SerializeWithLengthPrefix(ms, d2, PrefixStyle.Base128, 2);
ms.Position = 0;
var lookup = new Dictionary<int,Type> { {1, typeof(Data1)}, {2,typeof(Data2)}};
object obj;
while (Serializer.NonGeneric.TryDeserializeWithLengthPrefix(ms,
PrefixStyle.Base128, fieldNum => lookup[fieldNum], out obj))
{
Console.WriteLine(obj); // writes Data1 on the first iteration,
// and Data2 on the second iteration
}

Previous answers already explain the problem better than I can. I just want to add an even simpler way to reproduce the exception.
This error will also occur simply if the type of a serialized ProtoMember is different from the expected type during deserialization.
For instance if the client sends the following message:
public class DummyRequest
{
[ProtoMember(1)]
public int Foo{ get; set; }
}
But what the server deserializes the message into is the following class:
public class DummyRequest
{
[ProtoMember(1)]
public string Foo{ get; set; }
}
Then this will result in the for this case slightly misleading error message
ProtoBuf.ProtoException: Invalid wire-type; this usually means you have over-written a file without truncating or setting the length
It will even occur if the property name changed. Let's say the client sent the following instead:
public class DummyRequest
{
[ProtoMember(1)]
public int Bar{ get; set; }
}
This will still cause the server to deserialize the int Bar to string Foo which causes the same ProtoBuf.ProtoException.
I hope this helps somebody debugging their application.

Also check the obvious that all your subclasses have [ProtoContract] attribute. Sometimes you can miss it when you have rich DTO.

I've seen this issue when using the improper Encoding type to convert the bytes in and out of strings.
Need to use Encoding.Default and not Encoding.UTF8.
using (var ms = new MemoryStream())
{
Serializer.Serialize(ms, obj);
var bytes = ms.ToArray();
str = Encoding.Default.GetString(bytes);
}

If you are using SerializeWithLengthPrefix, please mind that casting instance to object type breaks the deserialization code and causes ProtoBuf.ProtoException : Invalid wire-type.
using (var ms = new MemoryStream())
{
var msg = new Message();
Serializer.SerializeWithLengthPrefix(ms, (object)msg, PrefixStyle.Base128); // Casting msg to object breaks the deserialization code.
ms.Position = 0;
Serializer.DeserializeWithLengthPrefix<Message>(ms, PrefixStyle.Base128)
}

This happened in my case because I had something like this:
var ms = new MemoryStream();
Serializer.Serialize(ms, batch);
_queue.Add(Convert.ToBase64String(ms.ToArray()));
So basically I was putting a base64 into a queue and then, on the consumer side I had:
var stream = new MemoryStream(Encoding.UTF8.GetBytes(myQueueItem));
var batch = Serializer.Deserialize<List<EventData>>(stream);
So though the type of each myQueueItem was correct, I forgot that I converted a string. The solution was to convert it once more:
var bytes = Convert.FromBase64String(myQueueItem);
var stream = new MemoryStream(bytes);
var batch = Serializer.Deserialize<List<EventData>>(stream);

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# serialization without struct metadata itself - c#

Related

protobuf-net serialization performance of large primitive arrays

How to write numbers to a file and make them readable between Java and C#

Protobuf-net v2 and large Dictionaries

Text and binary data in the same file

Using Protobuf-net, I suddenly got an exception about an unknown wire-type

Categories

Resources