Protobuf-net sirialization/deserialization c# vs Linux c++ - c#

I'm passing messages between Windows C# client and Linux C++ server via TCP sockets. C# code uses protobuf-net v2, Linux Google's version of protobuf. Small test object that I pass has 6 fields ( Enum, Int, String). I need help with two problems:
C# portion unable to deserialize data sent from Linux, unless the Memory stream used as the storage for the data is initialized with the binary array in the constructor. Array can not be larger than the data sent from Linux ( 9 bytes in my case ). Code sample - byte[] data = new byte[9], copy data from the socket into the array.
MemoryStream myStream = new MemoryStream(data), pass myStream to Serializer.Deserialize...
If I initialize MemoryStream without bynary buffer or with array of 1024 bytes, Deserialize will create empty object, without processing data.
When I try to serialize the same object with the same values as Linux same in C# the size of the data is 11 bytes vs 9 on Linux. I checked the byte array in the debugger, C# version has the same 9 fields as Linux data in the indexes 2-11 of the array. Index 0 is 8 and index 1 is 9. I can try to get around the problem, by modifying Linux deserialization code, just need to know, if I always have to deal with two extra fields at the beginning of the message. Also, I can add two extra fields to messages generated on Linux if it going to fix my deserialization in C#, just need to know how to generate values for these two fields.
Thanks.
Alex.

Protobuf data is not self-terminating, simply. You can, however, create either a MemoryStream or ProtoReader that takes a larger payload, but is limited to a virtual length. If you are sending multiple messages, you will need to know the length of the payload - that is inescapable. This is often achieved via a length prefix. I would expect this to throw random errors - most likely "invalid field header 0" - and I wonder if you are swallowing that exception
impossible to comment without the specific example; most likely something to do with the default value of field number 1 (header 8 === "field 1 varint")

Related

OutputBuffer not working for large c# list

I'm currently using SSIS to do an improvement on a project. need to insert single documents in a MongoDB collection of type Time Series. At some point I want to retrieve rows of data after going through a C# transformation script. I did this:
foreach (BsonDocument bson in listBson)
{
OutputBuffer.AddRow();
OutputBuffer.DatalineX = (string) bson.GetValue("data");
}
But this piece of code that works great with small file does not work with a 6 million line file. That is, there are no lines in the output. The other following tasks validate but react as if they had received nothing as input.
Where could the problem come from?
Your OuputBuffer has DatalineX defined as a string, either DT_STR or DT_WSTR and a specific length. When you exceed that value, things go bad. In normal strings, you'd have a maximum length of 8k or 4k respectively.
Neither of which are useful for your use case of at least 6M characters. To handle that, you'll need to change your data type to DT_TEXT/DT_NTEXT Those data types do not require a length as they are "max" types. There are lots of things to be aware of when using the LOB types.
Performance can suck depending on whether SSIS can keep the data in memory (good) or has to write intermediate values to disk (bad)
You can't readily manipulate them in a data flow
You'll use a different syntax in a Script Component to work with them
e.g.
// TODO: convert to bytes
Output0Buffer.DatalineX.AddBlobData(bytes);
Longer example of questionable accuracy with regard to encoding the bytes that you get to solve at https://stackoverflow.com/a/74902194/181965

Read binary file using BinaryFormatter

I am making a dotnet app that needs to read a binary file from a 3rd party.
The file is containing a 516 byte header record/struct (a couple of long identifiers and a couple of fixed length char array strings) followed by a number of payload structs (240 bytes of integers and booleans and chars each).
I know I can read this file in dotnet using BinaryReader and deserialize the fields in the structs one by one.
I have poco/structs that correctly defines the properties needed for the 2 record types but I can't se anyway of letting BinaryFormatter know which type (and how much) to read from the stream next as the binder seems to be relying on typename being serialized along with record payload which they are not in this file.
I would like to know: Is there a way of doing this via the BinaryFormatter, deserializing poco's directly?

NetworkStream.Length substitute

I am using a networkstream to pass short strings around the network.
Now, on the receiving side I have encountered an issue:
Normally I would do the reading like this
see if data is available at all
get count of data available
read that many bytes into a buffer
convert buffer content to string.
In code that assumes all offered methods work as probably intended, that would look something like this:
NetworkStream stream = someTcpClient.GetStream();
while(!stream.DataAvailable)
;
byte[] bufferByte;
stream.Read(bufferByte, 0, stream.Lenght);
AsciiEncoding enc = new AsciiEncoding();
string result = enc.GetString(bufferByte);
However, MSDN says that NetworkStream.Length is not really implemented and will always throw an Exception when called.
Since the incoming data are of varying length I cannot hard-code the count of bytes to expect (which would also be a case of the magic-number antipattern).
Question:
If I cannot get an accurate count of the number of bytes available for reading, then how can I read from the stream properly, without risking all sorts of exceptions within NetworkStream.Read?
EDIT:
Although the provided answer leads to a better overall code I still want to share another option that I came across:
TCPClient.Available gives the bytes available to read. I knew there had to be a way to count the bytes in one's own inbox.
There's no guarantee that calls to Read on one side of the connection will match up 1-1 with calls to Write from the other side. If you're dealing with variable length messages, it's up to you to provide the receiving side with this information.
One common way to do this is to first work out the length of the message you're going to send and then send that length information first. On the receiving side, you then obtain the length first and then you know how big a buffer to allocate. You then call Read in a loop until you've read the correct number of bytes. Note that, in your original code, you're currently ignoring the return value from Read, which tells you how many bytes were actually read. In a single call and return, this could be as low as 1, even if you're asking for more than 1 byte.
Another common way is to decide on message "formats" - where e.g. message number 1 is always 32 bytes in length and has X structure, and message number 2 is 51 bytes in length and has Y structure. With this approach, rather than you sending the message length before sending the message, you send the format information instead - first you send "here comes a message of type 1" and then you send the message.
A further common way, if applicable, is to use some form of sentinels - if your messages will never contain, say, a byte with value 0xff then you scan the received bytes until you've received an 0xff byte, and then everything before that byte was the message you wanted to receive.
But, whatever you want to do, whether its one of the above approaches, or something else, it's up to you to have your sending and receiving sides work together to allow the receiver to discover each message.
I forgot to say but a further way to change everything around is - if you want to exchange messages, and don't want to do any of the above fiddling around, then switch to something that works at a higher level - e.g. WCF, or HTTP, or something else, where those systems already take care of message framing and you can, then, just concentrate on what to do with your messages.
You could use StreamReader to read stream to the end
var streamReader = new StreamReader(someTcpClient.GetStream(), Encoding.ASCII);
string result = streamReader.ReadToEnd();

Inserting a byte array larger than 8k bytes

I'm using the code
cmd.Parameters.Add("#Array", SqlDbType.VarBinary).Value = Array;
The SqlDbType.VarBinary description states that it can only handle array's upto 8 K bytes. I have a byte array that represents an image and can go upto 10k bytes.
How do I store that in a varbinary(max) column using C#?
I have no trouble creating the array. I'm stuck at this 8k limit when trying to execute the query.
Edit: Let me clarify, on my machine even pictures upto 15k bytes get stored on the database in the varbinary(MAX) column when I run the asp.net application locally but once I deployed it the pictures would not get stored. I then resorted to drastically resizing the images to ensure their size was less that 8K and now the images get stored without any problem.
Perhaps you could look at the Sql Server FILESTREAM feature since its meant for storing files. It basically stores a pointer to your file and the file is stored directly in the filesystem (in the databases data directory).
I like FILESTREAM since you it means you continue to use the interface to the database (SQLClient for example) rather then breaking out to an adhoc method to read/write files to the harddrive. This means security is managed for you in that your app doesn't need special permissions to access the filesystem.
Quick google gave this acticle on using filestream in c# but I'm sure there are many others.
UPDATE following OP EDIT
So once deployed to other server the upload fails? Perhaps the problem is not the sql insert but that there is a http request content length limit imposed - for example in your web.config the httpRuntime element has the maxRequestLength attribute. If this is set to a low value perhaps this is the problem. So you could set to something like this (sets max to 6MB well over the 10kb problem):
<system.web>
<httpRuntime maxRequestLength="6144" />
The only thing here is the limit it 4MB buy default :|
No, this is what the description actually says:
Array of type Byte. A variable-length stream of binary data ranging
between 1 and 8,000 bytes. Implicit conversion fails if the byte array
is greater than 8,000 bytes. Explicitly set the object when working
with byte arrays larger than 8,000 bytes.
I would assume that what that actually means is that you cannot use AddWithValue to have a parameter infer the type as VarBinary if the byte array is over 8000 elements. You would have to use Add, specify the type of the parameter yourself and then set the Value property, i.e. use this:
command.Parameters.Add("#MyColumn", SqlDbType.VarBinary).Value = myByteArray;
rather than this:
command.Parameters.AddWithValue("#MyColumn", myByteArray);
Adding the length of data seems to be the fix
var dataParam = cmd.Parameters.AddWithValue("#Data", (object)data.Data ?? DBNull.Value);
if (data.Data != null)
{
dataParam.SqlDbType = SqlDbType.VarBinary;
dataParam.Size = data.Data.Length;
}

H.225 User Information Packet Parsing

I'm writing some code using PacketDotNet and SharpPCap to parse H.225 packets for a VOIP phone system. I've been using Wireshark to look at the structure, but I'm stuck. I've been using This as a reference.
Most of the H.225 packets I see are user information type with an empty message body and the actual information apparently shows up as a list of NonStandardControls in Wireshark. I thought I'd just extract out these controls and parse them later, but I don't really know where they start.
In almost all cases, the items start at the 10th byte of the H.225 data. Each item appears to begin with the length which is recorded as 2 bytes. However, I am getting a packet that has items starting at the 11th byte.
The only difference I see in this packet is something in the message body supposedly called open type length which has a value of 1, whereas the rest all appear to be 0. Would the items start at 10 + open type length? Is there some document that explains what this open type length is for?
Thanks.
H.225 doesn't use a fixed length encoding, it user ASN.1 PER encoding (not BER).
You probably won't find a C# library. OPAL is adding a C API if you are able to use that.

Categories

Resources