This is kind of a branch off of my other question. Read it if you like, but it's not necessary.
Basically, I realized that in order to effectively use C#'s BeginReceive() on large messages, I need to either (a) read the packet length first, then read exactly that many bytes or (b) use an end-of-packet delimiter. My question is, are either of these present in protocol buffers? I haven't used them yet, but going over the documentation it doesn't seem like there is a length header or a delimiter.
If not, what should I do? Should I just build the message then prefix/suffix it with the length header/EOP delimiter?
You need to include the size or end marker in your protocol. Nothing is built into stream based sockets (TCP/IP) other than supporting an indefinite stream of octets arbitrarily broken up into separate packets (and packets can be spilt in transit as well).
A simple approach would be for each "message" to have a fixed size header, include both a protocol version and a payload size and any other fixed data. Then the message content (payload).
Optionally a message footer (fixed size) could be added with a checksum or even a cryptographic signature (depending on your reliability/security requirements).
Knowing the payload size allows you to keep reading a number of bytes that will be enough for the rest of the message (and if a read completes with less, doing another read for the remaining bytes until the whole message has been received).
Having a end message indicator also works, but you need to define how to handle your message containing that same octet sequence...
Apologies for arriving late at the party. I am the author of protobuf-net, one of the C# implementations. For network usage, you should consider the "[De]SerializeWithLengthPrefix" methods - that way, it will automatically handle the lengths for you. There are examples in the source.
I won't go into huge detail on an old post, but if you want to know more, add a comment and I'll get back to you.
I agree with Matt that a header is better than a footer for Protocol Buffers, for the primary reason that as PB is a binary protocol it's problematic to come up with a footer that would not also be a valid message sequence. A lot of footer-based protocols (typically EOL ones) work because the message content is in a defined range (typically 0x20 - 0x7F ASCII).
A useful approach is to have your lowest level code just read buffers off of the socket and present them up to a framing layer that assembles complete messages and remembers partial ones (I present an async approach to this (using the CCR) here, albeit for a line protocol).
For consistency, you could always define your message as a PB message with three fields: a fixed-int as the length, an enum as the type, and a byte sequence that contains the actual data. This keeps your entire network protocol transparent.
TCP/IP, as well as UDP, packets include some reference to their size. The IP header contains a 16-bit field that specifies the length of the IP header and data in bytes. The TCP header contains a 4-bit field that specifies the size of the TCP header in 32-bit words. The UDP header contains a 16-bit field that specifies the length of the UDP header and data in bytes.
Here's the thing.
Using the standard run-of-the-mill sockets in Windows, whether you're using the System.Net.Sockets namespace in C# or the native Winsock stuff in Win32, you never see the IP/TCP/UDP headers. These headers are stripped off so that what you get when you read the socket is the actual payload, i.e., the data that was sent.
The typical pattern from everything I've ever seen and done using sockets is that you define an application-level header that precedes the data you want to send. At a minimum, this header should include the size of the data to follow. This will allow you to read each "message" in its entirety without having to guess as to its size. You can get as fancy as you want with it, e.g., sync patterns, CRCs, version, type of message, etc., but the size of the "message" is all you really need.
And for what it's worth, I would suggest using a header instead of an end-of-packet delimiter. I'm not sure if there is a signficant disadvantage to the EOP delimiter, but the header is the approach used by most IP protocols I've seen. In addition, it just seems more intuitive to me to process a message from the beginning rather than wait for some pattern to appear in my stream to indicate that my message is complete.
EDIT: I have only just become aware of the Google Protocol Buffers project. From what I can tell, it is a binary serialization/de-serialization scheme for WCF (I'm sure that's a gross oversimplification). If you are using WCF, you don't have to worry about the size of the messages being sent because the WCF plumbing takes care of this behind the scenes, which is probably why you haven't found anything related to message length in the Protocol Buffers documentation. However, in the case of sockets, knowing the size will help out tremendously as discussed above. My guess is that you will serialize your data using the Protocol Buffers and then tack on whatever application header you come up with before sending it. On the receive side, you'll pull off the header and then de-serialize the remainder of the message.
Related
I'm writing a simple chat program using sockets. When I'm sending a long message, flush the stream and a short message afterwards, the end of the long message gets appended to the short message. It looks like this:
Send "aaasdsd"
Recieve "aaasdsd"
Send "bb"
Recieve "bbasdsd"
Through debugging I've found that the Flush method, that's supposed to clear all data from the stream, does not do that. According to mdsn, it is the expected behaviour, because NetworkStream is not bufferized. How do I clear the stream in that case? I could just follow every message with an empty (consisting of \0 chars) one of the same length, but I don't think it's correct to do that, also, it would screw up some features I need.
TCP doesn't work this way. It's as simple as that.
TCP is a stream-based protocol. That means that you shouldn't ever treat it as a message-based protocol (unlike, say, UDP). If you need to send messages over TCP, you have to add your own messaging protocol on top of TCP.
What you're trying to do here is send two separate messages, and receive two separate messages on the other side. This would work fine on UDP (which is message-based), but it will not work on TCP, because TCP is a stream with no organisation.
So yeah, Flush works just fine. It's just that no matter how many times you call Flush on one side, and how many times you call individual Sends, each Receive on the other end will get as much data as can fit in its buffer, with no respect to the Sends on the other side.
The solution you've devised (almost - just separate the strings with a single \0) is actually one of the proper ways to handle this. By doing that, you're working with messages on top of the stream again. This is called message framing - it allows you to tell individual messages apart. In your case, you've added delimiters between the messages. Think about writing the same data in a file - again, you'll need some way of your own to separate the individual messages (for example, using end lines).
Another way to handle message framing is using a length prefix - before you send the string itself, send it's length. Then, when you read on the other side, you know that between the strings, there should always be a length prefix, so the reader knows when the message ends.
Yet another way isn't probably very useful for your case - you can work with fixed-length data. So a message will always be exactly 100 bytes, for example. This is very powerful when combined with pre-defined message types - so message type 1 would contain exactly two integers, representing some coördinates, for example.
In either case, though, you'll need your own buffering on the receiving end. This is because (as you've already seen) a single receive can read multiple messages at once, and at the same time, it's not guaranteed to read the whole message in a single read. Writing your own networking is actually pretty tricky - unless you're doing this to actually learn network programming, I'd recommend using some ready technology - for example, Lindgren (a nice networking library, optimized for games but works fine for general networking as well) or WCF. For a chat system, simple HTTP (especially with the bi-directional WebSockets) might be just fine as well.
EDIT:
As Damien correctly noted, there seems to be another problem with your code - you seem to be ignoring the return value of Read. The return value tells you the amount of bytes you've actually read. Since you have a fixed-size persistent buffer on the receiving side (apparently), it means that every byte after the amount you've just read will still contain the old data. To fix this, just make sure you're only working with as much bytes as Read returned. Also, since this seems to indicate you're ignoring the Read return value altogether, make sure to properly handle the case when Read returns 0 - that means the other side has gracefully shutdown its connection - and the receiving side should do the same.
I am working on a network application that can send live video feed asynchronously from one application to another, sort of like Skype. The main issue I am having is that I want to be able to send the frames but not have to know their size each time before receiving.
The way AForge.NET works when handling images is that the size of the current frame will most likely be different than the one before it. The size is not static so I was just wondering if there was a way to achieve this. And, I already tried sending the length first and then the frame, but that is not what I was looking for.
First, make sure you understand that TCP itself has no concept of "packet" at all, not at the user code level. If one is conceptualizing one's TCP network I/O in terms of packets, they are probably getting it wrong.
Now that said, you can impose a packet structure on the TCP stream of bytes. To do that where the packets are not always the same size, you can only transmit the length before the data, or delimit the data in some way, such as wrapping it in a self-describing encoding, or terminating the data in some way.
Note that adding structure around the data (encoding, terminating, whatever) when you're dealing with binary data is fraught with hassles, because binary data usually is required to support any combination of bytes. This introduces a need for escaping the data or otherwise being able to flag something that would normally look like a delimiter or terminator, so that it can be treated as binary data instead of some boundary of the data.
Personally, I'd just write a length before the data. It's a simple and commonly used technique. If you still don't want to do it that way, you should be specific and explain why you don't, so that your specific scenario can be better understood.
I've read around and i'm hearing mixed things about this. Do you have to split a file into chunks to send it over a stream? or does the OS do that for you?
I have a byte array of about 320,000 values, which i need to get across a network. I can get the first several thousand over but anything after that, it's just set to 0.
I'm using the NetworkStream class, creating a TcpListener / TcpClient, getting the stream from the listener once connected and writing the array to the stream then flushing. Without Success.
Any help would be appreciated.
Cheers,
When using TCP sockets, sending 1024 bytes may or may not be split into chunks by the OS. This behavior at our level should be considered undefined and the receiver should be able to handle a situation like that. What most protocols do is specify a certain (known) message size that contains information such as file size, what range of data it should read, etc. Each message the server constructs will have this header. You as the programmer can specify your chunk sizes, and each chunk must be reconstructed at the receiver level.
Here's a walkthrough:
Server sends a command to the client with information about the file, such as total size, file name, etc.
Client knows how big the command is based on a pre-programmed agreement of header size. If the command is 512 bytes, then the client will keep receiving data until it has filled a 512 byte buffer. Behind the scenes, the operating system could have picked that data up in multiple chunks, but that shouldn't be a worry for you. After all, you only care about reading exactly 512 bytes.
Server begins sending more commands, streaming a file to the client chunk by chunk (512 bytes at a time).
The client receives these chunks and constructs the file over the course of the connection.
Since the client knows how big the file is, it no longer reads on that socket.
The server terminates the connection.
That example is pretty basic, but it's a good groundwork on how communication works.
Recently I started working with sockets. I realized that when reading from a network stream, you can not know how much data is coming in. So either you know in advance how many bytes have to be recieved or you know which bytes.
Since I am currently trying to implement a C# WebSocket server I need to process HTTP requests. A HTTP request can have arbitrary length, so knowing in advance how many bytes is out of the question. But a HTTP request always has a certain format. It starts with the request-line, followed by zero or more headers, etc. So with all this information it should be simple, right?
Nope.
One approach I came up with was reading all data until a specific sequence of bytes was recognized. The StreamReader class has the ReadLine method which, I believe, works like this. For HTTP a reasonable delimiter would be the empty line separating the message head from the body.
The obvious problem here is the requirement of a (preferrably short) termination sequence, like a line break. Even the HTTP specification suggests that these two adjacent CRLFs are not a good choice, since they could also occur at the beginning of the message. And after all, two CRLFs are not a simple delimiter anyways.
So expanding the method to arbitrary type-3 grammars, I concluded the best choice for parsing the data is a finite state machine. I can feed the data to the machine byte after byte, just as I am reading it from the network stream. And as soon as the machine accepts the input I can stop reading data. Also, the FSM could immediately capture the significant tokens.
But is this really the best solution? Reading byte after byte and validating it with a custom parser seems tedious and expensive. And the FSM would be either slow or quite ugly. So...
How do you process data from a network stream when the form is known but not the size?
How can classes like the HttpListener parse the messages and be fast at it too?
Did I miss something here? How would this usually be done?
HttpListener and other such components can parse the messages because the format is deterministic. The Request is well documented. The request header is a series of CRLF-terminated lines, followed by a blank line (two CRLF in a row).
The message body can be difficult to parse, but it's deterministic in that the header tells you what encoding is used, whether it's compressed, etc. Even multi-part messages are not terribly difficult to parse.
Yes, you do need a state machine to parse HTTP messages. And yes you have to parse it byte-by-byte. It's somewhat involved, but it's very fast. Typically you read a bunch of data from the stream into a buffer and then process that buffer byte-by-byte. You don't read the stream one byte at a time because the overhead will kill performance.
You should take a look at the HttpListener source code to see how it all works. Go to http://referencesource.microsoft.com/netframework.aspx and download the .NET 4.5 Update 1 source.
Be prepared to spend a lot of time digging through that and through the HTTP spec.
By the way, it's not difficult to create a program that handles a small subset of HTTP requests. But I wonder why you'd want to do that when you can just use HttpListener and have all the details handled for you.
Update
You are talking about two different protocols. HTTP and WebSocket are two entirely different things. As the Wikipedia article says:
The WebSocket Protocol is an independent TCP-based protocol. Its only relationship to HTTP is that its handshake is interpreted by HTTP servers as an Upgrade request.
With HTTP, you know that the server will send the stream and then close the connection; it's a stream of bytes with a defined end. WebSocket is a message-based protocol; it enables a stream of messages. Those messages have to be delineated in some way; the sender has to tell the receiver where the end of the message is. That can be implicit or explicit. There are several different ways this is done:
The sender includes the length of message in the first few bytes of the message. For example, the first four bytes are a binary integer that says how many bytes follow in that message. So the receiver reads the first four bytes, converts that to an integer, and then reads that many bytes.
The length of the message is implicit. For example, sender and receiver agree that all messages are 80 bytes long.
The first byte of the message is a message type, and each message type has a defined length. For example, message type 1 is 40 bytes, message type 2 is 27 bytes, etc.
Messages have some terminator. In a line-oriented message system, for example, messages are terminated by CRLF. The sender sends the text and then CRLF. The receiver reads bytes until it receives CRLF.
Whatever the case, sender and receiver must agree on how messages are structured. Otherwise the case that you're worried about does crop up: the receiver is left waiting for bytes that will never be received.
In order to handle possible communications problems you set the ReceiveTimeout property on the socket, so that a Read will throw SocketException if it takes too long to receive a complete message. That way, your program won't be left waiting indefinitely for data that is not forthcoming. But this should only happen in the case of communications problems. Any reasonable message format will include a way to determine the length of a message; either you know how much data is coming, or you know when you've reached the end of a message.
If you want to send a message you can just pre-pend the size of the message to it. Get the number of bytes in the message, pre-pend a ulong to it. At the receiver, read the size of a ulong, parse it, then read that amount of bytes from the stream and then close it.
In a HTTP header you can read: Content-Length The length of the request body in octets (8-bit bytes)
C# socket server, which has roughly 200 - 500 active connections, each one constantly sending messages to our server.
About 70% of the time the messages are handled fine (in the correct order etc), however in the other 30% of cases we have jumbled up messages and things get screwed up. We should note that some clients send data in unicode and others in ASCII, so that's handled as well.
Messages sent to the server are a variable length string which end in a char3, it's the char3 that we break on, other than that we keep receiving data.
Could anyone shed any light on our ProcessReceive code and see what could possibly be causing us issues and how we can solve this small issue (here's hoping it's a small issue!)
Code below:
Firstly, I'm sure you know, but it's always worth repeating; TCP is a stream of bytes. It knows nothing of any application level "messages" that you may determine exist in that stream of bytes. All successful socket Recv calls, whether sync or async, can return any number of bytes between 1 and the size of the buffer supplied.
With that in mind you should really be dealing with your message framing (i.e. looking for your delimiter) before you do anything else. If you don't find a delimiter then simply reissue the read using the same SocketAsyncEventArgs, the same buffer and set the offset to where you currently are, this will read some more data into the buffer and you can take another look for the delimiter once the next read has completed... Ideally you'd keep track of where you last got to when searching for a delimiter in this buffer to reduce repeated scanning...
Right now you're not doing that and your use of e.Buffer[e.Offset] == 255 will fail if you get a message that arrives in pieces as you could be referring to any byte in the message if the message is split over multiple reads.
The problem I am seeing is that you are calling Encoding.Unicode.GetString() on a buffer you received in the current read from socket. However, the contents of that buffer might not be a valid unicode encoding of a string.
What you need to do is to buffer your entire stream, and then decode it as a string in one final operation, after you have received all the data.