I have a UDP sender and a UDP listener. Transfering messages works nicely. But...
It appears when I am overfeeding (sending sustained data quickly) the listening socket may throw on the call to ReceiveFrom with error code 10040 which means some buffer was not large enough. The exception message is
A message sent on a datagram socket was larger than the internal
message buffer or some other network limit, or the buffer used to
receive a datagram into was smaller than the datagram itself.
Fair enough. But the problem is I will then get this exception on every following call to ReceiveFrom. The socket appears broken. I am willing to accept the transfer failed but I now want to flush the socket's receive buffer and continue.
I can prevent this from happening by setting a substantial receive buffer size of 128K on the listening socket (as opposed to the default of 8K). I can also fix it by having the sender pause for 1 ms after sending a chunk of 65507 bytes of a multi-chunk bulk message.
But I do not feel safe. If this exception still occurs, I want to log and continue (better luck next time). Recreating the socket and restarting the listen thread seems blunt. Is there a better way?
Something unrelated I do not like: Socket.ReceiveFrom throws an exception after the timeout. This is stupid, timeouts are normal behavior. I would expect a TryReceiveFrom method and I do not like using the exception handler as a flow control statement, which seems to be the only option I have. Is there a better way?
[Edit]
On further scrutiny (I ran into exceptions being thrown again after sending messages in one piece in an effort to optimize) I found the main reason for my troubles. It turns out the ReceiveFrom method is not the friendliest API...
Here it says:
"With connectionless protocols, ReceiveFrom will read the first
enqueued datagram received into the local network buffer. If the
datagram you receive is larger than the size of buffer, the
ReceiveFrom method will fill buffer with as much of the message as is
possible, and throw a SocketException."
In other words: with UDP a full datagram will always be returned regardless the size argument, which is effectively ignored in its capacity as a limiter, and you'd better make sure your buffer is big enough.
So you want the buffer passed to ReceiveFrom to be at least 64K in order for it to be big enough for the biggest possible datagram, see what you got by checking the return value and work with that.
It gets a little worst still: the size argument is not entirely ignored, if offset plus size exceeds the length of your buffer you also get an exception. So it is ignored on the one hand because it does not limit the number of bytes being written to your buffer but it is still being sanity-checked.
After discovering this quirk and respecting it I did have not had any overruns, no matter how hard I bashed it from the sending end (I send a large bitmap repeatedly without pausing). The report on my journey my save others some frustration.
I accept now that when the buffer overruns, the socket is broken and needs to be recreated.
The exception being thrown after a timeout of Socket.ReceiveFrom can be prevented by first checking if any data is available using the Socket.Poll method. This has its own timeout argument. So it is pointless to set ReceiveTimeout on the socket, using Poll in tandem with ReceiveFrom works much nicer.
Related
I am using a System.Net.Sockets.Socket in TCP mode to send data to some other machine. I implemented a listener to receive data formatted in a certain way and it is all working nicely. Now I am looking for ways to optimize use of bandwidth and delivery speed.
I wonder if it is worth sending a single array of bytes (or Span in the newer frameworks) instead of using a consecutive series of Send calls. I now send my encoded message in parts (head, length, tail, end, using separate Send calls).
Sure I could test this and I will but I do not have much experience with this and may be missing important considerations.
I realize there is a nowait option that may have an impact and that the socket and whatever is there lower in the stack may apply its own optimizing policy.
In would like to be able to prioritize delivery time (the time between the call to Send and reception at the other end) over bandwidth use for those messages to which it matters, and be more lenient with messages that are not time critical. But then again, I create a new socket whenever I find there is something in my queue and then use that until the queue is empty for what could be more than one message so this may not always work. Ideally I would want to be lenient so the socket can optimize payload until a time critical message hits the queue and then tell the socket to hurry until no more time critical messages are in the queue.
So my primary question is should I build my message before calling Send once (would that potentially do any good or just waste CPU cycles) and are there any caveats an experienced TCP programmer could make me aware of?
Background:
The application I am programming uses async sockets (using BeginSend, EndSend, BeginReceive, EndReceive) to send data between each other. The sockets are TCP, no socket flags, on IPV4.
It uses the system where it sends a 4-byte (int) message, followed by a message with the length specified in the previous message. I use function helpers that handle the MessageLength, and the MessageBody. The flow is something like this
BeginReceive()
EndReceive()
MessageLengthReceived()
BeginReceive()
MessageBodyReceived()
Issue:
The issue arrives when I send file data, in chunks of 16kb (with an additional small overhead: offset, pieceIndex, etc). Occasionally, when receiving the MessageLength, it receives a data from a random part in the previous message, instead of the actual message length. Part of this issue is that it doesn't always happen at a set offset (eg beginning or end of file / piece / 16 kb chunk) and can happen with any file, but happens more if I send a lot more files / larger files.
There are internal messages that are sent (eg RequestMessages) that never experience this problem. All the internal messages are < 100 bytes.
I've tried waiting for the file chunk to save completely before requesting another chunk, but it still fails. I've also tried limiting how many chunks to send at a time, but this only resolves the issue when using 127.0.0.1 (local clients), and not cross network (LAN).
I've spent hours going through my application to see if there's any issues, but I have yet to see any where it would be sending the wrong data as a header. The issue always seems to inbetween the send and the receive of the two clients. Is there settings for socket / method of sending that I should be using? Or could it be some sort of race condition (I thought about race condition, but the fact that the data can be anywhere randomly in a file made me rethink this).
From the question, i guess the problem you are dealing with is inside the MonoTorrent library.
I myself has never encountered such problem. and by looking at the codes, i think the receive part is already ordered because the network IO will not try to receive a second message until the first one has been handled. PieceMessages' write requests are queued in DiskIO also so that should not be the problem.
however, in sending procedure, the ProcessQueue function can be called from several places. and the EnqueueSendMessage called by ProcessQueue indirectly doesn't actually enqueue the message to any queue. it just simply call the Socket.BeginSend. I don't know if Socket.BeginSend() has any queue mechanism inside. If there is not, this may bring some problem when multiple threads are trying to make the same socket "BeginSend" different data.
C# socket server, which has roughly 200 - 500 active connections, each one constantly sending messages to our server.
About 70% of the time the messages are handled fine (in the correct order etc), however in the other 30% of cases we have jumbled up messages and things get screwed up. We should note that some clients send data in unicode and others in ASCII, so that's handled as well.
Messages sent to the server are a variable length string which end in a char3, it's the char3 that we break on, other than that we keep receiving data.
Could anyone shed any light on our ProcessReceive code and see what could possibly be causing us issues and how we can solve this small issue (here's hoping it's a small issue!)
Code below:
Firstly, I'm sure you know, but it's always worth repeating; TCP is a stream of bytes. It knows nothing of any application level "messages" that you may determine exist in that stream of bytes. All successful socket Recv calls, whether sync or async, can return any number of bytes between 1 and the size of the buffer supplied.
With that in mind you should really be dealing with your message framing (i.e. looking for your delimiter) before you do anything else. If you don't find a delimiter then simply reissue the read using the same SocketAsyncEventArgs, the same buffer and set the offset to where you currently are, this will read some more data into the buffer and you can take another look for the delimiter once the next read has completed... Ideally you'd keep track of where you last got to when searching for a delimiter in this buffer to reduce repeated scanning...
Right now you're not doing that and your use of e.Buffer[e.Offset] == 255 will fail if you get a message that arrives in pieces as you could be referring to any byte in the message if the message is split over multiple reads.
The problem I am seeing is that you are calling Encoding.Unicode.GetString() on a buffer you received in the current read from socket. However, the contents of that buffer might not be a valid unicode encoding of a string.
What you need to do is to buffer your entire stream, and then decode it as a string in one final operation, after you have received all the data.
I am writing a client for a server that typically sends data as strings in 500 or less bytes. However, the data will occasionally exceed that, and a single set of data could contain 200,000 bytes, for all the client knows (on initialization or significant events). However, I would like to not have to have each client running with a 50 MB socket buffer (if it's even possible).
Each set of data is delimited by a null \0 character. What kind of structure should I look at for storing partially sent data sets?
For example, the server may send ABCDEFGHIJKLMNOPQRSTUV\0WXYZ\0123!\0. I would want to process ABCDEFGHIJKLMNOPQRSTUV, WXYZ, and 123! independently. Also, the server could send ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890LOL123HAHATHISISREALLYLONG without the terminating character. I would want that data set stored somewhere for later appending and processing.
Also, I'm using asynchronous socket methods (BeginSend, EndSend, BeginReceive, EndReceive) if that matters.
Currently I'm debating between List<Byte> and StringBuilder. Any comparison of the two for this situation would be very helpful.
Read the data from the socket into a buffer. When you get the terminating character, turn it into a message and send it on its way to the rest of your code.
Also, remember that TCP is a stream, not a packet. So you should never assume that you will get everything sent at one time in a single read.
As far as buffers go, you should probably only need one per connection at most. I'd probably start with the max size that you reasonably expect to receive, and if that fills, create a new buffer of a larger size - a typical strategy is to double the size when you run out to avoid churning through too many allocations.
If you have multiple incoming connections, you may want to do something like create a pool of buffers, and just return "big" ones to the pool when done with them.
You could just use a List<byte> as your buffer, so the .NET framework takes care of automatically expanding it as needed. When you find a null terminator you can use List.RemoveRange() to remove that message from the buffer and pass it to the next layer up.
You'd probably want to add a check and throw an exception if it exceeds a certain length, rather than just wait until the client runs out of memory.
(This is very similar to Ben S's answer, but I think a byte array is a bit more robust than a StringBuilder in the face of encoding issues. Decoding bytes to a string is best done higher up, once you have a complete message.)
I would just use a StringBuilder and read in one character at a time, copying and emptying the builder whenever I hit a null terminator.
I wrote this answer regarding Java sockets but the concept is the same.
What's the best way to monitor a socket for new data and then process that data?
I have a doubt about socket programming. I am developing a TCP packet sniffer. I am using Socket.BeginAccept, Socket.BeginReceive to capture every packet, but when a packet is received I have to process something. It is a fast operation, but would take some milliseconds, and then call BeginReceive again.
My question is, what would happen if some packets are sent while I am processing, and haven't called BeginReceive? Are packets lost, or are they buffered internally? Is there a limit?
In the linux world, the kernel will buffer them for you- I'm assuming the windows world does the same thing. But eventually as deltreme said, the buffer will overflow (there's definitely a limit) and there's a possibility the data will be dropped silently.
If you're doing something as heavyweight as a few milliseconds per packet, then you might want to consider using a threadpool to free up the network thread. I.e. all your network thread should do is grab the packet and throw it onto a queue for processing by another thread and go back to listening on the network. Another thread/threads can grab these packets off the queue and process them - the nice thing is you might even be able to process multiple packets at a time thus saving some overhead. Here your queue will act as the buffer and you can control how big you want it to be and you can define your own drop policy.
They are buffered, but I don't know on what level or what the limit is.
http://tangentsoft.net/wskfaq/ is a great resource you might find useful for any winsock related issue.
TCP gives you a reliable stream, so data is not lost (assuming the underlying network doesn't fail).
The OS on both ends have buffers that takes care of the bytes when you're not reading them. Those buffers have a finite size, if they fill up, TCP have flow control - essentially the sending end will discover that buffers are full and stop sending until more space becomes available.