I'm sending a large amount of data in one go between a client and server written C#. It works fine when I run the client and server on my local machine but when I put the server on a remote computer on the internet it seems to drop data.
I send 20000 strings using the socket.Send() method and receive them using a loop which does socket.Receive(). Each string is delimited by unique characters which I use to count the number received (this is the protocol if you like). The protocol is proven, in that even with fragmented messages each string is correctly counted. On my local machine I get all 20000, over the internet I get anything between 17000-20000. It seems to be worse the slower connection that the remote computer has. To add to the confusion, turning on Wireshark seems to reduce the number of dropped messages.
First of all, what is causing this? Is it a TCP/IP issue or something wrong with my code?
Secondly, how can I get round this? Receiving all of the 20000 strings is vital.
Socket receiving code:
private static readonly Encoding encoding = new ASCIIEncoding();
///...
while (socket.Connected)
{
byte[] recvBuffer = new byte[1024];
int bytesRead = 0;
try
{
bytesRead = socket.Receive(recvBuffer);
}
catch (SocketException e)
{
if (! socket.Connected)
{
return;
}
}
string input = encoding.GetString(recvBuffer, 0, bytesRead);
CountStringsIn(input);
}
Socket sending code:
private static readonly Encoding encoding = new ASCIIEncoding();
//...
socket.Send(encoding.GetBytes(string));
If you're dropping packets, you'll see a delay in transmission since it has to re-transmit the dropped packets. This could be very significant although there's a TCP option called selective acknowledgement which, if supported by both sides, it will trigger a resend of only those packets which were dropped and not every packet since the dropped one. There's no way to control that in your code. By default, you can always assume that every packet is delivered in order for TCP and if there's some reason that it can't deliver every packet in order, the connection will drop, either by a timeout or by one end of the connetion sending a RST packet.
What you're seeing is most likely the result of Nagle's algorithm. What it does is instead of sending each bit of data as you post it, it sends one byte and then waits for an ack from the other side. While it's waiting, it aggregates all the other data that you want to send and combines it into one big packet and then sends it. Since the max size for TCP is 65k, it can combine quite a bit of data into one packet, although it's extremely unlikely that this will occur, particularly since winsock's default buffer size is about 10k or so (I forget the exact amount). Additionally, if the max window size of the receiver is less than 65k, it will only send as much as the last advertised window size of the receiver. The window size also affects Nagle's algorithm as well in terms of how much data it can aggregate prior to sending because it can't send more than the window size.
The reason you see this is because on the internet, unlike your network, that first ack takes more time to return so Naggle's algorithm aggregates more of your data into a single packet. Locally, the return is effectively instantaneous so it's able to send your data as quickly as you can post it to the socket. You can disable Naggle's algorithm on the client side by using SetSockOpt (winsock) or Socket.SetSocketOption (.Net) but I highly recommend that you DO NOT disable Naggling on the socket unless you are 100% sure you know what you're doing. It's there for a very good reason.
Well there's one thing wrong with your code to start with, if you're counting the number of calls to Receive which complete: you appear to be assuming that you'll see as many Receive calls finish as you made Send calls.
TCP is a stream-based protocol - you shouldn't be worrying about individual packets or reads; you should be concerned with reading the data, expecting that sometimes you won't get a whole message in one packet and sometimes you may get more than one message in a single read. (One read may not correspond to one packet, too.)
You should either prefix each method with its length before sending, or have a delimited between messages.
It's definitely not TCP's fault. TCP guarantees in-order, exactly-once delivery.
Which strings are "missing"? I'd wager it's the last ones; try flushing from the sending end.
Moreover, your "protocol" here (I'm taking about the application-layer protocol you're inventing) is lacking: you should consider sending the # of objects and/or their length so the receiver knows when he's actually done receiving them.
How long are each of the strings? If they aren't exactly 1024 bytes, they'll be merged by the remote TCP/IP stack into one big stream, which you read big blocks of in your Receive call.
For example, using three Send calls to send "A", "B", and "C" will most likely come to your remote client as "ABC" (as either the remote stack or your own stack will buffer the bytes until they are read). If you need each string to come without it being merged with other strings, look into adding in a "protocol" with an identifier to show the start and end of each string, or alternatively configure the socket to avoid buffering and combining packets.
Related
I'm currently writing some async UDP network code in C#. I'm sending small packets (less than 50 bytes of data in each so far) back and forth and my first thought was to split them into two different packets and still send it as one packet but receive it as two. So a header or an extra information packet is always added to the start of the real packet. That would contain an ID and the data length.
So I thought I could split it on the receiving end (async receive) and first receive the header and then the actual information. This is so that I don't have to worry about the order between packets and "packet headers".
So I wrote code that basically worked like this:
Client sends 30 bytes of data to the server, where the first 3 bytes is the packet header.
The server would have called (PACKET_HEADER_SIZE = 3):
socket.BeginReceiveFrom(state.Buffer, 0, PACKET_HEADER_SIZE, SocketFlags.None, ref endPoint, ReceivePacketInfo, state);
Then receives the data:
private void ReceivePacketInfo(IAsyncResult ar)
{
StateObj state = (StateObj) ar.AsyncState;
int bytesRead = socket.EndReceiveFrom(ar, ref endPoint);
state.BytesReceived += read;
if (state.BytesReceived < state.Buffer.Length)
{
_socket.BeginReceiveFrom(state.Buffer, state.BytesReceived, state.Buffer.Length - state.BytesReceived, SocketFlags.None, ref endPoint, ReceivePacketInfo, state);
}
else
{
//my thought was to receive the rest of the packet here
}
}
but when calling socket.EndReceiveFrom(ar) I get a SocketException:
"A message sent on a datagram socket was larger than the internal message buffer or some other network limit, or the buffer used to receive a datagram into was smaller than the datagram itself"
So now I have a couple of questions.
Do I have to make sure I receive the whole packet (in this case both the header and the packet) before I call EndReceiveFrom?
Can I assume that I will either get the whole packet in one go or get nothing so that my if-statement in ReceivePacketInfo would be redundant (as long as it's size is less than the maximum packet size, of course)?
If I cannot, is there a good way of solving my problem? I could tag all my packet headers and all my packets to be able to map them together I suppose. I could also try to have a standardized "packet ending" so that I just read until I hit the end of the packet.
Thanks in advance for any help!
Can I assume that I will either get the whole packet in one go or get nothing
That's almost the only thing, that UDP can guarantee - the content of a packet. If packet is received it is guaranteed to have same size and same content. So you have to make sure, that you buffer is large enough for a packet.
The order of a packets is not guaranteed and the delivery itself. It is up to you and your application to handle dropped packets and out of order packets.
My Context
I have a TCP networking program that sends large objects that have been serialized and encoded into base64 over a connection. I wrote a client library and a server library, and they both use NetworkStream's Begin/EndReadandBegin/EndWrite. Here's the (very much simplified version of the) code I'm using:
For the server:
var Server = new TcpServer(/* network stuffs */);
Server.Connect();
Server.OnClientConnect += new ClientConnectEventHandler(Server_OnClientConnect);
void Server_OnClientConnect()
{
LargeObject obj = CalculateLotsOfBoringStuff();
Server.Write(obj.SerializeAndEncodeBase64());
}
Then the client:
var Client = new TcpClient(/* more network stuffs */);
Client.Connect();
Client.OnMessageFromServer += new MessageEventHandler(Client_OnMessageFromServer);
void Client_OnMessageFromServer(MessageEventArgs mea)
{
DoSomethingWithLargeObject(mea.Data.DecodeBase64AndDeserialize());
}
The client library has a callback method for NetworkStream.BeginRead which triggers the event OnMessageFromServer that passes the data as a string through MessageEventArgs.
My Problem
When receiving large amounts of data through BeginRead/EndRead, however, it appears to be fragmented over multiple messages. E.G. pretend this is a long message:
"This is a really long message except not because it's for explanatory purposes."
If that really were a long message, Client_OnMessageFromServer might be called... say three times with fragmented parts of the "long message":
"This is a really long messa"
"ge except not because it's for explanatory purpos"
"es."
Soooooooo.... takes deep breath
What would be the best way to have everything sent through one Begin/EndWrite to be received in one call to Client_OnMessageFromServer?
You can't. On TCP, how things arrive is not necessarily the same as how they were sent. It the job of your code to know what constitutes a complete message, and if necessary to buffer incoming data until you have a complete message (taking care not to discard the start of the next message I the process).
In text protocols, this usually means "spot the newline / nul-char". For binary, it usually means "read the length-header in the preamble the the message".
TCP is a stream protocol, and has no fixed message boundaries. This means you can receive part of a message or the end of one and the beginning of another.
There are two ways to solve this:
Alter your protocol to add end-of-message markers. This way you continuously receive until you find the special marker. This can however lead that you have a buffer containing the end of one message and the beginning of another which is why I recommend the next way.
Alter protocol to first send the length of the message. Then you will know exactly how long the message is, and can count down while receiving so you won't read the beginning of the next message.
I have a socket connection that receives data, and reads it for processing.
When data is not processed/pulled fast enough from the socket, there is a bottleneck at the TCP level, and the data received is delayed (I can tell by the tmestamps after parsing).
How can I see how much TCP bytes are awaiting to be read by the socket ? (via some external tool like WireShark or else)
private void InitiateRecv(IoContext rxContext)
{
rxContext._ipcSocket.BeginReceive(rxContext._ipcBuffer.Buffer, rxContext._ipcBuffer.WrIndex,
rxContext._ipcBuffer.Remaining(), 0, CompleteRecv, rxContext);
}
private void CompleteRecv(IAsyncResult ar)
{
IoContext rxContext = ar.AsyncState as IoContext;
if (rxContext != null)
{
int rxBytes = rxContext._ipcSocket.EndReceive(ar);
if (rxBytes > 0)
{
EventHandler<VfxIpcEventArgs> dispatch = EventDispatch;
dispatch (this, new VfxIpcEventArgs(rxContext._ipcBuffer));
InitiateRecv(rxContext);
}
}
}
The fact is that I guess the "dispatch" is somehow blocking the reception until it is done, ending up in latency (i.e, data that is processed bu the dispatch is delayed, hence my (false?) conclusion that there was data accumulated on the socket level or before.
How can I see how much TCP bytes are awaiting to be read by the socket
By specifying a protocol that indicates how many bytes it's about to send. Using sockets you operate a few layers above the byte level, and you can't see how many send() calls end up as receive() calls on your end because of buffering and delays.
If you specify the number of bytes on beforehand, and send a string like "13|Hello, World!", then there's no problem when the message arrives in two parts, say "13|Hello" and ", World!", because you know you'll have to read 13 bytes.
You'll have to keep some sort of state and a buffer in between different receive() calls.
When it comes to external tools like Wireshark, they cannot know how many bytes are left in the socket. They only know which packets have passed by the network interface.
The only way to check it with Wireshark is to actually know the last bytes you read from the socket, locate them in Wireshark, and count from there.
However, the best way to get this information is to check the Available property on the socket object in your .NET application.
You can use socket.Available if you are using normal Socket class. Otherwise you have to define a header byte which gives number of bytes to be sent from other end.
I have an asynchronous read method...
private void read(IAsyncResult ar) {
//Get the Server State Object
ServerState state = (ServerState)ar.AsyncState;
//read from the socket
int readCount = state.socket.EndReceive(ar);
//check if reading is done, move on if so, trigger another read if not
if (readCount > 0) {
//purge the buffer and start another read
state.purgeBuffer();
state.socket.BeginReceive(state.buffer, 0, ServerState.bufferSize, 0, new AsyncCallback(read), state);
}
else {
//all bytes have been read, dispatch the message
dispatch(state);
}
}
The problem that I am having is that read is only 0 if the connection is closed. How do I say, this is the end of that message and pass the data on to the dispatcher, while leaving the socket open to accept new messages.
Thank you!
You should not rely on what is in the TCP buffer. You must process the incoming bytes as a stream somewhere. You can't really know whether its complete. Only one layer above can know when the message completed.
Example:
If you read HTTP responses the HTTP header will contain the byte count which is in the HTTP body. So you know how much to read.
You only know how much to read if the data follows a certain protocol and you interprete it. Imagine you receive a file over the socket. The first thing you would receive is the file size. Without that you would never know how much to read.
You should make your messages fit a particular format so that you can distinguish when they start and when end. Even if it is a stream of data it should be sent in packets.
One option is to send length of message first and then you know how much data to expect. But problem with that is if you loose sync you can never recover and you will never know what is message length and what is its content. It is good to use some special marking sequence to know when message begins. It is is not 100% error proof (sequence might appear in data) but certainly helps and allows to recover from sync loose. This is particularly important when reading from a binary stream like socket.
Even ancient RS232 serial protocol had its frame and stop bit to know when you got all the data.
I am Sending Serialized large Image Object over UDP Socket.When I write all received bytes in Memory stream and pass the memory stream object for deserialization it throws an exception No assembly ID for object type 'ImagePacket'.
Receiver End Code:
ImageStream = new MemoryStream();
while (AccumulatingBytes <= TotalSizeOfComplexObject)
{
byte[] Recievedbytes = UdpListener.Receive(ref RemoteEndPoint);
ImageStream.Write(Recievedbytes, 0, Recievedbytes.Length);
AccumulatingBytes += Recievedbytes.Length;
}
ImageStream.Position = 0;
imagecontainer = (ImageContainer)bformater.Deserialize(ImageStream);//Here the Code Segment Breaks and Exception thrown
I suspect the problem here is simply: you are using UDP like it is TCP. UDP is packet based, but a: doesn't guarantee that the packets will arrive in order, and b: doesn't guarantee that packets won't be dropped or duplicated.
I fully expect you have some out of order. If you are sending multiple messages, it is also possible some were dropped, and you've included a few from the next message.
To use the network the way your code wants to use it: use TCP. Otherwise, the responsibility for making sense of out-of-order, dropped and duplicated packets is entirely yours. This could be, for example, by adding a sequence number to the packet, and keeping track of what has been received - re-ordering them as necessary, dropping duplicates, and re-requesting any that died en-route. Basically, re-writing everything that TCP adds! Unless you have a very specific scenario, there's a good chance that the TCP stack (with NIC and OS level support) will do a better job of this than you will.