C# network stream fragmented data

C# network stream fragmented data - c#

My Context
I have a TCP networking program that sends large objects that have been serialized and encoded into base64 over a connection. I wrote a client library and a server library, and they both use NetworkStream's Begin/EndReadandBegin/EndWrite. Here's the (very much simplified version of the) code I'm using:
For the server:
var Server = new TcpServer(/* network stuffs */);
Server.Connect();
Server.OnClientConnect += new ClientConnectEventHandler(Server_OnClientConnect);
void Server_OnClientConnect()
{
LargeObject obj = CalculateLotsOfBoringStuff();
Server.Write(obj.SerializeAndEncodeBase64());
}
Then the client:
var Client = new TcpClient(/* more network stuffs */);
Client.Connect();
Client.OnMessageFromServer += new MessageEventHandler(Client_OnMessageFromServer);
void Client_OnMessageFromServer(MessageEventArgs mea)
{
DoSomethingWithLargeObject(mea.Data.DecodeBase64AndDeserialize());
}
The client library has a callback method for NetworkStream.BeginRead which triggers the event OnMessageFromServer that passes the data as a string through MessageEventArgs.
My Problem
When receiving large amounts of data through BeginRead/EndRead, however, it appears to be fragmented over multiple messages. E.G. pretend this is a long message:
"This is a really long message except not because it's for explanatory purposes."
If that really were a long message, Client_OnMessageFromServer might be called... say three times with fragmented parts of the "long message":
"This is a really long messa"
"ge except not because it's for explanatory purpos"
"es."
Soooooooo.... takes deep breath
What would be the best way to have everything sent through one Begin/EndWrite to be received in one call to Client_OnMessageFromServer?

You can't. On TCP, how things arrive is not necessarily the same as how they were sent. It the job of your code to know what constitutes a complete message, and if necessary to buffer incoming data until you have a complete message (taking care not to discard the start of the next message I the process).
In text protocols, this usually means "spot the newline / nul-char". For binary, it usually means "read the length-header in the preamble the the message".

TCP is a stream protocol, and has no fixed message boundaries. This means you can receive part of a message or the end of one and the beginning of another.
There are two ways to solve this:
Alter your protocol to add end-of-message markers. This way you continuously receive until you find the special marker. This can however lead that you have a buffer containing the end of one message and the beginning of another which is why I recommend the next way.
Alter protocol to first send the length of the message. Then you will know exactly how long the message is, and can count down while receiving so you won't read the beginning of the next message.

Related

Debug Custom Server Socket Application

I've been struggling to debug an issue with my C# socket application. The application is part of a university assignment that I am working on. The scope of the assignment is to build a webserver and game/application server using raw sockets. In my case the webserver serves static files and acts as a proxy for the game server. Serving files seems to be working well, but when forwarding requests through to the game server, there is a data transfer issue.
The complete source code is available on GitHub for reference. But I think the problematic area is in this area which tries to read the body of the response from the game server responding to the webserver's request. Here is the relevant code:
if (header.ContainsKey("content-length"))
{
var bodyLength = Convert.ToInt32(header["content-length"]);
Console.WriteLine($"Receiving body from game server. Expecting {bodyLength} bytes");
body = ReceiveBodyData(socket, bodyLength);
Console.WriteLine($"Finished receiving body from game server. Received {body.Length} bytes.");
}
When it is executing this code the first message is written to the console, but the second message never prints because the method ReceiveBodyData never returns (ends up in an infinite loop trying to read the body data).
Examining the output from the game server, I see that the connection is closed from it's end, but I'm not sure if that kills the recipient's ability to read data or not.
Can anyone assist in debugging this issue, careful to recognise that this is my assignment, so ideally please don't write tons of code.

I think I have found the issue. I have committed changes that fix two bugs in my code.
First bug
Sure enough, the ReceiveBodyData method had a logic flaw in it. The method was as follows:
protected byte[] ReceiveBodyData(Socket handler, int bodyLength)
{
// we know how many bytes to expect, so we create an appropriate array and keep track
// of how many bytes we've received.
var result = new byte[bodyLength];
var totalBytesReceived = 0;
while (true)
{
// The bug is in the line below
var bytesCount = handler.Receive(result, totalBytesReceived, bodyLength - totalBytesReceived, SocketFlags.None);
if (totalBytesReceived >= bodyLength)
{
return result;
}
}
}
I was assigning the number of bytes read to a variable bytesCount but I wasn't updating the value of totalBytesReceived so I was never meeting the condition to exit the loop.
Second bug
I don't quite understand why this bug occurs. Sometimes, after a valid connection is made and the request and response are exchanged, a second connection is created. My code gets stuck because it is listening for the request data to be transmitted, but it never is, because it doesn't seem to be a valid connection.
I don't know if the browser is trying to maintain a persistent connection or quite what is going on here. So I now return null if there is no data to receive. The code that listens for connections, just goes back to listening.

SocketAsyncEventArgs buffer is full of zeroes

I'm writing a message layer for my distributed system. I'm using IOCP, ie the Socket.XXXAsync methods.
Here's something pretty close to what I'm doing (in fact, my receive function is based on his):
http://vadmyst.blogspot.com/2008/05/sample-code-for-tcp-server-using.html
What I've found now is that at the start of the program (two test servers talking to each other) I each time get a number of SAEA objects where the .Buffer is entirely filled with zeroes, yet the .BytesTransferred is the size of the buffer (1024 in my case).
What does this mean? Is there a special condition I need to check for? My system interprets this as an incomplete message and moves on, but I'm wondering if I'm actually missing some data. I was under the impression that if nothing was being received, you'd not get a callback. In any case, I can see in WireShark that there aren't any zero-length packets coming in.
I've found the following when I Googled it, but I'm not sure my problem is the same:
http://social.msdn.microsoft.com/Forums/en-US/ncl/thread/40fe397c-b1da-428e-a355-ee5a6b0b4d2c
http://go4answers.webhost4life.com/Example/socketasynceventargs-buffer-not-ready-121918.aspx

I am sure not what is going on in the linked example. It appears to be using asynchronous sockets in a synchronous way. I cannot see any callbacks or similar in the code. You may need to rethink whether you need synchronous or asynchronous sockets :).
To the problem at hand stems from the possibility that your functions are trying to read/write to the buffer before the network transmit/receive has been completed. Try using the callback functionality included in the async Socket. E.g.
// This goes into your accept function, to begin receiving the data
socketName.BeginReceive(yourbuffer, 0, yourbuffer.Length,
SocketFlags.None, new AsyncCallback(OnRecieveData), socketName);
// In your callback function you know that the socket has finished receiving data
// This callback will fire when the receive is complete.
private void OnRecieveData(IAsyncResult input) {
Socket inSocket = (Socket)input.AsyncState; // This is just a typecast
inSocket.EndReceive(input);
// Pull the data out of the socket as you already have before.
// state.Data.Write ......
}

How do I know when an asynchronous socket read ends?

I have an asynchronous read method...
private void read(IAsyncResult ar) {
//Get the Server State Object
ServerState state = (ServerState)ar.AsyncState;
//read from the socket
int readCount = state.socket.EndReceive(ar);
//check if reading is done, move on if so, trigger another read if not
if (readCount > 0) {
//purge the buffer and start another read
state.purgeBuffer();
state.socket.BeginReceive(state.buffer, 0, ServerState.bufferSize, 0, new AsyncCallback(read), state);
}
else {
//all bytes have been read, dispatch the message
dispatch(state);
}
}
The problem that I am having is that read is only 0 if the connection is closed. How do I say, this is the end of that message and pass the data on to the dispatcher, while leaving the socket open to accept new messages.
Thank you!

You should not rely on what is in the TCP buffer. You must process the incoming bytes as a stream somewhere. You can't really know whether its complete. Only one layer above can know when the message completed.
Example:
If you read HTTP responses the HTTP header will contain the byte count which is in the HTTP body. So you know how much to read.
You only know how much to read if the data follows a certain protocol and you interprete it. Imagine you receive a file over the socket. The first thing you would receive is the file size. Without that you would never know how much to read.

You should make your messages fit a particular format so that you can distinguish when they start and when end. Even if it is a stream of data it should be sent in packets.
One option is to send length of message first and then you know how much data to expect. But problem with that is if you loose sync you can never recover and you will never know what is message length and what is its content. It is good to use some special marking sequence to know when message begins. It is is not 100% error proof (sequence might appear in data) but certainly helps and allows to recover from sync loose. This is particularly important when reading from a binary stream like socket.
Even ancient RS232 serial protocol had its frame and stop bit to know when you got all the data.

C# Socket Trick

I am sending and receiving bytes between a server and a client. The server regularly sends some message in the form of bytes and client receives them.
Message format is below:
{Key:Value,Key:Value,Key:Value}
Now at the client side instead of receiving this message, I am receiving multiple copies of this message which is not suitable for this.
The client is receiving like this:
{Key:Value,Key:Value,Key:Value}
{Key:Value,Key:Value,Key:Value}
{Key:Value,Key:Value,Key:Value}
{Key:Value,Key:Value,Key:Value}
{Key:Value,Key:Value,Key:Value}
{Key:Value,Key:Value,Key:Value}
{Key:Value,Key:Value,
Can someone help me figure out the problem?
Updated
This code is sending instructions.
var client = (param as System.Net.Sockets.Socket);
while (true)
{
try
{
var instructions = "{";
instructions += "Window:" + window + ",";
instructions += "Time:" + System.DateTime.Now.ToShortTimeString() + ",";
instructions += "Message:" + msgToSend + "";
instructions += "}";
var bytes = System.Text.Encoding.Default.GetBytes(instructions);
client.Send(bytes, 0, bytes.Length, System.Net.Sockets.SocketFlags.None);
}
catch (Exception ex)
{
continue;
}
}
This code is receiving at client side.
while (true)
{
try
{
var data = new byte[tcpClient.ReceiveBufferSize];
stream.Read(data, 0, tcpClient.ReceiveBufferSize);
instructions = System.Text.Encoding.Default.GetString(data.ToArray());
}
catch (Exception ex)
{
continue;
}
}

Okay, a few problems with this code:
You're using Encoding.Default, which is almost certainly not what you want to do
You're always decoding the whole string, rather than just the amount you've actually managed to read - you're ignoring the return value of stream.Read
You're just continuing after an exception, with no logging, error handling or anything
As Dean says, you're repeatedly sending the same data
Ideally, it would be useful for your messages to have a prefix saying how long each one is, in bytes. Then in the receiving side you can read that length, then loop to repeatedly read into a buffer until you've read all the data you need. Then perform the decoding.
If you can't change the protocol, you'll still need to loop round, but checking for the end delimiter ("}" presumably) explicitly - and noting that you may receive data from the next message which you'll have to store until you next want to read.

You've got:
while (true)
In the sender: it's just going to keep sending the same thing over and over...
Also, if you get an exception trying to send or receive the data, you can't just try again and expect it to work. Depending on the exact error, you might need to reestablish the connection, or it might be that the network has gone away completely. In any case, simply retrying again is almost always going to be the wrong thing to do.

Problem has figured out like Dean Harding said.
But beside you should be more clearly about "client" or "server".
Basicaly:
Only server side should wait (by a loop) for msgs. Client (sender) sends msgs when needed or in condition.
You can sending msg in loop but should control and regulate it by a "Sleep" or "Timer". In this way, you can spare resource and give more time for receiver can process msg completely.

Your are sending your data through TCP. TCP is a stream-oriented protocol, so you know the client will receive the same stream of bytes in the same order, but you loose the packet boundaries. Your protocol seems to be packet-oriented instead. Then you have the choice:
switch to a packet-oriented protocol (UDP) or
delimit the packets yourself at the receiving side (as Jon Skeet said, by looking for the delimiters).
Keep in mind that TCP has some reliability features not found in UDP. If reliability is not a concern, switch to UDP. Otherwise, finding the delimiters at the client side should be easier than implementing your own reliability layer.

Handling dropped TCP packets in C#

I'm sending a large amount of data in one go between a client and server written C#. It works fine when I run the client and server on my local machine but when I put the server on a remote computer on the internet it seems to drop data.
I send 20000 strings using the socket.Send() method and receive them using a loop which does socket.Receive(). Each string is delimited by unique characters which I use to count the number received (this is the protocol if you like). The protocol is proven, in that even with fragmented messages each string is correctly counted. On my local machine I get all 20000, over the internet I get anything between 17000-20000. It seems to be worse the slower connection that the remote computer has. To add to the confusion, turning on Wireshark seems to reduce the number of dropped messages.
First of all, what is causing this? Is it a TCP/IP issue or something wrong with my code?
Secondly, how can I get round this? Receiving all of the 20000 strings is vital.
Socket receiving code:
private static readonly Encoding encoding = new ASCIIEncoding();
///...
while (socket.Connected)
{
byte[] recvBuffer = new byte[1024];
int bytesRead = 0;
try
{
bytesRead = socket.Receive(recvBuffer);
}
catch (SocketException e)
{
if (! socket.Connected)
{
return;
}
}
string input = encoding.GetString(recvBuffer, 0, bytesRead);
CountStringsIn(input);
}
Socket sending code:
private static readonly Encoding encoding = new ASCIIEncoding();
//...
socket.Send(encoding.GetBytes(string));

If you're dropping packets, you'll see a delay in transmission since it has to re-transmit the dropped packets. This could be very significant although there's a TCP option called selective acknowledgement which, if supported by both sides, it will trigger a resend of only those packets which were dropped and not every packet since the dropped one. There's no way to control that in your code. By default, you can always assume that every packet is delivered in order for TCP and if there's some reason that it can't deliver every packet in order, the connection will drop, either by a timeout or by one end of the connetion sending a RST packet.
What you're seeing is most likely the result of Nagle's algorithm. What it does is instead of sending each bit of data as you post it, it sends one byte and then waits for an ack from the other side. While it's waiting, it aggregates all the other data that you want to send and combines it into one big packet and then sends it. Since the max size for TCP is 65k, it can combine quite a bit of data into one packet, although it's extremely unlikely that this will occur, particularly since winsock's default buffer size is about 10k or so (I forget the exact amount). Additionally, if the max window size of the receiver is less than 65k, it will only send as much as the last advertised window size of the receiver. The window size also affects Nagle's algorithm as well in terms of how much data it can aggregate prior to sending because it can't send more than the window size.
The reason you see this is because on the internet, unlike your network, that first ack takes more time to return so Naggle's algorithm aggregates more of your data into a single packet. Locally, the return is effectively instantaneous so it's able to send your data as quickly as you can post it to the socket. You can disable Naggle's algorithm on the client side by using SetSockOpt (winsock) or Socket.SetSocketOption (.Net) but I highly recommend that you DO NOT disable Naggling on the socket unless you are 100% sure you know what you're doing. It's there for a very good reason.

Well there's one thing wrong with your code to start with, if you're counting the number of calls to Receive which complete: you appear to be assuming that you'll see as many Receive calls finish as you made Send calls.
TCP is a stream-based protocol - you shouldn't be worrying about individual packets or reads; you should be concerned with reading the data, expecting that sometimes you won't get a whole message in one packet and sometimes you may get more than one message in a single read. (One read may not correspond to one packet, too.)
You should either prefix each method with its length before sending, or have a delimited between messages.

It's definitely not TCP's fault. TCP guarantees in-order, exactly-once delivery.
Which strings are "missing"? I'd wager it's the last ones; try flushing from the sending end.
Moreover, your "protocol" here (I'm taking about the application-layer protocol you're inventing) is lacking: you should consider sending the # of objects and/or their length so the receiver knows when he's actually done receiving them.

How long are each of the strings? If they aren't exactly 1024 bytes, they'll be merged by the remote TCP/IP stack into one big stream, which you read big blocks of in your Receive call.
For example, using three Send calls to send "A", "B", and "C" will most likely come to your remote client as "ABC" (as either the remote stack or your own stack will buffer the bytes until they are read). If you need each string to come without it being merged with other strings, look into adding in a "protocol" with an identifier to show the start and end of each string, or alternatively configure the socket to avoid buffering and combining packets.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# network stream fragmented data - c#

Related

Debug Custom Server Socket Application

SocketAsyncEventArgs buffer is full of zeroes

How do I know when an asynchronous socket read ends?

C# Socket Trick

Handling dropped TCP packets in C#

Categories

Resources