I have a client-server program.
I'm sending data like this:
private void Sender(string s,TcpClient sock)
{
try
{
byte[] buffer = Encoding.UTF8.GetBytes(s);
sock.Client.Send(buffer);
}catch{}
}
and on the client side receiving like this:
byte[] buffer = new byte[PacketSize];
int size = client.Client.Receive(buffer);
String request = Encoding.UTF8.GetString(buffer, 0, size);
The problem is that data is not fully received always, sometimes it's only part of what I have sent. PacketSize is 10240 which is more than the bytes I send. I have also set SendBufferSize and ReceiveBufferSize at both sides.
The worst part is that sometimes data is fully received!
What might be the problem?
The size value returned by TcpClient.Receive is not the same as the length of the buffered string you sent. This is because there is no guarantee that when calling Receive once you will get back all the data that you sent with Send call. This behavior is intrinsic to the way TCP works (it's a stream-, not a message-based data protocol).
You cannot solve the problem by using bigger buffers, as the buffers you provide can only limit the amount of data that Receive returns. Even if you provide a 1MB buffer and there is 1MB of data to read, Receive can legitimately return any number of bytes (even just 1).
What you need to do is make sure that you have buffered all the data before calling Encoding.GetString. To do that, you need to know how much data there is in the first place. So at the very least, you need to write the length of the string bytes when sending:
byte[] buffer = Encoding.UTF8.GetBytes(s);
byte[] length = BitConverter.GetBytes(buffer.Length);
sock.Client.Send(length);
sock.Client.Send(buffer);
When receiving, you will first read the length (which has a known fixed size: 4 bytes) and then start buffering the rest of the data in a temp buffer until you have length bytes (this might take any number of Receive calls, so you 'll need a while loop). Only then can you call Encoding.GetString and get your original string back.
Explanation of the behavior you observe:
Even though the network stack of the OS makes pretty much no guarantees, in practice it will usually give you the data one TCP packet brings with one Receive call. Since the MTU (maximum packet size) for TCP allows around 1500 bytes for payload, naive code will work fine as long as the strings are less than this size. More than that and it will get split into multiple packets, and one Receive will then return only part of the data.
Related
My code has to consume data from a NetworkStream, and the data read from the stream will contain three parts: metadata, a well-known delimiter, and data.
I'm trying to determine the most efficient way of reading from the NetworkStream, up to the end of the delimiter. The metadata portion is generally measured in hundreds of bytes (but could be as small as 32 bytes), the delimiter is a specific 2-byte sequence, and the data could range from zero bytes to several gigabytes in size (the metadata provides information on the data length). I should only read up to the delimiter, because the rest of the stream (containing payload data) needs to be used elsewhere, and NetworkStream doesn't support seek and the data may be so large that I can't dump it all into a MemoryStream.
I've been using the following, and it works, but it seems there could be a more efficient way of reading up to the delimiter. Since the minimum metadata size is 32 bytes, I start with a 34-byte buffer (32 bytes of metadata + 2 bytes delimiter), read from the stream, and check for the delimiter. If the delimiter is found (smallest possible metadata), the code then breaks and the balance of the stream contains the data. If the delimiter is not found, the code then loops reading a single byte at a time, checking the last two bytes of the StringBuilder used to hold what has been read from the stream, until the delimiter is found at the end.
(code reduced for brevity, removed checking of negative cases, etc)
string delim = "__";
StringBuilder sb = new StringBuilder();
byte[] buffer = new byte[1];
byte[] initialBuffer = new byte[34];
int bytesRead = stream.Read(initialBuffer, 0, 34); // yes I check bytesRead in the actual code
sb.Append(Encoding.UTF8.GetString(initialBuffer);
while (true)
{
string delimCheck = sb.ToString((sb.Length - 2), 2);
if (delimCheck.Equals(delim)) break;
else
{
buffer = new byte[1];
bytesRead = stream.Read(buffer, 0, 1); // yes I check bytesRead in the actual code
sb.Append(Encoding.UTF8.GetString(buffer));
}
}
The code works, but it seems really inefficient and slow to read one byte at a time to reach the end of the delimiter. Is anything readily apparent that might better optimize this code?
Thanks!
Do you see those Read(array, offset, count) return values you are putting into a variable bytesRead and then happily ignoring?
Those (along with setting the socket in non-blocking mode) are the solution to your problem. Then you can access "everything received so far" without getting stuck waiting for enough extra data to arrive to fill your array.
Even in blocking mode, ignoring that return value is a bug, because when the socket is gracefully shut down, you will get a partial read where bytesRead < bytesRequested
Regarding your concerns about how to save the extra data for later, Microsoft provided a class for that. See System.IO.BufferedStream and the example:
The following code examples show how to use the BufferedStream class over the NetworkStream class to increase the performance of certain I/O operations. Start the server on a remote computer before starting the client. Specify the remote computer name as a command-line argument when starting the client. Vary the dataArraySize and streamBufferSize constants to view their effect on performance.
Source: https://learn.microsoft.com/en-us/dotnet/api/system.io.bufferedstream
Not shown in the example is that you still need to put the socket into non-blocking mode to avoid having the BufferedStream block until an entire buffer chunk is received. The Socket class provides the Blocking property to make that easy.
https://learn.microsoft.com/en-us/dotnet/api/system.net.sockets.socket.blocking
If i send 1000 bytes in TCP, does it guarantee that the receiver will get the entire 1000 bytes "togther"? or perhaps he will first only get 500 bytes, and later he'll receive the other bytes?
EDIT: the question comes from the application's point of view. If the 1000 bytes are reassembles into a single buffer before they reach the application .. then i don't care if it was fragmented in the way..
See Transmission Control Protocol:
TCP provides reliable, ordered delivery of a stream of bytes from a program on one computer to another program on another computer.
A "stream" means that there is no message boundary from the receiver's point of view. You could get one 1000 byte message or one thousand 1 byte messages depending on what's underneath and how often you call read/select.
Edit: Let me clarify from the application's point of view. No, TCP will not guarantee that the single read would give you all of the 1000 bytes (or 1MB or 1GB) packet the sender may have sent. Thus, a protocol above the TCP usually contains fixed length header with the total content length in it. For example you could always send 1 byte that indicates the total length of the content in bytes, which would support up to 255 bytes.
As other answers indicated, TCP is a stream protocol -- every byte sent will be received (once and in the same order), but there are no intrinsic "message boundaries" -- whether all bytes are sent in a single .send call, or multiple ones, they might still be received in one or multiple .receive calls.
So, if you need "message boundaries", you need to impose them on top of the TCP stream, IOW, essentially, at application level. For example, if you know the bytes you're sending will never contain a \0, null-terminated strings work fine; various methods of "escaping" let you send strings of bytes which obey no such limitations. (There are existing protocols for this but none is really widespread or widely accepted).
Basically as far as TCP goes it only guarantees that the data sent from one end to the other end will be sent in the same order.
Now usually what you'll have to do is have an internal buffer that keeps looping until it has received your 1000 byte "packet".
Because the recv command as mentioned returns how much has actually been received.
So usually you'll have to then implement a protocol on top of TCP to make sure you send data at an appropriate speed. Because if you send() all the data in one run through it will overload the under lying networking stack, and which will cause complications.
So usually in the protocol there is a tiny acknowledgement packet sent back to confirm that the packet of 1000 bytes are sent.
You decide, in your message that how many bytes your message shall contain. For instance in your case its 1000. Following is up and running C# code to achieve the same. The method returns with 1000 bytes. The abort code is 0 bytes; you can tailor that according to your needs.
Usage:
strMsg = ReadData(thisTcpClient.Client, 1000, out bDisconnected);
Following is the method:
string ReadData(Socket sckClient, int nBytesToRead, out bool bShouldDisconnect)
{
bShouldDisconnect = false;
byte[] byteBuffer = new byte[nBytesToRead];
Array.Clear(byteBuffer, 0, byteBuffer.Length);
int nDataRead = 0;
int nStartIndex = 0;
while (nDataRead < nBytesToRead)
{
int nBytesRead = sckClient.Receive(byteBuffer, nStartIndex, nBytesToRead - nStartIndex, SocketFlags.None);
if (0 == nBytesRead)
{
bShouldDisconnect = true;
//0 bytes received; assuming disconnect signal
break;
}
nDataRead += nBytesRead;
nStartIndex += nBytesRead;
}
return Encoding.Default.GetString(byteBuffer, 0, nDataRead);
}
Let us know this didn't help you (0: Good luck.
Yes, there is a chance for receiving packets part by part. Hope this msdn article and following example (taken from the article in msdn for quick review) would be helpful to you if you are using windows sockets.
void CChatSocket::OnReceive(int nErrorCode)
{
CSocket::OnReceive(nErrorCode);
DWORD dwReceived;
if (IOCtl(FIONREAD, &dwReceived))
{
if (dwReceived >= dwExpected) // Process only if you have enough data
m_pDoc->ProcessPendingRead();
}
else
{
// Error handling here
}
}
TCP guarantees that they will recieve all 1000 bytes, but not necessarily in order (though, it will appear so to the recieving application) and not necessarily all at once (unless you craft the packet yourself and make it so.).
That said, for a packet as small as 1000 bytes, there is a good chance it'll send in one packet as long as you do it in one call to send, though for larger transmissions it may not.
The only thing that the TCP layer guarantees is that the receiver will receive:
all the bytes transmitted by the sender
in the same order
There are no guarantees at all about how the bytes might be split up into "packets". All the stuff you might read about MTU, packet fragmentation, maximum segment size, or whatever else is all below the layer of TCP sockets, and is irrelevant. TCP provides a stream service only.
With reference to your question, this means that the receiver may receive the first 500 bytes, then the next 500 bytes later. Or, the receiver might receive the data one byte at a time, if that's what it asks for. This is the reason that the recv() function takes a parameter that tells it how much data to return, instead of it telling you how big a packet is.
The transmission control protocol guarantees successful delivery of all packets by requiring acknowledgment of the successful delivery of each packet to the sender by the receiver. By this definition the receiver will always receive the payload in chunks when the size of the payload exceeds the MTU (maximum transmission unit).
For more information please read Transmission Control Protocol.
The IP packets may get fragmented during retransmission.
So the destination machine may receive multiple packets - which will be reassembled back by TCP/IP stack. Depending on the network API you are using - the data will be given to you either reassembled or in RAW packets.
It depends of the stablished MTU (Maximum transfer unit). If your stablished connection (once handshaked) refers to a MTU of 512 bytes you will need two or more TCP packets to send 1000 bytes.
If i send 1000 bytes in TCP, does it guarantee that the receiver will get the entire 1000 bytes "togther"? or perhaps he will first only get 500 bytes, and later he'll receive the other bytes?
EDIT: the question comes from the application's point of view. If the 1000 bytes are reassembles into a single buffer before they reach the application .. then i don't care if it was fragmented in the way..
See Transmission Control Protocol:
TCP provides reliable, ordered delivery of a stream of bytes from a program on one computer to another program on another computer.
A "stream" means that there is no message boundary from the receiver's point of view. You could get one 1000 byte message or one thousand 1 byte messages depending on what's underneath and how often you call read/select.
Edit: Let me clarify from the application's point of view. No, TCP will not guarantee that the single read would give you all of the 1000 bytes (or 1MB or 1GB) packet the sender may have sent. Thus, a protocol above the TCP usually contains fixed length header with the total content length in it. For example you could always send 1 byte that indicates the total length of the content in bytes, which would support up to 255 bytes.
As other answers indicated, TCP is a stream protocol -- every byte sent will be received (once and in the same order), but there are no intrinsic "message boundaries" -- whether all bytes are sent in a single .send call, or multiple ones, they might still be received in one or multiple .receive calls.
So, if you need "message boundaries", you need to impose them on top of the TCP stream, IOW, essentially, at application level. For example, if you know the bytes you're sending will never contain a \0, null-terminated strings work fine; various methods of "escaping" let you send strings of bytes which obey no such limitations. (There are existing protocols for this but none is really widespread or widely accepted).
Basically as far as TCP goes it only guarantees that the data sent from one end to the other end will be sent in the same order.
Now usually what you'll have to do is have an internal buffer that keeps looping until it has received your 1000 byte "packet".
Because the recv command as mentioned returns how much has actually been received.
So usually you'll have to then implement a protocol on top of TCP to make sure you send data at an appropriate speed. Because if you send() all the data in one run through it will overload the under lying networking stack, and which will cause complications.
So usually in the protocol there is a tiny acknowledgement packet sent back to confirm that the packet of 1000 bytes are sent.
You decide, in your message that how many bytes your message shall contain. For instance in your case its 1000. Following is up and running C# code to achieve the same. The method returns with 1000 bytes. The abort code is 0 bytes; you can tailor that according to your needs.
Usage:
strMsg = ReadData(thisTcpClient.Client, 1000, out bDisconnected);
Following is the method:
string ReadData(Socket sckClient, int nBytesToRead, out bool bShouldDisconnect)
{
bShouldDisconnect = false;
byte[] byteBuffer = new byte[nBytesToRead];
Array.Clear(byteBuffer, 0, byteBuffer.Length);
int nDataRead = 0;
int nStartIndex = 0;
while (nDataRead < nBytesToRead)
{
int nBytesRead = sckClient.Receive(byteBuffer, nStartIndex, nBytesToRead - nStartIndex, SocketFlags.None);
if (0 == nBytesRead)
{
bShouldDisconnect = true;
//0 bytes received; assuming disconnect signal
break;
}
nDataRead += nBytesRead;
nStartIndex += nBytesRead;
}
return Encoding.Default.GetString(byteBuffer, 0, nDataRead);
}
Let us know this didn't help you (0: Good luck.
Yes, there is a chance for receiving packets part by part. Hope this msdn article and following example (taken from the article in msdn for quick review) would be helpful to you if you are using windows sockets.
void CChatSocket::OnReceive(int nErrorCode)
{
CSocket::OnReceive(nErrorCode);
DWORD dwReceived;
if (IOCtl(FIONREAD, &dwReceived))
{
if (dwReceived >= dwExpected) // Process only if you have enough data
m_pDoc->ProcessPendingRead();
}
else
{
// Error handling here
}
}
TCP guarantees that they will recieve all 1000 bytes, but not necessarily in order (though, it will appear so to the recieving application) and not necessarily all at once (unless you craft the packet yourself and make it so.).
That said, for a packet as small as 1000 bytes, there is a good chance it'll send in one packet as long as you do it in one call to send, though for larger transmissions it may not.
The only thing that the TCP layer guarantees is that the receiver will receive:
all the bytes transmitted by the sender
in the same order
There are no guarantees at all about how the bytes might be split up into "packets". All the stuff you might read about MTU, packet fragmentation, maximum segment size, or whatever else is all below the layer of TCP sockets, and is irrelevant. TCP provides a stream service only.
With reference to your question, this means that the receiver may receive the first 500 bytes, then the next 500 bytes later. Or, the receiver might receive the data one byte at a time, if that's what it asks for. This is the reason that the recv() function takes a parameter that tells it how much data to return, instead of it telling you how big a packet is.
The transmission control protocol guarantees successful delivery of all packets by requiring acknowledgment of the successful delivery of each packet to the sender by the receiver. By this definition the receiver will always receive the payload in chunks when the size of the payload exceeds the MTU (maximum transmission unit).
For more information please read Transmission Control Protocol.
The IP packets may get fragmented during retransmission.
So the destination machine may receive multiple packets - which will be reassembled back by TCP/IP stack. Depending on the network API you are using - the data will be given to you either reassembled or in RAW packets.
It depends of the stablished MTU (Maximum transfer unit). If your stablished connection (once handshaked) refers to a MTU of 512 bytes you will need two or more TCP packets to send 1000 bytes.
Assuming that I know the payload bytes before hand. What is the correct way to receive all the bytes? Currently, I am doing something like this
byte[] buffer = new byte[payloadLength];
socket.Receive(buffer, buffer.Length, SocketFlags.None);
But then, I thought that what if the payload is big and Receive might not be able to receive the whole data in one go. So I was planning to do something like this
byte[] buffer = new byte[payloadLength];
int remained = payloadLength;
int size = 0;
do {
size = socket.Receive(buffer, payloadLength - remained, remained, SocketFlags.None);
remained -= size;
} while (remained > 0 && size > 0);
Which one is more correct? Or do you guy have any better idea?
Definitely some variant like the second. Ignoring the return value from Receive is one of the most common beginner bugs found on SO, because all that the contract guarantees (if you don't ask for 0 bytes) when Receive returns is that it will have read at least one byte. It makes no guarantee that it will try to read as many bytes as you've asked for.1
Any message framing (such as here with fixed size messages, apparently) is up to you to implement atop TCPs streams of bytes.
1Even for relatively small receive sizes, there's no guarantee. So if you know how big your message is going to be because you're sending the length first (another common means of message framing), you need to loop even to get the 2/4/8 bytes that make up the message length short, int or long.
In practice the first option usually works. However you cannot guarantee it, so if it's a serious program that you care about then take the 2nd approach.
Certainly if you switched to ReceiveAsync, which you should if you have high performance needs, you would need to take the 2nd approach.
What is the behaviour of the NetworkStream.Write() method, if I send data via a TcpClient's NetworkStream, and the TcpClient.SendBufferSize is smaller than the data?
The MSDN documentation for SendBufferSize says
If the network buffer is smaller than the amount of data you provide
the Write method, several network send operations will be performed
for every call you make to the Write method. You can achieve greater
data throughput by ensuring that your network buffer is at least as
large as your application buffer.
So I know that the data will be sent in multiple operations, and the receiving TCP-Stack should reassemble it into one continuous stream transparently.
But what happens exactly in my program during this time?
If there is enough space in the SendBuffer, TcpClient.GetStream().Write() will not block at all, and return immediately and so will NetworkStream.Flush().
If I set the TcpClient.SendBufferSize to a value smaller than the data, will the Write() block until
either the first part of the data has been received and ACKed,
or the TcpClient.SendTimeout has expired?
Or does it work in some other way? Does it actually wait for a TCP ACK?
Are there any other drawbacks besides higher overhead to such a smaller buffer size? Are there problems with changing the SendBufferSize on the fly?
Example:
byte[] data = new byte[20] // 20 byte data
tcpClient.SendBufferSize= 10; // 10 byte buffer
tcpClient.SendTimeout = 1000; // 1s timeout
tcpClient.GetStream().Write(data,0,20);
// will this block until the first 10 bytes have been full transmitted?
// does it block until a TCP ACK has been received? or something else?
// will it throw if the first 10 bytes have not been received in 1 second?
tcpClient.GetStream().Flush(); // would this make any difference?
My goal here is mainly getting a better understanding of the network internals.
Also , I was wondering if this could be abused to react more quickly to a network failure. If data is sent only infrequently, and each data packet is small enough to be transmitted at once, and there are no receipt messages in a given message protocol, it could take a long time after a network error until the next Write() is called; so a long time until an exception is thrown.
If the SentBuffer is very small, would an error be noticed more quickly, unless it happened at end of the data?
Could I abuse this to measure the time it takes for a single packet to be transmitted and ACKed?