This question already has an answer here:
TcpClient's NetworkStream reads incomplete data unless Thread.Sleep(1) is called
(1 answer)
Closed 1 year ago.
So I have read more on this online and haven't found any solution. I know the problem is that my ReadAsync is faster than the connection is sending data. But I don't like to use the Thread.Sleep(1) because maybe 1 is not enough, maybe there is a hiccup in the connection. Anything can happen.
This is my code, sometimes my dataBytes Length == 0. But if I debug and set a breakpoint, there is always data. When I set a Thread.Sleep of 500ms works.
So what is happening is that DataAvailable is sometimes false, while more data is coming.
using (var client = (TcpClient)c)
{
using (NetworkStream stream = client.GetStream())
{
using (MemoryStream memory = new MemoryStream())
{
do
{
byte[] b = new byte[256];
int read = await stream.ReadAsync(b, 0, b.Length);
await memory.WriteAsync(b, 0, read);
} while (stream.DataAvailable && stream.CanRead);
memory.Seek(0, SeekOrigin.Begin);
byte[] dataBytes = memory.ToArray();
}
}
}
To test this connection I use RestSharp to send an HTTP message to the code above.
My question is, how can I fix this in a way I am not depending on a Thread.Sleep.
DataAvailable doesn't tell you whether more data is coming; it only tells you whether data is available in the buffers right now, and can be useful in deciding whether to do a synchronous read vs an asynchronous read. It should not have any place in a while loop that determines the end of the read operation.
If you want to read a socket to the end of the data ever, you need to keep reading until Read returns a non-positive value (assuming you're not doing zero-length reads for async IO reasons, which you aren't). However, most socket work doesn't involve a scenario where you can just read from start to end in one big chunk, and you are instead required to implement "framing" (see the second section here), allowing you to detect individual messages (which are completely different to individual Read calls).
Related
I was wondering if there is any method or property that allow us to see if there are available bytes to read in the stream associated to a BinaryReader (in my case, it is a NetworkStream, since I am performing a TCP communication).
I have checked the documentation and the only method I have seen is PeekChar(), but it only checks if there is a next byte (character), so in case there are many bytes to read, making a while loop to increase a counter may be inneficient.
Regarding the TCP communication, the problem is that the application protocol behind the TCP was not defined by me, and I am just trying to figure out how it works! Of course there will be some "length field" that will give me some clues about the bytes to read, but right know I am just checking how it works and this question came to my mind.
The BinaryReader itself doesn't have a DataAvailable property, but the NetworkStream does.
NetworkStream stream = new NetworkStream(socket);
BinaryReader reader = new BinaryReader(stream);
while (true)
{
if (stream.DataAvailable)
{
// call reader.ReadBytes(...) like you normally would!
}
// do other stuff here!
Thread.Sleep(100);
}
If you don't have a handle to the original NetworkStream at the point where you would call reader.ReadBytes(), you can use the BaseStream property.
while (true)
{
NetworkStream stream = reader.BaseStream as NetworkStream;
if (stream.DataAvailable)
{
// call reader.ReadBytes(...) like you normally would!
}
// do other stuff here!
Thread.Sleep(100);
}
BinaryReader will block until it reads all bytes required. The only exception is if it detects end of stream. But NetworkStream is an open stream and does not have the end of stream condition. So you can either create class with basic readers (ReadInt, ReadDouble, etc) that uses peek, reads byte by byte and does not block; or use another async technology.
I'm using the following code to send a file over tcp.
If i send many times the same file consecutively to test if it is robust, i receive the first file correctly and the other messed up.
All messed up files have the same incorrect bytes and if i Sleep(a while) all files are transfered correctly. I noticed I must instantiate a new buffer while reading my file to get everything done right. But i don't get why.
I fear my solution to reinstantiate a buffer could be just hiding another major problem. Any suggestion?
using(var fileStream = new FileStream(file, FileMode.Open, FileAccess.Read))
{
using(var binaryReader = new BinaryReader(fileStream))
{
var _sendingBuffer = new byte[BUFFER_SIZE];
int length = (int)fileStream.Length;
int bytesRead = 0;
//Ensure we reached the end of the stream regardless of encoding
while (binaryReader.BaseStream.Position != binaryReader.BaseStream.Length)
{
bytesRead = binaryReader.Read( _sendingBuffer, 0, _sendingBuffer.Length);
_socket.BeginSend(_sendingBuffer, 0, bytesRead, SocketFlags.None, SendFileCallback, null);
//without this i received some messed up data
_sendingBuffer = new byte[BUFFER_SIZE];
}
}
}
BeginSend is an asynchronous operation. It will only be guaranteed to be started after you call it, it won't be finished immediatly. As long as the socket is sending the passed data, that data must not be mutated.
The end of the operation will be signaled through the AsyncCallback callback parameter.
Your problem is exactly that you mutated the transmit buffer while the transmit was still in progress. Creating a new array for each transmit call fixes this.
Other ways to fix the problem:
Use the blocking Socket.Send function which will block until the whole data was sent and the buffer can be reused. This will also make your error handling much easier, because the error will not show up through the AsyncCallback.
Make your complete program acting asynchronously, e.g. using C#5's async Task and async/await functionalities
Therefore:
First read one part of the file asynchronously.
When the async read finishes send it asynchronously through the socket
When this completes and there is more data to read go back to 1.
If we create a HttpWebRequest and get the ResponseStream from its response, then whether the data will get downloaded completely at once or, when we call the ReadBytes of the stream , then only the data will get download from the network and then reads the content?
Code sample which i want to refer is mentioned below:
var webRequest = HttpWebRequest.Create('url of a big file approx 700MB') as HttpWebRequest;
var webResponse = webRequest.GetResponse();
using (BinaryReader ns = new BinaryReader(webResponse.GetResponseStream()))
{
Thread.Sleep(60000); //Sleep for 60seconds, hope 700MB file get downloaded in 60 seconds
//At this point whether the response is totally downloaded or will not get downloaded at all
var buffer = ns.ReadBytes(bufferToRead);
//Or, in the above statement ReadBytes function is responsible for downloading the content from the internet.
}
GetResponseStream opens and returns a Stream object. The stream object is sourced from the underlying Socket. This Socket is sent data by the network adapter asynchronously. The data just arrives and is buffered. GetResponseStream will block execution until the first data arrives.
ReadByte pulls the data up from the socket layer to c#. This method will block execution until there is a byte avaliable.
Closing the stream prematurely will end the asynchronous transfer (closes the Socket, the sender will be notified of this as their connection will fail) and discard (flush) any buffered data that you have not used yet.
var webRequest = HttpWebRequest.Create('url of a big file approx 700MB') as HttpWebRequest;
Okay, we're set up ready to go. It's a bit different if you PUT or POST a stream of your own, but the differences are analogous.
var webResponse = webRequest.GetResponse();
When GetResponse() returns, it will at the very least have read all of the HTTP headers. It may well have read the headers of a redirect, and done another request to the URI it was redirected to. It's also possible that it's actually hitting a cache (either directly or because the webserver setnt 304 Not Modified) but by default the details of that are hidden from you.
There will likely be some more bytes in the socket's buffer.
using (BinaryReader ns = new BinaryReader(webResponse.GetResponseStream()))
{
At this point, we've got a stream representing the network stream.
Let's remove the Thread.Sleep() it does nothing except add a risk of the connection timing out. Even assuming it doesn't timeout while waiting, the connection will have "backed off" from sending bytes since you weren't reading them, so the effect will be to slow things even more than you did by adding a deliberate slow-down.
var buffer = ns.ReadBytes(bufferToRead);
At this point, either bufferToRead bytes have been read to create a byte[] or else fewer than bufferToRead because the total size of the stream was less than that, in which case buffer contains the entire stream. This will take as long as it takes.
}
At this point, because a successful HTTP GET was performed, the underlying web-access layer may cache the response (probably not if it's very large - the default assumption is that very large requests don't get repeated a lot and don't benefit from caching).
Error conditions will raise exceptions if they occur, and in that case no caching will ever be done (there is no point caching a buggy response).
There is no need to sleep, or otherwise "wait" on it.
It's worth considering the following variant that works at just a slightly lower level by manipulating the stream directly rather than through a reader:
using(var stm = webResponse.GetResponseStream())
{
We're going to work on the stream directly;
byte[] buffer = new byte[4096];
do
{
int read = stm.Read(buffer, 0, 4096);
This will return up to 4096 bytes. It may read less, because it has a chunk of bytes already available and it returns that many immediately. It will only return 0 bytes if it is at the end of the stream, so this gives us a balance between waiting and not waiting - it promises to wait long enough to get at least one byte, but whether or not it waits until it gets all 4096 bytes is up to the stream to choose whether it is more efficient to wait that long or return fewer bytes;
DoSomething(buffer, 0, read);
We work with the bytes we got.
} while(read != 0);
Read() only gives us zero bytes, if it's at the end of the stream.
}
And again, when the stream is disposed, the response may or may not be cached.
As you can see, even at the lowest level .NET gives us access to when using HttpWebResponse, there's no need to add code to wait on anything, as that is always done for us.
You can use asynchronous access to the stream to avoid waiting, but then the asynchronous mechanism still means you get the result when it's available.
To answer your question about when streaming starts, GetResponseStream() will start receiving data from the server. However, at some point the network buffers will become full and the server will stop sending data if you don't read off the buffers. For a detailed description of the tcp buffers, etc see here.
So your sleep of 60000 will not be helping you much as the network buffers along the way will fill up and data will stop arriving until you read it off. It is better to read it off and write it in chunks as you go.
More info on the workings of ResponseStream here.
If you are wondering about what buffer size to use, see here.
This question already has answers here:
Receiving data in TCP
(10 answers)
Closed 2 years ago.
I have written a simple TCP client and server. The problem lies with the client.
I'm having some trouble reading the entire response from the server. I must let the thread sleep to allow all the data be sent.
I've tried a few times to convert this code into a loop that runs until the server is finished sending data.
// Init & connect to client
TcpClient client = new TcpClient();
Console.WriteLine("Connecting.....");
client.Connect("192.168.1.160", 9988);
// Stream string to server
input += "\n";
Stream stm = client.GetStream();
ASCIIEncoding asen = new ASCIIEncoding();
byte[] ba = asen.GetBytes(input);
stm.Write(ba, 0, ba.Length);
// Read response from server.
byte[] buffer = new byte[1024];
System.Threading.Thread.Sleep(1000); // Huh, why do I need to wait?
int bytesRead = stm.Read(buffer, 0, buffer.Length);
response = Encoding.ASCII.GetString(buffer, 0, bytesRead);
Console.WriteLine("Response String: "+response);
client.Close();
The nature of streams that are built on top of sockets is that you have an open pipeline that transmits and receives data until the socket is closed.
However, because of the nature of client/server interactions, this pipeline isn't always guaranteed to have content on it to be read. The client and server have to agree to send content over the pipeline.
When you take the Stream abstraction in .NET and overlay it on the concept of sockets, the requirement for an agreement between the client and server still applies; you can call Stream.Read all you want, but if the socket that your Stream is connected to on the other side isn't sending content, the call will just wait until there is content.
This is why protocols exist. At their most basic level, they help define what a complete message that is sent between two parties is. Usually, the mechanism is something along the lines of:
A length-prefixed message where the number of bytes to be read is sent before the message
A pattern of characters used to mark the end of a message (this is less common depending on the content that is being sent, the more arbitrary any part of the message can be, the less likely this will be used)
That said you aren't adhering to the above; your call to Stream.Read is just saying "read 1024 bytes" when in reality, there might not be 1024 bytes to be read. If that's the case, the call to Stream.Read will block until that's been populated.
The reason the call to Thread.Sleep probably works is because by the time a second goes by, the Stream has 1024 bytes on it to read and it doesn't block.
Additionally, if you truly want to read 1024 bytes, you can't assume that the call to Stream.Read will populate 1024 bytes of data. The return value for the Stream.Read method tells you how many bytes were actually read. If you need more for your message, then you need to make additional calls to Stream.Read.
Jon Skeet wrote up the exact way to do this if you want a sample.
Try to repeat the
int bytesRead = stm.Read(buffer, 0, buffer.Length);
while bytesRead > 0. It is a common pattern for that as i remember.
Of course don't forget to pass appropriate params for buffer.
You dont know the size of data you will be reading so you have to set a mechanism to decide. One is timeout and another is using delimiters.
On your example you read whatever data from just one iteration(read) because you dont set the timeout for reading and using default value thats "0" milisecond. So you have to sleep just 1000 ms. You get same effect with using recieve time out to 1000 ms.
I think using lenght of data as prefix is not the real solution because when socket is closed by both sides, socket time-wait situation can not handled properly. Same data can be send to server and cause server to get exception . We used prefix-ending character sequence. After every read we check the data for start and end character sequence, if we cant get end characters, we call another read. But of course this works only if you have the control of server side and client side code.
In the TCP Client / Server I just wrote I generate the packet I want to send to a memory stream, then take the length of that stream and use it as a prefix when sending the data. That way the client knows how many bytes of data it's going to need to read for a full packet.
Can anyone point out the flaw in this code? I'm retrieving some HTML with TcpClient. NetworkStream.Read() never seems to finish when talking to an IIS server. If I go use the Fiddler proxy instead, it works fine, but when talking directly to the target server the .read() loop won't exit until the connection exceptions out with an error like "the remote server has closed the connection".
internal TcpClient Client { get; set; }
/// bunch of other code here...
try
{
NetworkStream ns = Client.GetStream();
StreamWriter sw = new StreamWriter(ns);
sw.Write(request);
sw.Flush();
byte[] buffer = new byte[1024];
int read=0;
try
{
while ((read = ns.Read(buffer, 0, buffer.Length)) > 0)
{
response.AppendFormat("{0}", Encoding.ASCII.GetString(buffer, 0, read));
}
}
catch //(SocketException se)
{
}
finally
{
Close();
}
Update
In the debugger, I can see the entire response coming through immediately and being appended to my StringBuilder (response). It just appears that the connection isn't being closed when the server is done sending the response, or my code isn't detecting it.
Conclusion
As has been said here, it's best to take advantage of the offerings of the protocol (in the case of HTTP, the Content-Length header) to determine when a transaction is complete. However, I've found that not all pages have content-length set. So, I'm now using a hybrid solution:
For ALL transactions, set the request's Connection header to "close", so that the server is discouraged from keeping the socket open. This improves the chances that the server will close the connection when it is through responding to your request.
If Content-Length is set, use it to determine when a request is complete.
Else, set the NetworkStream's RequestTimeout property to a large, but reasonable, value like 1 second. Then, loop on NetworkStream.Read() until either a) the timeout occurs, or b) you read fewer bytes than you asked for.
Thanks to everyone for their excellent and detailed responses.
Contrary to what the documentation for NetworkStream.Read implies, the stream obtained from a TcpClient does not simply return 0 for the number of bytes read when there is no data available - it blocks.
If you look at the documentation for TcpClient, you will see this line:
The TcpClient class provides simple methods for connecting, sending, and receiving stream data over a network in synchronous blocking mode.
Now my guess is that if your Read call is blocking, it's because the server has decided not to send any data back. This is probably because the initial request is not getting through properly.
My first suggestion would be to eliminate the StreamWriter as a possible cause (i.e. buffering/encoding nuances), and write directly to the stream using the NetworkStream.Write method. If that works, make sure that you're using the correct parameters for the StreamWriter.
My second suggestion would be not to depend on the result of a Read call to break the loop. The NetworkStream class has a DataAvailable property that is designed for this. The correct way to write a receive loop is:
NetworkStream netStream = client.GetStream();
int read = 0;
byte[] buffer = new byte[1024];
StringBuilder response = new StringBuilder();
do
{
read = netStream.Read(buffer, 0, buffer.Length);
response.Append(Encoding.ASCII.GetString(buffer, 0, read));
}
while (netStream.DataAvailable);
Read the response until you reach a double CRLF. What you now have is the Response headers.
Parse the headers to read the Content-Length header which will be the count of bytes left in the response.
Here is a regular expression that can catch the Content-Length header.
David's Updated Regex
Content-Length: (?<1>\d+)\r\n
Content-Length
Note
If the server does not properly set this header I would not use it.
Not sure if this is helpful or not but with HTTP 1.1 the underlying connection to the server might not be closed so maybe the stream doesn't get closed either? The idea being that you can reuse the connection to send a new request. I think you have to use the content-length. Alternatively use the WebClient or WebRequest classes instead.
I may be wrong, but it looks like your call to Write is writing (under the hood) to the stream ns (via StreamWriter). Later, you're reading from the same stream (ns). I don't quite understand why are you doing this?
Anyway, you may need to use Seek on the stream, to move to the location where you want to start reading. I'd guess that it seeks to the end after writing. But as I said, I'm not really sure if this is a useful answer!
Two Suggestions...
Have you tried using the DataAvailable property of NetworkStream? It should return true if there is data to be read from the stream.
while (ns.DataAvailable)
{
//Do stuff here
}
Another option would be to change the ReadTimeOut to a low value so you don't end up blocking for a long time. It can be done like this:
ns.ReadTimeOut=100;