Reading from Stream using Observable through FromAsyncPattern, how to close/cancel properly

Reading from Stream using Observable through FromAsyncPattern, how to close/cancel properly - c#

Need: long-running program with TCP connections
A C# 4.0 (VS1010, XP) program needs to connect to a host using TCP, send and receive bytes, sometimes close the connection properly and reopen it later. Surrounding code is written using Rx.Net Observable style. The volume of data is low but the program should runs continuously (avoid memory leak by taking care of properly disposing resources).
The text below is long because I explain what I searched and found. It now appears to work.
Overall questions are: since Rx is sometime unintuitive, are the solutions good? Will that be reliable (say, may it run for years without trouble)?
Solution so far
Send
The program obtains a NetworkStream like this:
TcpClient tcpClient = new TcpClient();
LingerOption lingerOption = new LingerOption(false, 0); // Make sure that on call to Close(), connection is closed immediately even if some data is pending.
tcpClient.LingerState = lingerOption;
tcpClient.Connect(remoteHostPort);
return tcpClient.GetStream();
Asynchronous sending is easy enough. Rx.Net allows to handle this with much shorter and cleaner code than traditional solutions. I created a dedicated thread with an EventLoopScheduler. The operations needing a send are expressed using IObservable. Using ObserveOn(sendRecvThreadScheduler) guarantee that all send operations are done on that thread.
sendRecvThreadScheduler = new EventLoopScheduler(
ts =>
{
var thread = new System.Threading.Thread(ts) { Name = "my send+receive thread", IsBackground = true };
return thread;
});
// Loop code for sending not shown (too long and off-topic).
So far this is excellent and flawless.
Receive
It seems that to receive data, Rx.Net should also allow shorter and cleaner code that traditional solutions.
After reading several resources (e.g. http://www.introtorx.com/ ) and stackoverflow, it seems that a very simple solution is to bridge the Asynchronous Programming to Rx.Net like in https://stackoverflow.com/a/14464068/1429390 :
public static class Ext
{
public static IObservable<byte[]> ReadObservable(this Stream stream, int bufferSize)
{
// to hold read data
var buffer = new byte[bufferSize];
// Step 1: async signature => observable factory
var asyncRead = Observable.FromAsyncPattern<byte[], int, int, int>(
stream.BeginRead,
stream.EndRead);
return Observable.While(
// while there is data to be read
() => stream.CanRead,
// iteratively invoke the observable factory, which will
// "recreate" it such that it will start from the current
// stream position - hence "0" for offset
Observable.Defer(() => asyncRead(buffer, 0, bufferSize))
.Select(readBytes => buffer.Take(readBytes).ToArray()));
}
}
It mostly works. I can send and receive bytes.
Close time
This is when things start to go wrong.
Sometimes I need to close the stream and keep things clean. Basically this means: stop reading, end the byte-receiving observable, open a new connection with a new one.
For one thing, when connection is forcibly closed by remote host, BeginRead()/EndRead() immediately loop consuming all CPU returning zero bytes. I let higher level code notice this (with a Subscribe() to the ReadObservable in a context where high-level elements are available) and cleanup (including closing and disposing of the stream). This works well, too, and I take care of disposing of the object returned by Subscribe().
someobject.readOneStreamObservableSubscription = myobject.readOneStreamObservable.Subscribe(buf =>
{
if (buf.Length == 0)
{
MyLoggerLog("Read explicitly returned zero bytes. Closing stream.");
this.pscDestroyIfAny();
}
});
Sometimes, I just need to close the stream. But apparently this must cause exceptions to be thrown in the asynchronous read. c# - Proper way to prematurely abort BeginRead and BeginWrite? - Stack Overflow
I added a CancellationToken that causes Observable.While() to end the sequence. This does not help much to avoid these exceptions since BeginRead() can sleep for a long time.
Unhandled exception in the observable caused the program to exit. Searching provided .net - Continue using subscription after exception - Stack Overflow which suggested to add a Catch that resumes the broken Observable with an empty one, effectively.
Code looks like this:
public static IObservable<byte[]> ReadObservable(this Stream stream, int bufferSize, CancellationToken token)
{
// to hold read data
var buffer = new byte[bufferSize];
// Step 1: async signature => observable factory
var asyncRead = Observable.FromAsyncPattern<byte[], int, int, int>(
stream.BeginRead,
stream.EndRead);
return Observable.While(
// while there is data to be read
() =>
{
return (!token.IsCancellationRequested) && stream.CanRead;
},
// iteratively invoke the observable factory, which will
// "recreate" it such that it will start from the current
// stream position - hence "0" for offset
Observable.Defer(() =>
{
if ((!token.IsCancellationRequested) && stream.CanRead)
{
return asyncRead(buffer, 0, bufferSize);
}
else
{
return Observable.Empty<int>();
}
})
.Catch(Observable.Empty<int>()) // When BeginRead() or EndRead() causes an exception, don't choke but just end the Observable.
.Select(readBytes => buffer.Take(readBytes).ToArray()));
}
What now? Question
This appears to work well. Conditions where remote host forcibly closed the connection or is just no longer reachable are detected, causing higher level code to close the connection and retry. So far so good.
I'm unsure if things feel quite right.
For one thing, that line:
.Catch(Observable.Empty<int>()) // When BeginRead() or EndRead() causes an exception, don't choke but just end the Observable.
feels like the bad practice of empty catch block in imperative code. Actual code does log the exception, and higher level code detect the absence of reply and correctly handle, so it should be considered fairly okay (see below)?
.Catch((Func<Exception, IObservable<int>>)(ex =>
{
MyLoggerLogException("On asynchronous read from network.", ex);
return Observable.Empty<int>();
})) // When BeginRead() or EndRead() causes an exception, don't choke but just end the Observable.
Also, this is indeed shorter than most traditional solutions.
Are the solutions correct or did I miss some simpler/cleaner ways?
Are there some dreadful problems that would look obvious to wizards of Reactive Extensions?
Thank you for your attention.

Related

Await doesn't give a result

Ok, I have some code to present. Here is extension method for NetworkStream object.
public async static Task<byte[]> ReadDataAsync(this NetworkStream clientStream)
{
byte[] data = {};
var buffer = new byte[1024];
if (clientStream.CanRead)
{
using (var ms = new MemoryStream())
{
try
{
int bytesRead;
while (clientStream.DataAvailable &&
(bytesRead = await clientStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
await ms.WriteAsync(buffer, 0, bytesRead);
}
}
catch (Exception ex)
{
Console.WriteLine(ex.ToString());
return data;
}
data = ms.ToArray();
}
}
else
{
Console.WriteLine("Closing clientStream.");
clientStream.Close();
}
return data;
}
And the code where I am trying to call this method.
public async static Task Preform(Socket client)
{
var stream = new NetworkStream(client);
var data = await stream.ReadDataAsync();
var message = await MessageFabrique.DeserializeMessage(data);
ServerCollections.Instance.ServerIssueQueue.Add(new ServerIssue
{
Message = message,
ClientStream = stream
});
}
ReadDataAsync method always returns me to an empty array. And at the moment when i'm trying to deserialize data there is an exception - because data[0]. Please help me. Why is this happening, if await guarantees me the result, when it needed?

clientStream.DataAvailable does not mean data might show up in the future. It means data is available right now for reading. Get rid of it and just read, the read will block till data shows up or will return 0 when the stream hits it's end.

Scott's answer is right, but .Net already takes care of you...
You might consider Stream.CopyToAsync
await clientStream.CopyToAsync(ms)
for code with considerably less places to go wrong.

In addition to the other answers, you might also want to create a synchronization context. See this article for details.
The summary is that async/await works differently in console applications than it does in a UI application. WPF and WebForms applications have a synchronization context by default but console applications don't. The result (which is actually remarkably "poorly advertised" in the documentation) is that the behavior of async/await is much less predictable in a console application than it is in a UI application, and that this might make it not work "as advertised" under certain circumstances.
For example, in a UI application "async" doesn't necessarily mean that the code runs on a background thread. It's the equivalent of "come back to me later when I'm ready." As an analogy, consider going out to eat with 10 people: when the waiter comes by, the first person he asks to order isn't ready. Two bad solutions here would be to a) bring in a second waiter to either wait for the first guy to become ready or take the other 9 people's orders) or b) wait until the first guy's ready to start taking orders. The optimal thing is to take the other 9 people's orders and then come back to the first guy hoping he'll be ready by that time. At risk of oversimplifying this is basically how async works in a UI (unless you're explicitly putting the code on a background thread with something like Task.Run). However, in a console application when you use async there's no guarantee as to where the code will actually run.
If, however, you add a synchronization context as described in the the article I link to it'll behave in a much more predictable manner.

Port checking tool causes infinite packet received loop on TcpClient server

I've written a relatively basic asynchronous server which boots up a task to accept clients and each client then boots up a task to accept incoming packets, the code for which is as follows:
MessageListeningTask = new Task(async () => {
while (true) {
byte[] buffer = new byte[256];
try {
await tcpClient.GetStream().ReadAsync(buffer, 0, buffer.Length);
} catch {
break;
}
string data = Encoding.UTF8.GetString(buffer).Trim('\0', '\n', '\r', '\t', ' ');
OnMessageReceived(data);
}
});
This seems to work pretty well for most things, and after routing it through a class that splits the tokens based on a token at the start, is quite an effective listener.
Except, given my naivety to the topic, I seem to have done something stupidly somewhere in my implementation, and checking with this tool: http://www.yougetsignal.com/tools/open-ports/ seems to break this loop and cause it to trigger OnMessageReceived constantly with no data.
I'm not entirely sure what procedures to take to help diagnose this issue, and figure it's probably something to do with how the information stream operates, so I was hoping someone with experience in the topic could help me solve my issue but also explain what is causing it. If it's relevant, it's running under Mono on Ubuntu, but it usually runs perfectly so I can't see this being the issue.
I am happy to provide any additional information, or to check anything.
Thanks!

From the documentation of ReadAsync:
https://msdn.microsoft.com/en-us/library/hh137813(v=vs.110).aspx
"Return Value
Type: System.Threading.Tasks.Task
A task that represents the asynchronous read operation. The value of the TResult parameter contains the total number of bytes read into the buffer. The result value can be less than the number of bytes requested if the number of bytes currently available is less than the requested number, or it can be 0 (zero) if the end of the stream has been reached."
In the loop ReadAsync must be constantly returning 0 because it reached the end of stream.

Is writing zero bytes to a network stream a reliable way to detect closed connections?

I'm working on an application where a client connects with a TCP connection which then triggers an amount of work that may potentially take a lot of time to complete. This work must be cancelled if the user drops the TCP connection.
Currently, what I'm doing is starting up a timer that periodically checks the networks streams connectivity by doing this:
// stream is a Stream instance
var abort = false;
using (new Timer(x => {
try
{
stream.Write(new byte[0], 0, 0);
}
catch (Exception)
{
abort = true;
}
}, null, 1000, 1000))
{
// Do expensive work here and check abort periodically
}
I would have liked to read the CanWrite, CanRead or Connected but they report the last status of the stream. Is writing zero bytes a reliable way of testing connectivity, or can this itself cause issues? I cannot write or read any real data on the stream since that would mess up the client.

Let's just say that I have known it to work, decades ago, but there is no intrinsic reason why it should. Any of the API layers between you and the TCP stack is entitled to suppress the call to the next layer down, and even if it gets all the way into the stack it will only return an error if:
It checks for network errors before checking for zero length, which is implementation-dependent, and
There already was a network error, caused by some previous operation, or an incoming RST.
If you're expecting it to magically probe the network all the way to the other end, it definitely won't.

SocketAsyncEventArgs buffer is full of zeroes

I'm writing a message layer for my distributed system. I'm using IOCP, ie the Socket.XXXAsync methods.
Here's something pretty close to what I'm doing (in fact, my receive function is based on his):
http://vadmyst.blogspot.com/2008/05/sample-code-for-tcp-server-using.html
What I've found now is that at the start of the program (two test servers talking to each other) I each time get a number of SAEA objects where the .Buffer is entirely filled with zeroes, yet the .BytesTransferred is the size of the buffer (1024 in my case).
What does this mean? Is there a special condition I need to check for? My system interprets this as an incomplete message and moves on, but I'm wondering if I'm actually missing some data. I was under the impression that if nothing was being received, you'd not get a callback. In any case, I can see in WireShark that there aren't any zero-length packets coming in.
I've found the following when I Googled it, but I'm not sure my problem is the same:
http://social.msdn.microsoft.com/Forums/en-US/ncl/thread/40fe397c-b1da-428e-a355-ee5a6b0b4d2c
http://go4answers.webhost4life.com/Example/socketasynceventargs-buffer-not-ready-121918.aspx

I am sure not what is going on in the linked example. It appears to be using asynchronous sockets in a synchronous way. I cannot see any callbacks or similar in the code. You may need to rethink whether you need synchronous or asynchronous sockets :).
To the problem at hand stems from the possibility that your functions are trying to read/write to the buffer before the network transmit/receive has been completed. Try using the callback functionality included in the async Socket. E.g.
// This goes into your accept function, to begin receiving the data
socketName.BeginReceive(yourbuffer, 0, yourbuffer.Length,
SocketFlags.None, new AsyncCallback(OnRecieveData), socketName);
// In your callback function you know that the socket has finished receiving data
// This callback will fire when the receive is complete.
private void OnRecieveData(IAsyncResult input) {
Socket inSocket = (Socket)input.AsyncState; // This is just a typecast
inSocket.EndReceive(input);
// Pull the data out of the socket as you already have before.
// state.Data.Write ......
}

Sending multiple data in TCPSocket

I'm trying to create a chat with file transfer application using TCPSocket and here is my code..
SENDER:
public void sendData(string message)
{
StreamWriter streamWriter = new StreamWriter(netStream); // netStream is
// connected
streamWriter.WriteLine(message);
streamWriter.WriteLine(message);
logs.Add(string.Format("Message Sent! :{0}", message));
//netStream.Flush();
streamWriter.Flush();
}
RECEIVER:
private void ReceiveData()
{
StreamReader streamReader = new StreamReader(ChatNetStream);
StringBuilder dataAppends = new StringBuilder();
bool doneTransfer = false;
string data;
while (!doneTransfer)
{
while ((data = streamReader.ReadLine()) != null)
{
dataAppends.Append(data);
}
doneTransfer = true;
//ChatNetStream.Close();
//streamReader
}
//do whatever i want with dataAppends.ToString() here..
ReceiveData()
}
the problem is i always turn into infinite loop inside this statement
while ((data = streamReader.ReadLine()) != null)
{
dataAppends.Append(data);
}
even if i put streamWriter.Flush() on my sender..
do i need to close/dispose the netStream/NetworkStream?
anyway, can i use only 1 socket or connection to send a File and send a chat at the same time..? or do i need to use a new socket connection everytime i send a file..

You get an infinite loop because StreamReader.ReadLine will only return null when the end of the stream is reached. For a network stream, "end of stream" means "the other side has closed its half of the connection". Since the other side is your client, and it keeps the connection open while waiting for the user to type in more data, you will end up with an infinite loop.
What you want to do instead is fire off an operation that only completes if there is more data to read. There are two ways to go about this: either use a blocking read operation (on a dedicated thread, so that you don't block your application's other processing while waiting for messages), or use an async (event- or callback-based) approach.
For the synchronous (blocking) approach, see the documentation on NetworkStream.Read which includes example code that shows how to check if there is incoming data and how you can read it. The one point you absolutely need to know here is that when Read returns zero, it means that all data has been read and the connection has been closed from the other side (so you should close your end as well and not loop; the client has disconnected).
For low-level async network reads, the relevant operation is NetworkStream.BeginRead, which comes with its own example.
Both approaches are lower-level than what you currently have and will require you to manually assemble data inside a buffer and decide when "enough data" (i.e. a full line) has accumulated for you to process. You will then have to carefully pull that data out of the buffer and continue.
For a higher-level approach that still allows you some degree of orchestrating things, look into using client sockets (and in particular the two sync and async options there). This functionality is introduced by the TcpClient (and server-side the corresponding TcpListener) classes.
Finally, as jValdron's comment says, you will either need a separate connection for transferring file data or engineer some custom protocol that allows you to interleave multiple kinds of data over the same network stream. The second solution is has generally more technical merit, but it will also be harder for you to implement correctly.

Checkout the BasicSend example in networkComms.net which demonstrates a simple chat application using an open source library.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.