ZeroMQ Non-Blocking Non-Queueing Push

ZeroMQ Non-Blocking Non-Queueing Push - c#

I am using the C# wrapper for ZeroMQ but this seems more like an underlying issue with ZeroMQ.
Is there any way to push a message without blocking and without queueing? If the server is not up I would like the messages to be permanently disposed without blocking.
Here are the settings I've tried so far:
1)
Send (Blocking send)
High water mark = 0
This (stragely) does not block, but it seems to queue in memory until the socket is connected (memory keeps rising for the process).
2)
Send (Non-Blocking send)
High water mark = 1
This is a race condition. If I send two messages in rapid succession one message is sometimes thrown out for exceeding the high water mark.
3)
Poll the socket to figure out if it's going to block. This doesn't really help because I still have to put one (old) message in the queue before it starts blocking (if I set HWM = 1).
Non-blocking send with any high water mark is undesirable because as soon as the server comes back online it gets a bunch of old messages from clients.
Blocking send doesn't work because I don't want to block.

What you seem to be looking for is simply a PUB socket. This socket type never blocks on send, and discards any message it cannot send to a subscriber. See this page : http://api.zeromq.org/3-2:zmq-socket .
On a side note, you do not need to use this socket for "real" pub/sub, you can use it for nonblocking communication between two nodes by having only one PUB and one SUB socket by endpoint.
Your server will not get "old" messages after a reconnect because the PUB sockets will have dropped the messages it could not send while the server was disconnected. Nevertheless i believe that while you cannot avoid some internal ZMQ "queuing", it should have little bearing on your use case.

Related

TCP socket fails to (reliably) deliver what is sent after a restart of application

I basically create a socket, perform a couple of Sends, Shutdown the socket and Dispose it. This is repeated for the duration of the application session as data is being queued to be sent.
This works without issue until I restart the receiving application. Then I find at the receiving end that the last byte array I send in this sequence is not always received. The receiver knows it should get an End marker at some point, telling it this is it, so it can close the receive socket. I found after an application restart the End marker does not get delivered reliably. The receiver then continues to wait for it, it keeps calling BeginReceive and getting zero bytes. This results in a high CPU load and the receive socket not being closed.
Restarting the receiving application does not fix it, a reboot of the machine however does. Also, restarting all application and using different port numbers fixes it.
I can mitigate this by trying to receive three times with a short wait and giving up if nothing more comes in, assuming it is not going to happen anymore and this is indeed the end even if no explicit End marker was received. But I want to understand what is happening.
The experience suggests it has something to do with Windows recycling sockets under the hood.
Is there something I can do to get a "clean" socket other than using a different port number? Can anyone explain what is happening?
This is the crux of the sending code of which the End constant (which is just 5 fixed bytes) does not always make it to the receiver:
using (socket = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp))
{
socket.SendBufferSize = 0x10000; // increase send buffer size from default 8K to 64K
socket.Connect(this.Host, this.Port);
bool oneOrMoreSent = false;
while (this.queue.TryDequeue(out IMessage? msg))
{
if (msg != null)
{
byte[] packet = (msg as Message).Packet ??= CreatePacket(msg);
_ = socket.Send(packet);
oneOrMoreSent = true;
}
}
if (oneOrMoreSent)
{
_ = socket.Send(End);
}
socket.Shutdown(SocketShutdown.Send);
}
[Edit]
The article about half closed connections pointed to by JonasH was helpful and confirmed the statement of the other commenter with the ridiculous name. The application I implemented the sending code in is not the most well behaved regarding shutdown logic and we do use kill commands to make it go away when needed so this would explain the emergence of half closed connections where the receiver waits for more data that should come but never does. It is still weird that restarting the receiving application apparently does not release/cleanup the receiving socket.
Understanding this a little better, I settled for a straightforward fix. If the receiver expects more data but repeatedly doesn't get it, waiting 1 ms in between BeginReceive calls, it will now accept that the sending end is dead and return from the handler. So the receiving logic will never stubbornly wait for data, thus locking up the socket forever. This safeguard should have been there in the first place and it fixes my problem.

How can I avoid the merging of message from the same socket

similar question:
C# Socket BeginReceive / EndReceive capturing multiple messages
I am currently managing the communication between a website and a winform application, which is done by websocket created that way:
Socket socket = new Socket(AddressFamily.InterNetwork,
SocketType.Stream,
ProtocolType.Tcp);
If the emmitter send two messages A([TagBeginMessage:lengthMessageA] aaaaaaaaaaaaaaaa [EndMessage]) and B([TagBeginMessage: lengthMessageB] bbbbbbbbbbbbbbbbbbb [EndMessage]), I expect that the receiver will get
[TagBeginMessage:lengthMessageA] aaaaaaaaaaaaaaaa [EndMessage][TagBeginMessage: lengthMessageB] bbbbbbbbbbbbbbbbbbb [EndMessage]
or
[TagBeginMessage: lengthMessageB] bbbbbbbbbbbbbbbbbbb [EndMessage][TagBeginMessage:lengthMessageA] aaaaaaaaaaaaaaaa [EndMessage]
This is indeed the case for the vast majority of message, however the necessary asynchronous nature of the reception sometimes causes a bug when the message A is quite long and the message B quite short, in which the receiver get this:
[TagBeginMessage:lengthMessageA] aaaaaaaaaa[TagBeginMessageB: lengthMessageB] bbbbbbbbbbbbbbbbbbb [EndMessage]aaaaaa [EndMessageA]
This can still be parsed, although it required unique ending for each message. However, while I didn't see it, I am afraid that this means that the following case is also possible (due to the fact that the socket send their data by packet):
[TagBeginMessage:lengthMessageA] aaaaa[TagBeginMessageB: lengthMessageB] bbbbbbaaaabbbbbbbbbbbbb [EndMessage]aaaaaa [EndMessageA]
This is unparseable. Adding a length (as suggested in http://blog.stephencleary.com/2009/04/sample-code-length-prefix-message.html) to the beginning of the message indicates the problem but doesn't solve it. What can I do to avoid this?
My current solution are:
Send small message. Not elegant but should work.
Send very smallmessage to signal that the buffer of the socket is empty.

It sounds like you are sending data on the same connection from multiple threads simultaneously. That's fine, but if you do that, make sure you lock the connection to the current thread for the duration of sending out a logical package (length + data) - that way you will receive a complete packet (sent from one thread) before receiving anything that a different thread sent.
Under TCP/IP, packages are guaranteed to arrive in the same order that they were sent, but if you send, say, half a logical package (length + data) from one thread then half a package from another, then there is no way for the protocol layer or the receiving end to know that.

You have already linked to Stephen Cleary's blog so you know that tcp requires you to do some form of message framing. Here is the another one of Cleary's posts that describe your problem.
In short you must frame your tcp messages. You should also never see your final, unparsable, example. Tcp will send all of your data in the correct order.

Im not an expert with sockets, but i have some experience. What you could do is split the data thats being sent. Say you have a very long string that has the following: "AAAAAAAABBBBBBCCCCDDDDDEEE". What you could do is instead of sending the entire string through the socket at once you could send all the A characters then the B then the C and then the D etc. Right before the user recieves the message you could merge all the characters together. Thats just an idea.

How to safely stream data through a server socket to another socket?

I'm writing a server application for an iPhone application im designing. iPhone app is written in C# (MonoTouch) and the server is written in C# too (.NET 4.0)
I'm using asynchronous sockets for the network layer. The server allows two or more iPhones ("devices") to connect to each other and be able to send data bi-directionally.
Depending on the incoming message, the server either processes the message itself , or relays the data through to the other device(s) in the same group as the sending device. It can make this decision by decoding the header of the packet first, and deciding what type of packet it is.
This is done by framing the stream in a way that the first 8 bytes are two integers, the length of the header and the length of the payload (which can be much larger than the header).
The server reads (asynchronously) from the socket the first 8 bytes so it has the lengths of the two sections. It then reads again, up to the total length of the header section.
It then deserializes the header, and based on the information within, can see if the remaining data (payload) should be forwarded onto another device, or is something that the server itself needs to work with.
If it needs to be forwarded onto another device, then the next step is to read data coming into the socket in chunks of say, 1024 bytes, and write these directly using an async send via another socket that is connected to the recipient device.
This reduces the memory requirements of the server, as i'm not loading in the entire packet into a buffer, then re-sending it down the wire to the recipient.
However, because of the nature of async sockets, I am not guaranteed to receive the entire payload in one read, so have to keep reading until I receive all the bytes. In the case of relaying onto its final destination, this means that i'm calling BeginSend() for each chunk of bytes I receive from the sender, and forwarding that chunk onto the recipient, one chunk at a time.
The issue with this is that because I am using async sockets, this leaves the possibility of another thread doing a similar operation with the same recipient (and therefore same final destination socket), and so it is likely that the chunks coming from both threads will get mixed up and corrupt all the data going to that recipient.
For example: If the first thread sends a chunk, and is waiting for the next chunk from the sender (so it can relay it onwards), the second thread could send one of its chunks of data, and corrupt the first thread's (and the second thread's for that matter) data.
As I write this, i'm just wondering is it as simple as just locking the socket object?! Would this be the correct option, or could this cause other issues (e.g.: issues with receiving data through the locked socket that's being sent BACK from the remote device?)
Thanks in advance!

I was facing a similar scenario a while back, I don't have the complete solution anymore, but here's pretty much what I did :
I didn't use sync sockets, decided to explore the async sockets in C# - fun ride
I don't allow multiple threads to share a single resource unless I really have to
My "packets" were containing information about size, index and total packet count for a message
My packet's 1st byte was unique to signify that it's a start of a message, I used 0xAA
My packets's last 2 bytes were a result of a CRC-CCITT checksum (ushort)
The objects that did the receiving bit contained a buffer with all received bytes. From that buffer I was extracting "complete" messages once the size was ok, and the checksum matched
The only "locking" I needed to do was in the temp buffer so I could safely analyze it's contents between write/read operations
Hope that helps a bit

Not sure where the problem is. Since you mentioned servers, I assume TCP, yes?
A phone needs to communicate some of your PDU to another phone. It connects as a client to the server on the other phone. A socket-pair is established. It sends the data off to the server socket. The socket-pair is unique - no other streams that might be happening between the two phones should interrupt this, (will slow it up, of course).
I don't see how async/sync sockets, assuming implemented correctly, should affect this, either should work OK.
Is there something I cannot see here?
BTW, Maciek's plan to bolster up the protocol by adding an 'AA' start byte is an excellent idea - protocols depending on sending just a length as the first element always seem to screw up eventually and result in a node trying to dequeue more bytes that there are atoms in the universe.
Rgds,
Martin
OK, now I understand the problem, (I completely misunderstood the topology of the OP network - I thought each phone was running a TCP server as well as client/s, but there is just one server on PC/whatever a-la-chatrooms). I don't see why you could not lock the socket class with a mutex, so serializing the messages. You could queue the messages to the socket, but this has the memory implications that you are trying to avoid.
You could dedicate a connection to supplying only instructions to the phone, eg 'open another socket connection to me and return this GUID - a message will then be streamed on the socket'. This uses up a socket-pair just for control and halves the capacity of your server :(
Are you stuck with the protocol you have described, or can you break your messages up into chunks with some ID in each chunk? You could then multiplex the messages onto one socket pair.
Another alternative, that again would require chunking the messages, is introduce a 'control message', (maybee a chunk with 55 at start instead of AA), that contains a message ID, (GUID?), that the phone uses to establish a second socket connection to the server, passes up the ID and is then sent the second message on the new socket connection.
Another, (getting bored yet?), way of persuading the phone to recognise that a new message might be waiting would be to close the server socket that the phone is receiving a message over. The phone could then connect up again, tell the server that it only got xxxx bytes of message ID yyyy. The server could then reply with an instruction to open another socket for new message zzzz and then resume sending message yyyy. This might require some buffering on the server to ensure no data gets lost during the 'break'. You might want to implement this kind of 'restart streaming after break' functionality anyway since phones tend to go under bridges/tunnels just as the last KB of a 360MB video file is being streamed :( I know that TCP should take care of dropped packets, but if the phone wireless layer decides to close the socket for whatever reason...
None of these solutions is particularly satisfying. Interested to see whay other ideas crop up..
Rgds,
Martin

Thanks for the help everyone, i've realised the simpliest approach is to use synchronous send commands on the client, or at least a send command that must complete before the next item is sent. Im handling this with my own send queue on the client, rather than various parts of the app just calling send() when they need to send something.

Suggestions for developing a TCP/IP based message client

I've got a server side protocol that controls a telephony system, I've already implemented a client library that communicates with it which is in production now, however there are some problems with the system I have at the moment, so I am considering re-writing it.
My client library is currently written in Java but I am thinking of re-writing it in both C# and Java to allow for different clients to have access to the same back end.
The messages start with a keyword have a number of bytes of meta data and then some data. The messages are always terminated by an end of message character.
Communication is duplex between the client and the server usually taking the form of a request from the Client which provokes several responses from the server, but can be notifications.
The messages are marked as being on of:
C: Command
P: Pending (server is still handling the request)
D: Data data as a response to
R: Response
B: Busy (Server is too busy to handle response at the moment)
N: Notification
My current architecture has each message being parsed and a thread spawned to handle it, however I'm finding that some of the Notifications are processed out of order which is causing me some trouble as they have to be handled in the same order they arrive.
The duplex messages tend to take the following message format:
Client -> Server: Command
Server -> Client: Pending (Optional)
Server -> Client: Data (optional)
Server -> Client: Response (2nd entry in message data denotes whether this is an error or not)
I've been using the protocol for over a year and I've never seen the a busy message but that doesn't mean they don't happen.
The server can also send notifications to the client, and there are a few Response messages that are auto triggered by events on the server so they are sent without a corresponding Command being issued.
Some Notification Messages will arrive as part of sequence of messages, which are related for example:
NotificationName M00001
NotificationName M00001
NotificationName M00000
The string M0000X means that either there is more data to come or that this is the end of the messages.
At present the tcp client is fairly dumb it just spawns a thread that notifies an event on a subscriber that the message has been received, the event is specific to the message keyword and the type of message (So data,Responses and Notifications are handled separately) this works fairly effectively for Data and response messages, but falls over with the notification messages as they seem to arrive in rapid sequence and a race condition sometimes seems to cause the Message end to be processed before the ones that have the data are processed, leading to lost message data.
Given this really badly written description of how the system works how would you go about writing the client side transport code?
The meta data does not have a message number, and I have not control over the underlying protocol as it's provided by a vendor.

The requirement that messages must be processed in the order in which they're received almost forces a producer/consumer design, where the listener gets requests from the client, parses them, and then places the parsed request into a queue. A separate thread (the consumer) takes each message from the queue in order, processes it, and sends a response to the client.
Alternately, the consumer could put the result into a queue so that another thread (perhaps the listener thread?) can send the result to the client. In that case you'd have two producer/consumer relationships:
Listener -> event queue -> processing thread -> output queue -> output thread
In .NET, this kind of thing is pretty easy to implement using BlockingCollection to handle the queues. I don't know if there is something similar in Java.
The possibility of a multi-message request complicates things a little bit, as it seems like the listener will have to buffer messages until the last part of the request comes in before placing the entire thing into the queue.
To me, the beauty of the producer/consumer design is that it forces a hard separation between different parts of the program, making each much easier to debug and minimizing the possibility of shared state causing problems. The only slightly complicated part here is that you'll have to include the connection (socket or whatever) as part of the message that gets shared in the queues so that the output thread knows where to send the response.
It's not clear to me if you have to process all messages in the order they're received or if you just need to process messages for any particular client in the proper order. For example, if you have:
Client 1 message A
Client 1 message B
Client 2 message A
Is it okay to process the first message from Client 2 before you process the second message from Client 1? If so, then you can increase throughput by using what is logically multiple queues--one per client. Your "consumer" then becomes multiple threads. You just have to make sure that only one message per client is being processed at any time.

I would have one thread per client which does the parsing and processing. That way the processing would be in the order it is sent/arrives.
As you have stated, the tasks cannot be perform in parallel safely. performing the parsing and processing in different threads is likely to add as much overhead as you might save.
If your processing is relatively simple and doesn't depend on external systems, a single thread should be able to handle 1K to 20K messages per second.
Is there any other issues you would want to fix?

I can recommend only for Java-based solution.
I would use some already mature transport framework. By "some" I mean the only one I have worked with until now -- Apache MINA. However, it works and it's very flexible.
Regarding processing messages out-of-order -- for messages which must be produced in the order they were received you could build queues and put such messages into queues.
To limit number of queues, you could instantiate, say, 4 queues, and route incoming message to particular queue depending on the last 2 bits (indeces 0-3) of the hash of the ordering part of the message (for example, on the client_id contained in the message).
If you have more concrete questions, I can update my answer appropriately.

NetworkStream.Write returns immediately - how can I tell when it has finished sending data?

Despite the documentation, NetworkStream.Write does not appear to wait until the data has been sent. Instead, it waits until the data has been copied to a buffer and then returns. That buffer is transmitted in the background.
This is the code I have at the moment. Whether I use ns.Write or ns.BeginWrite doesn't matter - both return immediately. The EndWrite also returns immediately (which makes sense since it is writing to the send buffer, not writing to the network).
bool done;
void SendData(TcpClient tcp, byte[] data)
{
NetworkStream ns = tcp.GetStream();
done = false;
ns.BeginWrite(bytWriteBuffer, 0, data.Length, myWriteCallBack, ns);
while (done == false) Thread.Sleep(10);
}
 
public void myWriteCallBack(IAsyncResult ar)
{
NetworkStream ns = (NetworkStream)ar.AsyncState;
ns.EndWrite(ar);
done = true;
}
How can I tell when the data has actually been sent to the client?
I want to wait for 10 seconds(for example) for a response from the server after sending my data otherwise I'll assume something was wrong. If it takes 15 seconds to send my data, then it will always timeout since I can only start counting from when NetworkStream.Write returns - which is before the data has been sent. I want to start counting 10 seconds from when the data has left my network card.
The amount of data and the time to send it could vary - it could take 1 second to send it, it could take 10 seconds to send it, it could take a minute to send it. The server does send an response when it has received the data (it's a smtp server), but I don't want to wait forever if my data was malformed and the response will never come, which is why I need to know if I'm waiting for the data to be sent, or if I'm waiting for the server to respond.
I might want to show the status to the user - I'd like to show "sending data to server", and "waiting for response from server" - how could I do that?

I'm not a C# programmer, but the way you've asked this question is slightly misleading. The only way to know when your data has been "received", for any useful definition of "received", is to have a specific acknowledgment message in your protocol which indicates the data has been fully processed.
The data does not "leave" your network card, exactly. The best way to think of your program's relationship to the network is:
your program -> lots of confusing stuff -> the peer program
A list of things that might be in the "lots of confusing stuff":
the CLR
the operating system kernel
a virtualized network interface
a switch
a software firewall
a hardware firewall
a router performing network address translation
a router on the peer's end performing network address translation
So, if you are on a virtual machine, which is hosted under a different operating system, that has a software firewall which is controlling the virtual machine's network behavior - when has the data "really" left your network card? Even in the best case scenario, many of these components may drop a packet, which your network card will need to re-transmit. Has it "left" your network card when the first (unsuccessful) attempt has been made? Most networking APIs would say no, it hasn't been "sent" until the other end has sent a TCP acknowledgement.
That said, the documentation for NetworkStream.Write seems to indicate that it will not return until it has at least initiated the 'send' operation:
The Write method blocks until the requested number of bytes is sent or a SocketException is thrown.
Of course, "is sent" is somewhat vague for the reasons I gave above. There's also the possibility that the data will be "really" sent by your program and received by the peer program, but the peer will crash or otherwise not actually process the data. So you should do a Write followed by a Read of a message that will only be emitted by your peer when it has actually processed the message.

TCP is a "reliable" protocol, which means the data will be received at the other end if there are no socket errors. I have seen numerous efforts at second-guessing TCP with a higher level application confirmation, but IMHO this is usually a waste of time and bandwidth.
Typically the problem you describe is handled through normal client/server design, which in its simplest form goes like this...
The client sends a request to the server and does a blocking read on the socket waiting for some kind of response. If there is a problem with the TCP connection then that read will abort. The client should also use a timeout to detect any non-network related issue with the server. If the request fails or times out then the client can retry, report an error, etc.
Once the server has processed the request and sent the response it usually no longer cares what happens - even if the socket goes away during the transaction - because it is up to the client to initiate any further interaction. Personally, I find it very comforting to be the server. :-)

In general, I would recommend sending an acknowledgment from the client anyway. That way you can be 100% sure the data was received, and received correctly.

If I had to guess, the NetworkStream considers the data to have been sent once it hands the buffer off to the Windows Socket. So, I'm not sure there's a way to accomplish what you want via TcpClient.

I can not think of a scenario where NetworkStream.Write wouldn't send the data to the server as soon as possible. Barring massive network congestion or disconnection, it should end up on the other end within a reasonable time. Is it possible that you have a protocol issue? For instance, with HTTP the request headers must end with a blank line, and the server will not send any response until one occurs -- does the protocol in use have a similar end-of-message characteristic?
Here's some cleaner code than your original version, removing the delegate, field, and Thread.Sleep. It preforms the exact same way functionally.
void SendData(TcpClient tcp, byte[] data) {
NetworkStream ns = tcp.GetStream();
// BUG?: should bytWriteBuffer == data?
IAsyncResult r = ns.BeginWrite(bytWriteBuffer, 0, data.Length, null, null);
r.AsyncWaitHandle.WaitOne();
ns.EndWrite(r);
}
Looks like the question was modified while I wrote the above. The .WaitOne() may help your timeout issue. It can be passed a timeout parameter. This is a lazy wait -- the thread will not be scheduled again until the result is finished, or the timeout expires.

I try to understand the intent of .NET NetworkStream designers, and they must design it this way. After Write, the data to send are no longer handled by .NET. Therefore, it is reasonable that Write returns immediately (and the data will be sent out from NIC some time soon).
So in your application design, you should follow this pattern other than trying to make it working your way. For example, use a longer time out before received any data from the NetworkStream can compensate the time consumed before your command leaving the NIC.
In all, it is bad practice to hard code a timeout value inside source files. If the timeout value is configurable at runtime, everything should work fine.

How about using the Flush() method.
ns.Flush()
That should ensure the data is written before continuing.

Bellow .net is windows sockets which use TCP.
TCP uses ACK packets to notify the sender the data has been transferred successfully.
So the sender machine knows when data has been transferred but there is no way (that I am aware of) to get that information in .net.
edit:
Just an idea, never tried:
Write() blocks only if sockets buffer is full. So if we lower that buffers size (SendBufferSize) to a very low value (8? 1? 0?) we may get what we want :)

Perhaps try setting
tcp.NoDelay = true

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.