zeromq receive times out with EGAIN exactly the 255th time

zeromq receive times out with EGAIN exactly the 255th time - c#

library: clrzmq4 (https://github.com/zeromq/clrzmq4) in a c# project.
I am using zmq router-dealer configuration. The server is written in python and runs on linux. My dealer client written in c# runs on a windows machine. It sends messages and waits from the response
public Boolean sendMessage(Dictionary<String, String> msgDict)
{
ZError err;
String errStr;
var reqFrame = new ZFrame(JsonConvert.SerializeObject(msgDict));
retval = socket.Send(reqFrame, out err);
if (err != null)
{
errStr = String.Format("Error while sending command {3} {0} {1}", err.Text, err.Number, err.Name);
return false;
}
err = null;
respFrame = socket.ReceiveFrame(out err);
if (err != null)
{
errStr = String.Format("Error while receiving response data {0} {1} {2} {3}", err.Text, err.Number, err.Name, num_messages);
return false;
}
return true;
}
I set the sendTimeout and receiveTimeout on the socket to 2 min each.
When I keep calling sendMessage, exactly at the 255th time, receiveFrame timesout . On the server I see the message being processed and response being sent like everytime. And after this point, my send also timesout with the same error "EAGAIN" Resource temporarily unavailable.
There are the things I tried
Data with different lengths from 2 KB to 20 MB
set the sendhighwatermark and receivehighwatermark to different values: 10, 1000, 10000
Tried polling on the socket instead of ReceiveFrame
Tried making the sockets completely blocking.
In each of the above cases the failure occured at exactly the 255th time. In case of blocking sockets, it got blocked at the 255th time too.
I can't use netmq as much as I would like to because it doesn't have curvezmq and the server needs it.
I also tried a dealer client from another linux machine and it had no issues 255th time or even later.

Related

NetMQ response socket poll fails after succeeding one time

I'm new to the world of ZeroMQ and I'm working through the documentation of both NetMQ and ZeroMQ as I go. I'm currently implementing (or preparing to implement) the Paranoid Pirate Pattern, and hit a snag. I have a single app which is running the server(s), clients, and eventually queue, though I haven't implemented the queue yet. Right now, there should only be one server at a time running. I can launch as many clients as I like, all communicating with the single server. I am able to have my server "crash" and restart it (manually for now, automatically soon). That all works. Or at least, restarting the server works once.
To enforce that there's only a single server running, I have a thread (which I'll call the WatchThread) which opens a response socket that binds to an address and polls for messages. When the server dies, it signals its demise and the WatchThread decrements the count when it receives the signal. Here's the code snippet that is failing:
//This is the server's main loop:
public void Start(object? count)
{
num = (int)(count ?? -1);
_model.WriteMessage($"Server {num} up");
var rng = new Random();
using ResponseSocket server = new();
server.Bind(tcpLocalhost); //This is for talking to the clients
int cycles = 0;
while (true)
{
var message = server.ReceiveFrameString();
if (message == "Kill")
{
server.SendFrame("Dying");
return;
}
if (cycles++ > 3 && rng.Next(0, 16) == 0)
{
_model.WriteMessage($"Server {num} \"Crashing\"");
RequestSocket sock = new(); //This is for talking to the WatchThread
sock.Connect(WatchThreadString);
sock.SendFrame("Dying"); //This isn't working correctly
return;
}
if(cycles > 3 && rng.Next(0, 10) == 0)
{
_model.WriteMessage($"Server {num}: Slowdown");
Thread.Sleep(1000);
}
server.SendFrame($"Server{num}: {message}");
}
}
And here's the WatchThread code:
public const string WatchThreadString = "tcp://localhost:5000";
private void WatchServers()
{
_watchThread = new ResponseSocket(WatchThreadString);
_watchThread.ReceiveReady += OnWatchThreadOnReceiveReady;
while (_listen)
{
bool result = _watchThread.Poll(TimeSpan.FromMilliseconds(1000));
}
}
private void OnWatchThreadOnReceiveReady(object? s, NetMQSocketEventArgs a)
{
lock (_countLock)
{
ServerCount--;
}
_watchThread.ReceiveFrameBytes();
}
As you can see, it's pretty straight forward. What am I missing? It seems like what should happen is exactly what happens the first time everything is instantiated: The server is supposed to go down, so it opens a new socket to the pre-existing WatchThread and sends a frame. The WatchThread receives the message and decrements the counter appropriately. It's only on the second server where things don't behave as expected...
Edit: I was able to get it to work by unbinding/closing _watchThread and recreating it... it's definitely suboptimal and it still seems like I'm missing something. It's almost as if for some reason I can only use that socket once, though I have other request sockets being used multiple times.
Additional Edit:
My netstat output with 6 clients running (kubernetes is in my host file as 127.0.0.1 as is detailed here):
TCP 127.0.0.1:5555 MyComputerName:0 LISTENING
TCP 127.0.0.1:5555 kubernetes:64243 ESTABLISHED
TCP 127.0.0.1:5555 kubernetes:64261 ESTABLISHED
TCP 127.0.0.1:5555 kubernetes:64264 ESTABLISHED
TCP 127.0.0.1:5555 kubernetes:64269 ESTABLISHED
TCP 127.0.0.1:5555 kubernetes:64272 ESTABLISHED
TCP 127.0.0.1:5555 kubernetes:64273 ESTABLISHED

IBM XMS Receive method not returning messages immediately

I use IBM XMS to connect to a third party to send and receive messages.
UPDATE:
Client .Net Core 3.1
IBM XMS library version from Nuget. Tried 9.2.4 and 9.1.5 with same results
Same code used to work fine a week ago - so something must have changed in the MQ manager or somewhere in my infrastructure
SSL and client certificates
I have been using a receive with timeout for a while without problems but since last week I started to not see any messages to pick - even when they were there - but once I changed to the not timeout receive method I started again to pick messages every 5 minutes.
Looking at the XMS logs I can see the messages are actually read almost immediately with and without timeout but that XMS seems to be deciding to wait for those 5 minutes before returning the message...
I haven't changed anything in my side and the third party reassures they haven't either.
My question is: given the below code used to receive is there anything there that may be the cause of the 5 minutes wait? Any ideas on things I can try? I can share the XMS logs too if that helps.
// This is used to set the default properties in the factory before calling the receive method
private void SetConnectionProperties(IConnectionFactory cf)
{
cf.SetStringProperty(XMSC.WMQ_HOST_NAME, _mqConfiguration.Host);
cf.SetIntProperty(XMSC.WMQ_PORT, _mqConfiguration.Port);
cf.SetStringProperty(XMSC.WMQ_CHANNEL, _mqConfiguration.Channel);
cf.SetStringProperty(XMSC.WMQ_QUEUE_MANAGER, _mqConfiguration.QueueManager);
cf.SetStringProperty(XMSC.WMQ_SSL_CLIENT_CERT_LABEL, _mqConfiguration.CertificateLabel);
cf.SetStringProperty(XMSC.WMQ_SSL_KEY_REPOSITORY, _mqConfiguration.KeyRepository);
cf.SetStringProperty(XMSC.WMQ_SSL_CIPHER_SPEC, _mqConfiguration.CipherSuite);
cf.SetIntProperty(XMSC.WMQ_CONNECTION_MODE, XMSC.WMQ_CM_CLIENT);
cf.SetIntProperty(XMSC.WMQ_CLIENT_RECONNECT_OPTIONS, XMSC.WMQ_CLIENT_RECONNECT);
cf.SetIntProperty(XMSC.WMQ_CLIENT_RECONNECT_TIMEOUT, XMSC.WMQ_CLIENT_RECONNECT_TIMEOUT_DEFAULT);
}
public IEnumerable<IMessage> ReceiveMessage()
{
using var connection = _connectionFactory.CreateConnection();
using var session = connection.CreateSession(false, AcknowledgeMode.AutoAcknowledge);
using var destination = session.CreateQueue(_mqConfiguration.ReceiveQueue);
using var consumer = session.CreateConsumer(destination);
connection.Start();
var result = new List<IMessage>();
var keepRunning = true;
while (keepRunning)
{
try
{
var sw = new Stopwatch();
sw.Start();
var message = _mqConfiguration.ConsumerTimeoutMs == 0 ? consumer.Receive()
: consumer.Receive(_mqConfiguration.ConsumerTimeoutMs);
if (message != null)
{
result.Add(message);
_messageLogger.LogInMessage(message);
var ellapsedMillis = sw.ElapsedMilliseconds;
if (_mqConfiguration.ConsumerTimeoutMs == 0)
{
keepRunning = false;
}
}
else
{
keepRunning = false;
}
}
catch (Exception e)
{
// We log the exception
keepRunning = false;
}
}
consumer.Close();
destination.Dispose();
session.Dispose();
connection.Close();
return result;
}

The symptoms look like a match for APAR IJ20591: Managed .NET SSL application making MQGET calls unexpectedly receives MQRC_CONNECTION_BROKEN when running in .NET Core. This impacts messages larger than 15kb and IBM MQ .net standard (core) libraries using TLS channels. See also this thread. This will be fixed in 9.2.0.5, no CDS release is listed.
It states:
Setting the heartbeat interval to lower values may reduce the frequency of occurrence.
If your .NET application is not using a CCDT you can lower the heartbeat by having the SVRCONN channel's HBINT lowered and reconnecting your application.

C# tcp async listener gets stuck on my on_receive callback after client closes socket

I've got a listener socket that accepts, receives and sends as a TCP server typically does. I've given my accept and receive code below, it's not that different from the example on Microsoft's documentation. The main difference is that my server doesn't kill a connection after it stops receiving data (I don't know if this is a bad design or not?).
private void on_accept(IAsyncResult xResult)
{
Socket listener = null;
Socket handler = null;
TStateObject state = null;
Task<int> consumer = null;
try
{
mxResetEvent.Set();
listener = (Socket)xResult.AsyncState;
handler = listener.EndAccept(xResult);
state = new TStateObject()
{
Socket = handler
};
consumer = async_input_consumer(state);
OnConnect?.Invoke(this, handler);
handler.BeginReceive(state.Buffer, 0, TStateObject.BufferSize, 0, new AsyncCallback(on_receive), state);
}
catch (SocketException se)
{
if (se.ErrorCode == 10054)
{
on_disconnect(state);
}
}
catch (ObjectDisposedException)
{
return;
}
catch (Exception ex)
{
System.Console.WriteLine("Exception in TCPServer::AcceptCallback, exception: " + ex.Message);
}
}
private void on_receive(IAsyncResult xResult)
{
Socket handler = null;
TStateObject state = null;
try
{
state = xResult.AsyncState as TStateObject;
handler = state.Socket;
int bytesRead = handler.EndReceive(xResult);
UInt16 id = TClientRegistry.GetIdBySocket(handler);
TContext context = TClientRegistry.GetContext(id);
if (bytesRead > 0)
{
var buffer_data = new byte[bytesRead];
Array.Copy(state.Buffer, buffer_data, bytesRead);
state.BufferBlock.Post(buffer_data);
}
Array.Clear(state.Buffer, 0, state.Buffer.Length);
handler.BeginReceive(state.Buffer, 0, TStateObject.BufferSize, 0, new AsyncCallback(on_receive), state);
}
catch (SocketException se)
{
if(se.ErrorCode == 10054)
{
on_disconnect(state);
}
}
catch (ObjectDisposedException)
{
return;
}
catch (Exception ex)
{
System.Console.WriteLine("Exception in TCPServer::ReadCallback, exception: " + ex.Message);
}
}
This code is used to connect to an embedded device and works (mostly) fine. I was investigating a memory leak and trying to speed up the process a bit by replicating exactly what the device does (our connection speeds are in the realm of about 70kbps to our device, and it took an entire weekend of stress testing to get the memory leak to double the memory footprint of the server).
So I wrote a C# program to replicate the data transactions, but I've run into an issue where when I disconnect the test program, the server gets caught in a loop where it endlessly has its on_receive callback called. I was under the impression that BeginReceive wouldn't be triggered until something was received, and it seems to call on_receive, ends the receiving like an async callback should do, process the data, and then I want the connection to await more data so I call BeginReceive again.
The part of my test program where the issue occurs is in here:
private static void read_write_test()
{
mxConnection = new Socket(AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp);
mxConnection.Connect("12.12.12.18", 10);
if (mxConnection.Connected)
{
byte[] data = Encoding.ASCII.GetBytes("HANDSHAKESTRING"); //Connect string
int len = data.Length;
mxConnection.Send(data);
data = new byte[4];
len = mxConnection.Receive(data);
if (len == 0 || data[0] != '1')
{
mxConnection.Disconnect(false);
return;
}
}
//Meat of the test goes here but isn't relevant
mxConnection.Shutdown(SocketShutdown.Both);
mxConnection.Close();
}
Up until the Shutdown(SocketShutdown.Both) call, everything works as expected. When I make that call however, it seems like the server never gets notification that the client has closed the socket and gets stuck in a loop of endlessly trying to receive. I've done my homework and I think I am closing my connection properly as per this discussion. I've messed around with the disconnect section to just do mxConnection.Disconnect(false) as well, but the same thing occurs.
When the device disconnects from the server, my server catches a SocketException with error code 10054, which documentation says:
Connection reset by peer.
An existing connection was forcibly closed
by the remote host. This normally results if the peer application on
the remote host is suddenly stopped, the host is rebooted, the host or
remote network interface is disabled, or the remote host uses a hard
close (see setsockopt for more information on the SO_LINGER option on
the remote socket). This error may also result if a connection was
broken due to keep-alive activity detecting a failure while one or
more operations are in progress. Operations that were in progress fail
with WSAENETRESET. Subsequent operations fail with WSAECONNRESET.
I've used this to handle the socket being closed and has worked well for the most part. However, with my C# test program, it doesn't seem like it works the same way.
Am I missing something here? I'd appreciate any input. Thanks.

The main difference is that my server doesn't kill a connection after it stops receiving data (I don't know if this is a bad design or not?).
Of course it is.
it seems like the server never gets notification that the client has closed the socket and gets stuck in a loop of endlessly trying to receive
The server does get notification. It's just that you ignore it. The notification is that your receive operation returns 0. When that happens, you just call BeginReceive() again. Which starts a new read operation. Which…returns 0! You just keep doing that over and over again.
When a receive operation returns 0, you're supposed to complete the graceful closure (with a call to Shutdown() and Close()) that the remote endpoint started. Do not try to receive again. You'll just keep getting the same result.
I strongly recommend you do more homework. A good place to start would be the Winsock Programmer's FAQ. It is a fairly old resource and doesn't address .NET at all. But for the most part, the things that novice network programmers are getting wrong in .NET are the same things that novice Winsock programmers were getting wrong twenty years ago. The document is still just as relevant today as it was then.
By the way, your client-side code has some issues as well. First, when the Connect() method returns successfully, the socket is connected. You don't have to check the Connected property (and in fact, should never have to check that property). Second, the Disconnect() method doesn't do anything useful. It's used when you want to re-use the underlying socket handle, but you should be disposing the Socket object here. Just use Shutdown() and Close(), per the usual socket API idioms. Third, any code that receives from a TCP socket must do that in a loop, and make use of the received byte-count value to determine what data has been read and whether enough has been read to do anything useful. TCP can return any positive number of bytes on a successful read, and it's your program's job to identify the start and end of any particular blocks of data that were sent.

You missed this in the documentation for EndReceive() and Receive():
If the remote host shuts down the Socket connection with the Shutdown method, and all available data has been received, the Receive method will complete immediately and return zero bytes.
When you read zero bytes, you still start another BeginReceive(), instead of shutting down:
if (bytesRead > 0)
{
var buffer_data = new byte[bytesRead];
Array.Copy(state.Buffer, buffer_data, bytesRead);
state.BufferBlock.Post(buffer_data);
}
Array.Clear(state.Buffer, 0, state.Buffer.Length);
handler.BeginReceive(state.Buffer, 0, TStateObject.BufferSize, 0, new AsyncCallback(on_receive), state);
Since you keep calling BeginReceive on a socket that's 'shutdown', you're going to keep getting callbacks to receive zero bytes.
Compare with the example from Microsoft in the documentation for EndReceive():
public static void Read_Callback(IAsyncResult ar){
StateObject so = (StateObject) ar.AsyncState;
Socket s = so.workSocket;
int read = s.EndReceive(ar);
if (read > 0) {
so.sb.Append(Encoding.ASCII.GetString(so.buffer, 0, read));
s.BeginReceive(so.buffer, 0, StateObject.BUFFER_SIZE, 0,
new AsyncCallback(Async_Send_Receive.Read_Callback), so);
}
else{
if (so.sb.Length > 1) {
//All of the data has been read, so displays it to the console
string strContent;
strContent = so.sb.ToString();
Console.WriteLine(String.Format("Read {0} byte from socket" +
"data = {1} ", strContent.Length, strContent));
}
s.Close();
}
}

Where is data sent by UDP stored?

I had never used UDP before, so I gave it a go. To see what would happen, I had the 'server' send data every half a second, and the client receive data every 3 seconds. So even though the server is sending data much faster than the client can receive, the client still receives it all neatly one by one.
Can anyone explain why/how this happens? Where is the data buffered exactly?
Send
class CSimpleSend
{
CSomeObjServer obj = new CSomeObjServer();
public CSimpleSend()
{
obj.changedVar = varUpdated;
obj.threadedChangeSomeVar();
}
private void varUpdated(int var)
{
string send = var.ToString();
byte[] packetData = System.Text.UTF8Encoding.UTF8.GetBytes(send);
string ip = "127.0.0.1";
int port = 11000;
IPEndPoint ep = new IPEndPoint(IPAddress.Parse(ip), port);
Socket client = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp);
client.SendTo(packetData, ep);
Console.WriteLine("Sent Message: " + send);
Thread.Sleep(100);
}
}
All CSomeObjServer does is increment an integer by one every half second
Receive
class CSimpleReceive
{
CSomeObjClient obj = new CSomeObjClient();
public Action<string> showMessage;
Int32 port = 11000;
UdpClient udpClient;
public CSimpleReceive()
{
udpClient = new UdpClient(port);
showMessage = Console.WriteLine;
Thread t = new Thread(() => ReceiveMessage());
t.Start();
}
private void ReceiveMessage()
{
while (true)
{
//Thread.Sleep(1000);
IPEndPoint remoteIPEndPoint = new IPEndPoint(IPAddress.Any, port);
byte[] content = udpClient.Receive(ref remoteIPEndPoint);
if (content.Length > 0)
{
string message = Encoding.UTF8.GetString(content);
if (showMessage != null)
showMessage("Recv:" + message);
int var_out = -1;
bool succ = Int32.TryParse(message, out var_out);
if (succ)
{
obj.alterSomeVar(var_out);
Console.WriteLine("Altered var to :" + var_out);
}
}
Thread.Sleep(3000);
}
}
}
CSomeObjClient stores the variable and has one function (alterSomeVar) to update it
Ouput:
Sent Message: 1
Recv:1
Altered var to :1
Sent Message: 2
Sent Message: 3
Sent Message: 4
Sent Message: 5
Recv:2
Altered var to :2
Sent Message: 6
Sent Message: 7
Sent Message: 8
Sent Message: 9
Sent Message: 10
Recv:3
Altered var to :3

The operating system kernel maintains separate send and receive buffers for each UDP and TCP socket. If you google SO_SNDBUF and SO_RCVBUF you'll find lots of information about them.
When you send data, it is copied from your application space into the send buffer. From there it is copied to the network interface card, and then onto the wire. The receive side is the reverse: NIC to receive buffer, where it waits until you read it. Additionally copies and buffering can also occur, depending on the OS.
It is critical to note that the sizes of these buffers can vary radically. Some systems might default to as little as 4 kilobytes, while others give you 2 megabytes. You can find the current size using getsockopt() with SO_SNDBUF or SO_RCVBUF and likewise set it using setsockopt(). But many systems limit the size of the buffer, sometimes to arbitrarily small amounts. This is typically a kernel value like net.core.wmem_max or net.core.rmem_max, but the exact reference will vary by system.
Also note that setsockopt() can fail even if you request an amount less than the supposed limit. So to actually get a desired size, you need to repeatedly call setsockopt() using decreasing amounts until it finally succeeds.
The following page is a Tech Note from my company which touches on this topic a little bit and provides references for some common systems: http://www.dataexpedition.com/support/notes/tn0024.html

It looks to me like the UdpClient-Class provides a buffer for received data. Try using a socket directly. You might also want to set that sockets ReceiveBufferSize to zero, even though I believe it is only used for TCP connections.

Determine when a remote system is rebooted

I am writing a c# program that calls a batch file that reboots the a remote system to another partition. Is there a way to know when the system is done rebooting? I would like to know when I can access the remote system once it is rebooted.

I would ping it to determine when it's back online. There's a Ping built right into .net.

When system A tells system B to reboot, maybe it could supply its ip address (or other contact info), and then in system Bs startup process it could read the file with the contact info and call back to system A.

You have a few options...
You can try to connect to a network service on the other system, ie: use Ping as suggested by Jon B.
However, if the system is one under your control, you have other options. You could setup a scheduled task (or install a service) to start on boot, and have the other system notify you when it is online. This would prevent you from needing to ping repeatedly.

I wish I had used this site before I spent all day Saturday trying to figure this out. Thanks for the help!! Below is what I ended up doing. This code will ping a computer and determine when it is back online. The next step will be to determine if I can open a network connection to the remote desktop port as suggested earlier. Feel free to make any comments on the code.
Note: Here are the added using statements:
using System.Net; // added
using System.Net.NetworkInformation;// added
using System.ComponentModel; // added
using System.Threading; // added
static void Main(string[] args)
{
string systemName = "172.30.11.148"; // Name of system beign pinged. System name used instead of ip due to DHCP
Ping pingSender = new Ping(); // Creates new ping object.
string data = "datadatadatadatadatadatadatadata"; // buffer of 32 bytes of data to transmit.
byte[] buffer = Encoding.ASCII.GetBytes(data); // buffer containing data.
int pingTimeout = 120; // Timeout in ms sent into pingSender.
int maxWaitTimeout = 60; // Maximum time to wait for system to reboot.
int counter = 1; // Used in while statement
bool connectStatus = false; // State of system connection.
Console.WriteLine("Wait for reboot to start before trying to establish connection.");
//Thread.Sleep(30000); // Waits 30 seconds at start of reboot to allow network to close.
Thread.Sleep(1000);
try
{
// Performs a ping on the system. If the system is available, while loop is exited and program
// continues. If the system is not available, program waits approximately 1 second and then
// pings the system again.
while (counter < maxWaitTimeout)
{
// Pings the system with 32 byts of data and waits for timeout
Console.WriteLine("Connection attempt #: " + counter + " of " + maxWaitTimeout);
PingReply reply = pingSender.Send(systemName, pingTimeout, buffer);
Console.WriteLine("Status of ping to: {0} - {1}", systemName, reply.Status);
if (reply.Status.ToString() == "Success")
{
connectStatus = true;
Console.WriteLine("\nRoundTrip time: {0}", reply.RoundtripTime);
Console.WriteLine("Time to live: {0}", reply.Options.Ttl);
Console.WriteLine("Buffer size: {0}", reply.Buffer.Length);
break;
}
Thread.Sleep(880);
counter += 1;
} // end while loop
if (connectStatus == true)
{
Console.WriteLine("\nAble to establish connection to: {0}", systemName);
}
else
{
Console.WriteLine("\nUnable to establish network connection using ping to: {0}", systemName);
}
} // end try
catch (Exception ex)
{
System.Diagnostics.Debug.WriteLine(ex.Message);
}
Console.ReadLine(); // Pause console
} // end main

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.