I have an MQTT listener written in c#.
The program is running in Azure and
for some reason after a period of time, it gets disonnected with an exception:
"The operation has timed out." or
"Exception of type 'MQTTnet.Exceptions.MqttCommunicationTimedOutException' was thrown."
On production, the listener must always be online so on disconnect event i'm reconnecting, but it happens randomly, it can get disconnected 4 times a day and sometimes it can stay online without disconnect for a few days.
question is, why is it happening? the device that it listens to is sending a timestamp request every few minutes, but it should be very fast and shouldn't cause a timeout.
Here is the code:
private static IMqttClient _client;
private static IMqttClientOptions _options;
static async Task Main(string[] args)
{
//create subscriber client
var factory = new MqttFactory();
_client = factory.CreateMqttClient();
//configure options
_options = new MqttClientOptionsBuilder()
.WithClientId("ListenerClient")
.WithTcpServer(Utility.brokerIp, Utility.brokerPort).WithCredentials(Utility.brokerUser, Utility.brokerPassword)
.WithCleanSession()
.Build();
//Handlers
_client.UseConnectedHandler(e =>
{
Console.WriteLine("Connected successfully with MQTT Brokers Topic.");
WriteToLog("***Connected To MQTT Listener.***");
//Subscribe to topics******************
});
_client.UseDisconnectedHandler(e =>
{
WriteToLog("***DisConnected From MQTT Listener.***");
WriteToLog(e.Exception.Message);
_client.ConnectAsync(_options).Wait();
return;
});
_client.UseApplicationMessageReceivedHandler(async e =>
{
//manage messages
});
//Connect
_client.ConnectAsync(_options).Wait();
Task.Run(() => Thread.Sleep(Timeout.Infinite)).Wait();
_client.DisconnectAsync().Wait();
}
catch (Exception e)
{
Console.WriteLine(e);
throw;
}
We had a similar issue at one time, I believe the queue you are trying to connect to has very intermittent traffic. And then whatever service, or server you have hosting the queue it self, is setup to hibernate the queue when no traffic hits the queue for some predetermined period of time.
When you then try to use the queue, the "timeout" happens because the queue can't wake up from hibernation quick enough for you to get a processed message through.
if the queue is Azure hosted, try to get Azure support to confirm this is the case, if you are on premise hosting, try to verify your own configuration is set this way, and reduce the "wait" period to something deliberately small like 30 seconds, and verify that a hibernated queue, causes a time-out.
Related
We have been using NATS in production for a couple of years now with a single server and around 1,500 consumers using the NATS.net client, but have finally started analyzing in detail performance and are seeing quite regularly big delays in consuming messages on the consumer.
To keep things simple, we have a ping-pong style message that is generated on a consumer, sent to a central server via NATS which processes it and sends a reply back. Both messages have timestamps on them and identify the message that it is replying to.
What we are seeing is no issues at all between the consumer and central server, it seems to get them all the time, but at times there can be delays of several minutes before the reply message is consumed by the consumer.
To be clear we have 2 separate NATS connections for each direction of the flow.
This is the code where we are consuming from the subscription:
var thread = new Thread(() =>
{
using (_subscription = _queueGroup == null ? NATSConnection.Connection.SubscribeSync(_subject) : NATSConnection.Connection.SubscribeSync(_subject, _queueGroup))
{
Connection.RaiseSubscriberConnected();
while (_active)
{
try
{
var nextMessage = _subscription.NextMessage();
if (nextMessage != null)
{
Log.Debug("Subscriber Message Received");
using (var stream = new MemoryStream(nextMessage.Data))
{
NewSubscriptionItem.Invoke(Envelope.Parser.ParseFrom(stream));
}
}
}
catch (Exception ex)
{
Connection.RaiseException(ex);
}
}
}
})
{
IsBackground = true
};
thread.Start();
}
The Log.Debug("Subscriber Message Received"); line does not get hit at all during the period where we are missing replies, then after a period of time, all the outstanding messages come in in one hit... its as if there is a 'blockage' that gets cleared.
The machine(s) that the consumers are running on does have a lot going on them, but CPU never breaches around 50%.
Any pointers as to what to check next would be much appreciated!
I am testing how .NET WebSockets work when the client can't process data from the server side fast enough. For this purpose, I wrote an application that sends data continuously to a WebSocket, but includes an artificial delay in the receive loop. As expected, once the TCP window and other buffers fill, the SendAsync calls start to take long to return. But after a few minutes, one of these exceptions is thrown by SendAsync:
System.Net.HttpListenerException: The device does not recognize the command
System.Net.HttpListenerException: The I/O operation has been aborted because of either a thread exit or an application request.
What's weird is, that this only happens with certain message sizes and certain timing. When the client is allowed to read all data unrestricted, the connection is stable. Also, when the client is blocked completely and does not read at all, the connection stays open.
Examining the data flow through Wireshark revealed that it is the server that is resetting the TCP connection while the client's TCP window is exhausted.
I tried to follow this answer (.NET WebSockets forcibly closed despite keep-alive and activity on the connection) without success. Tweaking the WebSocket keep alive interval has no effect. Also, I know that the final application needs to be able to handle unexpected disconnections gracefully, but I do not want them to occur if they can be avoided.
Did anybody encounter this? Is there some timeout tweaking that I can do? Running this should produce the error between a minute and half to three minutes:
class Program
{
static void Main(string[] args)
{
System.Net.ServicePointManager.MaxServicePointIdleTime = Int32.MaxValue; // has no effect
HttpListener httpListener = new HttpListener();
httpListener.Prefixes.Add("http://*/ws/");
Listen(httpListener);
Thread.Sleep(500);
Receive("ws://localhost/ws/");
Console.WriteLine("running...");
Console.ReadKey();
}
private static async void Listen(HttpListener listener)
{
listener.Start();
while (true)
{
HttpListenerContext ctx = await listener.GetContextAsync();
if (!ctx.Request.IsWebSocketRequest)
{
ctx.Response.StatusCode = (int)HttpStatusCode.NotImplemented;
ctx.Response.Close();
return;
}
Send(ctx);
}
}
private static async void Send(HttpListenerContext ctx)
{
TimeSpan keepAliveInterval = TimeSpan.FromSeconds(5); // tweaking has no effect
HttpListenerWebSocketContext wsCtx = await ctx.AcceptWebSocketAsync(null, keepAliveInterval);
WebSocket webSocket = wsCtx.WebSocket;
byte[] buffer = new byte[100];
while (true)
{
await webSocket.SendAsync(new ArraySegment<byte>(buffer), WebSocketMessageType.Binary, true, CancellationToken.None);
}
}
private static async void Receive(string serverAddress)
{
ClientWebSocket webSocket = new ClientWebSocket();
webSocket.Options.KeepAliveInterval = TimeSpan.FromSeconds(5); // tweaking has no effect
await webSocket.ConnectAsync(new Uri(serverAddress), CancellationToken.None);
byte[] receiveBuffer = new byte[10000];
while (true)
{
await Task.Delay(10); // simulate a slow client
var message = await webSocket.ReceiveAsync(new ArraySegment<byte>(receiveBuffer), CancellationToken.None);
if (message.CloseStatus.HasValue)
break;
}
}
I'm not a .NET developer but as far as I have seen these kind of problems in websocket topic and in my own opinion, these can be the reasons:
Very short timeout setting on websocket on both sides.
Client/Server side runtime exceptions (beside of logging, must check onError and onClose methods to see why)
Internet or connection failures. Websocket sometimes goes into IDLE mode too. You have to implement a heartbeat system on websockets to keep them alive. Use ping and pong packets.
check maximum binary or text message size on server side. Also set some buffers to avoid failure when message is too big.
As you said your error usually happens within a certain time, 1 and 2 must help you. Again sorry if I cant provide you codes, but I have had same problems in java and I found out these are the settings that must be set in order to work with websockets. Search how to set these in your client and server implementations and you must be fine after that.
Apparently, I was hitting an HTTP.SYS low speed connection attack countermeasure, as roughly described in KB 3137046 (https://support.microsoft.com/en-us/help/3137046/http-sys-forcibly-disconnects-http-bindings-for-wcf-self-hosted-servic):
By default, Http.sys considers any speed rate of less than 150 bytes per second as a potential low speed connection attack, and it drops the TCP connection to release the resource.
When HTTP.SYS does that, there is a trace entry in the log at %windir%\System32\LogFiles\HTTPERR
Switching it off was simple from code:
httpListener.TimeoutManager.MinSendBytesPerSecond = UInt32.MaxValue;
I have successfully created an Azure application that sends DbTransactions to a ServiceBus Queue, and then, enqueues a 'notifying message' to a ServiceBus Topic for other clients to monitor (...so they can receive the updates automatically).
Now, I want to use SignalR to monitor & receive the SubscriptionClient messages...and I have test-code that works just fine on its' own.
I have found many examples for sending messages to an Azure Queue (that is easy). And, I have the code to receive a BrokeredMessage from a SubscriptionClient. However, I cannot get SignalR to continuously monitor my Distribute method.
How do I get SignalR to monitor the Topic?
CODE BEHIND: (updated)
public void Dequeue()
{
SubscriptionClient subscription = GetTopicSubscriptionClient(TOPIC_NAME, SUBSCRIPTION_NAME);
subscription.Receive();
BrokeredMessage message = subscription.Receive();
if (message != null)
{
try
{
var body = message.GetBody<string>();
var contextXml = message.Properties[PROPERTIES_CONTEXT_XML].ToString();
var transaction = message.Properties[PROPERTIES_TRANSACTION_TYPE].ToString();
Console.WriteLine("Body: " + body);
Console.WriteLine("MessageID: " + message.MessageId);
Console.WriteLine("Custom Property [Transaction]: " + transaction);
var context = XmlSerializer.Deserialize<Person>(contextXml);
message.Complete();
Clients.All.distribute(context, transaction);
}
catch (Exception ex)
{
// Manage later
}
}
}
CLIENT-SIDE CODE:
// TEST: Hub - GridUpdaterHub
var hubConnection = $.hubConnection();
var gridUpdaterHubProxy = hubConnection.createHubProxy('gridUpdaterHub');
gridUpdaterHubProxy.on('hello', function (message) {
console.log(message);
});
// I want this automated
gridUpdaterHubProxy.on('distribute', function (context, transaction) {
console.log('It is working');
});
connection.start().done(function () {
// This is successful
gridUpdaterHubProxy.invoke('hello', "Hello");
});
I would not do it like that. Your code is consuming and retaining ASP.NET thread pool's threads for each incoming connection, so if you have many clients you are not scaling well at all. I do not know the internals of SignalR that deep, but I'd guess that your never-ending method is preventing SignalR to let the client call your callbacks because that needs the server method to end properly. Just try to change while(true) with something exiting after, let's say, 3 messages in the queue, you should be called back 3 times and probably those calls will happen all together when your method exits.
If that is right, then you can move to something different, like dedicating a specific thread to consuming the queue and having callbacks called from there usning GlobalHost.ConnectionManager.GetHubContext. Probably better, you could try a different process consuming the queue and doing HTTP POST to your web app, which in turns broadcasts to the clients.
I'm writing a test application with signal r server and a web client and I wanted to know if there is a way to determine or have the server know which transport method the client is establishing with the server.
In regards to websockets which has a persistent two-way connection between the client and server or long polling which keeps polling the server until the server responds and then closes up the connection would there be any downside that I have to be aware of regarding the transport method not being web sockets outside of the persistent two-way connection especially if there are going to be many long running requests being made one after another?
I've noticed that making multiple requests from a client will be handled by the hub and returned when done, example I send a request to wait 10 seconds then a another request to wait 1 second. The Hub will respond to the 1 second wait request first then the 10 second delay, I am curious as to whether there is a thread per request created which is attached to the client via the same persistent duplex connection.
here is my example code.
class Startup
{
public void Configuration(IAppBuilder app)
{
app.UseCors(CorsOptions.AllowAll);
app.MapSignalR();
}
}
public class RunningHub : Hub
{
public void SendLongRunning(string name, string waitFor)
{
Clients.All.addMessage(name, "just requested a long running request I'll get back to you when im done");
LongRunning(waitFor);
Clients.All.addMessage(name, "I'm done with the long running request. which took " + waitFor + " ms");
}
private void LongRunning(string waitFor)
{
int waitTime = int.Parse(waitFor);
Thread.Sleep(waitTime);
}
}
JQuery Sample.
$(function () {
//Set the hubs URL for the connection
$.connection.hub.url = "http://localhost:9090/signalr";
// Declare a proxy to reference the hub.
var signalHub = $.connection.runningHub;
$('#url').append('<strong> Working With Port: ' + $.connection.hub.url + '</strong>');
// Create a function that the hub can call to broadcast messages.
signalHub.client.addMessage = function (name, message) {
//handles the response the message here
};
// Start the connection.
$.connection.hub.start().done(function () {
$('#sendlongrequest').click(function() {
signalHub.server.sendLongRunning($('#displayname').val(), $('#waitTime').val());
});
});
});
For ASP.NET Core;
var transportType = Context.Features.Get<IHttpTransportFeature>()?.TransportType;
Regarding the transport method:
You can inspect HubCallerContext.QueryString param transport:
public void SendLongRunning(string name, string waitFor)
{
var transport = Context.QueryString.First(p => p.Key == "transport").Value;
}
Regarding threading & long-running tasks:
Each request will be handled on a separate thread and the hub pipeline resolves the client-side promise when the hub method completes. This means that you can easily block your connection because of the connection limit in browsers (typically 6 connections at a time).
E.g.: if you use long-polling and you make six requests to the server, each triggering (or directly executing) a long-running operation, then you'll have six pending AJAX requests which only get resolved once the hub method is done, and you won't be able to make any further requests to the server until then. So you should use separate tasks for the long-running code and you should also not await those so the hub dispatcher can send its response without a delay.
If the client needs to know when the long-running task is done, then you should do a push notification from the server instead of relying on the .done() callback.
I get the following exception when a consumer is blocking to receive a message from the SharedQueue:
Unhandled Exception: System.IO.EndOfStreamException: SharedQueue closed
at RabbitMQ.Util.SharedQueue.EnsureIsOpen()
at RabbitMQ.Util.SharedQueue.Dequeue()
at Consumer.Program.Main(String[] args) in c:\Users\pdecker\Documents\Visual
Studio 2012\Projects\RabbitMQTest1\Consumer\Program.cs:line 33
Here is the line of code that is being executed when the exception is thrown:
BasicDeliverEventArgs e = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
So far I have seen the exception occuring when rabbitMQ is inactive. Our application needs to have the consumer always connected and listening for keystrokes. Does anyone know the cause of this problem? Does anyone know how to recover from this problem?
Thanks in advance.
The consumer is tied to the channel:
var consumer = new QueueingBasicConsumer(channel);
So if the channel has closed, then the consumer will not be able to fetch any additional events once the local Queue has been cleared.
Check for the channel to be open with
channel.IsOpen == true
and that the Queue has available events with
if( consumer.Queue.Count() > 0 )
before calling:
BasicDeliverEventArgs e = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
To be more specific, I would check the following before calling Dequeue()
if( !channel.IsOpen || !connection.IsOpen )
{
Your_Connection_Channel_Init_Function();
consumer = new QueueingBasicConsumer(channel); // consumer is tied to channel
}
if( consumer.Queue.Any() )
BasicDeliverEventArgs e = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
Don't worry this is just expected behavior, it means there is no message left in queue to process. Don't even try it is not gonna work...
consumer.Queue.Any()
Just catch the EndOfStreamException:
private void ConsumeMessages(string queueName)
{
using (IConnection conn = factory.CreateConnection())
{
using (IModel channel = conn.CreateModel())
{
var consumer = new QueueingBasicConsumer(channel);
channel.BasicConsume(queueName, false, consumer);
Trace.WriteLine(string.Format("Waiting for messages from: {0}", queueName));
while (true)
{
BasicDeliverEventArgs ea = null;
try
{
ea = consumer.Queue.Dequeue();
}
catch (EndOfStreamException endOfStreamException)
{
Trace.WriteLine(endOfStreamException);
// If you want to end listening end of queue call break;
break;
}
if (ea == null) break;
var body = ea.Body;
// Consume message how you want
Thread.Sleep(300);
channel.BasicAck(ea.DeliveryTag, false);
}
}
}
}
There is another possible source of trouble: your corporate firewall.
Thats because such firewall can drop your connection to RabbitMQ when the connection is idle for a certain amount of time.
Although RabbitMQ connection has a heartbeat feature to prevent this, if the heartbeat pulse happens after the firewall connection timeout, it is useless.
This is the default heartbeat interval configuration in seconds:
Default: 60 (580 prior to release 3.5.5)
From RabbitMQ:
Detecting Dead TCP Connections with Heartbeats
Introduction
Network can fail in many ways, sometimes pretty subtle (e.g. high
ratio packet loss). Disrupted TCP connections take a moderately long
time (about 11 minutes with default configuration on Linux, for
example) to be detected by the operating system. AMQP 0-9-1 offers a
heartbeat feature to ensure that the application layer promptly finds
out about disrupted connections (and also completely unresponsive
peers).
Heartbeats also defend against certain network equipment which
may terminate "idle" TCP connections.
That happened to us and we solved the problem by decreasing the Heartbeat Timeout Interval in the global configuration:
In your rabbitmq.config, find the heartbeat and set it to a value smaller than that of your firewall rule.
You can change the interval in your client, too:
Enabling Heartbeats with Java Client To configure the heartbeat
timeout in the Java client, set it with
ConnectionFactory#setRequestedHeartbeat before creating a connection:
ConnectionFactory cf = new ConnectionFactory();
// set the heartbeat timeout to 60 seconds
cf.setRequestedHeartbeat(60);
Enabling Heartbeats with the .NET Client To configure the heartbeat
timeout in the .NET client, set it with
ConnectionFactory.RequestedHeartbeat before creating a connection:
var cf = new ConnectionFactory();
//set the heartbeat timeout to 60 seconds
cf.RequestedHeartbeat = 60;
The answers here that say that this is the expected behavior are correct, however I would argue that it's bad to have it throw an exception by design like this.
from the documentation: "Callers of Dequeue() will block if no items are available until some other thread calls Enqueue() or the queue is closed. In the latter case this method will throw EndOfStreamException."
So, like GlenH7 said, you have to check that channel is open before calling Dequeue() (IModel.IsOpen).
However, what if the channel closes while Dequeue() is blocking? I think it's best to call Queue.DequeueNoWait(null), and block the thread yourself by waiting for it to return something that isn't null. So, something like:
while(channel.IsOpen)
{
var args = consumer.Queue.DequeueNoWait(null);
if(args == null) continue;
//...
}
This way, it won't throw that exception.