Big Delay In Consuming NATS Messages

Big Delay In Consuming NATS Messages - c#

We have been using NATS in production for a couple of years now with a single server and around 1,500 consumers using the NATS.net client, but have finally started analyzing in detail performance and are seeing quite regularly big delays in consuming messages on the consumer.
To keep things simple, we have a ping-pong style message that is generated on a consumer, sent to a central server via NATS which processes it and sends a reply back. Both messages have timestamps on them and identify the message that it is replying to.
What we are seeing is no issues at all between the consumer and central server, it seems to get them all the time, but at times there can be delays of several minutes before the reply message is consumed by the consumer.
To be clear we have 2 separate NATS connections for each direction of the flow.
This is the code where we are consuming from the subscription:
var thread = new Thread(() =>
{
using (_subscription = _queueGroup == null ? NATSConnection.Connection.SubscribeSync(_subject) : NATSConnection.Connection.SubscribeSync(_subject, _queueGroup))
{
Connection.RaiseSubscriberConnected();
while (_active)
{
try
{
var nextMessage = _subscription.NextMessage();
if (nextMessage != null)
{
Log.Debug("Subscriber Message Received");
using (var stream = new MemoryStream(nextMessage.Data))
{
NewSubscriptionItem.Invoke(Envelope.Parser.ParseFrom(stream));
}
}
}
catch (Exception ex)
{
Connection.RaiseException(ex);
}
}
}
})
{
IsBackground = true
};
thread.Start();
}
The Log.Debug("Subscriber Message Received"); line does not get hit at all during the period where we are missing replies, then after a period of time, all the outstanding messages come in in one hit... its as if there is a 'blockage' that gets cleared.
The machine(s) that the consumers are running on does have a lot going on them, but CPU never breaches around 50%.
Any pointers as to what to check next would be much appreciated!

Related

.NET IBM MQ Listener unacknowledged message and reading from the beginning of the queue

I have a C# application that sets up numerous MQ listeners (multiple threads and potentially multiple servers each with their own listeners). There are some messages that will come off the queue that I will want to leave on the queue, move on to the next message on the MQ, but then under some circumstances I will want to go back to re-read those messages...
var connectionFactory = XMSFactoryFactory.GetInstance(XMSC.CT_WMQ).CreateConnectionFactory();
connectionFactory.SetStringProperty(XMSC.WMQ_HOST_NAME, origination.Server);
connectionFactory.SetIntProperty(XMSC.WMQ_PORT, int.Parse(origination.Port));
connectionFactory.SetStringProperty(XMSC.WMQ_QUEUE_MANAGER, origination.QueueManager);
connectionFactory.SetStringProperty(XMSC.WMQ_CHANNEL, origination.Channel);
var connection = connectionFactory.CreateConnection(null, null);
_connections.Add(connection);
var session = connection.CreateSession(false, AcknowledgeMode.ClientAcknowledge); //changed to use ClientAcknowledge so that we will leave the message on the MQ until we're sure we're processing it
_sessions.Add(session);
var destination = session.CreateQueue(origination.Queue);
_destinations.Add(destination);
var consumer = session.CreateConsumer(destination);
_consumers.Add(consumer);
Logging.LogDebugMessage(Constants.ListenerStart);
connection.Start();
ThreadPool.QueueUserWorkItem((o) => Receive(forOrigination, consumer));
Then I have...
if (OnMQMessageReceived != null)
{
var message = consumer.Receive();
var identifier = string.Empty;
if (message is ITextMessage)
{
//do stuff with the message here
//populates identifier from the message
}
else
{
//do stuff with the message here
//populates identifier from the message
}
if (!string.IsNullOrWhiteSpace(identifier)&& OnMQMessageReceived != null)
{
if( some check to see if we should process the message now)
{
//process message here
message.Acknowledge(); //this really pulls it off of the MQ
//here is where I want to trigger the next read to be from the beginning of the MQ
}
else
{
//We actually want to do nothing here. As in do not do Acknowledge
//This leaves the message on the MQ and we'll pick it up again later
//But we want to move on to the next message in the MQ
}
}
else
{
message.Acknowledge(); //this really pulls it off of the MQ...its useless to us anyways
}
}
else
{
Thread.Sleep(0);
}
ThreadPool.QueueUserWorkItem((o) => Receive(forOrigination, consumer));
So a couple of questions:
If I do not acknowledge the message it stays on the MQ, right?
If the message is not acknowledged then by default when I read from the MQ again with the same listener it reads the next one and does not go to the beginning, right?
How do I change the listener so that the next time I read I start at the beginning of the queue?

Leaving messages on a queue is an anti-pattern. If you don't want to or cannot process the message at a certain point of your logic, then you have a number of choices:
Get it off the queue and put to another queue/topic for a delayed/different processing.
Get it off the queue and dump to a database, flat file - whatever, if you want to process it outside of messaging flow, or don't want to process at all.
If it is feasible, you may want to change the message producer so it doesn't mix the messages with different processing requirements in the same queue/topic.
In any case, do not leave a message on the queue, and always move forward to the next message. This will make the application way more predictable and easier to reason about. You will also avoid all kinds of performance problems. If your application is or may ever become sensitive to the sequence of message delivery, then manual acknowledgement of selected messages will be at odds with it too.
To your questions:
The JMS spec is vague regarding the behavior of unacknowledged messages - they may be delivered out of order, and it is undefined when exactly when they will be delivered. Also, the acknowledge method call will acknowledge all previously received and unacknowledged messages - probably not what you had in mind.
If you leave messages behind, the listener may or may not go back immediately. If you restart it, it of course will start afresh, but while it is sitting there waiting for messages it is implementation dependent.
So if you try to make your design work, you may get it kind of work under certain circumstances, but it will not be predictable or reliable.

RabbitMQ throws Shared Queue closed error

We have been using RabbitMQ as messaging service in the project. We will be pushing message into a queue and which will be received at the message consumer and will be trying to make entry into database. Once the values entered into the database we will be sending positive acknowledgement back to the RabbitMQ server if not we will be sending negative acknowledgement.
I have created Message Consumer as Windows service.Message has been successfully entered and well received by the message consumer(Made entry in table)but with an exception log "Shared Queue closed".
Please find the code block.
while (true)
{
try
{
if (!Connection.IsOpen || !Channel.IsOpen)
{
CreateConnection(existConnectionConfig, QueueName);
consumer = new QueueingBasicConsumer(Channel);
consumerTag=Channel.BasicConsume(QueueName,false,consumer);
}
BasicDeliverEventArgs e = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
IBasicProperties props = e.BasicProperties;
byte[] body = e.Body;
bool ack = onMessageReceived(body);
if (ack == true)
{
Channel.BasicAck(e.DeliveryTag, false);
}
else
Channel.BasicNack(e.DeliveryTag, false, true);
}
catch (Exception ex)
{
//Logged the exception in text file where i could see the
//message as "Shared queue closed"
}
}
I have surfed in net too but couldn't able to what the problem. It will be helpful if anyone able to help me out.
Thanks in advance,
Selva

In answer to your question, I have experienced the same problems when my Web Client has reset the connection due to App Pool recycling or some other underlying reason the connection has been dropped that appears beyond your scope. You may need to build in a retry mechanism to cope with this.
You might want to look at MassTransit. I have used this with RabbitMQ and it makes things a lot easier by effectively providing a management layer to RabbitMQ. MassTransit takes away the headache of retry mechanisms - see Connection management. It also provides a nice multi threaded concurrent consumer configuration.
This has the bonus of your implementation being more portable - you could easily change things to MSMQ should the requirement come along.

Odd Behavior of Azure Service Bus ReceiveBatch()

Working with a Azure Service Bus Topic currently and running into an issue receiving my messages using ReceiveBatch method. The issue is that the expected results are not actually the results that I am getting. Here is the basic code setup, use cases are below:
SubscriptionClient client = SubscriptionClient.CreateFromConnectionString(connectionString, convoTopic, subName);
IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100);
foreach (BrokeredMessage message in messageList)
{
try
{
Console.WriteLine(message.GetBody<string>() + message.MessageId);
message.Complete();
}
catch (Exception ex)
{
message.Abandon();
}
}
client.Close();
MessageBox.Show("Done");
Using the above code, if I send 4 messages, then poll on the first run through I get the first message. On the second run through I get the other 3. I'm expecting to get all 4 at the same time. It seems to always return a singular value on the first poll then the rest on subsequent polls. (same result with 3 and 5 where I get n-1 of n messages sent on the second try and 1 message on the first try).
If I have 0 messages to receive, the operation takes between ~30-60 seconds to get the messageList (that has a 0 count). I need this to return instantly.
If I change the code to IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100, new Timespan(0,0,0)); then issue #2 goes away because issue 1 still persists where I have to call the code twice to get all the messages.
I'm assuming that issue #2 is because of a default timeout value which I overwrite in #3 (though I find it confusing that if a message is there it immediately responds without waiting the default time). I am not sure why I never receive the full amount of messages in a single ReceiveBatch however.

The way I got ReceiveBatch() to work properly was to do two things.
Disable Partitioning in the Topic (I had to make a new topic for this because you can't toggle that after creation)
Enable Batching on each subscription created like so:
List item
SubscriptionDescription sd = new SubscriptionDescription(topicName, orgSubName);
sd.EnableBatchedOperations = true;
After I did those two things, I was able to get the topics to work as intended using IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100, new TimeSpan(0,0,0));

I'm having a similar problem with an ASB Queue. I discovered that I could mitigate it somewhat by increasing the PrefetchCount on the client prior to receiving the batch:
SubscriptionClient client = SubscriptionClient.CreateFromConnectionString(connectionString, convoTopic, subName);
client.PrefetchCount = 100;
IEnumerable<BrokeredMessage> messageList = client.ReceiveBatch(100);
From the Azure Service Bus Best Practices for Performance Improvements Using Service Bus Brokered Messaging:
Prefetching enables the queue or subscription client to load additional messages from the service when it performs a receive operation.
...
When using the default lock expiration of 60 seconds, a good value for
SubscriptionClient.PrefetchCount is 20 times the maximum processing rates of all receivers of the factory. For example, a factory creates 3 receivers, and each receiver can process up to 10 messages per second. The prefetch count should not exceed 20*3*10 = 600.
...
Prefetching messages increases the overall throughput for a queue or subscription because it reduces the overall number of message operations, or round trips. Fetching the first message, however, will take longer (due to the increased message size). Receiving prefetched messages will be faster because these messages have already been downloaded by the client.

Just a few more pieces to the puzzle. I still couldn't get it to work even after Enable Batching and Disable Partitioning - I still had to do two ReceiveBatch calls. I did find however:
Restarting the Service Bus services (I am using Service Bus for Windows Server) cleared up the issue for me.
Doing a single RecieveBatch and taking no action (letting the message locks expire) and then doing another ReceiveBatch caused all of the messages to come through at the same time. (Doing an initial ReceiveBatch and calling Abandon on all of the messages didn't cause that behavior.)
So it appears to be some sort of corruption/bug in Service Bus's in-memory cache.

RabbitMQ c# System.IO.EndOfStreamException

I get the following exception when a consumer is blocking to receive a message from the SharedQueue:
Unhandled Exception: System.IO.EndOfStreamException: SharedQueue closed
at RabbitMQ.Util.SharedQueue.EnsureIsOpen()
at RabbitMQ.Util.SharedQueue.Dequeue()
at Consumer.Program.Main(String[] args) in c:\Users\pdecker\Documents\Visual
Studio 2012\Projects\RabbitMQTest1\Consumer\Program.cs:line 33
Here is the line of code that is being executed when the exception is thrown:
BasicDeliverEventArgs e = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
So far I have seen the exception occuring when rabbitMQ is inactive. Our application needs to have the consumer always connected and listening for keystrokes. Does anyone know the cause of this problem? Does anyone know how to recover from this problem?
Thanks in advance.

The consumer is tied to the channel:
var consumer = new QueueingBasicConsumer(channel);
So if the channel has closed, then the consumer will not be able to fetch any additional events once the local Queue has been cleared.
Check for the channel to be open with
channel.IsOpen == true
and that the Queue has available events with
if( consumer.Queue.Count() > 0 )
before calling:
BasicDeliverEventArgs e = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
To be more specific, I would check the following before calling Dequeue()
if( !channel.IsOpen || !connection.IsOpen )
{
Your_Connection_Channel_Init_Function();
consumer = new QueueingBasicConsumer(channel); // consumer is tied to channel
}
if( consumer.Queue.Any() )
BasicDeliverEventArgs e = (BasicDeliverEventArgs)consumer.Queue.Dequeue();

Don't worry this is just expected behavior, it means there is no message left in queue to process. Don't even try it is not gonna work...
consumer.Queue.Any()
Just catch the EndOfStreamException:
private void ConsumeMessages(string queueName)
{
using (IConnection conn = factory.CreateConnection())
{
using (IModel channel = conn.CreateModel())
{
var consumer = new QueueingBasicConsumer(channel);
channel.BasicConsume(queueName, false, consumer);
Trace.WriteLine(string.Format("Waiting for messages from: {0}", queueName));
while (true)
{
BasicDeliverEventArgs ea = null;
try
{
ea = consumer.Queue.Dequeue();
}
catch (EndOfStreamException endOfStreamException)
{
Trace.WriteLine(endOfStreamException);
// If you want to end listening end of queue call break;
break;
}
if (ea == null) break;
var body = ea.Body;
// Consume message how you want
Thread.Sleep(300);
channel.BasicAck(ea.DeliveryTag, false);
}
}
}
}

There is another possible source of trouble: your corporate firewall.
Thats because such firewall can drop your connection to RabbitMQ when the connection is idle for a certain amount of time.
Although RabbitMQ connection has a heartbeat feature to prevent this, if the heartbeat pulse happens after the firewall connection timeout, it is useless.
This is the default heartbeat interval configuration in seconds:
Default: 60 (580 prior to release 3.5.5)
From RabbitMQ:
Detecting Dead TCP Connections with Heartbeats
Introduction
Network can fail in many ways, sometimes pretty subtle (e.g. high
ratio packet loss). Disrupted TCP connections take a moderately long
time (about 11 minutes with default configuration on Linux, for
example) to be detected by the operating system. AMQP 0-9-1 offers a
heartbeat feature to ensure that the application layer promptly finds
out about disrupted connections (and also completely unresponsive
peers).
Heartbeats also defend against certain network equipment which
may terminate "idle" TCP connections.
That happened to us and we solved the problem by decreasing the Heartbeat Timeout Interval in the global configuration:
In your rabbitmq.config, find the heartbeat and set it to a value smaller than that of your firewall rule.
You can change the interval in your client, too:
Enabling Heartbeats with Java Client To configure the heartbeat
timeout in the Java client, set it with
ConnectionFactory#setRequestedHeartbeat before creating a connection:
ConnectionFactory cf = new ConnectionFactory();
// set the heartbeat timeout to 60 seconds
cf.setRequestedHeartbeat(60);
Enabling Heartbeats with the .NET Client To configure the heartbeat
timeout in the .NET client, set it with
ConnectionFactory.RequestedHeartbeat before creating a connection:
var cf = new ConnectionFactory();
//set the heartbeat timeout to 60 seconds
cf.RequestedHeartbeat = 60;

The answers here that say that this is the expected behavior are correct, however I would argue that it's bad to have it throw an exception by design like this.
from the documentation: "Callers of Dequeue() will block if no items are available until some other thread calls Enqueue() or the queue is closed. In the latter case this method will throw EndOfStreamException."
So, like GlenH7 said, you have to check that channel is open before calling Dequeue() (IModel.IsOpen).
However, what if the channel closes while Dequeue() is blocking? I think it's best to call Queue.DequeueNoWait(null), and block the thread yourself by waiting for it to return something that isn't null. So, something like:
while(channel.IsOpen)
{
var args = consumer.Queue.DequeueNoWait(null);
if(args == null) continue;
//...
}
This way, it won't throw that exception.

Reading from MSMQ slows down when there is a lot of messages queued

Short introduction
I have a SEDA based system, and used MSMQ for communication (event triggering) between the different applications/services.
One of these services gets messages by file, so I have a file listener that reads the file content and inserts this into a queue (or actually 4 different queues, but that's not very important for the first question).
Server is Windows Server 2008
First question - read slows down
My application that reads these messages at the other side normally reads about 20 messages from the queue per second, but when the service that posts messages start queuing some thousand messages, the read goes down, and the read application only reads 2-4 messages per second. When there is no posting to the queue, the read application can again read up to 20 messages per second.
The code in the reading application is pretty simple, developed in C#, I use the Read(TimeSpan timeout) function in System.Messaging.
Q: Why does the read slows down when there is a lot of messages posted to the queue?
Second question - limitations of TPS
An additional question is about the read itself. It seems there is no difference in how many messages I can read per second if I use 1 or 5 threads to read from the queue. I've also tried implementing a "round robin solution" where the post service are posting to a random set of 4 queues, and the read application had one thread listening to each of these queues, but there is still only 20 TPS even if I read from 1 queue with 1 thread, 1 queue with 4 threads or 4 queues (with one thread per queue).
I know the processing in the thread takes about 50 ms, so 20 TPS is quite correct if there is only one message processed at the time, but the clue with multi threading should be that messages are handled in parallel and not sequential.
There is about 110 different queues on the server.
Q: Why can't I get more than 20 messages out of my queue at the time even with multi threading and the use of several queues?
This is the code running today:
// There are 4 BackgroundWorkers running this function
void bw_DoWork(object sender, DoWorkEventArgs e)
{
using(var mq = new MessageQueue(".\\content"))
{
mq.Formatter = new BinaryMessageFormatter();
// ShouldIRun is a bool set to false by OnStop()
while(ShouldIRun)
{
try
{
using(var msg = mq.Receive(new TimeSpan(0,0,2))
{
ProcessMessageBody(msg.Body); // This takes 50 ms to complete
}
}
catch(MessageQueueException mqe)
{
// This occurs every time TimeSpan in Receive() is reached
if(mqe.MessageQueueErrorCode == MessageQueueErrorCode.IOTimeout)
continue;
}
}
}
But even if there are 4 threads, it seems all waits for the function to enter the "Receive" point again. I've also tried using 4 different queues (content1, content2, content3 and content4), but still i get 1 message processed every 50 ms.
Does this have anything to do with the TimeSpan in Receive(), and/or is it possible to omit this?
Another question is if the use of private queues, instad of public will solve anything?

Performance issues.
You don't mention if all the code is running on the server or if you have clients remotely accessing the queues on the server. From the speed, I'll assume the latter.
Also, are the queues transactional?
How large are the messages?
If you want to read a message from a queue, your application does not connect to the queue itself. Everything goes between the local queue manager and the remote queue manager. The queue manager is the only process that writes to, and reads from queues. Therefore having multiple queues or a single queue won't necessarily perform any differently.
The MSMQ queue manager is therefore going to be a bottleneck at some point as there is only so much work it can do at the same time. Your first question shows this - when you put a high load on the queue manager putting messages IN, your ability to take messages OUT slows down. I'd recommend looking at performance monitor to see if MQSVC.EXE is maxed out, for example.

Why are you using timespan? - that is a bad thing and here is why.
When developing services and queue you need to program in a theadsafe manner. Each item in the queue will spawn a new thread. Using timespan is forcing each of the threads to use a single timer event thread. These events are having to wait for their turn at the event thread.
The norm is 1 thread per queue events - This is generally your System.Messaging.ReceiveCompletedEventArgs event. Another thread is your onStart event...
20 threads or 20 reads per second is probably correct. Generally when thread pooling you can only spawn 36 threads at a time in .net.
My advice is drop the timer event an make your queue simply process the data.
do something more like this;
namespace MessageService
{
public partial class MessageService : ServiceBase
{
public MessageService()
{
InitializeComponent();
}
private string MessageDirectory = ConfigurationManager.AppSettings["MessageDirectory"];
private string MessageQueue = ConfigurationManager.AppSettings["MessageQueue"];
private System.Messaging.MessageQueue messageQueue = null;
private ManualResetEvent manualResetEvent = new ManualResetEvent(true);
protected override void OnStart(string[] args)
{
// Create directories if needed
if (!System.IO.Directory.Exists(MessageDirectory))
System.IO.Directory.CreateDirectory(MessageDirectory);
// Create new message queue instance
messageQueue = new System.Messaging.MessageQueue(MessageQueue);
try
{
// Set formatter to allow ASCII text
messageQueue.Formatter = new System.Messaging.ActiveXMessageFormatter();
// Assign event handler when message is received
messageQueue.ReceiveCompleted +=
new System.Messaging.ReceiveCompletedEventHandler(messageQueue_ReceiveCompleted);
// Start listening
messageQueue.BeginReceive();
}
catch (Exception e)
{
}
}
protected override void OnStop()
{
//Make process synchronous before closing the queue
manualResetEvent.WaitOne();
// Clean up
if (this.messageQueue != null)
{
this.messageQueue.Close();
this.messageQueue = null;
}
}
public void messageQueue_ReceiveCompleted(object sender, System.Messaging.ReceiveCompletedEventArgs e)
{
manualResetEvent.Reset();
System.Messaging.Message completeMessage = null;
System.IO.FileStream fileStream = null;
System.IO.StreamWriter streamWriter = null;
string fileName = null;
byte[] bytes = new byte[2500000];
string xmlstr = string.Empty;
try
{
// Receive the message
completeMessage = this.messageQueue.EndReceive(e.AsyncResult);
completeMessage.BodyStream.Read(bytes, 0, bytes.Length);
System.Text.ASCIIEncoding ascii = new System.Text.ASCIIEncoding();
long len = completeMessage.BodyStream.Length;
int intlen = Convert.ToInt32(len);
xmlstr = ascii.GetString(bytes, 0, intlen);
}
catch (Exception ex0)
{
//Error converting message to string
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.