Windows Services communicating via MSMQs - Do I need a service bus? - c#

I have this problem where a system contains nodes (windows services) that push messages to be processed and others that pull messages and process them.
This has been designed in a way that the push nodes balance the load between queues by maintaining a round-robin list of queues and rotating queues after each send. Therefore message 1 will go to queue 1, message 2 to queue 2 etc. This part has been working great so far.
On the message pull end we designed it such that the messages are retrieved in a similar way - first from queue 1, then from queue 2 etc. In theory, each pull node sits on a different machine and in practice, so far, it only listened on a single queue. But a recent requirement made us have a pull node in a machine that listens to more than one queue: One that typically is extremely busy and filled with millions of messages and one that generally only contains a handful of messages.
The problem we are facing is that the way we architected originally the pull nodes goes from queue to queue until a message is found. If it times out (say after a sec) then it moves on to the next queue.
This doesnt work anymore cause Q1 (filled with millions of messages) will be delayed approximately a second per message since after each pull from Q1 we will ask Q2 for a message (and if it doesnt contain any we will wait for a second).
So it goes like this:
Q1 contains 10 messages and Q2 contains none
Pull node asks for a message from Q1
Q1 returns message immediately
Pull node asks for a message from Q1
------------ Waiting for a second ------------- (Q2 is empty and request times out)
Pull node asks for a message from Q1
Q1 returns message immediately
Pull node asks for a message from Q1
------------ Waiting for a second ------------- (Q2 is empty and request times out)
etc.
So this is clearly wrong.
I guess I am looking for the best architectural solution here. Message processing does not need to be as real-time as possible but needs to be robust and no message should ever be lost!
I would like to hear your views on this problem.
Thank in advance
Yannis

Maybe you could use the ReceiveCompleted event in the MessageQueue class? No need to poll then.

I ended up creating a set of threads - one for each msmq that needs to be processed. In the constructor I initialize those threads:
Storages.ForEach(queue =>
{
Task task = Task.Factory.StartNew(() =>
{
LoggingManager.LogInfo("Starting a local thread to read in mime messages from queue " + queue.Name, this.GetType());
while (true)
{
WorkItem mime = queue.WaitAndRetrieve();
if (mime != null)
{
_Semaphore.WaitOne();
_LocalStorage.Enqueue(mime);
lock (_locker) Monitor.Pulse(_locker);
LoggingManager.LogDebug("Adding no. " + _LocalStorage.Count + " item in queue", this.GetType());
}
}
});
});
The _LocalStorage is a thread-safe Queue implementation (ConcurrentQueue introduced in .NET 4.0)
The Semaphore is a counting semaphore to control inserts in the _LocalStorage. The _LocalStorage is basically a buffer of received messages but we dont want it to get too large while processing nodes are busy doing work. The effect could be that we retrieve ALL the msmq messages in that _LocalStorage but are busy processing only 5 of them or so. This is bad both in terms of resilience (if the program terminates unexpectedly we lose all these messages) and also in terms of performance as the memory consumption for holding all these items in memory will be huge. So we need to control how many items we hold in the _LocalStorage buffer queue.
We Pulse threads waiting for work (see below) that a new item was added to the queue by doing a simple Monitor.Pulse
The code that dequeues work items from the queue is as follows:
lock (_locker)
if (_LocalStorage.Count == 0)
Monitor.Wait(_locker);
WorkItem result;
if (_LocalStorage.TryDequeue(out result))
{
_Semaphore.Release();
return result;
}
return null;
I hope this can help someone to sort out a similar issue.

Related

Execution of parallel tasks while waiting for selected ones in c#

I have built a MQTT client that listens for certain status data. For each message I run a method which can take a while (up to 1 second). Since a lot of messages can arrive at once, I want to run the whole thing in parallel. My problem now is, when I receive a message belonging to topic A, I want to make sure that the previous task belonging to topic A has already finished before I start the new one. But I also need to be able to receive new messages during the time I am waiting for Task A to finish and add them to the queue if necessary. Of course, if the new message belongs to topic B, I don't care about the status of task A and I can run this method call in parallel.
In my mind, this is solved with a kind of dictionary that has different queues.
What about to use a lock on an object related to the topic?
When a new item come in the system you could retrieve/create a lock object from a ConcurrentDictionary and then you could use this object to lock the execution.
something like this.
static ConcurrentDictionary<string,object> _locksByCategory =
new ConcurrentDictionary<string,object>();
async void ProcessItem(ItemType item) {
var lockObject = _locksByCategory(item.Category, new object(), (k, o) => o);
lock (lockObject) {
// your code
}
}
This isn't a production ready solution but could help to start with.
I don't know exactly how you would do it, but it goes along the lines of:
On startup, create a (static? singleton?) Dictionary<Topic, ConcurrentQueue> and for each topic create a thread that does the following:
Wrap the ConcurrentQueue in a BlockingCollection
infinitely loop with BlockingCollection.Take at the start of the loop. This should block until an item is ready, execute the rest of the loop and listen for more items afterwards.
Whenever a message comes in, add it to the corresponding ConcurrentQueue.

C# Producer/Consumer queue with delays

The problem is the classic N producers (with N possibly large) and X consumers with limited resources (X is currently 4). The producers messages come in (say via MQTT) and get queued to be processed by consumers on a FIFO basis. The important part of the processing is that each consumer may need to send back to the producers one or more "replies" and such replies should be at least some time apart (the exact delay is not important). The classic solution where one starts X tasks that wait on the message queue, process and loop is easy to implement using, for example, System.Threading.Channels:
while (!cancellationToken.IsCancellationRequested && await queue.Reader.WaitToReadAsync()) {
while (queue.Reader.TryRead(out IncomingMessage item)) {
// Do some processing.
SendResponse(1);
// Do some more processing.
if (needsToSend2Response) {
await Task.Delay(500);
SendResponse(2);
}
}
}
This works and works well except that if a task needs to delay it can't process any more messages and that's obviously bad.
Possible solutions I thought of:
Use an outbound queue that process the messages and makes sure there is at least a minimum delay between messages sent to the same producer.
Don't use a queue. Just start a new Task every time a new message comes in and arbitrate the limited resources using a semaphore: it works but I don't see how to guarantee the FIFO requirement (some times messages from the same producer are processed in the wrong order).
Any other ideas?

What is the role of "MaxAutoRenewDuration" in azure service bus?

I'm using Microsoft.Azure.ServiceBus. (doc)
I was getting an exception of:
The lock supplied is invalid. Either the lock expired, or the message
has already been removed from the queue.
By the help of these questions:
1, 2, 3,
I am able to avoid the Exception by setting the AutoComplete to false and by increment the Azure's queue lock duration to its max (from 30 seconds to 5 minutes).
_queueClient.RegisterMessageHandler(ProcessMessagesAsync, new
MessageHandlerOptions(ExceptionReceivedHandler)
{
MaxConcurrentCalls = 1,
MaxAutoRenewDuration = TimeSpan.FromSeconds(10),
AutoComplete = false
}
);
private async Task ProcessMessagesAsync(Message message, CancellationToken token)
{
await ProccesMessage(message);
}
private async Task ProccesMessage(Message message)
{
//The complete should be closed before long-timed process
await _queueClient.CompleteAsync(message.SystemProperties.LockToken);
await DoFoo(message.Body); //some long running process
}
My questions are:
This answer suggested that the exception was raised because the lock was being expired before the long time process, but in my case I was marking the message as complete immediately (before the long run process), so I'm not sure why changing the locking duration from azure made any difference? when I change it back to 30 seconds I can see the exception again.
Not sure if it related to the question but what is the purpose MaxAutoRenewDuration, the official docs is The maximum duration during which locks are automatically renewed.. If in my case I have only one app receiver that en-queue from this queue, so is it not needed because I do not need to lock the message from another app to capture it? and why this value should be greater than the longest message lock duration?
There are a few things you need to consider.
Lock duration
Total time since a message acquired from the broker
The lock duration is simple - for how long a single competing consumer can lease a message w/o having that message leased to any other competing consumer.
The total time is a bit tricker. Your callback ProcessMessagesAsync registered with to receive the message is not the only thing that is involved. In the code sample, you've provided, you're setting the concurrency to 1. If there's a prefetch configured (queue gets more than one message with every request for a message or several), the lock duration clock on the server starts ticking for all those messages. So if your processing is done slightly under MaxLockDuration but for the same of example, the last prefetched message was waiting to get processed too long, even if it's done within less than lock duration time, it might lose its lock and the exception will be thrown when attempting completion of that message.
This is where MaxAutoRenewDuration comes into the game. What it does is extends the message lease with the broker, "re-locking" it for the competing consumer that is currently handling the message. MaxAutoRenewDuration should be set to the "possibly maximum processing time a lease will be required". In your sample, it's set to TimeSpan.FromSeconds(10) which is extremely low. It needs to be set to be at least longer than the MaxLockDuration and adjusted to the longest period of time ProccesMessage will need to run. Taking prefetching into consideration.
To help to visualize it, think of the client-side having an in-memory queue where the messages can be stored while you perform the serial processing of the messages one by one in your handler. Lease starts the moment a message arrives from the broker to that in-memory queue. If the total time in the in-memory queue plus the processing exceeds the lock duration, the lease is lost. Your options are:
Enable concurrent processing by setting MaxConcurrentCalls > 1
Increase MaxLockDuration
Reduce message prefetch (if you use it)
Configure MaxAutoRenewDuration to renew the lock and overcome the MaxLockDuration constraint
Note about #4 - it's not a guaranteed operation. Therefore there's a chance a call to the broker will fail and message lock will not be extended. I recommend designing your solutions to work within the lock duration limit. Alternatively, persist message information so that your processing doesn't have to be constrained by the messaging.

Task.Factory.StartNew - confused about the pool

Hi I'm getting myself tied up with Task.Factory.StartNew. Just as I think I get the idea of it someone has suggested I write the following code;
bool exitLoop = false;
while (!exitLoop)
{
exitLoop = true;
var messages = Queue.GetMessages(20);
foreach (var message in messages)
{
exitLoop = false;
Task.Factory.StartNew(() =>
{
DeliverMessage(message);
});
}
}
In theory this is going to drain a queue, 20 messages at a time, attempting to creat a Task for every message in the queue. So if we had a 1000 messages in the queue then in an instant we'd have 25 tasks and it would eat its way through all the msgs. I previously thought I understood this, I thought StartNew would block once it ran out of entries - in the old days that would have been ~ 25. But given this is .net 4.5 which I'm now under the impression that the upper limit for a pool is now pretty high. What puzzles me is that I would have assumed that is going to flood the pool with new tasks and start blocking, i.e. in an instant I now have 1000 tasks running. So if the pool limit is now hardly a limit why am I not seeing 1000 tasks?
[Edit]
ok, so what I'm seeing is that 1000 tasks are queued to run, rather than are running. So how do I determine the number of running/runnable tasks?
I know this is quite a while after your post, but I hope this may help someone facing your specific challenge. Your last comment stated that the 'DeliverMessage' method was making HTTP requests.
If you are using the 'WebClient' object (for example) to make your requests, it will be bound by the ServicePointManager.DefaultConnectionLimit property. This means it will create at most two (by default) concurrent connections to the host. If you created 1,000 parallel tasks, all 1,000 of those would have to be serviced by those two connections.
You'll have to play around with different values for this setting to find the right balance between throughput in your application and load on the web server.

Throttling speed of email sending process

Sorry the title is a bit crappy, I couldn't quite word it properly.
Edit: I should note this is a console c# app
I've prototyped out a system that works like so (this is rough pseudo-codeish):
var collection = grabfromdb();
foreach (item in collection) {
SendAnEmail();
}
SendAnEmail:
SmtpClient mailClient = new SmtpClient;
mailClient.SendCompleted += new SendCompletedEventHandler(SendComplete);
mailClient.SendAsync('the mail message');
SendComplete:
if (anyErrors) {
errorHandling()
}
else {
HitDBAndMarkAsSendOK();
}
Obviously this setup is not ideal. If the initial collection has, say 10,000 records, then it's going to new up 10,000 instances of smtpclient in fairly short order as quickly as it can step through the rows - and likely asplode in the process.
My ideal end game is to have something like 10 concurrent email going out at once.
A hacky solution comes to mind: Add a counter, that increments when SendAnEmail() is called, and decrements when SendComplete is sent. Before SendAnEmail() is called in the initial loop, check the counter, if it's too high, then sleep for a small period of time and then check it again.
I'm not sure that's such a great idea, and figure the SO hive mind would have a way to do this properly.
I have very little knowledge of threading and not sure if it would be an appropriate use here. Eg sending email in a background thread, first check the number of child threads to ensure there's not too many being used. Or if there is some type of 'thread throttling' built in.
Update
Following in the advice of Steven A. Lowe, I now have:
A Dictionary holding my emails and a unique key (this is the email que
A FillQue Method, which populates the dictionary
A ProcessQue method, which is a background thread. It checks the que, and SendAsycs any email in the que.
A SendCompleted delegate which removes the email from the que. And calls FillQue again.
I've a few problems with this setup. I think I've missed the boat with the background thread, should I be spawning one of these for each item in the dictionary? How can I get the thread to 'hang around' for lack of a better word, if the email que empties the thread ends.
final update
I've put a 'while(true) {}' in the background thread. If the que is empty, it waits a few seconds and tries again. If the que is repeatedly empty, i 'break' the while, and the program ends... Works fine. I'm a bit worried about the 'while(true)' business though..
Short Answer
Use a queue as a finite buffer, processed by its own thread.
Long Answer
Call a fill-queue method to create a queue of emails, limited to (say) 10. Fill it with the first 10 unsent emails. Launch a thread to process the queue - for each email in the queue, send it asynch. When the queue is empty sleep for a while and check again. Have the completion delegate remove the sent or errored email from the queue and update the database, then call the fill-queue method to read more unsent emails into the queue (back up to the limit).
You'll only need locks around the queue operations, and will only have to manage (directly) the one thread to process the queue. You will never have more than N+1 threads active at once, where N is the queue limit.
I believe your hacky solution actually would work. Just make sure you have a lock statement around the bits where you increment and decrement the counter:
class EmailSender
{
object SimultaneousEmailsLock;
int SimultaneousEmails;
public string[] Recipients;
void SendAll()
{
foreach(string Recipient in Recipients)
{
while (SimultaneousEmails>10) Thread.Sleep(10);
SendAnEmail(Recipient);
}
}
void SendAnEmail(string Recipient)
{
lock(SimultaneousEmailsLock)
{
SimultaneousEmails++;
}
... send it ...
}
void FinishedEmailCallback()
{
lock(SimultaneousEmailsLock)
{
SimultaneousEmails--;
}
... etc ...
}
}
I would add all my messages to a Queue, and then spawn i.e. 10 threads which sent emails until the Queue was empty. Pseudo'ish C# (probably wont compile):
class EmailSender
{
Queue<Message> messages;
List<Thread> threads;
public Send(IEnumerable<Message> messages, int threads)
{
this.messages = new Queue<Message>(messages);
this.threads = new List<Thread>();
while(threads-- > 0)
threads.Add(new Thread(SendMessages));
threads.ForEach(t => t.Start());
while(threads.Any(t => t.IsAlive))
Thread.Sleep(50);
}
private SendMessages()
{
while(true)
{
Message m;
lock(messages)
{
try
{
m = messages.Dequeue();
}
catch(InvalidOperationException)
{
// No more messages
return;
}
}
// Send message in some way. Not in an async way,
// since we are already kind of async.
Thread.Sleep(); // Perhaps take a quick rest
}
}
}
If the message is the same, and just having many recipients, just swap the Message with a Recipient, and add a single Message parameter to the Send method.
You could use a .NET Timer to setup the schedule for sending messages. Whenever the timer fires, grab the next 10 messages and send them all, and repeat. Or if you want a general (10 messages per second) rate you could have the timer fire every 100ms, and send a single message every time.
If you need more advanced scheduling, you could look at a scheduling framework like Quartz.NET
Isn't this something that Thread.Sleep() can handle?
You are correct in thinking that background threading can serve a good purpose here. Basically what you want to do is create a background thread for this process, let it run its own way, delays and all, and then terminate the thread when the process is done, or leave it on indefinitely (turning it into a Windows Service or something similar will be a good idea).
A little intro on multi-threading can be read here (with Thread.Sleep included!).
A nice intro on Windows Services can be read here.

Categories

Resources