I have multiple queues that multiple clients insert messages into them.
On the server side, I have multiple micro-services that access the queues and handle those messages. I want to lock a queue whenever a service is working on it, so that other services won't be able to work on that queue.
Meaning that if service A is processing a message from queue X, no other service can process a message from that queue, until service A has finished processing the message. Other services can process messages from any queue other than X.
Does anyone has an idea on how to lock a queue and prevent others from accessing it? preferably the other services will receive an exception or something so that they'll try again on a different queue.
UPDATE
Another way can be to assign the queues to the services, and whenever a service is working on a queue no other service should be assigned to the queue, until the work item was processed. This is also something that isn't easy to achieve.
There are several built-in ways of doing this. If you only have a single worker, you can set MessageOptions.MaxConcurrentCalls = 1.
If you have multiple, you can use the Singleton attribute. This gives you the option of setting it in Listener mode or Function mode. The former gives the behavior you're asking for, a serially-processed FIFO queue. The latter lets you lock more granularly, so you can specifically lock around critical sections, ensuring consistency while allowing greater throughput, but doesn't necessarily preserve order.
My Guess is they'd have implemented the singleton attribute similarly to your Redis approach, so performance should be equivalent. I've done no testing with that though.
You can achieve this by using Azure Service Bus message sessions
All messages in your queue must be tagged with the same SessionId. In that case, when a client receives a message, it locks not only this message but all messages with the same SessionId (effectively whole queue).
The solution was to use Azure's redis to store the locks in-memory and have micro-services that manage those locks using the redis store.
The lock() and unlock() operations are atomic and the lock has a TTL, so that a queue won't be locked indefinitely.
Azure Service Bus is a broker with competing consumers. You can't have what you're asking with a general queue all instances of your service are using.
Put the work items into a relational database. You can still use queues to push work to workers but the queue items can now be empty. When a worker receives an item he know to look into the database instead. The content of the message is disregarded.
That way messages are independent and idempotent. For queueing to work these two properties usually must hold.
That way you can more easily sequence actions that actually are sequential. You can use transactions as well.
Maybe you don't need queues at all. Maybe it is enough to have a fixed number of workers polling the database for work. This loses auto-scaling with queues, though.
Related
Should I call CreateIfNotExistsAsync() before every read/write on Azure queue?
I know it results in a REST call, but does it do any IO on the queue?
I am using the .Net library for Azure Queue (if this info is important).
All that method does is try to create the queue and catches the AlreadyExists error, which you could just as easily replicate yourself by catching the 404 when you try and access the queue. There is bound to be some performance impact.
More importantly, it increases your costs: from the archive of Understanding Windows Azure Storage Billing – Bandwidth, Transactions, and Capacity [MSDN]
We have seen applications that perform a CreateIfNotExist [sic] on a Queue before every put message into that queue. This results in two separate requests to the storage system for every message they want to enqueue, with the create queue failing. Make sure you only create your Blob Containers, Tables and Queues at the start of their lifetime to avoid these extra transaction costs.
I am evaluating using RabbitMQ as message queue/message bus and have been looking at the example tutorials on the RabbitMQ page.
I am looking for a specific scenario not covered by the tutorials and I am not sure if and how it would be possible to do via RabbitMQ.
The setup:
Let's assume I got a service, let's call it "purchase orders" and I have to other services called "logistics" and "accounting".
When an order is sent, I want to send it as a message via RabbitMQ.
There 2 "account" and 3 "logistic" services
What would be the correct way to ensure that "account" and "logistic" will process the message only once? Using pub/sub will cause the messages to be processed twice (account) or trice (logistics) if i understand it correctly.
With work queues and prefetch=1 it would assure that only one gets it, but I have 2 services and want each type of service to get one.
Is there a way to combine both and have a work queues for each of the service, without sending 2 separate events/messages to two different exchanges?
Using pub/sub will cause the messages to be processed twice (account) or trice (logistics) if i understand it correctly.
you probably have 1 queue per worker, based on your description, and you are routing the message to all worker queues. therefore, each worker gets a copy of the message, because you routed the message to all of the queues.
what you want is a single "account" queue and a single "logistic" queue. you will have multiple account services reading from the single account queue; same for the logistic service / queue.
setting prefetch=1 is important as well. this prevents you from reading too many messages in to a single worker, at once.
Is there a way to combine both and have a work queues for each of the service, without sending 2 separate events/messages to two different exchanges?
yes - don't use a fanout exchange. use a topic or direct exchange, and use multiple routing keys to route a single message to both the account and logistics queues.
What would be the correct way to ensure that "account" and "logistic" will process the message only once?
there is no way to guarantee this, 100%. at some point, even with a proper setup like I've described, you will have a network failure or a worker crash or some other problem and a message will get processed twice. you have to account for this in you design, using some form of idempotence in your message processing.
hope that helps!
I'm using RabbitMQ for the following scenario. When a user uses a premium search feature, I send a message via RabbitMQ to one of a few server instances. They run the same routine (DB queries and billing). I want to make sure I don't process the same message more than once.
I've come across this great tutorial but the exchange type presented in it is "Topic", which does not work for me, because I process the same message more than once.
How can I implement the request-response pattern with worker queues in RabbitMQ so that each message is handled only once and there's load balancing?
Anton Gogolev's comment above is correct. You cannot guarantee a message will be processed only once, for many reasons. But, this is often a requirement of systems - to only produce the desired result once.
The way to do that is through idempotence - the idea that no matter how many times a given message is processed, it will only make the desired change once.
There are a lot of ways to do this. One simple example is to use a shared database that tracks which messages have been processed. When you receive a message, you check to see if it has been processed already. If not, you process it. If it has, you just ignore it and move on.
In your case, if you are doing request/response and want load balancing, you probably want multiple consumers on the same queue. You could have 2 or 10 or 300 instances of your request handler listening to the same queue, and you won't have too much worry about duplicate processing.
RabbitMQ will send a given message to a single consumer. It will wait for that consumer to say it is done processing, or if the consumer crashes or rejects the message, it will requeue the message for another consumer to try again.
In this way, you will generally have only 1 request handler per request. But it will always be possible for more than one to handle the same message, which is why idempotence is important.
Regarding the use of a topic exchange vs any other type of exchange - it doesn't make much difference. There will always be the possibility of more than one queue receiving the message that you are sending, because you can have multiple queues bound to the same exchange with the same binding keys.
I am current writing a TCP listener that has many client applications sending short messages to. The TCP listener I have is a C# winform and what I need to do is to process these logs in batches to avoid hitting the database on every message I receive in the Queue. Currently, on every message I receive in the listener, I do Enqueue with the C# Queue Class.
A separate thread will execute every 5 minutes to check this Queue and start processing the Queue if there are any queued items. It seems that there is a concurrency/race condition issue with this design as when the 5 minute thread kicks off, the new messages being received can no longer access the Queue since I have a lock on it during DeQueue. Therefore, these new messages gets lost. It seems to be happening only when there are large amounts of messages being sent to the TCP listener.
Does anyone think this is a flawed design on my part or would there be a much better solution for this? I am not allowed to use MSMQ or WCF based on restrictions from the client application that are sending the messages.
So you have a producer-consumer scenario, with multiple producers and one (buffered) consumer. You may want to take a look at Reactive Extensions (they have a version for .NET 3.5). At the very least, you could leverage their backport of BlockingCollection<T>.
We have the following scenario using the .NET RabbitMQ library:
A worker thread picks up 'request' messages from a queue and dispatches them on multiple worker threads for processing. When complete, each worker thread then sends another message.
My question is what 'pattern' do people recommend for the sender in order to get the best throughput and stability?. For example:
1) A singleton 'publisher' instance that is accessed by all worker threads with a single Connection and IModel (using a 'lock' to synchronize access to the IModel)
2) A singleton 'publisher' instance that is accessed by all worker threads with a single Connection and which creates a new IModel for each send request.
or something else?
According to RabbitMQ user guide "IModel instances should not be used by more than one thread simultaneously: application code should maintain a clear notion of thread ownership for IModel instances"
In case a IModel is shared a lock should be use as you said but in my opinion this leads to a more complex code because connection and model should be keep alive in case of disconnection.
Although you don't mention if transactions are required in order to achieve more reliability on message delivery the channel must be set on transaction mode and maybe you will need a transaction per delivery.
Using a new model per delivery ensures a more simple error management but it
obviously will slow down the throughput (and the same by using transactions).
You could also, depending on your requirements, use non durable queues and direct or fanout exchanges which provide better throughput.
As an example, on a development machine a windows service that consumes messages from a queue using one single thread (deserializing, making some business logic, and finally send a new serialized message using transactions by opening/closing connection and model can achieve about 3000 messages processed per second.
(serialization was done via XmlSerializer which is worst than DataContractSerializer)