I have an HTTP-trigger Azure Function that adds a message to the queue: outputQueue.AddAsync(myMessage); Then the Queue-trigger Azure Function is triggered. It adds 100 messages to the same queue. Each one of those 100 messages is dequeued by this function and processed. This processing takes about 5-7 minutes. My functionTimeout is 10 minutes. Sometimes (in 10% of the calls) the same message is being dequeued amd processed twice and even more, thought the previous processing of this message was successful. Also I paid attantion that each such redundent dequeue happens about 10 minutes after the previous dequeue of the same massage (seems to be related to my functionTimeout of 10 minutes). So it looks like after the processing is done the function is not ended and hence not deleted from the queue, which causes another instance to dequeue it.
When I look at the Failures section of Application Insights I see that approximately for 1K operations I have about 10 WebExceptions and 2 TimeoutExceptions.
WebException:
Message: The remote server returned an error: (409) Conflict.
Failed method:
Microsoft.WindowsAzure.Storage.Shared.Protocol.HttpResponseParsers.ProcessExpectedStatusCodeNoExceptiond
FormattedMessage: An unhandled exception has occurred. Host is shutting down.
TimeoutException:
Message: The client could not finish the operation within specified timeout. The client could not finish the operation within specified timeout.
Failed method: Microsoft.WindowsAzure.Storage.Core.Executor.Executor.EndExecuteAsync
FormattedMessage: An unhandled exception has occurred. Host is shutting down.
I have try..catch in my Function entry point, but probably those 2 exceptions don't go to the catch block.
My host.json is as follows:
{
"functionTimeout": "00:10:00",
"version": "2.0",
"extensions": {
"queues": {
"maxPollingInterval": 1000,
"visibilityTimeout": "01:00:00",
"batchSize": 8,
"maxDequeueCount": 5,
"newBatchThreshold": 4
}
}
}
When I set "batchSize": 2 and "newBatchThreshold": 1 I have less redundant dequeues, but more instances are created (I know this by logging the IP of the server of every Azure Function call). If I have more servers that process different messages then my static data is less re-used betwen instances.
Also note that I've set the "visibilityTimeout" to 1 hour (I tried 30 minutes as well), but looks like this value is completely ignored and the message becomes visible after 10 minutes.
Any idea how I can avoid duplicate processing of the same messages? I'm thinking about writing the message info to DB after the successful processing and on every dequeue of a message check if this message was processed, say, within 1 hour from now and if so, not to process it again. Another option I'm thinking about is setting "maxDequeueCount" to 1 (I have a restore mechanism if some messages won't be processed at all due to some real failure).
BTW, those 10% of redundant processings don't cause functionality issues, but I still want to improve the performance.
Related
As seen in the image below, queue items remain invisible for more than 7 minutes and no processing is being done in the meantime as Live Metrics Stream shows no processor activity until after the (roughly) 7 minutes. It is like the Queue Trigger is not being activated. The visilibityTimeout is set to '00:00:05'.
This happens very often and there are no exceptions or errors. Refreshing many times shows some of the messages appearing with a dequeue count of 1 and then instantly disappearing / getting processing (even if the QueueTrigger function is stopped).
The more servers online there are that process the queue messages the more messages are likely to get stuck in such a way.
This is how it is implemented:
[FunctionName(nameof(QueueFunc))]
public async Task QueueFuncProcess(
[QueueTrigger("%QueueFuncName%", Connection = "QueueFuncConnection")] string queueItem)
I have an azure function which is triggered by adding a new message into a queue.
It should download a file from an FTP server and the name of the file is a part of the message that I push into the queue.
At some points, the server which is hosting files might become inaccessible and I will get exceptions, of course.
I would like to know how the queue behaves in these cases? does it pop the message and leave it? Or does it keep it and call the function again and again until the task gets completed without any exceptions?
From the docs:
The Functions runtime receives a message in PeekLock mode. It calls Complete on the message if the function finishes successfully, or calls Abandon if the function fails. If the function runs longer than the PeekLock timeout, the lock is automatically renewed as long as the function is running.
So, if the function fails it will be available for a next run, for a maximum of 10 retries. After 10 retries it goes to the DeadLetter Queue (source):
Service Bus Queues and Subscriptions each have a QueueDescription.MaxDeliveryCount and SubscriptionDescription.MaxDeliveryCount property respectively. the default value is 10. Whenever a message has been delivered under a lock (ReceiveMode.PeekLock), but has been either explicitly abandoned or the lock has expired, the message BrokeredMessage.DeliveryCount is incremented. When DeliveryCount exceeds MaxDeliveryCount, the message is moved to the DLQ, specifying the MaxDeliveryCountExceeded reason code.
I'm new to service bus and not able to figure this out.
Basically i'm using Azure function app which is hooked onto the service bus queue. Let's say a trigger is fired from the service bus and I receive a message from the queue, and in the processing of that message something goes wrong in my code. In such cases how do I make sure to put that message back in the queue again? Currently its just disappearing into thin air and when I restart my function app on VS, the next message from the queue is taken.
Ideally only when all my data processing is done and when i hit myMsg.Success() do I want it to be removed from the queue.
public static async Task RunAsync([ServiceBusTrigger("xx", "yy", AccessRights.Manage)]BrokeredMessage mySbMsg, TraceWriter log)
{
try{ // do something with mySbMsg }
catch{ // put that mySbMsg back in the queue so it doesn't disappear. and throw exception}
}
I was reading up on mySbMsg.Abandon() but it looks like that puts the message in the dead letter queue and I am not sure how to access it? and if there is a better way to error handle?
Cloud queues are a bit different than in-memory queues because they need to be robust to the possibility of the client crashing after it received the queue message but before it finished processing the message.
When a queue message is received, the message becomes "invisible" so that other clients can't pick it up. This gives the client a chance to process it and the client must mark it as completed when it is done (Azure Functions will do this automatically when you return from the function). That way, if the client were to crash in the middle of processing the message (we're on the cloud, so be robust to random machine crashes due to powerloss, etc), the server will see the absence of the completed message, assume the client crashed, and eventually resend the message.
Practically, this means that if you receive a queue message, and throw an exception (and thus we don't mark the message as completed), it will be invisible for a few minutes, but then it will show up again after a few minutes and another client can attempt to handle it. Put another way, in Azure functions, queue messages are automatically retried after exceptions, but the message will be invisible for a few minutes inbetween retries.
If you want the message to remain on the queue to be retried, the function should not swallow exception and rather throw. That way Function will not auto-complete the message and retry it.
Keep in mind that this will cause message to be retried and eventually, if exception persists, to be moved into dead-letter queue.
As per my understanding, I think what you are for is if there is an error in processing the message it needs to retry the execution instead of swallowing it. If you are using Azure Functions V2.0 you define the message handler options in the host.json
"extensions": {
"serviceBus": {
"prefetchCount": 100,
"messageHandlerOptions": {
"autoComplete": false,
"maxConcurrentCalls": 1
}
}
}
prefetchCount - Gets or sets the number of messages that the message receiver can simultaneously request.
autoComplete - Whether the trigger should automatically call complete after processing, or if the function code will manually call complete.
After retrying the message n(defaults to 10) number of times it will transfer the message to DLQ.
I have an Azure Service Bus Queue.
It's configured with:
Requires Duplicate Detection: true
Requires Session: true
Enable Partitions: false
Max Delivery Count: 10
Lock Duration: 1 minute
Batch Operations Enabled: true
Deadletter on Expiration Enabled: false
Enforce message ordering: true
When retrieving a message from the queue I use the following OnMessageOptions:
AutoComplete: false
AutoRenewTimeout: 12 minutes
Each message takes on average 2 minutes to complete.
Some of them succeed, others throw a "SessionLockLostException".
Why does the lock "AutoRenew" not keep the message lock renewed? It's supposed to keep doing it's job for 12 minutes, yet we get that exception after 2.
How do you debug the cause of the exception? The exception tells me roughly what happened, but not why. I can't find any information about logging within the Service Bus Queue client.
Where is the documentation? The MSDN in this instance is awful! It lacks even basic information about how these classes are supposed to work.
EDIT: As MaDeRkAn helpfully mentioned in a comment, the documentation for "SessionLockLostException" does mention that Azure can move around messages between partitions.
When I originally created a test application to see if this approach worked I had the queue configured to use partitions. While figuring out the code needed to handle the various exceptions that occur in various situations I read about that exception.
I have discounted this as being the problem for two reasons:
I've (literally) triple checked that Partitions are disabled. I also checked that the Queue we're using is the same Queue I'm looking at for the properties.
If Azure was causing failures this often (every 2-5 messages) then the service would be pretty much unusable! And while Azure has issues at times it's not normally totally broken like that.
In an Azure Web Job you can have functions triggered by an Azure queue in a continuously running job.
When a message has been read from a queue, it is deleted.
But if for some reason my Job crashes (think a VM restart) the current Job that was not finished will crash and I will lose the information in the message.
Is it possible to configure Azure Web Jobs not to delete messages from queue automatically, and do it manually when my job finishes?
There are two cases:
The function failed because the message was bad and we couldn't bind it - for example, you bind to an Person object but the message body is invalid JSON. In that case, we delete the message from the queue. We are going to have a mechanism of handling poison messages in a future release. (related question)
The function failed because an exception was thrown after the message was bound - for example, your own code thrown an exception. Whenever we get a message from a queue (except in case #1) we set a lease of, I think, 10 minutes:
If the function still runs after 10 minutes, we renew the lease.
If the function completes, the message is deleted.
If the function throws for any reason, we leave the message there and not renew the lease again. The message will show up in the queue again after the lease time expires (max 10 minutes).
The answer your question, if the VM restarts you are in case #2 and the message should show up again after, at most, 10 minutes.