Recoverable Azure Web Job (don't delete from queue)

Recoverable Azure Web Job (don't delete from queue) - c#

In an Azure Web Job you can have functions triggered by an Azure queue in a continuously running job.
When a message has been read from a queue, it is deleted.
But if for some reason my Job crashes (think a VM restart) the current Job that was not finished will crash and I will lose the information in the message.
Is it possible to configure Azure Web Jobs not to delete messages from queue automatically, and do it manually when my job finishes?

There are two cases:
The function failed because the message was bad and we couldn't bind it - for example, you bind to an Person object but the message body is invalid JSON. In that case, we delete the message from the queue. We are going to have a mechanism of handling poison messages in a future release. (related question)
The function failed because an exception was thrown after the message was bound - for example, your own code thrown an exception. Whenever we get a message from a queue (except in case #1) we set a lease of, I think, 10 minutes:
If the function still runs after 10 minutes, we renew the lease.
If the function completes, the message is deleted.
If the function throws for any reason, we leave the message there and not renew the lease again. The message will show up in the queue again after the lease time expires (max 10 minutes).
The answer your question, if the VM restarts you are in case #2 and the message should show up again after, at most, 10 minutes.

Related

Exception in QueueTrigger Azure Function

I have an azure function which is triggered by adding a new message into a queue.
It should download a file from an FTP server and the name of the file is a part of the message that I push into the queue.
At some points, the server which is hosting files might become inaccessible and I will get exceptions, of course.
I would like to know how the queue behaves in these cases? does it pop the message and leave it? Or does it keep it and call the function again and again until the task gets completed without any exceptions?

From the docs:
The Functions runtime receives a message in PeekLock mode. It calls Complete on the message if the function finishes successfully, or calls Abandon if the function fails. If the function runs longer than the PeekLock timeout, the lock is automatically renewed as long as the function is running.
So, if the function fails it will be available for a next run, for a maximum of 10 retries. After 10 retries it goes to the DeadLetter Queue (source):
Service Bus Queues and Subscriptions each have a QueueDescription.MaxDeliveryCount and SubscriptionDescription.MaxDeliveryCount property respectively. the default value is 10. Whenever a message has been delivered under a lock (ReceiveMode.PeekLock), but has been either explicitly abandoned or the lock has expired, the message BrokeredMessage.DeliveryCount is incremented. When DeliveryCount exceeds MaxDeliveryCount, the message is moved to the DLQ, specifying the MaxDeliveryCountExceeded reason code.

Azure functions with service bus: How to keep a message in the queue if something goes wrong with its processing?

I'm new to service bus and not able to figure this out.
Basically i'm using Azure function app which is hooked onto the service bus queue. Let's say a trigger is fired from the service bus and I receive a message from the queue, and in the processing of that message something goes wrong in my code. In such cases how do I make sure to put that message back in the queue again? Currently its just disappearing into thin air and when I restart my function app on VS, the next message from the queue is taken.
Ideally only when all my data processing is done and when i hit myMsg.Success() do I want it to be removed from the queue.
public static async Task RunAsync([ServiceBusTrigger("xx", "yy", AccessRights.Manage)]BrokeredMessage mySbMsg, TraceWriter log)
{
try{ // do something with mySbMsg }
catch{ // put that mySbMsg back in the queue so it doesn't disappear. and throw exception}
}
I was reading up on mySbMsg.Abandon() but it looks like that puts the message in the dead letter queue and I am not sure how to access it? and if there is a better way to error handle?

Cloud queues are a bit different than in-memory queues because they need to be robust to the possibility of the client crashing after it received the queue message but before it finished processing the message.
When a queue message is received, the message becomes "invisible" so that other clients can't pick it up. This gives the client a chance to process it and the client must mark it as completed when it is done (Azure Functions will do this automatically when you return from the function). That way, if the client were to crash in the middle of processing the message (we're on the cloud, so be robust to random machine crashes due to powerloss, etc), the server will see the absence of the completed message, assume the client crashed, and eventually resend the message.
Practically, this means that if you receive a queue message, and throw an exception (and thus we don't mark the message as completed), it will be invisible for a few minutes, but then it will show up again after a few minutes and another client can attempt to handle it. Put another way, in Azure functions, queue messages are automatically retried after exceptions, but the message will be invisible for a few minutes inbetween retries.

If you want the message to remain on the queue to be retried, the function should not swallow exception and rather throw. That way Function will not auto-complete the message and retry it.
Keep in mind that this will cause message to be retried and eventually, if exception persists, to be moved into dead-letter queue.

As per my understanding, I think what you are for is if there is an error in processing the message it needs to retry the execution instead of swallowing it. If you are using Azure Functions V2.0 you define the message handler options in the host.json
"extensions": {
"serviceBus": {
"prefetchCount": 100,
"messageHandlerOptions": {
"autoComplete": false,
"maxConcurrentCalls": 1
}
}
}
prefetchCount - Gets or sets the number of messages that the message receiver can simultaneously request.
autoComplete - Whether the trigger should automatically call complete after processing, or if the function code will manually call complete.
After retrying the message n(defaults to 10) number of times it will transfer the message to DLQ.

Azure WebJob Service Bus requeue message infront of the queue on error

I have a WebJob getting messages from Service bus which is working fine. If the webjob fails to finish or throws exception the message is sent back in the queue which is fine in itself.
I have set MaxDequeueCount to 10 and it is not a problem if it fails in some cases as it will try to process it again. But the problem is that the message seems to be sent to the bottom of the queue and other messages are handled before we get back to the first failed one.
I would like to handle one message at a time because I might have multiple updates on the same entity coming in a row. If the order changes it would update the entity in wrong order.
Is if it is possible to send the message back infront of the queue on error or continue working on the same message until we reach the MaxDequeueCount?

Ideally, you should not be relying on message order.
Given your requirement, you could potentially go with the FIFO semantics of Azure Service Bus Sessions. When a message is handled within a session and message is aborted, it will be handled once again rather than go to the end of the queue. You need to keep in mind the following:
Can only process one message at a time
Requires session to be used
If message is not completed and not aborted, it will be left hanging in the queue and will be picked up when a session is restarted.

Azure cannot Add Message to Queue

I have an Azure WebJob and I use the CloudQueue to communicate to it.
From my Web Application:
logger.Info("Writing Conversion Request to Document Queue " + JsonConvert.SerializeObject(blobInfo));                       
var queueMessage = new CloudQueueMessage(JsonConvert.SerializeObject(blobInfo));                       
documentQueue.AddMessage(queueMessage);
I verify in my log file that I see the INFO statement being written.
However, when I go to my queue:
What baffles me even more ... this Queue was full of messages before, including timestamps of this evening. 
I went and cleared out my Queue, and after clearing it, it will no longer receive messages.
Has anyone ever seen this before?

As Gaurav Mantri mentioned in the comments that the message shoud be processed by your WebJob. When it is up to max attempts then will be moved to poison queue.
We could also get more details about poison messages from azure official tutorials. The following is the snippet from the tutorials.
Messages whose content causes a function to fail are called poison messages. When the function fails, the queue message is not deleted and eventually is picked up again, causing the cycle to be repeated. The SDK can automatically interrupt the cycle after a limited number of iterations, or you can do it manually.
The SDK will call a function up to 5 times to process a queue message. If the fifth try fails, the message is moved to a poison queue. The maximum number of retries is configurable.

Azure Service Bus Subscriber regularly phoning home?

We have pub/sub application that involves an external client subscribing to a Web Role publisher via an Azure Service Bus Topic. Our current billing cycle indicates we've sent/received >25K messages, while our dashboard indicates we've sent <100. We're investigating our implementation and checking our assumptions in order to understand the disparity.
As part of our investigation we've gathered wireshark captures of client<=>service bus traffic on the client machine. We've noticed a regular pattern of communication that we haven't seen documented and would like to better understand. The following exchange occurs once every 50s when there is otherwise no activity on the bus:
The client pushes ~200B to the service bus.
10s later, the service bus pushes ~800B to the client. The client registers the receipt of an empty message (determined via breakpoint.)
The client immediately responds by pushing ~1000B to the service bus.
Some relevant information:
This occurs when our web role is not actively pushing data to the service bus.
Upon receiving a legit message from the Web Role, the pattern described above will not occur again until a full 50s has passed.
Both client and server connect to sb://namespace.servicebus.windows.net via TCP.
Our application messages are <64 KB
Questions
What is responsible for the regular, 3-packet message exchange we're seeing? Is it some sort of keep-alive?
Do each of the 3 packets count as a separately billable message?
Is this behavior configurable or otherwise documented?
EDIT:
This is the code the receives the messages:
private void Listen()
{
_subscriptionClient.ReceiveAsync().ContinueWith(MessageReceived);
}
private void MessageReceived(Task<BrokeredMessage> task)
{
if (task.Status != TaskStatus.Faulted && task.Result != null)
{
task.Result.CompleteAsync();
// Do some things...
}
Listen();
}

I think what you are seeing is the Receive call in the background. Behind the scenes the Receive calls are all using long polling. Which means they call out to the Service Bus endpoint and ask for a message. The Service Bus service gets that request and if it has a message it will return it immediately. If it doesn't have a message it will hold the connection open for a time period in case a message arrives. If a message arrives within that time frame it will be returned to the client. If a message is not available by the end of the time frame a response is sent to the client indicating that no message was there (aka, your null BrokeredMessage). If you call Receive with no overloads (like you've done here) it will immediately make another request. This loop continues to happend until a message is received.
Thus, what you are seeing are the number of times the client requests a message but there isn't one there. The long polling makes it nicer than what the Windows Azure Storage Queues have because they will just immediately return a null result if there is no message. For both technologies it is common to implement an exponential back off for requests. There are lots of examples out there of how to do this. This cuts back on how often you need to go check the queue and can reduce your transaction count.
To answer your questions:
Yes, this is normal expected behaviour.
No, this is only one transaction. For Service Bus you get charged a transaction each time you put a message on a queue and each time a message is requested (which can be a little opaque given that Recieve makes calls multiple times in the background). Note that the docs point out that you get charged for each idle transaction (meaning a null result from a Receive call).
Again, you can implement a back off methodology so that you aren't hitting the queue so often. Another suggestion I've recently heard was if you have a queue that isn't seeing a lot of traffic you could also check the queue depth to see if it was > 0 before entering the loop for processing and if you get no messages back from a receive call you could go back to watching the queue depth. I've not tried that and it is possible that you could get throttled if you did the queue depth check too often I'd think.
If these are your production numbers then your subscription isn't really processing a lot of messages. It would likely be a really good idea to have a back off policy to a time that is acceptable to wait before it is processed. Like, if it is okay that a message sits for more than 10 minutes then create a back off approach that will eventually just be checking for a message every 10 minutes, then when it gets one process it and immediately check again.
Oh, there is a Receive overload that takes a timeout, but I'm not 100% that is a server timeout or a local timeout. If it is local then it could still be making the calls every X seconds to the service. I think this is based on the OperationTimeout value set on the Messaging Factory Settings when creating the SubscriptionClient. You'd have to test that.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.