Azure function: limit the number of calls per second

Azure function: limit the number of calls per second - c#

I have an Azure function triggered by queue messages. This function makes a request to third-party API. Unfortunately this API has limit - 10 transactions per second, but I might have more than 10 messages per second in service bus queue. How can I limit the number of calls of Azure function to satisfy third-party API limitations?

Unfortunately there is no built-in option for this.
The only reliable way to limit concurrent executions would be to run on a fixed App Service Plan (not Consumption Plan) with just 1 instance running all the time. You will have to pay for this instance.
Then set the option in host.json file:
"serviceBus": {
// The maximum number of concurrent calls to the callback the message
// pump should initiate. The default is 16.
"maxConcurrentCalls": 10
}
Finally, make sure your function takes a second to execute (or other minimal duration, and adjust concurrent calls accordingly).
As #SeanFeldman suggested, see some other ideas in this answer. It's about Storage Queues, but applies to Service Bus too.

You can try writing some custom logic i.e. implement your own in-memory queue in Azure function to queue up requests and limit the calls to third party API. Anyway until the call to third party API succeeds, you dont need to acknowledge the messages in the queue. In this way reliability is also maintained.

The best way to maintain integrity of the system is to throttle the consumption of the Service Bus messages. You can control how your QueueClient processes the messages, see: https://learn.microsoft.com/en-us/azure/service-bus-messaging/service-bus-dotnet-get-started-with-queues#4-receive-messages-from-the-queue
Check out the "Max Concurrent calls"
static void RegisterOnMessageHandlerAndReceiveMessages()
{
// Configure the message handler options in terms of exception handling, number of concurrent messages to deliver, etc.
var messageHandlerOptions = new MessageHandlerOptions(ExceptionReceivedHandler)
{
// Maximum number of concurrent calls to the callback ProcessMessagesAsync(), set to 1 for simplicity.
// Set it according to how many messages the application wants to process in parallel.
MaxConcurrentCalls = 1,
// Indicates whether the message pump should automatically complete the messages after returning from user callback.
// False below indicates the complete operation is handled by the user callback as in ProcessMessagesAsync().
AutoComplete = false
};
// Register the function that processes messages.
queueClient.RegisterMessageHandler(ProcessMessagesAsync, messageHandlerOptions);
}

Do you want to get rid of N-10 messages you receive in a second interval or do you want to treat every message in respect to the API throttling limit? For the latter, you can add the messages processed by your function to another queue from which you can read a batch of 10 messages via another function (timer trigger) every second

Related

How to know if there is message ready to consume

I am new to Kafka and looking for a way to know if the message is ready for consumption to the consumer before calling consume method.
I am doing the POC on integrating C# with Kafka, previously I did that for RabbitMQ which has a method "MessageCount", but for Kafka, I cannot find any.

Actually Kafka has an infinite loop, in which it calls the poll() function to get eventual new records from a partition.
The configuration : max.poll.intervall.ms, specifies the interval of time after which, if the poll() function is not called, the consumer is considered dead and a rebalance is operated.
So to answer your question, Kafka always calls the poll() function to check if a message is available to be consummed. However, there some consumer configurations that allow to wait for a minimum size of messages before consumming the message:
fetch.min.bytes : you will wait untill you have x bytes of messages to consume them
fetch.max.wait.ms : set how much time you are gonna wait for the fetch.min.bytes to be gathered

In theory, if you can view if messages exist you are already using processes to connect to kafka. So you might as well just do a try catch with consume with the same performance.

Throttle/restrict serviceBus Queue to triggered the message form ServiceBusTrigger

I have a ServiceBusQueue(SBQ), which gets a lots of message payloads.
I have a ServiceBusTrigger(SBT) with accessRights(manage) which continuously polling a message from SBQ.
The problem i am facing is:
My SBT(16 instances at once) pick messages(16 messages individually) at one time and create a request to another server(suppose S1).
If SBT continuously creates 500-600 requests then the server S1 stops to respond.
I am expecting:
I could throttle/restrict to pick the message at once from SBQ so that I indirectly restrict to send the request.
Please share your thoughts, what design should i follow.I couldn't googled the exact solution.

Restrict the maximum concurrent calls of Service Bus Trigger.
In host.json, add configuration to throttle concurrency(i.e. by default 16 messages at once you have seen). Take an example of v2 function.
{
"version": "2.0",
"extensions": {
"serviceBus": {
"messageHandlerOptions": {
"maxConcurrentCalls": 8
}
}
}
}
Restrict Function host instances count. When the host scales out, each instance has one Service Bus trigger which reads multiple messages concurrently as set above.
If the trigger is on dedicated App service plan, scale in the instance counts to some small value. For functions on Consumption plan, add App setting WEBSITE_MAX_DYNAMIC_APPLICATION_SCALE_OUT
with reasonable value(<=5). Of course we can set the count to 1 in order to control the behavior strictly.
If we have control over how the messages are sent, schedule the incoming messages to help decrease the request rate.
Use static clients to reuse connection with the Server S1.

How do I do async RPC calls with RabbitMq

I'm trying to do a RestApi (asp.net core) that calls the backend (C#) through RabbitMq. To handle many requests I will need to call the backend asynchronously.
For me the example code from rabbitmq seem not to be thread-safe because it dequeues messages until the one with the correct correlation id is returned. All others will be ignored. (link: https://www.rabbitmq.com/tutorials/tutorial-six-dotnet.html )
while(true)
{
var ea = (BasicDeliverEventArgs)consumer.Queue.Dequeue();
if(ea.BasicProperties.CorrelationId == corrId)
{
return Encoding.UTF8.GetString(ea.Body);
}
}
I'm thinking in the following possibilities:
Possibility 1:
I could use the SimpleRpcClient and create for each request a own instance. This will cause that for each request a new queue to reply gets created.
Possibility 2:
Create a own RPC client that creates one reply queue (probably per request type) and returns the right response to the right request depending on the correlation id.
What is the best practice to make multiple calls asynchronous? Are there already implementations for the second possibility or do I need to implement this by myself?

Design a job queue, Push job to queue from generator and forget so that job generator remains responsive
Have multiple workers equal to number of available CPU threads (for optimized performance) to process jobs
Each worker to deque job from main queue and put it with results along in new queue.
Keep features for
Not to process *too old** jobs.
Terminate long running jobs.
Pick high priority jobs first.
If permitted design remote job runner nodes

What is a safe overhead for RequestAdditionalTime()?

I have a Windows service that spawns a set of child activities on separate threads and that should only terminate when all those activities have successfully completed. I do not know in advance how long it might take to terminate an activity after a stop signal is received. During OnStop(), I wait in intervals for that stop signal and keep requesting additional time for as long as the system is willing to grant it.
Here is the basic structure:
class MyService : ServiceBase
{
private CancellationTokenSource stopAllActivities;
private CountdownEvent runningActivities;
protected override void OnStart(string[] args)
{
// ... start a set of activities that signal runningActivities
// when they stop
// ... initialize runningActivities to the number of activities
}
protected override void OnStop()
{
stopAllActivities.Cancel();
while (!runningActivities.Wait(10000))
{
RequestAdditionalTime(15000); // NOTE: 5000 added for overhead
}
}
}
Just how much "overhead" should I be adding in the RequestAdditionalTime call? I'm concerned that the requests are cumulative, instead of based on the point in time when each RequestAdditionalTime call is made. If that's the case, adding overhead could result in the system eventually denying the request because it's too far out in the future. But if I don't add any overhead then my service could be terminated before it has a chance to request the next block of additional time.

This post wasn't exactly encouraging:
The MSDN documentation doesn’t mention this but it appears that the value specified in RequestAdditionalTime is not actually ‘additional’ time. Instead, it replaces the value in ServicesPipeTimeout. Worse still, any value greater than two minutes (120000 milliseconds) is ignored, i.e. capped at two minutes.
I hope that's not the case, but I'm posting this as a worst-case answer.
UPDATE: The author of that post was kind enough to post a very detailed reply to my comment, which I've copied below.
Lars, the short answer is no.
What I would say is that I now realise that Windows Services ought to be designed to start and terminate processing quickly when requested to do so.
As developers, we tend to focus on the implementation of the processing and then package it up and deliver it as a Windows Service.
However, this really isn’t the correct approach to designing Windows Services. Services must be able to respond quickly to requests to start and stop not only when an administrator making the request from the services console but also when the operating system is requesting a start as part of its start up processing or a stop because it is shutting down,
Consider what happens when Windows is configured to shut down when a UPS signals that the power has failed. It’s not appropriate for the service to respond with “I need a few more minutes…”.
It’s possible to write services that react quickly to stop requests even when they implement long running processing tasks. Usually a long running process will consist of batch processing of data and the processing should check if a stop has been requested at the level of the smallest unit of work that ensures data consistency.
As an example, the first service where I found the stop timeout was a problem involved the processing of a notifications queue on a remote server. The processing retrieved a notification from the queue, calling a web service to retrieve data related to the subject of the notification, and then writing a data file for processing by another application.
I implemented the processing as a timer driven call to a single method. Once the method is called it doesn’t return until all the notifications in the queue have been processed. I realised this was a mistake for a Windows Service because occasionally there might be tens of thousands of notifications in the queue and processing might take several minutes.
The method is capable of processing 50 notifications per second. So, what I should have done was implement a check to see if a stop had been requested before processing each notification. This would have allowed the method to return when it has completed the processing of a notification but before it has started to process the next notification. This would have ensured that the service responds quickly to a stop request and any pending notifications remained queued for processing when the service is restarted.

Azure Service Bus Subscriber regularly phoning home?

We have pub/sub application that involves an external client subscribing to a Web Role publisher via an Azure Service Bus Topic. Our current billing cycle indicates we've sent/received >25K messages, while our dashboard indicates we've sent <100. We're investigating our implementation and checking our assumptions in order to understand the disparity.
As part of our investigation we've gathered wireshark captures of client<=>service bus traffic on the client machine. We've noticed a regular pattern of communication that we haven't seen documented and would like to better understand. The following exchange occurs once every 50s when there is otherwise no activity on the bus:
The client pushes ~200B to the service bus.
10s later, the service bus pushes ~800B to the client. The client registers the receipt of an empty message (determined via breakpoint.)
The client immediately responds by pushing ~1000B to the service bus.
Some relevant information:
This occurs when our web role is not actively pushing data to the service bus.
Upon receiving a legit message from the Web Role, the pattern described above will not occur again until a full 50s has passed.
Both client and server connect to sb://namespace.servicebus.windows.net via TCP.
Our application messages are <64 KB
Questions
What is responsible for the regular, 3-packet message exchange we're seeing? Is it some sort of keep-alive?
Do each of the 3 packets count as a separately billable message?
Is this behavior configurable or otherwise documented?
EDIT:
This is the code the receives the messages:
private void Listen()
{
_subscriptionClient.ReceiveAsync().ContinueWith(MessageReceived);
}
private void MessageReceived(Task<BrokeredMessage> task)
{
if (task.Status != TaskStatus.Faulted && task.Result != null)
{
task.Result.CompleteAsync();
// Do some things...
}
Listen();
}

I think what you are seeing is the Receive call in the background. Behind the scenes the Receive calls are all using long polling. Which means they call out to the Service Bus endpoint and ask for a message. The Service Bus service gets that request and if it has a message it will return it immediately. If it doesn't have a message it will hold the connection open for a time period in case a message arrives. If a message arrives within that time frame it will be returned to the client. If a message is not available by the end of the time frame a response is sent to the client indicating that no message was there (aka, your null BrokeredMessage). If you call Receive with no overloads (like you've done here) it will immediately make another request. This loop continues to happend until a message is received.
Thus, what you are seeing are the number of times the client requests a message but there isn't one there. The long polling makes it nicer than what the Windows Azure Storage Queues have because they will just immediately return a null result if there is no message. For both technologies it is common to implement an exponential back off for requests. There are lots of examples out there of how to do this. This cuts back on how often you need to go check the queue and can reduce your transaction count.
To answer your questions:
Yes, this is normal expected behaviour.
No, this is only one transaction. For Service Bus you get charged a transaction each time you put a message on a queue and each time a message is requested (which can be a little opaque given that Recieve makes calls multiple times in the background). Note that the docs point out that you get charged for each idle transaction (meaning a null result from a Receive call).
Again, you can implement a back off methodology so that you aren't hitting the queue so often. Another suggestion I've recently heard was if you have a queue that isn't seeing a lot of traffic you could also check the queue depth to see if it was > 0 before entering the loop for processing and if you get no messages back from a receive call you could go back to watching the queue depth. I've not tried that and it is possible that you could get throttled if you did the queue depth check too often I'd think.
If these are your production numbers then your subscription isn't really processing a lot of messages. It would likely be a really good idea to have a back off policy to a time that is acceptable to wait before it is processed. Like, if it is okay that a message sits for more than 10 minutes then create a back off approach that will eventually just be checking for a message every 10 minutes, then when it gets one process it and immediately check again.
Oh, there is a Receive overload that takes a timeout, but I'm not 100% that is a server timeout or a local timeout. If it is local then it could still be making the calls every X seconds to the service. I think this is based on the OperationTimeout value set on the Messaging Factory Settings when creating the SubscriptionClient. You'd have to test that.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.