I have an Azure HttpTrigger function which processes POST requests and scales out on heavy load. The issue is that the caller of the function only waits 3 sec. for a HTTP 200 status code.
But when an azure function scales out it takes 4-6 sec. until the request gets processed. If the caller sends a request during the scale out it is possible that he cancels the request and my service is never able to process it. Which is a worst case scenario.
Is there a way to prevent that? My ideal scenario would be an immediate HTTP 202 answer to the caller. But I'm afraid that this is not possible during a scale out process.
A Scale-out will require your app to be loaded onto another instance so some delay occurs/incur for those requests because of time taken to load your app onto the new instance.
As described in the Official Documentation:
Consumption Plan is the true serverless hosting plan since it enables scaling to zero when idle state, but some requests might have additional latency at startup.
To get constant low latency with autoscaling, you should move to the premium hosting plan which avoids cold starts with perpetually warm instances.
Related
I'm having an issue where I'm triggering an event which is being handled by an Azure Function Service Bus Trigger.
The trigger that runs could run for anywhere up to an hour which is fine but I'm finding that after 5 minutes the message gets re added to the queue, so it's getting handled repeatedly.
I can hack around this by ensuring that this specific topic will only read the message once by changing the MaxDeliveryCount but ideally I'd like the lock to have a longer expiry time than the function (max 1 hour).
According to the Microsoft Documentation it should already do this but I'm still getting the issue when it's re queueing the message.
The Functions runtime receives a message in PeekLock mode. It calls Complete on the message if the function finishes successfully, or calls Abandon if the function fails. If the function runs longer than the PeekLock timeout, the lock is automatically renewed as long as the function is running.
Any ideas?
Azure Service Bus can only lock a message for a maximum of 5 minutes at a time. But it can also renew the lock, which, technically, can allow messages to be locked for as long as needed as long as there are no failures to issue the locking requests. In addition to that, there's a limit on the execution time functions can have. For example, on the Consumption plan, a function won't run longer than the maximum 10 minutes. For any processing longer than that, one should either look into alternatives, that include but are not limited to the following:
Functions Premium
App Service
Containers (Container Apps Service looks very promising)
I am sending 1000 Request asynchronously to API with timeout to each request is 10 seconds.
But the trigger execute only 400 - 500 requests and ignoring rest of all.
My questions is “Is Http Trigger execute all request in parallel or sequentially or there is any limit for parallel threads in Http Trigger”.
Is Http Trigger execute all request in parallel or sequentially or there is any limit for parallel threads in Http Trigger.
It should be paralleled executed, in your case it seems that there is no enough resource for dealling with the request in your service plan.
For Azure function there are 2 different modes: Consumption plan and Azure App Service plan. We may could get more info from Azure document.
The Consumption plan automatically allocates compute power when your code is running, scales out as necessary to handle load, and then scales down when code is not running.
In the App Service plan, your function apps run on dedicated VMs on Basic, Standard, and Premium SKUs, similar to Web Apps. Dedicated VMs are allocated to your App Service apps, which means the functions host is always running.
It seems that you are using App service plan, if it is that case, please have a try scale up or scale out your service plan.
I have ASP.NET Web API app (backend for mobile) published on Azure. Most of requests are lightweight and processed fast. But every client makes a lot of requests and do it rapidly on every interaction with mobile application.
The problem is that web application can't process even small (10/sec) amount of requests. Http queue growth but CPU doesn't.
I ran load testing with 250 requests/second and avg response time growth from ~200ms to 5s.
Maybe problem in my code? Or it's hardware restrictions? Can I increase count of processed requests at one time?
First it really matters what instances do you use (specially if you use small and extra small instances), how many instances do you use - dont expect too much from 1 core and 2Gb RAM on server.
Use caching (WebApi.OutputCache.V2 to decrease servers processing efforts, Azure Redis Cache as fast cache storage), Database also can be a bottleneck.
If you`ll have same results after adding both more instances to server and caching - then you should take a look at your code and find bottlenecks there.
And thats only general recommendatins, there is no code in a question.
I do have my OWIN application hosted as a Windows Service and I am getting a lot of timeout issues from different clients. I have some metrics in place around the request/response time however the numbers are very different. For example I can see the client is taking around one minute to perform a request that looks like in the server is taking 3-4 seconds. I am then assuming that the number of requests that can be accepted has reached the limit and subsequent requests that come in would get queued up. Am I right? If that's the case, is there any way I can monitor the number of incoming requests at a given time and how big is the queue (as in number of requests pending to get served)?
I am playing around with https://msdn.microsoft.com/en-us/library/microsoft.owin.host.httplistener.owinhttplistener.setrequestprocessinglimits(v=vs.113).aspx but doesn't look to have any effect.
Any feedback is much appreciated.
THanks!
HttpListener is built on top of Http.Sys so you need to use its performance counters and ETW traces to get this level of information.
https://msdn.microsoft.com/en-us/library/windows/desktop/cc307239%28v=vs.85%29.aspx?f=255&MSPPError=-2147217396
http://blogs.msdn.com/b/wndp/archive/2007/01/18/event-tracing-in-http-sys-part-1-capturing-a-trace.aspx
I am trying to simulate X number of concurrent requests for a WCF Service and measure the response time for each request. I want to have all the requests hit the Service at more or less the same time.
As the first step I spawned X number of Threads, using the Thread class, and have invoked the Start method. To synchronize all the requests, on the Thread callback I open the connection and have a Monitor.Wait to hold the request from being fired, till all the Threads are created and started. Once all the Threads are started, I call Monitor.PulseAll to trigger the method invocation on the WCF Client Proxy.
When I execute the requests this way, I see a huge delay in the response. A request that should just a few milliseconds, is taking about a second.
I also noticed huge lag between the time the request is dispatched and the time it was received at the service method. I measured this by send sending client time stamp as a parameter value to the service method for each request.
I have the following settings. Assume "X" to the Concurrent number of requests I want to fire. Also note with the following settings I don't get any Denial of Service issues.
The Call chain is as follows,Client->Service1->Service2->Service3
All Services are PerCall with Concurrency set to Multiple.
Throttling set to X Concurrent calls, X Concurrent Instances.
MaxConnections, ListenBacklog on the Service to X.
Min/Max Threads of ThreadPool set to X on both Client and Server (I have applied the patch provided by Microsoft).
Am not sure if the response time I'm measuring is accurate. Am I missing something very trivial?
Any inputs on this would be of great help.
Thanks.
-Krishnan
I did find the answer by myself the hard way. All this while, the way I was measuring the response time was wrong. One should spawn X number of threads, where X is the number of concurrent users one wants to simulate. In each thread, open the connection only once and have while loop to only execute the WCF Method that you want to test for a given duration. Measure the response time against each return, accumulate it and average it out against the number of calls that were executed within the given duration.
If all your outgoing calls are coming from a single process, it is likely that the runtime is either consolidating multiple requests onto a single open channel or capping the number of concurrent requests to a single target service. You may have better results if you move each simulated client into its own process, use a named EventWaitHandle to synchronize them, and call #Set() to unleash them all at once.
There is also a limit to the number of simultaneous pending (TCP SYN or SYN/ACK state) outgoing TCP connections allowed by desktop builds of Windows; if you encounter this limit you will get event 4226 logged and the additional simultaneous connections will be deferred.