I have a long running task in an Azure function which I want to run in a background thread using Task.Run. I don't care about the result.
public static async Task Run(...)
{
var taskA = await DoTaskA();
Task.Run(new Action(MethodB));
....
// return result based on taskA
}
Is this an acceptable pattern in Azure functions? (this is an HTTP trigger function)
I know this could also be done by adding a message to a queue and have another function execute this task but I want to know the pros and cons of starting run long running tasks in a background thread in Azure functions.
It might be best to have an Azure Function running TaskA and have it post a message in a ServiceBus which would trigger another Azure Function running TaskB when something is posted in that ServiceBus since no answer is needed anyway.
Here is the example shown on microsoft's website:
[FunctionName("FunctionB")]
public static void Run(
[ServiceBusTrigger("myqueue", AccessRights.Manage, Connection = "ServiceBusConnection")]
string myQueueItem,
TraceWriter log)
{
log.Info($"C# ServiceBus queue trigger function processed message: {myQueueItem}");
MethodB();
}
In that situation, you do not have to start a new task. Just call MethodB().
That will give you the flexibility the adjust the Plan of your Azure Functions (App Service vs Consumption Plan) and minimize the overall cost.
Depending on how complex your scenario is, you may want to look into Durable Functions. Durable Functions gives you greater control over a variety of scenarios, including long-running tasks.
No, no and no.
Have your HTTP triggered function return a 202 Accepted, the results of which you post to a blob URL later on. The 202 should include a Location header that points to the soon to exist blob URL and maybe a Retry-after header as well if you have a rough idea how long the processing takes.
The long processing task should be a queue triggered function. Why? Because things don't always go according to plan and you may need to retry processing. Why not have the retry built in.
Related
We have a function app that build a large json payload(+- 2000 lines) everyday and posts it to the api to be mapped and saved into a database.
We are using cqrs with mediatr and it seems the API side takes exceptionally long to create and save all the neccesary information.
The problem we have is that the function's postasjsonasync waits for the api response and times out after a few minutes.
Any idea how to run this as a background task or just post and forget? Our API is only concerned that it received data.
Function side:
using (var client = new HttpClient())
{
client.Timeout = new TimeSpan(0, 10, 0);
var response = await client.PostAsJsonAsync($"{endpoint}/api/v1.0/BatchImport/Import", json); <-- Times out waiting for API
response.EnsureSuccessStatusCode();
}
API mediatr handle side:
public async Task<Unit> Handle(CreateBatchOrderCommand request, CancellationToken cancellationToken)
{
foreach (var importOrder in request.Payload) <-- Takes long to process all the data
{
await PopulateImportDataAsync(importOrder, cancellationToken);
await CreateOrderAsync(importOrder, cancellationToken);
}
return Unit.Value;
}
Cheers
The problem we have is that the function's postasjsonasync waits for the api response and times out after a few minutes.
The easiest solution is going to be just increasing that timeout. If you are talking about Azure Functions, I believe you can increase the timeout to 10 minutes.
Any idea how to run this as a background task or just post and forget? Our API is only concerned that it received data.
Any fire-and-forget solution is not going to end well; you'll end up with lost data. I recommend that you not use fire-and-forget at all, and this advice goes double as soon as you're in the cloud.
Assuming increasing the timeout isn't sufficient, your solution is to use a basic distributed architecture, as described on my blog:
Have your API place the incoming request into a durable queue.
Have a separate backend (e.g., Azure (Durable) Function) process that request from the queue.
Assuming you’re on .NET Core, you could stick incoming requests into a queued background task:
https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-6.0&tabs=visual-studio#queued-background-tasks
Keep in mind this chews up resources from servicing other web requests so it will not scale well with millions of requests. This same basic principle, a message queue item and offline processing, can also be distributed across multiple services to take some of the load off the web service.
Does the durable function awake until activity invoked?
I'm about the implement scheduler, and instead use other library such Hangfire or Quartz. i want to implement durable function that will serve as a scheduler.
And my missing piece is, what happen in the function? does the function got shout until next activity invocation? each one is called execution?
[FunctionName("SchedulerRouter")]
public static async Task<HttpResponseMessage> HttpStart(
[HttpTrigger(AuthorizationLevel.Anonymous, "get", "post")]HttpRequestMessage req,
[OrchestrationClient]DurableOrchestrationClient starter, ILogger log)
{
var data = await req.Content.ReadAsAsync<JObject>();
var instanceId = await starter.StartNewAsync(FunctionsConsts.MAIN_DURABLE_SCHEDULER_NAME, data);
return starter.CreateCheckStatusResponse(req, instanceId);
}
Looks like you are confusing execution time with Max inactivity time is Azure functions:
Durable function is just related to the maximum execution time of a single call. For "out of the box" functions, that timeout is 10min, for durable functions this limitation gets removed. It also introduces support for stateful executions, which means following calls to the same function can share local variables and static members. This is an extension of the "out of the box" functions patterns which needs some additional boiler plate code to make everything working as expected. More details here: https://learn.microsoft.com/en-us/azure/azure-functions/durable/durable-functions-overview
Durable functions and normal functions share the same billing pattern, so cold starts will happen on durable functions as well especially when running in a consumption plan.
Azure functions running in a consumption plan will shutdown during a period of inactivity , and then reallocated and restarted when a new request arrives, this is called: Cold Start. You can mitigate this, building a timer trigger function which awakes your function every 5 to 10 min. But, you will still incurr in cold starts from time to time if your host gets up or down scaled automatically by Azure.
If you want to completely remove the chance of cold starts you will have to move to an App service plan. As a side note, Function apps in Azure are stateless by design, and you should implement your logic with this requirement in mind.
Did you looked into time triggers for AZ Functions? Maybe it is more soutable for you use case. Basically a CRON time tigger that invokes the function according the CRON setting.
The portal example for time trigger
I am trying to implement a task in fire and forget manner.
Lets look at the below piece of code.
public IHttpActionResult Update(int id)
{
var updatedResult = _updater.update(id);
// fire and forget a task
sendEmailToUser();
return ok();
}
private async Task sendEmailToUser()
{
var httpclient = new HttpClient();
// assume the client is initiated with required url and other headers
await httpclient.postasync("some url");
}
Given the above code, can i safely assume that whenever Update endpoint is called, sendEmailToUser task is triggered and will be run to completion ?
No. You should almost never start any background threads in web application. HTTP is suppose to be stateless and the web server was designed with that in mind.
The server might be put into sleep state when there is no incoming request for a set period of time. During that time all the background execution will be halt including the one you had. It might and might not get resume when the next request comes in.
Or when IIS decides to recycle your App domain on a scheduled basis your thread will get killed too.
If you really need background tasks then do that using windows service or run it as a separate console application.
Under normal conditions, it's reasonable to expect that the task will run to completion. It will go on independently.
Your biggest concerns, in this case, should be about the web API not being terminated, and the task not throwing an exception.
But if OP needs to be 100% sure, there are other safer ways to code that.
I am trying to implement files conversion using Azure Functions solution. The conversion can take a lot of time. Therefore I don't want waiting for the response on the calling server.
I wrote the function that returns response immediately (to indicate that service is available and converting is started) and runs conversion in separate thread. Callback URL is used to send converting result.
public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, Stream srcBlob, Binder binder, TraceWriter log)
{
log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");
// Get request model
var input = await req.Content.ReadAsAsync<ConvertInputModel>();
//Run convert in separate thread
Task.Run( async () => {
//Read input blob -> convert -> upload output blob
var convertResult = await ConvertAndUploadFile(input, srcBlob, binder, log);
//return result using HttpClient
SendCallback(convertResult, input.CallbackUrl);
});
//Return response immediately
return req.CreateResponse(HttpStatusCode.OK);
}
The problem that the new task breaks binding. I get exception while accessing params. So how can I run long-time operation in separate tread? Or such solution is totally wrong?
This pattern is not recommended (or supported) in Azure Functions. Particularly when running in the consumption plan, since the runtime won't be able to accurately manage your function's lifetime and will eventually shutdown your service.
One of the recommended (and widely used) patterns here would be to queue up this work to be processed by another function, listening on that queue, and return the response to the client right away.
With this approach, you accomplish essentially the same thing, where the actual processing will be done asynchronously, but in a reliable and efficient way (benefiting from automatic scaling to properly handle increased loads, if needed)
Do keep in mind that, when using the consumption plan, there's a function timeout of 5 minutes. If the processing is expected to take longer, you'd need to run your function on a dedicated plan with AlwaysOn enabled.
Your solution of running the background work inside the Azure Function is wrong like you suspected. You need a 2nd service that is designed to run these long running tasks. Here is documentation to Micosoft's best practices on azure for doing background jobs.
I have a WCF service set to PerCall
I would like to know how I can send a Start call from the client to start a long running process, and send a Cancel command to cancel it
My WCF service looks something like this
[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall)]
public class Service1 : IService1
{
CancellationTokenSource cancelToken = new CancellationTokenSource();
public void Start()
{
var compute = Task.Factory.StartNew(StartLongRunningTask, cancelToken.Token);
}
public void Stop()
{
cancelToken.Cancel();
}
private void StartLongRunningTask()
{
//process here
}
}
I guess the problem here is that, each time a call comes to the server, it's treated as a new request.
So how should starting and cancelling a long running task in WCF be done?
EDIT: I'm hosting it as a windows service
I have a WCF service set to PerCall
... the problem here is that, each time a call comes to the server, it's treated as a new request.
Yup, that's exactly what you're telling it to do. If you can, just change to InstanceContextMode.PerSession; then you can do what you're trying to do (assuming you're self-hosting).
If you can't do this, then you'll have to develop a more complex solution like #PeterRitchie commented. First, your host: IIS is not designed to have long-running operations independent of requests, so I'll assume you're self-hosting. Next, you'll need a form of token (like a GUID) that will act as an identifier for a long-running operation. Your Start method will allocate a GUID and CancellationTokenSource and start the operation, and your Stop method will take a GUID and use that to look up the CancellationTokenSource and cancel the operation. You'll need a shared (static, threadsafe) dictionary to act as lookup.
If your host is IIS, then your solution gets more complex... :)
First, you'll need a backend that's not hosted in IIS. Common choices are an Azure worker role or a Win32 service. Next, you'll need a reliable communications mechanism: an Azure queue, MSMQ, WebSphere, etc. Then you can build your WCF-over-IIS service to have the Start method generate a GUID identifier and drop a message on the queue to start processing. The Stop method takes the GUID and drops a message on the queue to cancel processing. All other logic gets moved to the backend service.
From how you've asked, the client seems to be aware of the async nature of the request.
#StephenCleary and #PeterRitchie's points are excellent, but your first step is to re-do your service/contract to properly implement an async service and add the means of communicating back (to client) some information/handle to the long running operation.
The Framework contains several paradigms for asynchronous programming (already :-) )but when it comes to WCF, you kinda fall back to How to: Implement an Asynchronous Service Operation
That will provide some infrastructure, but not necessarily the ability to automatically cancel an operation.
Speaking strictly about the cancellation (as this is your question): you will have to extend whatever your solution ends up being for cancellation. At the minimum you need to add necessary logic to your service “worker” to monitor and honor the cancellation token.
Other considerations that you may expect to encounter: return result from cancellation; cancelling a task that has managed to complete (what of you updated the 1,000,000 records by the time the cancellation request came); exception handling (with task-based programming exceptions are not thrown, but bundled in the Task, or whatever other “vehicle” you use to describe the ongoing operation).