I need to create a ASP.NET API service that when called doesnt wait for a response from the webserver. Basically I have a long sql task that I want to run then when its completed send an email to the user to tell them the job is done. It needs to avoid server response timeout, so something that just lets the user carry on without waiting round. I cant seem to find a way in MVC to do this, is it possible?
IMHO, I would queue this job and process it using another process outside IIS.
For example, this would be the flow:
User performs a request to your API to start the long task, but what API does in the server-side is queueing the whole task.
API returns a 200 OK response specifying that the job was queued successfully. You may use Azure Service Bus, Queues, MSMQ, RabbitMQ, Redis or even SQL Server using a table to maintain job state.
Some Windows Service, Azure Worker Role or a periodic scheduled task dequeues the task, processes it and as soon as it ends, it sends an email to the user to notify that the operation was done.
Queue the task and return the response immediately.
Basically, your server-side handler (controller action, Web API method, whatever) shouldn't invoke the long-running back-end task directly. It should do something relatively fast to just queue the task and then immediately return some indication that the task has been successfully queued. Another process entirely should actually execute the long-running task.
What I would recommend is two server-side applications. One is the web application, the other is either a Windows Service or a periodically scheduled Console Application. The web application would write a record to a database table to "queue" the process. This could contain simply:
User who queued the process
When it was queued
What process was queued (if there would ever be more than one, for example)
Status of the process ("queued" initially)
Anything else you might want to store.
Just insert a record here and then return to the user. At this point the web application has done its job.
The Windows Service (or Console Application) would check this database table for "queued" records. When it finds one, update the status to "processing" (so other executions don't try to run the same one) and invoke the long-running process. When the long-running process is complete, update the status to "complete" (or just delete the record if you don't want it anymore) and notify the user. (Handle error conditions accordingly, of course. Maybe re-try, maybe notify the user of the error, etc.)
By separating the concerns like this you place the appropriate responsibilities in the appropriate application contexts and provide the user with exactly the experience they're looking for. You additionally open the door for future functionality, such as queueing the process by means other than the web application or running reports on queued/running/failed/etc. processes by examining that database table.
Long story short: Don't try to hack a web application so that it doesn't behave like a web application. Use the technologies for their appropriate purposes.
Related
I'm working at an automation firm so we create processes for industrial automation. Previously this automation was done on the machine side of things, but we're slowly transitioning to controlling the machines with c#.
On my current project the production for one day takes about 2 hours. The operators of the factory have a web interface that we created in c# using asp.net core MVC in which they can start/pause/stop this production process.
When starting the process we await a function in our controller that is basically a while loop that controls this 2h long production process.
The problem is now that when I send out the REST request to start the production this request takes 2h to complete, I would prefer this request immediately completes and the production process starts on the background of my asp.net core application.
First I thought I could just leave out the await and simply do this in my controller (simplified code):
_ = _productionController.StartLongProcess(); // This contains the while loop
return Ok();
But since _productionController is scoped and all its dependencies are as well, these immediately get disposed of when the method returns and I can't access my database anymore for example.
The process should be able to continuously talk to our database to save information about the production process in case something fails, that way we can always pick off where we left off.
My question to you is now, are we tackling this the wrong way? I imagine it's bad practice to start these long running processes in the asp.net controller.
How do I make sure I always have access to my DatabaseContext in this long running process even though the REST request has already ended. Create a separate scope only for this method?
Starting ASP.NET Core 2.1, the right way to do this (within asp.net) is to extend BackgroundService (or implement IHostedService).
Incoming requests can tell the background service to start the long-running operation and return immediately. You'll of course need to handle cases where duplicate requests are sent in, or new requests are sent in before the existing request is completed.
The documentation page has an example where the BackgroundService reads commands of a queue and processes them one at a time.
How do I make sure I always have access to my DatabaseContext in this long running process even though the REST request has already ended. Create a separate scope only for this method?
Yes, create a separate scope.
My question to you is now, are we tackling this the wrong way? I imagine it's bad practice to start these long running processes in the asp.net controller.
We've done something similar in the past. As long as fault-tolerance (particularly w.r.t. app restarts) and idempotence are built into the long-running-operation's logic, you should be good to go.
REST requests are expected to be short, a few seconds at maximum.
So best practice here would be to offload a long running task to a background service and return a token where you can poll the service if the operation has already finished.
The background service could be a BackGroundWorker in Net Core. This is easy but not really fault tolerant, so some sort of db and retry logic could be good.
If you are in an intranet, you could also move to an inherently asynchronous protocol like RabbitMQ, where you send a StartOperation Message and then receive a Started Message when the process has completed.
Another option would be to use Hangfire. It will allow you to Enqueue the work that you want to execute to a persistent store e.g. SQL Server, MSMQ, Redis depending on what you have in your infrastructure. The job will then be picked up by a worker which can also run in the ASP.NET process or a windows service. It's distributed too so you can have a number of instances of the workers running. Also supports retrying failed jobs and has a dashboard to view the jobs. Best of all, it's free!
var jobId = BackgroundJob.Enqueue(() => ExecuteLongRunningProcess(parameter1));
https://www.hangfire.io/
Following is my understanding of the issue that you have posted:
You want to initiate a long running call, via Rest api call
You want to use the Async call, but not sure how to maintain the DB context for a long running call which is used for db communication on regular basis during the operation
Couple of important points:
Mostly you are not clear regarding working of the Async calls
When you make an Async call, then it stores the current thread synchronization context for the continuity using state machine, it doesn't block any thread pool thread, it utilize the hardware based concurrency
Can use ConfigureAwait(false) on backend to avoid explicit reentry in the current synchronization context, which is better for performance
Only challenge with Async calls to be genuine async the complete chain need to be Async enabled from the entry point, else the benefits can't be reaped, if you use Task.Wait or Task.Result anywhere, infact may even cause a deadlock in the ASP.Net
Regarding the long running operation, following are the options
A Simple async call as suggested above, though it can help you can make large number of async calls (thus scalability) but context will be lost if the client goes away and no way to reap the status of operation back
You can make a fire and forget call, and use a mechanism like ASP.Net SignalR, which is like IObservable over the network and can help in notifying the client when the processing finish
Best option would be using a messaging queue like Rabbit MQ, which doesn't run the risk of client going down, it acts a producer consumer and can notify when the client comes up, in this case MQ can be notified when the process finish and thus client can be informed. MQ can be used for both incoming and response message in an async manner
In case, where client wants to periodically come up and check the status of the request, then DB persistence is important, which can be updated at regular intervals and it can be checked what's the status of the long running process.
My question to you is now, are we tackling this the wrong way? I imagine it's bad practice to start these long running processes in the asp.net controller.
Generally, yes. Ideally an ASP.NET service does not have any long-running processes inside it - or at the very least, no long-running processes that can't be easily and quickly shut down.
Doing work outside of an HTTP request (i.e., request-extrinsic code) is most reliably achieved by adding a durable queue with a separate background processor. The "background processor" can be a Win32 service, possibly even on the same machine. With this architecture, the HTTP request places a request on the queue, and the processor picks up requests from the queue and executes them.
I am working on a Service Fabric Application, in which I am running my Application that contains a bunch of ASP.NET Core Web APIs. Now when I run my application on my local service fabric cluster that is configured with 5 nodes, the application runs successfully and I am able to send post requests the exposed Web APIs. Actually I want to hit the code running on a same cluster node with different post requests to the exposed APIs on that particular node.
For further explanation, for example there is an API exposed on Node '0' that accept a post request and execute a Job, and also there is an API that abort the running job. Now when I request to execute a Job, it starts to execute on Node '0' but when I try to abort the Job, the service fabric cluster forward the request to a different node for example say node '1'. In resulting I could not able to abort the running Job because there is no running Job available on Node '1'. I don't know how to handle this situation.
For states, I am using a Stateless service of type ASP.Net Core Web API and running the app on 5 nodes of my local service fabric cluster.
Please suggest what should be the best approach.
Your problem is because you are running your APIs to do a Worker task.
You should use your API to schedule the work in the Background(Process\Worker) and return to the user a token or operation id. The user will use this token to request the status or cancel the task.
The first step: When you call your API the first time, you could generate a GUID(Or insert in DB) and put this message in a queue(i.e: Service Bus), and then return the GUID to the caller.
The second step: A worker process will be running in your cluster listening for messages from this queue and process these messages whenever a message arrives. You can make this a single thread service that process message by message in a loop, or a multi-threaded service that process multiple messages using one Thread for each message. It will depend how complex you want to be:
In a single threaded listener, to scale you application, you have to span multiple instances so that multiple tasks will run in parallel, you can do that in SF with a simple scale command and SF will distribute the service instances across your available nodes.
In a multi-threaded version you will have to manage the concurrency for better performance, you might have to consider memory, cpu, disk and so on, otherwise you risk having too much load in a single node.
The third step, the cancellation: The cancellation process is easy and there are many approaches:
Using a similar approach and enqueue a cancellation message
Your service will listen for the cancellation in a separate thread and cancel the running task(if running).
Using a different queue to send the cancellation messages is better
If running multiple listener instances you might consider a topic instead of a queue.
Using a cache key to store the job status and check on every iteration if the cancellation has been requested.
Table with job status, where you check on every iteration as you would do with the cache key.
Creating a Remote endpoint to make a direct call to the service and trigger a cancellation token.
There are many approaches, these are simple, and you might make use of multiple in combination to have a better control of your tasks.
You'll need some storage to do that.
Create a table (e.g JobQueue). Before starting to process the job, you store in a database, store the status (e.g Running, it could be an enum), and then return the ID to the caller. Once you need to abort/cancel the job, you call the abort method from the API sending the ID you want to abort. In the abort method, you just update the status of the job to Aborting. Inside the first method (which runs the job), you'll need to check this table onde in a while, if it's aborting, then you stop the job (and update the status to Aborted). Or you could just delete from the database once the job has been aborted or finished.
Alternatively, if you want the data to be temporary, you could use a sixth server as a cache server and store data there. This cache server could be a clustered server as well, but then you would need to use something like Redis.
I have some legacy ASMX IIS hosted service. Client applications make subscribe or unsubscribe to the web service. Via some internal to the web service logic it needs to send messages to the subscribed applications periodically.
What is the best way to do the part of the long running task ? I understand opening Thread with long running task not a good idea to do under IIS.
ASMX services cannot do what you're asking for: they cannot just decide to send a message to the client. All they can do is respond if the client requests it.
You can hack around and come up with one method to start the long-running task, and another method to poll for the status of the task. This works, but it can be expensive.
The better model is to perform the long-running task in a separate Windows Service. Have that service host a simple WCF service which will only be used by the main service (the one that talks to the clients). The main (WCF) service would use a Duplex channel to communicate with the clients. That way, it can "call" the clients whenever there is news about one of the long-running tasks.
Usually in such cases when you don't have a way to push the result back, create an unique ID for the long running task and sent it back to the client, after that run the task and have a table in database or something else where you store the status of the task. The client will pull periodically the service to see the task' status by given ID. Once it finds the task is completed it will retrieve the result.
And is completely fine to have a thread running inside IIS doing its job.
We have an ASP MVC 3.0 application that reads data from the db using Entity framework (all on Azure). We have several long running queries (optimization has been done) and we want to make sure that the solution is scalable and prevent thread starvation.
We looked at async controllers and using I/O completion ports to run the query (using BeginExecute instead of the usual EF). However, async is hard to debug and increases the complexity of the code.
The proposed solution is as follows:
The web server (web role) gets a request that involves a long running query (example customer segmentation)
It enters the request information into a table along with the relevant parameters and returns thereby allowing the thread to process other requests.
We set a flag in the db that enables the UI to state that the query is in progress whenever a refresh to the page is done.
A worker role constantly queries this table and as soon as it finds this entry processes the long running query (customer segmentation) and updates the original customer table with the results.
In this case an immediate return of status to the users is not necessary. Users can check back within a couple of minutes to see if their request has been worked on. Instead of the table we were planning to use Azure Queues (but I guess Azure queues cannot notify a worker role so a db table will do just fine). Is this a workable solution. Are there any pitfalls to doing it this way?
While Windows Azure Storage queues don't give you a notification after a message has been processed, you could implement that yourself (perhaps with Windows Azure Storage tables). The nice part about queues: They handle concurrency and failed attempts.
For instance: If you have 2 worker instances processing messages off the same queue, every time a queue message is read, the message goes invisible in the queue, for an amount of time you specify. While invisible, only the worker instance that read the message has it. If that instance finishes processing, it can just delete the queue message (and update your notification table). If it fails (maybe due to the role instance crashing), the message re-appears on the queue after the invisibility timeout expires. Going one step further: Let's say it's simply a bad message that causes your code to crash every time. You can check the dequeue count before processing the message. If it's greater than, say, 2, simply store the message in a dead-letter table and inspect it manually.
One caveat with queues: The queue messages need to be idempotent operations (that is, they can be processed at least once, and the results should have the exact same side-effects each time).
If you go with a table instead of a queue, you'll need to deal with scaling (multiple threads or role instances processing the table), and dead-letter handling.
This depends. If your worker role does nothing other than delegating the heavy work to a SQL database, it seems a waste of resource and your money. Using a web role with async requests allows you to reduce the cost. If it is needed to do a heavy work in the worker role itself, then it is a good approach.
You can also use AJAX or web socket. Start the database query, and return the response immediately. The client can either poll the web role to see if a query has finished (if you use HTTP), or the web role can notify the client directly (if you use web socket).
How would one use SignalR to implement notifications in an .NET 4.0 system that consists of an ASP.NET MVC 3 application (which uses forms authentication), SQL Server 2008 database and an MSMQ WCF service (hosted in WAS) to process data? The runtime environment consists of IIS 7.5 running on Windows Server 2008 R2 Standard Edition.
I have only played with the samples and do not have extensive knowledge of SignalR.
Here is some background
The web application accepts data from the user and adds it to a table. It then calls an one way operation (with the database key) of the WCF service to process the data (a task). The web application returns to a page telling the user the data was submitted and they will be notified when processing is done. The user can look at an "index" page an see which tasks are completed, failed or are in progress. They can continue to submit more tasks (which is independent of previous data). They can close their browser and come back later.
The MSMQ based WCF service reads the record from the database and processes the data. This may take anything from milliseconds to several minutes. When its done processing the data, the record is updated with the corresponding status (error or fail) and results.
Most of the time, the WCF service is not performing any processing, however when it does, users generally want to know when its done as soon as possible. The user will still use other parts of the web application even if they don't have data to be processed by the WCF Service.
This is what I have done
In the primary navigation bar, I have an indicator (similar to Facebook or Google+) for the user to notify them when the status of tasks has changed. When they click on it, they get a summary of what was done and can then view the results if they wish to.
Using jQuery, I poll the server for changes. The controller action checks to see if there is any processes that were modified (completed or failed) and return them otherwise waits a couple of seconds and check again without returning to the client. In order to avoid a time out on the client, it will return after 30 seconds if there was no changes. The jQuery script waits a while and tries again.
The problems
Performance degrades with every user that views a page. There is no need for them to do anything in particular. We've noticed that memory usage of Firefox 7+ and Safari increases over time.
Using SignalR
I'm hoping that switching to SignalR can reduce polling and thus reduce resource requirements especially if nothing has changed task wise in the database. I have trouble getting the WCF service to notify clients that its done with processing a task given the fact that it uses forms based authentication.
By asking this question, I hope someone will give me better insight how they will redesign my notification scheme using SignalR, if at all.
If I understand correctly, you need a way of associating a task to a given user/client so that you can tell the client when their task has completed.
SignalR API documentation tells me you can call JS methods for specific clients based on the client id (https://github.com/SignalR/SignalR/wiki/SignalR-Client). In theory you could do something like:
Store the client id used by SignalR as part of the task metadata:
Queue the task as normal.
When the task is processed and de-queued:
Update your database with the status.
Using the client id stored as part of that task, use SignalR to send that client a notification:
You should be able to retrieve the connection that your client is using and send them a message:
string clientId = processedMessage.ClientId //Stored when you originally queued it.
IConnection connection = Connection.GetConnection<ProcessNotificationsConnection>();
connection.Send(clientId, "Your data was processed");
This assumes you mapped this connection and the client used that connection to start the data processing request in the first place. Your "primary navigation bar" has the JS that started the connection to the ProcessNotificationsConnection endpoint you mapped earlier.
EDIT: From https://github.com/SignalR/SignalR/wiki/Hubs
public class MyHub : Hub
{
public void Send(string data)
{
// Invoke a method on the calling client
Caller.addMessage(data);
// Similar to above, the more verbose way
Clients[Context.ClientId].addMessage(data);
// Invoke addMessage on all clients in group foo
Clients["foo"].addMessage(data);
}
}