We are developing a monolithic web application – very stateful. It handles both HTTP requests and long lived SignalR connections. (In ASP.NET Core 3.1 – we will upgrade to .NET 5 later.)
We do a redirect from a login page to our “main page”. The main page takes a while to load and initialize, after that it connects with SignalR. We also have a lot of work to do at the server side. Doing the server work in the login request (before redirecting to the main page) would slow down the login.
“Oh, let’s use a Task then!”, I thought. That is, put the server work in a Task, save that in the user state, and let it execute in parallel with the loading of the main page. Something like this (simplified):
public static async Task ServerSideInit()
{
// do a lot of init work
}
// at the end of the controller handling the login page POST:
UserState.BackgroundTask = ServerSideInit();
Redirect(UrlToTheMainPage);
// when the main page connects via SignalR:
try {
await UserState.BackgroundTask;
}
catch {
// handle errors in the init work
}
This would really speed things up. It won’t matter if the page loading or the init work finishes first – we await the Task. And the work in ServerSideInit() isn’t critical. If something happens and the main page never connects, the UserState (and the Task) will be destroyed after a timeout – and that’s perfectly OK. (There are some caveats. We would e.g. have to use IServiceProvider to create/dispose a scope in ServerSideInit(), so we get a scoped DbContext outside of the controller. But that’s OK.)
But then I read that there is a risk the ASP.NET Core framework shuts down the Task when wrapping up the POST request! (Do you have to await async methods?) The simple HostingEnvironment.QueueBackgroundWorkItem isn’t available any longer. There is a new BackgroundService class, though. (https://learn.microsoft.com/en-us/aspnet/core/fundamentals/host/hosted-services?view=aspnetcore-3.1&tabs=visual-studio) But registering a service and queueing jobs seems like a very cumbersome solution… We just want to fire a task that will take a couple of seconds to complete, and let that continue to run after ASP.NET Core has finished handling the POST request.
I’m not very experienced with ASP.NET Core… So I’d be very grateful for some input! Will my simple solution not work? Will the task be terminated by the framework? Is there some easier way to tell the framework “please don’t touch this Task”? Or is BackgroundService the way to go?
Doing the server work in the login request (before redirecting to the main page) would slow down the login. “Oh, let’s use a Task then!”, I thought. That is, put the server work in a Task, save that in the user state, and let it execute in parallel with the loading of the main page.
So, you have a need for request-extrinsic work. I.e., work that your server does that is outside the scope of a request.
The first question you need to ask yourself is "does this work need to be done?" In other words, "am I OK with occasionally losing work?". If this work must be done for correctness reasons, then there is only one real solution: asynchronous messaging. If you're OK with occasionally losing work (e.g., if the main page will detect that the ServerSideInit is not done and will do it at that time), then what you're really talking about is a cache, and that's fine to have an in-memory solution for.
But then I read that there is a risk the ASP.NET Core framework shuts down the Task when wrapping up the POST request!
The first thing to recognize is that shutdowns are normal. Rolling updates during regular deployments, OS patches, etc... Your web server will voluntarily shut down sooner or later, and any code that assumes it will run forever is inherently buggy.
ASP.NET Core by default will consider itself "safe to shut down" when all requests have been responded to. This is the reasonable behavior for any HTTP service, and this logic extends to every HTTP framework, regardless of language or runtime. However, this is clearly a problem for request-extrinsic code.
So, if your code just starts a task by calling the method directly (or by Task.Run, another sadly popular option), then it is living dangerously: ASP.NET has no idea that request-extrinsic code even exists, and will happily exit when requested, abruptly terminating that code.
There are stopgap solutions like HostingEnvironment.QueueBackgroundWorkItem (pre-Core) and IHostedService / IHostApplicationLifetime (Core). These register the request-extrinsic code so that ASP.NET is aware of it, and will not shut down until that code completes. However, those solutions only go partway; since they are in-memory, they are also dangerous: ASP.NET is now aware of the request-extrinsic code, but HTTP proxies, load balancers, and deployment scripts are not.
Is there some easier way to tell the framework “please don’t touch this Task”?
Back to the question at the beginning of this answer: "does this work need to be done?"
If it's just an optimization and doesn't need to be done, then just firing off the work with a Task.Run (or IHostedService) should be sufficient. I wouldn't keep it in UserState, though, since Tasks aren't serializable.
If the work needs to be done, then build an asynchronous messaging solution.
Related
in a dotnet core http application we have a call that performs work, makes several http calls out to backing services, and returns. It was all async await. which meant that the call was waiting for the backing services to perform their work before returning the call to the client. this was causing the call to timeout.
the solution that was presented was to remove the async await all the way down as low as we could then essentially just wrap the http calls (still async task methods) in pragma tags that suppress the warning/compiler error.
This makes me nervous because you don't guarantee that the final (or any) of http requests are made before the calling thread returns and the async machines associated with that thread are cleaned up.
Am I missing something? is this one of those weird but usable situations? or would it be more appropriate to spin off threads to handle those http calls?
You're talking about fire-and-forget, and you're right to be worried. There are several problems with fire-and-forget. This is because "forget" means "forget", and it's almost always the wrong decision to have your application just forget about something.
If there's an exception calling one of those inner HTTP requests, that exception will be ignored, unless you have special logic handling that situation. And remember, there's no outer request anymore, so returning an error isn't possible (there's nowhere for it to return to). So you have the possibility of silently-swallowed errors. This should make you nervous.
Also, you have the problem of not informing ASP.NET that you have ongoing background work. By returning early, you're telling ASP.NET to send the response and that everything is fine, even though in reality, the work isn't done and you have no idea when it will be done or even whether it will succeed. The point here is that nothing upstream of your code (including ASP.NET, IIS/Kestrel, proxies, load balancers) has any idea that your code is still working - after all, your code did just tell all those things that it's done handling that request. Will ASP.NET respond to a shutdown request? Sure! Can IIS do its periodic app pool recycle? Sure! Can your proxy take that node out of rotation when doing a rolling upgrade? Sure! Will your load balancer send it more work since it's not doing anything? Sure! As far as any of those systems know, your app isn't actually handling that request, and that can cause problems, like your "fire and forget" work suddenly disappearing - again, with no exceptions or logs or anything. This should make you nervous.
I'd say the best approach is to fix downstream calls, if possible. Also look into asynchronous concurrency, e.g., starting several calls and then await Task.WhenAll. If these approaches aren't sufficient, then I'd recommend a proper distributed architecture: have the API write to a persistent queue, and have the background work done by a separate application that processes that queue.
Is there a way to fire an Http call to an external web API within my own web API without having to wait for results?
The scenario I have is that I really don't care whether or not the call succeeds and I don't need the results of that query.
I'm currently doing something like this within one of my web API methods:
var client = new HttpClient() { BaseAddress = someOtherApiAddress };
client.PostAsync("DoSomething", null);
I cannot put this piece of code within a using statement because the call doesn't go through in that case. I also don't want to call .Result() on the task because I don't want to wait for the query to finish.
I'm trying to understand the implications of doing something like this. I read all over that this is really dangerous, but I'm not sure why. What happens for example when my initial query ends. Will IIS dispose the thread and the client object, and can this cause problems at the other end of the query?
Is there a way to fire an Http call to an external web API within my own web API without having to wait for results?
Yes. It's called fire and forget. However, it seems like you already have discovered it.
I'm trying to understand the implications of doing something like this
In one of the links in the answers you linked above state the three risks:
An unhandled exception in a thread not associated with a request will take down the process. This occurs even if you have a handler setup via the Application_Error method.
This means that any exception thrown in your application or in the receiving application won't be caught (There are methods to get past this)
If you run your site in a Web Farm, you could end up with multiple instances of your app that all attempt to run the same task at the same time. A little more challenging to deal with than the first item, but still not too hard. One typical approach is to use a resource common to all the servers, such as the database, as a synchronization mechanism to coordinate tasks.
You could have multiple fire-and forget calls when you mean to have just one.
The AppDomain your site runs in can go down for a number of reasons and take down your background task with it. This could corrupt data if it happens in the middle of your code execution.
Here is the danger. Should your AppDomain go down, it may corrupt the data that is being sent to the other API causing strange behavior at the other end.
I'm trying to understand the implications of doing something like
this. I read all over that this is really dangerous
Dangerous is relative. If you execute something that you don't care at all if it completes or not, then you shouldn't care at all if IIS decides to recycle your app while it's executing either, should you? The thing you'll need to keep in mind is that offloading work without registration might also cause the entire process to terminate.
Will IIS dispose the thread and the client object?
IIS can recycle the AppDomain, causing your thread to abnormally abort. Will it do so depends on many factors, such as how recycling is defined in your IIS, and if you're doing any other operations which may cause a recycle.
In many off his posts, Stephan Cleary tries to convey the point that offloading work without registering it with ASP.NET is dangerous and may cause undesirable side effects, for all the reason you've read. That's also why there are libraries such as AspNetBackgroundTasks or using Hangfire for that matter.
The thing you should most worry about is a thread which isn't associated with a request can cause your entire process to terminate:
An unhandled exception in a thread not associated with a request will
take down the process. This occurs even if you have a handler setup
via the Application_Error method.
Yes, there are a few ways to fire-and-forget a "task" or piece of work without needing confirmation. I've used Hangfire and it has worked well for me.
The dangers, from what I understand, are that an exception in a fire-and-forget thread could bring down your entire IIS process.
See this excellent link about it.
I have this [HttpPost] action method:
[HttpPost]
public ActionResult AddReview(Review review)
{
repository.Add(review);
repository.Save();
repository.UpdateSystemScoring(review.Id); // call SPROC with new Review ID.
return View("Success", review);
}
So, basically a user clicks a button, i add it to my database (via Entity Framework 4.0), save changes, and then i call a stored procedure with the identity field, which is that second last line of code.
This needs to be done after the review is saved (as the identity field is only created once Save is called, and EF persists the changes), and it is a system-wide calculation.
From the user point of view, he/she doesn't/shouldn't care that this calculation is happening.
This procedure can take anywhere from 0-20 seconds. It does not return anything.
Is this a candidate for an asynchronous controller?
Is there a way i can add the Review, and let another asynchronous controller handle the long-running SPROC call, so the user can be taken to the Success page immediately?
I must admit (partially ashamed of this): this is a rewrite of an existing system, and in the original system (ASP.NET Web Forms), i fired off another thread in order to achieve the above task - which is why i was wondering if the same principal can be applied to ASP.NET MVC 3.
I always try and avoid multi-threading in ASP.NET but user experience is the #1 priority, and i do not want the page timing out.
So - is this possible? Also happy to hear any other ideas. Also - i can't use triggers here, don't really want to go into too much detail why - but i can't.
I would fire a new thread (not from the thread pool) to perform this task and return immediately especially if you don't care about the results. Asynchronous controllers are useful in situations where most of the time is spent waiting for some other system to complete the task and you once this system completes the task your application is signaled to process the result. During the execution of the task no threads are consumed from your application. So in your scenario this task could be performed by SQL Server using the async versions of the BeginRead methods in ADO.NET. You could use this if you need the results back. If you don't firing a new thread would work just fine as before.
I think asynchronous controllers are more for things where the request may take a long time to return a response, but the main thread would spend most of that time waiting for another thread/process. This is mostly useful for ajax calls rather than main page load, when it is acceptable to just show a progress indicator until the response is returned.
I use a separate queueing system for this type of task, which is more robust and easier to work with but does take a bit more work to set up. If you really need to do it within the ASP.net process, a separate request is probably the best option, though there is some potential for the task not to run - for example I'm not sure what happens if the connection drops or the app pool recycles while an async task is running.
Since the scoring system takes so long to run I would recommend using a scheduled task in SQL Server or Windows to update the scores every x amount of minutes. Since the user doesn't know about the request it don't matter to run immediately.
You could add the ID's to a queue and process the queue every 30 minutes.
Otherwise if there is a reason this needs to be run immediately you could do an asyc call or see if you could trim some fat of the stored proc.
I have a very similar system that I wrote. Instead of doing things synchronously we do everything asynchronous using queues.
Action -> causes javascript request to web server
|
Web server puts notification on queue
|
Worker picks up message from queue and does point calculation
|
At some point in future user sees points adjusted
This allows us to be able to handle large amounts of user load and not need to worry about this having an adverse affect on our calculation engine. This also means that we can add more workers to handle larger load when we have large load and can remove workers when we don't have a large load.
Sometimes there is a lot that needs to be done when a given Action is called. Many times, there is more that needs to be done than what needs to be done to generate the next HTML for the user. In order to make the user have a faster experience, I want to only do what I need to do to get them their next view and send it off, but still do more things afterwards. How can I do this, multi-threading? Would I then need to worry about making sure different threads don't step on each others feet? Is there any built in functionality for this type of thing in ASP.NET MVC?
As others have mentioned, you can use a spawned thread to do this. I would take care to consider the 'criticality' of several edge cases:
If your background task encounters an error, and fails to do what the user expected to be done, do you have a mechanism of report this failure to the user?
Depending on how 'business critical' the various tasks are, using a robust/resilient message queue to store 'background tasks to be processed' will help protected against a scenario where the user requests some action, and the server responsible crashes, or is taken offline, or IIS service is restarted, etc. and the background thread never completes.
Just food for though on other issues you might need to address.
How can I do this, multi-threading?
Yes!
Would I then need to worry about making sure different threads don't step on each others feet?
This is something you need to take care of anyway, since two different ASP.NET request could arrive at the same time (from different clients) and be handled in two different worker threads simultaneously. So, any code accessing shared data needs to be coded in a thread-safe way anyway, even without your new feature.
Is there any built in functionality for this type of thing in ASP.NET MVC?
The standard .net multi-threading techniques should work just fine here (manually starting threads, or using the Task features, or using the Async CTP, ...).
It depends on what you want to do, and how reliable you need it to be. If the operaitons pending after the response was sent are OK to be lost, then .Net Async calls, ThreadPool or new Thread are all going to work just fine. If the process crashes the pending work is lost, but you already accepted that this can happen.
If the work requires any reliable guarantee, for instance the work incurs updates in the site database, then you cannot use the .Net process threading, you need to persist the request to do the work and then process this work even after a process restart (app-pool recycle as IIS so friendly calls them).
One way to do this is to use MSMQ. Other way is to use the a database table as a queue. The most reliable way is to use the database activation mechanisms, as described in Asynchronous procedure execution.
You can start a background task, then return from the action. This example is using the task Parallel Library, found in .NET 4.0:
public ActionResult DoSomething()
{
Task t = new Task(()=>DoSomethingAsynchronously());
t.Start();
return View();
}
I would use MSMQ for this kind of work. Rather than spawning threads in an ASP.NET application, I'd use an Asynchronous out of process way to do this. It's very simple and very clean.
In fact I've been using MSMQ in ASP.NET applications for a very long time and have never had any issues with this approach. Further, having a different process (that is an executable in a different app domain) do the long running work is an ideal way to handle it since your web application is no being used to do this work. So IIS, the threadpool and your web application can continue to do what they need to, while other processes handle long running tasks.
Maybe you should give it a try: Using an Asynchronous Controller in ASP.NET MVC
Is it possible to return the page response to the user, before you've finished all your server side work?
Ie, I've got a cheap hosting account with No database, but I'd like to log a certain event, by calling a webservice on my other, more expensive hosting account (ie, a very slow logging operation)
I don't really want the user to have to wait for this slow logging operation to complete before their page is rendered.
Would I need to spin up a new thread, or make an asynchronous call? Or is it possible to return the page, and then continue working happily in the same thread/code?
Using ASP.Net (webforms) C# .Net 2.0 etc.
You would probably need a second thread. An easy option would be to use the ThreadPool, but in a more sophisticated setup a producer/consumer queue would work well.
At the simplest level:
ThreadPool.QueueUserWorkItem(delegate {
DoLogging(state details);
});
You sure can - try Response.Flush.
That being said - creating an asynchronous call may be the best way to do what you want to do. Response.Flush simply flushed the output buffer to the client, an asynchronous call would allow you to fire off a logging call and not have it impact the client's load time.
Keep in mind that an asynchronous call made during the page's life cycle in ASP.NET may not return in time for you to do anything with the response.