Is await database.SaveChangesAsync() functionally equivalent to database.SaveChanges() ? If so, are there any benefits in using the async version if I need to update the database immediately ? Are there even more overheads in doing that ?
Thanks !
Update: I'm updating the question to give a better picture of my use case. This is in response to an answer below.
I'm writing WebAPI calling a Mediatr. I'm doing CRUD type of transaction. And I really needed to update the database and then continue from there. I don't see a need to spawn off a (what I saw on YouTube video) state machine to process the database update. Am I thinking correctly ? So if that's the case, I really don't need to do a await database.SaveChangesAsync, right ? I should just do a database.SaveChanges() and let the database update happen in the current thread. That's the purpose of my question. Is this thinking correct ?
Lots of codebases provide async and sync methods. The difference is the sync method will lock your thread. Essentially when you call the 'await database.SaveChangesAsync()' method, the tasks sets up an event to resume processing when you get a result and the thread returns to the pool. Performance wise, for large systems the more you make use of async/await the lower the drain on system resources.
Using async await also lets you do things like this
private async Task Foo() {
var dataBaseResultTask = database.SaveChangesAsync();
SomeOtherWorkICanDoWithoutWaiting();
await dataBaseResultTask;
}
In general, if you can use the async version of a method, you should. But if you're calling the database from a sync method, it's not the end of the world to use the sync variant. They are functionally the same. Generally though, the more you use C# the more async will spread through your system like a virus, eventually you're gonna have to give in.
Updating for updated question:
In an API context, crud operations make a lot of sense to use the async variant. Say you have 10 requests coming in at slightly different times, the 10 method calls will reach the database step, freeze the thread and wait for the database to respond. You now have 10 threads locked up despite your API not actually doing any processing. On the other hand if all the methods do await databaseAsync, all 10 threads will spin up, trigger the database call, and then return to the pool freeing them up for other API calls/processes. Then when replies from the database come back whatever threads are available will spin up, process the replies and then return to the pool again.
For a small API you can get away with either method but best practices wise, your use case is textbook for async await. Tl;dr; The method itself will behave the same with either approach, but use less resources with async await.
Related
I wanted to ask you about async/await. Namely, why does it always need to be used? (all my friends say so)
Example 1.
public async Task Boo()
{
await WriteInfoIntoFile("file.txt");
some other logic...
}
I have a Boo method, inside which I write something to files and then execute some logic. Asynchrony is used here so that the stream does not stop while the information is being written to the file. Everything is logical.
Example 2.
public async Task Bar()
{
var n = await GetNAsync(nId);
_uow.NRepository.Remove(n);
await _uow.CompleteAsync();
}
But for the second example, I have a question. Why here asynchronously get the entity, if without its presence it will still be impossible to work further?
why does it always need to be used?
It shouldn't always be used. Ideally (and especially for new code), it should be used for most I/O-based operations.
Why here asynchronously get the entity, if without its presence it will still be impossible to work further?
Asynchronous code is all about freeing up the calling thread. This brings two kinds of benefits, depending on where the code is running.
If the calling thread is a UI thread inside a GUI application, then asynchrony frees up the UI thread to handle user input. In other words, the application is more responsive.
If the calling thread is a server-side thread, e.g., an ASP.NET request thread, then asynchrony frees up that thread to handle other user requests. In other words, the server is able to scale further.
Depending on the context, you might or might not get some benefit. In case you call the second function from a desktop application, it allows the UI to stay responsive while the async code is being executed.
Why here asynchronously get the entity, if without its presence it will still be impossible to work further?
You are correct in the sense that this stream of work cannot proceed, but using async versions allows freeing up the thread to do other work:
I like this paragraph from Using Asynchronous Methods in ASP.NET MVC 4 to explain the benefits:
Processing Asynchronous Requests
In a web app that sees a large number of concurrent requests at start-up or has a bursty load (where concurrency increases suddenly), making web service calls asynchronous increases the responsiveness of the app. An asynchronous request takes the same amount of time to process as a synchronous request. If a request makes a web service call that requires two seconds to complete, the request takes two seconds whether it's performed synchronously or asynchronously. However during an asynchronous call, a thread isn't blocked from responding to other requests while it waits for the first request to complete. Therefore, asynchronous requests prevent request queuing and thread pool growth when there are many concurrent requests that invoke long-running operations.
Not sure what you mean by
without its presence it will still be impossible to work further
regarding example 2. As far as I can tell this code gets an entity by id from its repository asynchronously, removes it, then completes the transaction on its Unit of Work. Do you mean why it does not simply remove the entry by id? That would certainly be an improvement, but would still leave you with an asynchronous method as CompleteAsync is obviously asynchronous?
As to your general question, I don't think there is a general concensus to always use async/await.
In your second example there with the async/await keywords you are getting the value of the n variable asynchronously. This might be necessary because the GetNAsync method is likely performing some time-consuming operation, such as querying a database or perhaps you might be calling a webservice downstream, that could block the main thread of execution. By calling the method asynchronously, the rest of the code in the Bar method can continue to run while the query is being performed in the background.
But if in the GetNAsync you are just calling another method locally that is doing some basic CPU bound task then the async is pointless in my view. Aync works well when you are sure you need to wait such as network calls or I/O bound calls that will definitely add latency to your stack.
So I have been trying to get the grasp for quite some time now but couldn't see the sense in declaring every controller-endpoint as an async method.
Let's look at a GET-Request to visualize the question.
This is my way to go with simple requests, just do the work and send the response.
[HttpGet]
public IActionResult GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = _dbService.ReadAllUsers();
return Ok(userList);
}
Below is the async Task<IActionResult> option I see very often, it does the same as the method above but the method itself is returning a Task. One could think, that this one is better because you can have multiple requests coming in and they get worked on asynchronously BUT I tested this approach and the approach above with the same results. Both can take multiple requests at once. So why should I choose this signature instead of the one above? I can only see the negative effects of this like transforming the code into a state-machine due to being async.
[HttpGet]
public async Task<IActionResult> GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = _dbService.ReadAllUsers();
return Ok(userList);
}
This approach below is also something I don't get the grasp off. I see a lot of code having exactly this setup. One async method they await and then returning the result. Awaiting like this makes the code sequential again instead of having the benefits of Multitasking/Multithreading. Am I wrong on this one?
[HttpGet]
public async Task<IActionResult> GetUsers()
{
// Do some work and get a list of all user-objects.
List<User> userList = await _dbService.ReadAllUsersAsync();
return Ok(userList);
}
It would be nice if you could enlighten me with facts so I can either continue developing like I do right now or know that I have been doing it wrong due to misunderstanding the concept.
Please read the "Synchronous vs. Asynchronous Request Handling" section of Intro to Async/Await on ASP.NET.
Both can take multiple requests at once.
Yes. This is because ASP.NET is multithreaded. So, in the synchronous case, you just have multiple threads invoking the same action method (on different controller instances).
For non-multithreaded platforms (e.g., Node.js), you have to make the code asynchronous to handle multiple requests in the same process. But on ASP.NET it's optional.
Awaiting like this makes the code sequential again instead of having the benefits of Multitasking/Multithreading.
Yes, it is sequential, but it's not synchronous. It's sequential in the sense that the async method executes one statement at a time, and that request isn't complete until the async method completes. But it's not synchronous - the synchronous code is also sequential, but it blocks a thread until the method completes.
So why should I choose this signature instead of the one above?
If your backend can scale, then the benefit of asynchronous action methods is scalability. Specifically, asynchronous action methods yield their thread while the asynchronous operation is in progress - in this case, GetUsers is not taking up a thread while the database is performing its query.
The benefit can be hard to see in a testing environment, because your server has threads to spare, so there's no observable difference between calling an asynchronous method 10 times (taking up 0 threads) and calling a synchronous method 10 times (taking up 10 threads, with another 54 to spare). You can artificially restrict the number of threads in your ASP.NET server and then do some tests to see the difference.
In a real-world server, you usually want to make it as asynchronous as possible so that your threads are available for handling other requests. Or, as described here:
Bear in mind that asynchronous code does not replace the thread pool. This isn’t thread pool or asynchronous code; it’s thread pool and asynchronous code. Asynchronous code allows your application to make optimum use of the thread pool. It takes the existing thread pool and turns it up to 11.
Bear in mind the "if" above; this particularly applies to existing code. If you have just a single SQL server backend, and if pretty much all your actions query the db, then changing them to be asynchronous may not be useful, since the scalability bottleneck is usually the db server and not the web server. But if your web app could use threads to handle non-db requests, or if your db backend is scalable (NoSQL, SQL Azure, etc), then changing action methods to be asynchronous would likely help.
For new code, I recommend asynchronous methods by default. Asynchronous makes better use of server resources and is more cloud-friendly (i.e., less expensive for pay-as-you-go hosting).
If your DB Service class has an async method for getting the user then you should see benefits. As soon as the request goes out to the DB then it is waiting on a network or disk response back from the service at the other end. While it is doing this the thread can be freed up to do other work, such as servicing other requests. As it stands in the non-async version the thread will just block and wait for a response.
With async controller actions you can also get a CancellationToken which will allow you to quit early if the token is signalled because the client at the other end has terminated the connection (but that may not work with all web servers).
[HttpGet]
public async Task<IActionResult> GetUsers(CancellationToken ct)
{
ct.ThrowIfCancellationRequested();
// Do some work and get a list of all user-objects.
List<User> userList = await _dbService.ReadAllUsersAsync(ct);
return Ok(userList);
}
So, if you have an expensive set of operations and the client disconnects, you can stop processing the request because the client won't ever receive it. This frees up the application's time and resources to deal with requests where the client is interested in a result coming back.
However, if you have a very simple action that requires no async calls then I probably wouldn't worry about it, and leave it as it is.
You are missing something fundamental here.
When you use async Task<> you are effectively saying, "run all the I/O code asynchronously and free up processing threads to... process".
At scale, your app will be able to serve more requests/second because your I/O is not tying up your CPU intensive work and vice versa.
You're not seeing much benefit now, because you're probably testing locally with plenty of resources at hand.
See
Understanding CPU and I/O bound for asynchronous operations
Async in depth
As you know, ASP.NET is based on a multi-threaded model of request execution. To put it simply, each request is run in its own thread, contrary to other single thread execution approaches like the event-loop in nodejs.
Now, threads are a finite resource. If you exhaust your thread pool (available threads) you can't continue handling tasks efficiently (or in most cases at all). If you use synchronous execution for your action handlers, you occupy those threads from the pool (even if they don't need to be). It is important to understand that those threads handle requests, they don't do input/output. I/O is handled by separate processes which are independent of the ASP.NET request threads. So, if in your action you have a DB fetch operation that takes let's say 15 seconds to execute, you're forcing your current thread to wait idle 15 seconds for the DB result to be returned so it can continue executing code. Therefore, if you have 50 such requests, you'll have 50 threads occupied in basically sleep mode. Obviously, this is not very scalable and becomes a problem soon enough. In order to use your resources efficiently, it is a good idea to preserve the state of the executing request when an I/O operation is reached and free the thread to handle another request in the meantime. When the I/O operation completes, the state is reassembled and given to a free thread in the pool to resume the handling. That's why you use async handling. It lets you use your finite resources more efficiently and prevents such thread starvation. Of course, it is not an ultimate solution. It helps you scale your application for higher load. If you don't have the need for it, don't use it as it just adds overhead.
The result returns a task from an asynchronous (non-blocking) operation which represents some work that should be done. The task can tell you if the work is completed and if the operation returns a result, the task gives you the result which won't be available until the task is completed. You can learn more about C# asynchronous programming from the official Microsoft Docs:
https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.task?view=net-5.0
Async operations return a Task(it is not the result, but a promise made that will have the results once the task is completed.
The async methods should be awaited so that it waits till the task is completed. if you await an async method, the return values won't be a task anymore and will be the results you expected.
Imagine this async method:
async Task<SomeResult> SomeMethod()
{
...
}
the following code:
var result = SomeMethod(); // the result is a Task<SomeResult> which may have not completed yet
var result = await SomeMethod(); // the result is type of SomeResult
Now, why do we use async methods instead of sync methods:
Imagine that a Method is doing some job that may take a long time to complete, when it is done sync all other requests will be waiting till the execution of this long taking job to complete, as they are all happening in the same thread. However when it is async, the task will be done in a (possibly another) thread and will not block the other requests.
I've looked over multiple similar questions on SO, but I still couldn't answer my own question.
I have a console app (an Azure Webjob actually) which does file processing and DB management. Some heavy data being downloaded from multiple sources and processed on the DB.
Here's an example of my code:
var dbLongIndpendentProcess = doProcesAsync();
var myfilesTasks = files.Select(file => Task.Run(
async () =>
{
// files processing
}
await myfilesTasks.WhenAll();
await dbLongIndpendentProcess;
// continue with other stuff;
It all works fine and does what I am expecting it to do. There are other tasks running in this whole process, but I guess the idea is clear from the code above.
My question: Is this a fair way of approaching this, or would I get more performance (or sense?) by doing the good old "manual" multithreading? The main reason I chose this approach was that it's simple and straightforward.
However, wasn't async/await primarily aimed at doing asynchronous not to block the main (UI) thread. Here I don't have any UI and I am not doing anything. event-driven.
Thanks,
I don't think you're multithreading by using this approach (except the single Task.Run), async doesn't generally run things on separate threads, it only prevents things from blocking. See: https://msdn.microsoft.com/en-gb/library/mt674882.aspx#Anchor_5
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active. You can use Task.Run to move CPU-bound work to a
background thread, but a background thread doesn't help with a process
that's just waiting for results to become available.
It would be much better to use tasks for the things you want to multithread, then you can take better advantage of machine cores and resources. You might want to look at a task based solution such as Pipelining (which may work in this scenario) etc...: https://msdn.microsoft.com/en-gb/library/ff963548.aspx or another alternative.
I have a method which has just one task to do and has to wait for that task to complete:
public async Task<JsonResult> GetAllAsync()
{
var result = await this.GetAllDBAsync();
return Json(result, JsonRequestBehavior.AllowGet);
}
public async Task<List<TblSubjectSubset>> GetAllDBAsync()
{
return await model.TblSubjectSubsets.ToListAsync();
}
It is significantly faster than when I run it without async-await.
We know
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active
According to this link: https://msdn.microsoft.com/en-us/library/hh191443.aspx#BKMK_Threads. What is the reason for being faster when we don't have another thread to handle the job?
"Asynchronous" does not mean "faster."
"Asynchronous" means "performs its operation in a way that it does not require a thread for the duration of the operation, thus allowing that thread to be used for other work."
In this case, you're testing a single request. The asynchronous request will "yield" its thread to the ASP.NET thread pool... which has no other use for it, since there are no other requests.
I fully expect asynchronous handlers to run slower than synchronous handlers. This is for a variety of reasons: there's the overhead of the async/await state machine, and extra work when the task completes to have its thread enter the request context. Besides this, the Win32 API layer is still heavily optimized for synchronous calls (expect this to change gradually over the next decade or so).
So, why use asynchronous handlers then?
For scalability reasons.
Consider an ASP.NET server that is serving more than one request - hundreds or thousands of requests instead of a single one. In that case, ASP.NET will be very grateful for the thread returned to it during its request processing. It can immediately use that thread to handle other requests. Asynchronous requests allow ASP.NET to handle more requests with fewer threads.
This is assuming your backend can scale, of course. If every request has to hit a single SQL Server, then your scalability bottleneck will probably be your database, not your web server.
But if your situation calls for it, asynchronous code can be a great boost to your web server scalability.
For more information, see my article on async ASP.NET.
I agree with Orbittman when he mentions the overhead involved in the application architecture. It doesn't make for a very good benchmark premise since you can't be sure if the degradation can indeed be solely attributed to the async vs non-async calls.
I've created a really simple benchmark to get a rough comparison between an async and a synchronous call and async loses every time in the overall timing actually, though the data gathering section always seems to end up the same. Have a look: https://gist.github.com/mattGuima/25cb7893616d6baaf970
Having said that, the same thought regarding the architecture applies. Frameworks handle async calls differently: Async and await - difference between console, Windows Forms and ASP.NET
The main thing to remember is to never confuse async with performance gain, because it is completely unrelated and most often it will result on no gain at all, specially with CPU-bound code. Look at the Parallel library for that instead.
Async await is not the silver bullet that some people think it is and in your example is not required. If you were processing the result of the awaitable operation after you received it then you would be able to return a task and continue on the calling thread. You wouldn't have to then wait for the rest of the operation to complete. You would be correct to remove the async/await in the above code.
It's not really possible to answer the question without seeing the calling code either as it depends on what the context is trying to trying to do with the response. What you are getting back is not just a Task but a task in the context of the method that will continue when complete. See http://codeblog.jonskeet.uk/category/eduasync/ for much better information regarding the inner workings of async/await.
Lastly I would question your timings as with an Ajax request to a database and back there other areas with potentially greater latency, such as the HTTP request and response and the DB connection itself. I assume that you're using an ORM and that alone can cause an overhead. I wonder whether it's the async/await that is the problem.
I'm writing a series of ASP.Net Web Api services that basically get data from a database and return it.
We decided for now to reuse previous poorly written Data Access Objects (let's call them PoorDAO) that use ADO.Net to call stored procedures in the database.
One improvement in the future will be to rewrite that data access layer to benefit from Async data calls with Entity Framework.
Because of this, we decided to wrap the PoorDAO's in Repositories implementing an interface that exposes asynchronous methods. The idea is to keep the same interfaces for future EF asynchronous repositories :
// future common interface
public interface ICountryRepository
{
Task<Country> GetAllCountries();
}
// current implementation hiding a PoorDAO in shame
public class CountryRepository : ICountryRepository
{
public Task<Country> GetAllCountries()
{
var countries = PoorCountryDAO.GetAllcountries(); // poor static API call
// some data transformation ...
return Task.FromResult(result);
}
}
What we have here is basically a synchronous operation hiding in asynchronous clothing. This is all fine, but my question is : while we're at it, wouldn't it be better to make the method entirely async and call await Task.Run(() => poorCountryDAO.GetAllcountries()) instead of just poorCountryDAO.GetAllcountries() ?
As far as I can tell, this would free up the IIS thread the Web Api service HTTP request is currently running on, and create or reuse another thread. This thread would be blocked waiting for the DB to respond instead of the IIS thread being blocked. Is that any better resource wise ? Did I totally misunderstand or overinterpret how Task.Run() works ?
Edit : I came across this article which claims that in some cases, asynchronous database calls can result in an 8 fold performance improvement. His scenario is very close to mine. I can't get my head around how that could be possible given the answers here and am a bit perplexed about what to do...
Is that any better resource wise?
No; it's provably worse. The existing Task.FromResult and await is the best solution.
Task.Run, Task.Factory.StartNew, and Task.Start should not be used in an ASP.NET application. They steal threads from the same thread pool that ASP.NET uses, causing extra thread switches. Also, if they are long-running, they will mess with the default ASP.NET thread pool heuristics, possibly causing it to create and destroy threads unnecessarily.
It's the same thing, you're locking up a thread while releasing another one. In theory performance is the same, although it will actually be slightly worse because of the overhead of context switching
A few points: first, for await Task.Start(() => poorCountryDAO.GetAllcountries()), Task.Start(() => poorCountryDAO.GetAllcountries()) already gives you a task, so you should just return that instead rather than awaiting.
Note that in any case, the fact that this method's Task is really synchronous is an implementation detail. There may be a temptation to wrap the GetAllCountries() call itself in a background thread, but that's a bad idea.
In all of these cases, you're still going to be stuck wasting a thread. The scenario you desire where you free up the IIS thread completely requires the use of "Overlapped IO" for the database calls (as per your link).
Basically, in these cases right now, one way or another, a thread (either the main thread or a worker thread) are going to block when they call PoorCountryDAO.GetAllcountries(). However, when you switch to the asynchronous DB calls, they will no longer burn a thread at all. If, however, the caller uses its own Task.Run, that will now come back to bite you.