HttpClient provide not truly async operations? - c#

I'm confused about async IO operations. In this article Stephen Cleary explains that we should not use Task.Run(() => SomeIoMethod()) because truly async operations should use
standard P/Invoke asynchronous I/O system in .NET
http://blog.stephencleary.com/2013/11/there-is-no-thread.html
However, avoid “fake asynchrony” in libraries. Fake asynchrony is when
a component has an async-ready API, but it’s implemented by just
wrapping the synchronous API within a thread pool thread. That is
counterproductive to scalability on ASP.NET. One prominent example of
fake asynchrony is Newtonsoft JSON.NET, an otherwise excellent
library. It’s best to not call the (fake) asynchronous versions for
serializing JSON; just call the synchronous versions instead. A
trickier example of fake asynchrony is the BCL file streams. When a
file stream is opened, it must be explicitly opened for asynchronous
access; otherwise, it will use fake asynchrony, synchronously blocking
a thread pool thread on the file reads and writes.
And he advises to use HttpClient but internaly it use Task.Factory.StartNew()
Does this mean that HttpClient provides not truly async operations?

Does this mean that HttpClient provides not truly async operations?
Sort of. HttpClient is in an unusual position, since it's primary implementation uses HttpWebRequest, which is only partially asynchronous.
In particular, the DNS lookup is synchronous, and I think maybe the proxy resolution, too. After that, it's all asynchronous. So, for most scenarios, the DNS is fast (usually cached) and there isn't a proxy, so it acts asynchronously. Unfortunately, there are enough scenarios (particularly from within corporate networks) where the synchronous operations can cause significant lag.
So, when the team was writing HttpClient, they had three options:
Fix HttpWebRequest (and friends) allowing for fully-asynchronous operations. Unfortunately, this would have broken a fair amount of code. Due to the way inheritance is used as extension points in these objects, adding asynchronous methods would be backwards-incompatible.
Write their own HttpWebRequest equivalent. Unfortunately, this would take a lot of work and they'd lose all the interoperability with existing WebRequest-related code.
Queue requests to the thread pool to avoid the worst-case scenario (blocking synchronous code on the UI thread). Unfortunately, this has the side effects of degrading scalability on ASP.NET, being dependent on a free thread pool thread, and incurring the worst-case scenario cost even for best-case scenarios.
In an ideal world (i.e., when we have infinite developer and tester time), I would prefer (2), but I understand why they chose (3).
On a side note, the code you posted shows a dangerous use of StartNew, which has actually caused problems due to its use of TaskScheduler.Current. This has been fixed in .NET Core - not sure when the fix will roll back into .NET Framework proper.

No, your assumptions are wrong.
StartNew isn't equal to the Run method.
This code is from HttpClientHandler, not the HttpClient, and you didn't examine the this.startRequest code from this class. The code you're inspecting is a prepare method, which starts a task in new thread pool, and inside call actual code to start an http request.
HTTP-connection is created not on the .NET level of abstraction, and I'm sure that inside startRequest you'LL find some P/Invoke method, which will do actual work for:
DNS lookup
Socket connection
Sending the request
waiting for the answer
etc.
As you can see, all above are logic which really should be called in async manner, because it is outside the .NET framework, and some operation can be very time-consuming. This is exactly logic that should be called asynchroniously, and during the waiting for it .NET thread is being released in ThreadPool to process other tasks.

Related

Making third party I/O DLL asynchronous

I need to use a third-party DLL which implements a TCP socket client (in C++) using blocking calls. So basically (pseudocode);
void DoRequest()
{
send(myblockingSocket,data);
recv(myblockingSocket,responsedata);
}
What is the recommended way to make these calls accessible in .NET as asynchronous calls using async-await (without changing the original DLL) ?
I read: https://learn.microsoft.com/en-us/dotnet/standard/async-in-depth#deeper-dive-into-tasks-for-an-io-bound-operation and https://learn.microsoft.com/en-us/dotnet/csharp/async and several other pages and did not find another solution than spawning a new task, which is not recommended to do on I/O bound operations because of the task creation overhead.
What is the recommended way to make these calls accessible in .NET as asynchronous calls using async-await (without changing the original DLL) ?
There is no recommended solution because this isn't possible. Either the DLL itself must be changed/replaced so that it supports asynchrony, or the asynchronous calls will just be running the synchronous code on a background thread - what I call "fake asynchrony" because it appears asynchronous but is actually taking up a thread anyway.
... did not find another solution than spawning a new task, which is not recommended to do on I/O bound operations because of the task creation overhead.
It's actually not recommended for a couple of reasons:
It lies to the upstream code. It says "this API is asynchronous" when it's not. This can lead consumers to make incorrect decisions, e.g., preferring the asynchronous API in a server scenario.
It doesn't provide any actual benefit. Implementing a method with Task.Run forces the consumers to use an additional thread. If you just kept the API synchronous, then consumers can choose to call it with Task.Run or not, depending on their needs.

Is Task.Run or TaskFactory.StartNew always inappropriate to use in async methods?

I've heard that the responsibility for threading should lie on the application and I shouldn't use Task.Run or maybe TaskFactory.StartNew in async methods.
However if I have a library that has methods that do quite heavy computation, then to free the threads that for example are accepting asp .net core http requests, couldn't I make the method async and make it run a long running task? Or this should be a sync method and the asp .net core application should be responsible to start the task?
At first, let's think why we need Asynchrony?
Asynchrony is needed either for scalability or offloading.
In case of Scalability, exposing async version of that call does nothing. Because you’re typically still consuming the same amount of resources you would have if you’d invoked it synchronously, even a bit more. But, Scalability is achieved by decreasing the amount of resources you use. And you are not decreasing resources by using Task.Run().
In case of Offloading, you can expose async wrappers of your sync methods. Because it can be very useful for responsiveness, as it allows you to offload long-running operations to a different thread. And in that way, you are getting some benefit from that async wrapper of your method.
Result:
Wrapping a synchronous method with a simple asynchronous façade does not yield any scalability benefits, but yields offloading benefits. But in such cases, by exposing only the synchronous method, you get some nice benefits. For example:
Surface area of your library is reduced.
Your users will know whether there are actually scalability benefits to using exposed asynchronous APIs
If both the synchronous method and an asynchronous wrapper around it are exposed, the developer is then faced with thinking they should invoke the asynchronous version for scalability(?) reasons, but in reality will actually be hurting their throughput by paying for the additional offloading overhead without the scalability benefits.
The source is Should I expose asynchronous wrappers for synchronous methods? by Stepen Toub. And I strongly recommend to you to read it.
Update:
Question in the comment:
Scalability is well explained in that article, with one example. Let's take into account Thread.Sleep. There are two possible ways to implement async version of that call:
public Task SleepAsync(int millisecondsTimeout)
{
return Task.Run(() => Sleep(millisecondsTimeout));
}
And another new implementation:
public Task SleepAsync(int millisecondsTimeout)
{
TaskCompletionSource<bool> tcs = null;
var t = new Timer(delegate { tcs.TrySetResult(true); }, null, –1, -1);
tcs = new TaskCompletionSource<bool>(t);
t.Change(millisecondsTimeout, -1);
return tcs.Task;
}
Both of these implementations provide the same basic behavior, both completing the returned task after the timeout has expired. However, from a scalability perspective, the latter is much more scalable. The former implementation consumes a thread from the thread pool for the duration of the wait time, whereas the latter simply relies on an efficient timer to signal the Task when the duration has expired.
So, in your case, just wrapping call with Task.Run won't be exposed for scalability, but offloading. But, user of that library is not aware of that.
User of your library, can just wrap that call with Task.Run himself. And I really, think he must do it.
Not exactly answering the question (I think the other answer is good enought for that), but to add some additional advice: Becareful with using Task.Run in a library which other people can use. It can cause unexpected Thread pool starvation for the library users. For example a developer is using a lot of third party libraries and all of them use Task.Run() and stuff. Now the developer tries to use Task.Run in his app too, but it slows down his app, because the thread pool is already used up by the third party libraries.
When you want to parallel stuff with Parallel.ForEach it is a different issue.

Realizing the benefits of async/await in a Web API when a syncronous method is in the mix

My goal is increase scalability of a Web API I am working on. To that end, I've implemented the controller actions to be async. The idea being that request threads are released and are available to handle other incoming requests. However, the winning answer of Effectively use async/await with ASP.NET Web API states:
You'd need a truly asynchronous implementation to get the scalability
benefits of async.
Eventually my code needs to call a method from a 3rd party library that is very synchronous, single threaded and I/O bound. So really the only way to do it is through Task.Run(), which I am assuming will hold on to the thread - thereby cancelling the benefits of async/await.
So is there a way to realize the benefits of async/await in a Web API scenario when there is a synchronous operation in the mix?
So is there a way to realize the benefits of async/await in a Web API scenario when there is a synchronous operation in the mix?
No. The reason is simple: asynchrony provides its benefits by releasing its thread, and a synchronous operation blocks a thread.
The benefit of asynchrony on the server side is scalability. Asynchrony provides better scalability because it releases its thread when it's not needed.
A synchronous API blocks a thread, by definition. If the API is an I/O-bound operation, then it's blocking a thread that is mainly just waiting for I/O, i.e., the thread isn't needed but it also can't be released.
There's no way around this; the synchronous API must block a thread, so you cannot get the benefits of asynchrony.
If you had a GUI application, then there is a workaround. On the client side the primary benefit of asynchrony is responsiveness. Asynchrony provides better responsiveness by releasing the UI thread when it's not needed.
For GUI applications, you can use Task.Run to block a thread pool thread, and your UI thread remains responsive; this is an appropriate workaround. On server applications, using Task.Run like that is an antipattern, since it forces a thread switch and then you still end up with a blocked thread anyway (preventing any scalability benefit), so Task.Run just slows down your code for no benefit at all.
It all depends on what other code is involved. If there is a lot of other code that could be running while the I/O operations are executing then yes the Task.Run() would be ideal, just make sure to await the task where needed.
If there isn't code that could be running during the I/O operations then the Task.Run() is not a good plan as you would be blocking the main thread and adding a new thread anyways yielding worse scaling than just leaving it synchronous.
Ultimately if you are using a 3rd party synchronous library and want to optimize scaling your web API via async calls you should try to recreate the library yourself using async I/O calls. Feel free to check out https://learn.microsoft.com/en-us/dotnet/standard/io/asynchronous-file-i-o for more information.

Why does an async single task run faster than a normal single task?

I have a method which has just one task to do and has to wait for that task to complete:
public async Task<JsonResult> GetAllAsync()
{
var result = await this.GetAllDBAsync();
return Json(result, JsonRequestBehavior.AllowGet);
}
public async Task<List<TblSubjectSubset>> GetAllDBAsync()
{
return await model.TblSubjectSubsets.ToListAsync();
}
It is significantly faster than when I run it without async-await.
We know
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active
According to this link: https://msdn.microsoft.com/en-us/library/hh191443.aspx#BKMK_Threads. What is the reason for being faster when we don't have another thread to handle the job?
"Asynchronous" does not mean "faster."
"Asynchronous" means "performs its operation in a way that it does not require a thread for the duration of the operation, thus allowing that thread to be used for other work."
In this case, you're testing a single request. The asynchronous request will "yield" its thread to the ASP.NET thread pool... which has no other use for it, since there are no other requests.
I fully expect asynchronous handlers to run slower than synchronous handlers. This is for a variety of reasons: there's the overhead of the async/await state machine, and extra work when the task completes to have its thread enter the request context. Besides this, the Win32 API layer is still heavily optimized for synchronous calls (expect this to change gradually over the next decade or so).
So, why use asynchronous handlers then?
For scalability reasons.
Consider an ASP.NET server that is serving more than one request - hundreds or thousands of requests instead of a single one. In that case, ASP.NET will be very grateful for the thread returned to it during its request processing. It can immediately use that thread to handle other requests. Asynchronous requests allow ASP.NET to handle more requests with fewer threads.
This is assuming your backend can scale, of course. If every request has to hit a single SQL Server, then your scalability bottleneck will probably be your database, not your web server.
But if your situation calls for it, asynchronous code can be a great boost to your web server scalability.
For more information, see my article on async ASP.NET.
I agree with Orbittman when he mentions the overhead involved in the application architecture. It doesn't make for a very good benchmark premise since you can't be sure if the degradation can indeed be solely attributed to the async vs non-async calls.
I've created a really simple benchmark to get a rough comparison between an async and a synchronous call and async loses every time in the overall timing actually, though the data gathering section always seems to end up the same. Have a look: https://gist.github.com/mattGuima/25cb7893616d6baaf970
Having said that, the same thought regarding the architecture applies. Frameworks handle async calls differently: Async and await - difference between console, Windows Forms and ASP.NET
The main thing to remember is to never confuse async with performance gain, because it is completely unrelated and most often it will result on no gain at all, specially with CPU-bound code. Look at the Parallel library for that instead.
Async await is not the silver bullet that some people think it is and in your example is not required. If you were processing the result of the awaitable operation after you received it then you would be able to return a task and continue on the calling thread. You wouldn't have to then wait for the rest of the operation to complete. You would be correct to remove the async/await in the above code.
It's not really possible to answer the question without seeing the calling code either as it depends on what the context is trying to trying to do with the response. What you are getting back is not just a Task but a task in the context of the method that will continue when complete. See http://codeblog.jonskeet.uk/category/eduasync/ for much better information regarding the inner workings of async/await.
Lastly I would question your timings as with an Ajax request to a database and back there other areas with potentially greater latency, such as the HTTP request and response and the DB connection itself. I assume that you're using an ORM and that alone can cause an overhead. I wonder whether it's the async/await that is the problem.

Clarification on tasks in .net

I'm trying to understand tasks in .net from what I understand is that they are better than threads because they represent work that needs to get done and when there is a idle thread it just gets picked up and worked on allowing the full cpu to be utilized.
I see the Task<ActionResult> all over a new mvc 5 project and I would like to know why this is happening?
Does it make sense to always do this, or just when there can be blocking work in the function?
I'm guessing since this does act like a thread there is still sync objects that may be needed is this correct?
MVC 5 uses Task<ActionResult> to allow it to be fully asynchronous. By using Task<T>, the methods can be implemented using the new async and await language features, which allows you to compose asynchronous IO functions with MVC in a simple manner.
When working with MVC, in general, the Task<T> will hopefully not be using threads - they'll be composing asynchronous operations (typically IO bound work). Using threads on a server, in general, will reduce your overall scalability.
A Task does not represent a thread, even logically. It's not just an alternate implementation of threads. It's a higher level concept. A Task is the representation of an asynchronous operation that will complete at some point (usually in the future).
That task could represent code being run on another thread, it could represent some asynchronous IO operation that relies on OS interrupts to (indirectly, through a few other layers of indirection) cause the task to be marked completed), it could be the result of two other tasks being completed, or the continuation of some other task being completed, it could be an indication of when an event next fires, or some custom TaskCompletionSource that has who knows what as its implementation.
But you don't need to worry about all of those options. That's the point. In other models you need to treat all of those different types of asynchronous operations differently, complicating your asynchronous programs. The use of Task allows you to write code that can easily be composed with any and every type of asynchronous operation.
I'm guessing since this does act like a thread there is still sync objects that may be needed is this correct?
Technically, yes. There are times where you may need to use these, but largely, no. Ideally, if you're using idiomatic practices, you can avoid this, at least in most cases. Generally when one task depends on code running in other tasks it should be the continuation of that task, and information is assessed between tasks through the tasks' Result property. The use of Result doesn't require any synchronization mechanisms, so usually you can avoid them entirely.
I see the Task all over a new mvc 5 project and I would like to know why this is happening?
When you're going to make something asynchronous it generally makes sense to make everything asynchronous (or nothing). Mixing and matching just...doesn't work. Asynchronous code relies on having every method take very little time to execute so that the message pump can get back to processing its queue of pending tasks/continuations. Mixing asynchronous code and synchronous code makes it very likely to deadlock your application, and also defeats most of the purposes of using asynchrony to begin with (which is to avoid blocking threads).

Categories

Resources