Making third party I/O DLL asynchronous

Making third party I/O DLL asynchronous - c#

I need to use a third-party DLL which implements a TCP socket client (in C++) using blocking calls. So basically (pseudocode);
void DoRequest()
{
send(myblockingSocket,data);
recv(myblockingSocket,responsedata);
}
What is the recommended way to make these calls accessible in .NET as asynchronous calls using async-await (without changing the original DLL) ?
I read: https://learn.microsoft.com/en-us/dotnet/standard/async-in-depth#deeper-dive-into-tasks-for-an-io-bound-operation and https://learn.microsoft.com/en-us/dotnet/csharp/async and several other pages and did not find another solution than spawning a new task, which is not recommended to do on I/O bound operations because of the task creation overhead.

What is the recommended way to make these calls accessible in .NET as asynchronous calls using async-await (without changing the original DLL) ?
There is no recommended solution because this isn't possible. Either the DLL itself must be changed/replaced so that it supports asynchrony, or the asynchronous calls will just be running the synchronous code on a background thread - what I call "fake asynchrony" because it appears asynchronous but is actually taking up a thread anyway.
... did not find another solution than spawning a new task, which is not recommended to do on I/O bound operations because of the task creation overhead.
It's actually not recommended for a couple of reasons:
It lies to the upstream code. It says "this API is asynchronous" when it's not. This can lead consumers to make incorrect decisions, e.g., preferring the asynchronous API in a server scenario.
It doesn't provide any actual benefit. Implementing a method with Task.Run forces the consumers to use an additional thread. If you just kept the API synchronous, then consumers can choose to call it with Task.Run or not, depending on their needs.

Related

Is Task.Run or TaskFactory.StartNew always inappropriate to use in async methods?

I've heard that the responsibility for threading should lie on the application and I shouldn't use Task.Run or maybe TaskFactory.StartNew in async methods.
However if I have a library that has methods that do quite heavy computation, then to free the threads that for example are accepting asp .net core http requests, couldn't I make the method async and make it run a long running task? Or this should be a sync method and the asp .net core application should be responsible to start the task?

At first, let's think why we need Asynchrony?
Asynchrony is needed either for scalability or offloading.
In case of Scalability, exposing async version of that call does nothing. Because you’re typically still consuming the same amount of resources you would have if you’d invoked it synchronously, even a bit more. But, Scalability is achieved by decreasing the amount of resources you use. And you are not decreasing resources by using Task.Run().
In case of Offloading, you can expose async wrappers of your sync methods. Because it can be very useful for responsiveness, as it allows you to offload long-running operations to a different thread. And in that way, you are getting some benefit from that async wrapper of your method.
Result:
Wrapping a synchronous method with a simple asynchronous façade does not yield any scalability benefits, but yields offloading benefits. But in such cases, by exposing only the synchronous method, you get some nice benefits. For example:
Surface area of your library is reduced.
Your users will know whether there are actually scalability benefits to using exposed asynchronous APIs
If both the synchronous method and an asynchronous wrapper around it are exposed, the developer is then faced with thinking they should invoke the asynchronous version for scalability(?) reasons, but in reality will actually be hurting their throughput by paying for the additional offloading overhead without the scalability benefits.
The source is Should I expose asynchronous wrappers for synchronous methods? by Stepen Toub. And I strongly recommend to you to read it.
Update:
Question in the comment:
Scalability is well explained in that article, with one example. Let's take into account Thread.Sleep. There are two possible ways to implement async version of that call:
public Task SleepAsync(int millisecondsTimeout)
{
return Task.Run(() => Sleep(millisecondsTimeout));
}
And another new implementation:
public Task SleepAsync(int millisecondsTimeout)
{
TaskCompletionSource<bool> tcs = null;
var t = new Timer(delegate { tcs.TrySetResult(true); }, null, –1, -1);
tcs = new TaskCompletionSource<bool>(t);
t.Change(millisecondsTimeout, -1);
return tcs.Task;
}
Both of these implementations provide the same basic behavior, both completing the returned task after the timeout has expired. However, from a scalability perspective, the latter is much more scalable. The former implementation consumes a thread from the thread pool for the duration of the wait time, whereas the latter simply relies on an efficient timer to signal the Task when the duration has expired.
So, in your case, just wrapping call with Task.Run won't be exposed for scalability, but offloading. But, user of that library is not aware of that.
User of your library, can just wrap that call with Task.Run himself. And I really, think he must do it.

Not exactly answering the question (I think the other answer is good enought for that), but to add some additional advice: Becareful with using Task.Run in a library which other people can use. It can cause unexpected Thread pool starvation for the library users. For example a developer is using a lot of third party libraries and all of them use Task.Run() and stuff. Now the developer tries to use Task.Run in his app too, but it slows down his app, because the thread pool is already used up by the third party libraries.
When you want to parallel stuff with Parallel.ForEach it is a different issue.

HttpClient provide not truly async operations?

I'm confused about async IO operations. In this article Stephen Cleary explains that we should not use Task.Run(() => SomeIoMethod()) because truly async operations should use
standard P/Invoke asynchronous I/O system in .NET
http://blog.stephencleary.com/2013/11/there-is-no-thread.html
However, avoid “fake asynchrony” in libraries. Fake asynchrony is when
a component has an async-ready API, but it’s implemented by just
wrapping the synchronous API within a thread pool thread. That is
counterproductive to scalability on ASP.NET. One prominent example of
fake asynchrony is Newtonsoft JSON.NET, an otherwise excellent
library. It’s best to not call the (fake) asynchronous versions for
serializing JSON; just call the synchronous versions instead. A
trickier example of fake asynchrony is the BCL file streams. When a
file stream is opened, it must be explicitly opened for asynchronous
access; otherwise, it will use fake asynchrony, synchronously blocking
a thread pool thread on the file reads and writes.
And he advises to use HttpClient but internaly it use Task.Factory.StartNew()
Does this mean that HttpClient provides not truly async operations?

Does this mean that HttpClient provides not truly async operations?
Sort of. HttpClient is in an unusual position, since it's primary implementation uses HttpWebRequest, which is only partially asynchronous.
In particular, the DNS lookup is synchronous, and I think maybe the proxy resolution, too. After that, it's all asynchronous. So, for most scenarios, the DNS is fast (usually cached) and there isn't a proxy, so it acts asynchronously. Unfortunately, there are enough scenarios (particularly from within corporate networks) where the synchronous operations can cause significant lag.
So, when the team was writing HttpClient, they had three options:
Fix HttpWebRequest (and friends) allowing for fully-asynchronous operations. Unfortunately, this would have broken a fair amount of code. Due to the way inheritance is used as extension points in these objects, adding asynchronous methods would be backwards-incompatible.
Write their own HttpWebRequest equivalent. Unfortunately, this would take a lot of work and they'd lose all the interoperability with existing WebRequest-related code.
Queue requests to the thread pool to avoid the worst-case scenario (blocking synchronous code on the UI thread). Unfortunately, this has the side effects of degrading scalability on ASP.NET, being dependent on a free thread pool thread, and incurring the worst-case scenario cost even for best-case scenarios.
In an ideal world (i.e., when we have infinite developer and tester time), I would prefer (2), but I understand why they chose (3).
On a side note, the code you posted shows a dangerous use of StartNew, which has actually caused problems due to its use of TaskScheduler.Current. This has been fixed in .NET Core - not sure when the fix will roll back into .NET Framework proper.

No, your assumptions are wrong.
StartNew isn't equal to the Run method.
This code is from HttpClientHandler, not the HttpClient, and you didn't examine the this.startRequest code from this class. The code you're inspecting is a prepare method, which starts a task in new thread pool, and inside call actual code to start an http request.
HTTP-connection is created not on the .NET level of abstraction, and I'm sure that inside startRequest you'LL find some P/Invoke method, which will do actual work for:
DNS lookup
Socket connection
Sending the request
waiting for the answer
etc.
As you can see, all above are logic which really should be called in async manner, because it is outside the .NET framework, and some operation can be very time-consuming. This is exactly logic that should be called asynchroniously, and during the waiting for it .NET thread is being released in ThreadPool to process other tasks.

Using Task.Run for synchronous method in service

I've read a lot of articles about asynchronnous programming, but I'm not sure in one thing. I have 3rd party winrt library, written in C++ and I want to wrapp it. So now I have:
public Task LoginAsync(){
return Task.Run(winrtLibrary.Login();)
}
According Stephen Cleary and Stephen Toub blogs, it is not good solution. But when I use the method synchronously, my UI will not be responsive and will be blocked.
Is it better to expose service method synchronously and in UI use Task.Run?

What Stephen Toub means by
do not use Task.Run in the implementation of the method; instead, use Task.Run to call the method
Is that you shouldn't use Task.Run to hide CPU bound work behind async methods (Task returning methods). If you need to wrap external code, hide it behind an interface which reflects what this code do. Any asynchronous I/O can (and should) be exposed as Task returning methods, and CPU bound work must be exposed with the proper API. Let the consumers of your code to decide for themselves how to use that code. When you happen to by the consumer too, use Task.Run to run your synchronous code (now wrapped and exposed via interface) where it is very clear that you are offloading CPU bound work. In UI apps, for example, you should call Task.Run in your UI layer (and not deep down in your BL or even DA layers), where it is very clear that the UI offloads some CPU bound work.
Why do you think, that I shouldn't call Task.Run in BL? What if I have
ViewModel, which references BL and BL references service layer (in my
case is it wrapper).
I think that a method signature should reflect exactly what the method does.
The best I can do is to redirect you back to Cleary's article:
When a developer sees two methods in an API winrtLibrary.Login() and winrtLibrary.LoginAsync(), the convention is that they represent a naturally-asynchronous
operation. In other words, the developer expects that winrtLibrary.LoginAsync() is
the “natural” implementation and that winrtLibrary.Login() is essentially a
synchronous (blocking) equivalent of that operation. That API implies
that winrtLibrary.Login will at some point have the calling thread enter a wait
state as it blocks for the naturally-asynchronous operation to
complete.
You can still hide synchronous code behind async method and follow Cleary's rule of thumb, if you sign your method as public Task OffloadLoginToTheThreadPool(). But I think (and apparently Cleary, too) that the alternative of simply calling Task.Run from the UI (or Controller) is a much better approach, and it follows the principles of Clean Code.

Clarification on tasks in .net

I'm trying to understand tasks in .net from what I understand is that they are better than threads because they represent work that needs to get done and when there is a idle thread it just gets picked up and worked on allowing the full cpu to be utilized.
I see the Task<ActionResult> all over a new mvc 5 project and I would like to know why this is happening?
Does it make sense to always do this, or just when there can be blocking work in the function?
I'm guessing since this does act like a thread there is still sync objects that may be needed is this correct?

MVC 5 uses Task<ActionResult> to allow it to be fully asynchronous. By using Task<T>, the methods can be implemented using the new async and await language features, which allows you to compose asynchronous IO functions with MVC in a simple manner.
When working with MVC, in general, the Task<T> will hopefully not be using threads - they'll be composing asynchronous operations (typically IO bound work). Using threads on a server, in general, will reduce your overall scalability.

A Task does not represent a thread, even logically. It's not just an alternate implementation of threads. It's a higher level concept. A Task is the representation of an asynchronous operation that will complete at some point (usually in the future).
That task could represent code being run on another thread, it could represent some asynchronous IO operation that relies on OS interrupts to (indirectly, through a few other layers of indirection) cause the task to be marked completed), it could be the result of two other tasks being completed, or the continuation of some other task being completed, it could be an indication of when an event next fires, or some custom TaskCompletionSource that has who knows what as its implementation.
But you don't need to worry about all of those options. That's the point. In other models you need to treat all of those different types of asynchronous operations differently, complicating your asynchronous programs. The use of Task allows you to write code that can easily be composed with any and every type of asynchronous operation.
I'm guessing since this does act like a thread there is still sync objects that may be needed is this correct?
Technically, yes. There are times where you may need to use these, but largely, no. Ideally, if you're using idiomatic practices, you can avoid this, at least in most cases. Generally when one task depends on code running in other tasks it should be the continuation of that task, and information is assessed between tasks through the tasks' Result property. The use of Result doesn't require any synchronization mechanisms, so usually you can avoid them entirely.
I see the Task all over a new mvc 5 project and I would like to know why this is happening?
When you're going to make something asynchronous it generally makes sense to make everything asynchronous (or nothing). Mixing and matching just...doesn't work. Asynchronous code relies on having every method take very little time to execute so that the message pump can get back to processing its queue of pending tasks/continuations. Mixing asynchronous code and synchronous code makes it very likely to deadlock your application, and also defeats most of the purposes of using asynchrony to begin with (which is to avoid blocking threads).

Non-Blocking Threading

I am re-factoring a C# project that is used by several full-sized applications. This class interacts with hardware and often takes hundreds of milliseconds or more to execute some commands. In many cases, I am replacing Thread.Wait() calls that the previous programmer wrote with ThreadPool calls to perform these actions.
Now, some of the functions this project provides to the several projects using it take hundreds of milliseconds or more to execute and return a value to the calling program that the program must use. My question is whether or not there is some mechanism that I may use within this project to make these calls execute and return on some thread other than the main thread? In other words, I want to make these methods non-blocking from the perspective of this project, rather than require other applications using these functions to place calls in a separate thread.
Thanks

In other words, I want to make these methods non-blocking from the perspective of this project, rather than require other applications using these functions to place calls in a separate thread.
In general, the best approach is often to return a Task<T> in this type of scenario. This allows the caller to block if necessary, or use the new await and async keywords to cleanly coordinate with your library, without blocking or forcing them to move to a separate thread.

If you are using .net 4.5 you can use Task.Run to execute the slow operations on a separate thread and then ConfigureAwait(false) to not execute on the main thread once they return.
Task.Run(() => <slow operatoion).ConfigureAwait(false);

Not knowing what version of the framework you're using, have a look at the begin/end async pattern. You should look at changing the API for the project to implement it.
http://msdn.microsoft.com/en-us/library/ms228963.aspx

i worked on similar stuff ... i would suggest you to use 'select' instead of using threading.... look at this ... if it helps you
http://www.kegel.com/c10k.html

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.