I have been researching how to convert my (synchronous) algorithms into asynchronous ones. (TAP)
First, just to be clear, this question is not about "what is Async and Await does" ( I already have read the excellent posts of Stephen Cleary , for example Async and Await (If anyone is interested read the link- it is very informative)
I have also read already the chapter on concurrency of "C# in a nutshell".
This question is not about how async functions use await to call functions either. I already know that.
Unfortunately in almost all the things I read, the await Task.Delay(10) is used to "make a asynchronous function". For example:
public async Task<int> GetResult()
{
int result= await GiveMeTheInt();
}
public async Task<int> GiveMeTheInt() //<--is async needed here? (oops! I just realize it is...
{
await Task.Delay(100);
return(10);
}
In this example for instance I already understand the magic of async await in the GetResult() function but the implementation of GiveMeTheInt() is not very useful.(They just put a Delay as a generic asynchronous function and leave it at that)
So my question is about the "GiveMeTheInt"-type of questions (not the ones who call them).
The question
If I have an algorithm written in a function that so far has been synchronous, how can I convert this to be used asynchronously?
This is not a duplicate question, the closest I have found is Turning a Syncronous method async in which the poster is told to use a async version of his method that already exists. In my case, this does not exist.
My algorithms consist mainly of Image processing so in essence scanning a large array and changing the values of each pixel. Something like
void DoSomethingToImage(int[,] Image)
{
for(int i=0;i<width;i++)
for(int j=0;j<height;j++)
{
Image[i,j]=255;
}
}
(This is a fictional example, the operation is different of course)
The closest I have gotten an answer to this is to put the function inside a Task.Run() but I am not sure if this is the way to do it.
Any help will be greatly appreciated
So take a look at your method:
void DoSomethingToImage(int[,] image)
{
for (int i = 0; i < width; i++)
{
for (int j = 0; j < height; j++)
{
image[i, j] = 255;
}
}
}
Is this asynchronous? Obviously not. This is just CPU-bound work that will keep the processor busy for a bit. As such, it is not a good candiate to make asynchronous on its own. It should be run synchronously.
What if you are consuming this from an asynchronous part of the application? You certainly don’t want the user interface to block because you are iterating through a lot of pixels. So the solution is to load the work off to another thread. You do that using Task.Run:
await Task.Run(() => DoSomethingToImage(image));
So you would write that whenever you call the DoSomethingToImage method from an asynchronous method. Now, if you only use that method inside asynchronous contexts, you could argue that it might make sense to move the Task.Run into the function:
Task DoSomethingToImageAsync(int[,] image)
{
return Task.Run(() => { … });
}
Is this a good idea? In general no, because now you are making the method look asynchronous when it’s in fact not. All it does is spawn a new thread that does the work, and then it waits for the thread to complete. So now you are hiding that part and also make a method doing highly synchronous work decide that a thread should be started. That’s rarely a good idea. And there is nothing wrong with keeping the method as it is, to show that it’s synchronous, and make the calling code responsible of deciding how that code should be run.
If I have an algorithm written in a function that so far has been synchronous, how can I convert this to be used asynchronously?
So, coming back to your actual question, this is actually difficult to answer. The answer is probably just this: “It depends”.
If a method does CPU-bound work, you’re better off keeping it synchronous and let calling code decide how to run it. If you are doing mostly I/O work where you interact with other interfaces (network, file system, etc.), then that’s a good candidate for making it asynchronous, especially considering that many of those interfaces will already offer asynchronous ways to communicate with them.
One final note regarding your “is async needed here?” question in your code: You need the async keyword whenever you want to use await inside of it. The mere presence of the async keyword does not make a method asynchronous though (not even the return type does indicate that).
Related
I work with Node.js and so I got very used to its 'programming style' and its way to deal with asynchronous operations through higher order functions and callbacks, where most I/O events are handled in a async way by design and if I want to make a sync operation, I need to use Promises or the await shortcut, whereas in synchronous programming languages like Java, C#, C++ apparently I'd have to do the opposite, by somehow telling the compiler that the task I want to achieve must be performed asynchronously. I tried reading through the Microsoft docs and couldn't really understand how to achieve it. I mean, I could use Threads but for the simple task I want to process, exploring Threads is just not worth it for the trouble on guaranteeing thread-safety.
I came across the Task class. So, suppose that I want to run a Task method multiple times in a async way, where the functions are being called in parallel. How can I do this?
private Task<int> MyCustomTask(string whatever)
{
// I/O event that I want to be processed in async manner
}
So basically, I wanted to run this method in 'parallel' without threading.
foreach (x in y)
{
MyCustomTask("");
}
If you don't want to await, you can do something like this.
public class AsyncExamples
{
public List<string> whatevers = new List<string> { "1", "2", "3" };
private void MyCustomTask(string whatever)
{
// I/O event that I want to be processed in async manner
}
public void FireAndForgetAsync(string whatever)
{
Task.Run(
() =>
{
MyCustomTask(whatever);
}
);
}
public void DoParallelAsyncStuff()
{
foreach (var whatever in whatevers)
{
FireAndForgetAsync(whatever);
}
}
}
most I/O events are handled in a async way by design and if I want to make a sync operation, I need to use Promises or the await shortcut
I believe the difference you're expressing is the difference between functional and imperative programming, not the difference between asynchronous and synchronous programming. So I think what you're saying is that asynchronous programming fits more naturally with a functional style, which I would agree with. JavaScript is mostly functional, though it also has imperative and OOP aspects. C# is more imperative and OOP than functional, although it grows more functional with each year.
However, both JavaScript and C# are synchronous by default, not asynchronous by default. A method must "opt in" to asynchrony using async/await. In that way, they are very similar.
I tried reading through the Microsoft docs and couldn't really understand how to achieve it.
Cheat sheet if you're familiar with asynchronous JavaScript:
Task<T> is Promise<T>
If you need to write a wrapper for another API (e.g., the Promise<T> constructor using resolve/reject), then the C# type you need is TaskCompletionSource<T>.
async and await work practically the same way.
Task.WhenAll is Promise.all, and Task.WhenAny is Promise.any. There isn't a built-in equivalent for Promise.race.
Task.FromResult is Promise.resolve, and Task.FromException is Promise.reject.
So, suppose that I want to run a Task method multiple times in a async way, where the functions are being called in parallel. How can I do this?
(minor pedantic note: this is asynchronous concurrency; not parallelism, which implies threads)
To do this in JS, you would take your iterable, map it over an async method (resulting in an iterable of promises), and then Promise.all those promises.
To do the same thing in C#, you would take your enumerable, Select it over an async method (resulting in an enumerable of tasks), and then Task.WhenAll those tasks.
var tasks = y.Select(x => MyCustomTask(x)).ToList();
await Task.WhenAll(tasks);
What I have
I have a set of asynchronous processing methods, similar to:
public class AsyncProcessor<T>
{
//...rest of members, etc.
public Task Process(T input)
{
//Some special processing, most likely inside a Task, so
//maybe spawn a new Task, etc.
Task task = Task.Run(/* maybe private method that does the processing*/);
return task;
}
}
What I want
I would like to chain them all together, to execute in sequential order.
What I tried
I have tried to do the following:
public class CompositeAsyncProcessor<T>
{
private readonly IEnumerable<AsyncProcessor<T>> m_processors;
//Constructor receives the IEnumerable<AsyncProcessor<T>> and
//stores it in the field above.
public Task ProcessInput(T input)
{
Task chainedTask = Task.CompletedTask;
foreach (AsyncProcessor<T> processor in m_processors)
{
chainedTask = chainedTask.ContinueWith(t => processor.Process(input));
}
return chainedTask;
}
}
What went wrong
However, tasks do not run in order because, from what I have understood, inside the call to ContinueWith, the processor.Process(input) call is performed immediately and the method returns independently of the status of the returned task. Therefore, all processing Tasks still begin almost simultaneously.
My question
My question is whether there is something elegant that I can do to chain the tasks in order (i.e. without execution overlap). Could I achieve this using the following statement, (I am struggling a bit with the details), for example?
chainedTask = chainedTask.ContinueWith(async t => await processor.Process(input));
Also, how would I do this without using async/await, only ContinueWith?
Why would I want to do this?
Because my Processor objects have access to, and request things from "thread-unsafe" resources. Also, I cannot just await all the methods because I have no idea about how many they are, so I cannot just write down the necessary lines of code.
What do I mean by thread-unsafe? A specific problem
Because I may be using the term incorrectly, an illustration is a bit better to explain this bit. Among the "resources" used by my Processor objects, all of them have access to an object such as the following:
public interface IRepository
{
void Add(object obj);
bool Remove(object obj);
IEnumerable<object> Items { get; }
}
The implementation currently used is relatively naive. So some Processor objects add things, while others retrieve the Items for inspection. Naturally, one of the exceptions I get all too often is:
InvalidOperationException: Collection was modified, enumeration
operation may not execute.
I could spend some time locking access and pre-running the enumerations. However, this was the second option I would get down to, while my first thought was to just make the processes run sequentially.
Why must I use Tasks?
While I have full control in this case, I could say that for the purposes of the question, I might not be able to change the base implementation, so what would happen if I were stuck with Tasks? Furthermore, the operations actually do represent relatively time-consuming CPU-bound operations plus I am trying to achieve a responsive user interface so I needed to unload some burden to asynchronous operations. While being useful and, in most of my use-cases, not having the necessity to chain multiple of them, rather a single one each time (or a couple, but always specific and of a specific count, so I was able to hook them together without iterations and async/await), one of the use-cases finally necessitated chaining an unknown number of Tasks together.
How I deal with this currently
The way I am dealing with this currently is to append a call to Wait() inside the ContinueWith call, i.e.:
foreach (AsyncProcessor<T> processor in m_processors)
{
chainedTask = chainedTask.ContinueWith(t => processor.Process(input).Wait());
}
I would appreciate any idea on how I should do this, or how I could do it more elegantly (or, "async-properly", so to speak). Also, I would like to know how I can do this without async/await.
Why my question is different from this question, which did not answer my question entirely.
Because the linked question has two tasks, so the solution is to simply write the two lines required, while I have an arbitrary (and unknown) number of tasks, so I need an suitable iteration. Also, my method is not async. I now understand (from the single briefly available answer, which was deleted) that I could do it fairly easily if I changed my method to async and await each processor's Task method, but I still wish to know how this could be achieved without async/await syntax.
Why my question is not a duplicate of the other linked questions
Because none of them explains how to chain correctly using ContinueWith and I am interested in a solution that utilizes ContinueWith and does not make use of the async/await pattern. I know this pattern may be the preferable solution, I want to understand how to (if possible) make arbitrary chaining using ContinueWith calls properly. I now know I don't need ContinueWith. The question is, how do I do it with ContinueWith?
foreach + await will run Processes sequentially.
public async Task ProcessInputAsync(T input)
{
foreach (var processor in m_processors)
{
await processor.Process(input));
}
}
Btw. Process, should be called ProcessAsync
The method Task.ContinueWith does not understand async delegates, like Task.Run do, so when you return a Task it considers this as a normal return value and wraps it in another Task. So you end up receiving a Task<Task> instead of what you expected to get. The problem would be obvious if the AsyncProcessor.Process was returning a generic Task<T>. In this case you would get a compile error because of the illegal casting from Task<Task<T>> to Task<T>. In your case you cast from Task<Task> to Task, which is legal, since Task<TResult> derives from Task.
Solving the problem is easy. You just need to unwrap the Task<Task> to a simple Task, and there is a built-in method Unwrap that does exactly that.
There is another problem that you need to solve though. Currently your code suppresses all exceptions that may occur on each individual AsyncProcessor.Process, which I don't think it was intended. So you must decide which strategy to follow in this case. Are you going to propagate the first exception immediately, or you prefer to cache them all and propagate them at the end bundled in an AggregateException, like the Task.WhenAll does? The example bellow implements the first strategy.
public class CompositeAsyncProcessor<T>
{
//...
public Task Process(T input)
{
Task current = Task.CompletedTask;
foreach (AsyncProcessor<T> processor in m_processors)
{
current = current.ContinueWith(antecessor =>
{
if (antecessor.IsFaulted)
return Task.FromException<T>(antecessor.Exception.InnerException);
return processor.Process(input);
},
CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default
).Unwrap();
}
return current;
}
}
I have used an overload of ContinueWith that allows configuring all the options, because the defaults are not ideal. The default TaskContinuationOptions is None. Configuring it to ExecuteSynchronously you minimize the thread switches, since each continuation will run in the same thread that completed the previous one.
The default task scheduler is TaskScheduler.Current. By specifying TaskScheduler.Default you make it explicit that you want the continuations to run in thread-pool threads (for some exceptional cases that won't be able to run synchronously). The TaskScheduler.Current is context specific, and if it ever surprises you it won't be in a good way.
As you see there are a lot of gotchas with the old-school ContinueWith approach. Using the modern await in a loop is a lot easier to implement, and a lot more difficult to get it wrong.
So here I have a function
static bool Login(SignupData sd)
{
bool success=false;
/*
Perform login-related actions here
*/
}
And there is another function
static Task<bool> LoginAsync(SignupData sd)
{
return Task.Run<bool>(()=>Login(sd));
}
Now, I've come across a rather different implementation of this pattern, where you would add the async keyword to a function which returns Task<TResult> (so that it ends up looking like: async Task<TResult> LoginAsync(SignupData sd)). In this case, even if you return TResult instead of a Task<TResult>, the program still compiles.
My question here is, which implementation should be prefered?
static Task<bool> LoginAsync(SignupData sd)
{
return Task.Run<bool>(()=>Login(sd));
}
OR this one?
async static Task<bool> LoginAsync(SignupData sd)
{
bool success=Login(sd);
return success;
}
You shouldn't be doing either. Asynchronous methods are useful if they can prevent threads from being blocked. In your case, your method doesn't avoid that, it always blocks a thread.
How to handle long blocking calls depends on the application. For UI applications, you want to use Task.Run to make sure you don't block the UI thread. For e.g. web applications, you don't want to use Task.Run, you want to just use the thread you've got already to prevent two threads from being used where one suffices.
Your asynchronous method cannot reliably know what works best for the caller, so shouldn't indicate through its API that it knows best. You should just have your synchronous method and let the caller decide.
That said, I would recommend looking for a way to create a LoginAsync implementation that's really asynchronous. If it loads data from a database, for instance, open the connection using OpenAsync, retrieve data using ExecuteReaderAsync. If it connects to a web service, connect using the asynchronous methods for whatever protocol you're using. If it logs in some other way, do whatever you need to make that asynchronous.
If you're taking that approach, the async and await keywords make perfect sense and can make such an implementation very easy to create.
While HVD is correct, I will dive into async in an attempt to describe its intended use.
The async keyword, and the accompanying await keyword is a shortcut method of implementing non blocking code patterns within your application. While it plays along perfectly with the rest of the Task Parallel Library (TPL), it isn't usually used quite the same. It's beauty is in the elegance of how the compiler weaves in the asynchronicity, and allows it to be handled without explicitly spinning off separate threads, which may or may not be what you want.
For Example, let's look at some code:
async static Task<bool> DoStuffAsync()
{
var otherAsyncResult = doOtherStuffAsync();
return await otherAsyncResult
}
See the await keyword? It says, return to the caller, continue on until we have the result you need. Don't block, don't use a new thread, but basically return with a promise of a result when ready (A Task). The calling code can then carry on and not worry about the result until later when we have it.
Usually this ends up requiring that your code becomes non-blocking the whole way down (async all the way as it were), and often this is a difficult transition to understand. However, if you can it is incredibly powerful.
The better way to handle your code would be to make the synchronous code call the async one, and wait on it. That way you would be async as much as possible. It is always best to force that level as high as possible in your application, all the way to the UI if possible.
Hope that made sense. The TPL is a huge topic, and Async/Await really adds some interesting ways of structuring your code.
https://msdn.microsoft.com/en-us/library/hh191443.aspx
So I was asking this question about async , and I thought that it it's just a sugar syntax for :
Task<..>...ContinueWith...
And finally inspect the Result property.
I even asked a question about it here and I was told :
But Today I was corrected by Jon Skeet
" It's a very long way from that".
So what are the core differences between those 2 approaches ?
It is adding a continuation - but manually constructing that continuation can be very painful, due to the need to carry around all the information about where we'd got to and what the local state is.
As a very simple example, I suggest you try to come up with the equivalent of this async method:
public static async Task<int> SumTwoOperationsAsync()
{
var firstTask = GetOperationOneAsync();
var secondTask = GetOperationTwoAsync();
return await firstTask + await secondTask;
}
// These are just examples - you don't need to translate them.
private async Task<int> GetOperationOneAsync()
{
await Task.Delay(500); // Just to simulate an operation taking time
return 10;
}
private async Task<int> GetOperationTwoAsync()
{
await Task.Delay(100); // Just to simulate an operation taking time
return 5;
}
Really try to come up with the equivalent of the first method. I think you'll find it takes quite a lot of code - especially if you actually want to get back to an appropriate thread each time. (Imagine code in that async method also modified a WPF UI, for example.) Oh, and make sure that if either of the tasks fails, your returned task fails too. (The async method will actually "miss" the failure of the second task if the first task also fails, but that's a relatively minor problem IMO.)
Next, work out how you'd need to change your code if you needed the equivalent of try/finally in the async method. Again, that'll make the non-async method more complicated. It can all be done, but it's a pain in the neck.
So yes, it's "just" syntactic sugar. So is foreach. So is a for loop (or any other kind of loop). In the case of async/await, it's syntactic sugar which can do really rather a lot to transform your code.
There are lots of videos and blog posts around async, and I would expect that just watching/reading a few of them would give you enough insight to appreciate that this is far from a minor tweak: it radically changes how practical it is to write large amounts of asynchronous code correctly.
Additionally, being pattern-based, async/await doesn't only work on Task / Task<T>. You can await anything which adheres to the awaitable pattern. In practice very few developers will need to implement the pattern themselves, but it allows for methods like Task.Yield which returns a YieldAwaitable rather than a task.
I have a question about how customizable the new async/await keywords and the Task class in C# 4.5 are.
First some background for understanding my problem: I am developing on a framework with the following design:
One thread has a list of "current things to do" (usually around 100 to 200 items) which are stored as an own data structure and hold as a list. It has an Update() function that enumerates the list and look whether some "things" need to execute and does so. Basically its like a big thread sheduler. To simplify things, lets assume the "things to do" are functions that return the boolean true when they are "finished" (and should not be called next Update) and false when the sheduler should call them again next update.
All the "things" must not run concurrently and also must run in this one thread (because of thread static variables)
There are other threads which do other stuff. They are structured in the same way: Big loop that iterates a couple of hundret things to do in a big Update() - function.
Threads can send each other messages, including "remote procedure calls". For these remote calls, the RPC system is returning some kind of future object to the result value. In the other thread, a new "thing to do" is inserted.
A common "thing" to do are just sequences of RPCs chained together. At the moment, the syntax for this "chaining" is very verbose and complicated, since you manually have to check for the completion state of previous RPCs and invoke the next ones etc..
An example:
Future f1, f2;
bool SomeThingToDo() // returns true when "finished"
{
if (f1 == null)
f1 = Remote1.CallF1();
else if (f1.IsComplete && f2 == null)
f2 = Remote2.CallF2();
else if (f2 != null && f2.IsComplete)
return true;
return false;
}
Now this all sound awefull like async and await of C# 5.0 can help me here. I haven't 100% fully understand what it does under the hood (any good references?), but as I get it from some few talks I've watched, it exactly does what I want with this nicely simple code:
async Task SomeThingToDo() // returning task is completed when this is finished.
{
await Remote1.CallF1();
await Remote2.CallF2();
}
But I can't find a way how write my Update() function to make something like this happen. async and await seem to want to use the Task - class which in turn seems to need real threads?
My closest "solution" so far:
The first thread (which is running SomeThingToDo) calls their functions only once and stores the returned task and tests on every Update() whether the task is completed.
Remote1.CallF1 returns a new Task with an empty Action as constructor parameter and remembers the returned task. When F1 is actually finished, it calls RunSynchronously() on the task to mark it as completed.
That seems to me like a pervertion of the task system. And beside, it creates shared memory (the Task's IsComplete boolean) between the two threads which I would like to have replaced with our remote messanging system, if possible.
Finally, it does not solve my problem as it does not work with the await-like SomeThingToDo implementation above. It seems the auto-generated Task objects returned by an async function are completed immediately?
So finally my questions:
Can I hook into async/await to use my own implementations instead of Task<T>?
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Any good reference what exactly happens when I write async and await?
I haven't 100% fully understand what it does under the hood - any good references?
Back when we were designing the feature Mads, Stephen and I wrote some articles at a variety of different levels for MSDN magazine. The links are here:
http://blogs.msdn.com/b/ericlippert/archive/2011/10/03/async-articles.aspx
Start with my article, then Mads's, then Stephen's.
It seems the auto-generated Task objects returned by an async function are completed immediately?
No, they are completed when the code in the method body returns or throws, same as any other code.
Can I hook into async/await to use my own implementations instead of Task<T>?
A method which contains an await must return void, Task or Task<T>. However, the expression that is awaited can return any type so long as you can call GetAwaiter() on it. That need not be a Task.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Absolutely. A Task just represents work that will complete in the future. Though that work is typically done on another thread, there is no requirement.
To answer your questions:
Can I hook into async/await to use my own implementations instead of Task?
Yes. You can await anything. However, I do not recommend this.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
The Task type represents a future. It does not necessarily "run" on a thread; it can represent the completion of a download, or a timer expiring, etc.
Any good reference what exactly happens when I write async and await?
If you mean as far as code transformations go, this blog post has a nice side-by-side. It's not 100% accurate in its details, but it's enough to write a simple custom awaiter.
If you really want to twist async to do your bidding, Jon Skeet's eduasync series is the best resource. However, I seriously do not recommend you do this in production.
You may find my async/await intro helpful as an introduction to the async concepts and recommended ways to use them. The official MSDN documentation is also unusually good.
I did write the AsyncContext and AsyncContextThread classes that may work for your situation; they define a single-threaded context for async/await methods. You can queue work (or send messages) to an AsyncContextThread by using its Factory property.
Can I hook into async/await to use my own implementations instead of Task?
Yes.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Yes.
Any good reference what exactly happens when I write async and await?
Yes.
I would discourage you from asking yes/no questions. You probably don't just want yes/no answers.
async and await seem to want to use the Task - class which in turn seems to need real threads?
Nope, that's not true. A Task represents something that can be completed at some point in the future, possibly with a result. It's sometimes the result of some computation in another thread, but it doesn't need to be. It can be anything that is happening at some point in the future. For example, it could be the result of an IO operation.
Remote1.CallF1 returns a new Task with an empty Action as constructor parameter and remembers the returned task. When F1 is actually finished, it calls RunSynchronously() on the task to mark it as completed.
So what you're missing here is the TaskCompletionSource class. With that missing puzzle piece a lot should fit into place. You can create the TCS object, pass the Task from it's Task property around to...whomever, and then use the SetResult property to signal it's completion. Doing this doesn't result in the creation of any additional threads, or use the thread pool.
Note that if you don't have a result and just want a Task instead of a Task<T> then just use a TaskCompletionSource<bool> or something along those lines and then SetResult(false) or whatever is appropriate. By casting the Task<bool> to a Task you can hide that implementation from the public API.
That should also provide the "How" variations of the first two questions that you asked instead of the "can I" versions you asked. You can use a TaskCompletionSource to generate a task that is completed whenever you say it is, using whatever asynchronous construct you want, which may or may not involve the use of additional threads.