Hook when async method is started

Hook when async method is started - c#

Is there an integration point into the async/await mechanisms to hook in and know when an async method is started?
Can a custom TaskScheduler provide this hook? A custom SynchronizationContext?
I want to be able to track certain method calls and know which Task (or code) they're associated with.
Thanks!

According to the Task-based Asynchronous Pattern in section "Task Status":
All tasks returned from TAP methods must be “hot” [...] meaning that the asynchronous operations they represent have already been initiated and their TaskStatus is an enumeration value other than Created.
That means that you will always start the Task as soon as you create it (at least if the factory method implements the TAP, which is true for all methods provided by the .Net framework).
If you create and return a "cold" (non-running) Task using new Task(), there is no clean way to find out if the Task.Start() method was called. You can only check the TaskStatus property periodically.

Related

Observable.FromAsync vs Task.ToObservable

Does anyone have a steer on when to use one of these methods over the other. They seem to do the same thing in that they convert from TPL Task to an Observable.
Observable.FromAsync appear to support cancellation tokens which might be the subtle difference that allows the method generating the task to participate in cooperative cancellation if the observable is disposed.
Just wondering if I'm missing something obvious as to why you'd use one over the other.
Thanks

Observable.FromAsync accepts a TaskFactory in the form of Func<Task> or Func<Task<TResult>>,
in this case, the task is only created and executed, when the observable is subscribed to.
Where as .ToObservable() requires an already created (and thus started) Task.

#Sickboy answer is correct.
Observable.FromAsync() will start the task at the moment of subscription.
Task.ToObservable() needs an already running task.
One use for Observable.FromAsync is to control reentrancy for multiple calls to an async method.
This is an example where these two methods are not equivalent:
//ob is some IObservable<T>
//ExecuteQueryAsync is some async method
//Here, ExecuteQueryAsync will run **serially**, the second call will start
//only when the first one is already finished. This is an important property
//if ExecuteQueryAsync doesn't support reentrancy
ob
.Select(x => Observable.FromAsync(() => ExecuteQueryAsync(x))
.Concat()
.ObserveOnDispatcher()
.Subscribe(action)
vs
//ob is some IObservable<T>
//ExecuteQueryAsync is some async method
//Even when the `Subscribe` action order will be the same as the first
//example because of the `Concat`, ExecuteQueryAsync calls could be
//parallel, the second call to the method could start before the end of the
//first call.
.Select(x => ExecuteQueryAsync(x).ToObservable())
.Concat()
.Subscribe(action)
Note that on the first example one may need the ObserveOn() or ObserveOnDispatcher() method to ensure that the action is executed on the original dispatcher, since the Observable.FromAsync doesn't await the task, thus the continuation is executed on any available dispatcher

Looking at the code, it appears that (at least in some flows) that Observable.FromAsync calls into .ToObservable()*. I am sure the intent that they should be semantically equivalent (assuming you pass the same parameters e.g. Scheduler, CancellationToken etc.).
One is better suited to chaining/fluent syntax, one may read better in isolation. Whichever you coding style favors.
*https://github.com/Reactive-Extensions/Rx.NET/blob/859e6159cb07be67fd36b18c2ae2b9a62979cb6d/Rx.NET/Source/System.Reactive.Linq/Reactive/Linq/QueryLanguage.Async.cs#L727

Aside from being able to use a CancellationToken, FromAsync wraps in a defer so this allows changing the task logic based upon conditions at the time of subscription. Note that the Task will not be started, internally task.ToObservable will be called. The Func does allow you to start the task though when you create it.

async and await without "threads"? Can I customize what happens under-the-hood?

I have a question about how customizable the new async/await keywords and the Task class in C# 4.5 are.
First some background for understanding my problem: I am developing on a framework with the following design:
One thread has a list of "current things to do" (usually around 100 to 200 items) which are stored as an own data structure and hold as a list. It has an Update() function that enumerates the list and look whether some "things" need to execute and does so. Basically its like a big thread sheduler. To simplify things, lets assume the "things to do" are functions that return the boolean true when they are "finished" (and should not be called next Update) and false when the sheduler should call them again next update.
All the "things" must not run concurrently and also must run in this one thread (because of thread static variables)
There are other threads which do other stuff. They are structured in the same way: Big loop that iterates a couple of hundret things to do in a big Update() - function.
Threads can send each other messages, including "remote procedure calls". For these remote calls, the RPC system is returning some kind of future object to the result value. In the other thread, a new "thing to do" is inserted.
A common "thing" to do are just sequences of RPCs chained together. At the moment, the syntax for this "chaining" is very verbose and complicated, since you manually have to check for the completion state of previous RPCs and invoke the next ones etc..
An example:
Future f1, f2;
bool SomeThingToDo() // returns true when "finished"
{
if (f1 == null)
f1 = Remote1.CallF1();
else if (f1.IsComplete && f2 == null)
f2 = Remote2.CallF2();
else if (f2 != null && f2.IsComplete)
return true;
return false;
}
Now this all sound awefull like async and await of C# 5.0 can help me here. I haven't 100% fully understand what it does under the hood (any good references?), but as I get it from some few talks I've watched, it exactly does what I want with this nicely simple code:
async Task SomeThingToDo() // returning task is completed when this is finished.
{
await Remote1.CallF1();
await Remote2.CallF2();
}
But I can't find a way how write my Update() function to make something like this happen. async and await seem to want to use the Task - class which in turn seems to need real threads?
My closest "solution" so far:
The first thread (which is running SomeThingToDo) calls their functions only once and stores the returned task and tests on every Update() whether the task is completed.
Remote1.CallF1 returns a new Task with an empty Action as constructor parameter and remembers the returned task. When F1 is actually finished, it calls RunSynchronously() on the task to mark it as completed.
That seems to me like a pervertion of the task system. And beside, it creates shared memory (the Task's IsComplete boolean) between the two threads which I would like to have replaced with our remote messanging system, if possible.
Finally, it does not solve my problem as it does not work with the await-like SomeThingToDo implementation above. It seems the auto-generated Task objects returned by an async function are completed immediately?
So finally my questions:
Can I hook into async/await to use my own implementations instead of Task<T>?
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Any good reference what exactly happens when I write async and await?

I haven't 100% fully understand what it does under the hood - any good references?
Back when we were designing the feature Mads, Stephen and I wrote some articles at a variety of different levels for MSDN magazine. The links are here:
http://blogs.msdn.com/b/ericlippert/archive/2011/10/03/async-articles.aspx
Start with my article, then Mads's, then Stephen's.
It seems the auto-generated Task objects returned by an async function are completed immediately?
No, they are completed when the code in the method body returns or throws, same as any other code.
Can I hook into async/await to use my own implementations instead of Task<T>?
A method which contains an await must return void, Task or Task<T>. However, the expression that is awaited can return any type so long as you can call GetAwaiter() on it. That need not be a Task.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Absolutely. A Task just represents work that will complete in the future. Though that work is typically done on another thread, there is no requirement.

To answer your questions:
Can I hook into async/await to use my own implementations instead of Task?
Yes. You can await anything. However, I do not recommend this.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
The Task type represents a future. It does not necessarily "run" on a thread; it can represent the completion of a download, or a timer expiring, etc.
Any good reference what exactly happens when I write async and await?
If you mean as far as code transformations go, this blog post has a nice side-by-side. It's not 100% accurate in its details, but it's enough to write a simple custom awaiter.
If you really want to twist async to do your bidding, Jon Skeet's eduasync series is the best resource. However, I seriously do not recommend you do this in production.
You may find my async/await intro helpful as an introduction to the async concepts and recommended ways to use them. The official MSDN documentation is also unusually good.
I did write the AsyncContext and AsyncContextThread classes that may work for your situation; they define a single-threaded context for async/await methods. You can queue work (or send messages) to an AsyncContextThread by using its Factory property.

Can I hook into async/await to use my own implementations instead of Task?
Yes.
If that's not possible, can I use Task without anything that relates to "blocking" and "threads"?
Yes.
Any good reference what exactly happens when I write async and await?
Yes.
I would discourage you from asking yes/no questions. You probably don't just want yes/no answers.
async and await seem to want to use the Task - class which in turn seems to need real threads?
Nope, that's not true. A Task represents something that can be completed at some point in the future, possibly with a result. It's sometimes the result of some computation in another thread, but it doesn't need to be. It can be anything that is happening at some point in the future. For example, it could be the result of an IO operation.
Remote1.CallF1 returns a new Task with an empty Action as constructor parameter and remembers the returned task. When F1 is actually finished, it calls RunSynchronously() on the task to mark it as completed.
So what you're missing here is the TaskCompletionSource class. With that missing puzzle piece a lot should fit into place. You can create the TCS object, pass the Task from it's Task property around to...whomever, and then use the SetResult property to signal it's completion. Doing this doesn't result in the creation of any additional threads, or use the thread pool.
Note that if you don't have a result and just want a Task instead of a Task<T> then just use a TaskCompletionSource<bool> or something along those lines and then SetResult(false) or whatever is appropriate. By casting the Task<bool> to a Task you can hide that implementation from the public API.
That should also provide the "How" variations of the first two questions that you asked instead of the "can I" versions you asked. You can use a TaskCompletionSource to generate a task that is completed whenever you say it is, using whatever asynchronous construct you want, which may or may not involve the use of additional threads.

Is there a way to signal a System.Threading.Tasks.Task complete?

I have a library that is a complicated arbiter of many network connections. Each method of it's primary object takes a delegate, which is called when the network responds to a given request.
I want to translate my library to use the new .NET 4.5 "async/await" pattern; this would require me to return a "Task" object, which would signal to the user that the asynchronous part of the call is complete. Creating this object requires a function for the task to represent - As far as my understanding, it is essentially a lightweight thread.
This doesn't really fit the design of my library - I would like the task to behave more like an event, and directly signal to the user that their request has completed, rather then representing a function. Is this possible? Should i avoid abusing the "async/await" pattern in this way?
I don't know if I'm wording this very well, I hope you understand my meaning. Thank you for any help.

As far as my understanding, it is essentially a lightweight thread.
No, that's not really true. I can be true, under certain circumstances, but that's just one usage of Task. You can start a thread by passing it a delegate, and having it execute it (usually asynchronously, possibly synchronously, and by default using the thread pool).
Another way of using threads is through the use of a TaskCompletionSource. When you do that the task is (potentially) not creating any threads, using the thread pool, or anything along those lines. One common usage of this model is converting an event-based API to a Task-based API:
Let's assume, just because it's a common example, that we want to have a Task that will be completed when a From that we have is closed. There is already a FormClosed event that fires when that event occurs:
public static Task WhenClosed(this Form form)
{
var tcs = new TaskCompletionSource<object>();
form.FormClosing += (_, args) =>
{
tcs.SetResult(null);
};
return tcs.Task;
}
We create a TaskCompletionSource, add a handler to the event in question, in that handler we signal the completion of the task, and the TaskCompletionSource provides us with a Task to return to the caller. This Task will not have resulted in any new threads being created, it won't use the thread pool, or anything like that.
You can have a Task/Event based model using this construct that appears quite asynchronous, but only using a single thread to do all work (the UI thread).
In general, anytime you want to have a Task that represents something other than the execution of a function you'll want to consider using a TaskCompletionSource. It's usually the appropriate conceptual way to approach the problem, other than possibly using one of the existing TPL methods, such as WhenAll, WhenAny, etc.
Should I avoid abusing the "async/await" pattern in this way?
No, because it's not abuse. It's a perfectly appropriate use of Task constructs, as well as async/await. Consider, for example, the code that you can write using the helper method that I have above:
private async void button1_Click(object sender, EventArgs e)
{
Form2 popup = new Form2();
this.Hide();
popup.Show();
await popup.WhenClosed();
this.Show();
}
This code now works just like it reads; create a new form, hide myself, show the popup, wait until the popup is closed, and then show myself again. However, since it's not a blocking wait the UI thread isn't blocked. We also don't need to bother with events; adding handlers, dealing with multiple contexts; moving our logic around, or any of it. (All of that happens, it's just hidden from us.)

I want to translate my library to use the new .NET 4.5 "async/await" pattern; this would require me to return a "Task" object, which would signal to the user that the asynchronous part of the call is complete.
Well, not really - you can return anything which implements the awaitable pattern - but a Task is the simplest way of doing this.
This doesn't really fit the design of my library - I would like the task to behave more like an event, and directly signal to the user that their request has completed, rather then representing a function.
You can call Task.ContinueWith to act as a "handler" to execute when the task completes. Indeed, that's what TaskAwaiter does under the hood.
Your question isn't terribly clear to be honest, but if you really want to create a Task which you can then force to completion whenever you like, I suspect you just want TaskCompletionSource<TResult> - you call the SetResult, SetCanceled or SetException methods to indicate the appropriate kind of completion. (Or you can call the TrySet... versions.) Use the Task property to return a task to whatever needs it.

Custom awaitables for dummies

In Async/Await FAQ, Stephen Toub says:
An awaitable is any type that exposes a GetAwaiter method which returns a valid awaiter.
...
An awaiter is any type returned from an awaitable’s GetAwaiter method and that conforms to a particular pattern.
So in order to be an awaiter, a type should:
Implement the INotifyCompletion interface.
Provide a boolean property called IsCompleted.
Provide a parameterless GetResult method that returns void or TResult.
(I'm ignoring ICriticalNotifyCompletion for now.)
I know the page I mentioned has a sample that shows how the compiler translates await operations but I'm stil having a hard time understanding.
When I await an awaitable,
When is IsCompleted checked? Where should I set it?
When is OnCompleted called?
Which thread calls OnCompleted?
I saw examples of both directly invoking the continuation parameter of OnCompleted and using Task.Run(continuation) in different examples, which should I go for and why?

Why would you want a custom awaiter?
You can see the compiler's interpretation of await here. Essentially:
var temp = e.GetAwaiter();
if (!temp.IsCompleted)
{
SAVE_STATE()
temp.OnCompleted(&cont);
return;
cont:
RESTORE_STATE()
}
var i = temp.GetResult();
Edit from comments: OnCompleted should schedule its argument as a continuation of the asynchronous operation.

In the vast majority of cases, you as a developer need not worry about this. Use the async and await keywords and the compiler and runtime handle all this for you.
To answer your questions:
When does the code checks IsCompleted? Where should I set it?
The task should set IsCompleted when the task has finished doing what it was doing. For example, if the task was loading data from a file, IsCompleted should return true when the data is loaded and the caller can access it.
When does it calls OnCompleted?
OnCompleted usually contains a delegate supplied by the caller to execute when the task has completed.
Does it call OnCompleted in parallel or should the code
inside OnCompleted be asynchronous?
The code in OnCompleted should be thread neutral (not care which thread it is called from). This may be problematic for updating COM objects in Single Threaded Apartments (like any UI classes in Metro/Windows8/Windows Store apps). It does not have to be asynchronous but may contain asynchronous code.
I saw examples of both directly invoking the continuation parameter of OnCompleted
and using Task.Run(continuation) in different examples, which should I go for and when?
Use async/await when you can. Otherwise, use Task.Run() or Task.Wait() because they follow the sequential programming model most people are used to. Using continuations may still be required, particularly in Metro apps where you have apartment issues.

C# Asynchronous Options for Processing a List

I am trying to better understand the Async and the Parallel options I have in C#. In the snippets below, I have included the 5 approaches I come across most. But I am not sure which to choose - or better yet, what criteria to consider when choosing:
Method 1: Task
(see http://msdn.microsoft.com/en-us/library/dd321439.aspx)
Calling StartNew is functionally equivalent to creating a Task using one of its constructors and then calling Start to schedule it for execution. However, unless creation and scheduling must be separated, StartNew is the recommended approach for both simplicity and performance.
TaskFactory's StartNew method should be the preferred mechanism for creating and scheduling computational tasks, but for scenarios where creation and scheduling must be separated, the constructors may be used, and the task's Start method may then be used to schedule the task for execution at a later time.
// using System.Threading.Tasks.Task.Factory
void Do_1()
{
var _List = GetList();
_List.ForEach(i => Task.Factory.StartNew(_ => { DoSomething(i); }));
}
Method 2: QueueUserWorkItem
(see http://msdn.microsoft.com/en-us/library/system.threading.threadpool.getmaxthreads.aspx)
You can queue as many thread pool requests as system memory allows. If there are more requests than thread pool threads, the additional requests remain queued until thread pool threads become available.
You can place data required by the queued method in the instance fields of the class in which the method is defined, or you can use the QueueUserWorkItem(WaitCallback, Object) overload that accepts an object containing the necessary data.
// using System.Threading.ThreadPool
void Do_2()
{
var _List = GetList();
var _Action = new WaitCallback((o) => { DoSomething(o); });
_List.ForEach(x => ThreadPool.QueueUserWorkItem(_Action));
}
Method 3: Parallel.Foreach
(see: http://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel.foreach.aspx)
The Parallel class provides library-based data parallel replacements for common operations such as for loops, for each loops, and execution of a set of statements.
The body delegate is invoked once for each element in the source enumerable. It is provided with the current element as a parameter.
// using System.Threading.Tasks.Parallel
void Do_3()
{
var _List = GetList();
var _Action = new Action<object>((o) => { DoSomething(o); });
Parallel.ForEach(_List, _Action);
}
Method 4: IAsync.BeginInvoke
(see: http://msdn.microsoft.com/en-us/library/cc190824.aspx)
BeginInvoke is asynchronous; therefore, control returns immediately to the calling object after it is called.
// using IAsync.BeginInvoke()
void Do_4()
{
var _List = GetList();
var _Action = new Action<object>((o) => { DoSomething(o); });
_List.ForEach(x => _Action.BeginInvoke(x, null, null));
}
Method 5: BackgroundWorker
(see: http://msdn.microsoft.com/en-us/library/system.componentmodel.backgroundworker.aspx)
To set up for a background operation, add an event handler for the DoWork event. Call your time-consuming operation in this event handler. To start the operation, call RunWorkerAsync. To receive notifications of progress updates, handle the ProgressChanged event. To receive a notification when the operation is completed, handle the RunWorkerCompleted event.
// using System.ComponentModel.BackgroundWorker
void Do_5()
{
var _List = GetList();
using (BackgroundWorker _Worker = new BackgroundWorker())
{
_Worker.DoWork += (s, arg) =>
{
arg.Result = arg.Argument;
DoSomething(arg.Argument);
};
_Worker.RunWorkerCompleted += (s, arg) =>
{
_List.Remove(arg.Result);
if (_List.Any())
_Worker.RunWorkerAsync(_List[0]);
};
if (_List.Any())
_Worker.RunWorkerAsync(_List[0]);
}
}
I suppose the obvious critieria would be:
Is any better than the other for performance?
Is any better than the other for error handling?
Is any better than the other for monitoring/feedback?
But, how do you choose?
Thanks in advance for your insights.

Going to take these in an arbitrary order:
BackgroundWorker (#5)
I like to use BackgroundWorker when I'm doing things with a UI. The advantage that it has is having the progress and completion events fire on the UI thread which means you don't get nasty exceptions when you try to change UI elements. It also has a nice built-in way of reporting progress. One disadvantage that this mode has is that if you have blocking calls (like web requests) in your work, you'll have a thread sitting around doing nothing while the work is happening. This is probably not a problem if you only think you'll have a handful of them though.
IAsyncResult/Begin/End (APM, #4)
This is a widespread and powerful but difficult model to use. Error handling is troublesome since you need to re-catch exceptions on the End call, and uncaught exceptions won't necessarily make it back to any relevant pieces of code that can handle it. This has the danger of permanently hanging requests in ASP.NET or just having errors mysteriously disappear in other applications. You also have to be vigilant about the CompletedSynchronously property. If you don't track and report this properly, the program can hang and leak resources. The flip side of this is that if you're running inside the context of another APM, you have to make sure that any async methods you call also report this value. That means doing another APM call or using a Task and casting it to an IAsyncResult to get at its CompletedSynchronously property.
There's also a lot of overhead in the signatures: You have to support an arbitrary object to pass through, make your own IAsyncResult implementation if you're writing an async method that supports polling and wait handles (even if you're only using the callback). By the way, you should only be using callback here. When you use the wait handle or poll IsCompleted, you're wasting a thread while the operation is pending.
Event-based Asynchronous Pattern (EAP)
One that was not on your list but I'll mention for the sake of completeness. It's a little bit friendlier than the APM. There are events instead of callbacks and there's less junk hanging onto the method signatures. Error handling is a little easier since it's saved and available in the callback rather than re-thrown. CompletedSynchronously is also not part of the API.
Tasks (#1)
Tasks are another friendly async API. Error handling is straightforward: the exception is always there for inspection on the callback and nobody cares about CompletedSynchronously. You can do dependencies and it's a great way to handle execution of multiple async tasks. You can even wrap APM or EAP (one type you missed) async methods in them. Another good thing about using tasks is your code doesn't care how the operation is implemented. It may block on a thread or be totally asynchronous but the consuming code doesn't care about this. You can also mix APM and EAP operations easily with Tasks.
Parallel.For methods (#3)
These are additional helpers on top of Tasks. They can do some of the work to create tasks for you and make your code more readable, if your async tasks are suited to run in a loop.
ThreadPool.QueueUserWorkItem (#2)
This is a low-level utility that's actually used by ASP.NET for all requests. It doesn't have any built-in error handling like tasks so you have to catch everything and pipe it back up to your app if you want to know about it. It's suitable for CPU-intensive work but you don't want to put any blocking calls on it, such as a synchronous web request. That's because as long as it runs, it's using up a thread.
async / await Keywords
New in .NET 4.5, these keywords let you write async code without explicit callbacks. You can await on a Task and any code below it will wait for that async operation to complete, without consuming a thread.

Your first, third and forth examples use the ThreadPool implicitly because by default Tasks are scheduled on the ThreadPool and the TPL extensions use the ThreadPool as well, the API simply hides some of the complexity see here and here. BackgroundWorkers are part of the ComponentModel namespace because they are meant for use in UI scenarios.

Reactive extensions is another upcoming library for handling asynchronous programming, especially when it comes to composition of asynchronous events and methods.
It's not native, however it's developed by Ms labs. It's available both for .NET 3.5 and .NET 4.0 and is essentially a collection of extension methods on the .NET 4.0 introduced IObservable<T> interface.
There are a lot of examples and tutorials on their main site, and I strongly recommend checking some of them out. The pattern might seem a bit odd at first (at least for .NET programmers), but well worth it, even if it's just grasping the new concept.
The real strength of reactive extensions (Rx.NET) is when you need to compose multiple asynchronous sources and events. All operators are designed with this in mind and handles the ugly parts of asynchrony for you.
Main site: http://msdn.microsoft.com/en-us/data/gg577609
Beginner's guide: http://msdn.microsoft.com/en-us/data/gg577611
Examples: http://rxwiki.wikidot.com/101samples
That said, the best async pattern probably depends on what situation you're in. Some are better (simpler) for simpler stuff and some are more extensible and easier to handle when it comes to more complex scenarios. I cannot speak for all the ones you're mentioning though.

The last one is the best for 2,3 at least. It has built-in methods/properties for this.
Other variants are almost the same, just different versions/convinient wrappers

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.