ConfigureAwait for IObservable<T> - c#

I'm experiencing a deadlock when I use blocking code with Task.Wait(), waiting an async method that inside awaits an Rx LINQ query.
This is an example:
public void BlockingCode()
{
this.ExecuteAsync().Wait();
}
public async Task ExecuteAsync()
{
await this.service.GetFooAsync().ConfigureAwait(false);
//This is the RX query that doesn't support ConfigureAwaitawait
await this.service.Receiver
.FirstOrDefaultAsync(x => x == "foo")
.Timeout(TimeSpan.FromSeconds(1));
}
So, my question is if there is any equivalent for ConfigureAwait on awaitable IObservable to ensure that the continuation is not resumed on the same SynchronizationContext.

You have to comprehend what "awaiting an Observable" means. Check out this. So, semantically, your code
await this.service.Receiver
.FirstOrDefaultAsync(x => x == "foo")
.Timeout(TimeSpan.FromSeconds(1));
is equivalent to
await this.service.Receiver
.FirstOrDefaultAsync(x => x == "foo")
.Timeout(TimeSpan.FromSeconds(1))
.LastAsync()
.ToTask();
(Note that there is some kind of redundance here, calling FirstOrDefaultAsyncand LastAsync but that's no problem).
So there you got your task (it can take an additional CancellationToken if available). You may now use ConfigureAwait.

ConfigureAwait is not directly related to awaiters themselves, rather it is a functionality of the TPL to configure how Task should complete. It's problematic because this TPL method doesn't return a new Task, so you can't compose it with a conversion to an observable.
Rx itself is basically free-threaded. You can control the threads used during subscription and events with far finer control than Tasks - see here for more on this: ObserveOn and SubscribeOn - where the work is being done
It's hard to fix your code because you don't provide a small but complete working example - however, the built in functions in Rx won't ever try to marshall to a particular thread unless you specifically tell them to with one of the above operators.
If you combine an operator like Observable.FromAsync to convert the Task to an observable, you can use Observable.SubscribeOn(Scheduler.Default) to start the Task off the current SynchronizationContext.
Here's a gist to play with (designed for LINQPad, runs with nuget package rx-main): https://gist.github.com/james-world/82c3cc39babab7870f6d

Related

When should Task.ContinueWith be called with TaskScheduler.Current as an argument?

We are using this code snippet from StackOverflow to produce a Task that completes as soon as the first of a collection of tasks completes successfully. Due to the non-linear nature of its execution, async/await is not really viable, and so this code uses ContinueWith() instead. It doesn't specify a TaskScheduler, though, which a number of sources have mentioned can be dangerous because it uses TaskScheduler.Current when most developers usually expect TaskScheduler.Default behavior from continuations.
The prevailing wisdom appears to be that you should always pass an explicit TaskScheduler into ContinueWith. However, I haven't seen a clear explanation of when different TaskSchedulers would be most appropriate.
What is a specific example of a case where it would be best to pass TaskScheduler.Current into ContinueWith(), as opposed to TaskScheduler.Default? Are there rules of thumb to follow when making this decision?
For context, here's the code snippet I'm referring to:
public static Task<T> FirstSuccessfulTask<T>(IEnumerable<Task<T>> tasks)
{
var taskList = tasks.ToList();
var tcs = new TaskCompletionSource<T>();
int remainingTasks = taskList.Count;
foreach(var task in taskList)
{
task.ContinueWith(t =>
if(task.Status == TaskStatus.RanToCompletion)
tcs.TrySetResult(t.Result));
else
if(Interlocked.Decrement(ref remainingTasks) == 0)
tcs.SetException(new AggregateException(
tasks.SelectMany(t => t.Exception.InnerExceptions));
}
return tcs.Task;
}
Probably you need to choose a task scheduler that is appropriate for actions that an executing delegate instance performs.
Consider following examples:
Task ContinueWithUnknownAction(Task task, Action<Task> actionOfTheUnknownNature)
{
// We know nothing about what the action do, so we decide to respect environment
// in which current function is called
return task.ContinueWith(actionOfTheUnknownNature, TaskScheduler.Current);
}
int count;
Task ContinueWithKnownAction(Task task)
{
// We fully control a continuation action and we know that it can be safely
// executed by thread pool thread.
return task.ContinueWith(t => Interlocked.Increment(ref count), TaskScheduler.Default);
}
Func<int> cpuHeavyCalculation = () => 0;
Action<Task> printCalculationResultToUI = task => { };
void OnUserAction()
{
// Assert that SynchronizationContext.Current is not null.
// We know that continuation will modify an UI, and it can be safely executed
// only on an UI thread.
Task.Run(cpuHeavyCalculation)
.ContinueWith(printCalculationResultToUI, TaskScheduler.FromCurrentSynchronizationContext());
}
Your FirstSuccessfulTask() probably is the example where you can use TaskScheduler.Default, because the continuation delegate instance can be safely executed on a thread pool.
You can also use custom task scheduler to implement custom scheduling logic in your library. For example see Scheduler page on Orleans framework website.
For more information check:
It's All About the SynchronizationContext article by Stephen Cleary
TaskScheduler, threads and deadlocks article by Cosmin Lazar
StartNew is Dangerous article by Stephen Cleary
I'll have to rant a bit, this is getting way too many programmers into trouble. Every programming aid that was designed to make threading look easy creates five new problems that programmers have no chance to debug.
BackgroundWorker was the first one, a modest and sensible attempt to hide the complications. But nobody realizes that the worker runs on the threadpool so should never occupy itself with I/O. Everybody gets that wrong, not many ever notice. And forgetting to check e.Error in the RunWorkerCompleted event, hiding exceptions in threaded code is a universal problem with the wrappers.
The async/await pattern is the latest, it makes it really look easy. But it composes extraordinarily poorly, async turtles all the way down until you get to Main(). They had to fix that eventually in C# version 7.2 because everybody got stuck on it. But not fixing the drastic ConfigureAwait() problem in a library. It is completely biased towards library authors knowing what they are doing, notable is that a lot of them work for Microsoft and tinker with WinRT.
The Task class bridged the gap between the two, its design goal was to make it very composable. Good plan, they could not predict how programmers were going to use it. But also a liability, inspiring programmers to ContinueWith() up a storm to glue tasks together. Even when it doesn't make sense to do so because those tasks merely run sequentially. Notable is that they even added an optimization to ensure that the continuation runs on the same thread to avoid the context switch overhead. Good plan, but creating the undebuggable problem that this web site is named for.
So yes, the advice you saw was a good one. Task is useful to deal with asynchronicity. A common problem that you have to deal with when services move into the "cloud" and latency gets to be a detail you can no longer ignore. If you ContinueWith() that kind code then you invariably care about the specific thread that executes the continuation. Provided by TaskScheduler, low odds that it isn't the one provided by FromCurrentSynchronizationContext(). Which is how async/await happened.
If current task is a child task, then using TaskScheduler.Current will mean the scheduler will be that which the task it is in, is scheduled to; and if not inside another task, TaskScheduler.Current will be TaskScheduler.Default and thus use the ThreadPool.
If you use TaskScheduler.Default, then it will always go to the ThreadPool.
The only reason you would use TaskScheduler.Current:
To avoid the default scheduler issue, you should always pass an
explicit TaskScheduler to Task.ContinueWith and Task.Factory.StartNew.
From Stephen Cleary's post ContinueWith is Dangerous, Too.
There's further explanation here from Stephen Toub on his MSDN blog.
I most certainly don't think I am capable of providing bullet proof answer but I will give my five cents.
What is a specific example of a case where it would be best to pass TaskScheduler.Current into ContinueWith(), as opposed to TaskScheduler.Default?
Imagine you are working on some web api that webserver naturally makes multithreaded. So you need to compromise your parallelism because you don't want to use all the resources of your webserver, but at the same time you want to speed up your processing time, so you decide to make custom task scheduler with lowered concurrency level because why not.
Now your api needs to query some database and order the results, but these results are millions so you decide to do it via Merge Sort(divide and conquer), then you need all your child tasks of this algorithm to be complient with your custom task scheduler (TaskScheduler.Current) because otherwise you will end up taking all the resources for the algorithm and your webserver thread pool will starve.
When to use TaskScheduler.Current, TaskScheduler.Default, TaskScheduler.FromCurrentSynchronizationContext(), or some other TaskScheduler
TaskScheduler.FromCurrentSynchronizationContext() - Specific for WPF,
Forms applications UI thread context, you use this basically when you
want to get back to the UI thread after being offloaded some work to
non-UI thread
example taken from here
private void button_Click(…)
{
… // #1 on the UI thread
Task.Factory.StartNew(() =>
{
… // #2 long-running work, so offloaded to non-UI thread
}).ContinueWith(t =>
{
… // #3 back on the UI thread
}, TaskScheduler.FromCurrentSynchronizationContext());
}
TaskScheduler.Default - Almost all the time when you don't have any specific requirements, edge cases to collate with.
TaskScheduler.Current - I think I've given one generic example above, but in general it should be used when you have either custom scheduler or you explicitly passed TaskScheduler.FromCurrentSynchronizationContext() to TaskFactory or Task.StartNew method and later you use continuation tasks or inner tasks (so pretty damn rare imo).

What is the exact difference between Task.ContinueWith and ActionBlock.LinkTo?

I am new to TPL Dataflow ActionBlock, TransformBlock etc. I used to practice Task.ContinueWith() to create a pipeline if needed. I recently started practicing about the TPL Dataflow and its blocks.
But I am a bit confused about the exact difference between those two. So could you please advise me when to use what?
These are two separate methods that have similar behavior but really don't relate to one another. ContinueWith schedules a continuation for a Task. With async/await you should not really need to use ContinueWith since the async/await keywords already schedule the remainder of your method as continuation. For example the two methods AsyncAwait and Continuation produce the same result.
public async Task AsyncAwait()
{
await DoAsync();
DoSomethingElse();
}
public async Task Continuation()
{
await DoAsync().ContinueWith(_ => DoSomethingElse());
}
public Task DoAsync() => Task.Delay(TimeSpan.FromSeconds(1));
public void DoSomethingElse()
{
//More Work
}
LinkTo on the other hand creates a disposable link between two Tpl-Dataflow blocks. That link can be configured in a number of ways see DatflowLinkOptions. One of the most configuration items is to PropagateCompletion. As you can hopefully see a dataflow link can be much more than simple continuation. You can pass completion, add a predicate to filter data or even link blocks into a complex structure like a mesh or feedback loop. Also, dataflow links allow you setup "backpressure" to throttle a flow. If the downstream block becomes overloaded and it's input buffer fills the upstream blocks can pause processing. The complete behavior of a dataflow link is not easily implemented with continuations by hand.
public ITargetBlock<int> BuildPipeline()
{
var block1 = new TransformBlock<int, int>(x => x);
var block2 = new ActionBlock<int>(x => Console.WriteLine(x));
block1.LinkTo(block2 , new DataflowLinkOptions() { PropagateCompletion = true });
return block1;
}
Unless you're doing complex linking you should always prefer the use of async/await over raw continuations. async/await makes the code easier to write, understand and maintain. LinkTo only applies to dataflow blocks and should be viewed as something separate from continuations and used to construct dataflow networks.

Observable.FromAsync vs Task.ToObservable

Does anyone have a steer on when to use one of these methods over the other. They seem to do the same thing in that they convert from TPL Task to an Observable.
Observable.FromAsync appear to support cancellation tokens which might be the subtle difference that allows the method generating the task to participate in cooperative cancellation if the observable is disposed.
Just wondering if I'm missing something obvious as to why you'd use one over the other.
Thanks
Observable.FromAsync accepts a TaskFactory in the form of Func<Task> or Func<Task<TResult>>,
in this case, the task is only created and executed, when the observable is subscribed to.
Where as .ToObservable() requires an already created (and thus started) Task.
#Sickboy answer is correct.
Observable.FromAsync() will start the task at the moment of subscription.
Task.ToObservable() needs an already running task.
One use for Observable.FromAsync is to control reentrancy for multiple calls to an async method.
This is an example where these two methods are not equivalent:
//ob is some IObservable<T>
//ExecuteQueryAsync is some async method
//Here, ExecuteQueryAsync will run **serially**, the second call will start
//only when the first one is already finished. This is an important property
//if ExecuteQueryAsync doesn't support reentrancy
ob
.Select(x => Observable.FromAsync(() => ExecuteQueryAsync(x))
.Concat()
.ObserveOnDispatcher()
.Subscribe(action)
vs
//ob is some IObservable<T>
//ExecuteQueryAsync is some async method
//Even when the `Subscribe` action order will be the same as the first
//example because of the `Concat`, ExecuteQueryAsync calls could be
//parallel, the second call to the method could start before the end of the
//first call.
.Select(x => ExecuteQueryAsync(x).ToObservable())
.Concat()
.Subscribe(action)
Note that on the first example one may need the ObserveOn() or ObserveOnDispatcher() method to ensure that the action is executed on the original dispatcher, since the Observable.FromAsync doesn't await the task, thus the continuation is executed on any available dispatcher
Looking at the code, it appears that (at least in some flows) that Observable.FromAsync calls into .ToObservable()*. I am sure the intent that they should be semantically equivalent (assuming you pass the same parameters e.g. Scheduler, CancellationToken etc.).
One is better suited to chaining/fluent syntax, one may read better in isolation. Whichever you coding style favors.
*https://github.com/Reactive-Extensions/Rx.NET/blob/859e6159cb07be67fd36b18c2ae2b9a62979cb6d/Rx.NET/Source/System.Reactive.Linq/Reactive/Linq/QueryLanguage.Async.cs#L727
Aside from being able to use a CancellationToken, FromAsync wraps in a defer so this allows changing the task logic based upon conditions at the time of subscription. Note that the Task will not be started, internally task.ToObservable will be called. The Func does allow you to start the task though when you create it.

Task.ContinueWith method requires task argument?

I have a class with two methods, Load() and Process(). I want to be able to run these individually as background tasks, or in sequence. I like the ContinueWith() syntax, but I'm not able to get it to work. I have to take a Task parameter on the method I continue with and cannot have a Task parameter on the initial method.
I would like to do it without lambda expressions, but am I stuck either using them, forcing a task parameter on one of the methods, or creating a third method LoadAndProcess()?
void Run()
{
// doesn't work, but I'd like to do it
//Task.Factory.StartNew(MethodNoArguments).ContinueWith(MethodNoArguments);
Console.WriteLine("ContinueWith");
Task.Factory.StartNew(MethodNoArguments).ContinueWith(MethodWithTaskArgument).Wait();
Console.WriteLine("Lambda");
Task.Factory.StartNew(() => { MethodNoArguments(); MethodNoArguments(); }).Wait();
Console.WriteLine("ContinueWith Lambda");
Task.Factory.StartNew(MethodNoArguments).ContinueWith(x => {
MethodNoArguments();
}).Wait();
}
void MethodNoArguments()
{
Console.WriteLine("MethodNoArguments()");
}
void MethodWithTaskArgument(Task t = null)
{
Console.WriteLine("MethodWithTaskArgument()");
}
In all overloads of ContinueWith(), the first parameter is a delegate that takes a Task, so you can't pass a parameterless delegate to it.
I think that using lambdas is perfectly fine and that it doesn't hurt readability. And the lambda in your code is unnecessarily verbose:
Task.Factory.StartNew(MethodNoArguments).ContinueWith(_ => MethodNoArguments())
Or, as Cory Carson pointed out in a comment, you could write an extension method:
public static Task ContinueWith(this Task task, Action action)
{
return task.ContinueWith(_ => action());
}
Writing clean code when you use multiple continuations is not that easy, although you can follow a few rules to make code cleaner:
Use the shorter form from svick's answer
Avoid chaining the continuations. Store the result of each continuation in a separate variable and then call ContinueWith in a separate line
If the continuation code is long, store it in a lambda variable and then ContinueWith the lambda.
Use an extension method like this "Then" method to make your code even cleaner.
You can change your code to something like this, which is slightly cleaner:
var call1=Task.Factory.StartNew(()=>MethodNoArguments());
var call2 = call1.ContinueWith(_ => MethodNoArguments());
call2.Wait();
or even
var call1 = Task.Factory.StartNew<Task>(MethodNoArguments);
var call2 = call1.Then(MethodNoArguments);
call2.Wait();
Stephen Toub discusses the Then extension and other ways you can clean up your code in Processing Sequences of Asynchronous Operations with Tasks
This problem is solved for good in C# 4. In C# 5 you can use the async/await keywords to create clean code that looks like the original synchronous version, eg:
static async Task Run()
{
await MethodNoArguments();
await MethodNoArguments();
}
static async Task MethodNoArguments()
{
await Task.Run(()=>Console.WriteLine("MethodNoArguments()"));
}
Visual Studio 11 and .NET 4.5 have a Go Live license so you could probably start using them right away.
You can use the Async CTP in C# 4 to achieve the same result. You can use the Async CTP in production as it has a go live license. The downside is that you 'll have to make some small changes to your code when you move to .NET 4.5 due to differences between the CTP and .NET 4.5 (eg. CTP has TaskEx.Run instead of Task.Run).

Async/Await - is it *concurrent*?

I've been considering the new async stuff in C# 5, and one particular question came up.
I understand that the await keyword is a neat compiler trick/syntactic sugar to implement continuation passing, where the remainder of the method is broken up into Task objects and queued-up to be run in order, but where control is returned to the calling method.
My problem is that I've heard that currently this is all on a single thread. Does this mean that this async stuff is really just a way of turning continuation code into Task objects and then calling Application.DoEvents() after each task completes before starting the next one?
Or am I missing something? (This part of the question is rhetorical - I'm fully aware I'm missing something :) )
It is concurrent, in the sense that many outstanding asychronous operations may be in progress at any time. It may or may not be multithreaded.
By default, await will schedule the continuation back to the "current execution context". The "current execution context" is defined as SynchronizationContext.Current if it is non-null, or TaskScheduler.Current if there's no SynchronizationContext.
You can override this default behavior by calling ConfigureAwait and passing false for the continueOnCapturedContext parameter. In that case, the continuation will not be scheduled back to that execution context. This usually means it will be run on a threadpool thread.
Unless you're writing library code, the default behavior is exactly what's desired. WinForms, WPF, and Silverlight (i.e., all the UI frameworks) supply a SynchronizationContext, so the continuation executes on the UI thread (and can safely access UI objects). ASP.NET also supplies a SynchronizationContext that ensures the continuation executes in the correct request context.
Other threads (including threadpool threads, Thread, and BackgroundWorker) do not supply a SynchronizationContext. So Console apps and Win32 services by default do not have a SynchronizationContext at all. In this situation, continuations execute on threadpool threads. This is why Console app demos using await/async include a call to Console.ReadLine/ReadKey or do a blocking Wait on a Task.
If you find yourself needing a SynchronizationContext, you can use AsyncContext from my Nito.AsyncEx library; it basically just provides an async-compatible "main loop" with a SynchronizationContext. I find it useful for Console apps and unit tests (VS2012 now has built-in support for async Task unit tests).
For more information about SynchronizationContext, see my Feb MSDN article.
At no time is DoEvents or an equivalent called; rather, control flow returns all the way out, and the continuation (the rest of the function) is scheduled to be run later. This is a much cleaner solution because it doesn't cause reentrancy issues like you would have if DoEvents was used.
The whole idea behind async/await is that it performs continuation passing nicely, and doesn't allocate a new thread for the operation. The continuation may occur on a new thread, it may continue on the same thread.
The real "meat" (the asynchronous) part of async/await is normally done separately and the communication to the caller is done through TaskCompletionSource. As written here http://blogs.msdn.com/b/pfxteam/archive/2009/06/02/9685804.aspx
The TaskCompletionSource type serves two related purposes, both alluded to by its name: it is a source for creating a task, and the source for that task’s completion. In essence, a TaskCompletionSource acts as the producer for a Task and its completion.
and the example is quite clear:
public static Task<T> RunAsync<T>(Func<T> function)
{
if (function == null) throw new ArgumentNullException(“function”);
var tcs = new TaskCompletionSource<T>();
ThreadPool.QueueUserWorkItem(_ =>
{
try
{
T result = function();
tcs.SetResult(result);
}
catch(Exception exc) { tcs.SetException(exc); }
});
return tcs.Task;
}
Through the TaskCompletionSource you have access to a Task object that you can await, but it isn't through the async/await keywords that you created the multithreading.
Note that when many "slow" functions will be converted to the async/await syntax, you won't need to use TaskCompletionSource very much. They'll use it internally (but in the end somewhere there must be a TaskCompletionSource to have an asynchronous result)
The way I like to explain it is that the "await" keyword simply waits for a task to finish but yields execution to the calling thread while it waits. It then returns the result of the Task and continues from the statement after the "await" keyword once the Task is complete.
Some people I have noticed seem to think that the Task is run in the same thread as the calling thread, this is incorrect and can be proved by trying to alter a Windows.Forms GUI element within the method that await calls. However, the continuation is run in the calling thread where ever possible.
Its just a neat way of not having to have callback delegates or event handlers for when the Task completes.
I feel like this question needs a simpler answer for people. So I'm going to oversimplify.
The fact is, if you save the Tasks and don't await them, then async/await is "concurrent".
var a = await LongTask1(x);
var b = await LongTask2(y);
var c = ShortTask(a, b);
is not concurrent. LongTask1 will complete before LongTask2 starts.
var a = LongTask1(x);
var b = LongTask2(y);
var c = ShortTask(await a, await b);
is concurrent.
While I also urge people to get a deeper understanding and read up on this, you can use async/await for concurrency, and it's pretty simple.

Categories

Resources