Cancelling continuation chains from the inside - c#

I am working in C# on .NET 4.0 and have started replacing a number of nested BackgroundWorker setups with Task<T>.
The "nesting" is of this form:
var secondWorker = new BackgroundWorker();
secondWorker.DoWork += (sender, args) =>
{
MoreThings();
};
var firstWorker = new BackgroundWorker();
firstWorker.DoWork += (sender, args) =>
{
args.Result = this.Things();
};
firstWorker.RunWorkerCompleted += (sender, args) =>
{
var result = (bool)args.Result;
// possibly do things on UI
if (result) { secondWorker.RunWorkerAsync(); }
};
secondWorker here plays the role of a callback for firstWorker. The equivalent when using Task<T>, as I understand it, are continuations with ContinueWith(); however, that doesn't allow me to decide whether to actually run the continuation from the control flow in a particular case.
A - from my understanding very unclean - workaround would be this:
var source = new CancellationTokenSource();
var uiScheduler = TaskScheduler.FromCurrentSynchronizationContext();
Task.Factory.StartNew(() => { return this.Things(); })
.ContinueWith(t =>
{
// do things on UI
if (!t.Result) { source.Cancel(); }
}, CancellationToken.None, TaskContinuationOptions.NotOnFaulted, uiScheduler)
.ContinueWith(t => { this.MoreThings(); }, source.Token);
This kind of works, but from all the examples I've seen, in this form (accessing the CancellationTokenSource from within the continuation chain - although the task that does is not using the token) it rather looks like abuse of the CancellationToken mechanism. How bad is this really? What would be the proper, "idiomatic" way to cancel the continuation chain based on information determined inside its flow?
(This code on the outside has the intended effect, but I assume it is the wrong way to solve the task with respect to using the existing tools. I am not looking for critique of my "solution" but for the proper way to do it. That's why I am putting this on SO rather than Code Review.)

With continuations, C# also provides the async method feature. In this scheme, your code would look something like this:
async Task<bool> Things() { ... }
async Task MoreThings() { ... }
async Task RunStuff()
{
if (await Things())
{
await MoreThings();
}
}
The exact specifics depend on your implementation details. The important thing here is that via async and await, C# will automatically generate a state machine that will abstract all the tedium away from dealing with continuations, in a way that makes it easy to organize them.
EDIT: it occurred to me that your actual Things() and MoreThings() methods, you may not want to actually convert to async, so you can do this instead:
async Task RunStuff()
{
if (await Task.Run(() => Things()))
{
await Task.Run(() => MoreThings());
}
}
That will wrap your specific methods in an asynchronous task.
EDIT 2: it having been pointed out to me that I overlooked the pre-4.5 restriction here, the following should work:
void RunStuff()
{
Task.Factory.StartNew(() => Things()).ContinueWith(task =>
{
if (task.Result)
{
Task.Factory.StartNew(() => MoreThings());
}
});
}
Something like that, anyway.

Related

Switch new Task(()=>{ }) for Func<Task>

In an answer to one of my other questions, I was told that use of new Task(() => { }) is not something that is a normal use case. I was advised to use Func<Task> instead. I have tried to make that work, but I can't seem to figure it out. (Rather than drag it out in the comments, I am asking a separate question here.)
My specific scenario is that I need the Task to not start right when it is declared and to be able to wait for it later.
Here is a LinqPad example using new Task(() => { }). NOTE: This works perfectly! (Except that it uses new Task.)
static async void Main(string[] args)
{
// Line that I need to swap to a Func<Task> somehow.
// note that this is "cold" not started task
Task startupDone = new Task(() => { });
var runTask = DoStuff(() =>
{
//+++ This is where we want to task to "start"
startupDone.Start();
});
//+++ Here we wait for the task to possibly start and finish. Or timeout.
// Note that this times out at 1000ms even if "blocking = 10000" below.
var didStartup = startupDone.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
await runTask;
Console.Read();
}
public static async Task DoStuff(Action action)
{
// Swap to 1000 to simulate starting up blocking
var blocking = 1; //1000;
await Task.Delay(500 + blocking);
action();
// Do the rest of the stuff...
await Task.Delay(1000);
}
I tried swapping the second line with:
Func<Task> startupDone = new Func<Task>(async () => { });
But then the lines below the comments with +++ in them don't work right.
I swapped the startupDone.Start() with startupDone.Invoke().
But startupDone.Wait needs the task. Which is only returned in the lambda. I am not sure how to get access to the task outside the lambda so I can Wait for it.
How can use a Func<Task> and start it in one part of my code and do a Wait for it in another part of my code? (Like I can with new Task(() => { })).
The code you posted cannot be refactored to make use of a Func<Task> instead of a cold task, because the method that needs to await the task (the Main method) is not the same method that controls the creation/starting of the task (the lambda parameter of the DoStuff method). This could make the use of the Task constructor legitimate in this case, depending on whether the design decision to delegate the starting of the task to a lambda is justified. In this particular example the startupDone is used as a synchronization primitive, to signal that a condition has been met and the program can continue. This could be achieved equally well by using a specialized synchronization primitive, like for example a SemaphoreSlim:
static async Task Main(string[] args)
{
var startupSemaphore = new SemaphoreSlim(0);
Task runTask = RunAsync(startupSemaphore);
bool startupFinished = await startupSemaphore.WaitAsync(1000);
Console.WriteLine(startupFinished ? "Startup Finished" : "Startup Timed Out");
await runTask;
}
public static async Task RunAsync(SemaphoreSlim startupSemaphore)
{
await Task.Delay(500);
startupSemaphore.Release(); // Signal that the startup is done
await Task.Delay(1000);
}
In my opinion using a SemaphoreSlim is more meaningful in this case, and makes the intent of the code clearer. It also allows to await asynchronously the signal with a timeout WaitAsync(Int32), which is not something that you get from a Task out of the box (it is doable though).
Using cold tasks may be tempting in some cases, but when you revisit your code after a month or two you'll find yourself confused, because of how rare and unexpected is to have to deal with tasks that may or may have not been started yet.
I always try my hardest to never have blocking behavior when dealing with anything async or any type that represents potential async behavior such as Task. You can slightly modify your DoStuff to facilitate waiting on your Action.
static async void Main(string[] args)
{
Func<CancellationToken,Task> startupTask = async(token)=>
{
Console.WriteLine("Waiting");
await Task.Delay(3000, token);
Console.WriteLine("Completed");
};
using var source = new CancellationTokenSource(2000);
var runTask = DoStuff(() => startupTask(source.Token), source.Token);
var didStartup = await runTask;
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
Console.Read();
}
public static async Task<bool> DoStuff(Func<Task> action, CancellationToken token)
{
var blocking = 10000;
try
{
await Task.Delay(500 + blocking, token);
await action();
}
catch(TaskCanceledException ex)
{
return false;
}
await Task.Delay(1000);
return true;
}
First, the type of your "do this later" object is going to become Func<Task>. Then, when the task is started (by invoking the function), you get back a Task that represents the operation:
static async void Main(string[] args)
{
Func<Task> startupDoneDelegate = async () => { };
Task startupDoneTask = null;
var runTask = await DoStuff(() =>
{
startupDoneTask = startupDoneDelegate();
});
var didStartup = startupDoneTask.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
}

Is async and unwrap necessary for StartNew()?

I have this code:
var task = Task.Factory.StartNew(() => service.StartAsync(ct), ct);
but I'm wondering if it should instead be this:
var task = Task.Factory.StartNew(async () => await service.StartAsync(ct), ct).Unwrap();
Is the first one correct to start my async service? Or is the second one better?
Consider the type of task returned, the first one yields Task<Task<int>> while the second yields Task<int>. So really the first one is a Task representing starting of the inner task, while the second, unwrapped, represents the Task returned by the inner method representing the service starting. Finally you can also Unwrap the first and get the same effect without the async/await which is unnecessary here. None of this really covers what the need for StartNew is at all in this case just reviews the return types your looking at.
Consider the following code:
public class AsyncTesting
{
public void StartServiceTest()
{
Task<Task<int>> tsk1 = Task.Factory.StartNew(() => StartAsync());
Task<int> tsk2 = Task.Factory.StartNew(() => StartAsync()).Unwrap();
Task<int> tsk3 = Task.Factory.StartNew(async () => await StartAsync()).Unwrap();
}
public Task<int> StartAsync() => Task.Delay(2500).ContinueWith(tsk => 1);
}
The method that does not Unwrap returns a Task that represents starting the internal Task not the work it does.
As JSteward explain in their answer, the first line of code is wrong. It doesn't do what you expect it to do:
Task<Task> task = Task.Factory.StartNew(() => service.StartAsync(ct), ct); // Buggy
The second line has the correct behavior, but not because of the async/await. The async/await can be safely elided. What makes it correct is the Unwrap. It is still problematic though, because it violates the guideline CA2008 about not creating tasks without passing a TaskScheduler.
The best way to solve your problem is to use the Task.Run method:
Task task = Task.Run(() => service.StartAsync(ct), ct); // Correct
You can read about the differences between Task.Run and Task.Factory.StartNew in this article by Stephen Toub.

Parallel.ForEach using Thread.Sleep equivalent

So here's the situation: I need to make a call to a web site that starts a search. This search continues for an unknown amount of time, and the only way I know if the search has finished is by periodically querying the website to see if there's a "Download Data" link somewhere on it (it uses some strange ajax call on a javascript timer to check the backend and update the page, I think).
So here's the trick: I have hundreds of items I need to search for, one at a time. So I have some code that looks a little bit like this:
var items = getItems();
Parallel.ForEach(items, item =>
{
startSearch(item);
var finished = isSearchFinished(item);
while(finished == false)
{
finished = isSearchFinished(item); //<--- How do I delay this action 30 Secs?
}
downloadData(item);
}
Now, obviously this isn't the real code, because there could be things that cause isSearchFinished to always be false.
Obvious infinite loop danger aside, how would I correctly keep isSearchFinished() from calling over and over and over, but instead call every, say, 30 seconds or 1 minute?
I know Thread.Sleep() isn't the right solution, and I think the solution might be accomplished by using Threading.Timer() but I'm not very familiar with it, and there are so many threading options that I'm just not sure which to use.
It's quite easy to implement with tasks and async/await, as noted by #KevinS in the comments:
async Task<ItemData> ProcessItemAsync(Item item)
{
while (true)
{
if (await isSearchFinishedAsync(item))
break;
await Task.Delay(30 * 1000);
}
return await downloadDataAsync(item);
}
// ...
var items = getItems();
var tasks = items.Select(i => ProcessItemAsync(i)).ToArray();
await Task.WhenAll(tasks);
var data = tasks.Select(t = > t.Result);
This way, you don't block ThreadPool threads in vain for what is mostly a bunch of I/O-bound network operations. If you're not familiar with async/await, the async-await tag wiki might be a good place to start.
I assume you can convert your synchronous methods isSearchFinished and downloadData to asynchronous versions using something like HttpClient for non-blocking HTTP request and returning a Task<>. If you are unable to do so, you still can simply wrap them with Task.Run, as await Task.Run(() => isSearchFinished(item)) and await Task.Run(() => downloadData(item)). Normally this is not recommended, but as you have hundreds of items, it sill would give you a much better level of concurrency than with Parallel.ForEach in this case, because you won't be blocking pool threads for 30s, thanks to asynchronous Task.Delay.
You can also write a generic function using TaskCompletionSource and Threading.Timer to return a Task that becomes complete once a specified retry function succeeds.
public static Task RetryAsync(Func<bool> retryFunc, TimeSpan retryInterval)
{
return RetryAsync(retryFunc, retryInterval, CancellationToken.None);
}
public static Task RetryAsync(Func<bool> retryFunc, TimeSpan retryInterval, CancellationToken cancellationToken)
{
var tcs = new TaskCompletionSource<object>();
cancellationToken.Register(() => tcs.TrySetCanceled());
var timer = new Timer((state) =>
{
var taskCompletionSource = (TaskCompletionSource<object>) state;
try
{
if (retryFunc())
{
taskCompletionSource.TrySetResult(null);
}
}
catch (Exception ex)
{
taskCompletionSource.TrySetException(ex);
}
}, tcs, TimeSpan.FromMilliseconds(0), retryInterval);
// Once the task is complete, dispose of the timer so it doesn't keep firing. Also captures the timer
// in a closure so it does not get disposed.
tcs.Task.ContinueWith(t => timer.Dispose(),
CancellationToken.None,
TaskContinuationOptions.ExecuteSynchronously,
TaskScheduler.Default);
return tcs.Task;
}
You can then use RetryAsync like this:
var searchTasks = new List<Task>();
searchTasks.AddRange(items.Select(
downloadItem => RetryAsync( () => isSearchFinished(downloadItem), TimeSpan.FromSeconds(2)) // retry timout
.ContinueWith(t => downloadData(downloadItem),
CancellationToken.None,
TaskContinuationOptions.OnlyOnRanToCompletion,
TaskScheduler.Default)));
await Task.WhenAll(searchTasks.ToArray());
The ContinueWith part specifies what you do once the task has completed successfully. In this case it will run your downloadData method on a thread pool thread because we specified TaskScheduler.Default and the continuation will only execute if the task ran to completion, i.e. it was not canceled and no exception was thrown.

Moving from Event-Based Async to Task-Based Async

I am using a WCF service to load some data in a WPF application and, until recently, did so via the Event-Based Async methods that Visual Studio auto-generated for me:
//Old way
private void LoadFoos(int barId)
{
serviceClient.SelectFoosByBarIdCompleted += (s, e) =>
{
Foos = e.Result.OrderBy(f => f.Description).ToList();
});
serviceClient.SelectFoosByBarIdAsync();
}
For whatever reason, we moving to using Tasks and I had a question on the best way to do the same sort of thing:
//New way
private async void LoadFoos(int barId)
{
private TaskScheduler uiTaskScheduler = TaskScheduler.FromCurrentSynchronizationContext();
serviceClient.SelectFoosByBarIdAsync(barId).ContinueWith(t =>
{
Foos = t.Result.OrderBy(f => f.Description).ToList();
}, uiTaskScheduler);
}
I think this is uglier because I have to manually set the context so I don't update things on the wrong thread (Foos is a data-bound property). Also, I thought I'd be able to do this:
//New way #2, doesn't sort ;(
private async void LoadFoos(int barId)
{
private TaskScheduler uiTaskScheduler = TaskScheduler.FromCurrentSynchronizationContext();
var selectFoosTask = serviceClient.SelectFoosByBarIdAsync(barId);
Foos = selectFoosTask;
}
But then I can't order it according to Description.
The whole task concept is fairly new to me so maybe I am missing something. Is there a more succinct way than what I've listed above?
You would just use await, not a continuation:
private async Task LoadFoos(int barId)
{
var temp = await serviceClient.SelectFoosByBarIdAsync(barId);
Foos = temp.OrderBy(f => f.Description).ToList();
}
Note that using an async void, or even just a Task returning method is probably not ideal. It would likely be better to rewrite this as (*assuming Foos is a List<string>):
private async Task<List<string>> LoadFoosAsync(int barId)
{
var temp = await serviceClient.SelectFoosByBarIdAsync(barId);
return temp.OrderBy(f => f.Description).ToList();
}
Then, when you call this, use:
Foos = await LoadFoosAsync(id);
As for converting event based async methods to task based ones (Tasks are vastly superior by the way:) ) check out these blog posts
http://msdn.microsoft.com/en-us/magazine/ff959203.aspx
http://blogs.msdn.com/b/pfxteam/archive/2009/06/19/9791857.aspx
You can still do all the processing before you marshal back to the ui thread. You can also create your own ContinueWith helper method that puts the task on correct TaskScheduler. (if you can't use await, that is the simplest option)
Also note that newer versions (2012 and later I think) of the wsdl tool actually generates Task based async methods for services.
Since the method is async just await the task rather than manually adding a continuation:
private async Task LoadFoos(int barId)
{
Foos = (await serviceClient.SelectFoosByBarIdAsync(barId))
.OrderBy(f => f.Description).ToList();
}
Also note you should avoid async void methods whenever possible, as you then have no way of knowing when the asynchronous operation ends, or to access any exceptions thrown. Instead have the method return a Task. Better yet would be to have the method return a Task<T> where T is the data that this returns, rather than having the method set some other field.

Did I actually gain anything here using tasks and async

I am trying to get to grips with Tasks and the Async Await keywords. I have a little sample method which essentially invokes n number of methods. Two key points to note are
I don't care about the order the methods run in
All methods should be called on a background thread
With this here is the code.
public async void Handle<T>(T entry) {
await Task.Run(() => {
Parallel.ForEach(_handlers, pair => {
pair.Value.Invoke(_reference, new object[] {
entry
});
});
});
My question is did I actually gain any async or parallelism out of the code above?
I'm assuming that you're running in a UI application, since parallel code on a server is quite rare.
In this case, wrapping parallel code in a Task.Run is perfectly normal, and a common pattern when you wish to execute parallel code asynchronously.
I would make a few changes:
Avoid async void. Return a Task, so you can handle errors more cleanly.
Follow the Task-based Asynchronous Pattern naming conventions (i.e.,
end your method with Async).
Return the Task directly instead of async/await (if your entire async method is just to await a task, you can avoid some overhead just by returning the task directly).
Like this:
public Task HandleAsync<T>(T entry) {
return Task.Run(() => {
Parallel.ForEach(_handlers, pair => {
pair.Value.Invoke(_reference, new object[] {
entry
});
});
});
I'd also consider two other possibilities:
Consider using Parallel.Invoke instead of Parallel.ForEach. It seems to me that Parallel.Invoke is a closer match to what you're actually trying to do. (See svick's comment below).
Consider leaving the parallel code in a synchronous method (e.g., public void Handle<T>(T entry)) and calling it with Task.Run (e.g., await Task.Run(() => Handle(entry));). If your Handle[Async] method is intended to be part of a general-purpose library, then you want to avoid the use of Task.Run within your asynchronous methods.
No, you've gained nothing here. Even if you made it return Task (so that the caller could check when everything had finished) you'd be better off without async:
public Task Handle<T>(T entry)
{
return Task.Run(() =>
{
Parallel.ForEach(_handlers, pair =>
{
pair.Value.Invoke(_reference, new object[] { entry });
});
});
}

Categories

Resources