Async and Await - How is order of execution maintained?

Async and Await - How is order of execution maintained? - c#

I am actually reading some topics about the Task Parallel Library and the asynchronous programming with async and await. The book "C# 5.0 in a Nutshell" states that when awaiting an expression using the await keyword, the compiler transforms the code into something like this:
var awaiter = expression.GetAwaiter();
awaiter.OnCompleted (() =>
{
var result = awaiter.GetResult();
Let's assume, we have this asynchronous function (also from the referred book):
async Task DisplayPrimeCounts()
{
for (int i = 0; i < 10; i++)
Console.WriteLine (await GetPrimesCountAsync (i*1000000 + 2, 1000000) +
" primes between " + (i*1000000) + " and " + ((i+1)*1000000-1));
Console.WriteLine ("Done!");
}
The call of the 'GetPrimesCountAsync' method will be enqueued and executed on a pooled thread. In general invoking multiple threads from within a for loop has the potential for introducing race conditions.
So how does the CLR ensure that the requests will be processed in the order they were made? I doubt that the compiler simply transforms the code into the above manner, since this would decouple the 'GetPrimesCountAsync' method from the for loop.

Just for the sake of simplicity, I'm going to replace your example with one that's slightly simpler, but has all of the same meaningful properties:
async Task DisplayPrimeCounts()
{
for (int i = 0; i < 10; i++)
{
var value = await SomeExpensiveComputation(i);
Console.WriteLine(value);
}
Console.WriteLine("Done!");
}
The ordering is all maintained because of the definition of your code. Let's imagine stepping through it.
This method is first called
The first line of code is the for loop, so i is initialized.
The loop check passes, so we go to the body of the loop.
SomeExpensiveComputation is called. It should return a Task<T> very quickly, but the work that it'd doing will keep going on in the background.
The rest of the method is added as a continuation to the returned task; it will continue executing when that task finishes.
After the task returned from SomeExpensiveComputation finishes, we store the result in value.
value is printed to the console.
GOTO 3; note that the existing expensive operation has already finished before we get to step 4 for the second time and start the next one.
As far as how the C# compiler actually accomplishes step 5, it does so by creating a state machine. Basically every time there is an await there's a label indicating where it left off, and at the start of the method (or after it's resumed after any continuation fires) it checks the current state, and does a goto to the spot where it left off. It also needs to hoist all local variables into fields of a new class so that the state of those local variables is maintained.
Now this transformation isn't actually done in C# code, it's done in IL, but this is sort of the morale equivalent of the code I showed above in a state machine. Note that this isn't valid C# (you cannot goto into a a for loop like this, but that restriction doesn't apply to the IL code that is actually used. There are also going to be differences between this and what C# actually does, but is should give you a basic idea of what's going on here:
internal class Foo
{
public int i;
public long value;
private int state = 0;
private Task<int> task;
int result0;
public Task Bar()
{
var tcs = new TaskCompletionSource<object>();
Action continuation = null;
continuation = () =>
{
try
{
if (state == 1)
{
goto state1;
}
for (i = 0; i < 10; i++)
{
Task<int> task = SomeExpensiveComputation(i);
var awaiter = task.GetAwaiter();
if (!awaiter.IsCompleted)
{
awaiter.OnCompleted(() =>
{
result0 = awaiter.GetResult();
continuation();
});
state = 1;
return;
}
else
{
result0 = awaiter.GetResult();
}
state1:
Console.WriteLine(value);
}
Console.WriteLine("Done!");
tcs.SetResult(true);
}
catch (Exception e)
{
tcs.SetException(e);
}
};
continuation();
}
}
Note that I've ignored task cancellation for the sake of this example, I've ignored the whole concept of capturing the current synchronization context, there's a bit more going on with error handling, etc. Don't consider this a complete implementation.

The call of the 'GetPrimesCountAsync' method will be enqueued and executed on a pooled thread.
No. await does not initiate any kind of background processing. It waits for existing processing to complete. It is up to GetPrimesCountAsync to do that (e.g. using Task.Run). It's more clear this way:
var myRunningTask = GetPrimesCountAsync();
await myRunningTask;
The loop only continues when the awaited task has completed. There is never more than one task outstanding.
So how does the CLR ensure that the requests will be processed in the order they were made?
The CLR is not involved.
I doubt that the compiler simply transforms the code into the above manner, since this would decouple the 'GetPrimesCountAsync' method from the for loop.
The transform that you shows is basically right but notice that the next loop iteration is not started right away but in the callback. That's what serializes execution.

Related

Why a simple await Task.Delay(1) enables parallel execution?

I would like to ask simple question about code bellow:
static void Main(string[] args)
{
MainAsync()
//.Wait();
.GetAwaiter().GetResult();
}
static async Task MainAsync()
{
Console.WriteLine("Hello World!");
Task<int> a = Calc(18000);
Task<int> b = Calc(18000);
Task<int> c = Calc(18000);
await a;
await b;
await c;
Console.WriteLine(a.Result);
}
static async Task<int> Calc(int a)
{
//await Task.Delay(1);
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
This example runs Calc functions in synchronous way. When the line //await Task.Delay(1); will be uncommented, the Calc functions will be executed in a parallel way.
The question is: Why, by adding simple await, the Calc function is then async?
I know about async/await pair requirements. I'm asking about what it's really happening when simple await Delay is added at the beginning of a function. Whole Calc function is then recognized to be run in another thread, but why?
Edit 1:
When I added a thread checking to code:
static async Task<int> Calc(int a)
{
await Task.Delay(1);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
it is possible to see (in console) different thread id's. If await Delay line is deleted, the thread id is always the same for all runs of Calc function. In my opinion it proves that code after await is (can be) runned in different threads. And it is the reason why code is faster (in my opinion of course).

It's important to understand how async methods work.
First, they start running synchronously, on the same thread, just like every other method. Without that await Task.Delay(1) line, the compiler will have warned you that the method would be completely synchronous. The async keyword doesn't, by itself, make your method asynchronous. It just enables the use of await.
The magic happens at the first await that acts on an incomplete Task. At that point the method returns. It returns a Task that you can use to check when the rest of the method has completed.
So when you have await Task.Delay(1) there, the method returns at that line, allowing your MainAsync method to move to the next line and start the next call to Calc.
How the continuation of Calc runs (everything after await Task.Delay(1)) depends on if there is a "synchronization context". In ASP.NET (not Core) or a UI application, for example, the synchronization context controls how the continuations run and they would run one after the other. In a UI app, it would be on the same thread it started from. In ASP.NET, it may be a different thread, but still one after the other. So in either case, you would not see any parallelism.
However, because this is a console app, which does not have a synchronization context, the continuations happen on any ThreadPool thread as soon as the Task from Task.Delay(1) completes. That means each continuation can happen in parallel.
Also worth noting: starting with C# 7.1 you can make your Main method async, eliminating the need for your MainAsync method:
static async Task Main(string[] args)

An async function returns the incomplete task to the caller at its first incomplete await. After that the await on the calling side will await that task to become complete.
Without the await Task.Delay(1), Calc() does not have any awaits of its own, so will only return to the caller when it runs to the end. At this point the returned Task is already complete, so the await on the calling site immediately uses the result without actually invoking the async machinery.

layman's version....
nothing in the process is yielding CPU time back without 'delay' and so it doesn't give anything else CPU time, you are confusing this with multiple threaded code. "async and await" is not about multiple threaded but about using the CPU (thread/threads) when its doing non CPU work" aka writing to disk. Writing to disk does not need the thread (CPU). So when something is async, it can free the thread and be used for something else instead of waiting for non CPU (oi task) to complete.
#sunside is saying the same thing just more technically.
static async Task<int> Calc(int a)
{
//faking a asynchronous .... this will give this thread to something else
// until done then return here...
// does not make sense... as your making this take longer for no gain.
await Task.Delay(1);
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
vs
static async Task<int> Calc(int a)
{
using (var reader = File.OpenText("Words.txt"))
{
//real asynchronous .... this will give this thread to something else
var fileText = await reader.ReadToEndAsync();
// Do something with fileText...
}
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
the reason it looks like its in "parallel way" is that its just give the others tasks CPU time.
example; aka without delay
await a; do this (no actual aysnc work)
await b; do this (no actual aysnc work)
await c; do this (no actual aysnc work)
example 2;aka with delay
await a; start this then pause this (fake async), start b but come back and finish a
await b; start this then pause this (fake async), start c but come back and finish b
await c; start this then pause this (fake async), come back and finish c
what you should find is although more is started sooner, the overall time will be longer as it as to jump between tasks for no benefit with a faked asynchronous task. where as, if the await Task.Delay(1) was a real async function aka asynchronous in nature then the benefit would be it can start the other work using the thread which would of been blocked... while it waits for something which does not require the thread.
update silly code to show its slower... Make sure you are in "Release" mode you should always ignore the first run... these test are silly and you would need to use https://github.com/dotnet/BenchmarkDotNet to really see the diff
static void Main(string[] args)
{
Console.WriteLine("Exmaple1 - no Delay, expecting it to be faster, shorter times on average");
for (int i = 0; i < 10; i++)
{
Exmaple1().GetAwaiter().GetResult();
}
Console.WriteLine("Exmaple2- with Delay, expecting it to be slower,longer times on average");
for (int i = 0; i < 10; i++)
{
Exmaple2().GetAwaiter().GetResult();
}
}
static async Task Exmaple1()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Task<int> a = Calc1(18000); await a;
Task<int> b = Calc1(18000); await b;
Task<int> c = Calc1(18000); await c;
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
}
static async Task<int> Calc1(int a)
{
int result = 0;
for (int k = 0; k < a; ++k) { for (int l = 0; l < a; ++l) { result += l; } }
return result;
}
static async Task Exmaple2()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Task<int> a = Calc2(18000); await a;
Task<int> b = Calc2(18000); await b;
Task<int> c = Calc2(18000); await c;
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
}
static async Task<int> Calc2(int a)
{
await Task.Delay(1);
int result = 0;
for (int k = 0; k < a; ++k){for (int l = 0; l < a; ++l) { result += l; } }
return result;
}

By using an async/await pattern, you intend for your Calc method to run as a task:
Task<int> a = Calc(18000);
We need to establish that tasks are generally asynchronous in their nature, but not parallel - parallelism is a feature of threads. However, under the hood, some thread will be used to execute your tasks on. Depending on the context your running your code in, multiple tasks may or may not be executed in parallel or sequentially - but they will be (allowed to be) asynchronous.
One good way of internalizing this is picturing a teacher (the thread) answering question (the tasks) in class. A teacher will never be able to answer two different questions simultaneously - i.e. in parallel - but she will be able to answer questions of multiple students, and can also be interrupted with new questions in between.
Specifically, async/await is a cooperative multiprocessing feature (emphasis on "cooperative") where tasks only get to be scheduled onto a thread if that thread is free - and if some task is already running on that thread, it needs to manually give way. (Whether and how many threads are available for execution is, in turn, dependent on the environment your code is executing in.)
When running Task.Delay(1) the code announces that it is intending to sleep, which signals to the scheduler that another task may execute, thereby enabling asynchronicity. The way it's used in your code it is, in essence, a slightly worse version of Task.Yield (you can find more details about that in the comments below).
Generally speaking, whenever you await, the method currently being executed is "returned" from, marking the code after it as a continuation. This continuation will only be executed when the task scheduler selects the current task for execution again. This may happen if no other task is currently executing and waiting to be executed - e.g. because they all yielded or await something "long-running".
In your example, the Calc method yields due to the Task.Delay and execution returns to the Main method. This, in turn enters the next Calc method and the pattern repeats. As was established in other answers, these continuations may or may not execute on different threads, depending on the environment - without a synchronization context (e.g. in a console application), it may happen. To be clear, this is neither a feature of Task.Delay nor async/await, but of the configuration your code executes in. If you require parallelism, use proper threads or ensure that your tasks are started such that they encourage use of multiple threads.
In another note: Whenever you intend to run synchronous code in an asynchronous manner, use Task.Run() to execute it. This will make sure it doesn't get in your way too much by always using a background thread. This SO answer on LongRunning tasks might be insightful.

Execute set of tasks in parallel but with a group timeout

I'm currently trying to write a status checking tool with a reliable timeout value. One way I'd seen how to do this was using Task.WhenAny() and including a Task.Delay, however it doesn't seem to produce the results I expect:
public void DoIUnderstandTasksTest()
{
var checkTasks = new List<Task>();
// Create a list of dummy tasks that should just delay or "wait"
// for some multiple of the timeout
for (int i = 0; i < 10; i++)
{
checkTasks.Add(Task.Delay(_timeoutMilliseconds/2));
}
// Wrap the group of tasks in a task that will wait till they all finish
var allChecks = Task.WhenAll(checkTasks);
// I think WhenAny is supposed to return the first task that completes
bool didntTimeOut = Task.WhenAny(allChecks, Task.Delay(_timeoutMilliseconds)) == allChecks;
Assert.True(didntTimeOut);
}
What am I missing here?

I think you're confusing the workings of the When... calls with Wait....
Task.WhenAny doesn't return the first task to complete among those you pass to it. Rather, it returns a new Task that will be completed when any of the internal tasks finish. This means your equality check will always return false - the new task will never equal the previous one.
The behavior you're expecting seems similar to Task.WaitAny, which will block current execution until any of the internal tasks complete, and return the index of the completed task.
Using WaitAny, your code will look like this:
// Wrap the group of tasks in a task that will wait till they all finish
var allChecks = Task.WhenAll(checkTasks);
var taskIndexThatCompleted = Task.WaitAny(allChecks, Task.Delay(_timeoutMilliseconds));
Assert.AreEqual(0, taskIndexThatCompleted);

Random tasks from Task.Factory.StartNew never finishes

I am using Async await with Task.Factory method.
public async Task<JobDto> ProcessJob(JobDto jobTask)
{
try
{
var T = Task.Factory.StartNew(() =>
{
JobWorker jobWorker = new JobWorker();
jobWorker.Execute(jobTask);
});
await T;
}
This method I am calling inside a loop like this
for(int i=0; i < jobList.Count(); i++)
{
tasks[i] = ProcessJob(jobList[i]);
}
What I notice is that new tasks opens up inside Process explorer and they also start working (based on log file). however out of 10 sometimes 8 or sometimes 7 finishes. Rest of them just never come back.
why would that be happening ?
Are they timing out ? Where can I set timeout for my tasks ?
UPDATE
Basically above, I would like each Task to start running as soon as they are called and wait for the response on AWAIT T keyword. I am assuming here that once they finish each of them will come back at Await T and do the next action. I am alraedy seeing this result for 7 out of 10 tasks but 3 of them are not coming back.
Thanks

It is hard to say what the issues is without the rest if the code, but you code can be simplified by making ProcessJob synchronous and then calling Task.Run with it.
public JobDto ProcessJob(JobDto jobTask)
{
JobWorker jobWorker = new JobWorker();
return jobWorker.Execute(jobTask);
}
Start tasks and wait for all tasks to finish. Prefer using Task.Run rather than Task.Factory.StartNew as it provides more favourable defaults for pushing work to the background. See here.
for(int i=0; i < jobList.Count(); i++)
{
tasks[i] = Task.Run(() => ProcessJob(jobList[i]));
}
try
{
await Task.WhenAll(tasks);
}
catch(Exception ex)
{
// handle exception
}

First, let's make a reproducible version of your code. This is NOT the best way to achieve what you are doing, but to show you what is happening in your code!
I'll keep the code almost same as your code, except I'll use simple int rather than your JobDto and on completion of the job Execute() I'll write in a file that we can verify later. Here's the code
public class SomeMainClass
{
public void StartProcessing()
{
var jobList = Enumerable.Range(1, 10).ToArray();
var tasks = new Task[10];
//[1] start 10 jobs, one-by-one
for (int i = 0; i < jobList.Count(); i++)
{
tasks[i] = ProcessJob(jobList[i]);
}
//[4] here we have 10 awaitable Task in tasks
//[5] do all other unrelated operations
Thread.Sleep(1500); //assume it works for 1.5 sec
// Task.WaitAll(tasks); //[6] wait for tasks to complete
// The PROCESS IS COMPLETE here
}
public async Task ProcessJob(int jobTask)
{
try
{
//[2] start job in a ThreadPool, Background thread
var T = Task.Factory.StartNew(() =>
{
JobWorker jobWorker = new JobWorker();
jobWorker.Execute(jobTask);
});
//[3] await here will keep context of calling thread
await T; //... and release the calling thread
}
catch (Exception) { /*handle*/ }
}
}
public class JobWorker
{
static object locker = new object();
const string _file = #"C:\YourDirectory\out.txt";
public void Execute(int jobTask) //on complete, writes in file
{
Thread.Sleep(500); //let's assume does something for 0.5 sec
lock(locker)
{
File.AppendAllText(_file,
Environment.NewLine + "Writing the value-" + jobTask);
}
}
}
After running just the StartProcessing(), this is what I get in the file
Writing the value-4
Writing the value-2
Writing the value-3
Writing the value-1
Writing the value-6
Writing the value-7
Writing the value-8
Writing the value-5
So, 8/10 jobs has completed. Obviously, every time you run this, the number and order might change. But, the point is, all the jobs did not complete!
Now, if I un-comment the step [6] Task.WaitAll(tasks);, this is what I get in my file
Writing the value-2
Writing the value-3
Writing the value-4
Writing the value-1
Writing the value-5
Writing the value-7
Writing the value-8
Writing the value-6
Writing the value-9
Writing the value-10
So, all my jobs completed here!
Why the code is behaving like this, is already explained in the code-comments. The main thing to note is, your tasks run in ThreadPool based Background threads. So, if you do not wait for them, they will be killed when the MAIN process ends and the main thread exits!!
If you still don't want to await the tasks there, you can return the list of tasks from this first method and await the tasks at the very end of the process, something like this
public Task[] StartProcessing()
{
...
for (int i = 0; i < jobList.Count(); i++)
{
tasks[i] = ProcessJob(jobList[i]);
}
...
return tasks;
}
//in the MAIN METHOD of your application/process
var tasks = new SomeMainClass().StartProcessing();
// do all other stuffs here, and just at the end of process
Task.WaitAll(tasks);
Hope this clears all confusion.

It's possible your code is swallowing exceptions. I would add a ContineWith call to the end of the part of the code that starts the new task. Something like this untested code:
var T = Task.Factory.StartNew(() =>
{
JobWorker jobWorker = new JobWorker();
jobWorker.Execute(jobTask);
}).ContinueWith(tsk =>
{
var flattenedException = tsk.Exception.Flatten();
Console.Log("Exception! " + flattenedException);
return true;
});
},TaskContinuationOptions.OnlyOnFaulted); //Only call if task is faulted
Another possibility is that something in one of the tasks is timing out (like you mentioned) or deadlocking. To track down whether a timeout (or maybe deadlock) is the root cause, you could add some timeout logic (as described in this SO answer):
int timeout = 1000; //set to something much greater than the time it should take your task to complete (at least for testing)
var task = TheMethodWhichWrapsYourAsyncLogic(cancellationToken);
if (await Task.WhenAny(task, Task.Delay(timeout, cancellationToken)) == task)
{
// Task completed within timeout.
// Consider that the task may have faulted or been canceled.
// We re-await the task so that any exceptions/cancellation is rethrown.
await task;
}
else
{
// timeout/cancellation logic
}
Check out the documentation on exception handling in the TPL on MSDN.

Calling an async method from a synchronous method

I am attempting to run async methods from a synchronous method. But I can't await the async method since I am in a synchronous method. I must not be understanding TPL as this is the fist time I'm using it.
private void GetAllData()
{
GetData1()
GetData2()
GetData3()
}
Each method needs the previous method to finish as the data from the first is used for the second.
However, inside each method I want to start multiple Task operations in order to speed up the performance. Then I want to wait for all of them to finish.
GetData1 looks like this
internal static void GetData1 ()
{
const int CONCURRENCY_LEVEL = 15;
List<Task<Data>> dataTasks = new List<Task<Data>>();
for (int item = 0; item < TotalItems; item++)
{
dataTasks.Add(MyAyncMethod(State[item]));
}
int taskIndex = 0;
//Schedule tasks to concurency level (or all)
List<Task<Data>> runningTasks = new List<Task<Data>>();
while (taskIndex < CONCURRENCY_LEVEL && taskIndex < dataTasks.Count)
{
runningTasks.Add(dataTasks[taskIndex]);
taskIndex++;
}
//Start tasks and wait for them to finish
while (runningTasks.Count > 0)
{
Task<Data> dataTask = await Task.WhenAny(runningTasks);
runningTasks.Remove(dataTask);
myData = await dataTask;
//Schedule next concurrent task
if (taskIndex < dataTasks.Count)
{
runningTasks.Add(dataTasks[taskIndex]);
taskIndex++;
}
}
Task.WaitAll(dataTasks.ToArray()); //This probably isn't necessary
}
I am using await here but get an Error
The 'await' operator can only be used within an async method. Consider
marking this method with the 'async' modifier and changing its return
type to 'Task'
However, if I use the async modifier this will be an asynchronous operation. Therefore, if my call to GetData1 doesn't use the await operator won't control go to GetData2 on the first await, which is what I am trying to avoid? Is it possible to keep GetData1 as a synchronous method that calls an asynchronous method? Am I designing the Asynchronous method incorrectly? As you can see I'm quite confused.
This could be a duplicate of How to call asynchronous method from synchronous method in C#? However, I'm not sure how to apply the solutions provided there as I'm starting multiple tasks, want to WaitAny, do a little more processing for that task, then wait for all tasks to finish before handing control back to the caller.
UPDATE
Here is the solution I went with based on the answers below:
private static List<T> RetrievePageTaskScheduler<T>(
List<T> items,
List<WebPageState> state,
Func<WebPageState, Task<List<T>>> func)
{
int taskIndex = 0;
// Schedule tasks to concurency level (or all)
List<Task<List<T>>> runningTasks = new List<Task<List<T>>>();
while (taskIndex < CONCURRENCY_LEVEL_PER_PROCESSOR * Environment.ProcessorCount
&& taskIndex < state.Count)
{
runningTasks.Add(func(state[taskIndex]));
taskIndex++;
}
// Start tasks and wait for them to finish
while (runningTasks.Count > 0)
{
Task<List<T>> task = Task.WhenAny(runningTasks).Result;
runningTasks.Remove(task);
try
{
items.AddRange(task.Result);
}
catch (AggregateException ex)
{
/* Throwing this exception means that if one task fails
* don't process any more of them */
// https://stackoverflow.com/questions/8853693/pattern-for-implementing-sync-methods-in-terms-of-non-parallel-task-translating
System.Runtime.ExceptionServices.ExceptionDispatchInfo.Capture(
ex.Flatten().InnerExceptions.First()).Throw();
}
// Schedule next concurrent task
if (taskIndex < state.Count)
{
runningTasks.Add(func(state[taskIndex]));
taskIndex++;
}
}
return items;
}

Task<TResult>.Result (or Task.Wait() when there's no result) is similar to await, but is a synchronous operation. You should change GetData1() to use this. Here's the portion to change:
Task<Data> dataTask = Task.WhenAny(runningTasks).Result;
runningTasks.Remove(dataTask);
myData = gameTask.Result;

First, I recommend that your "internal" tasks not use Task.Run in their implementation. You should use an async method that does the CPU-bound portion synchronously.
Once your MyAsyncMethod is an async method that does some CPU-bound processing, then you can wrap it in a Task and use parallel processing as such:
internal static void GetData1()
{
// Start the tasks
var dataTasks = Enumerable.Range(0, TotalItems)
.Select(item => Task.Run(() => MyAyncMethod(State[item]))).ToList();
// Wait for them all to complete
Task.WaitAll(dataTasks);
}
Your concurrency limiting in your original code won't work at all, so I removed it for simpilicity. If you want to apply a limit, you can either use SemaphoreSlim or TPL Dataflow.

You can call the following:
GetData1().Wait();
GetData2().Wait();
GetData3().Wait();

Cancelling a Task is throwing an exception

From what I've read about Tasks, the following code should cancel the currently executing task without throwing an exception. I was under the impression that the whole point of task cancellation was to politely "ask" the task to stop without aborting threads.
The output from the following program is:
Dumping exception
[OperationCanceledException]
Cancelling and returning last calculated prime.
I am trying to avoid any exceptions when cancelling. How can I accomplish this?
void Main()
{
var cancellationToken = new CancellationTokenSource();
var task = new Task<int>(() => {
return CalculatePrime(cancellationToken.Token, 10000);
}, cancellationToken.Token);
try
{
task.Start();
Thread.Sleep(100);
cancellationToken.Cancel();
task.Wait(cancellationToken.Token);
}
catch (Exception e)
{
Console.WriteLine("Dumping exception");
e.Dump();
}
}
int CalculatePrime(CancellationToken cancelToken, object digits)
{
int factor;
int lastPrime = 0;
int c = (int)digits;
for (int num = 2; num < c; num++)
{
bool isprime = true;
factor = 0;
if (cancelToken.IsCancellationRequested)
{
Console.WriteLine ("Cancelling and returning last calculated prime.");
//cancelToken.ThrowIfCancellationRequested();
return lastPrime;
}
// see if num is evenly divisible
for (int i = 2; i <= num/2; i++)
{
if ((num % i) == 0)
{
// num is evenly divisible -- not prime
isprime = false;
factor = i;
}
}
if (isprime)
{
lastPrime = num;
}
}
return lastPrime;
}

I am trying to avoid any exceptions when cancelling.
You shouldn't do that.
Throwing OperationCanceledException is the idiomatic way that "the method you called was cancelled" is expressed in TPL. Don't fight against that - just expect it.
It's a good thing, because it means that when you've got multiple operations using the same cancellation token, you don't need to pepper your code at every level with checks to see whether or not the method you've just called has actually completed normally or whether it's returned due to cancellation. You could use CancellationToken.IsCancellationRequested everywhere, but it'll make your code a lot less elegant in the long run.
Note that there are two pieces of code in your example which are throwing an exception - one within the task itself:
cancelToken.ThrowIfCancellationRequested()
and one where you wait for the task to complete:
task.Wait(cancellationToken.Token);
I don't think you really want to be passing the cancellation token into the task.Wait call, to be honest... that allows other code to cancel your waiting. Given that you know you've just cancelled that token, it's pointless - it's bound to throw an exception, whether the task has actually noticed the cancellation yet or not. Options:
Use a different cancellation token (so that other code can cancel your wait independently)
Use a time-out
Just wait for as long as it takes

You are explicitly throwing an Exception on this line:
cancelToken.ThrowIfCancellationRequested();
If you want to gracefully exit the task, then you simply need to get rid of that line.
Typically people use this as a control mechanism to ensure the current processing gets aborted without potentially running any extra code. Also, there is no need to check for cancellation when calling ThrowIfCancellationRequested() since it is functionally equivalent to:
if (token.IsCancellationRequested)
throw new OperationCanceledException(token);
When using ThrowIfCancellationRequested() your Task might look more like this:
int CalculatePrime(CancellationToken cancelToken, object digits) {
try{
while(true){
cancelToken.ThrowIfCancellationRequested();
//Long operation here...
}
}
finally{
//Do some cleanup
}
}
Also, Task.Wait(CancellationToken) will throw an exception if the token was cancelled. To use this method, you will need to wrap your Wait call in a Try...Catch block.
MSDN: How to Cancel a Task

Some of the above answers read as if ThrowIfCancellationRequested() would be an option. It is not in this case, because you won't get your resulting last prime. The idiomatic way that "the method you called was cancelled" is defined for cases when canceling means throwing away any (intermediate) results. If your definition of cancelling is "stop computation and return the last intermediate result" you already left that way.
Discussing the benefits especially in terms of runtime is also quite misleading:
The implemented algorithm sucks at runtime. Even a highly optimized cancellation will not do any good.
The easiest optimization would be to unroll this loop and skip some unneccessary cycles:
for(i=2; i <= num/2; i++) {
if((num % i) == 0) {
// num is evenly divisible -- not prime
isprime = false;
factor = i;
}
}
You can
save (num/2)-1 cycles for every even number, which is slightly less than 50% overall (unrolling),
save (num/2)-square_root_of(num) cycles for every prime (choose bound according to math of smallest prime factor),
save at least that much for every non-prime, expect much more savings, e.g. num = 999 finishes with 1 cycle instead of 499 (break, if answer is found) and
save another 50% of cycles, which is of course 25% overall (choose step according to math of primes, unrolling handles the special case 2).
That accounts to saving a guaranteed minimum of 75% (rough estimation: 90%) of cycles in the inner loop, just by replacing it with:
if ((num % 2) == 0) {
isprime = false;
factor = 2;
} else {
for(i=3; i <= (int)Math.sqrt(num); i+=2) {
if((num % i) == 0) {
// num is evenly divisible -- not prime
isprime = false;
factor = i;
break;
}
}
}
There are much faster algorithms (which I won't discuss because I'm far enough off-topic) but this optimization is quite easy and still proves my point:
Don't worry about micro-optimizing runtime when your algorithm is this far from optimal.

Another note about the benefit of using ThrowIfCancellationRequested rather than IsCancellationRequested: I've found that when needing to use ContinueWith with a continuation option of TaskContinuationOptions.OnlyOnCanceled, IsCancellationRequested will not cause the conditioned ContinueWith to fire. ThrowIfCancellationRequested, however, will set the Canceled condition of the task, causing the ContinueWith to fire.
Note: This is only true when the task is already running and not when the task is starting. This is why I added a Thread.Sleep() between the start and cancellation.
CancellationTokenSource cts = new CancellationTokenSource();
Task task1 = new Task(() => {
while(true){
if(cts.Token.IsCancellationRequested)
break;
}
}, cts.Token);
task1.ContinueWith((ant) => {
// Perform task1 post-cancellation logic.
// This will NOT fire when calling cst.Cancel().
}
Task task2 = new Task(() => {
while(true){
cts.Token.ThrowIfCancellationRequested();
}
}, cts.Token);
task2.ContinueWith((ant) => {
// Perform task2 post-cancellation logic.
// This will fire when calling cst.Cancel().
}
task1.Start();
task2.Start();
Thread.Sleep(3000);
cts.Cancel();

You have two things listening to the token, the calculate prime method and also the Task instance named task. The calculate prime method should return gracefully, but task gets cancelled while it is still running so it throws. When you construct task don't bother giving it the token.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.