Correct way to a-synchronize parallel tasks - c#

Currently we have this code which works fine:
Result result1 = null;
Result result2 = null;
var task1 = Task.Factory.StartNew(()=>
{
var records = DB.Read("..");
//Do A lot
result1 = Process(records);
});
var task2 = Task.Factory.StartNew(()=>
{
var records = DB.Read(".....");
//Do A lot
result2 = Process(records);
});
Task.WaitAll(task1, task2);
var result = Combine(result1, result2);
Now we would like to use async counterparts of DB Functions and we are using this new pattern:
Result result1 = null;
Result result2 = null;
var task1 = await Task.Factory.StartNew( async ()=>
{
var records = await DB.ReadAsync("..");
//Do A lot
result1 = Process(records);
});
var task2 = await Task.Factory.StartNew(async ()=>
{
var records = await DB.ReadAsync(".....");
//Do A lot
result2 = Process(records);
});
Task.WaitAll(task1, task2);
var result = Combine(result1, result2);
After we switched to async we started observing abnormal behavior. So I wonder if this is the correct pattern to parallelize async calls ?

Task.Factory.StartNew is a pre-async API. You should be using Task.Run which was designed with async-await in mind:
var task1 = await Task.Run( async ()=>
{
var records = await DB.ReadAsync("..");
//Do A lot
result1 = Process(records);
});
The issue is that an async lambda returns a Task so Task.Factory.StartNew returns a Task<Task> (the outer one because Task.Factory.StartNew returns a Task and the inner one which is the result of the async lambda).
This means that when you wait on task1 and task2 you aren't really waiting for the entire operation, just the synchronous part of it.
You can fix that by using Task.Unwrap on the returned Task<Task>:
Task<Task> task1 = await Task.Factory.StartNew(async ()=>
{
var records = await DB.ReadAsync("..");
//Do A lot
result1 = Process(records);
});
Task actualTask1 = task1.Unwrap();
await actualTask1;
But Task.Run does that implicitly for you.
As a side note, you should realize that you don't need Task.Run to execute these operations concurrently. You can do that just by calling these methods and awaiting the results together with Task.When:
async Task MainAsync()
{
var task1 = FooAsync();
var task2 = BarAsync();
await Task.WhenAll(task1, task2);
var result = Combine(task1.Result, task2.Result);
}
async Task<Result> FooAsync()
{
var records = await DB.ReadAsync("..");
//Do A lot
return Process(records);
}
async Task<Result> BarAsync()
{
var records = await DB.ReadAsync(".....");
//Do A lot
return Process(records);
}
You only need Task.Run if you need to offload even the synchronous parts of these methods (the part before the first await) to the ThreadPool.

Well using .WaitAll is not an async programming because you're actually blocking current thread on waiting. Also you dont call .Unwrap and that's why you just wait only on creating of async lambda, not on async lambda itself.
Task.Run can unwrap async lambda for you. But there's a simpler and cleaner way.
var task1 = DB.ReadAsync("..").ContinueWith(task => {
//Do A lot
return Process(task.Result);
}, TaskScheduler.Default);
var task2 = DB.ReadAsync("..").ContinueWith(task => {
//Do A lot
return Process(task.Result);
}, TaskScheduler.Default);
var result = Combine(await task1, await task2);
In this way you will get result exactly when it's ready. So you don't need additional tasks and variables at all.
Please note that ContinueWith is a tricky function and it works on TaskScheduler.Current if it is not null and otherwise it works on TaskScheduler.Default which is thread pool scheduler. So it's safer to specify scheduler explicitly always when calling this function.
Also for claryfing I didn't included error checking because actually DB.ReadAsync can be completed with an error. But that's an easy thing and you can handle it for yourself.

Task.Factory.StartNew start a new Task of execution another independent execution unit. So the simplest way to deal with that may look like:
var task1 = Task.Factory.StartNew(()=> //NO AWAIT
{
var records = DB.Read("....."); //NO ASYNC
//Do A lot
result1 = Process(records);
});
... another task definition
Task.WaitAll(task1, task2);
Read and process sequentially in one task, as you have data dependency.

Related

How to run multiple methods in parallel in ASP.NET

I have 8 methods in an ASP.NET Console App, like Fun1(), Fun2(), Fun3() ... and so on. In my console application I have called all these methods sequentially. But now the requirement is I have do that using parallel programming concepts. I have read about task and threading concepts in Java but completely new to .NET parallel Programming.
Here is the flow of methods I needed in my console app,
As you can see the diagram, Task1 and Task2 should run in parallel, and Task3 will only occur after completion of the previous two.
The functions inside each task, for example Fun3 and Fun4 for the Task1, should run sequentially, the one after the other.
Can anyone please help me out?
One way to solve this is, by using WhenAll.
To take an example, I have created X number of methods with the name FuncX() like this:
async static Task<int> FuncX()
{
await Task.Delay(500);
var result = await Task.FromResult(1);
return result;
}
In this case, we have Func1, Func3, Func4, Func5, and Func6.
So we call methods and pass them to a list of Task.
var task1 = new List<Task<int>>();
task1.Add(Func3());
task1.Add(Func4());
var task2 = new List<Task<int>>();
task2.Add(Func1());
task2.Add(Func5());
task2.Add(Func6());
You have 2 options to get the result:
// option1
var eachFunctionIsDoneWithAwait1 = await Task.WhenAll(task1);
var eachFunctionIsDoneWithAwait2 = await Task.WhenAll(task2);
var sum1 = eachFunctionIsDoneWithAwait1.Sum() + eachFunctionIsDoneWithAwait2.Sum();
Console.WriteLine(sum1);
// option2
var task3 = new List<List<Task<int>>>();
task3.Add(task1);
task3.Add(task2);
var sum2 = 0;
task3.ForEach(async x =>
{
var r = await Task.WhenAll(x);
sum2 += r.Sum();
});
Console.WriteLine(sum2);
This is just example for inspiration, you can change it and do it the way you want.
Here is how you could create the tasks according to the diagram, using the Task.Run method:
Task task1 = Task.Run(() =>
{
Fun3();
Fun4();
});
Task task2 = Task.Run(() =>
{
Fun1();
Fun5();
Fun6();
});
Task task3 = Task.Run(async () =>
{
await Task.WhenAll(task1, task2);
Fun7();
Fun8();
});
The Task.Run invokes the delegate on the ThreadPool, not on a dedicated thread. If you have some reason to create a dedicated thread for each task, you could use the advanced Task.Factory.StartNew method with the TaskCreationOptions.LongRunning argument, as shown here.
It should be noted that the above implementation has not an optimal behavior in case of exceptions. In case the Fun3() fails immediately, the optimal behavior would be to stop the execution of the task2 as soon as possible. Instead this implementation will execute all three functions Fun1, Fun5 and Fun6 before propagating the error. You could fix this minor flaw by creating a CancellationTokenSource and invoking the Token.ThrowIfCancellationRequested after each function, but it's going to be messy.
Another issue is that in case both task1 and task2 fail, only the exception of the task1 is going to be propagated through the task3. Solving this issue is not trivial.
Update: Here is one way to solve the issue of partial exception propagation:
Task task3 = Task.WhenAll(task1, task2).ContinueWith(t =>
{
if (t.IsFaulted)
{
TaskCompletionSource tcs = new();
tcs.SetException(t.Exception.InnerExceptions);
return tcs.Task;
}
if (t.IsCanceled)
{
TaskCompletionSource tcs = new();
tcs.SetCanceled(new TaskCanceledException(t).CancellationToken);
return tcs.Task;
}
Debug.Assert(t.IsCompletedSuccessfully);
Fun7();
Fun8();
return Task.CompletedTask;
}, default, TaskContinuationOptions.DenyChildAttach, TaskScheduler.Default)
.Unwrap();
In case both task1 and task2 fail, the task3 will propagate the exceptions of both tasks.

Is there a way to wait for all tasks until a specific result is true, and then cancel the rest?

In my C# console application, I'm trying to run multiple tasks that do various data checks simultaneously. If one of the tasks returns true I should stop the other tasks since I have my actionable result. It's also very possible none of the functions return true
I have the code to run the tasks together (I think), I'm just having trouble getting to the finish line:
Task task1 = Task.Run(() => Task1(stoppingToken));
Task task2 = Task.Run(() => Task2(stoppingToken));
Task task3 = Task.Run(() => Task3(stoppingToken));
Task task4 = Task.Run(() => Task4(stoppingToken));
Task task5 = Task.Run(() => Task5(stoppingToken));
Task task6 = Task.Run(() => Task6(stoppingToken));
Task.WaitAll(task1, task2, task3, task4, task5, task6);
This is a little different than the answer in the linked question where the desired result is known (timeout value). I'm waiting for any of these tasks to possibly return true and then cancel the remaining tasks if they are still running
Task.WhenAny with cancellation of the non completed tasks and timeout
Here's a solution based on continuation tasks. The idea is to append continuation tasks to each of the original (provided) tasks, and check the result there. If it's a match, the completion source will be set with a result (if there's no match, the result won't be set at all).
Then, the code will wait for whatever happens first: either all the continuation tasks complete, or the task completion result will be set. Either way, we'll be ready to check the result of the task associated with task completion source (that's why we wait for the continuation tasks to complete, not the original tasks) and if it's set, it's pretty much an indication that we have a match (the additional check at the end is a little paranoid, but better safe than sorry I guess... :D)
public static async Task<bool> WhenAnyHasResult<T>(Predicate<T> isExpectedResult, params Task<T>[] tasks)
{
const TaskContinuationOptions continuationTaskFlags = TaskContinuationOptions.ExecuteSynchronously | TaskContinuationOptions.OnlyOnRanToCompletion | TaskContinuationOptions.AttachedToParent;
// Prepare TaskCompletionSource to be set only when one of the provided tasks
// completes with expected result
var tcs = new TaskCompletionSource<T>();
// For every provided task, attach a continuation task that fires
// once the original task was completed
var taskContinuations = tasks.Select(task =>
{
return task.ContinueWith(x =>
{
var taskResult = x.Result;
if (isExpectedResult(taskResult))
{
tcs.SetResult(taskResult);
}
},
continuationTaskFlags);
});
// We either wait for all the continuation tasks to be completed
// (it's most likely an indication that none of the provided tasks completed with the expected result)
// or for the TCS task to complete (which means a failure)
await Task.WhenAny(Task.WhenAll(taskContinuations), tcs.Task);
// If the task from TCS has run to completion, it means the result has been set from
// the continuation task attached to one of the tasks provided in the arguments
var completionTask = tcs.Task;
if (completionTask.IsCompleted)
{
// We will check once more to make sure the result is set as expected
// and return this as our outcome
var tcsResult = completionTask.Result;
return isExpectedResult(tcsResult);
}
// TCS result was never set, which means we did not find a task matching the expected result.
tcs.SetCanceled();
return false;
}
Now, the usage will be as follows:
static async Task ExampleWithBooleans()
{
Console.WriteLine("Example with booleans");
var task1 = SampleTask(3000, true);
var task2 = SampleTask(5000, false);
var finalResult = await TaskUtils.WhenAnyHasResult(result => result == true, task1, task2);
// go ahead and cancel your cancellation token here
Console.WriteLine("Final result: " + finalResult);
Debug.Assert(finalResult == true);
Console.WriteLine();
}
What's nice about putting it into a generic method, is that it works with any type, not only booleans, as a result of the original task.
Assuming your tasks return bool you can do something like this:
CancellationTokenSource source = new CancellationTokenSource();
CancellationToken stoppingToken = source.Token;
Task<bool> task1 = Task.Run(() => Task1(stoppingToken));
....
var tasks = new List<Task<bool>>
{
task1, task2, task3, ...
};
bool taskResult = false;
do
{
var finished = await Task.WhenAny(tasks);
taskResult = finished.Result;
tasks.Remove(finished);
} while (tasks.Any() && !taskResult);
source.Cancel();
You could use an asynchronous method that wraps a Task<bool> to another Task<bool>, and cancels a CancellationTokenSource if the result of the input task is true. In the example below this method is the IfTrueCancel, and it is implemented as local function. This way it captures the CancellationTokenSource, and so you don't have to pass it as argument on every call:
var cts = new CancellationTokenSource();
var stoppingToken = cts.Token;
var task1 = IfTrueCancel(Task.Run(() => Task1(stoppingToken)));
var task2 = IfTrueCancel(Task.Run(() => Task2(stoppingToken)));
var task3 = IfTrueCancel(Task.Run(() => Task3(stoppingToken)));
var task4 = IfTrueCancel(Task.Run(() => Task4(stoppingToken)));
var task5 = IfTrueCancel(Task.Run(() => Task5(stoppingToken)));
var task6 = IfTrueCancel(Task.Run(() => Task6(stoppingToken)));
Task.WaitAll(task1, task2, task3, task4, task5, task6);
async Task<bool> IfTrueCancel(Task<bool> task)
{
bool result = await task.ConfigureAwait(false);
if (result) cts.Cancel();
return result;
}
Another, quite different, solution to this problem could be to use PLINQ instead of explicitly created Tasks. PLINQ requires an IEnumerable of something in order to do parallel work on it, and in your case this something is the Task1, Task2 etc functions that you want to invoke. You could put them in an array of Func<CancellationToken, bool>, and solve the problem this way:
var functions = new Func<CancellationToken, bool>[]
{
Task1, Task2, Task3, Task4, Task5, Task6
};
bool success = functions
.AsParallel()
.WithDegreeOfParallelism(4)
.Select(function =>
{
try
{
bool result = function(stoppingToken);
if (result) cts.Cancel();
return result;
}
catch (OperationCanceledException)
{
return false;
}
})
.Any(result => result);
The advantage of this approach is that you can configure the degree of parallelism, and you don't have to rely on the ThreadPool availability for limiting the concurrency of the whole operation. The disadvantage is that all functions should have the same signature. You could overcome this disadvantage by declaring the functions as lambda expressions like this:
var functions = new Func<CancellationToken, bool>[]
{
ct => Task1(arg1, ct),
ct => Task2(arg1, arg2, ct),
ct => Task3(ct),
ct => Task4(arg1, arg2, arg3, ct),
ct => Task5(arg1, ct),
ct => Task6(ct)
};

Assign property, when Task completes

I want to ask you: I have code:
var task1 = await _connectionService.ValidateUriAsync(uri1);
OutputResult("ss", task1);
var task2 = await _connectionService.ValidateUriAsync(uri2);
OutputResult("bb", task2);
var task3 = await _connectionService.ValidateUriAsync(uri3);
OutputResult("cc", task3);
Now I'm waiting until each task finishes and then I output the result. But I would like to run all tasks independently (I know how to do that). But what I don't know is, when some task is completed I need to output result for each task. If task fails the output will be - task1 failed or Task1 success.
I tried this solution, but I will have to check which task is completed and than its result (true/false). It is complex. If I had 100 tasks, I cannot have 100 conditions.
var tasks = new[] {task1, task2, task3};
var process = tasks.Select(async task =>
{
var result = await task;
if(task == task1)assign property
});
await Task.WhenAll(proces);
EDIT:
Here is ValidateUriAsync func:
public async Task<bool> ValidateUriAsync(Uri uri)
{
try
{
var request = WebRequest.CreateHttp(uri);
var result = await request.GetResponseAsync();
return true;
}
catch (Exception e)
{
return false;
}
}
when some task is completed I need to output result for each task.
Don't think about this in terms of "reacting to tasks as they complete". Instead, think of your ValidateUriAsync method as an "operation", and what you want is to create a new higher-level "operation" that is "validate and assign".
With that mindset, the solution is more clear - introduce a new async method for the new operation:
private async Task ValidateAndOutputResult(Uri uri, string name)
{
var result = await _connectionService.ValidateUriAsync(uri);
OutputResult(name, result);
}
Now you can call the higher-level method, and use Task.WhenAll:
var tasks = new[]
{
ValidateAndOutputResult(uri1, "ss"),
ValidateAndOutputResult(uri2, "bb"),
ValidateAndOutputResult(uri3, "cc"),
};
await Task.WhenAll(tasks);

Task.WhenAll in PCL

I'm trying to run a few async tasks concurrently inside my Portable Class Library, but the WhenAll method doesn't appear to be supported.
My current workaround is to start each task and then await each of them:
var task1 = myService.GetData(source1);
var task2 = myService.GetData(source2);
var task3 = myService.GetData(source3);
// Now everything's started, we can await them
var result1 = await task1;
var result1 = await task2;
var result1 = await task3;
Is there something I'm missing? Do I need to make do with the workaround?
Is there something I'm missing?
Yes: Microsoft.Bcl.Async.
Once you install that package, then you can use TaskEx.WhenAll as a substitute for Task.WhenAll:
var task1 = myService.GetData(source1);
var task2 = myService.GetData(source2);
var task3 = myService.GetData(source3);
// Now everything's started, we can await them
var results = await TaskEx.WhenAll(task1, task2, task3);
P.S. Consider using the term parallel for (CPU-bound) parallel processing, and the term concurrent for doing more than one thing at a time. In this case, the (I/O-bound) tasks are concurrent but not parallel.

Concurrent execution of async methods

Using the async/await model, I have a method which makes 3 different calls to a web service and then returns the union of the results.
var result1 = await myService.GetData(source1);
var result2 = await myService.GetData(source2);
var result3 = await myService.GetData(source3);
allResults = Union(result1, result2, result3);
Using typical await, these 3 calls will execute synchronously wrt each other. How would I go about letting them execute concurrently and join the results as they complete?
How would I go about letting them execute in parallel and join the results as they complete?
The simplest approach is just to create all the tasks and then await them:
var task1 = myService.GetData(source1);
var task2 = myService.GetData(source2);
var task3 = myService.GetData(source3);
// Now everything's started, we can await them
var result1 = await task1;
var result1 = await task2;
var result1 = await task3;
You might also consider Task.WhenAll. You need to consider the possibility that more than one task will fail... with the above code you wouldn't observe the failure of task3 for example, if task2 fails - because your async method will propagate the exception from task2 before you await task3.
I'm not suggesting a particular strategy here, because it will depend on your exact scenario. You may only care about success/failure and logging one cause of failure, in which case the above code is fine. Otherwise, you could potentially attach continuations to the original tasks to log all exceptions, for example.
You could use the Parallel class:
Parallel.Invoke(
() => result1 = myService.GetData(source1),
() => result2 = myService.GetData(source2),
() => result3 = myService.GetData(source3)
);
For more information visit: http://msdn.microsoft.com/en-us/library/system.threading.tasks.parallel(v=vs.110).aspx
As a more generic solution you can use the api I wrote below, it also allows you to define a real time throttling mechanism of max number of concurrent async requests.
The inputEnumerable will be the enumerable of your source and asyncProcessor is your async delegate (myservice.GetData in your example).
If the asyncProcessor - myservice.GetData - returns void or just a Task without any type, then you can simply update the api to reflect that. (just replace all Task<> references to Task)
public static async Task<TOut[]> ForEachAsync<TIn, TOut>(
IEnumerable<TIn> inputEnumerable,
Func<TIn, Task<TOut>> asyncProcessor,
int? maxDegreeOfParallelism = null)
{
IEnumerable<Task<TOut>> tasks;
if (maxDegreeOfParallelism != null)
{
SemaphoreSlim throttler = new SemaphoreSlim(maxDegreeOfParallelism.Value, maxDegreeOfParallelism.Value);
tasks = inputEnumerable.Select(
async input =>
{
await throttler.WaitAsync();
try
{
return await asyncProcessor(input).ConfigureAwait(false);
}
finally
{
throttler.Release();
}
});
}
else
{
tasks = inputEnumerable.Select(asyncProcessor);
}
await Task.WhenAll(tasks);
}

Categories

Resources