I am creating a Task processor which uses TPL DataFlow. I will follow a producer consumer model where in Producer produces some items to be processed once in a while and consumers keep waiting for new items to arrive. Here is my code:
async Task Main()
{
var runner = new Runner();
CancellationTokenSource cts = new CancellationTokenSource();
Task runnerTask = runner.ExecuteAsync(cts.Token);
await Task.WhenAll(runnerTask);
}
public class Runner
{
public async Task ExecuteAsync(CancellationToken cancellationToken) {
var random = new Random();
ActionMeshProcessor processor = new ActionMeshProcessor();
await processor.Init(cancellationToken);
while (!cancellationToken.IsCancellationRequested)
{
await Task.Delay(TimeSpan.FromSeconds(1)); // wait before enqueuing more
int[] items = GetItems(random.Next(3, 7));
await processor.ProcessBlockAsync(items);
}
}
private int[] GetItems(int count)
{
Random randNum = new Random();
int[] arr = new int[count];
for (int i = 0; i < count; i++)
{
arr[i] = randNum.Next(10, 20);
}
return arr;
}
}
public class ActionMeshProcessor
{
private TransformBlock<int, int> Transformer { get; set; }
private ActionBlock<int> CompletionAnnouncer { get; set; }
public async Task Init(CancellationToken cancellationToken)
{
var options = new ExecutionDataflowBlockOptions
{
CancellationToken = cancellationToken,
MaxDegreeOfParallelism = 5,
BoundedCapacity = 5
};
this.Transformer = new TransformBlock<int, int>(async input => {
await Task.Delay(TimeSpan.FromSeconds(1)); //donig something complex here!
if (input > 15)
{
throw new Exception($"I can't handle this number: {input}");
}
return input + 1;
}, options);
this.CompletionAnnouncer = new ActionBlock<int>(async input =>
{
Console.WriteLine($"Completed: {input}");
await Task.FromResult(0);
}, options);
this.Transformer.LinkTo(this.CompletionAnnouncer);
await Task.FromResult(0); // what do I await here?
}
public async Task ProcessBlockAsync(int[] arr)
{
foreach (var item in arr)
{
await this.Transformer.SendAsync(item); // await if there are no free slots
}
}
}
I added a condition check above to throw an exception to mimic an exceptional case.
Here are my questions:
What is the best way I can handle exceptions in the above mesh without bringing the whole mesh down?
Is there a better way to initialize/start/continue a never ending DataFlow mesh?
Where do I await Completion?
I have looked in to this similar question
Exceptions
There's nothing asynchronous in your init it could be a standard synchronous constructor. You can handle exceptions in your mesh without taking the mesh down with a simple try catch in the lamda you provide to the block. You can then handle that case by either filtering the result from your mesh or ignoring the result in the following blocks. Below is an example of filtering. For the simple case of an int you can use an int? and filter out any value that was null or of course you could set any type of magic indicator value if you like. If your actually passing around a reference type you can either push out null or mark the data item as dirty in way that can be examined by the predicate on your link.
public class ActionMeshProcessor {
private TransformBlock<int, int?> Transformer { get; set; }
private ActionBlock<int?> CompletionAnnouncer { get; set; }
public ActionMeshProcessor(CancellationToken cancellationToken) {
var options = new ExecutionDataflowBlockOptions {
CancellationToken = cancellationToken,
MaxDegreeOfParallelism = 5,
BoundedCapacity = 5
};
this.Transformer = new TransformBlock<int, int?>(async input => {
try {
await Task.Delay(TimeSpan.FromSeconds(1)); //donig something complex here!
if (input > 15) {
throw new Exception($"I can't handle this number: {input}");
}
return input + 1;
} catch (Exception ex) {
return null;
}
}, options);
this.CompletionAnnouncer = new ActionBlock<int?>(async input =>
{
if (input == null) throw new ArgumentNullException("input");
Console.WriteLine($"Completed: {input}");
await Task.FromResult(0);
}, options);
//Filtering
this.Transformer.LinkTo(this.CompletionAnnouncer, x => x != null);
this.Transformer.LinkTo(DataflowBlock.NullTarget<int?>());
}
public async Task ProcessBlockAsync(int[] arr) {
foreach (var item in arr) {
await this.Transformer.SendAsync(item); // await if there are no free slots
}
}
}
Completion
You can expose Complete() and Completion from your processor and use those to await the completion when your app shutsdown, assuming thats the only time you'd shutdown the mesh. Also, make sure you propagate completion through your links properly.
//Filtering
this.Transformer.LinkTo(this.CompletionAnnouncer, new DataflowLinkOptions() { PropagateCompletion = true }, x => x != null);
this.Transformer.LinkTo(DataflowBlock.NullTarget<int?>());
}
public void Complete() {
Transformer.Complete();
}
public Task Completion {
get { return CompletionAnnouncer.Completion; }
}
Then, based on your sample the most likely place for completion is outside the loop driving your processing:
public async Task ExecuteAsync(CancellationToken cancellationToken) {
var random = new Random();
ActionMeshProcessor processor = new ActionMeshProcessor();
await processor.Init(cancellationToken);
while (!cancellationToken.IsCancellationRequested) {
await Task.Delay(TimeSpan.FromSeconds(1)); // wait before enqueuing more
int[] items = GetItems(random.Next(3, 7));
await processor.ProcessBlockAsync(items);
}
//asuming you don't intend to throw from cancellation
processor.Complete();
await processor.Completion();
}
Related
Hello I have the following problem:
I want to perform something similar to a transaction. I want to execute a number of async operations after I receive an external trigger.Therefore I am using a TaskCompletionSource that gets set in a method representing the trigger :TriggerTransaction.This trigger method gets called in Main on the thread pool when i press a specific console key.
After I press the A keyword the TriggerTransaction gets executed and the TaskCompletionSource-s get set.Still the main thread does not compute the sum of the two awaited tasks.
class Program
{
public static Task<Task<int>> TransactionOperation1()
{
TaskCompletionSource<Task<int>> tcs = new TaskCompletionSource<Task<int>>();
tasks.Add(tcs);
Task<Task<int>> result = tcs.Task;
return result;
}
public static Task<Task<int>> TransactionOperation2()
{
TaskCompletionSource<Task<int>> tcs = new TaskCompletionSource<Task<int>>();
tasks.Add(tcs);
Task<Task<int>> result = tcs.Task;
return result;
}
public static async Task<int> ExecuteTransactionOnDB()
{
await Task.Delay(1000);
return 5;
}
public static async Task TriggerTransaction()
{
int value = await ExecuteTransactionOnDB();
foreach (var item in tasks)
{
item.SetResult(value);
}
}
public static List<dynamic> tasks = new List<dynamic>();
static async Task Main(string[] args)
{
Task<Task<int>> a = TransactionOperation1();
Task<Task<int>> b = TransactionOperation2();
Task.Run(async() =>
{
while (Console.ReadKey().Key != ConsoleKey.A) ;
await TriggerTransaction();
});
if (!File.Exists("D:\\data.txt"))
{
File.Create("D:\\data.txt");
}
using(FileStream stream=new FileStream("data.txt",FileMode.Append,FileAccess.Write))
{
int sum=await await a + await await b;//thread wont pass this line when tasks are set.
ReadOnlyMemory<byte> bytes = Encoding.UTF8.GetBytes(sum);
stream.Write(bytes.ToArray());
}
Console.WriteLine(await await a + await await b);
}
}
}
P.S If you are wondering why I did use a List<dynamic> to store the TaskCompletionSource-s ,it's because the TransactionOperations will differ in return type.Some of them will return int,others String ..Bool..etc.
For a better understanding i made a schema-
As you will see there are:
-A list where i want to store the TCS-es
-Some Calls that are completed only after the external trigger was set(the transaction was executed)
As you can see in the Calls,all have different return types.
Why would you need a Task<Task<int>>? Simply Task<int> is enough, and accordingly, TaskCompletionSource<int>. And you also get rid of an awkward await await ..., which isn't required in your case either.
Note that I also added Close() to the stream returned by File.Create().
Here is a working version of the program:
class Program
{
public static Task<int> TransactionOperation1()
{
TaskCompletionSource<int> tcs = new TaskCompletionSource<int>();
tasks.Add(tcs);
return tcs.Task;
}
public static Task<int> TransactionOperation2()
{
TaskCompletionSource<int> tcs = new TaskCompletionSource<int>();
tasks.Add(tcs);
return tcs.Task;
}
public static async Task<int> ExecuteTransactionOnDB()
{
await Task.Delay(1000);
return 5;
}
public static async Task TriggerTransaction()
{
int value = await ExecuteTransactionOnDB();
foreach (var item in tasks)
{
item.SetResult(value);
}
}
public static List<dynamic> tasks = new List<dynamic>();
static async Task Main(string[] args)
{
Task<int> a = TransactionOperation1();
Task<int> b = TransactionOperation2();
Task input = Task.Run(async () => {
while (Console.ReadKey().Key != ConsoleKey.A);
await TriggerTransaction();
});
if (!File.Exists("C:\\temp\\data.txt"))
{
File.Create("C:\\temp\\data.txt").Close();
}
using (FileStream stream = new FileStream("C:\\temp\\data.txt", FileMode.Append, FileAccess.Write))
{
int sum = await a + await b; // now it works ok
var bytes = Encoding.UTF8.GetBytes(sum.ToString());
stream.Write(bytes);
}
Console.WriteLine(await a + await b);
}
}
Check out the modified version of the code, it produce the expected result, by executing the Task created using TaskCompletionSource. I have made the code the Generic too, so that you don't need to use the dynamic type and define the datatype at the compile time
static async Task Main(string[] args)
{
var a = Program<int>.TransactionOperation1();
var b = Program<int>.TransactionOperation2();
await Task.Run(async() =>
{
Console.ReadLine();
await Program<int>.TriggerTransaction(5);
});
if (!File.Exists("D:\\data.txt"))
{
File.Create("D:\\data.txt");
}
using (FileStream stream = new FileStream("D:\\data.txt", FileMode.Append, FileAccess.Write))
{
int sum = await a + await b;//thread wont pass this line when tasks are set.
var bytes = Encoding.UTF8.GetBytes(sum.ToString());
stream.Write(bytes, 0, bytes.Length);
}
Console.WriteLine(await a + await b);
}
class Program<T>
{
public static Task<T> TransactionOperation1()
{
var tcs = new TaskCompletionSource<T>();
tasks.Add(tcs);
return tcs.Task;
}
public static Task<T> TransactionOperation2()
{
var tcs = new TaskCompletionSource<T>();
tasks.Add(tcs);
return tcs.Task;
}
public static async Task<T> ExecuteTransactionOnDB(T t)
{
return await Task.FromResult(t);
}
public static async Task TriggerTransaction(T t)
{
T value = await ExecuteTransactionOnDB(t);
foreach (var item in tasks)
{
item.SetResult(value);
}
}
public static List<TaskCompletionSource<T>> tasks = new List<TaskCompletionSource<T>>();
}
Following are the important modifications:
List<dynamic> is replaced by List<TaskCompletionSource<T>>
TransactionOperation1/2 have return type Task<T>, which is the Task created using the TaskCompletionSource<T>
Added an extra await to the Task.Run, which executes the TriggerTransaction internally, though you can replace the following code:
await Task.Run(async() =>
{
Console.ReadLine();
await Program<int>.TriggerTransaction(5);
});
with
await Program<int>.TriggerTransaction(5);
Now it produces the result as you expect, it will sum up the two integers. Few more small changes like removing Task.Delay, which is not required
EDIT 1 - Using Task.WhenAll
static async Task Main(string[] args)
{
var a = Program.TransactionOperation1(5);
var b = Program.TransactionOperation1(5);
Console.ReadLine();
var taskResults = await Task.WhenAll(a,b);
dynamic finalResult = 0;
foreach(var t in taskResults)
finalResult += t;
if (!File.Exists("D:\\data.txt"))
{
File.Create("D:\\data.txt");
}
using (FileStream stream = new FileStream("D:\\data.txt", FileMode.Append, FileAccess.Write))
{
var bytes = Encoding.UTF8.GetBytes(finalResult.ToString());
stream.Write(bytes, 0, bytes.Length);
}
Console.WriteLine(finalResult);
}
class Program
{
public static Task<dynamic> TransactionOperation1(dynamic val)
{
return Task<dynamic>.Run(() => val);
}
public static Task<dynamic> TransactionOperation2(dynamic val)
{
return Task<dynamic>.Run(() => val);
}
}
I have created dummy code to describe my issue as follows:
public class ItemGenerator
{
public bool isStopped;
public List<int> list = new List<int>();
public void GetItems(int itemsPerSecond)
{
int i = 0;
while (!isStopped)
{
list.add(i);
await Task.Delay(1000);
i++;
}
}
}
[Test]
public void TestGetItmes()
{
ItemGenerator gen = new ItemGenerator();
gen.GetItems(1000);
await Task.Delay(5000).ContinueWith(t =>
{
gen.isStopped = true;
Assert.True(gen.list.Count() == (5 * 1000));
});
}
Now the problem is that the assert will fail sporadically, I guess it's to do with CPU performance and the fact that there is no guarantee that delay of 1000 will be always 1000ms but what would be the best approach to UT this kind of logic ?
Here's how I would approach this - firstly use the built in CancellationToken
public class ItemGenerator
{
public List<int> List { get; } = new List<int>();
public async Task GetItems(CancellationToken token)
{
int i = 0;
while(!token.IsCancellationRequested)
{
List.Add(i);
await Task.Delay(1000);
i++;
}
}
}
Then your test can make use of CancellationTokenSource and specifically CancelAfter method:
var gen = new ItemGenerator();
CancellationTokenSource src = new CancellationTokenSource();
src.CancelAfter(5000);
await gen.GetItems(src.Token);
Note you could pass the CancellationToken in to the constructor of ItemGenerator instead of the method if that is more appropriate.
I am working on a Task parallel problem that I have many Tasks that may or may not throw Exception.
I want to process all the tasks that finishes properly and log the rest. The Task.WhenAll propage the Task exception without allowing me to gather the rest results.
static readonly Task<string> NormalTask1 = Task.FromResult("Task result 1");
static readonly Task<string> NormalTask2 = Task.FromResult("Task result 2");
static readonly Task<string> ExceptionTk = Task.FromException<string>(new Exception("Bad Task"));
var results = await Task.WhenAll(new []{ NormalTask1,NormalTask2,ExceptionTk});
The Task.WhenAll with throw the Exception of ExcceptionTk ignoring the rest results. How I can get the results ignoring the Exception and log the exception at same time?
I could wrap the task into another task that try{...}catch(){...} the internal exception but I don't have access to them and I hope I will not have to add this overhead.
You can create a method like this to use instead of Task.WhenAll:
public Task<ResultOrException<T>[]> WhenAllOrException<T>(IEnumerable<Task<T>> tasks)
{
return Task.WhenAll(
tasks.Select(
task => task.ContinueWith(
t => t.IsFaulted
? new ResultOrException<T>(t.Exception)
: new ResultOrException<T>(t.Result))));
}
public class ResultOrException<T>
{
public ResultOrException(T result)
{
IsSuccess = true;
Result = result;
}
public ResultOrException(Exception ex)
{
IsSuccess = false;
Exception = ex;
}
public bool IsSuccess { get; }
public T Result { get; }
public Exception Exception { get; }
}
Then you can check each result to see if it was successful or not.
EDIT: the code above doesn't handle cancellation; here's an alternative implementation:
public Task<ResultOrException<T>[]> WhenAllOrException<T>(IEnumerable<Task<T>> tasks)
{
return Task.WhenAll(tasks.Select(task => WrapResultOrException(task)));
}
private async Task<ResultOrException<T>> WrapResultOrException<T>(Task<T> task)
{
try
{
var result = await task;
return new ResultOrException<T>(result);
}
catch (Exception ex)
{
return new ResultOrException<T>(ex);
}
}
You can get the result of each successfully completed Task<TResult> from its property Result.
var normalTask1 = Task.FromResult("Task result 1");
var normalTask2 = Task.FromResult("Task result 2");
var exceptionTk = Task.FromException<string>(new Exception("Bad Task"));
Task<string>[] tasks = new[] { normalTask1, normalTask2, exceptionTk };
Task whenAll = Task.WhenAll(tasks);
try
{
await whenAll;
}
catch
{
if (whenAll.IsFaulted) // There is also the possibility of being canceled
{
foreach (var ex in whenAll.Exception.InnerExceptions)
{
Console.WriteLine(ex); // Log each exception
}
}
}
string[] results = tasks
.Where(t => t.IsCompletedSuccessfully)
.Select(t => t.Result)
.ToArray();
Console.WriteLine($"Results: {String.Join(", ", results)}");
Output:
System.Exception: Bad Task
Results: Task result 1, Task result 2
You can add HOC with exception handling and then check success.
class Program
{
static async Task Main(string[] args)
{
var itemsToProcess = new[] { "one", "two" };
var results = itemsToProcess.ToDictionary(x => x, async (item) =>
{
try
{
var result = await DoAsync();
return ((Exception)null, result);
}
catch (Exception ex)
{
return (ex, (object)null);
}
});
await Task.WhenAll(results.Values);
foreach(var item in results)
{
Console.WriteLine(item.Key + (await item.Value).Item1 != null ? " Failed" : "Succeed");
}
}
public static async Task<object> DoAsync()
{
await Task.Delay(10);
throw new InvalidOperationException();
}
}
I have a simple class that does a synchronous thing,
public static class Synchronous
{
public static void DoTheWholeThing()
{
AStuff aStuff;
using (var a = new A())
{
aStuff = a.GetStuff();
}
BStuff bStuff;
using (var b = new B())
{
bStuff = b.GetStuff();
}
var combination = CombineStuff(aStuff, bStuff);
}
private static Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Obviously, this code is not fully defined but it does illustrate my question.
Now, the classes A and B are both responsible for retrieving data from different remote sources. Consequently, the developers of A and B have implemented asynchronous entry points called GetStuffAsync which return Task<AStuff> and Task<BStuff> respectively.
I want to take maximum advantage of the asynchronous methods and call them concurrently so I can reduce the overall wait time of my code.
Here is what I've concocted, so far.
public static class Asynchronous
{
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var getAStuffTask = new Func<Task<AStuff>>(
async () =>
{
using (var a = new A())
{
return await a.GetStuffAsync(cancellationToken);
}
})();
var getBStuffTask = new Func<Task<BStuff>>(
async () =>
{
using (var b = new B())
{
return await b.GetStuffAsync(cancellationToken);
}
})();
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
private Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Aside from this code looking curiously like the javascript module pattern, is this the correct approach. I don't think I should be using Task.Run as this code is clearly not CPU bound.
It seems a bit "clunky" that I need to instantiate typed delegates to do this. Is there a better way?
EDIT
following two good answers I'm in a quandary between named functions and continuations.
The code becomes radically simpler when you simply extract the anonymous methods out into named methods:
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var getAStuffTask = GetAStuffAsync(cancellationToken);
var getBStuffTask = GetBStuffAsync(cancellationToken);
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
private static async Task<AStuff> GetAStuffAsync(CancellationToken cancellationToken)
{
using (var a = new A())
{
return await a.GetStuffAsync(cancellationToken);
}
}
private static async Task<BStuff> GetBStuffAsync(CancellationToken cancellationToken)
{
using (var b = new B())
{
return await b.GetStuffAsync(cancellationToken);
}
}
That said, if you really want to stick with the anonymous methods, you can create a helper method that will allow generic type inference and lambdas to implicitly figure out the type of the delegate:
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var getAStuffTask = Start(async () =>
{
using (var a = new A())
{
return await a.GetStuffAsync(cancellationToken);
}
});
var getBStuffTask = Start(async () =>
{
using (var b = new B())
{
return await b.GetStuffAsync(cancellationToken);
}
});
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
public static Task<T> Start<T>(Func<Task<T>> asyncOperation)
{
return asyncOperation();
}
Use TPL continuations to call Dispose as soon as the task is complete.
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
var a = new A();
var b = new B();
// start the tasks and store them for awaiting later
var getAStuffTask = a.GetStuffAsync(cancellationToken);
var getBStuffTask = b.GetStuffAsync(cancellationToken);
// queue up continuations to dispose of the resource as soon as it is not needed
getAStuffTask.ContinueWith(() => a.Dispose());
getBStuffTask.ContinueWith(() => b.Dispose());
// await as normal
var combination = CombineStuff(
await getAStuffTask,
await getBStuffTask);
}
I am unsure if wrapping the whole method in an addition using block will accomplish anything but it may provide peace of mind.
You don't need to wrap your async calls in delegates to get them to execute immediately. If you call the GetStuffAsync methods directly without awaiting them you will have the same result.
public static class Asynchronous
{
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
using (var a = new A())
using (var b = new B()) {
var taskA = a.GetStuffAsync(cancellationToken);
var taskB = b.GetStuffAsync(cancellationToken);
await Task.WhenAll(new [] { taskA, taskB });
var combination = CombineStuff(taskA.Result, taskB.Result);
}
}
private Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Note that this does keep the a and b objects alive during the call to CombineStuff as #Servy notes. If that is a problem the declaration of the Task objects can be moved outside of the using blocks as below:
public static class Asynchronous
{
public async static Task DoTheWholeThing(CancellationToken cancellationToken)
{
Task taskA;
Task taskB;
using (var a = new A())
using (var b = new B()) {
taskA = a.GetStuffAsync(cancellationToken);
taskB = b.GetStuffAsync(cancellationToken);
await Task.WhenAll(new [] { taskA, taskB });
}
var combination = CombineStuff(taskA.Result, taskB.Result);
}
private Combination CombineStuff(AStuff aStuff, BStuff bStuff)
{
//// Magic Here
}
}
Although this still holds onto a and b as long as both tasks are running, rather than disposing of each as they return.
I was trying to use Task.WaitAny to wait a bunch of tasks but what I really want is to wait for the first RanToCompletion task instead of Canceled tasks.
So when I have a bunch tasks whose status are like:
0 Canceled;1 Canceled;2 Canceled;3 Canceled;4 Canceled;5 RanToCompletion;
Ideally I would want Task.WaitAny to return 5 but what it returns is 0.
How should I wait for the first RanToCompletion task?
There is nothing available out of the box. We need to write some helper method as noted in comments.
Here's an implementation using TaskCompletionSource.
public class MyTask
{
private readonly TaskCompletionSource<Task> completionSource = new TaskCompletionSource<Task>();
private readonly Task[] tasks;
private int numberOfTasks;
private MyTask(Task[] tasks)
{
if (tasks.Length == 0)
{
throw new ArgumentException("No tasks");
}
this.tasks = tasks;
this.numberOfTasks= tasks.Length;
}
private int WaitAnyInternal()
{
foreach (var task in tasks)
{
task.ContinueWith(task1 => completionSource.TrySetResult(task1), TaskContinuationOptions.OnlyOnRanToCompletion);
}
foreach (var task in tasks)
{
task.ContinueWith(task1 =>
{
if (Interlocked.Decrement(ref numberOfTasks) == 0)
{
completionSource.SetCanceled();
}
}, TaskContinuationOptions.NotOnRanToCompletion);
}
try
{
completionSource.Task.Wait();
}
catch (AggregateException ex)
{
if (ex.Flatten().InnerExceptions.OfType<OperationCanceledException>().Any())
{
return -1;
}
}
return Array.IndexOf(tasks, completionSource.Task.Result);
}
public static int WaitAnyRanToCompletion(params Task[] tasks)
{
return new MyTask(tasks).WaitAnyInternal();
}
}
Then use it as:
var task1 = Task.Run(() =>
{
Thread.Sleep(1000);
throw new Exception();
});//Faulted task
var task2 = Task.Run(() =>
{
Thread.Sleep(5000);
});//Will complete first
var task3 = Task.Delay(10000);//Will complete, but not first
int index = MyTask.WaitAnyRanToCompletion(task1, task2, task3);
//Index will be 1, which means task2