I have a method (called via an AJAX request) that runs at the end of a sequence. In this method I save to the database, send emails, look up a bunch of info in other APIs/databases, correlate things together, etc.. I just refactored the original method into a second revision and used Tasks to make it asynchronous, and shaved off up to two seconds in wall time. I used Tasks mainly because it seemed easier (I'm not that experienced in async/await yet) and some tasks depend on other tasks (like task C D and E all depend on results from B, which itself depends on A). Basicalll all of the tasks are started at the same time (processing just zips down the to the Wait() call on the email task, which in one way or another requires all the others to complete. I generally do something like this except with something like eight tasks:
public thing() {
var FooTask<x> = Task<x>.Factory.StartNew(() => {
// ...
return x;
});
var BarTask<y> = Task<y>.Factory.StartNew(() => {
// ...
var y = FooTask.Result;
// ...
return y + n;
}
var BazTask<z> = Task<z>.Factory.StartNew(() => {
var y = FooTask.Result;
return y - n;
}
var BagTask<z> = Task<z>.Factory.StartNew(() => {
var g = BarTask.Result;
var h = BazTask.Result;
return 1;
}
// Lots of try/catch aggregate shenanigans.
BagTask.Wait();
return "yay";
}
Oh I also need to roll back previous things if something breaks, like remove a database row if the email fails to send, so there are a few levels of try/catches in there. Anyway, all of this works (amazingly, it all worked on the first try). My question is whether this sort of method would benefit from being rewritten to use async/await rather than Tasks. If so, how would the multiple-dependency scenario play out without re-running an async method that was already ran or awaited by another method? I guess some shared variable?
Update:
The // ... lines were supposed to indicate that the task was doing something, like looking up DB records. Sorry if that wasn't clear. About half of the tasks (there are 8 total) can take up to maybe five seconds to run, if the contexts aren't warmed up, and the other half of the tasks just collect/assemble/process/use that data.
I also need to roll back previous things if something breaks, like remove a database row if the email fails to send, so there are a few levels of try/catches in there.
You'll find that async/await (paired with Task.Run instead of StartNew) will make your code much cleaner:
var x = await Task.Run(() => {
...
return ...;
});
var y = await Task.Run(() => {
...
return x + n;
});
var bazTask = Task.Run(() => {
...
return y - n;
});
var bagTask = Task.Run(async () => {
...
var g = y;
var h = await bazTask;
return 1;
}
await bagTask;
return "yay";
You also have the option of using Task.WhenAll if you want to await multiple tasks completing. Error handling in particular is cleaner with await since it doesn't wrap exception in AggregateException.
However
called via an AJAX request
This is a bit of a problem. Both StartNew and Task.Run should be avoided on ASP.NET.
shaved off up to two seconds in wall time
Yes, parallel processing on ASP.NET (which is what the code is currently doing) will make individual requests execute faster, but at the expense of scalability. The server will be unable to handle as many requests if it is doing parallel processing on each one.
save to the database, send emails, look up a bunch of info in other APIs/databases
These are all I/O-bound operations, not CPU-bound. So the ideal solution is to create truly-async I/O methods and then just call them using await (and Task.WhenAll if necessary). By "truly-async", I mean calling the underlying asynchronous APIs (e.g., HttpClient.GetStringAsync instead of WebClient.DownloadString; or Entity Framework's ToListAsync instead of ToList, etc). Using StartNew or Task.Run is what I call "fake asynchrony".
Once you have asynchronous APIs, your top-level method really becomes simple:
X x = await databaseService.GetXFromDatabaseAsync();
Y y = await apiService.LookupValueAsync(x);
Task<Baz> bazTask = databaseSerivce.GetBazFromDatabaseAsync(y);
Task<Bag> bagTask = apiService.SecondaryLookupAsync(y);
await Task.WhenAll(bazTask, bagTask);
Baz baz = await bazTask;
Bag bag = await bagTask;
return baz + bag;
Related
I've got a loop that needs to be run in parallel as each iteration is slow and processor intensive but I also need to call an async method as part of each iteration in the loop.
I've seen questions on how to handle an async method in the loop but not a combination of async and synchronous, which is what I've got.
My (simplified) code is as follows - I know this won't work properly due to the async action being passed to foreach.
protected IDictionary<int, ReportData> GetReportData()
{
var results = new ConcurrentDictionary<int, ReportData>();
Parallel.ForEach(requestData, async data =>
{
// process data synchronously
var processedData = ProcessData(data);
// get some data async
var reportRequest = await BuildRequestAsync(processedData);
// synchronous building
var report = reportRequest.BuildReport();
results.TryAdd(data.ReportId, report);
});
// This needs to be populated before returning
return results;
}
Is there any way to get execute the action in parallel when the action has to be async in order to await the single async call.
It's not a practical option to convert the synchronous functions to async.
I don't want to split the action up and have a Parallel.ForEach followed by the async calls with a WhenAll and another Parallel.ForEach as the speed of each stage can vary greatly between different iterations so splitting it would be inefficient as the faster ones would be waiting for the slower ones before continuing.
I did wonder if a PLINQ ForAll could be used instead of the Parallel.ForEach but have never used PLINQ and not sure if it would wait for all of the iterations to be completed before returning, i.e. would the Tasks still be running at the end of the process.
Is there any way to get execute the action in parallel when the action has to be async in order to await the single async call.
Yes, but you'll need to understand what Parallel gives you that you lose when you take alternative approaches. Specifically, Parallel will automatically determine the appropriate number of threads and adjust based on usage.
It's not a practical option to convert the synchronous functions to async.
For CPU-bound methods, you shouldn't convert them.
I don't want to split the action up and have a Parallel.ForEach followed by the async calls with a WhenAll and another Parallel.ForEach as the speed of each stage can vary greatly between different iterations so splitting it would be inefficient as the faster ones would be waiting for the slower ones before continuing.
The first recommendation I would make is to look into TPL Dataflow. It allows you to define a "pipeline" of sorts that keeps the data flowing through while limiting the concurrency at each stage.
I did wonder if a PLINQ ForAll could be used instead of the Parallel.ForEach
No. PLINQ is very similar to Parallel in how they work. There's a few differences over how aggressive they are at CPU utilization, and some API differences - e.g., if you have a collection of results coming out the end, PLINQ is usually cleaner than Parallel - but at a high-level view they're very similar. Both only work on synchronous code.
However, you could use a simple Task.Run with Task.WhenAll as such:
protected async Task<IDictionary<int, ReportData>> GetReportDataAsync()
{
var tasks = requestData.Select(async data => Task.Run(() =>
{
// process data synchronously
var processedData = ProcessData(data);
// get some data async
var reportRequest = await BuildRequestAsync(processedData);
// synchronous building
var report = reportRequest.BuildReport();
return (Key: data.ReportId, Value: report);
})).ToList();
var results = await Task.WhenAll(tasks);
return results.ToDictionary(x => x.Key, x => x.Value);
}
You may need to apply a concurrency limit (which Parallel would have done for you). In the asynchronous world, this would look like:
protected async Task<IDictionary<int, ReportData>> GetReportDataAsync()
{
var throttle = new SemaphoreSlim(10);
var tasks = requestData.Select(data => Task.Run(async () =>
{
await throttle.WaitAsync();
try
{
// process data synchronously
var processedData = ProcessData(data);
// get some data async
var reportRequest = await BuildRequestAsync(processedData);
// synchronous building
var report = reportRequest.BuildReport();
return (Key: data.ReportId, Value: report);
}
finally
{
throttle.Release();
}
})).ToList();
var results = await Task.WhenAll(tasks);
return results.ToDictionary(x => x.Key, x => x.Value);
}
I'm working on a problem where I have to delete records using a service call. The issue is that I have a for each loop where i have multiple await operations.This is making the operation take lot of time and performance is lacking
foreach(var a in list<long>b)
{
await _serviceresolver().DeleteOperationAsync(id,a)
}
The issue is that I have a for each loop where i have multiple await operations.
This is making the operation take lot of time and performance is lacking
The number one solution is to reduce the number of calls. This is often called "chunky" over "chatty". So if your service supports some kind of bulk-delete operation, then expose it in your service type and then you can just do:
await _serviceresolver().BulkDeleteOperationAsync(id, b);
But if that isn't possible, then you can at least use asynchronous concurrency. This is quite different from parallelism; you don't want to use Parallel or PLINQ.
var service = _serviceresolver();
var tasks = b.Select(a => service.DeleteOperationAsync(id, a)).ToList();
await Task.WhenAll(tasks);
I do not know what code is behind this DeleteOperationAsync, but for sure async/await isn't designed to speed things up. It was designated to "spare" threads (colloquially speaking)
The best would be to change the method to take as a parameter the whole list of ids - instead of taking and sending just one id.
And then to perform this async/await heavy operation only once for all of the ids.
If that is not possible, you could just run it in parallel using TPL (but it is ready the worst-case scenario - really:) )
Parallel.ForEach(listOfIdsToDelete,
async idToDelete => await _serviceresolver().DeleteOperationAsync(id,idToDelete)
);
You're waiting for each async operation to finish right now. If you can fire them all off concurrently, you can just call them without the await, or if you need to know when they finish, you can just fire them all off and then wait for them all to finish by tracking the tasks in a list:
List<Task> tasks = new List<Task>();
foreach (var a in List<long> b)
tasks.Add(_serviceresolveer().DeleteOperationAsync(id, a));
await Task.WhenAll(tasks);
You can use PLINQ (to leverage of all the processors of your machine) and the Task.WhenAll method (to no freeze the calling thread). In code, resulting something like this:
class Program {
static async Task Main(string[] args) {
var list = new List<long> {
4, 3, 2
};
var service = new Service();
var response =
from item in list.AsParallel()
select service.DeleteOperationAsync(item);
await Task.WhenAll(response);
}
}
public class Service {
public async Task DeleteOperationAsync(long value) {
await Task.Delay(2000);
Console.WriteLine($"Finished... {value}");
}
}
I have read a few stackoverflow threads about multi-threading in a foreach loop, but I am not sure I am understanding and using it right.
I have tried multiple scenarios, but I am not seeing much increase in performance.
Here is what I believe runs Asynchronous tasks, but running synchronously in the loop using a single thread:
Stopwatch stopWatch = new Stopwatch();
stopWatch.Start();
foreach (IExchangeAPI selectedApi in selectedApis)
{
if (exchangeSymbols.TryGetValue(selectedApi.Name, out symbol))
{
ticker = await selectedApi.GetTickerAsync(symbol);
}
}
stopWatch.Stop();
Here is what I hoped to be running Asynchronously (still using a single thread) - I would have expected some speed improvement already here:
List<Task<ExchangeTicker>> exchTkrs = new List<Task<ExchangeTicker>>();
stopWatch.Start();
foreach (IExchangeAPI selectedApi in selectedApis)
{
if (exchangeSymbols.TryGetValue(selectedApi.Name, out symbol))
{
exchTkrs.Add(selectedApi.GetTickerAsync(symbol));
}
}
ExchangeTicker[] retTickers = await Task.WhenAll(exchTkrs);
stopWatch.Stop();
Here is what I would have hoped to run Asynchronously in Multi-thread:
stopWatch.Start();
Parallel.ForEach(selectedApis, async (IExchangeAPI selectedApi) =>
{
if (exchangeSymbols.TryGetValue(selectedApi.Name, out symbol))
{
ticker = await selectedApi.GetTickerAsync(symbol);
}
});
stopWatch.Stop();
Stop watch results interpreted as follows:
Console.WriteLine("Time elapsed (ns): {0}", stopWatch.Elapsed.TotalMilliseconds * 1000000);
Console outputs:
Time elapsed (ns): 4183308100
Time elapsed (ns): 4183946299.9999995
Time elapsed (ns): 4188032599.9999995
Now, the speed improvement looks minuscule. Am I doing something wrong or is that more or less what I should be expecting? I suppose writing to files would be a better to check that.
Would you mind also confirming I am interpreting the different use cases correctly?
Finally, using a foreach loop in order to get the ticker from multiple platforms in parallel may not be the best approach. Suggestions on how to improve this would be welcome.
EDIT
Note that I am using the ExchangeSharp code base that you can find here
Here is what the GerTickerAsync() method looks like:
public virtual async Task<ExchangeTicker> GetTickerAsync(string marketSymbol)
{
marketSymbol = NormalizeMarketSymbol(marketSymbol);
return await Cache.CacheMethod(MethodCachePolicy, async () => await OnGetTickerAsync(marketSymbol), nameof(GetTickerAsync), nameof(marketSymbol), marketSymbol);
}
For the Kraken API, you then have:
protected override async Task<ExchangeTicker> OnGetTickerAsync(string marketSymbol)
{
JToken apiTickers = await MakeJsonRequestAsync<JToken>("/0/public/Ticker", null, new Dictionary<string, object> { { "pair", NormalizeMarketSymbol(marketSymbol) } });
JToken ticker = apiTickers[marketSymbol];
return await ConvertToExchangeTickerAsync(marketSymbol, ticker);
}
And the Caching method:
public static async Task<T> CacheMethod<T>(this ICache cache, Dictionary<string, TimeSpan> methodCachePolicy, Func<Task<T>> method, params object?[] arguments) where T : class
{
await new SynchronizationContextRemover();
methodCachePolicy.ThrowIfNull(nameof(methodCachePolicy));
if (arguments.Length % 2 == 0)
{
throw new ArgumentException("Must pass function name and then name and value of each argument");
}
string methodName = (arguments[0] ?? string.Empty).ToStringInvariant();
string cacheKey = methodName;
for (int i = 1; i < arguments.Length;)
{
cacheKey += "|" + (arguments[i++] ?? string.Empty).ToStringInvariant() + "=" + (arguments[i++] ?? string.Empty).ToStringInvariant("(null)");
}
if (methodCachePolicy.TryGetValue(methodName, out TimeSpan cacheTime))
{
return (await cache.Get<T>(cacheKey, async () =>
{
T innerResult = await method();
return new CachedItem<T>(innerResult, CryptoUtility.UtcNow.Add(cacheTime));
})).Value;
}
else
{
return await method();
}
}
At first it should be pointed out that what you are trying to achieve is performance, not asynchrony. And you are trying to achieve it by running multiple operations concurrently, not in parallel. To keep the explanation simple I'll use a simplified version of your code, and I'll assume that each operation is a direct web request, without an intermediate caching layer, and with no dependencies to values existing in dictionaries.
foreach (var symbol in selectedSymbols)
{
var ticker = await selectedApi.GetTickerAsync(symbol);
}
The above code runs the operations sequentially. Each operation starts after the completion of the previous one.
var tasks = new List<Task<ExchangeTicker>>();
foreach (var symbol in selectedSymbols)
{
tasks.Add(selectedApi.GetTickerAsync(symbol));
}
var tickers = await Task.WhenAll(tasks);
The above code runs the operations concurrently. All operations start at once. The total duration is expected to be the duration of the longest running operation.
Parallel.ForEach(selectedSymbols, async symbol =>
{
var ticker = await selectedApi.GetTickerAsync(symbol);
});
The above code runs the operations concurrently, like the previous version with Task.WhenAll. It offers no advantage, while having the huge disadvantage that you no longer have a way to await the operations to complete. The Parallel.ForEach method will return immediately after launching the operations, because the Parallel class doesn't understand async delegates (it does not accept Func<Task> lambdas). Essentially there are a bunch of async void lambdas in there, that are running out of control, and in case of an exception they will bring down the process.
So the correct way to run the operations concurrently is the second way, using a list of tasks and the Task.WhenAll. Since you've already measured this method and haven't observed any performance improvements, I am assuming that there something else that serializes the concurrent operations. It could be something like a SemaphoreSlim hidden somewhere in your code, or some mechanism on the server side that throttles your requests. You'll have to investigate further to find where and why the throttling happens.
In general, when you do not see an increase by multi threading, it is because your task is not CPU limited or large enough to offset the overhead.
In your example, i.e.:
selectedApi.GetTickerAsync(symbol);
This can hae 2 reasons:
1: Looking up the ticker is brutally fast and it should not be an async to start with. I.e. when you look it up in a dictionary.
2: This is running via a http connection where the runtime is LIMITING THE NUMBER OF CONCURRENT CALLS. Regardless how many tasks you open, it will not use more than 4 at the same time.
Oh, and 3: you think async is using threads. It is not. It is particularly not the case in a codel ike this:
await selectedApi.GetTickerAsync(symbol);
Where you basically IMMEDIATELY WAIT FOR THE RESULT. There is no multi threading involved here at all.
foreach (IExchangeAPI selectedApi in selectedApis) {
if (exchangeSymbols.TryGetValue(selectedApi.Name, out symbol))
{
ticker = await selectedApi.GetTickerAsync(symbol);
} }
This is linear non threaded code using an async interface to not block the current thread while the (likely expensive IO) operation is in place. It starts one, THEN WAITS FOR THE RESULT. No 2 queries ever start at the same time.
If you want a possible (just as example) more scalable way:
In the foreach, do not await but add the task to a list of tasks.
Then start await once all the tasks have started. Likein a 2nd loop.
WAY not perfect, but at least the runtime has a CHANCE to do multiple lookups at the same time. Your await makes sure that you essentially run single threaded code, except async, so your thread goes back into the pool (and is not waiting for results), increasing your scalability - an item possibly not relevant in this case and definitely not measured in your test.
I am trying to understand parallel programming and I would like my async methods to run on multiple threads. I have written something but it does not work like I thought it should.
Code
public static async Task Main(string[] args)
{
var listAfterParallel = RunParallel(); // Running this function to return tasks
await Task.WhenAll(listAfterParallel); // I want the program exceution to stop until all tasks are returned or tasks are completed
Console.WriteLine("After Parallel Loop"); // But currently when I run program, after parallel loop command is printed first
Console.ReadLine();
}
public static async Task<ConcurrentBag<string>> RunParallel()
{
var client = new System.Net.Http.HttpClient();
client.DefaultRequestHeaders.Add("Accept", "application/json");
client.BaseAddress = new Uri("https://jsonplaceholder.typicode.com");
var list = new List<int>();
var listResults = new ConcurrentBag<string>();
for (int i = 1; i < 5; i++)
{
list.Add(i);
}
// Parallel for each branch to run await commands on multiple threads.
Parallel.ForEach(list, new ParallelOptions() { MaxDegreeOfParallelism = 2 }, async (index) =>
{
var response = await client.GetAsync("posts/" + index);
var contents = await response.Content.ReadAsStringAsync();
listResults.Add(contents);
Console.WriteLine(contents);
});
return listResults;
}
I would like RunParallel function to complete before "After parallel loop" is printed. Also I want my get posts method to run on multiple threads.
Any help would be appreciated!
What's happening here is that you're never waiting for the Parallel.ForEach block to complete - you're just returning the bag that it will eventually pump into. The reason for this is that because Parallel.ForEach expects Action delegates, you've created a lambda which returns void rather than Task. While async void methods are valid, they generally continue their work on a new thread and return to the caller as soon as they await a Task, and the Parallel.ForEach method therefore thinks the handler is done, even though it's kicked that remaining work off into a separate thread.
Instead, use a synchronous method here;
Parallel.ForEach(list, new ParallelOptions() { MaxDegreeOfParallelism = 2 }, index =>
{
var response = client.GetAsync("posts/" + index).Result;
var contents = response.Content.ReadAsStringAsync().Result;
listResults.Add(contents);
Console.WriteLine(contents);
});
If you absolutely must use await inside, Wrap it in Task.Run(...).GetAwaiter().GetResult();
Parallel.ForEach(list, new ParallelOptions() { MaxDegreeOfParallelism = 2 }, index => Task.Run(async () =>
{
var response = await client.GetAsync("posts/" + index);
var contents = await response.Content.ReadAsStringAsync();
listResults.Add(contents);
Console.WriteLine(contents);
}).GetAwaiter().GetResult();
In this case, however, Task.run generally goes to a new thread, so we've subverted most of the control of Parallel.ForEach; it's better to use async all the way down;
var tasks = list.Select(async (index) => {
var response = await client.GetAsync("posts/" + index);
var contents = await response.Content.ReadAsStringAsync();
listResults.Add(contents);
Console.WriteLine(contents);
});
await Task.WhenAll(tasks);
Since Select expects a Func<T, TResult>, it will interpret an async lambda with no return as an async Task method instead of async void, and thus give us something we can explicitly await
Take a look at this: There Is No Thread
When you are making multiple concurrent web requests it's not your CPU that is doing the hard work. It's the CPU of the web server that is serving your requests. Your CPU is doing nothing during this time. It's not in a special "Wait-state" or something. The hardware inside your box that is working is your network card, that writes data to your RAM. When the response is received then your CPU will be notified about the arrived data, so it can do something with them.
You need parallelism when you have heavy work to do inside your box, not when you want the heavy work to be done by the external world. From the point of view of your CPU, even your hard disk is part of the external world. So everything that applies to web requests, applies also to requests targeting filesystems and databases. These workloads are called I/O bound, to be distinguished from the so called CPU bound workloads.
For I/O bound workloads the tool offered by the .NET platform is the asynchronous Task. There are multiple APIs throughout the libraries that return Task objects. To achieve concurrency you typically start multiple tasks and then await them with Task.WhenAll. There are also more advanced tools like the TPL Dataflow library, that is build on top of Tasks. It offers capabilities like buffering, batching, configuring the maximum degree of concurrency, and much more.
I have a core task retreiving me some core data and multiple other sub-tasks fetching extra data. Would like to run some enricher process to the core data as soon as the core task and any of the sub-task is ready. Would you know how to do so?
Thought about something like this but not sure it's the doing what I want:
// Starting the tasks
var coreDataTask = new Task(...);
var extraDataTask1 = new Task(...);
var extraDataTask2 = new Task(...);
coreDataTask.Start();
extraDataTask1.Start();
extraDataTask2.Start();
// Enriching the results
Task.WaitAll(coreDataTask, extraDataTask1);
EnrichCore(coreDataTask.Results, extraDataTask1.Results);
Task.WaitAll(coreDataTask, extraDataTask2);
EnrichCore(coreDataTask.Results, extraDataTask2.Results);
Also given the enrichement is on the same core object, guess I would need to lock it somewhere?
Thanks in advance!
Here is another idea taking advantage of Task.WhenAny() to detect when tasks are completing.
For this minimal example, I just assume that the core data and extra data are strings. But you can adjust for whatever your type is.
Also, I am not actually doing any processing. You would have to plug in your processing.
Also, an assumption I am making, that is not really clear, is that you are mostly trying to parallelize the gathering of your data because that's the expensive part, but that the enriching part is actually pretty fast. Based on that assumption, you'll notice that the tasks run in parallel to gather the core data and extra data. But as the data becomes available, the core data is enriched synchronously to avoid having to complicate the code with locking.
If you copy-paste the code below, you should be able to run it as is to see how it works.
public static void Main(string[] args)
{
StartWork().Wait();
}
private async static Task StartWork()
{
// start core and extra tasks
Task<string> coreDataTask = Task.Run(() => "core data" /* do something more complicated here */);
List<Task<string>> extraDataTaskList = new List<Task<string>>();
for (int i = 0; i < 10; i++)
{
int x = i;
extraDataTaskList.Add(Task.Run(() => "extra data " + x /* do something more complicated here */));
}
// wait for core data to be ready first.
StringBuilder coreData = new StringBuilder(await coreDataTask);
// enrich core as the extra data tasks complete.
while (extraDataTaskList.Count != 0)
{
Task<string> completedExtraDataTask = await Task.WhenAny(extraDataTaskList);
extraDataTaskList.Remove(completedExtraDataTask);
EnrichCore(coreData, await completedExtraDataTask);
}
Console.WriteLine(coreData.ToString());
}
private static void EnrichCore(StringBuilder coreData, string extraData)
{
coreData.Append(" enriched with ").Append(extraData);
}
EDIT: .NET 4.0 version
Here is how I would change it for .NET 4.0, while still retaining the same overall design:
Task.Run() becomes Task.Factory.StartNew()
Instead of doing await on tasks, I call Result, which is a blocking call that waits for the task to complete.
Use Task.WaitAny instead of Task.WhenAny, which is also a blocking call.
The design remains very similar. The one big difference between both versions of the code is that in the .NET 4.5 version, whenever there is an await, the current thread is free to do other work. In the .NET 4.0 version, whenever you call Task.Result or Task.WaitAny, the current thread blocks until the Task completes. It's possible that this difference is not really important to you. But if it is, just make sure to wrap and run the whole block of code in a background thread or task to free up your main thread.
The other difference is with the exception handling. With the .NET 4.5 version, if any of your tasks fails with an unhandled exception, the exception is automatically unwrapped and propagated in a very transparent manner. With the .NET 4.0 version, you'll be getting AggregateExceptions that you will have to unwrap and handle yourself. If this is a concern, make sure you test this beforehand so you know what to expect.
Personally, I try to avoid Task.ContinueWith whenever I can. It tends to make the code really ugly and hard to read.
public static void Main(string[] args)
{
// start core and extra tasks
Task<string> coreDataTask = Task.Factory.StartNew(() => "core data" /* do something more complicated here */);
List<Task<string>> extraDataTaskList = new List<Task<string>>();
for (int i = 0; i < 10; i++)
{
int x = i;
extraDataTaskList.Add(Task.Factory.StartNew(() => "extra data " + x /* do something more complicated here */));
}
// wait for core data to be ready first.
StringBuilder coreData = new StringBuilder(coreDataTask.Result);
// enrich core as the extra data tasks complete.
while (extraDataTaskList.Count != 0)
{
int indexOfCompletedTask = Task.WaitAny(extraDataTaskList.ToArray());
Task<string> completedExtraDataTask = extraDataTaskList[indexOfCompletedTask];
extraDataTaskList.Remove(completedExtraDataTask);
EnrichCore(coreData, completedExtraDataTask.Result);
}
Console.WriteLine(coreData.ToString());
}
private static void EnrichCore(StringBuilder coreData, string extraData)
{
coreData.Append(" enriched with ").Append(extraData);
}
I think what you probably want is "ContinueWith" (Documentation here : https://msdn.microsoft.com/en-us/library/dd270696(v=vs.110).aspx). That is as long as your enriching doesn't need to be done in a specific order.
The code would look something like the following :
var coreTask = new Task<object>(() => { return null; });
var enrichTask1 = new Task<object>(() => { return null; });
var enrichTask2 = new Task<object>(() => { return null; });
coreTask.Start();
coreTask.Wait();
//Create your continue tasks here with the data you want.
enrichTask1.ContinueWith(task => {/*Do enriching here with task.Result*/});
//Start all enricher tasks here.
enrichTask1.Start();
//Wait for all the tasks to complete here.
Task.WaitAll(enrichTask1);
You still need to run your CoreTask first as that's required to finish before all enriching tasks. But from there you can start all tasks, and tell them when they are done to "ContinueWith" doing something else.
You should also take a quick look in the "Enricher Pattern" that may be able to help you in general with what you want to achieve (Outside of threading). Examples like here : http://www.enterpriseintegrationpatterns.com/DataEnricher.html