Why does async IO block in C#? [duplicate] - c#

This question already has answers here:
Why File.ReadAllLinesAsync() blocks the UI thread?
(2 answers)
Closed 11 months ago.
I've created a WPF app that targets a local document database for fun/practice. The idea is the document for an entity is a .json file that lives on disk and folders act as collections. In this implementation, I have a bunch of .json documents that provide data about a Video to create a sort of an IMDB clone.
I have this class:
public class VideoRepository : IVideoRepository
{
public async IAsyncEnumerable<Video> EnumerateEntities()
{
foreach (var file in new DirectoryInfo(Constants.JsonDatabaseVideoCollectionPath).GetFiles())
{
var json = await File.ReadAllTextAsync(file.FullName); // This blocks
var document = JsonConvert.DeserializeObject<VideoDocument>(json); // Newtonsoft
var domainObject = VideoMapper.Map(document); // A mapper to go from the document type to the domain type
yield return domainObject;
}
// Uncommenting the below lines and commenting out the above foreach loop doesn't lock up the UI.
//await Task.Delay(5000);
//yield return new Video();
}
// Rest of class.
}
Way up the call stack, though the API layer and into the UI layer, I have an ICommand in a ViewModel:
QueryCommand = new RelayCommand(async (query) => await SendQuery((string)query));
private async Task SendQuery(string query)
{
QueryStatus = "Querying...";
QueryResult.Clear();
await foreach (var video in _videoEndpoints.QueryOnTags(query))
QueryResult.Add(_mapperService.Map(video));
QueryStatus = $"{QueryResult.Count()} videos found.";
}
The goal is to show the user a message 'Querying...' while the query is being processed. However, that message is never shown and the UI locks up until the query is complete, at which point the result message shows.
In VideoRepository, if I comment out the foreach loop and uncomment the two lines below it, the UI doesn't lock up and the 'Querying...' message gets shown for 5 seconds.
Why does that happen? Is there a way to do IO without locking up the UI/blocking?
Fortunately, if this were behind a web API and hit a real database, I probably wouldn't see this issue. I'd still like the UI to not lock up with this implementation though.
EDIT:
Dupe of Why File.ReadAllLinesAsync() blocks the UI thread?
Turns out Microsoft didn't make their async method very async. Changing the IO line fixes everything:
//var json = await File.ReadAllTextAsync(file.FullName); // Bad
var json = await Task.Run(() => File.ReadAllText(file.FullName)); // Good

You are probably targeting a .NET version older than .NET 6. In these old versions the file-system APIs were not implemented efficiently, and were not even truly asynchronous. Things have been improved in .NET 6, but still the synchronous file-system APIs are more performant than their asynchronous counterparts. Your problem can be solved simply by switching from this:
var json = await File.ReadAllTextAsync(file.FullName);
to this:
var json = await Task.Run(() => File.ReadAllText(file.FullName));
If you want to get fancy, you could also solve the problem in the UI layer, by using a custom LINQ operator like this:
public static async IAsyncEnumerable<T> OnThreadPool<T>(
this IAsyncEnumerable<T> source,
[EnumeratorCancellation] CancellationToken cancellationToken = default)
{
var enumerator = await Task.Run(() => source
.GetAsyncEnumerator(cancellationToken)).ConfigureAwait(false);
try
{
while (true)
{
var (moved, current) = await Task.Run(async () =>
{
if (await enumerator.MoveNextAsync())
return (true, enumerator.Current);
else
return (false, default);
}).ConfigureAwait(false);
if (!moved) break;
yield return current;
}
}
finally
{
await Task.Run(async () => await enumerator
.DisposeAsync()).ConfigureAwait(false);
}
}
This operator offloads to the ThreadPool all the operations associated with enumerating an IAsyncEnumerable<T>. It can be used like this:
await foreach (var video in _videoEndpoints.QueryOnTags(query).OnThreadPool())
QueryResult.Add(_mapperService.Map(video));

Related

C# async file operations / eliminate UI Blocking

Maybe I'm just looking at this wrong, I don't normally have to write a UI with progress. I'm trying to do this in a WPF app with .Net 7.0. My code looks something like:
public async void ProcessAllFiles()
{
var fileItemCollection = await GetFilesToProcess(#"\\MyServer\Share");
foreach (var fileItem in fileItemCollection)
{
var result = await ProcessFile(fileItem);
PublishResultsToUI(result);
}
}
public async Task<ResultInfo> ProcessFile(FileInfo item)
{
var result = new ResultInfo();
await Task.Run(() =>
{
// add processing results to result object
....
));
return result;
}
The "GetFilesToProcess" locks the UI 10-15 seconds and each "ProcessFiles" locks the UI a second or 2 for each file.
The timing seems to be about the same with or without the async/await/Task<T> statements.
If the UI didn't freeze every few seconds all would be good.
Any ideas would be appreciated (I've read through a bunch of docs, and I just have not found ones that make sense to me).

Running parallel async tasks and return result in .NET Core Web API

Hi Recently i was working in .net core web api project which is downloading files from external api.
In this .net core api recently found some issues while the no of files is more say more than 100. API is downloading max of 50 files and skipping others. WebAPI is deployed on AWS Lambda and timeout is 15mnts.
Actually the operation is timing out due to the long download process
public async Task<bool> DownloadAttachmentsAsync(List<DownloadAttachment> downloadAttachment)
{
try
{
bool DownloadFlag = false;
foreach (DownloadAttachment downloadAttachment in downloadAttachments)
{
DownloadFlag = await DownloadAttachment(downloadAttachment.id);
//update the download status in database
if(DownloadFlag)
{
bool UpdateFlag = await _DocumentService.UpdateDownloadStatus(downloadAttachment.id);
if (UpdateFlag)
{
await DeleteAttachment(downloadAttachment.id);
}
}
}
return true;
}
catch (Exception ext)
{
log.Error(ext, "Error in Saving attachment {attachemntId}",downloadAttachment.id);
return false;
}
}
Document service code
public async Task<bool> UpdateAttachmentDownloadStatus(string AttachmentID)
{
return await _documentRepository.UpdateAttachmentDownloadStatus(AttachmentID);
}
And DB update code
public async Task<bool> UpdateAttachmentDownloadStatus(string AttachmentID)
{
using (var db = new SqlConnection(_connectionString.Value))
{
var Result = 0; bool SuccessFlag = false;
var parameters = new DynamicParameters();
parameters.Add("#pm_AttachmentID", AttachmentID);
parameters.Add("#pm_Result", Result, System.Data.DbType.Int32, System.Data.ParameterDirection.Output);
var result = await db.ExecuteAsync("[Loan].[UpdateDownloadStatus]", parameters, commandType: CommandType.StoredProcedure);
Result = parameters.Get<int>("#pm_Result");
if (Result > 0) { SuccessFlag = true; }
return SuccessFlag;
}
}
How can i move this async task to run parallel ? and get the result? i tried following code
var task = Task.Run(() => DownloadAttachment( downloadAttachment.id));
bool result = task.Result;
Is this approach is fine? how can improve the performance? how to get the result from each parallel task and update to DB and delete based on success flag? Or this error is due to AWS timeout?
Please help
If you extracted the code that handles individual files to a separate method :
private async Task DownloadSingleAttachment(DownloadAttachment attachment)
{
try
{
var download = await DownloadAttachment(downloadAttachment.id);
if(download)
{
var update = await _DocumentService.UpdateDownloadStatus(downloadAttachment.id);
if (update)
{
await DeleteAttachment(downloadAttachment.id);
}
}
}
catch(....)
{
....
}
}
public async Task<bool> DownloadAttachmentsAsync(List<DownloadAttachment> downloadAttachment)
{
try
{
foreach (var attachment in downloadAttachments)
{
await DownloadSingleAttachment(attachment);
}
}
....
}
It would be easy to start all downloads at once, although not very efficient :
public async Task<bool> DownloadAttachmentsAsync(List<DownloadAttachment> downloadAttachment)
{
try
{
//Start all of them
var tasks=downloadAttachments.Select(att=>DownloadSingleAttachment(att));
await Task.WhenAll(tasks);
}
....
}
This isn't very efficient because external services hate lots of concurrent calls from a single source as you do, and almost certainly impose throttling. The database doesn't like lots of concurrent calls either, because in all database products concurrent calls lead to blocking one way or another. Even in databases that use multiversioning, this comes with an overhead.
Using Dataflow classes - Single block
One easy way to fix this is to use .NET's Dataflow classes to break the operation into a pipeline of steps, and execute each one with a different number of concurrent tasks.
We could put the entire operation into a single block, but that could cause problems if the update and delete operations aren't thread-safe :
var dlOptions= new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 10,
};
var downloader=new ActionBlock<DownloadAttachment>(async att=>{
await DownloadSingleAttachment(att);
},dlOptions);
foreach (var attachment in downloadAttachments)
{
await downloader.SendAsync(attachement.id);
}
downloader.Complete();
await downloader.Completion;
Dataflow - Multiple steps
To avoid possible thread issues, the rest of the methods can go to their own blocks. They could both go into one ActionBlock that calls both Update and Delete, or they could go into separate blocks if the methods talk to different services with different concurrency requirements.
The downloader block will execute at most 10 concurrent downloads. By default, each block uses only a single task at a time.
The updater and deleter blocks have their default DOP=1, which means there's no risk of race conditions as long as they don't try to use eg the same connection at the same time.
var downloader=new TransformBlock<string,(string id,bool download)>(
async id=> {
var download=await DownloadAttachment(id);
return (id,download);
},dlOptions);
var updater=new TransformBlock<(string id,bool download),(string id,bool update)>(
async (id,download)=> {
if(download)
{
var update = await _DocumentService.UpdateDownloadStatus(id);
return (id,update);
}
return (id,false);
});
var deleter=new ActionBlock<(string id,bool update)>(
async (id,update)=> {
if(update)
{
await DeleteAttachment(id);
}
});
The blocks can be linked into a pipeline now and used. The setting PropagateCompletion = true means that as soon as a block is finished processing, it will tell all its connected blocks to finish as well :
var linkOptions=new DataflowLinkOptions { PropagateCompletion = true};
downloader.LinkTo(updater, linkOptions);
updater.LinkTo(deleter,linkOptions);
We can pump data into the head block as long as we need. When we're done, we call the head block's Complete() method. As each block finishes processing its data, it will propagate its completion to the next block in the pipeline. We need to await for the last (tail) block to complete to ensure all the attachments have been processed:
foreach (var attachment in downloadAttachments)
{
await downloader.SendAsync(attachement.id);
}
downloader.Complete();
await deleter.Completion;
Each block has an input and (when necessary) an output buffer, which means the "producer" and "consumers" of the messages don't have to be in sync, or even know of each other. All the "producer" needs to know is where to find the head block in a pipeline.
Throttling and backpressure
One way to throttle is to use a fixed number of tasks through MaxDegreeOfParallelism.
It's also possible to put a limit to the input buffer, thus blocking previous steps or producers if a block can't process messages fast enough. This can be done simply by setting the BoundedCapacity option for a block:
var dlOptions= new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 10,
BoundedCapacity=20,
};
var updaterOptions= new ExecutionDataflowBlockOptions
{
BoundedCapacity=20,
};
...
var downloader=new TransformBlock<...>(...,dlOptions);
var updater=new TransformBlock<...>(...,updaterOptions);
No other changes are necessary
To run multiple asynchronous operations you could do something like this:
public async Task RunMultipleAsync<T>(IEnumerable<T> myList)
{
const int myNumberOfConcurrentOperations = 10;
var mySemaphore = new SemaphoreSlim(myNumberOfConcurrentOperations);
var tasks = new List<Task>();
foreach(var myItem in myList)
{
await mySemaphore.WaitAsync();
var task = RunOperation(myItem);
tasks.Add(task);
task.ContinueWith(t => mySemaphore.Release());
}
await Task.WhenAll(tasks);
}
private async Task RunOperation<T>(T myItem)
{
// Do stuff
}
Put your code from DownloadAttachmentsAsync at the 'Do stuff' comment
This will use a semaphore to limit the number of concurrent operations, since running to many concurrent operations is often a bad idea due to contention. You would need to experiment to find the optimal number of concurrent operations for your use case. Also note that error handling have been omitted to keep the example short.

Await async call in for-each-loop [duplicate]

This question already has answers here:
Using async/await for multiple tasks
(8 answers)
Closed 4 years ago.
I have a method in which I'm retrieving a list of deployments. For each deployment I want to retrieve an associated release. Because all calls are made to an external API, I now have a foreach-loop in which those calls are made.
public static async Task<List<Deployment>> GetDeployments()
{
try
{
var depjson = await GetJson($"{BASEURL}release/deployments?deploymentStatus=succeeded&definitionId=2&definitionEnvironmentId=5&minStartedTime={MinDateTime}");
var deployments = (JsonConvert.DeserializeObject<DeploymentWrapper>(depjson))?.Value?.OrderByDescending(x => x.DeployedOn)?.ToList();
foreach (var deployment in deployments)
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
}
return deployments;
}
catch (Exception)
{
throw;
}
}
This all works perfectly fine. However, I do not like the await in the foreach-loop at all. I also believe this is not considered good practice. I just don't see how to refactor this so the calls are made parallel, because the result of each call is used to set a property of the deployment.
I would appreciate any suggestions on how to make this method faster and, whenever possible, avoid the await-ing in the foreach-loop.
There is nothing wrong with what you are doing now. But there is a way to call all tasks at once instead of waiting for a single task, then processing it and then waiting for another one.
This is how you can turn this:
wait for one -> process -> wait for one -> process ...
into
wait for all -> process -> done
Convert this:
foreach (var deployment in deployments)
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
}
To:
var deplTasks = deployments.Select(d => GetJson($"{BASEURL}release/releases/{d.ReleaseId}"));
var reljsons = await Task.WhenAll(deplTasks);
for(var index = 0; index < deployments.Count; index++)
{
deployments[index].Release = JsonConvert.DeserializeObject<Release>(reljsons[index]);
}
First you take a list of unfinished tasks. Then you await it and you get a collection of results (reljson's). Then you have to deserialize them and assign to Release.
By using await Task.WhenAll() you wait for all the tasks at the same time, so you should see a performance boost from that.
Let me know if there are typos, I didn't compile this code.
Fcin suggested to start all Tasks, await for them all to finish and then start deserializing the fetched data.
However, if the first Task is already finished, but the second task not, and internally the second task is awaiting, the first task could already start deserializing. This would shorten the time that your process is idly waiting.
So instead of:
var deplTasks = deployments.Select(d => GetJson($"{BASEURL}release/releases/{d.ReleaseId}"));
var reljsons = await Task.WhenAll(deplTasks);
for(var index = 0; index < deployments.Count; index++)
{
deployments[index].Release = JsonConvert.DeserializeObject<Release>(reljsons[index]);
}
I'd suggest the following slight change:
// async fetch the Release data of Deployment:
private async Task<Release> FetchReleaseDataAsync(Deployment deployment)
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
return JsonConvert.DeserializeObject<Release>(reljson);
}
// async fill the Release data of Deployment:
private async Task FillReleaseDataAsync(Deployment deployment)
{
deployment.Release = await FetchReleaseDataAsync(deployment);
}
Then your procedure is similar to the solution that Fcin suggested:
IEnumerable<Task> tasksFillDeploymentWithReleaseData = deployments.
.Select(deployment => FillReleaseDataAsync(deployment)
.ToList();
await Task.WhenAll(tasksFillDeploymentWithReleaseData);
Now if the first task has to wait while fetching the release data, the 2nd task begins and the third etc. If the first task already finished fetching the release data, but the other tasks are awaiting for their release data, the first task starts already deserializing it and assigns the result to deployment.Release, after which the first task is complete.
If for instance the 7th task got its data, but the 2nd task is still waiting, the 7th task can deserialize and assign the data to deployment.Release. Task 7 is completed.
This continues until all tasks are completed. Using this method there is less waiting time because as soon as one task has its data it is scheduled to start deserializing
If i understand you right and you want to make the var reljson = await GetJson parralel:
Try this:
Parallel.ForEach(deployments, (deployment) =>
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
});
you might limit the number of parallel executions such as:
Parallel.ForEach(
deployments,
new ParallelOptions { MaxDegreeOfParallelism = 4 },
(deployment) =>
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
});
you might also want to be able to break the loop:
Parallel.ForEach(deployments, (deployment, state) =>
{
var reljson = await GetJson($"{BASEURL}release/releases/{deployment.ReleaseId}");
deployment.Release = JsonConvert.DeserializeObject<Release>(reljson);
if (noFurtherProcessingRequired) state.Break();
});

Best practice for task/await in a foreach loop

I have some time consuming code in a foreach that uses task/await.
it includes pulling data from the database, generating html, POSTing that to an API, and saving the replies to the DB.
A mock-up looks like this
List<label> labels = db.labels.ToList();
foreach (var x in list)
{
var myLabels = labels.Where(q => !db.filter.Where(y => x.userid ==y.userid))
.Select(y => y.ID)
.Contains(q.id))
//Render the HTML
//do some fast stuff with objects
List<response> res = await api.sendMessage(object); //POST
//put all the responses in the db
foreach (var r in res)
{
db.responses.add(r);
}
db.SaveChanges();
}
Time wise, generating the Html and posting it to the API seem to be taking most of the time.
Ideally it would be great if I could generate the HTML for the next item, and wait for the post to finish, before posting the next item.
Other ideas are also welcome.
How would one go about this?
I first thought of adding a Task above the foreach and wait for that to finish before making the next POST, but then how do I process the last loop... it feels messy...
You can do it in parallel but you will need different context in each Task.
Entity framework is not thread safe, so if you can't use one context in parallel tasks.
var tasks = myLabels.Select( async label=>{
using(var db = new MyDbContext ()){
// do processing...
var response = await api.getresponse();
db.Responses.Add(response);
await db.SaveChangesAsync();
}
});
await Task.WhenAll(tasks);
In this case, all tasks will appear to run in parallel, and each task will have its own context.
If you don't create new Context per task, you will get error mentioned on this question Does Entity Framework support parallel async queries?
It's more an architecture problem than a code issue here, imo.
You could split your work into two separate parts:
Get data from database and generate HTML
Send API request and save response to database
You could run them both in parallel, and use a queue to coordinate that: whenever your HTML is ready it's added to a queue and another worker proceeds from there, taking that HTML and sending to the API.
Both parts can be done in multithreaded way too, e.g. you can process multiple items from the queue at the same time by having a set of workers looking for items to be processed in the queue.
This screams for the producer / consumer pattern: one producer produces data in a speed different than the consumer consumes it. Once the producer does not have anything to produce anymore it notifies the consumer that no data is expected anymore.
MSDN has a nice example of this pattern where several dataflowblocks are chained together: the output of one block is the input of another block.
Walkthrough: Creating a Dataflow Pipeline
The idea is as follows:
Create a class that will generate the HTML.
This class has an object of class System.Threading.Tasks.Dataflow.BufferBlock<T>
An async procedure creates all HTML output and await SendAsync the data to the bufferBlock
The buffer block implements interface ISourceBlock<T>. The class exposes this as a get property:
The code:
class MyProducer<T>
{
private System.Threading.Tasks.Dataflow.BufferBlock<T> bufferBlock = new BufferBlock<T>();
public ISourceBlock<T> Output {get {return this.bufferBlock;}
public async ProcessAsync()
{
while (somethingToProduce)
{
T producedData = ProduceOutput(...)
await this.bufferBlock.SendAsync(producedData);
}
// no date to send anymore. Mark the output complete:
this.bufferBlock.Complete()
}
}
A second class takes this ISourceBlock. It will wait at this source block until data arrives and processes it.
do this in an async function
stop when no more data is available
The code:
public class MyConsumer<T>
{
ISourceBlock<T> Source {get; set;}
public async Task ProcessAsync()
{
while (await this.Source.OutputAvailableAsync())
{ // there is input of type T, read it:
var input = await this.Source.ReceiveAsync();
// process input
}
// if here, no more input expected. finish.
}
}
Now put it together:
private async Task ProduceOutput<T>()
{
var producer = new MyProducer<T>();
var consumer = new MyConsumer<T>() {Source = producer.Output};
var producerTask = Task.Run( () => producer.ProcessAsync());
var consumerTask = Task.Run( () => consumer.ProcessAsync());
// while both tasks are working you can do other things.
// wait until both tasks are finished:
await Task.WhenAll(new Task[] {producerTask, consumerTask});
}
For simplicity I've left out exception handling and cancellation. StackOverFlow has artibles about exception handling and cancellation of Tasks:
Keep UI responsive using Tasks, Handle AggregateException
Cancel an Async Task or a List of Tasks
This is what I ended up using: (https://stackoverflow.com/a/25877042/275990)
List<ToSend> sendToAPI = new List<ToSend>();
List<label> labels = db.labels.ToList();
foreach (var x in list) {
var myLabels = labels.Where(q => !db.filter.Where(y => x.userid ==y.userid))
.Select(y => y.ID)
.Contains(q.id))
//Render the HTML
//do some fast stuff with objects
sendToAPI.add(the object with HTML);
}
int maxParallelPOSTs=5;
await TaskHelper.ForEachAsync(sendToAPI, maxParallelPOSTs, async i => {
using (NasContext db2 = new NasContext()) {
List<response> res = await api.sendMessage(i.object); //POST
//put all the responses in the db
foreach (var r in res)
{
db2.responses.add(r);
}
db2.SaveChanges();
}
});
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body) {
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate {
using (partition)
while (partition.MoveNext()) {
await body(partition.Current).ContinueWith(t => {
if (t.Exception != null) {
string problem = t.Exception.ToString();
}
//observe exceptions
});
}
}));
}
basically lets me generate the HTML sync, which is fine, since it only takes a few seconds to generate 1000's but lets me post and save to DB async, with as many threads as I predefine. In this case I'm posting to the Mandrill API, parallel posts are no problem.

async i/o and process results as they become available

I has a simple console app where I want to call many Urls in a loop and put the result in a database table. I am using .Net 4.5 and using async i/o to fetch the URL data. Here is a simplified version of what I am doing. All methods are async except for the database operation. Do you guys see any issues with this? Are there better ways of optimizing?
private async Task Run(){
var items = repo.GetItems(); // sync method to get list from database
var tasks = new List<Task>();
// add each call to task list and process result as it becomes available
// rather than waiting for all downloads
foreach(Item item in items){
tasks.Add(GetFromWeb(item.url).ContinueWith(response => { AddToDatabase(response.Result);}));
}
await Task.WhenAll(tasks); // wait for all tasks to complete.
}
private async Task<string> GetFromWeb(url) {
HttpResponseMessage response = await GetAsync(url);
return await response.Content.ReadAsStringAsync();
}
private void AddToDatabase(string item){
// add data to database.
}
Your solution is acceptable. But you should check out TPL Dataflow, which allows you to set up a dataflow "mesh" (or "pipeline") and then shove the data through it.
For a problem this simple, Dataflow won't really add much other than getting rid of the ContinueWith (I always find manual continuations awkward). But if you plan to add more steps or change your data flow in the future, Dataflow should be something you consider.
Your solution is pretty much correct, with just two minor mistakes (both of which cause compiler errors). First, you don't call ContinueWith on the result of List.Add, you need call continue with on the task and then add the continuation to your list, this is solved by just moving a parenthesis. You also need to call Result on the reponse Task.
Here is the section with the two minor changes:
tasks.Add(GetFromWeb(item.url)
.ContinueWith(response => { AddToDatabase(response.Result);}));
Another option is to leverage a method that takes a sequence of tasks and orders them by the order that they are completed. Here is my implementation of such a method:
public static IEnumerable<Task<T>> Order<T>(this IEnumerable<Task<T>> tasks)
{
var taskList = tasks.ToList();
var taskSources = new BlockingCollection<TaskCompletionSource<T>>();
var taskSourceList = new List<TaskCompletionSource<T>>(taskList.Count);
foreach (var task in taskList)
{
var newSource = new TaskCompletionSource<T>();
taskSources.Add(newSource);
taskSourceList.Add(newSource);
task.ContinueWith(t =>
{
var source = taskSources.Take();
if (t.IsCanceled)
source.TrySetCanceled();
else if (t.IsFaulted)
source.TrySetException(t.Exception.InnerExceptions);
else if (t.IsCompleted)
source.TrySetResult(t.Result);
}, CancellationToken.None, TaskContinuationOptions.PreferFairness, TaskScheduler.Default);
}
return taskSourceList.Select(tcs => tcs.Task);
}
Using this your code can become:
private async Task Run()
{
IEnumerable<Item> items = repo.GetItems(); // sync method to get list from database
foreach (var task in items.Select(item => GetFromWeb(item.url))
.Order())
{
await task.ConfigureAwait(false);
AddToDatabase(task.Result);
}
}
Just though I'd throw in my hat as well with the Rx solution
using System.Reactive;
using System.Reactive.Linq;
private Task Run()
{
var fromWebObservable = from item in repo.GetItems.ToObservable(Scheduler.Default)
select GetFromWeb(item.url);
fromWebObservable
.Select(async x => await x)
.Do(AddToDatabase)
.ToTask();
}

Categories

Resources