I've set a bunch of Console.WriteLines and as far as I can tell none of them are being invoked when I run the following in .NET Fiddle.
using System;
using System.Net;
using System.Linq.Expressions;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Timers;
using System.Collections.Generic;
public class Program
{
private static readonly object locker = new object();
private static readonly string pageFormat = "http://www.letsrun.com/forum/forum.php?board=1&page={0}";
public static void Main()
{
var client = new WebClient();
// Queue up the requests we are going to make
var tasks = new Queue<Task<string>>(
Enumerable
.Repeat(0,50)
.Select(i => new Task<string>(() => client.DownloadString(string.Format(pageFormat,i))))
);
// Create set of 5 tasks which will be the at most 5
// requests we wait on
var runningTasks = new HashSet<Task<string>>();
for(int i = 0; i < 5; ++i)
{
runningTasks.Add(tasks.Dequeue());
}
var timer = new System.Timers.Timer
{
AutoReset = true,
Interval = 2000
};
// On each tick, go through the tasks that are supposed
// to have started running and if they have completed
// without error then store their result and run the
// next queued task if there is one. When we run out of
// any more tasks to run or wait for, stop the ticks.
timer.Elapsed += delegate
{
lock(locker)
{
foreach(var task in runningTasks)
{
if(task.IsCompleted)
{
if(!task.IsFaulted)
{
Console.WriteLine("Got a document: {0}",
task.Result.Substring(Math.Min(30, task.Result.Length)));
runningTasks.Remove(task);
if(tasks.Any())
{
runningTasks.Add(tasks.Dequeue());
}
}
else
{
Console.WriteLine("Uh-oh, task faulted, apparently");
}
}
else if(!task.Status.Equals(TaskStatus.Running)) // task not started
{
Console.WriteLine("About to start a task.");
task.Start();
}
else
{
Console.WriteLine("Apparently a task is running.");
}
}
if(!runningTasks.Any())
{
timer.Stop();
}
}
};
}
}
I'd also appreciate advice on how I can simplify or fix any faulty logic in this. The pattern I'm trying to do is like
(1) Create a queueu of N tasks
(2) Create a set of M tasks, the first M dequeued items from (1)
(3) Start the M tasks running
(4) After X seconds, check for completed tasks.
(5) For any completed task, do something with the result, remove the task from the set and replace it with another task from the queue (if any are left in the queueu).
(6) Repeat (4)-(5) indefinitely.
(7) If the set has no tasks left, we're done.
but perhaps there's a better way to implement it, or perhaps there's some .NET function that easily encapsulates what I'm trying to do (web requests in parallel with a specified max degree of parallelism).
There are several issues in your code, but since you are looking for better way to implement it - you can use Parallel.For or Parallel.ForEach:
Parallel.For(0, 50, new ParallelOptions() { MaxDegreeOfParallelism = 5 }, (i) =>
{
// surround with try-catch
string result;
using (var client = new WebClient()) {
result = client.DownloadString(string.Format(pageFormat, i));
}
// do something with result
Console.WriteLine("Got a document: {0}", result.Substring(Math.Min(30, result.Length)));
});
It will execute the body in parallel (not more than 5 tasks at any given time). When one task is completed - next one is started, until they are all done, just like you want.
UPDATE. There are several waits to throttle tasks with this approach, but the most straightforward is just sleep:
Parallel.For(0, 50, new ParallelOptions() { MaxDegreeOfParallelism = 5 },
(i) =>
{
// surround with try-catch
var watch = Stopwatch.StartNew();
string result;
using (var client = new WebClient()) {
result = client.DownloadString(string.Format(pageFormat, i));
}
// do something with result
Console.WriteLine("Got a document: {0}", result.Substring(Math.Min(30, result.Length)));
watch.Stop();
var sleep = 2000 - watch.ElapsedMilliseconds;
if (sleep > 0)
Thread.Sleep((int)sleep);
});
This isn't a direct answer to your question. I just wanted to suggest an alternative approach.
I'd recommend that you look into using Microsoft's Reactive Framework (NuGet "System.Reactive") for doing this kind of thing.
You could then do something like this:
var query =
Observable
.Range(0, 50)
.Select(i => string.Format(pageFormat, i))
.Select(u => Observable.Using(
() => new WebClient(),
wc => Observable.Start(() => new { url = u, content = wc.DownloadString(u) })))
.Merge(5);
IDisposable subscription = query.Subscribe(x =>
{
Console.WriteLine(x.url);
Console.WriteLine(x.content);
});
It's all async and the process can be stopped at any time by calling subscription.Dispose();
Related
I have a collection of 1000 input message to process. I'm looping the input collection and starting the new task for each message to get processed.
//Assume this messages collection contains 1000 items
var messages = new List<string>();
foreach (var msg in messages)
{
Task.Factory.StartNew(() =>
{
Process(msg);
});
}
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time?
How to ensure this message get processed in the same sequence/order of the Collection?
You could use Parallel.Foreach and rely on MaxDegreeOfParallelism instead.
Parallel.ForEach(messages, new ParallelOptions {MaxDegreeOfParallelism = 10},
msg =>
{
// logic
Process(msg);
});
SemaphoreSlim is a very good solution in this case and I higly recommend OP to try this, but #Manoj's answer has flaw as mentioned in comments.semaphore should be waited before spawning the task like this.
Updated Answer: As #Vasyl pointed out Semaphore may be disposed before completion of tasks and will raise exception when Release() method is called so before exiting the using block must wait for the completion of all created Tasks.
int maxConcurrency=10;
var messages = new List<string>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach(var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Answer to Comments
for those who want to see how semaphore can be disposed without Task.WaitAll
Run below code in console app and this exception will be raised.
System.ObjectDisposedException: 'The semaphore has been disposed.'
static void Main(string[] args)
{
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
// Task.WaitAll(tasks.ToArray());
}
Console.WriteLine("Exited using block");
Console.ReadKey();
}
private static void Process(string msg)
{
Thread.Sleep(2000);
Console.WriteLine(msg);
}
I think it would be better to use Parallel LINQ
Parallel.ForEach(messages ,
new ParallelOptions{MaxDegreeOfParallelism = 4},
x => Process(x);
);
where x is the MaxDegreeOfParallelism
With .NET 5.0 and Core 3.0 channels were introduced.
The main benefit of this producer/consumer concurrency pattern is that you can also limit the input data processing to reduce resource impact.
This is especially helpful when processing millions of data records.
Instead of reading the whole dataset at once into memory, you can now consecutively query only chunks of the data and wait for the workers to process it before querying more.
Code sample with a queue capacity of 50 messages and 5 consumer threads:
/// <exception cref="System.AggregateException">Thrown on Consumer Task exceptions.</exception>
public static async Task ProcessMessages(List<string> messages)
{
const int producerCapacity = 10, consumerTaskLimit = 3;
var channel = Channel.CreateBounded<string>(producerCapacity);
_ = Task.Run(async () =>
{
foreach (var msg in messages)
{
await channel.Writer.WriteAsync(msg);
// blocking when channel is full
// waiting for the consumer tasks to pop messages from the queue
}
channel.Writer.Complete();
// signaling the end of queue so that
// WaitToReadAsync will return false to stop the consumer tasks
});
var tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
var consumerTasks = Enumerable
.Range(1, consumerTaskLimit)
.Select(_ => Task.Run(async () =>
{
try
{
while (await channel.Reader.WaitToReadAsync(ct))
{
ct.ThrowIfCancellationRequested();
while (channel.Reader.TryRead(out var message))
{
await Task.Delay(500);
Console.WriteLine(message);
}
}
}
catch (OperationCanceledException) { }
catch
{
tokenSource.Cancel();
throw;
}
}))
.ToArray();
Task waitForConsumers = Task.WhenAll(consumerTasks);
try { await waitForConsumers; }
catch
{
foreach (var e in waitForConsumers.Exception.Flatten().InnerExceptions)
Console.WriteLine(e.ToString());
throw waitForConsumers.Exception.Flatten();
}
}
As pointed out by Theodor Zoulias:
On multiple consumer exceptions, the remaining tasks will continue to run and have to take the load of the killed tasks. To avoid this, I implemented a CancellationToken to stop all the remaining tasks and handle the exceptions combined in the AggregateException of waitForConsumers.Exception.
Side note:
The Task Parallel Library (TPL) might be good at automatically limiting the tasks based on your local resources. But when you are processing data remotely via RPC, it's necessary to manually limit your RPC calls to avoid filling the network/processing stack!
If your Process method is async you can't use Task.Factory.StartNew as it doesn't play well with an async delegate. Also there are some other nuances when using it (see this for example).
The proper way to do it in this case is to use Task.Run. Here's #ClearLogic answer modified for an async Process method.
static void Main(string[] args)
{
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Run(async () =>
{
try
{
await Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Console.WriteLine("Exited using block");
Console.ReadKey();
}
private static async Task Process(string msg)
{
await Task.Delay(2000);
Console.WriteLine(msg);
}
You can create your own TaskScheduler and override QueueTask there.
protected virtual void QueueTask(Task task)
Then you can do anything you like.
One example here:
Limited concurrency level task scheduler (with task priority) handling wrapped tasks
You can simply set the max concurrency degree like this way:
int maxConcurrency=10;
var messages = new List<1000>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
foreach(var msg in messages)
{
Task.Factory.StartNew(() =>
{
concurrencySemaphore.Wait();
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
}
}
If you need in-order queuing (processing might finish in any order), there is no need for a semaphore. Old fashioned if statements work fine:
const int maxConcurrency = 5;
List<Task> tasks = new List<Task>();
foreach (var arg in args)
{
var t = Task.Run(() => { Process(arg); } );
tasks.Add(t);
if(tasks.Count >= maxConcurrency)
Task.WaitAny(tasks.ToArray());
}
Task.WaitAll(tasks.ToArray());
I ran into a similar problem where I wanted to produce 5000 results while calling apis, etc. So, I ran some speed tests.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
{
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id).GetAwaiter().GetResult();
});
produced 100 results in 30 seconds.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
{
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id);
});
Moving the GetAwaiter().GetResult() to the individual async api calls inside GetProductMetaData resulted in 14.09 seconds to produce 100 results.
foreach (var id in ids.Take(100))
{
GetProductMetaData(productsMetaData, client, id);
}
Complete non-async programming with the GetAwaiter().GetResult() in api calls resulted in 13.417 seconds.
var tasks = new List<Task>();
while (y < ids.Count())
{
foreach (var id in ids.Skip(y).Take(100))
{
tasks.Add(GetProductMetaData(productsMetaData, client, id));
}
y += 100;
Task.WhenAll(tasks).GetAwaiter().GetResult();
Console.WriteLine($"Finished {y}, {sw.Elapsed}");
}
Forming a task list and working through 100 at a time resulted in a speed of 7.36 seconds.
using (SemaphoreSlim cons = new SemaphoreSlim(10))
{
var tasks = new List<Task>();
foreach (var id in ids.Take(100))
{
cons.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
GetProductMetaData(productsMetaData, client, id);
}
finally
{
cons.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Using SemaphoreSlim resulted in 13.369 seconds, but also took a moment to boot to start using it.
var throttler = new SemaphoreSlim(initialCount: take);
foreach (var id in ids)
{
throttler.WaitAsync().GetAwaiter().GetResult();
tasks.Add(Task.Run(async () =>
{
try
{
skip += 1;
await GetProductMetaData(productsMetaData, client, id);
if (skip % 100 == 0)
{
Console.WriteLine($"started {skip}/{count}, {sw.Elapsed}");
}
}
finally
{
throttler.Release();
}
}));
}
Using Semaphore Slim with a throttler for my async task took 6.12 seconds.
The answer for me in this specific project was use a throttler with Semaphore Slim. Although the while foreach tasklist did sometimes beat the throttler, 4/6 times the throttler won for 1000 records.
I realize I'm not using the OPs code, but I think this is important and adds to this discussion because how is sometimes not the only question that should be asked, and the answer is sometimes "It depends on what you are trying to do."
Now to answer the specific questions:
How to limit the maximum number of parallel tasks in c#: I showed how to limit the number of tasks that are completed at a time.
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time? I cannot guess how many will be processed at a time unless I set an upper limit but I can set an upper limit. Obviously different computers function at different speeds due to CPU, RAM etc. and how many threads and cores the program itself has access to as well as other programs running in tandem on the same computer.
How to ensure this message get processed in the same sequence/order of the Collection? If you want to process everything in a specific order, it is synchronous programming. The point of being able to run things asynchronously is ensuring that they can do everything without an order. As you can see from my code, the time difference is minimal in 100 records unless you use async code. In the event that you need an order to what you are doing, use asynchronous programming up until that point, then await and do things synchronously from there. For example, task1a.start, task2a.start, then later task1a.await, task2a.await... then later task1b.start task1b.await and task2b.start task 2b.await.
public static void RunTasks(List<NamedTask> importTaskList)
{
List<NamedTask> runningTasks = new List<NamedTask>();
try
{
foreach (NamedTask currentTask in importTaskList)
{
currentTask.Start();
runningTasks.Add(currentTask);
if (runningTasks.Where(x => x.Status == TaskStatus.Running).Count() >= MaxCountImportThread)
{
Task.WaitAny(runningTasks.ToArray());
}
}
Task.WaitAll(runningTasks.ToArray());
}
catch (Exception ex)
{
Log.Fatal("ERROR!", ex);
}
}
you can use the BlockingCollection, If the consume collection limit has reached, the produce will stop producing until a consume process will finish. I find this pattern more easy to understand and implement than the SemaphoreSlim.
int TasksLimit = 10;
BlockingCollection<Task> tasks = new BlockingCollection<Task>(new ConcurrentBag<Task>(), TasksLimit);
void ProduceAndConsume()
{
var producer = Task.Factory.StartNew(RunProducer);
var consumer = Task.Factory.StartNew(RunConsumer);
try
{
Task.WaitAll(new[] { producer, consumer });
}
catch (AggregateException ae) { }
}
void RunConsumer()
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Start();
}
}
void RunProducer()
{
for (int i = 0; i < 1000; i++)
{
tasks.Add(new Task(() => Thread.Sleep(1000), TaskCreationOptions.AttachedToParent));
}
}
Note that the RunProducer and RunConsumer has spawn two independent tasks.
I have a TPL Datalow pipeline with two sources and two targets linked in a many-to-many fashion. The target blocks appear to complete successfully, however, it usually drops one or more inputs. I've attached the simplest possible full repro I could come up with below. Any ideas?
Notes:
The problem only occurs if the artificial delay is used while generating the input.
Complete() is successfully called for both sources, but one of the source's Completion task hangs in the WaitingForActivation state, even though both Targets complete successfully.
I can't find any documentation stating many-to-many dataflows aren't supported, and this question's answer implies it is - https://social.msdn.microsoft.com/Forums/en-US/19d831af-2d3f-4d95-9672-b28ae53e6fa0/completion-of-complex-graph-dataflowgraph-object?forum=tpldataflow
using System;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
class Program
{
private const int NumbersPerSource = 10;
private const int MaxDelayMilliseconds = 10;
static async Task Main(string[] args)
{
int numbersProcessed = 0;
var source1 = new BufferBlock<int>();
var source2 = new BufferBlock<int>();
var target1 = new ActionBlock<int>(i => Interlocked.Increment(ref numbersProcessed));
var target2 = new ActionBlock<int>(i => Interlocked.Increment(ref numbersProcessed));
var linkOptions = new DataflowLinkOptions() { PropagateCompletion = true };
source1.LinkTo(target1, linkOptions);
source1.LinkTo(target2, linkOptions);
source2.LinkTo(target1, linkOptions);
source2.LinkTo(target2, linkOptions);
var task1 = Task.Run(() => Post(source1));
var task2 = Task.Run(() => Post(source2));
// source1 or source2 Completion tasks may never complete even though Complete is always successfully called.
//await Task.WhenAll(task1, task2, source1.Completion, source2.Completion, target1.Completion, target2.Completion);
await Task.WhenAll(task1, task2, target1.Completion, target2.Completion);
Console.WriteLine($"{numbersProcessed} of {NumbersPerSource * 2} numbers processed.");
}
private static async Task Post(BufferBlock<int> source)
{
foreach (var i in Enumerable.Range(0, NumbersPerSource)) {
await Task.Delay(TimeSpan.FromMilliseconds(GetRandomMilliseconds()));
Debug.Assert(source.Post(i));
}
source.Complete();
}
private static Random Random = new Random();
private static int GetRandomMilliseconds()
{
lock (Random) {
return Random.Next(0, MaxDelayMilliseconds);
}
}
}
As #MikeJ pointed out in a comment, linking the blocks using the PropagateCompletion in a many-to-many dataflow configuration can cause the premature completion of some target blocks. In this case the target1 and target2 are both marked as completed when any of the two source blocks completes, leaving the other source unable to complete, because there are still messages in it's output buffer. These messages are never going to be consumed, because none of the linked target blocks is willing to accept them.
To fix this problem you could use the custom PropagateCompletion method below:
public static void PropagateCompletion(IDataflowBlock[] sources,
IDataflowBlock[] targets)
{
// Arguments validation omitted
Task allSourcesCompletion = Task.WhenAll(sources.Select(s => s.Completion));
ThreadPool.QueueUserWorkItem(async _ =>
{
try { await allSourcesCompletion.ConfigureAwait(false); } catch { }
Exception exception = allSourcesCompletion.IsFaulted ?
allSourcesCompletion.Exception : null;
foreach (var target in targets)
{
if (exception is null) target.Complete(); else target.Fault(exception);
}
});
}
Usage example:
source1.LinkTo(target1);
source1.LinkTo(target2);
source2.LinkTo(target1);
source2.LinkTo(target2);
PropagateCompletion(new[] { source1, source2 }, new[] { target1, target2 });
Notice that no DataflowLinkOptions are passed when linking the sources to the targets in this example.
I'm using Asp .Net 4.5.1.
I have tasks to run, which call a web-service, and some might fail. I need to run N successful tasks which perform some light CPU work and mainly call a web service, then stop, and I want to throttle.
For example, let's assume we have 300 URLs in some collection. We need to run a function called Task<bool> CheckUrlAsync(url) on each of them, with throttling, meaning, for example, having only 5 run "at the same time" (in other words, have maximum 5 connections used at any given time). Also, we only want to perform N (assume 100) successful operations, and then stop.
I've read this and this and still I'm not sure what would be the correct way to do it.
How would you do it?
Assume ASP .Net
Assume IO call (http call to web serice), no heavy CPU operations.
Use Semaphore slim.
var semaphore = new SemaphoreSlim(5);
var tasks = urlCollection.Select(async url =>
{
await semaphore.WaitAsync();
try
{
return await CheckUrlAsync(url);
}
finally
{
semaphore.Release();
}
};
while(tasks.Where(t => t.Completed).Count() < 100)
{
await.Task.WhenAny(tasks);
}
Although I would prefer to use Rx.Net to produce some better code.
using(var semaphore = new SemaphoreSlim(5))
{
var results = urlCollection.ToObservable()
.Select(async url =>
{
await semaphore.WaitAsync();
try
{
return await CheckUrlAsync(url);
}
finally
{
semaphore.Release();
}
}).Take(100).ToList();
}
Okay...this is going to be fun.
public static class SemaphoreHelper
{
public static Task<T> ContinueWith<T>(
this SemaphoreSlim semaphore,
Func<Task<T>> action)
var ret = semaphore.WaitAsync()
.ContinueWith(action);
ret.ContinueWith(_ => semaphore.Release(), TaskContinuationOptions.None);
return ret;
}
var semaphore = new SemaphoreSlim(5);
var results = urlCollection.Select(
url => semaphore.ContinueWith(() => CheckUrlAsync(url)).ToList();
I do need to add that the code as it stands will still run all 300 URLs, it just will return quicker...thats all. You would need to add the cancelation token to the semaphore.WaitAsync(token) to cancel the queued work. Again I suggest using Rx.Net for that. Its just easier to use Rx.Net to get the cancelation token to work with .Take(100).
Try something like this?
private const int u_limit = 100;
private const int c_limit = 5;
List<Task> tasks = new List<Task>();
int totalRun = 0;
while (totalRun < u_limit)
{
for (int i = 0; i < c_limit; i++)
{
tasks.Add(Task.Run (() => {
// Your code here.
}));
}
Task.WaitAll(tasks);
totalRun += c_limit;
}
I have this situation:
var tasks = new List<ITask> ...
Parallel.ForEach(tasks, currentTask => currentTask.Execute() );
Is it possible to instruct PLinq to wait for 500ms before the next thread is spawned?
System.Threading.Thread.Sleep(5000);
You are using Parallel.Foreach totally wrong, You should make a special Enumerator that rate limits itself to getting data once every 500 ms.
I made some assumptions on how your DTO works due to you not providing any details.
private IEnumerator<SomeResource> GetRateLimitedResource()
{
SomeResource someResource = null;
do
{
someResource = _remoteProvider.GetData();
if(someResource != null)
{
yield return someResource;
Thread.Sleep(500);
}
} while (someResource != null);
}
here is how your paralell should look then
Parallel.ForEach(GetRateLimitedResource(), SomeFunctionToProcessSomeResource);
There are already some good suggestions. I would agree with others that you are using PLINQ in a manner it wasn't meant to be used.
My suggestion would be to use System.Threading.Timer. This is probably better than writing a method that returns an IEnumerable<> that forces a half second delay, because you may not need to wait the full half second, depending on how much time has passed since your last API call.
With the timer, it will invoke a delegate that you've provided it at the interval you specify, so even if the first task isn't done, a half second later it will invoke your delegate on another thread, so there won't be any extra waiting.
From your example code, it sounds like you have a list of tasks, in this case, I would use System.Collections.Concurrent.ConcurrentQueue to keep track of the tasks. Once the queue is empty, turn off the timer.
You could use Enumerable.Aggregate instead.
var task = tasks.Aggregate((t1, t2) =>
t1.ContinueWith(async _ =>
{ Thread.Sleep(500); return t2.Result; }));
If you don't want the tasks chained then there is also the overload to Select assuming the tasks are in order of delay.
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => x * 2))
.Select((x, i) => Task.Delay(TimeSpan.FromMilliseconds(i * 500))
.ContinueWith(_ => x.Result));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
From the comments a better options would be to guard the resource instead of using the time delay.
static object Locker = new object();
static int GetResultFromResource(int arg)
{
lock(Locker)
{
Thread.Sleep(500);
return arg * 2;
}
}
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => GetResultFromResource(x)));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
In this case how about a Producer-Consumer pattern with a BlockingCollection<T>?
var tasks = new BlockingCollection<ITask>();
// add tasks, if this is an expensive process, put it out onto a Task
// tasks.Add(x);
// we're done producin' (allows GetConsumingEnumerable to finish)
tasks.CompleteAdding();
RunTasks(tasks);
With a single consumer thread:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Execute();
// this may not be as accurate as you would like
Thread.Sleep(500);
}
}
If you have access to .Net 4.5 you can use Task.Delay:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
Task.Delay(500)
.ContinueWith(() => task.Execute())
.Wait();
}
}
I want to make 10 asynchronous http requests at once and only process the results when all have completed and in a single callback function. I also do not want to block any threads using WaitAll (it is my understanding that WaitAll blocks until all are complete). I think I want to make a custom IAsyncResult which will handle multiple calls. Am I on the right track? Are there any good resources or examples out there that describe handling this?
I like Darin's solution. But, if you want something more traditional, you can try this.
I would definitely use an array of wait handles and the WaitAll mechanism:
static void Main(string[] args)
{
WaitCallback del = state =>
{
ManualResetEvent[] resetEvents = new ManualResetEvent[10];
WebClient[] clients = new WebClient[10];
Console.WriteLine("Starting requests");
for (int index = 0; index < 10; index++)
{
resetEvents[index] = new ManualResetEvent(false);
clients[index] = new WebClient();
clients[index].OpenReadCompleted += new OpenReadCompletedEventHandler(client_OpenReadCompleted);
clients[index].OpenReadAsync(new Uri(#"http:\\www.google.com"), resetEvents[index]);
}
bool succeeded = ManualResetEvent.WaitAll(resetEvents, 10000);
Complete(succeeded);
for (int index = 0; index < 10; index++)
{
resetEvents[index].Dispose();
clients[index].Dispose();
}
};
ThreadPool.QueueUserWorkItem(del);
Console.WriteLine("Waiting...");
Console.ReadKey();
}
static void client_OpenReadCompleted(object sender, OpenReadCompletedEventArgs e)
{
// Do something with data...Then close the stream
e.Result.Close();
ManualResetEvent readCompletedEvent = (ManualResetEvent)e.UserState;
readCompletedEvent.Set();
Console.WriteLine("Received callback");
}
static void Complete(bool succeeded)
{
if (succeeded)
{
Console.WriteLine("Yeah!");
}
else
{
Console.WriteLine("Boohoo!");
}
}
In .NET 4.0 there's a nice parallel Task library that allows you to do things like:
using System;
using System.Linq;
using System.Net;
using System.Threading.Tasks;
class Program
{
public static void Main()
{
var urls = new[] { "http://www.google.com", "http://www.yahoo.com" };
Task.Factory.ContinueWhenAll(
urls.Select(url => Task.Factory.StartNew(u =>
{
using (var client = new WebClient())
{
return client.DownloadString((string)u);
}
}, url)).ToArray(),
tasks =>
{
var results = tasks.Select(t => t.Result);
foreach (var html in results)
{
Console.WriteLine(html);
}
});
Console.ReadLine();
}
}
As you can see for each url in the list a different task is started and once all tasks are completed the callback is invoked and passed the result of all tasks.
I think you are better off using the WaitAll approach. Otherwise you will be processing 10 IAsyncResult callbacks, and using a semaphore to determine that all 10 are finally complete.
Keep in mind that WaitAll is very efficient; it is not like the silliness of having a thread "sleep." When a thread sleeps, it continues to use processing time. When a thread is "de-scheduled" because it hit a WaitAll, then the thread no longer consumes any processor time. It is very efficient.