Using Task Parallel Library do handle frequent URL requests - c#

I am using .Net to build a stock quote updater. Suppose there are X number of stock symbols to be updated during market hours. in order to keep the updating at a pace not exceeding data provider's limit (e.g. Yahoo finance), I will try to limit the number of requests/sec by using a mechanism similar to thread pool. Let's say I want to allow only 5 requests/sec, that corresponds to a pool of 5 threads.
I heard about TPL and would like to use it although I am inexperienced of it. How can I specify the number of threads in the implicitly used pool in Task? Here is a loop to schedule the requests where requestFunc(url) is the function to update quotes. I like to get some comments or suggestions from the experts to do it properly:
// X is a number much bigger than 5
List<Task> tasks = new List<Task>();
for (int i=0; i<X; i++)
{
Task t = Task.Factory.StartNew(() => { requestFunc(url); }, TaskCreationOptions.None);
t.Wait(100); //slow down 100 ms. I am not sure if this is the right thing to do
tasks.Add(t);
}
Task.WaitAll(tasks);
Ok, I added a outer loop to make it run continuously. When I make some changes of #steve16351 's code, it only loops once. Why????
static void Main(string[] args)
{
LimitedExecutionRateTaskScheduler scheduler = new LimitedExecutionRateTaskScheduler(5);
TaskFactory factory = new TaskFactory(scheduler);
List<string> symbolsToCheck = new List<string>() { "GOOG", "AAPL", "MSFT", "AGIO", "MNK", "SPY", "EBAY", "INTC" };
while (true)
{
List<Task> tasks = new List<Task>();
Console.WriteLine("Starting...");
foreach (string symbol in symbolsToCheck)
{
Task t = factory.StartNew(() => { write(symbol); },
CancellationToken.None, TaskCreationOptions.None, scheduler);
tasks.Add(t);
}
//Task.WhenAll(tasks);
Console.WriteLine("Ending...");
Console.Read();
}
//Console.Read();
}
public static void write (string symbol)
{
DateTime dateValue = DateTime.Now;
//Console.WriteLine("[{0:HH:mm:ss}] Doing {1}..", DateTime.Now, symbol);
Console.WriteLine("Date and Time with Milliseconds: {0} doing {1}..",
dateValue.ToString("MM/dd/yyyy hh:mm:ss.fff tt"), symbol);
}

If you want to have a flow of url requests while limiting to no more than 5 concurrent operations you should use TPL Dataflow's ActionBlock:
var block = new ActionBlock<string>(
url => requestFunc(url),
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 5 });
foreach (var url in urls)
{
block.Post(url);
}
block.Complete();
await block.Completion;
You Post to it the urls and for each of them it would perform the request while making sure there are no more than MaxDegreeOfParallelism requests at a time.
When you are done, you can call Complete to signal the block for completion and await the Completion task to asynchronously wait until the block actually completes.

Do not worry about the amount of threads; just make sure that you are not exceeding the number of requests per sec. Use a single timer to signal a ManualResetEvent every 200 ms and have the tasks wait for that ManualResetEvent inside a loop.
To create a timer and make it signal the ManualResetEvent every 200 ms:
resetEvent = new ManualResetEvent(false);
timer = new Timer((state)=>resetEvent.Set(), 200, 0);
Make sure you clean up the timer (call Dispose) when you do not need it anymore.
Let the number of threads be determined by the run-time.
This would be a poor implementation if you create a single task per stock because you do not know when a stock will be updated.
So you could just put all the stocks in a list and have a single task update each stock one after another.
By giving another list of stocks to another task you could give that task a higher priority by setting its timer to every 250 ms and the low priority to every 1000 ms. That would add up to 5 times a second and the high priority list would be updated 4 times more often than the low priority.

You could use a custom task scheduler which limits the rate at which tasks can start.
In the below, tasks are queued up, and dequeued with a timer set to the frequency of your maximum allowed rate. So if 5 requests a second, the timer is set to 200ms. On the tick, a task is then dequeued and executed from those that are pending.
EDIT: In addition to the request rate, you can also extend to control the maximum number of executing threads as well.
static void Main(string[] args)
{
TaskFactory factory = new TaskFactory(new LimitedExecutionRateTaskScheduler(5, 5)); // 5 per second, 5 max executing
List<string> symbolsToCheck = new List<string>() { "GOOG", "AAPL", "MSFT" };
for (int i = 0; i < 5; i++)
symbolsToCheck.AddRange(symbolsToCheck);
foreach (string symbol in symbolsToCheck)
{
factory.StartNew(() =>
{
Console.WriteLine("[{0:HH:mm:ss}] [{1}] Doing {2}..", DateTime.Now, Thread.CurrentThread.ManagedThreadId, symbol);
Thread.Sleep(5000);
Console.WriteLine("[{0:HH:mm:ss}] [{1}] {2} is done", DateTime.Now, Thread.CurrentThread.ManagedThreadId, symbol);
});
}
Console.Read();
}
public class LimitedExecutionRateTaskScheduler : TaskScheduler
{
private ConcurrentQueue<Task> _pendingTasks = new ConcurrentQueue<Task>();
private readonly object _taskLocker = new object();
private List<Task> _executingTasks = new List<Task>();
private readonly int _maximumConcurrencyLevel = 5;
private Timer _doWork = null;
public LimitedExecutionRateTaskScheduler(double requestsPerSecond, int maximumDegreeOfParallelism)
{
_maximumConcurrencyLevel = maximumDegreeOfParallelism;
long frequency = (long)(1000.0 / requestsPerSecond);
_doWork = new Timer(ExecuteRequests, null, frequency, frequency);
}
public override int MaximumConcurrencyLevel
{
get
{
return _maximumConcurrencyLevel;
}
}
protected override bool TryDequeue(Task task)
{
return base.TryDequeue(task);
}
protected override void QueueTask(Task task)
{
_pendingTasks.Enqueue(task);
}
private void ExecuteRequests(object state)
{
Task queuedTask = null;
int currentlyExecutingTasks = 0;
lock (_taskLocker)
{
for (int i = 0; i < _executingTasks.Count; i++)
if (_executingTasks[i].IsCompleted)
_executingTasks.RemoveAt(i--);
currentlyExecutingTasks = _executingTasks.Count;
}
if (currentlyExecutingTasks == MaximumConcurrencyLevel)
return;
if (_pendingTasks.TryDequeue(out queuedTask) == false)
return; // no work to do
lock (_taskLocker)
_executingTasks.Add(queuedTask);
base.TryExecuteTask(queuedTask);
}
protected override bool TryExecuteTaskInline(Task task, bool taskWasPreviouslyQueued)
{
return false; // not properly implemented just to complete the class
}
protected override IEnumerable<Task> GetScheduledTasks()
{
return new List<Task>(); // not properly implemented just to complete the class
}
}

You could use a while loop with a task delay to control when your requests are issued. Using an async void method to make your requests means you don't get blocked by a failing request.
Async void is fire and forget which some devs don't lkke but I think it would work as a possible solution in this case.
I also think erno de weerd makes a great suggestion around prioritising calls to more important stocks.

Thanks #steve16351! It works like this:
static void Main(string[] args)
{
LimitedExecutionRateTaskScheduler scheduler = new LimitedExecutionRateTaskScheduler(5);
TaskFactory factory = new TaskFactory(scheduler);
List<string> symbolsToCheck = new List<string>() { "GOOG", "AAPL", "MSFT", "AGIO", "MNK", "SPY", "EBAY", "INTC" };
while (true)
{
List<Task> tasks = new List<Task>();
foreach (string symbol in symbolsToCheck)
{
Task t = factory.StartNew(() =>
{
write(symbol);
}, CancellationToken.None,
TaskCreationOptions.None, scheduler);
tasks.Add(t);
}
}
}
public static void write (string symbol)
{
DateTime dateValue = DateTime.Now;
Console.WriteLine("Date and Time with Milliseconds: {0} doing {1}..",
dateValue.ToString("MM/dd/yyyy hh:mm:ss.fff tt"), symbol);
}

Related

Limit the number of cores: new Task will block until core free from other tasks [duplicate]

I have a collection of 1000 input message to process. I'm looping the input collection and starting the new task for each message to get processed.
//Assume this messages collection contains 1000 items
var messages = new List<string>();
foreach (var msg in messages)
{
Task.Factory.StartNew(() =>
{
Process(msg);
});
}
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time?
How to ensure this message get processed in the same sequence/order of the Collection?
You could use Parallel.Foreach and rely on MaxDegreeOfParallelism instead.
Parallel.ForEach(messages, new ParallelOptions {MaxDegreeOfParallelism = 10},
msg =>
{
// logic
Process(msg);
});
SemaphoreSlim is a very good solution in this case and I higly recommend OP to try this, but #Manoj's answer has flaw as mentioned in comments.semaphore should be waited before spawning the task like this.
Updated Answer: As #Vasyl pointed out Semaphore may be disposed before completion of tasks and will raise exception when Release() method is called so before exiting the using block must wait for the completion of all created Tasks.
int maxConcurrency=10;
var messages = new List<string>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach(var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Answer to Comments
for those who want to see how semaphore can be disposed without Task.WaitAll
Run below code in console app and this exception will be raised.
System.ObjectDisposedException: 'The semaphore has been disposed.'
static void Main(string[] args)
{
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
// Task.WaitAll(tasks.ToArray());
}
Console.WriteLine("Exited using block");
Console.ReadKey();
}
private static void Process(string msg)
{
Thread.Sleep(2000);
Console.WriteLine(msg);
}
I think it would be better to use Parallel LINQ
Parallel.ForEach(messages ,
new ParallelOptions{MaxDegreeOfParallelism = 4},
x => Process(x);
);
where x is the MaxDegreeOfParallelism
With .NET 5.0 and Core 3.0 channels were introduced.
The main benefit of this producer/consumer concurrency pattern is that you can also limit the input data processing to reduce resource impact.
This is especially helpful when processing millions of data records.
Instead of reading the whole dataset at once into memory, you can now consecutively query only chunks of the data and wait for the workers to process it before querying more.
Code sample with a queue capacity of 50 messages and 5 consumer threads:
/// <exception cref="System.AggregateException">Thrown on Consumer Task exceptions.</exception>
public static async Task ProcessMessages(List<string> messages)
{
const int producerCapacity = 10, consumerTaskLimit = 3;
var channel = Channel.CreateBounded<string>(producerCapacity);
_ = Task.Run(async () =>
{
foreach (var msg in messages)
{
await channel.Writer.WriteAsync(msg);
// blocking when channel is full
// waiting for the consumer tasks to pop messages from the queue
}
channel.Writer.Complete();
// signaling the end of queue so that
// WaitToReadAsync will return false to stop the consumer tasks
});
var tokenSource = new CancellationTokenSource();
CancellationToken ct = tokenSource.Token;
var consumerTasks = Enumerable
.Range(1, consumerTaskLimit)
.Select(_ => Task.Run(async () =>
{
try
{
while (await channel.Reader.WaitToReadAsync(ct))
{
ct.ThrowIfCancellationRequested();
while (channel.Reader.TryRead(out var message))
{
await Task.Delay(500);
Console.WriteLine(message);
}
}
}
catch (OperationCanceledException) { }
catch
{
tokenSource.Cancel();
throw;
}
}))
.ToArray();
Task waitForConsumers = Task.WhenAll(consumerTasks);
try { await waitForConsumers; }
catch
{
foreach (var e in waitForConsumers.Exception.Flatten().InnerExceptions)
Console.WriteLine(e.ToString());
throw waitForConsumers.Exception.Flatten();
}
}
As pointed out by Theodor Zoulias:
On multiple consumer exceptions, the remaining tasks will continue to run and have to take the load of the killed tasks. To avoid this, I implemented a CancellationToken to stop all the remaining tasks and handle the exceptions combined in the AggregateException of waitForConsumers.Exception.
Side note:
The Task Parallel Library (TPL) might be good at automatically limiting the tasks based on your local resources. But when you are processing data remotely via RPC, it's necessary to manually limit your RPC calls to avoid filling the network/processing stack!
If your Process method is async you can't use Task.Factory.StartNew as it doesn't play well with an async delegate. Also there are some other nuances when using it (see this for example).
The proper way to do it in this case is to use Task.Run. Here's #ClearLogic answer modified for an async Process method.
static void Main(string[] args)
{
int maxConcurrency = 5;
List<string> messages = Enumerable.Range(1, 15).Select(e => e.ToString()).ToList();
using (SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
List<Task> tasks = new List<Task>();
foreach (var msg in messages)
{
concurrencySemaphore.Wait();
var t = Task.Run(async () =>
{
try
{
await Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Console.WriteLine("Exited using block");
Console.ReadKey();
}
private static async Task Process(string msg)
{
await Task.Delay(2000);
Console.WriteLine(msg);
}
You can create your own TaskScheduler and override QueueTask there.
protected virtual void QueueTask(Task task)
Then you can do anything you like.
One example here:
Limited concurrency level task scheduler (with task priority) handling wrapped tasks
You can simply set the max concurrency degree like this way:
int maxConcurrency=10;
var messages = new List<1000>();
using(SemaphoreSlim concurrencySemaphore = new SemaphoreSlim(maxConcurrency))
{
foreach(var msg in messages)
{
Task.Factory.StartNew(() =>
{
concurrencySemaphore.Wait();
try
{
Process(msg);
}
finally
{
concurrencySemaphore.Release();
}
});
}
}
If you need in-order queuing (processing might finish in any order), there is no need for a semaphore. Old fashioned if statements work fine:
const int maxConcurrency = 5;
List<Task> tasks = new List<Task>();
foreach (var arg in args)
{
var t = Task.Run(() => { Process(arg); } );
tasks.Add(t);
if(tasks.Count >= maxConcurrency)
Task.WaitAny(tasks.ToArray());
}
Task.WaitAll(tasks.ToArray());
I ran into a similar problem where I wanted to produce 5000 results while calling apis, etc. So, I ran some speed tests.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
{
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id).GetAwaiter().GetResult();
});
produced 100 results in 30 seconds.
Parallel.ForEach(products.Select(x => x.KeyValue).Distinct().Take(100), id =>
{
new ParallelOptions { MaxDegreeOfParallelism = 100 };
GetProductMetaData(productsMetaData, client, id);
});
Moving the GetAwaiter().GetResult() to the individual async api calls inside GetProductMetaData resulted in 14.09 seconds to produce 100 results.
foreach (var id in ids.Take(100))
{
GetProductMetaData(productsMetaData, client, id);
}
Complete non-async programming with the GetAwaiter().GetResult() in api calls resulted in 13.417 seconds.
var tasks = new List<Task>();
while (y < ids.Count())
{
foreach (var id in ids.Skip(y).Take(100))
{
tasks.Add(GetProductMetaData(productsMetaData, client, id));
}
y += 100;
Task.WhenAll(tasks).GetAwaiter().GetResult();
Console.WriteLine($"Finished {y}, {sw.Elapsed}");
}
Forming a task list and working through 100 at a time resulted in a speed of 7.36 seconds.
using (SemaphoreSlim cons = new SemaphoreSlim(10))
{
var tasks = new List<Task>();
foreach (var id in ids.Take(100))
{
cons.Wait();
var t = Task.Factory.StartNew(() =>
{
try
{
GetProductMetaData(productsMetaData, client, id);
}
finally
{
cons.Release();
}
});
tasks.Add(t);
}
Task.WaitAll(tasks.ToArray());
}
Using SemaphoreSlim resulted in 13.369 seconds, but also took a moment to boot to start using it.
var throttler = new SemaphoreSlim(initialCount: take);
foreach (var id in ids)
{
throttler.WaitAsync().GetAwaiter().GetResult();
tasks.Add(Task.Run(async () =>
{
try
{
skip += 1;
await GetProductMetaData(productsMetaData, client, id);
if (skip % 100 == 0)
{
Console.WriteLine($"started {skip}/{count}, {sw.Elapsed}");
}
}
finally
{
throttler.Release();
}
}));
}
Using Semaphore Slim with a throttler for my async task took 6.12 seconds.
The answer for me in this specific project was use a throttler with Semaphore Slim. Although the while foreach tasklist did sometimes beat the throttler, 4/6 times the throttler won for 1000 records.
I realize I'm not using the OPs code, but I think this is important and adds to this discussion because how is sometimes not the only question that should be asked, and the answer is sometimes "It depends on what you are trying to do."
Now to answer the specific questions:
How to limit the maximum number of parallel tasks in c#: I showed how to limit the number of tasks that are completed at a time.
Can we guess how many maximum messages simultaneously get processed at the time (assuming normal Quad core processor), or can we limit the maximum number of messages to be processed at the time? I cannot guess how many will be processed at a time unless I set an upper limit but I can set an upper limit. Obviously different computers function at different speeds due to CPU, RAM etc. and how many threads and cores the program itself has access to as well as other programs running in tandem on the same computer.
How to ensure this message get processed in the same sequence/order of the Collection? If you want to process everything in a specific order, it is synchronous programming. The point of being able to run things asynchronously is ensuring that they can do everything without an order. As you can see from my code, the time difference is minimal in 100 records unless you use async code. In the event that you need an order to what you are doing, use asynchronous programming up until that point, then await and do things synchronously from there. For example, task1a.start, task2a.start, then later task1a.await, task2a.await... then later task1b.start task1b.await and task2b.start task 2b.await.
public static void RunTasks(List<NamedTask> importTaskList)
{
List<NamedTask> runningTasks = new List<NamedTask>();
try
{
foreach (NamedTask currentTask in importTaskList)
{
currentTask.Start();
runningTasks.Add(currentTask);
if (runningTasks.Where(x => x.Status == TaskStatus.Running).Count() >= MaxCountImportThread)
{
Task.WaitAny(runningTasks.ToArray());
}
}
Task.WaitAll(runningTasks.ToArray());
}
catch (Exception ex)
{
Log.Fatal("ERROR!", ex);
}
}
you can use the BlockingCollection, If the consume collection limit has reached, the produce will stop producing until a consume process will finish. I find this pattern more easy to understand and implement than the SemaphoreSlim.
int TasksLimit = 10;
BlockingCollection<Task> tasks = new BlockingCollection<Task>(new ConcurrentBag<Task>(), TasksLimit);
void ProduceAndConsume()
{
var producer = Task.Factory.StartNew(RunProducer);
var consumer = Task.Factory.StartNew(RunConsumer);
try
{
Task.WaitAll(new[] { producer, consumer });
}
catch (AggregateException ae) { }
}
void RunConsumer()
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Start();
}
}
void RunProducer()
{
for (int i = 0; i < 1000; i++)
{
tasks.Add(new Task(() => Thread.Sleep(1000), TaskCreationOptions.AttachedToParent));
}
}
Note that the RunProducer and RunConsumer has spawn two independent tasks.

Executing N number of threads in parallel and in a sequential manner

I have an application where i have 1000+ small parts of 1 large file.
I have to upload maximum of 16 parts at a time.
I used Thread parallel library of .Net.
I used Parallel.For to divide in multiple parts and assigned 1 method which should be executed for each part and set DegreeOfParallelism to 16.
I need to execute 1 method with checksum values which are generated by different part uploads, so i have to set certain mechanism where i have to wait for all parts upload say 1000 to complete.
In TPL library i am facing 1 issue is it is randomly executing any of the 16 threads from 1000.
I want some mechanism using which i can run first 16 threads initially, if the 1st or 2nd or any of the 16 thread completes its task next 17th part should be started.
How can i achieve this ?
One possible candidate for this can be TPL Dataflow. This is a demonstration which takes in a stream of integers and prints them out to the console. You set the MaxDegreeOfParallelism to whichever many threads you wish to spin in parallel:
void Main()
{
var actionBlock = new ActionBlock<int>(
i => Console.WriteLine(i),
new ExecutionDataflowBlockOptions {MaxDegreeOfParallelism = 16});
foreach (var i in Enumerable.Range(0, 200))
{
actionBlock.Post(i);
}
}
This can also scale well if you want to have multiple producer/consumers.
Here is the manual way of doing this.
You need a queue. The queue is sequence of pending tasks. You have to dequeue and put them inside list of working task. When ever the task is done remove it from list of working task and take another from queue. Main thread controls this process. Here is the sample of how to do this.
For the test i used List of integer but it should work for other types because its using generics.
private static void Main()
{
Random r = new Random();
var items = Enumerable.Range(0, 100).Select(x => r.Next(100, 200)).ToList();
ParallelQueue(items, DoWork);
}
private static void ParallelQueue<T>(List<T> items, Action<T> action)
{
Queue pending = new Queue(items);
List<Task> working = new List<Task>();
while (pending.Count + working.Count != 0)
{
if (pending.Count != 0 && working.Count < 16) // Maximum tasks
{
var item = pending.Dequeue(); // get item from queue
working.Add(Task.Run(() => action((T)item))); // run task
}
else
{
Task.WaitAny(working.ToArray());
working.RemoveAll(x => x.IsCompleted); // remove finished tasks
}
}
}
private static void DoWork(int i) // do your work here.
{
// this is just an example
Task.Delay(i).Wait();
Console.WriteLine(i);
}
Please let me know if you encounter problem of how to implement DoWork for your self. because if you change method signature you may need to do some changes.
Update
You can also do this with async await without blocking the main thread.
private static void Main()
{
Random r = new Random();
var items = Enumerable.Range(0, 100).Select(x => r.Next(100, 200)).ToList();
Task t = ParallelQueue(items, DoWork);
// able to do other things.
t.Wait();
}
private static async Task ParallelQueue<T>(List<T> items, Func<T, Task> func)
{
Queue pending = new Queue(items);
List<Task> working = new List<Task>();
while (pending.Count + working.Count != 0)
{
if (working.Count < 16 && pending.Count != 0)
{
var item = pending.Dequeue();
working.Add(Task.Run(async () => await func((T)item)));
}
else
{
await Task.WhenAny(working);
working.RemoveAll(x => x.IsCompleted);
}
}
}
private static async Task DoWork(int i)
{
await Task.Delay(i);
}
var workitems = ... /*e.g. Enumerable.Range(0, 1000000)*/;
SingleItemPartitioner.Create(workitems)
.AsParallel()
.AsOrdered()
.WithDegreeOfParallelism(16)
.WithMergeOptions(ParallelMergeOptions.NotBuffered)
.ForAll(i => { Thread.Slee(1000); Console.WriteLine(i); });
This should be all you need. I forgot how the methods are named exactly... Look at the documentation.
Test this by printing to the console after sleeping for 1sec (which this sample code does).
Another option would be to use a BlockingCollection<T> as a queue between your file reader thread and your 16 uploader threads. Each uploader thread would just loop around consuming the blocking collection until it is complete.
And, if you want to limit memory consumption in the queue you can set an upper limit on the blocking collection such that the file-reader thread will pause when the buffer has reached capacity. This is particularly useful in a server environment where you may need to limit memory used per user/API call.
// Create a buffer of 4 chunks between the file reader and the senders
BlockingCollection<Chunk> queue = new BlockingCollection<Chunk>(4);
// Create a cancellation token source so you can stop this gracefully
CancellationTokenSource cts = ...
File reader thread
...
queue.Add(chunk, cts.Token);
...
queue.CompleteAdding();
Sending threads
for(int i = 0; i < 16; i++)
{
Task.Run(() => {
foreach (var chunk in queue.GetConsumingEnumerable(cts.Token))
{
.. do the upload
}
});
}

Why is this TAP async/await code slower than the TPL version?

I had to write a console application that called Microsoft Dynamics CRM web service to perform an action on over eight thousand CRM objects. The details of the web service call are irrelevant and not shown here but I needed a multi-threaded client so that I could make calls in parallel. I wanted to be able to control the number of threads used from a config setting and also for the application to cancel the whole operation if the number of service errors reached a config-defined threshold.
I wrote it using Task Parallel Library Task.Run and ContinueWith, keeping track of how many calls (threads) were in progress, how many errors we'd received, and whether the user had cancelled from the keyboard. Everything worked fine and I had extensive logging to assure myself that threads were finishing cleanly and that everything was tidy at the end of the run. I could see that the program was using the maximum number of threads in parallel and, if our maximum limit was reached, waiting until a running task completed before starting another one.
During my code review, my colleague suggested that it would be better to do it with async/await instead of tasks and continuations, so I created a branch and rewrote it that way. The results were interesting - the async/await version was almost twice as slow, and it never reached the maximum number of allowed parallel operations/threads. The TPL one always got to 10 threads in parallel whereas the async/await version never got beyond 5.
My question is: have I made a mistake in the way I have written the async/await code (or the TPL code even)? If I have not coded it wrong, can you explain why the async/await is less efficient, and does that mean it is better to carry on using TPL for multi-threaded code.
Note that the code I tested with did not actually call CRM - the CrmClient class simply thread-sleeps for a duration specified in the config (five seconds) and then throws an exception. This meant that there were no external variables that could affect the performance.
For the purposes of this question I created a stripped down program that combines both versions; which one is called is determined by a config setting. Each of them starts with a bootstrap runner that sets up the environment, creates the queue class, then uses a TaskCompletionSource to wait for completion. A CancellationTokenSource is used to signal a cancellation from the user. The list of ids to process is read from an embedded file and pushed onto a ConcurrentQueue. They both start off calling StartCrmRequest as many times as max-threads; subsequently, every time a result is processed, the ProcessResult method calls StartCrmRequest again, keeping going until all of our ids are processed.
You can clone/download the complete program from here: https://bitbucket.org/kentrob/pmgfixso/
Here is the relevant configuration:
<appSettings>
<add key="TellUserAfterNCalls" value="5"/>
<add key="CrmErrorsBeforeQuitting" value="20"/>
<add key="MaxThreads" value="10"/>
<add key="CallIntervalMsecs" value="5000"/>
<add key="UseAsyncAwait" value="True" />
</appSettings>
Starting with the TPL version, here is the bootstrap runner that kicks off the queue manager:
public static class TplRunner
{
private static readonly CancellationTokenSource CancellationTokenSource = new CancellationTokenSource();
public static void StartQueue(RuntimeParameters parameters, IEnumerable<string> idList)
{
Console.CancelKeyPress += (s, args) =>
{
CancelCrmClient();
args.Cancel = true;
};
var start = DateTime.Now;
Program.TellUser("Start: " + start);
var taskCompletionSource = new TplQueue(parameters)
.Start(CancellationTokenSource.Token, idList);
while (!taskCompletionSource.Task.IsCompleted)
{
if (Console.KeyAvailable)
{
if (Console.ReadKey().Key != ConsoleKey.Q) continue;
Console.WriteLine("When all threads are complete, press any key to continue.");
CancelCrmClient();
}
}
var end = DateTime.Now;
Program.TellUser("End: {0}. Elapsed = {1} secs.", end, (end - start).TotalSeconds);
}
private static void CancelCrmClient()
{
CancellationTokenSource.Cancel();
Console.WriteLine("Cancelling Crm client. Web service calls in operation will have to run to completion.");
}
}
Here is the TPL queue manager itself:
public class TplQueue
{
private readonly RuntimeParameters parameters;
private readonly object locker = new object();
private ConcurrentQueue<string> idQueue = new ConcurrentQueue<string>();
private readonly CrmClient crmClient;
private readonly TaskCompletionSource<bool> taskCompletionSource = new TaskCompletionSource<bool>();
private int threadCount;
private int crmErrorCount;
private int processedCount;
private CancellationToken cancelToken;
public TplQueue(RuntimeParameters parameters)
{
this.parameters = parameters;
crmClient = new CrmClient();
}
public TaskCompletionSource<bool> Start(CancellationToken cancellationToken, IEnumerable<string> ids)
{
cancelToken = cancellationToken;
foreach (var id in ids)
{
idQueue.Enqueue(id);
}
threadCount = 0;
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
Task.Run((Action) StartCrmRequest, cancellationToken);
}
return taskCompletionSource;
}
private void StartCrmRequest()
{
if (taskCompletionSource.Task.IsCompleted)
{
return;
}
if (cancelToken.IsCancellationRequested)
{
Program.TellUser("Crm client cancelling...");
ClearQueue();
return;
}
var count = GetThreadCount();
if (count >= parameters.MaxThreads)
{
return;
}
string id;
if (!idQueue.TryDequeue(out id)) return;
IncrementThreadCount();
crmClient.CompleteActivityAsync(new Guid(id), parameters.CallIntervalMsecs).ContinueWith(ProcessResult);
processedCount += 1;
if (parameters.TellUserAfterNCalls > 0 && processedCount%parameters.TellUserAfterNCalls == 0)
{
ShowProgress(processedCount);
}
}
private void ProcessResult(Task<CrmResultMessage> response)
{
if (response.Result.CrmResult == CrmResult.Failed && ++crmErrorCount == parameters.CrmErrorsBeforeQuitting)
{
Program.TellUser(
"Quitting because CRM error count is equal to {0}. Already queued web service calls will have to run to completion.",
crmErrorCount);
ClearQueue();
}
var count = DecrementThreadCount();
if (idQueue.Count == 0 && count == 0)
{
taskCompletionSource.SetResult(true);
}
else
{
StartCrmRequest();
}
}
private int GetThreadCount()
{
lock (locker)
{
return threadCount;
}
}
private void IncrementThreadCount()
{
lock (locker)
{
threadCount = threadCount + 1;
}
}
private int DecrementThreadCount()
{
lock (locker)
{
threadCount = threadCount - 1;
return threadCount;
}
}
private void ClearQueue()
{
idQueue = new ConcurrentQueue<string>();
}
private static void ShowProgress(int processedCount)
{
Program.TellUser("{0} activities processed.", processedCount);
}
}
Note that I am aware that a couple of the counters are not thread safe but they are not critical; the threadCount variable is the only critical one.
Here is the dummy CRM client method:
public Task<CrmResultMessage> CompleteActivityAsync(Guid activityId, int callIntervalMsecs)
{
// Here we would normally call a CRM web service.
return Task.Run(() =>
{
try
{
if (callIntervalMsecs > 0)
{
Thread.Sleep(callIntervalMsecs);
}
throw new ApplicationException("Crm web service not available at the moment.");
}
catch
{
return new CrmResultMessage(activityId, CrmResult.Failed);
}
});
}
And here are the same async/await classes (with common methods removed for the sake of brevity):
public static class AsyncRunner
{
private static readonly CancellationTokenSource CancellationTokenSource = new CancellationTokenSource();
public static void StartQueue(RuntimeParameters parameters, IEnumerable<string> idList)
{
var start = DateTime.Now;
Program.TellUser("Start: " + start);
var taskCompletionSource = new AsyncQueue(parameters)
.StartAsync(CancellationTokenSource.Token, idList).Result;
while (!taskCompletionSource.Task.IsCompleted)
{
...
}
var end = DateTime.Now;
Program.TellUser("End: {0}. Elapsed = {1} secs.", end, (end - start).TotalSeconds);
}
}
The async/await queue manager:
public class AsyncQueue
{
private readonly RuntimeParameters parameters;
private readonly object locker = new object();
private readonly CrmClient crmClient;
private readonly TaskCompletionSource<bool> taskCompletionSource = new TaskCompletionSource<bool>();
private CancellationToken cancelToken;
private ConcurrentQueue<string> idQueue = new ConcurrentQueue<string>();
private int threadCount;
private int crmErrorCount;
private int processedCount;
public AsyncQueue(RuntimeParameters parameters)
{
this.parameters = parameters;
crmClient = new CrmClient();
}
public async Task<TaskCompletionSource<bool>> StartAsync(CancellationToken cancellationToken,
IEnumerable<string> ids)
{
cancelToken = cancellationToken;
foreach (var id in ids)
{
idQueue.Enqueue(id);
}
threadCount = 0;
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
await StartCrmRequest();
}
return taskCompletionSource;
}
private async Task StartCrmRequest()
{
if (taskCompletionSource.Task.IsCompleted)
{
return;
}
if (cancelToken.IsCancellationRequested)
{
...
return;
}
var count = GetThreadCount();
if (count >= parameters.MaxThreads)
{
return;
}
string id;
if (!idQueue.TryDequeue(out id)) return;
IncrementThreadCount();
var crmMessage = await crmClient.CompleteActivityAsync(new Guid(id), parameters.CallIntervalMsecs);
ProcessResult(crmMessage);
processedCount += 1;
if (parameters.TellUserAfterNCalls > 0 && processedCount%parameters.TellUserAfterNCalls == 0)
{
ShowProgress(processedCount);
}
}
private async void ProcessResult(CrmResultMessage response)
{
if (response.CrmResult == CrmResult.Failed && ++crmErrorCount == parameters.CrmErrorsBeforeQuitting)
{
Program.TellUser(
"Quitting because CRM error count is equal to {0}. Already queued web service calls will have to run to completion.",
crmErrorCount);
ClearQueue();
}
var count = DecrementThreadCount();
if (idQueue.Count == 0 && count == 0)
{
taskCompletionSource.SetResult(true);
}
else
{
await StartCrmRequest();
}
}
}
So, setting MaxThreads to 10 and CrmErrorsBeforeQuitting to 20, the TPL version on my machine completes in 19 seconds and the async/await version takes 35 seconds. Given that I have over 8000 calls to make this is a significant difference. Any ideas?
I think I'm seeing the problem here, or at least a part of it. Look closely at the two bits of code below; they are not equivalent.
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
Task.Run((Action) StartCrmRequest, cancellationToken);
}
And:
// Prime our thread pump with max threads.
for (var i = 0; i < parameters.MaxThreads; i++)
{
await StartCrmRequest();
}
In the original code (I am taking it as a given that it is functionally sound) there is a single call to ContinueWith. That is exactly how many await statements I would expect to see in a trivial rewrite if it is meant to preserve the original behaviour.
Not a hard and fast rule and only applicable in simple cases, but nevertheless a good thing to keep an eye out for.
I think you over complicated your solution and ended up not getting where you wanted in either implementation.
First of all, connections to any HTTP host are limited by the service point manager. The default limit for client environments is 2, but you can increase it yourself.
No matter how much threads you spawn, there won't be more active requests than those allwed.
Then, as someone pointed out, await logically blocks the execution flow.
And finally, you spent your time creating an AsyncQueue when you should have used TPL data flows.
When implemented with async/await, I would expect the I/O bound algorithm to run on a single thread. Unlike #KirillShlenskiy, I believe that the bit responsible for "bringing back" to caller's context is not responsible for the slow-down. I think you overrun the thread pool by trying to use it for I/O-bound operations. It's designed primarily for compute-bound ops.
Have a look at ForEachAsync. I feel that's what you're looking for (Stephen Toub's discussion, you'll find Wischik's videos meaningful too):
http://blogs.msdn.com/b/pfxteam/archive/2012/03/05/10278165.aspx
(Use degree of concurrency to reduce memory footprint)
http://vimeo.com/43808831
http://vimeo.com/43808833

How to limit number of HttpWebRequest per second towards a webserver?

I need to implement a throttling mechanism (requests per second) when using HttpWebRequest for making parallel requests towards one application server. My C# app must issue no more than 80 requests per second to a remote server. The limit is imposed by the remote service admins not as a hard limit but as "SLA" between my platform and theirs.
How can I control the number of requests per second when using HttpWebRequest?
I had the same problem and couldn't find a ready solution so I made one, and here it is. The idea is to use a BlockingCollection<T> to add items that need processing and use Reactive Extensions to subscribe with a rate-limited processor.
Throttle class is the renamed version of this rate limiter
public static class BlockingCollectionExtensions
{
// TODO: devise a way to avoid problems if collection gets too big (produced faster than consumed)
public static IObservable<T> AsRateLimitedObservable<T>(this BlockingCollection<T> sequence, int items, TimeSpan timePeriod, CancellationToken producerToken)
{
Subject<T> subject = new Subject<T>();
// this is a dummyToken just so we can recreate the TokenSource
// which we will pass the proxy class so it can cancel the task
// on disposal
CancellationToken dummyToken = new CancellationToken();
CancellationTokenSource tokenSource = CancellationTokenSource.CreateLinkedTokenSource(producerToken, dummyToken);
var consumingTask = new Task(() =>
{
using (var throttle = new Throttle(items, timePeriod))
{
while (!sequence.IsCompleted)
{
try
{
T item = sequence.Take(producerToken);
throttle.WaitToProceed();
try
{
subject.OnNext(item);
}
catch (Exception ex)
{
subject.OnError(ex);
}
}
catch (OperationCanceledException)
{
break;
}
}
subject.OnCompleted();
}
}, TaskCreationOptions.LongRunning);
return new TaskAwareObservable<T>(subject, consumingTask, tokenSource);
}
private class TaskAwareObservable<T> : IObservable<T>, IDisposable
{
private readonly Task task;
private readonly Subject<T> subject;
private readonly CancellationTokenSource taskCancellationTokenSource;
public TaskAwareObservable(Subject<T> subject, Task task, CancellationTokenSource tokenSource)
{
this.task = task;
this.subject = subject;
this.taskCancellationTokenSource = tokenSource;
}
public IDisposable Subscribe(IObserver<T> observer)
{
var disposable = subject.Subscribe(observer);
if (task.Status == TaskStatus.Created)
task.Start();
return disposable;
}
public void Dispose()
{
// cancel consumption and wait task to finish
taskCancellationTokenSource.Cancel();
task.Wait();
// dispose tokenSource and task
taskCancellationTokenSource.Dispose();
task.Dispose();
// dispose subject
subject.Dispose();
}
}
}
Unit test:
class BlockCollectionExtensionsTest
{
[Fact]
public void AsRateLimitedObservable()
{
const int maxItems = 1; // fix this to 1 to ease testing
TimeSpan during = TimeSpan.FromSeconds(1);
// populate collection
int[] items = new[] { 1, 2, 3, 4 };
BlockingCollection<int> collection = new BlockingCollection<int>();
foreach (var i in items) collection.Add(i);
collection.CompleteAdding();
IObservable<int> observable = collection.AsRateLimitedObservable(maxItems, during, CancellationToken.None);
BlockingCollection<int> processedItems = new BlockingCollection<int>();
ManualResetEvent completed = new ManualResetEvent(false);
DateTime last = DateTime.UtcNow;
observable
// this is so we'll receive exceptions
.ObserveOn(new SynchronizationContext())
.Subscribe(item =>
{
if (item == 1)
last = DateTime.UtcNow;
else
{
TimeSpan diff = (DateTime.UtcNow - last);
last = DateTime.UtcNow;
Assert.InRange(diff.TotalMilliseconds,
during.TotalMilliseconds - 30,
during.TotalMilliseconds + 30);
}
processedItems.Add(item);
},
() => completed.Set()
);
completed.WaitOne();
Assert.Equal(items, processedItems, new CollectionEqualityComparer<int>());
}
}
The Throttle() and Sample() extension methods (On Observable) allow you to regulate a fast sequence of events into a "slower" sequence.
Here is a blog post with an example of Sample(Timespan) that ensures a maxium rate.
My original post discussed how to add a throttling mechanism to WCF via client behavior extensions, but then was pointed out that I misread the question (doh!).
Overall the approach can be to check with a class that determines if we are violating the rate limit or not. There's already been a lot of discussion around how to check for rate violations.
Throttling method calls to M requests in N seconds
If you are violating the rate limit, then sleep for a fix interval and check again. If not, then go ahead and make the HttpWebRequest call.

Most efficient way to execute several threads

You may skip this part:
I am creating an application where the client needs to find the server on the same network.
The server:
public static void StartListening(Int32 port)
{
TcpListener server = new TcpListener(IP.GetCurrentIP(), port);
server.Start();
Thread t = new Thread(new ThreadStart(() =>
{
while (true)
{
// wait for connection
TcpClient client = server.AcceptTcpClient();
if (stopListening)
{
break;
}
}
}));
t.IsBackground = true;
t.Start();
}
Let's say the server is listening on port 12345
then the client:
get the current ip address of the client let's say it is 192.168.5.88
create a list of all posible ip addresses. The server ip address will probably be related to the client's ip if they are on the same local network therefore I construct the list as:
192.168.5.0
192.168.5.1
192.168.5.2
192.168.5.3
.....etc
.....
192.168.0.88
192.168.1.88
192.168.2.88
192.168.3.88
...etc
192.0.5.88
192.1.5.88
192.2.5.88
192.3.5.88
192.4.5.88
..... etc
0.168.5.88
1.168.5.88
2.168.5.88
3.168.5.88
4.168.5.88
.... etc
Then I try to connect with every possible ip and port 12345. If one connection is successful then that means that I found the address of the server.
Now my question is:
Now I have done this in two ways. I know just the basics about threads and I don't know if this is dangerous but it works really fast.
// first way
foreach (var ip in ListOfIps)
{
new Thread(new ThreadStart(() =>
{
TryConnect(ip);
})).Start();
}
the second way I belive it is more safe but it takes much more time:
// second way
foreach (var ip in ListOfIps)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(TryConnect), ip);
}
I have to call the TryConnect method about 1000 times and each time it takes about 2 seconds (I set the connection timeout to 2 seconds). What will be the most efficient and secure way of calling it 1000 times?
EDIT 2
Here are the results using different techniques:
1) Using threadpool
..
..
var now = DateTime.Now;
foreach (var item in allIps)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(DoWork), item);
}
ThreadPool.QueueUserWorkItem(new WaitCallback(PrintTimeDifference), now);
}
static void PrintTimeDifference(object startTime)
{
Console.WriteLine("------------------Done!----------------------");
var s = (DateTime)startTime;
Console.WriteLine((DateTime.Now-s).Seconds);
}
It took 37 seconds to complete
2) Using threads:
..
..
var now = DateTime.Now;
foreach (var item in allIps)
{
new Thread(new ThreadStart(() =>
{
DoWork(item);
})).Start();
}
ThreadPool.QueueUserWorkItem(new WaitCallback(PrintTimeDifference), now);
It took 12 seconds to complete
3) Using tasks:
..
..
var now = DateTime.Now;
foreach (var item in allIps)
{
var t = Task.Factory.StartNew(() =>
DoWork(item)
);
}
ThreadPool.QueueUserWorkItem(new WaitCallback(PrintTimeDifference), now);
}
static void PrintTimeDifference(object startTime)
{
Console.WriteLine("------------------Done!----------------------");
var s = (DateTime)startTime;
Console.WriteLine((DateTime.Now-s).Seconds);
}
It took 8 seconds!!
In this case I would prefer the solution with the ThreadPool-Threads, because creating 1000 Threads is a heavy operation (when you think of the memory each thread gets).
But since .NET 4 there is another solution with the class Task.
Tasks are workloads which can be executed in parallel. You can define and run them like this:
var t = Task.Factory.StartNew(() => DoAction());
You don't have to care about the number of threads used because the runtime environment handles that. So if you have the possibility to split your workload into smaller packages which can be executed in parallel I would use Tasks to do the work.
Both methods run the risk of creating way too many Threads.
A thread is expensive in the time it takes to be created and in memory consumption.
It does look like your 2nd approach, using the ThreadPool, should work better. Because of the long timeout (2 sec) it will still create many threads, but far less then 1000.
The better approach (requires Fx 4) would be to use Parallel.ForEach(...). But that too may require some tuning.
And a really good solution would use a broadcast (UDP) protocol to discover services.
Now I made my own benchmark.
Here is the code:
class Program {
private static long parallelIterations = 100;
private static long taskIterations = 100000000;
static void Main(string[] args) {
Console.WriteLine("Parallel Iterations: {0:n0}", parallelIterations);
Console.WriteLine("Task Iterations: {0:n0}", taskIterations);
Analyse("Simple Threads", ExecuteWorkWithSimpleThreads);
Analyse("ThreadPool Threads", ExecuteWorkWithThreadPoolThreads);
Analyse("Tasks", ExecuteWorkWithTasks);
Analyse("Parallel For", ExecuteWorkWithParallelFor);
Analyse("Async Delegates", ExecuteWorkWithAsyncDelegates);
}
private static void Analyse(string name, Action action) {
Stopwatch watch = new Stopwatch();
watch.Start();
action();
watch.Stop();
Console.WriteLine("{0}: {1} seconds", name.PadRight(20), watch.Elapsed.TotalSeconds);
}
private static void ExecuteWorkWithSimpleThreads() {
Thread[] threads = new Thread[parallelIterations];
for (long i = 0; i < parallelIterations; i++) {
threads[i] = new Thread(DoWork);
threads[i].Start();
}
for (long i = 0; i < parallelIterations; i++) {
threads[i].Join();
}
}
private static void ExecuteWorkWithThreadPoolThreads() {
object locker = new object();
EventWaitHandle waitHandle = new ManualResetEvent(false);
int finished = 0;
for (long i = 0; i < parallelIterations; i++) {
ThreadPool.QueueUserWorkItem((threadContext) => {
DoWork();
lock (locker) {
finished++;
if (finished == parallelIterations)
waitHandle.Set();
}
});
}
waitHandle.WaitOne();
}
private static void ExecuteWorkWithTasks() {
Task[] tasks = new Task[parallelIterations];
for (long i = 0; i < parallelIterations; i++) {
tasks[i] = Task.Factory.StartNew(DoWork);
}
Task.WaitAll(tasks);
}
private static void ExecuteWorkWithParallelFor() {
Parallel.For(0, parallelIterations, (n) => DoWork());
}
private static void ExecuteWorkWithAsyncDelegates() {
Action[] actions = new Action[parallelIterations];
IAsyncResult[] results = new IAsyncResult[parallelIterations];
for (long i = 0; i < parallelIterations; i++) {
actions[i] = DoWork;
results[i] = actions[i].BeginInvoke((result) => { }, null);
}
for (long i = 0; i < parallelIterations; i++) {
results[i].AsyncWaitHandle.WaitOne();
results[i].AsyncWaitHandle.Close();
}
}
private static void DoWork() {
//Thread.Sleep(TimeSpan.FromMilliseconds(taskDuration));
for (long i = 0; i < taskIterations; i++ ) { }
}
}
Here is the result with different settings:
Parallel Iterations: 100.000
Task Iterations: 100
Simple Threads : 13,4589412 seconds
ThreadPool Threads : 0,0682997 seconds
Tasks : 0,1327014 seconds
Parallel For : 0,0066053 seconds
Async Delegates : 2,3844015 seconds
Parallel Iterations: 100
Task Iterations: 100.000.000
Simple Threads : 5,6415113 seconds
ThreadPool Threads : 5,5798242 seconds
Tasks : 5,6261562 seconds
Parallel For : 5,8721274 seconds
Async Delegates : 5,6041608 seconds
As you can see simple threads are not efficient when there are too much of them.
But when using some of them they are very efficient because there is little overhead (e.g. synchronization).
Well there are pros and cons to this approach:
Using an individual thread per connection will (in theory) let you make all connections in parallel, since this is a blocking I/O operation all threads will be suspended until the respective connection succeeds. However, creating 1000 threads is a bit of an overkill on the system.
Using the thread pool gives you the benefit of reusing threads, but only a limited number of connection tasks can be active at one time. For example if the thread pool has 4 threads, then 4 connections will be attempted, then another 4 and so on. This is light on resource but may take too long because, as you said, a single connection needs about 2 seconds.
So I would advise a trade-off: create a thread-pool with about 50 threads (using the SetMaxThreads method) and queue all the connections. That way, it will be lighter on resources than 1000 threads, and still process connections reasonably fast.

Categories

Resources