I have bunch of async methods, which I invoke from Dispatcher. The methods does not perform any work in the background, they just waits for some I/O operations, or wait for response from the webserver.
async Task FetchAsync()
{
// Prepare request in UI thread
var response = await new WebClient().DownloadDataTaskAsync();
// Process response in UI thread
}
now, I want to perform load tests, by calling multiple FetchAsync() in parallel with some max degree of parallelism.
My first attempt was using Paralell.Foreach(), but id does not work well with async/await.
var option = new ParallelOptions {MaxDegreeOfParallelism = 10};
Parallel.ForEach(UnitsOfWork, uow => uow.FetchAsync().Wait());
I've been looking at reactive extensions, but I'm still not able to take advantage of Dispatcher and async/await.
My goal is to not create separate thread for each FetchAsync(). Can you give me some hints how to do it?
Just call Fetchasync without awaiting each call and then use Task.WhenAll to await all of them together.
var tasks = new List<Task>();
var max = 10;
for(int i = 0; i < max; i++)
{
tasks.Add(FetchAsync());
}
await Task.WhenAll(tasks);
Here is a generic reusable solution to your question that you can reuse not only with your FetchAsync method but for any async method that has the same signature. The api includes real time concurrent throttling support as well:
Parameters are self explanatory:
totalRequestCount: is how many async requests (FatchAsync calls) you want to do in total, async processor is the FetchAsync method itself, maxDegreeOfParallelism is the optional nullable parameter. If you want real time concurrent throttling with max number of concurrent async requests, set it, otherwise not.
public static Task ForEachAsync(
int totalRequestCount,
Func<Task> asyncProcessor,
int? maxDegreeOfParallelism = null)
{
IEnumerable<Task> tasks;
if (maxDegreeOfParallelism != null)
{
SemaphoreSlim throttler = new SemaphoreSlim(maxDegreeOfParallelism.Value, maxDegreeOfParallelism.Value);
tasks = Enumerable.Range(0, totalRequestCount).Select(async requestNumber =>
{
await throttler.WaitAsync();
try
{
await asyncProcessor().ConfigureAwait(false);
}
finally
{
throttler.Release();
}
});
}
else
{
tasks = Enumerable.Range(0, totalRequestCount).Select(requestNumber => asyncProcessor());
}
return Task.WhenAll(tasks);
}
Related
I am using the HTTPClient in System.Net.Http to make requests against an API. The API is limited to 10 requests per second.
My code is roughly like so:
List<Task> tasks = new List<Task>();
items..Select(i => tasks.Add(ProcessItem(i));
try
{
await Task.WhenAll(taskList.ToArray());
}
catch (Exception ex)
{
}
The ProcessItem method does a few things but always calls the API using the following:
await SendRequestAsync(..blah). Which looks like:
private async Task<Response> SendRequestAsync(HttpRequestMessage request, CancellationToken token)
{
token.ThrowIfCancellationRequested();
var response = await HttpClient
.SendAsync(request: request, cancellationToken: token).ConfigureAwait(continueOnCapturedContext: false);
token.ThrowIfCancellationRequested();
return await Response.BuildResponse(response);
}
Originally the code worked fine but when I started using Task.WhenAll I started getting 'Rate Limit Exceeded' messages from the API. How can I limit the rate at which requests are made?
Its worth noting that ProcessItem can make between 1-4 API calls depending on the item.
The API is limited to 10 requests per second.
Then just have your code do a batch of 10 requests, ensuring they take at least one second:
Items[] items = ...;
int index = 0;
while (index < items.Length)
{
var timer = Task.Delay(TimeSpan.FromSeconds(1.2)); // ".2" to make sure
var tasks = items.Skip(index).Take(10).Select(i => ProcessItemsAsync(i));
var tasksAndTimer = tasks.Concat(new[] { timer });
await Task.WhenAll(tasksAndTimer);
index += 10;
}
Update
My ProcessItems method makes 1-4 API calls depending on the item.
In this case, batching is not an appropriate solution. You need to limit an asynchronous method to a certain number, which implies a SemaphoreSlim. The tricky part is that you want to allow more calls over time.
I haven't tried this code, but the general idea I would go with is to have a periodic function that releases the semaphore up to 10 times. So, something like this:
private readonly SemaphoreSlim _semaphore = new SemaphoreSlim(10);
private async Task<Response> ThrottledSendRequestAsync(HttpRequestMessage request, CancellationToken token)
{
await _semaphore.WaitAsync(token);
return await SendRequestAsync(request, token);
}
private async Task PeriodicallyReleaseAsync(Task stop)
{
while (true)
{
var timer = Task.Delay(TimeSpan.FromSeconds(1.2));
if (await Task.WhenAny(timer, stop) == stop)
return;
// Release the semaphore at most 10 times.
for (int i = 0; i != 10; ++i)
{
try
{
_semaphore.Release();
}
catch (SemaphoreFullException)
{
break;
}
}
}
}
Usage:
// Start the periodic task, with a signal that we can use to stop it.
var stop = new TaskCompletionSource<object>();
var periodicTask = PeriodicallyReleaseAsync(stop.Task);
// Wait for all item processing.
await Task.WhenAll(taskList);
// Stop the periodic task.
stop.SetResult(null);
await periodicTask;
The answer is similar to this one.
Instead of using a list of tasks and WhenAll, use Parallel.ForEach and use ParallelOptions to limit the number of concurrent tasks to 10, and make sure each one takes at least 1 second:
Parallel.ForEach(
items,
new ParallelOptions { MaxDegreeOfParallelism = 10 },
async item => {
ProcessItems(item);
await Task.Delay(1000);
}
);
Or if you want to make sure each item takes as close to 1 second as possible:
Parallel.ForEach(
searches,
new ParallelOptions { MaxDegreeOfParallelism = 10 },
async item => {
var watch = new Stopwatch();
watch.Start();
ProcessItems(item);
watch.Stop();
if (watch.ElapsedMilliseconds < 1000) await Task.Delay((int)(1000 - watch.ElapsedMilliseconds));
}
);
Or:
Parallel.ForEach(
searches,
new ParallelOptions { MaxDegreeOfParallelism = 10 },
async item => {
await Task.WhenAll(
Task.Delay(1000),
Task.Run(() => { ProcessItems(item); })
);
}
);
UPDATED ANSWER
My ProcessItems method makes 1-4 API calls depending on the item. So with a batch size of 10 I still exceed the rate limit.
You need to implement a rolling window in SendRequestAsync. A queue containing timestamps of each request is a suitable data structure. You dequeue entries with a timestamp older than 10 seconds. As it so happens, there is an implementation as an answer to a similar question on SO.
ORIGINAL ANSWER
May still be useful to others
One straightforward way to handle this is to batch your requests in groups of 10, run those concurrently, and then wait until a total of 10 seconds has elapsed (if it hasn't already). This will bring you in right at the rate limit if the batch of requests can complete in 10 seconds, but is less than optimal if the batch of requests takes longer. Have a look at the .Batch() extension method in MoreLinq. Code would look approximately like
foreach (var taskList in tasks.Batch(10))
{
Stopwatch sw = Stopwatch.StartNew(); // From System.Diagnostics
await Task.WhenAll(taskList.ToArray());
if (sw.Elapsed.TotalSeconds < 10.0)
{
// Calculate how long you still have to wait and sleep that long
// You might want to wait 10.5 or 11 seconds just in case the rate
// limiting on the other side isn't perfectly implemented
}
}
https://github.com/thomhurst/EnumerableAsyncProcessor
I've written a library to help with this sort of logic.
Usage would be:
var responses = await AsyncProcessorBuilder.WithItems(items) // Or Extension Method: items.ToAsyncProcessorBuilder()
.SelectAsync(item => ProcessItem(item), CancellationToken.None)
.ProcessInParallel(levelOfParallelism: 10, TimeSpan.FromSeconds(1));
I currently have a function that perform a set of 10 tasks in parallel. After the 10 tasks completes i move on to the next 10 until my queue is empty. I am looking forward to increase the efficiency of that algorithm as right now if 9 of my tasks have completed in 1 min and my 10th task is taking another 10 min i need to wait for all the 10 task to complete even though i have 9 spot free for 9 other task to start using.
Is there a way that when a task is completed, i immediately send another task for processing within that same level(for each loop). I saw that concurrent Dictionary can be use. Can you please guide and provide some sample code.
public async Task Test()
{
List<task> listoftasks =new List<Task>();
foreach(level in levels)
{
Queue<Model1> queue=new Queue<Model1>(Store);
while(queue.Count>0)
{
for(int i=0;i<10;i++)
{
if(!queue.TryDequeue(out Model1 item))
{
break;
}
listoftasks.Add(Task.Run(()=>Dosomething(sql)))
}
await Task.WhenAll(listoftasks);
listoftasks .Clear();
}
}
}
You can just use LimitedConcurrencyLevelTaskScheduler to achieve desired behavour (https://learn.microsoft.com/en-us/dotnet/api/system.threading.tasks.taskscheduler?view=net-5.0). In this case you can just push all tasks at one moment and they will be executed with desired level of concurrency (not more then 10 tasks at the parallel in your case).
You can get each Task to dequeue an item. Use a ConcurrentQueue to ensure thread-safety.
It's kind of a poor-man's scheduler, but it's very lightweight.
ConcurrentQueue<Model1> queue;
void Dequeue()
{
while(queue.TryDequeue(out var item))
DoSomething(item);
}
public async Task Test()
{
queue = new ConcurrentQueue<Model1>(Store);
var listoftasks = new List<Task>();
for (var i = 0; i < Environment.ProcessorCount; i++)
listoftasks.Add(Task.Run(() => Dequeue()));
await Task.WhenAll(listoftasks);
}
Note: this does not handle exceptions, so all exceptions must be handled or swallowed
Personally I'd use an ActionBlock (out of the TPL Dataflow library). It has
built in MaxDegreeOfParallelism
Can easily deal with async IO Bound workloads, or non async CPU Bound workloads
Has cancelation support (if needed)
Can be built into larger pipelines
Can run as perpetual consumer in a multi-producer environment
Given
private ActionBlock<Model> _processor;
Setup
_processor = new ActionBlock<Model>(
DoSomething,
new ExecutionDataflowBlockOptions()
{
CancellationToken = SomeCancelationTokenIfNeeded,
MaxDegreeOfParallelism = 10,
SingleProducerConstrained = true
});
Some Method
public static void DoSomething(Model item)
{ ... }
Usage
await _processor.SendAsync(someItem);
I have an enumeration of items (RunData.Demand), each representing some work involving calling an API over HTTP. It works great if I just foreach through it all and call the API during each iteration. However, each iteration takes a second or two so I'd like to run 2-3 threads and divide up the work between them. Here's what I'm doing:
ThreadPool.SetMaxThreads(2, 5); // Trying to limit the amount of threads
var tasks = RunData.Demand
.Select(service => Task.Run(async delegate
{
var availabilityResponse = await client.QueryAvailability(service);
// Do some other stuff, not really important
}));
await Task.WhenAll(tasks);
The client.QueryAvailability call basically calls an API using the HttpClient class:
public async Task<QueryAvailabilityResponse> QueryAvailability(QueryAvailabilityMultidayRequest request)
{
var response = await client.PostAsJsonAsync("api/queryavailabilitymultiday", request);
if (response.IsSuccessStatusCode)
{
return await response.Content.ReadAsAsync<QueryAvailabilityResponse>();
}
throw new HttpException((int) response.StatusCode, response.ReasonPhrase);
}
This works great for a while, but eventually things start timing out. If I set the HttpClient Timeout to an hour, then I start getting weird internal server errors.
What I started doing was setting a Stopwatch within the QueryAvailability method to see what was going on.
What's happening is all 1200 items in RunData.Demand are being created at once and all 1200 await client.PostAsJsonAsync methods are being called. It appears it then uses the 2 threads to slowly check back on the tasks, so towards the end I have tasks that have been waiting for 9 or 10 minutes.
Here's the behavior I would like:
I'd like to create the 1,200 tasks, then run them 3-4 at a time as threads become available. I do not want to queue up 1,200 HTTP calls immediately.
Is there a good way to go about doing this?
As I always recommend.. what you need is TPL Dataflow (to install: Install-Package System.Threading.Tasks.Dataflow).
You create an ActionBlock with an action to perform on each item. Set MaxDegreeOfParallelism for throttling. Start posting into it and await its completion:
var block = new ActionBlock<QueryAvailabilityMultidayRequest>(async service =>
{
var availabilityResponse = await client.QueryAvailability(service);
// ...
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4 });
foreach (var service in RunData.Demand)
{
block.Post(service);
}
block.Complete();
await block.Completion;
Old question, but I would like to propose an alternative lightweight solution using the SemaphoreSlim class. Just reference System.Threading.
SemaphoreSlim sem = new SemaphoreSlim(4,4);
foreach (var service in RunData.Demand)
{
await sem.WaitAsync();
Task t = Task.Run(async () =>
{
var availabilityResponse = await client.QueryAvailability(serviceCopy));
// do your other stuff here with the result of QueryAvailability
}
t.ContinueWith(sem.Release());
}
The semaphore acts as a locking mechanism. You can only enter the semaphore by calling Wait (WaitAsync) which subtracts one from the count. Calling release adds one to the count.
You're using async HTTP calls, so limiting the number of threads will not help (nor will ParallelOptions.MaxDegreeOfParallelism in Parallel.ForEach as one of the answers suggests). Even a single thread can initiate all requests and process the results as they arrive.
One way to solve it is to use TPL Dataflow.
Another nice solution is to divide the source IEnumerable into partitions and process items in each partition sequentially as described in this blog post:
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate
{
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}
While the Dataflow library is great, I think it's a bit heavy when not using block composition. I would tend to use something like the extension method below.
Also, unlike the Partitioner method, this runs the async methods on the calling context - the caveat being that if your code is not truly async, or takes a 'fast path', then it will effectively run synchronously since no threads are explicitly created.
public static async Task RunParallelAsync<T>(this IEnumerable<T> items, Func<T, Task> asyncAction, int maxParallel)
{
var tasks = new List<Task>();
foreach (var item in items)
{
tasks.Add(asyncAction(item));
if (tasks.Count < maxParallel)
continue;
var notCompleted = tasks.Where(t => !t.IsCompleted).ToList();
if (notCompleted.Count >= maxParallel)
await Task.WhenAny(notCompleted);
}
await Task.WhenAll(tasks);
}
I'm using Asp .Net 4.5.1.
I have tasks to run, which call a web-service, and some might fail. I need to run N successful tasks which perform some light CPU work and mainly call a web service, then stop, and I want to throttle.
For example, let's assume we have 300 URLs in some collection. We need to run a function called Task<bool> CheckUrlAsync(url) on each of them, with throttling, meaning, for example, having only 5 run "at the same time" (in other words, have maximum 5 connections used at any given time). Also, we only want to perform N (assume 100) successful operations, and then stop.
I've read this and this and still I'm not sure what would be the correct way to do it.
How would you do it?
Assume ASP .Net
Assume IO call (http call to web serice), no heavy CPU operations.
Use Semaphore slim.
var semaphore = new SemaphoreSlim(5);
var tasks = urlCollection.Select(async url =>
{
await semaphore.WaitAsync();
try
{
return await CheckUrlAsync(url);
}
finally
{
semaphore.Release();
}
};
while(tasks.Where(t => t.Completed).Count() < 100)
{
await.Task.WhenAny(tasks);
}
Although I would prefer to use Rx.Net to produce some better code.
using(var semaphore = new SemaphoreSlim(5))
{
var results = urlCollection.ToObservable()
.Select(async url =>
{
await semaphore.WaitAsync();
try
{
return await CheckUrlAsync(url);
}
finally
{
semaphore.Release();
}
}).Take(100).ToList();
}
Okay...this is going to be fun.
public static class SemaphoreHelper
{
public static Task<T> ContinueWith<T>(
this SemaphoreSlim semaphore,
Func<Task<T>> action)
var ret = semaphore.WaitAsync()
.ContinueWith(action);
ret.ContinueWith(_ => semaphore.Release(), TaskContinuationOptions.None);
return ret;
}
var semaphore = new SemaphoreSlim(5);
var results = urlCollection.Select(
url => semaphore.ContinueWith(() => CheckUrlAsync(url)).ToList();
I do need to add that the code as it stands will still run all 300 URLs, it just will return quicker...thats all. You would need to add the cancelation token to the semaphore.WaitAsync(token) to cancel the queued work. Again I suggest using Rx.Net for that. Its just easier to use Rx.Net to get the cancelation token to work with .Take(100).
Try something like this?
private const int u_limit = 100;
private const int c_limit = 5;
List<Task> tasks = new List<Task>();
int totalRun = 0;
while (totalRun < u_limit)
{
for (int i = 0; i < c_limit; i++)
{
tasks.Add(Task.Run (() => {
// Your code here.
}));
}
Task.WaitAll(tasks);
totalRun += c_limit;
}
I'm using ODP.NET, which doesn't provide any asych methods like the SQL driver does or other Oracle drivers.
I have lots of slow queries, sometimes I need to call several of them on a single MVC controller call. So I'm trying to wrap them in Task calls. So I'm thinking of using this pattern:
(This is somewhat contrived example I wouldn't call the same query 10 times for real, it would be some heterogeneous workload)
List<Task<DbDataReader>> list = new List<Task<DbDataReader>>(10);
for (int i = 0; i < 10; i++)
{
ERROR_LOG_PKG_DR package = new ERROR_LOG_PKG_DR();
//Could be a bunch of different queries.
//Returns a task that has the longrunningtask property set, so it's on it's own thread
list.Add(package.QUERY_ALL_ASYNC());
}
Task.WaitAll(list.ToArray());
And here is the task:
public Task<DbDataReader> QUERY_ALL_ASYNC()
{
CancellationToken ct = new CancellationToken();
return Task.Factory.StartNew(_ =>
{
System.Diagnostics.Debug.WriteLine("InTask: Thread: " +
Thread.CurrentThread.ManagedThreadId);
System.Diagnostics.Debug.WriteLine("InTask: Is Background: " +
Thread.CurrentThread.IsBackground);
System.Diagnostics.Debug.WriteLine("InTask: Is ThreadPool: " +
Thread.CurrentThread.IsThreadPoolThread);
return QUERY_ALL<DbDataReader>();
}, null, ct,TaskCreationOptions.LongRunning, TaskScheduler.Default);
}
This fires up 10 threads. What if I wanted some thread pool like behavior-- ie. only about 4 concurrent threads at a time-- re-use threads, etc, but I want it outside of the ASP.NET thread pool (which is being used to service requests)
How do I do that?
If you are happy to sacrifice your web app scalability, you could use SemaphoreSlim to throttle the number of parallel tasks:
const int MAX_PARALLEL_TASKS = 4;
DbDataReader GetData(CancellationToken token)
{
DbDataReader reader = ... // execute the synchronous DB API
return reader;
}
// this can be called form an async controller method
async Task ProcessAsync()
{
// list of synchronous methods or lambdas to offload to thread pool
var funcs = new Func<CancellationToken, DbDataReader>[] {
GetData, GetData2, ... };
Task<DbDataReader>[] tasks;
using (var semaphore = new SemaphoreSlim(MAX_PARALLEL_TASKS))
{
tasks = funcs.Select(async(func) =>
{
await semaphore.WaitAsync();
try
{
return await Task.Run(() => func(token));
}
finally
{
semaphore.Release();
}
}).ToArray();
await Task.WhenAll(tasks);
}
// process the results, e.g: tasks[0].Result
}
I think this is what you need - (msdn) How to: Create a Task Scheduler That Limits Concurrency.
Shortly put idea is to create custom TaskScheduler that limits number of concurrently runninga tasks.
Why not just use ThreadPool?
https://msdn.microsoft.com/en-us/library/kbf0f1ct.aspx
This provides system-aware throttling and has been around for a very long time