Concurrent parallel job processing with throttling using ActionBlock in TPL Dataflow - c#

I am using the below code snippet to try and run jobs (selected by user from UI) non-blocking from main thread (asynchronously) and concurrently w.r.t each other, with some throttling set up to prevent too many jobs hogging all RAM. I used many sources such as Stephen Cleary's blog, this link on ActionBlock as well as this one from #i3arnon
public class ActionBlockJobsAsyncImpl {
private ActionBlock<Job> qJobs;
private Dictionary<Job, CancellationTokenSource> cTokSrcs;
public ActionBlockJobsAsyncImpl () {
qJobs = new ActionBlock<Job>(
async a_job => await RunJobAsync(a_job),
new ExecutionDataflowBlockOptions
{
BoundedCapacity = boundedCapacity,
MaxDegreeOfParallelism = maxDegreeOfParallelism,
});
cTokSrcs = new Dictionary<Job, CancellationTokenSource>();
}
private async Task<bool> RunJobAsync(Job a_job) {
JobArgs args = JobAPI.GetJobArgs(a_job);
bool ok = await JobAPI.RunJobAsync(args, cTokSrcs[a_job].Token);
return ok;
}
private async Task Produce(IEnumerable<Job> jobs) {
foreach (var job in jobs)
{
await qJobs.SendAsync(job);
}
//qJobs.Complete();
}
public override async Task SubmitJobs(IEnumerable<Job> jobs) {
//-Establish new cancellation token and task status
foreach (var job in jobs) {
cTokSrcs[job] = new CancellationTokenSource();
}
// Start the producer.
var producer = Produce(jobs);
// Wait for everything to complete.
await Task.WhenAll(producer);
}
}
The reason I commented out the qJobs.Complete() method call was because the user should be able to submit jobs continuously from the UI (same ones or different ones), and I learnt from implementing and testing in my first pass using BufferBlock that I shouldn't have that Complete() call if I wanted such a continuous producer/consumer queue. But BufferBlock as I learnt doesn't support running concurrent jobs; hence this is my second pass with ActionBlock instead.
In the above code using ActionBlock, when the user selects jobs and clicks to run from UI, this calls the SubmitJobs method. The int parameters boundedCapacity=8 and maxDegreeOfParallelism=DataflowBlockOptions.Unbounded But the code as is, currently does nothing (i.e., it doesn't run any job) - my analogous BufferBlock implementation on the other hand, used to at least run the jobs asynchronously, albeit sequentially w.r.t each other. Here, it never runs any of the jobs and I don't see any error messages either. Appreciate any ideas on what I'm doing wrong and perhaps some useful ideas on how to fix the problem. Thanks for your interest.

Related

Running a long-running parallel task in the background, while allowing small async tasks to update the foreground

I have around 10 000 000 tasks that each takes from 1-10 seconds to complete. I am running those tasks on a powerful server, using 50 different threads, where each thread picks the first not-done task, runs it, and repeats.
Pseudo-code:
for i = 0 to 50:
run a new thread:
while True:
task = first available task
if no available tasks: exit thread
run task
Using this code, I can run all the tasks in parallell on any given number of threads.
In reality, the code uses C#'s Task.WhenAll, and looks like this:
ServicePointManager.DefaultConnectionLimit = threadCount; //Allow more HTTP request simultaneously
var currentIndex = -1;
var threads = new List<Task>(); //List of threads
for (int i = 0; i < threadCount; i++) //Generate the threads
{
var wc = CreateWebClient();
threads.Add(Task.Run(() =>
{
while (true) //Each thread should loop, picking the first available task, and executing it.
{
var index = Interlocked.Increment(ref currentIndex);
if (index >= tasks.Count) break;
var task = tasks[index];
RunTask(conn, wc, task, port);
}
}));
}
await Task.WhenAll(threads);
This works just as I wanted it to, but I have a problem: since this code takes a lot of time to run, I want the user to see some progress. The progress is displayed in a colored bitmap (representing a matrix), and also takes some time to generate (a few seconds).
Therefore, I want to generate this visualization on a background thread. But this other background thread is never executed. My suspicion is that it is using the same thread pool as the parallel code, and is therefore enqueued, and will not be executed before the parallel code is actually finished. (And that's a bit too late.)
Here's an example of how I generate the progress visualization:
private async void Refresh_Button_Clicked(object sender, RoutedEventArgs e)
{
var bitmap = await Task.Run(() => // <<< This task is never executed!
{
//bla, bla, various database calls, and generating a relatively large bitmap
});
//Convert the bitmap into a WPF image, and update the GUI
VisualizationImage = BitmapToImageSource(bitmap);
}
So, how could I best solve this problem? I could create a list of Tasks, where each Task represents one of my tasks, and run them with Parallel.Invoke, and pick another Thread pool (I think). But then I have to generate 10 million Task objects, instead of just 50 Task objects, running through my array of stuff to do. That sounds like it uses much more RAM than necessary. Any clever solutions to this?
EDIT:
As Panagiotis Kanavos suggested in one of his comments, I tried replacing some of my loop logic with ActionBlock, like this:
// Create an ActionBlock<int> that performs some work.
var workerBlock = new ActionBlock<ZoneTask>(
t =>
{
var wc = CreateWebClient(); //This probably generates some unnecessary overhead, but that's a problem I can solve later.
RunTask(conn, wc, t, port);
},
// Specify a maximum degree of parallelism.
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = threadCount
});
foreach (var t in tasks) //Note: the objects in the tasks array are not Task objects
workerBlock.Post(t);
workerBlock.Complete();
await workerBlock.Completion;
Note: RunTask just executes a web request using the WebClient, and parses the results. It's nothing in there that can create a dead lock.
This seems to work as the old parallelism code, except that it needs a minute or two to do the initial foreach loop to post the tasks. Is this delay really worth it?
Nevertheless, my progress task still seems to be blocked. Ignoring the Progress< T > suggestion for now, since this reduced code still suffers the same problem:
private async void Refresh_Button_Clicked(object sender, RoutedEventArgs e)
{
Debug.WriteLine("This happens");
var bitmap = await Task.Run(() =>
{
Debug.WriteLine("This does not!");
//Still doing some work here, so it's not optimized away.
};
VisualizationImage = BitmapToImageSource(bitmap);
}
So it still looks like new tasks are not executed as long as the parallell task is running. I even reduced the "MaxDegreeOfParallelism" from 50 to 5 (on a 24 core server) to see if Peter Ritchie's suggestion was right, but no change. Any other suggestions?
ANOTHER EDIT:
The issue seems to have been that I overloaded the thread pool with all my simultaneous blocking I/O calls. I replaced WebClient with HttpClient and its async-functions, and now everything seems to be working nicely.
Thanks to everyone for the great suggestions! Even though not all of them directly solved the problem, I'm sure they all improved my code. :)
.NET already provides a mechanism to report progress with the IProgress< T> and the Progress< T> implementation.
The IProgress interface allows clients to publish messages with the Report(T) class without having to worry about threading. The implementation ensures that the messages are processed in the appropriate thread, eg the UI thread. By using the simple IProgress< T> interface the background methods are decoupled from whoever processes the messages.
You can find more information in the Async in 4.5: Enabling Progress and Cancellation in Async APIs article. The cancellation and progress APIs aren't specific to the TPL. They can be used to simplify cancellation and reporting even for raw threads.
Progress< T> processes messages on the thread on which it was created. This can be done either by passing a processing delegate when the class is instantiated, or by subscribing to an event. Copying from the article:
private async void Start_Button_Click(object sender, RoutedEventArgs e)
{
//construct Progress<T>, passing ReportProgress as the Action<T>
var progressIndicator = new Progress<int>(ReportProgress);
//call async method
int uploads=await UploadPicturesAsync(GenerateTestImages(), progressIndicator);
}
where ReportProgress is a method that accepts a parameter of int. It could also accept a complex class that reported work done, messages etc.
The asynchronous method only has to use IProgress.Report, eg:
async Task<int> UploadPicturesAsync(List<Image> imageList, IProgress<int> progress)
{
int totalCount = imageList.Count;
int processCount = await Task.Run<int>(() =>
{
int tempCount = 0;
foreach (var image in imageList)
{
//await the processing and uploading logic here
int processed = await UploadAndProcessAsync(image);
if (progress != null)
{
progress.Report((tempCount * 100 / totalCount));
}
tempCount++;
}
return tempCount;
});
return processCount;
}
This decouples the background method from whoever receives and processes the progress messages.

How to correctly queue up tasks to run in C#

I have an enumeration of items (RunData.Demand), each representing some work involving calling an API over HTTP. It works great if I just foreach through it all and call the API during each iteration. However, each iteration takes a second or two so I'd like to run 2-3 threads and divide up the work between them. Here's what I'm doing:
ThreadPool.SetMaxThreads(2, 5); // Trying to limit the amount of threads
var tasks = RunData.Demand
.Select(service => Task.Run(async delegate
{
var availabilityResponse = await client.QueryAvailability(service);
// Do some other stuff, not really important
}));
await Task.WhenAll(tasks);
The client.QueryAvailability call basically calls an API using the HttpClient class:
public async Task<QueryAvailabilityResponse> QueryAvailability(QueryAvailabilityMultidayRequest request)
{
var response = await client.PostAsJsonAsync("api/queryavailabilitymultiday", request);
if (response.IsSuccessStatusCode)
{
return await response.Content.ReadAsAsync<QueryAvailabilityResponse>();
}
throw new HttpException((int) response.StatusCode, response.ReasonPhrase);
}
This works great for a while, but eventually things start timing out. If I set the HttpClient Timeout to an hour, then I start getting weird internal server errors.
What I started doing was setting a Stopwatch within the QueryAvailability method to see what was going on.
What's happening is all 1200 items in RunData.Demand are being created at once and all 1200 await client.PostAsJsonAsync methods are being called. It appears it then uses the 2 threads to slowly check back on the tasks, so towards the end I have tasks that have been waiting for 9 or 10 minutes.
Here's the behavior I would like:
I'd like to create the 1,200 tasks, then run them 3-4 at a time as threads become available. I do not want to queue up 1,200 HTTP calls immediately.
Is there a good way to go about doing this?
As I always recommend.. what you need is TPL Dataflow (to install: Install-Package System.Threading.Tasks.Dataflow).
You create an ActionBlock with an action to perform on each item. Set MaxDegreeOfParallelism for throttling. Start posting into it and await its completion:
var block = new ActionBlock<QueryAvailabilityMultidayRequest>(async service =>
{
var availabilityResponse = await client.QueryAvailability(service);
// ...
},
new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = 4 });
foreach (var service in RunData.Demand)
{
block.Post(service);
}
block.Complete();
await block.Completion;
Old question, but I would like to propose an alternative lightweight solution using the SemaphoreSlim class. Just reference System.Threading.
SemaphoreSlim sem = new SemaphoreSlim(4,4);
foreach (var service in RunData.Demand)
{
await sem.WaitAsync();
Task t = Task.Run(async () =>
{
var availabilityResponse = await client.QueryAvailability(serviceCopy));
// do your other stuff here with the result of QueryAvailability
}
t.ContinueWith(sem.Release());
}
The semaphore acts as a locking mechanism. You can only enter the semaphore by calling Wait (WaitAsync) which subtracts one from the count. Calling release adds one to the count.
You're using async HTTP calls, so limiting the number of threads will not help (nor will ParallelOptions.MaxDegreeOfParallelism in Parallel.ForEach as one of the answers suggests). Even a single thread can initiate all requests and process the results as they arrive.
One way to solve it is to use TPL Dataflow.
Another nice solution is to divide the source IEnumerable into partitions and process items in each partition sequentially as described in this blog post:
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate
{
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}
While the Dataflow library is great, I think it's a bit heavy when not using block composition. I would tend to use something like the extension method below.
Also, unlike the Partitioner method, this runs the async methods on the calling context - the caveat being that if your code is not truly async, or takes a 'fast path', then it will effectively run synchronously since no threads are explicitly created.
public static async Task RunParallelAsync<T>(this IEnumerable<T> items, Func<T, Task> asyncAction, int maxParallel)
{
var tasks = new List<Task>();
foreach (var item in items)
{
tasks.Add(asyncAction(item));
if (tasks.Count < maxParallel)
continue;
var notCompleted = tasks.Where(t => !t.IsCompleted).ToList();
if (notCompleted.Count >= maxParallel)
await Task.WhenAny(notCompleted);
}
await Task.WhenAll(tasks);
}

Running Task<T> on a custom scheduler

I am creating a generic helper class that will help prioritise requests made to an API whilst restricting parallelisation at which they occur.
Consider the key method of the application below;
public IQueuedTaskHandle<TResponse> InvokeRequest<TResponse>(Func<TClient, Task<TResponse>> invocation, QueuedClientPriority priority, CancellationToken ct) where TResponse : IServiceResponse
{
var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
_logger.Debug("Queueing task.");
var taskToQueue = Task.Factory.StartNew(async () =>
{
_logger.Debug("Starting request {0}", Task.CurrentId);
return await invocation(_client);
}, cts.Token, TaskCreationOptions.None, _schedulers[priority]).Unwrap();
taskToQueue.ContinueWith(task => _logger.Debug("Finished task {0}", task.Id), cts.Token);
return new EcosystemQueuedTaskHandle<TResponse>(cts, priority, taskToQueue);
}
Without going into too many details, I want to invoke tasks returned by Task<TResponse>> invocation when their turn in the queue arises. I am using a collection of queues constructed using QueuedTaskScheduler indexed by a unique enumeration;
_queuedTaskScheduler = new QueuedTaskScheduler(TaskScheduler.Default, 3);
_schedulers = new Dictionary<QueuedClientPriority, TaskScheduler>();
//Enumerate the priorities
foreach (var priority in Enum.GetValues(typeof(QueuedClientPriority)))
{
_schedulers.Add((QueuedClientPriority)priority, _queuedTaskScheduler.ActivateNewQueue((int)priority));
}
However, with little success I can't get the tasks to execute in a limited parallelised environment, leading to 100 API requests being constructed, fired, and completed in one big batch. I can tell this using a Fiddler session;
I have read some interesting articles and SO posts (here, here and here) that I thought would detail how to go about this, but so far I have not been able to figure it out. From what I understand, the async nature of the lambda is working in a continuation structure as designed, which is marking the generated task as complete, basically "insta-completing" it. This means that whilst the queues are working fine, runing a generated Task<T> on a custom scheduler is turning out to be the problem.
This means that whilst the queues are working fine, runing a generated Task on a custom scheduler is turning out to be the problem.
Correct. One way to think about it[1] is that an async method is split into several tasks - it's broken up at each await point. Each one of these "sub-tasks" are then run on the task scheduler. So, the async method will run entirely on the task scheduler (assuming you don't use ConfigureAwait(false)), but at each await it will leave the task scheduler, and then re-enter that task scheduler after the await completes.
So, if you want to coordinate asynchronous work at a higher level, you need to take a different approach. It's possible to write the code yourself for this, but it can get messy. I recommend you first try ActionBlock<T> from the TPL Dataflow library, passing your custom task scheduler to its ExecutionDataflowBlockOptions.
[1] This is a simplification. The state machine will avoid creating actual task objects unless necessary (in this case, they are necessary because they're being scheduled to a task scheduler). Also, only await points where the awaitable isn't complete actually cause a "method split".
Stephen Cleary's answer explains well why you can't use TaskScheduler for this purpose and how you can use ActionBlock to limit the degree of parallelism. But if you want to add priorities to that, I think you'll have to do that manually. Your approach of using a Dictionary of queues is reasonable, a simple implementation (with no support for cancellation or completion) of that could look something like this:
class Scheduler
{
private static readonly Priority[] Priorities =
(Priority[])Enum.GetValues(typeof(Priority));
private readonly IReadOnlyDictionary<Priority, ConcurrentQueue<Func<Task>>> queues;
private readonly ActionBlock<Func<Task>> executor;
private readonly SemaphoreSlim semaphore;
public Scheduler(int degreeOfParallelism)
{
queues = Priorities.ToDictionary(
priority => priority, _ => new ConcurrentQueue<Func<Task>>());
executor = new ActionBlock<Func<Task>>(
invocation => invocation(),
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = degreeOfParallelism,
BoundedCapacity = degreeOfParallelism
});
semaphore = new SemaphoreSlim(0);
Task.Run(Watch);
}
private async Task Watch()
{
while (true)
{
await semaphore.WaitAsync();
// find item with highest priority and send it for execution
foreach (var priority in Priorities.Reverse())
{
Func<Task> invocation;
if (queues[priority].TryDequeue(out invocation))
{
await executor.SendAsync(invocation);
}
}
}
}
public void Invoke(Func<Task> invocation, Priority priority)
{
queues[priority].Enqueue(invocation);
semaphore.Release(1);
}
}

Task Scheduler with WCF Service Reference async function

I am trying to consume a service reference, making multiple requests at the same time using a task scheduler. The service includes an synchronous and an asynchronous function that returns a result set. I am a bit confused, and I have a couple of initial questions, and then I will share how far I got in each. I am using some logging, concurrency visualizer, and fiddler to investigate. Ultimately I want to use a reactive scheduler to make as many requests as possible.
1) Should I use the async function to make all the requests?
2) If I were to use the synchronous function in multiple tasks what would be the limited resources that would potentially starve my thread count?
Here is what I have so far:
var myScheduler = new myScheduler();
var myFactory = new Factory(myScheduler);
var myClientProxy = new ClientProxy();
var tasks = new List<Task<Response>>();
foreach( var request in Requests )
{
var localrequest = request;
tasks.Add( myFactory.StartNew( () =>
{
// log stuff
return client.GetResponsesAsync( localTransaction.Request );
// log some more stuff
}).Unwrap() );
}
Task.WaitAll( tasks.ToArray() );
// process all the requests after they are done
This runs but according to fiddler it just tries to do all of the requests at once. It could be the scheduler but I trust that more then I do the above.
I have also tried to implement it without the unwrap command and instead using an async await delegate and it does the same thing. I have also tried referencing the .result and that seems to do it sequentially. Using the non synchronous service function call with the scheduler/factory it only gets up to about 20 simultaneous requests at the same time per client.
Yes. It will allow your application to scale better by using fewer threads to accomplish more.
Threads. When you initiate a synchronous operation that is inherently asynchronous (e.g. I/O) you have a blocked thread waiting for the operation to complete. You could however be using this thread in the meantime to execute CPU bound operations.
The simplest way to limit the amount of concurrent requests is to use a SemaphoreSlim which allows to asynchronously wait to enter it:
async Task ConsumeService()
{
var client = new ClientProxy();
var semaphore = new SemaphoreSlim(100);
var tasks = Requests.Select(async request =>
{
await semaphore.WaitAsync();
try
{
return await client.GetResponsesAsync(request);
}
finally
{
semaphore.Release();
}
}).ToList();
await Task.WhenAll(tasks);
// TODO: Process responses...
}
Regardless of how you are calling the WCF service whether it is an Async call or a Synchronous one you will be bound by the WCF serviceThrottling limits. You should look at these settings and possible adjust them higher (if you have them set to low values for some reason), in .NET4 the defaults are pretty good, however In older versions of the .NET framework, these defaults were much more conservative than .NET4.
.NET 4.0
MaxConcurrentSessions: default is 100 * ProcessorCount
MaxConcurrentCalls: default is 16 * ProcessorCount
MaxConcurrentInstances: default is MaxConcurrentCalls+MaxConcurrentSessions
1.)Yes.
2.)Yes.
If you want to control the number of simultaneous requests you can try using Stephen Toub's ForEachAsync method. it allows you to control how many tasks are processed at the same time.
public static class Extensions
{
public static Task ForEachAsync<T>(this IEnumerable<T> source, int dop, Func<T, Task> body)
{
return Task.WhenAll(
from partition in Partitioner.Create(source).GetPartitions(dop)
select Task.Run(async delegate {
using (partition)
while (partition.MoveNext())
await body(partition.Current);
}));
}
}
void Main()
{
var myClientProxy = new ClientProxy();
var responses = new List<Response>();
// Max 10 concurrent requests
Requests.ForEachAsync<Request>(10, async (r) =>
{
var response = await client.GetResponsesAsync( localTransaction.Request );
responses.Add(response);
}).Wait();
}

How (and if) to write a single-consumer queue using the TPL?

I've heard a bunch of podcasts recently about the TPL in .NET 4.0. Most of them describe background activities like downloading images or doing a computation, using tasks so that the work doesn't interfere with a GUI thread.
Most of the code I work on has more of a multiple-producer / single-consumer flavor, where work items from multiple sources must be queued and then processed in order. One example would be logging, where log lines from multiple threads are sequentialized into a single queue for eventual writing to a file or database. All the records from any single source must remain in order, and records from the same moment in time should be "close" to each other in the eventual output.
So multiple threads or tasks or whatever are all invoking a queuer:
lock( _queue ) // or use a lock-free queue!
{
_queue.enqueue( some_work );
_queueSemaphore.Release();
}
And a dedicated worker thread processes the queue:
while( _queueSemaphore.WaitOne() )
{
lock( _queue )
{
some_work = _queue.dequeue();
}
deal_with( some_work );
}
It's always seemed reasonable to dedicate a worker thread for the consumer side of these tasks. Should I write future programs using some construct from the TPL instead? Which one? Why?
You can use a long running Task to process items from a BlockingCollection as suggested by Wilka. Here's an example which pretty much meets your applications requirements. You'll see output something like this:
Log from task B
Log from task A
Log from task B1
Log from task D
Log from task C
Not that outputs from A, B, C & D appear random because they depend on the start time of the threads but B always appears before B1.
public class LogItem
{
public string Message { get; private set; }
public LogItem (string message)
{
Message = message;
}
}
public void Example()
{
BlockingCollection<LogItem> _queue = new BlockingCollection<LogItem>();
// Start queue listener...
CancellationTokenSource canceller = new CancellationTokenSource();
Task listener = Task.Factory.StartNew(() =>
{
while (!canceller.Token.IsCancellationRequested)
{
LogItem item;
if (_queue.TryTake(out item))
Console.WriteLine(item.Message);
}
},
canceller.Token,
TaskCreationOptions.LongRunning,
TaskScheduler.Default);
// Add some log messages in parallel...
Parallel.Invoke(
() => { _queue.Add(new LogItem("Log from task A")); },
() => {
_queue.Add(new LogItem("Log from task B"));
_queue.Add(new LogItem("Log from task B1"));
},
() => { _queue.Add(new LogItem("Log from task C")); },
() => { _queue.Add(new LogItem("Log from task D")); });
// Pretend to do other things...
Thread.Sleep(1000);
// Shut down the listener...
canceller.Cancel();
listener.Wait();
}
I know this answer is about a year late, but take a look at MSDN.
which shows how to create a LimitedConcurrencyLevelTaskScheduler from the TaskScheduler class. By limiting the concurrency to a single task, that should then process your tasks in order as they are queued via:
LimitedConcurrencyLevelTaskScheduler lcts = new LimitedConcurrencyLevelTaskScheduler(1);
TaskFactory factory = new TaskFactory(lcts);
factory.StartNew(()=>
{
// your code
});
I'm not sure that TPL is adequate in your use case. From my understanding the main use case for TPL is to split one huge task into several smaller tasks that can be run side by side. For example if you have a big list and you want to apply the same transformation on each element. In this case you can have several tasks applying the transformation on a subset of the list.
The case you describe doesn't seem to fit in this picture for me. In your case you don't have several tasks that do the same thing in parallel. You have several different tasks that each does is own job (the producers) and one task that consumes. Perhaps TPL could be used for the consumer part if you want to have multiple consumers because in this case, each consumer does the same job (assuming you find a logic to enforce the temporal consistency you look for).
Well, this of course is just my personnal view on the subject
Live long and prosper
It sounds like BlockingCollection would be handy for you. So for your code above, you could use something like (assuming _queue is a BlockingCollection instance):
// for your producers
_queue.Add(some_work);
A dedicated worker thread processing the queue:
foreach (var some_work in _queue.GetConsumingEnumerable())
{
deal_with(some_work);
}
Note: when all your producers have finished producing stuff, you'll need to call CompleteAdding() on _queue otherwise your consumer will be stuck waiting for more work.

Categories

Resources