How (and if) to write a single-consumer queue using the TPL?

How (and if) to write a single-consumer queue using the TPL? - c#

I've heard a bunch of podcasts recently about the TPL in .NET 4.0. Most of them describe background activities like downloading images or doing a computation, using tasks so that the work doesn't interfere with a GUI thread.
Most of the code I work on has more of a multiple-producer / single-consumer flavor, where work items from multiple sources must be queued and then processed in order. One example would be logging, where log lines from multiple threads are sequentialized into a single queue for eventual writing to a file or database. All the records from any single source must remain in order, and records from the same moment in time should be "close" to each other in the eventual output.
So multiple threads or tasks or whatever are all invoking a queuer:
lock( _queue ) // or use a lock-free queue!
{
_queue.enqueue( some_work );
_queueSemaphore.Release();
}
And a dedicated worker thread processes the queue:
while( _queueSemaphore.WaitOne() )
{
lock( _queue )
{
some_work = _queue.dequeue();
}
deal_with( some_work );
}
It's always seemed reasonable to dedicate a worker thread for the consumer side of these tasks. Should I write future programs using some construct from the TPL instead? Which one? Why?

You can use a long running Task to process items from a BlockingCollection as suggested by Wilka. Here's an example which pretty much meets your applications requirements. You'll see output something like this:
Log from task B
Log from task A
Log from task B1
Log from task D
Log from task C
Not that outputs from A, B, C & D appear random because they depend on the start time of the threads but B always appears before B1.
public class LogItem
{
public string Message { get; private set; }
public LogItem (string message)
{
Message = message;
}
}
public void Example()
{
BlockingCollection<LogItem> _queue = new BlockingCollection<LogItem>();
// Start queue listener...
CancellationTokenSource canceller = new CancellationTokenSource();
Task listener = Task.Factory.StartNew(() =>
{
while (!canceller.Token.IsCancellationRequested)
{
LogItem item;
if (_queue.TryTake(out item))
Console.WriteLine(item.Message);
}
},
canceller.Token,
TaskCreationOptions.LongRunning,
TaskScheduler.Default);
// Add some log messages in parallel...
Parallel.Invoke(
() => { _queue.Add(new LogItem("Log from task A")); },
() => {
_queue.Add(new LogItem("Log from task B"));
_queue.Add(new LogItem("Log from task B1"));
},
() => { _queue.Add(new LogItem("Log from task C")); },
() => { _queue.Add(new LogItem("Log from task D")); });
// Pretend to do other things...
Thread.Sleep(1000);
// Shut down the listener...
canceller.Cancel();
listener.Wait();
}

I know this answer is about a year late, but take a look at MSDN.
which shows how to create a LimitedConcurrencyLevelTaskScheduler from the TaskScheduler class. By limiting the concurrency to a single task, that should then process your tasks in order as they are queued via:
LimitedConcurrencyLevelTaskScheduler lcts = new LimitedConcurrencyLevelTaskScheduler(1);
TaskFactory factory = new TaskFactory(lcts);
factory.StartNew(()=>
{
// your code
});

I'm not sure that TPL is adequate in your use case. From my understanding the main use case for TPL is to split one huge task into several smaller tasks that can be run side by side. For example if you have a big list and you want to apply the same transformation on each element. In this case you can have several tasks applying the transformation on a subset of the list.
The case you describe doesn't seem to fit in this picture for me. In your case you don't have several tasks that do the same thing in parallel. You have several different tasks that each does is own job (the producers) and one task that consumes. Perhaps TPL could be used for the consumer part if you want to have multiple consumers because in this case, each consumer does the same job (assuming you find a logic to enforce the temporal consistency you look for).
Well, this of course is just my personnal view on the subject
Live long and prosper

It sounds like BlockingCollection would be handy for you. So for your code above, you could use something like (assuming _queue is a BlockingCollection instance):
// for your producers
_queue.Add(some_work);
A dedicated worker thread processing the queue:
foreach (var some_work in _queue.GetConsumingEnumerable())
{
deal_with(some_work);
}
Note: when all your producers have finished producing stuff, you'll need to call CompleteAdding() on _queue otherwise your consumer will be stuck waiting for more work.

Related

Concurrent parallel job processing with throttling using ActionBlock in TPL Dataflow

I am using the below code snippet to try and run jobs (selected by user from UI) non-blocking from main thread (asynchronously) and concurrently w.r.t each other, with some throttling set up to prevent too many jobs hogging all RAM. I used many sources such as Stephen Cleary's blog, this link on ActionBlock as well as this one from #i3arnon
public class ActionBlockJobsAsyncImpl {
private ActionBlock<Job> qJobs;
private Dictionary<Job, CancellationTokenSource> cTokSrcs;
public ActionBlockJobsAsyncImpl () {
qJobs = new ActionBlock<Job>(
async a_job => await RunJobAsync(a_job),
new ExecutionDataflowBlockOptions
{
BoundedCapacity = boundedCapacity,
MaxDegreeOfParallelism = maxDegreeOfParallelism,
});
cTokSrcs = new Dictionary<Job, CancellationTokenSource>();
}
private async Task<bool> RunJobAsync(Job a_job) {
JobArgs args = JobAPI.GetJobArgs(a_job);
bool ok = await JobAPI.RunJobAsync(args, cTokSrcs[a_job].Token);
return ok;
}
private async Task Produce(IEnumerable<Job> jobs) {
foreach (var job in jobs)
{
await qJobs.SendAsync(job);
}
//qJobs.Complete();
}
public override async Task SubmitJobs(IEnumerable<Job> jobs) {
//-Establish new cancellation token and task status
foreach (var job in jobs) {
cTokSrcs[job] = new CancellationTokenSource();
}
// Start the producer.
var producer = Produce(jobs);
// Wait for everything to complete.
await Task.WhenAll(producer);
}
}
The reason I commented out the qJobs.Complete() method call was because the user should be able to submit jobs continuously from the UI (same ones or different ones), and I learnt from implementing and testing in my first pass using BufferBlock that I shouldn't have that Complete() call if I wanted such a continuous producer/consumer queue. But BufferBlock as I learnt doesn't support running concurrent jobs; hence this is my second pass with ActionBlock instead.
In the above code using ActionBlock, when the user selects jobs and clicks to run from UI, this calls the SubmitJobs method. The int parameters boundedCapacity=8 and maxDegreeOfParallelism=DataflowBlockOptions.Unbounded But the code as is, currently does nothing (i.e., it doesn't run any job) - my analogous BufferBlock implementation on the other hand, used to at least run the jobs asynchronously, albeit sequentially w.r.t each other. Here, it never runs any of the jobs and I don't see any error messages either. Appreciate any ideas on what I'm doing wrong and perhaps some useful ideas on how to fix the problem. Thanks for your interest.

How to use tasks with ConcurrentDictionary

I have to write a program where I'm reading from a database the queues to process and all the queues are run in parallel and managed on the parent thread using a ConcurrentDictionary.
I have a class that represents the queue, which has a constructor that takes in the queue information and the parent instance handle. The queue class also has the method that processes the queue.
Here is the Queue Class:
Class MyQueue {
protected ServiceExecution _parent;
protect string _queueID;
public MyQueue(ServiceExecution parentThread, string queueID)
{
_parent = parentThread;
_queueID = queueID;
}
public void Process()
{
try
{
//Do work to process
}
catch()
{
//exception handling
}
finally{
_parent.ThreadFinish(_queueID);
}
The parent thread loops through the dataset of queues and instantiates a new queue class. It spawns a new thread to execute the Process method of the Queue object asynchronously. This thread is added to the ConcurrentDictionary and then started as follows:
private ConcurrentDictionary<string, MyQueue> _runningQueues = new ConcurrentDictionary<string, MyQueue>();
Foreach(datarow dr in QueueDataset.rows)
{
MyQueue queue = new MyQueue(this, dr["QueueID"].ToString());
Thread t = new Thread(()=>queue.Process());
if(_runningQueues.TryAdd(dr["QueueID"].ToString(), queue)
{
t.start();
}
}
//Method that gets called by the queue thread when it finishes
public void ThreadFinish(string queueID)
{
MyQueue queue;
_runningQueues.TryRemove(queueID, out queue);
}
I have a feeling this is not the right approach to manage the asynchronous queue processing and I'm wondering if perhaps I can run into deadlocks with this design? Furthermore, I would like to use Tasks to run the queues asynchronously instead of the new Threads. I need to keep track of the queues because I will not spawn a new thread or task for the same queue if the previous run is not complete yet. What is the best way to handle this type of parallelism?
Thanks in advance!

About your current approach
Indeed it is not the right approach. High number of queues read from database will spawn high number of threads which might be bad. You will create a new thread each time. Better to create some threads and then re-use them. And if you want tasks, better to create LongRunning tasks and re-use them.
Suggested Design
I'd suggest the following design:
Reserve only one task to read queues from the database and put those queues in a BlockingCollection;
Now start multiple LongRunning tasks to read a queue each from that BlockingCollection and process that queue;
When a task is done with processing the queue it took from the BlockingCollection, it will then take another queue from that BlockingCollection;
Optimize the number of these processing tasks so as to properly utilize the cores of your CPU. Usually since DB interactions are slow, you can create tasks 3 times more than the number of cores however YMMV.
Deadlock possibility
They will at least not happen at the application side. However, since the queues are of database transactions, the deadlock may happen at the database end. You may have to write some logic to make your task start a transaction again if the database rolled it back because of deadlock.
Sample Code
private static void TaskDesignedRun()
{
var expectedParallelQueues = 1024; //Optimize it. I've chosen it randomly
var parallelProcessingTaskCount = 4 * Environment.ProcessorCount; //Optimize this too.
var baseProcessorTaskArray = new Task[parallelProcessingTaskCount];
var taskFactory = new TaskFactory(TaskCreationOptions.LongRunning, TaskContinuationOptions.None);
var itemsToProcess = new BlockingCollection<MyQueue>(expectedParallelQueues);
//Start a new task to populate the "itemsToProcess"
taskFactory.StartNew(() =>
{
// Add code to read queues and add them to itemsToProcess
Console.WriteLine("Done reading all the queues...");
// Finally signal that you are done by saying..
itemsToProcess.CompleteAdding();
});
//Initializing the base tasks
for (var index = 0; index < baseProcessorTaskArray.Length; index++)
{
baseProcessorTaskArray[index] = taskFactory.StartNew(() =>
{
while (!itemsToProcess.IsAddingCompleted && itemsToProcess.Count != 0) {
MyQueue q;
if (!itemsToProcess.TryTake(out q)) continue;
//Process your queue
}
});
}
//Now just wait till all queues in your database have been read and processed.
Task.WaitAll(baseProcessorTaskArray);
}

Running Task<T> on a custom scheduler

I am creating a generic helper class that will help prioritise requests made to an API whilst restricting parallelisation at which they occur.
Consider the key method of the application below;
public IQueuedTaskHandle<TResponse> InvokeRequest<TResponse>(Func<TClient, Task<TResponse>> invocation, QueuedClientPriority priority, CancellationToken ct) where TResponse : IServiceResponse
{
var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
_logger.Debug("Queueing task.");
var taskToQueue = Task.Factory.StartNew(async () =>
{
_logger.Debug("Starting request {0}", Task.CurrentId);
return await invocation(_client);
}, cts.Token, TaskCreationOptions.None, _schedulers[priority]).Unwrap();
taskToQueue.ContinueWith(task => _logger.Debug("Finished task {0}", task.Id), cts.Token);
return new EcosystemQueuedTaskHandle<TResponse>(cts, priority, taskToQueue);
}
Without going into too many details, I want to invoke tasks returned by Task<TResponse>> invocation when their turn in the queue arises. I am using a collection of queues constructed using QueuedTaskScheduler indexed by a unique enumeration;
_queuedTaskScheduler = new QueuedTaskScheduler(TaskScheduler.Default, 3);
_schedulers = new Dictionary<QueuedClientPriority, TaskScheduler>();
//Enumerate the priorities
foreach (var priority in Enum.GetValues(typeof(QueuedClientPriority)))
{
_schedulers.Add((QueuedClientPriority)priority, _queuedTaskScheduler.ActivateNewQueue((int)priority));
}
However, with little success I can't get the tasks to execute in a limited parallelised environment, leading to 100 API requests being constructed, fired, and completed in one big batch. I can tell this using a Fiddler session;
I have read some interesting articles and SO posts (here, here and here) that I thought would detail how to go about this, but so far I have not been able to figure it out. From what I understand, the async nature of the lambda is working in a continuation structure as designed, which is marking the generated task as complete, basically "insta-completing" it. This means that whilst the queues are working fine, runing a generated Task<T> on a custom scheduler is turning out to be the problem.

This means that whilst the queues are working fine, runing a generated Task on a custom scheduler is turning out to be the problem.
Correct. One way to think about it[1] is that an async method is split into several tasks - it's broken up at each await point. Each one of these "sub-tasks" are then run on the task scheduler. So, the async method will run entirely on the task scheduler (assuming you don't use ConfigureAwait(false)), but at each await it will leave the task scheduler, and then re-enter that task scheduler after the await completes.
So, if you want to coordinate asynchronous work at a higher level, you need to take a different approach. It's possible to write the code yourself for this, but it can get messy. I recommend you first try ActionBlock<T> from the TPL Dataflow library, passing your custom task scheduler to its ExecutionDataflowBlockOptions.
[1] This is a simplification. The state machine will avoid creating actual task objects unless necessary (in this case, they are necessary because they're being scheduled to a task scheduler). Also, only await points where the awaitable isn't complete actually cause a "method split".

Stephen Cleary's answer explains well why you can't use TaskScheduler for this purpose and how you can use ActionBlock to limit the degree of parallelism. But if you want to add priorities to that, I think you'll have to do that manually. Your approach of using a Dictionary of queues is reasonable, a simple implementation (with no support for cancellation or completion) of that could look something like this:
class Scheduler
{
private static readonly Priority[] Priorities =
(Priority[])Enum.GetValues(typeof(Priority));
private readonly IReadOnlyDictionary<Priority, ConcurrentQueue<Func<Task>>> queues;
private readonly ActionBlock<Func<Task>> executor;
private readonly SemaphoreSlim semaphore;
public Scheduler(int degreeOfParallelism)
{
queues = Priorities.ToDictionary(
priority => priority, _ => new ConcurrentQueue<Func<Task>>());
executor = new ActionBlock<Func<Task>>(
invocation => invocation(),
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = degreeOfParallelism,
BoundedCapacity = degreeOfParallelism
});
semaphore = new SemaphoreSlim(0);
Task.Run(Watch);
}
private async Task Watch()
{
while (true)
{
await semaphore.WaitAsync();
// find item with highest priority and send it for execution
foreach (var priority in Priorities.Reverse())
{
Func<Task> invocation;
if (queues[priority].TryDequeue(out invocation))
{
await executor.SendAsync(invocation);
}
}
}
}
public void Invoke(Func<Task> invocation, Priority priority)
{
queues[priority].Enqueue(invocation);
semaphore.Release(1);
}
}

BlockingCollection vs Subject for use as a consumer

I'm trying to implement a consumer in C#. There are many publishers which could be executing concurrently. I've created three examples, one with Rx and subject, one with BlockingCollection and a third using ToObservable from the BlockingCollection. They all do the same thing in this simple example and I want them to work with multiple producers.
What are the different qualities of each approach?
I'm already using Rx, so I'd prefer this approach. But I'm concerned that OnNext has no thread safe guarantee and I don't know what the queuing semantics are of Subject and the default scheduler.
Is there a thread safe subject?
Are all messages going to be processed?
Are there any other scenarios when this wont work? Is it processing concurrently?
void SubjectOnDefaultScheduler()
{
var observable = new Subject<long>();
observable.
ObserveOn(Scheduler.Default).
Subscribe(i => { DoWork(i); });
observable.OnNext(1);
observable.OnNext(2);
observable.OnNext(3);
}
Not Rx, but easily adapted to use/subscribe it. It takes an item and then processes it. This should happen serially.
void BlockingCollectionAndConsumingTask()
{
var blockingCollection = new BlockingCollection<long>();
var taskFactory = new TaskFactory();
taskFactory.StartNew(() =>
{
foreach (var i in blockingCollection.GetConsumingEnumerable())
{
DoWork(i);
}
});
blockingCollection.Add(1);
blockingCollection.Add(2);
blockingCollection.Add(3);
}
Using a blocking collection a bit like a subject seems like a good compromise. I'm guessing implicitly will schedule onto task, so that I can use async/await, is that correct?
void BlockingCollectionToObservable()
{
var blockingCollection = new BlockingCollection<long>();
blockingCollection.
GetConsumingEnumerable().
ToObservable(Scheduler.Default).
Subscribe(i => { DoWork(i); });
blockingCollection.Add(1);
blockingCollection.Add(2);
blockingCollection.Add(3);
}

Subject is not thread-safe. OnNexts issued concurrently will directly call an Observer concurrently. Personally I find this quite surprising given the extent to which other areas of Rx enforce the correct semantics. I can only assume this was done for performance considerations.
Subject is kind of a half-way house though, in that it does enforce termination with OnError or OnComplete - after either of these are raised, OnNext is a NOP. And this behaviour is thread-safe.
But use Observable.Synchronize() on a Subject and it will force outgoing calls to obey the proper Rx semantics. In particular, OnNext calls will block if made concurrently.
The underlying mechanism is the standard .NET lock. When the lock is contended by multiple threads they are granted the lock on a first-come first-served basis most of the time. There are certain conditions where fairness is violated. However, you will definitely get the serialized access you are looking for.
ObserveOn has behaviour that is platform specific - if available, you can supply a SynchronizationContext and OnNext calls are Posted to it. With a Scheduler, it ends up putting calls onto a ConcurrentQueue<T> and dispatching them serially via the scheduler - so the thread of execution will depend on the scheduler. Either way, the queuing behaviour will also enforce the correct semantics.
In both cases (Synchronize & ObserveOn), you certainly won't lose messages. With ObserveOn, you can implicitly choose thread you'll process messages on by your choice of Scheduler/Context, with Synchronize you'll process messages on the calling thread. Which is better will depend on your scenario.
There's more to consider as well - such as what you want to do if your producers out-pace your consumer.
You might want to have a look at Rxx Consume as well: http://rxx.codeplex.com/SourceControl/changeset/view/63470#1100703
Sample code showing Synchronize behaviour (Nuget Rx-Testing, Nunit) - it's a bit hokey with the Thread.Sleep code but it's quite fiddly to be bad and I was lazy :):
public class SubjectTests
{
[Test]
public void SubjectDoesNotRespectGrammar()
{
var subject = new Subject<int>();
var spy = new ObserverSpy(Scheduler.Default);
var sut = subject.Subscribe(spy);
// Swap the following with the preceding to make this test pass
//var sut = subject.Synchronize().Subscribe(spy);
Task.Factory.StartNew(() => subject.OnNext(1));
Task.Factory.StartNew(() => subject.OnNext(2));
Thread.Sleep(2000);
Assert.IsFalse(spy.ConcurrencyViolation);
}
private class ObserverSpy : IObserver<int>
{
private int _inOnNext;
public ObserverSpy(IScheduler scheduler)
{
_scheduler = scheduler;
}
public bool ConcurrencyViolation = false;
private readonly IScheduler _scheduler;
public void OnNext(int value)
{
var isInOnNext = Interlocked.CompareExchange(ref _inOnNext, 1, 0);
if (isInOnNext == 1)
{
ConcurrencyViolation = true;
return;
}
var wait = new ManualResetEvent(false);
_scheduler.Schedule(TimeSpan.FromSeconds(1), () => wait.Set());
wait.WaitOne();
_inOnNext = 0;
}
public void OnError(Exception error)
{
}
public void OnCompleted()
{
}
}
}

Why does Parallel.Foreach create endless threads?

The code below continues to create threads, even when the queue is empty..until eventually an OutOfMemory exception occurs. If i replace the Parallel.ForEach with a regular foreach, this does not happen. anyone know of reasons why this may happen?
public delegate void DataChangedDelegate(DataItem obj);
public class Consumer
{
public DataChangedDelegate OnCustomerChanged;
public DataChangedDelegate OnOrdersChanged;
private CancellationTokenSource cts;
private CancellationToken ct;
private BlockingCollection<DataItem> queue;
public Consumer(BlockingCollection<DataItem> queue) {
this.queue = queue;
Start();
}
private void Start() {
cts = new CancellationTokenSource();
ct = cts.Token;
Task.Factory.StartNew(() => DoWork(), ct);
}
private void DoWork() {
Parallel.ForEach(queue.GetConsumingPartitioner(), item => {
if (item.DataType == DataTypes.Customer) {
OnCustomerChanged(item);
} else if(item.DataType == DataTypes.Order) {
OnOrdersChanged(item);
}
});
}
}

I think Parallel.ForEach() was made primarily for processing bounded collections. And it doesn't expect collections like the one returned by GetConsumingPartitioner(), where MoveNext() blocks for a long time.
The problem is that Parallel.ForEach() tries to find the best degree of parallelism, so it starts as many Tasks as the TaskScheduler lets it run. But the TaskScheduler sees there are many Tasks that take a very long time to finish, and that they're not doing anything (they block) so it keeps on starting new ones.
I think the best solution is to set the MaxDegreeOfParallelism.
As an alternative, you could use TPL Dataflow's ActionBlock. The main difference in this case is that ActionBlock doesn't block any threads when there are no items to process, so the number of threads wouldn't get anywhere near the limit.

The Producer/Consumer pattern is mainly used when there is just one Producer and one Consumer.
However, what you are trying to achieve (multiple consumers) more neatly fits in the Worklist pattern. The following code was taken from a slide for unit2 slide "2c - Shared Memory Patterns" from a parallel programming class taught at the University of Utah, which is available in the download at http://ppcp.codeplex.com/
BlockingCollection<Item> workList;
CancellationTokenSource cts;
int itemcount
public void Run()
{
int num_workers = 4;
//create worklist, filled with initial work
worklist = new BlockingCollection<Item>(
new ConcurrentQueue<Item>(GetInitialWork()));
cts = new CancellationTokenSource();
itemcount = worklist.Count();
for( int i = 0; i < num_workers; i++)
Task.Factory.StartNew( RunWorker );
}
IEnumberable<Item> GetInitialWork() { ... }
public void RunWorker() {
try {
do {
Item i = worklist.Take( cts.Token );
//blocks until item available or cancelled
Process(i);
//exit loop if no more items left
} while (Interlocked.Decrement( ref itemcount) > 0);
} finally {
if( ! cts.IsCancellationRequested )
cts.Cancel();
}
}
}
public void AddWork( Item item) {
Interlocked.Increment( ref itemcount );
worklist.Add(item);
}
public void Process( Item i )
{
//Do what you want to the work item here.
}
The preceding code allows you to add worklist items to the queue, and lets you set an arbitrary number of workers (in this case, four) to pull items out of the queue and process them.
Another great resource for the Parallelism on .Net 4.0 is the book "Parallel Programming with Microsoft .Net" which is freely available at: http://msdn.microsoft.com/en-us/library/ff963553

Internally in the Task Parallel Library, the Parallel.For and Parallel.Foreach follow a hill-climbing algorithm to determine how much parallelism should be utilized for the operation.
More or less, they start with running the body on one task, move to two, and so on, until a break-point is reached and they need to reduce the number of tasks.
This works quite well for method bodies that complete quickly, but if the body takes a long time to run, it may take a long time before the it realizes it needs to decrease the amount of parallelism. Until that point, it continues adding tasks, and possibly crashes the computer.
I learned the above during a lecture given by one of the developers of the Task Parallel Library.
Specifying the MaxDegreeOfParallelism is probably the easiest way to go.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How (and if) to write a single-consumer queue using the TPL? - c#

Related

Concurrent parallel job processing with throttling using ActionBlock in TPL Dataflow

How to use tasks with ConcurrentDictionary

Running Task<T> on a custom scheduler

BlockingCollection vs Subject for use as a consumer

Why does Parallel.Foreach create endless threads?

Categories

Resources