Multiple Task.Factory's - c#

I'm really loving the TPL. Simply calling Task.Factory.StartNew() and not worrying about anything, is quite amazing.
But, is it possible to have multiple Factories running on the same thread?
Basically, I have would like to have two different queues, executing different types of tasks.
One queue handles tasks of type A while the second queue handles tasks of type B.
If queue A has nothing to do, it should ignore tasks in queue B and vice versa.
Is this possible to do, without making my own queues, or running multiple threads for the factories?
To clarify what I want to do.
I read data from a network device. I want to do two things with this data, totally independent from each other.
I want to log to a database.
I want to send to another device over network.
Sometimes the database log will take a while, and I don't want the network send to be delayed because of this.

If you use .NET 4.0:
LimitedConcurrencyLevelTaskScheduler (with concurrency level of 1; see here)
If you use .NET 4.5:
ConcurrentExclusiveSchedulerPair (take only the exclusive scheduler out of the pair; see here)
Create two schedulers and pass them to the appropriate StartNew. Or create two TaskFactories with these schdulers and use them to create and start the tasks.

You can define yourself a thread pool using a queue of threads

Related

Multi-threaded queue consumer and task processing

I'm writing a service that has to read tasks from an AMQP message queue and perform a synchronous action based on the message type. These actions might be to send an email or hit a web service, but will generally be on the order of a couple hundred milliseconds assuming no errors.
I want this to be extensible so that other actions can be added in the future. Either way, the volume of messages could be quite high, with bursts of 100's / second coming in.
I'm playing around with several designs, but my questions are as follows:
What type of threading model should I go with? Do I:
a) Go with a single thread to consume from the queue and put tasks on a thread pool? If so, how do I represent those tasks?
b) Create multiple threads to host their own consumers and have them handle the task synchronously?
c) Create multiple threads to host their own consumers and have them all register a delegate to handle the tasks as they come in?
In the case of a or c, what's the best way to have the spawned thread communicate back with the main thread? I need to ack the message that came off the the queue. Do I raise an event from the spawned thread that the main thread listens to?
Is there a guideline as to how many threads I should run, given x cores? Is it x, 2*x? There are other services running on this system too.
You should generally* avoid direct thread programming in favor of the Task Parallel Library and concurrent collections built into .NET 4.0 and higher. Fortunately, the producer/consumer problem you described is common and Microsoft has a general-purpose tool for this: the BlockingCollection. This article has a good summary of its features. You may also refer to this white paper for performance analysis of the BlockingCollection<T> (among other things).
However, before pursuing the BlockingCollection<T> or an equivalent, given the scenario you described, why not go for the simple solution of using the Tasks. The TPL gives you the asynchronous execution of tasks with a lot of extras like cancellation and continuation. If, however, you need more advanced lifecycle management, then go for something like a BlockingCollection<T>.
* By "generally", I'm insinuating that the generic solution will not necessarily perform the best for your specific case as it's almost certain that a properly designed custom solution will be better. As with every decision, perform the cost/benefit analysis.

Conditionally handle and remove messages by multiple consumers from a MSMQ queue

I found this related question but my situation is a little bit different.
I have a ASP.NET application that produces long running tasks that should be processed by a number of background processes (Windows Services). Most of the tasks are similar and can be handled by most task runners. Due to different versions of the client application (where the tasks are generated by users) some tasks can only be processed by task runners of specific version. The web server has no knowledge about the kind of task. It just sends all the tasks to the same queue using MSMQ.
If a task enters the queue, the next free task runner should receive the task, decide if he can handle this kind of task, remove the task from the queue and run the task.
If the runner that received the message is not able to process this kind of task, it should put back the message to the queue, so that another runner can have a look on the message.
I tried to implement a conditional receive using a transaction, that I can abort if the task has the wrong format:
transaction.Begin();
var msg = queue.Receive(TimeSpan.FromSeconds(1000), transaction);
if (CanHandle(msg))
{
transaction.Commit();
// handle
}
else
{
transaction.Abort();
}
It seems to work, but I don't know if this is the preferable way do go.
Another problem with this solution is, if there is no other free runner that can handle this message I will receive it again and again.
Is there a way I can solve this problem only using MSMQ? The whole task data is already stored in a SQL database. The task runner accesses the task data over a HTTP API (Thats why I rule out solution like SQLServer Service Broker). The data sent to the message queue is only meta data used to identify the job.
If plain MSMQ is not the right tool, can I solve the problem using MassTransit (I didn't like the fact that I have to install and run the additional MassTransit RuntimeServices + SQL db for it) for example? Other suggestions?
The way you are utilizing MSMQ is really circumventing some of the fundamental features of the technology. If queue message cannot be universally handled by the reader, you are incurring a pretty sizable system performance penalty, where many of your task processing services can get sent back empty-handed when they ask for tasks. In extreme scenario, imagine what would happen if there were only one service that could perform task type "A." If that service were to go down, and the first task to be pulled out of the queue is of type "A," then your entire system will shut down.
I would suggest one of two approaches:
Utilize multiple queues, as in one per task version. Hide task retrieval behind an API or some other service. Your service can request a task from one or more task types, or you can even allow for anything. The API would then be charged with figuring out which queue to pull from (i.e. map to a specific task type, pick one at random, do some sort of round robining, etc.)
Opt for a different storage technology over queueing. If you write good enough SQL, a relational database would be more than up for the task. You just must exhibit a lot of care to not incur deadlocks.
Can you create another queue? if Yes then I would create multiple queues. Like GenericTaskQ, which will have all the tasks in it then xTaskQ and yTaskQ. Now your xTaskRunner will pick the tasks from Generic queue and if can not process it then put it in yTaskQ(or whatever q is appropriate). same is for yTaskRunner, if it cant handle the message put it in xTaskQueue. And x and y taskrunners should always look for their respective queues first, if nothing there then go look into genericq.
If you can not create multiple qs, use message(task) labels (which should be unique, we normaly use GUID) to remember what tasks have already been seen by a task runner and can not be processed. also use Peek, to check if this message is already been addressed, before actually receiving the message.

Multithreaded Service Engineering Questions

I am trying to leverage .NET 4.5 new threading capabilities to engineer a system to update a list of objects in memory.
I worked with multithreading years ago in Java and I fear that my skills have become stagnant, especially in the .NET region.
Basically, I have written a Feed abstract class and I inherit from that for each of my threads. The thread classes themselves are simple and run fine.
Each of these classes run endlessly, they block until an event occurs and it updates the List.
So the first question is, how might I keep the parent thread alive while these threads run? I've prevented this race condition by writing this currently in a dev console app with a Console.read().
Second, I would like to set up a repository of List objects that I can access from the parent thread. How would I update those Lists from the child thread and expose them to another system? Trying to avoid SQL. Shared memory?
I hope I've made this clear. I was hoping for something like this: Writing multithreaded methods using async/await in .Net 4.5
except, we need the adaptability of external classes and of course, we need to somehow expose those Lists.
You can run the "parent" thread in a while with some flag to stop it:
while(flag){
//run the thread
}
You can expose a public List as a property of some class to hold your data. Remember to lock access in multithreading code.
If the 'parent' thread is supposed to wait during the processing it could simply await the call(s) to the async method(s).
If it has to wait for specific events you could use a signaling object such as a Barrier.
If the thread has to 'do' things while waiting you could check the availability of the result or the progress: How to do progress reporting using Async/Await
If you're using tasks, you can use Tasks.WaitAll to wait for the tasks to complete. The default is that Tasks and async/await use your system's ThreadPool, so I'd avoid placing anything but relatively short running tasks here.
If you're using System.Threading.Thread (I prefer using these for long running threads), check out the accepted answer here: C# Waiting for multiple threads to finish
If you can fetch batches of data, you can expose services allowing access to the shared objects using self hosted Web API or something like NancyFX. WCF and remoting are also options if you prefer binary communication.
Shared memory, keep-alive TCP connections or UDP are options if you have many small transactions. Perhaps you could use ZeroMQ (it's not a traditional queue) with the C# binding they provide?
For concurrent access to the lists take a look at the classes in System.Collections.Concurrent before implementing your own locking.

how to mix multithreading with sequential requirement?

i have a program which process price data coming from the broker. the pseudo code are as follow:
Process[] process = new Process[50];
void tickEvent(object sender, EventArgs e)
{
int contractNumber = e.contractNumber;
doPriceProcess(process[contractNumber], e);
}
now i would like to use mutlithreading to speed up my program, if the data are of different contract number, i would like to fire off different threads to speed up the process. However if the data are from the same contract, i would like the program to wait until the current process finishes before i continue with the next data. How do i do it?
can you provide some code please?
thanks in advance~
You have many high level architectural decissions to make here:
How many ticks do you expect to come from that broker?
After all, you should have some kind dispatcher here.
Here is some simple description of what basically is to do:
Encapsulate the incoming ticks in packages, best
single commands that have all the data needed
Have a queue where you can easily (and thread safe) can store those commands
Have a Dispatcher, that takes an item of the queue and assigns some worker
to do the command (or let the command execute itself)
Having a worker, you can have multiple threads, processes or whatsoever
to work multiple commands seemlessly
Maybe you want to do some dispatching already for the input queue, depending
on how many requests you want to be able to complete per time unit.
Here is some more information that can be helpful:
Command pattern in C#
Reactor pattern (with sample code)
Rather than holding onto an array of Processes, I would hold onto an array of BlockingCollections. Each blocking collection can correspond to a particular contract. Then you can have producer threads that add work to do onto the end of a corresponding contract's queue, and you can have producer queues that consume the results from those collections. You can ensure than each thread (I would use threads for this, not processes) is handling 1-n different queues, but that each queue is handled by no more than one thread. That way you can ensure that no bits of work from the same contract are worked on in parallel.
The threading aspect of this can be handled effectiving using C#'s Task class. For your consumers you can create a new task for each BlockingCollection. That task's body will pretty much just be:
foreach(SomeType item in blockingCollections[contractNumber].GetConsumingEnumerable())
processItem(item);
However, by using Tasks you will let the computer schedule them as it sees fit. If it notices most of them sitting around waiting on empty queues it will just have a few (or just one) actual thread rotating between the tasks that it's using. If they are trying to do enough, and your computer can clearly support the load of additional threads, it will add more (possibly adding/removing dynamically as it goes). By letting much smarter people than you or I handle that scheduling it's much more likely to be efficient without under or over parallelizing.

Simple asynchronous Queue datastructure in C#/mono

I want to write an application that needs a Tasks queue. I should be able to add Tasks into this queue and these tasks can finish asynchronously (and should be removable from this queue, once they are complete)
The datastructure should also make it possible to get the information about any task within the Queue, provided a unique queue-position-identifier.
The data-structure should also provide the list of items in the queue anytime.
A LINQ interface to manage this queue will also be desirable.
Since this is a very common requirement for many applications (atleast in my personal observation), I want to know if there are any standard datastructures that are available as part of the c# library, instead of I writing something from the scratch.
Any pointers ?
Seems to me you are conflating the data structure and the asynch task that it is designed to track. Are you sure they need to be the same thing?
Does ThreadPool.QueueUserWorkItem not satisfy for running asynch tasks? You can maintain your own structure derived from List<TaskStatus> or HashSet<TaskStatus> to keep track of the results, and you can provide convenience methods to clear completed items, retrieve pending items, and so on.

Categories

Resources