Thread Pool of workers in a Window service? - c#

I'm creating a Windows service with 2 separate components:
1 component creates jobs and inserts them to the database (1 thread)
The 2nd component processes these jobs (multiple FIXED # of threads in a thread pool)
These 2 components will always run as long as the service is running.
What I'm stuck on is determining how to implement this thread pool. I've done some research, and there seems to be many ways of doing this such as creating a class that overriddes the method "ThreadPoolCallback", and using ThreadPool.QueueUserWorkItem to queue a work item. http://msdn.microsoft.com/en-us/library/3dasc8as.aspx
However in the example given, it doesn't seem to fit my scenario. I want to create a FIXED number of threads in a thread pool initially. Then feed it jobs to process. How do I do this?
// Wrapper method for use with thread pool.
public void ThreadPoolCallback(Object threadContext)
{
int threadIndex = (int)threadContext;
Console.WriteLine("thread {0} started...", threadIndex);
_fibOfN = Calculate(_n);
Console.WriteLine("thread {0} result calculated...", threadIndex);
_doneEvent.Set();
}
Fibonacci[] fibArray = new Fibonacci[FibonacciCalculations];
const int FibonacciCalculations = 10;
for (int i = 0; i < FibonacciCalculations; i++)
{
ThreadPool.QueueUserWorkItem(f.ThreadPoolCallback, i);
}

Create a BlockingCollection of work items. The thread that creates jobs adds them to this collection.
Create a fixed number of persistent threads that read items from that BlockingCollection and process them. Something like:
BlockingCollection<WorkItem> WorkItems = new BlockingCollection<WorkItem>();
void WorkerThreadProc()
{
foreach (var item in WorkItems.GetConsumingEnumerable())
{
// process item
}
}
Multiple worker threads can be doing that concurrently. BlockingCollection supports multiple readers and writers, so there's no concurrency problems that you have to deal with.
See my blog post Simple Multithreading, part 2 for an example that uses one consumer and one producer. Adding multiple consumers is a very simple matter of spinning up a new task for each consumer.
Another way to do it is to use a semaphore that controls how many jobs are currently being processed. I show how to do that in this answer. However, I think the shared BlockingCollection is in general a better solution.

The .NET thread pool isn't really designed for a fixed number of threads. It's designed to use the resources of the machine in the best way possible to perform multiple relatively small jobs.
Maybe a better solution for you would be to instantiate a fixed number of BackgroundWorkers instead? There are some reasonable BW examples.

Related

How to avoid an OutOfMemoryException when using SetMaxThreads

I need to call a method many times (ten millon), therefore I use threads. But when the loop has 100 cycles of my method, it launchs an OutOfMemoryException.
I tried add SetMaxThreads to only run 50 threads simultaneous but don't works (because I don't know how to do it). Thanks in advance.
ThreadPool.SetMaxThreads(50, 50);
for (int i = 0; i < tablePersons.Rows.Count; i++)
{
Thread t = new Thread(RegisterPerson);
t.Start(tablePersons.Rows[i]);
}
static void RegisterPerson(object paramObject)
{
DataRow person = (DataRow)paramObject;
Call a service...
}
1) You are confusing thread pool threads with user created threads.
This creates a new thread (not a thread pool thread):
Thread t = new Thread(RegisterPerson);
Seeting the Threadpool to have a maximum of fifty threads:
ThreadPool.SetMaxThreads(50, 50);
has no effect on your loop, where you attempt to create a user thread for each row.
There are a number of ways to enter the thread pool:
Via the Task Parallel Library (from Framework 4.0)
By calling ThreadPool.QueueUserWorkItem
Via asynchronous delegates
Via BackgroundWorker
2) You should not be creating that many user threads.
I would suggest reading: Joe Albahari's excellent Threading in C#
Rather than creating that many separate threads manually, you should probably use Parallel.ForEach(), and let that handle the thread creation for you.
They won't all run simultaneously, but you won't run into memory issues.

What is the most efficient method for assigning threads based on the following scenario?

I can have a maximum of 5 threads running simultaneous at any one time which makes use of 5 separate hardware to speedup the computation of some complex calculations and return the result. The API (contains only one method) for each of this hardware is not thread safe and can only run on a single thread at any point in time. Once the computation is completed, the same thread can be re-used to start another computation on either the same or a different hardware depending on availability. Each computation is stand alone and does not depend on the results of the other computation. Hence, up to 5 threads may complete its execution in any order.
What is the most efficient C# (using .Net Framework 2.0) coding solution for keeping track of which hardware is free/available and assigning a thread to the appropriate hardware API for performing the computation? Note that other than the limitation of 5 concurrently running threads, I do not have any control over when or how the threads are fired.
Please correct me if I am wrong but a lock free solution is preferred as I believe it will result in increased efficiency and a more scalable solution.
Also note that this is not homework although it may sound like it...
.NET provides a thread pool that you can use. System.Threading.ThreadPool.QueueUserWorkItem() tells a thread in the pool to do some work for you.
Were I designing this, I'd not focus on mapping threads to your HW resources. Instead I'd expose a lockable object for each HW resource - this can simply be an array or queue of 5 Objects. Then for each bit of computation you have, call QueueUserWorkItem(). Inside the method you pass to QUWI, find the next available lockable object and lock it (aka, dequeue it). Use the HW resource, then re-enqueue the object, exit the QUWI method.
It won't matter how many times you call QUWI; there can be at most 5 locks held, each lock guards access to one instance of your special hardware device.
The doc page for Monitor.Enter() shows how to create a safe (blocking) Queue that can be accessed by multiple workers. In .NET 4.0, you would use the builtin BlockingCollection - it's the same thing.
That's basically what you want. Except don't call Thread.Create(). Use the thread pool.
cite: Advantage of using Thread.Start vs QueueUserWorkItem
// assume the SafeQueue class from the cited doc page.
SafeQueue<SpecialHardware> q = new SafeQueue<SpecialHardware>()
// set up the queue with objects protecting the 5 magic stones
private void Setup()
{
for (int i=0; i< 5; i++)
{
q.Enqueue(GetInstanceOfSpecialHardware(i));
}
}
// something like this gets called many times, by QueueUserWorkItem()
public void DoWork(WorkDescription d)
{
d.DoPrepWork();
// gain access to one of the special hardware devices
SpecialHardware shw = q.Dequeue();
try
{
shw.DoTheMagicThing();
}
finally
{
// ensure no matter what happens the HW device is released
q.Enqueue(shw);
// at this point another worker can use it.
}
d.DoFollowupWork();
}
A lock free solution is only beneficial if the computation time is very small.
I would create a facade for each hardware thread where jobs are enqueued and a callback is invoked each time a job finishes.
Something like:
public class Job
{
public string JobInfo {get;set;}
public Action<Job> Callback {get;set;}
}
public class MyHardwareService
{
Queue<Job> _jobs = new Queue<Job>();
Thread _hardwareThread;
ManualResetEvent _event = new ManualResetEvent(false);
public MyHardwareService()
{
_hardwareThread = new Thread(WorkerFunc);
}
public void Enqueue(Job job)
{
lock (_jobs)
_jobs.Enqueue(job);
_event.Set();
}
public void WorkerFunc()
{
while(true)
{
_event.Wait(Timeout.Infinite);
Job currentJob;
lock (_queue)
{
currentJob = jobs.Dequeue();
}
//invoke hardware here.
//trigger callback in a Thread Pool thread to be able
// to continue with the next job ASAP
ThreadPool.QueueUserWorkItem(() => job.Callback(job));
if (_queue.Count == 0)
_event.Reset();
}
}
}
Sounds like you need a thread pool with 5 threads where each one relinquishes the HW once it's done and adds it back to some queue. Would that work? If so, .Net makes thread pools very easy.
Sounds a lot like the Sleeping barber problem. I believe the standard solution to that is to use semaphores

Recursive function MultiThreading to perform one task at a time

I am writing a program to crawl the websites. The crawl function is a recursive one and may consume more time to complete, So I used Multi Threading to perform the crawl for multiple websites.
What exactly I need is, after completion crawling one website it call next one (which should be in Queqe) instead multiple websites crawling at a time.
I am using C# and ASP.NET.
The standard practice for doing this is to use a blocking queue. If you are using .NET 4.0 then you can take advantage of the BlockingCollection class otherwise you can use Stephen Toub's implementation.
What you will do is spin up as many worker threads as you feel necessary and have them go around in an infinite loop dequeueing items as they appear in the queue. Your main thread will be enqueueing the item. A blocking queue is designed to wait/block on the dequeue operation until an item becomes available.
public class Program
{
private static BlockingQueue<string> m_Queue = new BlockingQueue<string>();
public static void Main()
{
var thread1 = new Thread(Process);
var thread2 = new Thread(Process);
thread1.Start();
thread2.Start();
while (true)
{
string url = GetNextUrl();
m_Queue.Enqueue(url);
}
}
public static void Process()
{
while (true)
{
string url = m_Queue.Dequeue();
// Do whatever with the url here.
}
}
}
I don't usually think positive thoughts when it comes to web crawlers...
You want to use a threadpool.
ThreadPool.QueueUserWorkItem(new WaitCallback(CrawlSite), (object)s);
You simply 'push' you workload into the queue, and let the threadpool manage it.
I have to say - I'm not a Threading expert and my C# is quite rusty - but considering the requirements I would suggest something like this:
Define a Queue for the websites.
Define a Pool with Crawler threads.
The main process iterates over the website queue and retrieves the site address.
Retrieve an available thread from the pool - assign it the website address and allow it to start running. Set an indicator in the thread object that it should wait for all subsequent threads to finish (so you will not continue to the next site).
Once all the threads have ended - the main thread (started in step #4) will end and return to the main loop of the main process to continue to the next website.
The Crawler behavior should be something like this:
Investigate the content of the current address
Retrieve the hierarchy below the current level
For each child of the current node of the site tree - pull a new crawler thread from the pool and start it running in the background with the address of the child node
If the pool is empty, wait until a thread becomes available.
If the thread is marked to wait - wait for all the other threads to finish
I think there are some challenges here - but as a general flow I believe it can do do job.
Put all your url's in a queue, and pop one off the queue each time you are done with the previous one.
You could also put the recursive links in the queue, to better control how many downloads you are executing at a time.
You could set up X number of worker threads which all get a url off the queue in order to process more at a time. But this way you can throttle it yourself.
You can use ConcurrentQueue<T> in .Net to get a thread safe queue to work with.

In .NET is there a thread scheduler for long running threads?

Our scenario is a network scanner.
It connects to a set of hosts and scans them in parallel for a while using low priority background threads.
I want to be able to schedule lots of work but only have any given say ten or whatever number of hosts scanned in parallel. Even if I create my own threads, the many callbacks and other asynchronous goodness uses the ThreadPool and I end up running out of resources. I should look at MonoTorrent...
If I use THE ThreadPool, can I limit my application to some number that will leave enough for the rest of the application to Run smoothly?
Is there a threadpool that I can initialize to n long lived threads?
[Edit]
No one seems to have noticed that I made some comments on some responses so I will add a couple things here.
Threads should be cancellable both
gracefully and forcefully.
Threads should have low priority leaving the GUI responsive.
Threads are long running but in Order(minutes) and not Order(days).
Work for a given target host is basically:
For each test
Probe target (work is done mostly on the target end of an SSH connection)
Compare probe result to expected result (work is done on engine machine)
Prepare results for host
Can someone explain why using SmartThreadPool is marked wit ha negative usefulness?
In .NET 4 you have the integrated Task Parallel Library. When you create a new Task (the new thread abstraction) you can specify a Task to be long running. We have made good experiences with that (long being days rather than minutes or hours).
You can use it in .NET 2 as well but there it's actually an extension, check here.
In VS2010 the Debugging Parallel applications based on Tasks (not threads) has been radically improved. It's advised to use Tasks whenever possible rather than raw threads. Since it lets you handle parallelism in a more object oriented friendly way.
UPDATE
Tasks that are NOT specified as long running, are queued into the thread pool (or any other scheduler for that matter).
But if a task is specified to be long running, it just creates a standalone Thread, no thread pool is involved.
The CLR ThreadPool isn't appropriate for executing long-running tasks: it's for performing short tasks where the cost of creating a thread would be nearly as high as executing the method itself. (Or at least a significant percentage of the time it takes to execute the method.) As you've seen, .NET itself consumes thread pool threads, you can't reserve a block of them for yourself lest you risk starving the runtime.
Scheduling, throttling, and cancelling work is a different matter. There's no other built-in .NET worker-queue thread pool, so you'll have roll your own (managing the threads or BackgroundWorkers yourself) or find a preexisting one (Ami Bar's SmartThreadPool looks promising, though I haven't used it myself).
In your particular case, the best option would not be either threads or the thread pool or Background worker, but the async programming model (BeginXXX, EndXXX) provided by the framework.
The advantages of using the asynchronous model is that the TcpIp stack uses callbacks whenever there is data to read and the callback is automatically run on a thread from the thread pool.
Using the asynchronous model, you can control the number of requests per time interval initiated and also if you want you can initiate all the requests from a lower priority thread while processing the requests on a normal priority thread which means the packets will stay as little as possible in the internal Tcp Queue of the networking stack.
Asynchronous Client Socket Example - MSDN
P.S. For multiple concurrent and long running jobs that don't do allot of computation but mostly wait on IO (network, disk, etc) the better option always is to use a callback mechanism and not threads.
I'd create your own thread manager. In the following simple example a Queue is used to hold waiting threads and a Dictionary is used to hold active threads, keyed by ManagedThreadId. When a thread finishes, it removes itself from the active dictionary and launches another thread via a callback.
You can change the max running thread limit from your UI, and you can pass extra info to the ThreadDone callback for monitoring performance, etc. If a thread fails for say, a network timeout, you can reinsert back into the queue. Add extra control methods to Supervisor for pausing, stopping, etc.
using System;
using System.Collections.Generic;
using System.Threading;
namespace ConsoleApplication1
{
public delegate void CallbackDelegate(int idArg);
class Program
{
static void Main(string[] args)
{
new Supervisor().Run();
Console.WriteLine("Done");
Console.ReadKey();
}
}
class Supervisor
{
Queue<System.Threading.Thread> waitingThreads = new Queue<System.Threading.Thread>();
Dictionary<int, System.Threading.Thread> activeThreads = new Dictionary<int, System.Threading.Thread>();
int maxRunningThreads = 10;
object locker = new object();
volatile bool done;
public void Run()
{
// queue up some threads
for (int i = 0; i < 50; i++)
{
Thread newThread = new Thread(new Worker(ThreadDone).DoWork);
newThread.IsBackground = true;
waitingThreads.Enqueue(newThread);
}
LaunchWaitingThreads();
while (!done) Thread.Sleep(200);
}
// keep starting waiting threads until we max out
void LaunchWaitingThreads()
{
lock (locker)
{
while ((activeThreads.Count < maxRunningThreads) && (waitingThreads.Count > 0))
{
Thread nextThread = waitingThreads.Dequeue();
activeThreads.Add(nextThread.ManagedThreadId, nextThread);
nextThread.Start();
Console.WriteLine("Thread " + nextThread.ManagedThreadId.ToString() + " launched");
}
done = (activeThreads.Count == 0) && (waitingThreads.Count == 0);
}
}
// this is called by each thread when it's done
void ThreadDone(int threadIdArg)
{
lock (locker)
{
// remove thread from active pool
activeThreads.Remove(threadIdArg);
}
Console.WriteLine("Thread " + threadIdArg.ToString() + " finished");
LaunchWaitingThreads(); // this could instead be put in the wait loop at the end of Run()
}
}
class Worker
{
CallbackDelegate callback;
public Worker(CallbackDelegate callbackArg)
{
callback = callbackArg;
}
public void DoWork()
{
System.Threading.Thread.Sleep(new Random().Next(100, 1000));
callback(System.Threading.Thread.CurrentThread.ManagedThreadId);
}
}
}
Use the built-in threadpool. It has good capabilities.
Alternatively you can look at the Smart Thread Pool implementation here or at Extended Thread Pool for a limit on the maximum number of working threads.

process list items in different threads

At the moment I have List of Job objects that are queued to be processed sequentially shown in the code bellow.
List<Job> jobList = jobQueue.GetJobsWithStatus(Status.New);
foreach (Job job in jobList)
{
job.Process();
}
I am interested in running several Jobs at the same time in a limited number of threads (lets say 5 threads).
What is the best way to do this in c#?
Additional Notes:
A Job object does not share resources
with other jobs.
Each Job takes about 10 seconds to
process.
Each job could connect to
different resources.
Update: I have used a Semaphore because I could not limit the amount of active threads with a ThreadPool.
If you're feeling adventurous you can use C# 4.0 and the Task Parallel Library:
Parallel.ForEach(jobList, curJob => {
curJob.Process()
});
You will want to look into thread pools (for the simple answer). There is even a ThreadPool class in C# and it is quite easy to set up with great examples in the msdn library.
ThreadPool.QueueUserWorkItem()
This will queue a method for execution. It will execute when a thread pool thread is available.
If you meant to use Threading, simple coding:
private static void TransferThread(ThreadStart tstart)
{
Thread thread = new Thread(tstart);
thread.IsBackground = true;
thread.Start();
}
Or you can you ThreadPool. This allows you to use any available thread and return it back to the pool after you're done.
I would give the .NET ThreadPool a try.

Categories

Resources