Usage multithreading could lead to excessive memory use

Usage multithreading could lead to excessive memory use - c#

I'm having a windows service project that logs messages to a database (or other place). The frequency of these messages could go up to ten per second. Since sending and processing the messages shouldn't delay the main process of the service I start a new thread for the processing of every message. This means that if the main process needs to send 100 log messages, 100 threads are started that process each message. I learned that when a thread is done, it will be cleaned so I don't have to dispose it. As long as I dispose all used objects in the thread everything should be working fine.
The service could go into a exception that leads to shutting down the service. Before the service shuts down it should wait for all threads that were logging messages. To achieve this it adds the thread to a list every time a thread is started. When the wait-for-threads method is called, all threads in the list are checked if it is still alive and if so, it uses join to wait for it.
The code:
Creating the thread:
/// <summary>
/// Creates a new thread and sends the message
/// </summary>
/// <param name="logMessage"></param>
private static void ThreadSend(IMessage logMessage)
{
ParameterizedThreadStart threadStart = new ParameterizedThreadStart(MessageHandler.HandleMessage);
Thread messageThread = new Thread(threadStart);
messageThread.Name = "LogMessageThread";
messageThread.Start(logMessage);
threads.Add(messageThread);
}
The waiting for threads to end:
/// <summary>
/// Waits for threads that are still being processed
/// </summary>
public static void WaitForThreads()
{
int i = 0;
foreach (Thread thread in threads)
{
i++;
if (thread.IsAlive)
{
Debug.Print("waiting for {0} - {1} to end...", thread.Name, i);
thread.Join();
}
}
}
Now my main concern is if this service runs for a month it will still have all threads (millions) in the list (most of them dead). This will eat memory and I don't know how much. This in whole doesn't seem to be a good practice to me, I want to clean up finished threads but I can't find out how to do it. Does any one have a good or best practice for this?

Remove the threads from the list if they are dead?
/// <summary>
/// Waits for threads that are still being processed
/// </summary>
public static void WaitForThreads()
{
List<Thread> toRemove = new List<int>();
int i = 0;
foreach (Thread thread in threads)
{
i++;
if (thread.IsAlive)
{
Debug.Print("waiting for {0} - {1} to end...", thread.Name, i);
thread.Join();
}
else
{
toRemove.Add(thread);
}
}
threads.RemoveAll(x => toRemove.Contains(x));
}
Have a look at Task Parallelism

First of all: Creating one thread per log message is not a good idea. Either use ThreadPool or create a limited number of worker threads which handle the log items from a common queue (producer/consumer).
Second: Of course you need to also remove the thread references from the list! Either when the thread method ends, it can remove itself, or you can even do it on a regular basis. For example, have a timer run every half and hour that checks the list for dead threads and removes them.

If all you're doing in those threads is logging, you should probably have a single logging thread and a shared queue that the main thread puts messages on. The logging thread can then read the queue and log. This is incredibly easy with the BlockingCollection.
Create the queue in the service's main thread:
BlockingCollection<IMessage> LogMessageQueue = new BlockingCollection<IMessage>();
Your service's main thread creates a Logger (see below) instance, which starts a thread to process log messages. The main thread adds items to the LogMessageQueue. The logger thread reads them from the queue. When the main thread wants to shut down, it calls LogMessageQueue.CompleteAdding. The logger will empty the queue and exit.
Main thread would look like this:
// start the logger
Logger _loggingThread = new Logger(LogMessageQueue);
// to log a message:
LogMessageQueue.Add(logMessage);
// when the program needs to shut down:
LogMessageQueue.CompleteAdding();
And the logger class:
class Logger
{
BlockingCollection<IMessage> _queue;
Thread _loggingThread;
public Logger(BlockingCollection<IMessage> queue)
{
_queue = queue;
_loggingThread = new Thread(LoggingThreadProc);
}
private void LoggingThreadProc(object state)
{
IMessage msg;
while (_queue.TryTake(out msg, TimeSpan.Infinite))
{
// log the item
}
}
}
This way you have just one additional thread, messages are guaranteed to be processed in the order they're sent (not true of your current approach), and you don't have to worry about keeping track of thread shutdown, etc.
Update
If some of your log messages will take time to process (the email you described, for example), you can process them asynchronously. For example:
while (_queue.TryTake(out msg, TimeSpan.Infinite))
{
if (msg.Type == Email)
{
// start asynchronous task to send email
}
else
{
// write to log file
}
}
This way, only those messages that potentially take lots of time will run asynchronously. You can also have a secondary queue there if you want, for the email messages. That way you won't get bogged down with a bunch of email threads. Rather, you limit it to one or two, or perhaps a handful.
Note that you can also have multiple Logger instances if you want, all reading from the same message queue. Just make sure they're each writing to a different log file. The queue itself will support multiple consumers.

I think in general the approach to solve your issue is maybe not the best practice.
I mean, instead of creating 1000s of threads, you just want to store 1000s of messages in a database right? And it seems you want to do this asynchronously.
But creating a thread for each message is not really a good idea and actually does not solve that issue...
Instead I would try to implement something like message queues. You can have multiple queues and each queue has its own thread. If messages are coming in, you send them to one of the queues (alternating)...
The queue either waits for a certain amount of messages, or always waits a certain amount of time (e.g. 1 second, depends of how long it takes to store e.g. 100 messages within the database) until it tries to store the queued messages in the database.
This way you should actually always have a constant number of threads and you shouldn't see any performance issues...
Also it would enable you to batch insert data and not only one by one with the overhead of db connections etc...
Of cause, if your database is slower then the tasks are able to store the messages, more and more messages will be queued... But that's true for your current solution, also.

Since multiple answers and comments led to my solution I will post the complete code here.
I used threadpool to manage the threads and code from this page for the wating function.
Creating the thread:
private static void ThreadSend(IMessage logMessage)
{
ThreadPool.QueueUserWorkItem(MessageHandler.HandleMessage, logMessage);
}
Waiting for the threads to finish:
public static bool WaitForThreads(int maxWaitingTime)
{
int maxThreads = 0;
int placeHolder = 0;
int availableThreads = 0;
while (maxWaitingTime > 0)
{
System.Threading.ThreadPool.GetMaxThreads(out maxThreads, out placeHolder);
System.Threading.ThreadPool.GetAvailableThreads(out availableThreads, out placeHolder);
//Stop if all threads are available
if (availableThreads == maxThreads)
{
return true;
}
System.Threading.Thread.Sleep(TimeSpan.FromMilliseconds(1000));
--maxWaitingTime;
}
return false;
}
Optionally you can add this somewhere outside these methods to limit the amount of threads in the pool.
System.Threading.ThreadPool.SetMaxThreads(MaxWorkerThreads, MaxCompletionPortThreads);

Related

Wait for RabbitMQ Threads to finish in Windows Service OnStop()

I am working on a windows service written in C# (.NET 4.5, VS2012), which uses RabbitMQ (receiving messages by subscription). There is a class which derives from DefaultBasicConsumer, and in this class are two actual consumers (so two channels). Because there are two channels, two threads handle incoming messages (from two different queues/routing keys) and both call the same HandleBasicDeliver(...) function.
Now, when the windows service OnStop() is called (when someone is stopping the service), I want to let both those threads finish handling their messages (if they are currently processing a message), sending the ack to the server, and then stop the service (abort the threads and so on).
I have thought of multiple solutions, but none of them seem to be really good. Here's what I tried:
using one mutex; each thread tries to take it when entering HandleBasicDeliver, then releases it afterwards. When OnStop() is called, the main thread tries to grab the same mutex, effectively preventing the RabbitMQ threads to actually process any more messages. The disadvantage is, only one consumer thread can process a message at a time.
using two mutexes: each RabbitMQ thread has uses a different mutex, so they won't block each other in the HandleBasicDeliver() - I can differentiate which
thread is actually handling the current message based on the routing key. Something like:
HandleBasicDeliver(...)
{
if(routingKey == firstConsumerRoutingKey)
{
// Try to grab the mutex of the first consumer
}
else
{
// Try to grab the mutex of the second consumer
}
}
When OnStop() is called, the main thread will try to grab both mutexes; once both mutexes are "in the hands" of the main thread, it can proceed with stopping the service. The problem: if another consumer would be added to this class, I'd need to change a lot of code.
using a counter, or CountdownEvent. Counter starts off at 0, and each time HandleBasicDeliver() is entered, counter is safely incremented using the Interlocked class. After the message is processed, counter is decremented. When OnStop() is called, the main thread checks if the counter is 0. Should this condition be fulfilled, it will continue. However, after it checks if counter is 0, some RabbitMQ thread might begin to process a message.
When OnStop() is called, closing the connection to the RabbitMQ (to make sure no new messages will arrive), and then waiting a few seconds ( in case there are any messages being processed, to finish processing) before closing the application. The problem is, the exact number of seconds I should wait before shutting down the apllication is unknown, so this isn't an elegant or exact solution.
I realize the design does not conform to the Single Responsibility Principle, and that may contribute to the lack of solutions. However, could there be a good solution to this problem without having to redesign the project?

We do this in our application, The main idea is to use a CancellationTokenSource
On your windows service add this:
private static readonly CancellationTokenSource CancellationTokenSource = new CancellationTokenSource();
Then in your rabbit consumers do this:
1. change from using Dequeue to DequeueNoWait
2. have your rabbit consumer check the cancellation token
Here is our code:
public async Task StartConsuming(IMessageBusConsumer consumer, MessageBusConsumerName fullConsumerName, CancellationToken cancellationToken)
{
var queueName = GetQueueName(consumer.MessageBusConsumerEnum);
using (var model = _rabbitConnection.CreateModel())
{
// Configure the Quality of service for the model. Below is how what each setting means.
// BasicQos(0="Don't send me a new message until I’ve finished", _fetchSize = "Send me N messages at a time", false ="Apply to this Model only")
model.BasicQos(0, consumer.FetchCount.Value, false);
var queueingConsumer = new QueueingBasicConsumer(model);
model.BasicConsume(queueName, false, fullConsumerName, queueingConsumer);
var queueEmpty = new BasicDeliverEventArgs(); //This is what gets returned if nothing in the queue is found.
while (!cancellationToken.IsCancellationRequested)
{
var deliverEventArgs = queueingConsumer.Queue.DequeueNoWait(queueEmpty);
if (deliverEventArgs == queueEmpty)
{
// This 100ms wait allows the processor to go do other work.
// No sense in going back to an empty queue immediately.
// CancellationToken intentionally not used!
// ReSharper disable once MethodSupportsCancellation
await Task.Delay(100);
continue;
}
//DO YOUR WORK HERE!
}
}

Usually, how we ensure a windows service not stop before processing completes is to use some code like below. Hope that help.
protected override void OnStart(string[] args)
{
// start the worker thread
_workerThread = new Thread(WorkMethod)
{
// !!!set to foreground to block windows service be stopped
// until thread is exited when all pending tasks complete
IsBackground = false
};
_workerThread.Start();
}
protected override void OnStop()
{
// notify the worker thread to stop accepting new migration requests
// and exit when all tasks are completed
// some code to notify worker thread to stop accepting new tasks internally
// wait for worker thread to stop
_workerThread.Join();
}

Possibility of Semaphoreslim.Wait(0) (to prevent multiple execution) causing non execution

The situation I am uncertain of concerns the usage of a "threadsafe" PipeStream where multiple threads can add messages to be written. If there is no queue of messages to be written, the current thread will begin writing to the reading party. If there is a queue, and the queue grows while the pipe is writing, I want the thread that begun writing to deplete the queue.
I "hope" that this design (demonstrated below) discourages the continuous entering/releasing of the SemaphoreSlim and decrease the number of tasks scheduled. I say "hope" because I should test whether this complication has any positive performance implications. However, before even testing this I should first understand if the code does what I think it will, so please consider the following class, and below it a sequence of events;
Note: I understand that execution of tasks is not tied to any particular thread, but I find this is the easiest way to explain.
class SemaphoreExample
{
// Wrapper around a NamedPipeClientStream
private readonly MessagePipeClient m_pipe =
new MessagePipeClient("somePipe");
private readonly SemaphoreSlim m_semaphore =
new SemaphoreSlim(1, 1);
private readonly BlockingCollection<Message> m_messages =
new BlockingCollection<Message>(new ConcurrentQueue<Message>());
public Task Send<T>(T content)
where T : class
{
if (!this.m_messages.TryAdd(new Message<T>(content)))
throw new InvalidOperationException("No more requests!");
Task dequeue = TryDequeue();
return Task.FromResult(true);
// In reality this class (and method) is more complex.
// There is a similiar pipe (and wrkr) in the other direction.
// The "sent jobs" is kept in a dictionary and this method
// returns a task belonging to a completionsource tied
// to the "sent job". The wrkr responsible for the other
// pipe reads a response and sets the corresponding
// completionsource.
}
private async Task TryDequeue()
{
if (!this.m_semaphore.Wait(0))
return; // someone else is already here
try
{
Message message;
while (this.m_messages.TryTake(out message))
{
await this.m_pipe.WriteAsync(message);
}
}
finally { this.m_semaphore.Release(); }
}
}
Wrkr1 finishes writing to the pipe. (in TryDequeue)
Wrkr1 determines queue is empty. (in TryDequeue)
Wrkr2 adds item to queue. (in Send)
Wrkr2 determines Wrkr1 occupies the Semaphore, returns. (in Send)
Wrkr1 releases the Semaphore. (in TryDequeue)
Queue is left with 1 item that wont be acted upon for x amount of Time.
Is this sequence of events possible? Should I forget this idea altogether and have every call to "Send" await on "TryDeque" and the semaphore within it? Perhaps the potential performance implications of scheduling another task per method call is negligible, even at a "high" frequency.
UPDATE:
Following the advice of Alex I am doing the following;
Let the caller of "Send" specify a "maxWorkload" integer that specifies how many items the caller is prepared to do (for other callers, in the worst case) before delegating work to another thread to handle any extra work. However, before creating the new thread, other callers of "Send" is given an opportunity to enter the semaphore, thereby possibly preventing the use of an additional thread.
To not let any work be left lingering in the queue, any worker who successfully entered the semaphore and did some work must check if there is any new work added after exiting the semaphore. If this is true the same worker will try to re-enter (if "maxWorkload" is not reached) or delegate work as described above.
Example below: Send now sets up "TryPool" as a continuation of "TryDequeue". "TryPool" only begins if "TryDequeue" returns true (i.e. did some work while having entered the semaphore).
// maxWorkload cannot be -1 for this method
private async Task<bool> TryDequeue(int maxWorkload)
{
int currWorkload = 0;
while (this.m_messages.Count != 0 && this.m_semaphore.Wait(0))
{
try
{
currWorkload = await Dequeue(currWorkload, maxWorkload);
if (currWorkload >= maxWorkload)
return true;
}
finally
{
this.m_semaphore.Release();
}
}
return false;
}
private Task TryPool()
{
if (this.m_messages.Count == 0 || !this.m_semaphore.Wait(0))
return Task<bool>.FromResult(false);
return Task.Run(async () =>
{
do
{
try
{
await Dequeue(0, -1);
}
finally
{
this.m_semaphore.Release();
}
}
while (this.m_messages.Count != 0 && this.m_semaphore.Wait(0));
});
}
private async Task<int> Dequeue(int currWorkload, int maxWorkload)
{
while (currWorkload < maxWorkload || maxWorkload == -1)
{
Message message;
if (!this.m_messages.TryTake(out message))
return currWorkload;
await this.m_pipe.WriteAsync(message);
currWorkload++;
}
return maxWorkload;
}

I tend to call this pattern the "GatedBatchWriter", i.e. the first thread through the gate handles a batch of tasks; its own and a number of others on behalf of other writers, until it has done enough work.
This pattern is primarily useful, when it is more efficient to batch work, because of overheads associated with that work. E.g. writing larger blocks to disk in one go, instead of multiple small ones.
And yes, this particular pattern has a specific race condition to be aware of: The "responsible writer", i.e. the one that got through the gate, determines that no more messages are in the queue and stops before releasing the semaphore (i.e. its write responsibility). A second writer arrived and in between those two decision points failed to acquire write responsibility. Now there is a message in the queue that will not be delivered (or delivered late, when the next writer arrives).
Additionally, what you are doing now, is not fair, in terms of scheduling. If there are many messages, the queue might never be empty, and the writer that got through the gate will be busy writing messages on behalf of the others for all eternity. You need to limit the batch size for the responsible writer.
Some other things you may want to change are:
Have your Message contain a task completion token.
Have writers that could not acquire the write responsibility enqueue their message and wait for any of two task completions: the task completion associated with their message, the releasing of the write responsibility.
Have the responsible writer set the completion for messages that it processed.
Have the responsible writer release it's write responsibility when it has done enough work.
When a waiting writer is woken up by one of the two task completions:
if it was due to the completion token on its message, it can go its merry way.
otherwise, try to acquire the write responsibility, rinse, repeat...
One more note: if there are a lot of messages, i.e. a high message load on average, a dedicated thread / long running task handling the queue will generally have a better performance.

What is the most efficient method for assigning threads based on the following scenario?

I can have a maximum of 5 threads running simultaneous at any one time which makes use of 5 separate hardware to speedup the computation of some complex calculations and return the result. The API (contains only one method) for each of this hardware is not thread safe and can only run on a single thread at any point in time. Once the computation is completed, the same thread can be re-used to start another computation on either the same or a different hardware depending on availability. Each computation is stand alone and does not depend on the results of the other computation. Hence, up to 5 threads may complete its execution in any order.
What is the most efficient C# (using .Net Framework 2.0) coding solution for keeping track of which hardware is free/available and assigning a thread to the appropriate hardware API for performing the computation? Note that other than the limitation of 5 concurrently running threads, I do not have any control over when or how the threads are fired.
Please correct me if I am wrong but a lock free solution is preferred as I believe it will result in increased efficiency and a more scalable solution.
Also note that this is not homework although it may sound like it...

.NET provides a thread pool that you can use. System.Threading.ThreadPool.QueueUserWorkItem() tells a thread in the pool to do some work for you.
Were I designing this, I'd not focus on mapping threads to your HW resources. Instead I'd expose a lockable object for each HW resource - this can simply be an array or queue of 5 Objects. Then for each bit of computation you have, call QueueUserWorkItem(). Inside the method you pass to QUWI, find the next available lockable object and lock it (aka, dequeue it). Use the HW resource, then re-enqueue the object, exit the QUWI method.
It won't matter how many times you call QUWI; there can be at most 5 locks held, each lock guards access to one instance of your special hardware device.
The doc page for Monitor.Enter() shows how to create a safe (blocking) Queue that can be accessed by multiple workers. In .NET 4.0, you would use the builtin BlockingCollection - it's the same thing.
That's basically what you want. Except don't call Thread.Create(). Use the thread pool.
cite: Advantage of using Thread.Start vs QueueUserWorkItem
// assume the SafeQueue class from the cited doc page.
SafeQueue<SpecialHardware> q = new SafeQueue<SpecialHardware>()
// set up the queue with objects protecting the 5 magic stones
private void Setup()
{
for (int i=0; i< 5; i++)
{
q.Enqueue(GetInstanceOfSpecialHardware(i));
}
}
// something like this gets called many times, by QueueUserWorkItem()
public void DoWork(WorkDescription d)
{
d.DoPrepWork();
// gain access to one of the special hardware devices
SpecialHardware shw = q.Dequeue();
try
{
shw.DoTheMagicThing();
}
finally
{
// ensure no matter what happens the HW device is released
q.Enqueue(shw);
// at this point another worker can use it.
}
d.DoFollowupWork();
}

A lock free solution is only beneficial if the computation time is very small.
I would create a facade for each hardware thread where jobs are enqueued and a callback is invoked each time a job finishes.
Something like:
public class Job
{
public string JobInfo {get;set;}
public Action<Job> Callback {get;set;}
}
public class MyHardwareService
{
Queue<Job> _jobs = new Queue<Job>();
Thread _hardwareThread;
ManualResetEvent _event = new ManualResetEvent(false);
public MyHardwareService()
{
_hardwareThread = new Thread(WorkerFunc);
}
public void Enqueue(Job job)
{
lock (_jobs)
_jobs.Enqueue(job);
_event.Set();
}
public void WorkerFunc()
{
while(true)
{
_event.Wait(Timeout.Infinite);
Job currentJob;
lock (_queue)
{
currentJob = jobs.Dequeue();
}
//invoke hardware here.
//trigger callback in a Thread Pool thread to be able
// to continue with the next job ASAP
ThreadPool.QueueUserWorkItem(() => job.Callback(job));
if (_queue.Count == 0)
_event.Reset();
}
}
}

Sounds like you need a thread pool with 5 threads where each one relinquishes the HW once it's done and adds it back to some queue. Would that work? If so, .Net makes thread pools very easy.

Sounds a lot like the Sleeping barber problem. I believe the standard solution to that is to use semaphores

In .NET is there a thread scheduler for long running threads?

Our scenario is a network scanner.
It connects to a set of hosts and scans them in parallel for a while using low priority background threads.
I want to be able to schedule lots of work but only have any given say ten or whatever number of hosts scanned in parallel. Even if I create my own threads, the many callbacks and other asynchronous goodness uses the ThreadPool and I end up running out of resources. I should look at MonoTorrent...
If I use THE ThreadPool, can I limit my application to some number that will leave enough for the rest of the application to Run smoothly?
Is there a threadpool that I can initialize to n long lived threads?
[Edit]
No one seems to have noticed that I made some comments on some responses so I will add a couple things here.
Threads should be cancellable both
gracefully and forcefully.
Threads should have low priority leaving the GUI responsive.
Threads are long running but in Order(minutes) and not Order(days).
Work for a given target host is basically:
For each test
Probe target (work is done mostly on the target end of an SSH connection)
Compare probe result to expected result (work is done on engine machine)
Prepare results for host
Can someone explain why using SmartThreadPool is marked wit ha negative usefulness?

In .NET 4 you have the integrated Task Parallel Library. When you create a new Task (the new thread abstraction) you can specify a Task to be long running. We have made good experiences with that (long being days rather than minutes or hours).
You can use it in .NET 2 as well but there it's actually an extension, check here.
In VS2010 the Debugging Parallel applications based on Tasks (not threads) has been radically improved. It's advised to use Tasks whenever possible rather than raw threads. Since it lets you handle parallelism in a more object oriented friendly way.
UPDATE
Tasks that are NOT specified as long running, are queued into the thread pool (or any other scheduler for that matter).
But if a task is specified to be long running, it just creates a standalone Thread, no thread pool is involved.

The CLR ThreadPool isn't appropriate for executing long-running tasks: it's for performing short tasks where the cost of creating a thread would be nearly as high as executing the method itself. (Or at least a significant percentage of the time it takes to execute the method.) As you've seen, .NET itself consumes thread pool threads, you can't reserve a block of them for yourself lest you risk starving the runtime.
Scheduling, throttling, and cancelling work is a different matter. There's no other built-in .NET worker-queue thread pool, so you'll have roll your own (managing the threads or BackgroundWorkers yourself) or find a preexisting one (Ami Bar's SmartThreadPool looks promising, though I haven't used it myself).

In your particular case, the best option would not be either threads or the thread pool or Background worker, but the async programming model (BeginXXX, EndXXX) provided by the framework.
The advantages of using the asynchronous model is that the TcpIp stack uses callbacks whenever there is data to read and the callback is automatically run on a thread from the thread pool.
Using the asynchronous model, you can control the number of requests per time interval initiated and also if you want you can initiate all the requests from a lower priority thread while processing the requests on a normal priority thread which means the packets will stay as little as possible in the internal Tcp Queue of the networking stack.
Asynchronous Client Socket Example - MSDN
P.S. For multiple concurrent and long running jobs that don't do allot of computation but mostly wait on IO (network, disk, etc) the better option always is to use a callback mechanism and not threads.

I'd create your own thread manager. In the following simple example a Queue is used to hold waiting threads and a Dictionary is used to hold active threads, keyed by ManagedThreadId. When a thread finishes, it removes itself from the active dictionary and launches another thread via a callback.
You can change the max running thread limit from your UI, and you can pass extra info to the ThreadDone callback for monitoring performance, etc. If a thread fails for say, a network timeout, you can reinsert back into the queue. Add extra control methods to Supervisor for pausing, stopping, etc.
using System;
using System.Collections.Generic;
using System.Threading;
namespace ConsoleApplication1
{
public delegate void CallbackDelegate(int idArg);
class Program
{
static void Main(string[] args)
{
new Supervisor().Run();
Console.WriteLine("Done");
Console.ReadKey();
}
}
class Supervisor
{
Queue<System.Threading.Thread> waitingThreads = new Queue<System.Threading.Thread>();
Dictionary<int, System.Threading.Thread> activeThreads = new Dictionary<int, System.Threading.Thread>();
int maxRunningThreads = 10;
object locker = new object();
volatile bool done;
public void Run()
{
// queue up some threads
for (int i = 0; i < 50; i++)
{
Thread newThread = new Thread(new Worker(ThreadDone).DoWork);
newThread.IsBackground = true;
waitingThreads.Enqueue(newThread);
}
LaunchWaitingThreads();
while (!done) Thread.Sleep(200);
}
// keep starting waiting threads until we max out
void LaunchWaitingThreads()
{
lock (locker)
{
while ((activeThreads.Count < maxRunningThreads) && (waitingThreads.Count > 0))
{
Thread nextThread = waitingThreads.Dequeue();
activeThreads.Add(nextThread.ManagedThreadId, nextThread);
nextThread.Start();
Console.WriteLine("Thread " + nextThread.ManagedThreadId.ToString() + " launched");
}
done = (activeThreads.Count == 0) && (waitingThreads.Count == 0);
}
}
// this is called by each thread when it's done
void ThreadDone(int threadIdArg)
{
lock (locker)
{
// remove thread from active pool
activeThreads.Remove(threadIdArg);
}
Console.WriteLine("Thread " + threadIdArg.ToString() + " finished");
LaunchWaitingThreads(); // this could instead be put in the wait loop at the end of Run()
}
}
class Worker
{
CallbackDelegate callback;
public Worker(CallbackDelegate callbackArg)
{
callback = callbackArg;
}
public void DoWork()
{
System.Threading.Thread.Sleep(new Random().Next(100, 1000));
callback(System.Threading.Thread.CurrentThread.ManagedThreadId);
}
}
}

Use the built-in threadpool. It has good capabilities.
Alternatively you can look at the Smart Thread Pool implementation here or at Extended Thread Pool for a limit on the maximum number of working threads.

Throttling speed of email sending process

Sorry the title is a bit crappy, I couldn't quite word it properly.
Edit: I should note this is a console c# app
I've prototyped out a system that works like so (this is rough pseudo-codeish):
var collection = grabfromdb();
foreach (item in collection) {
SendAnEmail();
}
SendAnEmail:
SmtpClient mailClient = new SmtpClient;
mailClient.SendCompleted += new SendCompletedEventHandler(SendComplete);
mailClient.SendAsync('the mail message');
SendComplete:
if (anyErrors) {
errorHandling()
}
else {
HitDBAndMarkAsSendOK();
}
Obviously this setup is not ideal. If the initial collection has, say 10,000 records, then it's going to new up 10,000 instances of smtpclient in fairly short order as quickly as it can step through the rows - and likely asplode in the process.
My ideal end game is to have something like 10 concurrent email going out at once.
A hacky solution comes to mind: Add a counter, that increments when SendAnEmail() is called, and decrements when SendComplete is sent. Before SendAnEmail() is called in the initial loop, check the counter, if it's too high, then sleep for a small period of time and then check it again.
I'm not sure that's such a great idea, and figure the SO hive mind would have a way to do this properly.
I have very little knowledge of threading and not sure if it would be an appropriate use here. Eg sending email in a background thread, first check the number of child threads to ensure there's not too many being used. Or if there is some type of 'thread throttling' built in.
Update
Following in the advice of Steven A. Lowe, I now have:
A Dictionary holding my emails and a unique key (this is the email que
A FillQue Method, which populates the dictionary
A ProcessQue method, which is a background thread. It checks the que, and SendAsycs any email in the que.
A SendCompleted delegate which removes the email from the que. And calls FillQue again.
I've a few problems with this setup. I think I've missed the boat with the background thread, should I be spawning one of these for each item in the dictionary? How can I get the thread to 'hang around' for lack of a better word, if the email que empties the thread ends.
final update
I've put a 'while(true) {}' in the background thread. If the que is empty, it waits a few seconds and tries again. If the que is repeatedly empty, i 'break' the while, and the program ends... Works fine. I'm a bit worried about the 'while(true)' business though..

Short Answer
Use a queue as a finite buffer, processed by its own thread.
Long Answer
Call a fill-queue method to create a queue of emails, limited to (say) 10. Fill it with the first 10 unsent emails. Launch a thread to process the queue - for each email in the queue, send it asynch. When the queue is empty sleep for a while and check again. Have the completion delegate remove the sent or errored email from the queue and update the database, then call the fill-queue method to read more unsent emails into the queue (back up to the limit).
You'll only need locks around the queue operations, and will only have to manage (directly) the one thread to process the queue. You will never have more than N+1 threads active at once, where N is the queue limit.

I believe your hacky solution actually would work. Just make sure you have a lock statement around the bits where you increment and decrement the counter:
class EmailSender
{
object SimultaneousEmailsLock;
int SimultaneousEmails;
public string[] Recipients;
void SendAll()
{
foreach(string Recipient in Recipients)
{
while (SimultaneousEmails>10) Thread.Sleep(10);
SendAnEmail(Recipient);
}
}
void SendAnEmail(string Recipient)
{
lock(SimultaneousEmailsLock)
{
SimultaneousEmails++;
}
... send it ...
}
void FinishedEmailCallback()
{
lock(SimultaneousEmailsLock)
{
SimultaneousEmails--;
}
... etc ...
}
}

I would add all my messages to a Queue, and then spawn i.e. 10 threads which sent emails until the Queue was empty. Pseudo'ish C# (probably wont compile):
class EmailSender
{
Queue<Message> messages;
List<Thread> threads;
public Send(IEnumerable<Message> messages, int threads)
{
this.messages = new Queue<Message>(messages);
this.threads = new List<Thread>();
while(threads-- > 0)
threads.Add(new Thread(SendMessages));
threads.ForEach(t => t.Start());
while(threads.Any(t => t.IsAlive))
Thread.Sleep(50);
}
private SendMessages()
{
while(true)
{
Message m;
lock(messages)
{
try
{
m = messages.Dequeue();
}
catch(InvalidOperationException)
{
// No more messages
return;
}
}
// Send message in some way. Not in an async way,
// since we are already kind of async.
Thread.Sleep(); // Perhaps take a quick rest
}
}
}
If the message is the same, and just having many recipients, just swap the Message with a Recipient, and add a single Message parameter to the Send method.

You could use a .NET Timer to setup the schedule for sending messages. Whenever the timer fires, grab the next 10 messages and send them all, and repeat. Or if you want a general (10 messages per second) rate you could have the timer fire every 100ms, and send a single message every time.
If you need more advanced scheduling, you could look at a scheduling framework like Quartz.NET

Isn't this something that Thread.Sleep() can handle?
You are correct in thinking that background threading can serve a good purpose here. Basically what you want to do is create a background thread for this process, let it run its own way, delays and all, and then terminate the thread when the process is done, or leave it on indefinitely (turning it into a Windows Service or something similar will be a good idea).
A little intro on multi-threading can be read here (with Thread.Sleep included!).
A nice intro on Windows Services can be read here.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.