Prevent Race Condition in Efficient Consumer Producer model - c#

What I am trying to achieve is to have a consumer producer method. There can be many producers but only one consumer. There cannot be a dedicated consumer because of scalability, so the idea is to have the producer start the consuming process if there is data to be consumed and there is currently no active consumer.
1. Many threads can be producing messages. (Asynchronous)
2. Only one thread can be consuming messages. (Synchronous)
3. We should only have a consumer in process if there is data to be consumed
4. A continuous consumer that waits for data would not be efficient if we add many of these classes.
In my example I have a set of methods that send data. Multiple threads can write data Write() but only one of those threads will loop and Send data SendNewData(). The reason that only one loop can write data is because the order of data must be synchronous, and with a AsyncWrite() out of our control we can only guarantee order by running one AyncWrite() at a time.
The problem that I have is that if a thread gets called to Write() produce, it will queue the data and check the Interlocked.CompareExchance to see if there is a consumer. If it sees that another thread is in the loop already consuming, it will assume that this consumer will send the data. This is a problem if that looping thread consumer is at "Race Point A" since this consumer has already checked that there is no more messages to send and is about to shut down the consuming process.
Is there a way to prevent this race condition without locking a large part of the code. The real scenario has many queues and is a bit more complex than this.
In the real code List<INetworkSerializable> is actually a byte[] BufferPool. I used List for the example to make this block easier to read.
With 1000s of these classes being active at once, I cannot afford to have the SendNewData looping continuously with a dedicated thread. The looping thread should only be active if there is data to send.
public void Write(INetworkSerializable messageToSend)
{
Queue.Enqueue(messageToSend);
// Check if there are any current consumers. If not then we should instigate the consuming.
if (Interlocked.CompareExchange(ref RunningWrites, 1, 0) == 0)
{ //We are now the thread that consumes and sends data
SendNewData();
}
}
//Only one thread should be looping here to keep consuming and sending data synchronously.
private void SendNewData()
{
INetworkSerializable dataToSend;
List<INetworkSerializable> dataToSendList = new List<INetworkSerializable>();
while (true)
{
if (!Queue.TryDequeue(out dataToSend))
{
//Race Point A
if (dataToSendList.IsEmpty)
{
//All data is sent, return so that another thread can take responsibility.
Interlocked.Decrement(ref RunningWrites);
return;
}
//We have data in the list to send but nothing more to consume so lets send the data that we do have.
break;
}
dataToSendList.Add(dataToSend);
}
//Async callback is WriteAsyncCallback()
WriteAsync(dataToSendList);
}
//Callback after WriteAsync() has sent the data.
private void WriteAsyncCallback()
{
//Data was written to sockets, now lets loop back for more data
SendNewData();
}

It sounds like you would be better off with the producer-consumer pattern that is easily implemented with the BlockingCollection:
var toSend = new BlockingCollection<something>();
// producers
toSend.Add(something);
// when all producers are done
toSend.CompleteAdding();
// consumer -- this won't end until CompleteAdding is called
foreach(var item in toSend.GetConsumingEnumerable())
Send(item);
To address the comment of knowing when to call CompleteAdding, I would launch the 1000s of producers as tasks, wait for all those tasks to complete (Task.WaitAll), and then call CompleteAdding. There are good overloads taking in CancellationTokens that would give you better control, if needed.
Also, TPL is pretty good about scheduling off blocked threads.
More complete code:
var toSend = new BlockingCollection<int>();
Parallel.Invoke(() => Produce(toSend), () => Consume(toSend));
...
private static void Consume(BlockingCollection<int> toSend)
{
foreach (var value in toSend.GetConsumingEnumerable())
{
Console.WriteLine("Sending {0}", value);
}
}
private static void Produce(BlockingCollection<int> toSend)
{
Action<int> generateToSend = toSend.Add;
var producers = Enumerable.Range(0, 1000)
.Select(n => new Task(value => generateToSend((int) value), n))
.ToArray();
foreach(var p in producers)
{
p.Start();
}
Task.WaitAll(producers);
toSend.CompleteAdding();
}

Check this variant. There are some descriptive comments in code.
Also notice that WriteAsyncCallback now don't call SendNewData method anymore
private int _pendingMessages;
private int _consuming;
public void Write(INetworkSerializable messageToSend)
{
Interlocked.Increment(ref _pendingMessages);
Queue.Enqueue(messageToSend);
// Check if there is anyone consuming messages
// if not, we will have to become a consumer and process our own message,
// and any other further messages until we have cleaned the queue
if (Interlocked.CompareExchange(ref _consuming, 1, 0) == 0)
{
// We are now the thread that consumes and sends data
SendNewData();
}
}
// Only one thread should be looping here to keep consuming and sending data synchronously.
private void SendNewData()
{
INetworkSerializable dataToSend;
var dataToSendList = new List<INetworkSerializable>();
int messagesLeft;
do
{
if (!Queue.TryDequeue(out dataToSend))
{
// there is one possibility that we get here while _pendingMessages != 0:
// some other thread had just increased _pendingMessages from 0 to 1, but haven't put a message to queue.
if (dataToSendList.Count == 0)
{
if (_pendingMessages == 0)
{
_consuming = 0;
// and if we have no data this mean that we are safe to exit from current thread.
return;
}
}
else
{
// We have data in the list to send but nothing more to consume so lets send the data that we do have.
break;
}
}
dataToSendList.Add(dataToSend);
messagesLeft = Interlocked.Decrement(ref _pendingMessages);
}
while (messagesLeft > 0);
// Async callback is WriteAsyncCallback()
WriteAsync(dataToSendList);
}
private void WriteAsync(List<INetworkSerializable> dataToSendList)
{
// some code
}
// Callback after WriteAsync() has sent the data.
private void WriteAsyncCallback()
{
// ...
SendNewData();
}

The race condition can be prevented by adding the following and double checking the Queue after we have declared that we are no longer the consumer.
if (dataToSend.IsEmpty)
{
//Declare that we are no longer the consumer.
Interlocked.Decrement(ref RunningWrites);
//Double check the queue to prevent race condition A
if (Queue.IsEmpty)
return;
else
{ //Race condition A occurred. There is data again.
//Let's try to become a consumer.
if (Interlocked.CompareExchange(ref RunningWrites, 1, 0) == 0)
continue;
//Another thread has nominated itself as the consumer. Our job is done.
return;
}
}
break;

Related

How to use tasks with ConcurrentDictionary

I have to write a program where I'm reading from a database the queues to process and all the queues are run in parallel and managed on the parent thread using a ConcurrentDictionary.
I have a class that represents the queue, which has a constructor that takes in the queue information and the parent instance handle. The queue class also has the method that processes the queue.
Here is the Queue Class:
Class MyQueue {
protected ServiceExecution _parent;
protect string _queueID;
public MyQueue(ServiceExecution parentThread, string queueID)
{
_parent = parentThread;
_queueID = queueID;
}
public void Process()
{
try
{
//Do work to process
}
catch()
{
//exception handling
}
finally{
_parent.ThreadFinish(_queueID);
}
The parent thread loops through the dataset of queues and instantiates a new queue class. It spawns a new thread to execute the Process method of the Queue object asynchronously. This thread is added to the ConcurrentDictionary and then started as follows:
private ConcurrentDictionary<string, MyQueue> _runningQueues = new ConcurrentDictionary<string, MyQueue>();
Foreach(datarow dr in QueueDataset.rows)
{
MyQueue queue = new MyQueue(this, dr["QueueID"].ToString());
Thread t = new Thread(()=>queue.Process());
if(_runningQueues.TryAdd(dr["QueueID"].ToString(), queue)
{
t.start();
}
}
//Method that gets called by the queue thread when it finishes
public void ThreadFinish(string queueID)
{
MyQueue queue;
_runningQueues.TryRemove(queueID, out queue);
}
I have a feeling this is not the right approach to manage the asynchronous queue processing and I'm wondering if perhaps I can run into deadlocks with this design? Furthermore, I would like to use Tasks to run the queues asynchronously instead of the new Threads. I need to keep track of the queues because I will not spawn a new thread or task for the same queue if the previous run is not complete yet. What is the best way to handle this type of parallelism?
Thanks in advance!
About your current approach
Indeed it is not the right approach. High number of queues read from database will spawn high number of threads which might be bad. You will create a new thread each time. Better to create some threads and then re-use them. And if you want tasks, better to create LongRunning tasks and re-use them.
Suggested Design
I'd suggest the following design:
Reserve only one task to read queues from the database and put those queues in a BlockingCollection;
Now start multiple LongRunning tasks to read a queue each from that BlockingCollection and process that queue;
When a task is done with processing the queue it took from the BlockingCollection, it will then take another queue from that BlockingCollection;
Optimize the number of these processing tasks so as to properly utilize the cores of your CPU. Usually since DB interactions are slow, you can create tasks 3 times more than the number of cores however YMMV.
Deadlock possibility
They will at least not happen at the application side. However, since the queues are of database transactions, the deadlock may happen at the database end. You may have to write some logic to make your task start a transaction again if the database rolled it back because of deadlock.
Sample Code
private static void TaskDesignedRun()
{
var expectedParallelQueues = 1024; //Optimize it. I've chosen it randomly
var parallelProcessingTaskCount = 4 * Environment.ProcessorCount; //Optimize this too.
var baseProcessorTaskArray = new Task[parallelProcessingTaskCount];
var taskFactory = new TaskFactory(TaskCreationOptions.LongRunning, TaskContinuationOptions.None);
var itemsToProcess = new BlockingCollection<MyQueue>(expectedParallelQueues);
//Start a new task to populate the "itemsToProcess"
taskFactory.StartNew(() =>
{
// Add code to read queues and add them to itemsToProcess
Console.WriteLine("Done reading all the queues...");
// Finally signal that you are done by saying..
itemsToProcess.CompleteAdding();
});
//Initializing the base tasks
for (var index = 0; index < baseProcessorTaskArray.Length; index++)
{
baseProcessorTaskArray[index] = taskFactory.StartNew(() =>
{
while (!itemsToProcess.IsAddingCompleted && itemsToProcess.Count != 0) {
MyQueue q;
if (!itemsToProcess.TryTake(out q)) continue;
//Process your queue
}
});
}
//Now just wait till all queues in your database have been read and processed.
Task.WaitAll(baseProcessorTaskArray);
}

BrokeredMessage Automatically Disposed after calling OnMessage()

I am trying to queue up items from an Azure Service Bus so I can process them in bulk. I am aware that the Azure Service Bus has a ReceiveBatch() but it seems problematic for the following reasons:
I can only get a max of 256 messages at a time and even this then can be random based on message size.
Even if I peek to see how many messages are waiting I don't know how many RequestBatch calls to make because I don't know how many messages each call will give me back. Since messages will keep coming in I can't just continue to make requests until it's empty since it will never be empty.
I decided to just use the message listener which is cheaper than doing wasted peeks and will give me more control.
Basically I am trying to let a set number of messages build up and
then process them at once. I use a timer to force a delay but I need
to be able to queue my items as they come in.
Based on my timer requirement it seemed like the blocking collection was not a good option so I am trying to use ConcurrentBag.
var batchingQueue = new ConcurrentBag<BrokeredMessage>();
myQueueClient.OnMessage((m) =>
{
Console.WriteLine("Queueing message");
batchingQueue.Add(m);
});
while (true)
{
var sw = WaitableStopwatch.StartNew();
BrokeredMessage msg;
while (batchingQueue.TryTake(out msg)) // <== Object is already disposed
{
...do this until I have a thousand ready to be written to DB in batch
Console.WriteLine("Completing message");
msg.Complete(); // <== ERRORS HERE
}
sw.Wait(MINIMUM_DELAY);
}
However as soon as I access the message outside of the OnMessage
pipeline it shows the BrokeredMessage as already being disposed.
I am thinking this must be some automatic behavior of OnMessage and I don't see any way to do anything with the message other than process it right away which I don't want to do.
This is incredibly easy to do with BlockingCollection.
var batchingQueue = new BlockingCollection<BrokeredMessage>();
myQueueClient.OnMessage((m) =>
{
Console.WriteLine("Queueing message");
batchingQueue.Add(m);
});
And your consumer thread:
foreach (var msg in batchingQueue.GetConsumingEnumerable())
{
Console.WriteLine("Completing message");
msg.Complete();
}
GetConsumingEnumerable returns an iterator that consumes items in the queue until the IsCompleted property is set and the queue is empty. If the queue is empty but IsCompleted is False, it does a non-busy wait for the next item.
To cancel the consumer thread (i.e. shut down the program), you stop adding things to the queue and have the main thread call batchingQueue.CompleteAdding. The consumer will empty the queue, see that the IsCompleted property is True, and exit.
Using BlockingCollection here is better than ConcurrentBag or ConcurrentQueue, because the BlockingCollection interface is easier to work with. In particular, the use of GetConsumingEnumerable relieves you from having to worry about checking the count or doing busy waits (polling loops). It just works.
Also note that ConcurrentBag has some rather strange removal behavior. In particular, the order in which items are removed differs depending on which thread removes the item. The thread that created the bag removes items in a different order than other threads. See Using the ConcurrentBag Collection for the details.
You haven't said why you want to batch items on input. Unless there's an overriding performance reason to do so, it doesn't seem like a particularly good idea to complicate your code with that batching logic.
If you want to do batch writes to the database, then I would suggest using a simple List<T> to buffer the items. If you have to process the items before they're written to the database, then use the technique I showed above to process them. Then, rather writing directly to the database, add the item to a list. When the list gets 1,000 items, or a given amount of time elapses, allocate a new list and start a task to write the old list to the database. Like this:
// at class scope
// Flush every 5 minutes.
private readonly TimeSpan FlushDelay = TimeSpan.FromMinutes(5);
private const int MaxBufferItems = 1000;
// Create a timer for the buffer flush.
System.Threading.Timer _flushTimer = new System.Threading.Timer(TimedFlush, FlushDelay.TotalMilliseconds, Timeout.Infinite);
// A lock for the list. Unless you're getting hundreds of thousands
// of items per second, this will not be a performance problem.
object _listLock = new Object();
List<BrokeredMessage> _recordBuffer = new List<BrokeredMessage>();
Then, in your consumer:
foreach (var msg in batchingQueue.GetConsumingEnumerable())
{
// process the message
Console.WriteLine("Completing message");
msg.Complete();
lock (_listLock)
{
_recordBuffer.Add(msg);
if (_recordBuffer.Count >= MaxBufferItems)
{
// Stop the timer
_flushTimer.Change(Timeout.Infinite, Timeout.Infinite);
// Save the old list and allocate a new one
var myList = _recordBuffer;
_recordBuffer = new List<BrokeredMessage>();
// Start a task to write to the database
Task.Factory.StartNew(() => FlushBuffer(myList));
// Restart the timer
_flushTimer.Change(FlushDelay.TotalMilliseconds, Timeout.Infinite);
}
}
}
private void TimedFlush()
{
bool lockTaken = false;
List<BrokeredMessage> myList = null;
try
{
if (Monitor.TryEnter(_listLock, 0, out lockTaken))
{
// Save the old list and allocate a new one
myList = _recordBuffer;
_recordBuffer = new List<BrokeredMessage>();
}
}
finally
{
if (lockTaken)
{
Monitor.Exit(_listLock);
}
}
if (myList != null)
{
FlushBuffer(myList);
}
// Restart the timer
_flushTimer.Change(FlushDelay.TotalMilliseconds, Timeout.Infinite);
}
The idea here is that you get the old list out of the way, allocate a new list so that processing can continue, and then write the old list's items to the database. The lock is there to prevent the timer and the record counter from stepping on each other. Without the lock, things would likely appear to work fine for a while, and then you'd get weird crashes at unpredictable times.
I like this design because it eliminates polling by the consumer. The only thing I don't like is that the consumer has to be aware of the timer (i.e. it has to stop and then restart the timer). With a little more thought, I could eliminate that requirement. But it works well the way it's written.
Switching to OnMessageAsync solved the problem for me
_queueClient.OnMessageAsync(async receivedMessage =>
I reached out to Microsoft about the BrokeredMessage being disposed issue on MSDN, this is the response:
Very basic rule and I am not sure if this is documented. The received message needs to be processed in the callback function's life time. In your case, messages will be disposed when async callback completes, this is why your complete attempts are failing with ObjectDisposedException in another thread.
I don't really see how queuing messages for further processing helps on the throughput. This will add more burden to client for sure. Try processing the message in the async callback, that should be performant enough.
In my case that means I can't use ServiceBus in the way I wanted to, and I have to re-think how I wanted things to work. Bugger.
I had the same issue when started to work with Azure Service Bus service.
I have found that method OnMessage always dispose BrokedMessage object. The approach proposed by Jim Mischel didn't help me (but it was very interesting to read - thanks!).
After some investigation I have found that the whole approach is wrong. Let me explain the right way to do what you want.
Use BrokedMessage.Complete() method only inside OnMessage method handler.
If you need to process message outside of this method that you should use method QueueClient.Complete(Guid lockToken). "LockToken" is property of BrokeredMessage object.
Example:
var messageOptions = new OnMessageOptions {
AutoComplete = false,
AutoRenewTimeout = TimeSpan.FromMinutes( 5 ),
MaxConcurrentCalls = 1
};
var buffer = new Dictionary<string, Guid>();
// get message from queue
myQueueClient.OnMessage(
m => buffer.Add(key: m.GetBody<string>(), value: m.LockToken),
messageOptions // this option says to ServiceBus to "froze" message in he queue until we process it
);
foreach(var item in buffer){
try {
Console.WriteLine($"Process item: {item.Key}");
myQueueClient.Complete(item.Value);// you can also use method CompleteBatch(...) to improve performance
}
catch{
// "unfroze" message in ServiceBus. Message would be delivered to other listener
myQueueClient.Defer(item.Value);
}
}
My solution was to get the message SequenceNumber then defer the message and add the SequenceNumber the BlockingCollection. Once the BlockingCollection picks up a new item it can receive the deferred message by the SequenceNumber and mark the message as complete. If for some reason the BlockingCollection doesn't process the SequenceNumber it will remain in the queue as deferred so it can be picked up later when the process is restarted. This protects against loosing messages if the process abnormally terminates while there's still items in the BlockingCollection.
BlockingCollection<long> queueSequenceNumbers = new BlockingCollection<long>();
//This finds any deferred/unfinished messages on startup.
BrokeredMessage existingMessage = client.Peek();
while (existingMessage != null)
{
if (existingMessage.State == MessageState.Deferred)
{
queueSequenceNumbers.Add(existingMessage.SequenceNumber);
}
existingMessage = client.Peek();
}
//setup the message handler
Action<BrokeredMessage> processMessage = new Action<BrokeredMessage>((message) =>
{
try
{
//skip deferred messages if they are already in the queueSequenceNumbers collection.
if (message.State != MessageState.Deferred || (message.State == MessageState.Deferred && !queueSequenceNumbers.Any(x => x == message.SequenceNumber)))
{
message.Defer();
queueSequenceNumbers.Add(message.SequenceNumber);
}
}
catch (Exception ex)
{
// Indicates a problem, unlock message in queue
message.Abandon();
}
});
// Callback to handle newly received messages
client.OnMessage(processMessage, new OnMessageOptions() { AutoComplete = false, MaxConcurrentCalls = 1 });
//start the blocking loop to process messages as they are added to the collection
foreach (var queueSequenceNumber in queueSequenceNumbers.GetConsumingEnumerable())
{
var message = client.Receive(queueSequenceNumber);
//mark the message as complete so it's removed from the queue
message.Complete();
//do something with the message
}

How will Parallel.Foreach behave when Iterating over the results of a method call?

Scope:
I am currently implementing an application that uses Amazon SQS Service as a provider of data for this program to process.
Since I need a parallel processing over the messages dequeued from this queue, this is what I've did.
Parallel.ForEach (GetMessages (msgAttributes), new ParallelOptions { MaxDegreeOfParallelism = threadCount }, message =>
{
// Processing Logic
});
Here's the header of the "GetMessages" method:
private static IEnumerable<Message> GetMessages (List<String> messageAttributes = null)
{
// Dequeueing Logic... 10 At a Time
// Yielding the messages to the Parallel Loop
foreach (Message awsMessage in messages)
{
yield return awsMessage;
}
}
How will this work ?:
My initial thought about how this would work was that the GetMessagesmethod would be executed whenever the thread's had no work (or a good number of threads had no work, something like an internal heuristic to measure this). That being said, to me, the GetMessages method would than, distribute the messages to the Parallel.For working threads, which would process the messages and wait for the Parallel.For handler to give them more messages to work.
Problem? I was wrong...
The thing is that, I was wrong. Still, I have no idea on what's happening in this situation.
The number of messages being dequeued is way too high, and it grews by powers of 2 every time they get dequeued. The dequeueing count (messsages) goes as following:
Dequeue is Called: Returns 80 Messages
Dequeue is Called: Returns 160 Messages
Dequeue is Called: Returns 320 Messages (and so forth)
After a certain point, the number of messages being dequeued, or, in this case, waiting to be processed by my application is too high and I end up running out of memory.
More Information:
I am using thread-safe InterLocked operations to increment counters mentioned above.
The number of threads being used is 25 (for the Parallel.Foreach)
Each "GetMessages" will return up to 10 messages (as an IEnumerable, yielded).
Question: What exactly is happening on this scenario ?
I am having a hard-time trying to figure out what exactly is going on. Is my GetMessages method being invoked by each thread once it finishes the "Processing Loop", hence, leading to more and more messages being dequeued ?
Is the call to the "GetMessages", made by a single thread, or is it being called by multiple threads ?
I think there's an issue with Parallel.ForEach partitioning... Your question is a typical producer / consumer scenario. For such a case, you should have independent logics for dequeuing on one side, and processing on the other. It will respect separation of concerns, and will simplify debugging.
BlockingCollection<T> will let you to separate boths : on one side, you add items to be processed, and on the other, you consume them. Here's an example of how to implement it :
You will need the ParallelExtensionsExtras nuget package for BlockingCollection<T> workload partitioning (.GetConsumingEnumerable() in the process method).
public static class ProducerConsumer
{
public static ConcurrentQueue<String> SqsQueue = new ConcurrentQueue<String>();
public static BlockingCollection<String> Collection = new BlockingCollection<String>();
public static ConcurrentBag<String> Result = new ConcurrentBag<String>();
public static async Task TestMethod()
{
// Here we separate all the Tasks in distinct threads
Task sqs = Task.Run(async () =>
{
Console.WriteLine("Amazon on thread " + Thread.CurrentThread.ManagedThreadId.ToString());
while (true)
{
ProducerConsumer.BackgroundFakedAmazon(); // We produce 50 Strings each second
await Task.Delay(1000);
}
});
Task deq = Task.Run(async () =>
{
Console.WriteLine("Dequeue on thread " + Thread.CurrentThread.ManagedThreadId.ToString());
while (true)
{
ProducerConsumer.DequeueData(); // Dequeue 20 Strings each 100ms
await Task.Delay(100);
}
});
Task process = Task.Run(() =>
{
Console.WriteLine("Process on thread " + Thread.CurrentThread.ManagedThreadId.ToString());
ProducerConsumer.BackgroundParallelConsumer(); // Process all the Strings in the BlockingCollection
});
await Task.WhenAll(c, sqs, deq, process);
}
public static void DequeueData()
{
foreach (var i in Enumerable.Range(0, 20))
{
String dequeued = null;
if (SqsQueue.TryDequeue(out dequeued))
{
Collection.Add(dequeued);
Console.WriteLine("Dequeued : " + dequeued);
}
}
}
public static void BackgroundFakedAmazon()
{
Console.WriteLine(" ---------- Generate 50 items on amazon side ---------- ");
foreach (var data in Enumerable.Range(0, 50).Select(i => Path.GetRandomFileName().Split('.').FirstOrDefault()))
SqsQueue.Enqueue(data + " / ASQS");
}
public static void BackgroundParallelConsumer()
{
// Here we stay in Parallel.ForEach, waiting for data. Once processed, we are still waiting the next chunks
Parallel.ForEach(Collection.GetConsumingEnumerable(), (i) =>
{
// Processing Logic
String processedData = "Processed : " + i;
Result.Add(processedData);
Console.WriteLine(processedData);
});
}
}
You can try it from a console app like this :
static void Main(string[] args)
{
ProducerConsumer.TestMethod().Wait();
}

C# Threading in Window service creating issue

I have issue with email sending window service. The service starts after every three minutes delay and get messages that are to send from the db, and start sending it. Here is how the code looks like:
MessageFilesHandler MFHObj = new MessageFilesHandler();
List<Broadcostmsg> imidiateMsgs = Manager.GetImidiateBroadCastMsgs(conString);
if (imidiateMsgs.Count > 0)
{
// WriteToFileImi(strLog);
Thread imMsgThread = new Thread(new ParameterizedThreadStart(MFHObj.SendImidiatBroadcast));
imMsgThread.IsBackground = true;
imMsgThread.Start(imidiateMsgs);
}
This sends messages to large lists, and take long to complete sending to a larger list. now the problem occurs when on message is still sending and the service get a new message to send, the previous sending is haulted and new message sending started, although i am using threads, each time service get message to send it initiate a new thread.
Can u please help where i am doing mistake in the code.
I think you are using your code inside a loop which WAITS for new messages, did you manage those waits?? let's see:
while(imidiateMsgs.Count == 0)
{
//Wait for new Message
}
//Now you have a new message Here
//Make a new thread to process message
there are different methods for that wait, I suggest using BlockingQueues:
In public area:
BlockingCollection<Broadcostmsg> imidiateMsgs = new BlockingCollection<Broadcostmsg>();
In your consumer(thread which generates messages):
SendImidiatBroadcast = imidiateMsgs.Take();//this will wait for new message
//Now you have a new message Here
//Make a new thread to process message
In producer(thread which answers messages):
imidiateMsgs.Add(SendImidiatBroadcast);
And you have to use thread pool for making new threads each time to answer messages, don' initialize new thread each time.
It looks like requirement is to build a consumer producer queue. In which producer will keep adding message to a list and consumer would pick item from that list and do some work with it
Only worry for me is, you are each time creating a new Thread to send email rather than picking threads from thread pool. If you keep on creating more and more thread, performance of your application will degrade due to over head created by context switching.
If you are using .Net framwe work 4.0, the soultion become pretty easy. You could use System.Collections.Concurrent.ConcurrentQueue for en-queuing and dequeuing your items. Its thread safe, so no lock objects required. Use Tasks to process your messages.
BlockingCollection takes an IProducerConsumerCollection in its constructor, or it will use a ConcurrentQueue by default if you call its empty constructor.
So to enqueue your messages.
//define a blocking collectiom
var blockingCollection = new BlockingCollection<string>();
//Producer
Task.Factory.StartNew(() =>
{
while (true)
{
blockingCollection.Add("value" + count);
count++;
}
});
//consumer
//GetConsumingEnumerable would wait until it find some item for work
// its similar to while(true) loop that we put inside consumer queue
Task.Factory.StartNew(() =>
{
foreach (string value in blockingCollection.GetConsumingEnumerable())
{
Console.WriteLine("Worker 1: " + value);
}
});
UPDATE
Since you are using FrameWork 3.5. I suggest you have a look at Joseph Albahari's implementation of Consumer/Producer Queue. Its one of the best that you would ever find out.
Taking the code directly from above link
public class PCQueue
{
readonly object _locker = new object();
Thread[] _workers;
Queue<Action> _itemQ = new Queue<Action>();
public PCQueue (int workerCount)
{
_workers = new Thread [workerCount];
// Create and start a separate thread for each worker
for (int i = 0; i < workerCount; i++)
(_workers [i] = new Thread (Consume)).Start();
}
public void Shutdown (bool waitForWorkers)
{
// Enqueue one null item per worker to make each exit.
foreach (Thread worker in _workers)
EnqueueItem (null);
// Wait for workers to finish
if (waitForWorkers)
foreach (Thread worker in _workers)
worker.Join();
}
public void EnqueueItem (Action item)
{
lock (_locker)
{
_itemQ.Enqueue (item); // We must pulse because we're
Monitor.Pulse (_locker); // changing a blocking condition.
}
}
void Consume()
{
while (true) // Keep consuming until
{ // told otherwise.
Action item;
lock (_locker)
{
while (_itemQ.Count == 0) Monitor.Wait (_locker);
item = _itemQ.Dequeue();
}
if (item == null) return; // This signals our exit.
item(); // Execute item.
}
}
}
The advantage with this approach is you can control the number of Threads that you need to create for optimized performance. With threadpools approach, although its safe, you can not control the number of threads that could be created simultaneously.

Concurrent collections eating too much cpu without Thread.Sleep

What would be the correct usage of either, BlockingCollection or ConcurrentQueue so you can freely dequeue items without burning out half or more of your CPU using a thread ?
I was running some tests using 2 threads and unless I had a Thread.Sleep of at least 50~100ms it would always hit at least 50% of my CPU.
Here is a fictional example:
private void _DequeueItem()
{
object o = null;
while(socket.Connected)
{
while (!listOfQueueItems.IsEmpty)
{
if (listOfQueueItems.TryDequeue(out o))
{
// use the data
}
}
}
}
With the above example I would have to set a thread.sleep so the cpu doesnt blow up.
Note: I have also tried it without the while for IsEmpty check, result was the same.
It is not because of the BlockingCollection or ConcurrentQueue, but the while loop:
while(socket.Connected)
{
while (!listOfQueueItems.IsEmpty)
{ /*code*/ }
}
Of course it will take the cpu down; because of if the queue is empty, then the while loop is just like:
while (true) ;
which in turn will eat the cpu resources.
This is not a good way of using ConcurrentQueue you should use AutoResetEvent with it so whenever item is added you will be notified.
Example:
private ConcurrentQueue<Data> _queue = new ConcurrentQueue<Data>();
private AutoResetEvent _queueNotifier = new AutoResetEvent(false);
//at the producer:
_queue.Enqueue(new Data());
_queueNotifier.Set();
//at the consumer:
while (true)//or some condition
{
_queueNotifier.WaitOne();//here we will block until receive signal notification.
Data data;
if (_queue.TryDequeue(out data))
{
//handle the data
}
}
For a good usage of the BlockingCollection you should use the GetConsumingEnumerable() to wait for the items to be added, Like:
//declare the buffer
private BlockingCollection<Data> _buffer = new BlockingCollection<Data>(new ConcurrentQueue<Data>());
//at the producer method:
_messageBuffer.Add(new Data());
//at the consumer
foreach (Data data in _buffer.GetConsumingEnumerable())//it will block here automatically waiting from new items to be added and it will not take cpu down
{
//handle the data here.
}
You really want to be using the BlockingCollection class in this case. It is designed to block until an item appears in the queue. A collection of this nature is often referred to as a blocking queue. This particular implementation is safe for multiple producers and multiple consumers. That is something that is surprisingly difficult to get right if you tried implementing it yourself. Here is what your code would look like if you used BlockingCollection.
private void _DequeueItem()
{
while(socket.Connected)
{
object o = listOfQueueItems.Take();
// use the data
}
}
The Take method blocks automatically if the queue is empty. It blocks in a manner that puts the thread in the SleepWaitJoin state so that it will not consume CPU resources. The neat thing about BlockingCollection is that it also uses low-lock strategies to increase performance. What this means is that Take will check to see if there is an item in the queue and if not then it will briefly perform a spin wait to prevent a context switch of the thread. If the queue is still empty then it will put the thread to sleep. This means that BlockingCollection will have some of the performance benefits that ConcurrentQueue provides in regards to concurrent execution.
You can call Thread.Sleep() only when queue is empty:
private void DequeueItem()
{
object o = null;
while(socket.Connected)
{
if (listOfQueueItems.IsEmpty)
{
Thread.Sleep(50);
}
else if (listOfQueueItems.TryDequeue(out o))
{
// use the data
}
}
}
Otherwise you should consider to use events.

Categories

Resources