Concurrent collections eating too much cpu without Thread.Sleep - c#

What would be the correct usage of either, BlockingCollection or ConcurrentQueue so you can freely dequeue items without burning out half or more of your CPU using a thread ?
I was running some tests using 2 threads and unless I had a Thread.Sleep of at least 50~100ms it would always hit at least 50% of my CPU.
Here is a fictional example:
private void _DequeueItem()
{
object o = null;
while(socket.Connected)
{
while (!listOfQueueItems.IsEmpty)
{
if (listOfQueueItems.TryDequeue(out o))
{
// use the data
}
}
}
}
With the above example I would have to set a thread.sleep so the cpu doesnt blow up.
Note: I have also tried it without the while for IsEmpty check, result was the same.

It is not because of the BlockingCollection or ConcurrentQueue, but the while loop:
while(socket.Connected)
{
while (!listOfQueueItems.IsEmpty)
{ /*code*/ }
}
Of course it will take the cpu down; because of if the queue is empty, then the while loop is just like:
while (true) ;
which in turn will eat the cpu resources.
This is not a good way of using ConcurrentQueue you should use AutoResetEvent with it so whenever item is added you will be notified.
Example:
private ConcurrentQueue<Data> _queue = new ConcurrentQueue<Data>();
private AutoResetEvent _queueNotifier = new AutoResetEvent(false);
//at the producer:
_queue.Enqueue(new Data());
_queueNotifier.Set();
//at the consumer:
while (true)//or some condition
{
_queueNotifier.WaitOne();//here we will block until receive signal notification.
Data data;
if (_queue.TryDequeue(out data))
{
//handle the data
}
}
For a good usage of the BlockingCollection you should use the GetConsumingEnumerable() to wait for the items to be added, Like:
//declare the buffer
private BlockingCollection<Data> _buffer = new BlockingCollection<Data>(new ConcurrentQueue<Data>());
//at the producer method:
_messageBuffer.Add(new Data());
//at the consumer
foreach (Data data in _buffer.GetConsumingEnumerable())//it will block here automatically waiting from new items to be added and it will not take cpu down
{
//handle the data here.
}

You really want to be using the BlockingCollection class in this case. It is designed to block until an item appears in the queue. A collection of this nature is often referred to as a blocking queue. This particular implementation is safe for multiple producers and multiple consumers. That is something that is surprisingly difficult to get right if you tried implementing it yourself. Here is what your code would look like if you used BlockingCollection.
private void _DequeueItem()
{
while(socket.Connected)
{
object o = listOfQueueItems.Take();
// use the data
}
}
The Take method blocks automatically if the queue is empty. It blocks in a manner that puts the thread in the SleepWaitJoin state so that it will not consume CPU resources. The neat thing about BlockingCollection is that it also uses low-lock strategies to increase performance. What this means is that Take will check to see if there is an item in the queue and if not then it will briefly perform a spin wait to prevent a context switch of the thread. If the queue is still empty then it will put the thread to sleep. This means that BlockingCollection will have some of the performance benefits that ConcurrentQueue provides in regards to concurrent execution.

You can call Thread.Sleep() only when queue is empty:
private void DequeueItem()
{
object o = null;
while(socket.Connected)
{
if (listOfQueueItems.IsEmpty)
{
Thread.Sleep(50);
}
else if (listOfQueueItems.TryDequeue(out o))
{
// use the data
}
}
}
Otherwise you should consider to use events.

Related

Understanding the C# lock [duplicate]

I am having hard time in understanding Wait(), Pulse(), PulseAll(). Will all of them avoid deadlock? I would appreciate if you explain how to use them?
Short version:
lock(obj) {...}
is short-hand for Monitor.Enter / Monitor.Exit (with exception handling etc). If nobody else has the lock, you can get it (and run your code) - otherwise your thread is blocked until the lock is aquired (by another thread releasing it).
Deadlock typically happens when either A: two threads lock things in different orders:
thread 1: lock(objA) { lock (objB) { ... } }
thread 2: lock(objB) { lock (objA) { ... } }
(here, if they each acquire the first lock, neither can ever get the second, since neither thread can exit to release their lock)
This scenario can be minimised by always locking in the same order; and you can recover (to a degree) by using Monitor.TryEnter (instead of Monitor.Enter/lock) and specifying a timeout.
or B: you can block yourself with things like winforms when thread-switching while holding a lock:
lock(obj) { // on worker
this.Invoke((MethodInvoker) delegate { // switch to UI
lock(obj) { // oopsiee!
...
}
});
}
The deadlock appears obvious above, but it isn't so obvious when you have spaghetti code; possible answers: don't thread-switch while holding locks, or use BeginInvoke so that you can at least exit the lock (letting the UI play).
Wait/Pulse/PulseAll are different; they are for signalling. I use this in this answer to signal so that:
Dequeue: if you try to dequeue data when the queue is empty, it waits for another thread to add data, which wakes up the blocked thread
Enqueue: if you try and enqueue data when the queue is full, it waits for another thread to remove data, which wakes up the blocked thread
Pulse only wakes up one thread - but I'm not brainy enough to prove that the next thread is always the one I want, so I tend to use PulseAll, and simply re-verify the conditions before continuing; as an example:
while (queue.Count >= maxSize)
{
Monitor.Wait(queue);
}
With this approach, I can safely add other meanings of Pulse, without my existing code assuming that "I woke up, therefore there is data" - which is handy when (in the same example) I later needed to add a Close() method.
Simple recipe for use of Monitor.Wait and Monitor.Pulse. It consists of a worker, a boss, and a phone they use to communicate:
object phone = new object();
A "Worker" thread:
lock(phone) // Sort of "Turn the phone on while at work"
{
while(true)
{
Monitor.Wait(phone); // Wait for a signal from the boss
DoWork();
Monitor.PulseAll(phone); // Signal boss we are done
}
}
A "Boss" thread:
PrepareWork();
lock(phone) // Grab the phone when I have something ready for the worker
{
Monitor.PulseAll(phone); // Signal worker there is work to do
Monitor.Wait(phone); // Wait for the work to be done
}
More complex examples follow...
A "Worker with something else to do":
lock(phone)
{
while(true)
{
if(Monitor.Wait(phone,1000)) // Wait for one second at most
{
DoWork();
Monitor.PulseAll(phone); // Signal boss we are done
}
else
DoSomethingElse();
}
}
An "Impatient Boss":
PrepareWork();
lock(phone)
{
Monitor.PulseAll(phone); // Signal worker there is work to do
if(Monitor.Wait(phone,1000)) // Wait for one second at most
Console.Writeline("Good work!");
}
No, they don't protect you from deadlocks. They are just more flexible tools for thread synchronization. Here is a very good explanation how to use them and very important pattern of usage - without this pattern you will break all the things:
http://www.albahari.com/threading/part4.aspx
Something that total threw me here is that Pulse just gives a "heads up" to a thread in a Wait. The Waiting thread will not continue until the thread that did the Pulse gives up the lock and the waiting thread successfully wins it.
lock(phone) // Grab the phone
{
Monitor.PulseAll(phone); // Signal worker
Monitor.Wait(phone); // ****** The lock on phone has been given up! ******
}
or
lock(phone) // Grab the phone when I have something ready for the worker
{
Monitor.PulseAll(phone); // Signal worker there is work to do
DoMoreWork();
} // ****** The lock on phone has been given up! ******
In both cases it's not until "the lock on phone has been given up" that another thread can get it.
There might be other threads waiting for that lock from Monitor.Wait(phone) or lock(phone). Only the one that wins the lock will get to continue.
They are tools for synchronizing and signaling between threads. As such they do nothing to prevent deadlocks, but if used correctly they can be used to synchronize and communicate between threads.
Unfortunately most of the work needed to write correct multithreaded code is currently the developers' responsibility in C# (and many other languages). Take a look at how F#, Haskell and Clojure handles this for an entirely different approach.
Unfortunately, none of Wait(), Pulse() or PulseAll() have the magical property which you are wishing for - which is that by using this API you will automatically avoid deadlock.
Consider the following code
object incomingMessages = new object(); //signal object
LoopOnMessages()
{
lock(incomingMessages)
{
Monitor.Wait(incomingMessages);
}
if (canGrabMessage()) handleMessage();
// loop
}
ReceiveMessagesAndSignalWaiters()
{
awaitMessages();
copyMessagesToReadyArea();
lock(incomingMessages) {
Monitor.PulseAll(incomingMessages); //or Monitor.Pulse
}
awaitReadyAreaHasFreeSpace();
}
This code will deadlock! Maybe not today, maybe not tomorrow. Most likely when your code is placed under stress because suddenly it has become popular or important, and you are being called to fix an urgent issue.
Why?
Eventually the following will happen:
All consumer threads are doing some work
Messages arrive, the ready area can't hold any more messages, and PulseAll() is called.
No consumer gets woken up, because none are waiting
All consumer threads call Wait() [DEADLOCK]
This particular example assumes that producer thread is never going to call PulseAll() again because it has no more space to put messages in. But there are many, many broken variations on this code possible. People will try to make it more robust by changing a line such as making Monitor.Wait(); into
if (!canGrabMessage()) Monitor.Wait(incomingMessages);
Unfortunately, that still isn't enough to fix it. To fix it you also need to change the locking scope where Monitor.PulseAll() is called:
LoopOnMessages()
{
lock(incomingMessages)
{
if (!canGrabMessage()) Monitor.Wait(incomingMessages);
}
if (canGrabMessage()) handleMessage();
// loop
}
ReceiveMessagesAndSignalWaiters()
{
awaitMessagesArrive();
lock(incomingMessages)
{
copyMessagesToReadyArea();
Monitor.PulseAll(incomingMessages); //or Monitor.Pulse
}
awaitReadyAreaHasFreeSpace();
}
The key point is that in the fixed code, the locks restrict the possible sequences of events:
A consumer threads does its work and loops
That thread acquires the lock
And thanks to locking it is now true that either:
a. Messages haven't yet arrived in the ready area, and it releases the lock by calling Wait() BEFORE the message receiver thread can acquire the lock and copy more messages into the ready area, or
b. Messages have already arrived in the ready area and it receives the messages INSTEAD OF calling Wait(). (And while it is making this decision it is impossible for the message receiver thread to e.g. acquire the lock and copy more messages into the ready area.)
As a result the problem of the original code now never occurs:
3. When PulseEvent() is called No consumer gets woken up, because none are waiting
Now observe that in this code you have to get the locking scope exactly right. (If, indeed I got it right!)
And also, since you must use the lock (or Monitor.Enter() etc.) in order to use Monitor.PulseAll() or Monitor.Wait() in a deadlock-free fashion, you still have to worry about possibility of other deadlocks which happen because of that locking.
Bottom line: these APIs are also easy to screw up and deadlock with, i.e. quite dangerous
This is a simple example of monitor use :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApp4
{
class Program
{
public static int[] X = new int[30];
static readonly object _object = new object();
public static int count=0;
public static void PutNumbers(int numbersS, int numbersE)
{
for (int i = numbersS; i < numbersE; i++)
{
Monitor.Enter(_object);
try
{
if(count<30)
{
X[count] = i;
count++;
Console.WriteLine("Punt in " + count + "nd: "+i);
Monitor.Pulse(_object);
}
else
{
Monitor.Wait(_object);
}
}
finally
{
Monitor.Exit(_object);
}
}
}
public static void RemoveNumbers(int numbersS)
{
for (int i = 0; i < numbersS; i++)
{
Monitor.Enter(_object);
try
{
if (count > 0)
{
X[count] = 0;
int x = count;
count--;
Console.WriteLine("Removed " + x + " element");
Monitor.Pulse(_object);
}
else
{
Monitor.Wait(_object);
}
}
finally
{
Monitor.Exit(_object);
}
}
}
static void Main(string[] args)
{
Thread W1 = new Thread(() => PutNumbers(10,50));
Thread W2 = new Thread(() => PutNumbers(1, 10));
Thread R1 = new Thread(() => RemoveNumbers(30));
Thread R2 = new Thread(() => RemoveNumbers(20));
W1.Start();
R1.Start();
W2.Start();
R2.Start();
W1.Join();
R1.Join();
W2.Join();
R2.Join();
}
}
}

Condition Variables C#/.NET

In my quest to build a condition variable class I stumbled on a trivially simple way of doing it and I'd like to share this with the stack overflow community. I was googling for the better part of an hour and was unable to actually find a good tutorial or .NET-ish example that felt right, hopefully this can be of use to other people out there.
It's actually incredibly simple, once you know about the semantics of lock and Monitor.
But first, you do need an object reference. You can use this, but remember that this is public, in the sense that anyone with a reference to your class can lock on that reference. If you are uncomfortable with this, you can create a new private reference, like this:
readonly object syncPrimitive = new object(); // this is legal
Somewhere in your code where you'd like to be able to provide notifications, it can be accomplished like this:
void Notify()
{
lock (syncPrimitive)
{
Monitor.Pulse(syncPrimitive);
}
}
And the place where you'd do the actual work is a simple looping construct, like this:
void RunLoop()
{
lock (syncPrimitive)
{
for (;;)
{
// do work here...
Monitor.Wait(syncPrimitive);
}
}
}
Yes, this looks incredibly deadlock-ish, but the locking protocol for Monitor is such that it will release the lock during the Monitor.Wait. In fact, it's a requirement that you have obtained the lock before you call either Monitor.Pulse, Monitor.PulseAll or Monitor.Wait.
There's one caveat with this approach that you should know about. Since the lock is required to be held before calling the communication methods of Monitor you should really only hang on to the lock for an as short duration as possible. A variation of the RunLoop that's more friendly towards long running background tasks would look like this:
void RunLoop()
{
for (;;)
{
// do work here...
lock (syncPrimitive)
{
Monitor.Wait(syncPrimitive);
}
}
}
But now we've changed up the problem a bit, because the lock is no longer protecting the shared resource throughout the processing. So, if some of your code in the do work here... bit needs to access a shared resource you'll need an separate lock managing access to that.
We can leverage the above to create a simple thread-safe producer consumer collection (although .NET already provides an excellent ConcurrentQueue<T> implementation; this is just to illustrate the simplicity of using Monitor in implementing such mechanisms).
class BlockingQueue<T>
{
// We base our queue on the (non-thread safe) .NET 2.0 Queue collection
readonly Queue<T> q = new Queue<T>();
public void Enqueue(T item)
{
lock (q)
{
q.Enqueue(item);
System.Threading.Monitor.Pulse(q);
}
}
public T Dequeue()
{
lock (q)
{
for (;;)
{
if (q.Count > 0)
{
return q.Dequeue();
}
System.Threading.Monitor.Wait(q);
}
}
}
}
Now the point here is not to build a blocking collection, that also available in the .NET framework (see BlockingCollection). The point is to illustrate how simple it is to build an event driven message system using the Monitor class in .NET to implement conditional variable. Hope you find this useful.
Use ManualResetEvent
The class that is similar to conditional variable is the ManualResetEvent, just that the method name is slightly different.
The notify_one() in C++ would be named Set() in C#.
The wait() in C++ would be named WaitOne() in C#.
Moreover, ManualResetEvent also provides a Reset() method to set the state of the event to non-signaled.
The accepted answer is not a good one.
According to the Dequeue() code, Wait() gets called in each loop, which causes unnecessary waiting thus excessive context switches. The correct paradigm should be, wait() is called when the waiting condition is met. In this case, the waiting condition is q.Count() == 0.
Here's a better pattern to follow when it comes to using a Monitor.
https://msdn.microsoft.com/en-us/library/windows/desktop/ms682052%28v=vs.85%29.aspx
Another comment on C# Monitor is, it does not make use of a condition variable(which will essentially wake up all threads waiting for that lock, regardless of the conditions in which they went to wait; consequently, some threads may grab the lock and immediately return to sleep when they find the waiting condition hasn't been changed). It does not provide you with as find-grained threading control as pthreads. But it's .Net anyway, so not completely unexpected.
=============upon the request of John, here's an improved version=============
class BlockingQueue<T>
{
readonly Queue<T> q = new Queue<T>();
public void Enqueue(T item)
{
lock (q)
{
while (false) // condition predicate(s) for producer; can be omitted in this particular case
{
System.Threading.Monitor.Wait(q);
}
// critical section
q.Enqueue(item);
}
// generally better to signal outside the lock scope
System.Threading.Monitor.Pulse(q);
}
public T Dequeue()
{
T t;
lock (q)
{
while (q.Count == 0) // condition predicate(s) for consumer
{
System.Threading.Monitor.Wait(q);
}
// critical section
t = q.Dequeue();
}
// this can be omitted in this particular case; but not if there's waiting condition for the producer as the producer needs to be woken up; and here's the problem caused by missing condition variable by C# monitor: all threads stay on the same waiting queue of the shared resource/lock.
System.Threading.Monitor.Pulse(q);
return t;
}
}
A few things I'd like to point out:
1, I think my solution captures the requirements & definitions more precisely than yours. Specifically, the consumer should be forced to wait if and only if there's nothing left in the queue; otherwise it's up to the OS/.Net runtime to schedule threads. In your solution, however, the consumer is forced to wait in each loop, regardless whether it has actually consumed anything or not - this is the excessive waiting/context switches I was talking about.
2, My solution is symmetric in the sense that both the consumer and the producer code share the same pattern while yours is not. If you did know the pattern and just omitted for this particular case, then I take back this point.
3, Your solution signals inside the lock scope, while my solutions signals outside the lock scope. Please refer to this answer as to why your solution is worse.
why should we signal outside the lock scope
I was talking about the flaw of missing condition variables in C# monitor, and here's its impact: there's simply no way for C# to implemented the solution of moving the waiting thread from the condition queue to the lock queue. Therefore, the excessive context switch is doomed to take place in the three-thread scenario proposed by the answer in the link.
Also, the lack of condition variable makes it impossible to distinguish between the various cases where threads wait on the same shared resource/lock, but for different reasons. All waiting threads are place on a big waiting queue for that shared resource, which undermines efficiency.
"But it's .Net anyway, so not completely unexpected" --- it's understandable that .Net does not pursue as high efficiency as C++, it's understandable. But it does not imply programmers should not know the differences and their impacts.
Go to deadlockempire.github.io/. They have an amazing tutorial that will help you understand the condition variable as well as locks and will cetainly help you write your desired class.
You can step through the following code at deadlockempire.github.io and trace it. Here is the code snippet
while (true) {
Monitor.Enter(mutex);
if (queue.Count == 0) {
Monitor.Wait(mutex);
}
queue.Dequeue();
Monitor.Exit(mutex);
}
while (true) {
Monitor.Enter(mutex);
if (queue.Count == 0) {
Monitor.Wait(mutex);
}
queue.Dequeue();
Monitor.Exit(mutex);
}
while (true) {
Monitor.Enter(mutex);
queue.Enqueue(42);
Monitor.PulseAll(mutex);
Monitor.Exit(mutex);
}
As has been pointed out by h9uest's answer and comments the Monitor's Wait interface does not allow for proper condition variables (i.e. it does not allow for waiting on multiple conditions per shared lock).
The good news is that the other synchronization primitives (e.g. SemaphoreSlim, lock keyword, Monitor.Enter/Exit) in .NET can be used to implement a proper condition variable.
The following ConditionVariable class will allow you to wait on multiple conditions using a shared lock.
class ConditionVariable
{
private int waiters = 0;
private object waitersLock = new object();
private SemaphoreSlim sema = new SemaphoreSlim(0, Int32.MaxValue);
public ConditionVariable() {
}
public void Pulse() {
bool release;
lock (waitersLock)
{
release = waiters > 0;
}
if (release) {
sema.Release();
}
}
public void Wait(object cs) {
lock (waitersLock) {
++waiters;
}
Monitor.Exit(cs);
sema.Wait();
lock (waitersLock) {
--waiters;
}
Monitor.Enter(cs);
}
}
All you need to do is create an instance of the ConditionVariable class for each condition you want to be able to wait on.
object queueLock = new object();
private ConditionVariable notFullCondition = new ConditionVariable();
private ConditionVariable notEmptyCondition = new ConditionVariable();
And then just like in the Monitor class, the ConditionVariable's Pulse and Wait methods must be invoked from within a synchronized block of code.
T Take() {
lock(queueLock) {
while(queue.Count == 0) {
// wait for queue to be not empty
notEmptyCondition.Wait(queueLock);
}
T item = queue.Dequeue();
if(queue.Count < 100) {
// notify producer queue not full anymore
notFullCondition.Pulse();
}
return item;
}
}
void Add(T item) {
lock(queueLock) {
while(queue.Count >= 100) {
// wait for queue to be not full
notFullCondition.Wait(queueLock);
}
queue.Enqueue(item);
// notify consumer queue not empty anymore
notEmptyCondition.Pulse();
}
}
Below is a link to the full source code of a proper Condition Variable class using 100% managed code in C#.
https://github.com/CodeExMachina/ConditionVariable
i think i found "The WAY" on the tipical problem of a
List<string> log;
used by multiple thread, one tha fill it and the other processing and the other one empting
avoiding empty
while(true){
//stuff
Thread.Sleep(100)
}
variables used in Program
public static readonly List<string> logList = new List<string>();
public static EventWaitHandle evtLogListFilled = new AutoResetEvent(false);
the processor work like
private void bw_DoWorkLog(object sender, DoWorkEventArgs e)
{
StringBuilder toFile = new StringBuilder();
while (true)
{
try
{
{
//waiting form a signal
Program.evtLogListFilled.WaitOne();
try
{
//critical section
Monitor.Enter(Program.logList);
int max = Program.logList.Count;
for (int i = 0; i < max; i++)
{
SetText(Program.logList[0]);
toFile.Append(Program.logList[0]);
toFile.Append("\r\n");
Program.logList.RemoveAt(0);
}
}
finally
{
Monitor.Exit(Program.logList);
// end critical section
}
try
{
if (toFile.Length > 0)
{
Logger.Log(toFile.ToString().Substring(0, toFile.Length - 2));
toFile.Clear();
}
}
catch
{
}
}
}
catch (Exception ex)
{
Logger.Log(System.Reflection.MethodBase.GetCurrentMethod(), ex);
}
Thread.Sleep(100);
}
}
On the filler thread we have
public static void logList_add(string str)
{
try
{
try
{
//critical section
Monitor.Enter(Program.logList);
Program.logList.Add(str);
}
finally
{
Monitor.Exit(Program.logList);
//end critical section
}
//set start
Program.evtLogListFilled.Set();
}
catch{}
}
this solution is fully tested, the istruction Program.evtLogListFilled.Set(); may release the lock on Program.evtLogListFilled.WaitOne() and also the next future lock.
I think this is the simpliest way.

Prevent Race Condition in Efficient Consumer Producer model

What I am trying to achieve is to have a consumer producer method. There can be many producers but only one consumer. There cannot be a dedicated consumer because of scalability, so the idea is to have the producer start the consuming process if there is data to be consumed and there is currently no active consumer.
1. Many threads can be producing messages. (Asynchronous)
2. Only one thread can be consuming messages. (Synchronous)
3. We should only have a consumer in process if there is data to be consumed
4. A continuous consumer that waits for data would not be efficient if we add many of these classes.
In my example I have a set of methods that send data. Multiple threads can write data Write() but only one of those threads will loop and Send data SendNewData(). The reason that only one loop can write data is because the order of data must be synchronous, and with a AsyncWrite() out of our control we can only guarantee order by running one AyncWrite() at a time.
The problem that I have is that if a thread gets called to Write() produce, it will queue the data and check the Interlocked.CompareExchance to see if there is a consumer. If it sees that another thread is in the loop already consuming, it will assume that this consumer will send the data. This is a problem if that looping thread consumer is at "Race Point A" since this consumer has already checked that there is no more messages to send and is about to shut down the consuming process.
Is there a way to prevent this race condition without locking a large part of the code. The real scenario has many queues and is a bit more complex than this.
In the real code List<INetworkSerializable> is actually a byte[] BufferPool. I used List for the example to make this block easier to read.
With 1000s of these classes being active at once, I cannot afford to have the SendNewData looping continuously with a dedicated thread. The looping thread should only be active if there is data to send.
public void Write(INetworkSerializable messageToSend)
{
Queue.Enqueue(messageToSend);
// Check if there are any current consumers. If not then we should instigate the consuming.
if (Interlocked.CompareExchange(ref RunningWrites, 1, 0) == 0)
{ //We are now the thread that consumes and sends data
SendNewData();
}
}
//Only one thread should be looping here to keep consuming and sending data synchronously.
private void SendNewData()
{
INetworkSerializable dataToSend;
List<INetworkSerializable> dataToSendList = new List<INetworkSerializable>();
while (true)
{
if (!Queue.TryDequeue(out dataToSend))
{
//Race Point A
if (dataToSendList.IsEmpty)
{
//All data is sent, return so that another thread can take responsibility.
Interlocked.Decrement(ref RunningWrites);
return;
}
//We have data in the list to send but nothing more to consume so lets send the data that we do have.
break;
}
dataToSendList.Add(dataToSend);
}
//Async callback is WriteAsyncCallback()
WriteAsync(dataToSendList);
}
//Callback after WriteAsync() has sent the data.
private void WriteAsyncCallback()
{
//Data was written to sockets, now lets loop back for more data
SendNewData();
}
It sounds like you would be better off with the producer-consumer pattern that is easily implemented with the BlockingCollection:
var toSend = new BlockingCollection<something>();
// producers
toSend.Add(something);
// when all producers are done
toSend.CompleteAdding();
// consumer -- this won't end until CompleteAdding is called
foreach(var item in toSend.GetConsumingEnumerable())
Send(item);
To address the comment of knowing when to call CompleteAdding, I would launch the 1000s of producers as tasks, wait for all those tasks to complete (Task.WaitAll), and then call CompleteAdding. There are good overloads taking in CancellationTokens that would give you better control, if needed.
Also, TPL is pretty good about scheduling off blocked threads.
More complete code:
var toSend = new BlockingCollection<int>();
Parallel.Invoke(() => Produce(toSend), () => Consume(toSend));
...
private static void Consume(BlockingCollection<int> toSend)
{
foreach (var value in toSend.GetConsumingEnumerable())
{
Console.WriteLine("Sending {0}", value);
}
}
private static void Produce(BlockingCollection<int> toSend)
{
Action<int> generateToSend = toSend.Add;
var producers = Enumerable.Range(0, 1000)
.Select(n => new Task(value => generateToSend((int) value), n))
.ToArray();
foreach(var p in producers)
{
p.Start();
}
Task.WaitAll(producers);
toSend.CompleteAdding();
}
Check this variant. There are some descriptive comments in code.
Also notice that WriteAsyncCallback now don't call SendNewData method anymore
private int _pendingMessages;
private int _consuming;
public void Write(INetworkSerializable messageToSend)
{
Interlocked.Increment(ref _pendingMessages);
Queue.Enqueue(messageToSend);
// Check if there is anyone consuming messages
// if not, we will have to become a consumer and process our own message,
// and any other further messages until we have cleaned the queue
if (Interlocked.CompareExchange(ref _consuming, 1, 0) == 0)
{
// We are now the thread that consumes and sends data
SendNewData();
}
}
// Only one thread should be looping here to keep consuming and sending data synchronously.
private void SendNewData()
{
INetworkSerializable dataToSend;
var dataToSendList = new List<INetworkSerializable>();
int messagesLeft;
do
{
if (!Queue.TryDequeue(out dataToSend))
{
// there is one possibility that we get here while _pendingMessages != 0:
// some other thread had just increased _pendingMessages from 0 to 1, but haven't put a message to queue.
if (dataToSendList.Count == 0)
{
if (_pendingMessages == 0)
{
_consuming = 0;
// and if we have no data this mean that we are safe to exit from current thread.
return;
}
}
else
{
// We have data in the list to send but nothing more to consume so lets send the data that we do have.
break;
}
}
dataToSendList.Add(dataToSend);
messagesLeft = Interlocked.Decrement(ref _pendingMessages);
}
while (messagesLeft > 0);
// Async callback is WriteAsyncCallback()
WriteAsync(dataToSendList);
}
private void WriteAsync(List<INetworkSerializable> dataToSendList)
{
// some code
}
// Callback after WriteAsync() has sent the data.
private void WriteAsyncCallback()
{
// ...
SendNewData();
}
The race condition can be prevented by adding the following and double checking the Queue after we have declared that we are no longer the consumer.
if (dataToSend.IsEmpty)
{
//Declare that we are no longer the consumer.
Interlocked.Decrement(ref RunningWrites);
//Double check the queue to prevent race condition A
if (Queue.IsEmpty)
return;
else
{ //Race condition A occurred. There is data again.
//Let's try to become a consumer.
if (Interlocked.CompareExchange(ref RunningWrites, 1, 0) == 0)
continue;
//Another thread has nominated itself as the consumer. Our job is done.
return;
}
}
break;

C# : Monitor - Wait,Pulse,PulseAll

I am having hard time in understanding Wait(), Pulse(), PulseAll(). Will all of them avoid deadlock? I would appreciate if you explain how to use them?
Short version:
lock(obj) {...}
is short-hand for Monitor.Enter / Monitor.Exit (with exception handling etc). If nobody else has the lock, you can get it (and run your code) - otherwise your thread is blocked until the lock is aquired (by another thread releasing it).
Deadlock typically happens when either A: two threads lock things in different orders:
thread 1: lock(objA) { lock (objB) { ... } }
thread 2: lock(objB) { lock (objA) { ... } }
(here, if they each acquire the first lock, neither can ever get the second, since neither thread can exit to release their lock)
This scenario can be minimised by always locking in the same order; and you can recover (to a degree) by using Monitor.TryEnter (instead of Monitor.Enter/lock) and specifying a timeout.
or B: you can block yourself with things like winforms when thread-switching while holding a lock:
lock(obj) { // on worker
this.Invoke((MethodInvoker) delegate { // switch to UI
lock(obj) { // oopsiee!
...
}
});
}
The deadlock appears obvious above, but it isn't so obvious when you have spaghetti code; possible answers: don't thread-switch while holding locks, or use BeginInvoke so that you can at least exit the lock (letting the UI play).
Wait/Pulse/PulseAll are different; they are for signalling. I use this in this answer to signal so that:
Dequeue: if you try to dequeue data when the queue is empty, it waits for another thread to add data, which wakes up the blocked thread
Enqueue: if you try and enqueue data when the queue is full, it waits for another thread to remove data, which wakes up the blocked thread
Pulse only wakes up one thread - but I'm not brainy enough to prove that the next thread is always the one I want, so I tend to use PulseAll, and simply re-verify the conditions before continuing; as an example:
while (queue.Count >= maxSize)
{
Monitor.Wait(queue);
}
With this approach, I can safely add other meanings of Pulse, without my existing code assuming that "I woke up, therefore there is data" - which is handy when (in the same example) I later needed to add a Close() method.
Simple recipe for use of Monitor.Wait and Monitor.Pulse. It consists of a worker, a boss, and a phone they use to communicate:
object phone = new object();
A "Worker" thread:
lock(phone) // Sort of "Turn the phone on while at work"
{
while(true)
{
Monitor.Wait(phone); // Wait for a signal from the boss
DoWork();
Monitor.PulseAll(phone); // Signal boss we are done
}
}
A "Boss" thread:
PrepareWork();
lock(phone) // Grab the phone when I have something ready for the worker
{
Monitor.PulseAll(phone); // Signal worker there is work to do
Monitor.Wait(phone); // Wait for the work to be done
}
More complex examples follow...
A "Worker with something else to do":
lock(phone)
{
while(true)
{
if(Monitor.Wait(phone,1000)) // Wait for one second at most
{
DoWork();
Monitor.PulseAll(phone); // Signal boss we are done
}
else
DoSomethingElse();
}
}
An "Impatient Boss":
PrepareWork();
lock(phone)
{
Monitor.PulseAll(phone); // Signal worker there is work to do
if(Monitor.Wait(phone,1000)) // Wait for one second at most
Console.Writeline("Good work!");
}
No, they don't protect you from deadlocks. They are just more flexible tools for thread synchronization. Here is a very good explanation how to use them and very important pattern of usage - without this pattern you will break all the things:
http://www.albahari.com/threading/part4.aspx
Something that total threw me here is that Pulse just gives a "heads up" to a thread in a Wait. The Waiting thread will not continue until the thread that did the Pulse gives up the lock and the waiting thread successfully wins it.
lock(phone) // Grab the phone
{
Monitor.PulseAll(phone); // Signal worker
Monitor.Wait(phone); // ****** The lock on phone has been given up! ******
}
or
lock(phone) // Grab the phone when I have something ready for the worker
{
Monitor.PulseAll(phone); // Signal worker there is work to do
DoMoreWork();
} // ****** The lock on phone has been given up! ******
In both cases it's not until "the lock on phone has been given up" that another thread can get it.
There might be other threads waiting for that lock from Monitor.Wait(phone) or lock(phone). Only the one that wins the lock will get to continue.
They are tools for synchronizing and signaling between threads. As such they do nothing to prevent deadlocks, but if used correctly they can be used to synchronize and communicate between threads.
Unfortunately most of the work needed to write correct multithreaded code is currently the developers' responsibility in C# (and many other languages). Take a look at how F#, Haskell and Clojure handles this for an entirely different approach.
Unfortunately, none of Wait(), Pulse() or PulseAll() have the magical property which you are wishing for - which is that by using this API you will automatically avoid deadlock.
Consider the following code
object incomingMessages = new object(); //signal object
LoopOnMessages()
{
lock(incomingMessages)
{
Monitor.Wait(incomingMessages);
}
if (canGrabMessage()) handleMessage();
// loop
}
ReceiveMessagesAndSignalWaiters()
{
awaitMessages();
copyMessagesToReadyArea();
lock(incomingMessages) {
Monitor.PulseAll(incomingMessages); //or Monitor.Pulse
}
awaitReadyAreaHasFreeSpace();
}
This code will deadlock! Maybe not today, maybe not tomorrow. Most likely when your code is placed under stress because suddenly it has become popular or important, and you are being called to fix an urgent issue.
Why?
Eventually the following will happen:
All consumer threads are doing some work
Messages arrive, the ready area can't hold any more messages, and PulseAll() is called.
No consumer gets woken up, because none are waiting
All consumer threads call Wait() [DEADLOCK]
This particular example assumes that producer thread is never going to call PulseAll() again because it has no more space to put messages in. But there are many, many broken variations on this code possible. People will try to make it more robust by changing a line such as making Monitor.Wait(); into
if (!canGrabMessage()) Monitor.Wait(incomingMessages);
Unfortunately, that still isn't enough to fix it. To fix it you also need to change the locking scope where Monitor.PulseAll() is called:
LoopOnMessages()
{
lock(incomingMessages)
{
if (!canGrabMessage()) Monitor.Wait(incomingMessages);
}
if (canGrabMessage()) handleMessage();
// loop
}
ReceiveMessagesAndSignalWaiters()
{
awaitMessagesArrive();
lock(incomingMessages)
{
copyMessagesToReadyArea();
Monitor.PulseAll(incomingMessages); //or Monitor.Pulse
}
awaitReadyAreaHasFreeSpace();
}
The key point is that in the fixed code, the locks restrict the possible sequences of events:
A consumer threads does its work and loops
That thread acquires the lock
And thanks to locking it is now true that either:
a. Messages haven't yet arrived in the ready area, and it releases the lock by calling Wait() BEFORE the message receiver thread can acquire the lock and copy more messages into the ready area, or
b. Messages have already arrived in the ready area and it receives the messages INSTEAD OF calling Wait(). (And while it is making this decision it is impossible for the message receiver thread to e.g. acquire the lock and copy more messages into the ready area.)
As a result the problem of the original code now never occurs:
3. When PulseEvent() is called No consumer gets woken up, because none are waiting
Now observe that in this code you have to get the locking scope exactly right. (If, indeed I got it right!)
And also, since you must use the lock (or Monitor.Enter() etc.) in order to use Monitor.PulseAll() or Monitor.Wait() in a deadlock-free fashion, you still have to worry about possibility of other deadlocks which happen because of that locking.
Bottom line: these APIs are also easy to screw up and deadlock with, i.e. quite dangerous
This is a simple example of monitor use :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApp4
{
class Program
{
public static int[] X = new int[30];
static readonly object _object = new object();
public static int count=0;
public static void PutNumbers(int numbersS, int numbersE)
{
for (int i = numbersS; i < numbersE; i++)
{
Monitor.Enter(_object);
try
{
if(count<30)
{
X[count] = i;
count++;
Console.WriteLine("Punt in " + count + "nd: "+i);
Monitor.Pulse(_object);
}
else
{
Monitor.Wait(_object);
}
}
finally
{
Monitor.Exit(_object);
}
}
}
public static void RemoveNumbers(int numbersS)
{
for (int i = 0; i < numbersS; i++)
{
Monitor.Enter(_object);
try
{
if (count > 0)
{
X[count] = 0;
int x = count;
count--;
Console.WriteLine("Removed " + x + " element");
Monitor.Pulse(_object);
}
else
{
Monitor.Wait(_object);
}
}
finally
{
Monitor.Exit(_object);
}
}
}
static void Main(string[] args)
{
Thread W1 = new Thread(() => PutNumbers(10,50));
Thread W2 = new Thread(() => PutNumbers(1, 10));
Thread R1 = new Thread(() => RemoveNumbers(30));
Thread R2 = new Thread(() => RemoveNumbers(20));
W1.Start();
R1.Start();
W2.Start();
R2.Start();
W1.Join();
R1.Join();
W2.Join();
R2.Join();
}
}
}

How to effectively log asynchronously?

I am using Enterprise Library 4 on one of my projects for logging (and other purposes). I've noticed that there is some cost to the logging that I am doing that I can mitigate by doing the logging on a separate thread.
The way I am doing this now is that I create a LogEntry object and then I call BeginInvoke on a delegate that calls Logger.Write.
new Action<LogEntry>(Logger.Write).BeginInvoke(le, null, null);
What I'd really like to do is add the log message to a queue and then have a single thread pulling LogEntry instances off the queue and performing the log operation. The benefit of this would be that logging is not interfering with the executing operation and not every logging operation results in a job getting thrown on the thread pool.
How can I create a shared queue that supports many writers and one reader in a thread safe way? Some examples of a queue implementation that is designed to support many writers (without causing synchronization/blocking) and a single reader would be really appreciated.
Recommendation regarding alternative approaches would also be appreciated, I am not interested in changing logging frameworks though.
I wrote this code a while back, feel free to use it.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace MediaBrowser.Library.Logging {
public abstract class ThreadedLogger : LoggerBase {
Queue<Action> queue = new Queue<Action>();
AutoResetEvent hasNewItems = new AutoResetEvent(false);
volatile bool waiting = false;
public ThreadedLogger() : base() {
Thread loggingThread = new Thread(new ThreadStart(ProcessQueue));
loggingThread.IsBackground = true;
loggingThread.Start();
}
void ProcessQueue() {
while (true) {
waiting = true;
hasNewItems.WaitOne(10000,true);
waiting = false;
Queue<Action> queueCopy;
lock (queue) {
queueCopy = new Queue<Action>(queue);
queue.Clear();
}
foreach (var log in queueCopy) {
log();
}
}
}
public override void LogMessage(LogRow row) {
lock (queue) {
queue.Enqueue(() => AsyncLogMessage(row));
}
hasNewItems.Set();
}
protected abstract void AsyncLogMessage(LogRow row);
public override void Flush() {
while (!waiting) {
Thread.Sleep(1);
}
}
}
}
Some advantages:
It keeps the background logger alive, so it does not need to spin up and spin down threads.
It uses a single thread to service the queue, which means there will never be a situation where 100 threads are servicing the queue.
It copies the queues to ensure the queue is not blocked while the log operation is performed
It uses an AutoResetEvent to ensure the bg thread is in a wait state
It is, IMHO, very easy to follow
Here is a slightly improved version, keep in mind I performed very little testing on it, but it does address a few minor issues.
public abstract class ThreadedLogger : IDisposable {
Queue<Action> queue = new Queue<Action>();
ManualResetEvent hasNewItems = new ManualResetEvent(false);
ManualResetEvent terminate = new ManualResetEvent(false);
ManualResetEvent waiting = new ManualResetEvent(false);
Thread loggingThread;
public ThreadedLogger() {
loggingThread = new Thread(new ThreadStart(ProcessQueue));
loggingThread.IsBackground = true;
// this is performed from a bg thread, to ensure the queue is serviced from a single thread
loggingThread.Start();
}
void ProcessQueue() {
while (true) {
waiting.Set();
int i = ManualResetEvent.WaitAny(new WaitHandle[] { hasNewItems, terminate });
// terminate was signaled
if (i == 1) return;
hasNewItems.Reset();
waiting.Reset();
Queue<Action> queueCopy;
lock (queue) {
queueCopy = new Queue<Action>(queue);
queue.Clear();
}
foreach (var log in queueCopy) {
log();
}
}
}
public void LogMessage(LogRow row) {
lock (queue) {
queue.Enqueue(() => AsyncLogMessage(row));
}
hasNewItems.Set();
}
protected abstract void AsyncLogMessage(LogRow row);
public void Flush() {
waiting.WaitOne();
}
public void Dispose() {
terminate.Set();
loggingThread.Join();
}
}
Advantages over the original:
It's disposable, so you can get rid of the async logger
The flush semantics are improved
It will respond slightly better to a burst followed by silence
Yes, you need a producer/consumer queue. I have one example of this in my threading tutorial - if you look my "deadlocks / monitor methods" page you'll find the code in the second half.
There are plenty of other examples online, of course - and .NET 4.0 will ship with one in the framework too (rather more fully featured than mine!). In .NET 4.0 you'd probably wrap a ConcurrentQueue<T> in a BlockingCollection<T>.
The version on that page is non-generic (it was written a long time ago) but you'd probably want to make it generic - it would be trivial to do.
You would call Produce from each "normal" thread, and Consume from one thread, just looping round and logging whatever it consumes. It's probably easiest just to make the consumer thread a background thread, so you don't need to worry about "stopping" the queue when your app exits. That does mean there's a remote possibility of missing the final log entry though (if it's half way through writing it when the app exits) - or even more if you're producing faster than it can consume/log.
Here is what I came up with... also see Sam Saffron's answer. This answer is community wiki in case there are any problems that people see in the code and want to update.
/// <summary>
/// A singleton queue that manages writing log entries to the different logging sources (Enterprise Library Logging) off the executing thread.
/// This queue ensures that log entries are written in the order that they were executed and that logging is only utilizing one thread (backgroundworker) at any given time.
/// </summary>
public class AsyncLoggerQueue
{
//create singleton instance of logger queue
public static AsyncLoggerQueue Current = new AsyncLoggerQueue();
private static readonly object logEntryQueueLock = new object();
private Queue<LogEntry> _LogEntryQueue = new Queue<LogEntry>();
private BackgroundWorker _Logger = new BackgroundWorker();
private AsyncLoggerQueue()
{
//configure background worker
_Logger.WorkerSupportsCancellation = false;
_Logger.DoWork += new DoWorkEventHandler(_Logger_DoWork);
}
public void Enqueue(LogEntry le)
{
//lock during write
lock (logEntryQueueLock)
{
_LogEntryQueue.Enqueue(le);
//while locked check to see if the BW is running, if not start it
if (!_Logger.IsBusy)
_Logger.RunWorkerAsync();
}
}
private void _Logger_DoWork(object sender, DoWorkEventArgs e)
{
while (true)
{
LogEntry le = null;
bool skipEmptyCheck = false;
lock (logEntryQueueLock)
{
if (_LogEntryQueue.Count <= 0) //if queue is empty than BW is done
return;
else if (_LogEntryQueue.Count > 1) //if greater than 1 we can skip checking to see if anything has been enqueued during the logging operation
skipEmptyCheck = true;
//dequeue the LogEntry that will be written to the log
le = _LogEntryQueue.Dequeue();
}
//pass LogEntry to Enterprise Library
Logger.Write(le);
if (skipEmptyCheck) //if LogEntryQueue.Count was > 1 before we wrote the last LogEntry we know to continue without double checking
{
lock (logEntryQueueLock)
{
if (_LogEntryQueue.Count <= 0) //if queue is still empty than BW is done
return;
}
}
}
}
}
I suggest to start with measuring actual performance impact of logging on the overall system (i.e. by running profiler) and optionally switching to something faster like log4net (I've personally migrated to it from EntLib logging a long time ago).
If this does not work, you can try using this simple method from .NET Framework:
ThreadPool.QueueUserWorkItem
Queues a method for execution. The method executes when a thread pool thread becomes available.
MSDN Details
If this does not work either then you can resort to something like John Skeet has offered and actually code the async logging framework yourself.
In response to Sam Safrons post, I wanted to call flush and make sure everything was really finished writting. In my case, I am writing to a database in the queue thread and all my log events were getting queued up but sometimes the application stopped before everything was finished writing which is not acceptable in my situation. I changed several chunks of your code but the main thing I wanted to share was the flush:
public static void FlushLogs()
{
bool queueHasValues = true;
while (queueHasValues)
{
//wait for the current iteration to complete
m_waitingThreadEvent.WaitOne();
lock (m_loggerQueueSync)
{
queueHasValues = m_loggerQueue.Count > 0;
}
}
//force MEL to flush all its listeners
foreach (MEL.LogSource logSource in MEL.Logger.Writer.TraceSources.Values)
{
foreach (TraceListener listener in logSource.Listeners)
{
listener.Flush();
}
}
}
I hope that saves someone some frustration. It is especially apparent in parallel processes logging lots of data.
Thanks for sharing your solution, it set me into a good direction!
--Johnny S
I wanted to say that my previous post was kind of useless. You can simply set AutoFlush to true and you will not have to loop through all the listeners. However, I still had crazy problem with parallel threads trying to flush the logger. I had to create another boolean that was set to true during the copying of the queue and executing the LogEntry writes and then in the flush routine I had to check that boolean to make sure something was not already in the queue and the nothing was getting processed before returning.
Now multiple threads in parallel can hit this thing and when I call flush I know it is really flushed.
public static void FlushLogs()
{
int queueCount;
bool isProcessingLogs;
while (true)
{
//wait for the current iteration to complete
m_waitingThreadEvent.WaitOne();
//check to see if we are currently processing logs
lock (m_isProcessingLogsSync)
{
isProcessingLogs = m_isProcessingLogs;
}
//check to see if more events were added while the logger was processing the last batch
lock (m_loggerQueueSync)
{
queueCount = m_loggerQueue.Count;
}
if (queueCount == 0 && !isProcessingLogs)
break;
//since something is in the queue, reset the signal so we will not keep looping
Thread.Sleep(400);
}
}
Just an update:
Using enteprise library 5.0 with .NET 4.0 it can easily be done by:
static public void LogMessageAsync(LogEntry logEntry)
{
Task.Factory.StartNew(() => LogMessage(logEntry));
}
See:
http://randypaulo.wordpress.com/2011/07/28/c-enterprise-library-asynchronous-logging/
An extra level of indirection may help here.
Your first async method call can put messages onto a synchonized Queue and set an event -- so the locks are happening in the thread-pool, not on your worker threads -- and then have yet another thread pulling messages off the queue when the event is raised.
If you log something on a separate thread, the message may not be written if the application crashes, which makes it rather useless.
The reason goes why you should always flush after every written entry.
If what you have in mind is a SHARED queue, then I think you are going to have to synchronize the writes to it, the pushes and the pops.
But, I still think it's worth aiming at the shared queue design. In comparison to the IO of logging and probably in comparison to the other work your app is doing, the brief amount of blocking for the pushes and the pops will probably not be significant.

Categories

Resources