The below code uses a background worker thread to process work items one by one. The worker thread starts waiting on a ManualResetEvent whenever it runs out of work items. Main thread periodically adds new work items and wakes the worker thread.
The waking mechanism has a race condition. If a new item is added by main thread while worker thread is at the place indicated by *, the worker thread will not get woken.
Is there a simple and correct way of waking the worker thread that does not have this problem?
ManualResetEvent m_waitEvent;
// Worker thread processes work items one by one
void WorkerThread()
{
while (true)
{
m_waitEvent.WaitOne();
bool noMoreItems = ProcessOneWorkItem();
if (noMoreItems)
{
// *
m_waitEvent.Reset(); // No more items, wait for more
}
}
}
// Main thread code that adds a new work item
AddWorkItem();
m_waitEvent.Set(); // Wake worker thread
You're using the wrong synchronization mechanism. Rather than a MRE just use a Semaphore. The semaphore will then represent the number of items yet to be processed. You can set it to add one, or wait on it to reduce it by one. There is no if, you always do every semaphore action and as a result there is no race condition.
That said, you can avoid the problem entirely. Rather than managing the synchronization primitives yourself you can just use a BlockingCollection. Have the producer add items and the consumer consume them. The synchronization will all be taken care of for you by that class, and likely more efficiently than your implementation would be as well.
I tend to use a current work items counter and increment and decrement that counter. You can turn your processor thread into a loop that is looking at that counter then sleeping rather than run once and done. That way, no matter where you are when the item is added, you are 1 sleep cycle from the item being processed.
Related
Suppose that I have this mutex (or semaphore) called Mutex1
private static readonly SemaphoreSlim Mutex1 = new SemaphoreSlim(0, 1);
Will following code always work as my expectations?
Thread A
await Mutex1.WaitAsync(); // wait in thread "A" until thread "B" releases mutex
Thread B
Mutex1.Release();
await Mutex1.WaitAsync(); // thread A should continue and thread B should wait.
Is Mutex1.Release in thread B always guaranteed to cause continuation in thread A?
My guess answer is no, because before thread A continues, thread B may wait again and since "waiting threads" are not queued, thread B may continue again instead of thread A. am I right?
The safe approach that I'm currently using is with additional field called Mutex2.
private static readonly SemaphoreSlim Mutex2 = new SemaphoreSlim(0, 1);
Thread A
await Mutex1.WaitAsync(); // wait in thread "A" until thread "B" releases mutex
Thread B
Mutex1.Release();
await Mutex2.WaitAsync(); // notice Mutex TWO
Now I'm sure that switching between threads is handled correctly, but I wanted to know if I can save one field and use first approach safely or not. note that this operation is wrapped in lock so its single threaded and safe.
(for those who are curios this is the part of application for making synchronized communication between two apps using UWP app services)
From https://msdn.microsoft.com/en-us/library/system.threading.semaphoreslim(v=vs.110).aspx we have:
If multiple threads are blocked, there is no guaranteed order, such as FIFO or LIFO, that controls when threads enter the semaphore.
So, to me, this says that the slim semaphores do not queue up requests.
And, that your first approach has a race condition. That is, thread B can release the mutex and immediately reacquire it, locking out thread A for an indeterminate amount of time.
So, you probably have to use your two semaphore approach. Or, use a different synchronization primitive that supports a [fair] queue of waiters.
The way you are using this, you should try using a AutoResetEvent. https://msdn.microsoft.com/en-us/library/system.threading.autoresetevent(v=vs.110).aspx
That will do what you want.
Consider two threads run simultaneously. A is reading and B is writing. When A is reading, in the middle of code ,CPU time for A finishes then B thread continues.
Is there any way to don't give back CPU until A finishes, but B can start or continue?
You need to understand that you have almost no control over when CPU is given back and to whom it is given. The operating system does that. To have control on that, you'd need to be the operating system. The only things you can usually do are:
start a thread
set thread priority, so some threads are may more likely get time than others
put a thread to sleep, immediatelly and ask the operating system to wake it up upon some condition, maybe with some timeout (waiting time limit)
as a special case, or a typical use case, the second point is often also provided with a shorthand:
put a thread to sleep, immediatelly for a specified amount of time
By "sleep" I mean that this thread is paused and will not get any CPU time, even if all CPUs are idle, unless the thread is woken up by the OS due to some condition.
Furthermore, in a typical case, there is no "thread A and thread B that switch CPU time between them", but there is "lots of threads from various processes and the operating system itself, and you two threads". This means that when your thread A loses the CPU, most probably it will not be the thread B that gets the time now. Some other thread from somewhere else will get it, and at some future point of time, maybe your thread A or maybe thread B will get it back.
This means that there is very little you can be sure. You can be sure that your threads are
either dead
or sleeping
or proceeding 'forward' in a hard to determine order
If you need to ensure that some threads are synchronized, you must .. not start them simultaneously, or put them sleep in precise moments and wake them up in precise order.
You've just said in comments:
You know, if in the middle of A CPU time finishes, data that has been retrieved is not complete
This means that you need to ensure that thread B does not try to touch the data before thread A finishes writing it. But also, if you think about it, you need to ensure that thread A doesn't start writing next data if the thread B is now reading previous data.
This means synchronization. This means that threads A and B must wait if the other thread is touching the data. This means that they need to be put to sleep and woken up when the other thread finishes.
In C#, the easiest way to do that is to use lock(x) keyword. When a thread enters a lock() section, it proceeds only if it is able to get the lock. If not, it is put to sleep. It can't get the lock if any other thread was faster and got it before. However, a thread releases the lock when it ends its job. Upon that time, one of the sleeping threads is woken up and given the lock.
lock(fooo) { // <- this line means 'acquire the lock or sleep'
iam.doing(myjob);
very.important(things);
thatshouldnt.be.interrupted();
byother(threads);
} // <- this line means 'release the lock'
So, when a thread gets through the lock(fooo){ line, you can't be sure it won't be interrupted. Oh, surely it will be. OS will switch the threads back and forth to other processes, and so on. But you can be sure that no other threads of your app will be inside the code block. If they tried to get inside while your thread got that lock, they'd imediatelly fall asleep in the first lock line. One of them be will be later woken up when your thread gets out of that code.
There's one more thing. lock() keyword requires a parameter. I wrote foo there. You need to pass there something that will act as the lock. It can be any object, even plain object:
private object thelock = new object();
private void dosomething()
{
lock(thelock)
{
foobarize(thebaz);
}
}
however you must ensure that all threads try to use the same lock instance. Writing a code like
private void dosomething()
{
object thelock = new object();
lock(thelock)
{
foobarize(thebaz);
}
}
is a nonsense since every potential thread executing that lines will try lockin upon their own new object instance and will see it as "free" (it's new, just created, noone took it earlier) and will immediatelly get into the protected code block.
Now you wrote about using ConcurrentQueue. This class provides safely mechanisms against concurrency. You can be sure that adding or reading or removing items from that queue is already safe. This collection makes it safe. You don't need to add synchronization to add or remove items safely. It's safe. If you observe any ill effects, then most probably you have tried putting an item into that collection and then you were modifying that item. Concurrent collection will not guard you against that. It can only make sure that add/remove/etc are safe. But it has no knowledge or control on what you do to the items:
In short, if some thread B tries to read items from the collection, then in thread A this is NOT safe:
concurrentcoll.Add(item);
item.x = 5;
item.foobarize();
but this is safe:
item.x = 5;
item.foobarize();
concurrentcoll.Add(item);
// and do not touch the Item anymore here.
I was reading AutoResetEvent documentation on MSDN and following warning kinda bothers me..
"Important:
There is no guarantee that every call to the Set method will release a thread. If two calls are too close together, so that the second call occurs before a thread has been released, only one thread is released. It is as if the second call did not happen. Also, if Set is called when there are no threads waiting and the AutoResetEvent is already signaled, the call has no effect."
But this warning basically kills the very reason to have such a thread synchronization techniques. For example I have a list which will hold jobs. And there is only one producer which will add jobs to the list. I have consumers (more than one), waiting to get the job from the list.. something like this..
Producer:
void AddJob(Job j)
{
lock(qLock)
{
jobQ.Enqueue(j);
}
newJobEvent.Set(); // newJobEvent is AutoResetEvent
}
Consumer
void Run()
{
while(canRun)
{
newJobEvent.WaitOne();
IJob job = null;
lock(qLock)
{
job = jobQ.Dequeue();
}
// process job
}
}
If the above warning is true, then if I enqueue two jobs very quickly, only one thread will pick up the job, isn't it? I was under the assumption that Set will be atomic, that is it does the following:
Set the event
If threads are waiting, pick one thread to wake up
reset the event
run the selected thread.
So I am basically confused about the warning in MSDN. is it a valid warning?
Even if the warning isn't true and Set is atomic, why would you use an AutoResetEvent here? Let's say you have some producers queue up 3 events in row and there's one consumer. After processing the 2nd job, the consumer blocks and never processes the third.
I would use a ReaderWriterLockSlim for this type of synchronization. Basically, you need multiple producers to be able to have write locks, but you don't want consumers to lock out producers for a long time while they are only reading the queue size.
The message on MSDN is a valid message indeed. What's happening internally is something like this:
Thread A waits for the event
Thread B sets the event
[If thread A is in spinlock]
[yes] Thread a detects that the event is set, unsets it and resumes its work
[no] The event will tell thread A to wake up, once woken, thread A will unset the event resume its work.
Note that the internal logic is not synchronous since Thread B doesn't wait for Thread A to continue its business. You can make this synchronous by introducing a temporary ManualResetEvent that thread A has to signal once it continues its work and on which Thread B has to wait. This is not done by default due to the inner working of the windows threading model. I guess the documentation is misleading but correct for saying that the Set method only releases one or more waiting threads.
Alternatively i would suggest you to look at the BlockingCollection class in the System.Collections.Concurrent namespace of the BCL introduced in .NET 4.0 which does exactly what you are trying to do
I am using multiple threads in my application using while(true) loop and now i want to exit from loop when all the active threads complete their work.
Assuming that you have a list of the threads themselves, here are two approaches.
Solution the first:
Use Thread.Join() with a timespan parameter to synch up with each thread in turn. The return value tells you whether the thread has finished or not.
Solution the second:
Check Thread.IsAlive() to see if the thread is still running.
In either situation, make sure that your main thread yields processor time to the running threads, else your wait loop will consume most/all the CPU and starve your worker threads.
You can use Process.GetCurrentProcess().Threads.Count.
There are various approaches here, but utlimately most of them come down to your changing the executed threads to do something whenever they leave (success or via exception, which you don't want to do anyway). A simple approach might be to use Interlock.Decrement to reduce a counter - and if it is zero (or -ve, which probably means an error) release a ManualResetEvent or Monitor.Pulse an object; in either case, the original thread would be waiting on that object. A number of such approaches are discussed here.
Of course, it might be easier to look at the TPL bits in 4.0, which provide a lot of new options here (not least things like Parallel.For in PLINQ).
If you are using a synchronized work queue, it might also be possible to set that queue to close (drain) itself, and simply wait for the queue to be empty? The assumption here being that your worker threads are doing something like:
T workItem;
while(queue.TryDequeue(out workItem)) { // this may block until either something
ProcessWorkItem(workItem); // todo, or the queue is terminated
}
// queue has closed - exit the thread
in which case, once the queue is empty all your worker threads should already be in the process of suicide.
You can use Thread.Join(). The Join method will block the calling thread until the thread (the one on which the Join method is called) terminates.
So if you have a list of thread, then you can loop through and call Join on each thread. You loop will only exit when all the threads are dead. Something like this:
for(int i = 0 ;i < childThreadList.Count; i++)
{
childThreadList[i].Join();
}
///...The following code will execute when all threads in the list have been terminated...///
I find that using the Join() method is the cleanest way. I use multiple threads frequently, and each thread is typically loading data from different data sources (Informix, Oracle and SQL at the same time.) A simple loop, as mentioned above, calling Join() on each thread object (which I store in a simple List object) works!!!
Carlos Merighe.
I prefer using a HashSet of Threads:
// create a HashSet of heavy tasks (threads) to run
HashSet<Thread> Threadlist = new HashSet<Thread>();
Threadlist.Add(new Thread(() => SomeHeavyTask1()));
Threadlist.Add(new Thread(() => SomeHeavyTask2()));
Threadlist.Add(new Thread(() => SomeHeavyTask3()));
// start the threads
foreach (Thread T in Threadlist)
T.Start();
// these will execute sequential
NotSoHeavyTask1();
NotSoHeavyTask2();
NotSoHeavyTask3();
// loop through tasks to see if they are still active, and join them to main thread
foreach (Thread T in Threadlist)
if (T.ThreadState == ThreadState.Running)
T.Join();
// finally this code will execute
MoreTasksToDo();
I set the max thread to 10. Then I added 22000 task using ThreadPool.QueueUserWorkItem.
It is very likely that not all the 22000 task was completed after running the program. Is there a limitation how many task can be queued for avaiable threads?
If you need to wait for all of the tasks to process, you need to handle that yourself. The ThreadPool threads are all background threads, and will not keep the application alive.
This is a relatively clean way to handle this type of situation:
using (var mre = new ManualResetEvent(false))
{
int remainingToProcess = workItems.Count(); // Assuming workItems is a collection of "tasks"
foreach(var item in workItems)
{
// Delegate closure (in C# 4 and earlier) below will
// capture a reference to 'item', resulting in
// the incorrect item sent to ProcessTask each iteration. Use a local copy
// of the 'item' variable instead.
// C# 5/VS2012 will not require the local here.
var localItem = item;
ThreadPool.QueueUserWorkItem(delegate
{
// Replace this with your "work"
ProcessTask(localItem);
// This will (safely) decrement the remaining count, and allow the main thread to continue when we're done
if (Interlocked.Decrement(ref remainingToProcess) == 0)
mre.Set();
});
}
mre.WaitOne();
}
That being said, it's usually better to "group" together your work items if you have thousands of them, and not treat them as separate Work Items for the threadpool. This is some overhead involved in managing the list of items, and since you won't be able to process 22000 at a time, you're better off grouping these into blocks. Having single work items each process 50 or so will probably help your overall throughput quite a bit...
The queue has no practical limit however the pool itself will not exceed 64 wait handles, ie total threads active.
This is an implementation dependent question and the implementation of this function has changed a bit over time. But in .Net 4.0, you're essentially limited by the amount of memory in the system as the tasks are stored in an in memory queue. You can see this by digging through the implementation in reflector.
From the documentation of ThreadPool:
Note: The threads in the managed thread pool are background threads. That is, their IsBackground properties are true. This means that a ThreadPool thread will not keep an application running after all foreground threads have exited.
Is it possible that you're exiting before all tasks have been processed?