I m using ConcurrentBag to store object in run time. At some point I need to empty the bag and store the bag content to a list. This is what i do:
IList<T> list = new List<T>();
lock (bag)
{
T pixel;
while (bag.TryTake(out pixel))
{
list.Add(pixel);
}
}
My Question is with synchronization, As far as I read in the book lock is faster than others synchronization methods. Source -- http://www.albahari.com/threading/part2.aspx.
Performance is my second concern, I d like to know if I can use ReaderWriterLockSlim at this point. What would be the benefit of using ReaderWriterLockSlim? The reason is that, I dont want this operation to block incoming requests.
If yes, Should I use Upgradable Lock?
Any ideas ? Comments?
I'm not sure why you're using the lock. The whole idea behind ConcurrentBag is that it's concurrent.
Unless you're just trying to prevent some other thread from taking things or adding things to the bag while you're emptying it.
Re-reading your question, I'm pretty sure you don't want to synchronize access here at all. ConcurrentBag allows multiple threads to Take and Add, without you having to do any explicit synchronization.
If you lock the bag, then no other thread can add or remove things while your code is running. Assuming, of course, that you protect every other access to the bag with a lock. And once you do that, you've completely defeated the purpose of having a lock-free concurrent data structure. Your data structure has become a poorly-performing list that's controlled by a lock.
Same thing if you use a reader-writer lock. You'd have to synchronize every access.
You don't need to add any explicit synchronization in this case. Ditch the lock.
Lock is great when threads will do a lot of operations in a row(bursty - low contention)
RWSlim is great when you have a lot more read locks than write locks(read heavy - high read contention)
Lockless is great when you need a multiple readers and/or writers all working at the same time(mix of read/write - lots of contention)
Related
Say you have an in-memory list of strings, and a multi-threaded system, with many readers but just one writer thread.
In general, is it possible to implement this kind of system in C#, without using a lock? Would the implementation make any assumptions about how the threads interact (or place restrictions on what they can do, when)?
Yes. The trick is to make sure the list remains immutable. The writer will snapshot the main collection, modify the snapshot, and then publish the snapshot to the variable holding the reference to the main collection. The following example demonstrates this.
public class Example
{
// This is the immutable master collection.
volatile List<string> collection = new List<string>();
void Writer()
{
var copy = new List<string>(collection); // Snapshot the collection.
copy.Add("hello world"); // Modify the snapshot.
collection = copy; // Publish the snapshot.
}
void Reader()
{
List<string> local = collection; // Acquire a local reference for safe reading.
if (local.Count > 0)
{
DoSomething(local[0]);
}
}
}
There are a couple of caveats with this approach.
It only works because there is a single writer.
Writes are O(n) operations.
Different readers may be using different version of the list simultaneously.
This is a fairly dangerous trick. There are very specific reasons why volatile was used, why a local reference is acquired on the reader side, etc. If you do not understand these reasons then do not use the pattern. There is too much that can go wrong.
The notion that this is thread-safe is semantic. No, it will not throw exceptions, blow up, or tear a whole in spacetime. But, there are other ways in which this pattern can cause problems. Know what the limitations are. This is not a miracle cure for every situation.
Because of the above constraints the scenarios where this would benefit you are quite limited. The biggest problem is that writes require a full copy first so they may be slow. But, if the writes are infrequent then this might be tolerable.
I describe more patterns in my answer here as well including one that is safe for multiple writers.
That is a fairly common request for a threading library to fulfill - that sort of lock is generally just called a "reader-writer lock", or some variation on that theme. I haven't ever needed to use the C# implementation specifically, but there is one: http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlockslim.aspx
Of course, you run into the issue that if readers will always be reading, you'll never be able to get the writer in to write. You'll have to handle that yourself, I believe.
(Ok, so it's still technically a "lock", but it's not the C# "lock" construct, it's a more sophisticated object specifically designed for the purpose stated in the question. So I guess whether it's a correct answer depends somewhat on semantics and on why he was asking the question.)
To avoid locks, you might want to consider Microsoft's concurrent collections. These collections provide thread safe access to collections of objects in both ordered and unordered forms. They use some neat tricks to avoid locking internally in as many instances as possible.
You can also use Microsoft's new Immutable Collections library: http://blogs.msdn.com/b/bclteam/archive/2012/12/18/preview-of-immutable-collections-released-on-nuget.aspx
Note: this is completely separate from the Concurrent Collections.
A singly-linked-list approach can be used without locks provided the writer only inserts/deletes at either the head or the tail. In either case, if you construct the new node beforehand, you only need a single atomic operation (head = newHead; or tail.next = newTail) to make the operation visible to the readers.
In terms of performance, insertions and deletions are O(1), while length calculation is O(n).
What is better:
to have large code area in lock statement
or
to have small locks in large area?..
exchanges in this sample are not changable.
lock (padLock)
{
foreach (string ex in exchanges)
{
sub.Add(x.ID, new Subscription(ch, queue.QueueName, true));
.........
}
or
foreach (string ex in exchanges)
{
lock (padLock)
{
sub.Add(x.ID, new Subscription(ch, queue.QueueName, true));
}
.....
The wider lock - the less you get from multithreading, and vice versa
So, use of locks completely depends on logic. Lock only things and places which changing and have to run only by one thread in a time
If you lock for using the collection sub - use smaller lock, but if you run multiple simultaneous foreach loops in parallel
Good practise would be to only lock that area which you want to be executed by only one thread at a given time
If that area is whole foreach loop then first approach is fine
but if that area is just one line as you have shown I second approach then go for the second approach
In this particular case, the best option is the first one, because otherwise you're just wasting time locking/unlocking since you have to execute the entire loop anyway. So there's not much opportunity for parallelism in a loop that executes individually atomic operations anyway.
For more general advice on critical section sizes check this article: http://software.intel.com/en-us/articles/managing-lock-contention-large-and-small-critical-sections/
I think there are two different questions:
1. Which would be correct?
2. Which would give better performance?
The correctness question is complicated. It depends on your data structures, and how you intend the lock to protect them. If the "sub" object is not thread-safe, you definitely need the big lock.
The performance question is simpler and less important (but for some reason, I think you're focused on it more).
Many small locks may be slower, because they just do more work. But if you manage to run a large portion of the loop code without lock, you get some concurrency, so it would be better.
You can't effectively judge which is "right" with the given code snippets. The first example says it is not OK for people to see sub with only part of the content from exchanges. The second example says it is OK for people to see sub with only part of the content from exchanges.
I have two threads, a producer thread that places objects into a generic List collection and a consumer thread that pulls those objects out of the same generic List. I've got the reads and writes to the collection properly synchronized using the lock keyword, and everything is working fine.
What I want to know is if it is ok to access the Count property without first locking the collection.
JaredPar refers to the Count property in his blog as a decision procedure that can lead to race conditions, like this:
if (list.Count > 0)
{
return list[0];
}
If the list has one item and that item is removed after the Count property is accessed but before the indexer, an exception will occur. I get that.
But would it be ok to use the Count property to, say, determine the initial size a completely different collection? The MSDN documentation says that instance members are not guaranteed to be thread safe, so should I just lock the collection before accessing the Count property?
I suspect it's "safe" in terms of "it's not going to cause anything to go catastrophically wrong" - but that you may get stale data. That's because I suspect it's just held in a simple variable, and that that's likely to be the case in the future. That's not the same as a guarantee though.
Personally I'd keep it simple: if you're accessing shared mutable data, only do so in a lock (using the same lock for the same data). Lock-free programming is all very well if you've got appropriate isolation in place (so you know you've got appropriate memory barriers, and you know that you'll never be modifying it in one thread while you're reading from it in another) but it sounds like that isn't the case here.
The good news is that acquiring an uncontested lock is incredibly cheap - so I'd go for the safe route if I were you. Threading is hard enough without introducing race conditions which are likely to give no significant performance benefit but at the cost of rare and unreproducible bugs.
When doing thread synchronization in C# should I also lock an object when I read a value or just changing it?
for example I have Queue<T> object. Should I just lock it when doing the Enqueue and Dequeue or should I also lock it when checking values like Count?
From MSDN:
A Queue<(Of <(T>)>) can support
multiple readers concurrently, as long
as the collection is not modified.
Even so, enumerating through a
collection is intrinsically not a
thread-safe procedure. To guarantee
thread safety during enumeration, you
can lock the collection during the
entire enumeration. To allow the
collection to be accessed by multiple
threads for reading and writing, you
must implement your own
synchronization.
You should ensure no reader is active while an item is queued (a lock is probably a good idea).
Looking at the count in reflector reveals a read from a private field. This can be okay depending on what you do with the value. This means you shouldn't do stuff like this (without proper locking):
if(queue.Count > 0)
queue.Dequeue();
Depends on what you want to do with lock. Usually this kind of locking needs a reader/writer locking mechanism.
Readers/writers locking means that readers share a lock, so you can have multiple readers reading the collection simultaneously, but to write, you should acquire an exclusive lock.
If you don't lock it, you may get an older value. A race condition could occur such that a write operation is performed changing Count, but you would get the value before the change. For example, if the queue has only one item, and a thread calls dequeue, another thread may read the count, find it still 1, and call dequeue again. The second call won't be done until the lock is granted, but at that time the queue would actually be empty.
The CLR guarantees atomic reads for values up to the width of the processor. So if you're running on 32 bit, reading ints will be atomic. If you're running on 64 bit machine, reading longs will be atomic. Ergo, if Count is an Int32 there's no need to lock.
This post is pertinent to your question.
I'm trying to get my multithreading understanding locked down. I'm doing my best to teach myself, but some of these issues need clarification.
I've gone through three iterations with a piece of code, experimenting with locking.
In this code, the only thing that needs locking is this.managerThreadPriority.
First, the simple, procedural approach, with minimalistic locking.
var managerThread = new Thread
(
new ThreadStart(this.ManagerThreadEntryPoint)
);
lock (this.locker)
{
managerThread.Priority = this.managerThreadPriority;
}
managerThread.Name = string.Format("Manager Thread ({0})", managerThread.GetHashCode());
managerThread.Start();
Next, a single statement to create and launch a new thread, but the lock appears to be scoped too large, to include the creation and launching of the thread. The compiler doesn't somehow magically know that the lock can be released after this.managerThreadPriority is used.
This kind of naive locking should be avoided, I would assume.
lock (this.locker)
{
new Thread
(
new ThreadStart(this.ManagerThreadEntryPoint)
)
{
Priority = this.managerThreadPriority,
Name = string.Format("Manager Thread ({0})", GetHashCode())
}
.Start();
}
Last, a single statement to create and launch a new thread, with a "embedded" lock only around the shared field.
new Thread
(
new ThreadStart(this.ManagerThreadEntryPoint)
)
{
Priority = new Func<ThreadPriorty>(() =>
{
lock (this.locker)
{
return this.managerThreadPriority;
}
})(),
Name = string.Format("Manager Thread ({0})", GetHashCode())
}
.Start();
Care to comment about the scoping of lock statements? For example, if I need to use a field in an if statement and that field needs to be locked, should I avoid locking the entire if statement? E.g.
bool isDumb;
lock (this.locker) isDumb = this.FieldAccessibleByMultipleThreads;
if (isDumb) ...
Vs.
lock (this.locker)
{
if (this.FieldAccessibleByMultipleThreads) ...
}
1) Before you even start the other thread, you don't have to worry about shared access to it at all.
2) Yes, you should lock all access to shared mutable data. (If it's immutable, no locking is required.)
3) Don't use GetHashCode() to indicate a thread ID. Use Thread.ManagedThreadId. I know, there are books which recommend Thread.GetHashCode() - but look at the docs.
Care to comment about the scoping of lock statements? For example, if I
need to use a field in an if statement and that field needs to be locked,
should I avoid locking the entire if statement?
In general, it should be scoped for the portion of code that needs the resource being guarded, and no more than that. This is so it can be available for other threads to make use of it as soon as possible.
But it depends on whether the resource you are locking is part of a bigger picture that has to maintain consistency, or whether it is a standalone resource not related directly to any other.
If you have interrelated parts that need to all change in a synchronized manner, that whole set of parts needs to be locked for the duration of the whole process.
If you have an independent, single item uncoupled to anything else, then only that one item needs to be locked long enough for a portion of the process to access it.
Another way to say it is, are you protecting synchronous or asynchronous access to the resource?
Synchronous access needs to hold on to it longer in general because it cares about a bigger picture that the resource is a part of. It must maintain consistency with related resources. You may very well wrap an entire for-loop in such a case if you want to prevent interruptions until all are processed.
Asynchronous access should hold onto it as briefly as possible. Thus, the more appropriate place for the lock would be inside portions of code, such as inside a for-loop or if-statement so you can free up the individual elements right away even before other ones are processed.
Aside from these two considerations, I would add one more. Avoid nesting of locks involving two different locking objects. I have learned by experience that it is a likely source of deadlocks, particularly if other parts of the code use them. If the two objects are part of a group that needs to be treated as a single whole all the time, such nesting should be refactored out.
There is no need to lock anything before you have started any threads.
If you are only going to read a variable there's no need for locks either. It's when you mix reads and writes that you need to use mutexes and similar locking, and you need to lock in both the reading and the writing thread.