Why are there no concurrent collections in C#? - c#

I am trying to get an overview of the thread safety theory behind the collections in C#.
Why are there no concurrent collections as there are in Java? (java docs). Some collections appear thread safe but it is not clear to me what the position is for example with regard to:
compound operations,
safety of using iterators,
write operations
I do not want to reinvent the wheel! (I am not a multi-threading guru and am definitely not underestimating how hard this would be anyway).
I hope the community can help.

.NET has had relatively "low level" concurrency support until now - but .NET 4.0 introduces the System.Collections.Concurrent namespace which contains various collections which are safe and useful.
Andrew's answer is entirely correct in terms of how to deal with collections before .NET 4.0 of course - and for most uses I'd just lock appropriately when accessing a "normal" shared collection. The concurrent collections, however, make it easy to use a producer/consumer queue, etc.

C# offers several ways to work with collections across multiple threads. For a good write-up of these techniques I would recommend that you start with Collections and Synchronization (Thread Safety):
By default, Collections classes are
generally not thread safe. Multiple
readers can read the collection with
confidence; however, any modification
to the collection produces undefined
results for all threads that access
the collection, including the reader
threads.
Collections classes can be made thread
safe using any of the following
methods:
Create a thread-safe wrapper using the Synchronized method, and
access the collection exclusively
through that wrapper.
If the class does not have a Synchronized method, derive from the
class and implement a Synchronized
method using the SyncRoot property.
Use a locking mechanism, such as the lock statement in C# (SyncLock in
Visual Basic), on the SyncRoot
property when accessing the
collection.

As Jon Skeet mentioned, there are now "thread safe" collections in the System.Collections.Concurrent namespace in .NET 4.
One of the reason that no concurrent collections exist (at least my guess) in prior .NET Framework versions is that it is very hard to guarantee thread safety, even with a concurrent collection.
(This is not entirely true as some collections offer a Synchronized method to return a thread safe collection from a non-thread safe collection so there are some thread safe collections...)
For example assume one has a thread safe Dictionary - if one only want to to an insert if the Key does not exist one would first query the collection to see if the Key exists, then one would do an insert if the key does not exist. These two operation are not thread safe though, between the query of ContainsKey and the Add operation another thread could have done an insert of that key so there is a race condition.
Inother words the operations of the collection are thread safe - but the usage of it is not necessarily. In this case one would need to transition back to traditional locking techniques (mutex/monitor/semaphore...) to achieve thread safety so the concurrent collection has bought you nothing in terms of multi-threaded safety (but is probably worse for performance).

Related

Does read-only concurrent access to .net collections need to be guarded?

Having a .net collection (a dictionary) that will potentially see very high concurrent read-only access, does it still need to be guarded meaning that I should be using a thread safe version of the collection or use synchronization mechanisms or is thread safety a topic only in presence of concurrent read and write activities?
Access to a collection needs to be synchronized only when reads occur concurrently with writes.
If your collection is constructed once at the beginning of the program, and then accessed only for reading its elements or iterating over its content, then there is no need to add any additional synchronization around the reads.
.NET framework offers an Immutable Collections package to ensure this flow of execution. You build your immutable collection upfront, and then your code has no way to modify it even inadvertently.
Use ConcurrentDictionary which is a thread-safe collection of key/value pairs that and allows to be accessed by multiple threads concurrently. You can read at ConcurrentDictionary.
ConcurrentDictionary<TKey, TValue> implements the IReadOnlyCollection<T> and IReadOnlyDictionary<TKey, TValue> interfaces starting with the .NET Framework 4.6; in previous versions of the .NET Framework, the ConcurrentDictionary class did not implement these interfaces.
All the operations of this class are atomic and are thread-safe The only exceptions are the methods that accept a delegate, that is, AddOrUpdate and GetOrAdd.
For modifications and write operations to the dictionary, ConcurrentDictionary<TKey, TValue> uses fine-grained locking to ensure thread safety.
Read operations on the dictionary are performed in a lock-free manner.
However, delegates for these methods are called outside the locks to avoid the problems that can arise from executing unknown code under a lock. Therefore, the code executed by these delegates is not subject to the atomicity of the operation.

Is it safe for multiple threads to read from a Lookup<TKey, TElement>?

Is it safe for multiple threads to read from a Lookup<TKey, TElement>?
Lookup<TKey, TElement> is immutable, however MSDN states:
Any public static (Shared in Visual Basic) members of this type are
thread safe. Any instance members are not guaranteed to be thread
safe.
Though I shudder to imagine it, I'm wondering if the machine that pumps out MSDN documentation
could be incorrect.
Because I don't like to risk that I may have to debug an obscure multithreading related bug 1 year from now, I'm going to assume it's not safe to use this class without manual synchronization.
As long as there is no writing, doing just reading is thread-safe. This is valid in any case.
Your question is in a sense orthogonal to the notion of thread-safety. A write in combination with a write or read is not thread-safe, but multiple reads without writing are thread-safe.
What MSDN says about instance members not being guaranteed to be thread-safe can only be valid in case of non-thread-safe scenarios, which by definition imply a write operation.
This is standard disclaimer for all most classes as you've probably noticed. Some methods may be thread safe, but "are not guaranteed".
Generally it is safe to read from collection using multiple threads if there are no writers to collection. If you need to update collection at the same time - use appropriate synchronization or built in thread safe collections like SynchronizedKeyedCollection.
Because the Lookup<TKey,TElement> is immutable, means that you will get the same values for all members. It does not mean that the items stored in it cannot be modified. So the collection is indeed not thread safe. A perfect example would be that most linq is lazy evaluated, and creating the enumerator could involve executing the lazy code. Trying to enumerate in two separate threads could cause the collection to be realized twice producing the wrong result.
Update:
Now that the source code is available on https://referencesource.microsoft.com it is confirmed that internal state is set during method calls without regard to multithreading meaning that you could have race conditions and the Lookup<TKey,TElement> class is in fact not thread safe.

Threadsafety dictionary C#

A colleague of mine recently stated it is fine for multiple read write threads to access a c# dictionary if you don't mind retreiving stale data. His justification was that since the program would reapeatedly read from the dictionary, stale data won't be an issue.
I told him that locking a collection was always necessary when you have a writer thread because the internal state of the collection will get corrupted.
Am I mistaken?
You are correct, and your colleague is wrong: one can access a dictionary from multiple threads only in the absence of writers.
.NET 4.0 adds ConcurrentDictionary<K,T> class that does precisely what its name implies.
You are correct in that some form of locking is required for writing, though just having a write doesn't mean that you have to lock() { } every access to the collection.
As you say, the non-synchronized versions of the built-in collections are thread safe only for reading. Typically a ReadWriterLockSlim is used to manage concurrent access in cases where writes can happen, as it will allow for multiple threads to access the collection as long as no writes are taking place, but only one thread (the writer) during a write.
From http://msdn.microsoft.com/en-us/library/xfhwa508.aspx :
A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.
For a thread-safe alternative, see ConcurrentDictionary<TKey, TValue>.

Can a List<t> be accessed by multiple threads?

I am planning to share a List between multiple threads. The list will be locked during a changes, which happen infrequently. Is there a thread safety issue if multiple iterations are made from different threads through the list simultaneously?
If you can (if you can use .NET 4 that is), use BlockingCollection<T>:
Provides blocking and bounding capabilities for thread-safe collections that implement IProducerConsumerCollection<T>.
If not then encapsulate the list completely and add thread-safe methods that access the List<T>'s state. Don't make the reference to the list public or return it from any methods - always encapsulate the reference so you can guarantee that you are locking around all access to it.
A List<T> is not a thread-safe class but if you lock everytime you read/write to it there won't be any issues. According to the documentation:
Public static (Shared in Visual Basic)
members of this type are thread safe.
Any instance members are not
guaranteed to be thread safe.
A List<T> can support multiple readers
concurrently, as long as the
collection is not modified.
Enumerating through a collection is
intrinsically not a thread-safe
procedure. In the rare case where an
enumeration contends with one or more
write accesses, the only way to ensure
thread safety is to lock the
collection during the entire
enumeration. To allow the collection
to be accessed by multiple threads for
reading and writing, you must
implement your own synchronization.
List<T> is not thread-safe generally. Having multiple readers will not cause any issues, however, you cannot write to the list while it is being read. So you would need to lock on both read and write or use something like a System.Threading.ReaderWriterLock (which allows multiple readers but only one writer).
It can be read from multiple threads simultaneously, if that's what you're asking. Consider a reader-writer lock if so.
To answer this question you have to take a look first to the documentation, then to the source code and there is a warning at this place - the source code of List<T> can be modified through the years.
Darin Dimitrov has quoted the documentation from year 2010 and there is a differences to the year 2021:
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
It is safe to perform multiple read operations on a List, but
issues can occur if the collection is modified while it's being read.
To ensure thread safety, lock the collection during a read or write
operation. To enable a collection to be accessed by multiple threads
for reading and writing, you must implement your own synchronization.
For collections with built-in synchronization, see the classes in the
System.Collections.Concurrent namespace. For an inherently thread-safe
alternative, see the ImmutableList class.
As you can see there is already no such a sentence
Enumerating through a collection is intrinsically not a thread-safe
procedure
So an advice is - check the documentation and implementation of List<T> and track changes on .NET framework.
The answer to your question is - it depends.
If you use foreach to iterate through the list, then if the list was modified, even just call list[i]=value, where the value is equal to list[i], you will become an exception, since the List<T>._version(which will be checked by the enumerator object) being changed on set.
for loop at this place will not throw an exception, if you will modify a value in the list, but change on the length of the list can be dangerous.
If the list will not be modified at all during the iterating, then the iterating is thread safe.

List<T> concurrent removing and adding

I am not too sure, so i thought i'd ask. Would removing and adding items to a System.Collections.Generic.List<> object be non-thread safe?
My situation:
When a connection is received, it is added to the list, but also at the same time, there's a worker that's removing dead connections and such.
Is there a problem? Will a lock do?
I also want to know if i'm allowed to use a lock on the list object with it's Foreach<> method.
Yes, adding and removing items from a List<> is not thread safe, so you need to synchronise the access, for example using lock.
Mind that the lock keyword in no ways locks the object that you use as identifier, it only prevents two threads to enter the same code block at the same time. You will need locks around all code that accesses the list, using the same object as identifier.
At the time of the question there wasn't .NET Framework 4 yet, but the people who are faced the problem now should try to use collections from System.Collections.Concurrent namespace for dealing with thread-safe issues
List<T> is not thread-safe, so yes, you will need to control access to the list with a lock. If you have multiple threads accessing the List make sure you have them all respect the lock or you will have issues. The best way to do this would to be to subclass the List so that the locking happens automatically, else you will more than likely end up forgetting eventually.
Definitely using lock for particular code makes it thread safe, but I do not agree with it for current scenario.
You can implement method Synchronized to make collection thread safe. This link explains why and how to do that.
Another purely programmatic approach is mentioned in this link, though I never tested it firsthand but it should work.
btw, one of the bigger concern is that, are you trying to maintain something like connection pool on you own? if yes then why?
I take my answer back. Using locks in better answer that using this method.
Actually, sometimes List<> is thread-safe, and sometimes not, according to Microsoft:
Public static members of this type are
thread safe. Any instance members are
not guaranteed to be thread safe.
but that page goes on to say:
Enumerating through a collection is
intrinsically not a thread-safe
procedure. In the rare case where an
enumeration contends with one or more
write accesses, the only way to ensure
thread safety is to lock the
collection during the entire
enumeration. To allow the collection
to be accessed by multiple threads for
reading and writing, you must
implement your own synchronization.

Categories

Resources