List<T> concurrent removing and adding - c#

I am not too sure, so i thought i'd ask. Would removing and adding items to a System.Collections.Generic.List<> object be non-thread safe?
My situation:
When a connection is received, it is added to the list, but also at the same time, there's a worker that's removing dead connections and such.
Is there a problem? Will a lock do?
I also want to know if i'm allowed to use a lock on the list object with it's Foreach<> method.

Yes, adding and removing items from a List<> is not thread safe, so you need to synchronise the access, for example using lock.
Mind that the lock keyword in no ways locks the object that you use as identifier, it only prevents two threads to enter the same code block at the same time. You will need locks around all code that accesses the list, using the same object as identifier.

At the time of the question there wasn't .NET Framework 4 yet, but the people who are faced the problem now should try to use collections from System.Collections.Concurrent namespace for dealing with thread-safe issues

List<T> is not thread-safe, so yes, you will need to control access to the list with a lock. If you have multiple threads accessing the List make sure you have them all respect the lock or you will have issues. The best way to do this would to be to subclass the List so that the locking happens automatically, else you will more than likely end up forgetting eventually.

Definitely using lock for particular code makes it thread safe, but I do not agree with it for current scenario.
You can implement method Synchronized to make collection thread safe. This link explains why and how to do that.
Another purely programmatic approach is mentioned in this link, though I never tested it firsthand but it should work.
btw, one of the bigger concern is that, are you trying to maintain something like connection pool on you own? if yes then why?
I take my answer back. Using locks in better answer that using this method.

Actually, sometimes List<> is thread-safe, and sometimes not, according to Microsoft:
Public static members of this type are
thread safe. Any instance members are
not guaranteed to be thread safe.
but that page goes on to say:
Enumerating through a collection is
intrinsically not a thread-safe
procedure. In the rare case where an
enumeration contends with one or more
write accesses, the only way to ensure
thread safety is to lock the
collection during the entire
enumeration. To allow the collection
to be accessed by multiple threads for
reading and writing, you must
implement your own synchronization.

Related

is it safe to apply double-checked locking to Dictionary?

Is it safe to apply double-checked locking to Dictionary?
I.e. if it safe to call TryGetValue and other "get/contains" methods from different threads? (without calling other, non-get methods).
upd Would collection be safe for N readers AND 1 writer? Assume In cycle 10 threads are try to access element with key X using double-checked locking and if accessed they just remove it. At some point I do add element with key X from one another thread (using lock). I expect that exactly one reader should obtain this element and delete it.
upd2, about answer so my question is confusing. actually I've asked two questions:
if it safe to call TryGetValue and other "get/contains" methods from different threads? (without calling other, non-get methods).
Would collection be safe for N readers AND 1 writer?
The answer for first question is Yes and the answer for second question is No.
So sometimes it's safe to apply double-checked locking sometimes it is not. It depends on if you are writing at the same time to a collection.
I assume you're talking about the generic Dictionary<TKey, TValue> class. That class is safe for N readers or 1 writer. So as long as you're not modifying it, you can have as many threads you want reading from it, no lock required.
If it's possible that a thread will want to modify the dictionary, you have to synchronize access to it. I would suggest ReaderWriterLockSlim.
This is not safe because it is not documented to be safe. You can't have one writer and N readers.
Here is an applicable sentence from the docs:
A Dictionary can support multiple readers concurrently,
as long as the collection is not modified.
Actually, if you peek into Dictionary with Reflector you can see that it is unsafe, but that is not the point. The point is that you cannot rely on undocumented properties because they can change at any time, introducing bugs into production that nobody knows about.
You also cannot test this to be safe. It might work on your box and break somewhere else. This is the nature of threading bugs. Not worth it.

Is it safe for multiple threads to read from a Lookup<TKey, TElement>?

Is it safe for multiple threads to read from a Lookup<TKey, TElement>?
Lookup<TKey, TElement> is immutable, however MSDN states:
Any public static (Shared in Visual Basic) members of this type are
thread safe. Any instance members are not guaranteed to be thread
safe.
Though I shudder to imagine it, I'm wondering if the machine that pumps out MSDN documentation
could be incorrect.
Because I don't like to risk that I may have to debug an obscure multithreading related bug 1 year from now, I'm going to assume it's not safe to use this class without manual synchronization.
As long as there is no writing, doing just reading is thread-safe. This is valid in any case.
Your question is in a sense orthogonal to the notion of thread-safety. A write in combination with a write or read is not thread-safe, but multiple reads without writing are thread-safe.
What MSDN says about instance members not being guaranteed to be thread-safe can only be valid in case of non-thread-safe scenarios, which by definition imply a write operation.
This is standard disclaimer for all most classes as you've probably noticed. Some methods may be thread safe, but "are not guaranteed".
Generally it is safe to read from collection using multiple threads if there are no writers to collection. If you need to update collection at the same time - use appropriate synchronization or built in thread safe collections like SynchronizedKeyedCollection.
Because the Lookup<TKey,TElement> is immutable, means that you will get the same values for all members. It does not mean that the items stored in it cannot be modified. So the collection is indeed not thread safe. A perfect example would be that most linq is lazy evaluated, and creating the enumerator could involve executing the lazy code. Trying to enumerate in two separate threads could cause the collection to be realized twice producing the wrong result.
Update:
Now that the source code is available on https://referencesource.microsoft.com it is confirmed that internal state is set during method calls without regard to multithreading meaning that you could have race conditions and the Lookup<TKey,TElement> class is in fact not thread safe.

Threadsafety dictionary C#

A colleague of mine recently stated it is fine for multiple read write threads to access a c# dictionary if you don't mind retreiving stale data. His justification was that since the program would reapeatedly read from the dictionary, stale data won't be an issue.
I told him that locking a collection was always necessary when you have a writer thread because the internal state of the collection will get corrupted.
Am I mistaken?
You are correct, and your colleague is wrong: one can access a dictionary from multiple threads only in the absence of writers.
.NET 4.0 adds ConcurrentDictionary<K,T> class that does precisely what its name implies.
You are correct in that some form of locking is required for writing, though just having a write doesn't mean that you have to lock() { } every access to the collection.
As you say, the non-synchronized versions of the built-in collections are thread safe only for reading. Typically a ReadWriterLockSlim is used to manage concurrent access in cases where writes can happen, as it will allow for multiple threads to access the collection as long as no writes are taking place, but only one thread (the writer) during a write.
From http://msdn.microsoft.com/en-us/library/xfhwa508.aspx :
A Dictionary can support multiple readers concurrently, as long as the collection is not modified. Even so, enumerating through a collection is intrinsically not a thread-safe procedure. In the rare case where an enumeration contends with write accesses, the collection must be locked during the entire enumeration. To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.
For a thread-safe alternative, see ConcurrentDictionary<TKey, TValue>.

Can a List<t> be accessed by multiple threads?

I am planning to share a List between multiple threads. The list will be locked during a changes, which happen infrequently. Is there a thread safety issue if multiple iterations are made from different threads through the list simultaneously?
If you can (if you can use .NET 4 that is), use BlockingCollection<T>:
Provides blocking and bounding capabilities for thread-safe collections that implement IProducerConsumerCollection<T>.
If not then encapsulate the list completely and add thread-safe methods that access the List<T>'s state. Don't make the reference to the list public or return it from any methods - always encapsulate the reference so you can guarantee that you are locking around all access to it.
A List<T> is not a thread-safe class but if you lock everytime you read/write to it there won't be any issues. According to the documentation:
Public static (Shared in Visual Basic)
members of this type are thread safe.
Any instance members are not
guaranteed to be thread safe.
A List<T> can support multiple readers
concurrently, as long as the
collection is not modified.
Enumerating through a collection is
intrinsically not a thread-safe
procedure. In the rare case where an
enumeration contends with one or more
write accesses, the only way to ensure
thread safety is to lock the
collection during the entire
enumeration. To allow the collection
to be accessed by multiple threads for
reading and writing, you must
implement your own synchronization.
List<T> is not thread-safe generally. Having multiple readers will not cause any issues, however, you cannot write to the list while it is being read. So you would need to lock on both read and write or use something like a System.Threading.ReaderWriterLock (which allows multiple readers but only one writer).
It can be read from multiple threads simultaneously, if that's what you're asking. Consider a reader-writer lock if so.
To answer this question you have to take a look first to the documentation, then to the source code and there is a warning at this place - the source code of List<T> can be modified through the years.
Darin Dimitrov has quoted the documentation from year 2010 and there is a differences to the year 2021:
Public static (Shared in Visual Basic) members of this type are thread
safe. Any instance members are not guaranteed to be thread safe.
It is safe to perform multiple read operations on a List, but
issues can occur if the collection is modified while it's being read.
To ensure thread safety, lock the collection during a read or write
operation. To enable a collection to be accessed by multiple threads
for reading and writing, you must implement your own synchronization.
For collections with built-in synchronization, see the classes in the
System.Collections.Concurrent namespace. For an inherently thread-safe
alternative, see the ImmutableList class.
As you can see there is already no such a sentence
Enumerating through a collection is intrinsically not a thread-safe
procedure
So an advice is - check the documentation and implementation of List<T> and track changes on .NET framework.
The answer to your question is - it depends.
If you use foreach to iterate through the list, then if the list was modified, even just call list[i]=value, where the value is equal to list[i], you will become an exception, since the List<T>._version(which will be checked by the enumerator object) being changed on set.
for loop at this place will not throw an exception, if you will modify a value in the list, but change on the length of the list can be dangerous.
If the list will not be modified at all during the iterating, then the iterating is thread safe.

What is the meant by 'thread safe' object?

I have used generic queue in C# collection and everyone says that it is better to use the object of System.Collection.Generic.Queue because of thread safety.
Please advise on the right decision to use Queue object, and how it is thread safe?
"Thread safe" is a bit of an unfortunate term because it doesn't really have a solid definition. Basically it means that certain operations on the object are guaranteed to behave sensibly when the object is being operated on via multiple threads.
Consider the simplest example: a counter. Suppose you have two threads that are incrementing a counter. If the sequence of events goes:
Thread one reads from counter, gets
zero.
Thread two reads from counter, gets
zero.
Thread one increments zero, writes
one to counter.
Thread two increments zero, writes
one to counter.
Then notice how the counter has "lost" one of the increments. Simple increment operations on counters are not threadsafe; to make them threadsafe you can use locks, or InterlockedIncrement.
Similarly with queues. Not-threadsafe-queues can "lose" enqueues the same way that not-threadsafe counters can lose increments. Worse, not threadsafe queues can even crash or produce crazy results if you use them in a multi-threaded scenario improperly.
The difficulty with "thread safe" is that it is not clearly defined. Does it simply mean "will not crash"? Does it mean that sensible results will be produced? For example, suppose you have a "threadsafe" collection. Is this code correct?
if (!collection.IsEmpty) Console.WriteLine(collection[0]);
No. Even if the collection is "threadsafe", that doesn't mean that this code is correct; another thread could have made the collection empty after the check but before the writeline and therefore this code could crash, even if the object is allegedly "threadsafe". Actually determining that every relevant combination of operations is threadsafe is an extremely difficult problem.
Now to come to your actual situation: anyone who is telling you "you should use the Queue class, it is better because it is threadsafe" probably does not have a clear idea of what they're talking about. First off, Queue is not threadsafe. Second, whether Queue is threadsafe or not is completely irrelevant if you are only using the object on a single thread! If you have a collection that is going to be accessed on multiple threads, then, as I indicated in my example above, you have an extremely difficult problem to solve, regardless of whether the collection itself is "threadsafe". You have to determine that every combination of operations you perform on the collection is also threadsafe. This is a very difficult problem, and if it is one you face, then you should use the services of an expert on this difficult topic.
A type that is thread safe can be safely accessed from multiple threads without concern for concurrency. This usually means that the type is read-only.
Interestingly enough, Queue<T> is not thread safe - it can support concurrent reads as long as the queue isn't modified but that isn't the same thing as thread safety.
In order to think about thread safety consider what would happen if two threads were accessing a Queue<T> and a third thread came along and began either adding to or removing from this Queue<T>. Since this type does not restrict this behavior it is not thread safe.
In dealing with multithreading, you usually have to deal with concurrency issues. The term "concurrency issues" refers to issues that are specifically introduced by the possibility of interleaving instructions from two different execution contexts on a resource shared by both. Here, in terms of thread safety, the execution contexts are two threads within a process; however, in related subjects they might be processes.
Thread safety measures are put in place to achieve two goals primarily. First is to regain determinism with regard to what happens if the threads context-switch (which is otherwise controlled by the OS and thus basically nondeterministic in user-level programs), to prevent certain tasks from being left half-finished or two contexts writing to the same location in memory one after the other. Most measures simply use a little bit of hardware-supported test-and-set instructions and the like, as well as software-level synchronization constructs to force all other execution contexts to stay away from a data type while another one is doing work that should not be interrupted.
Usually, objects that are read-only are thread-safe. Many objects that are not read-only are able to have data accesses (read-only) occur with multiple threads without issue, if the object is not modified in the middle. But this is not thread safety. Thread safety is when all manner of things are done to a data type to prevent any modifications to it by one thread from causing data corruption or deadlock even when dealing with many concurrent reads and writes.

Categories

Resources