Thread safe OfType

Thread safe OfType - c#

I am facing the problem that this code:
enumerable.OfType<Foo>().Any(x => x.Footastic == true);
Isnt thread safe and throws an enumeration has changed exception.
Is there a good way to overcome this issue?
Already tried the following but it didnt always work (seems to not fire this often)
public class Foo
{
public void DoSomeMagicWithCollection(IEnumerable enumerable)
{
lock (enumerable)
{
enumerable.OfType<Foo>().Any(x => x.Footastic == true);
}
}
}

If you're getting an exception that the underlying collection has changed while enumerating it, given that this code clearly doesn't mutate the collection itself, it means that another thread is mutating the collection while you're trying to iterate it.
There is no possible solution to this problem other than simply not doing that. What's happening is that the enumerator of the List (or whatever collection type that is) is throwing the exception and preventing further enumeration because it can see that the list was modified during the enumeration. There is no way for the enumerators of OfType of Any that wrap it to possibly recover from that. The underlying enumerator is refusing to give them the data from the list. They can't do anything about that.
You need to use some sort of synchronization mechanism to prevent another thread from mutating the collection wnile this thread is enumerating this collection. Your lock doesn't prevent another thread from using the collection, it simply prevents any code that locks on the same instance from running. You need to have any code that could possibly mutate the list also lock on the same object to properly synchronize them.
Another possibility would be to use a collection that is inherently designed to be accessed from multiple threads at the same time. There are several such collections in the System.Collections.Concurrent namespace. They may or may not fit your needs. They will take care of synchronizing access to their data (to a point) on their own, without you needing to explicitly lock when accessing them.

Related

How do I freeze a collection so that I can iterate through it?

The following code gives me an exception:
var snapshot = BluetoothCapture.Instance.Snapshot();
var allowedDevice = snapshot.FirstOrDefault( _ => some_expression );
Collection was modified; enumeration operation may not execute.
I thought I could use a lock to freeze the collection so that I can iterate through it. However, I'm still getting the same exception.
The class definition below has a Snapshot method that attempts this:
public partial class BluetoothCapture
{
...
public void Capture()
{
_watcher = DeviceInformation.CreateWatcher();
_watcher.Added += (s, e) => { _devices.Add(e); };
_watcher.Start();
}
public IEnumerable<DeviceInformation> Snapshot()
{
lock (_devices)
{
return _devices.AsReadOnly();
}
}
}
Any suggestions?

Lock is used when you need to stop a block of code to execute in multiple threads (stop the parallel execution).
If Capture is called multiple times, so, yes, you can have a write being called before the previous one being finished.
You can use a ConcurrentBag.
The ConcurrentBag is a list like object, but it is thread safe (none of generic list is).
But, ConcurrentBag is unordered collection, so it does not guarantee ordering.
If you need some ordered list, you can see this link
Thread safe collections in .NET
You can also make a "lock" in the Add (not in the get)
_watcher.Added += (s, e) => { lock(_devices){_devices.Add(e); }};
But, if your app is running for a while, you can have a memory and performance problem (the add will be not async), even if the Capture is.

lock is really very useful concept but only when we use it wisely.
If in further code, your don't want to update reference of snapshot (the collection which you are getting from BluetoothCapture.Instance.Snapshot()) but just perform some Linq query to get filtered value to perform some logic.
you can avoid using lock.
It will be beneficial too, as by not doing lock you are actually not holding other threads to perform it's logic. - and we should not ignore the fact that terrible use of lock can cause serious problems like dead-lock too.
you are getting this exception, most likly, the collection on which you are performing linq query; is being updated by some other thread. (i too got that problem).
you can do one thing, instead of using general reference of collection (the one which you are getting from BluetoothCapture.Instance.Snapshot()) you can create a local list - as it is local it will not be updated by other threads.

is enumerator thread safe after getting with lock?

I am wondering if the returned enumerator is thread safe:
public IEnumerator<T> GetEnumerator()
{
lock (_sync) {
return _list.GetEnumerator();
}
}
If I have multiple threads whom are adding data (also within lock() blocks) to this list and one thread enumerating the contents of this list. When the enumerating thread is done it clears the list. Will it then be safe to use the enumerator gotten from this method.
I.e. does the enumerator point to a copy of the list at the instance it was asked for or does it always point back to the list itself, which may or may not be manipulated by another thread during its enumeration?
If the enumerator is not thread safe, then the only other course of action I can see is to create a copy of the list and return that. This is however not ideal as it will generate lots of garbage (this method is called about 60 times per second).

No, not at all. This lock synchronizes only access to _list.GetEnumerator method; where as enumerating a list is lot more than that. It includes reading the IEnumerator.Current property, calling IEnumerator.MoveNext etc.
You either need a lock over the foreach(I assume you enumerate via foreach), or you need to make a copy of list.
Better option is to take a look at Threadsafe collections provided out of the box.

According to the documentation, to guarantee thread-safity you have to lock collecton during entire iteration over it.
The enumerator does not have exclusive access to the collection;
therefore, enumerating through a collection is intrinsically not a
thread-safe procedure. To guarantee thread safety during enumeration,
you can lock the collection during the entire enumeration. To allow
the collection to be accessed by multiple threads for reading and
writing, you must implement your own synchronization.
Another option, may be define you own custom iterator, and for every thread create a new instance of it. So every thread will have it's own copy of Current read-only pointer to the same collection.

If I have multiple threads whom are adding data (also within lock() blocks) to this list and one thread enumerating the contents of this
list. When the enumerating thread is done it clears the list. Will it
then be safe to use the enumerator gotten from this method.
No. See reference here: http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.aspx
An enumerator remains valid as long as the collection remains
unchanged. If changes are made to the collection, such as adding,
modifying, or deleting elements, the enumerator is irrecoverably
invalidated and the next call to MoveNext or Reset throws an
InvalidOperationException. If the collection is modified between
MoveNext and Current, Current returns the element that it is set to,
even if the enumerator is already invalidated. The enumerator does not
have exclusive access to the collection; therefore, enumerating
through a collection is intrinsically not a thread-safe procedure.
Even when a collection is synchronized, other threads can still modify
the collection, which causes the enumerator to throw an exception. To
guarantee thread safety during enumeration, you can either lock the
collection during the entire enumeration or catch the exceptions
resulting from changes made by other threads.
..
does the enumerator point to a copy of the list at the instance it was asked for or does it always point back to the list itself, which
may or may not be manipulated by another thread during its
enumeration?
Depends on the collection. See the Concurrent Collections. Concurrent Stack, ConcurrentQueue, ConcurrentBag all take snapshots of the collection when GetEnumerator() is called and returns elements from the snapshot. The underlying collection may change without changing the snapshot. On the other hand, ConcurrentDictionary doesn't take a snapshot, so changing the collection while iterating will immediately effect things per the rules above.
A trick I sometimes use in this case is to create a temporary collection to iterate so the original is free while I use the snapshot:
foreach(var item in items.ToList()) {
//
}
If your list is too large and this causes GC churn, then locking is probably your best bet. If locking is too heavy, maybe consider a partial iteration each timeslice, if that is feasible.
You said:
When the enumerating thread is done it clears the list.
Nothing says you have to process the whole list at a time. You could instead remove a range of items, move them to a separate enumerating thread, let that process, and repeat. Perhaps iteratation and lists aren't the best model here. Consider the ConcurrentQueue with which you could build producers and consumer model, and consumers just steadily remove items to process without iteration

Should I use a lock or ToList()

To avoid the famous error:
Collection was modified; enumeration operation may not execute
when iterating an IEnumerator, should I just use ToList() as suggested in this questions, or should I use lock when updating the list?
My concern is that ToList() is simply creating a new list each time, while lock will only have a small delay at some points when updating the list.
So, should I use ToList() or lock?

The error you are seeing tells you that you are trying to modify collection while you are enumerating it. The lock keyword is applicable in multi-threaded situation and it makes sure your only one thread is in critical section. It has nothing to do with "locking" collection so that you can enumerate it while also modifying it.
The error you are seeing typically occurs with code like below
foreach(var item in items)
{
//Inside this loop now you are enumerating collection
//This means you can't add or delete anything from items
items.Add(newItem); //This will produce the error you are seeing
}
This error might also occur if you have one thread enumerating collection while other thread adding or removing from collection. So if you know you have multi-threaded situation then you can use probably use lock although it is bit tricky and may impact your concurrency performance and even cause race condition. For multi-threaded situation, using just ToList may not be enough either because you need to make sure only one thread has access to collection while you are calling ToList. So you may need to use ToList along with lock and it may be overall simpler option depending your scenario. To see an example of how to do thread-safe modify-while-enumerate see this answer: Making a "modify-while-enumerating" collection thread-safe.
If you know you don't have multiple thread then ToList is probably your only option and you don't need lock at all. I would also suggest to make sure if you really do need to modify collection while enumerating it. In many cases (but not all), you can do design bit differently to avoid it in first place.

Does using ConcurrentDictionary TryGetValue within an if statement make the if contents thread-safe?

If I have a ConcurrentDictionary and use the TryGetValue within an if statement, does this make the if statement's contents thread safe? Or must you lock still within the if statement?
Example:
ConcurrentDictionary<Guid, Client> m_Clients;
Client client;
//Does this if make the contents within it thread-safe?
if (m_Clients.TryGetValue(clientGUID, out client))
{
//Users is a list.
client.Users.Add(item);
}
or do I have to do:
ConcurrentDictionary<Guid, Client> m_Clients;
Client client;
//Does this if make the contents within it thread-safe?
if (m_Clients.TryGetValue(clientGUID, out client))
{
lock (client)
{
//Users is a list.
client.Users.Add(item);
}
}

Yes you have to lock inside the if statement the only guarantee you get from concurrent dictionary is that its methods are thread save.

The accepted answer could be misleading, depending on your point of view and the scope of thread safety you are trying to achieve. This answer is aimed at people who stumble on this question while learning about threading and concurrency:
It's true that locking on the output of the dictionary retrieval (the Client object) makes some of the code thread safe, but only the code that is accessing that retrieved object within the lock. In the example, it's possible that another thread removes that object from the dictionary after the current thread retrieves it. (Even though there are no statements between the retrieval and the lock, other threads can still execute in between.) Then, this code would add the Client object to the Users list even though it is no longer in the concurrent dictionary. That could cause an exception, synchronization, or race condition.
It depends on what the rest of the program is doing. But in the scenario I'm describing, it would be safer to put the lock around the entire dictionary retrieval. And then a regular dictionary might be faster and simpler than a concurrent dictionary, as long as you always lock on it while using it!

While both of the current answers are technically true I think that the potential exists for them to be a little misleading and they don't express ConcurrentDictionary's big strengths. Maybe the OP's original way of solving the problem with locks worked in that specific circumstance but this answer is aimed more generally towards people learning about ConcurrentDictionary for the first time.
Concurrent Dictionary is designed so that you don't have to use locks. It has several specialty methods designed around the idea that some other thread could modify the object in the dictionary while you're currently working on it. For a simple example, the TryUpdate method lets you check to see if a key's value has changed between when you got it and the moment that you're trying to update it. If the value that you've got matches the value currently in the ConcurrentDictionary you can update it and TryUpdate returns true. If not, TryUpdate returns false. The documentation for the TryUpdate method can make this a little confusing because it doesn't make it explicitly clear why there is a comparison value but that's the idea behind the comparison value. If you wanted to have a little more control around adding or updating, you could use one of the overloads of the AddOrUpdate method to either add a value for a key if it doesn't exist at the moment that you're trying to add it or update the value if some other thread has already added a value for the key that is specified. The context of whatever you're trying to do will dictate the appropriate method to use. The point is that, rather than locking, try taking a look at the specialty methods that ConcurrentDictionary provides and prefer those over trying to come up with your own locking solution.
In the case of OP's original question, I would suggest that instead of this:
ConcurrentDictionary<Guid, Client> m_Clients;
Client client;
//Does this if make the contents within it thread-safe?
if (m_Clients.TryGetValue(clientGUID, out client))
{
//Users is a list.
client.Users.Add(item);
}
One might try the following instead*:
ConcurrentDictionary<Guid, Client> m_Clients;
Client originalClient;
if(m_Clients.TryGetValue(clientGUID, out originalClient)
{
//The Client object will need to implement IEquatable if more
//than an object instance comparison needs to be done. This
//sample code assumes that Client implements IEquatable.
//If copying a Client is not trivial, you'll probably want to
//also implement a simple type of copy in a method of the Client
//object. This sample code assumes that the Client object has
//a ShallowCopy method to do this copy for simplicity's sake.
Client modifiedClient = originalClient.ShallowCopy();
//Make whatever modifications to modifiedClient that need to get
//made...
modifiedClient.Users.Add(item);
//Now update the value in the ConcurrentDictionary
if(!m_Clients.TryUpdate(clientGuid, modifiedClient, originalClient))
{
//Do something if the Client object was updated in between
//when it was retrieved and when the code here tries to
//modify it.
}
}
*Note in the example above, I'm using TryUpate for ease of demonstrating the concept. In practice, if you need to make sure that an object gets added if it doesn't exist or updated if it does, the AddOrUpdate method would be the ideal option because the method handles all of the looping required to check for add vs update and take the appropriate action.
It might seem like it's a little harder at first because it may be necessary to implement IEquatable and, depending on how instances of Client need to be copied, some sort of copying functionality but it pays off in the long run if you're working with ConcurrentDictionary and objects within it in any serious way.

Safe to get Count value from generic collection without locking the collection?

I have two threads, a producer thread that places objects into a generic List collection and a consumer thread that pulls those objects out of the same generic List. I've got the reads and writes to the collection properly synchronized using the lock keyword, and everything is working fine.
What I want to know is if it is ok to access the Count property without first locking the collection.
JaredPar refers to the Count property in his blog as a decision procedure that can lead to race conditions, like this:
if (list.Count > 0)
{
return list[0];
}
If the list has one item and that item is removed after the Count property is accessed but before the indexer, an exception will occur. I get that.
But would it be ok to use the Count property to, say, determine the initial size a completely different collection? The MSDN documentation says that instance members are not guaranteed to be thread safe, so should I just lock the collection before accessing the Count property?

I suspect it's "safe" in terms of "it's not going to cause anything to go catastrophically wrong" - but that you may get stale data. That's because I suspect it's just held in a simple variable, and that that's likely to be the case in the future. That's not the same as a guarantee though.
Personally I'd keep it simple: if you're accessing shared mutable data, only do so in a lock (using the same lock for the same data). Lock-free programming is all very well if you've got appropriate isolation in place (so you know you've got appropriate memory barriers, and you know that you'll never be modifying it in one thread while you're reading from it in another) but it sounds like that isn't the case here.
The good news is that acquiring an uncontested lock is incredibly cheap - so I'd go for the safe route if I were you. Threading is hard enough without introducing race conditions which are likely to give no significant performance benefit but at the cost of rare and unreproducible bugs.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.