How to safely write to the same List - c#

I've got a public static List<MyDoggie> DoggieList;
DoggieList is appended to and written to by multiple processes throughout my application.
We run into this exception pretty frequently:
Collection was modified; enumeration operation may not execute
Assuming there are multiple classes writing to DoggieList how do we get around this exception?
Please note that this design is not great, but at this point we need to quickly fix it in production.
How can we perform mutations to this list safely from multiple threads?
I understand we can do something like:
lock(lockObject)
{
DoggieList.AddRange(...)
}
But can we do this from multiple classes against the same DoggieList?

you can also create you own class and encapsulate locking thing in that only, you can try like as below ,
you can add method you want like addRange, Remove etc.
class MyList {
private object objLock = new object();
private List<int> list = new List<int>();
public void Add(int value) {
lock (objLock) {
list.Add(value);
}
}
public int Get(int index) {
int val = -1;
lock (objLock) {
val = list[0];
}
return val;
}
public void GetAll() {
List<int> retList = new List<int>();
lock (objLock) {
retList = new List<T>(list);
}
return retList;
}
}
Good stuff : Concurrent Collections very much in detail :http://www.albahari.com/threading/part5.aspx#_Concurrent_Collections
making use of concurrent collection ConcurrentBag Class can also resolve issue related to multiple thread update
Example
using System.Collections.Concurrent;
using System.Threading.Tasks;
public static class Program
{
public static void Main()
{
var items = new[] { "item1", "item2", "item3" };
var bag = new ConcurrentBag<string>();
Parallel.ForEach(items, bag.Add);
}
}

Using lock a the disadvantage of preventing concurrent readings.
An efficient solution which does not require changing the collection type is to use a ReaderWriterLockSlim
private static readonly ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
With the following extension methods:
public static class ReaderWriterLockSlimExtensions
{
public static void ExecuteWrite(this ReaderWriterLockSlim aLock, Action action)
{
aLock.EnterWriteLock();
try
{
action();
}
finally
{
aLock.ExitWriteLock();
}
}
public static void ExecuteRead(this ReaderWriterLockSlim aLock, Action action)
{
aLock.EnterReadLock();
try
{
action();
}
finally
{
aLock.ExitReadLock();
}
}
}
which can be used the following way:
_lock.ExecuteWrite(() => DoggieList.Add(new Doggie()));
_lock.ExecuteRead(() =>
{
// safe iteration
foreach (MyDoggie item in DoggieList)
{
....
}
})
And finally if you want to build your own collection based on this:
public class SafeList<T>
{
private readonly ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
private readonly List<T> _list = new List<T>();
public T this[int index]
{
get
{
T result = default(T);
_lock.ExecuteRead(() => result = _list[index]);
return result;
}
}
public List<T> GetAll()
{
List<T> result = null;
_lock.ExecuteRead(() => result = _list.ToList());
return result;
}
public void ForEach(Action<T> action) =>
_lock.ExecuteRead(() => _list.ForEach(action));
public void Add(T item) => _lock.ExecuteWrite(() => _list.Add(item));
public void AddRange(IEnumerable<T> items) =>
_lock.ExecuteWrite(() => _list.AddRange(items));
}
This list is totally safe, multiple threads can add or get items in parallel without any concurrency issue. Additionally, multiple threads can get items in parallel without locking each other, it's only when writing than 1 single thread can work on the collection.
Note that this collection does not implement IEnumerable<T> because you could get an enumerator and forget to dispose it which would leave the list locked in read mode.

make DoggieList of type ConcurrentStack and then use pushRange method. It is thread safe.
using System.Collections.Concurrent;
var doggieList = new ConcurrentStack<MyDoggie>();
doggieList.PushRange(YourCode)

Related

Make a thread-safe list of integers

I need to make a class that stores a List of int and read/write from it asynchronously.
here's my class:
public class ConcurrentIntegerList
{
private readonly object theLock = new object();
List<int> theNumbers = new List<int>();
public void AddNumber(int num)
{
lock (theLock)
{
theNumbers.Add(num);
}
}
public List<int> GetNumbers()
{
lock (theLock)
{
return theNumbers;
}
}
}
But it is not thread-safe until here. when I do multiple operations from different threads I get this error:
Collection was modified; enumeration operation may not execute.
What I missed here?
public List<int> GetNumbers()
{
lock (theLock)
{
// return theNumbers;
return theNumbers.ToList();
}
}
But the performance won't be very good this way, and GetNumbers() now returns a snapshot copy.

Threading With List Property

public static class People
{
List<string> names {get; set;}
}
public class Threading
{
public static async Task DoSomething()
{
var t1 = new Task1("bob");
var t2 = new Task1("erin");
await Task.WhenAll(t1,t2);
}
private static async Task Task1(string name)
{
await Task.Run(() =>
{
if(People.names == null) People.names = new List<string>();
Peoples.names.Add(name);
}
}
}
Is that dangerous to initialize a list within a thread? Is it possible that both threads could initialize the list and remove one of the names?
So I was thinking of three options:
Leave it like this since it is simple - only if it is safe though
Do same code but use a concurrentBag - I know thread safe but is initialize safe
Using [DataMember(EmitDefaultValue = new List())] and then just do .Add in Task1 and not worry about initializing. But the only con to this is sometimes the list wont need to be used at all and it seems like a waste to initialize it everytime.
Okay so what I figured worked best for my case was I used a lock statement.
public class Class1
{
private static Object thisLock = new Object();
private static async Task Task1(string name)
{
await Task.Run(() =>
{
AddToList(name);
}
}
private static AddToList(string name)
{
lock(thisLock)
{
if(People.names == null) People.names = new List<string>();
People.names.Add(name);
}
}
}
public static class People
{
public static List<string> names {get; set;}
}
for a simple case like this the easiest way to get thread-safety is using the lock statement:
public static class People
{
static List<string> _names = new List<string>();
public static void AddName(string name)
{
lock (_names)
{
_names.Add(name);
}
}
public static IEnumerable<string> GetNames()
{
lock(_names)
{
return _names.ToArray();
}
}
}
public class Threading
{
public static async Task DoSomething()
{
var t1 = new Task1("bob");
var t2 = new Task1("erin");
await Task.WhenAll(t1,t2);
}
private static async Task Task1(string name)
{
People.AddName(name);
}
}
of course it's not very usefull (why not just add without the threads) - but I hope you get the idea.
If you don't use some kind of lock and concurrently read and write to a List you will most likely get an InvalidOperationException saying the collection has changed during read.
Because you don't really know when a user will use the collection you might return the easiest way to get thread-saftey is copying the collection into an array and returning this.
If this is not practical (collection to large, ..) you have to use the classes in System.Collections.Concurrrent for example the BlockingCollection but those are a bit more involved.

Parallel.ForEach stalled when integrated with BlockingCollection

I adopted my implementation of parallel/consumer based on the code in this question
class ParallelConsumer<T> : IDisposable
{
private readonly int _maxParallel;
private readonly Action<T> _action;
private readonly TaskFactory _factory = new TaskFactory();
private CancellationTokenSource _tokenSource;
private readonly BlockingCollection<T> _entries = new BlockingCollection<T>();
private Task _task;
public ParallelConsumer(int maxParallel, Action<T> action)
{
_maxParallel = maxParallel;
_action = action;
}
public void Start()
{
try
{
_tokenSource = new CancellationTokenSource();
_task = _factory.StartNew(
() =>
{
Parallel.ForEach(
_entries.GetConsumingEnumerable(),
new ParallelOptions { MaxDegreeOfParallelism = _maxParallel, CancellationToken = _tokenSource.Token },
(item, loopState) =>
{
Log("Taking" + item);
if (!_tokenSource.IsCancellationRequested)
{
_action(item);
Log("Finished" + item);
}
else
{
Log("Not Taking" + item);
_entries.CompleteAdding();
loopState.Stop();
}
});
},
_tokenSource.Token);
}
catch (OperationCanceledException oce)
{
System.Diagnostics.Debug.WriteLine(oce);
}
}
private void Log(string message)
{
Console.WriteLine(message);
}
public void Stop()
{
Dispose();
}
public void Enqueue(T entry)
{
Log("Enqueuing" + entry);
_entries.Add(entry);
}
public void Dispose()
{
if (_task == null)
{
return;
}
_tokenSource.Cancel();
while (!_task.IsCanceled)
{
}
_task.Dispose();
_tokenSource.Dispose();
_task = null;
}
}
And here is a test code
class Program
{
static void Main(string[] args)
{
TestRepeatedEnqueue(100, 1);
}
private static void TestRepeatedEnqueue(int itemCount, int parallelCount)
{
bool[] flags = new bool[itemCount];
var consumer = new ParallelConsumer<int>(parallelCount,
(i) =>
{
flags[i] = true;
}
);
consumer.Start();
for (int i = 0; i < itemCount; i++)
{
consumer.Enqueue(i);
}
Thread.Sleep(1000);
Debug.Assert(flags.All(b => b == true));
}
}
The test always fails - it always stuck at around 93th-item from the 100 tested. Any idea which part of my code caused this issue, and how to fix it?
You cannot use Parallel.Foreach() with BlockingCollection.GetConsumingEnumerable(), as you have discovered.
For an explanation, see this blog post:
https://devblogs.microsoft.com/pfxteam/parallelextensionsextras-tour-4-blockingcollectionextensions/
Excerpt from the blog:
BlockingCollection’s GetConsumingEnumerable implementation is using BlockingCollection’s internal synchronization which already supports multiple consumers concurrently, but ForEach doesn’t know that, and its enumerable-partitioning logic also needs to take a lock while accessing the enumerable.
As such, there’s more synchronization here than is actually necessary, resulting in a potentially non-negligable performance hit.
[Also] the partitioning algorithm employed by default by both Parallel.ForEach and PLINQ use chunking in order to minimize synchronization costs: rather than taking the lock once per element, it'll take the lock, grab a group of elements (a chunk), and then release the lock.
While this design can help with overall throughput, for scenarios that are focused more on low latency, that chunking can be prohibitive.
That blog also provides the source code for a method called GetConsumingPartitioner() which you can use to solve the problem.
public static class BlockingCollectionExtensions
{
public static Partitioner<T> GetConsumingPartitioner<T>(this BlockingCollection<T> collection)
{
return new BlockingCollectionPartitioner<T>(collection);
}
public class BlockingCollectionPartitioner<T> : Partitioner<T>
{
private BlockingCollection<T> _collection;
internal BlockingCollectionPartitioner(BlockingCollection<T> collection)
{
if (collection == null)
throw new ArgumentNullException("collection");
_collection = collection;
}
public override bool SupportsDynamicPartitions
{
get { return true; }
}
public override IList<IEnumerator<T>> GetPartitions(int partitionCount)
{
if (partitionCount < 1)
throw new ArgumentOutOfRangeException("partitionCount");
var dynamicPartitioner = GetDynamicPartitions();
return Enumerable.Range(0, partitionCount).Select(_ => dynamicPartitioner.GetEnumerator()).ToArray();
}
public override IEnumerable<T> GetDynamicPartitions()
{
return _collection.GetConsumingEnumerable();
}
}
}
The reason for failure is because of the following reason as explained here
The partitioning algorithm employed by default by both
Parallel.ForEach and PLINQ use chunking in order to minimize
synchronization costs: rather than taking the lock once per element,
it'll take the lock, grab a group of elements (a chunk), and then
release the lock.
To get it to work, you can add a method on your ParallelConsumer<T> class to indicate that the adding is completed, as below
public void StopAdding()
{
_entries.CompleteAdding();
}
And now call this method after your for loop , as below
consumer.Start();
for (int i = 0; i < itemCount; i++)
{
consumer.Enqueue(i);
}
consumer.StopAdding();
Otherwise, Parallel.ForEach() would wait for the threshold to be reached so as to grab the chunk and start processing.

Lock only on an Id

I have a method which needs to run exclusivley run a block of code, but I want to add this restriction only if it is really required. Depending on an Id value (an Int32) I would be loading/modifying distinct objects, so it doesn't make sense to lock access for all threads. Here's a first attempt of doing this -
private static readonly ConcurrentDictionary<int, Object> LockObjects = new ConcurrentDictionary<int, Object>();
void Method(int Id)
{
lock(LockObjects.GetOrAdd(Id,new Object())
{
//Do the long running task here - db fetches, changes etc
Object Ref;
LockObjects.TryRemove(Id,out Ref);
}
}
I have my doubts if this would work - the TryRemove can fail (which will cause the ConcurrentDictionary to keep getting bigger).
A more obvious bug is that the TryRemove successfully removes the Object but if there are other threads (for the same Id) which are waiting (locked out) on this object, and then a new thread with the same Id comes in and adds a new Object and starts processing, since there is no one else waiting for the Object it just added.
Should I be using TPL or some sort of ConcurrentQueue to queue up my tasks instead ? What's the simplest solution ?
I use a similar approach to lock resources for related items rather than a blanket resource lock... It works perfectly.
Your almost there but you really don't need to remove the object from the dictionary; just let the next object with that id get the lock on the object.
Surely there is a limit to the number of unique ids in your application? What is that limit?
The main semantic issue I see is that an object can be locked without being listed in the collection because the the last line in the lock removes it and a waiting thread can pick it up and lock it.
Change the collection to be a collection of objects that should guard a lock. Do not name it LockedObjects and do not remove the objects from the collection unless you no longer expect the object to be needed.
I always think of this type of objects as a key instead of a lock or blocked object; the object is not locked, it is a key to locked sequences of code.
I used the following approach. Do not check the original ID, but get small hash-code of int type to get the existing object for lock. The count of lockers depends on your situation - the more locker counter, the less the probability of collision.
class ThreadLocker
{
const int DEFAULT_LOCKERS_COUNTER = 997;
int lockersCount;
object[] lockers;
public ThreadLocker(int MaxLockersCount)
{
if (MaxLockersCount < 1) throw new ArgumentOutOfRangeException("MaxLockersCount", MaxLockersCount, "Counter cannot be less, that 1");
lockersCount = MaxLockersCount;
lockers = Enumerable.Range(0, lockersCount).Select(_ => new object()).ToArray();
}
public ThreadLocker() : this(DEFAULT_LOCKERS_COUNTER) { }
public object GetLocker(int ObjectID)
{
var idx = (ObjectID % lockersCount + lockersCount) % lockersCount;
return lockers[idx];
}
public object GetLocker(string ObjectID)
{
var hash = ObjectID.GetHashCode();
return GetLocker(hash);
}
public object GetLocker(Guid ObjectID)
{
var hash = ObjectID.GetHashCode();
return GetLocker(hash);
}
}
Usage:
partial class Program
{
static ThreadLocker locker = new ThreadLocker();
static void Main(string[] args)
{
var id = 10;
lock(locker.GetLocker(id))
{
}
}
}
Of cource, you can use any hash-code functions to get the corresponded array index.
If you want to use the ID itself and do not allow collisions, caused by hash-code, you can you the next approach. Maintain the Dictionary of objects and store info about the number of the threads, that want to use ID:
class ThreadLockerByID<T>
{
Dictionary<T, lockerObject<T>> lockers = new Dictionary<T, lockerObject<T>>();
public IDisposable AcquireLock(T ID)
{
lockerObject<T> locker;
lock (lockers)
{
if (lockers.ContainsKey(ID))
{
locker = lockers[ID];
}
else
{
locker = new lockerObject<T>(this, ID);
lockers.Add(ID, locker);
}
locker.counter++;
}
Monitor.Enter(locker);
return locker;
}
protected void ReleaseLock(T ID)
{
lock (lockers)
{
if (!lockers.ContainsKey(ID))
return;
var locker = lockers[ID];
locker.counter--;
if (Monitor.IsEntered(locker))
Monitor.Exit(locker);
if (locker.counter == 0)
lockers.Remove(locker.id);
}
}
class lockerObject<T> : IDisposable
{
readonly ThreadLockerByID<T> parent;
internal readonly T id;
internal int counter = 0;
public lockerObject(ThreadLockerByID<T> Parent, T ID)
{
parent = Parent;
id = ID;
}
public void Dispose()
{
parent.ReleaseLock(id);
}
}
}
Usage:
partial class Program
{
static ThreadLockerByID<int> locker = new ThreadLockerByID<int>();
static void Main(string[] args)
{
var id = 10;
using(locker.AcquireLock(id))
{
}
}
}
There are mini-libraries that do this for you, such as AsyncKeyedLock. I've used it and it saved me a lot of headaches.

How to use lock on a Dictionary containing a list of objects in C#?

I have the following class:
public static class HotspotsCache
{
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<int, List<HotSpot>>();
private static object Lock = new object();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
lock (Lock)
{
if (!_companyHotspots.ContainsKey(companyId))
{
RefreshCompanyHotspotCache(companyId);
}
return _companyHotspots[companyId];
}
}
private static void RefreshCompanyHotspotCache(short companyId)
{
....
hotspots = ServiceProvider.Instance.GetService<HotspotsService>().GetHotSpots(..);
_companyHotspots.Add(companyId, hotspots);
....
}
The issue that I'm having is that the operation of getting the hotspots, in RefreshCompanyHotspotCache method, takes a lot of time . So while one thread is performing the cache refresh for a certain CompanyId, all the other threads are waiting until this operation is finished, although there could be threads that are requesting the list of hotspots for another companyId for which the list is already loaded in the dictionary. I would like these last threads not be locked. I also want that all threads that are requesting the list of hotspots for a company that is not yet loaded in the cache to wait until the list is fully retrieved and loaded in the dictionary.
Is there a way to lock only the threads that are reading/writing the cache for certain companyId (for which the refresh is taking place) and let the other threads that are requesting data for another company to do their job?
My thought was to use and array of locks
lock (companyLocks[companyId])
{
...
}
But that didn't solve anything. The threads dealing with one company are still waiting for threads that are refreshing the cache for other companies.
Use the Double-checked lock mechanism also mentioned by Snowbear - this will prevent your code locking when it doesn't actually need to.
With your idea of an individual lock per client, I've used this mechanism in the past, though I used a dictionary of locks. I made a utility class for getting a lock object from a key:
/// <summary>
/// Provides a mechanism to lock based on a data item being retrieved
/// </summary>
/// <typeparam name="T">Type of the data being used as a key</typeparam>
public class LockProvider<T>
{
private object _syncRoot = new object();
private Dictionary<T, object> _lstLocks = new Dictionary<T, object>();
/// <summary>
/// Gets an object suitable for locking the specified data item
/// </summary>
/// <param name="key">The data key</param>
/// <returns></returns>
public object GetLock(T key)
{
if (!_lstLocks.ContainsKey(key))
{
lock (_syncRoot)
{
if (!_lstLocks.ContainsKey(key))
_lstLocks.Add(key, new object());
}
}
return _lstLocks[key];
}
}
So simply use this in the following manner...
private static LockProvider<short> _clientLocks = new LockProvider<short>();
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<short, List<HotSpot>>();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
if (!_companyHotspots.ContainsKey(companyId))
{
lock (_clientLocks.GetLock(companyId))
{
if (!_companyHotspots.ContainsKey(companyId))
{
// Add item to _companyHotspots here...
}
}
return _companyHotspots[companyId];
}
How about you only lock 1 thread, and let that update, while everyone else uses the old list?
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<short, List<HotSpot>>();
private static Dictionary<short, List<HotSpot>> _companyHotspotsOld = new Dictionary<short, List<HotSpot>>();
private static bool _hotspotsUpdating = false;
private static object Lock = new object();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
if (!_hotspotsUpdating)
{
if (!_companyHotspots.ContainsKey(companyId))
{
lock (Lock)
{
_hotspotsUpdating = true;
_companyHotspotsOld = _companyHotspots;
RefreshCompanyHotspotCache(companyId);
_hotspotsUpdating = false;
return _companyHotspots[companyId];
}
}
else
{
return _companyHotspots[companyId];
}
}
else
{
return _companyHotspotsOld[companyId];
}
}
Have you looked into ReaderWriterLockSlim? That should be able to let get finer grained locking where you only take a writelock when needed.
Another thing you may need to look out for is false sharing. I don't know how a lock is implemented exactly but if you lock on objects in an array they're bound to be close to each other in memory, possibly putting them on the same cacheline, so the lock may not behave as you expect.
Another idea, what happens if you change the last code snippet to
object l = companyLocks[companyId];
lock(l){
}
could be the lock statement wraps more here than intended.
GJ
New idea, with locking just the lists as they are created.
If you can guarantee that each company will have at least one hotspot, do this:
public static class HotspotsCache
{
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<int, List<HotSpot>>();
static HotspotsCache()
{
foreach(short companyId in allCompanies)
{
companyHotspots.Add(companyId, new List<HotSpot>());
}
}
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
List<HotSpots> result = _companyHotspots[companyId];
if(result.Count == 0)
{
lock(result)
{
if(result.Count == 0)
{
RefreshCompanyHotspotCache(companyId, result);
}
}
}
return result;
}
private static void RefreshCompanyHotspotCache(short companyId, List<HotSpot> resultList)
{
....
hotspots = ServiceProvider.Instance.GetService<HotspotsService>().GetHotSpots(..);
resultList.AddRange(hotspots);
....
}
}
Since the dictionary is being modified after its initial creation, no need to do any locking on it. We only need to lock the individual lists as we populate them, the read operation needs no locking (including the initial Count == 0).
If you're able to use .NET 4 then the answer is straightforward -- use a ConcurrentDictionary<K,V> instead and let that look after the concurrency details for you:
public static class HotSpotsCache
{
private static readonly ConcurrentDictionary<short, List<HotSpot>>
_hotSpotsMap = new ConcurrentDictionary<short, List<HotSpot>>();
public static List<HotSpot> GetCompanyHotSpots(short companyId)
{
return _hotSpotsMap.GetOrAdd(companyId, id => LoadHotSpots(id));
}
private static List<HotSpot> LoadHotSpots(short companyId)
{
return ServiceProvider.Instance
.GetService<HotSpotsService>()
.GetHotSpots(/* ... */);
}
}
If you're not able to use .NET 4 then your idea of using several more granular locks is a good one:
public static class HotSpotsCache
{
private static readonly Dictionary<short, List<HotSpot>>
_hotSpotsMap = new Dictionary<short, List<HotSpot>();
private static readonly object _bigLock = new object();
private static readonly Dictionary<short, object>
_miniLocks = new Dictionary<short, object>();
public static List<HotSpot> GetCompanyHotSpots(short companyId)
{
List<HotSpot> hotSpots;
object miniLock;
lock (_bigLock)
{
if (_hotSpotsMap.TryGetValue(companyId, out hotSpots))
return hotSpots;
if (!_miniLocks.TryGetValue(companyId, out miniLock))
{
miniLock = new object();
_miniLocks.Add(companyId, miniLock);
}
}
lock (miniLock)
{
if (!_hotSpotsMap.TryGetValue(companyId, out hotSpots))
{
hotSpots = LoadHotSpots(companyId);
lock (_bigLock)
{
_hotSpotsMap.Add(companyId, hotSpots);
_miniLocks.Remove(companyId);
}
}
return hotSpots;
}
}
private static List<HotSpot> LoadHotSpots(short companyId)
{
return ServiceProvider.Instance
.GetService<HotSpotsService>()
.GetHotSpots(/* ... */);
}
}

Categories

Resources