Proper manipulation of IEnumerable<foo> - c#

Following my previous question.
In a multi-threaded program, different threads each generate a very long list of results. When thread finished it's task, I would like to Concatenate the different lists into a single list. Please mind followings:
public struct AccEntry
{
internal AccEntry(int accumulation)
: this()
{
Accumulation = accumulation;
}
public int Accumulation { private set; get; }
}
internal class Functions
{
internal Functions(Object lockOnMe, IEnumerable<AccEntry> results)
{
_lockOnMe = lockOnMe;
_results = results;
_result = new List<AccEntry>();
}
private IEnumerable<AccEntry> _results { set; get; }
private List<AccEntry> _result { set; get; }
internal void SomeFunction()
{
/// some time consuming process that builds _result
lock(_lockOnMe)
{
/// The problem is here! _results is always null.
if (_results == null) _results = _result;
else _results = _results.Concat(_result);
}
}
}
public class ParentClass
{
public void DoJob()
{
IEnumerable<AccEntry> results = null;
/// initialize and launch multiple threads where each
/// has a new instance of Functions, and call SomeFunction.
}
}
The problem, as mentioned in code, is that _results is always null. When on thread changes sets it to _result the other thread finds it null again. I also tried using a ref keyword in Functions constructor for results, but it did not change anything.
Assuming that followings execute as expected, I wonder what could be the point I miss on above-mentioned code?!!
List<int> listA = new List<int>();
List<int> listB = new List<int>();
listA.Add(10);
listB.Add(12);
IEnumerable<int> listC = null;
listC = listA;
listC = listC.Concat(listB);

As you are concatenating the items and assigning it back to the _results variable, that will replace the original value that you assigned to the variable. That new collection of items will be local to that instance.
Instead of using an IEnumerable<> that you have to replace to update it, use a List<> that you can add items to in place:
internal class Functions
{
internal Functions(Object lockOnMe, List<AccEntry> results)
{
_lockOnMe = lockOnMe;
_results = results;
_result = new List<AccEntry>();
}
private object _lockOnMe;
private List<AccEntry> _results;
private List<AccEntry> _result;
internal void SomeFunction()
{
/// some time consuming process that builds _result
lock(_lockOnMe)
{
_results.AddRange(_result);
}
}
}
Just make sure to create the list before creating the Functions instances.

Related

Make a thread-safe list of integers

I need to make a class that stores a List of int and read/write from it asynchronously.
here's my class:
public class ConcurrentIntegerList
{
private readonly object theLock = new object();
List<int> theNumbers = new List<int>();
public void AddNumber(int num)
{
lock (theLock)
{
theNumbers.Add(num);
}
}
public List<int> GetNumbers()
{
lock (theLock)
{
return theNumbers;
}
}
}
But it is not thread-safe until here. when I do multiple operations from different threads I get this error:
Collection was modified; enumeration operation may not execute.
What I missed here?
public List<int> GetNumbers()
{
lock (theLock)
{
// return theNumbers;
return theNumbers.ToList();
}
}
But the performance won't be very good this way, and GetNumbers() now returns a snapshot copy.

ConcurrentDictionary thread safe and efficient change of a property witihin a dictionary value

There are quite a few posts about thread-safe changes to a ConcurrentDictionary, however, all of the examples I have searched concern themselves with changing the whole value. I think this is a slightly different question..
I'm after some guidance on the best way to change a property of an object value already in a ConcurrentDictionary?
For example, I could do the following but not sure if it's thread-safe
CustomObject obj;
if (customObjectDictionary.TryGetValue(objectKeyToChangeProperty, out obj))
{
obj.Property2 = "NewData";
}
Another way is to copy the dictionary using ToArray and then get the required object, amend the property and then use the thread-safe AddOrUpdate method
obj = customObjectDictionary.ToArray().Select(x => x.Value).FirstOrDefault();
if(obj != null)
{
obj.Property2 = "NewData";
customObjectDictionary.AddOrUpdate(obj.Key, obj, (oldkey, oldvalue) => obj);
}
This seems a bit of a long winded way of doing this and not sure if calling ToArray will be performant if doing this many times.
Sample Code is below:
public partial class Form1 : Form
{
private ConcurrentDictionary<int, CustomObject> customObjectDictionary { get; set; }
public Form1()
{
InitializeComponent();
InitialzeObjects();
Start();
}
private void InitialzeObjects()
{
customObjectDictionary = new ConcurrentDictionary<int, CustomObject>();
var o1 = new CustomObject() { Key = 1, Property1 = 1, Property2 = "Object1" };
customObjectDictionary.AddOrUpdate(o1.Key, o1, (oldkey, oldvalue) => o1);
var o2 = new CustomObject() { Key = 2, Property1 = 2, Property2 = "Object2" };
customObjectDictionary.AddOrUpdate(o2.Key, o2, (oldkey, oldvalue) => o2);
}
private async void Start()
{
bool complete = await Task.Run(() => Test());
}
private async Task<bool> Test()
{
int objectKeyToChangeProperty = 2;
CustomObject obj;
// Method 1 change local variable directly
if (customObjectDictionary.TryGetValue(objectKeyToChangeProperty, out obj))
{
obj.Property2 = "NewData";
}
// Method 2 - make copy first then
obj = customObjectDictionary.ToArray().Select(x => x.Value).FirstOrDefault();
if(obj != null)
{
obj.Property2 = "NewData";
customObjectDictionary.AddOrUpdate(obj.Key, obj, (oldkey, oldvalue) => obj);
}
return true;
}
}
public class CustomObject
{
public int Key { get; set; }
public int Property1 { get; set; }
public string Property2 { get; set; }
}
A concurrent collection is thread safe in the respects that it will always be internally consistent, even if it's at the cost of you getting snapshots of the collection (among other trickery). However, and it's a big however, the objects inside the collection or the code you write that use it are not guaranteed to be thread safe.
I'm not sure of the actual problem you are trying to solve or the threading nature of properties you are trying to change. Though, the safest bet (and if you are unsure) is to just use a lock when accessing these properties.
It is not safe, so you must protect it somewhere else.
There are multiple patterns and that will depend on the current scenario.
For example, if you need to increment a variable:
if (customObjectDictionary.TryGetValue(objectKeyToChangeProperty, out obj))
{
obj.Property1++;
}
behind the scene you are reading Property1, adding 1, and save it again in one line, but not in an atomic instruction, in the middle it could have change 50 times.
you can protect thta code with a lock, inside the if, but another developer can come 2 years later and do the same somewhere else, and not realize that he/she is forgetting the lock
Then you could enforce the read and write inside the object that locks, foe example:
public class CustomObject
{
private int _Property1;
public Property1
{
get
{
return _Property1;
}
public void IncrementProperty1By (int increment)
{
lock { ... Property1++ ... }
}
}

How to safely write to the same List

I've got a public static List<MyDoggie> DoggieList;
DoggieList is appended to and written to by multiple processes throughout my application.
We run into this exception pretty frequently:
Collection was modified; enumeration operation may not execute
Assuming there are multiple classes writing to DoggieList how do we get around this exception?
Please note that this design is not great, but at this point we need to quickly fix it in production.
How can we perform mutations to this list safely from multiple threads?
I understand we can do something like:
lock(lockObject)
{
DoggieList.AddRange(...)
}
But can we do this from multiple classes against the same DoggieList?
you can also create you own class and encapsulate locking thing in that only, you can try like as below ,
you can add method you want like addRange, Remove etc.
class MyList {
private object objLock = new object();
private List<int> list = new List<int>();
public void Add(int value) {
lock (objLock) {
list.Add(value);
}
}
public int Get(int index) {
int val = -1;
lock (objLock) {
val = list[0];
}
return val;
}
public void GetAll() {
List<int> retList = new List<int>();
lock (objLock) {
retList = new List<T>(list);
}
return retList;
}
}
Good stuff : Concurrent Collections very much in detail :http://www.albahari.com/threading/part5.aspx#_Concurrent_Collections
making use of concurrent collection ConcurrentBag Class can also resolve issue related to multiple thread update
Example
using System.Collections.Concurrent;
using System.Threading.Tasks;
public static class Program
{
public static void Main()
{
var items = new[] { "item1", "item2", "item3" };
var bag = new ConcurrentBag<string>();
Parallel.ForEach(items, bag.Add);
}
}
Using lock a the disadvantage of preventing concurrent readings.
An efficient solution which does not require changing the collection type is to use a ReaderWriterLockSlim
private static readonly ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
With the following extension methods:
public static class ReaderWriterLockSlimExtensions
{
public static void ExecuteWrite(this ReaderWriterLockSlim aLock, Action action)
{
aLock.EnterWriteLock();
try
{
action();
}
finally
{
aLock.ExitWriteLock();
}
}
public static void ExecuteRead(this ReaderWriterLockSlim aLock, Action action)
{
aLock.EnterReadLock();
try
{
action();
}
finally
{
aLock.ExitReadLock();
}
}
}
which can be used the following way:
_lock.ExecuteWrite(() => DoggieList.Add(new Doggie()));
_lock.ExecuteRead(() =>
{
// safe iteration
foreach (MyDoggie item in DoggieList)
{
....
}
})
And finally if you want to build your own collection based on this:
public class SafeList<T>
{
private readonly ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
private readonly List<T> _list = new List<T>();
public T this[int index]
{
get
{
T result = default(T);
_lock.ExecuteRead(() => result = _list[index]);
return result;
}
}
public List<T> GetAll()
{
List<T> result = null;
_lock.ExecuteRead(() => result = _list.ToList());
return result;
}
public void ForEach(Action<T> action) =>
_lock.ExecuteRead(() => _list.ForEach(action));
public void Add(T item) => _lock.ExecuteWrite(() => _list.Add(item));
public void AddRange(IEnumerable<T> items) =>
_lock.ExecuteWrite(() => _list.AddRange(items));
}
This list is totally safe, multiple threads can add or get items in parallel without any concurrency issue. Additionally, multiple threads can get items in parallel without locking each other, it's only when writing than 1 single thread can work on the collection.
Note that this collection does not implement IEnumerable<T> because you could get an enumerator and forget to dispose it which would leave the list locked in read mode.
make DoggieList of type ConcurrentStack and then use pushRange method. It is thread safe.
using System.Collections.Concurrent;
var doggieList = new ConcurrentStack<MyDoggie>();
doggieList.PushRange(YourCode)

thread-safe loading of a static collection

I have a few static Dictionary object that holds some constants list for me so I wouldn't have to load them from database each time my website loads (for example: a list of countries, a list of categories).
So I have a static function that checks if the instance is null, and if it is query the database, instantiate the static variable, and populate it with data.
Since it is a website, there could be a case that more than one person tries to access that information at the same time while the object is null, and all those who do will call that process at the same time (which is really not necessary, causes unneeded queries against the DB, and could cause duplicated objects in the list).
I know there's a way to make this kind of loading thread-safe (just not really sure how) - could someone point me in the right direction? should I use a lock?
Thanks
UPDATE II:
This is what I wrote (is this a good thread-safe code?)
private static Lazy<List<ICountry>> _countries = new Lazy<List<ICountry>>(loadCountries);
private static List<ICountry> loadCountries()
{
List<ICountry> result = new List<ICountry>();
DataTable dtCountries = SqlHelper.ExecuteDataTable("stp_Data_Countries_Get");
foreach (DataRow dr in dtCountries.Rows)
{
result.Add(new Country
{
ID = Convert.ToInt32(dr["CountryId"]),
Name = dr["Name"].ToString()
});
}
return result;
}
public static List<ICountry> GetAllCountries()
{
return _countries.Value;
}
You can use Lazy to load a resource in a lazy and thread-safe manner:
Lazy<List<string>> countries =
new Lazy<List<string>>(()=> /* get your countries from db */);
Update:
public static class HelperTables
{
private static Lazy<List<ICountry>> _countries;
static HelperTables //Static constructor
{
//Instantiating the lazy object in the static constructor will prevent race conditions
_countries = new Lazy<List<ICountry>>(() =>
{
List<ICountry> result = new List<ICountry>();
DataTable dtCountries = SqlHelper.ExecuteDataTable("stp_Data_Countries_Get");
foreach (DataRow dr in dtCountries.Rows)
{
result.Add(new Country
{
ID = Convert.ToInt32(dr["CountryId"]),
Name = dr["Name"].ToString()
});
}
return result;
});
}
public static List<ICountry> GetAllCountries()
{
return _countries.Value;
}
}
If you're using .NET 4.0, you can use the builtin Lazy generic class.
private static Lazy<YourObject> data = new Lazy<YourObject>(YourInitializationFunction);
public static YourObject Data { get { return data.Value; } }
Note that you have to add a static constructor to the class where you define this, otherwise it's not completely thread-safe.
If you're not on .NET 4.0+, you can just write your own code. The basic pattern looks something like this:
private static YourObject data;
private static object syncObject = new object();
public static YourObject Data
{
get
{
if (data == null)
{
lock (syncObject)
{
if (data != null)
return data;
var obj = new YourObject();
return (YourObject)Interlocked.Exchange(ref data, obj);
}
}
return data;
}
}

How to use lock on a Dictionary containing a list of objects in C#?

I have the following class:
public static class HotspotsCache
{
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<int, List<HotSpot>>();
private static object Lock = new object();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
lock (Lock)
{
if (!_companyHotspots.ContainsKey(companyId))
{
RefreshCompanyHotspotCache(companyId);
}
return _companyHotspots[companyId];
}
}
private static void RefreshCompanyHotspotCache(short companyId)
{
....
hotspots = ServiceProvider.Instance.GetService<HotspotsService>().GetHotSpots(..);
_companyHotspots.Add(companyId, hotspots);
....
}
The issue that I'm having is that the operation of getting the hotspots, in RefreshCompanyHotspotCache method, takes a lot of time . So while one thread is performing the cache refresh for a certain CompanyId, all the other threads are waiting until this operation is finished, although there could be threads that are requesting the list of hotspots for another companyId for which the list is already loaded in the dictionary. I would like these last threads not be locked. I also want that all threads that are requesting the list of hotspots for a company that is not yet loaded in the cache to wait until the list is fully retrieved and loaded in the dictionary.
Is there a way to lock only the threads that are reading/writing the cache for certain companyId (for which the refresh is taking place) and let the other threads that are requesting data for another company to do their job?
My thought was to use and array of locks
lock (companyLocks[companyId])
{
...
}
But that didn't solve anything. The threads dealing with one company are still waiting for threads that are refreshing the cache for other companies.
Use the Double-checked lock mechanism also mentioned by Snowbear - this will prevent your code locking when it doesn't actually need to.
With your idea of an individual lock per client, I've used this mechanism in the past, though I used a dictionary of locks. I made a utility class for getting a lock object from a key:
/// <summary>
/// Provides a mechanism to lock based on a data item being retrieved
/// </summary>
/// <typeparam name="T">Type of the data being used as a key</typeparam>
public class LockProvider<T>
{
private object _syncRoot = new object();
private Dictionary<T, object> _lstLocks = new Dictionary<T, object>();
/// <summary>
/// Gets an object suitable for locking the specified data item
/// </summary>
/// <param name="key">The data key</param>
/// <returns></returns>
public object GetLock(T key)
{
if (!_lstLocks.ContainsKey(key))
{
lock (_syncRoot)
{
if (!_lstLocks.ContainsKey(key))
_lstLocks.Add(key, new object());
}
}
return _lstLocks[key];
}
}
So simply use this in the following manner...
private static LockProvider<short> _clientLocks = new LockProvider<short>();
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<short, List<HotSpot>>();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
if (!_companyHotspots.ContainsKey(companyId))
{
lock (_clientLocks.GetLock(companyId))
{
if (!_companyHotspots.ContainsKey(companyId))
{
// Add item to _companyHotspots here...
}
}
return _companyHotspots[companyId];
}
How about you only lock 1 thread, and let that update, while everyone else uses the old list?
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<short, List<HotSpot>>();
private static Dictionary<short, List<HotSpot>> _companyHotspotsOld = new Dictionary<short, List<HotSpot>>();
private static bool _hotspotsUpdating = false;
private static object Lock = new object();
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
if (!_hotspotsUpdating)
{
if (!_companyHotspots.ContainsKey(companyId))
{
lock (Lock)
{
_hotspotsUpdating = true;
_companyHotspotsOld = _companyHotspots;
RefreshCompanyHotspotCache(companyId);
_hotspotsUpdating = false;
return _companyHotspots[companyId];
}
}
else
{
return _companyHotspots[companyId];
}
}
else
{
return _companyHotspotsOld[companyId];
}
}
Have you looked into ReaderWriterLockSlim? That should be able to let get finer grained locking where you only take a writelock when needed.
Another thing you may need to look out for is false sharing. I don't know how a lock is implemented exactly but if you lock on objects in an array they're bound to be close to each other in memory, possibly putting them on the same cacheline, so the lock may not behave as you expect.
Another idea, what happens if you change the last code snippet to
object l = companyLocks[companyId];
lock(l){
}
could be the lock statement wraps more here than intended.
GJ
New idea, with locking just the lists as they are created.
If you can guarantee that each company will have at least one hotspot, do this:
public static class HotspotsCache
{
private static Dictionary<short, List<HotSpot>> _companyHotspots = new Dictionary<int, List<HotSpot>>();
static HotspotsCache()
{
foreach(short companyId in allCompanies)
{
companyHotspots.Add(companyId, new List<HotSpot>());
}
}
public static List<HotSpot> GetCompanyHotspots(short companyId)
{
List<HotSpots> result = _companyHotspots[companyId];
if(result.Count == 0)
{
lock(result)
{
if(result.Count == 0)
{
RefreshCompanyHotspotCache(companyId, result);
}
}
}
return result;
}
private static void RefreshCompanyHotspotCache(short companyId, List<HotSpot> resultList)
{
....
hotspots = ServiceProvider.Instance.GetService<HotspotsService>().GetHotSpots(..);
resultList.AddRange(hotspots);
....
}
}
Since the dictionary is being modified after its initial creation, no need to do any locking on it. We only need to lock the individual lists as we populate them, the read operation needs no locking (including the initial Count == 0).
If you're able to use .NET 4 then the answer is straightforward -- use a ConcurrentDictionary<K,V> instead and let that look after the concurrency details for you:
public static class HotSpotsCache
{
private static readonly ConcurrentDictionary<short, List<HotSpot>>
_hotSpotsMap = new ConcurrentDictionary<short, List<HotSpot>>();
public static List<HotSpot> GetCompanyHotSpots(short companyId)
{
return _hotSpotsMap.GetOrAdd(companyId, id => LoadHotSpots(id));
}
private static List<HotSpot> LoadHotSpots(short companyId)
{
return ServiceProvider.Instance
.GetService<HotSpotsService>()
.GetHotSpots(/* ... */);
}
}
If you're not able to use .NET 4 then your idea of using several more granular locks is a good one:
public static class HotSpotsCache
{
private static readonly Dictionary<short, List<HotSpot>>
_hotSpotsMap = new Dictionary<short, List<HotSpot>();
private static readonly object _bigLock = new object();
private static readonly Dictionary<short, object>
_miniLocks = new Dictionary<short, object>();
public static List<HotSpot> GetCompanyHotSpots(short companyId)
{
List<HotSpot> hotSpots;
object miniLock;
lock (_bigLock)
{
if (_hotSpotsMap.TryGetValue(companyId, out hotSpots))
return hotSpots;
if (!_miniLocks.TryGetValue(companyId, out miniLock))
{
miniLock = new object();
_miniLocks.Add(companyId, miniLock);
}
}
lock (miniLock)
{
if (!_hotSpotsMap.TryGetValue(companyId, out hotSpots))
{
hotSpots = LoadHotSpots(companyId);
lock (_bigLock)
{
_hotSpotsMap.Add(companyId, hotSpots);
_miniLocks.Remove(companyId);
}
}
return hotSpots;
}
}
private static List<HotSpot> LoadHotSpots(short companyId)
{
return ServiceProvider.Instance
.GetService<HotSpotsService>()
.GetHotSpots(/* ... */);
}
}

Categories

Resources