How is it possible to pass a dictionary as a parameter to a thread function and then iterate through it?
Dictionary<string, Track> dic = allTracks;
updateThread = new Thread(() => toDB(dic));
updateThread.Start();
and the function:
public static void toDB( Dictionary<string, Track> dict)
{
foreach (KeyValuePair<string, Track> pair in dict)
{
//do something - but I do not alter anything in dictionary
}
}
I have tried like this but I get an error
Collection was modified; enumeration operation may not execute.
You will get this exception if your dictionary is modified in the main thread or the thread you have passed on to. You can use ConcurrentDictionary or implement the locking yourself.
However, if you do not intend to modify the original collection inside the function you are calling in the thread and you don't need the latest values either, then you can simple create a copy before passing it to your seperate thread.
Related
Im using the ConcurrentBag to contain a list of strings. Occasionally it will contain a duplicate.
However im checking the contents of this before adding the new entry so it should never have a duplicate.
ConcurrentDictionary<string, string> SystemFiles = PopulateSystemFiles();
ConcurrentBag<string> SystemNames = new ConcurrentBag<string>();
Parallel.ForEach(SystemFiles, file =>
{
string name = GetSystemName(file.Value);
if (!SystemNames.Contains(name))
{
SystemNames.Add(name);
}
});
My assumpion is that the .Contains method is not thread safe. Am i correct?
ConcurrentBag is threadsafe, but your code isn't:
if (!SystemNames.Contains(name))
{
SystemNames.Add(name);
}
Contains will execute in a thread-safe way, then Add will also execute in a thread-safe way, but you have no guarantee that an item haven't been added in-between.
For your needs, I recommend using a ConcurrentDictionary instead. Just ignore the value as you won't need it.
var SystemNames = new ConcurrentDictionary<string, bool>();
Then use the TryAdd method to do the "if not contains then add" in a single atomic operation:
SystemNames.TryAdd(name, true);
If I have the following code:
var dictionary = new ConcurrentDictionary<int, HashSet<string>>();
foreach (var user in users)
{
if (!dictionary.ContainsKey(user.GroupId))
{
dictionary.TryAdd(user.GroupId, new HashSet<string>());
}
dictionary[user.GroupId].Add(user.Id.ToString());
}
Is the act of adding an item into the HashSet inherently thread safe because HashSet is a value property of the concurrent dictionary?
No. Putting a container in a thread-safe container does not make the inner container thread safe.
dictionary[user.GroupId].Add(user.Id.ToString());
is calling HashSet's add after retrieving it from the ConcurrentDictionary. If this GroupId is looked up from two threads at once this would break your code with strange failure modes. I saw the result of one of my teammates making the mistake of not locking his sets, and it wasn't pretty.
This is a plausible solution. I'd do something different myself but this is closer to your code.
if (!dictionary.ContainsKey(user.GroupId))
{
dictionary.TryAdd(user.GroupId, new HashSet<string>());
}
var groups = dictionary[user.GroupId];
lock(groups)
{
groups.Add(user.Id.ToString());
}
No, the collection (the dictionary itself) is thread-safe, not whatever you put in it. You have a couple of options:
Use AddOrUpdate as #TheGeneral mentioned:
dictionary.AddOrUpdate(user.GroupId, new HashSet<string>(), (k,v) => v.Add(user.Id.ToString());
Use a concurrent collection, like the ConcurrentBag<T>:
ConcurrentDictionary<int, ConcurrentBag<string>>
Whenever you are building the Dictionary, as in your code, you should be better off accessing it as little as possible. Think of something like this:
var dictionary = new ConcurrentDictionary<int, ConcurrentBag<string>>();
var grouppedUsers = users.GroupBy(u => u.GroupId);
foreach (var group in grouppedUsers)
{
// get the bag from the dictionary or create it if it doesn't exist
var currentBag = dictionary.GetOrAdd(group.Key, new ConcurrentBag<string>());
// load it with the users required
foreach (var user in group)
{
if (!currentBag.Contains(user.Id.ToString())
{
currentBag.Add(user.Id.ToString());
}
}
}
If you actually want a built-in concurrent HashSet-like collection, you'd need to use ConcurrentDictionary<int, ConcurrentDictionary<string, string>>, and care either about the key or the value from the inner one.
I am in this situation that there is a large dictionary that is randomly updated by one thread at a fairly high frequency, and there is another thread that tries to take a snapshot of the dictionary to save as history.
I currently using something like this:
Dictionary<string, object> dict = new Dictionary<string, object>();
var items = dict.Values.ToList();
This works fine for most of the time, except it occasionally throws:
System.InvalidOperationException: Collection was modified; enumeration
operation may not execute.
I understand why this happen, but I don't know what can I do to avoid the collection modified error.
What is the best approach to iterate such collection?
I also tried ConcurrentDictionary, but no luck.
Why? Is ConcurrentDictionary thread safe only at item level?
According to the docs you should be able to use the GetEnumerator() method of ConcurrentDictionary to get a thread-safe iterator.
The enumerator returned from the dictionary is safe to use concurrently with reads and writes to the dictionary, however it does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after GetEnumerator was called.
Since you're dealing with concurrent threads, it's not surprising to have some tradeoffs with consistency, but I would expect this approach to block less than the brute force approach given in other answers. This wouldn't have worked if you tried:
var items = concurrentDict.Items.ToList();
but it's supposed to work for
var items = concurrentDict.GetEnumerator();
or you could simply reference the iterator directly:
foreach(var item in concurrentDict)
{
valueList.Add(item.Value);
}
An ImmutableDictionary might be appropriate for you, as it supports scalable multi-threading and snapshotting as part of its basic feature-set.
// initialize.
ImmutableDictionary<string, int> dict = ImmutableDictionary.Create<string,int>();
// create a new dictionary with "foo" key added.
ImmutableDictionary<string, int> newdict = dict.Add("foo", 0);
// replace dict, thread-safe, with a new dictionary with "bar" added.
// note this is using dict, not newdict, so there is no "foo" in it.
ImmutableInterlocked.TryAdd(ref dict, "bar", 1);
// take a snapshot, thread-safe.
ImmutableDictionary<string,int> snapshot = dict;
The immutable nature means that the dictionary can never change -- you can only add a value by creating a new dictionary. And because of this property, you take a "snapshot" of it by simply keeping a reference around from the point you want to snapshot.
It is optimized under the hood to be efficient, not copying the entire thing for every operation. That said, for other operations it isn't as efficient as ConcurrentDictionary, but it's all a trade-off in what you want. For instance, a ConcurrentDictionary can be concurrently enumerated but it's impossible to enumerate a snapshot of it.
You can use a monitor with lock keyword to ensure that only reading or only writing is executing at this moment.
public class SnapshotDictionary<TKey, TValue> : IEnumerable<KeyValuePair<TKey, TValue>>
{
private readonly Dictionary<TKey, TValue> _dictionary = new Dictionary<TKey, TValue>();
private readonly object _lock = new object();
public void Add(TKey key, TValue value)
{
lock (_lock)
{
_dictionary.Add(key, value);
}
}
// TODO: Other necessary IDictionary methods
public Dictionary<TKey, TValue> GetSnaphot()
{
lock (_lock)
{
return new Dictionary<TKey, TValue>(_dictionary);
}
}
public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
{
return GetSnaphot().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
GetSnapshot method returns a snapshot of your dictionary.
I have also overriden GetEnumerator so that it creates a snapshot and then returns snapshot's enumerator.
So, this will work because will be executed on a snapshot:
var items = snapshotDictionary.GetSnapshot().Values.ToList();
// or
foreach (var item in snapshotDictionary)
{
// ...
}
However, this approach does not allow multithreading writing.
Recently I was running into the following exception when using a generic dictionary
An InvalidOperationException has occurred. A collection was modified
I realized that this error was primarily because of thread safety issues on the static dictionary I was using.
A little background: I currently have an application which has 3 different methods that are related to this issue.
Method A iterates through the dictionary using foreach and returns a value.
Method B adds data to the dictionary.
Method C changes the value of the key in the dictionary.
Sometimes while iterating through the dictionary, data is also being added, which is the cause of this issue. I keep getting this exception in the foreach part of my code where I iterate over the contents of the dictionary. In order to resolve this issue, I replaced the generic dictionary with the ConcurrentDictionary and here are the details of what I did.
Aim : My main objective is to completely remove the exception
For method B (which adds a new key to the dictionary) I replaced .Add with TryAdd
For method C (which updates the value of the dictionary) I did not make any changes. A rough sketch of the code is as follows :
static public int ChangeContent(int para)
{
foreach (KeyValuePair<string, CustObject> pair in static_container)
{
if (pair.Value.propA != para ) //Pending cancel
{
pair.Value.data_id = prim_id; //I am updating the content
return 0;
}
}
return -2;
}
For method A - I am simply iterating over the dictionary and this is where the running code stops (in debug mode) and Visual Studio informs me that this is where the error occured.The code I am using is similar to the following
static public CustObject RetrieveOrderDetails(int para)
{
foreach (KeyValuePair<string, CustObject> pair in static_container)
{
if (pair.Value.cust_id.Equals(symbol))
{
if (pair.Value.OrderStatus != para)
{
return pair.Value; //Found
}
}
}
return null; //Not found
}
Are these changes going to resolve the exception that I am getting.
Edit:
It states on this page that the method GetEnumerator allows you to traverse through the elements in parallel with writes (although it may be outdated). Isnt that the same as using foreach ?
For modification of elements, one option is to manually iterate the dictionary using a for loop, e.g.:
Dictionary<string, string> test = new Dictionary<string, string>();
int dictionaryLength = test.Count();
for (int i = 0; i < dictionaryLength; i++)
{
test[test.ElementAt(i).Key] = "Some new content";
}
Be weary though, that if you're also adding to the Dictionary, you must increment dictionaryLength (or decrement it if you move elements) appropriately.
Depending on what exactly you're doing, and if order matters, you may wish to use a SortedDictionary instead.
You could extend this by updating dictionaryLength explicitly by recalling test.Count() at each iteration, and also use an additional list containing a list of keys you've already modified and so on and so forth if there's a danger of missing any, it really depends what you're doing as much as anything and what your needs are.
You can further get a list of keys using test.Keys.ToList(), that option would work as follows:
Dictionary<string, string> test = new Dictionary<string, string>();
List<string> keys = test.Keys.ToList();
foreach (string key in keys)
{
test[key] = "Some new content";
}
IEnumerable<string> newKeys = test.Keys.ToList().Except(keys);
if(newKeys.Count() > 0)
// Do it again or whatever.
Note that I've also shown an example of how to find out whether any new keys were added between you getting the initial list of keys, and completing iteration such that you could then loop round and handle the new keys.
Hopefully one of these options will suit (or you may even want to mix and match- for loop on the keys for example updating that as you go instead of the length) - as I say, it's as much about what precisely you're trying to do as much as anything.
Before doing foreach() try out copying container to a new instance
var unboundContainer = static_container.ToList();
foreach (KeyValuePair<string, CustObject> pair in unboundContainer)
Also I think updating Value property is not right from thread safety perspectives, refactor your code to use TryUpdate() instead.
In continuation for my latest ponders about locks in C# and .NET,
Consider the following scenario:
I have a class which contains a specific collection (for this example, i've used a Dictionary<string, int>) which is updated from a data source every few minutes using a specific method which it's body you can see below:
DataTable dataTable = dbClient.ExecuteDataSet(i_Query).GetFirstTable();
lock (r_MappingLock)
{
i_MapObj.Clear();
foreach (DataRow currRow in dataTable.Rows)
{
i_MapObj.Add(Convert.ToString(currRow[i_Column1]), Convert.ToInt32[i_Column2]));
}
}
r_MappingLock is an object dedicated to lock the critical section which refreshes the dictionary's contents.
i_MapObj is the dictionary object
i_Column1 and i_Column2 are the datatable's column names which contain the desired data for the mapping.
Now, I also have a class method which receives a string and returns the correct mapped int based on the mentioned dictionary.
I want this method to wait until the refresh method completes it's execution, so at first glance one would consider the following implementation:
lock (r_MappingLock)
{
int? retVal = null;
if (i_MapObj.ContainsKey(i_Key))
{
retVal = i_MapObj[i_Key];
}
return retVal;
}
This will prevent unexpected behaviour and return value while the dictionary is being updated, but another issue arises:
Since every thread which executes the above method tries to claim the lock, it means that if multiple threads try to execute this method at the same time, each will have to wait until the previous thread finished executing the method and try to claim the lock, and this is obviously an undesirable behaviour since the above method is only for reading purposes.
I was thinking of adding a boolean member to the class which will be set to true or false wether the dictionary is being updated or not and checking it within the "read only" method, but this arise other race-condition based issues...
Any ideas how to solve this gracefully?
Thanks again,
Mikey
Have a look at the built in ReaderWriterLock.
I would just switch to using a ConcurrentDictionary to avoid this situation altogether - manually locking is error-prone. Also as I can gather from "C#: The Curious ConcurrentDictionary", ConcurrentDictionary is already read-optimized.
Albin pointed out correctly at ReaderWriterLock. I will add an even nicer one: ReaderWriterGate by Jeffrey Richter. Enjoy!
You might consider creating a new dictionary when updating, instead of locking. This way, you will always have consistent results, but reads during updates would return previous data:
private volatile Dictionary<string, int> i_MapObj = new Dictionary<string, int>();
private void Update()
{
DataTable dataTable = dbClient.ExecuteDataSet(i_Query).GetFirstTable();
var newData = new Dictionary<string, int>();
foreach (DataRow currRow in dataTable.Rows)
{
newData.Add(Convert.ToString(currRow[i_Column1]), Convert.ToInt32[i_Column2]));
}
// Start using new data - reference assignments are atomic
i_MapObj = newData;
}
private int? GetValue(string key)
{
int value;
if (i_MapObj.TryGetValue(key, out value))
return value;
return null;
}
In C# 4.0 there is ReaderWriterLockSlim class that is a lot faster!
Almost as fast as a lock().
Keep the policy to disallow recursion (LockRecursionPolicy::NoRecursion) to keep performances so high.
Look at this page for more info.