Using the Concurrent Dictionary - Thread Safe Collection Modification - c#

Recently I was running into the following exception when using a generic dictionary
An InvalidOperationException has occurred. A collection was modified
I realized that this error was primarily because of thread safety issues on the static dictionary I was using.
A little background: I currently have an application which has 3 different methods that are related to this issue.
Method A iterates through the dictionary using foreach and returns a value.
Method B adds data to the dictionary.
Method C changes the value of the key in the dictionary.
Sometimes while iterating through the dictionary, data is also being added, which is the cause of this issue. I keep getting this exception in the foreach part of my code where I iterate over the contents of the dictionary. In order to resolve this issue, I replaced the generic dictionary with the ConcurrentDictionary and here are the details of what I did.
Aim : My main objective is to completely remove the exception
For method B (which adds a new key to the dictionary) I replaced .Add with TryAdd
For method C (which updates the value of the dictionary) I did not make any changes. A rough sketch of the code is as follows :
static public int ChangeContent(int para)
{
foreach (KeyValuePair<string, CustObject> pair in static_container)
{
if (pair.Value.propA != para ) //Pending cancel
{
pair.Value.data_id = prim_id; //I am updating the content
return 0;
}
}
return -2;
}
For method A - I am simply iterating over the dictionary and this is where the running code stops (in debug mode) and Visual Studio informs me that this is where the error occured.The code I am using is similar to the following
static public CustObject RetrieveOrderDetails(int para)
{
foreach (KeyValuePair<string, CustObject> pair in static_container)
{
if (pair.Value.cust_id.Equals(symbol))
{
if (pair.Value.OrderStatus != para)
{
return pair.Value; //Found
}
}
}
return null; //Not found
}
Are these changes going to resolve the exception that I am getting.
Edit:
It states on this page that the method GetEnumerator allows you to traverse through the elements in parallel with writes (although it may be outdated). Isnt that the same as using foreach ?

For modification of elements, one option is to manually iterate the dictionary using a for loop, e.g.:
Dictionary<string, string> test = new Dictionary<string, string>();
int dictionaryLength = test.Count();
for (int i = 0; i < dictionaryLength; i++)
{
test[test.ElementAt(i).Key] = "Some new content";
}
Be weary though, that if you're also adding to the Dictionary, you must increment dictionaryLength (or decrement it if you move elements) appropriately.
Depending on what exactly you're doing, and if order matters, you may wish to use a SortedDictionary instead.
You could extend this by updating dictionaryLength explicitly by recalling test.Count() at each iteration, and also use an additional list containing a list of keys you've already modified and so on and so forth if there's a danger of missing any, it really depends what you're doing as much as anything and what your needs are.
You can further get a list of keys using test.Keys.ToList(), that option would work as follows:
Dictionary<string, string> test = new Dictionary<string, string>();
List<string> keys = test.Keys.ToList();
foreach (string key in keys)
{
test[key] = "Some new content";
}
IEnumerable<string> newKeys = test.Keys.ToList().Except(keys);
if(newKeys.Count() > 0)
// Do it again or whatever.
Note that I've also shown an example of how to find out whether any new keys were added between you getting the initial list of keys, and completing iteration such that you could then loop round and handle the new keys.
Hopefully one of these options will suit (or you may even want to mix and match- for loop on the keys for example updating that as you go instead of the length) - as I say, it's as much about what precisely you're trying to do as much as anything.

Before doing foreach() try out copying container to a new instance
var unboundContainer = static_container.ToList();
foreach (KeyValuePair<string, CustObject> pair in unboundContainer)
Also I think updating Value property is not right from thread safety perspectives, refactor your code to use TryUpdate() instead.

Related

In C# how does foreach behave when the enumerated container is modified

This seems like it should be answered but potential dupes I found were asking different things...
I noticed that this seems to work fine (sourceDirInclusion is a simple Dictionary<X,Y>)
foreach (string dir in sourceDirInclusion.Keys)
{
if (sourceDirInclusion[dir] == null)
sourceDirInclusion.Remove(dir);
}
Does that mean removing items from a collection in foreach is safe, or that I got lucky?
What about if I was adding more elements to the dictionary rather than removing?
The problem I'm trying to solve is that sourceDirInclusion is initially populated, but then each value can contribute new items to the dictionary in a second pass. e.g what I want to do is like:
foreach (string dir in sourceDirInclusion.Keys)
{
X x = sourceDirInclusion[dir];
sourceDirInclusion.Add(X.dir,X.val);
}
Short answer: This is not safe.
Long answer: From the IEnumerator<T> documentation:
An enumerator remains valid as long as the collection remains unchanged. If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and its behavior is undefined.
Note that the docs say the behavior is undefined, which means that it might work and it might not. One should never rely on undefined behavior.
In this case, it depends on the behavior of the Keys enumerable, regarding whether or not it creates a copy of the list of keys when you begin enumerating. In this specific case, we know from the docs that the return value from Dictionary<,>.Keys is a collection that refers back to the dictionary:
The returned Dictionary<TKey, TValue>.KeyCollection is not a static copy; instead, the Dictionary<TKey, TValue>.KeyCollection refers back to the keys in the original Dictionary<TKey, TValue>. Therefore, changes to the Dictionary<TKey, TValue> continue to be reflected in the Dictionary<TKey, TValue>.KeyCollection.
So it should be considered unsafe to modify the dictionary while enumerating the dictionary's keys.
You can correct this with one change. Alter this line:
foreach (string dir in sourceDirInclusion.Keys)
To this:
foreach (string dir in sourceDirInclusion.Keys.ToList())
The ToList() extension method will create an explicit copy of the list of keys, making it safe to modify the dictionary; the "underlying collection" will be the copy and not the original.
If will throw
InvalidOperationException: Message="Collection was modified; enumeration operation may not execute
To avoid that add candidates for removal to an external list. Then loop over it and remove from target container (dictionary).
List<string> list = new List<string>(sourceDirInclusion.Keys.Count);
foreach (string dir in sourceDirInclusion.Keys)
{
if (sourceDirInclusion[dir] == null)
list.Add(dir);
}
foreach (string dir in list)
{
sourceDirInclusion.Remove(dir);
}
check this out: What is the best way to modify a list in a 'foreach' loop?
In short:
The collection used in foreach is immutable. This is very much by design.
As it says on MSDN:
The foreach statement is used to iterate through the collection to get the information that you want, but can not be used to add or remove items from the source collection to avoid unpredictable side effects. If you need to add or remove items from the source collection, use a for loop.
UPDATE:
You can use a for loop instead:
for (int index = 0; index < dictionary.Count; index++) {
var item = dictionary.ElementAt(index);
var itemKey = item.Key;
var itemValue = item.Value;
}
This works because you are traversing sourceDirInclusion.Keys.
However, just to be sure with future versions of the FrameWork I recommend that you use sourceDirInclusion.Keys.ToArray() in the foreach statement this way you will create a copy of the keys that you loop through.
This will however not work:
foreach(KeyValuePair<string, object> item in sourceDirInclusion)
{
if (item.Value == null)
sourceDirInclusion.Remove(item.Key);
}
As a rule, you cannot modify a collection while it is traversed, but often you can make a new collection by using .ToArray() or .ToList() and traverse that while modifying the original collection.
Good luck with your quest.

C# Locking mechanism - write only locking

In continuation for my latest ponders about locks in C# and .NET,
Consider the following scenario:
I have a class which contains a specific collection (for this example, i've used a Dictionary<string, int>) which is updated from a data source every few minutes using a specific method which it's body you can see below:
DataTable dataTable = dbClient.ExecuteDataSet(i_Query).GetFirstTable();
lock (r_MappingLock)
{
i_MapObj.Clear();
foreach (DataRow currRow in dataTable.Rows)
{
i_MapObj.Add(Convert.ToString(currRow[i_Column1]), Convert.ToInt32[i_Column2]));
}
}
r_MappingLock is an object dedicated to lock the critical section which refreshes the dictionary's contents.
i_MapObj is the dictionary object
i_Column1 and i_Column2 are the datatable's column names which contain the desired data for the mapping.
Now, I also have a class method which receives a string and returns the correct mapped int based on the mentioned dictionary.
I want this method to wait until the refresh method completes it's execution, so at first glance one would consider the following implementation:
lock (r_MappingLock)
{
int? retVal = null;
if (i_MapObj.ContainsKey(i_Key))
{
retVal = i_MapObj[i_Key];
}
return retVal;
}
This will prevent unexpected behaviour and return value while the dictionary is being updated, but another issue arises:
Since every thread which executes the above method tries to claim the lock, it means that if multiple threads try to execute this method at the same time, each will have to wait until the previous thread finished executing the method and try to claim the lock, and this is obviously an undesirable behaviour since the above method is only for reading purposes.
I was thinking of adding a boolean member to the class which will be set to true or false wether the dictionary is being updated or not and checking it within the "read only" method, but this arise other race-condition based issues...
Any ideas how to solve this gracefully?
Thanks again,
Mikey
Have a look at the built in ReaderWriterLock.
I would just switch to using a ConcurrentDictionary to avoid this situation altogether - manually locking is error-prone. Also as I can gather from "C#: The Curious ConcurrentDictionary", ConcurrentDictionary is already read-optimized.
Albin pointed out correctly at ReaderWriterLock. I will add an even nicer one: ReaderWriterGate by Jeffrey Richter. Enjoy!
You might consider creating a new dictionary when updating, instead of locking. This way, you will always have consistent results, but reads during updates would return previous data:
private volatile Dictionary<string, int> i_MapObj = new Dictionary<string, int>();
private void Update()
{
DataTable dataTable = dbClient.ExecuteDataSet(i_Query).GetFirstTable();
var newData = new Dictionary<string, int>();
foreach (DataRow currRow in dataTable.Rows)
{
newData.Add(Convert.ToString(currRow[i_Column1]), Convert.ToInt32[i_Column2]));
}
// Start using new data - reference assignments are atomic
i_MapObj = newData;
}
private int? GetValue(string key)
{
int value;
if (i_MapObj.TryGetValue(key, out value))
return value;
return null;
}
In C# 4.0 there is ReaderWriterLockSlim class that is a lot faster!
Almost as fast as a lock().
Keep the policy to disallow recursion (LockRecursionPolicy::NoRecursion) to keep performances so high.
Look at this page for more info.

Quick mass-updating a Dictionary

I have a Dictionary<int, int> and would like to update certain elements all at once based on their current values, e.g. changing all elements with value 10 to having value 14 or something.
I imagined this would be easy with some LINQ/lambda stuff but it doesn't appear to be as simple as I thought. My current approach is this:
List<KeyValuePair<int, int>> kvps = dictionary.Where(d => d.Value == oldValue).ToList();
foreach (KeyValuePair<int, int> kvp in kvps)
{
dictionary[KeyValuePair.Key] = newValue;
}
The problem is that dictionary is pretty big (hundreds of thousands of elements) and I'm running this code in a loop thousands of times, so it's incredibly slow. There must be a better way...
This might be the wrong data structure. You are attempting to look up dictionary entries based on their values which is the reverse of the usual pattern. Maybe you could store Sets of keys that currently map to certain values. Then you could quickly move these sets around instead of updating each entry separately.
I would consider writing your own collection type to achieve this whereby keys with the same value actually share the same value instance such that changing it in one place changes it for all keys.
Something like the following (obviously, lots of code omitted here - just for illustrative purposes):
public class SharedValueDictionary : IDictionary<int, int>
{
private List<MyValueObject> values;
private Dictionary<int, MyValueObject> keys;
// Now, when you add a new key/value pair, you actually
// look in the values collection to see if that value already
// exists. If it does, you add an entry to keys that points to that existing object
// otherwise you create a new MyValueObject to wrap the value and add entries to
// both collections.
}
This scenario would require multiple versions of Add and Remove to allow for changing all keys with the same value, changing only one key of a set to be a new value, removing all keys with the same value and removing just one key from a value set. It shouldn't be difficult to code for these scenarios as and when needed.
You need to generate a new dictionary:
d = d.ToDictionary(w => w.Key, w => w.Value == 10 ? 14 : w.Value)
I think the thing that everybody must be missing is that it is exceeeeedingly trivial:
List<int> keys = dictionary.Keys.Where(d => d == oldValue);
You are NOT looking up keys by value (as has been offered by others).
Instead, keys.SingleOrDefault() will now by definition return the single key that equals oldValue if it exists in the dictionary. So the whole code should simplify to
if (dictionary.ContainsKey(oldValue))
dictionary[key] = newValue;
That is quick. Now I'm a little concerned that this might indeed not be what the OP intended, but it is what he had written. So if the existing code does what he needs, he will now have a highly performant version of the same :)
After the edit, this seems an immediate improvement:
foreach (var kvp in dictionary.Where(d => d.Value == oldValue))
{
kvp.Value = newValue;
}
I'm pretty sure you can update the kvp directly, as long as the key isn't changed

C# - Removing Items from Dictionary in while loop

I have this and all seems to work fine but not sure why and if its valid.
Dictionary<string, List<string>> test = new Dictionary<string, List<string>>();
while (test.Count > 0)
{
var obj = test.Last();
MyMethod(obj);
test.Remove(obj.Key);
}
Update: Thanks for the answers, I have updated my code to explain why I don't do Dictionary.Clear();
I don't understand why you are trying to process all Dictonary entries in reverse order - but your code is OK.
It might be a bit faster to get a list of all Keys and process the entries by key instead of counting again and again...
E.G.:
var keys = test.Keys.OrderByDescending(o => o).ToList();
foreach (var key in keys)
{
var obj = test[key];
MyMethod(obj);
test.Remove(key);
}
Dictonarys are fast when they are accessed by their key value. Last() is slower and counting is not necessary - you can get a list of all (unique) keys.
There is nothing wrong with mutating a collection type in a while loop in this manner. Where you get into trouble is when you mutate a collection during a foreach block. Or more generally use a IEnumerator<T> after the underlying collection is mutated.
Although in this sample it would be a lot simpler to just call test.Clear() :)
That works, fine, since you're not iterating over the dictionary while removing items. Each time you check test.Count, it's like it's checking it from scratch.
That being said, the above code could be written much simpler and more effectively:
test.Clear();
It works because Count will be updated every time you remove an object. So say count is 3, test.Remove will decriment the count to 2, and so on, until the count is 0, then you will break out of the loop
Yes, this should be valid, but why not just call Dictionary.Clear()?
All you're doing is taking the last item in the collection and removing it until there are no more items left in the Dictionary.
Nothing out of the ordinary and there's no reason it shouldn't work (as long as emptying the collection is what you want to do).
So, you're just trying to clear the Dictionary, correct? Couldn't you just do the following?
Dictionary<string, List<string>> test = new Dictionary<string, List<string>>();
test.Clear();
This seems like it will work, but it looks extremely expensive. This would be a problem if you were iterating over it with a foreach loop (you can't edit collections while your iterating).
Dictionary.Clear() should do the trick (but you probably already knew that).
Despite your update, you can probably still use clear...
foreach(var item in test) {
MyMethod(item);
}
test.Clear()
Your call to .Last() is going to be extremely inefficient on a large dictionary, and won't guarantee any particular ordering of the processing regardless (the Dictionary is an unordered collection)
I used this code to remove items conditionally.
var dict = new Dictionary<String, float>
var keys = new String[dict.Count];
dict.Keys.CopyTo(keys, 0);
foreach (var key in keys) {
var v = dict[key];
if (condition) {
dict.Remove(key);
}

C#: Easy access to the member of a singleton ICollection<>?

I have an ICollection that I know will only ever have one member. Currently, I loop through it, knowing the loop will only ever run once, to grab the value. Is there a cleaner way to do this?
I could alter the persistentState object to return single values, but that would complicate the rest of the interface. It's grabbing data from XML, and for the most part ICollections are appropriate.
// worldMapLinks ensured to be a singleton
ICollection<IDictionary<string, string>> worldMapLinks = persistentState.GetAllOfType("worldMapLink");
string levelName = ""; //worldMapLinks.GetEnumerator().Current['filePath'];
// this loop will only run once
foreach (IDictionary<string, string> dict in worldMapLinks) // hacky hack hack hack
{
levelName = dict["filePath"];
}
// proceed with levelName
loadLevel(levelName);
Here is another example of the same issue:
// meta will be a singleton
ICollection<IDictionary<string, string>> meta = persistentState.GetAllOfType("meta");
foreach (IDictionary<string, string> dict in meta) // this loop should only run once. HACKS.
{
currentLevelName = dict["name"];
currentLevelCaption = dict["teaserCaption"];
}
Yet another example:
private Vector2 startPositionOfKV(ICollection<IDictionary<string, string>> dicts)
{
Vector2 result = new Vector2();
foreach (IDictionary<string, string> dict in dicts) // this loop will only ever run once
{
result.X = Single.Parse(dict["x"]);
result.Y = Single.Parse(dict["y"]);
}
return result;
}
Why not use the Single or FirstOrDefault extension methods?
var levelName = worldMapLinks.Single().Value;
Single has the advantage of enforcing your assumption that there is only 1 value in the enumeration. If this is not true an exception will be raised forcing you to reconsider your logic. FirstOrDefault will return a default value if there is not at least 1 element in the enumeration.
If you can use LINQ-to-objects in your class, use the Single() extension method on the collection if you know there will be exactly one member. Otherwise, if there could be zero or one, use SingleOrDefault()
Why do you have a collection with only one member? It seems that the real answer should be to better design your system rather than rely on any method to retrieve one element from a collection. You say it makes it more complicated, but how? Isn't this solution itself a complication? Is it possible to change the interface to return one element where applicable and a collection elsewhere? Seems like a code smell to me.

Categories

Resources