I am in this situation that there is a large dictionary that is randomly updated by one thread at a fairly high frequency, and there is another thread that tries to take a snapshot of the dictionary to save as history.
I currently using something like this:
Dictionary<string, object> dict = new Dictionary<string, object>();
var items = dict.Values.ToList();
This works fine for most of the time, except it occasionally throws:
System.InvalidOperationException: Collection was modified; enumeration
operation may not execute.
I understand why this happen, but I don't know what can I do to avoid the collection modified error.
What is the best approach to iterate such collection?
I also tried ConcurrentDictionary, but no luck.
Why? Is ConcurrentDictionary thread safe only at item level?
According to the docs you should be able to use the GetEnumerator() method of ConcurrentDictionary to get a thread-safe iterator.
The enumerator returned from the dictionary is safe to use concurrently with reads and writes to the dictionary, however it does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after GetEnumerator was called.
Since you're dealing with concurrent threads, it's not surprising to have some tradeoffs with consistency, but I would expect this approach to block less than the brute force approach given in other answers. This wouldn't have worked if you tried:
var items = concurrentDict.Items.ToList();
but it's supposed to work for
var items = concurrentDict.GetEnumerator();
or you could simply reference the iterator directly:
foreach(var item in concurrentDict)
{
valueList.Add(item.Value);
}
An ImmutableDictionary might be appropriate for you, as it supports scalable multi-threading and snapshotting as part of its basic feature-set.
// initialize.
ImmutableDictionary<string, int> dict = ImmutableDictionary.Create<string,int>();
// create a new dictionary with "foo" key added.
ImmutableDictionary<string, int> newdict = dict.Add("foo", 0);
// replace dict, thread-safe, with a new dictionary with "bar" added.
// note this is using dict, not newdict, so there is no "foo" in it.
ImmutableInterlocked.TryAdd(ref dict, "bar", 1);
// take a snapshot, thread-safe.
ImmutableDictionary<string,int> snapshot = dict;
The immutable nature means that the dictionary can never change -- you can only add a value by creating a new dictionary. And because of this property, you take a "snapshot" of it by simply keeping a reference around from the point you want to snapshot.
It is optimized under the hood to be efficient, not copying the entire thing for every operation. That said, for other operations it isn't as efficient as ConcurrentDictionary, but it's all a trade-off in what you want. For instance, a ConcurrentDictionary can be concurrently enumerated but it's impossible to enumerate a snapshot of it.
You can use a monitor with lock keyword to ensure that only reading or only writing is executing at this moment.
public class SnapshotDictionary<TKey, TValue> : IEnumerable<KeyValuePair<TKey, TValue>>
{
private readonly Dictionary<TKey, TValue> _dictionary = new Dictionary<TKey, TValue>();
private readonly object _lock = new object();
public void Add(TKey key, TValue value)
{
lock (_lock)
{
_dictionary.Add(key, value);
}
}
// TODO: Other necessary IDictionary methods
public Dictionary<TKey, TValue> GetSnaphot()
{
lock (_lock)
{
return new Dictionary<TKey, TValue>(_dictionary);
}
}
public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
{
return GetSnaphot().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
GetSnapshot method returns a snapshot of your dictionary.
I have also overriden GetEnumerator so that it creates a snapshot and then returns snapshot's enumerator.
So, this will work because will be executed on a snapshot:
var items = snapshotDictionary.GetSnapshot().Values.ToList();
// or
foreach (var item in snapshotDictionary)
{
// ...
}
However, this approach does not allow multithreading writing.
Related
I'm considering three approaches for returning references to internal Dictionary instances (C#) in regards to code safety and impact on the code readability/visually for a project I'm working on.
I've narrowed it down to the following three approaches, but am open to better suggestions. Currently I prefer #3 as the best balance of safety without extra boiler plate.
1) Use a second ReadOnlyDictionary instance to wrap internal Dictionary, only ever letting the ReadOnlyDictionary escape the class:
2) Return the Dictionary instance as an IReadOnlyDictionary, but recasting would allow it to be modified so not as safe as option #1 or #3.
3) Return Dictionary.ToImmutableDictionary() as a ImmutableDictionary when it escapes the containing class so that the returned object is an immutable view of the inner dictionary, although this will make a new copy for every call incurring a higher cost, that should be fine with small simple dictionaries (which mine are).
private readonly Dictionary<string, string> innerDictionary = new Dictionary<string, string>();
// Only required for Example #1
private readonly IReadOnlyDictionary<string, string> readonlyInnerDictionary;
public ExampleClass() {
// Only required for Example #1
readonlyInnerDictionary = new ReadOnlyDictionary<string, string>(innerDictionary);
}
public IReadOnlyDictionary<string, string> GetExampleOne() {
// Requires a second dictionary which is more boiler plate but the object being returned is truly readonly
return readonlyInnerDictionary;
}
public IReadOnlyDictionary<string, string> GetExampleTwo() {
// Requires InnerDictionary be defined as Dictionary (Not IDictionary) but doesn't require the second dictionary be defined
// which is less boiler plate, but the object returned could be re-cast to it's mutable form meaning it's not truly mutation safe.
return innerDictionary;
}
public ImmutableDictionary<string, string> GetExampleThree() {
// Truly immutable object returned, but a new instance is built for every call; fortunately all of my dictionaries are small (containing at most 9 keys)
return innerDictionary.ToImmutableDictionary();
}
Option 1 is the way to go. You can recast ReadOnlyDictionary to IDictionary, but that will throw an Exception when trying to mutate:
void CastingTest()
{
var dic1 = new Dictionary<string, string>();
dic1.Add("Key", "Value");
var dic2 = new ReadOnlyDictionary<string, string>(dic1);
var castedDic = (IDictionary<string, string>)dic2;
castedDic.Add("AnotherKey", "Another Value"); //System.NotSupportedException, Collection is read only
}
The ReadOnlyDictionary doesn't create another Dictionary. It points to the same reference of the first one, encapsulating it. So if you do:
void AddTest()
{
var dic1 = new Dictionary<string, string>();
dic1.Add("Key", "Value");
var dic2 = new ReadOnlyDictionary<string, string>(dic1);
dic1.Add("Key2", "Value2"); //Now dic2 have 2 values too.
}
Never expose your innerDictionary and you'll be fine.
Determined that the neatest, easiest and safest; but not the most performant solution is to use a ConcurrentDictionary internally which ensures thread safety (from System.Collections.Concurrent) and then to use the System.Collections.Immutable to call dictionary.ToImmutableDictionary() which creates the dictionary which escapes the inner class. The interface signature is for ImmutableDictionary<KeyType, ValueType>.
This is not the most performant solution, but in my case with dictionaries with less than 12 keys and small simple objects representing state in most cases that is not a concern.
I was exploring the sources of ASP.NET core on GitHub to see what kind of tricks the ASP.NET team used to speed up the framework. I saw something that intrigued me. In the source code of the ServiceProvider, in the Dispose implementation, they enumerate a dictionary, and they put a comment to indicate a performance trick :
private readonly Dictionary<IService, object> _resolvedServices = new Dictionary<IService, object>();
// Code removed for brevity
public void Dispose()
{
// Code removed for brevity
// PERF: We've enumerating the dictionary so that we don't allocate to enumerate.
// .Values allocates a KeyCollection on the heap, enumerating the dictionary allocates
// a struct enumerator
foreach (var entry in _resolvedServices)
{
(entry.Value as IDisposable)?.Dispose();
}
_resolvedServices.Clear();
}
What is the difference if the dictionary is enumerated like that ?
foreach (var entry in _resolvedServices.Values)
{
(entry as IDisposable)?.Dispose();
}
It has a performance impact ? Or it's because allocate a ValueCollection will consume more memory ?
You're right, this is about memory consumption. The difference is actually pretty well described in the comment: accessing the Value property of a Dictionary<TKey, TValue> will allocate a ValueCollection, which is a class (reference type), on the heap.
foreach'ing through the dictionary itself results in a call to GetEnumerator() which returns an Enumerator. This is a struct and will be allocated on the stack rather than on the heap.
How can check, whether a list of dictionaries contains a specific dictionary?
private List<Dictionary<string, object>> detailsDictionary;
private Dictionary<string, object> selecteItem;
Is there a direct way to check if selectedItem is in detailsDictionary?
Answer:
bool isPresent=false;
foreach(Dictionary<string,object> dic in detailsDictionary)
{
if (DictionaryExtensionMethods.ContentEquals(selectedItem, dic))
{
isPresent= true;
break;
}
}
public static class DictionaryExtensionMethods
{
public static bool ContentEquals<TKey, TValue>(this Dictionary<TKey, TValue> dictionary, Dictionary<TKey, TValue> otherDictionary)
{
return (otherDictionary ?? new Dictionary<TKey, TValue>())
.OrderBy(kvp => kvp.Key)
.SequenceEqual((dictionary ?? new Dictionary<TKey, TValue>())
.OrderBy(kvp => kvp.Key));
}
}
I manually compare each dictionary in list with selected dictionary and make isPresent=true if the two dictionary are equal. I think its a long process and there should be some other easy way.
You can check whether or not any item is in a list by using the IList.Contains method:
bool contains = detailsDictionary.Contains(selectedItem);
Note that this has O(N) complexity, as it has to go though every item in the list, until it finds a match or runs through the whole list. If this is a problem, you may want to cache your lists using a HashSet, which has a Contains method that (in most cases) works much faster.
Or if you mean where selectedItem and the elements of detailsDictionary share the same keys, rather than that they are identical objects:
detailsDictionary.Any(dict => dict.Count == selectedItem.Count && dict.Keys.All(key => selectedItem.ContainsKey(key)));
Obviously this is slower and like Gediminas mentions, there is likely a better way if speed is of the essence.
A solution I used in my code checks whether the selectedItem's values are present in each detailsDictionary item,
detailsDictionary.Any(x=> x.Values.SequenceEqual(selectedItem.Values));
This assums the list of dicts are all dicts of the same kind (same headers).
If not, you could do the following:
detailsDictionary.Any(x=> x.Values.SequenceEqual(selectedItem.Values) &&
x.Keys.SequenceEqual(selectedItem.Values));
However I dont know much about the efficiency of this method.
I'm having difficulty with a dictionary that I want to compare an updated version to the original version.
The first method passes in the dictionary, then from there it gets passed to a static helper class that updates that dictionary.
Before I pass the original dictionary to the helper class, I want to make a copy of the original dictionary so I can compare.
This is where I'm having trouble. After the helper class, the 'copy' of the dictionary has been updated too.
I've even tried making a struct that contains a dictionary thinking that'd copy the original dictionary values, but that seems to be by ref too! Here is a snippet of the code.
public PartialViewResult updateItem(string submit, FormCollection Collection)
{
SurveyItem UpdatedItem = new SurveyItem();
ItemSettingsCopy OriginalSettings;
ItemBank CurrentSurvey = (ItemBank)Session["Survey"];
string _itemName = (string)Session["CurrentItem"];
OriginalSettings.ItemSettings = CurrentSurvey[_itemName].ItemSettings;
//this is where I'm trying to make a copy of the original settings.
UpdatedItem = BankManagerHelper.UpdateItem(CurrentSurvey[_itemName], Collection, submit); //static item now updates the fields in the item
//AT THIS POINT OriginalSettings.ItemSettings HAS BEEN CHANGED TOO
You need to clone the dictionary.
This answer is a way to go. https://stackoverflow.com/a/139841/61256
[This code was copied from the link]
public static Dictionary<TKey, TValue> CloneDictionaryCloningValues<TKey, TValue>
(Dictionary<TKey, TValue> original) where TValue : ICloneable
{
Dictionary<TKey, TValue> ret = new Dictionary<TKey, TValue>(original.Count,
original.Comparer);
foreach (KeyValuePair<TKey, TValue> entry in original)
{
ret.Add(entry.Key, (TValue) entry.Value.Clone());
}
return ret;
}
This is where I'm having trouble. After the helper class, the 'copy'
of the dictionary has been updated too.
Yes, as your are actually updating the same variable in memory. When you pass a Dictionary as parameter of a function, you actually pass a reference to your variable.
What you could do is create a new dictionary that contains the same list of objects before you call your UpdateItem method. You'll then have 2 different objects in memory, so you'll be able to compare them.
Note that you might want to create new instances of the items that are stored in your dictionary, or both dictionary will contain references to the same objects (I don't know if you want to compare dictionaries themselves or objects stored into dictionaries).
I am trying to re-write some code using Dictionary to use ConcurrentDictionary. I have reviewed some examples but I am still having trouble implementing the AddOrUpdate function. This is the original code:
dynamic a = HttpContext;
Dictionary<int, string> userDic = this.HttpContext.Application["UserSessionList"] as Dictionary<int, String>;
if (userDic != null)
{
if (useDic.ContainsKey(authUser.UserId))
{
userDic.Remove(authUser.UserId);
}
}
else
{
userDic = new Dictionary<int,string>();
}
userDic.Add(authUser.UserId, a.Session.SessionID.ToString());
this.HttpContext.Application["UserDic"] = userDic;
I don't know what to add for the update portion:
userDic.AddOrUpdate(authUser.UserId,
a.Session.SessionID.ToString(),
/*** what to add here? ***/);
Any pointers would be appreciated.
You need to pass a Func which returns the value to be stored in the dictionary in case of an update. I guess in your case (since you don't distinguish between add and update) this would be:
var sessionId = a.Session.SessionID.ToString();
userDic.AddOrUpdate(
authUser.UserId,
sessionId,
(key, oldValue) => sessionId);
I.e. the Func always returns the sessionId, so that both Add and Update set the same value.
BTW: there is a sample on the MSDN page.
I hope, that I did not miss anything in your question, but why not just like this? It is easier, atomic and thread-safe (see below).
userDic[authUser.UserId] = sessionId;
Store a key/value pair into the dictionary unconditionally, overwriting any value for that key if the key already exists: Use the indexer’s setter
(See: http://blogs.msdn.com/b/pfxteam/archive/2010/01/08/9945809.aspx)
The indexer is atomic, too. If you pass a function instead, it might not be:
All of these operations are atomic and are thread-safe with regards to all other operations on the ConcurrentDictionary. The only caveat to the atomicity of each operation is for those which accept a delegate, namely AddOrUpdate and GetOrAdd. [...] these delegates are invoked outside of the locks
See: http://blogs.msdn.com/b/pfxteam/archive/2010/01/08/9945809.aspx
I ended up implementing an extension method:
static class ExtensionMethods
{
// Either Add or overwrite
public static void AddOrUpdate<K, V>(this ConcurrentDictionary<K, V> dictionary, K key, V value)
{
dictionary.AddOrUpdate(key, value, (oldkey, oldvalue) => value);
}
}
For those who are interested in, I am currently implementing a case which is a great example for using the "oldValue" aka existing value instead of forcing a new one (personally I don't like the term "oldValue" as it is not that old when it was created just a few processor ticks ago from within a parallel thread).
dictionaryCacheQueues.AddOrUpdate(
uid,
new ConcurrentQueue<T>(),
(existingUid, existingValue) => existingValue
);