Enumerate Dictionary.Values vs Dictionary itself - c#

I was exploring the sources of ASP.NET core on GitHub to see what kind of tricks the ASP.NET team used to speed up the framework. I saw something that intrigued me. In the source code of the ServiceProvider, in the Dispose implementation, they enumerate a dictionary, and they put a comment to indicate a performance trick :
private readonly Dictionary<IService, object> _resolvedServices = new Dictionary<IService, object>();
// Code removed for brevity
public void Dispose()
{
// Code removed for brevity
// PERF: We've enumerating the dictionary so that we don't allocate to enumerate.
// .Values allocates a KeyCollection on the heap, enumerating the dictionary allocates
// a struct enumerator
foreach (var entry in _resolvedServices)
{
(entry.Value as IDisposable)?.Dispose();
}
_resolvedServices.Clear();
}
What is the difference if the dictionary is enumerated like that ?
foreach (var entry in _resolvedServices.Values)
{
(entry as IDisposable)?.Dispose();
}
It has a performance impact ? Or it's because allocate a ValueCollection will consume more memory ?

You're right, this is about memory consumption. The difference is actually pretty well described in the comment: accessing the Value property of a Dictionary<TKey, TValue> will allocate a ValueCollection, which is a class (reference type), on the heap.
foreach'ing through the dictionary itself results in a call to GetEnumerator() which returns an Enumerator. This is a struct and will be allocated on the stack rather than on the heap.

Related

In c# , how to iterate IEnumerable in multithreading environment

I am in this situation that there is a large dictionary that is randomly updated by one thread at a fairly high frequency, and there is another thread that tries to take a snapshot of the dictionary to save as history.
I currently using something like this:
Dictionary<string, object> dict = new Dictionary<string, object>();
var items = dict.Values.ToList();
This works fine for most of the time, except it occasionally throws:
System.InvalidOperationException: Collection was modified; enumeration
operation may not execute.
I understand why this happen, but I don't know what can I do to avoid the collection modified error.
What is the best approach to iterate such collection?
I also tried ConcurrentDictionary, but no luck.
Why? Is ConcurrentDictionary thread safe only at item level?
According to the docs you should be able to use the GetEnumerator() method of ConcurrentDictionary to get a thread-safe iterator.
The enumerator returned from the dictionary is safe to use concurrently with reads and writes to the dictionary, however it does not represent a moment-in-time snapshot of the dictionary. The contents exposed through the enumerator may contain modifications made to the dictionary after GetEnumerator was called.
Since you're dealing with concurrent threads, it's not surprising to have some tradeoffs with consistency, but I would expect this approach to block less than the brute force approach given in other answers. This wouldn't have worked if you tried:
var items = concurrentDict.Items.ToList();
but it's supposed to work for
var items = concurrentDict.GetEnumerator();
or you could simply reference the iterator directly:
foreach(var item in concurrentDict)
{
valueList.Add(item.Value);
}
An ImmutableDictionary might be appropriate for you, as it supports scalable multi-threading and snapshotting as part of its basic feature-set.
// initialize.
ImmutableDictionary<string, int> dict = ImmutableDictionary.Create<string,int>();
// create a new dictionary with "foo" key added.
ImmutableDictionary<string, int> newdict = dict.Add("foo", 0);
// replace dict, thread-safe, with a new dictionary with "bar" added.
// note this is using dict, not newdict, so there is no "foo" in it.
ImmutableInterlocked.TryAdd(ref dict, "bar", 1);
// take a snapshot, thread-safe.
ImmutableDictionary<string,int> snapshot = dict;
The immutable nature means that the dictionary can never change -- you can only add a value by creating a new dictionary. And because of this property, you take a "snapshot" of it by simply keeping a reference around from the point you want to snapshot.
It is optimized under the hood to be efficient, not copying the entire thing for every operation. That said, for other operations it isn't as efficient as ConcurrentDictionary, but it's all a trade-off in what you want. For instance, a ConcurrentDictionary can be concurrently enumerated but it's impossible to enumerate a snapshot of it.
You can use a monitor with lock keyword to ensure that only reading or only writing is executing at this moment.
public class SnapshotDictionary<TKey, TValue> : IEnumerable<KeyValuePair<TKey, TValue>>
{
private readonly Dictionary<TKey, TValue> _dictionary = new Dictionary<TKey, TValue>();
private readonly object _lock = new object();
public void Add(TKey key, TValue value)
{
lock (_lock)
{
_dictionary.Add(key, value);
}
}
// TODO: Other necessary IDictionary methods
public Dictionary<TKey, TValue> GetSnaphot()
{
lock (_lock)
{
return new Dictionary<TKey, TValue>(_dictionary);
}
}
public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
{
return GetSnaphot().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return GetEnumerator();
}
}
GetSnapshot method returns a snapshot of your dictionary.
I have also overriden GetEnumerator so that it creates a snapshot and then returns snapshot's enumerator.
So, this will work because will be executed on a snapshot:
var items = snapshotDictionary.GetSnapshot().Values.ToList();
// or
foreach (var item in snapshotDictionary)
{
// ...
}
However, this approach does not allow multithreading writing.

C# returning Dictionary references safely

I'm considering three approaches for returning references to internal Dictionary instances (C#) in regards to code safety and impact on the code readability/visually for a project I'm working on.
I've narrowed it down to the following three approaches, but am open to better suggestions. Currently I prefer #3 as the best balance of safety without extra boiler plate.
1) Use a second ReadOnlyDictionary instance to wrap internal Dictionary, only ever letting the ReadOnlyDictionary escape the class:
2) Return the Dictionary instance as an IReadOnlyDictionary, but recasting would allow it to be modified so not as safe as option #1 or #3.
3) Return Dictionary.ToImmutableDictionary() as a ImmutableDictionary when it escapes the containing class so that the returned object is an immutable view of the inner dictionary, although this will make a new copy for every call incurring a higher cost, that should be fine with small simple dictionaries (which mine are).
private readonly Dictionary<string, string> innerDictionary = new Dictionary<string, string>();
// Only required for Example #1
private readonly IReadOnlyDictionary<string, string> readonlyInnerDictionary;
public ExampleClass() {
// Only required for Example #1
readonlyInnerDictionary = new ReadOnlyDictionary<string, string>(innerDictionary);
}
public IReadOnlyDictionary<string, string> GetExampleOne() {
// Requires a second dictionary which is more boiler plate but the object being returned is truly readonly
return readonlyInnerDictionary;
}
public IReadOnlyDictionary<string, string> GetExampleTwo() {
// Requires InnerDictionary be defined as Dictionary (Not IDictionary) but doesn't require the second dictionary be defined
// which is less boiler plate, but the object returned could be re-cast to it's mutable form meaning it's not truly mutation safe.
return innerDictionary;
}
public ImmutableDictionary<string, string> GetExampleThree() {
// Truly immutable object returned, but a new instance is built for every call; fortunately all of my dictionaries are small (containing at most 9 keys)
return innerDictionary.ToImmutableDictionary();
}
Option 1 is the way to go. You can recast ReadOnlyDictionary to IDictionary, but that will throw an Exception when trying to mutate:
void CastingTest()
{
var dic1 = new Dictionary<string, string>();
dic1.Add("Key", "Value");
var dic2 = new ReadOnlyDictionary<string, string>(dic1);
var castedDic = (IDictionary<string, string>)dic2;
castedDic.Add("AnotherKey", "Another Value"); //System.NotSupportedException, Collection is read only
}
The ReadOnlyDictionary doesn't create another Dictionary. It points to the same reference of the first one, encapsulating it. So if you do:
void AddTest()
{
var dic1 = new Dictionary<string, string>();
dic1.Add("Key", "Value");
var dic2 = new ReadOnlyDictionary<string, string>(dic1);
dic1.Add("Key2", "Value2"); //Now dic2 have 2 values too.
}
Never expose your innerDictionary and you'll be fine.
Determined that the neatest, easiest and safest; but not the most performant solution is to use a ConcurrentDictionary internally which ensures thread safety (from System.Collections.Concurrent) and then to use the System.Collections.Immutable to call dictionary.ToImmutableDictionary() which creates the dictionary which escapes the inner class. The interface signature is for ImmutableDictionary<KeyType, ValueType>.
This is not the most performant solution, but in my case with dictionaries with less than 12 keys and small simple objects representing state in most cases that is not a concern.

Query List Dictionary

I have a list of dictionaries
var ProductItemsDictionary = new List<Dictionary<string, string>>();
Is it possible to use linq to search the list and find a dictionary based on a key inside that dictionary and retrieve the value?
Sure it is, but is it worth? See for instance the dotctor's answer. It will do the job, but is inefficient - 2 key lookups (one for checking and one for retrieving the value), compiler generated class and heap allocation (because of the specificKey variable capture), etc. Less code? More readable? What about the non linq equivalent:
static void Foo(List<Dictionary<string, string>> ProductItemsDictionary, string key)
{
string value;
foreach (var dictionary in ProductItemsDictionary)
if (dictionary.TryGetValue(key, out value)) { /* Use value */ }
// Not found
}
Zero allocations, minimum key lookups, good readability (IMO) - what else do we need? :-)

In C# how does foreach behave when the enumerated container is modified

This seems like it should be answered but potential dupes I found were asking different things...
I noticed that this seems to work fine (sourceDirInclusion is a simple Dictionary<X,Y>)
foreach (string dir in sourceDirInclusion.Keys)
{
if (sourceDirInclusion[dir] == null)
sourceDirInclusion.Remove(dir);
}
Does that mean removing items from a collection in foreach is safe, or that I got lucky?
What about if I was adding more elements to the dictionary rather than removing?
The problem I'm trying to solve is that sourceDirInclusion is initially populated, but then each value can contribute new items to the dictionary in a second pass. e.g what I want to do is like:
foreach (string dir in sourceDirInclusion.Keys)
{
X x = sourceDirInclusion[dir];
sourceDirInclusion.Add(X.dir,X.val);
}
Short answer: This is not safe.
Long answer: From the IEnumerator<T> documentation:
An enumerator remains valid as long as the collection remains unchanged. If changes are made to the collection, such as adding, modifying, or deleting elements, the enumerator is irrecoverably invalidated and its behavior is undefined.
Note that the docs say the behavior is undefined, which means that it might work and it might not. One should never rely on undefined behavior.
In this case, it depends on the behavior of the Keys enumerable, regarding whether or not it creates a copy of the list of keys when you begin enumerating. In this specific case, we know from the docs that the return value from Dictionary<,>.Keys is a collection that refers back to the dictionary:
The returned Dictionary<TKey, TValue>.KeyCollection is not a static copy; instead, the Dictionary<TKey, TValue>.KeyCollection refers back to the keys in the original Dictionary<TKey, TValue>. Therefore, changes to the Dictionary<TKey, TValue> continue to be reflected in the Dictionary<TKey, TValue>.KeyCollection.
So it should be considered unsafe to modify the dictionary while enumerating the dictionary's keys.
You can correct this with one change. Alter this line:
foreach (string dir in sourceDirInclusion.Keys)
To this:
foreach (string dir in sourceDirInclusion.Keys.ToList())
The ToList() extension method will create an explicit copy of the list of keys, making it safe to modify the dictionary; the "underlying collection" will be the copy and not the original.
If will throw
InvalidOperationException: Message="Collection was modified; enumeration operation may not execute
To avoid that add candidates for removal to an external list. Then loop over it and remove from target container (dictionary).
List<string> list = new List<string>(sourceDirInclusion.Keys.Count);
foreach (string dir in sourceDirInclusion.Keys)
{
if (sourceDirInclusion[dir] == null)
list.Add(dir);
}
foreach (string dir in list)
{
sourceDirInclusion.Remove(dir);
}
check this out: What is the best way to modify a list in a 'foreach' loop?
In short:
The collection used in foreach is immutable. This is very much by design.
As it says on MSDN:
The foreach statement is used to iterate through the collection to get the information that you want, but can not be used to add or remove items from the source collection to avoid unpredictable side effects. If you need to add or remove items from the source collection, use a for loop.
UPDATE:
You can use a for loop instead:
for (int index = 0; index < dictionary.Count; index++) {
var item = dictionary.ElementAt(index);
var itemKey = item.Key;
var itemValue = item.Value;
}
This works because you are traversing sourceDirInclusion.Keys.
However, just to be sure with future versions of the FrameWork I recommend that you use sourceDirInclusion.Keys.ToArray() in the foreach statement this way you will create a copy of the keys that you loop through.
This will however not work:
foreach(KeyValuePair<string, object> item in sourceDirInclusion)
{
if (item.Value == null)
sourceDirInclusion.Remove(item.Key);
}
As a rule, you cannot modify a collection while it is traversed, but often you can make a new collection by using .ToArray() or .ToList() and traverse that while modifying the original collection.
Good luck with your quest.

Byref issues in C#

I'm having difficulty with a dictionary that I want to compare an updated version to the original version.
The first method passes in the dictionary, then from there it gets passed to a static helper class that updates that dictionary.
Before I pass the original dictionary to the helper class, I want to make a copy of the original dictionary so I can compare.
This is where I'm having trouble. After the helper class, the 'copy' of the dictionary has been updated too.
I've even tried making a struct that contains a dictionary thinking that'd copy the original dictionary values, but that seems to be by ref too! Here is a snippet of the code.
public PartialViewResult updateItem(string submit, FormCollection Collection)
{
SurveyItem UpdatedItem = new SurveyItem();
ItemSettingsCopy OriginalSettings;
ItemBank CurrentSurvey = (ItemBank)Session["Survey"];
string _itemName = (string)Session["CurrentItem"];
OriginalSettings.ItemSettings = CurrentSurvey[_itemName].ItemSettings;
//this is where I'm trying to make a copy of the original settings.
UpdatedItem = BankManagerHelper.UpdateItem(CurrentSurvey[_itemName], Collection, submit); //static item now updates the fields in the item
//AT THIS POINT OriginalSettings.ItemSettings HAS BEEN CHANGED TOO
You need to clone the dictionary.
This answer is a way to go. https://stackoverflow.com/a/139841/61256
[This code was copied from the link]
public static Dictionary<TKey, TValue> CloneDictionaryCloningValues<TKey, TValue>
(Dictionary<TKey, TValue> original) where TValue : ICloneable
{
Dictionary<TKey, TValue> ret = new Dictionary<TKey, TValue>(original.Count,
original.Comparer);
foreach (KeyValuePair<TKey, TValue> entry in original)
{
ret.Add(entry.Key, (TValue) entry.Value.Clone());
}
return ret;
}
This is where I'm having trouble. After the helper class, the 'copy'
of the dictionary has been updated too.
Yes, as your are actually updating the same variable in memory. When you pass a Dictionary as parameter of a function, you actually pass a reference to your variable.
What you could do is create a new dictionary that contains the same list of objects before you call your UpdateItem method. You'll then have 2 different objects in memory, so you'll be able to compare them.
Note that you might want to create new instances of the items that are stored in your dictionary, or both dictionary will contain references to the same objects (I don't know if you want to compare dictionaries themselves or objects stored into dictionaries).

Categories

Resources