Generic List Contains() perfomance and alternatives - c#

I need to store big amount of key, value pairs where key is not unique. Both key and value are strings. And items count is about 5 million.
My goal is to hold only unique pairs.
I've tried to use List<KeyValuePair<string, string>>, but the Contains() is extremely slow.
LINQ Any() looks a little bit faster, but still too slow.
Are there any alternatives to perform the search faster on a generic list? Or maybe I should use another storage?

I would use a Dictionary<string, HashSet<string>> mapping one key to all its values.
Here is a full solution. First, write a couple of extension methods to add a (key,value) pair to your Dictionary and another one to get all (key,value) pairs. Note that I use arbitrary types for keys and values, you can substitute this with string without problem.
You can even write these methods somewhere else instead of as extensions, or not use methods at all and just use this code somewhere in your program.
public static class Program
{
public static void Add<TKey, TValue>(
this Dictionary<TKey, HashSet<TValue>> data, TKey key, TValue value)
{
HashSet<TValue> values = null;
if (!data.TryGetValue(key, out values)) {
// first time using this key? create a new HashSet
values = new HashSet<TValue>();
data.Add(key, values);
}
values.Add(value);
}
public static IEnumerable<KeyValuePair<TKey, TValue>> KeyValuePairs<TKey, TValue>(
this Dictionary<TKey, HashSet<TValue>> data)
{
return data.SelectMany(k => k.Value,
(k, v) => new KeyValuePair<TKey, TValue>(k.Key, v));
}
}
Now you can use it as follows:
public static void Main(string[] args)
{
Dictionary<string, HashSet<string>> data = new Dictionary<string, HashSet<string>>();
data.Add("k1", "v1.1");
data.Add("k1", "v1.2");
data.Add("k1", "v1.1"); // already in, so nothing happens here
data.Add("k2", "v2.1");
foreach (var kv in data.KeyValuePairs())
Console.WriteLine(kv.Key + " : " + kv.Value);
}
Which will print this:
k1 : v1.1
k1 : v1.2
k2 : v2.1
If your key mapped to a List<string> then you would need to take care of duplicates yourself. HashSet<string> does that for you already.

I guess that Dictionary<string, List<string>> will do the trick.

I would consider using some in-proc NoSQL database like RavenDB (RavenDB Embedded in this case) as they state on their website:
RavenDB can be used for application that needs to store millions of records and has fast query times.
Using it is requires no big boilerplate (example from RavenDB website):
var myCompany = new Company
{
Name = "Hibernating Rhinos",
Employees = {
new Employee
{
Name = "Ayende Rahien"
}
},
Country = "Israel"
};
// Store the company in our RavenDB server
using (var session = documentStore.OpenSession())
{
session.Store(myCompany);
session.SaveChanges();
}
// Create a new session, retrieve an entity, and change it a bit
using (var session = documentStore.OpenSession())
{
Company entity = session.Query<Company>()
.Where(x => x.Country == "Israel")
.FirstOrDefault();
// We can also load by ID: session.Load<Company>(companyId);
entity.Name = "Another Company";
session.SaveChanges(); // will send the change to the database
}

To make a unique list you want to use .Distinct() to generate it, not .Contains(). However whatever class holds your strings must implement .GetHashCode() and .Equals() correctly to get good performance or you must pass in a custom comparer.
Here is how you could do it with a custom comparer
private static void Main(string[] args)
{
List<KeyValuePair<string, string>> giantList = Populate();
var uniqueItems = giantList.Distinct(new MyStringEquater()).ToList();
}
class MyStringEquater : IEqualityComparer<KeyValuePair<string, string>>
{
//Choose which comparer you want based on if you want your comparisions to be case sensitive or not
private static StringComparer comparer = StringComparer.OrdinalIgnoreCase;
public bool Equals(KeyValuePair<string, string> x, KeyValuePair<string, string> y)
{
return comparer.Equals(x.Key, y.Key) && comparer.Equals(x.Value, y.Value);
}
public int GetHashCode(KeyValuePair<string, string> obj)
{
unchecked
{
int x = 27;
x = x*11 + comparer.GetHashCode(obj.Key);
x = x*11 + comparer.GetHashCode(obj.Value);
return x;
}
}
}
Also per your comment in the other answer you could also use the above comparer in a HashSet and have it store your unique items that way. You just need to pass in the comparer in to the constructor.
var hashSetWithComparer = new HashSet<KeyValuePair<string,string>(new MyStringEquater());

You will most likely see an improvement if you use a HashSet<KeyValuePair<string, string>>.
The test below finishes on my machine in about 10 seconds. If I change...
var collection = new HashSet<KeyValuePair<string, string>>();
...to...
var collection = new List<KeyValuePair<string, string>>();
...I get tired of waiting for it to complete (more than a few minutes).
Using a KeyValuePair<string, string> has the advantage that equality is determined by the values of Key and Value. Since strings are interned, and KeyValuePair<TKey, TValue> is a struct, pairs with the same Key and Value will be considered equal by the runtime.
You can see that equality with this test:
var hs = new HashSet<KeyValuePair<string, string>>();
hs.Add(new KeyValuePair<string, string>("key", "value"));
var b = hs.Contains(new KeyValuePair<string, string>("key", "value"));
Console.WriteLine(b);
One thing that's important to remember, though, is that the equality of pairs depends on the internment of strings. If, for some reason, your strings aren't interned (because they come from a file or something), the equality probably won't work.
using System;
using System.Collections.Generic;
using System.Diagnostics;
namespace ConsoleApplication1 {
internal class Program {
static void Main(string[] args) {
var key = default(string);
var value = default(string);
var collection = new HashSet<KeyValuePair<string, string>>();
for (var i = 0; i < 5000000; i++) {
if (key == null || i % 2 == 0) {
key = "k" + i;
}
value = "v" + i;
collection.Add(new KeyValuePair<string, string>(key, value));
}
var found = 0;
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < 5000000; i++) {
if (collection.Contains(new KeyValuePair<string, string>("k" + i, "v" + i))) {
found++;
}
}
sw.Stop();
Console.WriteLine("Found " + found);
Console.WriteLine(sw.Elapsed);
Console.ReadLine();
}
}
}

Have you tried using a Hashset? Much quicker than lists when large numbers are involved although I don't know if it'd still be too slow.
This answer has a lot of information: HashSet vs. List performance

Related

How can I merge two dictionaries without getting ArgumentException on the key?

I can't figure out how to keep the keys and values on a dictionary when I try to merge two dictionaries. I keep getting ArgumentException due to duplicate of key. When the key match I would just like to add the value by =+ kvp.value;
I have a list of Dictionaries where the
1st Dictionary = kvp = "jump", 2;
2ndDictionary = kvp = "jump", 4;
I like to merge them and get something like:
Dictionary = kvp = "jump", 6;
That I can later add to my list of Dictionaries
I've tried to run something I found in StackOverflow thread.
foreach (var dict in listOfDict)
{
dict.SelectMany(d => d)
.ToLookup(pair => pair.Key, pair => pair.Value)
.ToDictionary(group => group.Key, group => group.First());
}
But I keep getting.
cannot be inferred from the usage. Try specifying the type arguments
explicitly.
I want to avoid getting all keys and all values on separate lists that I later loop through to add key and value on a new dictionary.
Simplest extension to list of dictionary of double values with using Linq:
public static class ExtListOfDict {
public static Dictionary<TKey, double> SumValue1<TKey>(this List<Dictionary<TKey, double>> list)
=> list?.SelectMany(i => i).ToLookup(i => i.Key, i => i.Value).ToDictionary(i => i.Key, i => i.Sum());
}
without linq:
public static Dictionary<TKey, double> SumValue2<TKey>(this List<Dictionary<TKey, double>> list) {
if(list?.Count > 0) {
var dir = new Dictionary<TKey, double>(list[0]);
for(var i = 1; i < list.Count; i++)
foreach (var kv in list[i])
if (dir.TryGetValue(kv.Key, out double sum))
dir[kv.Key] = sum + kv.Value;
else
dir.Add(kv.Key, kv.Value);
return dir;
} else
return null;
}
If you like the LINQ approach, I would go with something like this:
var dictionaries = new List<Dictionary<string, int>>(); // this is the list of dictionaries you want to merge
var unifiedDictionary = new Dictionary<string, int>(); // this is the dictionary where you merge and add the values
foreach (var kvp in dictionaries.SelectMany(dictionary => dictionary))
{
if (unifiedDictionary.ContainsKey(kvp.Key))
{
unifiedDictionary[kvp.Key] += kvp.Value;
}
else
{
unifiedDictionary.Add(kvp.Key, kvp.Value);
}
}
However, if this is too hard to read (I am not always a fan of excessive LINQ over explicit code blocks), you can use the for-loop approach:
var dictionaries = new List<Dictionary<string, int>>(); // this is the list of dictionaries you want to merge
var unifiedDictionary = new Dictionary<string, int>(); // this is the dictionary where you merge and add the values
foreach (var dictionary in dictionaries)
{
foreach (var kvp in dictionary)
{
if (unifiedDictionary.ContainsKey(kvp.Key))
{
unifiedDictionary[kvp.Key] += kvp.Value;
}
else
{
unifiedDictionary.Add(kvp.Key, kvp.Value);
}
}
}
Hope this helps you. If further help and explanations are needed, please tell me.
Here is a solution based on the CollectionsMarshal.GetValueRefOrAddDefault API (.NET 6), and on the INumber<TSelf> interface (.NET 7):
public static Dictionary<TKey, TValue> ToSumDictionary<TKey, TValue>(
this IEnumerable<Dictionary<TKey, TValue>> dictionaries)
where TValue : struct, INumber<TValue>
{
ArgumentNullException.ThrowIfNull(dictionaries);
Dictionary<TKey, TValue> result = null;
foreach (var dictionary in dictionaries)
{
if (result is null)
{
result = new(dictionary, dictionary.Comparer);
continue;
}
if (!ReferenceEquals(dictionary.Comparer, result.Comparer))
throw new InvalidOperationException("Incompatible comparers.");
foreach (var (key, value) in dictionary)
{
ref TValue refValue = ref CollectionsMarshal
.GetValueRefOrAddDefault(result, key, out bool exists);
refValue = exists ? refValue + value : value;
}
}
result ??= new();
return result;
}
The key of each KeyValuePair<TKey, TValue> in each dictionary is hashed only once.
If you are getting an exception due to duplicate keys, then it sounds like you have duplicate keys!
Have you checked the two dictionaries before you try to merge them? Simply calling =+ kvp.value without checking to see if the first dictionary already has a key of that name is very likely to be your problem.
You need to check for an existing entry with that key, and if one is found, take whatever action is appropriate for your scenario (ie ignore, overwrite, ask the user to decide, etc)

How to iterate in reverse through an OrderedDictionary

How can I iterate through an OrderedDictionary in reverse and access its keys?
Since it doesn't have any support for LINQ extensions, I have tried the following:
var orderedDictionary= new OrderedDictionary();
orderedDictionary.Add("something", someObject);
orderedDictionary.Add("another", anotherObject);
for (var dictIndex = orderedDictionary.Count - 1; dictIndex != 0; dictIndex--)
{
// It gives me the value, but how do I get the key?
// E.g., "something" and "another".
var key = orderedDictionary[dictIndex];
}
May I suggest to use SortedDictionary<K, V>? It does support LINQ and it is type safe:
var orderedDictionary = new SortedDictionary<string, string>();
orderedDictionary.Add("something", "a");
orderedDictionary.Add("another", "b");
foreach (KeyValuePair<string, string> kvp in orderedDictionary.Reverse())
{
}
Also, as Ivan Stoev pointed out in a comment, the returned items of the OrderedDictionary aren't ordered at all, so SortedDictionary is what you want.
You can lessen the complexity of this problem significantly by using a regular Dictionary (or SortedDictionary, depending on your requirements) and keep a secondary List to keep track of the keys' insertion order. You can even use a class to facilitate this organization:
public class DictionaryList<TKey, TValue>
{
private Dictionary<TKey, TValue> _dict;
private List<TKey> _list;
public TValue this[TKey key]
{
get { return _dict[key]; }
set { _dict[key] = value; }
}
public DictionaryList()
{
_dict = new Dictionary<TKey, TValue>();
_list = new List<TKey>();
}
public void Add(TKey key, TValue value)
{
_dict.Add(key, value);
_list.Add(key);
}
public IEnumerable<TValue> GetValuesReverse()
{
for (int i = _list.Count - 1; i >= 0; i--)
yield return _dict[_list[i]];
}
}
(And of course add whatever other methods you need as well.)
Since it doesn't have any support for LINQ extensions...
That's because it's a non-generic Enumerable. You can make it generic by casting it to the right type:
foreach (var entry in orderedDictionary.Cast<DictionaryEntry>().Reverse()) {
var key = entry.Key;
var value = entry.Value;
}
You can get an element at an index like this:
orderedDictionary.Cast<DictionaryEntry>().ElementAt(dictIndex);
And for getting the Key:
orderedDictionary.Cast<DictionaryEntry>().ElementAt(dictIndex).K‌​ey.ToString();
I am not bothered with the order fact. You can get the key by copying the keys to an indexable collection. Also the condition of the loop needed to be changed to dictIndex > -1;.
Please try this:
var orderedDictionary = new OrderedDictionary();
orderedDictionary.Add("something", someObject);
orderedDictionary.Add("another", anotherObject);
object[] keys = new object[orderedDictionary.Keys.Count];
orderedDictionary.Keys.CopyTo(keys, 0);
for (var dictIndex = orderedDictionary.Count-1; dictIndex > -1; dictIndex--)
{
// It gives me the value, but how do I get the key?
// E.g., "something" and "another".
var key = orderedDictionary[dictIndex];
// Get your key, e.g. "something" and "another"
var key = keys[dictIndex];
}
If you need to use an OrderdDictionary, you can always use a SortedDictionary like below.
var orderedDictionary = new SortedDictionary<int, string>();
orderedDictionary.Add(1, "Abacas");
orderedDictionary.Add(2, "Lion");
orderedDictionary.Add(3, "Zebera");
var reverseList = orderedDictionary.ToList().OrderByDescending(pair => pair.Value);
foreach (var item in reverseList)
{
Debug.Print(item.Value);
}

Asp.Net KeyValuePair<string, { } > does not contain definition or extension [duplicate]

I've seen a few different ways to iterate over a dictionary in C#. Is there a standard way?
foreach(KeyValuePair<string, string> entry in myDictionary)
{
// do something with entry.Value or entry.Key
}
If you are trying to use a generic Dictionary in C# like you would use an associative array in another language:
foreach(var item in myDictionary)
{
foo(item.Key);
bar(item.Value);
}
Or, if you only need to iterate over the collection of keys, use
foreach(var item in myDictionary.Keys)
{
foo(item);
}
And lastly, if you're only interested in the values:
foreach(var item in myDictionary.Values)
{
foo(item);
}
(Take note that the var keyword is an optional C# 3.0 and above feature, you could also use the exact type of your keys/values here)
In some cases you may need a counter that may be provided by for-loop implementation. For that, LINQ provides ElementAt which enables the following:
for (int index = 0; index < dictionary.Count; index++) {
var item = dictionary.ElementAt(index);
var itemKey = item.Key;
var itemValue = item.Value;
}
Depends on whether you're after the keys or the values...
From the MSDN Dictionary(TKey, TValue) Class description:
// When you use foreach to enumerate dictionary elements,
// the elements are retrieved as KeyValuePair objects.
Console.WriteLine();
foreach( KeyValuePair<string, string> kvp in openWith )
{
Console.WriteLine("Key = {0}, Value = {1}",
kvp.Key, kvp.Value);
}
// To get the values alone, use the Values property.
Dictionary<string, string>.ValueCollection valueColl =
openWith.Values;
// The elements of the ValueCollection are strongly typed
// with the type that was specified for dictionary values.
Console.WriteLine();
foreach( string s in valueColl )
{
Console.WriteLine("Value = {0}", s);
}
// To get the keys alone, use the Keys property.
Dictionary<string, string>.KeyCollection keyColl =
openWith.Keys;
// The elements of the KeyCollection are strongly typed
// with the type that was specified for dictionary keys.
Console.WriteLine();
foreach( string s in keyColl )
{
Console.WriteLine("Key = {0}", s);
}
Generally, asking for "the best way" without a specific context is like asking
what is the best color?
One the one hand, there are many colors and there's no best color. It depends on the need and often on taste, too.
On the other hand, there are many ways to iterate over a Dictionary in C# and there's no best way. It depends on the need and often on taste, too.
Most straightforward way
foreach (var kvp in items)
{
// key is kvp.Key
doStuff(kvp.Value)
}
If you need only the value (allows to call it item, more readable than kvp.Value).
foreach (var item in items.Values)
{
doStuff(item)
}
If you need a specific sort order
Generally, beginners are surprised about order of enumeration of a Dictionary.
LINQ provides a concise syntax that allows to specify order (and many other things), e.g.:
foreach (var kvp in items.OrderBy(kvp => kvp.Key))
{
// key is kvp.Key
doStuff(kvp.Value)
}
Again you might only need the value. LINQ also provides a concise solution to:
iterate directly on the value (allows to call it item, more readable than kvp.Value)
but sorted by the keys
Here it is:
foreach (var item in items.OrderBy(kvp => kvp.Key).Select(kvp => kvp.Value))
{
doStuff(item)
}
There are many more real-world use case you can do from these examples.
If you don't need a specific order, just stick to the "most straightforward way" (see above)!
C# 7.0 introduced Deconstructors and if you are using .NET Core 2.0+ Application, the struct KeyValuePair<> already include a Deconstruct() for you. So you can do:
var dic = new Dictionary<int, string>() { { 1, "One" }, { 2, "Two" }, { 3, "Three" } };
foreach (var (key, value) in dic) {
Console.WriteLine($"Item [{key}] = {value}");
}
//Or
foreach (var (_, value) in dic) {
Console.WriteLine($"Item [NO_ID] = {value}");
}
//Or
foreach ((int key, string value) in dic) {
Console.WriteLine($"Item [{key}] = {value}");
}
I would say foreach is the standard way, though it obviously depends on what you're looking for
foreach(var kvp in my_dictionary) {
...
}
Is that what you're looking for?
You can also try this on big dictionaries for multithreaded processing.
dictionary
.AsParallel()
.ForAll(pair =>
{
// Process pair.Key and pair.Value here
});
I appreciate this question has already had a lot of responses but I wanted to throw in a little research.
Iterating over a dictionary can be rather slow when compared with iterating over something like an array. In my tests an iteration over an array took 0.015003 seconds whereas an iteration over a dictionary (with the same number of elements) took 0.0365073 seconds that's 2.4 times as long! Although I have seen much bigger differences. For comparison a List was somewhere in between at 0.00215043 seconds.
However, that is like comparing apples and oranges. My point is that iterating over dictionaries is slow.
Dictionaries are optimised for lookups, so with that in mind I've created two methods. One simply does a foreach, the other iterates the keys then looks up.
public static string Normal(Dictionary<string, string> dictionary)
{
string value;
int count = 0;
foreach (var kvp in dictionary)
{
value = kvp.Value;
count++;
}
return "Normal";
}
This one loads the keys and iterates over them instead (I did also try pulling the keys into a string[] but the difference was negligible.
public static string Keys(Dictionary<string, string> dictionary)
{
string value;
int count = 0;
foreach (var key in dictionary.Keys)
{
value = dictionary[key];
count++;
}
return "Keys";
}
With this example the normal foreach test took 0.0310062 and the keys version took 0.2205441. Loading all the keys and iterating over all the lookups is clearly a LOT slower!
For a final test I've performed my iteration ten times to see if there are any benefits to using the keys here (by this point I was just curious):
Here's the RunTest method if that helps you visualise what's going on.
private static string RunTest<T>(T dictionary, Func<T, string> function)
{
DateTime start = DateTime.Now;
string name = null;
for (int i = 0; i < 10; i++)
{
name = function(dictionary);
}
DateTime end = DateTime.Now;
var duration = end.Subtract(start);
return string.Format("{0} took {1} seconds", name, duration.TotalSeconds);
}
Here the normal foreach run took 0.2820564 seconds (around ten times longer than a single iteration took - as you'd expect). The iteration over the keys took 2.2249449 seconds.
Edited To Add:
Reading some of the other answers made me question what would happen if I used Dictionary instead of Dictionary. In this example the array took 0.0120024 seconds, the list 0.0185037 seconds and the dictionary 0.0465093 seconds. It's reasonable to expect that the data type makes a difference on how much slower the dictionary is.
What are my Conclusions?
Avoid iterating over a dictionary if you can, they are substantially slower than iterating over an array with the same data in it.
If you do choose to iterate over a dictionary don't try to be too clever, although slower you could do a lot worse than using the standard foreach method.
As already pointed out on this answer, KeyValuePair<TKey, TValue> implements a Deconstruct method starting on .NET Core 2.0, .NET Standard 2.1 and .NET Framework 5.0 (preview).
With this, it's possible to iterate through a dictionary in a KeyValuePair agnostic way:
var dictionary = new Dictionary<int, string>();
// ...
foreach (var (key, value) in dictionary)
{
// ...
}
There are plenty of options. My personal favorite is by KeyValuePair
Dictionary<string, object> myDictionary = new Dictionary<string, object>();
// Populate your dictionary here
foreach (KeyValuePair<string,object> kvp in myDictionary)
{
// Do some interesting things
}
You can also use the Keys and Values Collections
With .NET Framework 4.7 one can use decomposition
var fruits = new Dictionary<string, int>();
...
foreach (var (fruit, number) in fruits)
{
Console.WriteLine(fruit + ": " + number);
}
To make this code work on lower C# versions, add System.ValueTuple NuGet package and write somewhere
public static class MyExtensions
{
public static void Deconstruct<T1, T2>(this KeyValuePair<T1, T2> tuple,
out T1 key, out T2 value)
{
key = tuple.Key;
value = tuple.Value;
}
}
As of C# 7, you can deconstruct objects into variables. I believe this to be the best way to iterate over a dictionary.
Example:
Create an extension method on KeyValuePair<TKey, TVal> that deconstructs it:
public static void Deconstruct<TKey, TVal>(this KeyValuePair<TKey, TVal> pair, out TKey key, out TVal value)
{
key = pair.Key;
value = pair.Value;
}
Iterate over any Dictionary<TKey, TVal> in the following manner
// Dictionary can be of any types, just using 'int' and 'string' as examples.
Dictionary<int, string> dict = new Dictionary<int, string>();
// Deconstructor gets called here.
foreach (var (key, value) in dict)
{
Console.WriteLine($"{key} : {value}");
}
foreach is fastest and if you only iterate over ___.Values, it is also faster
Using C# 7, add this extension method to any project of your solution:
public static class IDictionaryExtensions
{
public static IEnumerable<(TKey, TValue)> Tuples<TKey, TValue>(
this IDictionary<TKey, TValue> dict)
{
foreach (KeyValuePair<TKey, TValue> kvp in dict)
yield return (kvp.Key, kvp.Value);
}
}
And use this simple syntax
foreach (var(id, value) in dict.Tuples())
{
// your code using 'id' and 'value'
}
Or this one, if you prefer
foreach ((string id, object value) in dict.Tuples())
{
// your code using 'id' and 'value'
}
In place of the traditional
foreach (KeyValuePair<string, object> kvp in dict)
{
string id = kvp.Key;
object value = kvp.Value;
// your code using 'id' and 'value'
}
The extension method transforms the KeyValuePair of your IDictionary<TKey, TValue> into a strongly typed tuple, allowing you to use this new comfortable syntax.
It converts -just- the required dictionary entries to tuples, so it does NOT converts the whole dictionary to tuples, so there are no performance concerns related to that.
There is a only minor cost calling the extension method for creating a tuple in comparison with using the KeyValuePair directly, which should NOT be an issue if you are assigning the KeyValuePair's properties Key and Value to new loop variables anyway.
In practice, this new syntax suits very well for most cases, except for low-level ultra-high performance scenarios, where you still have the option to simply not use it on that specific spot.
Check this out: MSDN Blog - New features in C# 7
Simplest form to iterate a dictionary:
foreach(var item in myDictionary)
{
Console.WriteLine(item.Key);
Console.WriteLine(item.Value);
}
I found this method in the documentation for the DictionaryBase class on MSDN:
foreach (DictionaryEntry de in myDictionary)
{
//Do some stuff with de.Value or de.Key
}
This was the only one I was able to get functioning correctly in a class that inherited from the DictionaryBase.
Sometimes if you only needs the values to be enumerated, use the dictionary's value collection:
foreach(var value in dictionary.Values)
{
// do something with entry.Value only
}
Reported by this post which states it is the fastest method:
http://alexpinsker.blogspot.hk/2010/02/c-fastest-way-to-iterate-over.html
I know this is a very old question, but I created some extension methods that might be useful:
public static void ForEach<T, U>(this Dictionary<T, U> d, Action<KeyValuePair<T, U>> a)
{
foreach (KeyValuePair<T, U> p in d) { a(p); }
}
public static void ForEach<T, U>(this Dictionary<T, U>.KeyCollection k, Action<T> a)
{
foreach (T t in k) { a(t); }
}
public static void ForEach<T, U>(this Dictionary<T, U>.ValueCollection v, Action<U> a)
{
foreach (U u in v) { a(u); }
}
This way I can write code like this:
myDictionary.ForEach(pair => Console.Write($"key: {pair.Key}, value: {pair.Value}"));
myDictionary.Keys.ForEach(key => Console.Write(key););
myDictionary.Values.ForEach(value => Console.Write(value););
If you want to use a for loop, you can do as below:
var keyList=new List<string>(dictionary.Keys);
for (int i = 0; i < keyList.Count; i++)
{
var key= keyList[i];
var value = dictionary[key];
}
I will take the advantage of .NET 4.0+ and provide an updated answer to the originally accepted one:
foreach(var entry in MyDic)
{
// do something with entry.Value or entry.Key
}
If say, you want to iterate over the values collection by default, I believe you can implement IEnumerable<>, Where T is the type of the values object in the dictionary, and "this" is a Dictionary.
public new IEnumerator<T> GetEnumerator()
{
return this.Values.GetEnumerator();
}
The standard way to iterate over a Dictionary, according to official documentation on MSDN is:
foreach (DictionaryEntry entry in myDictionary)
{
//Read entry.Key and entry.Value here
}
I wrote an extension to loop over a dictionary.
public static class DictionaryExtension
{
public static void ForEach<T1, T2>(this Dictionary<T1, T2> dictionary, Action<T1, T2> action) {
foreach(KeyValuePair<T1, T2> keyValue in dictionary) {
action(keyValue.Key, keyValue.Value);
}
}
}
Then you can call
myDictionary.ForEach((x,y) => Console.WriteLine(x + " - " + y));
Dictionary< TKey, TValue > It is a generic collection class in c# and it stores the data in the key value format.Key must be unique and it can not be null whereas value can be duplicate and null.As each item in the dictionary is treated as KeyValuePair< TKey, TValue > structure representing a key and its value. and hence we should take the element type KeyValuePair< TKey, TValue> during the iteration of element.Below is the example.
Dictionary<int, string> dict = new Dictionary<int, string>();
dict.Add(1,"One");
dict.Add(2,"Two");
dict.Add(3,"Three");
foreach (KeyValuePair<int, string> item in dict)
{
Console.WriteLine("Key: {0}, Value: {1}", item.Key, item.Value);
}
The best answer is of course: Think, if you could use a more appropriate data structure than a dictionary if you plan to iterate over it- as Vikas Gupta mentioned already in the (beginning of the) discussion under the question. But that discussion as this whole thread still lacks surprisingly good alternatives. One is:
SortedList<string, string> x = new SortedList<string, string>();
x.Add("key1", "value1");
x.Add("key2", "value2");
x["key3"] = "value3";
foreach( KeyValuePair<string, string> kvPair in x )
Console.WriteLine($"{kvPair.Key}, {kvPair.Value}");
Why it could be argued a code smell of iterating over a dictionary (e.g. by foreach(KeyValuePair<,>) ?
A basic principle of Clean Coding:
"Express intent!"
Robert C. Martin writes in "Clean Code": "Choosing names that reveal intent". Obviously naming alone is too weak. "Express (reveal) intent with every coding decision" expresses it better.
A related principle is "Principle of least surprise" (=Principle of Least Astonishment).
Why this is related to iterating over a dictionary? Choosing a dictionary expresses the intent of choosing a data structure which was made for primarily finding data by key. Nowadays there are so much alternatives in .NET, if you want to iterate through key/value pairs that you could choose something else.
Moreover: If you iterate over something, you have to reveal something about how the items are (to be) ordered and expected to be ordered!
Although the known implementations of Dictionary sort the key collection in the order of the items added-
AFAIK, Dictionary has no assured specification about ordering (has it?).
But what are the alternatives?
TLDR:
SortedList: If your collection is not getting too large, a simple solution would be to use SortedList<,> which gives you also full indexing of key/value pairs.
Microsoft has a long article about mentioning and explaining fitting collections:
Keyed collection
To mention the most important: KeyedCollection<,> and SortedDictionary<,> .
SortedDictionary<,> is a bit faster than SortedList for only inserting if it gets large, but lacks indexing and is needed only if O(log n) for inserting is preferenced over other operations. If you really need O(1) for inserting and accept slower iterating in exchange, you have to stay with simple Dictionary<,>.
Obviously there is no data structure which is the fastest for every possible operation..
Additionally there is ImmutableSortedDictionary<,>.
And if one data structure is not exactly what you need, then derivate from Dictionary<,> or even from the new ConcurrentDictionary<,> and add explicit iteration/sorting functions!
var dictionary = new Dictionary<string, int>
{
{ "Key", 12 }
};
var aggregateObjectCollection = dictionary.Select(
entry => new AggregateObject(entry.Key, entry.Value));
Just wanted to add my 2 cent, as the most answers relate to foreach-loop.
Please, take a look at the following code:
Dictionary<String, Double> myProductPrices = new Dictionary<String, Double>();
//Add some entries to the dictionary
myProductPrices.ToList().ForEach(kvP =>
{
kvP.Value *= 1.15;
Console.Writeline(String.Format("Product '{0}' has a new price: {1} $", kvp.Key, kvP.Value));
});
Altought this adds a additional call of '.ToList()', there might be a slight performance-improvement (as pointed out here foreach vs someList.Foreach(){}),
espacially when working with large Dictionaries and running in parallel is no option / won't have an effect at all.
Also, please note that you wont be able to assign values to the 'Value' property inside a foreach-loop. On the other hand, you will be able to manipulate the 'Key' as well, possibly getting you into trouble at runtime.
When you just want to "read" Keys and Values, you might also use IEnumerable.Select().
var newProductPrices = myProductPrices.Select(kvp => new { Name = kvp.Key, Price = kvp.Value * 1.15 } );
in addition to the highest ranking posts where there is a discussion between using
foreach(KeyValuePair<string, string> entry in myDictionary)
{
// do something with entry.Value or entry.Key
}
or
foreach(var entry in myDictionary)
{
// do something with entry.Value or entry.Key
}
most complete is the following because you can see the dictionary type from the initialization, kvp is KeyValuePair
var myDictionary = new Dictionary<string, string>(x);//fill dictionary with x
foreach(var kvp in myDictionary)//iterate over dictionary
{
// do something with kvp.Value or kvp.Key
}

Item duplication problem

My goal is to add a insert new value to a column where my column values are as follows
100 * 100
150 * 150
200 * 200
200 * 200
I get the following error:
Item has already been added. Key in dictionary: '200 x 200' Key being added: '200 x 200'
For next code:
SortedList sortedList = new SortedList();
foreach (ListItem listItem in ddldimension.Items)
sortedList.Add(listItem.Text, listItem.Value);
if (!sortedList.ContainsKey(CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension1")))
sortedList.Add(CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension1"), "defaultEmbedDimension1");
if (!sortedList.ContainsKey(CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension2")))
sortedList.Add(CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension2"), "defaultEmbedDimension2");
if (!sortedList.ContainsKey(CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension3")))
sortedList.Add(CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension3"), "defaultEmbedDimension3");
From the error message you're getting, and from the documentation for SortedList:
In either case, a SortedList does not allow duplicate keys.
So it would appear that a SortedList isn't the right structure for you to be using in your application. Unfortunately, you've provided insufficient information to allow me to suggest something better.
SortedList does not allow adding duplicate keys. Use List<> (along with KeyValuePair for example) instead (eg. List<KeyValuePair<string, object>>).
Here is the solution for your code:
var list = new List<KeyValuePair<string, string>>();
foreach (var item in ddldimension.Items)
{
list.Add(new KeyValuePair<string, string>(item.Text, item.Value));
}
var defaultEmbedDimension1 = CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension1");
int index = list.FindIndex(k => k.Key == defaultEmbedDimension1); // If there is no such Key, it will be -1. If you want to find by Value, replace k.Key by k.Value
if (index >= 0)
{
list.Add(new KeyValuePair<string, string>(defaultEmbedDimension1, "defaultEmbedDimension1"));
}
In this way, you allow to keep duplicate keys in your structure. Note you invoke the same method twice. Initialize variable instead:
string defaultEmbedDimension1 = CommonUtilities.GetCustomString("DefaultValues", "defaultEmbedDimension1");
To populate list, you can alternatively use LINQ:
var list = ddldimensions.Items.Select(item => new KeyValuePair<string, string>(item.Text, item.Value)).ToList();
Read also: C# KeyValuePair Collection Hints at Dot Net Perls.
But if you decide to disallow duplicates and gently deal with them in SortedList, you can create an extension:
public static class SortedListExtensions
{
public static bool AddIfNotContains<K, V>(this IDictionary<K, V> dictionary, K key, V value)
{
if (!dictionary.ContainsKey(key))
{
dictionary.Add(key, value);
return true;
}
return false;
}
}
And use it as I did below, without throwing exception:
var sortedList = new SortedList<string, string>();
sortedList.Add("a", "b");
sortedList.AddIfNotContains("a", "b"); // Will not be added
sortedList.AddIfNotContains("b", "b"); // Will be added

How to sort the list with duplicate keys?

I have a set of elements/keys which I'm reading from two different config files. So the keys may be same but with different values associated with each of them.
I want to list them in the sorted order. What can I do ? I tried with SortedList class but it does not allow duplicate keys.
How can I do it?
e.g Lets say I have 3 elements with keys 1,2,3. Then i get one more element having key 2 (but different value). Then I want the new key to get inserted after existing key 2 but before 3. If I againg find an element with key 2, then it should go after most recently added key 2.
Please note than I'm using .NET 2.0
I prefer to use LINQ for this type of thing:
using System.Linq;
...
var mySortedList = myList.Orderby(l => l.Key)
.ThenBy(l => l.Value);
foreach (var sortedItem in mySortedList) {
//You'd see each item in the order you specified in the loop here.
}
Note: you must be using .NET 3.5 or later to accomplish this.
what you need is a Sort function with a custom IComparer. What you have now is the default icomparer when you use sort. this will check on a field value.
When you create a custom IComparer (you do this in you class by implementing the Icomparable interface). what it does is: your object checks itself to every other object in the list you sort.
this is done by a function. (don't worry VS will implementd it when refering your interface
public class ThisObjectCLass : IComparable{
public int CompareTo(object obj) {
ThisObjectCLass something = obj as ThisObjectCLass ;
if (something!= null)
if(this.key.CompareTo(object.key) == 0){
//then:
if .....
}
else if(this.value "is more important then(use some logic here)" something.value){
return 1
}
else return -1
else
throw new ArgumentException("I am a dumb little rabid, trying to compare different base classes");
}
}
read on the links above for better information.
I know I had some troubles understanding this myself in the beginning, so for any extra help add a comment and I will elaborate
I did it by creating a SortedList<int, List<string>>. Whenever I find the duplicate key, I simply insert the value in the existing list associated with the key already present in the SortedList object. This way, I can have list of values for a particular key.
Use your own comparer class!
If your keys in the sorted list are integers, you may use for example this comparer:
public class DegreeComparer : IComparer<int>
{
#region IComparer<int> Members
public int Compare(int x, int y)
{
if (x < y)
return -1;
else
return 1;
}
#endregion
}
To instanciate a new SortedList with int keys and string values use:
var mySortedList = new SortedList<int, string>(new DegreeComparer());
If you don't really care about the sequence of the elements with equal keys, add everything to a list and then sort it by key:
static void Main(string[] args)
{
List<KeyValuePair<int, MyClass>> sortedList =
new List<KeyValuePair<int, MyClass>>() {
new KeyValuePair<int, MyClass>(4, new MyClass("four")),
new KeyValuePair<int, MyClass>(7, new MyClass("seven")),
new KeyValuePair<int, MyClass>(5, new MyClass("five")),
new KeyValuePair<int, MyClass>(4, new MyClass("four-b")),
new KeyValuePair<int, MyClass>(7, new MyClass("seven-b"))
};
sortedList.Sort(Compare);
}
static int Compare(KeyValuePair<int, MyClass> a, KeyValuePair<int, MyClass> b)
{
return a.Key.CompareTo(b.Key);
}
If you really want the items inserted later to be after those inserted earlier, sort them as they are inserted:
class Sorter : IComparer<KeyValuePair<int, MyClass>>
{
static void Main(string[] args)
{
List<KeyValuePair<int, MyClass>> sortedList = new List<KeyValuePair<int, MyClass>>();
Sorter sorter = new Sorter();
foreach (KeyValuePair<int, MyClass> kv in new KeyValuePair<int, MyClass>[] {
new KeyValuePair<int, MyClass>(4, new MyClass("four")),
new KeyValuePair<int, MyClass>(7, new MyClass("seven")),
new KeyValuePair<int, MyClass>(5, new MyClass("five")),
new KeyValuePair<int, MyClass>(4, new MyClass("four-b")),
new KeyValuePair<int, MyClass>(4, new MyClass("four-c")),
new KeyValuePair<int, MyClass>(7, new MyClass("seven-b")) })
{
sorter.Insert(sortedList, kv);
}
for (int i = 0; i < sortedList.Count; i++)
{
Console.WriteLine(sortedList[i].ToString());
}
}
void Insert(List<KeyValuePair<int, MyClass>> sortedList, KeyValuePair<int, MyClass> newItem)
{
int newIndex = sortedList.BinarySearch(newItem, this);
if (newIndex < 0)
sortedList.Insert(~newIndex, newItem);
else
{
while (newIndex < sortedList.Count && (sortedList[newIndex].Key == newItem.Key))
newIndex++;
sortedList.Insert(newIndex, newItem);
}
}
#region IComparer<KeyValuePair<int,MyClass>> Members
public int Compare(KeyValuePair<int, MyClass> x, KeyValuePair<int, MyClass> y)
{
return x.Key.CompareTo(y.Key);
}
#endregion
}
Or you could have a sorted list of lists:
static void Main(string[] args)
{
SortedDictionary<int, List<MyClass>> sortedList = new SortedDictionary<int,List<MyClass>>();
foreach (KeyValuePair<int, MyClass> kv in new KeyValuePair<int, MyClass>[] {
new KeyValuePair<int, MyClass>(4, new MyClass("four")),
new KeyValuePair<int, MyClass>(7, new MyClass("seven")),
new KeyValuePair<int, MyClass>(5, new MyClass("five")),
new KeyValuePair<int, MyClass>(4, new MyClass("four-b")),
new KeyValuePair<int, MyClass>(4, new MyClass("four-c")),
new KeyValuePair<int, MyClass>(7, new MyClass("seven-b")) })
{
List<MyClass> bucket;
if (!sortedList.TryGetValue(kv.Key, out bucket))
sortedList[kv.Key] = bucket = new List<MyClass>();
bucket.Add(kv.Value);
}
foreach(KeyValuePair<int, List<MyClass>> kv in sortedList)
{
for (int i = 0; i < kv.Value.Count; i++ )
Console.WriteLine(kv.Value[i].ToString());
}
}
I'm not sure if you can use List initializers in .NET 2.0 like I did in the first example above, but I'm sure you know how to populate a list with data.
.NET doesn't have huge support for stable sorts (meaning that equivalent elements maintain their relative order when sorted). However, you can write your own stable-sorted-insert using List.BinarySearch and a custom IComparer<T> (that returns -1 if the key is less than or equal to the target, and +1 if greater).
Note that List.Sort is not a stable sort, so you'd either have to write your own stable quicksort routine or just use insertion sort to initially populate the collection.
did you contemplate the NameValueCollection class as it allows you to store multiple values per key? you could for example have the following:
NameValueCollection nvc = new NameValueCollection();
nvc.Add("1", "one");
nvc.Add("2", "two");
nvc.Add("3", "three");
nvc.Add("2", "another value for two");
nvc.Add("1", "one bis");
and then to retrieve the values you could have:
for (int i = 0; i < nvc.Count; i++)
{
if (nvc.GetValues(i).Length > 1)
{
for (int x = 0; x < nvc.GetValues(i).Length; x++)
{
Console.WriteLine("'{0}' = '{1}'", nvc.GetKey(i), nvc.GetValues(i).GetValue(x));
}
}
else
{
Console.WriteLine("'{0}' = '{1}'", nvc.GetKey(i), nvc.GetValues(i)[0]);
}
}
which give the output:
'1' = 'one'
'1' = 'one bis'
'2' = 'two'
'2' = 'another value for two'
'3' = 'three'
In .NET 2.0 you can write :
List<KeyValuePair<string, string>> keyValueList = new List<KeyValuePair<string, string>>();
// Simulate your list of key/value pair which key could be duplicate
keyValueList.Add(new KeyValuePair<string,string>("1","One"));
keyValueList.Add(new KeyValuePair<string,string>("2","Two"));
keyValueList.Add(new KeyValuePair<string,string>("3","Three"));
// Here an entry with duplicate key and new value
keyValueList.Add(new KeyValuePair<string, string>("2", "NEW TWO"));
// Your final sorted list with one unique key
SortedList<string, string> sortedList = new SortedList<string, string>();
foreach (KeyValuePair<string, string> s in keyValueList)
{
// Use the Indexer instead of Add method
sortedList[s.Key] = s.Value;
}
Output :
[1, One]
[2, NEW TWO]
[3, Three]
How about this
SortedList<string, List<string>> sl = new SortedList<string, List<string>>();
List<string> x = new List<string>();
x.Add("5");
x.Add("1");
x.Add("5");
// use this to load
foreach (string z in x)
{
if (!sl.TryGetValue(z, out x))
{
sl.Add(z, new List<string>());
}
sl[z].Add("F"+z);
}
// use this to print
foreach (string key in sl.Keys)
{
Console.Write("key=" + key + Environment.NewLine);
foreach (string item in sl[key])
{
Console.WriteLine(item);
}
}
I had a similar issue where I was designing a game similar to the concept of a Chess game where you have the computer make a move. I needed to have the possibility of multiple pieces being able to make a move and thus I needed to have multiple Board-States. Each BoardState needed to be ranked based on the position of the pieces. For argument sake and simplicity, say my game was Noughts and Crosses and I was Noughts and the Computer was Crosses. If the board-state was showing 3 in a row of Noughts then this is the best state for me, if it shows 3 in a row of Crosses then this is the worst state for me and best for the computer. There are other states during the game that are more favourible to one or the other and furthermore there are muliplte states that result in a Draw, so how do I go about ranking it when there are equal rank scores. This is what I came up with (apologise in advance if you are not a VB programmer).
My comparer class:
Class ByRankScoreComparer
Implements IComparer(Of BoardState)
Public Function Compare(ByVal bs1 As BoardState, ByVal bs2 As BoardState) As Integer Implements IComparer(Of BoardState).Compare
Dim result As Integer = bs2.RankScore.CompareTo(bs1.RankScore) 'DESCENDING order
If result = 0 Then
result = bs1.Index.CompareTo(bs2.Index)
End If
Return result
End Function
End Class
My declarations:
Dim boardStates As SortedSet(Of BoardState)(New ByRankScoreComparer)
My Board-State implementation:
Class BoardState
Private Shared BoardStateIndex As Integer = 0
Public ReadOnly Index As Integer
...
Public Sub New ()
BoardStateIndex += 1
Index = BoardStateIndex
End Sub
...
End Class
As you can see RankScores are maintained in descending order and any 2 states having the same rank-score the later state goes to the bottom as it will always have a greater assigned Index and thus this allows duplicates. I can also safely call boardStates.Remove(myCurrentBoardState) which also uses the comparer and the comparer must return a 0 value in order to locate the objected to be deleted.

Categories

Resources