How to properly use SortedDictionary in c#? - c#

I'm trying to do something very simple but it seems that I don't understand SortedDictionary.
What I'm trying to do is the following:
Create a sorted dictionary that sorts my items by some floating number, so I create a dictionary that looks like this
SortedDictionary<float, Node<T>> allNodes = new SortedDictionary<float, Node<T>>();
And now after I add items, I want to remove them one by one (every removal should be at a complexity of O(log(n)) from the smallest to the largest.
How do I do it? I thought that simply allNodes[0] will give me the the smallest, but it doesn't.
More over, it seems like the dictionary can't handle duplicate keys. I feel like I'm using the wrong data structure...
Should I use something else if I have bunch of nodes that I want to be sorted by their distance (floating point)?

allNodes[0] will not give you the first item in the dictionary - it will give you the item with a float key value of 0.
If you want the first item try allNodes.Values.First() instead. Or to find the first key use allNodes.Keys.First()
To remove the items one by one, loop over a copy of the Keys collection and call allNodes.Remove(key);
foreach (var key in allNodes.Keys.ToList())
{
allNodes.Remove(key);
}
To answer your addendum to your question, yes SortedDictionary (any flavor of Dictionary for that matter) will not handle duplicate keys - if you try and add an item with an existing key it will overwrite the previous value.
You could use a SortedDictionary<float, List<Node<T>>> but then you have the complexity of extracting collections versus items, needing to initialize each list rather than just adding an item, etc. It's all possible and may still be the fastest structure for adds and gets, but it does add a bit of complexity.

Yes, you're right about complexity.
In SortedDictionary all the keys are sorted. If you want to iterate from the smallest to the largest, foreach will be enough:
foreach(KeyValuePair<float, Node<T>> kvp in allNodes)
{
// Do Something...
}
You wrote that you want to remove items. It's forbidden to remove from collections during iteratation with foreach, so firstly create a copy of it to do so.
EDIT:
Yes, if you have duplicated keys you can't use SortedDictionary. Create a structural Node with Node<T> and float, then write a comparer:
public class NodeComparer : IComparer<Node>
{
public int Compare(Node n1, Node n2)
{
return n2.dist.CompareTo(n1.dist);
}
}
And then put everything in simple List<Node> allNodes and sort:
allNodes.Sort(new NodeComparer());

As a Dictionary<TKey, TValue> must have unique keys, I'd use List<Node<T>> instead. For instance, if your Node<T> class has a Value property
class Node<T>
{
float Value { get; set; }
// other properties
}
and you want to sort by this property, use LINQ:
var list = new List<Node<T>>();
// populate list
var smallest = list.OrderBy(n => n.Value).FirstOrDefault();
To remove the nodes one by one, just iterate threw the list:
while (list.Count > 0)
{
list.RemoveAt(0);
}

Related

Sorting by name after each entry

I am using this LINQ statement to sort a list by product name (ascending order) which contains product names (string) and Sizes available for each product (List<byte>);
LinkedList<FullItemDetails> itemDetails = new LinkedList<FullItemDetails>();
public class FullItemDetails
{
public string ProductName { get; set; }
public List<byte> Sizes { get; set; }
}
Now every time I input a new entry ex; Jacket,6,12,18,10, I think my program is sorting my list all over again;
itemDetails.AddLast(fullItemDetails);
//SortedProducts
itemDetails = Products.OrderBy(x => x.ProductName).ToList();
If the list is already sorted I only need to put the last entry in its correct place.
What is the best way to do it. Also to reduce the complexity of the algorithm. thanks
This seems like an ideal problem for a SortedList, as you have a key (name) and value (List<int> for the size).
Documentation is available here: http://msdn.microsoft.com/en-us/library/system.collections.sortedlist.aspx
The list declaration would look like this: SortedList<string, List<int> >. All inserts would be sorted on string, and the values can be enumerated based on each key.
Instead of List<T>, use SortedList<TKey, TValue> or SortedSet<T>. You can pass in an IComparer<T> to use a specific sorting algorithm through the respective constructor. Should you want to use a Lambda expression, you can use a small wrapper class to wrap a Comparison<T>.
Which will result in something like:
ICollection<FullItemDetails> _itemList = new SortedSet<FullItemDetails>(new ComparisonComparer<FullItemDetails>((x,y) -> x.ProductName.CompareTo(y.ProductName))
Your collection will now always be ordered.
When you're using .NET 4.5, you can use Comparer<T>.Create to create an IComparer implementation from a lambda expression.
You can use a SortedList<string,FullItemDetails>.
And you add your times like that list.Add(fullItemDetails.Name,fullItemDetails)
[Edit] : The order will be conserved after adding or removing an element.
[Edit2] Using LINQ
you use a list to store your items (adding/removing) : List<FullItemDetails> originalList and other property to read your sorted data :
IEnumerable<FullItemDetails> sortedList = originalList.OrderBy(e => e.Name).ThenBy(e => /* logic to order by another property*/);
and now you can iterate through your sortedList and as this sorted list is an IEnumerable<T> each time you iterate through it you will have exactly the same elements as in your originalList (after adding or removing items).
In other words : sortedList only contains the logic to read your originalList.
Hope this helps.
Regards.

SortedHashTable in c#

What I am trying to do is to implement a heuristic approach to NP complete problem: I have a list of objects (matches) each has a double score. I am taking the first element in the list sorted by the score desc and then remove it from the list. Then all elements bound to the first one are to be removed. I iterate through the list till I have no more elements.
I need a data structure which can efficiently solve this problem, so basically it should ahve the following properties:
1. Generic
2. Is always sorted
3. Has a fast key access
Right now SortedSet<T> looks like the best fit.
The question is: is it the most optimal choice for in my case?
List result = new List();
while (sortedItems.Any())
{
var first = sortedItems.First();
result.Add(first);
sortedItems.Remove(first);
foreach (var dependentFirst in first.DependentElements)
{
sortedItems.Remove(dependentFirst);
}
}
What I need is something like sorted hash table.
I assume you're not just wanting to clear the list, but you want to do something with each item as it's removed.
var toDelete = new HashSet<T>();
foreach (var item in sortedItems)
{
if (!toDelete.Contains(item))
{
toDelete.Add(item);
// do something with item here
}
foreach (var dependentFirst in item.DependentElements)
{
if (!toDelete.Contains(item))
{
toDelete.Add(dependentFirst);
// do something with item here
}
}
}
sortedItems.RemoveAll(i => toDelete.Contains(i));
I think you should use two data structures - a heap and a set - heap for keeping the sorted items, set for keeping the removed items. Fill the heap with the items, then remove the top one, and add it and all its dependents to the set. Remove the second one - if it's already in the set, ignore it and move to the third, otherwise add it and its dependents to the set.
Each time you add an item to the set, also do whatever it is you plan to do with the items.
The complexity here is O(NlogN), you won't get any better than this, as you have to sort the list of items anyway. If you want to get better performance, you can add a 'Removed' boolean to each item, and set it to true instead of using a set to keep track of the removed items. I don't know if this is applicable to you.
If im not mistake, you want something like this
var dictionary = new Dictionary<string, int>();
dictionary.Add("car", 2);
dictionary.Add("apple", 1);
dictionary.Add("zebra", 0);
dictionary.Add("mouse", 5);
dictionary.Add("year", 3);
dictionary = dictionary.OrderBy(o => o.Key).ToDictionary(o => o.Key, o => o.Value);

Correct Collection to use for ordered collection

I have a collection, That needs to be ordered in the order it is created.
But then at any time The User can change the order (ie move the 4th item to the first postion)
Is there any Collections with pre-built methods?
or should I use a SortedList.
Add(key++, Object); //pseudo code
then to change item
SwapObject(int key, int SwapKey)
{
where key == value
tempvalue = key;
SwapKey = key;
key = tempvalue;
}
You can use a generic List<> which has the Insert method, so you can insert an object in a given position any time.
You can use a simple List<YourObject> as container and implement IComparer for sorting.
List also provides methods for sorting, insert at a location or remove from a location

Whats the best collection to use for uniquely identifying nodes?

Currently I am using a Dictionary<int,node> to store around 10,000 nodes. The key is used as an ID number for later look up and the 'node' is a class that contains some data. Other classes within the program use the ID number as a pointer to the node. (this may sound inefficient. However, explaining my reasoning for using a dictionary for this is beyond the scope of my question.)
However, 20% of the nodes are duplicate.
What i want to do is when i add a node check to see if it all ready exists. if it does then use that ID number. If not create a new one.
This is my current solution to the problem:
public class nodeDictionary
{
Dictionary<int, node> dict = new Dictionary<int, node>( );
public int addNewNode( latLng ll )
{
node n = new node( ll );
if ( dict.ContainsValue( n ) )
{
foreach ( KeyValuePair<int, node> kv in dict )
{
if ( kv.Value == n )
{
return kv.Key;
}
}
}
else
{
if ( dict.Count != 0 )
{
dict.Add( dict.Last( ).Key + 1, n );
return dict.Last( ).Key + 1;
}
else
{
dict.Add( 0, n );
return 0;
}
}
throw new Exception( );
}//end add new node
}
The problem with this is when trying to add a new node to a list of 100,000 nodes it takes 78 milliseconds to add the node. This is unacceptable because i could be adding an additional 1,000 nodes at any given time.
So, is there a better way do do this? I am not looking for someone to write the code for me, I am just looking for guidance.
It sounds like you want to
make sure that LatLng overrides Equals/GetHashCode (preferrably implement the IEquatable<LatLng> interface)
stuff all the items directly into a HashSet<LatLng>
For implementing GetHashCode, see here: Why is it important to override GetHashCode when Equals method is overridden?
If you need to generate 'artificial' unique IDs in some fashion, I suggest you use the dictionary approach again, but 'in reverse':
// uses the same hash function for speedy lookup/insertion
IDictionary<LatLng, int> idMap = new Dictionary<LatLng, int>();
foreach (LatLng latLng in LatLngCoords)
{
if (!idMap.ContainsKey(latLng))
idMap.Add(latLng, idMap.Count+1); // to start with 1
}
You can have the idMap replace the HashSet<>; the implementation (and performance characteristics) is essentially the same but as an associative container.
Here's a lookup function to get from LatLng to Id:
int IdLookup(LatLng latLng)
{
int id;
if (idMap.TryGetValue(latLng, id))
return id;
throw new InvalidArgumentException("Coordinate not in idMap");
}
You could just-in-time add it:
int IdFor(LatLng latLng)
{
int id;
if (idMap.TryGetValue(latLng, id))
return id;
id = idMap.Count+1;
idMap.Add(latLng, id);
return id;
}
I'd add a second dictionary for the reverse direction. i.e. Dictionary<Node,int>
Then you either
Are content with reference equality and do nothing.
Create an IEqualityComparer<Node> and supply it to the dictionary
Override Equals and GetHashCode on Node
In both cases a good implementation for the hashcode is essential to get good performance.
Your solution is not only slow, but also wrong. The order of items in a Dictionary is undefined, so dict.Last() is not guaranteed to return the item that was added last. (Although it may often look that way.)
Using an id to identify an object in your application seems wrong too. You should consider using references to the object directly.
But if you want to use your current design and assuming that you compare nodes based on their latLng, you could create two dictionaries: the one you already have and a second one, Dictionary<latLng, int>, that can be used to efficiently fond out whether a certain node already exists. And if it does, it gives you its id.
What exactly is the purpose of this code?
if ( dict.ContainsValue( n ) )
{
foreach ( KeyValuePair kv in dict )
{
if ( kv.Value == n )
{
return kv.Key;
}
}
}
The ContainsValue searches for a value (instead of a key) and is very inefficient (O(n)). Ditto for foreach. Let alone you do both when only one is necessary (you could completely remove ContainsValue by rearranging your ifs a little)!
You should probably mainntain additional dictionary that is "reverse" of the original one (i.e. values in old dictionary are keys in the new one and vice versa), to "cover" your search patterns (similarly to how databases can maintain multiple indexes par table to cover multiple ways table can be queried).
You could try using HashSet<T>
You might want to consider restructuring this to just use a List (where the 'key' is just the index into the List) instead of a Dictionary. A few advantages:
Looking up an element by integer key is now O(1) (and a very fast O(1) given that it's just an array dereference internally).
When you insert a new element, you perform an O(n) search to see whether it already exists in the list. If it does not, you've also already traversed the list and can have recorded whether you encountered an entry with a null record. If you have, that index is the new key. If not, the new key is the current list Count. You're enumerating the collection once instead of multiple times and the enumeration itself is much faster than enumerating a Dictionary.

SortedSet and SortedList fails with different enums

The whole story; I have some KeyValuePairs that I need to store in a session and my primary goal is to keep it small. Therefore I don't have the option of using many different collection. While the key is a different enum value of of a different enum type the value is always just a enum value of the same enum type. I have chosen a HashTable for this approach which content look like this (just many more):
// The Key-Value-Pairs
{ EnumTypA.ValueA1, MyEnum.ValueA },
{ EnumTypB.ValueB1, MyEnum.ValueB },
{ EnumTypC.ValueC1, MyEnum.ValueA },
{ EnumTypA.ValueA2, MyEnum.ValueC },
{ EnumTypB.ValueB1, MyEnum.ValueC }
At most I am running contains on that HashTable but for sure I also need to fetch the value at some point and I need to loop through all elements. That all works fine but now I have a new requirement to keep the order I have added them to the HashTable -> BANG
A HashTable is a map and that is not possible!
Now I thought about using a SortedList<object, MyEnum> or to go with more Data but slightly faster lookups and use a SortedSet<object> in addition to the HashTable.
Content below has been edited
The SortedList is implemented as
SortedList<Enum, MyEnum> mySortedList = new SortedList<Enum, MyEnum>();
the SortedSet is implemented as
SortedSet<Enum> mySortedSet = new SortedSet<Enum>();
The described Key - Value - Pairs are added to the sorted list with
void AddPair(Enum key, MyEnum value)
{
mySortedList.Add(key, value);
}
And for the SortedSett like this
void AddPair(Enum key)
{
mySortedSet.Add(key);
}
Both are failing with the exception:
Object must be the same type as the
enum
My question is: What goes wrong and how can I archive my goal?
Used Solution
I've decided to life with the downside
of redundant data against slower
lookups and decided to implement a
List<Enum> which will retain the
insert order parallel to my already
existing HashTable.
In my case I just have about 50-150
Elements so I decided to benchmark the
Hashtable against the
List<KeyValuePair<object,object>>
Therefore I have create me the
following helper to implement
ContainsKey() to the
List<KeyValuePair<object,object>>
static bool ContainsKey(this List<KeyValuePair<object, object>> list, object key)
{
foreach (KeyValuePair<object, object> p in list)
{
if (p.Key.Equals(key))
return true;
}
return false;
}
I inserted the same 100 Entries and
checked randomly for one of ten
different entries in a 300000 loop.
And... the difference was tiny so I
decided to go with the
List<KeyValuePair<object,object>>
I think you should store your data in an instance of List<KeyValuePair<Enum, MyEnum>> or Dictionary<Enum, MyEnum>.
SortedSet and SortedList are generic, but your keys are EnumTypeA/EnumTypeB, you need to specify the generic T with their base class(System.Enum) like:
SortedList<Enum, MyEnum> sorted = new SortedList<Enum, MyEnum>();
EDIT
Why you got this exception
SortedList and SortedSet use a comparer inside to check if two keys are equal. Comparer<Enum>.Default will be used as the comparer if you didn't specify the comparer in the constructor. Unfortunately Comparer<Enum>.Default isn't implemented as you expected. It throws the exception if the two enums are not the same type.
How to resolve the problem
If you don't want to use a List<KeyValuePair<Enum, MyEnum>> and insist using SortedLIst, you need to specify a comparer to the constructor like this:
class EnumComparer : IComparer<Enum>
{
public int Compare(Enum x, Enum y)
{
return x.GetHashCode() - y.GetHashCode();
}
}
var sorted = new SortedList<Enum, MyEnum>(new EnumComparer());
Btw, I think you need to obtain the "inserting order"? If so, List<KeyValuePair<K,V>> is a better choice, because SortedSet will prevent duplicated items.

Categories

Resources