Data structure with unique elements and fast add and remove - c#

I need a data structure with the following properties:
Each element of the structure must be unique.
Add: Adds one element to the data structure unless the element already
exists.
Pop: Removes one element from the data structure and returns the element
removed. It's unimportant which element is removed.
No other operations are required for this structure. A naive implementation with a list will require almost O(1) time for Pop and O(N) time for Add (since the entire list must be checked to ensure
uniqueness). I am currently using a red-black tree to fulfill the needs of this data structure, but I am wondering if I can use something less complicated to achieve almost the same performance.
I prefer answers in C#, but Java, Javascript, and C++ are also acceptable.
My question is similar to this question, however I have no need to lookup or remove the maximum or minimum value (or indeed any particular kind of value), so I was hoping there would be improvements in that respect. If any of the structures in that question are appropriate here, however, let me know.
So, what data structure allows only unique elements, supports fast add and remove, and is less complicated than a red-black tree?

What about the built-in HashSet<T>?
It contains only unique elements. Remove (pop) is O(1) and Add is O(1) unless the internal array must be resized.

As said by Meta-Knight, a HashSet is the fastest data structure to do exactly that. Lookups and removals take constant O(1) time (except in rare cases when your hash sucks and then you require multiple rehashes or you use a bucket hashset). All operations on a hashset take O(1) time, the only drawback is that it requires more memory because the hash is used as an index into an array (or other allocated block of memory). So unless you're REALLY strict on memory then go with HashSet. I'm only explaining the reason why you should go with this approach and you should accept Meta-Knights answer as his was first.
Using hashes is OK because usually you override the HashCode() and Equals() functions. What the HashSet does internally is generate the hash, then if it is equal check for equality (just in case of hash collisions). If they are not it must call a method to do something called rehashing which generates a new hash which is usually at an odd prime offset from the original hash (not sure if .NET does this but other languages do) and repeats the process as necessary.

Remove a random element is quite easy from an hashset or a dictionary.
Everything is averaged O(1), that in real world means O(1).
Example:
public class MyNode
{
...
}
public class MyDataStructure
{
private HashSet<MyNode> nodes = new HashSet<MyNode>();
/// <summary>
/// Inserts an element to this data structure.
/// If the element already exists, returns false.
/// Complexity is averaged O(1).
/// </summary>
public bool Add(MyNode node)
{
return node != null && this.nodes.Add(node);
}
/// <summary>
/// Removes a random element from the data structure.
/// Returns the element if an element was found.
/// Returns null if the data structure is empty.
/// Complexity is averaged O(1).
/// </summary>
public MyNode Pop()
{
// This loop can execute 1 or 0 times.
foreach (MyNode node in nodes)
{
this.nodes.Remove(node);
return node;
}
return null;
}
}
Almost everything that can be compared can also be hashed :) in my experience.
I would like to know if there is someone that know something that cannot be hashed.
To my experience this applies also to some floating point comparations with tolerance using special techniques.
An hash function for an hash table don't need to be perfect, it just need to be good enough.
Also if your data is very complicated usually hash functions are less complicated than red black trees or avl trees.
They are useful because they keep things ordered, but you don't need this.
To show how to do a simple hashset i will consider a simple dictionary with integer keys.
This implementation is very fast and very good for sparse arrays for examples.
I didn't write the code to grow the bucket table, because it is annoying and usually a source of big bugs, but since this is a proof of concept, it should be enough.
I didn't write iterator neither.
I wrote it by scratch, there may be bugs.
public class FixedIntDictionary<T>
{
// Our internal node structure.
// We use structs instead of objects to not add pressure to the garbage collector.
// We mantains our own way to manage garbage through the use of a free list.
private struct Entry
{
// The key of the node
internal int Key;
// Next index in pEntries array.
// This field is both used in the free list, if node was removed
// or in the table, if node was inserted.
// -1 means null.
internal int Next;
// The value of the node.
internal T Value;
}
// The actual hash table. Contains indices to pEntries array.
// The hash table can be seen as an array of singlt linked list.
// We store indices to pEntries array instead of objects for performance
// and to avoid pressure to the garbage collector.
// An index -1 means null.
private int[] pBuckets;
// This array contains the memory for the nodes of the dictionary.
private Entry[] pEntries;
// This is the first node of a singly linked list of free nodes.
// This data structure is called the FreeList and we use it to
// reuse removed nodes instead of allocating new ones.
private int pFirstFreeEntry;
// Contains simply the number of items in this dictionary.
private int pCount;
// Contains the number of used entries (both in the dictionary or in the free list) in pEntries array.
// This field is going only to grow with insertions.
private int pEntriesCount;
///<summary>
/// Creates a new FixedIntDictionary.
/// tableBucketsCount should be a prime number
/// greater than the number of items that this
/// dictionary should store.
/// The performance of this hash table will be very bad
/// if you don't follow this rule!
/// </summary>
public FixedIntDictionary<T>(int tableBucketsCount)
{
// Our free list is initially empty.
this.pFirstFreeEntry = -1;
// Initializes the entries array with a minimal amount of items.
this.pEntries = new Entry[8];
// Allocate buckets and initialize every linked list as empty.
int[] buckets = new int[capacity];
for (int i = 0; i < buckets.Length; ++i)
buckets[i] = -1;
this.pBuckets = buckets;
}
///<summary>Gets the number of items in this dictionary. Complexity is O(1).</summary>
public int Count
{
get { return this.pCount; }
}
///<summary>
/// Adds a key value pair to the dictionary.
/// Complexity is averaged O(1).
/// Returns false if the key already exists.
/// </summary>
public bool Add(int key, T value)
{
// The hash table can be seen as an array of linked list.
// We find the right linked list using hash codes.
// Since the hash code of an integer is the integer itself, we have a perfect hash.
// After we get the hash code we need to remove the sign of it.
// To do that in a fast way we and it with 0x7FFFFFFF, that means, we remove the sign bit.
// Then we have to do the modulus of the found hash code with the size of our buckets array.
// For this reason the size of our bucket array should be a prime number,
// this because the more big is the prime number, the less is the chance to find an
// hash code that is divisible for that number. This reduces collisions.
// This implementation will not grow the buckets table when needed, this is the major
// problem with this implementation.
// Growing requires a little more code that i don't want to write now
// (we need a function that finds prime numbers, and it should be fast and we
// need to rehash everything using the new buckets array).
int bucketIndex = (key & 0x7FFFFFFF) % this.pBuckets.Length;
int bucket = this.pBuckets[bucketIndex];
// Now we iterate in the linked list of nodes.
// Since this is an hash table we hope these lists are very small.
// If the number of buckets is good and the hash function is good this will translate usually
// in a O(1) operation.
Entry[] entries = this.pEntries;
for (int current = entries[bucket]; current != -1; current = entries[current].Next)
{
if (entries[current].Key == key)
{
// Entry already exists.
return false;
}
}
// Ok, key not found, we can add the new key and value pair.
int entry = this.pFirstFreeEntry;
if (entry != -1)
{
// We found a deleted node in the free list.
// We can use that node without "allocating" another one.
this.pFirstFreeEntry = entries[entry].Next;
}
else
{
// Mhhh ok, the free list is empty, we need to allocate a new node.
// First we try to use an unused node from the array.
entry = this.pEntriesCount++;
if (entry >= this.pEntries)
{
// Mhhh ok, the entries array is full, we need to make it bigger.
// Here should go also the code for growing the bucket table, but i'm not writing it here.
Array.Resize(ref this.pEntries, this.pEntriesCount * 2);
entries = this.pEntries;
}
}
// Ok now we can add our item.
// We just overwrite key and value in the struct stored in entries array.
entries[entry].Key = key;
entries[entry].Value = value;
// Now we add the entry in the right linked list of the table.
entries[entry].Next = this.pBuckets[bucketIndex];
this.pBuckets[bucketIndex] = entry;
// Increments total number of items.
++this.pCount;
return true;
}
/// <summary>
/// Gets a value that indicates wether the specified key exists or not in this table.
/// Complexity is averaged O(1).
/// </summary>
public bool Contains(int key)
{
// This translate in a simple linear search in the linked list for the right bucket.
// The operation, if array size is well balanced and hash function is good, will be almost O(1).
int bucket = this.pBuckets[(key & 0x7FFFFFFF) % this.pBuckets.Length];
Entry[] entries = this.pEntries;
for (int current = entries[bucket]; current != -1; current = entries[current].Next)
{
if (entries[current].Key == key)
{
return true;
}
}
return false;
}
/// <summary>
/// Removes the specified item from the dictionary.
/// Returns true if item was found and removed, false if item doesn't exists.
/// Complexity is averaged O(1).
/// </summary>
public bool Remove(int key)
{
// Removal translate in a simple contains and removal from a singly linked list.
// Quite simple.
int bucketIndex = (key & 0x7FFFFFFF) % this.pBuckets.Length;
int bucket = this.pBuckets[bucketIndex];
Entry[] entries = this.pEntries;
int next;
int prev = -1;
int current = entries[bucket];
while (current != -1)
{
next = entries[current].Next;
if (entries[current].Key == key)
{
// Found! Remove from linked list.
if (prev != -1)
entries[prev].Next = next;
else
this.pBuckets[bucketIndex] = next;
// We now add the removed node to the free list,
// so we can use it later if we add new elements.
entries[current].Next = this.pFirstFreeEntry;
this.pFirstFreeEntry = current;
// Decrements total number of items.
--this.pCount;
return true;
}
prev = current;
current = next;
}
return false;
}
}
If you wander if this implementation is good or not, is a very similar implementation of what the .NET framework do for Dictionary class :)
To make it an hashset, just remove the T and you have an hashset of integers.
If you need to get hashcodes for generic objects, just use x.GetHashCode or provide your hash code function.
To write iterators you need to modify several things, but don't want to add too much other things in this post :)

Related

Grouping unique integer arrays

One module in my app generates a small array of integers. Typically the size is 25 integers. The integers tend to be pretty small, less than 10000. I'll like to save all the unique arrays in a container of some sort. The number of arrays generated can be in the millions.
So, for every new array I need to figure out if it already exits. And if it does what's the index.
A naive approach is to keep all arrays in a list and then just call:
MyList.FindIndex(x=>x.SequenceEqual(Small_Array));
But this becomes very slow if the number of arrays is getting into the thousands.
A less naive approach is to store all arrays in a dictionary where the key is a hash value from the array. If the hash is just another integer (32bit) than I have cannot find a good hashing algorithm which doesn't collides.
Which, I think leaves me to using a hashing algorithm like MD5 that can be converted into a 128bit integer. Is that a good way to tackle my problem?
Rather than make the key the hash, make it the array itself - with a custom comparer. The value would be the notional "index".
The comparer doesn't need to be hugely efficient, nor does the hash generation need to go to great length to avoid duplicates, so long as there aren't too many collisions. (You should potentially add logging to check that.) Here's a really simple start:
public class Int32ArrayEqualityComparer : IEqualityComparer<int[]>
{
// Note: SequenceEqual already checks the count before looking at content.
public bool Equals(int[] first, int[] second) =>
first.SequenceEqual(second);
public int GetHashCode(int[] array)
{
unchecked
{
int hash = 23;
foreach (var item in array)
{
hash = hash * 31 + item;
}
return hash;
}
}
}
You'd then create the dictionary like this:
var arrayMap = new Dictionary<int[], int>(new Int32ArrayEqualityComparer());
Then you'd have something like:
public int MaybeAddArray(int[] array)
{
if (!arrayMap.TryGetValue(array, out var index))
{
index = arrayMap.Count + 1;
arrayMap[array] = index;
}
return index;
}
Note that ConcurrentDictionary has simpler ways of doing this. Also note that the "index" is somewhat artificial here. You may not even need this, depending on what you're doing.

Why does ImmutableArray.Create copy an existing immutable array?

I am trying to make a slice of an existing ImmutableArray<T> in a method and thought I could use the construction method Create<T>(ImmutableArray<T> a, int offset, int count)like so:
var arr = ImmutableArray.Create('A', 'B', 'C', 'D');
var bc ImmutableArray.Create(arr, 1, 2);
I was hoping my two ImmutableArrays could share the underlying array here. But
when double checking this I see that the implementation doesn't:
From corefx/../ImmutableArray.cs at github at 15757a8 18 Mar 2017
/// <summary>
/// Initializes a new instance of the <see cref="ImmutableArray{T}"/> struct.
/// </summary>
/// <param name="items">The array to initialize the array with.
/// The selected array segment may be copied into a new array.</param>
/// <param name="start">The index of the first element in the source array to include in the resulting array.</param>
/// <param name="length">The number of elements from the source array to include in the resulting array.</param>
/// <remarks>
/// This overload allows helper methods or custom builder classes to efficiently avoid paying a redundant
/// tax for copying an array when the new array is a segment of an existing array.
/// </remarks>
[Pure]
public static ImmutableArray<T> Create<T>(ImmutableArray<T> items, int start, int length)
{
Requires.Range(start >= 0 && start <= items.Length, nameof(start));
Requires.Range(length >= 0 && start + length <= items.Length, nameof(length));
if (length == 0)
{
return Create<T>();
}
if (start == 0 && length == items.Length)
{
return items;
}
var array = new T[length];
Array.Copy(items.array, start, array, 0, length);
return new ImmutableArray<T>(array);
}
Why does it have to make the copy of the underlying items here?
And isn't the documentation for this method misleading/wrong? It says
This overload allows helper methods or custom builder classes to
efficiently avoid paying a redundant tax for copying an array when the
new array is a segment of an existing array.
But the segment case is exactly when it copies, and it only avoids the copy if the desired slice is empty or the whole input array.
Is there another way of accomplishing what I want, short of implementing some kind of ImmutableArraySpan?
I'm going to answer my own question with the aid of the comments:
An ImmutableArray can't represent a slice of the underlying array because it doesn't have the fields for it - and obviously adding 64/128 bits of range fields that are only rarely used would be too wasteful.
So the only possibility is to have a proper Slice/Span struct, and there isn't one at the moment apart from ArraySegment (which can't use ImmutableArray as backing data).
It's probably easy to write an ImmutableArraySegment implementing IReadOnlyList<T> etc so will probably be the solution here.
Regarding the documentation - it's as correct as it can be, it avoids the few copies it can (all, none) and copies otherwise.
There are new APIs with the new Span and ReadonlySpan types which will ship with the magical language and runtime features for low level code (ref returns/locals).The types are actually already shipping as part of the System.Memory nuget package, but until they are integrated there will be no way of using them to solve the problem of slicing an ImmutableArray which requires this method on ImmutableArray (which is in System.Collections.Immutable that doesn't depend on System.Memory types, yet)
public ReadonlySpan<T> Slice(int start, int count)
I'm guessing/hoping such APIs will come once the types are in place.

C# data structure, list which can dynamically resize up to a given limit, and allows fast access to any index

I'm implementing a memory system for an AI agent. It needs to have an internal list of state transitions which is capped at some number, say 10000.
If at capacity, adding a new memory should automatically remove the oldest memory.
Importantly, I should also need to be able to quickly access any item in this list.
A wrapper for Queue at first seemed obvious, but Queue does not allow fast access of any element. (O(n))
Similarly, remove an item from the beginning of a List structure takes O(n).
LinkedLists allow fast additions and removals, but again do not allow quick access to every index.
An array would allow random access but obviously it's not dynamically resizeable and deletion is problematic.
I've seen a HashMap being suggested but I'm ensure how that might be implemented.
Suggestions?
If you want the queue to be a fixed length, you could use a circular buffer which enables O(1) enqueue, dequeue and indexing operations and automatically overwrites old entries when the queue is full.
Try using a Dictionary with a LinkedList. The keys of the Dictionary are the indexes of the LinkedList nodes and the values of the Dictionary are of type LinkedListNode; that is, the LinkedList nodes.
The Dictionary would give you almost an O(1) on its operations and removing/adding LinkedListNode(s) to the beginning or end of a LinkedList is of O(1) as well.
Another alternative is to use a HashTable. However, in this case you have to know the capacity of the table beforehand (See Hashtable.Add Method) in order to get the O(1) performance:
If Count is less than the capacity of the Hashtable, this method is an O(1) operation. If the capacity needs to be increased to accommodate the new element, this method becomes an O(n) operation, where n is Count.
In the first solution, no matter what's the capcity of the LinkedList or the Dictionary you would still get almost an O(1) from both the Dictionary and the LinkedList. Of course that's going to be an O(3) or O(4) depending on the total number of operations that you perform on both the Dictionary and the LinkedList to do an add or remove operation inside your memory class. The search access is going to be always an O(1) because you will be using the Dictionary only.
HashMap is for Java, so the closest equivalent is Dictionary. C# Java HashMap equivalent. But I wouldn't say that this is the ultimate answer.
If you implement it as Dictionary, which key == the content, then you can search the content with O(1). However, you cannot have same key. Also, because it is not ordered, you may not know which the 1st content is.
If you implement it as Dictionary, which key == index, and value == the content, searching for the content still takes O(n) because you don't know the location of content.
A List or an Array will cost O(1) if you search the content by index reference. So, please double check your statement that it takes O(n)
If you search by index is sufficient, then circular array/ buffer which #Lee mentioned is good enough.
Otherwise, similar to DB, you might want to maintain in 2 separate data: 1 for storing the data (Circular Array) and the other one for search (Hash).
EDIT: #Lee has it right. A circular buffer seems to give you what you want. Answer left in place though.
I think the data structure you want might be a priority queue -- it depends on what you mean by 'quickly access any item'. If you mean 'able to enumerate all items in O(N)', then a priority queue fits the bill. If you mean 'enumerate the list in historical order', then it won't.
Assuming you need these operations;
add an item and associate with a time
remove the oldest item
enumerate all existing items in arbitrary order
Then you could easily extend this priority queue implementation I wrote a little while ago.
You'll want implement IEnumerable as a loop through the T[] data array from 0 to cursor. This will give you your enumeration.
Implement a GetItem(i) function which returns this.data[i] so long as i <= cursor.
Implement an automatic size limit by putting this into the Push() method;
if (queue.Size => 10000) {
queue.Pop();
}
I think this is O(ln n) for push and pop, and O(N) to enumerate ALL items, or O(i) to find ANY item, so long as you don't need them in order.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace Mindfire.DataStructures
{
public class PiorityQueue<T>
{
private int[] priorities;
private T[] data;
private int cursor;
private int capacity;
public int Size
{
get
{
return cursor+1;
}
}
public PiorityQueue(int capacity)
{
this.cursor = -1;
this.capacity = capacity;
this.priorities = new int[this.capacity];
this.data = new T[this.capacity];
}
public T Pop()
{
if (this.Size == 0)
{
throw new InvalidOperationException($"The {this.GetType().Name} is Empty");
}
var result = this.data[0];
this.data[0] = this.data[cursor];
this.priorities[0] = this.priorities[cursor];
this.cursor--;
var loc = 0;
while (true)
{
var l = loc * 2;
var r = loc * 2 + 1;
var leftIsBigger = l <= cursor && this.priorities[loc] < this.priorities[l];
var rightIsBigger = r <= cursor && this.priorities[loc] < this.priorities[r];
if (leftIsBigger)
{
Swap(loc, l);
loc = l;
}
else if (rightIsBigger)
{
Swap(loc, r);
loc = r;
}
else
{
break;
}
}
return result;
}
public void Push(int priority, T v)
{
this.cursor++;
if (this.cursor == this.capacity)
{
Resize(this.capacity * 2);
};
this.data[this.cursor] = v;
this.priorities[this.cursor] = priority;
var loc = (this.cursor -1)/ 2;
while (this.priorities[loc] < this.priorities[cursor])
{
// swap
this.Swap(loc, cursor);
}
}
private void Swap(int a, int b)
{
if (a == b) { return; }
var data = this.data[b];
var priority = this.priorities[b];
this.data[b] = this.data[a];
this.priorities[b] = this.priorities[a];
this.priorities[a] = priority;
this.data[a] = data;
}
private void Resize(int newCapacity)
{
var newPriorities = new int[newCapacity];
var newData = new T[newCapacity];
this.priorities.CopyTo(newPriorities, 0);
this.data.CopyTo(newData, 0);
this.data = newData;
this.priorities = newPriorities;
this.capacity = newCapacity;
}
public PiorityQueue() : this(1)
{
}
public T Peek()
{
if (this.cursor > 0)
{
return this.data[0];
}
else
{
return default(T);
}
}
public void Push(T item, int priority)
{
}
}
}

C# Time complexity of Array[T].Contains(T item) vs HashSet<T>.Contains(T item)

HashSet(T).Contains(T) (inherited from ICollection<T>.Contains(T)) has a time complexity of O(1).
So, I'm wondering what the complexity of a class member array containing integers would be as I strive to achieve O(1) and don't need the existence checks of HashSet(T).Add(T).
Since built-in types are not shown in the .NET reference source, I have no chance of finding found the array implementation of IList(T).Contains(T).
Any (further) reading material or reference would be very much appreciated.
You can see source code of Array with any reflector (maybe online too, didn't check). IList.Contains is just:
Array.IndexOf(this,value) >= this.GetLowerBound(0);
And Array.IndexOf calls Array.IndexOf<T>, which, after a bunch of consistency checks, redirects to
EqualityComparer<T>.Default.IndexOf(array, value, startIndex, count)
And that one finally does:
int num = startIndex + count;
for (int index = startIndex; index < num; ++index)
{
if (this.Equals(array[index], value))
return index;
}
return -1;
So just loops over array with average complexity O(N). Of course that was obvious from the beginning, but just to provide some more evidence.
Array source code for the .Net Framework (up to v4.8) is available in reference source, and can be de-compiled using ILSpy.
In reference source, you find at line 2753 then 2809:
// -----------------------------------------------------------
// ------- Implement ICollection<T> interface methods --------
// -----------------------------------------------------------
...
[SecuritySafeCritical]
bool Contains<T>(T value) {
//! Warning: "this" is an array, not an SZArrayHelper. See comments above
//! or you may introduce a security hole!
T[] _this = JitHelpers.UnsafeCast<T[]>(this);
return Array.IndexOf(_this, value) != -1;
}
And IndexOf ends up on this IndexOf which is a O(n) algorithm.
internal virtual int IndexOf(T[] array, T value, int startIndex, int count)
{
int endIndex = startIndex + count;
for (int i = startIndex; i < endIndex; i++) {
if (Equals(array[i], value)) return i;
}
return -1;
}
Those methods are on a special class SZArrayHelper in same source file, and as explained at line 2721, this is the implementation your are looking for.
// This class is needed to allow an SZ array of type T[] to expose IList<T>,
// IList<T.BaseType>, etc., etc. all the way up to IList<Object>. When the following call is
// made:
//
// ((IList<T>) (new U[n])).SomeIListMethod()
//
// the interface stub dispatcher treats this as a special case, loads up SZArrayHelper,
// finds the corresponding generic method (matched simply by method name), instantiates
// it for type <T> and executes it.
About achieving O(1) complexity, you should convert it to a HashSet:
var lookupHashSet = new HashSet<T>(yourArray);
...
var hasValue = lookupHashSet.Contains(testValue);
Of course, this conversion is an O(n) operation. If you do not have many lookup to do, it is moot.
Note from documentation on this constructor:
If collection contains duplicates, the set will contain one of each unique element. No exception will be thrown. Therefore, the size of the resulting set is not identical to the size of collection.
You actually can see the source for List<T>, but you need to look it up online. Here's one source.
Any pure list/array bool Contains(T item) check is O(N) complexity, because each element needs to be checked. .NET is no exception. (If you designed a data structure that manifested as a list but also contained a bloom filter helper data structure, that would be another story.)

More efficient way to build sum than for loop

I have two lists with equal size. Both contain numbers. The first list is generated and the second one is static. Since I have many of the generated lists, I want to find out which one is the best. For me the best list is the one which is most equal to the reference. Therefore I calculate the difference at each position and add it up.
Here is the code:
/// <summary>
/// Calculates a measure based on that the quality of a match can be evaluated
/// </summary>
/// <param name="Combination"></param>
/// <param name="histDates"></param>
/// <returns>fitting value</returns>
private static decimal getMatchFitting(IList<decimal> combination, IList<MyClass> histDates)
{
decimal fitting = 0;
if (combination.Count != histDates.Count)
{
return decimal.MaxValue;
}
//loop through all values, compare and add up the result
for (int i = 0; i < combination.Count; i++)
{
fitting += Math.Abs(combination[i] - histDates[i].Value);
}
return fitting;
}
Is there possibly a more elegant but more important and more efficient way to get the desired sum?
Thanks in advance!
You can do the same with LINQ as follows:
return histDates.Zip(combination, (x, y) => Math.Abs(x.Value - y)).Sum();
This could be considered more elegant, but it cannot be more efficient that what you already have. It can also work with any type of IEnumerable (so you don't need specifically an IList), but that does not have any practical importance in your situation.
You can also reject a histDates as soon as the running sum of differences becomes larger than the smallest sum seen so far if you have this information at hand.
This is possible without using lists. Instead of filling your two lists you just want to have the sum of each values for a single list, e.g. IList combination becomes int combinationSum.
Do the same for histDates list.
Then substract those two values. No loop in this case is needed.
you can do more elegant with LINQ but it will not be more efficient... if you can calculate the sums while adding the items to the list you might get an edge...
I don't think I want to garantee any direct improvment of efficiancy as I can't test it right now but this at least looks nicer:
if (combination.Count != histDates.Count)
return decimal.MaxValue;
return combination.Select((t, i) => Math.Abs(t - histDates[i].Value)).Sum();

Categories

Resources