Fastest way to store and retrieve value by two integer keys - c#

I need to super fast store and retrieve values by two integer keys.
So I have input values uint Id1, uint Id2 and need to get uint Count.
Also I know max value of Id1 and Id2 (it is about 5 000 000).
My current implementation takes about 70% of application work time and it might be a few days.
It just use standard .net dictionaries and of course can be improved. But I guess it is a very useful operation in computer science and no doubt more efficient algorithms exists.
Here is my implementation
void Main()
{
var rep = new Repository();
var sw = new Stopwatch();
sw.Start();
for (uint i = 0; i < 10000; i++)
{
for (uint j = 0; j < 1000; j++)
{
rep.Add(new DomainEntity(){Id1 = i, Id2 = j, Count = 1});
}
}
for (uint i = 0; i < 10000; i++)
{
for (uint j = 0; j < 1000; j++)
{
rep.GetDomainEntityByIds(i,j);
}
}
sw.Stop();
Console.WriteLine ("Elapsed:{0}", sw.Elapsed);
}
public class Repository
{
private readonly Dictionary<Tuple<UInt32, UInt32>, UInt32> _dictStore;
public Repository()
{
_dictStore = new Dictionary<Tuple<uint, uint>, uint>();
}
public uint Add(DomainEntity item)
{
var entry = MapToTableEntry(item);
_dictStore.Add(entry.Key,entry.Value);
return 0;
}
public void Update(DomainEntity item)
{
var entry = MapToTableEntry(item);
_dictStore[entry.Key] = entry.Value;
}
public IEnumerable<DomainEntity> GetAllItems()
{
return _dictStore.Select(MapToDomainEntity);
}
public DomainEntity GetDomainEntityByIds(uint articleId1, uint articleId2)
{
var tuple = new Tuple<uint, uint>(articleId1, articleId2);
if (_dictStore.ContainsKey(tuple))
{
return MapToDomainEntity(new KeyValuePair<Tuple<uint, uint>, uint>(tuple, _dictStore[tuple]));
}
return null;
}
private KeyValuePair<Tuple<uint, uint>, uint> MapToTableEntry(DomainEntity item)
{
return new KeyValuePair<Tuple<uint, uint>, uint>(new Tuple<uint, uint>(item.Id1,item.Id2), item.Count);
}
private DomainEntity MapToDomainEntity(KeyValuePair<Tuple<uint, uint>, uint> entry)
{
return new DomainEntity
{
Id1 = entry.Key.Item1,
Id2 = entry.Key.Item2,
Count = entry.Value,
};
}
}
public class DomainEntity
{
public uint Id1 { get; set; }
public uint Id2 { get; set; }
public uint Count { get; set; }
}

One minor(?) improvement, you can use TryGetValue to avoid to lookup the dictionary twice:
public DomainEntity GetDomainEntityByIds(uint articleId1, uint articleId2)
{
var tuple = new Tuple<uint, uint>(articleId1, articleId2);
uint value;
if (_dictStore.TryGetValue(tuple, out value))
{
return MapToDomainEntity(new KeyValuePair<Tuple<uint, uint>, uint>(tuple, value));
}
return null;
}

What you want to do is create an efficient dictionary using an efficient key & hash. Since the dictionary always uses a 32 bit value and you have around 45 bits of data, you can't create a unique hash, but you should do your best.
Always use TryGetValue() rather than a double lookup.
When using dictionaries with value type keys, use a custom IEqualityComparer passed as the argument to the dictionary constructor.
Use a custom hash code to try to squeeze the maximum amount of information from the subkeys into the 32 bit hash.
Example:
public class Storage
{
private Dictionary<Key, DomainObject> dict;
public Storage()
{
dict = new Dictionary<Key, DomainObject>(Key.Comparer.Instance)
}
public DomainObject Get(uint a, uint b)
{
DomainObject obj;
dict.TryGetValue(new Key(a,b), out obj);
return obj;
}
internal struct Key
{
internal readonly uint a;
internal readonly uint b;
public Key(uint a, uint b)
{
this.a = a;
this.b = b;
}
internal class Comparer : IEqualityComparer<Key>
{
internal static readonly Comparer Instance = new Comparer();
private Comparer(){}
public bool Equals(Key x, Key y)
{
return x.a == y.a && x.b == y.b;
}
public int GetHashCode(Key x)
{
return (int)((x.a & 0xffff) << 16) | (x.b & 0xffff));
}
}
}
}

You're doing a lot of extra work in there, converting to and from KeyValuePair. Also, DomainEntity is a reference type, so you probably should just store references to those in the dictionary rather than having to create them from the key and value every time you look one up.
Create your dictionary as:
var _dictStore = new Dictionary<Tuple<uint, uint>, DomainEntity>();
Then:
public uint Add(DomainEntity item)
{
var key = new Tuple<uint, uint>(item.Id1, item.Id2);
_dictStore.Add(key, item);
return 0;
}
And lookup:
public DomainEntity GetDomainEntityByIds(uint articleId1, uint articleId2)
{
var key = new Tuple<uint, uint>(articleId1, articleId2);
DomainEntity value;
if (!_dictStore.TryGetValue(key, out value))
{
value = null;
}
return value;
}

Related

Get Type in QuickInfoSource

i have a Class like this.
public static class MyTestClass
{
[MyCustomAttribnute("MoreInformations")
public static string MyProperty => "Sample";
}
I can use the class like this.
public static void Main()
{
var myTest = MyTestClass.MyProperty;
}
Now i create a QuickInfoSource and i can get the text "MyTestClass.MyProperty" when i hover over "MyProperty". But i want do get the Type "MyTestClass" to get the customAttribute of "MyProperty".
anybody knows how to get the Type?
Here is my experimental Code of the "QuickInfoSource" class.
internal class TestQuickInfoSource : IAsyncQuickInfoSource
{
private TestQuickInfoSourceProvider m_provider;
private ITextBuffer m_subjectBuffer;
private Dictionary<string, string> m_dictionary;
public TestQuickInfoSource(TestQuickInfoSourceProvider provider, ITextBuffer subjectBuffer)
{
m_provider = provider;
m_subjectBuffer = subjectBuffer;
//these are the method names and their descriptions
m_dictionary = new Dictionary<string, string>();
m_dictionary.Add("add", "int add(int firstInt, int secondInt)\nAdds one integer to another.");
m_dictionary.Add("subtract", "int subtract(int firstInt, int secondInt)\nSubtracts one integer from another.");
m_dictionary.Add("multiply", "int multiply(int firstInt, int secondInt)\nMultiplies one integer by another.");
m_dictionary.Add("divide", "int divide(int firstInt, int secondInt)\nDivides one integer by another.");
}
public async Task<QuickInfoItem> GetQuickInfoItemAsync(IAsyncQuickInfoSession session, CancellationToken cancellationToken)
{
// Map the trigger point down to our buffer.
SnapshotPoint? subjectTriggerPoint = session.GetTriggerPoint(m_subjectBuffer.CurrentSnapshot);
if (!subjectTriggerPoint.HasValue)
{
return null;
}
ITextSnapshot currentSnapshot = subjectTriggerPoint.Value.Snapshot;
SnapshotSpan querySpan = new SnapshotSpan(subjectTriggerPoint.Value, 0);
//look for occurrences of our QuickInfo words in the span
ITextStructureNavigator navigator = m_provider.NavigatorService.GetTextStructureNavigator(m_subjectBuffer);
TextExtent extent = navigator.GetExtentOfWord(subjectTriggerPoint.Value);
SnapshotSpan span = navigator.GetSpanOfPreviousSibling(querySpan);
string searchText = extent.Span.GetText();
string searchText2 = span.GetText();
foreach (string key in m_dictionary.Keys)
{
int foundIndex = searchText.IndexOf(key, StringComparison.CurrentCultureIgnoreCase);
if (foundIndex > -1)
{
string value;
m_dictionary.TryGetValue(key, out value);
return new QuickInfoItem(session.ApplicableToSpan, value ?? string.Empty);
}
}
return null;
}
private bool m_isDisposed;
public void Dispose()
{
if (!m_isDisposed)
{
GC.SuppressFinalize(this);
m_isDisposed = true;
}
}
}

Int as array representation

I need an int array, from an int value.
The int value 123456 converts to int[] {1,2,3,4,5,6}.
Is there any better solution than this:
using System.Diagnostics;
namespace test
{
#if DEBUG
[DebuggerDisplay("{GetDebuggerDisplay()}")]
#endif
public class IntArray
{
#if DEBUG
[DebuggerBrowsable(DebuggerBrowsableState.Never)]
#endif
private int _value;
#if DEBUG
[DebuggerBrowsableAttribute(DebuggerBrowsableState.Never)]
#endif
private int[] _valueArray;
public IntArray(int intValue)
{
Value = intValue;
}
public int Value
{
get { return _value; }
set
{
_value = value;
_valueArray = null;
_valueArray = CreateIntArray(value);
}
}
public int[] Array
{
get { return _valueArray; }
}
private string GetDebuggerDisplay()
{
return string.Format("Value = {0}", Value);
}
private static int[] CreateIntArray(int value)
{
string s = value.ToString();
var intArray = new int[s.Length];
for (int i = 0; i < s.Length; i++)
intArray[i] = int.Parse(s[i].ToString());
return intArray;
}
}
}
Any help and criticism would be appreciated.
You can do as following using Linq. This is only the making of the array from the int value.
var arrayOfInts = myint.ToString().Select(i => int.Parse(i.ToString())).ToArray();
EDIT :
This can also be made as a extension method on int if you want to use this often.
public static class IntExtensions
{
public static int[] ToArray(this int i)
{
return i.ToString().Select(c => int.Parse(c.ToString())).ToArray();
}
}
Then you can use this extension by doing this :
var myArray = 123456.ToArray();
You may convert to int to String, later you can use LINQ to Convert each character to integer and then return an array of integers using .ToArray()
int a = 123456;
string tempA = a.ToString();
int[] temp = tempA.Select(r => Convert.ToInt32(r.ToString())).ToArray();
EDIT:
As per Styxxy comment:
int a = 123456;
int[] array = new int[a.ToString().Length];
int i = array.Length - 1;
while (a > 0)
{
array[i--] = a % 10;
a = a / 10;
}
Another approach:
public static int[] GetInts(this int value)
{
if (value == 0)
return new int[] { 0 };
else
{
int val = value;
List<int> values = new List<int>();
while (Math.Abs(val) >= 1)
{
values.Add(Math.Abs(val % 10));
val = val / 10;
}
values.Reverse();
return values.ToArray();
}
}
and use it:
int value = 123456;
int[] values = value.GetInts();
Edit: improved to work with negative numbers and zero
var res = 123456.ToString().Select(c => Int32.Parse(c.ToString())).ToArray();
Another way using char.GetNumericValue:
int[] ints = 123456.ToString().Select(c => (int)char.GetNumericValue(c)).ToArray();
or without Linq:
var chars = 123456.ToString();
int[] ints = new int[chars.Length];
for (int i = 0; i < chars.Length; i++)
ints[i] = (int)char.GetNumericValue(chars[i]);
As said in the comments, it is better to use basic arithmetic operations, rather than converting to a string, looping through a string and parsing strings to integers.
Here is an example (I made an extension method for an integer):
static class IntegerExtensions
{
public static int[] ToCypherArray(this int value)
{
var cyphers = new List<int>();
do
{
cyphers.Add(value % 10);
value = value / 10;
} while (value != 0);
cyphers.Reverse();
return cyphers.ToArray();
}
}
class Program
{
static void Main(string[] args)
{
int myNumber = 123456789;
int[] cypherArray = myNumber.ToCypherArray();
Array.ForEach(cypherArray, (i) => Console.WriteLine(i));
Console.ReadLine();
}
}

How to create dynamic incrementing variable using “for” loop in C#

How to create dynamic incrementing variable using "for" loop in C#? like this:
track_1, track_2, track_3, track_4. so on.
You can't create dynamically-named variables. All you can do - it to create some collection or array, and operate with it.
I think the best class for you is generic List<>:
List<String> listWithDynamic = new List<String>();
for (int i = 1; i < limit; i +=1)
{
listWithDynamic.Add(string.Format("track_{0}", i));
...
}
Assuming you want strings:
for (int i = 1; i < limit; i +=1)
{
string track = string.Format("track_{0}", i);
...
}
But when you already have variables called track_1, track_2, track_3, track_4 you will need an array or List:
var tracks = new TrackType[] { track_1, track_2, track_3, track_4 } ;
for (int i = 0; i < tracks.length; i++)
{
var track = tracks[i]; // tracks[0] == track_1
...
}
Obvious Solution
for (var i = 0; i < 10; i++)
{
var track = string.Format("track_{0}", i);
}
Linq-Based Solution
foreach (var track in Enumerable.Range(0, 100).Select(x => string.Format("track_{0}", x)))
{
}
Operator-Based Solution This is somewhat hacky, but fun none-the-less.
for (var i = new Frob(0, "track_{0}"); i < 100; i++)
{
Console.WriteLine(i.ValueDescription);
}
struct Frob
{
public int Value { get; private set; }
public string ValueDescription { get; private set; }
private string _format;
public Frob(int value, string format)
: this()
{
Value = value;
ValueDescription = string.Format(format, value);
_format = format;
}
public static Frob operator ++(Frob value)
{
return new Frob(value.Value + 1, value._format);
}
public static Frob operator --(Frob value)
{
return new Frob(value.Value - 1, value._format);
}
public static implicit operator int(Frob value)
{
return value.Value;
}
public static implicit operator string(Frob value)
{
return value.ValueDescription;
}
public override bool Equals(object obj)
{
if (obj is Frob)
{
return ((Frob)obj).Value == Value;
}
else if (obj is string)
{
return ((string)obj) == ValueDescription;
}
else if (obj is int)
{
return ((int)obj) == Value;
}
else
{
return base.Equals(obj);
}
}
public override int GetHashCode()
{
return Value;
}
public override string ToString()
{
return ValueDescription;
}
}
don't know if I get your question, but I will try:
for(var i = 1; i < yourExclusiveUpperbound; i++)
{
var track = String.Format("$track_{0}", i);
// use track
}
or with some LINQ-Magic:
foreach(var track in Enumerate.Range(1, count)
.Select(i => String.Format("$track_{0}", i)))
{
// use track
}
Do as follow:
for (int i = 0; i < lenght; i ++)
{
any work do in loop
}
No, we can't create dynamically named variables in a loop. But, there are other elegant ways to address the problem instead of creating dynamically named variables.
One could be, create an array or list before the loop and store values in array / list items in the loop. You can access the array / list later anywhere in your code. If you know which variable you want to use (track_1, track_2, ...), you can simply access it from the array / list (tracks[1], tracks[2], ...).
List<String> tracks = new List<String>();
for (int i = 1; i < limit; i++)
{
Track track = new Track();
tracks.Add(track);
...
}

C# hashcode for array of ints

I have a class that internally is just an array of integers. Once constructed the array never changes. I'd like to pre-compute a good hashcode so that this class can be very efficiently used as a key in a Dictionary. The length of the array is less than about 30 items, and the integers are between -1000 and 1000 in general.
Not very clever, but sufficient for most practical purposes:
EDIT: changed due to comment of Henk Holterman, thanks for that.
int hc = array.Length;
foreach (int val in array)
{
hc = unchecked(hc * 314159 + val);
}
If you need something more sophisticated, look here.
For an array of values generally between -1000 and 1000, I would probably use something like this:
static int GetHashCode(int[] values)
{
int result = 0;
int shift = 0;
for (int i = 0; i < values.Length; i++)
{
shift = (shift + 11) % 21;
result ^= (values[i]+1024) << shift;
}
return result;
}
You may use CRC32 checksum. Here is the code:
[CLSCompliant(false)]
public class Crc32 {
uint[] table = new uint[256];
uint[] Table { get { return table; } }
public Crc32() {
MakeCrcTable();
}
void MakeCrcTable() {
for (uint n = 0; n < 256; n++) {
uint value = n;
for (int i = 0; i < 8; i++) {
if ((value & 1) != 0)
value = 0xedb88320 ^ (value >> 1);
else
value = value >> 1;
}
Table[n] = value;
}
}
public uint UpdateCrc(uint crc, byte[] buffer, int length) {
uint result = crc;
for (int n = 0; n < length; n++) {
result = Table[(result ^ buffer[n]) & 0xff] ^ (result >> 8);
}
return result;
}
public uint Calculate(Stream stream) {
long pos = stream.Position;
const int size = 0x32000;
byte[] buf = new byte[size];
int bytes = 0;
uint result = 0xffffffff;
do {
bytes = stream.Read(buf, 0, size);
result = UpdateCrc(result, buf, bytes);
}
while (bytes == size);
stream.Position = pos;
return ~result;
}
}
I think choosing a good hash-algorithm would have to be based on the distribution (in a probability sense) of the integer values.
Have a look at Wikipedia for a list of algorithms
Any CRC (or even XOR) should be ok.
You could take a different approach and use a recursive dictionary for each value in your int array. This way you can leave .net to do primitive type hashing.
internal class DictionaryEntry<TKey, TValue>
{
public Dictionary<TKey, DictionaryEntry<TKey, TValue>> Children { get; private set; }
public TValue Value { get; private set; }
public bool HasValue { get; private set; }
public void SetValue(TValue value)
{
Value = value;
HasValue = true;
}
public DictionaryEntry()
{
Children = new Dictionary<TKey, DictionaryEntry<TKey, TValue>>();
}
}
internal class KeyStackDictionary<TKey, TValue>
{
// Helper dictionary to work with a stack of keys
// Usage:
// var dict = new KeyStackDictionary<int, string>();
// int[] keyStack = new int[] {23, 43, 54};
// dict.SetValue(keyStack, "foo");
// string value;
// if (dict.GetValue(keyStack, out value))
// {
// }
private DictionaryEntry<TKey, TValue> _dict;
public KeyStackDictionary()
{
_dict = new DictionaryEntry<TKey, TValue>();
}
public void SetValue(TKey[] keyStack, TValue value)
{
DictionaryEntry<TKey, TValue> dict = _dict;
for (int i = 0; i < keyStack.Length; i++)
{
TKey key = keyStack[i];
if (dict.Children.ContainsKey(key))
{
dict = dict.Children[key];
}
else
{
var child = new DictionaryEntry<TKey, TValue>();
dict.Children.Add(key, child);
dict = child;
}
if (i == keyStack.Length - 1)
{
dict.SetValue(value);
}
}
}
// returns false if the value is not found using the key stack
public bool GetValue(TKey[] keyStack, out TValue value)
{
DictionaryEntry<TKey, TValue> dict = _dict;
for (int i = 0; i < keyStack.Length; i++)
{
TKey key = keyStack[i];
if (dict.Children.ContainsKey(key))
{
dict = dict.Children[key];
}
else
{
break;
}
if (i == keyStack.Length - 1 && dict.HasValue)
{
value = dict.Value;
return true;
}
}
value = default(TValue);
return false;
}
}
You can use Linq methods too:
var array = new int[10];
var hashCode = array.Aggregate(0, (a, v) =>
HashCode.Combine(a, v.GetHashCode()));
I'm using this here
var arrayHash = string.Join(string.Empty, array).GetHashCode();
If a element changed in the array, you will get a new hash.
I would recommend:
HashCode.Combine(array)
For .NET Core 2.1 / .NET Standard 2.1 / .NET 5 and later.

C# HashCode Builder

I used to use the apache hashcode builder a lot
Does this exist for C#
This is my homemade builder.
Usage:
hash = new HashCodeBuilder().
Add(a).
Add(b).
Add(c).
Add(d).
GetHashCode();
It does not matter what type fields a,b,c and d are, easy to extend, no need to create array.
Source:
public sealed class HashCodeBuilder
{
private int hash = 17;
public HashCodeBuilder Add(int value)
{
unchecked
{
hash = hash * 31 + value; //see Effective Java for reasoning
// can be any prime but hash * 31 can be opimised by VM to hash << 5 - hash
}
return this;
}
public HashCodeBuilder Add(object value)
{
return Add(value != null ? value.GetHashCode() : 0);
}
public HashCodeBuilder Add(float value)
{
return Add(value.GetHashCode());
}
public HashCodeBuilder Add(double value)
{
return Add(value.GetHashCode());
}
public override int GetHashCode()
{
return hash;
}
}
Sample usage:
public sealed class Point
{
private readonly int _x;
private readonly int _y;
private readonly int _hash;
public Point(int x, int y)
{
_x = x;
_y = y;
_hash = new HashCodeBuilder().
Add(_x).
Add(_y).
GetHashCode();
}
public int X
{
get { return _x; }
}
public int Y
{
get { return _y; }
}
public override bool Equals(object obj)
{
return Equals(obj as Point);
}
public bool Equals(Point other)
{
if (other == null) return false;
return (other._x == _x) && (other._y == _y);
}
public override int GetHashCode()
{
return _hash;
}
}
I use the following:
public static int ComputeHashFrom(params object[] obj) {
ulong res = 0;
for(uint i=0;i<obj.Length;i++) {
object val = obj[i];
res += val == null ? i : (ulong)val.GetHashCode() * (1 + 2 * i);
}
return (int)(uint)(res ^ (res >> 32));
}
Using such a helper is quick, easy and reliable, but it has potential two downsides (which you aren't likely to encounter frequently, but are good to be aware of):
It can generate poor hashcodes for some distributions of params. For instance, for any int x, ComputeHashFrom(x*-3, x) == 0 - so if your objects have certain pathological properties you may get many hash code collisions resulting in poorly performing Dictionaries and HashSets. It's not likely to happen, but a type-aware hash code computation can avoid such problems more easily.
The computation of the hashcode is slower than a specialized computation could be. In particular, it involved the allocation of the params array and a loop - which quite a bit of unnecessary overhead if you've just got two members to process.
Neither of the drawbacks causes any errors merely inefficiency; and both with show up in a profiler as blips in either this method or in the internals of the hash-code consumer.
C# doesn't have a built-in HashCode builder, but you can roll your own. I recently had this precise problem and created this hashcode generator that doesn't use boxing, by using generics, and implements a modified FNV algorithm for generating the specific hash. But you could use any algorithm you'd like, like one of those in System.Security.Cryptography.
public static int GetHashCode<T>(params T[] args)
{
return args.GetArrayHashCode();
}
public static int GetArrayHashCode<T>(this T[] objects)
{
int[] data = new int[objects.Length];
for (int i = 0; i < objects.Length; i++)
{
T obj = objects[i];
data[i] = obj == null ? 1 : obj.GetHashCode();
}
return GetFnvHash(data);
}
private static int GetFnvHash(int[] data)
{
unchecked
{
const int p = 16777619;
long hash = 2166136261;
for (int i = 0; i < data.Length; i++)
{
hash = (hash ^ data[i]) * p;
}
hash += hash << 13;
hash ^= hash >> 7;
hash += hash << 3;
hash ^= hash >> 17;
hash += hash << 5;
return (int)hash;
}
}
Microsoft recently released a class to compute hashcodes. Please see https://learn.microsoft.com/en-us/dotnet/api/system.hashcode. You need to include NuGet package Microsoft.Bcl.HashCode in your project to use it.
Usage example:
using System.Collections.Generic;
public class MyClass {
public int MyVar { get; }
public string AnotherVar { get; }
public object MoreVars;
public override int GetHashCode()
=> HashCode.Combine(MyVar, AnotherVar, MoreVars);
}
Nowadays I leverage ValueTuples, ref Tuples or anonymous types:
var hash = (1, "seven").GetHashCode();
var hash2 = Tuple.Create(1, "seven").GetHashCode();
var hash3 = new { Number = 1, String = "seven" }.GetHashCode();
I believe value tuples will be fastest.

Categories

Resources