Related
I have a custom BindingList that I want create a custom AddRange method for.
public class MyBindingList<I> : BindingList<I>
{
...
public void AddRange(IEnumerable<I> vals)
{
foreach (I v in vals)
Add(v);
}
}
My problem with this is performance is terrible with large collections. The case I am debugging now is trying to add roughly 30,000 records, and taking an unacceptable amount of time.
After looking into this issue online, it seems like the problem is that the use of Add is resizing the array with each addition. This answer I think summarizes it as :
If you are using Add, it is resizing the inner array gradually as needed (doubling)
What can I do in my custom AddRange implementation to specify the size the BindingList needs to resize to be based on the item count, rather than letting it constantly re-allocate the array with each item added?
CSharpie explained in his answer that the bad performance is due to the ListChanged-event firing after each Add, and showed a way to implement AddRange for your custom BindingList.
An alternative would be to implement the AddRange functionality as an extension method for BindingList<T>. Based on on CSharpies implementation:
/// <summary>
/// Extension methods for <see cref="System.ComponentModel.BindingList{T}"/>.
/// </summary>
public static class BindingListExtensions
{
/// <summary>
/// Adds the elements of the specified collection to the end of the <see cref="System.ComponentModel.BindingList{T}"/>,
/// while only firing the <see cref="System.ComponentModel.BindingList{T}.ListChanged"/>-event once.
/// </summary>
/// <typeparam name="T">
/// The type T of the values of the <see cref="System.ComponentModel.BindingList{T}"/>.
/// </typeparam>
/// <param name="bindingList">
/// The <see cref="System.ComponentModel.BindingList{T}"/> to which the values shall be added.
/// </param>
/// <param name="collection">
/// The collection whose elements should be added to the end of the <see cref="System.ComponentModel.BindingList{T}"/>.
/// The collection itself cannot be null, but it can contain elements that are null,
/// if type T is a reference type.
/// </param>
/// <exception cref="ArgumentNullException">values is null.</exception>
public static void AddRange<T>(this System.ComponentModel.BindingList<T> bindingList, IEnumerable<T> collection)
{
// The given collection may not be null.
if (collection == null)
throw new ArgumentNullException(nameof(collection));
// Remember the current setting for RaiseListChangedEvents
// (if it was already deactivated, we shouldn't activate it after adding!).
var oldRaiseEventsValue = bindingList.RaiseListChangedEvents;
// Try adding all of the elements to the binding list.
try
{
bindingList.RaiseListChangedEvents = false;
foreach (var value in collection)
bindingList.Add(value);
}
// Restore the old setting for RaiseListChangedEvents (even if there was an exception),
// and fire the ListChanged-event once (if RaiseListChangedEvents is activated).
finally
{
bindingList.RaiseListChangedEvents = oldRaiseEventsValue;
if (bindingList.RaiseListChangedEvents)
bindingList.ResetBindings();
}
}
}
This way, depending on your needs, you might not even need to write your own BindingList-subclass.
You can pass in a List in the constructor and make use of List<T>.Capacity.
But i bet, the most significant speedup will come form suspending events when adding a range. So I included both things in my example code.
Probably needs some finetuning to handle some worst cases and what not.
public class MyBindingList<I> : BindingList<I>
{
private readonly List<I> _baseList;
public MyBindingList() : this(new List<I>())
{
}
public MyBindingList(List<I> baseList) : base(baseList)
{
if(baseList == null)
throw new ArgumentNullException();
_baseList = baseList;
}
public void AddRange(IEnumerable<I> vals)
{
ICollection<I> collection = vals as ICollection<I>;
if (collection != null)
{
int requiredCapacity = Count + collection.Count;
if (requiredCapacity > _baseList.Capacity)
_baseList.Capacity = requiredCapacity;
}
bool restore = RaiseListChangedEvents;
try
{
RaiseListChangedEvents = false;
foreach (I v in vals)
Add(v); // We cant call _baseList.Add, otherwise Events wont get hooked.
}
finally
{
RaiseListChangedEvents = restore;
if (RaiseListChangedEvents)
ResetBindings();
}
}
}
You cannot use the _baseList.AddRangesince BindingList<T> wont hook the PropertyChanged event then. You can bypass this only using Reflection by calling the private Method HookPropertyChanged for each Item after AddRange. this however only makes sence if vals (your method parameter) is a collection. Otherwise you risk enumerating the enumerable twice.
Thats the closest you can get to "optimal" without writing your own BindingList.
Which shouldnt be too dificult as you could copy the source code from BindingList and alter the parts to your needs.
I would like to implement a simple in-memory LRU cache system and I was thinking about a solution based on an IDictionary implementation which could handle an hashed LRU mechanism.
Coming from java, I have experiences with LinkedHashMap, which works fine for what I need: I can't find anywhere a similar solution for .NET.
Has anyone developed it or has anyone had experiences like this?
This a very simple and fast implementation we developed for a web site we own.
We tried to improve the code as much as possible, while keeping it thread safe.
I think the code is very simple and clear, but if you need some explanation or a guide related to how to use it, don't hesitate to ask.
namespace LRUCache
{
public class LRUCache<K,V>
{
private int capacity;
private Dictionary<K, LinkedListNode<LRUCacheItem<K, V>>> cacheMap = new Dictionary<K, LinkedListNode<LRUCacheItem<K, V>>>();
private LinkedList<LRUCacheItem<K, V>> lruList = new LinkedList<LRUCacheItem<K, V>>();
public LRUCache(int capacity)
{
this.capacity = capacity;
}
[MethodImpl(MethodImplOptions.Synchronized)]
public V get(K key)
{
LinkedListNode<LRUCacheItem<K, V>> node;
if (cacheMap.TryGetValue(key, out node))
{
V value = node.Value.value;
lruList.Remove(node);
lruList.AddLast(node);
return value;
}
return default(V);
}
[MethodImpl(MethodImplOptions.Synchronized)]
public void add(K key, V val)
{
if (cacheMap.TryGetValue(key, out var existingNode))
{
lruList.Remove(existingNode);
}
else if (cacheMap.Count >= capacity)
{
RemoveFirst();
}
LRUCacheItem<K, V> cacheItem = new LRUCacheItem<K, V>(key, val);
LinkedListNode<LRUCacheItem<K, V>> node = new LinkedListNode<LRUCacheItem<K, V>>(cacheItem);
lruList.AddLast(node);
// cacheMap.Add(key, node); - here's bug if try to add already existing value
cacheMap[key] = node;
}
private void RemoveFirst()
{
// Remove from LRUPriority
LinkedListNode<LRUCacheItem<K,V>> node = lruList.First;
lruList.RemoveFirst();
// Remove from cache
cacheMap.Remove(node.Value.key);
}
}
class LRUCacheItem<K,V>
{
public LRUCacheItem(K k, V v)
{
key = k;
value = v;
}
public K key;
public V value;
}
}
There is nothing in the base class libraries that does this.
On the free side, maybe something like C5's HashedLinkedList would work.
If you're willing to pay, maybe check out this C# toolkit. It contains an implementation.
I've recently released a class called LurchTable to address the need for a C# variant of the LinkedHashMap. A brief discussion of the LurchTable can be found here.
Basic features:
Linked Concurrent Dictionary by Insertion, Modification, or Access
Dictionary/ConcurrentDictionary interface support
Peek/TryDequeue/Dequeue access to 'oldest' entry
Allows hard-limit on items enforced at insertion
Exposes events for add, update, and remove
Source Code: http://csharptest.net/browse/src/Library/Collections/LurchTable.cs
GitHub: https://github.com/csharptest/CSharpTest.Net.Collections
HTML Help: http://help.csharptest.net/
PM> Install-Package CSharpTest.Net.Collections
The LRUCache answer with sample code above uses MethodImplOptions.Synchronized, which is equivalent to putting lock(this) around each method call. Whilst correct, this global lock will significantly reduce throughput under concurrent load.
To solve this I implemented a thread safe pseudo LRU designed for concurrent workloads. Performance is very close to ConcurrentDictionary, ~10x faster than MemoryCache and hit rate is better than a conventional LRU. Full analysis provided in the github link below.
Usage looks like this:
int capacity = 666;
var lru = new ConcurrentLru<int, SomeItem>(capacity);
var value = lru.GetOrAdd(1, (k) => new SomeItem(k));
GitHub: https://github.com/bitfaster/BitFaster.Caching
Install-Package BitFaster.Caching
Found you answer while googling, also found this:
http://code.google.com/p/csharp-lru-cache/
csharp-lru-cache: LRU cache collection class library
This is a collection class that
functions as a least-recently-used
cache. It implements ICollection<T>,
but also exposes three other members:
Capacity, the maximum number of items
the cache can contain. Once the
collection is at capacity, adding a
new item to the cache will cause the
least recently used item to be
discarded. If the Capacity is set to 0
at construction, the cache will not
automatically discard items.
Oldest,
the oldest (i.e. least recently used)
item in the collection.
DiscardingOldestItem, an event raised
when the cache is about to discard its
oldest item. This is an extremely
simple implementation. While its Add
and Remove methods are thread-safe, it
shouldn't be used in heavy
multithreading environments because
the entire collection is locked during
those methods.
This takes Martin's code with Mr T's suggestions and makes it Stylecop friendly. Oh, it also allows for disposal of values as they cycle out of the cache.
namespace LruCache
{
using System;
using System.Collections.Generic;
/// <summary>
/// A least-recently-used cache stored like a dictionary.
/// </summary>
/// <typeparam name="TKey">
/// The type of the key to the cached item
/// </typeparam>
/// <typeparam name="TValue">
/// The type of the cached item.
/// </typeparam>
/// <remarks>
/// Derived from https://stackoverflow.com/a/3719378/240845
/// </remarks>
public class LruCache<TKey, TValue>
{
private readonly Dictionary<TKey, LinkedListNode<LruCacheItem>> cacheMap =
new Dictionary<TKey, LinkedListNode<LruCacheItem>>();
private readonly LinkedList<LruCacheItem> lruList =
new LinkedList<LruCacheItem>();
private readonly Action<TValue> dispose;
/// <summary>
/// Initializes a new instance of the <see cref="LruCache{TKey, TValue}"/>
/// class.
/// </summary>
/// <param name="capacity">
/// Maximum number of elements to cache.
/// </param>
/// <param name="dispose">
/// When elements cycle out of the cache, disposes them. May be null.
/// </param>
public LruCache(int capacity, Action<TValue> dispose = null)
{
this.Capacity = capacity;
this.dispose = dispose;
}
/// <summary>
/// Gets the capacity of the cache.
/// </summary>
public int Capacity { get; }
/// <summary>Gets the value associated with the specified key.</summary>
/// <param name="key">
/// The key of the value to get.
/// </param>
/// <param name="value">
/// When this method returns, contains the value associated with the specified
/// key, if the key is found; otherwise, the default value for the type of the
/// <paramref name="value" /> parameter. This parameter is passed
/// uninitialized.
/// </param>
/// <returns>
/// true if the <see cref="T:System.Collections.Generic.Dictionary`2" />
/// contains an element with the specified key; otherwise, false.
/// </returns>
public bool TryGetValue(TKey key, out TValue value)
{
lock (this.cacheMap)
{
LinkedListNode<LruCacheItem> node;
if (this.cacheMap.TryGetValue(key, out node))
{
value = node.Value.Value;
this.lruList.Remove(node);
this.lruList.AddLast(node);
return true;
}
value = default(TValue);
return false;
}
}
/// <summary>
/// Looks for a value for the matching <paramref name="key"/>. If not found,
/// calls <paramref name="valueGenerator"/> to retrieve the value and add it to
/// the cache.
/// </summary>
/// <param name="key">
/// The key of the value to look up.
/// </param>
/// <param name="valueGenerator">
/// Generates a value if one isn't found.
/// </param>
/// <returns>
/// The requested value.
/// </returns>
public TValue Get(TKey key, Func<TValue> valueGenerator)
{
lock (this.cacheMap)
{
LinkedListNode<LruCacheItem> node;
TValue value;
if (this.cacheMap.TryGetValue(key, out node))
{
value = node.Value.Value;
this.lruList.Remove(node);
this.lruList.AddLast(node);
}
else
{
value = valueGenerator();
if (this.cacheMap.Count >= this.Capacity)
{
this.RemoveFirst();
}
LruCacheItem cacheItem = new LruCacheItem(key, value);
node = new LinkedListNode<LruCacheItem>(cacheItem);
this.lruList.AddLast(node);
this.cacheMap.Add(key, node);
}
return value;
}
}
/// <summary>
/// Adds the specified key and value to the dictionary.
/// </summary>
/// <param name="key">
/// The key of the element to add.
/// </param>
/// <param name="value">
/// The value of the element to add. The value can be null for reference types.
/// </param>
public void Add(TKey key, TValue value)
{
lock (this.cacheMap)
{
if (this.cacheMap.Count >= this.Capacity)
{
this.RemoveFirst();
}
LruCacheItem cacheItem = new LruCacheItem(key, value);
LinkedListNode<LruCacheItem> node =
new LinkedListNode<LruCacheItem>(cacheItem);
this.lruList.AddLast(node);
this.cacheMap.Add(key, node);
}
}
private void RemoveFirst()
{
// Remove from LRUPriority
LinkedListNode<LruCacheItem> node = this.lruList.First;
this.lruList.RemoveFirst();
// Remove from cache
this.cacheMap.Remove(node.Value.Key);
// dispose
this.dispose?.Invoke(node.Value.Value);
}
private class LruCacheItem
{
public LruCacheItem(TKey k, TValue v)
{
this.Key = k;
this.Value = v;
}
public TKey Key { get; }
public TValue Value { get; }
}
}
}
The Caching Application Block of EntLib has an LRU scavenging option out of the box and can be in memory. It might be a bit heavyweight for what you want tho.
I don't believe so. I've certainly seen hand-rolled ones implemented several times in various unrelated projects (which more or less confirms this. If there was one, surely at least one of the projects would have used it).
It's pretty simple to implement, and usually gets done by creating a class which contains both a Dictionary and a List.
The keys go in the list (in-order) and the items go in the dictionary.
When you Add a new item to the collection, the function checks the length of the list, pulls out the last Key (if it's too long) and then evicts the key and value from the dictionary to match. Not much more to it really
I like Lawrence's implementation. Hashtable + LinkedList is a good solution.
Regarding threading, I would not lock this with[MethodImpl(MethodImplOptions.Synchronized)], but rather use ReaderWriterLockSlim or spin lock (since contention usually fast) instead.
In the Get function I would check if it's already the 1st item first, rather than always removing and adding. This gives you the possibility to keep that within a reader lock that is not blocking other readers.
I just accidently found now LruCache.cs in aws-sdk-net: https://github.com/aws/aws-sdk-net/blob/master/sdk/src/Core/Amazon.Runtime/Internal/Util/LruCache.cs
If it's an asp.net app you can use the cache class[1] but you'll be competing for space with other cached stuff, which may be what you want or may not be.
[1] http://msdn.microsoft.com/en-us/library/system.web.caching.cache.aspx
Overview
We use Dapper to execute stored procedures on our internal applications. I need to build out a set of APIs that we could use that sit on top of Dapper, so the enterprise can avoid being tightly coupled with dapper. I wrote the set of APIs and have them working and performing great.
A simple example of the usage is:
private async Task Delete()
{
// Get an instance of the graph builder from our factory.
IGraph graphBuilder = EntityGraphFactory.CreateEntityGraph();
// Associate the builder to a stored procedure, and map an entity instance to it.
// We provide the graph the entities primary key and value.
graphBuilder.MapToProcedure("DeleteAddress").MapEntity(this.CustomerAddress)
.DefineProperty(address => address.AddressId).IsKey();
// Get an instance of our repository and delete the entity defined in the graph.
IGraphRepository repository = GraphRepositoryFactory.CreateRepository();
await repository.DeleteAsync(graphBuilder);
this.CustomerAddress = new Address();
}
The problem
The challenge I now have is caching. I want the repository to handle the caching for us automatically. When we query for lookup data like this:
private async Task RestoreAddress()
{
IGraph graphBuilder = EntityGraphFactory.CreateEntityGraph();
IGraphRepository repository = GraphRepositoryFactory.CreateRepository();
// Map ourself to a stored procedure. Tell the graph we are going to
// take the entered Id, and pass it in to the stored procedure as a
// "AddressId" parameter.
// We then define each of the properties that the returned rows
// must map back to, renaming the columns to their associated properties.
graphBuilder.MapToProcedure("GetAddressbyId")
.MapFromEntity(this.CustomerAddress)
.DefineProperty(address => address.AddressId.ToString())
.MapToEntity<Address>()
.DefineProperty(address => address.Street == "AddressLine1")
.DefineProperty(address => address.City)
.DefineProperty(address => address.RowGuid.ToString() == "rowguid")
.DefineProperty(address => address.LastModified.ToString() == "ModifiedDate")
.DefineProperty(address => address.PostalCode)
.DefineProperty(address => address.AddressId)
.MapToEntity<StateProvince>()
.DefineProperty(province => province.StateProvinceId.ToString() == "StateProvinceId");
IEnumerable<Address> addresses = await repository.GetAsync<Address>(graphBuilder);
this.CustomerAddress = addresses.FirstOrDefault() ?? new Address();
this.SelectedProvince = this.Provinces.FirstOrDefault(
province => province.StateProvinceId == this.CustomerAddress.StateProvinceId);
}
Addresses in this example is a set of lookup data that won't change during the runtime of the app. Not until a sync is performed, at which point the cache could be cleared. The issue though is that I'm not sure how to go about caching. In this example, I am executing GetAddressById, but I could have executed GetAddressByStateId or GetAllAddresses. Then I don't know what data was already fetched and still needs to be fetched.
Potential solutions
I have a few ideas on how to go about doing this, but I'm not sure if they're going to cause conflicts or issues if I were to implement them. So before I outline them, I want to show you the implementation of the IGraph interface.
/// <summary>
/// Exposes methods for retrieving mapping information and entity definitions.
/// </summary>
internal class Graph : IGraph
{
/// <summary>
/// Initializes a new instance of the <see cref="Graph"/> class.
/// </summary>
internal Graph()
{
this.ProcedureMapping = new ProcedureBuilder(this);
this.GraphMap = new Dictionary<Type, List<PropertyDefinition>>();
}
/// <summary>
/// Gets the graph definitions created for each Type registered with it.
/// </summary>
internal Dictionary<Type, List<PropertyDefinition>> GraphMap { get; private set; }
/// <summary>
/// Gets or sets the key used by the graph as it's Primary Key.
/// </summary>
internal PropertyDefinition RootKey { get; set; }
/// <summary>
/// Gets the procedure mapping.
/// </summary>
internal ProcedureBuilder ProcedureMapping { get; private set; }
/// <summary>
/// Gets the graph generated for the given entity
/// </summary>
/// <typeparam name="TEntity">The entity type to retrieve definition information from.</typeparam>
/// <returns>
/// Returns a collection of PropertyDefinition objects
/// </returns>
public IEnumerable<PropertyDefinition> GetEntityGraph<TEntity>() where TEntity : class, new()
{
return this.GetEntityGraph(typeof(TEntity));
}
/// <summary>
/// Gets a collection of PropertyDefinition objects that make up the data graph for the Entity speified.
/// </summary>
/// <param name="entityType">The entity type to retrieve definition information from.</param>
/// <returns>
/// Returns a collection of PropertyDefinition objects
/// </returns>
public IEnumerable<PropertyDefinition> GetEntityGraph(Type entityType)
{
if (GraphMap.ContainsKey(entityType))
{
return GraphMap[entityType];
}
return Enumerable.Empty<PropertyDefinition>();
}
/// <summary>
/// Gets the graph generated by the graph for all entities graphed on it.
/// </summary>
/// <returns>
/// Returns a dictionary where the key is a mapped type and the value is its definition data.
/// </returns>
public Dictionary<Type, IEnumerable<PropertyDefinition>> GetBuilderGraph()
{
// Return a new dictionary containing the same values. This prevents someone from adding content to the
// dictionary we hold internally.
return this.GraphMap.ToDictionary(keySelector => keySelector.Key, valueSelector => valueSelector.Value as IEnumerable<PropertyDefinition>);
}
/// <summary>
/// Resets the graph so that it may be used in a fresh state.
/// </summary>
public void ClearGraph()
{
this.GraphMap.Clear();
this.RootKey = null;
this.ProcedureMapping = new ProcedureBuilder(this);
}
/// <summary>
/// Gets the primary key defined for this data graph.
/// </summary>
/// <returns>Returns the PropertyDefinition associated as the Builder Key.</returns>
public PropertyDefinition GetKey()
{
return this.RootKey;
}
/// <summary>
/// Gets the stored procedure for the operation type provided.
/// </summary>
/// <param name="operationType">Type of operation the procedure will perform when executed.</param>
/// <returns>
/// Returns the ProcedureDefinition mapped to this graph for the given operation type.
/// </returns>
public ProcedureDefinition GetProcedureForOperation(ProcedureOperationType operationType)
{
string procedureName = this.ProcedureMapping.ProcedureMap[operationType];
return new ProcedureDefinition(operationType, procedureName);
}
/// <summary>
/// Gets all of the associated stored procedure mappings.
/// </summary>
/// <returns>
/// Returns a collection of ProcedureDefinition objects mapped to this data graph.
/// </returns>
public IEnumerable<ProcedureDefinition> GetProcedureMappings()
{
// Convert the builders dictionary mapping of stored procedures to OperationType into a collection of ProcedureDefinition objects.
return this.ProcedureMapping.ProcedureMap
.Where(kvPair => !string.IsNullOrEmpty(kvPair.Value))
.Select(kvPair => new ProcedureDefinition(kvPair.Key, kvPair.Value));
}
/// <summary>
/// Maps the data defined in this graph to a stored procedure.
/// </summary>
/// <param name="procedureName">Name of the procedure responsible for receiving the data in this graph.</param>
/// <returns>
/// Returns the data graph.
/// </returns>
public IGraph MapToProcedure(string procedureName)
{
this.ProcedureMapping.DefineForAllOperations(procedureName);
return this;
}
/// <summary>
/// Allows for mapping the data in this graph to different stored procedures.
/// </summary>
/// <returns>
/// Returns an instance of IProcedureBuilder used to perform the mapping operation
/// </returns>
public IProcedureBuilder MapToProcedure()
{
return this.ProcedureMapping;
}
/// <summary>
/// Defines what Entity will be used to building out property definitions
/// </summary>
/// <typeparam name="TEntity">The type of the entity to use during the building process.</typeparam>
/// <returns>
/// Returns an instance of IEntityDefinition that can be used for building out the entity definition
/// </returns>
public IPropertyBuilderForInput<TEntity> MapFromEntity<TEntity>() where TEntity : class, new()
{
this.CreateDefinition<TEntity>();
return new PropertyBuilder<TEntity>(this, DefinitionDirection.In);
}
/// <summary>
/// Defines what Entity will be used to building out property definitions
/// </summary>
/// <typeparam name="TEntity">The type of the entity to use during the building process.</typeparam>
/// <param name="entity">An existing instance of the entity used during the building process.</param>
/// <returns>
/// Returns an instance of IEntityDefinition that can be used for building out the entity definition
/// </returns>
public IPropertyBuilderForInput<TEntity> MapFromEntity<TEntity>(TEntity entity) where TEntity : class, new()
{
this.CreateDefinition<TEntity>();
return new PropertyBuilder<TEntity>(this, DefinitionDirection.In, entity);
}
public IPropertyBuilderForOutput<TEntity> MapToEntity<TEntity>() where TEntity : class, new()
{
this.CreateDefinition<TEntity>();
return new PropertyBuilder<TEntity>(this, DefinitionDirection.Out);
}
public IPropertyBuilderForInput<TEntity> MapEntity<TEntity>() where TEntity : class, new()
{
this.CreateDefinition<TEntity>();
return new PropertyBuilder<TEntity>(this, DefinitionDirection.Both);
}
public IPropertyBuilderForInput<TEntity> MapEntity<TEntity>(TEntity entity) where TEntity : class, new()
{
this.CreateDefinition<TEntity>();
return new PropertyBuilder<TEntity>(this, DefinitionDirection.Both, entity);
}
private void CreateDefinition<TEntity>() where TEntity : class, new()
{
// A definition has already been created, so return.
if (this.GraphMap.ContainsKey(typeof(TEntity)))
{
return;
}
this.GraphMap.Add(typeof(TEntity), new List<PropertyDefinition>());
}
}
The point of this class is to let you map a Type to it, and then use interfaces that are returned on the MapEntity methods, to define properties and their characteristics. The repository then is given the builder, and pulls the Mappings from it, generating a Dapper DynamicParameters collection from it. The map is also used in a custom Dapper TypeMapper.
Since I am only caching things that are queried, I'll save some page-space and just share the query method on my repository, and its TypeMapper.
public async Task<IEnumerable<TEntity>> GetAsync<TEntity>(IGraph builder, IDataContext context = null)
{
IEnumerable<TEntity> items = null;
DynamicParameters parameters = this.GetParametersFromDefinition(builder, DefinitionDirection.In);
// Setup our mapping of the return results.
this.SetupSqlMapper<TEntity>(builder);
ProcedureDefinition mapping = builder.GetProcedureForOperation(ProcedureOperationType.Select);
// Query the database
await this.SetupConnection(
context,
async (connection, transaction) => items = await connection.QueryAsync<TEntity>(
mapping.StoredProcedure,
parameters,
commandType: CommandType.StoredProcedure,
transaction: transaction));
return items;
}
private async Task SetupConnection(IDataContext context, Func<IDbConnection, IDbTransaction, Task> communicateWithDatabase)
{
SqlDataContext connectionContext = await this.CreateConnectionContext(context);
IDbConnection databaseConnection = await connectionContext.GetConnection();
// Fetch the transaction, if any, associated with the context. If none exists, null is returned and passed
// in to the ExecuteAsync method.
IDbTransaction transaction = connectionContext.GetTransaction();
try
{
await communicateWithDatabase(databaseConnection, transaction);
}
catch (Exception)
{
this.RollbackChanges(connectionContext);
throw;
}
// If we are given a shared connection, we are not responsible for closing it.
if (context == null)
{
this.CloseConnection(connectionContext);
}
}
private DynamicParameters GetParametersFromDefinition(IGraph builder, DefinitionDirection direction)
{
// Fetch the model definition, then loop through each property we are saving and add it
// do a Dapper DynamicParameer dictionary.
Dictionary<Type, IEnumerable<PropertyDefinition>> definition = builder.GetBuilderGraph();
var parameters = new DynamicParameters();
foreach (var pair in definition)
{
IEnumerable<PropertyDefinition> properties =
pair.Value.Where(property => property.Direction == direction || property.Direction == DefinitionDirection.Both);
foreach (PropertyDefinition data in properties)
{
parameters.Add(data.ResolvedName, data.PropertyValue);
}
}
return parameters;
}
/// <summary>
/// Sets up the Dapper SQL Type mapper.
/// </summary>
/// <param name="type">The type we want to map the build definition to.</param>
/// <param name="graph">The graph.</param>
private void SetupSqlMapper(Type type, IGraph builder)
{
SqlMapper.SetTypeMap(
type,
new CustomPropertyTypeMap(type, (typeToMap, columnName) =>
{
// Grab all of the property definitions on the entity defined with the IGraph
IEnumerable<PropertyDefinition> entityDefinition = builder.GetEntityGraph(typeToMap);
PropertyInfo propertyForColumn;
// Lookup a PropertyDefinition definition from the IGraph that can map to the columnName provided by the database.
PropertyDefinition propertyData = null;
if (this.dataStoreConfig.EnableSensitiveCasing)
{
propertyData = entityDefinition.FirstOrDefault(
definition => definition.ResolvedName.Equals(columnName) || definition.Property.Name.Equals(columnName));
}
else
{
propertyData = entityDefinition.FirstOrDefault(
definition =>
definition.ResolvedName.ToLower().Equals(columnName.ToLower()) ||
definition.Property.Name.ToLower().Equals(columnName.ToLower()));
}
// If a mapping definition was not found, use the TypePool to fetch the property info from the type cache.
// Otherwise we assign the property from the definition mapping.
if (propertyData == null)
{
propertyForColumn = this.dataStoreConfig.EnableSensitiveCasing
? TypePool.GetProperty(typeToMap, (info) => info.Name.Equals(columnName))
: TypePool.GetProperty(typeToMap, (info) => info.Name.ToLower().Equals(columnName.ToLower()));
}
else
{
propertyForColumn = propertyData.Property;
}
if (propertyForColumn == null)
{
Debug.WriteLine(string.Format("The column {0} could not be mapped to the Type {1}. It does not have a definition associated with the data graph, nor a property with a matching name.", columnName, typeToMap.Name));
}
return propertyForColumn;
}));
}
There are a couple different paths I'm considering here.
Override GetHashCode() on my IGraph implementation and have it return a hashed value of the Dictionary & RootKey properties. Then in the repository, I ask for the graphs hashcode, query the database and cache the return results in a Dictionary of <HashCode, ResultSet>. The next time I create the builder, or re-use an existing one, the repository could do a key lookup and return back the cache.
In this approach, is that safe? Can I call GraphMap.GethashCode() and rest assured that the hash will be based on the contents of the dictionary, and therefore (mostly)unique? Would I have to iterate over each item in the dictionary, asking for HashCodes on their members to prevent hash code collisions?
Cache the expression used to generate the map. Within my repository, I can generate a hash, based on the hashcode of each expression used on the builder. This way, if you ever use the same series of expressions to build the mapping, the repository would know and could return the previously fetched data?
Are hashcodes a safe way to go, or should I be exploring different routes? Is there an industry standard way of going about this?
I am coding a parallel clustering algorithm in C#. I have a class called Classification which currently uses a Dictionary to associate a String label with a set of N-dimensional points. Each label has one or more points in its cluster. Classification also has a reverse index that associates each point with its label, which is in a second Dictionary.
The most expensive method currently performs many merge operations. A merge takes all members of a source cluster and moves them to the same target cluster. This operation updates both Dictionaries.
I have multiple processes that identify points that belong together in the same cluster. This relationship is transitive: if A belongs with B and B with C, then A, B and C belong in the same cluster. However, there is no easy way to segment the problem such that each process works on a disjoint subset of points. The processes will be operating on the same points.
How do I implement the merge method to minimize lock contention and assure atomic operations? Here is a portion of my class definition. The final method, Merge, is the one I want to parallelize. That may require me to change my Dictionaries into parallel collections.
using System;
using System.Collections.Generic;
using System.Linq;
namespace Cluster
{
/// <summary>
/// Represent points grouped into class partitions where each point is associated with a class label
/// and each labeled class has a set of points.
///
/// This does not perform the unassisted clustering of points.
/// </summary>
/// <typeparam name="TPoint">Type of the points to be classified.</typeparam>
/// <typeparam name="TLabel">Type of the label that can be associated with a set within the classification.</typeparam>
public class Classification<TPoint, TLabel> where TLabel : IEquatable<TLabel>
{
#region Properties (LabelToPoints, PointToLabel, NumPartitions, NumPoints)
/// <summary>
/// Associates a class label with the points belonging to that class.
/// </summary>
public Dictionary<TLabel, ISet<TPoint>> LabelToPoints { get; set; }
/// <summary>
/// Associates a point with the label for its class.
/// </summary>
private Dictionary<TPoint, TLabel> PointToLabel { get; set; }
/// <summary>
/// Total number of class partitions that points are divided among.
/// </summary>
public int NumPartitions { get { return LabelToPoints.Count; } }
/// <summary>
/// Total number of points among all partitions.
/// </summary>
public int NumPoints { get { return PointToLabel.Count; } }
#endregion
#region Constructors
public Classification()
{
LabelToPoints = new Dictionary<TLabel, ISet<TPoint>>();
PointToLabel = new Dictionary<TPoint, TLabel>();
}
public Classification(IEnumerable<TPoint> points, Func<TPoint,TLabel> startingLabel) : this()
{
foreach (var point in points)
{
Add(point, startingLabel(point));
}
}
/// <summary>
/// Create a new Classification by randomly selecting a subset of the points in the current Classification.
/// </summary>
/// <param name="sampleSize">Number of points to include in the new Classification.</param>
/// <returns>A new Classification that has the given number of points.</returns>
public Classification<TPoint, TLabel> Sample(int sampleSize)
{
var subset = new Classification<TPoint, TLabel>();
foreach (var point in Points().TakeRandom(sampleSize, NumPoints))
subset.Add(point, GetClassLabel(point));
return subset;
}
#endregion
#region Modify the Classification (Add, Remove, Merge)
/// <summary>
/// Add a point to the classification with the associated label.
///
/// If the point was already classified, its old classification is removed.
/// </summary>
/// <param name="p">Point to add.</param>
/// <param name="classLabel">Label for classification.</param>
public void Add(TPoint p, TLabel classLabel)
{
Remove(p);
PointToLabel[p] = classLabel;
EnsurePartition(classLabel).Add(p);
}
private ISet<TPoint> EnsurePartition(TLabel classLabel)
{
ISet<TPoint> partition;
if (LabelToPoints.TryGetValue(classLabel, out partition)) return partition;
partition = new HashSet<TPoint>();
LabelToPoints[classLabel] = partition;
return partition;
}
/// <summary>
/// Remove a point from its class.
/// </summary>
/// <param name="p">Point to remove.</param>
/// <returns>True if the point was removed, false if it was not previously a member of any class.</returns>
public Boolean Remove(TPoint p)
{
TLabel label;
if (!PointToLabel.TryGetValue(p, out label)) return false;
PointToLabel.Remove(p);
var oldPoints = LabelToPoints[label];
var didRemove = oldPoints.Remove(p);
if (oldPoints.Count == 0)
LabelToPoints.Remove(label);
return didRemove;
}
/// <summary>
/// Merge all the members of the partitions labeled by any of the labels in sourceLabels
/// into the partition indicated by targetLabel.
/// </summary>
/// <param name="targetLabel">Move members into this labeled partition.</param>
/// <param name="sourceLabels">Move members out of these labeled partitions.</param>
public void Merge(TLabel targetLabel, IEnumerable<TLabel> sourceLabels)
{
var targetPartition = EnsurePartition(targetLabel);
foreach (var sourceLabel in sourceLabels.Where(sLabel => !sLabel.Equals(targetLabel)))
{
ISet<TPoint> singleSourcePoints;
if (!LabelToPoints.TryGetValue(sourceLabel, out singleSourcePoints)) continue;
// Add to LabelToPoints under new targetLabel
targetPartition.UnionWith(singleSourcePoints);
// Remove from LabelToPoints under old sourceLabel.
LabelToPoints.Remove(sourceLabel);
foreach (var p in singleSourcePoints)
PointToLabel[p] = targetLabel;
}
}
#endregion
}
}
UPDATE 1:
I am sure that changing the Dictionaries to ConcurrentDictionary is not enough. I think I need to take out locks on the target label and all the source labels. If any other process is in the middle of performing a merge with any of those labels as source or target, then I need to wait until that other process is done. How do you do this without creating deadlocks? I can sort the labels alphabetically to try and ensure the same order, but is that enough?
UPDATE 2:
New approach. I realized that the merge operations are order independent. I will create a multiple producer, single consumer queue. The producers find opportunities to merge, while the consumer serializes access to the Classification object. Since performing the merges takes longer than finding the merge candidates, the consumer won't starve. A better design would get better throughput, but this will still be an improvement.
I would like to implement a simple in-memory LRU cache system and I was thinking about a solution based on an IDictionary implementation which could handle an hashed LRU mechanism.
Coming from java, I have experiences with LinkedHashMap, which works fine for what I need: I can't find anywhere a similar solution for .NET.
Has anyone developed it or has anyone had experiences like this?
This a very simple and fast implementation we developed for a web site we own.
We tried to improve the code as much as possible, while keeping it thread safe.
I think the code is very simple and clear, but if you need some explanation or a guide related to how to use it, don't hesitate to ask.
namespace LRUCache
{
public class LRUCache<K,V>
{
private int capacity;
private Dictionary<K, LinkedListNode<LRUCacheItem<K, V>>> cacheMap = new Dictionary<K, LinkedListNode<LRUCacheItem<K, V>>>();
private LinkedList<LRUCacheItem<K, V>> lruList = new LinkedList<LRUCacheItem<K, V>>();
public LRUCache(int capacity)
{
this.capacity = capacity;
}
[MethodImpl(MethodImplOptions.Synchronized)]
public V get(K key)
{
LinkedListNode<LRUCacheItem<K, V>> node;
if (cacheMap.TryGetValue(key, out node))
{
V value = node.Value.value;
lruList.Remove(node);
lruList.AddLast(node);
return value;
}
return default(V);
}
[MethodImpl(MethodImplOptions.Synchronized)]
public void add(K key, V val)
{
if (cacheMap.TryGetValue(key, out var existingNode))
{
lruList.Remove(existingNode);
}
else if (cacheMap.Count >= capacity)
{
RemoveFirst();
}
LRUCacheItem<K, V> cacheItem = new LRUCacheItem<K, V>(key, val);
LinkedListNode<LRUCacheItem<K, V>> node = new LinkedListNode<LRUCacheItem<K, V>>(cacheItem);
lruList.AddLast(node);
// cacheMap.Add(key, node); - here's bug if try to add already existing value
cacheMap[key] = node;
}
private void RemoveFirst()
{
// Remove from LRUPriority
LinkedListNode<LRUCacheItem<K,V>> node = lruList.First;
lruList.RemoveFirst();
// Remove from cache
cacheMap.Remove(node.Value.key);
}
}
class LRUCacheItem<K,V>
{
public LRUCacheItem(K k, V v)
{
key = k;
value = v;
}
public K key;
public V value;
}
}
There is nothing in the base class libraries that does this.
On the free side, maybe something like C5's HashedLinkedList would work.
If you're willing to pay, maybe check out this C# toolkit. It contains an implementation.
I've recently released a class called LurchTable to address the need for a C# variant of the LinkedHashMap. A brief discussion of the LurchTable can be found here.
Basic features:
Linked Concurrent Dictionary by Insertion, Modification, or Access
Dictionary/ConcurrentDictionary interface support
Peek/TryDequeue/Dequeue access to 'oldest' entry
Allows hard-limit on items enforced at insertion
Exposes events for add, update, and remove
Source Code: http://csharptest.net/browse/src/Library/Collections/LurchTable.cs
GitHub: https://github.com/csharptest/CSharpTest.Net.Collections
HTML Help: http://help.csharptest.net/
PM> Install-Package CSharpTest.Net.Collections
The LRUCache answer with sample code above uses MethodImplOptions.Synchronized, which is equivalent to putting lock(this) around each method call. Whilst correct, this global lock will significantly reduce throughput under concurrent load.
To solve this I implemented a thread safe pseudo LRU designed for concurrent workloads. Performance is very close to ConcurrentDictionary, ~10x faster than MemoryCache and hit rate is better than a conventional LRU. Full analysis provided in the github link below.
Usage looks like this:
int capacity = 666;
var lru = new ConcurrentLru<int, SomeItem>(capacity);
var value = lru.GetOrAdd(1, (k) => new SomeItem(k));
GitHub: https://github.com/bitfaster/BitFaster.Caching
Install-Package BitFaster.Caching
Found you answer while googling, also found this:
http://code.google.com/p/csharp-lru-cache/
csharp-lru-cache: LRU cache collection class library
This is a collection class that
functions as a least-recently-used
cache. It implements ICollection<T>,
but also exposes three other members:
Capacity, the maximum number of items
the cache can contain. Once the
collection is at capacity, adding a
new item to the cache will cause the
least recently used item to be
discarded. If the Capacity is set to 0
at construction, the cache will not
automatically discard items.
Oldest,
the oldest (i.e. least recently used)
item in the collection.
DiscardingOldestItem, an event raised
when the cache is about to discard its
oldest item. This is an extremely
simple implementation. While its Add
and Remove methods are thread-safe, it
shouldn't be used in heavy
multithreading environments because
the entire collection is locked during
those methods.
This takes Martin's code with Mr T's suggestions and makes it Stylecop friendly. Oh, it also allows for disposal of values as they cycle out of the cache.
namespace LruCache
{
using System;
using System.Collections.Generic;
/// <summary>
/// A least-recently-used cache stored like a dictionary.
/// </summary>
/// <typeparam name="TKey">
/// The type of the key to the cached item
/// </typeparam>
/// <typeparam name="TValue">
/// The type of the cached item.
/// </typeparam>
/// <remarks>
/// Derived from https://stackoverflow.com/a/3719378/240845
/// </remarks>
public class LruCache<TKey, TValue>
{
private readonly Dictionary<TKey, LinkedListNode<LruCacheItem>> cacheMap =
new Dictionary<TKey, LinkedListNode<LruCacheItem>>();
private readonly LinkedList<LruCacheItem> lruList =
new LinkedList<LruCacheItem>();
private readonly Action<TValue> dispose;
/// <summary>
/// Initializes a new instance of the <see cref="LruCache{TKey, TValue}"/>
/// class.
/// </summary>
/// <param name="capacity">
/// Maximum number of elements to cache.
/// </param>
/// <param name="dispose">
/// When elements cycle out of the cache, disposes them. May be null.
/// </param>
public LruCache(int capacity, Action<TValue> dispose = null)
{
this.Capacity = capacity;
this.dispose = dispose;
}
/// <summary>
/// Gets the capacity of the cache.
/// </summary>
public int Capacity { get; }
/// <summary>Gets the value associated with the specified key.</summary>
/// <param name="key">
/// The key of the value to get.
/// </param>
/// <param name="value">
/// When this method returns, contains the value associated with the specified
/// key, if the key is found; otherwise, the default value for the type of the
/// <paramref name="value" /> parameter. This parameter is passed
/// uninitialized.
/// </param>
/// <returns>
/// true if the <see cref="T:System.Collections.Generic.Dictionary`2" />
/// contains an element with the specified key; otherwise, false.
/// </returns>
public bool TryGetValue(TKey key, out TValue value)
{
lock (this.cacheMap)
{
LinkedListNode<LruCacheItem> node;
if (this.cacheMap.TryGetValue(key, out node))
{
value = node.Value.Value;
this.lruList.Remove(node);
this.lruList.AddLast(node);
return true;
}
value = default(TValue);
return false;
}
}
/// <summary>
/// Looks for a value for the matching <paramref name="key"/>. If not found,
/// calls <paramref name="valueGenerator"/> to retrieve the value and add it to
/// the cache.
/// </summary>
/// <param name="key">
/// The key of the value to look up.
/// </param>
/// <param name="valueGenerator">
/// Generates a value if one isn't found.
/// </param>
/// <returns>
/// The requested value.
/// </returns>
public TValue Get(TKey key, Func<TValue> valueGenerator)
{
lock (this.cacheMap)
{
LinkedListNode<LruCacheItem> node;
TValue value;
if (this.cacheMap.TryGetValue(key, out node))
{
value = node.Value.Value;
this.lruList.Remove(node);
this.lruList.AddLast(node);
}
else
{
value = valueGenerator();
if (this.cacheMap.Count >= this.Capacity)
{
this.RemoveFirst();
}
LruCacheItem cacheItem = new LruCacheItem(key, value);
node = new LinkedListNode<LruCacheItem>(cacheItem);
this.lruList.AddLast(node);
this.cacheMap.Add(key, node);
}
return value;
}
}
/// <summary>
/// Adds the specified key and value to the dictionary.
/// </summary>
/// <param name="key">
/// The key of the element to add.
/// </param>
/// <param name="value">
/// The value of the element to add. The value can be null for reference types.
/// </param>
public void Add(TKey key, TValue value)
{
lock (this.cacheMap)
{
if (this.cacheMap.Count >= this.Capacity)
{
this.RemoveFirst();
}
LruCacheItem cacheItem = new LruCacheItem(key, value);
LinkedListNode<LruCacheItem> node =
new LinkedListNode<LruCacheItem>(cacheItem);
this.lruList.AddLast(node);
this.cacheMap.Add(key, node);
}
}
private void RemoveFirst()
{
// Remove from LRUPriority
LinkedListNode<LruCacheItem> node = this.lruList.First;
this.lruList.RemoveFirst();
// Remove from cache
this.cacheMap.Remove(node.Value.Key);
// dispose
this.dispose?.Invoke(node.Value.Value);
}
private class LruCacheItem
{
public LruCacheItem(TKey k, TValue v)
{
this.Key = k;
this.Value = v;
}
public TKey Key { get; }
public TValue Value { get; }
}
}
}
The Caching Application Block of EntLib has an LRU scavenging option out of the box and can be in memory. It might be a bit heavyweight for what you want tho.
I don't believe so. I've certainly seen hand-rolled ones implemented several times in various unrelated projects (which more or less confirms this. If there was one, surely at least one of the projects would have used it).
It's pretty simple to implement, and usually gets done by creating a class which contains both a Dictionary and a List.
The keys go in the list (in-order) and the items go in the dictionary.
When you Add a new item to the collection, the function checks the length of the list, pulls out the last Key (if it's too long) and then evicts the key and value from the dictionary to match. Not much more to it really
I like Lawrence's implementation. Hashtable + LinkedList is a good solution.
Regarding threading, I would not lock this with[MethodImpl(MethodImplOptions.Synchronized)], but rather use ReaderWriterLockSlim or spin lock (since contention usually fast) instead.
In the Get function I would check if it's already the 1st item first, rather than always removing and adding. This gives you the possibility to keep that within a reader lock that is not blocking other readers.
I just accidently found now LruCache.cs in aws-sdk-net: https://github.com/aws/aws-sdk-net/blob/master/sdk/src/Core/Amazon.Runtime/Internal/Util/LruCache.cs
If it's an asp.net app you can use the cache class[1] but you'll be competing for space with other cached stuff, which may be what you want or may not be.
[1] http://msdn.microsoft.com/en-us/library/system.web.caching.cache.aspx