I have the following helper class (simplified):
public static class Cache
{
private static readonly object _syncRoot = new object();
private static Dictionary<Type, string> _lookup = new Dictionary<Type, string>();
public static void Add(Type type, string value)
{
lock (_syncRoot)
{
_lookup.Add(type, value);
}
}
public static string Lookup(Type type)
{
string result;
lock (_syncRoot)
{
_lookup.TryGetValue(type, out result);
}
return result;
}
}
Add will be called roughly 10/100 times in the application and Lookup will be called by many threads, many of thousands of times. What I would like is to get rid of the read lock.
How do you normally get rid of the read lock in this situation?
I have the following ideas:
Require that _lookup is stable before the application starts operation. The could be build up from an Attribute. This is done automatically through the static constructor the attribute is assigned to. Requiring the above would require me to go through all types that could have the attribute and calling RuntimeHelpers.RunClassConstructor which is an expensive operation;
Move to COW semantics.
public static void Add(Type type, string value)
{
lock (_syncRoot)
{
var lookup = new Dictionary<Type, string>(_lookup);
lookup.Add(type, value);
_lookup = lookup;
}
}
(With the lock (_syncRoot) removed in the Lookup method.) The problem with this is that this uses an unnecessary amount of memory (which might not be a problem) and I would probably make _lookup volatile, but I'm not sure how this should be applied. (John Skeets' comment here gives me pause.)
Using ReaderWriterLock. I believe this would make things worse since the region being locked is small.
Suggestions are very welcome.
UPDATE:
The values of the cache are immutable.
To remove locks completely (slightly differnt then "lock free" where locks almost eliminated and remaining are cleverly replaced with Interlocked instructions) you need to make sure that your dictionary is immutable. If items in the dictionary are not immutable (and as result have they own locks) you probably should not worry about locking on dictionary level.
is the best and easiest solution if you can use it.
reasonable and easy to debug. (Note: as written it does not work well for concurrent adding of the same item. Conside double checking locking pattern if needed - Double-checked locking in .NET)
I would not do it if 1/2 is an option.
If you can use new 4.0 collections - ConcurrentDictionary there matches your criteria (see http://msdn.microsoft.com/en-us/library/dd997305.aspx and http://blogs.msdn.com/b/pfxteam/archive/2010/01/26/9953725.aspx).
At work at the moment, so nothing elegant, came up with this (untested)
public static class Cache
{
private static readonly object _syncRoot = new object();
private static Dictionary<Type, string> _lookup = new Dictionary<Type, string>();
public static class OneToManyLocker
{
private static readonly Object WriteLocker = new Object();
private static readonly List<Object> ReadLockers = new List<Object>();
private static readonly Object myLocker = new Object();
public static Object GetLock(LockType lockType)
{
lock(WriteLocker)
{
if(lockType == LockType.Read)
{
var newReadLocker = new Object();
lock(myLocker)
{
ReadLockers.Add(newReadLocker);
}
return newReadLocker;
}
foreach(var readLocker in ReadLockers)
{
lock(readLocker) { }
}
return WriteLocker;
}
}
public enum LockType {Read, Write};
}
public static void Add(Type type, string value)
{
lock(OneToManyLocker.GetLock(OneToManyLocker.LockType.Write))
{
_lookup.Add(type, value);
}
}
public static string Lookup(Type type)
{
string result;
lock (OneToManyLocker.GetLock(OneToManyLocker.LockType.Read))
{
_lookup.TryGetValue(type, out result);
}
return result;
}
}
You will need some sort of cleanup for the read lockers, but should be threadsafe allowing multiple reads at a time while also locking on writes, unless I'm totally missing something
Either:
Dont use normal locks, go spinlock if the lookup is fast (dictionary is not).
If that is not the case, then use http://msdn.microsoft.com/en-us/library/system.threading.readerwriterlock.aspx. This allows multiple readers and only one writer.
Related
Currently I'm facing an issue with Dictionary<ulong, IStoreableObject>, the problem comes when 2 different types that implements IStoreableObject have the same Id, everything messes up.
I came out with something like this in order to "fix it" but I'm not sure if this is a bad design, thread unsafe or if there will be another kind of issues, maybe someone help me to figure out how do I properly separated dictionaries for different types that implement the same interface so if their ids collide there won't be issues? This is my code so far:
using System.Collections.Generic;
namespace SqlExpress
{
internal sealed class Cache<T> where T : IStoreableObject
{
private readonly Dictionary<ulong, T> _cache = new Dictionary<ulong, T>();
private readonly object _lock = new object();
private static Cache<T> _instance = null;
internal static Cache<T> Instance
{
get
{
if (_instance is null)
{
_instance = new Cache<T>();
}
return _instance;
}
}
private Cache() { }
internal void AddOrUpdate(T obj)
{
if (_cache.ContainsKey(obj.Id))
{
_cache[obj.Id] = obj;
}
else
{
lock (_lock)
{
_cache.Add(obj.Id, obj);
}
}
}
internal T Get(ulong id)
{
if (_cache.ContainsKey(id))
{
return _cache[id];
}
return default(T);
}
internal void Remove(ulong id)
{
if (_cache.ContainsKey(id))
{
_cache.Remove(id);
}
}
}
}
Expected behavior:
Let's assume we have 2 Objects, Foo and Bar, they both implements IStoreableObject.
2+ Objects of Foo mustn't have the same Id, and 2+ Objects of Bar mustn't have the same Id either. Anyways, Foo and Bar caches should not related to each other so if an object in Cache<Foo> has the same id of another object in Cache<Bar>, then nothing bad should be happening. This can't be done with Dictionary<ulong, IStoreableObject> since all underlying types are in the same cache, I want them on different caches and each cache should exists once, I mean, only 1 Cache<Foo> should exists as well as 1 Cache<Bar>.
As others have pointed out, the only way to prevent duplicate keys is by including the Type in the key somehow.
You can do that by nesting dictionaries, like this:
Dictionary<Type, Dictionary<ulong, T>> _cache
But I would probably use a ValueTuple:
Dictionary<(Type, ulong), T> _cache
There also is a string concatenation approach $"{typeof(T).Name}_{id}", but that again can generate duplicates if the type happens to have the same name but lives in a separate namespace. So it's best to consider the complete Type.
Also, consider using ConcurrentDictionary as it seems like you're in a multi-threaded environment here.
I'm looking for a solution that allows multiple threads to read the shared resource (concurrency permitted) but then locks these reading threads once a thread enters a mutating block, to achieve best of both world.
I've looked up this reference but it seems the solution is to lock both reading and writing threads.
class Foo {
List<string> sharedResource;
public void reading() // multiple reading threads allowed, concurrency ok, lock this only if a thread enters the mutating block below.
{
}
public void mutating() // this should lock any threads entering this block as well as lock the reading threads above
{
lock(this)
{
}
}
}
Is there such a solution in C#?
Edit
All threads entering in both GetMultiton and constructor should return the same instance. want them to be thread safe.
class Foo: IFoo {
public static IFoo GetMultiton(string key, Func<IFoo> fooRef)
{
if (instances.TryGetValue(key, out IFoo obj))
{
return obj;
}
return fooRef();
}
public Foo(string key) {
instances.Add(key, this);
}
}
protected static readonly IDictionary<string, IFoo> instances = new ConcurrentDictionary<string, IFoo>();
Use
Foo.GetMultiton("key1", () => new Foo("key1"));
There is a pre-built class for this behavior ReaderWriterLockSlim
class Foo {
List<string> sharedResource;
ReaderWriterLockSlim _lock = new ReaderWriterLockSlim();
public void reading() // multiple reading threads allowed, concurrency ok, lock this only if a thread enters the mutating block below.
{
_lock.EnterReadLock();
try
{
//Do reading stuff here.
}
finally
{
_lock.ExitReadLock();
}
}
public void mutating() // this should lock any threads entering this block as well as lock the reading threads above
{
_lock.EnterWriteLock();
try
{
//Do writing stuff here.
}
finally
{
_lock.ExitWriteLock();
}
}
}
Multiple threads can enter the read lock at the same time but if a write lock tries to be taken it will block till all current readers finish then block all new writers and new readers till the write lock finishes.
With your update you don't need locks at all. Just use GetOrAdd from ConcurrentDictionary
class Foo: IFoo {
public static IFoo GetMultiton(string key, Func<IFoo> fooRef)
{
return instances.GetOrAdd(key, k=> fooRef());
}
public Foo(string key) {
instances.Add(key, this);
}
}
Note that fooRef() may be called more than once, but only the first one to return will be used as the result for all the threads. If you want fooRef() to only be called once it will require slightly more complicated code.
class Foo: IFoo {
public static IFoo GetMultiton(string key, Func<IFoo> fooRef)
{
return instances.GetOrAdd(key, k=> new Lazy<IFoo>(fooRef)).Value;
}
public Foo(string key) {
instances.Add(key, new Lazy<IFoo>(()=>this);
}
}
protected static readonly IDictionary<string, Lazy<IFoo>> instances = new ConcurrentDictionary<string, Lazy<IFoo>>();
The solution depends on your requirements. If performance of ReaderWriterLockSlim (note that it's approximately twice slower than regular lock in current .NET Framework, so maximum performance you can achieve if you modify rarely and reading is quite heavy operation, otherwise overhead will be more than profit), you can try to create copy of data, modify it and atomically swap reference with help of Interlocked class (if it's not a requirement to have the most recent data in each thread as soon as it was changed).
class Foo
{
IReadOnlyList<string> sharedResource = new List<string>();
public void reading()
{
// Here you can safely* read from sharedResource
}
public void mutating()
{
var copyOfData = new List<string>(sharedResource);
// modify copyOfData here
// Following line is correct only in case of single writer:
Interlocked.Exchange(ref sharedResource, copyOfData);
}
}
Benefits of lock-free case:
We have no locks on read, so we get maximum performance.
Drawbacks:
We have to copy data => memory traffic (allocations, garbage collection)
Reader thread can observe not the most recent update (if it reads reference before it was updated)
If reader uses sharedResource reference multiple times, then we must copy this reference to local variable via Interlocked.Exchange (if this usages of reference assume that it's the same collection)
If sharedResource is a list of mutable objects, then we must be careful with updating this objects in mutating since reader might be using them at the same moment => in this case it's better to make copies of these objects as well
If there are several updater threads, then we must use Interlocked.CompareExchange instead of Interlocked.Exchange in mutating and a kind of a loop
So, if you want to go lock-free, then it's better to use immutable objects. And anyway you will pay with memory allocations/GC for the performance.
UPDATE
Here is version that allows multiple writers as well:
class Foo
{
IReadOnlyList<string> sharedResource = new List<string>();
public void reading()
{
// Here you can safely* read from sharedResource
}
public void mutating()
{
IReadOnlyList<string> referenceToCollectionForCopying;
List<string> copyOfData;
do
{
referenceToCollectionForCopying = Volatile.Read(ref sharedResource);
copyOfData = new List<string>(referenceToCollectionForCopying);
// modify copyOfData here
} while (!ReferenceEquals(Interlocked.CompareExchange(ref sharedResource, copyOfData,
referenceToCollectionForCopying), referenceToCollectionForCopying));
}
}
I need to make a critical section in an area on the basis of a finite set of strings. I want the lock to be shared for the same string instance, (somewhat similar to String.Intern approach).
I am considering the following implementation:
public class Foo
{
private readonly string _s;
private static readonly HashSet<string> _locks = new HashSet<string>();
public Foo(string s)
{
_s = s;
_locks.Add(s);
}
public void LockMethod()
{
lock(_locks.Single(l => l == _s))
{
...
}
}
}
Are there any problems with this approach? Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet<string>?
Is it better to, for example, create a Dictionary<string, object> that creates a new lock object for each string instance?
Final Implementation
Based on the suggestions I went with the following implementation:
public class Foo
{
private readonly string _s;
private static readonly ConcurrentDictionary<string, object> _locks = new ConcurrentDictionary<string, object>();
public Foo(string s)
{
_s = s;
}
public void LockMethod()
{
lock(_locks.GetOrAdd(_s, _ => new object()))
{
...
}
}
}
Locking on strings is discouraged, the main reason is that (because of string-interning) some other code could lock on the same string instance without you knowing this. Creating a potential for deadlock situations.
Now this is probably a far fetched scenario in most concrete situations. It's more a general rule for libraries.
But on the other hand, what is the perceived benefit of strings?
So, point for point:
Are there any problems with this approach?
Yes, but mostly theoretical.
Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet?
The HashSet<> is not involved in the thread-safety as long as the threads only read concurrently.
Is it better to, for example, create a Dictionary that creates a new lock object for each string instance?
Yes. Just to be on the safe side. In a large system the main aim for avoiding deadlock is to keep the lock-objects as local and private as possible. Only a limited amount of code should be able to access them.
I'd say it's a really bad idea, personally. That isn't what strings are for.
(Personally I dislike the fact that every object has a monitor in the first place, but that's a slightly different concern.)
If you want an object which represents a lock which can be shared between different instances, why not create a specific type for that? You can given the lock a name easily enough for diagnostic purposes, but locking is really not the purpose of a string. Something like this:
public sealed class Lock
{
private readonly string name;
public string Name { get { return name; } }
public Lock(string name)
{
if (name == null)
{
throw new ArgumentNullException("name");
}
this.name = name;
}
}
Given the way that strings are sometimes interned and sometimes not (in a way which can occasionally be difficult to discern by simple inspection), you could easily end up with accidentally shared locks where you didn't intend them.
Locking on strings can be problematic, because interned strings are essentially global.
Interned strings are per process, so they are even shared among different AppDomains. Same goes for type objects (so don't lock on typeof(x)) either.
I had a similar issue not long ago where I was looking for a good way to lock a section of code based on a string value. Here's what we have in place at the moment, that solves the problem of interned strings and has the granularity we want.
The main idea is to maintain a static ConcurrentDictionary of sync objects with a string key. When a thread enters the method, it immediately establishes a lock and attempts to add the sync object to the concurrent dictionary. If we can add to the concurrent dictionary, it means that no other threads have a lock based on our string key and we can continue our work. Otherwise, we'll use the sync object from the concurrent dictionary to establish a second lock, which will wait for the running thread to finish processing. When the second lock is released, we can attempt to add the current thread's sync object to the dictionary again.
One word of caution: the threads aren't queued- so if multiple threads with the same string key are competing simultaneously for a lock, there are no guarantees about the order in which they will be processed.
Feel free to critique if you think I've overlooked something.
public class Foo
{
private static ConcurrentDictionary<string, object> _lockDictionary = new ConcurrentDictionary<string, object>();
public void DoSomethingThreadCriticalByString(string lockString)
{
object thisThreadSyncObject = new object();
lock (thisThreadSyncObject)
{
try
{
for (; ; )
{
object runningThreadSyncObject = _lockDictionary.GetOrAdd(lockString, thisThreadSyncObject);
if (runningThreadSyncObject == thisThreadSyncObject)
break;
lock (runningThreadSyncObject)
{
// Wait for the currently processing thread to finish and try inserting into the dictionary again.
}
}
// Do your work here.
}
finally
{
// Remove the key from the lock dictionary
object dummy;
_lockDictionary.TryRemove(lockString, out dummy);
}
}
}
}
In the following code:
public class SomeItem { }
public class SomeItemsBag : ConcurrentBag< SomeItem > { }
public class SomeItemsList : List< SomeItem > { }
public static class Program
{
private static ConcurrentDictionary< string, SomeItemsBag > _SomeItemsBag;
private static ConcurrentDictionary< string, SomeItemsList > _SomeItemsList;
private static void GetItem(string key)
{
var bag = _SomeItemsBag[key];
var list= _SomeItemsList[key];
...
}
}
My assumption is that bag is threadsafe and list is not. Is this the right way to deal with a dictionary of lists in a multithreaded app?
Edited to add:
Only 1 thread would be adding to the bag/list and another thread would remove, but many threads could access.
Your assumptions that the ConcurrentBag is thread safe and the List is not are correct. But, you can synchronise access to the list, for example:
private static ConcurrentDictionary< string, SomeItemsBag > _SomeItemsBag;
private static ConcurrentDictionary< string, SomeItemsList > _SomeItemsList;
private static object _someItemsListLocker = new object();
private static void GetItem(string key)
{
var bag = _SomeItemsBag[key];
lock (_someItemsListLocker) {
var list = _SomeItemsList[key];
}
}
However, you're better off describing the situation completely if you want more holistic advice as to what data structure you should be using. Note that there are also ConcurrentQueue and ConcurrentStack which may be better for what you want over the list. They are optimised in multi-threaded scenarios since addition and removal can only happen on one side respectively (same sides for stack, opposite sides for queue).
I have a static in-memory cache that is written to only once an hour (or longer), and read by many threads at an extremely high rate. Conventional wisdom suggests I follow a pattern such as the following:
public static class MyCache
{
private static IDictionary<int, string> _cache;
private static ReaderWriterLockSlim _sharedLock;
static MyCache()
{
_cache = new Dictionary<int, string>();
_sharedLock = new ReaderWriterLockSlim();
}
public static string GetData(int key)
{
_sharedLock.EnterReadLock();
try
{
string returnValue;
_cache.TryGetValue(key, out returnValue);
return returnValue;
}
finally
{
_sharedLock.ExitReadLock();
}
}
public static void AddData(int key, string data)
{
_sharedLock.EnterWriteLock();
try
{
if (!_cache.ContainsKey(key))
_cache.Add(key, data);
}
finally
{
_sharedLock.ExitWriteLock();
}
}
}
As an excercise in micro-optimization, how can I shave off even more ticks in the relative expense of shared read locks? Time to write can be expensive, since it rarely happens. I need to make reads as fast as possible. Can I just drop the read locks (below) and remain thread-safe in this scenario? Or is there a lock-free version I can use? I'm familiar with memory-fencing but don't know how to safely apply it in this instance.
Note: I'm not tied to either pattern so any suggestions are welcome as long as the end result is faster and in C# 4.x.*
public static class MyCache2
{
private static IDictionary<int, string> _cache;
private static object _fullLock;
static MyCache2()
{
_cache = new Dictionary<int, string>();
_fullLock = new object();
}
public static string GetData(int key)
{
//Note: There is no locking here... Is that ok?
string returnValue;
_cache.TryGetValue(key, out returnValue);
return returnValue;
}
public static void AddData(int key, string data)
{
lock (_fullLock)
{
if (!_cache.ContainsKey(key))
_cache.Add(key, data);
}
}
}
You don't need a lock when there are threads only ever reading from the data structure. So, since writes are so rare (and, I assume, not concurrent), an option might be to make a full copy of the dictionary, make the modifications to the copy, and then atomically exchange the old dictionary with the new one:
public static class MyCache2
{
private static IDictionary<int, string> _cache;
static MyCache2()
{
_cache = new Dictionary<int, string>();
}
public static string GetData(int key)
{
string returnValue;
_cache.TryGetValue(key, out returnValue);
return returnValue;
}
public static void AddData(int key, string data)
{
IDictionary<int, string> clone = Clone(_cache);
if (!clone.ContainsKey(key))
clone.Add(key, data);
Interlocked.Exchange(ref _cache, clone);
}
}
I would be looking to go lock free here, and achieve thread safety by simply not changing any published dictionary. What I mean is: when you need to add data, create a complete copy of the dictionary, and append/update/etc the copy. Since this is once an hour this shouldn't be a problem even for large data. Then, when you have made the changes, simply swap the reference from the old dictionary to the new dictionary (reference reads/writes are guaranteed to be atomic).
One caveat: any code that needs consistent state between multiple operations should capture the dictionary into a variable first, I.e.
var snapshot = someField;
// multiple reads on snapshot
This ensures that any related logic is all made using the same version of the data, to avoid confusion when the reference swaps during the operation.
I would also take a lock when writing (not when reading) to ensure no squabbling over the data. There are lock-free multi-writer approaches too (primarily Interlocked.CompareExchange and reapply if it fails), but I would use the simplest approach first, and a single writer is exactly that.
Alternative option: the .net 1.x Hashtable (essentially Dictionary, minus the generics) has an interesting threading story; the reads are thread safe without locks - you only need to use locks to ensure at most one writer.
So: you might consider using a non-generic Hashtable, no locking on reads, and then take a lock during writes.
This is the main reason I still find myself using Hashtable sometimes, even in .net 4.x applications.
One problem though - it'll cause the int key to be boxed for both storage and query.
This makes a copy of the dictionary only when data is being added. A lock is used for adding but you can take that out if you don't intend to add from more than one thread. If there's no copy then data is pulled from the original dictionary, otherwise the copy is used while adding.
Just in case the copy gets nulled out after it's checked and seen as not null but before it's able to retrieve the value, I added a try catch which in that rare event, it will pull the data from the original which is then locked but again, this should happen very rarely if at all.
public static class MyCache2
{
private static IDictionary<int, string> _cache;
private static IDictionary<int, string> _cacheClone;
private static Object _lock;
static MyCache2()
{
_cache = new Dictionary<int, string>();
_lock = new Object();
}
public static string GetData(int key)
{
string returnValue;
if (_cacheClone == null)
{
_cache.TryGetValue(key, out returnValue);
}
else
{
try
{
_cacheClone.TryGetValue(key, out returnValue);
}
catch
{
lock (_lock)
{
_cache.TryGetValue(key, out returnValue);
}
}
}
return returnValue;
}
public static void AddData(int key, string data)
{
lock (_lock)
{
_cacheClone = Clone(_cache);
if (!_cache.ContainsKey(key))
_cache.Add(key, data);
_cacheClone = null;
}
}
}
You might also look at lock free data structures. http://www.boyet.com/Articles/LockfreeStack.html is a good example