Does replacing a value associated with a ConcurrentDictionary key lock any dictionary operations beyond that key?
EDIT: For example, I'd like to know if either thread will ever block the other, besides when the keys are first added, in the following:
public static class Test {
private static ConcurrentDictionary<int, int> cd = new ConcurrentDictionary<int, int>();
public static Test() {
new Thread(UpdateItem1).Start();
new Thread(UpdateItem2).Start();
}
private static void UpdateItem1() {
while (true) cd[1] = 0;
}
private static void UpdateItem2() {
while (true) cd[2] = 0;
}
}
Initially I assumed it does, because for example dictionary[key] = value; could refer to a key that is not present yet. However, while working I realized that if an add is necessary it could occur after a separate lock escalation.
I was drafting the following class, but the indirection provided by the AccountCacheLock class is unnecessary if the answer to this question (above) is "no". In fact, all of my own lock management is pretty much unneeded.
// A flattened subset of repository user values that are referenced for every member page access
public class AccountCache {
// The AccountCacheLock wrapper allows the AccountCache item to be updated in a locally-confined account-specific lock.
// Otherwise, one of the following would be necessary:
// Replace a ConcurrentDictionary item, requiring a lock on the ConcurrentDictionary object (unless the ConcurrentDictionary internally implements similar indirection)
// Update the contents of the AccountCache item, requiring either a copy to be returned or the lock to wrap the caller's use of it.
private static readonly ConcurrentDictionary<int, AccountCacheLock> dictionary = new ConcurrentDictionary<int, AccountCacheLock>();
public static AccountCache Get(int accountId, SiteEntities refreshSource) {
AccountCacheLock accountCacheLock = dictionary.GetOrAdd(accountId, k => new AccountCacheLock());
AccountCache accountCache;
lock (accountCacheLock) {
accountCache = accountCacheLock.AccountCache;
}
if (accountCache == null || accountCache.ExpiresOn < DateTime.UtcNow) {
accountCache = new AccountCache(refreshSource.Accounts.Single(a => a.Id == accountId));
lock (accountCacheLock) {
accountCacheLock.AccountCache = accountCache;
}
}
return accountCache;
}
public static void Invalidate(int accountId) {
// TODO
}
private AccountCache(Account account) {
ExpiresOn = DateTime.UtcNow.AddHours(1);
Status = account.Status;
CommunityRole = account.CommunityRole;
Email = account.Email;
}
public readonly DateTime ExpiresOn;
public readonly AccountStates Status;
public readonly CommunityRoles CommunityRole;
public readonly string Email;
private class AccountCacheLock {
public AccountCache AccountCache;
}
}
Side question: is there something in the ASP.NET framework that already does this?
You don't need to be doing any locks. The ConcurrentDictionary should handle that pretty well.
Side question: is there something in the ASP.NET framework that already does this?
Of course. It's not specifically related to ASP.NET but you may take a look at the System.Runtime.Caching namespace and more specifically the MemoryCache class. It adds things like expiration and callbacks on the top of a thread safe hashtable.
I don't quite understand the purpose of the AccountCache you have shown in your updated answer. It's exactly what a simple caching layer gives you for free.
Obviously if you intend to be running your ASP.NET application in a web farm you should consider some distributed caching such as memcached for example. There are .NET implementations of the ObjectCache class on top of the memcached protocol.
I also wanted to note that I took a cursory peek inside ConcurrentDictionary, and it looks like item replacements are locked on neither the individual item nor the entire dictionary, but rather the hash of the item (i.e. a lock object associated with a dictionary "bucket"). It seems to be designed so that an initial introduction of a key also does not lock the entire dictionary, provided the dictionary need not be resized. I believe this also means that two updates can occur simultaneously provided they don't produce matching hashes.
Related
I am working on a caching manager for a MVC web application. For this app, I have some very large objects that are costly to build. During the application lifetime, I may need to create several of these objects, based upon user requests. When built, the user will be working with the data in the objects, resulting in many read actions. On occasion, I will need to update some minor data points in the cached object (create & replace would take too much time).
Below is a cache manager class that I have created to help me in this. Beyond basic thread safety, my goals were to:
Allow multiple reads against a object, but lock all reads to that object upon an
update request
Ensure that the object is only ever created 1 time if
it does not already exist (keep in mind that its a long build
action).
Allow the cache to store many objects, and maintain a lock
per object (rather than one lock for all objects).
public class CacheManager
{
private static readonly ObjectCache Cache = MemoryCache.Default;
private static readonly ConcurrentDictionary<string, ReaderWriterLockSlim>
Locks = new ConcurrentDictionary<string, ReaderWriterLockSlim>();
private const int CacheLengthInHours = 1;
public object AddOrGetExisting(string key, Func<object> factoryMethod)
{
Locks.GetOrAdd(key, new ReaderWriterLockSlim());
var policy = new CacheItemPolicy
{
AbsoluteExpiration = DateTimeOffset.Now.AddHours(CacheLengthInHours)
};
return Cache.AddOrGetExisting
(key, new Lazy<object>(factoryMethod), policy);
}
public object Get(string key)
{
var targetLock = AcquireLockObject(key);
if (targetLock != null)
{
targetLock.EnterReadLock();
try
{
var cacheItem = Cache.GetCacheItem(key);
if(cacheItem!= null)
return cacheItem.Value;
}
finally
{
targetLock.ExitReadLock();
}
}
return null;
}
public void Update<T>(string key, Func<T, object> updateMethod)
{
var targetLock = AcquireLockObject(key);
var targetItem = (Lazy<object>) Get(key);
if (targetLock == null || key == null) return;
targetLock.EnterWriteLock();
try
{
updateMethod((T)targetItem.Value);
}
finally
{
targetLock.ExitWriteLock();
}
}
private ReaderWriterLockSlim AcquireLockObject(string key)
{
return Locks.ContainsKey(key) ? Locks[key] : null;
}
}
Am I accomplishing my goals while remaining thread safe? Do you all see a better way to achieve my goals?
Thanks!
UPDATE: So the bottom line here was that I was really trying to do too much in 1 area. For some reason, I was convinced that managing the Get / Update operations in the same class that managed the cache was a good idea. After looking at Groo's solution & rethinking the issue, I was able to do a good amount of refactoring which removed this issue I was facing.
Well, I don't think this class does what you need.
Allow multiple reads against the object, but lock all reads upon an update request
You may lock all reads to the cache manager, but you are not locking reads (nor updates) to the actual cached instance.
Ensure that the object is only ever created 1 time if it does not already exist (keep in mind that its a long build action).
I don't think you ensured that. You are not locking anything while adding the object to the dictionary (and, furthermore, you are adding a lazy constructor, so you don't even know when the object is going to be instantiated).
Edit: This part holds, the only thing I would change is to make Get return a Lazy<object>. While writing my program, I forgot to cast it and calling ToString on the return value returned `"Value not created".
Allow the cache to store many objects, and maintain a lock per object (rather than one lock for all objects).
That's the same as point 1: you are locking the dictionary, not the access to the object. And your update delegate has a strange signature (it accepts a typed generic parameter, and returns an object which is never used). This means you are really modifying the object's properties, and these changes are immediately visible to any part of your program holding a reference to that object.
How to resolve this
If your object is mutable (and I presume it is), there is no way to ensure transactional consistency unless each of your properties also acquires a lock on each read access. A way to simplify this is to make it immutable (that why these are so popular for multithreading).
Alternatively, you may consider breaking this large object into smaller pieces and caching each piece separately, making them immutable if needed.
[Edit] Added a race condition example:
class Program
{
static void Main(string[] args)
{
CacheManager cache = new CacheManager();
cache.AddOrGetExisting("item", () => new Test());
// let one thread modify the item
ThreadPool.QueueUserWorkItem(s =>
{
Thread.Sleep(250);
cache.Update<Test>("item", i =>
{
i.First = "CHANGED";
Thread.Sleep(500);
i.Second = "CHANGED";
return i;
});
});
// let one thread just read the item and print it
ThreadPool.QueueUserWorkItem(s =>
{
var item = ((Lazy<object>)cache.Get("item")).Value;
Log(item.ToString());
Thread.Sleep(500);
Log(item.ToString());
});
Console.Read();
}
class Test
{
private string _first = "Initial value";
public string First
{
get { return _first; }
set { _first = value; Log("First", value); }
}
private string _second = "Initial value";
public string Second
{
get { return _second; }
set { _second = value; Log("Second", value); }
}
public override string ToString()
{
return string.Format("--> PRINTING: First: [{0}], Second: [{1}]", First, Second);
}
}
private static void Log(string message)
{
Console.WriteLine("Thread {0}: {1}", Thread.CurrentThread.ManagedThreadId, message);
}
private static void Log(string property, string value)
{
Console.WriteLine("Thread {0}: {1} property was changed to [{2}]", Thread.CurrentThread.ManagedThreadId, property, value);
}
}
Something like this should happen:
t = 0ms : thread A gets the item and prints the initial value
t = 250ms: thread B modifies the first property
t = 500ms: thread A prints the INCONSISTENT value (only the first prop. changed)
t = 750ms: thread B modifies the second property
I need to make a critical section in an area on the basis of a finite set of strings. I want the lock to be shared for the same string instance, (somewhat similar to String.Intern approach).
I am considering the following implementation:
public class Foo
{
private readonly string _s;
private static readonly HashSet<string> _locks = new HashSet<string>();
public Foo(string s)
{
_s = s;
_locks.Add(s);
}
public void LockMethod()
{
lock(_locks.Single(l => l == _s))
{
...
}
}
}
Are there any problems with this approach? Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet<string>?
Is it better to, for example, create a Dictionary<string, object> that creates a new lock object for each string instance?
Final Implementation
Based on the suggestions I went with the following implementation:
public class Foo
{
private readonly string _s;
private static readonly ConcurrentDictionary<string, object> _locks = new ConcurrentDictionary<string, object>();
public Foo(string s)
{
_s = s;
}
public void LockMethod()
{
lock(_locks.GetOrAdd(_s, _ => new object()))
{
...
}
}
}
Locking on strings is discouraged, the main reason is that (because of string-interning) some other code could lock on the same string instance without you knowing this. Creating a potential for deadlock situations.
Now this is probably a far fetched scenario in most concrete situations. It's more a general rule for libraries.
But on the other hand, what is the perceived benefit of strings?
So, point for point:
Are there any problems with this approach?
Yes, but mostly theoretical.
Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet?
The HashSet<> is not involved in the thread-safety as long as the threads only read concurrently.
Is it better to, for example, create a Dictionary that creates a new lock object for each string instance?
Yes. Just to be on the safe side. In a large system the main aim for avoiding deadlock is to keep the lock-objects as local and private as possible. Only a limited amount of code should be able to access them.
I'd say it's a really bad idea, personally. That isn't what strings are for.
(Personally I dislike the fact that every object has a monitor in the first place, but that's a slightly different concern.)
If you want an object which represents a lock which can be shared between different instances, why not create a specific type for that? You can given the lock a name easily enough for diagnostic purposes, but locking is really not the purpose of a string. Something like this:
public sealed class Lock
{
private readonly string name;
public string Name { get { return name; } }
public Lock(string name)
{
if (name == null)
{
throw new ArgumentNullException("name");
}
this.name = name;
}
}
Given the way that strings are sometimes interned and sometimes not (in a way which can occasionally be difficult to discern by simple inspection), you could easily end up with accidentally shared locks where you didn't intend them.
Locking on strings can be problematic, because interned strings are essentially global.
Interned strings are per process, so they are even shared among different AppDomains. Same goes for type objects (so don't lock on typeof(x)) either.
I had a similar issue not long ago where I was looking for a good way to lock a section of code based on a string value. Here's what we have in place at the moment, that solves the problem of interned strings and has the granularity we want.
The main idea is to maintain a static ConcurrentDictionary of sync objects with a string key. When a thread enters the method, it immediately establishes a lock and attempts to add the sync object to the concurrent dictionary. If we can add to the concurrent dictionary, it means that no other threads have a lock based on our string key and we can continue our work. Otherwise, we'll use the sync object from the concurrent dictionary to establish a second lock, which will wait for the running thread to finish processing. When the second lock is released, we can attempt to add the current thread's sync object to the dictionary again.
One word of caution: the threads aren't queued- so if multiple threads with the same string key are competing simultaneously for a lock, there are no guarantees about the order in which they will be processed.
Feel free to critique if you think I've overlooked something.
public class Foo
{
private static ConcurrentDictionary<string, object> _lockDictionary = new ConcurrentDictionary<string, object>();
public void DoSomethingThreadCriticalByString(string lockString)
{
object thisThreadSyncObject = new object();
lock (thisThreadSyncObject)
{
try
{
for (; ; )
{
object runningThreadSyncObject = _lockDictionary.GetOrAdd(lockString, thisThreadSyncObject);
if (runningThreadSyncObject == thisThreadSyncObject)
break;
lock (runningThreadSyncObject)
{
// Wait for the currently processing thread to finish and try inserting into the dictionary again.
}
}
// Do your work here.
}
finally
{
// Remove the key from the lock dictionary
object dummy;
_lockDictionary.TryRemove(lockString, out dummy);
}
}
}
}
I have a list that is accessed by multiple background threads to update/read. Updates actions include both insertions and deletions.
To do this concurrently without synchronization problems, I am using a lock on a private readonly object in the class.
To minimize the time I need to lock the list when reading its data, I do a deep clone of it and return the deep clone and unlock the dictionary for insert/delete updates.
Due to this every read of the list increases the memory consumption of my service.
One point to note is that the inserts/deletes are internal to the class that contains the list. But the read is meant for public consumption.
My question is:
Is there any way, I can avoid cloning the list and still use it concurrently for reads using read/write locks?
public class ServiceCache
{
private static List<Users> activeUsers;
private static readonly object lockObject = new object();
private static ServiceCache instance = new ServiceCache();
public static ServiceCache Instance
{
get
{
return instance;
}
}
private void AddUser(User newUser)
{
lock (lockObject)
{
//... add user logic
}
}
private void RemoveUser(User currentUser)
{
lock (lockObject)
{
//... remove user logic
}
}
public List<Users> ActiveUsers
{
get
{
lock (lockObject)
{
//The cache returns deep copies of the users it holds, not links to the actual data.
return activeUsers.Select(au => au.DeepCopy()).ToList();
}
}
}
}
It sounds like you need to use the ConcurrentDictionary class, and create a key for each of the Users objects you are storing. Then it becomes as simple as this for adding / updating a user:
_dictionary.AddOrUpdate("key", (k, v) =>
{
return newUser;
}, (k, v) =>
{
return newUser;
});
And then for removing, you would do this:
Users value = null;
_dictionary.TryRemove("key", out value);
Getting the list of people would be super easy as well, since you would just need to do:
return _dictionary.Values.Select(x => x.Value).ToList();
Which should return a copy of the dictionary contents at that very moment.
And let the .NET runtime take care of the threading for you.
You can use a reader-writer lock to allow simultaneous reads.
However, it would be much faster to use a ConcurrentDictionary and thread-safe immutable values, then get rid of all synchronization.
Due to this every read of the list increases the memory consumption of
my service.
Why? Are the callers not releasing the reference? They need to, since the content of the dictionary can change.
What you are doing with copying is I think very close to how a Concurrent data structure, e.g. copy-on-write collection works, except that the caller cannot hold on to the reference.
A couple of other approaches:
Return the same copy to all callers till the collection gets modified. The returned collection should be immutable
Expose all the functionality the caller would want from the copy and use a single lock to work with the original list
Using what I judged was the best of all worlds on the Implementing the Singleton Pattern in C# amazing article, I have been using with success the following class to persist user-defined data in memory (for the very rarely modified data):
public class Params
{
static readonly Params Instance = new Params();
Params()
{
}
public static Params InMemory
{
get
{
return Instance;
}
}
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
public int ChunkSize
{
get
{
// Loc uses the Localizations impl
LC.Loc("params.chunksize").To<int>();
}
}
public void RebuildLocalizations()
{
_localizations = null;
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}
My usage would look something like this:
var allLocs = Params.InMemory.Localizations; //etc
Whenever I update the database, the RefreshLocalizations gets called, so only part of my in-memory store is rebuilt. I have a single production environment out of about 10 that seems to be misbehaving when the RefreshLocalizations gets called, not refreshing at all, but this is also seems to be intermittent and very odd altogether.
My current suspicions goes towards the singleton, which I think does the job great and all the unit tests prove that the singleton mechanism, the refresh mechanism and the RAM performance all work as expected.
That said, I am down to these possibilities:
This customer is lying when he says their environment is not using loading balance, which is a setting I am not expecting the in-memory stuff to work properly (right?)
There is some non-standard pool configuration in their IIS which I am testing against (maybe in a Web Garden setting?)
The singleton is failing somehow, but not sure how.
Any suggestions?
.NET 3.5 so not much parallel juice available, and not ready to use the Reactive Extensions for now
Edit1: as per the suggestions, would the getter look something like:
public IEnumerable<Localization> Localizations
{
get
{
lock(_localizations) {
return _localizations ?? (_localizations = new Repository<Localization>().Get());
}
}
}
To expand on my comment, here is how you might make the Localizations property thread safe:
public class Params
{
private object _lock = new object();
private IEnumerable<Localization> _localizations;
public IEnumerable<Localization> Localizations
{
get
{
lock (_lock) {
if ( _localizations == null ) {
_localizations = new Repository<Localization>().Get();
}
return _localizations;
}
}
}
public void RebuildLocalizations()
{
lock(_lock) {
_localizations = null;
}
}
// other similar values coming from the DB and staying in-memory,
// and their refresh methods
}
There is no point in creating a thread safe singleton, if your properties are not going to be thread safe.
You should either lock around assignment of the _localization field, or instantiate in your singleton's constructor (preferred). Any suggestion which applies to singleton instantiation applies to this lazy-instantiated property.
The same thing further applies to all properties (and their properties) of Localization. If this is a Singleton, it means that any thread can access it any time, and simply locking the getter will again do nothing.
For example, consider this case:
Thread 1 Thread 2
// both threads access the singleton, but you are "safe" because you locked
1. var loc1 = Params.Localizations; var loc2 = Params.Localizations;
// do stuff // thread 2 calls the same property...
2. var value = loc1.ChunkSize; var chunk = LC.Loc("params.chunksize");
// invalidate // ...there is a slight pause here...
3. loc1.RebuildLocalizations();
// ...and gets the wrong value
4. var value = chunk.To();
If you are only reading these values, then it might not matter, but you can see how you can easily get in trouble with this approach.
Remember that with threading, you never know if a different thread will execute something between two instructions. Only simple 32-bit assignments are atomic, nothing else.
This means that, in this line here:
return LC.Loc("params.chunksize").To<int>();
is, as far as threading is concerned, equivalent to:
var loc = LC.Loc("params.chunksize");
Thread.Sleep(1); // anything can happen here :-(
return loc.To<int>();
Any thread can jump in between Loc and To.
I've a class that contains a static collection to store the logged-in users in an ASP.NET MVC application. I just want to know about the below code is thread-safe or not. Do I need to lock the code whenever I add or remove item to the onlineUsers collection.
public class OnlineUsers
{
private static List<string> onlineUsers = new List<string>();
public static EventHandler<string> OnUserAdded;
public static EventHandler<string> OnUserRemoved;
private OnlineUsers()
{
}
static OnlineUsers()
{
}
public static int NoOfOnlineUsers
{
get
{
return onlineUsers.Count;
}
}
public static List<string> GetUsers()
{
return onlineUsers;
}
public static void AddUser(string userName)
{
if (!onlineUsers.Contains(userName))
{
onlineUsers.Add(userName);
if (OnUserAdded != null)
OnUserAdded(null, userName);
}
}
public static void RemoveUser(string userName)
{
if (onlineUsers.Contains(userName))
{
onlineUsers.Remove(userName);
if (OnUserRemoved != null)
OnUserRemoved(null, userName);
}
}
}
That is absolutely not thread safe. Any time 2 threads are doing something (very common in a web application), chaos is possible - exceptions, or silent data loss.
Yes you need some kind of synchronization such as lock; and static is usually a very bad idea for data storage, IMO (unless treated very carefully and limited to things like configuration data).
Also - static events are notorious for a good way to keep object graphs alive unexpectedly. Treat those with caution too; if you subscribe once only, fine - but don't subscribe etc per request.
Also - it isn't just locking the operations, since this line:
return onlineUsers;
returns your list, now unprotected. all access to an item must be synchronized. Personally I'd return a copy, i.e.
lock(syncObj) {
return onlineUsers.ToArray();
}
Finally, returning a .Count from such can be confusing - as it is not guaranteed to still be Count at any point. It is informational at that point in time only.
Yes, you need to lock the onlineUsers to make that code threadsafe.
A few notes:
Using a HashSet<string> instead of the List<string> may be a good idea, since it is much more efficient for operations like this (Contains and Remove especially). This does not change anything on the locking requirements though.
You can declare a class as "static" if it has only static members.
Yes you do need to lock your code.
object padlock = new object
public bool Contains(T item)
{
lock (padlock)
{
return items.Contains(item);
}
}
Yes. You need to lock the collection before you read or write to the collection, since multiple users are potentially being added from different threadpool workers. You should probably also do it on the count as well, though if you're not concerned with 100% accuracy that may not be an issue.
As per Lucero's answer, you need to lock onlineUsers. Also be careful what will clients of your class do with the onlineUsers returned from GetUsers(). I suggest you change your interface - for example use IEnumerable<string> GetUsers() and make sure the lock is used in its implementation. Something like this:
public static IEnumerable<string> GetUsers() {
lock (...) {
foreach (var element in onlineUsers)
yield return element;
// We need foreach, just "return onlineUsers" would release the lock too early!
}
}
Note that this implementation can expose you to deadlocks if users try to call some other method of OnlineUsers that uses lock, while still iterating over the result of GetUsers().
That code it is not thread-safe per se.
I will not make any suggestions relative to your "design", since you didn't ask any. I'll assume you found good reasons for those static members and exposing your list's contents as you did.
However, if you want to make your code thread-safe, you should basically use a lock object to lock on, and wrap the contents of your methods with a lock statement:
private readonly object syncObject = new object();
void SomeMethod()
{
lock (this.syncObject)
{
// Work with your list here
}
}
Beware that those events being raised have the potential to hold the lock for an extended period of time, depending on what the delegates do.
You could omit the lock from the NoOfOnlineUsers property while declaring your list as volatile. However, if you want the Count value to persist for as long as you are using it at a certain moment, use a lock there, as well.
As others suggested here, exposing your list directly, even with a lock, will still pose a "threat" on it's contents. I would go with returning a copy (and that should fit most purposes) as Mark Gravell advised.
Now, since you said you are using this in an ASP.NET environment, it is worth saying that all local and member variables, as well as their member variables, if any, are thread safe.