In the following code:
public class StringCache
{
private readonly object lockobj = new object();
private readonly Dictionary<int, string> cache = new Dictionary<int, string>();
public string GetMemberInfo(int key)
{
if (cache.ContainsKey(key))
return cache[key];
lock (lockobj)
{
if (!cache.ContainsKey(key))
cache[key] = GetString(key);
}
return cache[key];
}
private static string GetString(int key)
{
return "Not Important";
}
}
1) Is ContainsKey thread safe? IOW, what happens if that method is executing when another thread is adding something to the dictionary?
2) For the first return cache[key], is there any chance that it could return a garbled value?
TIA,
MB
The inherent thread safety of ContainsKey doesn't matter, since there is no synchronization between ContainsKey & cache[key].
For example:
if (cache.ContainsKey(key))
// Switch to another thread, which deletes the key.
return cache[key];
MSDN is pretty clear on this point:
To allow the collection to be accessed
by multiple threads for reading and
writing, you must implement your own
synchronization.
For more info, JaredPar posted a great blog entry at http://blogs.msdn.com/jaredpar/archive/2009/02/11/why-are-thread-safe-collections-so-hard.aspx on thread-safe collections.
No, ContainsKey is not thread-safe if you're writing values while you're trying to read.
Yes, there is a chance you could get back invalid results -- but you'll probably start seeing exceptions first.
Take a look at the ReaderWriterLockSlim for locking in situations like this -- it's built to do this kind of stuff.
Here's what it says in the MSDN documentation:
Public static (Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe.
A Dictionary<(Of <(TKey, TValue>)>)
can support multiple readers
concurrently, as long as the
collection is not modified. Even so,
enumerating through a collection is
intrinsically not a thread-safe
procedure. In the rare case where an
enumeration contends with write
accesses, the collection must be
locked during the entire enumeration.
To allow the collection to be accessed
by multiple threads for reading and
writing, you must implement your own
synchronization.
If I'm reading that correctly, I don't believe that it is thread safe.
Dictionary is not Thread-Safe.
If you say that
what happens if that method is
executing when another thread is
adding something to the dictionary?
then I suppose other functions access the cache as well. You need to synchronize accesses(reading and writing) to the cache. Use your lock object in all of these operations.
I believe its not thread safe,
I would suggest go thru below link, it shows implementation of the thread safe dictionary, or its better to develop your own synchronization.
http://lysaghtn.weebly.com/synchronised-dictionary.html
Related
I have a key to task mapping and I need to run the task only if the task for the given is not already running. Pseudo code follows. I believe there is lot of scope for improvement. I'm locking on the map and hence almost serializing access to CacheFreshener. Is there a better way of doing this? We know that when I'm trying to lock a key k1, there is no point in cache freshener call for key k2 waiting for lock.
class CacheFreshener
{
private ConcurrentDictionary<string,bool> lockMap;
public RefreshData(string key, Func<string, bool> cacheMissAction)
{
lock(lockMap)
{
if (lockMap.ContainsKey(key))
{
// no-op
return;
}
else
{
lockMap.Add(key, true);
}
}
// if you are here means task is not already present
cacheMissAction(key);
lock(lockMap) // Do we need to lock here??
{
lockMap.Remove(key);
}
}
}
As requested, here is an elaborated explanation of what I was getting at relative to my comments…
The basic issue here seems to be the question of concurrency, i.e. two or more threads accessing the same object at a time. This is the scenario ConcurrentDictionary is designed for. If you use the IDictionary methods of ContainsKey() and Add() separately, then you would need explicit synchronization (but only for that operation…in this particular scenario it wouldn't strictly be needed when calling Remove()) to ensure these are performed as a single atomic operation. But the ConcurrentDictionary class anticipates this need, and includes the TryAdd() method to accomplish the same, without the explicit synchronization.
<aside>
It is not entirely clear to me the intent behind the code example as given. The code appears to be meant to only store an object in the "cache" for the duration of the invocation of the cacheMissAction delegate. The key is removed immediately after. So it does seem like it's not really caching anything per se. It just prevents more than one thread from being in the process of invoking cacheMissAction at a time (subsequent threads will fail to invoke it, but also cannot count on it having completed by the time their call to the RefreshData() method has completed).
</aside>
But taking the code example as given, it's clear that no explicit locking is actually required. The ConcurrentDictionary class already provides thread-safe access (i.e. non-corruption of the data structure when used concurrently from multiple threads), and it provides the TryAdd() method as a mechanism for adding a key (and its value, though here that's just always a bool literal of true) to the dictionary that will ensure that only one thread ever has a key in the dictionary at a time.
So we can rewrite the code to look like this instead and accomplish the same goal:
private ConcurrentDictionary<string,bool> lockMap;
public RefreshData(string key, Func<string, bool> cacheMissAction)
{
if (!lockMap.TryAdd(key, true))
{
return;
}
// if you are here means task was not already present
cacheMissAction(key);
lockMap.Remove(key);
}
No lock statement is needed for either the add or remove, as the TryAdd() handles the entire "check for key and add if not present" operation atomically.
I will note that using a dictionary to do the job of a set could be considered inefficient. If the collection is likely not to be large, it's no big deal, but I do find it odd that Microsoft chose to make the same mistake they made originally when in the pre-generics days you had to use the non-generic dictionary object Hashtable to store a set, before HashSet<T> came along. Now we have all these easy-to-use classes in System.Collections.Concurrent, but no thread-safe implementation of ISet<T> in there. Sigh…
That said, if you do prefer a somewhat more efficient approach in terms of storage (this is not necessarily a faster implementation, depending on the concurrent access patterns of the object), something like this would work as an alternative:
private HashSet<string> lockSet;
private readonly object _lock = new object();
public RefreshData(string key, Func<string, bool> cacheMissAction)
{
lock (_lock)
{
if (!lockSet.Add(key))
{
return;
}
}
// if you are here means task was not already present
cacheMissAction(key);
lock (_lock)
{
lockSet.Remove(key);
}
}
In this case, you do need the lock statement, because the HashSet<T> class is not inherently thread-safe. This is of course very similar to your original implementation, just using the more set-like semantics of HashSet<T> instead.
Consider the following code:
Dictionary<string, string> list = new Dictionary<string, string>();
object lockObj = new object();
public void MyMethod(string a) {
if (list.Contains(a))
return;
lock (lockObj) {
list.Add(a,"someothervalue");
}
}
Assuming I'm calling MyMethod("mystring") from different threads concurrently.
Would it be possible for more than one thread (we'll just take it as two) enter the if (!list.Contains(a)) statement in the same time (with a few CPU cycles differences), both threads get evaluated as false and one thread enters the critical region while another gets locked outside, so the second thread enters and add "mystring" to the list again after the first thread exits, resulting in the dictionary trying to add a duplicate key?
No, it's not thread-safe. You need the lock around the list.Contains too as it is possible for a thread to be switched out and back in again between the the if test and adding the data. Another thread may have added data in the meantime.
You need to lock the entire operation (check and add) or multiple threads may attempt to add the same value.
I would recommend using the ConcurrentDictionary(TKey, TValue) since it is designed to be thread safe.
private readonly ConcurrentDictionary<string, string> _items
= new ConcurrentDictionary<string, string>();
public void MyMethod(string item, string value)
{
_items.AddOrUpdate(item, value, (i, v) => value);
}
You need to lock around the whole statement. It's possible for you to run into issues on the .Contains portion (the way your code is now)
You should check the list after locking. e.g.
if (list.Contains(a))
return;
lock (lockObj) {
if (list.Contains(a))
return;
list.Add(a);
}
}
private Dictionary<string, string> list = new Dictionary<string, string>();
public void MyMethod(string a) {
lock (list) {
if (list.Contains(a))
return;
list.Add(a,"someothervalue");
}
}
Check out this guide to locking, it's good
A few guidelines to bear in mind
Generally lock around a private static object when locking on multiple writeable values
Do not lock on things with scope outside the class or local method such as lock(this), which could lead to deadlocks!
You may lock on the object being changed if it is the only concurrently accessed object
Ensure the object you lock is not null!
You can only lock on reference types
I am going to assume that you meant write ContainsKey instead of Contains. Contains on a Dictionary is explicitly implemented so it is not accessible via the type you declared.1
Your code is not safe. The reason is because there is nothing preventing ContainsKey and Add from executing at the same time. There are actually some quite remarkable failure scenarios that this would introduce. Because I looked at how the Dictionary is implemented I can see that your code could cause a situation where data structure contains duplicates. And I mean it literally contains duplicates. The exception will not necessarily be thrown. The other failure scenarios just keep getting stranger and stranger, but I will not go into those here.
One trivial modification to your code might involve a variation of the double-checked locking pattern.
public void MyMethod(string a)
{
if (!dictionary.ContainsKey(a))
{
lock (dictionary)
{
if (!dictionary.ContainsKey(a))
{
dictionary.Add(a, "someothervalue");
}
}
}
}
This, of course, is not any safer for the reason I already stated. Actually, the double-checked locking pattern is notoriously difficult to get right in all but the simplest cases (like the canonical implementation of a singleton). There are many variations on this theme. You can try it with TryGetValue or the default indexer, but ultimately all of these variations are just dead wrong.
So how could this be done correctly without taking a lock? You could try ConcurrentDictionary. It has the method GetOrAdd which is really useful in these scenarios. Your code would look like this.
public void MyMethod(string a)
{
// The variable 'dictionary' is a ConcurrentDictionary.
dictionary.GetOrAdd(a, "someothervalue");
}
That is all there is to it. The GetOrAdd function will check to see if the item exists. If it does not then it will be added. Otherwise, it will leave the data structure alone. This is all done in a thread-safe manner. In most cases the ConcurrentDictionary does this without waiting on a lock.2
1By the way, your variable name is obnoxious too. If it were not for Servy's comment I may have missed the fact that we were talking about a Dictionary as opposed to a List. In fact, based on the Contains call I first thought we were talking about a List.
2On the ConcurrentDictionary readers are completely lock free. However, writers always take a lock (adds and updates that is; the remove operation is still lock free). This includes the GetOrAdd function. The difference is that the data structure maintains several possible lock options so in most cases there is little or no lock contention. That is why this data structure is said to be "low lock" or "concurrent" as opposed to "lock free".
You can first do a non-locking check, but if you want to be thread-safe you need to repeat the check again within the lock. This way you don't lock unless you have to and ensure thread safety.
Dictionary<string, string> list = new Dictionary<string, string>();
object lockObj = new object();
public void MyMethod(string a) {
if (list.Contains(a))
return;
lock (lockObj) {
if (!list.Contains(a)){
list.Add(a,"someothervalue");
}
}
}
I need to implement the class that should perform locking mechanism in our framework.
We have several threads and they are numbered 0,1,2,3.... We have a static class called ResourceHandler, that should lock these threads on given objects. The requirement is that n Lock() invokes should be realeased by m Release() invokes, where n = [0..] and m = [0..]. So no matter how many locks was performed on single object, only one Release() call is enough to unlock all. Even further if o object is not locked, Release() call should perform nothing. Also we need to know what objects are locked on what threads.
I have this implementation:
public class ResourceHandler
{
private readonly Dictionary<int, List<object>> _locks = new Dictionary<int, List<object>>();
public static ResourceHandler Instance {/* Singleton */}
public virtual void Lock(int threadNumber, object obj)
{
Monitor.Enter(obj);
if (!_locks.ContainsKey(threadNumber)) {_locks.Add(new List<object>());}
_locks[threadNumber].Add(obj);
}
public virtual void Release(int threadNumber, object obj)
{
// Check whether we have threadN in _lock and skip if not
var count = _locks[threadNumber].Count(x => x == obj);
_locks[threadNumber].RemoveAll(x => x == obj);
for (int i=0; i<count; i++)
{
Monitor.Exit(obj);
}
}
// .....
}
Actually what I am worried here about is thread-safety. I'm actually not sure, is it thread-safe or not, and it's a real pain to fix that. Am I doing the task correctly and how can I ensure that this is thread-safe?
Your Lock method locks on the target objects but the _locks dictionary can be accessed by any thread at any time. You may want to add a private lock object for accessing the dictionary (in both the Lock and Release methods).
Also keep in mind that by using such a ResourceHandler it is the responsibility of the rest of the code (the consuming threads) to release all used objects (a regular lock () block for instance covers that problem since whenever you leave the lock's scope, the object is released).
You also may want to use ReferenceEquals when counting the number of times an object is locked instead of ==.
You can ensure this class is thread safe by using a ConcurrentDictionary but, it won't help you with all the problems you will get from trying to develop your own locking mechanism.
There are a number locking mechansims that are already part of the .Net Framework, you should use those.
It sounds like you are going to need to use a combination of these, including Wait Handles to achieve what you want.
EDIT
After reading more carefully, I think you might need an EventWaitHandle
What you have got conceptually looks dangerous; this is bacause calls to Monitor.Enter and Monitor.Exit for them to work as a Lock statement, are reccomended to be encapsulated in a try/finally block, that is to ensure they are executed sequetally. Calling Monitor.Exit before Monitor.Enter will throw an exception.
To avoid these problems (if an exception is thrown, the lock for a given thread may-or-may-not be taken, and if a lock is taken it will not be released, resulting in a leaked lock. I would recomend using one of the options provided in the other answers above. However, if you do want to progress with this mechanism, CLR 4.0 added the following overload to the Monitor.Enter method
public static void Enter (object, ref bool lockTaken);
lockTaken is false if and only if the Enter method throws an exception and the lock was not taken. So, using your two methods using a global bool lockTaken you can create something like (here the example is for a single locker - you will need a Dictionary of List<bool> corresponding to your threads - or event better a Tuple). So in your method Lock you would have something like
bool lockTaken = false;
Monitor.Enter(locker, ref lockTaken);
in the other method Release
if (lockTaken)
Monitor.Exit(locker);
I hope this helps.
Edit: I don't think I fully appreciate your problem, but from what I can gather I would be using a Concurrent Collection. These are fully thead safe. Check out IProducerConsumerCollection<T> and ConcurrentBag<T>. These should facilitate what you want with all thread safter taken care of by the framework (note. a thread safe collection doesn't mean the code it executes is thread safe!). However, using a collection like this, is likely to be far slower than using locks.
IMO you need to use atomic set of functions to make it safe.
http://msdn.microsoft.com/en-us/library/system.threading.mutex.aspx
Mutexes I guess will help u.
Is there a difference in the below code segments in the way we lock?
public Hashtable mySet= new Hashtable() //mySet is visible to other threads.
lock (mySet)
{
mySet.Add("Hello World");
}
and
public Hashtable mySet= new Hashtable();
lock(mySet.SyncRoot)
{
mySet.Add("Hello World");
}
lock doesn't actually lock the object in question, so it makes no difference which object is used. Instead it uses the object to establish a protocol and as long as all threads use the same object the protocol guarantees that only one thread will execute code guarded by that lock.
You can think of the object as the microphone on a talk show. Whoever holds the microphone is the only one allowed to talk (I know that is not always how it turns out on some of the shows, but that's the idea anyway).
As the object passend to the lock will only be used as a "flag holder", this will not make any difference.
Please see this
According to the MSDN documentation here only a lock on the SyncRoot of a collection does guarantee thread safety.
Enumerating through a collection is intrinsically not a thread-safe
procedure. Even when a collection is synchronized, other threads can
still modify the collection, which causes the enumerator to throw an
exception. To guarantee thread safety during enumeration, you can
either lock the collection during the entire enumeration or catch the
exceptions resulting from changes made by other threads.
Suppose that I have a Dictionary<string, string>. The dictionary is declared as public static in my console program.
If I'm working with threads and I want to do foreach on this Dictionary from one thread but at the same time another thread want to add item to the dictionary. This would cause a bug here because we can't modify our Dictionary while we are running on it with a foreach loop in another thread.
To bypass this problem I created a lock statement on the same static object on each operation on the dictionary.
Is this the best way to bypass this problem? My Dictionary can be very big and I can have many threads that want to foreach on it. As it is currently, things can be very slow.
Try using a ConcurrentDictionary<TKey, TValue>, which is designed for this kind of scenario.
There's a nice tutorial here on how to use it.
The big question is: Do you need the foreach to be a snapshot?
If the answer is "no", then use a ConcurrentDictionary and you will probably be fine. (The one remaining question is whether the nature of your inserts and reads hit the striped locks in a bad way, but if that was the case you'd be finding normal reads and writes to the dictionary even worse).
However, because it's GetEnumerator doesn't provide a snapshot, it will not be enumerating the same start at the beginning as it is at the end. It could miss items, or duplicate items. The question is whether that's a disaster to you or not.
If it would be a disaster if you had duplicates, but not otherwise, then you can filter out duplicates with Distinct() (whether keyed on the keys or both the key and value, as required).
If you really need it to be a hard snapshot, then take the following approach.
Have a ConcurrentDictionary (dict) and a ReaderWriterLockSlim (rwls). On both reads and writes obtain a reader lock (yes even though you're writing):
public static void AddToDict(string key, string value)
{
rwls.EnterReadLock();
try
{
dict[key] = value;
}
finally
{
rwls.ExitReadLock();
}
}
public static bool ReadFromDict(string key, out string value)
{
rwls.EnterReadLock();
try
{
return dict.TryGetValue(key, out value);
}
finally
{
rwls.ExitReadLock();
}
}
Now, when we want to enumerate the dictionary, we acquire the write lock (even though we're reading):
public IEnumerable<KeyValuePair<string, string>> EnumerateDict()
{
rwls.EnterWriteLock();
try
{
return dict.ToList();
}
finally
{
rwls.ExitWriteLock();
}
}
This way we obtain the shared lock for reading and writing, because ConcurrentDictionary deals with the conflicts involved in that for us. We obtain the exclusive lock for enumerating, but just for long enough to obtain a snapshot of the dictionary in a list, which is then used only in that thread and not shared with any other.
With .NET 4 you get a fancy new ConcurrentDictionary. I think there are some .NET 3.5-based implementations floating around.
Yes, you will have a problem updating the global dictionary while an enumeration is running in another thread.
Solutions:
Require all users of the dictionary to acquire a mutex lock before accessing the object, and release the lock afterwards.
Use .NET 4.0's ConcurrentDictionary class.