When calling TryUpdate you should specify old value besides key, why is it required ?
And additional question why TryUpdate method (and others similar) have while(true) loop wrapper inside ?
The ConcurrentDictionary<TKey,TValue> collection is designed to support concurrent scenarios, where operations must be atomic. For example let's say that you have a dictionary with string keys and int values, and you want to increment the value of the key "A". The following code is not atomic:
dictionary["A"]++;
Between reading the value and updating it, it is possible that another thread will change the value, resulting in the other thread's change being lost. It is easier to see it if we rewrite the above code like this:
var value = dictionary["A"];
value++;
dictionary["A"] = value;
The solution is to avoid updating the dictionary using the indexer, and use the TryUpdate instead. In case another thread intercepts our update, we'll have to start all over again, until we finally win the race at updating this key:
while (true)
{
var existing = dictionary["A"];
var updated = existing + 1;
if (dictionary.TryUpdate("A", updated, existing)) break;
}
Doing loops with while (true), also known as "spinning", is a typical technique in low-lock multithreaded programming.
Related question: Is there a way to use ConcurrentDictionary.TryUpdate with a lambda expression?
Related
Why do we need third argument comparisonValue in ConcurrentDictionary.TryUpdate method?
And why will updating not succeed if already existed value is not equal to comparisonValue? Can't we just replace existed value with the new one just like in normal Dictionary<,>?
This is the signature:
public bool TryUpdate(TKey key, TValue newValue, TValue comparisonValue)
The point is that you're using concurrent dictionary for scenarios with concurrent access to the dictionary. You don't know who (and how) changed the dictionary in the meantime. Passing a comparison value is a very simple and effective way of only doing the change if the state of the dictionary is the same one you expect.
If you expect collisions to be relatively rare, this is a very efficient and performant way of handling shared state (no need for locking, and thus stopping all access). This pattern is the basis of lock-free code; you see it even on the hardware level. You can look up Compare and Exchange (or Compare and Swap) for more information.
If you want to update a key of a ConcurrentDictionary regardless of its current value, you can just use the set accessor of the indexer:
var dictionary = new ConcurrentDictionary<int, string>();
dictionary[1] = "Hello";
dictionary[2] = "World";
dictionary[1] = "Goodbye";
Console.WriteLine(String.Join(", ", dictionary));
Output:
[1, Goodbye], [2, World]
If each thread is working with an isolated set of keys, updating a ConcurrentDictionary like this might be sufficient. But if multiple threads are competing for updating the same keys, chaos might ensue. In those cases it might be desirable to use the TryUpdate method, or more frequently the AddOrUpdate method. These methods allow to update conditionally the dictionary, with the checking and updating being an atomic operation.
The following question might offer some insights about how this API can be used in practice:
Is there a way to use ConcurrentDictionary.TryUpdate with a lambda expression?
I have a collection as below
private static readonly Dictionary<string,object> _AppCache = new Dictionary<string,object>;
Then I was wondering which one is better to use to check if a key exists (none of my keys has null value)
_AppCache.ContainsKey("x")
_AppCache["x"] != null
This code might be access through various number of threads
The whole code is:
public void SetGlobalObject(string key, object value)
{
globalCacheLock.EnterWriteLock();
try
{
if (!_AppCache.ContainsKey(key))
{
_AppCache.Add(key, value);
}
}
finally
{
globalCacheLock.ExitWriteLock();
}
}
Update
I changed my code to use dictionary to keep focus of the question on Conatinskey or Indexer
I don't disagree with other's advice to use Dictionary. However, to answer your question, I think you should use ContainsKey to check if a key exists for several reasons
That is specifically what ContainsKey was written to do
For _AppCache["x"] != null to work your app must operate under an unenforced assumption (that no values will be null). That assumption may hold true now, but future maintainers may not know or understand this critical assumption, resulting in unintuitive bugs
Slightly less processing for ContainsKey, although this is not really important
Neither of the two choices are threadsafe, so that is not a deciding factor. For that, you either need to use locking, or use ConcurrentDictionary.
If you move to a Dictionary (per your question update), the answer is even more in favor of ContainsKey. If you used the index option, you would have to catch an exception to detect if the key is not in the Dictionary. ContainsKey would be much more straightforward in your code.
When the key is in the Dictionary, ContainsKey is slightly more efficient. Both options first call an internal method FindEntry. In the case of ContainsKey, it just returns the result of that. For the index option, it must also retrieve the value. In the case of the key not being in the Dictionary, the index option would be a fair amount less efficient, because it will be throwing an exception.
You are obviously checking for the existence of that key. In that case, _AppCache["x"] != null will give you a KeyNotFoundException if the key does not exist, which is probably not as desirable. If you really want to check if the key exists, without generating an exception by just checking, you have to use _AppCache.ContainsKey("x"). For checking if the key exists in the dictionary or hashtable, I would stick with ContainsKey. Any difference in performance, if != null is faster, would be offset by the additional code to deal with the exception if the key really does not exist.
In reality, _AppCache["x"] != null is not checking if the key exists, it is checking, given that key "x" exists, whether the associated value is null.
Neither way (although accomplishing different tasks) gives you any advantage on thread safety.
All of this holds true if you use ConcurrentDictionary - no difference in thread safety, the two ways accomplish different things, any possible gain in checking with !=null is offset by additional code to handle exception. So, use ContainsKey.
If you're concerned about thread-safety, you should have a look at the ConcurrentDictionary class.
If you do not want to use ConcurrentDictionary, than you'll have to make sure that you synchronize access to your regular Dictionary<K,V> instance. That means, making sure that no 2 threads can have multiple access to your dictionary, by locking on each write and read operation.
For instance, if you want to add something to a regular Dictionary in a thread-safe way, you'll have to do it like this:
private readonly object _sync = new object();
// ...
lock( _sync )
{
if( _dictionary.ContainsKey(someKey) == false )
{
_dictionary.Add(someKey, somevalue);
}
}
You should'nt be using using Hashtable anymore since the introduction of the generic Dictionary<K,V> class and therefore type-safe alternative has been introduced in .NET 2.0
One caveat though when using a Dictionary<K,V>: when you want to retrieve the value associated with a given key, the Dictionary will throw an exception when there is no entry for that specified key, whereas a Hashtable will return null in that case.
You should use a ConcurrentDictionary rather than a Dictionary, which is thread-safe itself. Therefore you do not need the lock, which (generally *) improves performance, since the locking mechanisms are rather expensive.
Now, only to check whether an entry exists I recommend ContainsKey, irrespective of which (Concurrent)Dictionary you use:
_AppCache.ContainsKey(key)
But what you do in two steps can be done in one step using the Concurrent Dictionary by using GetOrAdd:
_AppCache.GetOrAdd(key, value);
You need a lock for neither action:
public void SetGlobalObject(string key, object value)
{
_AppCache.GetOrAdd(key, value);
}
Not only does this (probably *) perform better, but I think it expresses your intentions much clearer and less cluttered.
(*) Using "probably" and "generally" here to emphasise that these data structures do have loads of baked-in optimisations for performance, however performance in your specific case must always be measured.
Are the following assumptions valid for this code? I put some background info under the code, but I don't think it's relevant.
Assumption 1: Since this is a single application, I'm making the assumption it will be handled by a single process. Thus, static variables are shared between threads, and declaring my collection of lock objects statically is valid.
Assumption 2: If I know the value is already in the dictionary, I don't need to lock on read. I could use a ConcurrentDictionary, but I believe this one will be safe since I'm not enumerating (or deleting), and the value will exist and not change when I call UnlockOnValue().
Assumption 3: I can lock on the Keys collection, since that reference won't change, even if the underlying data structure does.
private static Dictionary<String,Object> LockList =
new Dictionary<string,object>();
private void LockOnValue(String queryStringValue)
{
lock(LockList.Keys)
{
if(!LockList.Keys.Contains(queryStringValue))
{
LockList.Add(screenName,new Object());
}
System.Threading.Monitor.Enter(LockList[queryStringValue]);
}
}
private void UnlockOnValue(String queryStringValue)
{
System.Threading.Monitor.Exit(LockList[queryStringValue]);
}
Then I would use this code like:
LockOnValue(Request.QueryString["foo"])
//Check cache expiry
//if expired
//Load new values and cache them.
//else
//Load cached values
UnlockOnValue(Request.QueryString["foo"])
Background: I'm creating an app in ASP.NET that downloads data based on a single user-defined variable in the query string. The number of values will be quite limited. I need to cache the results for each value for a specified period of time.
Approach: I decided to use local files to cache the data, which is not the best option, but I wanted to try it since this is non-critical and performance is not a big issue. I used 2 files per option, one with the cache expiry date, and one with the data.
Issue: I'm not sure what the best way to do locking is, and I'm not overly familiar with threading issues in .NET (one of the reasons I chose this approach). Based on what's available, and what I read, I thought the above should work, but I'm not sure and wanted a second opinion.
Your current solution looks pretty good. The two things I would change:
1: UnlockOnValue needs to go in a finally block. If an exception is thrown, it will never release its lock.
2: LockOnValue is somewhat inefficient, since it does a dictionary lookup twice. This isn't a big deal for a small dictionary, but for a larger one you will want to switch to TryGetValue.
Also, your assumption 3 holds - at least for now. But the Dictionary contract makes no guarantee that the Keys property always returns the same object. And since it's so easy to not rely on this, I'd recommend against it. Whenever I need an object to lock on, I just create an object for that sole purpose. Something like:
private static Object _lock = new Object();
lock only has a scope of a single process. If you want to span processes you'll have to use primitives like Mutex (named).
lock is the same as Monitor.Enter and Monitor.Exit. If you also do Monitor.Enter and Monitor.Exit, it's being redundant.
You don't need to lock on read, but you do have to lock the "transaction" of checking if the value doesn't exist and adding it. If you don't lock on that series of instructions, something else could come in between when you check for the key and when you add it and add it--thus resulting in an exception. The lock you're doing is sufficient to do that (you don't need the additional calls to Enter and Exit--lock will do that for you).
Am I right in thinking this is the correct use of a Concurrent Dictionary
private ConcurrentDictionary<int,long> myDic = new ConcurrentDictionary<int,long>();
//Main thread at program startup
for(int i = 0; i < 4; i++)
{
myDic.Add(i, 0);
}
//Separate threads use this to update a value
myDic[InputID] = newLongValue;
I have no locks etc and am just updating the value in the dictionary even though multiple threads might be trying to do the same.
It depends on what you mean by thread-safe.
From MSDN - How to: Add and Remove Items from a ConcurrentDictionary:
ConcurrentDictionary<TKey, TValue> is designed for multithreaded scenarios. You do not have to use locks in your code to add or remove items from the collection. However, it is always possible for one thread to retrieve a value, and another thread to immediately update the collection by giving the same key a new value.
So, it is possible to get an inconsistent view of the value of an item in the dictionary.
Best way to find this out is check MSDN documentation.
For ConcurrentDictionary the page is http://msdn.microsoft.com/en-us/library/dd287191.aspx
Under thread safety section, it is stated "All public and protected members of ConcurrentDictionary(Of TKey, TValue) are thread-safe and may be used concurrently from multiple threads."
So from concurrency point of view you are okay.
Yes, you are right.
That and the possibility to enumerate the dictionary on one thread while changing it on another thread are the only means of existence for that class.
It depends, in my case I prefer using this method.
ConcurrentDictionary<TKey, TValue>.AddOrUpdate Method (TKey, Func<TKey, TValue>, Func<TKey, TValue, TValue>);
See MSDN Library for method usage details.
Sample usage:
results.AddOrUpdate(
Id,
id => new DbResult() {
Id = id,
Value = row.Value,
Rank = 1
},
(id, v) =>
{
v.Rank++;
return v;
});
Just a note: Does not justify using a ConcurrentDicitonary object with a linear loop, making it underutilized. The best alternative is to follow the recommendations of the Microsoft Documentation, as mentioned by Oded using Parallelism, according to the example below:
Parallel.For(0, 4, i =>
{
myDic.TryAdd(i, 0);
});
I've written a wrapper class around a 3rd party library that requires properties to be set by calling a Config method and passing a string formatted as "Property=Value"
I'd like to pass all the properties in a single call and process them iteratively.
I've considered the following:
creating a property/value class and then creating a List of these
objects
building a string of multiple "Property=Value" separating them
with a token (maybe "|")
Using a hash table
All of these would work (and I'm thinking of using option 1) but is there a better way?
A bit more detail about my query:
The finished class will be included in a library for re-use in other applications. Whilst I don't currently see threading as a problem at the moment (our apps tend to just have a UI thread and a worker thread) it could become an issue in the future.
Garbage collection will not be an issue.
Access to arbitrary indices of the data source is not currently an issue.
Optimization is not currently an issue but clearly define the key/value pairs is important.
As you've already pointed out, any of the proposed solutions will accomplish the task as you've described it. What this means is that the only rational way to choose a particular method is to define your requirements:
Does your code need to support multiple threads accessing the data source simultaneously? If so, using a ConcurrentDictionary, as Yahia suggested, makes sense. Otherwise, there's no reason to incur the additional overhead and complexity of using a concurrent data structure.
Are you working in an environment where garbage collection is a problem (for example, an XNA game)? If so, any suggestion involving the concatenation of strings is going to be problematic.
Do you need O(1) access to arbitrary indices of the data source? If so, your third approach makes sense. On the other hand, if all you're doing is iterating over the collection, there's no reason to incur the additional overhead of inserting into a hashtable; use a List<KeyValuePair<String, String>> instead.
On the other hand, you may not be working in an environment where the optimization described above is necessary; the ability to clearly define the key/value pairs programatically may be more important to you. In which case using a Dictionary is a better choice.
You can't make an informed decision as to how to implement a feature without completely defining what the feature needs to do, and since you haven't done that, any answer given here will necessarily be incomplete.
Given your clarifications, I would personally suggest the following:
Avoid making your Config() method thread-safe by default, as per the MSDN guidelines:
By default, class libraries should not be thread safe. Adding locks to create thread-safe code decreases performance, increases lock contention, and creates the possibility for deadlock bugs to occur.
If thread safety becomes important later, make it the caller's responsibility.
Given that you don't have special performance requirements, stick with a dictionary to allow key/value pairs to be easily defined and read.
For simplicity's sake, and to avoid generating lots of unnecessary strings doing concatenations, just pass the dictionary in directly and iterate over it.
Consider the following example:
var configData = new Dictionary<String, String>
configData["key1"] = "value1";
configData["key2"] = "value2";
myLibraryObject.Config(configData);
And the implementation of Config:
public void Config(Dictionary<String, String> values)
{
foreach(var kvp in values)
{
var configString = String.Format("{0}={1}", kvp.Key, kvp.Value);
// do whatever
}
}
You could use Dictionary<string,string>, the items are then of type KeyValuePair<string,string> (this correpsonds to your first idea)
You can then use myDict.Select(kvp=>string.Format("{0}={1}",kvp.Key,kvp.Value)) to get a list of strings with the needed formatting
Use for example a ConcurrentDictionary<string,string> - it is thread-safe and really fast since most operations are implemented lock-free...
You could make a helper class that uses reflection to turn any class into a Property=Value collection
public static class PropertyValueHelper
{
public static IEnumerable<string> GetPropertyValues(object source)
{
Type t = source.GetType();
foreach (var property in t.GetProperties())
{
object value = property.GetValue(source, null);
if (value != null)
{
yield return property.Name + "=" + value.ToString();
}
else
{
yield return property.Name + "=";
}
}
}
}
You would need to add extra logic to handle enumerations, indexed properties, etc.