ConcurrentDictionary<TKey,TValue> vs Dictionary<TKey,TValue> - c#

As MSDN says
ConcurrentDictionary<TKey, TValue> Class Represents a thread-safe collection of key-value pairs that can be accessed by multiple threads concurrently.
But as I know, System.Collections.Concurrent classes are designed for PLINQ.
I have Dictionary<Key,Value> which keeps on-line clients in the server, and I make it thread safe by locking object when I have access to it.
Can I safely replace Dictionary<TKey,TValue> by ConcurrentDictionary<TKey,TValue> in my case? will the performance increased after replacement?
Here in Part 5 Joseph Albahari mentioned that it designed for Parallel programming
The concurrent collections are tuned for parallel programming. The conventional collections outperform them in all but highly concurrent scenarios.
A thread-safe collection doesn’t guarantee that the code using it will be thread-safe.
If you enumerate over a concurrent collection while another thread is modifying it, no exception is thrown. Instead, you get a mixture of old and new content.
There’s no concurrent version of List.
The concurrent stack, queue, and bag classes are implemented internally with linked lists. This makes them less memory-efficient than the nonconcurrent Stack and Queue classes, but better for concurrent access because linked lists are conducive to lock-free or low-lock implementations. (This is because inserting a node into a linked list requires updating just a couple of references, while inserting an element into a List-like structure may require moving thousands of existing elements.)

Without knowing more about what you're doing within the lock, then it's impossible to say.
For instance, if all of your dictionary access looks like this:
lock(lockObject)
{
foo = dict[key];
}
... // elsewhere
lock(lockObject)
{
dict[key] = foo;
}
Then you'll be fine switching it out (though you likely won't see any difference in performance, so if it ain't broke, don't fix it). However, if you're doing anything fancy within the lock block where you interact with the dictionary, then you'll have to make sure that the dictionary provides a single function that can accomplish what you're doing within the lock block, otherwise you'll end up with code that is functionally different from what you had before. The biggest thing to remember is that the dictionary only guarantees that concurrent calls to the dictionary are executed in a serial fashion; it can't handle cases where you have a single action in your code that interacts with the dictionary multiple times. Cases like that, when not accounted for by the ConcurrentDictionary, require your own concurrency control.
Thankfully, the ConcurrentDictionary provides some helper functions for more common multi-step operations like AddOrUpdate or GetOrAdd, but they can't cover every circumstance. If you find yourself having to work to shoehorn your logic into these functions, it may be better to handle your own concurrency.

It's not as simple as replacing Dictionary with ConcurrentDictionary, you'll need to adapt your code, as these classes have new methods that behave differently, in order to guarantee thread-safety.
Eg., instead of calling Add or Remove, you have TryAdd and TryRemove. It's important you use these methods that behave atomically, as if you make two calls where the second is reliant on the outcome of the first, you'll still have race conditions and need a lock.

You can replace Dictionary<TKey, TValue> with ConcurrentDictionary<TKey, TValue>.
The effect on performance may not be what you want though (if there is a lot of locking/synchronization, performance may suffer...but at least your collection is thread-safe).

While I'm unsure about replacement difficulties, but if you have anywhere where you need to access multiple elements in the dictionary in the same "lock session" then you'll need to modify your code.
It could give improved performance if Microsoft has given separate locks for read and write, since read operations shouldn't block other read operations.

Yes you can safely replace, however dictionary designed for plinq may have some extra code for added functionality that you may not use. But the performance overhead will be marginally very small.

Related

When should I use ConcurrentDictionary and Dictionary?

I'm always confused on which one of these to pick. As I see it I use Dictionary over List if I want two data types as a Key and Value so I can easily find a value by its key but I am always confused if I should use a ConcurrentDictionary or Dictionary?
Before you go off at me for not putting much research in to this I have tried, but it seems google hasn't really got anything on Dictionary vs ConcurrentDictionary but has something on each one individually.
I have asked a friend this before but all they said is: "use ConcurrentDictionary if you use your dictionary a lot in code" and I didn't really want to pester them in to explaining it in larger detail. Could anyone expand on this?
"Use ConcurrentDictionary if you use your dictionary in a lot in code" is kind of vague advice. I don't blame you for the confusion.
ConcurrentDictionary is primarily for use in an environment where you're updating the dictionary from multiple threads (or async tasks). You can use a standard Dictionary from as much code as you like if it's from a single thread ;)
If you look at the methods on a ConcurrentDictionary, you'll spot some interesting methods like TryAdd, TryGetValue, TryUpdate, and TryRemove.
For example, consider a typical pattern you might see for working with a normal Dictionary class.
// There are better ways to do this... but we need an example ;)
if (!dictionary.ContainsKey(id))
dictionary.Add(id, value);
This has an issue in that between the check for whether it contains a key and calling Add a different thread could call Add with that same id. When this thread calls Add, it'll throw an exception. The method TryAdd handles that for you and will return a true/false telling you whether it added it (or whether that key was already in the dictionary).
So unless you're working in a multi-threaded section of code, you probably can just use the standard Dictionary class. That being said, you could theoretically have locks to prevent concurrent access to a dictionary; that question is already addressed in "Dictionary locking vs. ConcurrentDictionary".
The biggest reason to use ConcurrentDictionary over the normal Dictionary is thread safety. If your application will get multiple threads using the same dictionary at the same time, you need the thread-safe ConcurrentDictionary, this is particularly true when these threads are writing to or building the dictionary.
The downside to using ConcurrentDictionary without the multi-threading is overhead. All those functions that allow it to be thread-safe will still be there, all the locks and checks will still happen, taking processing time and using extra memory.
ConcurrentDictionary is useful when you need to access a dictionary across multiple threads (i.e. multithreading). Vanilla Dictionary objects do not possess this capability and therefore should only be used in a single-threaded manner.
A ConcurrentDictionary is useful when you want a high-performance dictionary that can be safely accessed by multiple threads concurrently. Compared to a standard Dictionary protected with a lock, it is more efficient under heavy usage because of its granular locking implementation. Instead of all threads competing for a single lock, the ConcurrentDictionary maintains multiple locks internally, minimizing this way the contention, and limiting the possibility of becoming a bottleneck.
Despite these nice characteristics, the number of scenarios where using a ConcurrentDictionary is the best option is actually quite small. There are two reasons for that:
The thread-safety guaranties offered by the ConcurrentDictionary are limited to the protection of its internal state. That's it. If you want to do anything slightly non-trivial, like for example updating the dictionary and another variable as an atomic operation, you are out of luck. This is not a supported scenario for a ConcurrentDictionary. Even protecting the elements it contains (in case they are mutable objects) is not supported. If you try to update one of its values using the AddOrUpdate method, the dictionary will be protected but the value will not. The Update in this context means replace the existing value with another one, not modify the existing value.
Whenever you find tempting to use a ConcurrentDictionary, there are usually better alternatives available. Alternatives that do not involve shared state, which is what a ConcurrentDictionary essentially is. No matter how efficient is its locking scheme, it will have a hard time beating an architecture where there is no shared state at all, and each thread does its own thing without interfering with the other threads. Commonly used libraries that follow this principle are the PLINQ and the TPL Dataflow library. Below is a PLINQ example:
Dictionary<string, Product> dictionary = productIDs
.AsParallel()
.Select(id => GetProduct(id))
.ToDictionary(product => product.Barcode);
Instead of creating a dictionary beforehand, and then having multiple threads filling it concurrently with values, you can trust PLINQ to produce a dictionary utilizing more efficient strategies, involving partitioning of the initial workload, and assigning each partition to a different worker thread. A single thread will eventually aggregate the partial results, and fill the dictionary.
The accepted answer above is correct. However, it is worth mentioning explicitly if a dictionary is not being modified i.e. it is only ever read from, regardless of number of threads, then Dictionary<TKey,TValue> is preferred because no synchronization is required.
e.g. caching config in a Dictionary<TKey,TValue>, that is populated only once at startup and used throughout the application for the life of the application.
When to use a thread-safe collection : ConcurrentDictionary vs. Dictionary
If you are only reading key or values, the Dictionary<TKey,TValue> is faster because no synchronization is required if the dictionary is not being modified by any threads.

Dictionary(Of TKey, TValue) concurrent access: worst scenario

So I have Dictionary<string, SomeClass> which will be accessed heavily by multiple concurrent threads - some will write, most will read. No locks, no synchronization - worker thread will make no checks - simply read or write. The only guarantee is that no two threads will write value with same key. The question is could this data structure become corrupted this way? By corrupted I mean not working anymore even with one thread.
could this data structure become corrupted this way?
Yes, most likely you would get an IndexOutOfRange or similar exception.
And even when you catch and ignore the exceptions, you would not get reliable data any more. Both duplicates and missing values are possible.
So just don't do this.
The worst-case scenarios include:
NullReferenceException or IndexOutOfRangeException thrown out of a Dictionary<,> method.
An arbitrary amount of data is lost. If two threads attempt to resize the Dictionary<,> table at the same time, they can stomp on each other, screw up, and lose data.
Wrong answer is returned by a read from the Dictionary<,>.
Basically, the Dictionary<,> can do just about anything bad you can think of, within the limits imposed by the CLR. Presumably, you still won't break type safety or corrupt the heap as you could in a native programming language. Probably, anyways :-)
If you're accessing a collection from multiple threads, it would be safest to use one of the threadsafe varieties in .Net 4, such as ConcurrentDictionary

Is the ConcurrentDictionary thread-safe to the point that I can use it for a static cache?

Basically, if I want to do the following:
public class SomeClass
{
private static ConcurrentDictionary<..., ...> Cache { get; set; }
}
Does this let me avoid using locks all over the place?
Yes, it is thread safe and yes it avoids you using locks all over the place (whatever that means). Of course that will only provide you a thread safe access to the data stored in this dictionary, but if the data itself is not thread safe then you need to synchronize access to it of course. Imagine for example that you have stored in this cache a List<T>. Now thread1 fetches this list (in a thread safe manner as the concurrent dictionary guarantees you this) and then starts enumerating over this list. At exactly the same time thread2 fetches this very same list from the cache (in a thread safe manner as the concurrent dictionary guarantees you this) and writes to the list (for example it adds a value). Conclusion: if you haven't synchronized thread1 it will get into trouble.
As far as using it as a cache is concerned, well, that's probably not a good idea. For caching I would recommend you what is already built into the framework. Classes such as MemoryCache for example. The reason for this is that what is built into the System.Runtime.Caching assembly is, well, explicitly built for caching => it handles things like automatic expiration of data if you start running low on memory, callbacks for cache expiration items, and you would even be able to distribute your cache over multiple servers using things like memcached, AppFabric, ..., all things that you would can't dream of with a concurrent dictionary.
You still may need to use locking in the same way that you might need a transaction in a database. The "concurrent" part means that the dictionary will continue to function correctly across multiple threads.
Built into the concurrent collection are TryGetValue and TryRemove which acknowledge that someone might delete an item first. Locking at a granular level is built in, but you still need to think about what to do in these situations. For Caching, it often doesn't matter -- i.e. it's an idempotent operation.
re: caching. I feel that it depends on what you are storing in the cache + what you are doing with it. There are casting costs associated with using an object. Probably for most web based things MemCache is better suited as suggested above.

Is there a threadsafe and generic IList<T> in c#?

Is List<T> or HashSet<T> or anything else built in threadsafe for addition only?
My question is similar to Threadsafe and generic arraylist? but I'm only looking for safety to cover adding to this list threaded, not removal or reading from it.
System.Collections.Concurrent.BlockingCollection<T>
Link.
.NET 4.0 you could use the BlockingCollection<T>, but that is still designed to be thread safe for all operations, not just addition.
In general, it's uncommon to design a data structure that guarantees certain operations to be safe for concurrency and other to not be so. If you're concerned that there is an overhead when accessing a collection for reading, you should do some benchmarking before you go out of your way to look for specialized collections to deal with that.

Concurrent Dictionary in C#

I need to implement concurrent Dictionary because .Net does not contain concurrent implementation for collections(Since .NET4 will be contains). Can I use for it "Power Threading Library" from Jeffrey Richter or present implemented variants or any advice for implemented?
Thanks ...
I wrote a thread-safe wrapper for the normal Dictionary class that uses Interlocked to protect the internal dictionary. Interlocked is by far the fastest locking mechanism available and will give much better performance than ReaderWriterLockSlim, Monitor or any of the other available locks.
The code was used to implement a Cache class for Fasterflect, which is a library to speed up reflection. As such we tried a number of different approaches in order to find the fastest possible solution. Interestingly, the new concurrent collections in .NET 4 are noticeably faster than my implementation, although both are pretty darn fast compared to solutions using a less performance locking mechanism. The implementation for .NET 3.5 is located inside a conditional region in the bottom half of the file.
You can use Reflector to view the source code of the concurrent implementation of .NET 4.0 RC and copy it to your own code. This way you will have the least problems when migrating to .NET 4.0.
I wrote a concurrent dictionary myself (prior to .NET 4.0's System.Collections.Concurrent namespace); there's not much to it. You basically just want to make sure certain methods are not getting called at the same time, e.g., Contains and Remove or something like that.
What I did was to use a ReaderWriterLock (in .NET 3.5 and above, you could go with ReaderWriterLockSlim) and call AcquireReaderLock for all "read" operations (like this[TKey], ContainsKey, etc.) and AcquireWriterLock for all "write" operations (like this[TKey] = value, Add, Remove, etc.). Be sure to wrap any calls of this sort in a try/finally block, releasing the lock in the finally.
It's also a good idea to modify the behavior of GetEnumerator slightly: rather than enumerate over the existing collection, make a copy of it and allow enumeration over that. Otherwise you'll face potential deadlocks.
Here's a simple implementation that uses sane locking (though Interlocked would likely be faster):
http://www.tech.windowsapplication1.com/content/the-synchronized-dictionarytkey-tvalue
Essentially, just create a Dictionary wrapper/decorator and synchronize access to any read/write actions.
When you switch to .Net 4.0, just replace all of your overloads with delegated calls to the underlying ConcurrentDictionary.

Categories

Resources