Concurrently Access and Write to a Synchronized Dictionary Efficiently in C# - c#

I'm looking for the most efficient way to store key value pairs in a static Synchronized Dictionary (.NET 3.5, so not ConcurrentDictionary) while being able to access them at the same time.
Dictionary.Add(key, value);
if (Dictionary.Count >= 200)
{
foreach (KeyValuePair<string, Info> pair in Dictionary)
{
Info entry = pair.Value;
StoreInDatabase(entry);
}
Dictionary.Clear();
}
This is where the problem lies. If one user is adding to the dictionary while another is accessing and storing to the database it breaks.
lock (Dictionary)
{
//Same Code Above
}
I put a lock in, and it seems to work fine, but I'm wondering if there is a more efficient way of doing this. It's not as efficient as I'd like it to be. Any suggestions would be much appreciated!
Note: I have to use the StoreInDatabase method to store the values.
REVISED CODE:
private static SynchronizedDictionary<string, Info> Dictionary = new SynchronizedDictionary<string, Info>();
...
Dictionary.Add(key, value);
if (Dictionary.Count >= 200)
{
SynchronizedDictionary<string, Info> temporaryDictionary = new SynchronizedDictionary<string, Info>();
lock (Dictionary)
{
temporaryDictionary = Dictionary;
Dictionary.Clear();
}
lock(temporaryDictionary)
{
foreach (KeyValuePair<string, Info> pair in temporaryDictionary)
{
Info entry = pair.Value;
StoreInDatabase(entry);
}
}
}
This greatly improved performance. Thanks flq!

You will have a lock around the DB operations which in comparison to in-memory activities take ages.
Within a lock, you should copy the values you want to store in the DB and clear the dictionary. Then you can release the dictionary and some other thread writes the stuff to DB.
Also it may make sense to use some private locking object in order to minimize the potential of deadlocks.

I have implemented a thread-safe dictionary using Interlocked, which is the most lightweight synchronization mechanism available.
You can find the source code here. It's written for Fasterflect, a library that helps make reflection tasks easier and faster. The code uses an #ifdefine to conditionally enable the custom dictionary for .NET 3.5 as our benchmarks shows the .NET 4.0 ConcurrentDictionary to be even faster.
As flq points out, accessing the database while holding a lock is a really bad idea. Like seriously, not something you would ever want to do. Find a better solution, such as copying data you need to store to a temporary data structure.

Related

Proper class definition and usage - thread safe - ASP.net

I wonder how to define a class properly and use it safely. I mean thread safely when thousands of concurrent calls are being made by every website visitor.
I made myself something like below but i wonder is it properly built
public static class csPublicFunctions
{
private static Dictionary<string, clsUserTitles> dicAuthorities;
static csPublicFunctions()
{
dicAuthorities = new Dictionary<string, clsUserTitles>();
using (DataTable dtTemp = DbConnection.db_Select_DataTable("select * from myTable"))
{
foreach (DataRow drw in dtTemp.Rows)
{
clsUserTitles tempCLS = new clsUserTitles();
tempCLS.irAuthorityLevel = Int32.Parse(drw["Level"].ToString());
tempCLS.srTitle_tr = drw["Title_tr"].ToString();
tempCLS.srTitle_en = drw["Title_en"].ToString();
dicAuthorities.Add(drw["authorityLevel"].ToString(), tempCLS);
}
}
}
public class clsUserTitles
{
private string Title_tr;
public string srTitle_tr
{
get { return Title_tr; }
set { Title_tr = value; }
}
private string Title_en;
public string srTitle_en
{
get { return Title_en; }
set { Title_en = value; }
}
private int AuthorityLevel;
public int irAuthorityLevel
{
get { return AuthorityLevel; }
set { AuthorityLevel = value; }
}
}
public static clsUserTitles returnUserTitles(string srUserAuthority)
{
return dicAuthorities[srUserAuthority];
}
}
Dictionary will be initialized only 1 time. No add remove update later.
Dictionary supports thread safe reading. Here is the proof from MSDN:
A Dictionary can support multiple readers concurrently,
as long as the collection is not modified. Even so, enumerating
through a collection is intrinsically not a thread-safe procedure. In
the rare case where an enumeration contends with write accesses, the
collection must be locked during the entire enumeration. To allow the
collection to be accessed by multiple threads for reading and writing,
you must implement your own synchronization.
So, if you are planning to only read data from it, it should work. However, I do not believe that your dictionary is filled only once and won't be modified during your application work. in this case, all other guys in this thread are correct, it is necessary to synchronize access to this dictionary and it is best to use the ConcurrentDictionary object.
Now, I want to say a couple of words about the design itself. If you want to store a shared data between users, use ASP.NET Cache instead which was designed for such purposes.
A quick look through your code and it seems to me that your first problem will be the publicly available dictionary dicAuthorities. Dictionaries are not thread safe. Depending on what you want to do with that Dictionary, you'll need to implement something that regulates access to it. See this related question:
Making dictionary access thread-safe?
As the others have said, Dictionary<TKey,TValue> is not inherently thread-safe. However, if your usage scenario is:
Fill the dictionary on startup
Use that dictionary as lookup while the application is running
Never add or remove values after startup
than you should be fine.
However, if you use .net 4.5, I would recommend making #3 explict, by using a ReadOnlyDictionary
So, your implementation might look like this (changed the coding style to more C# friendly)
private static readonly ReadOnlyDictionary<string, UserTitles> authorities;
static PublicFunctions()
{
Dictionary<string, UserTitles> authoritiesFill = new Dictionary<string, clsUserTitles>();
using (DataTable dtTemp = DbConnection.db_Select_DataTable("select * from myTable"))
{
foreach (DataRow drw in dtTemp.Rows)
{
UserTitles userTitle = new UserTitles
{
AuthorityLevel = Int32.Parse(drw["Level"].ToString()),
TitleTurkish = drw["Title_tr"].ToString();
TitleEnglish = drw["Title_en"].ToString();
}
authoritiesFill.Add(drw["authorityLevel"].ToString(), userTitle);
}
}
authorities = new ReadOnlyDictionary<string, UserTitles>(authoritiesFill);
}
I've also added a readonly modifier to the declaration itself, because this way you can be sure that it won't be replaced at runtime by another dictionary.
No you code is not thread safe.
[EDIT does not apply - set/created inside static constructor] Dictionary (as pointed by System Down answer) is not thread safe while being updated. Dictionary is not read only - hence no way to guarantee that it is not modified over time.
[EDIT does not apply - set/created inside static constructor] Initialization is not protected by any locks so you end-up with multiple initializations at the same time
Your entries are mutable - so it is very hard to reason if you get consistent value of each entry
[EDIT does not apply - only modified in static constructor] Field that holds dictionary not read-only - depending on code you may end-up with inconsistent data if not caching pointer to dictionary itself.
Side note: try to follow coding guidelines for C# and call classes starting with upper case MySpecialClass and have names that reflect purpose of the class (or clearly sample names).
EDIT: most of my points do not apply as the only initialization of the dictionary is inside static constructor. Which makes initialization safe from thread-safety point of view.
Note that initialization inside static constructor will happen at non-deterministic moment "before first use". It can lead to unexpected behavior - i.e. when access to DB may use wrong "current" user account.
The answer to your question is no, it's not thread safe. Dictionary is not a thread-safe collection. If you want to use a thread-safe dictionary then use ConcurrentDictionary.
Besides that, it's difficult to say whether your csPublicFunctions is thread-safe or not because it depends on how you handle your database connections inside the call to DbConnection.db_Select_DataTable
There is not thread-safe problem only with public Dictionary.
Yes, dictionary filling is thread-safe. But another modification of this dictionary is not thread safe. As was wrote above - ConcurrentDictionary could help.
Another problem that your class clsUserTitles is not thread-safe too.
If clsUserTitles is using only for reading you could make each property setter of clsUserTitles private. And initialize these properties from clsUserTitles constructor.

Parallel optimisation of string comparison

I'm trying to optimise the performance of a string comparison operation on each string key of a dictionary used as a database query cache. The current code looks like:
public void Clear(string tableName)
{
foreach (string key in cache.Keys.Where(key => key.IndexOf(tableName, StringComparison.Ordinal) >= 0).ToList())
{
cache.Remove(key);
}
}
I'm new to using C# parallel features and am wondering what the best way would be to convert this into a parallel operation so that multiple string comparisons can happen 'simultaneously'. The cache can often get quite large so maintenance on it with Clear() can get quite costly.
Make your cache object a ConcurrentDictionary and use TryRemove instead of Remove.
This will make your cache thread-safe; then, can invoke your current foreach loop like this:
Parallel.ForEach(cache.Keys, key =>
{
if(key.IndexOf(tableName, StringComparison.Ordinal) >= 0)
{
dynamic value; // just because I don't know your dictionary.
cache.TryRemove(key, out value);
}
});
Hope that gives you an starting point.
Your approach can't work well on a Dictionary<string, Whatever> because that class isn't thread-safe for multiple writers, so the simultaneous deletes could cause all sorts of problems.
You will therefore have to use a lock to synchronise the removals, which will therefore make the access of the dictionary essentially single-threaded. About the only thing that can be safely done across the threads simultaneously is the comparison in the Where.
You could use ConcurrentDictionary because its use of striped locks will reduce this impact. It still doesn't seem the best approach though.
If you are building keys from a strings so that testing if the key starts with a sub-key, and if removing the entire subkey is a frequent need, then you could try using a Dictionary<string, Dictionary<string, Whatever>>. Adding or updating becomes a bit more expensive, but clearing becomes an O(1) removal of just the one value from the higher-level dictionary.
I've used Dictionaries as caches before and what I've used to do is to do the clean up the cache "on the fly", that is, with each entry I also include its time of inclusion, then anytime an entry is requested I remove the old entries. Performance hit was minimal to me but if needed you could implement a Queue (of Tuple<DateTime, TKey> where TKey is the type of the keys on your dictionary) as an index to hold these timestamps so you didn't need to iterate over the entire dictionary every time. Anyway, if you're having to think about these issues, it's time to consider using a dedicated caching server. To me, Shared Cache (http://sharedcache.codeplex.com) has been good enough.

What is the best way to implement a property=value collection

I've written a wrapper class around a 3rd party library that requires properties to be set by calling a Config method and passing a string formatted as "Property=Value"
I'd like to pass all the properties in a single call and process them iteratively.
I've considered the following:
creating a property/value class and then creating a List of these
objects
building a string of multiple "Property=Value" separating them
with a token (maybe "|")
Using a hash table
All of these would work (and I'm thinking of using option 1) but is there a better way?
A bit more detail about my query:
The finished class will be included in a library for re-use in other applications. Whilst I don't currently see threading as a problem at the moment (our apps tend to just have a UI thread and a worker thread) it could become an issue in the future.
Garbage collection will not be an issue.
Access to arbitrary indices of the data source is not currently an issue.
Optimization is not currently an issue but clearly define the key/value pairs is important.
As you've already pointed out, any of the proposed solutions will accomplish the task as you've described it. What this means is that the only rational way to choose a particular method is to define your requirements:
Does your code need to support multiple threads accessing the data source simultaneously? If so, using a ConcurrentDictionary, as Yahia suggested, makes sense. Otherwise, there's no reason to incur the additional overhead and complexity of using a concurrent data structure.
Are you working in an environment where garbage collection is a problem (for example, an XNA game)? If so, any suggestion involving the concatenation of strings is going to be problematic.
Do you need O(1) access to arbitrary indices of the data source? If so, your third approach makes sense. On the other hand, if all you're doing is iterating over the collection, there's no reason to incur the additional overhead of inserting into a hashtable; use a List<KeyValuePair<String, String>> instead.
On the other hand, you may not be working in an environment where the optimization described above is necessary; the ability to clearly define the key/value pairs programatically may be more important to you. In which case using a Dictionary is a better choice.
You can't make an informed decision as to how to implement a feature without completely defining what the feature needs to do, and since you haven't done that, any answer given here will necessarily be incomplete.
Given your clarifications, I would personally suggest the following:
Avoid making your Config() method thread-safe by default, as per the MSDN guidelines:
By default, class libraries should not be thread safe. Adding locks to create thread-safe code decreases performance, increases lock contention, and creates the possibility for deadlock bugs to occur.
If thread safety becomes important later, make it the caller's responsibility.
Given that you don't have special performance requirements, stick with a dictionary to allow key/value pairs to be easily defined and read.
For simplicity's sake, and to avoid generating lots of unnecessary strings doing concatenations, just pass the dictionary in directly and iterate over it.
Consider the following example:
var configData = new Dictionary<String, String>
configData["key1"] = "value1";
configData["key2"] = "value2";
myLibraryObject.Config(configData);
And the implementation of Config:
public void Config(Dictionary<String, String> values)
{
foreach(var kvp in values)
{
var configString = String.Format("{0}={1}", kvp.Key, kvp.Value);
// do whatever
}
}
You could use Dictionary<string,string>, the items are then of type KeyValuePair<string,string> (this correpsonds to your first idea)
You can then use myDict.Select(kvp=>string.Format("{0}={1}",kvp.Key,kvp.Value)) to get a list of strings with the needed formatting
Use for example a ConcurrentDictionary<string,string> - it is thread-safe and really fast since most operations are implemented lock-free...
You could make a helper class that uses reflection to turn any class into a Property=Value collection
public static class PropertyValueHelper
{
public static IEnumerable<string> GetPropertyValues(object source)
{
Type t = source.GetType();
foreach (var property in t.GetProperties())
{
object value = property.GetValue(source, null);
if (value != null)
{
yield return property.Name + "=" + value.ToString();
}
else
{
yield return property.Name + "=";
}
}
}
}
You would need to add extra logic to handle enumerations, indexed properties, etc.

C# Dictionary with ReaderWriterLockSlim

I'm very new to multi-threading and for some reason this class is giving me more trouble than it should.
I am setting up a dictionary in the ASP.net cache - It will be frequently queried for individual objects, enumerated occasionally, and written extremely infrequently. I'll note that the dictionary data is almost never changed, I'm planning on letting it expire daily with a callback to rebuild from the database when it leaves the cache.
I believe that the enumeration and access by keys are safe so long as the dictionary isn't being written. I'm thinking a ReaderWriterLockSlim based wrapper class is the way to go but I'm fuzzy on a few points.
If I use Lock I believe that I can either lock on a token or the actual object I'm protecting. I don't see how to do something similar using the ReaderWriter Lock. Am I correct in thinking that multiple instances of my wrapper will not lock properly as the ReaderWriterLocks are out of each other's scope?
What is the best practice for writing a wrapper like this? Building it as a static almost seems redundant as the primary object is being maintained by the cache. Singleton's seem to be frowned upon, and I'm concerned about the above mentioned scoping issues for individual instances.
I've seen a few implementations of similar wrappers around but I haven't been able to answer these questions. I just want to make sure that I have a firm grasp on what I'm doing rather than cutting & pasting my way through. Thank you very much for your help!
**Edit: Hopefully this is a clearer summary of what I'm trying to find out- **
1. Am I correct in thinking that the lock does not affect the underlying data and is scoped like any other variable?
As an example lets say i have the following -
MyWrapperClass
{
ReaderWriterLockSlim lck = new ReaderWriterLockSlim();
Do stuff with this lock on the underlying cached dictionary object...
}
MyWrapperClass wrapA = new MyWrapperClass();
MyWrapperClass wrapB = new MyWrapperClass();
Am I right in thinking that the wrapA lock and wrapB lock won't interact, And that if wrapA & wrapB both attempt operations it will be unsafe?
2. If this is the case what is the best practice way to "share" the lock data?
This is an Asp.net app - there will be multiple pages that need to access the data which is why i'm doing this in the first place. What is the best practice for ensuring that the various wrappers are using the same lock? Should my wrapper be a static or singleton that all threads are using, if not what is the more elegant alternative?
You have multiple dictionary objects in the Cache, and you want each one locked independently. The "best" way is to just use a simple class that does it for you.
public class ReadWriteDictionary<K,V>
{
private readonly Dictionary<K,V> dict = new Dictionary<K,V>();
private readonly ReaderWriterLockSlim rwLock = new ReaderWriterLockSlim();
public V Get(K key)
{
return ReadLock(() => dict[key]);
}
public void Set(K key, V value)
{
WriteLock(() => dict.Add(key, value));
}
public IEnumerable<KeyValuePair<K, V>> GetPairs()
{
return ReadLock(() => dict.ToList());
}
private V2 ReadLock<V2>(Func<V2> func)
{
rwLock.EnterReadLock();
try
{
return func();
}
finally
{
rwLock.ExitReadLock();
}
}
private void WriteLock(Action action)
{
rwLock.EnterWriteLock();
try
{
action();
}
finally
{
rwLock.ExitWriteLock();
}
}
}
Cache["somekey"] = new ReadWriteDictionary<string,int>();
There is also a more complete example on the help page of ReaderWriterLockSlim on MSDN. It wouldn't be hard to make it generic.
edit To answer your new questions -
1.) You are correct wrapA and wrapB will not interact. They both have their own instance of ReaderWriterLockSlim.
2.) If you need a shared lock amongst all your wrapper classes, then it must be static.
ConcurrentDictionary does everything you want and then some. Part of System.Concurrent.Collections
The standard way to lock is: object lck = new object(); ... lock(lck) { ... } in this instance the object lck represents the lock.
ReadWriterLockSlim isn't much different, its just in this case the actual ReadWriterLockSlim class represents the actual lock, so everywhere you would have used lck you now use your ReadWriterLockSlim.
ReadWriterLockSlim lck = new ReadWriterLockSlim();
...
lck.EnterReadLock();
try
{
...
}
finally
{
lck.ExitReadLock();
}

How to iterate through Dictionary without using foreach

I am not sure if the title formulates it well so sorry.
I basically have a bunch of elements listing targets for a communication. I placed them in a dictionary though i am open to moving them to a different data structure. My problem is that i have a tree-like structure where a key is a branch and each branch has many leaves. Both the branch and the leaves have names stored in strings (cannot be numeral).
private Dictionary < string, string[]> targets;
For each element in the dictionary i must send a communication, and when the target answers i go to the next target and start over. So after searching i am faced with these dilemmas:
I cannot use the usual foreach because i need to keep the pointer in memory to pass it in between threads.
Since dictionaries are random access it is difficult to keep a pointer
When i receive a communication i must verify if the origins are from a target, so i like the dictionary.contains method for that.
I am fairly new at C#, so the answer is probably obvious but i am finding a hard time finding a data structure that fits my needs. What would be the simplest solution? Can somebody suggest anything?
Thank you.
EDIT
I think my post has confused many, and they are sort of stuck on the terms pointers and threads. By threads i don`t mean that they are parallel, simply that i cannot use a foreach or a loop as the next thread that does the next iteration is triggered by incoming communication. This mechanism cannot be changed at the moment, just the iteration must be. By pointer i wasn't referring to the memory pointers often used in C, i just meant something that points to where you are in a list. Sorry i am a Java programmer so i might be using confusing terms.
I noticed the Enumerator is often inherited and that it can be used with structures such as Dictionary and Linked List. Examples i find talk about this sub structure being encapsulated, and shows foreach loops as examples.
Would it be possible to use GetEnumerator() in some way that the enumerator would remember the current position even when accessed through a different thread?
I am off to test these on my own, but if any input from more experienced people is always appreciated!
I think you need to re-work your architecture a bit, the Dictionary itself is probably not the data structure you need to use for a ordered iteration.
I would consider moving your tree into a linked list instead.
When you kick off your communications I would suggest having your threads callback a delegate to update your list data, or another shared datastructure that keeps track of where you are in the communication process.
static LinkedList<LeafItem> TreeList = new LinkedList<LeafItem>( );
foreach (LeafItem li in TreeList) {
Thread newThread = new Thread(
new ParameterizedThreadStart(Work.DoWork));
newThread.Start(li);
}
You can enumerate over this in parallel using Parallel.ForEach method (from .NET 4). It has been backported as part of the Rx Framework for use in .NET 3.5sp1.
Note - this doesn't actually use one thread per item, but rather partitions the work using the thread pool, based on the hardware thread count of the system on which you're executing (which is usually better...). In .NET 4, it takes advantage of the ThreadPool's new hill climbing and work stealing algorithms, so is very efficient.
this one is a slight long shot, and I suspect I've messed it up somewhere here :/
basically the idea is to create a custom IEnumerator for your dictionary. The idea being that it contains a static variable that keeps the "location" of the enumeration, for continuing.
the following is some skeleton code for something that does work for pausing and restarting.
public class MyDictEnumerator<T> : IEnumerator<T>
{
private List<T> Dict;
private static int curLocation = -1;
public MyDictEnumerator(List<T> dictionary)
{
Dict = dictionary;
}
public T Current
{
get { return Dict[curLocation]; }
}
public void Dispose()
{ }
object System.Collections.IEnumerator.Current
{
get { return Dict[curLocation]; }
}
public bool MoveNext()
{
curLocation++;
if (curLocation >= Dict.Count)
return false;
return true;
}
public void Reset()
{
curLocation = -1;
}
}
Then to use:
MyDictEnumerator<KeyValuePair<string, int>> enumer = new MyDictEnumerator<KeyValuePair<string, int>>(test.ToList());
while (enumer.MoveNext())
{
Console.WriteLine(enumer.Current.Value);
}
I'll admit that this isn't the cleanest way of doing it. But if you break out of the enumerator, and create a new one on another thread, then it will continue at the same point (i think :/)
I hope this helps.
Edit: from your comments:
My alogrithm is more like: Get the
first target Send the message to the
first target Thread DIES - Catch a
port reception event check if its the
right target do some actions - go to
the next target start the loop over.
If you want to process the items asynchronously but not in parallel, you should be able to achieve this by copying the dictionary's keys to a Queue<string> and passing both to the callback that handles your asynchronous responses.
Your completion handler pseduo-code might look like this:
// first extract your dictionary, key, and queue from whatever state
// object you're using to pass data back to the completion event
if (dictionary.Contains(key)) {
// process the response
}
if (queue.Count > 0) {
string key = queue.Dequeue();
string[] messages = dictionary[key];
// send the messages, along with your state data and this callback
}

Categories

Resources