By default non-static methods have their own instance of variables for each thread when accessed via multiple threads, thus rendering them thread safe if they do not include a public variable etc.
On the other hand, variables in static methods are shared amongst threads rendering them non-thread safe by default.
Say, I have a class, having no static variables or methods whatsoever.
public class Profile {
private ConcurrentDictionary<int, int> cache =
new ConcurrentDictionary<int, int>();
public AddToCache() {
}
public RemoveToCache() {
}
public DoSomethingThatShouldBeThreadSafe() {
}
}
But then I create a static object from this class.
public static Profile objProfile = new Profile();
And then, objProfile is accessed with multiple threads.
The question is, are the methods of Profile class, AddToCache, RemoveFromCache and DoSomethingThatShouldBeThreadSafe, going to be thread safe or not when used through objProfile? Are their variables will be shared amongst threads, even if they are not static because the whole instance of the class is static?
As long as you only access the ConcurrentDictionary<> instance cache, and don't overwrite cache with a new instance in one of Profile-methods it is threadsafe.
Because of the second point, it's better to mark it readonly,
private readonly ConcurrentDictionary<int, int> cache =
new ConcurrentDictionary<int, int>();
because this says that you can write this member only during instantiation of Profile.
EDIT:
Although the ConcurrentDictionary<> itself is thread-safe, you still have the problem of non-atomicity of compound operations. Let's take a look at two possible GetFromCache() methods.
int? GetFromCacheNonAtomic(int key)
{
if (cache.ContainsKey(key)) // first access to cache
return cache[key]; // second access to cache
return null;
}
int? GetFromCacheAtomic(int key)
{
int value;
if (cache.TryGetValue(key, out value)) // single access to cache
return value;
return null;
}
only the second one is atomic, because it uses the ConcurrentDictionary<>.TryGetValue() method.
EDIT 2 (answer to 2nd comment of Chiao):
ConcurrentDictionary<> has the GetOrAdd() method, which takes a Func<TKey, TValue> delegate for non-existing values.
void AddToCacheIfItDoesntExist(int key)
{
cache.GetOrAdd(key, SlowMethod);
}
int SlowMethod(int key)
{
Thread.Sleep(1000);
return key * 10;
}
You seem to me to be asserting that local variables of a static method are themselves static. This is not true.
Local variables are always local for both instance and static methods and so, excluding special cases like variable capture, live on the stack. Thus they are private to each separate invocation of the method.
Yes this should be a thread safe setup. All the functions will create their own 'copy' of the function local variables. Only when you explicitly 'touch' shared properties you'll get into problems.
However there will be only ONE cache, making the containing class static will make touching the cache NOT thread safe.
Related
Does replacing a value associated with a ConcurrentDictionary key lock any dictionary operations beyond that key?
EDIT: For example, I'd like to know if either thread will ever block the other, besides when the keys are first added, in the following:
public static class Test {
private static ConcurrentDictionary<int, int> cd = new ConcurrentDictionary<int, int>();
public static Test() {
new Thread(UpdateItem1).Start();
new Thread(UpdateItem2).Start();
}
private static void UpdateItem1() {
while (true) cd[1] = 0;
}
private static void UpdateItem2() {
while (true) cd[2] = 0;
}
}
Initially I assumed it does, because for example dictionary[key] = value; could refer to a key that is not present yet. However, while working I realized that if an add is necessary it could occur after a separate lock escalation.
I was drafting the following class, but the indirection provided by the AccountCacheLock class is unnecessary if the answer to this question (above) is "no". In fact, all of my own lock management is pretty much unneeded.
// A flattened subset of repository user values that are referenced for every member page access
public class AccountCache {
// The AccountCacheLock wrapper allows the AccountCache item to be updated in a locally-confined account-specific lock.
// Otherwise, one of the following would be necessary:
// Replace a ConcurrentDictionary item, requiring a lock on the ConcurrentDictionary object (unless the ConcurrentDictionary internally implements similar indirection)
// Update the contents of the AccountCache item, requiring either a copy to be returned or the lock to wrap the caller's use of it.
private static readonly ConcurrentDictionary<int, AccountCacheLock> dictionary = new ConcurrentDictionary<int, AccountCacheLock>();
public static AccountCache Get(int accountId, SiteEntities refreshSource) {
AccountCacheLock accountCacheLock = dictionary.GetOrAdd(accountId, k => new AccountCacheLock());
AccountCache accountCache;
lock (accountCacheLock) {
accountCache = accountCacheLock.AccountCache;
}
if (accountCache == null || accountCache.ExpiresOn < DateTime.UtcNow) {
accountCache = new AccountCache(refreshSource.Accounts.Single(a => a.Id == accountId));
lock (accountCacheLock) {
accountCacheLock.AccountCache = accountCache;
}
}
return accountCache;
}
public static void Invalidate(int accountId) {
// TODO
}
private AccountCache(Account account) {
ExpiresOn = DateTime.UtcNow.AddHours(1);
Status = account.Status;
CommunityRole = account.CommunityRole;
Email = account.Email;
}
public readonly DateTime ExpiresOn;
public readonly AccountStates Status;
public readonly CommunityRoles CommunityRole;
public readonly string Email;
private class AccountCacheLock {
public AccountCache AccountCache;
}
}
Side question: is there something in the ASP.NET framework that already does this?
You don't need to be doing any locks. The ConcurrentDictionary should handle that pretty well.
Side question: is there something in the ASP.NET framework that already does this?
Of course. It's not specifically related to ASP.NET but you may take a look at the System.Runtime.Caching namespace and more specifically the MemoryCache class. It adds things like expiration and callbacks on the top of a thread safe hashtable.
I don't quite understand the purpose of the AccountCache you have shown in your updated answer. It's exactly what a simple caching layer gives you for free.
Obviously if you intend to be running your ASP.NET application in a web farm you should consider some distributed caching such as memcached for example. There are .NET implementations of the ObjectCache class on top of the memcached protocol.
I also wanted to note that I took a cursory peek inside ConcurrentDictionary, and it looks like item replacements are locked on neither the individual item nor the entire dictionary, but rather the hash of the item (i.e. a lock object associated with a dictionary "bucket"). It seems to be designed so that an initial introduction of a key also does not lock the entire dictionary, provided the dictionary need not be resized. I believe this also means that two updates can occur simultaneously provided they don't produce matching hashes.
I need to make a critical section in an area on the basis of a finite set of strings. I want the lock to be shared for the same string instance, (somewhat similar to String.Intern approach).
I am considering the following implementation:
public class Foo
{
private readonly string _s;
private static readonly HashSet<string> _locks = new HashSet<string>();
public Foo(string s)
{
_s = s;
_locks.Add(s);
}
public void LockMethod()
{
lock(_locks.Single(l => l == _s))
{
...
}
}
}
Are there any problems with this approach? Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet<string>?
Is it better to, for example, create a Dictionary<string, object> that creates a new lock object for each string instance?
Final Implementation
Based on the suggestions I went with the following implementation:
public class Foo
{
private readonly string _s;
private static readonly ConcurrentDictionary<string, object> _locks = new ConcurrentDictionary<string, object>();
public Foo(string s)
{
_s = s;
}
public void LockMethod()
{
lock(_locks.GetOrAdd(_s, _ => new object()))
{
...
}
}
}
Locking on strings is discouraged, the main reason is that (because of string-interning) some other code could lock on the same string instance without you knowing this. Creating a potential for deadlock situations.
Now this is probably a far fetched scenario in most concrete situations. It's more a general rule for libraries.
But on the other hand, what is the perceived benefit of strings?
So, point for point:
Are there any problems with this approach?
Yes, but mostly theoretical.
Is it OK to lock on a string object in this way, and are there any thread safety issues in using the HashSet?
The HashSet<> is not involved in the thread-safety as long as the threads only read concurrently.
Is it better to, for example, create a Dictionary that creates a new lock object for each string instance?
Yes. Just to be on the safe side. In a large system the main aim for avoiding deadlock is to keep the lock-objects as local and private as possible. Only a limited amount of code should be able to access them.
I'd say it's a really bad idea, personally. That isn't what strings are for.
(Personally I dislike the fact that every object has a monitor in the first place, but that's a slightly different concern.)
If you want an object which represents a lock which can be shared between different instances, why not create a specific type for that? You can given the lock a name easily enough for diagnostic purposes, but locking is really not the purpose of a string. Something like this:
public sealed class Lock
{
private readonly string name;
public string Name { get { return name; } }
public Lock(string name)
{
if (name == null)
{
throw new ArgumentNullException("name");
}
this.name = name;
}
}
Given the way that strings are sometimes interned and sometimes not (in a way which can occasionally be difficult to discern by simple inspection), you could easily end up with accidentally shared locks where you didn't intend them.
Locking on strings can be problematic, because interned strings are essentially global.
Interned strings are per process, so they are even shared among different AppDomains. Same goes for type objects (so don't lock on typeof(x)) either.
I had a similar issue not long ago where I was looking for a good way to lock a section of code based on a string value. Here's what we have in place at the moment, that solves the problem of interned strings and has the granularity we want.
The main idea is to maintain a static ConcurrentDictionary of sync objects with a string key. When a thread enters the method, it immediately establishes a lock and attempts to add the sync object to the concurrent dictionary. If we can add to the concurrent dictionary, it means that no other threads have a lock based on our string key and we can continue our work. Otherwise, we'll use the sync object from the concurrent dictionary to establish a second lock, which will wait for the running thread to finish processing. When the second lock is released, we can attempt to add the current thread's sync object to the dictionary again.
One word of caution: the threads aren't queued- so if multiple threads with the same string key are competing simultaneously for a lock, there are no guarantees about the order in which they will be processed.
Feel free to critique if you think I've overlooked something.
public class Foo
{
private static ConcurrentDictionary<string, object> _lockDictionary = new ConcurrentDictionary<string, object>();
public void DoSomethingThreadCriticalByString(string lockString)
{
object thisThreadSyncObject = new object();
lock (thisThreadSyncObject)
{
try
{
for (; ; )
{
object runningThreadSyncObject = _lockDictionary.GetOrAdd(lockString, thisThreadSyncObject);
if (runningThreadSyncObject == thisThreadSyncObject)
break;
lock (runningThreadSyncObject)
{
// Wait for the currently processing thread to finish and try inserting into the dictionary again.
}
}
// Do your work here.
}
finally
{
// Remove the key from the lock dictionary
object dummy;
_lockDictionary.TryRemove(lockString, out dummy);
}
}
}
}
Okay, newbie multi-threading question:
I have a Singleton class. The class has a Static List and essentially works like this:
class MyClass {
private static MyClass _instance;
private static List<string> _list;
private static bool IsRecording;
public static void StartRecording() {
_list = new List<string>();
IsRecording = true;
}
public static IEnumerable<string> StopRecording() {
IsRecording = false;
return new List<string>(_list).AsReadOnly();
}
public MyClass GetInstance(){
}
public void DoSomething(){
if(IsRecording) _list.Add("Something");
}
}
Basically a user can call StartRecording() to initialize a List and then all calls to an instance-method may add stuff to the list. However, multiple threads may hold an instance to MyClass, so multiple threads may add entries to the list.
However, both list creation and reading are single operations, so the usual Reader-Writer Problem in multi-threading situations does not apply. The only problem I could see is the insertion order being weird, but that is not a problem.
Can I leave the code as-is, or do I need to take any precautions for multi-threading? I should add that in the real application this is not a List of strings but a List of Custom Objects (so the code is _list.Add(new Object(somedata))), but these objects only hold data, no code besides a call to DateTime.Now.
Edit: Clarifications following some answers: DoSomething cannot be static (the class here is abbreviated, there is a lot of stuff going on that is using instance-variables, but these created by the constructor and then only read).
Is it good enough to do
lock(_list){
_list.Add(something);
}
and
lock(_list){
return new List<string>(_list).AsReadOnly();
}
or do I need some deeper magic?
You certainly must lock the _list. And since you are creating multiple instances for _list you can not lock on _list itself but you should use something like:
private static object _listLock = new object();
As an aside, to follow a few best practices:
DoSomething(), as shown, can be static and so it should be.
for Library classes the recommended pattern is to make static members thread-safe, that would apply to StartRecording(), StopRecording() and DoSomething().
I would also make StopRecording() set _list = null and check it for null in DoSomething().
And before you ask, all this takes so little time that there really are no performance reasons not to do it.
You need to lock the list if multiple threads are adding to it.
A few observations...
Maybe there's a reason not to, but I would suggest making the class static and hence all of its members static. There's no real reason, at least from what you've shown, to require clients of MyClass to call the GetInstance() method just so they can call an instance method, DoSomething() in this case.
I don't see what prevents someone from calling the StartRecording() method multiple times. You might consider putting a check in there so that if it is already recording you don't create a new list, pulling the rug out from everyone's feet.
Finally, when you lock the list, don't do it like this:
static object _sync = new object();
lock(_sync){
_list.Add(new object(somedata));
}
Minimize the amount of time spent inside the lock by moving the new object creation outside of the lock.
static object _sync = new object();
object data = new object(somedata);
lock(_sync){
_list.Add(data);
}
EDIT
You said that DoSomething() cannot be static, but I bet it can. You can still use an object of MyClass inside DoSomething() for any instance-related stuff you have to do. But from a programming usability perspective, don't require the users to MyClass to call GetInstance() first. Consider this:
class MyClass {
private static MyClass _instance;
private static List<string> _list;
private static bool IsRecording;
public static void StartRecording()
{
_list = new List<string>();
IsRecording = true;
}
public static IEnumerable<string> StopRecording()
{
IsRecording = false;
return new List<string>(_list).AsReadOnly();
}
private static MyClass GetInstance() // make this private, not public
{ return _instance; }
public static void DoSomething()
{
// use inst internally to the function to get access to instance variables
MyClass inst = GetInstance();
}
}
Doing this, the users of MyClass can go from
MyClass.GetInstance().DoSomething();
to
MyClass.DoSomething();
.NET collections are not fully thread-safe. From MSDN: "Multiple readers can read the collection with confidence; however, any modification to the collection produces undefined results for all threads that access the collection, including the reader threads." You can follow the suggestions on that MSDN page to make your accesses thread-safe.
One problem that you would probably run into with your current code is if StopRecording is called while some other thread is inside DoSomething. Since creating a new list from an existing one requires enumerating over it, you are likely to run into the old "Collection was modified; enumeration operation may not execute" problem.
The bottom line: practice safe threading!
It's possible, albeit tricky, to write a linked list that allows simultaneous insertions from multiple threads without a lock, but this isn't it. It's just not safe to call _list.Add in parallel and hope for the best. Depending how it's written, you could lose one or both values, or corrupt the entire structure. Just lock it.
Okay, I just can't get my head around multi-threading scenarios properly. Sorry for asking a similar question again, I'm just seeing many different "facts" around the internet.
public static class MyClass {
private static List<string> _myList = new List<string>;
private static bool _record;
public static void StartRecording()
{
_myList.Clear();
_record = true;
}
public static IEnumerable<string> StopRecording()
{
_record = false;
// Return a Read-Only copy of the list data
var result = new List<string>(_myList).AsReadOnly();
_myList.Clear();
return result;
}
public static void DoSomething()
{
if(_record) _myList.Add("Test");
// More, but unrelated actions
}
}
The idea is that if Recording is activated, calls to DoSomething() get recorded in an internal List, and returned when StopRecording() is called.
My specification is this:
StartRecording is not considered Thread-Safe. The user should call this while no other Thread is calling DoSomething(). But if it somehow could be, that would be great.
StopRecording is also not officially thread-safe. Again, it would be great if it could be, but that is not a requirement.
DoSomething has to be thread-safe
The usual way seems to be:
public static void DoSomething()
{
object _lock = new object();
lock(_lock){
if(_record) _myList.Add("Test");
}
// More, but unrelated actions
}
Alternatively, declaring a static variable:
private static object _lock;
public static void DoSomething()
{
lock(_lock){
if(_record) _myList.Add("Test");
}
// More, but unrelated actions
}
However, this answer says that this does not prevent other code from accessing it.
So I wonder
How would I properly lock a list?
Should I create the lock object in my function or as a static class variable?
Can I wrap the functionality of Start and StopRecording in a lock-block as well?
StopRecording() does two things: Set a boolean variable to false (to prevent DoSomething() from adding more stuff) and then copying the list to return a copy of the data to the caller). I assume that _record = false; is atomic and will be in effect immediately? So normally I wouldn't have to worry about Multi-Threading here at all, unless some other Thread calls StartRecording() again?
At the end of the day, I am looking for a way to express "Okay, this list is mine now, all other threads have to wait until I am done with it".
I will lock on the _myList itself here since it is private, but using a separate variable is more common. To improve on a few points:
public static class MyClass
{
private static List<string> _myList = new List<string>;
private static bool _record;
public static void StartRecording()
{
lock(_myList) // lock on the list
{
_myList.Clear();
_record = true;
}
}
public static IEnumerable<string> StopRecording()
{
lock(_myList)
{
_record = false;
// Return a Read-Only copy of the list data
var result = new List<string>(_myList).AsReadOnly();
_myList.Clear();
return result;
}
}
public static void DoSomething()
{
lock(_myList)
{
if(_record) _myList.Add("Test");
}
// More, but unrelated actions
}
}
Note that this code uses lock(_myList) to synchronize access to both _myList and _record. And you need to sync all actions on those two.
And to agree with the other answers here, lock(_myList) does nothing to _myList, it just uses _myList as a token (presumably as key in a HashSet). All methods must play fair by asking permission using the same token. A method on another thread can still use _myList without locking first, but with unpredictable results.
We can use any token so we often create one specially:
private static object _listLock = new object();
And then use lock(_listLock) instead of lock(_myList) everywhere.
This technique would have been advisable if myList had been public, and it would have been absolutely necessary if you had re-created myList instead of calling Clear().
Creating a new lock in DoSomething() would certainly be wrong - it would be pointless, as each call to DoSomething() would use a different lock. You should use the second form, but with an initializer:
private static object _lock = new object();
It's true that locking doesn't stop anything else from accessing your list, but unless you're exposing the list directly, that doesn't matter: nothing else will be accessing the list anyway.
Yes, you can wrap Start/StopRecording in locks in the same way.
Yes, setting a Boolean variable is atomic, but that doesn't make it thread-safe. If you only access the variable within the same lock, you're fine in terms of both atomicity and volatility though. Otherwise you might see "stale" values - e.g. you set the value to true in one thread, and another thread could use a cached value when reading it.
There are a few ways to lock the list. You can lock on _myList directly providing _myList is never changed to reference a new list.
lock (_myList)
{
// do something with the list...
}
You can create a locking object specifically for this purpose.
private static object _syncLock = new object();
lock (_syncLock)
{
// do something with the list...
}
If the static collection implements the System.Collections.ICollection interface (List(T) does), you can also synchronize using the SyncRoot property.
lock (((ICollection)_myList).SyncRoot)
{
// do something with the list...
}
The main point to understand is that you want one and only one object to use as your locking sentinal, which is why creating the locking sentinal inside the DoSomething() function won't work. As Jon said, each thread that calls DoSomething() will get its own object, so the lock on that object will succeed every time and grant immediate access to the list. By making the locking object static (via the list itself, a dedicated locking object, or the ICollection.SyncRoot property), it becomes shared across all threads and can effectively serialize access to your list.
The first way is wrong, as each caller will lock on a different object.
You could just lock on the list.
lock(_myList)
{
_myList.Add(...)
}
You may be misinterpreting the this answer, what is actually being stated is that they lock statement is not actually locking the object in question from being modified, rather it is preventing any other code using that object as a locking source from executing.
What this really means is that when you use the same instance as the locking object the code inside the lock block should not get executed.
In essence you are not really attempting to "lock" your list, you are attempting to have a common instance that can be used as a reference point for when you want to modify your list, when this is in use or "locked" you want to prevent other code from executing that would potentially modify the list.
In a C# app, suppose I have a single global class that contains some configuration items, like so :
public class Options
{
int myConfigInt;
string myConfigString;
..etc.
}
static Options GlobalOptions;
the members of this class will be uses across different threads :
Thread1: GlobalOptions.myConfigString = blah;
while
Thread2: string thingie = GlobalOptions.myConfigString;
Using a lock for access to the GlobalOptions object would also unnecessary block when 2 threads are accessing different members, but on the other hand creating a sync-object for every member seems a bit over the top too.
Also, using a lock on the global options would make my code less nice I think;
if I have to write
string stringiwanttouse;
lock(GlobalOptions)
{
stringiwanttouse = GlobalOptions.myConfigString;
}
everywhere (and is this thread-safe or is stringiwanttouse now just a pointer to myConfigString ? Yeah, I'm new to C#....) instead of
string stringiwanttouse = GlobalOptions.myConfigString;
it makes the code look horrible.
So...
What is the best (and simplest!) way to ensure thread-safety ?
You could wrap the field in question (myConfigString in this case) in a Property, and have code in the Get/Set that uses either a Monitor.Lock or a Mutex. Then, accessing the property only locks that single field, and doesn't lock the whole class.
Edit: adding code
private static object obj = new object(); // only used for locking
public static string MyConfigString {
get {
lock(obj)
{
return myConfigstring;
}
}
set {
lock(obj)
{
myConfigstring = value;
}
}
}
The following was written before the OP's edit:
public static class Options
{
private static int _myConfigInt;
private static string _myConfigString;
private static bool _initialized = false;
private static object _locker = new object();
private static void InitializeIfNeeded()
{
if (!_initialized) {
lock (_locker) {
if (!_initialized) {
ReadConfiguration();
_initalized = true;
}
}
}
}
private static void ReadConfiguration() { // ... }
public static int MyConfigInt {
get {
InitializeIfNeeded();
return _myConfigInt;
}
}
public static string MyConfigString {
get {
InitializeIfNeeded();
return _myConfigstring;
}
}
//..etc.
}
After that edit, I can say that you should do something like the above, and only set configuration in one place - the configuration class. That way, it will be the only class modifying the configuration at runtime, and only when a configuration option is to be retrieved.
Your configurations may be 'global', but they should not be exposed as a global variable. If configurations don't change, they should be used to construct the objects that need the information - either manually or through a factory object. If they can change, then an object that watches the configuration file/database/whatever and implements the Observer pattern should be used.
Global variables (even those that happen to be a class instance) are a Bad Thing™
What do you mean by thread safety here? It's not the global object that needs to be thread safe, it is the accessing code. If two threads write to a member variable near the same instant, one of them will "win", but is that a problem? If your client code depends on the global value staying constant until it is done with some unit of processing, then you will need to create a synchronization object for each property that needs to be locked. There isn't any great way around that. You could just cache a local copy of the value to avoid problems, but the applicability of that fix will depend on your circumstances. Also, I wouldn't create a synch object for each property by default, but instead as you realize you will need it.