Why is specifying a synchronization object in the lock statement mandatory - c#

I'm trying to wrap my mind around what exactly happens in the lock statement.
If I understood correctly, the lock statement is syntactic sugar and the following...
Object _lock = new Object();
lock (_lock)
{
// Critical code section
}
...gets translated into something roughly like:
Object _lock = new Object();
Monitor.Enter(_lock);
try
{
// Critical code section
}
finally { Monitor.Exit (_lock); }
I have used the lock statement a few times, and always created a private field _lock, as a dedicated synchronization object. I do understand why you should not lock on public variables or types.
But why does the compiler not create that instance field as well? I feel there might in fact be situations where the developer wants to specify what to lock on, but from my experience, in most cases that is of absolutely no interest, you just want that lock! So why is there no parameterless overload of lock?
lock()
{
// First critical code section
}
lock()
{
// Second critical code section
}
would be translated into (or similar):
[DebuggerHidden]
private readonly object _lock1 = new object()
[DebuggerHidden]
private readonly object _lock2 = new object()
Monitor.Enter(_lock1);
try
{
// First critical code section
}
finally { Monitor.Exit(_lock1); }
Monitor.Enter(_lock2);
try
{
// Second critical code section
}
finally { Monitor.Exit(_lock2); }
EDIT: I have obviously been unclear concerning multiple lock statements. Updated the question to contain two lock statements.

The state of the lock needs to be stored. Whether or not it was entered. So that another thread that tries to enter the same lock can be blocked.
That requires a variable. Just a very simple one, an plain object is enough.
A hard requirement for such a variable is that it is created before any lock statement uses it. Trying to create it on-the-fly as you propose creates a new problem, there's a now a need to use a lock to safely create the variable so that only the first thread that enters the lock creates it and other threads trying to enter the lock are blocked until it is created. Which requires a variable. Etcetera, an unsolvable chicken-and-egg problem.

There can be situations when you will need two different lock's, which are independent of each other. Meaning when one 'lockable' part of code is locked other 'lockable' should not be locked. That's why there is ability to provide lock objects - you can have several of them for several independent lock's

In order for the no-variable thing to work, you'd have to either:
Have one auto-generated lock variable per lock block (what you did, which means that you can't have two different lock blocks locking on the same variable)
Use the same lock variable for all lock blocks in the same class (which means you can't have two independent things protected)
Plus, you'd also have the issue of deciding whether those should be instance-level or static.
In the end, I'm guessing the language designers didn't feel that the simplification in one specific case was worth the ambiguity introduced while reading code. Threading code (which is the reason to use locks) is already hard to write correctly and verify. Making it harder would be a not-good thing.

Allowing for an implicit lock object might encourage the use of a single lock object, which is considered bad practice. By enforcing the use of an explicit lock object, the language encourages you to name the lock something useful, such as "countIncementLock".
A variable named thusly would not encourage developers to use the same lock object when performing a completely separate operation, such as writing to a stream of some kind.
Therefore, the object could be writing to a stream on one thread, while incrementing a counter on another thread, and neither of the threads would necessarily interfere with each other.
The only reason why the language wouldn't do this is because is because it would look like a good practice, but in reality would be hiding a bad practice.
Edit:
Perhaps the designers of C# did not want implicit lock variables because they thought it might encourage bad behavour.
Perhaps the designers did not think of implicit lock variables at all, because they had other more important things to think about first.
If every C# developer knew exactly what was happening when they wrote lock(), and they knew the implications, then there's no reason why it shouldn't exist, and no reason why it shouldn't work how you're suggesting.

Related

Why do you need a lock object for C#?

You can sometimes reuse the object itself for the lock, but quite often it is advised to use a different object anyway.
Don't we have a lot more typesafety and a lot better intention if there would just be a keyword for lock?
private object _MyLock = new object();
// someone would now be able to reassign breaking everything
_MyLock = new object();
lock ( _MyLock )
...
VS
private lock _MyLock;
// compiler error
_MyLock = new object();
lock ( _MyLock )
...
Before I get downvotes that you can't guess someones intention: I'm pretty sure the language designers had a good reason and there are more knowledgable coders here that now why. I ask this to understand better programming principles.
Note that by using an object as a monitor, you can already do all the things you can do with any other object: Pass references so that multiple classes can share the same monitor, keep them in arrays, keep them inside other data structures.
If you had a special type of declaration for a lockable, you would need special syntax for passing it to a function, storing a reference to it inside another instance, associating it with a type instead of an instance, creating arrays (e.g. for LockMany operation), and so on.
Using the language rules that already exist for objects to handle all these common and not-so-common usages makes the language a whole lot simpler.
Don't we have a lot more typesafety and a lot better intention if there would just be a keyword for lock?
It's not about type safety at all. It's about thread safety.
Sometimes that means running the same code in a single lock over and over. Perhaps you have a single large array, where some of your operations might need to swap two elements, and you want to make sure things are synchronized during the swap. In this kind of context, even a simple lock keyword by itself, where the object is created for you behind the scenes, might be good enough.
Sometimes you're sharing an object among very different sets of code. Now you need multiple lock sections that coordinate using a common object. In this case, the code you're talking about seems to make sense. Letting the compiler create a lock object for you isn't good enough because the different lock sections would not coordinate, but you also want to make sure the common lock object is fixed, and doesn't change somehow. For example, maybe you're working through an array with multiple threads, and you have different operations that might modify a shared index value that indicates which element is considered current or active. Each of these operations should lock on the same object.
But sometimes you share multiple object instances (often of the same type) among several sets of code. Think producer/consumer pattern, where multiple consumers from different threads need to coordinate access to a shared queue, and the consumers themselves are multi-threaded. In this kind of case, a single common lock object would be okay to retrieve an element from the queue, but a single shared object in different sections of the consumer could become a bottleneck for the application. Instead, you would only want to lock once per active object/consumer. You need the lock section to accept a variable that indicates which object needs protection, without locking your entire data set.
One solution may be defining lock object as readonly.
static readonly object lockObject = new object();
In this case compiler prevents renewing and assigning new object to lockobject.

Locking on the object that is being synchronized or using a dedicated lock object? [duplicate]

This question already has answers here:
C# lock statement, what object to lock on?
(4 answers)
Closed 6 years ago.
As far as I've understood from colleagues and the web, it is bad practice to lock on the object that is being synchronized, but what I dont understand is why?
The following class is supposed to load settings to a dictionary, and it has a method to retrieve settings as well.
public class TingbogSettingService : ITingbogSettingService
{
private readonly ISettingRepository _settingRepository;
private readonly ICentralLog _centralLog;
private Dictionary<string, ISetting> _settingDictionary = new Dictionary<string, ISetting>();
public TingbogSettingService(ISettingRepository settingRepository, ICentralLog centralLog)
{
_settingRepository = settingRepository;
_centralLog = centralLog;
}
public ISetting GetSetting(string settingName)
{
ISetting setting;
if (!_settingDictionary.TryGetValue(settingName, out setting))
{
return null;
}
return setting;
}
public void LoadSettings()
{
var settings = _settingRepository.LoadSettings();
try
{
lock (_settingDictionary)
{
_settingDictionary = settings.ToDictionary(x => x.SettingName);
}
}
catch (Exception ex)
{
_centralLog.Log(Targets.Database, LogType.Error, $"LoadSettings error: Could not load the settings", new List<Exception>() { ex });
}
}
}
During the LoadSettings function I want to lock the _settingDictionary, so that GetSetting will be blocked, until the new settings are loaded.
Should I use a dedicated lock object instead?
For instance:
private static readonly object m_Lock = new object();
…
lock (m_Lock)
EDIT
I thought that lock(_settingDictionary) would lock the _settingDictionary itself, however I now realize that his is not the case. What I wanted was to prevent other threads from accessing _settingDictionary until the new settings were loaded (LoadSettings method completed). As only 1 thread is updating the _settingDictionary, I guess I dont need a lock there at all.
As for closing the question - something similar has been asked before, yes, but the scenario is not the same. I learned from your answers and it is going to be hard to pick a winner amongst y'all.
This is quite a broad subject, but let me focus on one major problem in your code: _settingDictionary changes.
You don't lock on the field, you lock on the instance. This means that when you lock on _settingDictionary, and then you change _settingDictionary, you're not preventing any concurrent access - anyone can lock on the new _settingDictionary.
lock doesn't prevent access to the object you're locking either. If you need synchronization, you must synchronize all access to the object, including your _settingDictionary.TryGetValue. Dictionary isn't thread-safe.
The main guide-lines to what you should lock on are something like this:
The lock object is private to the locker - if it's not private, some other class might be holding a lock on your object, which may lead to deadlocks.
The field should be readonly - this is not a strict requirement, but it makes things easier. The main point is that you must not lock on an object that might change while the lock is being held; others trying to take the lock concurrently will succeed.
The lock object is a reference type - this kind of goes without saying, but you cannot lock on e.g. an int field, since it is boxed when you try to lock it - in effect, this is the same as the previous point - everyone locks on their own instance of the object, eliminating all synchronization.
Obligatory disclaimer: Multi-threading is hard. Seriously hard. Make sure you understand what's happening and what can possibly happen. Any multi-threaded code you write must be written in a way that's correct, first and foremost. http://www.albahari.com/threading/ is a great starter on all things multi-threaded in C#/.NET.
There is no "right" or "wrong" answer to this but there are some guidelines and some things to be aware of.
First, there's many that feel that Microsoft should never have allowed to lock on arbitrary objects. Instead they should've encapsulated the locking functionality into a specific class and avoided potential overhead in every other object out there.
The biggest problem with allowing locking on arbitrary objects is that if you lock on an object you make publicly available to 3rd party code, you have no control over who else might be locking on the same object. You could write your code to the letter, dotting every I and it would still end up deadlocking because some other, 3rd party, code is locking on the same object out of your control.
So that point alone is guideline enough to say "don't ever lock on objects you make publicly available".
But what if the object you want to synchronize access to is private? Well, then it becomes more fuzzy. Presumably you have full control over the code you write yourself and thus if you then lock on the dictionary, as an example, then it will work just fine.
Still, my advice would be to always set up a separate object to lock on, get into this habit, and then you won't so easily make mistakes if you later decides to expose a previously private object into the public and forgetting to separate the locking semantics from it.
The simplest locking object is just that, an object:
private readonly object _SomethingSomethingLock = new object();
Also know, though I think you already do, that locking on an object does not "lock the object". Any other piece of code that doesn't bother with locks can still access the object just fine.
Here is also something I just noticed about your code.
When you do this:
lock (x)
You don't lock on x, you lock on the object that x refers to at the time of the lock.
This is important when looking at this code:
lock (_settingDictionary)
{
_settingDictionary = settings.ToDictionary(x => x.SettingName);
}
Here you have two objects in play:
The dictionary that settingDictionary refers to at the time of lock (_settingDictionary)
The new dictionary that .ToDictionary(...) returns
You have a lock on the first object, but not on the second. This is another scenario where having a dedicated locking object would not only make sense, but also be correct, as the above code is buggy in my opinion.
The problem you are talking about happens when you lock on an object to which external users of your class have access - most commonly, the object itself, i.e. lock (this).
If your code were locking on this instead of _settingDictionary, someone else could deadlock your code as follows:
TingbogSettingService svc = ...
lock (svc) {
Task.Run(() => {
svc.LoadSettings();
});
}
When you lock on a private object, such as _settingDictionary in your case, there harmful effect described above is avoided, because nobody outside your code can lock on the same object.
Note: Using the lock in your code does not make it thread-safe, because GetSetting method does not lock on _settingDictionary when reading from it. Moreover, the fact that you re-assing _settingDictionary inside the lock makes locking irrelevant, because after the reassignment another thread can enter protected section in the lock.
There are different thing you could lock:
a dedicated non static object: private readonly object m_Lock = new object();
a dedicated static object (your example): private static readonly object m_Lock = new object();
the object itself: lock (_settingDictionary)
this, typeof(MyClass)...
The first two are OK but actually different. Locking on a static object means the lock is shared between all instances of your classes. Locking on a non-static object means the lock is different for each instance of your class.
The third option is OK, it's the same as the first one. The only difference is that the object is not read-only (using a read-only field is slightly better as you ensure it won't ever change).
The last option is a bad option for various reasons, see Why is lock(this) {...} bad?
So be careful about what you lock, your example uses a static object while your initial code uses a non-static object. Those are really different use cases.
It is better to use a dedicated object that is not modified by the block of code or used for other purposes in some other methods. That way the object has a single responsibility so that you don't mix the usage of it as a synchronization object, with it being maybe set to null at some point or reinitialized by another method.
lock (_settingDictionary) doesn't lock the dictionary specified between (), it locks the next block of code by using _settingDictionary as a synchronization object (To know if the block has been entered of left by another thread by setting some flags on that object).

C# "lock" keyword: Why is an object necessary for the syntax?

To mark code as a critical section we do this:
Object lockThis = new Object();
lock (lockThis)
{
//Critical Section
}
Why is it necessary to have an object as a part of the lock syntax? In other words, why can't this work the same:
lock
{
//Critical Section
}
Because you don't just lock - you lock something (you lock a lock).
The point of locking is to disallow two threads from directly competing for the same resource. Therefore, you hide that resource behind an arbitrary object. That arbitrary object acts as a lock. When one thread enters a critical section, it locks the lock and the others can't get in. When the thread finishes its work in the critical section, it unlocks and leaves the keys out for whichever thread happens to come next.
If a program has one resource that's candidate for competing accesses, it's possible that it will have other such resources as well! But often these resources are independent from each other - in other words, it may make sense for one thread to be able to lock one particular resource and another thread in the meantime to be able to lock another resource, without those two interfering.
A resource may also need to be accessed from two critical sections. Those two will need to have the same lock. If each had their own, they wouldn't be effective in keeping the resource uncontested.
Obviously, then, we don't just lock - we lock each particular resource's own lock. But the compiler can't autogenerate that arbitrary lock object silently, because it doesn't know which resources should be locked using the same lock and which ones should have their own lock. That's why you have to explicitly state which lock protects which block (or blocks) of code.
Note that the correct usage of an object as a lock requires that the object is persistent (at least as persistent as the corresponding resource). In other words, you can't create it in a local scope, store it in a local variable and throw it away when the scope exits, because this means you're not actually locking anything. If you have one persistent object acting as a lock for a given resource, only one thread can enter that section. If you create a new lock object every time someone attempts to get in, then anyone can enter at all times.
It needs something to use as the lock.
This way two different methods can share the same lock, so only one can be used at a time.
Object lockThis = new Object();
void read()
{
lock (lockThis)
{
//Critical Section
}
}
void write()
{
lock (lockThis)
{
//Critical Section
}
}
Well the simple answer is that that's just how the language is specified. Locks need to have some sort of object to lock on. This allows you to control what code must be locked by controlling the scope and lifecycle of this object. See lock Statement (C# Reference).
It's possible that the designers of the language could have provided an anonymous-style lock like you've suggested, and rely on the compiler to generate an appropriate object behind the scenes. But should the object that's created be a static or instance member? How would it be used if you had multiple methods that need a lock in one class? These are difficult questions, and I'm sure the designers of the language simply didn't feel the benefit of including such a construct would be worth the added complexity or confusion it would entail.

c# lock question: lock(this) vs lock(SyncRoot)

I have a class with a field of type collection.
Questions:
if I lock(this), do I effectively lock the collection too?
what is more efficient, to do lock(this) or to create a SyncRoot object and do lock(SyncRoot)?
Don't lock on this. It could be the case that someone else has used the instance as a lock object too. Use specifically designated lock objects.
1) if I lock(this), do I effectively lock the collection too?
No.
2) what is more efficient, to do lock(this) or to create a SyncRoot object and do lock(SyncRoot) ?
Efficient? Focus on semantics. locking on this is dangerous. Don't do it. The difference in performance, if any, is not material.
Seriously, it's akin to asking, what will get me to my destination faster, driving 100 MPH the wrong way down the freeway, or walking?
Always use lock(_syncRoot).
Where _syncRoot is a private field (just has to be an object).
This is no difference in terms of efficiency, but you're better to have a private field that you're in control of to lock on. If you lock on this, another object may also be locking on it.
See Why is lock(this) {...} bad? for a much better explanation. Also have a look at the msdn article on lock.
By locking on a collection, you aren't doing anything to stop it from being changed. A misunderstanding you might have, is that lock doesn't do anything special to stop that object being changed, it only works if every critical piece of code also calls lock.
When using lock you aren't doing anything magical to the object you put inside the lock - it isn't making it read only or anything like that. It is just making a note that something has a lock reference to that object. So if anybody else tries to get a lock on that object at the same time it will do what you expect (prevent synchronous access).
What lock doesn't do is care about any properties, fields or anything else in the object you are locking. So no, you aren't locking the collection at all.
This is explained at greater length in this question: Why is lock(this) {...} bad? (which I got from other answers but is an excellent answerand I felt it should be included here too).
As for efficiency I wouldn't expect there to be a lot of performance difference between the two. However as others have said you should not lock on something that might be locked by something outside of your control. This is why more often than not you will find private variables being created for this.
Personally I'd give it a more descriptive name than synclock to describe exactly what the locking process is for (eg saveLock).
A lot of people are saying that lock(this) is dangerous. However The MSDN description of ICollection.SyncRoot states:
For collections whose underlying store is not publicly available, the expected implementation is to return the current instance
If a class follows this guideline, then yes, lock(this) is effectively the same as lock(SyncRoot). But you shouldn't rely on implementation details like this, and should use the more explicit lock(SyncRoot).
Of course, if you don't want to publicly expose locking semantics, but use lock within your class implementation, then you should lock on a private object as others have recommended, and as MSDN recommends. But that doesn't seem to be what you're asking.
Both locking on an instance (lock(this)) and locking on a public property (lock(SyncRoot)) expose you to having the lock taken by code you don't control. If your intention is to expose locking semantics to callers, you have no choice but to do this.
Always lock on something over which the locking code has control. You have no control over a type or an instance of a type.

What is a best approach to make a function or set of statements thread safe in C#?

What is a best approach to make a function or set of statements thread safe in C#?
Don't use shared read/write state when possible. Go with immutable types.
Take a look at the C# lock statement. Read Jon Skeet's article on multi threading in .net.
It depends on what you're trying to accomplish.
If you want to make sure that in any given time only one thread would run a specific code use lock or Monitor:
public void Func(...)
{
lock(syncObject)
{
// only one thread can enter this code
}
}
On the other hand you want multiple threads to run the same code but do not want them to cause race conditions by changing the same point in memory don't write to static/shared objects which can be reached by multiple at the same time.
BTW - If you want to create a static object that would be shared only within a single thread use the ThreadStatic attribute (http://msdn.microsoft.com/en-us/library/system.threadstaticattribute(VS.71).aspx).
Use lock statement around shared state variables. Once you ensured thread safety, run code through code profiler to find bottlenecks and optimize those places with more advanced multi-threading constructs.
The best approach will vary depending on your exact problem at hand.
The simplest approach in C# is to "lock" resources shared by multiple threads using a lock statement. This creates a block of code which can only be accessed by one thread at a time: the one which has obtained the "lock" object. For example, this property is thread safe using the lock syntax:
public class MyClass
{
private int _myValue;
public int MyProperty
{
get
{
lock(this)
{
return _myValue;
}
}
set
{
lock(this)
{
_myValue = value;
}
}
}
}
A thread aquires the lock at the start of the block and only releases the lock at the end of the block. If the lock is not available, the thread will wait until the lock is available. Obviously, access to the private variable within the class is not thread-safe, so all threads must access the value through the property to be safe.
This is by far the simplest way for threads to have safe access to shared data, however it only touches the tip of the iceberg of techniques for threading.
Write the function in such a way that:
It does not modify its parameters in any way
It does not access any state outside of its local variables.
Otherwise, race conditions MAY occur. The code must be thoroughly examined for such conditions and appropriate thread synchronization must be implemented (locks, etc...). Writing code that does not require synchronization is the best way to make it thread-safe. Of course, this is often not possible - but should be the first option considered in most situations.
There's a lot to understand when learning what "thread safe" means and all the issues that are introduced (synchronization, etc).
I'd recommend reading through this page in order to get a better feel for what you're asking: Threading in C#. It gives a pretty comprehensive overview of the subject, which sounds like it could be pretty helpful.
And Mehrdad's absolutely right -- go with immutable types if you can help it.

Categories

Resources