Simple threading question, locking non local changes

Simple threading question, locking non local changes - c#

Ok first I must preface this question with a disclaimer, I'm really new to threading so this may be a 'newbie' question but I searched google and couldn't find an answer. As I understand it a critical section is code that can be accessed by two or more threads, the danger being one thread will overwrite a value before the other is finished and vice versa. What can you do about changes made outside of your class for example, I have a line monitoring program:
int currentNumber = provider.GetCurrentNumber();
if(provider.CanPassNumber(false, currentNumber))
{
currentNumber++;
provider.SetNumber(currentNumber);
}
and on another thread I have something like this:
if(condition)
provider.SetNumber(numberToSet);
Now I'm afraid that in the first function I get currentNumber which is 5, right after that on another thread the number is set to 7 and then it rewrites the 7 to 6, ignoring the change made by the thread that set it to 7.
Is there anyway to lock provider.SetNumber until the first function finishes? The critical section is basically the currentNumber which can be changed by many places in the program.
I hope I made myself clear, if not let me know and I will try to explain myself better.
EDIT:
Also I made the functions really short for the example. In reality the function is much longer and makes changes to currentNumber many times so I don't really want to put a lock around the entire function. If I lock every call to provider.SetNumber and release it after I finish it can change during the time it is released before I lock it again to call provider.SetNumber. Honestly I'm also worried about locking the entire function because of performance and deadlock.

Rather than using the lock() keywords I'd suggested seeing if you can use the Interlocked class which is designed for small operations. It's got much less overhead than lock, in fact can be down to a single CPU instruction on some CPUs.
There are a couple of methods of interest for you, Exchange and Read, both of which are thread safe.

You want to look into the Lock keyword. Also you might want to this tutorial to Threading in C#.

As Filip said, lock is useful here.
Not only should you lock on provider.SetNumber(currentNumber), you also need to lock on any conditional that the setter depends on.
lock(someObject)
{
if(provider.CanPassNumber(false, currentNumber))
{
currentNumber++;
provider.SetNumber(currentNumber);
}
}
as well as
if(condition)
{
lock(someObject)
{
provider.SetNumber(numberToSet);
}
}
If condition is reliant on numberToSet, you should take the lock statement around the whole block. Also note that someObject must be the same object.

You can use the lock statement, to enter a critical section with mutual exclusion. The lock will use the object's reference to differentiate one critical section from another, you must have the same reference for all your lock if it accesses to the same elements.
// Define an object which can be locked in your class.
object locker = new object();
// Add around your critical sections the following :
lock (locker) { /* ... */ }
That will change your code to :
int currentNumber = provider.GetCurrentNumber();
lock (locker)
{
if(provider.CanPassNumber(false, currentNumber))
{
currentNumber++;
provider.SetNumber(currentNumber);
}
}
And :
if(condition)
{
lock (locker)
{
provider.SetNumber(numberToSet);
}
}

In your SetNumber method you can simply use a lock statement:
public class MyProvider {
object numberLock = new object();
...
public void SetNumber(int num) {
lock(numberLock) {
// Do Stuff
}
}
}
Also, note that in your example currentNumber is a primitive (int), which means that variable's value won't be overwritten should your provider's actual data member's value change.

Well first of im not so good with threading but a critical section is a part of your code that can only be accessed my one thread at a time not the other way around..
To create a critical section is easy
Lock(this)
{
//Only one thread can run this at a time
}
note: that this should be replaced with some internal object...

Related

Is it possible to create a deadlock in C# if nothing but the lock keyword is used around primitive data access?

I've written a lot of multi-threaded C# code, and I've never had a deadlock in any code I've released.
I use the following rules of thumb:
I tend to use nothing but the lock keyword (I also use other techniques such as reader/writer locks, but sparingly, and only if required for speed).
I use Interlocked.Increment if I am dealing with a long.
I tend to use the smallest granular unit of locking: I only tend to lock around primitive data structures such as long, dictionary or list.
I'm wondering if it's even possible to generate a deadlock if these rules are thumb are consistently followed, and if so, what the code would look like?
Update
I also use these rules of thumb:
Avoid adding a lock around anything that could pause indefinitely, especially I/O operations. If you absolutely have to do so, ensure that absolutely everything within the lock will time out after a set TimeSpan.
The objects I use for locking are always dedicated objects, e.g. object _lockDict = new object(); then lock(_lockDict) { // Access dictionary here }.
Update
Great answer from Jon Skeet. It also confirms why I never get deadlocks as I tend to instinctively avoid nested locks, and even if I do use them, I've always instinctively kept the entry order consistent.
And in response to my comment on tending to use nothing but the lock keyword, i.e. using Dictionary + lock instead of ConcurrentDictionary, Jon Skeet made this comment:
#Contango: That's exactly the approach I'd take too.
I'd go for simple code with locking over "clever" lock-free code every time, until there's evidence that it's causing an issue.

Yes, it's easy to deadlock, without actually accessing any data:
private readonly object lock1 = new object();
private readonly object lock2 = new object();
public void Method1()
{
lock(lock1)
{
Thread.Sleep(1000);
lock(lock2)
{
}
}
}
public void Method2()
{
lock(lock2)
{
Thread.Sleep(1000);
lock(lock1)
{
}
}
}
Call both Method1 and Method2 at roughly the same time, and boom - deadlock. Each thread will be waiting for the "inner" lock, which the other thread has acquired as its "outer" lock.
If you make sure you always acquire locks in the same order (e.g. "never acquire lock2 unless you already own lock1) and release the locks in the reverse order (which is implicit if you're acquiring/releasing with lock) then you won't get that sort of deadlock.
You can still get a deadlock with async code, with just a single thread involved - but that involves Task as well:
public async Task FooAsync()
{
BarAsync().Wait(); // Don't do this!
}
public async Task BarAsync()
{
await Task.Delay(1000);
}
If you run that code from a WinForms thread, you'll deadlock in a single thread - FooAsync will be blocking on the task returned by BarAsync, and the continuation for BarAsync won't be able to run because it's waiting to get back onto the UI thread. Basically, you shouldn't issue blocking calls from the UI thread...

As long as you ever only lock on one thing it's impossible, if one thread tries to lock on multiple locks, then yes. The dining philosophers problem nicely illustrates a simple deadlock caused with simple data.
As the other answers have already shown;
void Thread1Method()
{
lock (lock1)
{
// Do smth
lock (lock2)
{ }
}
}
void Thread2Method()
{
lock (lock2)
{
// Do smth
lock (lock2)
{ }
}
}

Addendum to what Skeet wrote:
The problem normally isn't with "only" two locks... (clearly there could be even with only two locks, but we want to play in Hard mode :-) )...
Let's say that in your program there are 10 lockable resources... Let's call them a1...a10. You must be sure that you'll always lock those in the same order, even for subsets of them... If a method needs a3, a5 and a7, and another methods needs a4, a5, a7, you must be sure that both will try locking them in the "right" order. For simplicity sake in this case the order is clear: a1->a10.
Normally lock objects aren't numbered, and/or they aren't taken in a single method... For example:
void MethodA()
{
lock (Lock1)
{
CommonMethod();
}
}
void MethodB()
{
lock (Lock3)
{
CommonMethod();
}
}
void CommonMethod()
{
lock (Lock2)
{
}
}
void MethodC()
{
lock (Lock1)
{
lock (Lock2)
{
lock (Lock3)
{
}
}
}
}
Here, even with the Lock* numbered, it isn't immediately clear that the locks could be taken in the wrong order (MethodB+CommonMethod take Lock3+Lock2, while MethodC takes Lock1+Lock2+Lock3)... It isn't immediately clear and we are playing with three very big advantages: we are speaking of deadlock, so we are looking for them, the locks are numbered and the whole code is around 30 lines.

Lock for different methods but same variable

Hello friends have a doubt in threaded application.
class sample
{
static volatile bool _shutdownThreads;
static readonly object _lockerObject = new object();
main()
{
create thread for samplemethod()
lock(_lockerObject)
{
_shutdownThreads = true;
}
}
samplemethod()
{
while(true)
{
lock(_lockerObject)
{
if(_shutdownThreads) break;
}
}
}
}
(1)ok i guess you might have understood what i am trying to accomplish. I need to have a safe way to use the _shutdownThreads variable. is this the right approach?
(2)if i lock a block of code all the variables inside the block gets locked too? i mean even other threads(for example main) cant access the variable. am i right?

Yes, you are right. The purpose of lock is to let one thread access a code block while other threads will wait. However in your specific case: it does not make sense to lock a boolean assignment. This will be atomic anyway.

"I need to have a safe way to use the _shutdownThreads variable. is
this the right approach?"
Yes, and no. It's safe, but you have a busy loop that will use A LOT of CPU for no good reason. There are better options for waiting for an event, but you can at least make it a lot less horrific by making the thread sleep a while between each check:
while(true) {
lock(_lockerObject) {
if(_shutdownThreads) break;
}
Thread.Sleep(100);
}
"if i lock a block of code all the variables inside the block gets
locked too?"
No, not at all. The lock doesn't keep any other thread from accessing any data what so ever. The only thing that the lock does is keeping any other thread from entering a code block that uses the same identifier reference (_lockerObject in your case).
To protect the data, you have to use locks around every code block that accesses the data, using the same identifier reference.

Why is specifying a synchronization object in the lock statement mandatory

I'm trying to wrap my mind around what exactly happens in the lock statement.
If I understood correctly, the lock statement is syntactic sugar and the following...
Object _lock = new Object();
lock (_lock)
{
// Critical code section
}
...gets translated into something roughly like:
Object _lock = new Object();
Monitor.Enter(_lock);
try
{
// Critical code section
}
finally { Monitor.Exit (_lock); }
I have used the lock statement a few times, and always created a private field _lock, as a dedicated synchronization object. I do understand why you should not lock on public variables or types.
But why does the compiler not create that instance field as well? I feel there might in fact be situations where the developer wants to specify what to lock on, but from my experience, in most cases that is of absolutely no interest, you just want that lock! So why is there no parameterless overload of lock?
lock()
{
// First critical code section
}
lock()
{
// Second critical code section
}
would be translated into (or similar):
[DebuggerHidden]
private readonly object _lock1 = new object()
[DebuggerHidden]
private readonly object _lock2 = new object()
Monitor.Enter(_lock1);
try
{
// First critical code section
}
finally { Monitor.Exit(_lock1); }
Monitor.Enter(_lock2);
try
{
// Second critical code section
}
finally { Monitor.Exit(_lock2); }
EDIT: I have obviously been unclear concerning multiple lock statements. Updated the question to contain two lock statements.

The state of the lock needs to be stored. Whether or not it was entered. So that another thread that tries to enter the same lock can be blocked.
That requires a variable. Just a very simple one, an plain object is enough.
A hard requirement for such a variable is that it is created before any lock statement uses it. Trying to create it on-the-fly as you propose creates a new problem, there's a now a need to use a lock to safely create the variable so that only the first thread that enters the lock creates it and other threads trying to enter the lock are blocked until it is created. Which requires a variable. Etcetera, an unsolvable chicken-and-egg problem.

There can be situations when you will need two different lock's, which are independent of each other. Meaning when one 'lockable' part of code is locked other 'lockable' should not be locked. That's why there is ability to provide lock objects - you can have several of them for several independent lock's

In order for the no-variable thing to work, you'd have to either:
Have one auto-generated lock variable per lock block (what you did, which means that you can't have two different lock blocks locking on the same variable)
Use the same lock variable for all lock blocks in the same class (which means you can't have two independent things protected)
Plus, you'd also have the issue of deciding whether those should be instance-level or static.
In the end, I'm guessing the language designers didn't feel that the simplification in one specific case was worth the ambiguity introduced while reading code. Threading code (which is the reason to use locks) is already hard to write correctly and verify. Making it harder would be a not-good thing.

Allowing for an implicit lock object might encourage the use of a single lock object, which is considered bad practice. By enforcing the use of an explicit lock object, the language encourages you to name the lock something useful, such as "countIncementLock".
A variable named thusly would not encourage developers to use the same lock object when performing a completely separate operation, such as writing to a stream of some kind.
Therefore, the object could be writing to a stream on one thread, while incrementing a counter on another thread, and neither of the threads would necessarily interfere with each other.
The only reason why the language wouldn't do this is because is because it would look like a good practice, but in reality would be hiding a bad practice.
Edit:
Perhaps the designers of C# did not want implicit lock variables because they thought it might encourage bad behavour.
Perhaps the designers did not think of implicit lock variables at all, because they had other more important things to think about first.
If every C# developer knew exactly what was happening when they wrote lock(), and they knew the implications, then there's no reason why it shouldn't exist, and no reason why it shouldn't work how you're suggesting.

Parallel.For not handling lock properly

I've done the following test:
private static object threadLocker = new object();
private static long threadStaticVar;
public static long ThreadStaticVar
{
get
{
lock (threadLocker)
{
return threadStaticVar;
}
}
set
{
lock (threadLocker)
{
threadStaticVar = value;
}
}
}
Parallel.For(0, 20000, (x) =>
{
//lock (threadLocker) // works with this lock
//{
ThreadStaticVar++;
//}
});
This Parallel.For invokes the method passing the values from 0 to 19999. So it would execute 20k times.
If I don't wrap ThreadStaticVar++; with a lock, even though it has a lock on its get and set, the result will not be 20000. If I remove the comment bars and lock it inside the .For it gets the right value.
My question is: How does it work? Why the lock on the get and set doesn't work? Why it works only inside my For?

The ++ operator isn't an atomic increment. There will be a call to get followed by a call to set, and those calls can be interleaved among different threads since the lock is only on each individual operation. Think of it like this:
lock {tmp = var}
lock {var = tmp+1}
Those locks don't look so effective now, do they?

In your example ThreadStaricVar++ is not an atomic operation.
More accurately, ++ is not an atomic operation as it locks your getter, then increment the value, and then locks your setter to set the value. Between these two anything can happen :)
To do it properly I would recommend to use object-oriented programming instead of this procedural code. Just implement an Increment() method in your object and make it responsible to lock and do ++ inside this method. In your parallel loop you just command your object what to do, now it this object's responsibility to make it happen and figure out how to do it.
So you just implement your lock within the Increment() method and have no problems anywhere outside (really, consumers shouldn't know and shouldn't even think about such issues).

You can rename threadStaticVar and make it public . Then, use Interlocked.Increment.
However, also consider whether a parallel for is appropriate. Even if the real code is more complex, running in parallel with locking may not be your best option.

C# manual lock/unlock

I have a function in C# that can be called multiple times from multiple threads and I want it to be done only once so I thought about this:
class MyClass
{
bool done = false;
public void DoSomething()
{
lock(this)
if(!done)
{
done = true;
_DoSomething();
}
}
}
The problem is _DoSomething takes a long time and I don't want many threads to wait on it when they can just see that done is true.
Something like this can be a workaround:
class MyClass
{
bool done = false;
public void DoSomething()
{
bool doIt = false;
lock(this)
if(!done)
doIt = done = true;
if(doIt)
_DoSomething();
}
}
But just doing the locking and unlocking manually will be much better.
How can I manually lock and unlock just like the lock(object) does? I need it to use same interface as lock so that this manual way and lock will block each other (for more complex cases).

The lock keyword is just syntactic sugar for Monitor.Enter and Monitor.Exit:
Monitor.Enter(o);
try
{
//put your code here
}
finally
{
Monitor.Exit(o);
}
is the same as
lock(o)
{
//put your code here
}

Thomas suggests double-checked locking in his answer. This is problematic. First off, you should not use low-lock techniques unless you have demonstrated that you have a real performance problem that is solved by the low-lock technique. Low-lock techniques are insanely difficult to get right.
Second, it is problematic because we don't know what "_DoSomething" does or what consequences of its actions we are going to rely on.
Third, as I pointed out in a comment above, it seems crazy to return that the _DoSomething is "done" when another thread is in fact still in the process of doing it. I don't understand why you have that requirement, and I'm going to assume that it is a mistake. The problems with this pattern still exist even if we set "done" after "_DoSomething" does its thing.
Consider the following:
class MyClass
{
readonly object locker = new object();
bool done = false;
public void DoSomething()
{
if (!done)
{
lock(locker)
{
if(!done)
{
ReallyDoSomething();
done = true;
}
}
}
}
int x;
void ReallyDoSomething()
{
x = 123;
}
void DoIt()
{
DoSomething();
int y = x;
Debug.Assert(y == 123); // Can this fire?
}
Is this threadsafe in all possible implementations of C#? I don't think it is. Remember, non-volatile reads may be moved around in time by the processor cache. The C# language guarantees that volatile reads are consistently ordered with respect to critical execution points like locks, and it guarantees that non-volatile reads are consistent within a single thread of execution, but it does not guarantee that non-volatile reads are consistent in any way across threads of execution.
Let's look at an example.
Suppose there are two threads, Alpha and Bravo. Both call DoIt on a fresh instance of MyClass. What happens?
On thread Bravo, the processor cache happens to do a (non-volatile!) fetch of the memory location for x, which contains zero. "done" happens to be on a different page of memory which is not fetched into the cache quite yet.
On thread Alpha at the "same time" on a different processor DoIt calls DoSomething. Thread Alpha now runs everything in there. When thread Alpha is done its work, done is true and x is 123 on Alpha's processor. Thread Alpha's processor flushes those facts back out to main memory.
Thread bravo now runs DoSomething. It reads the page of main memory containing "done" into the processor cache and sees that it is true.
So now "done" is true, but "x" is still zero in the processor cache for thread Bravo. Thread Bravo is not required to invalidate the portion of the cache that contains "x" being zero because on thread Bravo neither the read of "done" nor the read of "x" were volatile reads.
The proposed version of double-checked locking is not actually double-checked locking at all. When you change the double-checked locking pattern you need to start over again from scratch and re-analyze everything.
The way to make this version of the pattern correct is to make at least the first read of "done" into a volatile read. Then the read of "x" will not be permitted to move "ahead" of the volatile read to "done".

You can check the value of done before and after the lock:
if (!done)
{
lock(this)
{
if(!done)
{
done = true;
_DoSomething();
}
}
}
This way you won't enter the lock if done is true. The second check inside the lock is to cope with race conditions if two threads enter the first if at the same time.
BTW, you shouldn't lock on this, because it can cause deadlocks. Lock on a private field instead (like private readonly object _syncLock = new object())

The lock keyword is just syntactic sugar for the Monitor class. Also you could call Monitor.Enter(), Monitor.Exit().
But the Monitor class itself has also the functions TryEnter() and Wait() which could help in your situation.

I know this answer comes several years late, but none of the current answers seem to address your actual scenario, which only became apparent after your comment:
The other threads don't need to use any information generated by ReallyDoSomething.
If the other threads don't need to wait for the operation to complete, the second code snippet in your question would work fine. You can optimize it further by eliminating your lock entirely and using an atomic operation instead:
private int done = 0;
public void DoSomething()
{
if (Interlocked.Exchange(ref done, 1) == 0) // only evaluates to true ONCE
_DoSomething();
}
Furthermore, if your _DoSomething() is a fire-and-forget operation, then you might not even need the first thread to wait for it, allowing it to run asynchronously in a task on the thread pool:
int done = 0;
public void DoSomething()
{
if (Interlocked.Exchange(ref done, 1) == 0)
Task.Factory.StartNew(_DoSomething);
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.