Why is a lock required here? - c#

I'm looking at an example on p. 40 of Stephen Cleary's book that is
// Note: this is not the most efficient implementation.
// This is just an example of using a lock to protect shared state.
static int ParallelSum(IEnumerable<int> values)
{
object mutex = new object();
int result = 0;
Parallel.ForEach(source: values,
localInit: () => 0,
body: (item, state, localValue) => localValue + item,
localFinally: localValue =>
{
lock (mutex)
result += localValue;
});
return result;
}
and I'm a little confused why the lock is needed. Because if all we're doing is summing a collection of ints, say {1, 5, 6}, then we shouldn't need to care about the shared sum result being incremented in any order.
(1 + 5) + 6 = 1 + (5 + 6) = (1 + 6) + 5 = ...
Can someone explain where my thinking here is flawed?
I guess I'm a little confused by the body of the method can't simply be
int result = 0;
Parallel.ForReach(values, (val) => { result += val; });
return result;

Operations such as addition are not atomic and thus are not thread safe. In the example code, if the lock was omitted, it's completely possible that two addition operations are executed near-simultaneously, resulting in one of the values being overridden or incorrectly added. There is a method that is thread safe to increment integers: Interlocked.Add(int, int). However, since this isn't being used, the lock is required in the example code to ensure that exactly one non-atomic addition operation at most is completed at a time (in sequence and not in parallel).

Its not about the order in which 'result' is updated, its the race condition of updating it, remember that operator += is not atomic, so two threads may not see the udpate of another's thread before they touch it

The statement result += localValue; is really saying result = result + localValue; You are both reading and updating a resource(variable) shared by different threads. This could easily lead into a Race Condition. lockmakes sure this statement at any given moment in time is accessed by a single thread.

Related

Concurrent modification of double[][] elements without locking

I have a jagged double[][] array that may be modified concurrently by multiple threads. I should like to make it thread-safe, but if possible, without locks. The threads may well target the same element in the array, that is why the whole problem arises. I have found code to increment double values atomically using the Interlocked.CompareExchange method: Why is there no overload of Interlocked.Add that accepts Doubles as parameters?
My question is: will it stay atomic if there is a jagged array reference in Interlocked.CompareExchange? Your insights are much appreciated.
With an example:
public class Example
{
double[][] items;
public void AddToItem(int i, int j, double addendum)
{
double newCurrentValue = items[i][j];
double currentValue;
double newValue;
SpinWait spin = new SpinWait();
while (true) {
currentValue = newCurrentValue;
newValue = currentValue + addendum;
// This is the step of which I am uncertain:
newCurrentValue = Interlocked.CompareExchange(ref items[i][j], newValue, currentValue);
if (newCurrentValue == currentValue) break;
spin.SpinOnce();
}
}
}
Yes, it will still be atomic and thread-safe. Any calls to the same cell will be passing the same address-to-a-double. Details like whether it is in an array of as a field on an object are irrelevant.
However, the line:
double newCurrentValue = items[i][j];
is not atomic - that could in theory give a torn value (especially on x86). That's actually OK in this case because in the torn-value scenario it will just hit the loop, count as a collision, and redo - this time using the known-atomic value from CompareExchange.
Seems that you want eventually add some value to an array item.
I suppose there are only updates of values (array itself stays the same piece of memory) and all updates are done via this AddToItem method.
So, you have to read updated value each time (otherwise you lost changes done by other thread, or get infinite loop).
public class Example
{
double[][] items;
public void AddToItem(int i, int j, double addendum)
{
var spin = new SpinWait();
while (true)
{
var valueAtStart = Volatile.Read(ref items[i][j]);
var newValue = valueAtStart + addendum;
var oldValue = Interlocked.CompareExchange(ref items[i][j], newValue, valueAtStart);
if (oldValue.Equals(valueAtStart))
break;
spin.SpinOnce();
}
}
}
Note that we have to use some volatile method to read items[i][j]. Volatile.Read is used to avoid some unwanted optimizations that are allowed under .NET Memory Model (see ECMA-334 and ECMA-335 specifications).
Since we update value atomically (via Interlocked.CompareExchange) it's enough to read items[i][j] via Volatile.Read.
If not all changes to this array are done in this method, then it's better to write a loop in which create local copy of array, modify it and update reference to new array (using Volatile.Read and Interlocked.CompareExchange)

Creating an example which show threads modifying a common variable

I want to create an example that shows failure when using parallel loops.
I am trying to make up an example, that shows in Prallel.For two threads might modify a common (non-thread-local) variables. For that, I write the following example, in which a thread assigns its number to a variable, and then we add a delay to see if any other thread will override this variable or not. In the output, it overriding never happens to me.
Parallel.For(0, result.Length, ii =>
{
int threadNum = Thread.CurrentThread.ManagedThreadId;
Thread.Sleep(10000);
if (threadNum != Thread.CurrentThread.ManagedThreadId)
Console.WriteLine("threadNum = {0}, Thread.CurrentThread.ManagedThreadId = {1}", threadNum, Thread.CurrentThread.ManagedThreadId);
});
One might argue that I am delaying all of the threads. So I add the delay to only one thread:
int num = -1;
Parallel.For(0, result.Length, ii =>
{
if( num == -1)
num = Thread.CurrentThread.ManagedThreadId;
int threadNum = Thread.CurrentThread.ManagedThreadId;
if (Thread.CurrentThread.ManagedThreadId == num)
{
Console.WriteLine("num = {0}", num);
Thread.Sleep(10);
}
if (threadNum != Thread.CurrentThread.ManagedThreadId)
Console.WriteLine("threadNum = {0}, Thread.CurrentThread.ManagedThreadId = {1}", threadNum, Thread.CurrentThread.ManagedThreadId);
});
Here it just remembers the first thread, and only delays that. Still I don't observe any over riding of the variable 'threadNum' by threads.
Any ideas?
Your threadNum is declared in the closure, so each thread will have its own copy.
You could move it outside the closure.
Perhaps a better way to demonstrate concurrency issues would be to have multiple threads incrementing the same variable (that is defined in a scope accessible to all threads). The ++ operator is not atomic, so you are unlikely to have the end result be NumberOfThreads * NumberOfIterationsPerThread (assuming you start at zero).

How is a captured variable simultaneously exposed to multiple threads of execution?

In C# spec 4.0 section 7.15.5.1:
Note that unlike an uncaptured variable, a captured local variable can
be simulataneously exposed to multiple threads of execution.
What exactly does it mean by "multiple threads of execution"? Does this mean multiple threads, multiple execution paths or something else?
E.G.
private static void Main(string[] args)
{
Action[] result = new Action[3];
int x;
for (int i = 0; i < 3; i++)
{
//int x = i * 2 + 1;//this behaves more intuitively. Outputs 1,3,5
x = i*2 + 1;
result[i] = () => { Console.WriteLine(x); };
}
foreach (var a in result)
{
a(); //outputs 5 each time
}
//OR...
int y = 1;
Action one = new Action(() =>
{
Console.WriteLine(y);//Outputs 1
y = 2;
});
Action two = new Action(() =>
{
Console.WriteLine(y);//Outputs 2. Working with same Y
});
var t1 = Task.Factory.StartNew(one);
t1.Wait();
Task.Factory.StartNew(two);
Console.Read();
}
Here x exhibits different behavior based upon where x is declared. In the case of y the same variable is captured and used by multiple threads, but IMO this behavior is intuitive.
What are they referring to?
"Multiple threads of execution" just means multiple threads; i.e., multiple threads that are executing concurrently. If a particular variable is exposed to multiple threads, any of those threads can read and write that variable's value.
This is potentially dangerous and should be avoided whenever possible. One possibility to avoid this, if your scenario allows, is to create local copies of variables in your tasks' methods.
If you modify the second part of your code a little:
int y = 1;
Action one = new Action(() =>
{
Console.WriteLine(y);//Outputs 1
y = 2;
});
Action two = new Action(() =>
{
Console.WriteLine(y);//Outputs 2. Working with same Y
y = 1;
});
var t1 = Task.Factory.StartNew(one);
t1 = Task.Factory.StartNew(two);
t1.Wait();
t1 = Task.Factory.StartNew(one);
t1.Wait();
t1 = Task.Factory.StartNew(two);
Console.Read();
Run it a few times or put it in a loop. It will output different, seemingly random, results, e.g. 1 1 1 2 or 1 2 1 2.
Multiple threads are accessing the same variable and getting and setting it simultaneously, which may give unexpected results.
Refer to the link below.
A reference to the outer variable n is said to be captured when the delegate is created. Unlike local variables, the lifetime of a captured variable extends until the delegates that reference the anonymous methods are eligible for garbage collection.
In other words the variable's memory location is captured when the method is created and then shared to that method. Just like if a thread had access to that variable minus the anonymous method. There is a compiler warning that will occur in some cases that it will suggest you move to a local temp variable to not cause unintended consequences.
MSDN Anonymous Methods

How does local initialization with Parallel ForEach work?

I am unsure about the use of the local init function in Parallel.ForEach, as described in the msdn article: http://msdn.microsoft.com/en-us/library/dd997393.aspx
Parallel.ForEach<int, long>(nums, // source collection
() => 0, // method to initialize the local variable
(j, loop, subtotal) => // method invoked by the loop on each iteration
{
subtotal += nums[j]; //modify local variable
return subtotal; // value to be passed to next iteration
},...
How does () => 0 initialize anything? What's the name of the variable and how can I use it in the loop logic?
With reference to the following overload of the Parallel.ForEach static extension method:
public static ParallelLoopResult ForEach<TSource, TLocal>(
IEnumerable<TSource> source,
Func<TLocal> localInit,
Func<TSource, ParallelLoopState, TLocal, TLocal> taskBody,
Action<TLocal> localFinally
)
In your specific example
The line:
() => 0, // method to initialize the local variable
is simply a lambda (anonymous function) which will return the constant integer zero. This lambda is passed as the localInit parameter to Parallel.ForEach - since the lambda returns an integer, it has type Func<int> and type TLocal can be inferred as int by the compiler (similarly, TSource can be inferred from the type of the collection passed as parameter source)
The return value (0) is then passed as the 3rd parameter (named subtotal) to the taskBody Func. This (0) is used the initial seed for the body loop:
(j, loop, subtotal) =>
{
subtotal += nums[j]; //modify local variable (Bad idea, see comment)
return subtotal; // value to be passed to next iteration
}
This second lambda (passed to taskBody) is called N times, where N is the number of items allocated to this task by the TPL partitioner.
Each subsequent call to the second taskBody lambda will pass the new value of subTotal, effectively calculating a running partial total, for this Task. After all the items assigned to this task have been added, the third and last, localFinally function parameter will be called, again, passing the final value of the subtotal returned from taskBody. Because several such tasks will be operating in parallel, there will also need to be a final step to add up all the partial totals into the final 'grand' total. However, because multiple concurrent tasks (on different Threads) could be contending for the grandTotal variable, it is important that changes to it are done in a thread-safe manner.
(I've changed names of the MSDN variables to make it more clear)
long grandTotal = 0;
Parallel.ForEach(nums, // source collection
() => 0, // method to initialize the local variable
(j, loop, subtotal) => // method invoked by the loop on each iteration
subtotal + nums[j], // value to be passed to next iteration subtotal
// The final value of subtotal is passed to the localFinally function parameter
(subtotal) => Interlocked.Add(ref grandTotal, subtotal)
In the MS Example, modification of the parameter subtotal inside the task body is a poor practice, and unnecessary. i.e. The code subtotal += nums[j]; return subtotal; would be better as just return subtotal + nums[j]; which could be abbreviated to the lambda shorthand projection (j, loop, subtotal) => subtotal + nums[j]
In General
The localInit / body / localFinally overloads of Parallel.For / Parallel.ForEach allow once-per task initialization and cleanup code to be run, before, and after (respectively) the taskBody iterations are performed by the Task.
(Noting the For range / Enumerable passed to the parallel For / Foreach will be partitioned into batches of IEnumerable<>, each of which will be allocated a Task)
In each Task, localInit will be called once, the body code will be repeatedly invoked, once per item in batch (0..N times), and localFinally will be called once upon completion.
In addition, you can pass any state required for the duration of the task (i.e. to the taskBody and localFinally delegates) via a generic TLocal return value from the localInit Func - I've called this variable taskLocals below.
Common uses of "localInit":
Creating and initializing expensive resources needed by the loop body, like a database connection or a web service connection.
Keeping Task-Local variables to hold (uncontended) running totals or collections
If you need to return multiple objects from localInit to the taskBody and localFinally, you can make use of a strongly typed class, a Tuple<,,> or, if you use only lambdas for the localInit / taskBody / localFinally, you can also pass data via an anonymous class. Note if you use the return from localInit to share a reference type among multiple tasks, that you will need to consider thread safety on this object - immutability is preferable.
Common uses of the "localFinally" Action:
To release resources such as IDisposables used in the taskLocals (e.g. database connections, file handles, web service clients, etc)
To aggregate / combine / reduce the work done by each task back into shared variable(s). These shared variables will be contended, so thread safety is a concern:
e.g. Interlocked.Increment on primitive types like integers
lock or similar will be required for write operations
Make use of the concurrent collections to save time and effort.
The taskBody is the tight part of the loop operation - you'll want to optimize this for performance.
This is all best summarized with a commented example:
public void MyParallelizedMethod()
{
// Shared variable. Not thread safe
var itemCount = 0;
Parallel.For(myEnumerable,
// localInit - called once per Task.
() =>
{
// Local `task` variables have no contention
// since each Task can never run by multiple threads concurrently
var sqlConnection = new SqlConnection("connstring...");
sqlConnection.Open();
// This is the `task local` state we wish to carry for the duration of the task
return new
{
Conn = sqlConnection,
RunningTotal = 0
}
},
// Task Body. Invoked once per item in the batch assigned to this task
(item, loopState, taskLocals) =>
{
// ... Do some fancy Sql work here on our task's independent connection
using(var command = taskLocals.Conn.CreateCommand())
using(var reader = command.ExecuteReader(...))
{
if (reader.Read())
{
// No contention for `taskLocal`
taskLocals.RunningTotal += Convert.ToInt32(reader["countOfItems"]);
}
}
// The same type of our `taskLocal` param must be returned from the body
return taskLocals;
},
// LocalFinally called once per Task after body completes
// Also takes the taskLocal
(taskLocals) =>
{
// Any cleanup work on our Task Locals (as you would do in a `finally` scope)
if (taskLocals.Conn != null)
taskLocals.Conn.Dispose();
// Do any reduce / aggregate / synchronisation work.
// NB : There is contention here!
Interlocked.Add(ref itemCount, taskLocals.RunningTotal);
}
And more examples:
Example of per-Task uncontended dictionaries
Example of per-Task database connections
as an extension to #Honza Brestan's answer. The way Parallel foreach splits the work into tasks can also be important, it will group several loop iterations into a single task so in practice localInit() is called once for every n iterations of the loop and multiple groups can be started simultaneously.
The point of a localInit and localFinally is to ensure that a parallel foreach loop can combine results from each itteration into a single result without you needing to specify lock statements in the body, to do this you must provide an initialisation for the value you want to create (localInit) then each body itteration can process the local value, then you provide a method to combine values from each group (localFinally) in a thread-safe way.
If you don't need localInit for synchronising tasks, you can use lambda methods to reference values from the surrounding context as normal without any problems. See Threading in C# (Parallel.For and Parallel.ForEach) for a more in depth tutorial on using localInit/Finally and scroll down to Optimization with local values, Joseph Albahari is really my goto source for all things threading.
You can get a hint on MSDN in the correct Parallel.ForEach overload.
The localInit delegate is invoked once for each thread that participates in the loop's execution and returns the initial local state for each of those tasks. These initial states are passed to the first body invocations on each task. Then, every subsequent body invocation returns a possibly modified state value that is passed to the next body invocation.
In your example () => 0 is a delegate just returning 0, so this value is used for the first iteration on each task.
From my side a little bit more easier example
class Program
{
class Person
{
public int Id { get; set; }
public string Name { get; set; }
public int Age { get; set; }
}
static List<Person> GetPerson() => new List<Person>()
{
new Person() { Id = 0, Name = "Artur", Age = 26 },
new Person() { Id = 1, Name = "Edward", Age = 30 },
new Person() { Id = 2, Name = "Krzysiek", Age = 67 },
new Person() { Id = 3, Name = "Piotr", Age = 23 },
new Person() { Id = 4, Name = "Adam", Age = 11 },
};
static void Main(string[] args)
{
List<Person> persons = GetPerson();
int ageTotal = 0;
Parallel.ForEach
(
persons,
() => 0,
(person, loopState, subtotal) => subtotal + person.Age,
(subtotal) => Interlocked.Add(ref ageTotal, subtotal)
);
Console.WriteLine($"Age total: {ageTotal}");
Console.ReadKey();
}
}

Lock using atomic operations

Yes, I'm aware of that the following question could be answered with "Use the lock keyword instead" or something similar. But since this is just for "fun", I don't care about those.
I've made a simple lock using atomic operations:
public class LowLock
{
volatile int locked = 0;
public void Enter(Action action)
{
var s = new SpinWait();
while (true)
{
var inLock = locked; // release-fence (read)
// If CompareExchange equals 1, we won the race.
if (Interlocked.CompareExchange(ref locked, 1, inLock) == 1)
{
action();
locked = 0; // acquire fence (write)
break; // exit the while loop
}
else s.SpinOnce(); // lost the race. Spin and try again.
}
}
}
I'm using the lock above in a simple for loop, that adds a string to a normal List<string>, with the purpose of making the add method thread-safe, when wrapped inside the Enter method from a LowLock.
The code looks like:
static void Main(string[] args)
{
var numbers = new List<int>();
var sw = Stopwatch.StartNew();
var cd = new CountdownEvent(10000);
for (int i = 0; i < 10000; i++)
{
ThreadPool.QueueUserWorkItem(o =>
{
low.Enter(() => numbers.Add(i));
cd.Signal();
});
}
cd.Wait();
sw.Stop();
Console.WriteLine("Time = {0} | results = {1}", sw.ElapsedMilliseconds, numbers.Count);
Console.ReadKey();
}
Now the tricky part is that when the main thread hits the Console.WriteLine that prints time and number of elements in the list, the number of elements should be equal to the count given to the CountdownEvent (10000) - It works most of the time, but sometimes there's only 9983 elements in the list, other times 9993. What am I overlooking?
I suggest you take a look at the SpinLock structure as it appears to do exactly what you want.
That said, this all looks really dicey, but I'll have a stab at it.
You appear to be trying to use 0 to mean 'unlocked' and 1 to mean 'locked'.
In which case the line:
if (Interlocked.CompareExchange(ref locked, 1, inLock) == 1)
isn't correct at all. It's just replacing the locked variable with the value of 1 (locked) if its current value is the same as the last time you read it via inLock = locked (and acquiring the lock if so). Worse, it is entering the mutual exclusion section if the original value was 1 (locked), which is the exact opposite of what you want to be doing.
You should actually be atomically checking that the lock has not been taken (original value == 0) and take it if you can (new value == 1), by using 0 (unlocked) as both the comparand argument as well as the value to test the return value against:
if (Interlocked.CompareExchange(ref locked, 1, 0) == 0)
Now even if you fixed this, we also need to be certain that the List<T>.Add method will 'see' an up to date internal-state of the list to perform the append correctly. I think Interlocked.CompareExchange uses a full memory barrier, which should create this pleasant side-effect, but this does seem a little dangerous to rely on (I've never seen this documented anywhere).
I strongly recommend staying away from such low-lock patterns except in the most trivial (and obviously correct) of scenarios unless you are a genuine expert in low-lock programming. We mere mortals will get it wrong.
EDIT: Updated the compare value to 0.
Interlocked.CompareExchange returns the original value of the variable, so you want something like this:
public class LowLock
{
int locked = 0;
public void Enter( Action action )
{
var s = new SpinWait();
while ( true )
{
// If CompareExchange equals 0, we won the race.
if ( Interlocked.CompareExchange( ref locked, 1, 0 ) == 0 )
{
action();
Interlocked.Exchange( ref locked, 0 );
break; // exit the while loop
}
s.SpinOnce(); // lost the race. Spin and try again.
}
}
}
I've removed the volatile and used a full fence to reset the flag, because volatile is hard

Categories

Resources