Creating an example which show threads modifying a common variable - c#

I want to create an example that shows failure when using parallel loops.
I am trying to make up an example, that shows in Prallel.For two threads might modify a common (non-thread-local) variables. For that, I write the following example, in which a thread assigns its number to a variable, and then we add a delay to see if any other thread will override this variable or not. In the output, it overriding never happens to me.
Parallel.For(0, result.Length, ii =>
{
int threadNum = Thread.CurrentThread.ManagedThreadId;
Thread.Sleep(10000);
if (threadNum != Thread.CurrentThread.ManagedThreadId)
Console.WriteLine("threadNum = {0}, Thread.CurrentThread.ManagedThreadId = {1}", threadNum, Thread.CurrentThread.ManagedThreadId);
});
One might argue that I am delaying all of the threads. So I add the delay to only one thread:
int num = -1;
Parallel.For(0, result.Length, ii =>
{
if( num == -1)
num = Thread.CurrentThread.ManagedThreadId;
int threadNum = Thread.CurrentThread.ManagedThreadId;
if (Thread.CurrentThread.ManagedThreadId == num)
{
Console.WriteLine("num = {0}", num);
Thread.Sleep(10);
}
if (threadNum != Thread.CurrentThread.ManagedThreadId)
Console.WriteLine("threadNum = {0}, Thread.CurrentThread.ManagedThreadId = {1}", threadNum, Thread.CurrentThread.ManagedThreadId);
});
Here it just remembers the first thread, and only delays that. Still I don't observe any over riding of the variable 'threadNum' by threads.
Any ideas?

Your threadNum is declared in the closure, so each thread will have its own copy.
You could move it outside the closure.
Perhaps a better way to demonstrate concurrency issues would be to have multiple threads incrementing the same variable (that is defined in a scope accessible to all threads). The ++ operator is not atomic, so you are unlikely to have the end result be NumberOfThreads * NumberOfIterationsPerThread (assuming you start at zero).

Related

starting tasks with lambda expressions in loops in C#

In the preparation for a C# exam at university I found the following multiple choice question:
Client applications call your library by passing a set of operations
to perform. Your library must ensure that system resources are most
effectively used. Jobs may be scheduled in any order, but your
librarymust log the position of each operation. You have declared this
code:
public IEnumerable<Task> Execute(Action[] jobs)
{
var tasks = new Task[jobs.Length];
for (var i = 0; i < jobs.Length; i++)
{
/* COMPLETION NEEDED */
}
return tasks;
}
public void RunJob(Action job, int index)
{
// implementation omitted
}
Complete the method by inserting code in the for loop. Choose the
correct answer.
1.)
tasks[i] = new Task((idx) => RunJob(jobs[(int)idx], (int)idx), i);
tasks[i].Start();
2.)
tasks[i] = new Task(() => RunJob(jobs[i], i));
tasks[i].Start();
3.)
tasks[i] = Task.Run(() => RunJob(jobs[i], i));
I have opted for answer 3 since Task.Run() queues the specified work on the thread pool and returns a Task object that represents the work.
But the correct answer was 1, using the Task(Action, Object) constructor. The explanation says the following:
In answer 1, the second argument to the constructor is passed as the
only argument to the Action delegate. The current value of the
i variable is captured when the value is boxed and passed to the Task
constructor.
Answer 2 and 3 use a lambda expression that captures the i variable
from the enclosing method. The lambda expression will probably return
the final value of i, in this case 10, before the operating system
preempts the current thread and begins every task delegate created by
the loop. The exact value cannot be determined because the OS
schedules thread execution based on many factors external to your
program.
While I perfectly understand the explanation of answer 1, I don't get the point in the explanations for answer 2 and 3. Why would the lambda expression return the final value?
In options 2 and 3 lambda captures original i variable used in for loop. It's not guaranteed when tasks will be run on thread pool. So possible behavior: for loop is finished, i=10 and then tasks are started to execute. So all of them will use i=10.
Similar behavior you can see here:
void Do()
{
var actions = new List<Action>();
for (int i = 0; i < 3; i++)
{
actions.Add(() => Console.WriteLine(i));
}
//actions executed after loop is finished
foreach(var a in actions)
{
a();
}
}
Output is:
3
3
3
You can fix it like this:
for (int i = 0; i < 3; i++)
{
var local = i;
actions.Add(() => Console.WriteLine(local));
}

Task.Run(), passing parameter to

Consider the following code:
attempt = 0;
for (int counter = 0; counter < 8; counter++)
{
if (attempt < totalitems)
{
Tasklist<output>.Add(Task.Run(() =>
{
return someasynctask(inputList[attempt]);
}));
}
else
{
break;
}
attempt++;
}
await Task.WhenAll(Tasklist).ConfigureAwait(false);
I want to have for example 8 concurrent tasks, each working on different inputs concurrently, and finally check the result, when all of them have finished.
Because I'm not awaiting for completion of Task.Run() attempt is increased before starting of tasks, and when the task is started, there may be items in the inputList that are not processed or processed twice or more instead (because of uncertainty in attempt value.
How to do that?
The problem lies within the use of a "lambda": when Task.Run(() => return someasynctask(inputList[attempt])); is reached during the execution, the variable attempt is captured, not its value (i.e. it is a "closure"). Consequently, when the lambda gets executed, the value of the variable at that specific moment will be used.
Just add a temporary copy of the variable before your lambda, and use that. E.g.
if (attempt < totalitems)
{
int localAttempt = attempt;
Tasklist<output>.Add(Task.Run(() =>
{
return someasynctask(inputList[localAttempt]);
}));
}
Thanks to #gobes for his answer:
Try this:
attempt = 0;
for (int counter = 0; counter < 8; counter++)
{
if (attempt < totalitems)
{
Tasklist<output>.Add(Task.Run(() =>
{
int tmpAttempt = attempt;
return someasynctask(inputList[tmpAttempt]);
}));
}
else
{
break;
}
attempt++;
}
await Task.WhenAll(Tasklist).ConfigureAwait(false);
Actually, what the compiler is doing is extracting your lambda into a method, located in an automagically generated class, which is referencing the attempt variable. This is the important point: the generated code only reference the variable from another class; it doesn't copy it. So every change to attempt is seen by the method.
What happens during the execution is roughly this:
enter the loop with attempt = 0
add a call of the lambda-like-method to your tasklist
increase attempt
repeat
After the loop, you have some method calls awaiting (no pun intended) to be executed, but each one is referencing THE SAME VARIABLE, therefore sharing its value - the last affected to it.
For more details, I really recommend reading C# in depth, or some book of the same kind - there are a lot of resources about closure in C# on the web :)

Why is a lock required here?

I'm looking at an example on p. 40 of Stephen Cleary's book that is
// Note: this is not the most efficient implementation.
// This is just an example of using a lock to protect shared state.
static int ParallelSum(IEnumerable<int> values)
{
object mutex = new object();
int result = 0;
Parallel.ForEach(source: values,
localInit: () => 0,
body: (item, state, localValue) => localValue + item,
localFinally: localValue =>
{
lock (mutex)
result += localValue;
});
return result;
}
and I'm a little confused why the lock is needed. Because if all we're doing is summing a collection of ints, say {1, 5, 6}, then we shouldn't need to care about the shared sum result being incremented in any order.
(1 + 5) + 6 = 1 + (5 + 6) = (1 + 6) + 5 = ...
Can someone explain where my thinking here is flawed?
I guess I'm a little confused by the body of the method can't simply be
int result = 0;
Parallel.ForReach(values, (val) => { result += val; });
return result;
Operations such as addition are not atomic and thus are not thread safe. In the example code, if the lock was omitted, it's completely possible that two addition operations are executed near-simultaneously, resulting in one of the values being overridden or incorrectly added. There is a method that is thread safe to increment integers: Interlocked.Add(int, int). However, since this isn't being used, the lock is required in the example code to ensure that exactly one non-atomic addition operation at most is completed at a time (in sequence and not in parallel).
Its not about the order in which 'result' is updated, its the race condition of updating it, remember that operator += is not atomic, so two threads may not see the udpate of another's thread before they touch it
The statement result += localValue; is really saying result = result + localValue; You are both reading and updating a resource(variable) shared by different threads. This could easily lead into a Race Condition. lockmakes sure this statement at any given moment in time is accessed by a single thread.

Are Dictionary and SortedSet thread safe for READERS in C#?

If I don't modify a collection is it safe for 2 threads to run
foreach (var el in collection)
Console.Write(el);
at the same time?
The docs are kinda ambiguous. They do say - "not thread safe", but don't clearly say if they are READER thread safe.
MSDN says :
"To allow the collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization."
What about just for reading?
I did a heavy pounding test hoping to catch threads trampling over each other but so far List, Dictionary and SortedSet all seem to not get confused by multiple readers enumerating over them. Not enough to declare them thread safe for reading, but some food for thought.
SortedSet<int> m_List = new SortedSet<int>();
void Test()
{
// Put 10,000,000 integers into a collection
for (int i = 1; i < 10000000; i++)
m_List.Add(i);
// Start 10 threads
for (int i = 1; i < 10; i++)
(new Thread(Act)).Start(i);
}
Random rand = new Random();
private void Act(object id)
{
Console.WriteLine("---- Started Thread: {0}", id);
int i = 0;
int r = rand.Next(10000);
foreach (int j in m_List)
{
if (j != ++i) // the thread safety check:
{
Console.WriteLine("NOT THREAD SAFE!!!!!!!!!!!!");
}
if (j%10000 == r)
{
//Console.WriteLine("{0} yeld at {1}", state, j);
Thread.Yield(); // to encourage threads intermingling.
}
}
Console.WriteLine("---- Finished Thread: {0}", id);
}
For Dictionary<T,T>, The document states clearly:
A Dictionary can support multiple readers concurrently,
as long as the collection is not modified.
So if you never modify the dictionary during read, it is fine to have multiple readers reading from the dictionary at once.
SortedSet<T> instance members are not guaranteed to be thread safe, as specified in the documentation.

Lock using atomic operations

Yes, I'm aware of that the following question could be answered with "Use the lock keyword instead" or something similar. But since this is just for "fun", I don't care about those.
I've made a simple lock using atomic operations:
public class LowLock
{
volatile int locked = 0;
public void Enter(Action action)
{
var s = new SpinWait();
while (true)
{
var inLock = locked; // release-fence (read)
// If CompareExchange equals 1, we won the race.
if (Interlocked.CompareExchange(ref locked, 1, inLock) == 1)
{
action();
locked = 0; // acquire fence (write)
break; // exit the while loop
}
else s.SpinOnce(); // lost the race. Spin and try again.
}
}
}
I'm using the lock above in a simple for loop, that adds a string to a normal List<string>, with the purpose of making the add method thread-safe, when wrapped inside the Enter method from a LowLock.
The code looks like:
static void Main(string[] args)
{
var numbers = new List<int>();
var sw = Stopwatch.StartNew();
var cd = new CountdownEvent(10000);
for (int i = 0; i < 10000; i++)
{
ThreadPool.QueueUserWorkItem(o =>
{
low.Enter(() => numbers.Add(i));
cd.Signal();
});
}
cd.Wait();
sw.Stop();
Console.WriteLine("Time = {0} | results = {1}", sw.ElapsedMilliseconds, numbers.Count);
Console.ReadKey();
}
Now the tricky part is that when the main thread hits the Console.WriteLine that prints time and number of elements in the list, the number of elements should be equal to the count given to the CountdownEvent (10000) - It works most of the time, but sometimes there's only 9983 elements in the list, other times 9993. What am I overlooking?
I suggest you take a look at the SpinLock structure as it appears to do exactly what you want.
That said, this all looks really dicey, but I'll have a stab at it.
You appear to be trying to use 0 to mean 'unlocked' and 1 to mean 'locked'.
In which case the line:
if (Interlocked.CompareExchange(ref locked, 1, inLock) == 1)
isn't correct at all. It's just replacing the locked variable with the value of 1 (locked) if its current value is the same as the last time you read it via inLock = locked (and acquiring the lock if so). Worse, it is entering the mutual exclusion section if the original value was 1 (locked), which is the exact opposite of what you want to be doing.
You should actually be atomically checking that the lock has not been taken (original value == 0) and take it if you can (new value == 1), by using 0 (unlocked) as both the comparand argument as well as the value to test the return value against:
if (Interlocked.CompareExchange(ref locked, 1, 0) == 0)
Now even if you fixed this, we also need to be certain that the List<T>.Add method will 'see' an up to date internal-state of the list to perform the append correctly. I think Interlocked.CompareExchange uses a full memory barrier, which should create this pleasant side-effect, but this does seem a little dangerous to rely on (I've never seen this documented anywhere).
I strongly recommend staying away from such low-lock patterns except in the most trivial (and obviously correct) of scenarios unless you are a genuine expert in low-lock programming. We mere mortals will get it wrong.
EDIT: Updated the compare value to 0.
Interlocked.CompareExchange returns the original value of the variable, so you want something like this:
public class LowLock
{
int locked = 0;
public void Enter( Action action )
{
var s = new SpinWait();
while ( true )
{
// If CompareExchange equals 0, we won the race.
if ( Interlocked.CompareExchange( ref locked, 1, 0 ) == 0 )
{
action();
Interlocked.Exchange( ref locked, 0 );
break; // exit the while loop
}
s.SpinOnce(); // lost the race. Spin and try again.
}
}
}
I've removed the volatile and used a full fence to reset the flag, because volatile is hard

Categories

Resources