Parallel Programming Race Conditions

Parallel Programming Race Conditions - c#

I am working through a Parallel Programming example on Race conditions
In the example they are demonstrating the isolation patter to deal with race conditions
Why is it in this example following that a race condition does not occur when creating the task and the stateObject is passed as part of the task creation
I understand that we use isolatedBalance to do the updateing ...but at the point where we assign the isolatedbalance = (int)stateObject could not another tasks finished balance be there i.e not 0 but 100 ???
So if there where enough tasks and that the task scheduler started an early task and it finished at a point when a later task is being created and assinged the account.Balance value would be 100 etc for when 1 of the tasks had finshed for a taks that was starting
class BankAccount
{
public int Balance { get; set; }
}
class Program
{
static void Main(string[] args)
{
var account = new BankAccount();
var tasks = new Task<int>[1000];
for (int i = 0; i < 1000; i++)
{
tasks[i] = new Task<int>((stateObject)=>
{
int isobalance = (int) stateObject;
for (int j = 0; j < 1000; j++)
{
isobalance ++;
}
return isobalance;
}, account.Balance);
tasks[i].Start();
}
Task.WaitAll(tasks);
for (int i = 0; i < 1000; i++)
{
account.Balance += tasks[i].Result;
}
Console.WriteLine("Epectecd valeu {0}, Counter value {1}",1000000,account.Balance);
// wait for input before exiting
Console.WriteLine("Press enter to finish");
Console.ReadLine();
}
}

The method that you have passed to the Task constructor does not update account.Balance it only uses the initial value of account.Balance. It does not update it. int is pass by value. From MSDN:
A value-type variable contains its data directly as opposed to a reference-type variable, which contains a reference to its data. Therefore, passing a value-type variable to a method means passing a copy of the variable to the method. Any changes to the parameter that take place inside the method have no affect on the original data stored in the variable. If you want the called method to change the value of the parameter, you have to pass it by reference, using the ref or out keyword. For simplicity, the following examples use ref.
Therefore account.Balance is not updated until after Task.WaitAll(tasks); is called. Task.WaitAll() causes the code to stop there until all tasks have finished. Only after that, once all the results have been computed. will account.Balance be updated with the values returned from tasks[i].Result.

It does not cause a race condition because you only copy the current value of account.Balance and assign it to a local variable inside the thread. Upon creation of each thread, they simply copy the current value of account.Balance on their stack and then to a local variable, but no thread actually changes it, they all work on their local copy. Imagine this to be like a method call. When you pass an int to a method, it is copied by value and then even if you modify it inside the method you will not see any changes outside.
Having said this, my favorite example to illustrate what you are asking is the very common "assign a unique id to each thread" problem. Consider these two cases:
Not thread-safe:
for(int i = 0; i < n; i++)
{
Thread t = new Thread(
o =>
{
int index = i;
// do whatever
});
t.Start();
}
This is not thread-safe because the main thread continues to loop over i while the threads are using it inside their code. When each thread t actually starts, i may have already reached n.
Thread-safe:
for(int i = 0; i < n; i++)
{
Thread t = new Thread(
o =>
{
int index = (int)o;
// do whatever
});
t.Start(i);
}
This is thread-safe given my initial explanation. Each thread receives the current value of i upon creation and copies it inside a local variable, such that the threads will correctly have ids 0, 1, ..., n-1. I hope this example makes it more clear.

Related

Async and Await - How is order of execution maintained?

I am actually reading some topics about the Task Parallel Library and the asynchronous programming with async and await. The book "C# 5.0 in a Nutshell" states that when awaiting an expression using the await keyword, the compiler transforms the code into something like this:
var awaiter = expression.GetAwaiter();
awaiter.OnCompleted (() =>
{
var result = awaiter.GetResult();
Let's assume, we have this asynchronous function (also from the referred book):
async Task DisplayPrimeCounts()
{
for (int i = 0; i < 10; i++)
Console.WriteLine (await GetPrimesCountAsync (i*1000000 + 2, 1000000) +
" primes between " + (i*1000000) + " and " + ((i+1)*1000000-1));
Console.WriteLine ("Done!");
}
The call of the 'GetPrimesCountAsync' method will be enqueued and executed on a pooled thread. In general invoking multiple threads from within a for loop has the potential for introducing race conditions.
So how does the CLR ensure that the requests will be processed in the order they were made? I doubt that the compiler simply transforms the code into the above manner, since this would decouple the 'GetPrimesCountAsync' method from the for loop.

Just for the sake of simplicity, I'm going to replace your example with one that's slightly simpler, but has all of the same meaningful properties:
async Task DisplayPrimeCounts()
{
for (int i = 0; i < 10; i++)
{
var value = await SomeExpensiveComputation(i);
Console.WriteLine(value);
}
Console.WriteLine("Done!");
}
The ordering is all maintained because of the definition of your code. Let's imagine stepping through it.
This method is first called
The first line of code is the for loop, so i is initialized.
The loop check passes, so we go to the body of the loop.
SomeExpensiveComputation is called. It should return a Task<T> very quickly, but the work that it'd doing will keep going on in the background.
The rest of the method is added as a continuation to the returned task; it will continue executing when that task finishes.
After the task returned from SomeExpensiveComputation finishes, we store the result in value.
value is printed to the console.
GOTO 3; note that the existing expensive operation has already finished before we get to step 4 for the second time and start the next one.
As far as how the C# compiler actually accomplishes step 5, it does so by creating a state machine. Basically every time there is an await there's a label indicating where it left off, and at the start of the method (or after it's resumed after any continuation fires) it checks the current state, and does a goto to the spot where it left off. It also needs to hoist all local variables into fields of a new class so that the state of those local variables is maintained.
Now this transformation isn't actually done in C# code, it's done in IL, but this is sort of the morale equivalent of the code I showed above in a state machine. Note that this isn't valid C# (you cannot goto into a a for loop like this, but that restriction doesn't apply to the IL code that is actually used. There are also going to be differences between this and what C# actually does, but is should give you a basic idea of what's going on here:
internal class Foo
{
public int i;
public long value;
private int state = 0;
private Task<int> task;
int result0;
public Task Bar()
{
var tcs = new TaskCompletionSource<object>();
Action continuation = null;
continuation = () =>
{
try
{
if (state == 1)
{
goto state1;
}
for (i = 0; i < 10; i++)
{
Task<int> task = SomeExpensiveComputation(i);
var awaiter = task.GetAwaiter();
if (!awaiter.IsCompleted)
{
awaiter.OnCompleted(() =>
{
result0 = awaiter.GetResult();
continuation();
});
state = 1;
return;
}
else
{
result0 = awaiter.GetResult();
}
state1:
Console.WriteLine(value);
}
Console.WriteLine("Done!");
tcs.SetResult(true);
}
catch (Exception e)
{
tcs.SetException(e);
}
};
continuation();
}
}
Note that I've ignored task cancellation for the sake of this example, I've ignored the whole concept of capturing the current synchronization context, there's a bit more going on with error handling, etc. Don't consider this a complete implementation.

The call of the 'GetPrimesCountAsync' method will be enqueued and executed on a pooled thread.
No. await does not initiate any kind of background processing. It waits for existing processing to complete. It is up to GetPrimesCountAsync to do that (e.g. using Task.Run). It's more clear this way:
var myRunningTask = GetPrimesCountAsync();
await myRunningTask;
The loop only continues when the awaited task has completed. There is never more than one task outstanding.
So how does the CLR ensure that the requests will be processed in the order they were made?
The CLR is not involved.
I doubt that the compiler simply transforms the code into the above manner, since this would decouple the 'GetPrimesCountAsync' method from the for loop.
The transform that you shows is basically right but notice that the next loop iteration is not started right away but in the callback. That's what serializes execution.

How do I execute multiple threads with the same function but different parameter concurrently

for (long key = 0; key < 5; key++)
{
var processingThread = new Thread(() => Setup(key));
processingThread.Start();
}
I want to execute the Setup(key) function with each key value but at the same time on multiple windows..

You need to capture a local copy of key within the for loop otherwise by the time the threads actually call Setup the value of key has become 5. If you capture a local copy then that value doesn't change and all works as expected.
for (long key = 0; key < 5; key++)
{
var localKey = key;
var processingThread = new Thread(() => Setup(localKey));
processingThread.Start();
}

Check out the Parallel.ForEach() and Parallel.For() methods.
https://msdn.microsoft.com/en-us/library/dd460720(v=vs.110).aspx
Explicitly creating a new thread has large overhead and should be avoided. Only do it if have good reason and already have considered using a thread pool based solution (such as PTL, or similar).

for (long key = 0; key < 5; key++)
{
var processingThread = new Thread(Setup);
processingThread.Start(key);
}
Setup parameter type must be changed to object (and casted, if needed)

If Parallel.For() does not provide the trick you can pass them all a AutoResetEvent.
Call Wait() in all your delegates and then call Set() after creating all the threads.
Please take notice in the fact that the system does
// THIS ISN'T TESTED AND IS WRITTEN HERE, SO MIND THE SYNTAX, THIS MIGHT NOT COMPILE !!
AutoResetEvent handle = new AutoResetEvent(true);
for (long key = 0; key < 5; key++)
{
var processingThread = new Thread(() =>
{
handle.Wait();
Setup(key)
} );
processingThread.Start();
}
handle.Set();

how to do two threads can not acces the same folder

I am writing a multithreaded application it is windows service. I have 20 folders. I create 15 threads onstart method. I want to achieve that; 15 threads go to folders 1,2,3,...,15 sequentially. When one thread finished, it creates another thread. This created thread must go 16.th folder. It must not go to working folders. How can I do this? That is, how can I be sure that two threads do not go the same folder?

Could you not just have a static variable that would be a counter for the folder name?
Something like:
private static int _folderNameCounter = 0;
private static readonly object _padlock = new object();
public static int GetFolderCounter()
{
lock(_padlock)
{
_folderNameCounter++;
return _folderNameCounter;
}
}
public static void Main()
{
for(int i = 0; i < 20; i++)
{
Task.Factory.StartNew(() =>
{
var path = #"c:\temp\" + GetFolderCounter();
Directory.CreateDirectory(path);
// add your own code for the thread here
});
}
}

Note: I've used the TPL instead of using Threads directly since I think that the TPL is a better solution. You can of course have specific requirements which can mean that Threads is the better solution for
your case.
Use a BlockingCollection<T> and fill the collection with the folder numbers. Each task handles an item of the collection, and the collection itself handles the multi-threading aspect so that each item is only handled by one consumer.
// Define the blocking collection with a maximum size of 15.
const int maxSize = 15;
var data = new BlockingCollection<int>(maxSize);
// Add the data to the collection.
// Do this in a separate task since BlockingCollection<T>.Add()
// blocks when the specified capacity is reached.
var addingTask = new Task(() => {
for (int i = 1; i <= 20; i++) {
data.Add(i);
}
).Start();
// Define a signal-to-stop bool
var stop = false;
// Create 15 handle tasks.
// You can change this to threads if necessary, but the general idea is that
// each consumer continues to consume until the stop-boolean is set.
// The Take method returns only when an item is/becomes available.
for (int t = 0; t < maxSize; t++) {
new Task(() => {
while (!stop) {
int item = data.Take();
// Note: the Take method will block until an item comes available.
HandleThisItem(item);
}
}).Start();
};
// Wait until you need to stop. When you do, set stop true
stop = true;

c# multithreading unit test

I'm looking for some advice on writing unit tests for multi-threading in C#. Specifically, I want to check that an object is being locked correctly. However, in order to test this I need to assert against that object, which may have changed before the assert(s) are implemented (with the lock being released, another thread may change the object).
Using AutoResetEvent I have been able to control the flow in the unit test side, allowing me to effectively emulate the lock in the tested object. The issue with this is that I no longer need the lock for the test to pass.
What I'd like is to have a test that passes with the lock in and fails with it out.
Obviously, this is a simplified example. It's also .Net 4, so there is no async and await option (although if that would help, changing could be an option).
Suggestions welcome. Thanks.
Below is example code:
public class BasicClass
{
public int Val
{
get { lock (lockingObject) { return val; } }
private set { lock (lockingObject) { val = value; } }
}
private int val;
public BasicClass(int val = -1)
{
Val = val;
}
public void SetValue(int val)
{
Val = val;
}
private object lockingObject = new object();
}
This is the (NUnit) unit test:
[Test]
public void BasicClassTest()
{
for (int repeat = 0; repeat < 1000; repeat++) // Purely for dev testing and can get away with as no SetUp/TearDown
{
BasicClass b = new BasicClass();
int taskCount = 10;
Task[] tasks = new Task[taskCount];
var taskControl = new AutoResetEvent(false);
var resultControl = new AutoResetEvent(false);
int expected = -1;
for (int i = 0; i < taskCount; i++)
{
int temp = i;
tasks[temp] = new Task(() =>
{
taskControl.WaitOne(); // Hold there here until set
b.SetValue(temp);
expected = temp;
resultControl.Set(); // Allows asserts to be processed.
});
}
// Start each task
foreach (var t in tasks)
t.Start();
// Assert results as tasks finish.
for (int i = 0; i < taskCount; i++)
{
taskControl.Set(); // Unblock, allow one thread to proceed.
resultControl.WaitOne(); // Wait for a task to set a expected value
Assert.That(b.Val, Is.EqualTo(expected));
Console.WriteLine("b.Val = {0}, expected = {1}", b.Val, expected); // Output values to ensure they are changing
}
// Wait for all tasks to finish, but not forever.
Task.WaitAll(tasks, 1000);
}
}

As for other system functions like DateTime.Now, I prefer to abstract threading functions like sleep, mutex, signals and so on (yes, I know there are libraries for DateTime.Now and other system functions, but I think to abstract it is a better way).
So you end up with a kind of IThreadind interface with methods to Sleep and so on. The disadvantage is, that you can't use the handy lock statement in this case. You could have a method Lock(object) that returns you an IDisposable that you can use with the "using" statement, to get nearly the same comfort.
using(threading.Lock(lockObject))
{
...
}
Now you can Create a real implementation with the real functions and a Mock for your unit tests which is injected. So you could for example for your tests shortcut any sleep call to e few ms in order to speed up your tests. And you can verify that all functions where called that you expected.
Sounds like a lot of work? Think over, how many time you will spend to debug some nasty threading issue which from time to time crashes your production system with your customer running amok.

Passing value parameter to Task in c#

I have an issue with passing a long by value to a Task.
I have a list of ID's where I loop through each one, assign to a local variable then pass as a parameter to a new Task. I do not wait for the task to complete before looping round and processing the next ID. I keep an array of Tasks but this is irrelevant.
loop
long ID = list[index];
task[index] = Task.Factory.StartNew(() => doWork(ID));
end loop
If the list contained for example 100 and 200. I would want the first task called with 100
then the second task called with 200. But it does not, doWork receives 200 for both tasks so there is an issue when the value is copied.
I can demonstrate with some simple console code
class Program
{
static void Main(string[] args)
{
long num = 100;
Task one = Task.Factory.StartNew(() => doWork(num));
num = 200;
Console.ReadKey();
}
public static void doWork(long val)
{
Console.WriteLine("Method called with {0}", val);
}
}
The above code will always display
Method called with 200
I modified the code to wait for the task status to switch from WaitingToRun
static void Main(string[] args)
{
long num = 100;
Task one = Task.Factory.StartNew(() => doWork(num));
while(one.Status == TaskStatus.WaitingToRun)
{}
num = 200;
Console.ReadKey();
}
This improves things but not 100% proof, after a few runs I got Method called with 200
Also tried the following
while (true)
{
if (one.Status == TaskStatus.Running | one.IsCompleted == true)
break;
}
but again got 200 displayed.
Any ideas how you can guarantee the value passed to the task without waiting for the task to complete?

Any ideas how you can guarantee the value passed to the task without waiting for the task to complete?
Sure - just create a separate variable which isn't modified anywhere else. You can use a new scope to make that clear:
long num = 100;
Task one;
{
// Nothing can change copyOfNum!
long copyOfNum = num;
one = Task.Factory.StartNew(() => doWork(copyOfNum));
}
You can't change the C# compiler to capture "the value of the variable when delegate is created" rather than capturing the variable, but you can make sure the variable isn't changed afterwards, which accomplishes the same thing.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Parallel Programming Race Conditions - c#

Related

Async and Await - How is order of execution maintained?

How do I execute multiple threads with the same function but different parameter concurrently

how to do two threads can not acces the same folder

c# multithreading unit test

Passing value parameter to Task in c#

Categories

Resources