When should I think about Memory Barrier and Instruction Reorder?

When should I think about Memory Barrier and Instruction Reorder? - c#

I tried to implement Peterson Lock with C# like this
public class PetersonLock
{
private volatile bool[] flag = new bool[2];
private volatile int victim;
public int oneThreadId;
public void Lock() {
int i = Thread.CurrentThread.ManagedThreadId == oneThreadId ? 1 : 0;
int j = 1 - i;
flag[i] = true; /* A */
victim = i; /* B */
while (flag[j] && victim == i) { } /* C */
}
public void Unlock() {
int i = Thread.CurrentThread.ManagedThreadId == oneThreadId ? 1 : 0;
flag[i] = false;
}
}
But when I use this lock in 2 threads, it didn't work, someone said I should think about Instruction Reorder and use Memory Barrier. Do Line A, Line B and Line C reorder like A->B->C, or A->C->B, or C->A->B or other orders? So I changed my code to this:
public class PetersonLock
{
private volatile bool[] flag = new bool[2];
private volatile int victim;
public int oneThreadId;
public void Lock() {
int i = Thread.CurrentThread.ManagedThreadId == oneThreadId ? 1 : 0;
int j = 1 - i;
flag[i] = true;
Thread.MemoryBarrier(); // Is this line nesscessary?
victim = i;
Thread.MemoryBarrier(); // Is this line nesscessary?
while (flag[j] && victim == i) { }
}
public void Unlock() {
int i = Thread.CurrentThread.ManagedThreadId == oneThreadId ? 1 : 0;
flag[i] = false;
}
}
I don't know whether these two lines are both nesscessary?
Is there any rules to help me judge which line WILL be reordered and which
line WILL NOT be?
When should I use Memory Barrier?

According to Microsoft Documentation, there are simpler ways to synchronize code using locks and the Monitor class.
If you still want to stick to Thread.MemoryBarrier, I found this link quite interesting.
Regarding your code, I'm not very familiar with the PetersonLock, so don't take my words blindly. I would say that the first barrier is used to ensure that flag is set before the victim is set and the second barrier is needed to flush the memory to all cached memory lines so all the threads read the same value

Related

c# 1D-byte array to 2D-double array

I'm dealing with c# concurrent-queue and multi-threading in socket-programming tcp/ip
First, I've already done with socket-programming itself. That means, I've already finished coding about client, server and stuffs about communication itself
basic structure is pipe-lined(producer-consumer problem) and now I'm doing with bit conversion
below is brief summary about my code
client-socket ->server-socket -> concurrent_queue_1(with type byte[65536],Thread_1 process this) -> concurrent_queue_2(with type double[40,3500], Thread_2 process this) -> display-data or other work(It can be gpu-work)
*(double[40,3500] can be changed to other size)
Till now,I've implemented putting_data into queue1(Thread1) and just dequeuing all(Thread2) and, its speed is about 700Mbps
The reason I used two concurrent_queue is, I want communication,and type conversion work to be processed in background regardless of main procedure about control things.
Here is the code about my own concurrent_queue with Blocking
public class BlockingConcurrentQueue<T> : IDisposable
{
private readonly ConcurrentQueue<T> _internalQueue;
private AutoResetEvent _autoResetEvent;
private long _consumed;
private long _isAddingCompleted = 0;
private long _produced;
private long _sleeping;
public BlockingConcurrentQueue()
{
_internalQueue = new ConcurrentQueue<T>();
_produced = 0;
_consumed = 0;
_sleeping = 0;
_autoResetEvent = new AutoResetEvent(false);
}
public bool IsAddingCompleted
{
get
{
return Interlocked.Read(ref _isAddingCompleted) == 1;
}
}
public bool IsCompleted
{
get
{
if (Interlocked.Read(ref _isAddingCompleted) == 1 && _internalQueue.IsEmpty)
return true;
else
return false;
}
}
public void CompleteAdding()
{
Interlocked.Exchange(ref _isAddingCompleted, 1);
}
public void Dispose()
{
_autoResetEvent.Dispose();
}
public void Enqueue(T item)
{
_internalQueue.Enqueue(item);
if (Interlocked.Read(ref _isAddingCompleted) == 1)
throw new InvalidOperationException("Adding Completed.");
Interlocked.Increment(ref _produced);
if (Interlocked.Read(ref _sleeping) == 1)
{
Interlocked.Exchange(ref _sleeping, 0);
_autoResetEvent.Set();
}
}
public bool TryDequeue(out T result)
{
if (Interlocked.Read(ref _consumed) == Interlocked.Read(ref _produced))
{
Interlocked.Exchange(ref _sleeping, 1);
_autoResetEvent.WaitOne();
}
if (_internalQueue.TryDequeue(out result))
{
Interlocked.Increment(ref _consumed);
return true;
}
return false;
}
}
My question is here
As I mentioned above, concurrent_queue1's type is byte[65536] and 65536 bytes = 8192 double data.
(40 * 3500=8192 * 17.08984375)
I want merge multiple 8192 double data into form of double[40,3500](size can be changed)and enqueue to concurrent_queue2 with Thread2
It's easy to do it with naive-approach(using many complex for loop) but it's slow cuz, It copys all the
data and expose to upper class or layer.
I'm searching method automatically enqueuing with matched size like foreach loop automatically iterates through 2D-array in row-major way, not yet found
Is there any fast way to merge 1D-byte array into form of 2D-double array and enqueue it?
Thanks for your help!

I try to understand your conversion rule, so I write this conversion code. Use Parallel to speed up the calculation.
int maxSize = 65536;
byte[] dim1Array = new byte[maxSize];
for (int i = 0; i < maxSize; ++i)
{
dim1Array[i] = byte.Parse((i % 256).ToString());
}
int dim2Row = 40;
int dim2Column = 3500;
int byteToDoubleRatio = 8;
int toDoubleSize = maxSize / byteToDoubleRatio;
double[,] dim2Array = new double[dim2Row, dim2Column];
Parallel.For(0, toDoubleSize, i =>
{
int row = i / dim2Column;
int col = i % dim2Column;
int originByteIndex = row * dim2Column * byteToDoubleRatio + col * byteToDoubleRatio;
dim2Array[row, col] = BitConverter.ToDouble(
dim1Array,
originByteIndex);
});

Unmanaged code callback - method calls counting

My app have a method in C# called by a camera driver unmaged code, it is registered using a delegate. I want to count everytime this method is called.
Sample:
bool firtTime;
uint counter;
//firtTime is reseted (set to true) in another method.
private void MyMethod()
{
if (firtTime)
{
counter = 0;
firtTime = false;
}
counter++;
//Do stuff
}
Is my approach ok or may I get wrong values in counter?

If this is called from multiple threads at the same time, then this is not ok. You have several race conditions that can cause trouble. Consider using System.Thread.Interlocked.Increment to increment counter.
If you need to protect both counter and firtTime, then consider something like this:
bool firtTime;
uint counter;
object sync = new object();
//firtTime is reseted (set to true) in another method.
private void MyMethod()
{
lock (sync) {
if (firtTime)
{
counter = 0;
firtTime = false;
}
counter++;
}
//Do stuff
}
Be sure to use lock (sync) { } whenever messing with counter and firtTime in your other code

static int counter = 0;
private void MyMethod()
{
Interlocked.Increment(ref counter);
//Do stuff
}
Static variable are associated with type rather than objects;
counter++ is translated into counter = counter + 1, both a read and a write. So needs to be atomic with Interlocked.Increment
For counter
var myCounter = counter == 0 ? 0 : (counter - 1);

Garbage Collector generations not incrementing

I have a lot of questions about the garbage collection procedure, mainly when does it run, when the objects are set to an older generation, so on..
static void Main(string[] args)
{
int i = 0, j = 0;
int a = 0;
Holder prev = new Holder(null);
while(GC.CollectionCount(1) == 0)
{
int aux = GC.CollectionCount(0);
if(aux > a){
a = aux;
++j;
Console.WriteLine((i+1));
}
++i;
Holder h = new Holder(prev);
Console.WriteLine(GC.GetGeneration(prev));
prev = h;
}
}
I'm trying to get the number of objects in the gen1.
Why does j = 1; ?? the GC only runs once on the gen0 (to leave the while shouldn't it at least run 2 times)?
[EDIT]
by adding this after the while breaks, i got very confused
Console.WriteLine("#gc0 = "+GC.CollectionCount(0)); --> 2
Console.WriteLine("#gc1 = "+GC.CollectionCount(1)); --> 1
Console.WriteLine("#objs = "+ i);
Console.ReadLine();
how come the GC.CollectionCount(0) be only 2? I´ve been reading Richter clr via c# and he said this
the objects in generation 1 are examined
only when generation 1 reaches its budget, which usually requires several garbage collections of
generation 0.
[EDIT]
but at the same time, if the gc sees that all the object's survived, it grows g0 limit, maybe the reason for only 2 gc's on g0?

How about we spice things up with a bit of randomness like this:
class Program
{
static void Main(string[] args)
{
Random random = new Random();
int i = 0, j = 0;
int a = 0;
Holder prev = new Holder(null);
Holder prev2 = new Holder(null);
while (GC.CollectionCount(1) == 0)
{
int aux = GC.CollectionCount(0);
if (aux > a)
{
a = aux;
++j;
Console.WriteLine((i + 1));
}
++i;
var flag = random.Next(1) == 1;
Holder h = new Holder(flag ? prev : prev2);
Console.WriteLine("Prev: " + GC.GetGeneration(prev));
Console.WriteLine("Prev2: " + GC.GetGeneration(prev2));
if (flag)
{
prev = h;
}
else
{
prev2 = h;
}
}
}
}
internal class Holder
{
private Holder holder;
public Holder(Holder o)
{
holder = o;
}
}
The code sample you've provided was so simple that the CLR knew there was not point in moving your prev item to another generation.
It usage was simple and I think that the runtime had it optimized it to live on G0 only.
Adding a more complex logic breaks the runtime's optimizations and now one of prev or prev1 will go on the G1 depending on which of the objects was used less frequently (don't know the exact mechanics here).
You can try adding instead of prev and prev2 an prevs array and do a random there on the index and you can better see how the array elements will advance the generations.

Problem with delegates in C#

In the following program, DummyMethod always print 5. But if we use the commented code instead, we get different values (i.e. 1, 2, 3, 4). Can anybody please explain why this is happenning?
delegate int Methodx(object obj);
static int DummyMethod(int i)
{
Console.WriteLine("In DummyMethod method i = " + i);
return i + 10;
}
static void Main(string[] args)
{
List<Methodx> methods = new List<Methodx>();
for (int i = 0; i < 5; ++i)
{
methods.Add(delegate(object obj) { return DummyMethod(i); });
}
//methods.Add(delegate(object obj) { return DummyMethod(1); });
//methods.Add(delegate(object obj) { return DummyMethod(2); });
//methods.Add(delegate(object obj) { return DummyMethod(3); });
//methods.Add(delegate(object obj) { return DummyMethod(4); });
foreach (var method in methods)
{
int c = method(null);
Console.WriteLine("In main method c = " + c);
}
}
Also if the following code is used, I get the desired result.
for (int i = 0; i < 5; ++i)
{
int j = i;
methods.Add(delegate(object obj) { return DummyMethod(j); });
}

The problem is that you're capturing the same variable i in every delegate - which by the end of the loop just has the value 5.
Instead, you want each delegate to capture a different variable, which means declaring a new variable in the loop:
for (int i = 0; i < 5; ++i)
{
int localCopy = i;
methods.Add(delegate(object obj) { return DummyMethod(localCopy); });
}
This is a pretty common "gotcha" - you can read a bit more about captured variables and closures in my closures article.

This article will probably help you understand what is happening (i.e. what a closure is): http://blogs.msdn.com/oldnewthing/archive/2006/08/02/686456.aspx

If you look at the code generated (using Reflector) you can see the difference:
private static void Method2()
{
List<Methodx> list = new List<Methodx>();
Methodx item = null;
<>c__DisplayClassa classa = new <>c__DisplayClassa();
classa.i = 0;
while (classa.i < 5)
{
if (item == null)
{
item = new Methodx(classa.<Method2>b__8);
}
list.Add(item);
classa.i++;
}
foreach (Methodx methodx2 in list)
{
Console.WriteLine("In main method c = " + methodx2(null));
}
}
When you use the initial code it creates a temporary class in the background, this class holds a reference to the "i" variable, so as per Jon's answer, you only see the final value of this.
private sealed class <>c__DisplayClassa
{
// Fields
public int i;
// Methods
public <>c__DisplayClassa();
public int <Method2>b__8(object obj);
}
I really recommend looking at the code in Reflector to see what's going on, its how I made sense of captured variables. Make sure you set the Optimization of the code to ".NET 1.0" in the Option menu, otherwise it'll hide all the behind scenes stuff.

I think it is because the variable i is put to the heap (it's a captured variable)
Take a look at this answer.

Best C# solution for multithreaded threadsafe read/write locking?

What is the safest (and shortest) way do lock read/write access to static members in a multithreaded environment in C#?
Is it possible to do the threadsafe locking & unlocking on class level (so I don't keep repeating lock/unlock code every time static member access is needed)?
Edit: Sample code would be great :)
Edit: Should I use the volatile keyword or Thread.MemoryBarrier() to avoid multiprocessor caching or is that unnecessary? According to Jon Skeet only those will make changes visible to other processors? (Asked this separately here).

Small Values
For small values (basically any field that can be declared volatile), you can do the following:
private static volatile int backingField;
public static int Field
{
get { return backingField; }
set { backingField = value; }
}
Large Values
With large values the assignment won't be atomic if the value is larger then 32-bits on a 32-bit machine or 64-bits on a 64-bit machine. See the ECMA 335 12.6.6 spec. So for reference types and most of the built-in value types the assignment is atomic, however if you have some large struct, like:
struct BigStruct
{
public long value1, valuea0a, valuea0b, valuea0c, valuea0d, valuea0e;
public long value2, valuea0f, valuea0g, valuea0h, valuea0i, valuea0j;
public long value3;
}
In this case you will need some kind of locking around the get accessor. You could use ReaderWriterLockSlim for this which I've demonstrated below. Joe Duffy has advice on using ReaderWriterLockSlim vs ReaderWriterLock:
private static BigStruct notSafeField;
private static readonly ReaderWriterLockSlim slimLock =
new ReaderWriterLockSlim();
public static BigStruct Safe
{
get
{
slimLock.EnterReadLock();
var returnValue = notSafeField;
slimLock.ExitReadLock();
return returnValue;
}
set
{
slimLock.EnterWriteLock();
notSafeField = value;
slimLock.ExitWriteLock();
}
}
Unsafe Get-Accessor Demonstration
Here's the code I used to show the lack of atomicity when not using a lock in the get-accessor:
private static readonly object mutexLock = new object();
private static BigStruct notSafeField;
public static BigStruct NotSafe
{
get
{
// this operation is not atomic and not safe
return notSafeField;
}
set
{
lock (mutexLock)
{
notSafeField = value;
}
}
}
public static void Main(string[] args)
{
var t = new Thread(() =>
{
while (true)
{
var current = NotSafe;
if (current.value2 != (current.value1 * 2)
|| current.value3 != (current.value1 * 5))
{
throw new Exception(String.Format("{0},{1},{2}", current.value1, current.value2, current.value3));
}
}
});
t.Start();
for(int i=0; i<50; ++i)
{
var w = new Thread((state) =>
{
while(true)
{
var index = (int) state;
var newvalue = new BigStruct();
newvalue.value1 = index;
newvalue.value2 = index * 2;
newvalue.value3 = index * 5;
NotSafe = newvalue;
}
});
w.Start(i);
}
Console.ReadLine();
}

The safest and shortest way is to create a private, static field of type Object that is only used for locking (think of it as a "pad-lock" object). Use this and only this field to lock on as this prevent other types from locking up your code when then lock on the same type that you do.
If you lock on the type itself there is risk that another type will also decide to lock on your type and this could create deadlocks.
Here is an example:
class Test
{
static readonly Object fooLock = new Object();
static String foo;
public static String Foo
{
get { return foo; }
set
{
lock (fooLock)
{
foo = value;
}
}
}
}
Notice that I have create a private, static field for locking foo - I use that field to lock the write operations on that field.

Although you could just use a single mutex to control all the access to the class (effectively serializing the access to the class) I suggest you study the static class, determine which members are being used where and how and the use one or several ReaderWriterLock (code examples in the MSDN documentation) which provides access to several readers but only one writer at the same time.
That way you'll have a fine grained multithreaded class which will only block for writing but will allow several readers at the same time and which will allow writing to one member while reading another unrelated member.

class LockExample {
static object lockObject = new object();
static int _backingField = 17;
public static void NeedsLocking() {
lock(lockObject) {
// threadsafe now
}
}
public static int ReadWritePropertyThatNeedsLocking {
get {
lock(lockObject) {
// threadsafe now
return _backingField;
}
}
set {
lock(lockObject) {
// threadsafe now
_backingField = value;
}
}
}
}
lock on an object specifically created for this purpose rather than on typeof(LockExample) to prevent deadlock situations where others have locked on LockExample's type object.
Is it possible to do the threadsafe locking & unlocking on class level (so I don't keep repeating lock/unlock code every time static member access is needed)?
Only lock where you need it, and do it inside the callee rather than requiring the caller to do the locking.

Several others have already explained how to use the lock keyword with a private lock object, so I will just add this:
Be aware that even if you lock inside each method in your type, calling more than one method in a sequence can not be considered atomic. For example if you're implementing a dictionary and your interface has a Contains method and an Add method, calling Contains followed by Add will not be atomic. Someone could modify the dictionary between the calls to Contains and Add - i.e. there's a race condition. To work around this you would have to change the interface and offer a method like AddIfNotPresent (or similar) which encapsulates both the checking and the modification as a single action.
Jared Par has an excellent blog post on the topic (be sure to read the comments as well).

You should lock/unlock on each static member access, within the static accessor, as needed.
Keep a private object to use for locking, and lock as required. This keeps the locking as fine-grained as possible, which is very important. It also keeps the locking internal to the static class members. If you locked at the class level, your callers would become responsible for the locking, which would hurt usability.

I thank you all and I'm glad to share this demo program, inspired by the above contributions, that run 3 modes (not safe, mutex, slim).
Note that setting "Silent = false" will result in no conflict at all between the threads. Use this "Silent = false" option to make all threads write in the Console.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
namespace Test
{
class Program
{
//------------------------------------------------------------------------------
// Configuration.
const bool Silent = true;
const int Nb_Reading_Threads = 8;
const int Nb_Writing_Threads = 8;
//------------------------------------------------------------------------------
// Structured data.
public class Data_Set
{
public const int Size = 20;
public long[] T;
public Data_Set(long t)
{
T = new long[Size];
for (int i = 0; i < Size; i++)
T[i] = t;
}
public Data_Set(Data_Set DS)
{
Set(DS);
}
public void Set(Data_Set DS)
{
T = new long[Size];
for (int i = 0; i < Size; i++)
T[i] = DS.T[i];
}
}
private static Data_Set Data_Sample = new Data_Set(9999);
//------------------------------------------------------------------------------
// SAFE process.
public enum Mode { Unsafe, Mutex, Slim };
public static Mode Lock_Mode = Mode.Unsafe;
private static readonly object Mutex_Lock = new object();
private static readonly ReaderWriterLockSlim Slim_Lock = new ReaderWriterLockSlim();
public static Data_Set Safe_Data
{
get
{
switch (Lock_Mode)
{
case Mode.Mutex:
lock (Mutex_Lock)
{
return new Data_Set(Data_Sample);
}
case Mode.Slim:
Slim_Lock.EnterReadLock();
Data_Set DS = new Data_Set(Data_Sample);
Slim_Lock.ExitReadLock();
return DS;
default:
return new Data_Set(Data_Sample);
}
}
set
{
switch (Lock_Mode)
{
case Mode.Mutex:
lock (Mutex_Lock)
{
Data_Sample.Set(value);
}
break;
case Mode.Slim:
Slim_Lock.EnterWriteLock();
Data_Sample.Set(value);
Slim_Lock.ExitWriteLock();
break;
default:
Data_Sample.Set(value);
break;
}
}
}
//------------------------------------------------------------------------------
// Main function.
static void Main(string[] args)
{
// Console.
const int Columns = 120;
const int Lines = (Silent ? 50 : 500);
Console.SetBufferSize(Columns, Lines);
Console.SetWindowSize(Columns, 40);
// Threads.
const int Nb_Threads = Nb_Reading_Threads + Nb_Writing_Threads;
const int Max = (Silent ? 50000 : (Columns * (Lines - 5 - (3 * Nb_Threads))) / Nb_Threads);
while (true)
{
// Console.
Console.Clear();
Console.WriteLine("");
switch (Lock_Mode)
{
case Mode.Mutex:
Console.WriteLine("---------- Mutex ----------");
break;
case Mode.Slim:
Console.WriteLine("---------- Slim ----------");
break;
default:
Console.WriteLine("---------- Unsafe ----------");
break;
}
Console.WriteLine("");
Console.WriteLine(Nb_Reading_Threads + " reading threads + " + Nb_Writing_Threads + " writing threads");
Console.WriteLine("");
// Flags to monitor all threads.
bool[] Completed = new bool[Nb_Threads];
for (int i = 0; i < Nb_Threads; i++)
Completed[i] = false;
// Threads that change the values.
for (int W = 0; W < Nb_Writing_Threads; W++)
{
var Writing_Thread = new Thread((state) =>
{
int t = (int)state;
int u = t % 10;
Data_Set DS = new Data_Set(t + 1);
try
{
for (int k = 0; k < Max; k++)
{
Safe_Data = DS;
if (!Silent) Console.Write(u);
}
}
catch (Exception ex)
{
Console.WriteLine("\r\n" + "Writing thread " + (t + 1) + " / " + ex.Message + "\r\n");
}
Completed[Nb_Reading_Threads + t] = true;
});
Writing_Thread.Start(W);
}
// Threads that read the values.
for (int R = 0; R < Nb_Reading_Threads; R++)
{
var Reading_Thread = new Thread((state) =>
{
int t = (int)state;
char u = (char)((int)('A') + (t % 10));
try
{
for (int j = 0; j < Max; j++)
{
Data_Set DS = Safe_Data;
for (int i = 0; i < Data_Set.Size; i++)
{
if (DS.T[i] != DS.T[0])
{
string Log = "";
for (int k = 0; k < Data_Set.Size; k++)
Log += DS.T[k] + " ";
throw new Exception("Iteration " + (i + 1) + "\r\n" + Log);
}
}
if (!Silent) Console.Write(u);
}
}
catch (Exception ex)
{
Console.WriteLine("\r\n" + "Reading thread " + (t + 1) + " / " + ex.Message + "\r\n");
}
Completed[t] = true;
});
Reading_Thread.Start(R);
}
// Wait for all threads to complete.
bool All_Completed = false;
while (!All_Completed)
{
All_Completed = true;
for (int i = 0; i < Nb_Threads; i++)
All_Completed &= Completed[i];
}
// END.
Console.WriteLine("");
Console.WriteLine("Done!");
Console.ReadLine();
// Toogle mode.
switch (Lock_Mode)
{
case Mode.Unsafe:
Lock_Mode = Mode.Mutex;
break;
case Mode.Mutex:
Lock_Mode = Mode.Slim;
break;
case Mode.Slim:
Lock_Mode = Mode.Unsafe;
break;
}
}
}
}
}

Locking in static methods sounds like a bad idea, for one thing if you use these static methods from class constructor you could run into some interesting side-effects due to loader locks (and the fact that class loaders can ignore other locks).

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

When should I think about Memory Barrier and Instruction Reorder? - c#

Related

c# 1D-byte array to 2D-double array

Unmanaged code callback - method calls counting

Garbage Collector generations not incrementing

Problem with delegates in C#

Best C# solution for multithreaded threadsafe read/write locking?

Categories

Resources