How to use MemoryBarrier? Ways to safe Atomicity operations? - c#

While I was learning threading memory barrier (fences) seems really not easy to understand, here in my case I want employee 10 threads simultaneously increase a Int32 number: x by 100 times at each (x++), and will get result 10 * 100 = 1000.
So this is actually an atomicity problem, and what I know so far there are a number of ways to achieve that - limited in concurrent ways:
Interlocked.Increment
exclusive lock (lock, monitor, Mutex, Semaphore, etc.)
ReadWriteLockSlim
If there are more better ways please kindly guide me, I tried to use a volatile read/write but failed:
for (int i = 0; i < 10000; i++)
{
Thread.VolatileRead(ref x);
Thread.VolatileWrite(ref x, x + 1);
}
My investigation code is tidied below:
private const int MaxThraedCount = 10;
private Thread[] m_Workers = new Thread[MaxThraedCount];
private volatile int m_Counter = 0;
private Int32 x = 0;
protected void btn_DoWork_Click(object sender, EventArgs e)
{
for (int i = 0; i < MaxThraedCount; i++)
{
m_Workers[i] = new Thread(IncreaseNumber) { Name = "Thread " + (i + 1) };
m_Workers[i].Start();
}
}
void IncreaseNumber()
{
try
{
for (int i = 0; i < 10000; i++)
Interlocked.Increment(ref x);
// Increases Counter and decides whether or not sets the finish signal
m_Counter++;
if (m_Counter == MaxThraedCount)
{
// Print finish information on UI
m_Counter = 0;
}
}
catch (Exception ex)
{
throw;
}
}
My question is: how can I use Memory Barrier to replace Interlocked, since "All of Interlocked’s methods generate a full fence.", I tried to modify the increase loop as below but failed, I don't understand why...
for (int i = 0; i < 10000; i++)
{
Thread.MemoryBarrier();
x++;
Thread.MemoryBarrier();
}

The memory barrier just keeps memory operations from moving from one side of the barrier to the other. Your issue is this:
Thread A reads the value of X.
Thread B reads the value of X.
Thread A adds one to the value it read.
Thread B adds one to the value it read.
Thread A writes back the value it calculated.
Thread B writes back the value it calculated.
Oops, two increments only added one. Memory barriers are not atomic operations and they are not locks. They just enforce ordering, not atomicity.
Unfortunately, the x86 architecture does not offer any atomic operations that don't include a full fence. It is what it is. On the bright side, the full fence is heavily optimized. (For example, it does not ever lock any bus.)

You can't "use MemoryBarrier to replace Interlocked". They are two different tools.
Use MemoryBarrier, volatile etc to control re-ordering of reads and writes. Use Interlocked, lock etc for atomicity.
(Besides, are you aware that calling MemoryBarrier also generates a full fence*, as do VolatileRead and VolatileWrite? So if you're trying to avoid Interlocked, lock etc for performance reasons, there's a good chance that your alternatives will be less performant as well as more likely to be broken.)
*In the standard Microsoft CLR, at least. I'm not sure about Mono etc.

Related

Volatile reads/writes and instruction reordering

There is a very good article by Joe Albahari explaining volatile in C#: Threading in C#: PART 4: ADVANCED THREADING.
Considering instruction reordering Joe uses this example:
public class IfYouThinkYouUnderstandVolatile
{
private volatile int x, y;
private void Test1() // Executed on one thread
{
this.x = 1; // Volatile write (release-fence)
int a = this.y; // Volatile read (acquire-fence)
}
private void Test2() // Executed on another thread
{
this.y = 1; // Volatile write (release-fence)
int b = this.x; // Volatile read (acquire-fence)
}
}
Basically what he is saying is that a and b both could end up with containing 0 when the methods are running on different threads in parallel.
IOW the optimizer or processor could reorder the instructions as follows:
public class IfYouThinkYouUnderstandVolatileReordered
{
private volatile int x, y;
private void Test1() // Executed on one thread
{
int tempY = this.y; // Volatile read (reordered)
this.x = 1; // Volatile write
int a = tempY; // Use the already read value
}
private void Test2() // Executed on another thread
{
int tempX = this.x; // Volatile read (reordered)
this.y = 1; // Volatile write (release-fence)
int b = tempX; // Use the already read value
}
}
The reason why this can happen though we are using volatile is that a read instruction following a write instruction can be moved before the write instruction.
So far I understand what is happening here.
My question is: could this reordering work through stack frames? I mean can a volatile write instruction be moved after a volatile read instruction which is happening in another method (or property accessor)?
Have a look at the following code: it is working with properties instead of directly accessing instance variables.
What about reordering in this case? Could it happen in any case? Or could it only happen if the property access is inlined by the compiler?
public class IfYouThinkYouUnderstandVolatileWithProps
{
private volatile int x, y;
public int PropX
{
get { return this.x; }
set { this.x = value; }
}
public int PropY
{
get { return this.y; }
set { this.y = value; }
}
private void Test1() // Executed on one thread
{
this.PropX = 1; // Volatile write (release-fence)
int a = this.PropY; // Volatile read (acquire-fence)
}
private void Test2() // Executed on another thread
{
this.PropY = 1; // Volatile write (release-fence)
int b = this.PropX; // Volatile read (acquire-fence)
}
}
As said in ECMA-335
I.12.6.4 Optimization
Conforming implementations of the CLI are free to execute programs using any technology that guarantees, within a single thread of execution, that side-effects and exceptions generated by a thread are visible in the order specified by the CIL. For this purpose only volatile operations (including volatile reads) constitute visible side-effects. (Note that while only volatile operations constitute visible side-effects, volatile operations also affect the visibility of non-volatile references.)
Volatile operations are specified in §I.12.6.7. There are no ordering guarantees relative to exceptions injected into a thread by another thread (such exceptions are sometimes called “asynchronous exceptions”
(e.g., System.Threading.ThreadAbortException).
So, obviously it's allowed to inline all that code and then it's the same as it was.
You should not think about such high levels things because you can't control them.
JIT has many reasons to inline or not.
Reordering is a good concept which allows you to reason about possible outcomes of parallel code execution. But the real things happening are not only about reordering read/write operations. It can be real reordering or caching values in CPU registers by JIT, or effects of speculative execution by the CPU itself, or how memory controller does its job.
Think of reads and writes of pieces of memory of pointer (or less) size. Use reordering model of such reads and writes and don't rely on today's specifics of the JIT or CPU your program runs on.

Will all the worker threads get generated at once?

Is anyone out there who can explain me the flow of this code?
I wonder how main thread generates worker threads, what I know is:
As soon as main thread calls .start method it creates a new thread.
But I have a confusion how the behavior changes when it comes to looping multiple threads in main.
static void Main()
{
Thread[] tr = new Thread[10];
for (int i = 0; i < 10; i++)
{
tr[i] = new Thread(new ThreadStart(count));
tr[i].Start();
}
static private void count()
{
for (int i = 0; i < 10; ++i)
{
lock (theLock)
{
Console.WriteLine("Count {0} Thread{1}",
counter++, Thread.CurrentThread.GetHashCode());
}
}
Is there a good way to debug and track your multithreaded program. after google it out I found tracking thread window in debug mood, but I couldn't find it useful even after given custom names to threads.
I just can't understand the flow, how threads being launched, how they work all together etc as breakpoints seem no effect in multi-threaded application. (At least in my case.)
I want this output 1 printed by Thread : 4551 [ThreadID] 2 printed by
Thread : 4552 3 printed by Thread : 4553 4 printed by Thread : 4554 5
printed by Thread : 4555 6 printed by Thread : 4556 7 printed by
Thread : 4557 8 printed by Thread : 4558 9 printed by Thread : 4559 10
printed by Thread : 4560 11 printed by Thread : 4551 [ Same Thread Id
Appears again as in 1] 12 printed by Thread : 4552
I'll try to describe what your code is doing as it interacts with the threading subsystem. The details I'm giving are from what I remember from my OS design university classes, so the actual implementation in the host operating system and/or the CLR internals may vary a bit from what I describe.
static void Main()
{
Thread[] tr = new Thread[10];
for (int i = 0; i < 10; i++)
{
tr[i] = new Thread(new ThreadStart(count));
// The following line puts the thread in a "runnable" thread list that is
// managed by the OS scheduler. The scheduler will allow threads to run by
// considering many factors, such as how many processes are running on
// the system, how much time a runnable thread has been waiting, the process
// priority, the thread's priority, etc. This means you have little control
// on the order of execution, The only certain fact is that your thread will
// run, at some point in the near future.
tr[i].Start();
// At this point you are exiting your main function, so the program should
// end, however, since you didn't flag your threads as BackgroundThreads,
// the program will keep running until every thread finishes.
}
static private void count()
{
// The following loop is very short, and it is probable that the thread
// might finish before the scheduler allows another thread to run
// Like user2864740 suggested, increasing the amount of iterations will
// increase the chance that you experience interleaved execution between
// multiple threads
for (int i = 0; i < 10; ++i)
{
// Acquire a mutually-exclusive lock on theLock. Assuming that
// theLock has been declared static, then only a single thread will be
// allowed to execute the code guarded by the lock.
// Any running thread that tries to acquire the lock that is
// being held by a different thread will BLOCK. In this case, the
// blocking operation will do the following:
// 1. Register the thread that is about to be blocked in the
// lock's wait list (this is managed by a specialized class
// known as the Monitor)
// 2. Remove the thread that is about to be blocked from the scheduler's
// runnable list. This way the scheduler won't try to yield
// the CPU to a thread that is waiting for a lock to be
// released. This saves CPU cycles.
// 3. Yield execution (allow other threads to run)
lock (theLock)
{
// Only a single thread can run the following code
Console.WriteLine("Count {0} Thread{1}",
counter++, Thread.CurrentThread.GetHashCode());
}
// At this point the lock is released. The Monitor class will inspect
// the released lock's wait list. If any threads were waiting for the
// lock, one of them will be selected and returned to the scheduler's
// runnable list, where eventually it will be given the chance to run
// and contend for the lock. Again, many factors may be evaluated
// when selecting which blocked thread to return to the runnable
// list, so we can't make any guarantees on the order the threads
// are unblocked
}
}
Hopefully things are clearer. The important thing here is to acknowledge that you have little control of how individual threads are scheduled for execution, making it impossible (without a fair amount of synchronization code) to replicate the output you are expecting. At most, you can change a thread's priority to hint the scheduler that a certain thread must be favored over other threads. However, this needs to be done very carefully, as it may lead to a nasty problem known as priority inversion. Unless you know exactly what you are doing, it is usually better not to change a thread's priority.
After a continuous try, I got to complete the requirements of my task. Here is the code:
using System;
using System.Threading;
public class EntryPoint
{
static private int counter = 0;
static private object theLock = new Object();
static object obj = new object();
static private void count()
{
{
for (int i = 0; i < 10; i++)
{
lock (theLock)
{
Console.WriteLine("Count {0} Thread{1}",
counter++, Thread.CurrentThread.GetHashCode());
if (counter>=10)
Monitor.Pulse(theLock);
Monitor.Wait(theLock); } }}
}
static void Main()
{
Thread[] tr = new Thread[10];
for (int i = 0; i < 10; i++)
{
tr[i] = new Thread(new ThreadStart(count));
tr[i].Start();
}
}
}
Monitor maintains a ready queue in a sequential order hence I achieved what I wanted:
Cheers!

Handle leak in .Net threads

This has actually bitten me a couple times. If you do simple code like this:
private void button1_Click(object sender, EventArgs e)
{
for (int i = 0; i < 1000000; i++)
{
Thread CE = new Thread(SendCEcho);
CE.Priority = ThreadPriority.Normal;
CE.IsBackground = true;
CE.Start();
Thread.Sleep(500);
GC.Collect();
}
}
private void SendCEcho()
{
int Counter = 0;
for (int i = 0; i < 5; i++)
{
Counter++;
Thread.Sleep(25);
}
}
Run this code and watch the handles fly! Thread.Sleep is so you can shut it down and it doesn't take over you computer. This should guarantee that the thread launched dies before the next thread is launched. Calling GC.Collect(); does nothing. From my observation this code loses 10 handles every refresh to the task manager at normal refresh.
It doesn't matter what is in the void SendCEcho() function, count to five if you want. When the thread dies there is one handle that does not get cleaned up.
With most programs this doesn't really matter because they do not run for extended periods of time. On some of the programs I've created they need to run for months and months on end.
If you exercise this code over and over, you can eventually leak handles to the point where the windows OS becomes unstable and it will not function. Reboot required.
I did solve the problem by using a thread pool and code like this:
ThreadPool.QueueUserWorkItem(new WaitCallback(ThreadJobMoveStudy));
My question though is why is there such a leak in .Net, and why has it existed for so long? Since like 1.0? I could not find the answer here.
You're creating threads that never end. The for loop in SendCEcho never terminates, so the threads never end and thus cannot be reclaimed. If I fix your loop code then the threads finish and are reclaimed as expected. I'm unable to reproduce the problem with the code below.
static void SendCEcho()
{
int Counter = 0;
for (int i = 0; i < 5; i++)
{
Counter++;
Thread.Sleep(25);
}
}
for (int i = 0; i < 5 + i++; )
Fairly bizarre typo, you have not re-created your real problem. There is one, a Thread object consumes 5 operating system handles, its internal finalizer releases them. A .NET class normally has a Dispose() method to ensure that such handles can be released early but Thread does not have one. That was courageous design, such a Dispose() method would be very hard to call.
So having to rely on the finalizer is a hard requirement. In a program that has a "SendEcho" method, and does not do anything else, you are running the risk that the garbage collector never runs. So the finalizer can't do its job. It is then up to you to call GC.Collect() yourself. You'd consider doing so every, say, 1000 threads you start. Or use ThreadPool.QueueUserWorkItem() or Task.Run() so you recycle the threads, the logical approach.
Use Perfmon.exe to verify that the GC indeed doesn't run. Add the .NET CLR Memory > # Gen 0 Collections counter for your program.
Try adding GC.WaitForPendingFinalizers(); after your call to GC.Collect(); I think that will get you what you are after. Complete source below:
EDIT
Also take a look at this from 2005: https://bytes.com/topic/net/answers/106014-net-1-1-possibly-1-0-also-threads-leaking-event-handles-bug. Almost the exact same code as yours.
using System;
using System.Threading;
class Program
{
static void Main(string[] args)
{
for (int i = 0; i < 1000000; i++)
{
Thread CE = new Thread(SendCEcho);
CE.Priority = ThreadPriority.Normal;
CE.IsBackground = true;
CE.Start();
Thread.Sleep(500);
CE = null;
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
public static void SendCEcho()
{
int Counter = 0;
for (int i = 0; i < 5; i++ )
{
Counter++;
Thread.Sleep(25);
}
}
}

Cancel request if taking to long to process

Hello is this possible in c#:
If I have a loop say:
int x = 0;
for (int i = 0; i < 1000000; i++) {
x++;
}
And it takes more than 1 second to complete, is it possible to kill the code and move on?
Yes, if you have the loop running in a different thread, you can abort that thread from a different thread. This would effectively kill the code you're running, and would work fine in the simplified example you've given. The following code demonstrates how you might do this:
void Main()
{
double foo = 0;
var thread = new Thread(() => foo = GetFoo());
thread.Start();
string message;
if(thread.Join(1000))
{
message = foo.ToString();
}
else
{
message = "Process took too long";
thread.Abort();
}
message.Dump();
}
public double GetFoo() {
double a = RandomProvider.Next(2, 5);
while (a == 3) { RandomProvider.Next(2, 5);}
double b = RandomProvider.Next(2, 5);
double c = b /a;
double e = RandomProvider.Next(8, 11);
while (e == 9) { RandomProvider.Next(8,11); }
double f = b / e;
return f;
}
However, as you can see from Eric Lippert's comment (and others), aborting a thread is not very graceful (a.k.a. "pure evil"). If you were to have important things happening inside this loop, you could end up with data in an unstable state by forcing the thread abortion in the middle of the loop. So it is possible, but whether you want to use this approach will depend on what exactly you're doing in the loop.
So a better option would be to have your loops voluntarily decide to exit when they're told to (Updated 2015/11/5 to use newer TPL classes):
async void Main()
{
try
{
var cancelTokenSource = new CancellationTokenSource(TimeSpan.FromMilliseconds(1000));
var foo = await GetFooAsync(cancelTokenSource.Token);
Console.WriteLine(foo);
}
catch (OperationCanceledException e)
{
Console.WriteLine("Process took too long");
}
}
private Random RandomProvider = new Random();
public async Task<double> GetFooAsync(CancellationToken cancelToken) {
double a = RandomProvider.Next(2, 5);
while (a == 3)
{
cancelToken.ThrowIfCancellationRequested();
RandomProvider.Next(2, 5);
}
double b = RandomProvider.Next(2, 5);
double c = b /a;
double e = RandomProvider.Next(8, 11);
while (e == 9)
{
cancelToken.ThrowIfCancellationRequested();
RandomProvider.Next(8,11);
}
double f = b / e;
return f;
}
This gives GetFooAsync an opportunity to catch on to the fact that it's supposed to exit soon, and make sure it gets into a stable state before giving up the ghost.
If you control the code that needs to be shut down then write logic into it that enables it to be shut down cleanly from another thread.
If you do not control the code that needs to be shut down then there are two scenarios. First, it is hostile to you and actively resisting your attempts to shut it down. That is a bad situation to be in. Ideally you should run that code in another process, not another thread, and destroy the process. A hostile thread has lots of tricks it can do to keep you from shutting it down.
Second, if it is not hostile and it can't be shut down cleanly then either it is badly designed, or it is buggy. Fix it. If you can't fix it, run it in another process, just like it was hostile.
The last thing you should do is abort the thread. Thread aborts are pure evil and should only be done as a last resort. The right thing to do is to design the code so that it can be cancelled cleanly and quickly.
More thoughts:
http://blogs.msdn.com/b/ericlippert/archive/2010/02/22/should-i-specify-a-timeout.aspx
Alternatively, you could go with a single-threaded approach. Break the work up into small chunks where the last thing each chunk of work does is enqueues the next chunk of work onto a work queue. You then sit in a loop, pulling work out of the queue and executing it. To stop early, just clear the queue. No need to go with a multithreaded approach if you don't need to.
You can achieve this in a simple way without mucking around with threading:
int x = 0;
int startTime = Environment.TickCount;
for (int i = 0; i < 1000000; i++)
{
x++;
int currentTime = Environment.TickCount;
if ((currentTime - startTime) > 1000)
{
break;
}
}
You can replace Environment.TickCount with a DateTime call or System.Diagnostics.Stopwatch if you like.
You can use background worker, this class supports cancellation
If the problem is the number of iteration in the loop more than what you do inside the loop, then you can ask for save the time when the loop started, and see de diference betweeen each iteration ( or maybe %10, 20, 100) and cancel if the time you wanted to spend on that task is over.

C#: Create CPU Usage at Custom Percentage

I'm looking to test system responsiveness etc. on a few machines under certain CPU usage conditions. Unfortunately I can only create ~100% usage (infinite loop) or not enough CPU usage (I'm using C#).
Is there any way, in rough approximation, as other tasks are running on the system as well, to create CPU usage artificially at 20, 30, 40% (and so forth) steps?
I understand that there are differences between systems, obviously, as CPU's vary. It's more about algorithms/ideas on customizable CPU intensive calculations that create enough usage on a current CPU without maxing it out that I can tweak them then in some way to adjust them to create the desired percentage.
This then?
DateTime lastSleep = DateTime.Now;
while (true)
{
TimeSpan span = DateTime.Now - lastSleep;
if (span.TotalMilliseconds > 700)
{
Thread.Sleep(300);
lastSleep = DateTime.Now;
}
}
You could use smaller numbers to get a more steady load....as long as the ratio is whatever you want. This does only use one core though, so you might have to do this in multiple threads.
You could add a threaded timer that wakes up on an interval and does some work. Then tweak the interval and amount of work until you approximate the load you want.
private void button1_Click(object sender, EventArgs e)
{
m_timer = new Timer(DoWork);
m_timer.Change(TimeSpan.Zero, TimeSpan.FromMilliseconds(10));
}
private static void DoWork(object state)
{
long j = 0;
for (int i = 0; i < 2000000; i++)
{
j += 1;
}
Console.WriteLine(j);
}
With that and tweaking the value of the loop I was able to add 20%, 60% and full load to my system. It will scale for multiple cores using additional threads for more even load.
The utility provides a simple slider bar user interface that allows you to place an arbitrary load on the processor(s) in your system. Automatically detects and handles multiple processors.
Worked very well for me when I downloaded it this morning.
Here is a function that utilizes all available processors/cores to a customizable percent, and can be cancelled at any time by the calling code.
private static CancellationTokenSource StressCPU(int percent)
{
if (percent < 0 || percent > 100) throw new ArgumentException(nameof(percent));
var cts = new CancellationTokenSource();
for (int i = 0; i < Environment.ProcessorCount; i++)
{
new Thread(() =>
{
var stopwatch = new Stopwatch();
while (!cts.IsCancellationRequested)
{
stopwatch.Restart();
while (stopwatch.ElapsedMilliseconds < percent) { } // hard work
Thread.Sleep(100 - percent); // chill out
}
}).Start();
}
return cts;
}
Usage example:
var cts = StressCPU(50);
Thread.Sleep(15000);
cts.Cancel();
Result:
Not done this - but you could try working out prioritised threads running in multiple programs.
When you tried using sleep, did you do like this or just put the sleep inside the actual loop?
I think this will work.
while(true)
{
Thread.Sleep(0);
for(int i=0;i<veryLargeNumber; i++)
{
//maybe add more loops here if looping to veryLargeNumber goes to quickly
//also maybe add some dummy operation so the compiler doesn't optimize the loop away
}
}

Categories

Resources