This has actually bitten me a couple times. If you do simple code like this:
private void button1_Click(object sender, EventArgs e)
{
for (int i = 0; i < 1000000; i++)
{
Thread CE = new Thread(SendCEcho);
CE.Priority = ThreadPriority.Normal;
CE.IsBackground = true;
CE.Start();
Thread.Sleep(500);
GC.Collect();
}
}
private void SendCEcho()
{
int Counter = 0;
for (int i = 0; i < 5; i++)
{
Counter++;
Thread.Sleep(25);
}
}
Run this code and watch the handles fly! Thread.Sleep is so you can shut it down and it doesn't take over you computer. This should guarantee that the thread launched dies before the next thread is launched. Calling GC.Collect(); does nothing. From my observation this code loses 10 handles every refresh to the task manager at normal refresh.
It doesn't matter what is in the void SendCEcho() function, count to five if you want. When the thread dies there is one handle that does not get cleaned up.
With most programs this doesn't really matter because they do not run for extended periods of time. On some of the programs I've created they need to run for months and months on end.
If you exercise this code over and over, you can eventually leak handles to the point where the windows OS becomes unstable and it will not function. Reboot required.
I did solve the problem by using a thread pool and code like this:
ThreadPool.QueueUserWorkItem(new WaitCallback(ThreadJobMoveStudy));
My question though is why is there such a leak in .Net, and why has it existed for so long? Since like 1.0? I could not find the answer here.
You're creating threads that never end. The for loop in SendCEcho never terminates, so the threads never end and thus cannot be reclaimed. If I fix your loop code then the threads finish and are reclaimed as expected. I'm unable to reproduce the problem with the code below.
static void SendCEcho()
{
int Counter = 0;
for (int i = 0; i < 5; i++)
{
Counter++;
Thread.Sleep(25);
}
}
for (int i = 0; i < 5 + i++; )
Fairly bizarre typo, you have not re-created your real problem. There is one, a Thread object consumes 5 operating system handles, its internal finalizer releases them. A .NET class normally has a Dispose() method to ensure that such handles can be released early but Thread does not have one. That was courageous design, such a Dispose() method would be very hard to call.
So having to rely on the finalizer is a hard requirement. In a program that has a "SendEcho" method, and does not do anything else, you are running the risk that the garbage collector never runs. So the finalizer can't do its job. It is then up to you to call GC.Collect() yourself. You'd consider doing so every, say, 1000 threads you start. Or use ThreadPool.QueueUserWorkItem() or Task.Run() so you recycle the threads, the logical approach.
Use Perfmon.exe to verify that the GC indeed doesn't run. Add the .NET CLR Memory > # Gen 0 Collections counter for your program.
Try adding GC.WaitForPendingFinalizers(); after your call to GC.Collect(); I think that will get you what you are after. Complete source below:
EDIT
Also take a look at this from 2005: https://bytes.com/topic/net/answers/106014-net-1-1-possibly-1-0-also-threads-leaking-event-handles-bug. Almost the exact same code as yours.
using System;
using System.Threading;
class Program
{
static void Main(string[] args)
{
for (int i = 0; i < 1000000; i++)
{
Thread CE = new Thread(SendCEcho);
CE.Priority = ThreadPriority.Normal;
CE.IsBackground = true;
CE.Start();
Thread.Sleep(500);
CE = null;
GC.Collect();
GC.WaitForPendingFinalizers();
}
}
public static void SendCEcho()
{
int Counter = 0;
for (int i = 0; i < 5; i++ )
{
Counter++;
Thread.Sleep(25);
}
}
}
Related
This question already has answers here:
Is .NET's StringBuilder thread-safe
(3 answers)
Closed 5 years ago.
using System;
using System.Threading;
using System.Text;
class ThreadTest
{
static StringBuilder sb = new StringBuilder();
static void Main()
{
Thread t = new Thread(WriteY);
t.Start();
for(int i = 0; i < 1000; i++)
{
sb.Append("x");
}
Console.WriteLine(sb.ToString());
}
private static void WriteY()
{
for (int i = 0; i < 1000; i++)
{
sb.Append("y");
}
}
}
output:
{xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyyy}
Question:
Why does 'x' appear before 'y'?
StringBuilder is Accept only one Thread?
Why does not appear like this "xyxyxyxyx xyxyxyxy"?
Questions 1 and 3 are both related to the time slicing of the Windows scheduler. According to the Windows 2000 Performance Guide, a time slice on x86 processors is about 30 ms. That may have changed since Windows 2000, but should still be in this order of magnitude. Hence, t.Start() only adds the new thread to the scheduler but does not immediately trigger a context switch to it. The main thread has still the remaining part of its time slice, which obviously is enough time to print the 'x' 1,000 times.
Furthermore, when the new thread is actually scheduled, it has a whole time slice to print out 'y'. As this is plenty of time, you don't get the pattern "xyxyxy", but rather 'x's until the time slice of the main thread runs out and then 'y's until the end of the time slice of the new thread and then 'x's again. (At least if there are enough 'x's and 'y's to be printed, which, according to Simon's comment is the case with 10,000 'x's and 'y's.)
Question 2 is answered by the MSDN page on the StringBuilder. Under the topic "Thread Safety" it is written that "Any public static ( Shared in Visual Basic) members of this type are thread safe. Any instance members are not guaranteed to be thread safe." As the Append method is an instance method, you cannot call this reliably from different threads in parallel without further synchronization.
Is this what you are looking for?
class Program
{
static StringBuilder sb = new StringBuilder();
static void Main()
{
Thread t = new Thread(WriteY);
t.Start();
for (int i = 0; i < 1000; i++)
{
//Console.Write("x");
sb.Append("x");
Thread.Sleep(10);
}
//t.Join();
Console.WriteLine(sb.ToString());
}
private static void WriteY()
{
for (int i = 0; i < 1000; i++)
{
//Console.Write("y");
sb.Append("y");
Thread.Sleep(10);
}
}
}
Why does 'x' appear before 'y'?
Because the main thread is not blocked at any point and continuing its execution before the resources are granted to other thread that is printing y
StringBuilder is Accept only one Thread?
No that is not the case. Run example below.
Why does not appear like this "xyxyxyxyx xyxyxyxy"?
there is not much work so, to get that random result you need to increase the duration which is demonstrated by using sleep.
Update: in your above example you can see the randomness if you increase your loop to 100000 or greater. and you also need to add t.Join() otherwise you thread may not yield the work.
Is anyone out there who can explain me the flow of this code?
I wonder how main thread generates worker threads, what I know is:
As soon as main thread calls .start method it creates a new thread.
But I have a confusion how the behavior changes when it comes to looping multiple threads in main.
static void Main()
{
Thread[] tr = new Thread[10];
for (int i = 0; i < 10; i++)
{
tr[i] = new Thread(new ThreadStart(count));
tr[i].Start();
}
static private void count()
{
for (int i = 0; i < 10; ++i)
{
lock (theLock)
{
Console.WriteLine("Count {0} Thread{1}",
counter++, Thread.CurrentThread.GetHashCode());
}
}
Is there a good way to debug and track your multithreaded program. after google it out I found tracking thread window in debug mood, but I couldn't find it useful even after given custom names to threads.
I just can't understand the flow, how threads being launched, how they work all together etc as breakpoints seem no effect in multi-threaded application. (At least in my case.)
I want this output 1 printed by Thread : 4551 [ThreadID] 2 printed by
Thread : 4552 3 printed by Thread : 4553 4 printed by Thread : 4554 5
printed by Thread : 4555 6 printed by Thread : 4556 7 printed by
Thread : 4557 8 printed by Thread : 4558 9 printed by Thread : 4559 10
printed by Thread : 4560 11 printed by Thread : 4551 [ Same Thread Id
Appears again as in 1] 12 printed by Thread : 4552
I'll try to describe what your code is doing as it interacts with the threading subsystem. The details I'm giving are from what I remember from my OS design university classes, so the actual implementation in the host operating system and/or the CLR internals may vary a bit from what I describe.
static void Main()
{
Thread[] tr = new Thread[10];
for (int i = 0; i < 10; i++)
{
tr[i] = new Thread(new ThreadStart(count));
// The following line puts the thread in a "runnable" thread list that is
// managed by the OS scheduler. The scheduler will allow threads to run by
// considering many factors, such as how many processes are running on
// the system, how much time a runnable thread has been waiting, the process
// priority, the thread's priority, etc. This means you have little control
// on the order of execution, The only certain fact is that your thread will
// run, at some point in the near future.
tr[i].Start();
// At this point you are exiting your main function, so the program should
// end, however, since you didn't flag your threads as BackgroundThreads,
// the program will keep running until every thread finishes.
}
static private void count()
{
// The following loop is very short, and it is probable that the thread
// might finish before the scheduler allows another thread to run
// Like user2864740 suggested, increasing the amount of iterations will
// increase the chance that you experience interleaved execution between
// multiple threads
for (int i = 0; i < 10; ++i)
{
// Acquire a mutually-exclusive lock on theLock. Assuming that
// theLock has been declared static, then only a single thread will be
// allowed to execute the code guarded by the lock.
// Any running thread that tries to acquire the lock that is
// being held by a different thread will BLOCK. In this case, the
// blocking operation will do the following:
// 1. Register the thread that is about to be blocked in the
// lock's wait list (this is managed by a specialized class
// known as the Monitor)
// 2. Remove the thread that is about to be blocked from the scheduler's
// runnable list. This way the scheduler won't try to yield
// the CPU to a thread that is waiting for a lock to be
// released. This saves CPU cycles.
// 3. Yield execution (allow other threads to run)
lock (theLock)
{
// Only a single thread can run the following code
Console.WriteLine("Count {0} Thread{1}",
counter++, Thread.CurrentThread.GetHashCode());
}
// At this point the lock is released. The Monitor class will inspect
// the released lock's wait list. If any threads were waiting for the
// lock, one of them will be selected and returned to the scheduler's
// runnable list, where eventually it will be given the chance to run
// and contend for the lock. Again, many factors may be evaluated
// when selecting which blocked thread to return to the runnable
// list, so we can't make any guarantees on the order the threads
// are unblocked
}
}
Hopefully things are clearer. The important thing here is to acknowledge that you have little control of how individual threads are scheduled for execution, making it impossible (without a fair amount of synchronization code) to replicate the output you are expecting. At most, you can change a thread's priority to hint the scheduler that a certain thread must be favored over other threads. However, this needs to be done very carefully, as it may lead to a nasty problem known as priority inversion. Unless you know exactly what you are doing, it is usually better not to change a thread's priority.
After a continuous try, I got to complete the requirements of my task. Here is the code:
using System;
using System.Threading;
public class EntryPoint
{
static private int counter = 0;
static private object theLock = new Object();
static object obj = new object();
static private void count()
{
{
for (int i = 0; i < 10; i++)
{
lock (theLock)
{
Console.WriteLine("Count {0} Thread{1}",
counter++, Thread.CurrentThread.GetHashCode());
if (counter>=10)
Monitor.Pulse(theLock);
Monitor.Wait(theLock); } }}
}
static void Main()
{
Thread[] tr = new Thread[10];
for (int i = 0; i < 10; i++)
{
tr[i] = new Thread(new ThreadStart(count));
tr[i].Start();
}
}
}
Monitor maintains a ready queue in a sequential order hence I achieved what I wanted:
Cheers!
I have a parallel algorithm which I have some barrier issues with. Before y'all scream "search" I can say I have looked at available posts and links, and I have followed the instructions for a barrier with Monitor.Wait and Monitor.PulseAll, but my issue is that all threads except the last one created (and initiated) is reached by the PulseAll from my main thread. Here are how the basic layout of the code is:
public static object signal = new object(); //This one is located as class variable, not in the method
public void RunAlgorithm(List<City> cities){
List<Thread> localThreads = new List<Thread>();
object[] temp = //some parameters here
for(int i = 0; i < numOfCitiesToCheck; i++){
Thread newThread = new Thread((o) => DoWork(o as object[]));
newThread.IsBackground = true;
newThread.Priority = ThreadPriority.AboveNormal;
newThread.Start(temp as object);
localThreads.Add(newThread);
}
//All threads initiated, now we pulse all
lock(signal){
Monitor.PulseAll(signal);
}
int counter = 0;
while(true){
if(counter == localThreads.Count){ break; }
localThreads[counter].Join();
counter++;
}
}
That's what done by the main thread (removed a few uneccessary pieces) and as stated before the main thread will always get stuck in the Join() on the last thread in the list.
This is how the method of the threads look like:
private void DoWork(object[] arguments){
lock(signal){
Monitor.Wait(signal);
}
GoDoWork(arguments);
}
Are there any other barriers I can use for this type of signaling? All I want is to let the main thread signal all threads at the same time so that they start at the same time. I want them to start at the same time inorder to have as close parallel as possible (I measure running time for the algorithm and a few other things). Is there any part of my barrier or code that is flawed (with the barrier I mean)? I tried running an instance with fewer threads and it still get stuck on the last one, I don't know why it is. I have confirmed by via VS debug that the last thread is sleeping (all other threads are !isAlive, while the last one isAlive = true).
Any help appreciated!
I managed to solve it using the Barrier class. Many thanks to Damien_The_Unbeliever! Still can't believe I haven't heard of it before.
public Barrier barrier = new barrier(1);
public void RunAlgorithm(List<City> cities){
List<Thread> localThreads = new List<Thread>();
object[] temp = //some parameters here
for(int i = 0; i < numOfCitiesToCheck; i++){
barrier.AddParticipant();
Thread newThread = new Thread((o) => DoWork(o as object[]));
newThread.IsBackground = true;
newThread.Priority = ThreadPriority.AboveNormal;
newThread.Start(temp as object);
localThreads.Add(newThread);
}
barrier.SignalAndWait();
int counter = 0;
while(true){
if(counter == localThreads.Count){ break; }
localThreads[counter].Join();
counter++;
}
}
private void DoWork(object[] arguments){
barrier.SignalAndWait();
GoDoWork(arguments);
}
I have a Windows Forms application written in .NET 4.0. Recently, while execution some tests, I noticed that there is some problem with handles. Table below shows the results:
As you can see the, only handle type which is increasing is Event.
So my question is: Is it possible that the described problem is caused by a Windows Forms application? I mean, I do not synchronize threads using AutoResetEvent or ManualResetEvent. I do use threads, but what can be seen from the table above the number of thread handles seems to be ok. So, I assume that they are well managed by CLR?
Can it be caused by any third party components I am also using in my app?
If sth is unclear I will try to answer your questions. Thanks for help!
This answer is a bit late, but I just ran across the question while investigating a very similar issue in some of my code and found the answer by placing a break point at the syscall in the disassembly of CreateEvent. Hopefully other people will find this answer useful, even if it is too late for your specific use case.
The answer is that .NET creates Event kernel objects for various threading primitives when there is contention. Notably, I have made a test application that can show they are created when using the "lock" statement, though, presumably, any of the Slim threading primitives will perform similar lazy creation.
It is important to note that the handles are NOT leaked, though an increasing number may indicate a leak elsewhere in your code. The handles will be released when the garbage collector collects the object that created them (eg, the object provided in the lock statement).
I have pasted my test code below which will showcase the leak on a small scale (around 100 leaked Event handles on my test machine - your mileage may vary).
A few specific points of interest:
Once the list is cleared and the GC.Collect() is run, any created handles will be cleaned up.
Setting ThreadCount to 1 will prevent any Event handles from being created.
Similarly, commenting out the lock statement will cause no handles to be created.
Removing the ThreadCount from the calculation of index (line 72) will drastically reduce contention and thus prevent nearly all the handles from being created.
No matter how long you let it run for, it will never create more than 200 handles (.NET seems to create 2 per object for some reason).
using System.Collections.Generic;
using System.Threading;
namespace Dummy.Net
{
public static class Program
{
private static readonly int ObjectCount = 100;
private static readonly int ThreadCount = System.Environment.ProcessorCount - 1;
private static readonly List<object> _objects = new List<object>(ObjectCount);
private static readonly List<Thread> _threads = new List<Thread>(ThreadCount);
private static int _currentIndex = 0;
private static volatile bool _finished = false;
private static readonly ManualResetEventSlim _ready = new ManualResetEventSlim(false, 1024);
public static void Main(string[] args)
{
for (int i = 0; i < ObjectCount; ++i)
{
_objects.Add(new object());
}
for (int i = 0; i < ThreadCount; ++i)
{
var thread = new Thread(ThreadMain);
thread.Name = $"Thread {i}";
thread.Start();
_threads.Add(thread);
}
System.Console.WriteLine("Ready.");
Thread.Sleep(10000);
_ready.Set();
System.Console.WriteLine("Started.");
Thread.Sleep(10000);
_finished = true;
foreach (var thread in _threads)
{
thread.Join();
}
System.Console.WriteLine("Finished.");
Thread.Sleep(3000);
System.Console.WriteLine("Collecting.");
_objects.Clear();
System.GC.Collect();
Thread.Sleep(3000);
System.Console.WriteLine("Collected.");
Thread.Sleep(3000);
}
private static void ThreadMain()
{
_ready.Wait();
while (!_finished)
{
int rawIndex = Interlocked.Increment(ref _currentIndex);
int index = (rawIndex / ThreadCount) % ObjectCount;
bool sleep = rawIndex % ThreadCount == 0;
if (!sleep)
{
Thread.Sleep(10);
}
object obj = _objects[index];
lock (obj)
{
if (sleep)
{
Thread.Sleep(250);
}
}
}
}
}
}
Events are the main source of memory leaks in .Net, and AutoResetEvent and ManualResetEvent are very badly named. They are not the cause.
When you see something like this:
myForm.OnClicked += Form_ClickHandler
That is the type of event this is talking about. When you register an event handler, the event source (like OnClicked) keeps a reference to the handler. If you create and register new handlers you MUST unregister the event (like myForm.OnClicked -= Form_ClickHandler) otherwise your memory use will keep growing.
For more info:
Why and How to avoid Event Handler memory leaks?
C# Events Memory Leak
While I was learning threading memory barrier (fences) seems really not easy to understand, here in my case I want employee 10 threads simultaneously increase a Int32 number: x by 100 times at each (x++), and will get result 10 * 100 = 1000.
So this is actually an atomicity problem, and what I know so far there are a number of ways to achieve that - limited in concurrent ways:
Interlocked.Increment
exclusive lock (lock, monitor, Mutex, Semaphore, etc.)
ReadWriteLockSlim
If there are more better ways please kindly guide me, I tried to use a volatile read/write but failed:
for (int i = 0; i < 10000; i++)
{
Thread.VolatileRead(ref x);
Thread.VolatileWrite(ref x, x + 1);
}
My investigation code is tidied below:
private const int MaxThraedCount = 10;
private Thread[] m_Workers = new Thread[MaxThraedCount];
private volatile int m_Counter = 0;
private Int32 x = 0;
protected void btn_DoWork_Click(object sender, EventArgs e)
{
for (int i = 0; i < MaxThraedCount; i++)
{
m_Workers[i] = new Thread(IncreaseNumber) { Name = "Thread " + (i + 1) };
m_Workers[i].Start();
}
}
void IncreaseNumber()
{
try
{
for (int i = 0; i < 10000; i++)
Interlocked.Increment(ref x);
// Increases Counter and decides whether or not sets the finish signal
m_Counter++;
if (m_Counter == MaxThraedCount)
{
// Print finish information on UI
m_Counter = 0;
}
}
catch (Exception ex)
{
throw;
}
}
My question is: how can I use Memory Barrier to replace Interlocked, since "All of Interlocked’s methods generate a full fence.", I tried to modify the increase loop as below but failed, I don't understand why...
for (int i = 0; i < 10000; i++)
{
Thread.MemoryBarrier();
x++;
Thread.MemoryBarrier();
}
The memory barrier just keeps memory operations from moving from one side of the barrier to the other. Your issue is this:
Thread A reads the value of X.
Thread B reads the value of X.
Thread A adds one to the value it read.
Thread B adds one to the value it read.
Thread A writes back the value it calculated.
Thread B writes back the value it calculated.
Oops, two increments only added one. Memory barriers are not atomic operations and they are not locks. They just enforce ordering, not atomicity.
Unfortunately, the x86 architecture does not offer any atomic operations that don't include a full fence. It is what it is. On the bright side, the full fence is heavily optimized. (For example, it does not ever lock any bus.)
You can't "use MemoryBarrier to replace Interlocked". They are two different tools.
Use MemoryBarrier, volatile etc to control re-ordering of reads and writes. Use Interlocked, lock etc for atomicity.
(Besides, are you aware that calling MemoryBarrier also generates a full fence*, as do VolatileRead and VolatileWrite? So if you're trying to avoid Interlocked, lock etc for performance reasons, there's a good chance that your alternatives will be less performant as well as more likely to be broken.)
*In the standard Microsoft CLR, at least. I'm not sure about Mono etc.