Way to work with threads properly in C#

Way to work with threads properly in C# - c#

I have some difficulties designing the way my code should work:
Serial #1 (receives data at any time) invokes Routine() if some particular received value A is > constant1, but only if Routine() is not running, otherwise ONLY the last invocation will run after Routine() ends
Serial #2 (receives data at any time) sets B and C with the received data
Routine() checks if C > constant2 and saves B and C to a file
Timer (every N seconds) runs another routine that checks the saved files and sends an email (without interfering with Routine() while is saving B and C)
My current design uses a couple of global booleans, but I think that is producing some problems (due the boolean changing between checking it and setting it again to start the 'locked' procedure).
So, what is the recommended way to take down a sync problem like this? lock(someGlobalObject)?, using Monitor? (how I discard multiple pending routine() invocations?), Mutex?, Semaphore?
Thanks!

First off, lock and Monitor are functionally the same.
Lock statement vs Monitor.Enter method
Now, generally, when I'm writing multi-threaded code, the decision is between either a ReaderWriterLock (ReaderWriterLockSlim if on .NET 3.5 or later) and lock(). lock() will perform better when you have multiple threads that need to write to the same object and few that need to read. ReaderWriterLocks will perform better when you have concurrent data that is read much more frequently (and from multiple threads) than it is written to.
ReaderWriterLock vs lock{}
If I understand your example correctly, B and C are objects that are accessed from multiple threads. It looks like they are written to from one thread and read from two other threads. So, I would suggest that it all depends on the frequency of the reads and writes. If B and C are written to much more frequently than they are read, try a lock() statement. Otherwise, if they are read more frequently, go with a ReaderWriterLock(Slim) object. Obviously, you'll need to run some tests to determine if one is better than the other. Either one should solve the problem, though one is most likely going to be faster than the other.

Related

Using a double buffer technique for concurrent reading and writing?

I have a relatively simple case where:
My program will be receiving updates via Websockets, and will be using these updates to update it's local state. These updates will be very small (usually < 1-1000 bytes JSON so < 1ms to de-serialize) but will be very frequent (up to ~1000/s).
At the same time, the program will be reading/evaluating from this local state and outputs its results.
Both of these tasks should run in parallel and will run for the duration for the program, i.e. never stop.
Local state size is relatively small, so memory usage isn't a big concern.
The tricky part is that updates need to happen "atomically", so that it does not read from a local state that has for example, written only half of an update. The state is not constrained to using primitives and could contain arbitrary classes AFAICT atm, so I cannot solve it by something simple like using Interlocked atomic operations. I plan on running each task on its own thread, so a total of two threads in this case.
To achieve this goal I thought to use a double buffer technique, where:
It keeps two copies of the state so one can be read from while the other is being written to.
The threads could communicate which copy they are using by using a lock. i.e. Writer thread locks copy when writing to it; reader thread requests access to lock after it's done with current copy; writer thread sees that reader thread is using it so it switches to other copy.
Writing thread keeps track of state updates it's done on the current copy so when it switches to the other copy it can "catch up".
That's the general gist of the idea, but the actual implementation will be a bit different of course.
I've tried to lookup whether this is a common solution but couldn't really find much info, so it's got me wondering things like:
Is it viable, or am I missing something?
Is there a better approach?
Is it a common solution? If so what's it commonly referred to as?
(bonus) Is there a good resource I could read up on for topics related to this?
Pretty much I feel I've run into a dead-end where I cannot find (because I don't know what to search for) much more resources and info to see if this approach is "good". I plan on writing this in .NET C#, but I assume the techniques and solutions could translate to any language. All insights appreciated.

You actually need four buffers/objects. Two buffers/objects are owned by the reader, one by the writer, and one in the mailbox.
The reader -- each time he finishes a group of atomic operations on his newer object, he uses interlocked exchange to swap his older object handle (pointer or index doesn't matter) with the mailbox one. Then he looks at the newly obtained object and compares the sequence number to the object he just read (and is still holding) to find out which is newer.
The writer -- writes a complete copy of latest data into his object, then uses interlocked exchange to swap his newly written object with the mailbox one.
As you can see, the writer can steal the mailbox object at any time, but never the one that the reader is using, so read operations stay atomic. And the reader can steal the mailbox object at any time, but never the one the writer is using, so write operations stay atomic.
As long as the interlocked-exchange function produces the correct memory fence (release for the swap done in the writer thread, acquire for the reader thread), the objects can themselves be arbitrarily complex.

If I understand correctly, the writes themselves are synchronous. If so, then maybe it's not necessary to keep two copies or even to use locks.
Maybe something like this could work?
State state = populateInitialState();
...
// Reader thread
public State doRead() {
return makeCopyOfState(state);
}
...
// Writer thread
public void updateState() {
State newState = makeCopyOfState(state);
// make changes in newState
state = newState;
}

It looks like you are using the input-process-output pattern in a multithreaded pipeline. Sometimes the input and processing phases (or processing and output phases) are merged when the problem is simple.
You have added a C# tag so using something like a BlockingCollection might be a useful way to communicate between the input and output threads. Since the local state is relatively small (your words) then posting a data-object containing a copy of the local state from the input thread to the output thread could be a simple solution. This follows a share-nothing philosophy which satisfies the atomic requirement because a snapshot of the current state is queued. The "catch up" capability is satisfied because the queue contains the backlog of state changes.
Generally, Messaging Patterns and Conversation Patterns are useful resources when trying to work out what to communicate and how to communicate between 2 or more threads (or processes, services, servers, etc).

Should I lock a datatable in multithread paradigm?

In a project of windows services (C# .Net Platform), I need a suggestion.
In the project I have class named Cache, in which I keep some data that I need frequently. There is a thread that updates cache after each 30 minutes. Where as there are multiple threads which use cache data.
In the cache class, there are getter and setter functions which are used by user threads and cache updater thread respectively. No one uses data objects like tables directly, because they are private members.
From the above context, do you think I should use locking functionality in the cache class?

The effects of not using locks when writing to a shared memory location (like cache) really depend on the application. If the code was used in banking software the results could be catastrophic.
As a rule o thumb - when multiple threads access the same location, even if only one tread writes and all the other read, you should use locks (for write operation). What can happen is that one thread will start reading data, get swiped out by the updater thread; So it'll potentially end up using a mixture of old and new data. If that really as an impact depends on the application and how sensible it is.

Key Point: If you don't lock on the reads there's a chance your read won't see the changes. A lock will force your read code to get values from main memory rather than pulling data from a cache or register. To avoid actually locking you could use Thread.MemoryBarrier(), which will do the same job without overhead of actually locking anything.
Minor Points: Using lock would prevent a read from getting half old data and half new data. If you are reading more than one field, I recommend it. If you are really clever, you could keep all the data in an immutable object and return that object to anyone calling the getter and so avoid the need for a lock. (When new data comes in, you create a new immutable object, then replace the old with the new in one go. Use a lock for the actual write, or, if you're still feeling really clever, make the field referencing the object volatile.)
Also: when your getter is called, remember it's running on many other threads. There's a tendency to think that once something is running the Cache class's code it's all on the same thread, and it's not.

guarantee that up-to-date value of variable is always visible to several threads on multi-processor system

I'm using such configuration:
.NET framework 4.5
Windows Server 2008 R2
HP DL360p Gen8 (2 * Xeon E5-2640, x64)
I have such field somewhere in my program:
protected int HedgeVolume;
I access this field from several threads. I assume that as I have multi-processor system it's possible that this threads are executing on different processors.
What should I do to guarantee that any time I use this field the most recent value is "read"? And to make sure that when I "write" value it become available to all other threads immediately?
What should I do?
just leave field as is.
declare it volatile
use Interlocked class to access the field
use .NET 4.5 Volatile.Read, Volatile.Write methods to access the field
use lock
I only need simplest way to make my program work on this configuration I don't need my program to work on another computers or servers or operation systems. Also I want minimal latency so I'm looking for fastest solution that will always work on this standard configuration (multiprocessor intel x64, .net 4.5).

Your question is missing one key element... How important is the integrity of the data in that field?
volatile gives you performance, but if a thread is currently writing changes to the field, you won't get that data until it's done, so you might access out of date information, and potentially overwrite changes another thread is currently doing. If the data is sensitive, you might get bugs that would get very hard to track. However, if you are doing very quick update, overwrite the value without reading it and don't care that once in a while you get outdated (by a few ms) data, go for it.
lock guaranty that only one thread can access the field at a time. You can put it only on the methods that write the field and leave the reading method alone. The down side is, it is slow, and may block a thread while another is performing its task. However, you are sure your data stay valid.
Interlock exist to shield yourself from the scheduler context switch. My opinion? Don't use it unless you know exactly why you would be using it and exactly how to use it. It gives options, but with great options comes great problematic. It prevents a context switch while a variable is being update. It might not do what you think it does and won't prevent parallel threads from performing their tasks simultaneously.

You want to use Volatile.Read().
As you are running on x86, all writes in C# are the equivalent of Volatile.Write(), you only need to use this for Itanium.
Volatile.Read() will ensure that you get the latest copy regardless of which thread last wrote it.
There is a fantastic write up here, C# Memory Model Explained
Summary of it includes,
On some processors, not only must the compiler avoid certain
optimizations on volatile reads and writes, it also has to use special
instructions. On a multi-core machine, different cores have different
caches. The processors may not bother to keep those caches coherent by
default, and special instructions may be needed to flush and refresh
the caches.
Hopefully that much is obvious, other than the need for volatile to stop the compiler from optimising it, there is the processor as well.
However, in C# all writes are volatile (unlike say in Java),
regardless of whether you write to a volatile or a non-volatile field.
So, the above situation actually never happens in C#. A volatile write
updates the thread’s cache, and then flushes the entire cache to main
memory.
You do not need Volatile.Write(). More authoratitive source here, Joe Duffy CLR Memory Model. However, you may need it to stop the compiler reordering it.
Since all C# writes are volatile, you can think of all writes as going
straight to main memory. A regular, non-volatile read can read the
value from the thread’s cache, rather than from main
You need Volatile.Read()

When you start designing a concurrent program, you should consider these options in order of preference:
1) Isolation: each thread has it's own private data
2) Immutability: threads can see shared state, but it never changes
3) Mutable shared state: protect all access to shared state with locks
If you get to (3), then how fast do you actually need this to be?
Acquiring an uncontested lock takes in the order of 10ns ( 10-8 seconds ) - that's fast enough for most applications and is the easiest way to guarantee correctness.
Using any of the other options you mention takes you into the realm of low-lock programming, which is insanely difficult to get correct.
If you want to learn how to write concurrent software, you should read these:
Intro: Joe Albahari's free e-book - will take about a day to read
Bible: Joe Duffy's "Concurrent Programming on Windows" - will take about a month to read

Depends what you DO. For reading only, volatile is easiest, interlocked allows a little more control. Lock is unnecessary as it is more ganular than the problem you describe. Not sure about Volatile.Read/Write, never used them.

volatile - bad, there are some issues (see Joe Duffy's blog)
if all you do is read the value or unconditionally write a value - use Volatile.Read and Volatile.Write
if you need to read and subsequently write an updated value - use the lock syntax. You can however achieve the same effect without lock using the Interlocked classes functionality, but this is more complex (involves CompareExchange s to ensure that you are updating the read value i.e. has not been modified since the read operation + logic to retry if the value was modified since the read).

From this i can understand that you want to be able to read the last value that it was writtent in a field. Lets make an analogy with the sql concurency problem of the data. If you want to be able to read the last value of a field you must make atomic instructions. If someone is writing a field all of the threads must be locked for reading until that thread finished the writing transaction. After that every read on that thread will be safe. The problem is not with reading as it is with writing. A lock on that field whenever its writtent should be enough if you ask me ...

First have a look here: Volatile vs. Interlocked vs. lock
The volatile modifier shurely is a good option for a multikernel cpu.
But is this enough? It depends on how you calculate the new HedgeVolume value!
If your new HedgeVolume does not depend on current HedgeVolume then your done with volatile.
But if HedgeVolume[x] = f(HedgeVolume[x-1]) then you need some thread synchronisation to guarantee that HedgeVolume doesn't change while you calculate and assign the new value. Both, lock and Interlocked szenarios would be suitable in this case.

I had a similar question and found this article to be extremely helpful. It's a very long read, but I learned a LOT!

Is it possible to make a piece of code atomic (C#)?

When I said atomic, I meant set of instructions will execute without any context switching to another thread on the same process (other kinds of switches have to be done of course). The only solution I came up with is to suspend all threads except currently executed before part and resume them after it. Any more elegant way?
The reason I want to do that is to collect a coherent state of objects running on multiple threads. However, their code cannot be changed (they're already compiled), so I cannot insert mutexes, semaphores, etc in it. The atomic operation is of course state collecting (i.e. copying some variables).

There are some atomic operations in the Interlocked class but it only provides a few very simple operations. It can't be used to create an entire atomic block of code.
I'd advise using locking carefully to make sure that your code will still work even if the context changes.

Well, you can use locks, but you can't prevent context switching exactly. But if your threads lock on the same object, then the threads waiting obviously won't be running, so there's no context switching involved since there's nothing to run.
You might want to look at this page too.

No. You can surround a block of code with a Monitor to make it thread-safe, but you cannot make general code snippets atomic.
object lck = new object();
lock(lck)
{
// thread safe code goes in here
}

No, that's against multi-tasking.
Unless very simple operations like incrementing ... which are not subject of your question.

It is possible to obtain a global state from a shared memory composed of a collection (array) of atomic one reader/multi writer registers. The solution is simple but not trivial. You can read the algorithm published in the paper "atomic snapshots of shared memory" or you can read the chapter 4 from the art of multiprocesor programming book, there you can get ideas on the implementation on the java language, of course, once you are familiarized with the idea you should be able to transport it to any other language. Sorry if my english is not well enough.

Do I need to synchronize thread access to an int?

I've just written a method that is called by multiple threads simultaneously and I need to keep track of when all the threads have completed. The code uses this pattern:
private void RunReport()
{
_reportsRunning++;
try
{
//code to run the report
}
finally
{
_reportsRunning--;
}
}
This is the only place within the code that _reportsRunning's value is changed, and the method takes about a second to run.
Occasionally when I have more than six or so threads running reports together the final result for _reportsRunning can get down to -1. If I wrap the calls to _runningReports++ and _runningReports-- in a lock then the behaviour appears to be correct and consistent.
So, to the question: When I was learning multithreading in C++ I was taught that you didn't need to synchronize calls to increment and decrement operations because they were always one assembly instruction and therefore it was impossible for the thread to be switched out mid-call. Was I taught correctly, and if so, how come that doesn't hold true for C#?

A ++ operator is not atomic in C# (and I doubt it is guaranteed to be atomic in C++) so yes, your counting is subject to race conditions.
Use Interlocked.Increment and .Decrement
System.Threading.Interlocked.Increment(ref _reportsRunning);
try
{
...
}
finally
{
System.Threading.Interlocked.Decrement(ref _reportsRunning);
}

So, to the question: When I was
learning multithreading in C++ I was
taught that you didn't need to
synchronize calls to increment and
decrement operations because they were
always one assembly instruction and
therefore it was impossible for the
thread to be switched out mid-call.
Was I taught correctly, and if so how
come that doesn't hold true for C#?
This is incredibly wrong.
On some architectures, like x86, there are single increment and decrement instructions. Many architectures do not have them and need to do separate loads and stores. Even on x86, there is no guarantee the compiler will generate the memory version of these instructions - it'll likely load into a register first, especially if it needs to do several operations with the result.
Even if the compiler could be guaranteed to always generate the memory version of increment and decrement on x86, that still does not guarantee atomicity - two CPU's could modify the variable simultaneously and get inconsistent results. The instruction would need the lock prefix to force it to be an atomic operation - compilers never emit the lock variant by default since it is less performant since it guarantees the action is atomic.
Consider the following x86 assembly instruction:
inc [i]
If I is initially 0 and the code is run on two threads on two cores, the value after both threads finish could legally be either 1 or 2, since there is no guarantee that one thread will complete its read before the other thread finishes its write, or that one thread's write will even be visible before the other threads read.
Changing this to:
lock inc [i]
Will result in getting a final value of 2.
Win32's InterlockedIncrement and InterlockedDecrement and .NET's Interlocked.Increment and Interlocked.Decrement result in doing the equivalent (possibly the exact same machine code) of lock inc.

You were taught wrong.
There does exist hardware with atomic integer increment, so it's possible that what you were taught was right for the hardware and compiler you were using at the time. But in general in C++ you can't even guarantee that incrementing a non-volatile variable writes memory consecutively with reading it, let alone atomically with reading.

Incrementing the int is one instruction but what about loading the value in the register?
That's what i++ effectively does:
load i into a register
increment the register
unload the register into i
As you can see there are 3 (this may be different on other platforms) instructions which in any stage the cpu can context switch into a different thread leaving your variable in an unknown state.
You should use Interlocked.Increment and Interlocked.Decrement to solve that.

No, you need to synchronize access. On Windows you can do this easily with InterlockedIncrement() and InterlockedDecrement(). I'm sure there are equivalents for other platforms.
EDIT: Just noticed the C# tag. Do what the other guy said. See also: I've heard i++ isn't thread safe, is ++i thread-safe?

Any kind of increment/decrement operation in a higher level language (and yes, even C is higher level compared to machine instructions) is not atomic by nature. However, each processor platform usually has primitives that support various atomic operations.
If your lecturer was referring to machine instructions, Increment and Decrement operations are likely to be atomic. Yet, that is not always correct on the ever increasing multi-core platforms of today, unless they guarantee coherency.
The higher level languages usually implement support for atomic transactions using low level atomic machine instructions. This is provided as the interlock mechanism by the higher level API.

x++ probably isn't atomic, but ++x might be (not sure offhand, but if you consider the difference between post- and pre-increment it should be clear why pre- is more amenable to atomicity).
A bigger point is, if these runs take a second to run each, the amount of time added by a lock is going to be noise compared to the runtime of the method itself. It's probably not worth monkeying with trying to remove the lock in this case - you've got a correct solution with locking, that will likely not have a visible difference in performance from the non-locking solution.

On a single-processor machine, if one isn't using virtual memory, x++ (rvalue ignored) is likely to translate into a single atomic INC instruction on x86 architectures (if x is long, the operation is only atomic when using a 32-bit compiler). Also, movsb/movsw/movsl are atomic ways of moving a byte/word/longword; a compiler isn't apt to use those as the normal way of assigning variables, but one could have an atomic-move utility function. It would be possible for a virtual memory manager to be written in such a way that those instructions would behave atomically if a page fault occurs on the write, but I don't think that's normally guaranteed.
On a multi-processor machine, all bets are off unless one uses explicit interlocked instructions (invokable via special library calls). The most versatile instruction which is commonly available is CompareExchange. That instruction will alter a memory location only if it contains an expected value; it will return the value it had when it decided whether or not to alter it. If one wishes to "xor" a variable with 1, one could do something like (in vb.net)
Dim OldValue as Integer
Do
OldValue = Variable
While Threading.Interlocked.CompareExchange(Variable, OldValue Xor 1, OldValue) OldValue
This approach allows one to perform any sort of atomic update to a variable whose new value should depend on the old value. For certain common operations like increment and decrement, there are faster alternatives, but the CompareExchange allows one to implement other useful patterns as well.
Important caveats: (1) Keep the loop as short as possible; the longer the loop, the more likely it is that another task will hit the variable during the loop, and the more time will be wasted each time that happens; (2) a specified number of updates, divided arbitrarily among threads, will always complete, since the only way a thread can forced to re-execute the loop is if some other thread has made useful progress; if some threads can perform updates without making forward progress toward completion, however, the code may become live-locked.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.