Examples of thread safe classes in the .NET Framework? - c#

I'm currently working on code which is supposed to be thread safe. Lots of asynchronous calls and events and stuff that generally requires quite a bit of work to keep synchronized and thread safe.
Are there any classes in the .NET framework which deal with this sort of thing, which I could look at (decompile), to see how things are supposed to be done? The more complex the better really...

.NET 4.0 Thread-Safe Collections
Though targeting Windows rather than just .NET, Joe Duffy's book is worth noting: Concurrent Programming on Windows

MSDN has some good information on asynchronous programming in .NET. Check out Asynchronous Programming Design Patterns.
Also check out the Monitor and Mutex classes in System.Threading

There are a few ways to keep things atomic. There are classes that support fencing, such that code is guaranteed to execute in a certain order. You can use the keyword volatile to ensure that variable access is atomic (Read from memory each use) and that you have no phantom reads. These are just a few tools; I would suggest a simple Google of "atomic C#, F#.."

Related

in C# how to run method async in the same thread

Is it possible to define and call method asynchronously in the same thread as the caller?
Suppose I have just one core and I don't want threading management overhead with like 100 threads.
Edit
The reason I ask is nodejs model of doing things - everything on one thread never blocking anything, which proved to be very efficient, which made me wonder if same stuff possible in C# (and I couldn't achieve it myself).
Edit2 Well, as noted in comments node isn't single-threaded after all (however simple load test shows, that it uses just one core...), but I think what makes it so efficient is implicit requirement to write only non-blocking code. Which is possible in C#, except not required:) Anyway, thanks everyone...
More info in
this SO post and even more in this one
It's not really clear exactly what context you're talking about, but the async/await feature of C# 5 already helps to support this sort of thing. The difference is that whereas in node.js everything is forced to be single threaded by default (as I understand it; quite possibly incorrectly given the links in the comments), server-side applications in .NET using asynchrony will use very few threads without limiting themselves to that. If everything really can be served by a single thread, it may well be - if you never have more than one thing to do in terms of physical processing, then that's fine.
But what if one request comes in while another is doing a bit of work? Maybe it's doing just a small amount of encryption, or something like that. Do you really want to make the second request wait for the first one to complete? If you do, you could model that in .NET fairly easily with a TaskScheduler associated with a single-thread thread-pool... but more commonly, you'd use the thread-pool built into .NET, which will work efficiently while still allowing concurrency.
First off, you should make sure you're using .NET 4.5 - it has far more asynchronous APIs (e.g. for database and file access) than earlier versions of .NET. You want to use APIs which conform to the Task-based Asynchronous Pattern (TAP). Then, using async/await you can write your server-side code so that it reads a bit like synchronous code, but actually executes asynchronous. So for example, you might have:
Guid userId = await FetchUserIdAsync();
IEnumerable<Message> messages = await FetchMessagesAsync(userId);
Even though you need to "wait" while each of these operations talks place, you do so without blocking a thread. The C# compiler takes care of building a state machine for you. There's no need to explicitly write callbacks which frequently turn into spaghetti code - the compiler does it all.
You need to make sure that whatever web/web-service framework you use supports asynchrony, and it should just manage the rest, using as few threads as it can get away with.
Follow the two links above to find out more about the asynchrony support in C# 5 - it's a huge topic, and I'm sure you'll have more questions afterwards, but when you've read a bit more you should be able to ask very specific questions instead of the current broad one.

differences between multithread and multitask

What's different between multithread programming and multitask in C#.net4?
I need some technical reviews.
I am doing some research on the topic and I need something to help me.
Multitasking is a somewhat imprecise term that can mean different things in different contexts. It can refer to:
multi-processing (time sharing between separate processes),
multiple threads or tasks in an embedded system,
a particular form or framework for of multi-threading,
even just plain multithreading
I think that the 'multitasking' term you're asking about is regarding the "Task Parallelism" support added in .NET 4: http://msdn.microsoft.com/en-us/library/dd537609.aspx
That model would fall into the 3rd item above - it's an abstraction for performing work in parallel that uses threading but tries to keep much of the mechanics of threads under the covers.

Java's ThreadPoolExecutor equivalent for C#?

I used to make good use of Java's ThreadPoolExecutor class and have yet to find a good equivalent in C#. I know of ThreadPool.QueueUserWorkItem which is useful in many cases but no good if you want to control the number of threads assigned to a task or have multiple individual queues for different task types.
For example I liked to use a ThreadPoolExecutor with a single thread to guarantee sequential execution of asynchronous calls.. Is there an easy way to do this in C#? Is there a non-static thread pool implementation?
Until .Net 4.0 and the TPL, there is no such feature built-in.
However, see this artcle
As part of the Reactive Extensions (Rx), the Task Parallel Library was backported to .NET 3.5. If you add a reference to the System.Threading.dll including in its distribution, you can use the TPL with .NET 3.5.
There are also thread pools built into the Concurrency and Coordination Runtime, which is freely available for use. See this MSDN article for use.

Alternative to Threads

I've read that threads are very problematic. What alternatives are available? Something that handles blocking and stuff automatically?
A lot of people recommend the background worker, but I've no idea why.
Anyone care to explain "easy" alternatives? The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Any ideas?
To summarize the problems with threads:
if threads share memory, you can get
race conditions
if you avoid races by liberally using locks, you
can get deadlocks (see the dining philosophers problem)
An example of a race: suppose two threads share access to some memory where a number is stored. Thread 1 reads from the memory address and stores it in a CPU register. Thread 2 does the same. Now thread 1 increments the number and writes it back to memory. Thread 2 then does the same. End result: the number was only incremented by 1, while both threads tried to increment it. The outcome of such interactions depend on timing. Worse, your code may seem to work bug-free but once in a blue moon the timing is wrong and bad things happen.
To avoid these problems, the answer is simple: avoid sharing writable memory. Instead, use message passing to communicate between threads. An extreme example is to put the threads in separate processes and communicate via TCP/IP connections or named pipes.
Another approach is to share only read-only data structures, which is why functional programming languages can work so well with multiple threads.
This is a bit higher-level answer, but it may be useful if you want to consider other alternatives to threads. Anyway, most of the answers discussed solutions based on threads (or thread pools) or maybe tasks from .NET 4.0, but there is one more alternative, which is called message-passing. This has been successfuly used in Erlang (a functional language used by Ericsson). Since functional programming is becoming more mainstream in these days (e.g. F#), I thought I could mention it. In genral:
Threads (or thread pools) can usually used when you have some relatively long-running computation. When it needs to share state with other threads, it gets tricky (you have to correctly use locks or other synchronization primitives).
Tasks (available in TPL in .NET 4.0) are very lightweight - you can split your program into thousands of tasks and then let the runtime run them (it will use optimal number of threads). If you can write your algorithm using tasks instead of threads, it sounds like a good idea - you can avoid some synchronization when you run computation using smaller steps.
Declarative approaches (PLINQ in .NET 4.0 is a great option) if you have some higher-level data processing operation that can be encoded using LINQ primitives, then you can use this technique. The runtime will automatically parallelize your code, because LINQ doesn't specify how exactly should it evaluate the results (you just say what results you want to get).
Message-passing allows you two write program as concurrently running processes that perform some (relatively simple) tasks and communicate by sending messages to each other. This is great, because you can share some state (send messages) without the usual synchronization issues (you just send a message, then do other thing or wait for messages). Here is a good introduction to message-passing in F# from Robert Pickering.
Note that the last three techniques are quite related to functional programming - in functional programming, you desing programs differently - as computations that return result (which makes it easier to use Tasks). You also often write declarative and higher-level code (which makes it easier to use Declarative approaches).
When it comes to actual implementation, F# has a wonderful message-passing library right in the core libraries. In C#, you can use Concurrency & Coordination Runtime, which feels a bit "hacky", but is probably quite powerful too (but may look too complicated).
Won't the parallel programming options in .Net 4 be an "easy" way to use threads? I'm not sure what I'd suggest for .Net 3.5 and earlier...
This MSDN link to the Parallel Computing Developer Center has links to lots of info on Parellel Programming including links to videos, etc.
I can recommend this project. Smart Thread Pool
Project Description
Smart Thread Pool is a thread pool written in C#. It is far more advanced than the .NET built-in thread pool.
Here is a list of the thread pool features:
The number of threads dynamically changes according to the workload on the threads in the pool.
Work items can return a value.
A work item can be cancelled.
The caller thread's context is used when the work item is executed (limited).
Usage of minimum number of Win32 event handles, so the handle count of the application won't explode.
The caller can wait for multiple or all the work items to complete.
Work item can have a PostExecute callback, which is called as soon the work item is completed.
The state object, that accompanies the work item, can be disposed automatically.
Work item exceptions are sent back to the caller.
Work items have priority.
Work items group.
The caller can suspend the start of a thread pool and work items group.
Threads have priority.
Can run COM objects that have single threaded apartment.
Support Action and Func delegates.
Support for WindowsCE (limited)
The MaxThreads and MinThreads can be changed at run time.
Cancel behavior is imporved.
"Problematic" is not the word I would use to describe working with threads. "Tedious" is a more appropriate description.
If you are new to threaded programming, I would suggest reading this thread as a starting point. It is by no means exhaustive but has some good introductory information. From there, I would continue to scour this website and other programming sites for information related to specific threading questions you may have.
As for specific threading options in C#, here's some suggestions on when to use each one.
Use BackgroundWorker if you have a single task that runs in the background and needs to interact with the UI. The task of marshalling data and method calls to the UI thread are handled automatically through its event-based model. Avoid BackgroundWorker if (1) your assembly does not already reference the System.Windows.Form assembly, (2) you need the thread to be a foreground thread, or (3) you need to manipulate the thread priority.
Use a ThreadPool thread when efficiency is desired. The ThreadPool helps avoid the overhead associated with creating, starting, and stopping threads. Avoid using the ThreadPool if (1) the task runs for the lifetime of your application, (2) you need the thread to be a foreground thread, (3) you need to manipulate the thread priority, or (4) you need the thread to have a fixed identity (aborting, suspending, discovering).
Use the Thread class for long-running tasks and when you require features offered by a formal threading model, e.g., choosing between foreground and background threads, tweaking the thread priority, fine-grained control over thread execution, etc.
Any time you introduce multiple threads, each running at once, you open up the potential for race conditions. To avoid these, you tend to need to add synchronization, which adds complexity, as well as the potential for deadlocks.
Many tools make this easier. .NET has quite a few classes specifically meant to ease the pain of dealing with multiple threads, including the BackgroundWorker class, which makes running background work and interacting with a user interface much simpler.
.NET 4 is going to do a lot to ease this even more. The Task Parallel Library and PLINQ dramatically ease working with multiple threads.
As for your last comment:
The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Most of the routines in .NET are built upon the ThreadPool. In .NET 4, when using the TPL, the work load will actually scale at runtime, for you, eliminating the burden of having to specify the number of threads to use. However, there are ways to do this now.
Currently, you can use ThreadPool.SetMaxThreads to help limit the number of threads generated. In TPL, you can specify ParallelOptions.MaxDegreesOfParallelism, and pass an instance of the ParallelOptions into your routine to control this. The default behavior scales up with more threads as you add more processing cores, which is usually the best behavior in any case.
Threads are not problematic if you understand what causes problems with them.
For ex. if you avoid statics, you know which API's to use (e.g. use synchronized streams), you will avoid many of the issues that come up for their bad utilization.
If threading is a problem (this can happen if you have unsafe/unmanaged 3rd party dll's that cannot support multithreading. In this can an option is to create a meachism to queue the operations. ie store the parameters of the action to a database and just run through them one at a time. This can be done in a windows service. Obviously this will take longer but in some cases is the only option.
Threads are indispensable tools for solving many problems, and it behooves the maturing developer to know how to effectively use them. But like many tools, they can cause some very difficult-to-find bugs.
Don't shy away from some so useful just because it can cause problems, instead study and practice until you become the go-to guy for multi-threaded apps.
A great place to start is Joe Albahari's article: http://www.albahari.com/threading/.

What are the important threading API calls to understand before writing multi-threaded code?

Recently I was blogging about the oft over-used idea of multi-threading in .Net. I put together a staring list for "APIs you should know first":
Thread
ThreadPool
ManualResetEvent
AutoResetEvent
EventWaitHandle
WaitHandle
Monitor
Mutex
Semaphore
Interlocked
BackgroundWorker
AsyncOperation
lock Statement
volatile
ThreadStaticAttribute
Thread.MemoryBarrier
Thread.VolatileRead
Thread.VolatileWrite
Then I started thinking maybe not all of these are important. For instance, Thread.MemoryBarrier could probably be safely removed from the list. Add to that the obvious statement that I don't know everything and I decided to turn here.
So this is a broad and opinionated question, but I'm curious as to the collective's opinion as to a best-practice study list. Essentially I'm looking for a short hit list for new and/or Jr. developers to work from when beginning to write multi-threading code in C#.
So without further commentary, what should be added or removed from the above list?
I think you need to classify the levels of multithreading, not the different API's. Depending on your threading needs, you may or may not need to know certain subsets of the API's you have listed. If I were to organize them, I would do it along these lines:
Basic Multi-threading:
Requirements
Need to run concurrent processes.
Do not need access to shared resources.
Maximizing utilization of available hardware resources.
API Knowledge
Thread
ThreadPool
BackgroundWorker
Asynchronous Operations/Delegates
Shared Resource Multi-threading:
Requirements
Basic Multi=-threading requirements
Use of shared resources
API Knowledge
Basic Multi-threading API's
lock()/Monitor (they are the same thing)
Interlocked
ReaderWriterLock and variants
volatile
Multi-thread Synchronization
Requirements
Basic Multi=-threading requirements
Shared Resource Multi-threading requirements
Synchronization of behavior across multiple threads
API Knowledge
Basic Multi-threading API's
Shared Resource Multi-threading API's
WaitHandle
Manual/AutoResetEvent
Mutex
Semaphore
Concurrent Shared Resource Multi-threading (hyperthreading)
Requirements
Basic Multi=-threading requirements
Shared Resource Multi-threading requirements
Concurrent read/write access to shared collections
API Knowledge
Basic Multi-threading API's
Shared Resource Multi-threading API's
Parallel Extensions to .NET/.NET 4.0
The rest of the API's I would simply lump into general threading knowledge, stuff that could be picked up as needed, as they fit into all three levels. Things like MemoryBarrier are pretty fringe, and there are usually better ways to accomplish the same thing it accomplishes, with less ambiguity into their behavior and meaning.
IMHO BackgroundWorker should be on the VERY top of your list.
It's fairly simple to explain. Can be used in most cases and delivers a lot of "bang for the buck". For somebody new to threading this gives him something to actually work with without having to learn hours before he does the first thing right.
how about the ParameterizedThreadStart and ThreadStart delegates ?
.NET 4.0 will be bringing some new tools to this problem; many of these tools will hide the details of the low-level threading APIs. You can start getting ready to leverage this new functionality today by doing things such as using LINQ or learning functional-programming techniques.
I'd recommend looking at the Parallel Extensions coming in .NET 4.0, the ThreadPool, BackgroundWorker (if they're working in WinForms) and the lock keyword. Those provide most of the functionality that you'll need from multi-threading, whilst still being a relatively safe environment in which to experiment. Also, you should add the Dispatcher from WPF to your list; developers are more likely to come across that than VolatileRead.

Categories

Resources