Related
Asynchronous programming is a technique that calls a long running method in the background so that the UI thread remains responsive. It should be used while calling a web service or database query or any I/O bound operation. when the asynchronous method completes, it returns the result to the main thread. In this way, the program's main thread does not have to wait for the result of an I/O bound operation and continues to execute further without blocking/freezing the UI. This is ok.
As far as I know the asynchronous method executes on a background worker thread. The runtime makes availabe the thread either from the threadpool or it may create a brand new thread for its execution.
But I have read in many posts that an asynchronous operation may execute on a separate thread or without using any thread. Now I am very confused.
1) Could you please help clarifying in what situation an asynchronous operation will not use a thread?
2) What is the role of processor core in asynchronous operation?
3) How it is different from multithreading? I know one thing that multithreading is usefult with compute-bound operation.
Please help.
IO (let's say a database-operation over the network) is a good example for all three:
you basically just register a callback the OS will finally call (maybe on a then newly created thread) when the IO-Operation finished. There is no thread sitting around and waiting - the resurrection will be triggered by hardware-events (or at least by a OS process usually outside user-space)
it might have none (see 1)
in Multithreading you use more than one thread (your background-thread) and there one might idle sit there doing nothing (but using up system-resources) - this is of course different if you have something to compute (so the thread is not idle waiting for external results) - there it makes sense to use a background-worker-thread
Asynchronous operations don't actually imply much of anything about how they are processed, only that they would like the option to get back to you later with your results. By way of example:
They may (as you've mentioned) split off a compute-bound task onto an independent thread, but this is not the only use case.
They may sometimes complete synchronously within the call that launches them, in which case no additional thread is used. This may happen with an I/O request if there is already enough buffer content (input) or free buffer space (output) to service the request.
They may simply drop off a long-running I/O request to the system; in this case the callback is likely to occur on a background thread after receiving notification from an I/O completion port.
On completion, a callback may be delivered later on the same thread; this is especially common with events within a UI framework, such as navigation in a WebBrowser.
Asynchronity doesn't say anything about thread. Its about having some kind of callbacks which will be handled inside a "statemachine" (not really correct but you can think of it like events ). Asynchronity does not raise threads nor significantly allocate system ressources. You can run as many asynchronous methods as you want to.
Threads do have a real imply on your system and you have a hughe but limited number you can have at once.
Io operations are mostly related to others controllers (HDD, NIC,...) What now happens if you create a thread is that a thread of your application which has nothing to do waits for the controllers to finish. In async as Carsten and Jeffrey already mentioned you just get some kind of callback mechanism so your thread continues to do other work, methods and so on.
Also keep in mind that each thread costs ressources (RAM, Performance,handles Garbage Collection got worse,...) and may even and up in exceptions (OutOfMemoryException...)
So when to use Threads? Absolutly only if you really need it. If there is a async api use it unless you have really important reasons to not use it.
In past days the async api was really painfull, thats why many people used threads when ever they need just asynchronity.
For example node.js refuses the use of mulptile thread at all!
This is specially important if you handle multiple requests for example in services / websites where there is always work to do. There is also a this short webcast with Jeffrey Richter about this which helped me to understand
Also have a look at this MSDN article
PS: As a side effect sourcecode with async and await tend to be more readable
When do you use threads in a application? For example, in simple CRUD operations, use of smtp, calling webservices that may take a few time if the server is facing bandwith issues, etc.
To be honest, i don't know how to determine if i need to use a thread (i know that it must be when we're excepting that a operation will take a few time to be done).
This may be a "noob" question but it'll be great if you share with me your experience in threads.
Thanks
I added C# and .NET tags to your question because you mention C# in your title. If that is not accurate, feel free to remove the tags.
There are different styles of multithreading. For example, there are asynchronous operations with callback functions. .NET 4 introduces the parallel Linq library. The style of multithreading you would use, or whether to use any at all, depends on what you are trying to accomplish.
Parallel execution, such as what parallel Linq would generally be trying to do, takes advantage of multiple processor cores executing instructions that do not need to wait for data from each other. There are many sources for such algorithms outside Linq, such as this. However, it is possible that parallel execution may be unable to you or that it does not suit your application.
More traditional multithreading takes advantage of threading within the .NET library (in this case) as provided by System.Thread. Remember that there is some overhead in starting processes on threads, so only use threads when the advantages of doing so outweigh this overhead. Generally speaking, you would only want to use this type of single-processor multithreading when the task running under the thread will have long gaps in which the processor could be doing something else. For example, I/O from hard disk (and, consequently, from a database system that uses one) is many orders of magnitude slower than memory access. Network access can also be slow, as another example. Multithreading could allow another process to be running while waiting for these slow (compared to the processor) operations to complete.
Another example when I have used traditional multithreading is to cache some values the first time a particular ASP.NET page is accessed within a session. I kick off a thread so that the user does not have to wait for the caching to complete before interacting with the page. I also regulate the behavior when the caching does not complete before the user requests another page so that, if the caching does not complete, it is not a problem. It simply makes some further requests faster that were previously too slow.
Consider also the cost that multithreading has to the maintainability of your application. Threaded applications can be harder to debug, for example.
I hope this answers your question at least somewhat.
Joseph Albahari summarized it very well here:
Maintaining a responsive user interface
Making efficient use of an otherwise blocked CPU
Parallel programming
Speculative execution
Allowing requests to be processed simultaneously
One reason to use threads is to split large, CPU-bound tasks across a number of CPUs/cores, to finish faster. Another is to let an extended task execute asynchronously, so the foreground can remain responsive while it runs.
Your examples seem to be concentrating on the second of these. While it can be a good reason, if you can use asynchronous I/O instead, that's usually preferable (e.g., almost anything using sockets can/will be better off using the socket(s) asynchronously). Asynchronous I/O is easier to cancel, and it'll usually have lower CPU overhead as well.
You can use threads when you need different execution paths. This leads(when done correctly) to more responsive and/or faster applications but also leads to more complex code and debugging.
In a simple CRUD scenario maybe is not that useful, but maybe your UI is consuming a slow web service. If you your code is tied to your UI thread you will have unresponsive UI between the service calls.
In that case, using System.Threading.Threads maybe be overkill because you don't need so much control. Using a BackgrounWorker maybe a better choice.
Threading is something difficult to master, but the benefits when used correctly are huge, performance is the most common.
Somehow you have answered your question by yourself. Using threads whenever you execute time consuming operations is right choice. Also you should it in situations when you want to make things faster. For example you want to process some amount of files - each file can be processed by different thread.
By using threads you can better utilize power of multi-core/processor machines.
Monitoring some data in background of your application.
There are dozens of such scenarios.
Realising my comment might suffice as an answer ...
I like to view multi-threading scenarios from a resource perspective. In other words, UI (graphics), networking, disk IO, CPU (cores), RAM etc. I find that helps when deciding where to use multi-threading in the general sense at least.
The reasoning behind this is simply that I can take advantage of one resource on a specific thread (eg. Disk IO) while at the same time using another thread to accomplish something else using a different resource.
In a previous question, I made a bit of a faux pas. You see, I'd been reading about threads and had got the impression that they were the tastiest things since kiwi jello.
Imagine my confusion then, when I read stuff like this:
[T]hreads are A Very Bad Thing. Or, at least, explicit management of threads is a bad thing
and
Updating the UI across threads is usually a sign that you are abusing threads.
Since I kill a puppy every time something confuses me, consider this your chance get your karma back in the black...
How should I be using thread?
Enthusiam for learning about threading is great; don't get me wrong. Enthusiasm for using lots of threads, by contrast, is symptomatic of what I call Thread Happiness Disease.
Developers who have just learned about the power of threads start asking questions like "how many threads can I possible create in one program?" This is rather like an English major asking "how many words can I use in a sentence?" Typical advice for writers is to keep your sentences short and to the point, rather than trying to cram as many words and ideas into one sentence as possible. Threads are the same way; the right question is not "how many can I get away with creating?" but rather "how can I write this program so that the number of threads is the minimum necessary to get the job done?"
Threads solve a lot of problems, it's true, but they also introduce huge problems:
Performance analysis of multi-threaded programs is often extremely difficult and deeply counterintuitive. I've seen real-world examples in heavily multi-threaded programs in which making a function faster without slowing down any other function or using more memory makes the total throughput of the system smaller. Why? Because threads are often like streets downtown. Imagine taking every street and magically making it shorter without re-timing the traffic lights. Would traffic jams get better, or worse? Writing faster functions in multi-threaded programs drives the processors towards congestion faster.
What you want is for threads to be like interstate highways: no traffic lights, highly parallel, intersecting at a small number of very well-defined, carefully engineered points. That is very hard to do. Most heavily multi-threaded programs are more like dense urban cores with stoplights everywhere.
Writing your own custom management of threads is insanely difficult to get right. The reason is because when you are writing a regular single-threaded program in a well-designed program, the amount of "global state" you have to reason about is typically small. Ideally you write objects that have well-defined boundaries, and that do not care about the control flow that invokes their members. You want to invoke an object in a loop, or a switch, or whatever, you go right ahead.
Multi-threaded programs with custom thread management require global understanding of everything that a thread is going to do that could possibly affect data that is visible from another thread. You pretty much have to have the entire program in your head, and understand all the possible ways that two threads could be interacting in order to get it right and prevent deadlocks or data corruption. That is a large cost to pay, and highly prone to bugs.
Essentially, threads make your methods lie. Let me give you an example. Suppose you have:
if (!queue.IsEmpty) queue.RemoveWorkItem().Execute();
Is that code correct? If it is single threaded, probably. If it is multi-threaded, what is stopping another thread from removing the last remaining item after the call to IsEmpty is executed? Nothing, that's what. This code, which locally looks just fine, is a bomb waiting to go off in a multi-threaded program. Basically that code is actually:
if (queue.WasNotEmptyAtSomePointInThePast) ...
which obviously is pretty useless.
So suppose you decide to fix the problem by locking the queue. Is this right?
lock(queue) {if (!queue.IsEmpty) queue.RemoveWorkItem().Execute(); }
That's not right either, necessarily. Suppose the execution causes code to run which waits on a resource currently locked by another thread, but that thread is waiting on the lock for queue - what happens? Both threads wait forever. Putting a lock around a hunk of code requires you to know everything that code could possibly do with any shared resource, so that you can work out whether there will be any deadlocks. Again, that is an extremely heavy burden to put on someone writing what ought to be very simple code. (The right thing to do here is probably to extract the work item in the lock and then execute it outside the lock. But... what if the items are in a queue because they have to be executed in a particular order? Now that code is wrong too because other threads can then execute later jobs first.)
It gets worse. The C# language spec guarantees that a single-threaded program will have observable behaviour that is exactly as the program is specified. That is, if you have something like "if (M(ref x)) b = 10;" then you know that the code generated will behave as though x is accessed by M before b is written. Now, the compiler, jitter and CPU are all free to optimize that. If one of them can determine that M is going to be true and if we know that on this thread, the value of b is not read after the call to M, then b can be assigned before x is accessed. All that is guaranteed is that the single-threaded program seems to work like it was written.
Multi-threaded programs do not make that guarantee. If you are examining b and x on a different thread while this one is running then you can see b change before x is accessed, if that optimization is performed. Reads and writes can logically be moved forwards and backwards in time with respect to each other in single threaded programs, and those moves can be observed in multi-threaded programs.
This means that in order to write multi-threaded programs where there is a dependency in the logic on things being observed to happen in the same order as the code is actually written, you have to have a detailed understanding of the "memory model" of the language and the runtime. You have to know precisely what guarantees are made about how accesses can move around in time. And you cannot simply test on your x86 box and hope for the best; the x86 chips have pretty conservative optimizations compared to some other chips out there.
That's just a brief overview of just a few of the problems you run into when writing your own multithreaded logic. There are plenty more. So, some advice:
Do learn about threading.
Do not attempt to write your own thread management in production code.
Use higher-level libraries written by experts to solve problems with threads. If you have a bunch of work that needs to be done in the background and want to farm it out to worker threads, use a thread pool rather than writing your own thread creation logic. If you have a problem that is amenable to solution by multiple processors at once, use the Task Parallel Library. If you want to lazily initialize a resource, use the lazy initialization class rather than trying to write lock free code yourself.
Avoid shared state.
If you can't avoid shared state, share immutable state.
If you have to share mutable state, prefer using locks to lock-free techniques.
Explicit management of threads is not intrinsically a bad thing, but it's frought with dangers and shouldn't be done unless absolutely necessary.
Saying threads are absolutely a good thing would be like saying a propeller is absolutely a good thing: propellers work great on airplanes (when jet engines aren't a better alternative), but wouldn't be a good idea on a car.
You cannot appreciate what kind of problems threading can cause unless you've debugged a three-way deadlock. Or spent a month chasing a race condition that happens only once a day. So, go ahead and jump in with both feet and make all the kind of mistakes you need to make to learn to fear the Beast and what to do to stay out of trouble.
There's no way I could offer a better answer than what's already here. But I can offer a concrete example of some multithreaded code that we actually had at my work that was disastrous.
One of my coworkers, like you, was very enthusiastic about threads when he first learned about them. So there started to be code like this throughout the program:
Thread t = new Thread(LongRunningMethod);
t.Start(GetThreadParameters());
Basically, he was creating threads all over the place.
So eventually another coworker discovered this and told the developer responsible: don't do that! Creating threads is expensive, you should use the thread pool, etc. etc. So a lot of places in the code that originally looked like the above snippet started getting rewritten as:
ThreadPool.QueueUserWorkItem(LongRunningMethod, GetThreadParameters());
Big improvement, right? Everything's sane again?
Well, except that there was a particular call in that LongRunningMethod that could potentially block -- for a long time. Suddenly every now and then we started seeing it happen that something our software should have reacted to right away... it just didn't. In fact, it might not have reacted for several seconds (clarification: I work for a trading firm, so this was a complete catastrophe).
What had ended up happening was that the thread pool was actually filling up with long-blocking calls, leading to other code that was supposed to happen very quickly getting queued up and not running until significantly later than it should have.
The moral of this story is not, of course, that the first approach of creating your own threads is the right thing to do (it isn't). It's really just that using threads is tough, and error-prone, and that, as others have already said, you should be very careful when you use them.
In our particular situation, many mistakes were made:
Creating new threads in the first place was wrong because it was far more costly than the developer realized.
Queuing all background work on the thread pool was wrong because it treated all background tasks indiscriminately and did not account for the possibility of asynchronous calls actually being blocked.
Having a long-blocking method by itself was the result of some careless and very lazy use of the lock keyword.
Insufficient attention was given to ensuring that the code that was being run on background threads was thread-safe (it wasn't).
Insufficient thought was given to the question of whether making a lot of the affected code multithreaded was even worth doing to begin with. In plenty of cases, the answer was no: multithreading just introduced complexity and bugs, made the code less comprehensible, and (here's the kicker): hurt performance.
I'm happy to say that today, we're still alive and our code is in a much healthier state than it once was. And we do use multithreading in plenty of places where we've decided it's appropriate and have measured performance gains (such as reduced latency between receiving a market data tick and having an outgoing quote confirmed by the exchange). But we learned some pretty important lessons the hard way. Chances are, if you ever work on a large, highly multithreaded system, you will too.
Unless you are on the level of being able to write a fully-fledged kernel scheduler, you will get explicit thread management always wrong.
Threads can be the most awesome thing since hot chocolate, but parallel programming is incredibly complex. However, if you design your threads to be independent then you can't shoot yourself in the foot.
As fore rule of the thumb, if a problem is decomposed into threads, they should be as independent as possible, with as few but well defined shared resources as possible, with the most minimalistic management concept.
I think the first statement is best explained as such: with the many advanced APIs now available, manually writing your own thread code is almost never necessary. The new APIs are a lot easier to use, and a lot harder to mess up!. Whereas, with the old-style threading, you have to be quite good to not mess up. The old-style APIs (Thread et. al.) are still available, but the new APIs (Task Parallel Library, Parallel LINQ, and Reactive Extensions) are the way of the future.
The second statement is from more of a design perspective, IMO. In a design that has a clean separation of concerns, a background task should not really be reaching directly into the UI to report updates. There should be some separation there, using a pattern like MVVM or MVC.
I would start by questioning this perception:
I'd been reading about threads and had got the impression that they were the tastiest things since kiwi jello.
Don’t get me wrong – threads are a very versatile tool – but this degree of enthusiasm seems weird. In particular, it indicates that you might be using threads in a lot of situations where they simply don’t make sense (but then again, I might just mistake your enthusiasm).
As others have indicated, thread handling is additionally quite complex and complicated. Wrappers for threads exist and only in rare occasions do they have to be handled explicitly. For most applications, threads can be implied.
For example, if you just want to push a computation to the background while leaving the GUI responsive, a better solution is often to either use callback (that makes it seem as though the computation is done in the background while really being executed on the same thread), or by using a convenience wrapper such as the BackgroundWorker that takes and hides all the explicit thread handling.
A last thing, creating a thread is actually very expensive. Using a thread pool mitigates this cost because here, the runtime creates a number of threads that are subsequently reused. When people say that explicit management of threads is bad, this is all they might be referring to.
Many advanced GUI Applications usually consist of two threads, one for the UI, one (or sometimes more) for Processing of data (copying files, making heavy calculations, loading data from a database, etc).
The processing threads shouldn't update the UI directly, the UI should be a black box to them (check Wikipedia for Encapsulation).
They just say "I'm done processing" or "I completed task 7 of 9" and call an Event or other callback method. The UI subscribes to the event, checks what has changed and updates the UI accordingly.
If you update the UI from the Processing Thread you won't be able to reuse your code and you will have bigger problems if you want to change parts of your code.
I think you should experiement as much as possible with Threads and get to know the benefits and pitfalls of using them. Only by experimentation and usage will your understanding of them grow. Read as much as you can on the subject.
When it comes to C# and the userinterface (which is single threaded and you can only modify userinterface elements on code executed on the UI thread). I use the following utility to keep myself sane and sleep soundly at night.
public static class UIThreadSafe {
public static void Perform(Control c, MethodInvoker inv) {
if(c == null)
return;
if(c.InvokeRequired) {
c.Invoke(inv, null);
}
else {
inv();
}
}
}
You can use this in any thread that needs to change a UI element, like thus:
UIThreadSafe.Perform(myForm, delegate() {
myForm.Title = "I Love Threads!";
});
A huge reason to try to keep the UI thread and the processing thread as independent as possible is that if the UI thread freezes, the user will notice and be unhappy. Having the UI thread be blazing fast is important. If you start moving UI stuff out of the UI thread or moving processing stuff into the UI thread, you run a higher risk of having your application become unresponsive.
Also, a lot of the framework code is deliberately written with the expectation that you will separate the UI and processing; programs will just work better when you separate the two out, and will hit errors and problems when you don't. I don't recall any specifics issues that I encountered as a result of this, though I have vague recollections of in the past trying to set certain properties of stuff the UI was responsible for outside of the UI and having the code refuse to work; I don't recall whether it didn't compile or it threw an exception.
Threads are a very good thing, I think. But, working with them is very hard and needs a lot of knowledge and training. The main problem is when we want to access shared resources from two other threads which can cause undesirable effects.
Consider classic example: you have a two threads which get some items from a shared list and after doing something they remove the item from the list.
The thread method that is called periodically could look like this:
void Thread()
{
if (list.Count > 0)
{
/// Do stuff
list.RemoveAt(0);
}
}
Remember that the threads, in theory, can switch at any line of your code that is not synchronized. So if the list contains only one item, one thread could pass the list.Count condition, just before list.Remove the threads switch and another thread passes the list.Count (list still contains one item). Now the first thread continues to list.Remove and after that second thread continues to list.Remove, but the last item already has been removed by the first thread, so the second one crashes. That's why it would have to be synchronized using lock statement, so that there can't be a situation where two threads are inside the if statement.
So that is the reason why UI which is not synchronized must always run in a single thread and no other thread should interfere with UI.
In previous versions of .NET if you wanted to update UI in another thread, you would have to synchronize using Invoke methods, but as it was hard enough to implement, new versions of .NET come with BackgroundWorker class which simplifies a thing by wrapping all the stuff and letting you do the asynchronous stuff in a DoWork event and updating UI in ProgressChanged event.
A couple of things are important to note when updating the UI from a non-UI thread:
If you use "Invoke" frequently, the performance of your non-UI thread may be severely adversely affected if other stuff makes the UI thread run sluggishly. I prefer to avoid using "Invoke" unless the non-UI thread needs to wait for the UI-thread action to be performed before it continues.
If you use "BeginInvoke" recklessly for things like control updates, an excessive number of invocation delegates may get queued, some of which may well be pretty useless by the time they actually occur.
My preferred style in many cases is to have each control's state encapsulated in an immutable class, and then have a flag which indicates whether an update is not needed, pending, or needed but not pending (the latter situation may occur if a request is made to update a control before it is fully created). The control's update routine should, if an update is needed, start by clearing the update flag, grabbing the state, and drawing the control. If the update flag is set, it should re-loop. To request another thread, a routine should use Interlocked.Exchange to set the update flag to update pending and--if it wasn't pending--try to BeginInvoke the update routine; if the BeginInvoke fails, set the update flag to "needed but not pending".
If an attempt to control occurs just after the control's update routine checks and clears its update flag, it may well happen that the first update will reflect the new value but the update flag will have been set anyway, forcing an extra screen redraw. On the occasions when this happens, it will be relatively harmless. The important thing is that the control will end up being drawn in the correct state, without there ever having been more than one BeginInvoke pending.
I've read that threads are very problematic. What alternatives are available? Something that handles blocking and stuff automatically?
A lot of people recommend the background worker, but I've no idea why.
Anyone care to explain "easy" alternatives? The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Any ideas?
To summarize the problems with threads:
if threads share memory, you can get
race conditions
if you avoid races by liberally using locks, you
can get deadlocks (see the dining philosophers problem)
An example of a race: suppose two threads share access to some memory where a number is stored. Thread 1 reads from the memory address and stores it in a CPU register. Thread 2 does the same. Now thread 1 increments the number and writes it back to memory. Thread 2 then does the same. End result: the number was only incremented by 1, while both threads tried to increment it. The outcome of such interactions depend on timing. Worse, your code may seem to work bug-free but once in a blue moon the timing is wrong and bad things happen.
To avoid these problems, the answer is simple: avoid sharing writable memory. Instead, use message passing to communicate between threads. An extreme example is to put the threads in separate processes and communicate via TCP/IP connections or named pipes.
Another approach is to share only read-only data structures, which is why functional programming languages can work so well with multiple threads.
This is a bit higher-level answer, but it may be useful if you want to consider other alternatives to threads. Anyway, most of the answers discussed solutions based on threads (or thread pools) or maybe tasks from .NET 4.0, but there is one more alternative, which is called message-passing. This has been successfuly used in Erlang (a functional language used by Ericsson). Since functional programming is becoming more mainstream in these days (e.g. F#), I thought I could mention it. In genral:
Threads (or thread pools) can usually used when you have some relatively long-running computation. When it needs to share state with other threads, it gets tricky (you have to correctly use locks or other synchronization primitives).
Tasks (available in TPL in .NET 4.0) are very lightweight - you can split your program into thousands of tasks and then let the runtime run them (it will use optimal number of threads). If you can write your algorithm using tasks instead of threads, it sounds like a good idea - you can avoid some synchronization when you run computation using smaller steps.
Declarative approaches (PLINQ in .NET 4.0 is a great option) if you have some higher-level data processing operation that can be encoded using LINQ primitives, then you can use this technique. The runtime will automatically parallelize your code, because LINQ doesn't specify how exactly should it evaluate the results (you just say what results you want to get).
Message-passing allows you two write program as concurrently running processes that perform some (relatively simple) tasks and communicate by sending messages to each other. This is great, because you can share some state (send messages) without the usual synchronization issues (you just send a message, then do other thing or wait for messages). Here is a good introduction to message-passing in F# from Robert Pickering.
Note that the last three techniques are quite related to functional programming - in functional programming, you desing programs differently - as computations that return result (which makes it easier to use Tasks). You also often write declarative and higher-level code (which makes it easier to use Declarative approaches).
When it comes to actual implementation, F# has a wonderful message-passing library right in the core libraries. In C#, you can use Concurrency & Coordination Runtime, which feels a bit "hacky", but is probably quite powerful too (but may look too complicated).
Won't the parallel programming options in .Net 4 be an "easy" way to use threads? I'm not sure what I'd suggest for .Net 3.5 and earlier...
This MSDN link to the Parallel Computing Developer Center has links to lots of info on Parellel Programming including links to videos, etc.
I can recommend this project. Smart Thread Pool
Project Description
Smart Thread Pool is a thread pool written in C#. It is far more advanced than the .NET built-in thread pool.
Here is a list of the thread pool features:
The number of threads dynamically changes according to the workload on the threads in the pool.
Work items can return a value.
A work item can be cancelled.
The caller thread's context is used when the work item is executed (limited).
Usage of minimum number of Win32 event handles, so the handle count of the application won't explode.
The caller can wait for multiple or all the work items to complete.
Work item can have a PostExecute callback, which is called as soon the work item is completed.
The state object, that accompanies the work item, can be disposed automatically.
Work item exceptions are sent back to the caller.
Work items have priority.
Work items group.
The caller can suspend the start of a thread pool and work items group.
Threads have priority.
Can run COM objects that have single threaded apartment.
Support Action and Func delegates.
Support for WindowsCE (limited)
The MaxThreads and MinThreads can be changed at run time.
Cancel behavior is imporved.
"Problematic" is not the word I would use to describe working with threads. "Tedious" is a more appropriate description.
If you are new to threaded programming, I would suggest reading this thread as a starting point. It is by no means exhaustive but has some good introductory information. From there, I would continue to scour this website and other programming sites for information related to specific threading questions you may have.
As for specific threading options in C#, here's some suggestions on when to use each one.
Use BackgroundWorker if you have a single task that runs in the background and needs to interact with the UI. The task of marshalling data and method calls to the UI thread are handled automatically through its event-based model. Avoid BackgroundWorker if (1) your assembly does not already reference the System.Windows.Form assembly, (2) you need the thread to be a foreground thread, or (3) you need to manipulate the thread priority.
Use a ThreadPool thread when efficiency is desired. The ThreadPool helps avoid the overhead associated with creating, starting, and stopping threads. Avoid using the ThreadPool if (1) the task runs for the lifetime of your application, (2) you need the thread to be a foreground thread, (3) you need to manipulate the thread priority, or (4) you need the thread to have a fixed identity (aborting, suspending, discovering).
Use the Thread class for long-running tasks and when you require features offered by a formal threading model, e.g., choosing between foreground and background threads, tweaking the thread priority, fine-grained control over thread execution, etc.
Any time you introduce multiple threads, each running at once, you open up the potential for race conditions. To avoid these, you tend to need to add synchronization, which adds complexity, as well as the potential for deadlocks.
Many tools make this easier. .NET has quite a few classes specifically meant to ease the pain of dealing with multiple threads, including the BackgroundWorker class, which makes running background work and interacting with a user interface much simpler.
.NET 4 is going to do a lot to ease this even more. The Task Parallel Library and PLINQ dramatically ease working with multiple threads.
As for your last comment:
The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Most of the routines in .NET are built upon the ThreadPool. In .NET 4, when using the TPL, the work load will actually scale at runtime, for you, eliminating the burden of having to specify the number of threads to use. However, there are ways to do this now.
Currently, you can use ThreadPool.SetMaxThreads to help limit the number of threads generated. In TPL, you can specify ParallelOptions.MaxDegreesOfParallelism, and pass an instance of the ParallelOptions into your routine to control this. The default behavior scales up with more threads as you add more processing cores, which is usually the best behavior in any case.
Threads are not problematic if you understand what causes problems with them.
For ex. if you avoid statics, you know which API's to use (e.g. use synchronized streams), you will avoid many of the issues that come up for their bad utilization.
If threading is a problem (this can happen if you have unsafe/unmanaged 3rd party dll's that cannot support multithreading. In this can an option is to create a meachism to queue the operations. ie store the parameters of the action to a database and just run through them one at a time. This can be done in a windows service. Obviously this will take longer but in some cases is the only option.
Threads are indispensable tools for solving many problems, and it behooves the maturing developer to know how to effectively use them. But like many tools, they can cause some very difficult-to-find bugs.
Don't shy away from some so useful just because it can cause problems, instead study and practice until you become the go-to guy for multi-threaded apps.
A great place to start is Joe Albahari's article: http://www.albahari.com/threading/.
I understand the main function of the lock key word from MSDN
lock Statement (C# Reference)
The lock keyword marks a statement
block as a critical section by
obtaining the mutual-exclusion lock
for a given object, executing a
statement, and then releasing the
lock.
When should the lock be used?
For instance it makes sense with multi-threaded applications because it protects the data. But is it necessary when the application does not spin off any other threads?
Is there performance issues with using lock?
I have just inherited an application that is using lock everywhere, and it is single threaded and I want to know should I leave them in, are they even necessary?
Please note this is more of a general knowledge question, the application speed is fine, I want to know if that is a good design pattern to follow in the future or should this be avoided unless absolutely needed.
When should the lock be used?
A lock should be used to protect shared resources in multithreaded code. Not for anything else.
But is it necessary when the application does not spin off any other threads?
Absolutely not. It's just a time waster. However do be sure that you're not implicitly using system threads. For example if you use asynchronous I/O you may receive callbacks from a random thread, not your original thread.
Is there performance issues with using lock?
Yes. They're not very big in a single-threaded application, but why make calls you don't need?
...if that is a good design pattern to follow in the future[?]
Locking everything willy-nilly is a terrible design pattern. If your code is cluttered with random locking and then you do decide to use a background thread for some work, you're likely to run into deadlocks. Sharing a resource between multiple threads requires careful design, and the more you can isolate the tricky part, the better.
All the answers here seem right: locks' usefulness is to block threads from acessing locked code concurrently. However, there are many subtleties in this field, one of which is that locked blocks of code are automatically marked as critical regions by the Common Language Runtime.
The effect of code being marked as critical is that, if the entire region cannot be entirely executed, the runtime may consider that your entire Application Domain is potentially jeopardized and, therefore, unload it from memory. To quote MSDN:
For example, consider a task that attempts to allocate memory while holding a lock. If the memory allocation fails, aborting the current task is not sufficient to ensure stability of the AppDomain, because there can be other tasks in the domain waiting for the same lock. If the current task is terminated, other tasks could be deadlocked.
Therefore, even though your application is single-threaded, this may be a hazard for you. Consider that one method in a locked block throws an exception that is eventually not handled within the block. Even if the exception is dealt as it bubbles up through the call stack, your critical region of code didn't finish normally. And who knows how the CLR will react?
For more info, read this article on the perils of Thread.Abort().
Bear in mind that there might be reasons why your application is not as single-threaded as you think. Async I/O in .NET may well call-back on a pool thread, for example, as do some of the various timer classes (not the Windows Forms Timer, though).
Generally speaking if your application is single threaded, you're not going to get much use out of the lock statement. Not knowing your application exactly, I don't know if they're useful or not - but I suspect not. Further, if you're application is using lock everywhere I don't know that I would feel all that confident about it working in a multi-threaded environment anyways - did the original developer actually know how to develop multi-threaded code, or did they just add lock statements everywhere in the vague hope that that would do the trick?
lock should be used around the code that modifies shared state, state that is modified by other threads concurrently, and those other treads must take the same lock.
A lock is actually a memory access serializer, the threads (that take the lock) will wait on the lock to enter until the current thread exits the lock, so memory access is serialized.
To answer you question lock is not needed in a single threaded application, and it does have performance side effects. because locks in C# are based on kernel sync objects and every lock you take creates a transition to kernel mode from user mode.
If you're interested in multithreading performance a good place to start is MSDN threading guidelines
You can have performance issues with locking variables, but normally, you'd construct your code to minimize the lengths of time that are spent inside a 'locked' block of code.
As far as removing the locks. It'll depend on what exactly the code is doing. Even though it's single threaded, if your object is implemented as a Singleton, it's possible that you'll have multiple clients using an instance of it (in memory, on a server) at the same time..
Yes, there will be some performance penalty when using lock but it is generally neglible enough to not matter.
Using locks (or any other mutual-exclusion statement or construct) is generally only needed in multi-threaded scenarios where multiple threads (either of your own making or from your caller) have the opportunity to interact with the object and change the underlying state or data maintained. For example, if you have a collection that can be accessed by multiple threads you don't want one thread changing the contents of that collection by removing an item while another thread is trying to read it.
Lock(token) is only used to mark one or more blocks of code that should not run simultaneously in multiple threads. If your application is single-threaded, it's protecting against a condition that can't exist.
And locking does invoke a performance hit, adding instructions to check for simultaneous access before code is executed. It should only be used where necessary.
See the question about 'Mutex' in C#. And then look at these two questions regarding use of the 'lock(Object)' statement specifically.
There is no point in having locks in the app if there is only one thread and yes, it is a performance hit although it does take a fair number of calls for that hit to stack up into something significant.