in my game I have entities (about 2000-5000) that do very performance heavy calculations every frame (or every 2nd or 3rd).
I know that parallel programming will help in this scenario.
The question is how do I use multiple threads / cpu-cores in the best way possible.
I tried the static Parallel class in .NET, but the startup cost for every Parallel.For / Parallel.ForEach is simply to big.
Does it start a number of threads each time I call a function of the Parallel class ??
The game's target fps is 60 fps, so whatevery approach I use, it should never involve starting multiple threads every game-frame!
So my question is:
What other alternatives are there to parallel-process multiple entities?
Should I create the threads myself, and if so, what pitfalls are there?
Is the threadpool a good alternative?
By default the Parallel.For/Parallel.Foreach uses ThreadPool thread. Infact TPL api like Parallel class, Task instance etc uses ThreadPool thread.
The number of threads started by TPL will depend upon lots of factors like number of available ideal thread in threadpool etc. So assuming TPL will create lots of threads as soon as you will submit request will not be entirely correct.
Important point is that you can control, the number of threads used the TPL for executing the tasks via MaxDegreeOfParallelism flag. By setting this value to say number "X", you limit the number of concurrent items being processed to "X". Personally, I feel this is a very powerful feature of TPL library.
Also, if you do not want your task to block other requests in queue, then you can set the TaskCreationOptions to be of type "LongRunning", in that case, TPL will run the task on a dedicated thread.
In your case, you can try using Parallel.For/ForEach and set the MaxDegreeOfParalleleism value to some optimal value and see how that behaves.
You need some computers or servers to test your parallel programmed game. Other wise on a single computer, since it will work on thread, it will cost more.
Additionally, since the game is calculated and drawn 60 times in a second, working on threads is not logical, your game is already running on threads.
Maybe you have to change or optimize your algorithm. Do not let everything is done by CPU, use GPU as much as possible.
There isn't a simple answer.
When every entity of your game can be calculated without knowing any other entity then you can simply move that calcultion to a given number of threads. A threadpool would be a good start.
You could set this up when your programm starts.
But even when you do so, you have to check if switching threads performs better than having one single thread. Context switch is here the keyword you can search for.
But what happens when entity A on thread A does a calculation with affects entity B on thread B and forces entity B to be destroyed which forces entity C on thread C to be destroyed also?
And now to make it harder :P entity C was calculated before entity B, before entity A...
What should happen? Is it ok that those entities will be cleaned up in 2 or 3 frames in the future?
Does it affect or break your game logic?
If so then things are getting complicated.
Related
I have an interesting exercise to solve from my professor. But I need a little bit of help so it does not become boring during the holidays.
The exercise is to
create a multithreaded load balancer, that reads 1 measuring point from 5 sensors every second. (therefore 5 values every second).
Then do some "complex" calculations with those values.
Printing results of the calculations on the screen. (like max value or average value of sensor 1-5 and so on, of course multithreaded)
As an additional task I also have to ensure that if in the future for example 500 sensors would be read every second the computer doesn't quit the job.(load balancing).
I have a csv textfile with ~400 measuring points from 5 imaginary sensors.
What I think I have to do:
Read the measuring points into an array
Ensure thread safe access to that array
Spawn a new thread for every value that calculates some math stuff
Set a max value for maximum concurrent working threads
I am new to multithreading applications in c# but I think using threadpool is the right way. I am currently working on a queue and maybe starting it inside a task so it wont block the application.
What would you recommend?
There are a couple of environment dependencies here:
What version of .NET are you using?
What UI are you using - desktop (WPF/WinForms) or ASP.NET?
Let's assume that it's .NET 4.0 or higher and a desktop app.
Reading the sensors
In a WPF or WinForms application, I would use a single BackgroundWorker to read data from the sensors. 500 reads per second is trivial - even 500,00 is usually trivial. And the BackgroundWorker type is specifically designed for interacting with desktop apps, for example handing-off results to the UI without worrying about thread interactions.
Processing the calculations
Then you need to process the "complex" calculations. This depends on how long-lived these calculations are. If we assume they're short-lived (say less than 1 second each), then I think using the TaskScheduler and the standard ThreadPool will be fine. So you create a Task for each calculation, and then let the TaskScheduler take care of allocating tasks to threads.
The job of the TaskScheduler is to load-balance the work by queuing lightweight tasks to more heavyweight threads, and managing the ThreadPool to best balance the workload vs the number of cores on the machine. You can even override the default TaskScheduler to schedule tasks in whatever manner you want.
The ThreadPool is a FIFO queue of work items that need to be processed. In .NET 4.0, the ThreadPool has improved performance by making the work queue a thread-safe ConcurrentQueue collection.
Measuring task throughput and efficiency
You can use PerformanceCounter to measure both CPU and memory usage. This will give you a good idea of whether the cores and memory are being used efficiently. The task throughput is simply measured by looking at the rate at which tasks are being processed and supplying results.
Note that I haven't included any code here, as I assume you want to deal with the implementation details for your professor :-)
In an attempt to speed up processing of physics objects in C# I decided to change a linear update algorithm into a parallel algorithm. I believed the best approach was to use the ThreadPool as it is built for completing a queue of jobs.
When I first implemented the parallel algorithm, I queued up a job for every physics object. Keep in mind, a single job completes fairly quickly (updates forces, velocity, position, checks for collision with the old state of any surrounding objects to make it thread safe, etc). I would then wait on all jobs to be finished using a single wait handle, with an interlocked integer that I decremented each time a physics object completed (upon hitting zero, I then set the wait handle). The wait was required as the next task I needed to do involved having the objects all be updated.
The first thing I noticed was that performance was crazy. When averaged, the thread pooling seemed to be going a bit faster, but had massive spikes in performance (on the order of 10 ms per update, with random jumps to 40-60ms). I attempted to profile this using ANTS, however I could not gain any insight into why the spikes were occurring.
My next approach was to still use the ThreadPool, however instead I split all the objects into groups. I initially started with only 8 groups, as that was how any cores my computer had. The performance was great. It far outperformed the single threaded approach, and had no spikes (about 6ms per update).
The only thing I thought about was that, if one job completed before the others, there would be an idle core. Therefore, I increased the number of jobs to about 20, and even up to 500. As I expected, it dropped to 5ms.
So my questions are as follows:
Why would spikes occur when I made the job sizes quick / many?
Is there any insight into how the ThreadPool is implemented that would help me to understand how best to use it?
Using threads has a price - you need context switching, you need locking (the job queue is most probably locked when a thread tries to fetch a new job) - it all comes at a price. This price is usually small compared to the actual work your thread is doing, but if the work ends quickly, the price becomes meaningful.
Your solution seems correct. A reasonable rule of thumb is to have twice as many threads as there are cores.
As you probably expect yourself, the spikes are likely caused by the code that manages the thread pools and distributes tasks to them.
For parallel programming, there are more sophisticated approaches than "manually" distributing work across different threads (even if using the threadpool).
See Parallel Programming in the .NET Framework for instance for an overview and different options. In your case, the "solution" may be as simple as this:
Parallel.ForEach(physicObjects, physicObject => Process(physicObject));
Here's my take on your two questions:
I'd like to start with question 2 (how the thread pool works) because it actually holds the key to answering question 1. The thread pool is implemented (without going into details) as a (thread-safe) work queue and a group of worker threads (which may shrink or enlarge as needed). As the user calls QueueUserWorkItem the task is put into the work queue. The workers keep polling the queue and taking work if they are idle. Once they manage to take a task, they execute it and then return to the queue for more work (this is very important!). So the work is done by the workers on-demand: as the workers become idle they take more pieces of work to do.
Having said the above, it's simple to see what is the answer to question 1 (why did you see a performance difference with more fine-grained tasks): it's because with fine-grain you get more load-balancing (a very desirable property), i.e. your workers do more or less the same amount of work and all cores are exploited uniformly. As you said, with a coarse-grain task distribution, there may be longer and shorter tasks, so one or more cores may be lagging behind, slowing down the overall computation, while other do nothing. With small tasks the problem goes away. Each worker thread takes one small task at a time and then goes back for more. If one thread picks up a shorter task it will go to the queue more often, If it takes a longer task it will go to the queue less often, so things are balanced.
Finally, when the jobs are too fine-grained, and considering that the pool may enlarge to over 1K threads, there is very high contention on the queue when all threads go back to take more work (which happens very often), which may account for the spikes you are seeing. If the underlying implementation uses a blocking lock to access the queue, then context switches are very frequent which hurts performance a lot and makes it seem rather random.
answer of question 1:
this is because of Thread switching , thread switching (or context switching in OS concepts) is CPU clocks that takes to switch between each thread , most of times multi-threading increases the speed of programs and process but when it's process is so small and quick size then context switching will take more time than thread's self process so the whole program throughput decreases, you can find more information about this in O.S concepts books .
answer of question 2:
actually i have a overall insight of ThreadPool , and i cant explain what is it's structure exactly.
to learn more about ThreadPool start here ThreadPool Class
each version of .NET Framework adds more and more capabilities utilizing ThreadPool indirectly. such as Parallel.ForEach Method mentioned before added in .NET 4 along with System.Threading.Tasks which makes code more readable and neat. You can learn more on this here Task Schedulers as well.
At very basic level what it does is: it creates let's say 20 threads and puts them into a lits. Each time it receives a delegate to execute async it takes idle thread from the list and executes delegate. if no available threads found it puts it into a queue. every time deletegate execution completes it will check if queue has any item and if so peeks one and executes in the same thread.
From what I understand about the difference between Task & Thread is that task happened in the thread-pool while the thread is something that I need to managed by myself .. ( and that task can be cancel and return to the thread-pool in the end of his mission )
But in some blog I read that if the operating system need to create task and create thread => it will be easier to create ( and destroy ) task.
Someone can explain please why creating task is simple that thread ?
( or maybe I missing something here ... )
I think that what you are talking about when you say Task is a System.Threading.Task. If that's the case then you can think about it this way:
A program can have many threads, but a processor core can only run one Thread at a time.
Threads are very expensive, and switching between the threads that are running is also very expensive.
So... Having thousands of threads doing stuff is inefficient. Imagine if your teacher gave you 10,000 tasks to do. You'd spend so much time cycling between them that you'd never get anything done. The same thing can happen to the CPU if you start too many threads.
To get around this, the .NET framework allows you to create Tasks. Tasks are a bit of work bundled up into an object, and they allow you to do interesting things like capture the output of that work and chain pieces of work together (first go to the store, then buy a magazine).
Tasks are scheduled on a pool of threads. The specific number of threads depends on the scheduler used, but the default scheduler tries to pick a number of threads that is optimal for the number of CPU cores that you have and how much time your tasks are spending actually using CPU time. If you want to, you can even write your own scheduler that does something specific like making sure that all Tasks for that scheduler always operate on a single thread.
So think of Tasks as items in your to-do list. You might be able to do 5 things at once, but if your boss gives you 10000, they will pile up in your inbox until the first 5 that you are doing get done. The difference between Tasks and the ThreadPool is that Tasks (as I mentioned earlier) give you better control over the relationship between different items of work (imagine to-do items with multiple instructions stapled together), whereas the ThreadPool just allows you to queue up a bunch of individual, single-stage items (Functions).
You are hearing two different notions of task. The first is the notion of a job, and the second is the notion of a process.
A long time ago (in computer terms), there were no threads. Each running instance of a program was called a process, since it simply performed one step after another after another until it exited. This matches the intuitive idea of a process as a series of steps, like that of a factory assembly line. The operating system manages the process abstraction.
Then, developers began to add multiple assembly lines to the factories. Now a program could do more than one thing at once, and either a library or (more commonly today) the operating system would manage the scheduling of the steps within each thread. A thread is kind of a lightweight process, but a thread belongs to a process, and all the threads in a process share memory. On the other hand, multiple processes can't mess with each others' memory. So, the multiple threads in your web server can each access the same information about the connection, but Word can't access Excel's in-memory data structures because Word and Excel are running as separate processes. The idea of a process as a series of steps doesn't really match the model of a process with threads, so some people took to calling the "abstraction formerly known as a process" a task. This is the second definition of task that you saw in the blog post. Note that plenty of people still use the word process to mean this thing.
Well, as threads became more commmon, developers added even more abstractions over top of them to make them easier to use. This led to the rise of the thread pool, which is a library-managed "pool" of threads. You pass the library a job, and the library picks a thread and runs the job on that thread. The .NET framework has a thread pool implementation, and the first time you heard about a "task" the documentation really meant a job that you pass to the thread pool.
So in a sense, both the documentation and the blog post are right. The overloading of the term task is the unfortunate source of confusion.
Threads have been a part of .Net from v1.0, Tasks were introduced in the Task Parallel Library TPL which was released in .Net 4.0.
You can consider a Task as a more sophisticated version of a Thread. They are very easy to use and have a lot of advantages over Threads as follows:
You can create return types to Tasks as if they are functions.
You can the "ContinueWith" method, which will wait for the previous task and then start the execution. (Abstracting wait)
Abstracts Locks which should be avoided as per guidlines of my company.
You can use Task.WaitAll and pass an array of tasks so you can wait till all tasks are complete.
You can attach task to the parent task, thus you can decide whether the parent or the child will exist first.
You can achieve data parallelism with LINQ queries.
You can create parallel for and foreach loops
Very easy to handle exceptions with tasks.
*Most important thing is if the same code is run on single core machine it will just act as a single process without any overhead of threads.
Disadvantage of tasks over threads:
You need .Net 4.0
Newcomers who have learned operating systems can understand threads better.
New to the framework so not much assistance available.
Some tips:-
Always use Task.Factory.StartNew method which is semantically perfect and standard.
Take a look at Task Parallel Libray for more information
http://msdn.microsoft.com/en-us/library/dd460717.aspx
Expanding on the comment by Eric Lippert:
Threads are a way that allows your application to do several things in parallel. For example, your application might have one thread that processes the events from the user, like button clicks, and another thread that performs some long computation. This way, you can do two different things “at the same time”. If you didn't do that, the user wouldn't be to click buttons until the computation finished. So, Thread is something that can execute some code you wrote.
Task, on the other hand represents an abstract notion of some job. That job can have a result, and you can wait until the job finishes (by calling Wait()) or say that you want to do something after the job finishes (by calling ContinueWith()).
The most common job that you want to represent is to perform some computation in parallel with the current code. And Task offers you a simple way to do that. How and when the code actually runs is defined by TaskScheduler. The default one uses a ThreadPool: a set of threads that can run any code. This is done because creating and switching threads in inefficient.
But Task doesn't have to be directly associated with some code. You can use TaskCompletionSource to create a Task and then set its result whenever you want. For example, you could create a Task and mark it as completed when the user clicks a button. Some other code could wait on that Task and while it's waiting, there is no code executing for that Task.
If you want to know when to use Task and when to use Thread: Task is simpler to use and more efficient that creating your own Threads. But sometimes, you need more control than what is offered by Task. In those cases, it makese sense to use Thread directly.
Tasks really are just a wrapper for the boilerplate code of spinning up threads manually. At the root, there is no difference. Tasks just make the management of threads easier, as well as they are generally more expressive due to the lessening of the boilerplate noise.
I was thinking the other day about creating a little life simulator. I have only brushed over the idea and I was wondering the best way to implement so it will run efficiently.
At the lowest level there will be male and female entities wondering around and when they meet they will produce offspring.
I would like to use multithreading to manage the entities but im not sure of number of threads etc.
Would it be best to have a couple of threads that manage males and females, or would it be ok to start a thread for each entity so each instance is running on its own thread?
I have read a few posts about maximum number of threads and the limit ranges from 20-1000 threads in an app.
Anyone have any suggestions on architecture?
N
DO NOT HAVE ONE THREAD PER ENTITY. That is a recipe for disaster. That is not what threads were designed for. Threads are extremely heavyweight; remember, every thread consumes one million bytes of virtual address space immediately. Also, remember that every time two threads interact on a shared data structure -- like your simulated world -- they need to take out locks to prevent corruption. Your program will be a mass of hundreds of blocked threads, and blocked threads are useless; they can't work. High contention is anathema to good performance, and lots of threads sharing one data structure is nothing but constant contention.
The number of threads in your program should be one unless you have an extremely good reason to have two. In general the number of threads in a program should be as small as you can possibly make it while still getting the performance characteristics you need.
If your program really is "embarrassingly parallel" -- that is, it is extremely easy to do calculations in parallel without locking a shared data structure -- then the correct number of threads is equal to the number of processor cores in the machine. Remember, threads slow each other down. If you have four bank tellers, and each one is servicing one customer, things go fast. You are describing a situation where you have four bank tellers (CPU cores) each servicing a hundred people at the same time by handing out pennies round-robin.
Simulations are rarely embarrassingly parallel because it is hard to break up the work into independent parts. Ray tracing, for example, is embarrassingly parallel; the contents of one pixel do not depend on the contents of any other pixel, so they can be calculated in parallel. But you would not have one thread per pixel! You'd have one thread per processor, and let each of four processors work on a quarter of the pixels. It's hard to do that with simulations because the entities do interact with each other.
High quality professional simulators such as physics engines that need to deal with interacting entities do not solve their problems with threads. Typically they'll have one thread doing the simulation and one thread running the user interface, so that a costly simulation calculation does not hang the UI.
The right architecture for you is probably to have one thread going like blazes just doing the simulation. Work out all the interactions of the entities for a single frame and then signal the UI thread that it needs to update. This will allow you to figure out what your maximum frame rate is, by measuring how many microseconds it takes to work out all the interactions of every entity.
Using 100s of Threads (that would Sleep() a lot) is not a good solution. You will quickly run out of memory.
The TPL might make this idea workable but it wasn't really designed for this either.
Look into Discrete Event Simulation and Fibers. There are pseudo-Fiber libraries for .NET
I assume that your application manages the timeline in delta-Ts.
You could use C# 4.0's parallel framework and work with TASKs not threads.
Each delta-T run a parallel-for to update the entities.
The parallel framework will decide how many threads to open and manage the threads.
I've read that threads are very problematic. What alternatives are available? Something that handles blocking and stuff automatically?
A lot of people recommend the background worker, but I've no idea why.
Anyone care to explain "easy" alternatives? The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Any ideas?
To summarize the problems with threads:
if threads share memory, you can get
race conditions
if you avoid races by liberally using locks, you
can get deadlocks (see the dining philosophers problem)
An example of a race: suppose two threads share access to some memory where a number is stored. Thread 1 reads from the memory address and stores it in a CPU register. Thread 2 does the same. Now thread 1 increments the number and writes it back to memory. Thread 2 then does the same. End result: the number was only incremented by 1, while both threads tried to increment it. The outcome of such interactions depend on timing. Worse, your code may seem to work bug-free but once in a blue moon the timing is wrong and bad things happen.
To avoid these problems, the answer is simple: avoid sharing writable memory. Instead, use message passing to communicate between threads. An extreme example is to put the threads in separate processes and communicate via TCP/IP connections or named pipes.
Another approach is to share only read-only data structures, which is why functional programming languages can work so well with multiple threads.
This is a bit higher-level answer, but it may be useful if you want to consider other alternatives to threads. Anyway, most of the answers discussed solutions based on threads (or thread pools) or maybe tasks from .NET 4.0, but there is one more alternative, which is called message-passing. This has been successfuly used in Erlang (a functional language used by Ericsson). Since functional programming is becoming more mainstream in these days (e.g. F#), I thought I could mention it. In genral:
Threads (or thread pools) can usually used when you have some relatively long-running computation. When it needs to share state with other threads, it gets tricky (you have to correctly use locks or other synchronization primitives).
Tasks (available in TPL in .NET 4.0) are very lightweight - you can split your program into thousands of tasks and then let the runtime run them (it will use optimal number of threads). If you can write your algorithm using tasks instead of threads, it sounds like a good idea - you can avoid some synchronization when you run computation using smaller steps.
Declarative approaches (PLINQ in .NET 4.0 is a great option) if you have some higher-level data processing operation that can be encoded using LINQ primitives, then you can use this technique. The runtime will automatically parallelize your code, because LINQ doesn't specify how exactly should it evaluate the results (you just say what results you want to get).
Message-passing allows you two write program as concurrently running processes that perform some (relatively simple) tasks and communicate by sending messages to each other. This is great, because you can share some state (send messages) without the usual synchronization issues (you just send a message, then do other thing or wait for messages). Here is a good introduction to message-passing in F# from Robert Pickering.
Note that the last three techniques are quite related to functional programming - in functional programming, you desing programs differently - as computations that return result (which makes it easier to use Tasks). You also often write declarative and higher-level code (which makes it easier to use Declarative approaches).
When it comes to actual implementation, F# has a wonderful message-passing library right in the core libraries. In C#, you can use Concurrency & Coordination Runtime, which feels a bit "hacky", but is probably quite powerful too (but may look too complicated).
Won't the parallel programming options in .Net 4 be an "easy" way to use threads? I'm not sure what I'd suggest for .Net 3.5 and earlier...
This MSDN link to the Parallel Computing Developer Center has links to lots of info on Parellel Programming including links to videos, etc.
I can recommend this project. Smart Thread Pool
Project Description
Smart Thread Pool is a thread pool written in C#. It is far more advanced than the .NET built-in thread pool.
Here is a list of the thread pool features:
The number of threads dynamically changes according to the workload on the threads in the pool.
Work items can return a value.
A work item can be cancelled.
The caller thread's context is used when the work item is executed (limited).
Usage of minimum number of Win32 event handles, so the handle count of the application won't explode.
The caller can wait for multiple or all the work items to complete.
Work item can have a PostExecute callback, which is called as soon the work item is completed.
The state object, that accompanies the work item, can be disposed automatically.
Work item exceptions are sent back to the caller.
Work items have priority.
Work items group.
The caller can suspend the start of a thread pool and work items group.
Threads have priority.
Can run COM objects that have single threaded apartment.
Support Action and Func delegates.
Support for WindowsCE (limited)
The MaxThreads and MinThreads can be changed at run time.
Cancel behavior is imporved.
"Problematic" is not the word I would use to describe working with threads. "Tedious" is a more appropriate description.
If you are new to threaded programming, I would suggest reading this thread as a starting point. It is by no means exhaustive but has some good introductory information. From there, I would continue to scour this website and other programming sites for information related to specific threading questions you may have.
As for specific threading options in C#, here's some suggestions on when to use each one.
Use BackgroundWorker if you have a single task that runs in the background and needs to interact with the UI. The task of marshalling data and method calls to the UI thread are handled automatically through its event-based model. Avoid BackgroundWorker if (1) your assembly does not already reference the System.Windows.Form assembly, (2) you need the thread to be a foreground thread, or (3) you need to manipulate the thread priority.
Use a ThreadPool thread when efficiency is desired. The ThreadPool helps avoid the overhead associated with creating, starting, and stopping threads. Avoid using the ThreadPool if (1) the task runs for the lifetime of your application, (2) you need the thread to be a foreground thread, (3) you need to manipulate the thread priority, or (4) you need the thread to have a fixed identity (aborting, suspending, discovering).
Use the Thread class for long-running tasks and when you require features offered by a formal threading model, e.g., choosing between foreground and background threads, tweaking the thread priority, fine-grained control over thread execution, etc.
Any time you introduce multiple threads, each running at once, you open up the potential for race conditions. To avoid these, you tend to need to add synchronization, which adds complexity, as well as the potential for deadlocks.
Many tools make this easier. .NET has quite a few classes specifically meant to ease the pain of dealing with multiple threads, including the BackgroundWorker class, which makes running background work and interacting with a user interface much simpler.
.NET 4 is going to do a lot to ease this even more. The Task Parallel Library and PLINQ dramatically ease working with multiple threads.
As for your last comment:
The user will be able to select the number of threads to use (depending on their speed needs and computer power).
Most of the routines in .NET are built upon the ThreadPool. In .NET 4, when using the TPL, the work load will actually scale at runtime, for you, eliminating the burden of having to specify the number of threads to use. However, there are ways to do this now.
Currently, you can use ThreadPool.SetMaxThreads to help limit the number of threads generated. In TPL, you can specify ParallelOptions.MaxDegreesOfParallelism, and pass an instance of the ParallelOptions into your routine to control this. The default behavior scales up with more threads as you add more processing cores, which is usually the best behavior in any case.
Threads are not problematic if you understand what causes problems with them.
For ex. if you avoid statics, you know which API's to use (e.g. use synchronized streams), you will avoid many of the issues that come up for their bad utilization.
If threading is a problem (this can happen if you have unsafe/unmanaged 3rd party dll's that cannot support multithreading. In this can an option is to create a meachism to queue the operations. ie store the parameters of the action to a database and just run through them one at a time. This can be done in a windows service. Obviously this will take longer but in some cases is the only option.
Threads are indispensable tools for solving many problems, and it behooves the maturing developer to know how to effectively use them. But like many tools, they can cause some very difficult-to-find bugs.
Don't shy away from some so useful just because it can cause problems, instead study and practice until you become the go-to guy for multi-threaded apps.
A great place to start is Joe Albahari's article: http://www.albahari.com/threading/.