In my HFT trading application I have several places where I receive data from network. In most cases this is just a thread that only receives and process data. Below is part of such processing:
public Reciver(IPAddress mcastGroup, int mcastPort, IPAddress ipSource)
{
thread = new Thread(ReceiveData);
s = new Socket(AddressFamily.InterNetwork, SocketType.Dgram, ProtocolType.Udp);
s.ReceiveBufferSize = ReceiveBufferSize;
var ipPort = new IPEndPoint(LISTEN_INTERFACE/* IPAddress.Any*/, mcastPort);
s.Bind(ipPort);
option = new byte[12];
Buffer.BlockCopy(mcastGroup.GetAddressBytes(), 0, option, 0, 4);
Buffer.BlockCopy(ipSource.GetAddressBytes(), 0, option, 4, 4);
Buffer.BlockCopy(/*IPAddress.Any.GetAddressBytes()*/LISTEN_INTERFACE.GetAddressBytes(), 0, option, 8, 4);
}
public void ReceiveData()
{
byte[] byteIn = new byte[4096];
while (needReceive)
{
if (IsConnected)
{
int count = 0;
try
{
count = s.Receive(byteIn);
}
catch (Exception e6)
{
Console.WriteLine(e6.Message);
Log.Push(LogItemType.Error, e6.Message);
return;
}
if (count > 0)
{
OnNewMessage(new NewMessageEventArgs(byteIn, count));
}
}
}
}
This thread works forever once created. I just wonder if I should configure this thread to run on certain core? As I need lowest latency I want to avoid context switch. As I want to avoid context switch I better to run the same thread on the same processor core, right?
Taking into account that i need lowest latency is that correct that:
It would be better to set "thread afinity" for the most part of the "long-running" threads?
It would be better to set "thread afinity" for the thread from my example above?
I rewriting above code to c++ right now to port to Linux later if this is important however I assume that my question is more about hardware than language or OS.
I think the algorithm that has as little latency as possible would be to pin your threads to one core and set them to realtime priority (or whatever is the highest one).
This will cause the OS to evict any other thread which happens to use that core.
Hopefully the CPU cache will still contain useful data when your thread gets scheduled there. For that reason I like the idea of pinning to a core.
You should probably set your entire process to a high priority class and minimize other activity on your box. Also turn off unused hardware because it might generate interrupts. Fix your NIC's interrupts to a different CPU core (some better NICs can do that).
As I want to avoid context switch I better to run the same thread on the same processor core, right?
No. A context switch will not necessarily be avoided by setting affinity to one CPU. You have no control over context switches, they are in the hands of the OS thread scheduler. They occur when a thread quantum (time slice) has elapsed or when a higher priority thread interrupts your thread.
Latency you talk about, I assume is network or memory latency, is not at all avoided by setting thread affinity. Memory latency can be avoided by making your code cache friendly (ie it can all be in the L1 - L2 caches, for example). Network latency is really just part of any network, and not something I suspect you can do much about.
As Tony The Lion has already answered your question, I would like to address your comment:
"why not setting thread afinity to my code? why thread from my example need to travel between cores?"
Your thread doesn't travel anywhere.
Context switch happens when OS thread scheduler decides to give your thread a slice of time to execute. Then the environment is prepared for your thread, e.g. the CPU registers
are set up to correct values etc. This is called context switch.
So regardless of thread affinity, the same CPU setup work has to be done, whether it is the same CPU/core which was used in previous slice when your thread was running or another one. And at this moments, your computer has more info to do it properly then you do at compile time.
You seem to believe that thread somehow resides on the CPU, but it is not so. What you use is a logical thread and there can be hundreds or even thousands of them. Common CPUs, OTOH, usually have 1 or 2 hardware threads per core, and your logical thread gets mapped to one of these every time it is scheduled, even if OS always picks the same HW thread.
EDIT: it seems that you have already picked the answer you want to hear and I don't like long discussion threads on answers so I will put it here.
you should try and measure it. I believe that you will be dissapointed
running some threads on high priority thread might easily mess up other processes
you are worried about context switch latency, but you have no problems that GC thread will freeze your thread? BTW, on which core will your GC thread run? :)
what if your highest priority thread blocks GC thread? memory leaks? do you know what is priority of that thread so you are sure it would work?
really, why not C or hand optimized assembly if microseconds are important?
as someone suggested, you should use an RTOS if you want to control this aspect of execution
it doesn't seem likely that your data travels through data center just 4-5 times slower than it takes to setup a thread context on one machine, but who knows...
Related
I understand that thread pool priority should/can not be changed by the running process, but is the priority of particular task running on the thread pool somewhat priced-in with the calling process priority?
in other words, do all tasks in thread pool run in the same priority regardless the calling process priority?
thank you
update 1: i should have been more specific, i refer to thread inside Parallel.ForEach
I understand that thread pool priority should/can not be changed by the running process,
That's not exact. You can change Thread Pool's thread priority (inside delegate itself) and it'll run with new priority but default one will be restored when its task finishes and it'll be send back to pool.
ThreadPool.QueueUserWorkItem(delegate(object state) {
Thread.CurrentThread.Priority = ThreadPriority.Highest;
// Code in this function will run with Highest priority
});
is the priority of particular task running on the thread pool somewhat priced-in with the calling process priority?
Yes, and it doesn't apply to Thread Pool's threads only. In Windows process' priority is given by its class (from IDLE_PRIORITY_CLASS to REALTIME_PRIORITY_CLASS). Together with thread's priority (from THREAD_PRIORITY_IDLE to THREAD_PRIORITY_TIME_CRITICAL) it'll be used to calculate thread's final priority.
From MSDN:
The process priority class and thread priority level are combined to form the base priority of each thread.
Note that it's not simply a base priority plus an offset:
NORMAL_PRIORITY_CLASS + THREAD_PRIORITY_IDLE == 1
NORMAL_PRIORITY_CLASS + THREAD_PRIORITY_TIME_CRITICAL == 15
But:
REALTIME_PRIORITY_CLASS + THREAD_PRIORITY_IDLE == 16
REALTIME_PRIORITY_CLASS + THREAD_PRIORITY_TIME_CRITICAL == 31
Moreover threads can have a temporary boost (decided and managed by Windows Scheduler). Be aware that a process can also change its own priority class.
in other words, do all tasks in thread pool run in the same priority regardless the calling process priority?
No, thread's priority depends on process' priority (see previous paragraph) and each thread in pool can temporary have a different priority. Also note that thread priority isn't affected by caller thread's priority:
ThreadPool.QueueUserWorkItem(delegate(object s1) {
Thread.CurrentThread.Priority = ThreadPriority.Highest;
ThreadPool.QueueUserWorkItem(delegate(object s2) {
// This code is executed with ThreadPriority.Normal
Thread.CurrentThread.Priority = ThreadPriority.Lowest;
// This code is executed with ThreadPriority.Lowest
});
// This code is executed with ThreadPriority.Highest
});
EDIT: .NET tasks uses Thread Pool than what wrote above still applies. If, for example, you're enumerating a collecition with Parallel.ForEach to increase thread priority you have to do it inside your loop:
Parallel.ForEach(items, item => {
Thread.CurrentThread.Priority = ThreadPriority.Highest;
// Your code here...
});
Just a warning: be careful when you change priorities. If, for example, two threads use a shared resource (protected by a lock), there are many races to acquire that resources and one of them has highest priority then you may end with a very high CPU usage (because of spinning behavior of Monitor.Enter). This is just one issue, please refer to MSDN for more details (increasing thread's priority may even result is worse performance).
do all tasks in thread pool run in the same priority regardless the calling process priority?
They have to. The only thing dropped off at the pool is a delegate. That holds a reference to an object but not to the Thread that dropped it off.
The ones that are currently running have the same priority. But there's a queue for the ones that aren't running yet - so in practice, there is a "priority". Even more confusingly, thread priorities can be boosted (and limited) by the OS, for example when two threads in the threadpool depend on each other (e.g. one is blocking the other). And of course, any time something is blocking on the threadpool, you're wasting resources :D
That said, you shouldn't really change thread priorities at all, threadpool or not. You don't really need to, and thread (and process) priorities don't work the way you probably expect them to - it's just not worth it. Keep everything at normal, and just ignore that there's a Priority property, and you'll avoid a lot of unnecessary problems :)
You'll find a lot of nice explanations on the internet - for example, http://blog.codinghorror.com/thread-priorities-are-evil/. Those are usually dated, of course - but so is the concept of thread priorities, really - they were designed for single-core machines at a time when OSes weren't all that good at pre-emptive multitasking, really.
System.Threading.ThreadPool.SetMaxThreads(50, 50);
File.ReadLines().AsParallel().WithDegreeOfParallelism(100).ForAll((s)->{
/*
some code which is waiting external API call
and do not utilize CPU
*/
});
I have never got threads count more than CPU count in my system.
Can I use PLINQ and get more than one thread per CPU?
If you're calling external web API, you might be hitting the limit of concurrent simultaneous connections, which is set to 2. In the begining of your application do the following:
System.Net.ServicePointManager.DefaultConnectionLimit = 4096;
System.Net.ServicePointManager.Expect100Continue = false;
Try if that helps. If not, there might be some other bottleneck within the routine you're trying to parallelize.
Also, just like other responders said, ThreadPool decides how many threads to spin up based on load. In my experience with TPL I've seen that thread cound increases by time: longer the app runs, and heavier load gets, more threads are spun up.
PLINQ uses a hill-climbing algorithm to determine the optimum size of the thread pool which is used by the TPL. I think that if you put a lot of I/O in your tasks, seeing more threads than the cpu count is likeable.
That said, I've never seen more threads than the cpu count :) . But maybe I never had the right situation.
I tested this with the following code:
var lines = Enumerable.Range(0, 200).ToArray();
int currentThreads = 0;
int maxThreads = 0;
object l = new object();
lines.AsParallel().WithDegreeOfParallelism(100).ForAll(
s =>
{
lock (l)
{
currentThreads++;
if (currentThreads > maxThreads)
{
maxThreads = currentThreads;
Console.WriteLine(maxThreads);
}
}
Thread.Sleep(3000);
lock (l)
{
currentThreads--;
}
});
Console.WriteLine();
Console.WriteLine(maxThreads);
Basically, it records the current number of concurrently executing iterations and then saves the maximum encountered value.
The results vary quite a bit, between 15 and 25, but it's always much more than the number of CPUs my computer has (4). Increasing the sleep time increases the maximum number of concurrent threads. So it looks like the limiting factor here is the ThreadPool: it will create new threads slowly, especially when jobs are being completed relatively quickly.
If you want to increase the number of threads used, you would need to use SetMinThreads() (not SetMaxThreads()). If I set the minimum to 50, the number of threads actually used is around 60.
But having dozens of threads that do nothing but wait is quite inefficient, especially when it comes to memory consumption. You should consider using asynchronous methods instead.
PLINQ does not fit in this case.
I have found next article useful for me.
http://msdn.microsoft.com/en-us/library/hh228609(v=vs.110).aspx
Short answer: nope.
The amount of threading is simply up to the .Net Framework runtime. There is no developer control for controlling the number of threads for TPL (Task Parallel Library) usage.
EDIT
Thanks to some other feedback: it is actually possible--but not recommended--to manually control the number of threads in the ThreadPool, which PLINQ and TPL use.
It's my opinion that any parallelization problem needs to be carefully thought out, and carefully constructed and tested. There's a lot of subtlety in this.
In my C#/.NET 3.5 program I am using Threadpool threads ( delegate+BeginInvoke/EndInvoke) to parallelize and speed up some file loading. SystemInternals tool ProcessExplorer shows that number of threads in process is increasing over time, while I would expect to stay the same. Looks like some Threads/Threads handles stay hanging around for no reason.
Interestingly enough, I can not find pattern how threads grow and seems that happen sporadically, without repeatable pattern each time I start application. I spend some time analyzing and here are some observations:
1) code looks like this:
ArrayList IAsyncResult_s = new ArrayList();
AsyncProcessing thread1 = processRasterLayer;
... ArrayList filesToRender....
foreach (string FileName in filesToRender)
{
string fileName2 = FileName;
GeoImage partialImage1;
IAsyncResult asyncResult = thread1.BeginInvoke(
fileName2, .....,
out partialImage1, ..., null, null);
IAsyncResult_s.Add(asyncResult);
asyncResult = null;
}
.................
//block and render all
foreach (IAsyncResult asyncResult in IAsyncResult_s)
{
GeoImage partialImage1;
thread1.EndInvoke(
out partialImage1, , asyncResult);
//render image.. some calls to render partial image here
partialImage1.Dispose();
partialImage1 = null;
}
IAsyncResult_s.Clear();
IAsyncResult_s = null;
thread1 = null;
2) Number of Process Threads
My trace shows that during execution inside loop, ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads); gives numbers like 493, 1000.
At the end of loops , , ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads); gives numbers 500, 1000. So, number of available thread returns to same
Number of process threads reported by SystemInternals ProcessExplorer and API System.Diagnostics.Process.GetCurrentProcess().Threads.Count is the 16 before loops, and around 21 after loops.
If I call againg those loops, number of threads in process grows, but not by fixed nubmer each time, but grows 1-4 each time I repeat above code, so grows like 16->21->22->26->31...
3)Forced garbage collection didn’t htelp
I tried to froce garbage collection to get rid of those extra threads, but that didn’t removed them from process.
4)Profling tools
I was using RedGates Memory and Performace profilers, but hasen’t found obvious reason. I saw several extra threas and their object (ThreadContext etc) hanging, but saw no object holding those threads in memory. I am prety sure those extra threads were involved into loops work above, since I added thread name inside calls, and they still have that name I gave them.
5) Intelitrace
Intelitrace debuging showed also extra threads hanging. They still have names I gave them. But interestingly, it also showed that same thread that is hanging now, was used by above loop in the past, but also same thread was executing some timer related evens from timers form my code.
6) Locating issue
So, When I disable above loops that process filse Asynchroniously, and load files sequentialy, I do not have extra threads, and number of threads in my application is constant and and around 16.
7) Regarding SetMaxThreads :Here how it looks on my machine (XP, .NET 3.5):
Code like this:
ThreadPool.GetAvailableThreads(out AvailableWorkerThreads, out AvailableCompletionPortThreads);
ThreadPool.GetMaxThreads(out MaxWorkerThreads, out MaxCompletionPortThreads);
ThreadPool.GetMinThreads(out MinWorkerThreads, out MinCompletionPortThreads);
Gives result:
MinWorkerThreads:2 MaxWorkerThreads:500 MinCompletionPortThreads:2 MaxCompletionPortThreads:1000 AvailableWorkerThreads:500 AvailableCompletionPortThreads:1000
My app is using maybe 8 worker threads at the same time. I see no problem with SetMaxThreads.
8)
Functionally, I have no problems so far with this solution above. But somehow, if tools report that number of threads in my app is growing, it looks like “resource leak” of some kind, and I would like to address it. It looks like some thread handles are hanging around for no reason.
9) Here is one interesting article. It sasy that thread resources are cleaned once EndInvoke is called. I am doing so in my code. Article sasy: ..”. Because EndInvoke cleans up after the spawned thread, you must make sure that an EndInvoke is called for each BeginInvoke.” “If the thread pool thread has exited, EndInvoke does the following: It cleans up the exited thread's loose ends and disposes of its resources.” See: http://en.csharp-online.net/Asynchronous_Programming%E2%80%94BeginInvoke_EndInvoke
10) Another interesting article. Author says he had thread handle leaks because he was creating controls from non-gui thread. It is pretty elaborate article, see: http://msmvps.com/blogs/senthil/archive/2008/05/29/the-case-of-the-leaking-thread-handles.aspx
11) Another interesting article. It talks about ThreadPool.SetMinThreads property. It seems that it is not ThreadPool.SetMaxThreads but ThreadPool.SetMinThreads that enables useful control over ThradPool. This article is an eye-opener for me, and made me think about how ThreadPool works and performance problems it might cause. Article is: http_://www.dotnetperls.com/threadpool-setminthreads . Anoter similar one is : http_://www.codeproject.com/Articles/3813/NET-s-ThreadPool-Class-Behind-The-Scenes
12) Another interesting article. It is talking about throttling issue with ThreadPool. Article mentions ThreadPool limit of 2 new threads per second increase. See http_://social. msdn. microsoft. com/forums/en-US/clr/thread/3325cb32-371b-4f3e-965f-6ca88538dc3e/
13) So, in maybe 30 tests I saw only 2 times that number of threads allocated would shrink. But, it did happen. I saw once thread number going like 16->....->31->61-> ->30->16. So, it went back to 16. It doesn’t happen often, and it is not about time waited, it was like big activity in process, followed by a period of constant low level activity.
14) ThreadPool.SetMinThreads Method documentation. It talks about 2 new threads per second limit for threadpool. It is not clear if setting this property would remove that limit. http_://msdn.microsoft. com/en-ca/library/system. threading.threadpool.setminthreads(v=vs.90).aspx
So the answer is: there's no leak here. This is how the thread pool works. It keeps around threads that finished working so you don't have to pay the price of thread creation next time you need one. If you have many concurrent work items then the number of threads in the pool will increase but they'll max out at MaxWorkerThreads. (And it has nothing to do with the garbage collector.)
See this article for more info:
http://msdn.microsoft.com/en-us/library/0ka9477y.aspx
i would consider a consumer producer pattern. the idea behind a threadpool is to recycle threads, not create hundreds of new. in best case you have for each cpu one thread, and queue the work. this will be sure faster as you avoid useless context switches and waits for creating new threads, as far as i remember the net threadpool waits about one second until a new thread is created, to give other threads a chance to get recycled.
I have a game in XNA which needs to do network calls. In the update method I determine what needs to be sent and then add it to a list of stuff to send. Then I run the network call. This slows down the application alot obviously. So I first tried creating a new thread like this in the update to make it do it on a seperate thread:
Thread thread;
thread = new Thread(
new ThreadStart(DoNetworkThing));
thread.Start();
I presume creating threads has overhead etc which results in this being even slower.
Finally, I made a method that has while(true){DoNetworkThing();} in it which will keep looping round and run the network call over and over again(it does check if its already busy with one, and whether there is stuff to send). That method I called in the LoadContent method in a thread, so it will run alongside the game in its own thread.
But that is really slow too.
So what am I doing wrong? What is the best way of doing this?
Thanks
I had this exact problem when initially attempting to use threads on XNA -- adding a thread slowed everything down. It turned out that thread affinity was the issue.
On Xbox 360, by default, all threads run on the same processor core; this behaviour differs from Windows, where the kernel places threads onto other cores for you. (See answers on this thread at MSDN social for detail.)
To get around it, you need to set the affinity for your thread to another core in your thread function:
void DoNetworkThing()
{
#ifdef XBOX
Thread.SetProcessorAffinity(3); // see note below
#endif
/* your code goes here */
}
Thread thread = new Thread(
new ThreadStart(DoNetworkThing));
thread.Start();
The documentation for Thread.SetProcessorAffinity states that on XNA, cores 0 and 2 are reserved for the framework; core 1 and cores 3-5 are free for your use. The main thread (the thread containing your main() function) will be on core 1. The code above arbitrarily sets the thread to run on core 3, but you can choose a different core, or add code to programmatically choose a core (which you may want to do if you have more than a few threads).
Finally, note the #ifdef guard - Thread.SetProcessorAffinity is only available on Xbox; it won't even compile on Windows!
OK, so first of all- creating a new thread does have some minimal overhead, but you're only creating the thread once (or a few times)... so unless you're creating hundreds of threads (which you shouldn't be), then you will not need to worry about the overhead.
Let's look at your first example:
Thread thread;
thread = new Thread(new ThreadStart(DoNetworkThing));
// you should set the thread to background, unless you
// want it to live on even after your application closes
thread.IsBackground = true;
thread.Start();
If you were sticking with that model, the DoNetworkThing function would look like this:
void DoNetworkThing()
{
while(someConditionIsTrue)
{
// do the networking stuff here
}
}
I presume that in your next attempt you did something like this:
Thread thread = new Thread(()=>
{
while(true)
{
DoNetworkThing();
}
});
thread.IsBackground = true;
thread.Start();
Both approaches are fine, but the only difference is the content of DoNetworkingThing. In the second approach it will look like this:
void DoNetworkThing()
{
// do the networking thing, but note
// that this time your infinite while
// loop is outside the function
}
Now you said that both of these attempts are really slow, but nothing given in your examples would indicate that there should be any noticeable performance impact. It would be great if you can:
Give us an example that would demonstrate the slow down.
Tell us how many cores you have on the machine that the game is running on.
Finally, if you're mingling with threads then I would STRONGLY suggest that you pickup a good book on multithreading and really familiarize yourself with the concepts behind it, go through the exercises and write a couple of simple programs. It will take you a couple of months, but if you're not familiar with threading then it will help you learn how to avoid a lot of ugly mistakes.
A lot of people recommend Joe Duffy's Concurrent Programming on Windows, but feel free to check some other C# multithreading books too.
It might be worthwhile to check if using a ThreadPool improves performance. Creating many threads may be expensive.
http://msdn.microsoft.com/en-us/library/h4732ks0(v=vs.100).aspx
Alright...I've given the site a fair search and have read over many posts about this topic. I found this question: Code for a simple thread pool in C# especially helpful.
However, as it always seems, what I need varies slightly.
I have looked over the MSDN example and adapted it to my needs somewhat. The example I refer to is here: http://msdn.microsoft.com/en-us/library/3dasc8as(VS.80,printer).aspx
My issue is this. I have a fairly simple set of code that loads a web page via the HttpWebRequest and WebResponse classes and reads the results via a Stream. I fire off this method in a thread as it will need to executed many times. The method itself is pretty short, but the number of times it needs to be fired (with varied data for each time) varies. It can be anywhere from 1 to 200.
Everything I've read seems to indicate the ThreadPool class being the prime candidate. Here is what things get tricky. I might need to fire off this thing say 100 times, but I can only have 3 threads at most running (for this particular task).
I've tried setting the MaxThreads on the ThreadPool via:
ThreadPool.SetMaxThreads(3, 3);
I'm not entirely convinced this approach is working. Furthermore, I don't want to clobber other web sites or programs running on the system this will be running on. So, by limiting the # of threads on the ThreadPool, can I be certain that this pertains to my code and my threads only?
The MSDN example uses the event drive approach and calls WaitHandle.WaitAll(doneEvents); which is how I'm doing this.
So the heart of my question is, how does one ensure or specify a maximum number of threads that can be run for their code, but have the code keep running more threads as the previous ones finish up until some arbitrary point? Am I tackling this the right way?
Sincerely,
Jason
Okay, I've added a semaphore approach and completely removed the ThreadPool code. It seems simple enough. I got my info from: http://www.albahari.com/threading/part2.aspx
It's this example that showed me how:
[text below here is a copy/paste from the site]
A Semaphore with a capacity of one is similar to a Mutex or lock, except that the Semaphore has no "owner" – it's thread-agnostic. Any thread can call Release on a Semaphore, while with Mutex and lock, only the thread that obtained the resource can release it.
In this following example, ten threads execute a loop with a Sleep statement in the middle. A Semaphore ensures that not more than three threads can execute that Sleep statement at once:
class SemaphoreTest
{
static Semaphore s = new Semaphore(3, 3); // Available=3; Capacity=3
static void Main()
{
for (int i = 0; i < 10; i++)
new Thread(Go).Start();
}
static void Go()
{
while (true)
{
s.WaitOne();
Thread.Sleep(100); // Only 3 threads can get here at once
s.Release();
}
}
}
Note: if you are limiting this to "3" just so you don't overwhelm the machine running your app, I'd make sure this is a problem first. The threadpool is supposed to manage this for you. On the other hand, if you don't want to overwhelm some other resource, then read on!
You can't manage the size of the threadpool (or really much of anything about it).
In this case, I'd use a semaphore to manage access to your resource. In your case, your resource is running the web scrape, or calculating some report, etc.
To do this, in your static class, create a semaphore object:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
Then, in each thread, you do this:
System.Threading.Semaphore S = new System.Threading.Semaphore(3, 3);
try
{
// wait your turn (decrement)
S.WaitOne();
// do your thing
}
finally {
// release so others can go (increment)
S.Release();
}
Each thread will block on the S.WaitOne() until it is given the signal to proceed. Once S has been decremented 3 times, all threads will block until one of them increments the counter.
This solution isn't perfect.
If you want something a little cleaner, and more efficient, I'd recommend going with a BlockingQueue approach wherein you enqueue the work you want performed into a global Blocking Queue object.
Meanwhile, you have three threads (which you created--not in the threadpool), popping work out of the queue to perform. This isn't that tricky to setup and is very fast and simple.
Examples:
Best threading queue example / best practice
Best method to get objects from a BlockingQueue in a concurrent program?
It's a static class like any other, which means that anything you do with it affects every other thread in the current process. It doesn't affect other processes.
I consider this one of the larger design flaws in .NET, however. Who came up with the brilliant idea of making the thread pool static? As your example shows, we often want a thread pool dedicated to our task, without having it interfere with unrelated tasks elsewhere in the system.