I've noticed that in many of my services (which use multiple threads) the thread IDs keep increasing their values. Is this a sign of trouble? Am I somehow not returning them to the pool or is this value increase normal behavior?
As long as your threads are returning (and not blocking, waiting, sleeping or in an infinite loop) then you're okay. ManagedThreadId is just a unique identifier, it isn't a "thread count" at all ( http://msdn.microsoft.com/en-us/library/system.threading.thread.managedthreadid.aspx )
Thread.ManagedThreadId
An integer that represents a unique identifier for this managed thread.
To be sure your threads are returning, pause your process in the VS debugger and tell it to freeze all threads and have a look at the Threads debug window. In a runtime environment I'd modify the thread code to increment a locked integer and to decrement the same locked integer when the thread returns (use a try/finally block to ensure a thrown exception doesn't cause the integer decrement to be missed).
The 'correct' answer is no it is not normal. Not like CLR is broken. Your app should (most of the time unless you have some very good reason, of which I can't even imagine what it might be) use Thread threads carefully. If you are creating other the 100 threads you are 99% doing something wrong.
You either kill threads where you should re-use them OR you should use thread pool threads where you are using Thread threads.
EDIT OK. You might not trust me. But MSDN says the same:
The value of the ManagedThreadId property does not vary over time, even if unmanaged code that hosts the common language runtime implements the thread as a fiber.
So just to stress it again (which I haven't made clear in first attempt)... You are not seeing thread Ids changing in the existing threads. You see different threads popping up (in hundreds by your own words).... New thread gets new ID. Old thread does not change its ID.
Related
I saw David fowler's and Demian at the NDC and they've talked about scaling.
At the beginning of the presentation they've asked the audience: " How many threads are involved here in this code : "
void Main()
{
Task.Delay(1000).Wait();
}
Then #jonskeet said: "at least 2".
The first thread is the main thread and I can assume that the second thread is the one used by Delay ( timer ), which at the end grabs another thread from the thread pool ( I hope I'm right on this one). There is no await here. So I don't think there is a state machine here.
Question
But why is there another option for another thread ? ( he said at least 2). Can someone please clarify what's the thread usage in this simple example?
But why is there another option for another thread ?
Speculation, but: we know that there is not an OS level timer per delay; instead, as an implementation detail there is a linked-list (ordered by timeout) of pending timers, and only the first node is actually scheduled to the OS.
Now imagine the OS-level timeout triggers; it needs to do multiple things:
activate the callbacks of all items with the same timeout value
schedule an OS timeout for the next item with a later timeout
book-keeping
The infrastructure code probably doesn't want one slowly written callback to delay all the others, so it almost certainly hands the callback activation to the thread-pool, rather than invoking the callback synchronously. It is possible, but not guaranteed, that the book-keeping etc will happen fast enough that the same worker thread picks up the callbacks from the pool; a more likely option is that an unrelated thread-pool thread deals with that.
So; we have
your primary thread
the thread handling the OS timeout and scheduling callbacks onto the thread-pool
the thread-pool thread picking up the callback
For a definitive answer, only Jon can answer the question, since he's the one who uttered the phrase you're asking about. Fortunately, in this case there's a real possibility he might.
That said, I would say the "at least" is mainly acknowledgement that there any number of other sources of other threads, never mind it depends on what the original question actually meant by "involved here". For example, simply accessing the thread pool could result in some minimum of threads being created immediately; they may not be used, but they could still be there.
Furthermore, .NET has for some time had a multithreaded garbage collector. So the mere fact you're dealing with a .NET program means there could be that GC thread involved. For that matter, there could also be the finalizer thread.
All that said, I would say that generally, you could expect there to just be the two threads. The thread pool by default will create threads immediately up to some maximum number, but only as needed. And in the given code example, there's not going to be any demand for garbage collection. When I run the example you show in a default .NET 5 project, I get just the two threads you'd expect:
Okay, So I wanted to know what happens when I use TaskCreationOptions.LongRunning. By this answer, I came to know that for long running tasks, I should use this options because it creates a thread outside of threadpool.
Cool. But what advantage would I get when I create a thread outside threadpool? And when to do it and avoid it?
what advantage would I get when I create a thread outside threadpool?
The threadpool, as it name states, is a pool of threads which are allocated once and re-used throughout, in order to save the time and resources necessary to allocate a thread. The pool itself re-sizes on demand. If you queue more work than actual workers exist in the pool, it will allocate more threads in 500ms intervals, one at a time (this exists to avoid allocation of multiple threads simultaneously where existing threads may already finish executing and can serve requests). If many long running operations are performed on the thread-pool, it causes "thread starvation", meaning delegates will start getting queued and ran only once a thread frees up. That's why you'd want to avoid a large amount of threads doing lengthy work with thread-pool threads.
The Managed Thread-Pool docs also have a section on this question:
There are several scenarios in which it is appropriate to create and
manage your own threads instead of using thread pool threads:
You require a foreground thread.
You require a thread to have a particular priority.
You have tasks that cause the thread to block for long periods of time. The thread pool has a maximum number of threads, so a large
number of blocked thread pool threads might prevent tasks from
starting.
You need to place threads into a single-threaded apartment. All ThreadPool threads are in the multithreaded apartment.
You need to have a stable identity associated with the thread, or to dedicate a thread to a task.
For more, see:
Thread vs ThreadPool
When should I not use the ThreadPool in .Net?
Dedicated thread or thread-pool thread?
"Long running" can be quantified pretty well, a thread that takes more than half a second is running long. That's a mountain of processor instructions on a modern machine, you'd have to burn a fat five billion of them per second. Pretty hard to do in a constructive way unless you are calculating the value of Pi to thousands of decimals in the fraction.
Practical threads can only take that long when they are not burning core but are waiting a lot. Invariably on an I/O completion, like reading data from a disk, a network, a dbase server. And often the reason you'd start considering using a thread in the first place.
The threadpool has a "manager". It determines when a threadpool thread is allowed to start. It doesn't happen immediately when you start it in your program. The manager tries to limit the number of running threads to the number of CPU cores you have. It is much more efficient that way, context switching between too many active threads is expensive. And a good throttle, preventing your program from consuming too many resources in a burst.
But the threadpool manager has the very common problem with managers, it doesn't know enough about what is going on. Just like my manager doesn't know that I'm goofing off at Stackoverflow.com, the tp manager doesn't know that a thread is waiting for something and not actually performing useful work. Without that knowledge it cannot make good decisions. A thread that does a lot of waiting should be ignored and another one should be allowed to run in its place. Actually doing real work.
Just like you tell your manager that you go on vacation, so he can expect no work to get done, you tell the threadpool manager the same thing with LongRunning.
Do note that it isn't quite a bad as it, perhaps, sounds in this answer. Particularly .NET 4.0 hired a new manager that's a lot smarter at figuring out the optimum number of running threads. It does so with a feedback loop, collecting data to discover if active threads actually get work done. And adjusts the optimum accordingly. Only problem with this approach is the common one when you close a feedback loop, you have to make it slow so the loop cannot become unstable. In other words, it isn't particularly quick at driving up the number of active threads.
If you know ahead of time that the thread is pretty abysmal, running for many seconds with no real cpu load then always pick LongRunning. Otherwise it is a tuning job, observing the program when it is done and tinkering with it to make it more optimal.
This is pretty simple. We have code like this:
var slot = Thread.GetNamedDataSlot("myslot");
Thread.SetData(slot, value);
The current code exits the thread. Eventually the thread is re-allocated for more work. We expect (according to doc and many assertions in SO) that the value will still be there in the slot. And yet, at least sometimes, it isn't. It comes up null. The ManagedThreadId is the same as the one we set the value for, but the value has gone null.
We do call some opaque third-party assemblies, but I don't think that there's any way that other code could clear that slot without knowing its name.
Any thoughts on how this could go happen? Could it be that .net destroys the thread, and later creates another one with the same id? Does a thread live for the duration of the app domain?
The answer is that threads are not forever. A thread returned to the pool might be reused, or might be discarded. Take care when leaving something in TLS on a thread, if you don't code a destructor you could have a resource leak.
Here's a post that describes the same issue: http://rocksolid.gibraltarsoftware.com/development/logging/managed-thread-ids-unique-ids-that-arent-unique
Threadpool threads do not belong to you. You're not supposed to rely on their context at all, and that includes stuff like ThreadStatic data and LocalDataStoreSlot. There's so many things the runtime can do with threadpool threads that will break your code, it's not even funny. This gets even crazier when you start using await, for example (the same method can easily execute on multiple different threads, some from the thread pool, some not...).
As an implementation detail (nothing you should rely on), the .NET runtime manages the thread pool to be as big as required. On a properly asynchronous application, this means it will only have about 1-2x the amount of CPU cores. However, if those threads become tied up, it will start creating new ones to accomodate the queued work items (unless, of course, the pool threads are actually saturating the CPU - new threads will not help in that case). When the peak load is done, it will similarly start releasing the threads.
ManagedThreadId is not unique over the scope of the AppDomain over its life-time - it is only unique in any given moment. You shouldn't rely on it being unique, especially when dealing with threadpool threads. The ID will stay the same for a given thread over it's lifetime (even if the underlying system thread changes - assuming of course the managed thread is actually implemented on top of a system thread) - when you're working with threadpool threads, though, you are not working with actual threads - you're just posting work items on the thread-pool.
By default, the CLR runs tasks on pooled threads, which is ideal for
short-running compute-bound work. For longer-running and blocking
operations, you can prevent use of a pooled thread as follows:
Task task = Task.Factory.StartNew (() => ...,
TaskCreationOptions.LongRunning);
I am reading topic about thread and task. Can you explain to me what are "long[er]-running" and "short-running" tasks?
In general thread pooling, you distinguish short-running and long-running threads based on the comparison between their start-up time and run time.
Threads generally take some time to be created and get up to the point where they can start running your code.
The means that if you run a large number of threads where they each take a minute to start but only run for a second (not accurate times but the intent here is simply to show the relationship), the run time of each will be swamped by the time taken to get them going in the first place.
That's one of the reasons for using a thread pool: the threads aren't terminated once their work is done. Instead, they hang around to be reused so that the start-up time isn't incurred again.
So, in that sense, a long running thread is one whose run time is far greater than the time required to start it. In that case, the start-up time is far less important than it is for short running threads.
Conversely, short running threads are ones whose run time is less than or comparable to the start-up time.
For .NET specifically, it's a little different in operation. The thread pooling code will, once it's reached the minimum number of threads, attempt to limit thread creation to one per half-second.
Hence, if you know your thread is going to be long running, you should notify the scheduler so that it can adjust itself accordingly. This will probably mean just creating a new thread rather than grabbing one from the pool, so that the pool can be left to service short-running tasks as intended (no guarantees on that behaviour but it would make sense to do it that way).
However, that doesn't change the meaning of long-running and short-running, all it means is that there's some threshold at which it makes sense to distinguish between the two. For .NET, I would suggest the half-second figure would be a decent choice.
Is there any way to find that
how many threads are waiting on semaphore?
how many threads have currently occupied semaphore?
if i use threadpool thread to wait on semaphore, how to let main thread wait until threadpool thread is finished.
Thanks.
This is forbidden knowledge in thread synchronization. Because it is utterly impossible to ever make this accurate. It represents an unsolvable race condition. When you use Habjan's approach, you'll conclude that there are, say, two threads waiting. A microsecond later another thread calls WaitOne() and there are three. But you'll make decisions based on that stale value.
Race conditions are nothing to mess with, they are incredibly hard to debug. They have a habit of making your code fail only once a week. As soon as you add instrumenting code to try to diagnose the reason your code fails, they'll stop occurring because that added code changed the timing.
Never do this.