I am creating an app that deals with huge number of data to be processed. I want to use threading in C# just to make it processes faster. Please see example code below.
private static void MyProcess(Object someData)
{
//Do some data processing
}
static void Main(string[] args)
{
for (int task = 1; task < 10; task++)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(MyProcess), task);
}
}
Does this mean that a new thread will be created every loop passing the task to the "MyProcess" method (10 threads total)? Also, are the threads going to process concurrently?
The number of threads a threadpool will start depends on multiple factors, see The managed thread pool
Basically you are queing 10 worker items here which are likely to start threads immediatly.
The threads will most likly run concurrently, depending on the machine and number of processors.
If you start a large number of worker items, they will end up in a queue and start running as soon as a thread becomes available.
The calls will be scheduled on the thread pool. It does not guarantee that 10 threads will be created nor that all 10 tasks will be executed concurrently. The number of threads in the thread pool depends on the hardware and is chosen automatically to provide the best performance.
This articles contain good explanations of how it works:
https://owlcation.com/stem/C-ThreadPool-and-its-Task-Queue-Example
https://learn.microsoft.com/en-us/dotnet/api/system.threading.threadpool?redirectedfrom=MSDN&view=netframework-4.8
https://www.c-sharpcorner.com/article/thread-pool-in-net-core-and-c-sharp/
This Stackoverflow question explains the difference between ThreadPool and Thread:
Thread vs ThreadPool
Your method will be queued 9 times (you start at 1, not 0) for execution and will be executed when a thredpool thread will be available.
Related
I have a Task which I do not await because I want it to continue its own logic in the background. Part of that logic is to delay 60 seconds and check back in to see if some minute work is to be done. The abbreviate code looks something like this:
public Dictionary<string, Task> taskQueue = new Dictionary<string, Task>();
// Entry point
public void DoMainWork(string workId, XmlDocument workInstructions){
// A work task (i.e. "workInstructions") is actually a plugin which might use its own tasks internally or any other logic it sees fit.
var workTask = Task.Factory.StartNew(() => {
// Main work code that interprets workInstructions
// .........
// .........
// etc.
}, TaskCreationOptions.LongRunning);
// Add the work task to the queue of currently running tasks
taskQueue.Add(workId, workTask);
// Delay a period of time and then see if we need to extend our timeout for doing main work code
this.QueueCheckinOnWorkTask(workId); // Note the non-awaited task
}
private async Task QueueCheckinOnWorkTask(string workId){
DateTime startTime = DateTime.Now;
// Delay 60 seconds
await Task.Delay(60 * 1000).ConfigureAwait(false);
// Find out how long Task.Delay delayed for.
TimeSpan duration = DateTime.Now - startTime; // THIS SOMETIMES DENOTES TIMES MUCH LARGER THAN EXPECTED, I.E. 80+ SECONDS VS. 60
if(!taskQueue.ContainsKey(workId)){
// Do something based on work being complete
}else{
// Work is not complete, inform outside source we're still working
QueueCheckinOnWorkTask(workId); // Note the non-awaited task
}
}
Keep in mind, this is example code just to show a extremely miniminal version of what is going on with my actual program.
My problem is that Task.Delay() is delaying for longer than the time specified. Something is blocking this from continuing in a reasonable timeframe.
Unfortunately I haven't been able to replicate the issue on my development machine and it only happens on the server every couple of days. Lastly, it seems related to the number of work tasks we have running at a time.
What would cause this to delay longer than expected? Additionally, how might one go about debugging this type of situation?
This is a follow up to my other question which did not receive an answer: await Task.Delay() delaying for longer that expected
Most often that happens because of thread pool saturation. You can clearly see its effect with this simple console application (I measure time the same way you are doing, doesn't matter in this case if we use stopwatch or not):
public class Program {
public static void Main() {
for (int j = 0; j < 10; j++)
for (int i = 1; i < 10; i++) {
TestDelay(i * 1000);
}
Console.ReadKey();
}
static async Task TestDelay(int expected) {
var startTime = DateTime.Now;
await Task.Delay(expected).ConfigureAwait(false);
var actual = (int) (DateTime.Now - startTime).TotalMilliseconds;
ThreadPool.GetAvailableThreads(out int aw, out _);
ThreadPool.GetMaxThreads(out int mw, out _);
Console.WriteLine("Thread: {3}, Total threads in pool: {4}, Expected: {0}, Actual: {1}, Diff: {2}", expected, actual, actual - expected, Thread.CurrentThread.ManagedThreadId, mw - aw);
Thread.Sleep(5000);
}
}
This program starts 100 tasks which await Task.Delay for 1-10 seconds, and then use Thread.Sleep for 5 seconds to simulate work on a thread on which continuation runs (this is thread pool thread). It will also output total number of threads in thread pool, so you will see how it increases over time.
If you run it you will see that in almost all cases (except first 8) - actual time after delay is much longer than expected, in some cases 5 times longer (you delayed for 3 seconds but 15 seconds has passed).
That's not because Task.Delay is so imprecise. The reason is continuation after await should be executed on a thread pool thread. Thread pool will not always give you a thread when you request. It can consider that instead of creating new thread - it's better to wait for one of the current busy threads to finish its work. It will wait for a certain time and if no thread became free - it will still create a new thread. If you request 10 thread pool threads at once and none is free, it will wait for Xms and create new one. Now you have 9 requests in queue. Now it will again wait for Xms and create another one. Now you have 8 in queue, and so on. This wait for a thread pool thread to become free is what causes increased delay in this console application (and most likely in your real program) - we keep thread pool threads busy with long Thread.Sleep, and thread pool is saturated.
Some parameters of heuristics used by thread pool are available for you to control. Most influential one is "minumum" number of threads in a pool. Thread pool is expected to always create new thread without delay until total number of threads in a pool reaches configurable "minimum". After that, if you request a thread, it might either still create new one or wait for existing to become free.
So the most straightforward way to remove this delay is to increase minimum number of threads in a pool. For example if you do this:
ThreadPool.GetMinThreads(out int wt, out int ct);
ThreadPool.SetMinThreads(100, ct); // increase min worker threads to 100
All tasks in the example above will complete at the expected time with no additional delay.
This is usually not recommended way to solve this problem though. It's better to avoid performing long running heavy operations on thread pool threads, because thread pool is a global resource and doing this affects your whole application. For example, if we remove Thread.Sleep(5000) in the example above - all tasks will delay for expected amount of time, because all what keeps thread pool thread busy now is Console.WriteLine statement which completes in no time, making this thread available for other work.
So to sum up: identify places where you perform heavy work on thread pool threads and avoid doing that (perform heavy work on separate, non-thread-pool threads instead). Alternatively, you might consider increasing minimum number of threads in a pool to a reasonable amount.
Why is it that the following program will only run a limited number of blocked tasks. The limiting number seems to be the number of cores on the machine.
Initially when I wrote this I expected to see the following:
Job complete output of Jobs 1 - 24
A 2 second gap
Output of Jobs 25 - 48
However the output was:
Job complete output of Jobs 1 - 4
Then randomly completing jobs every couple of 100ms.
When running on server with 32 cores, the program did run as I had expected.
class Program
{
private static object _lock = new object();
static void Main(string[] args)
{
int completeJobs = 1;
var limiter = new MyThreadLimiter();
for (int iii = 1; iii < 100000000; iii++)
{
var jobId = iii;
limiter.Schedule()
.ContinueWith(t =>
{
lock (_lock)
{
completeJobs++;
Console.WriteLine("Job: " + completeJobs + " scheduled");
}
});
}
Console.ReadLine();
}
}
class MyThreadLimiter
{
readonly SemaphoreSlim _semaphore = new SemaphoreSlim(24);
public async Task Schedule()
{
await _semaphore.WaitAsync();
Task.Run(() => Thread.Sleep(2000))
.ContinueWith(t => _semaphore.Release());
}
}
However replacing the Thread.Sleep with Task.Delay gives my expected results.
public async Task Schedule()
{
await _semaphore.WaitAsync();
Task.Delay(2000)
.ContinueWith(t => _semaphore.Release());
}
And using a Thread gives my expected results
public async Task Schedule()
{
await _semaphore.WaitAsync();
var thread = new Thread(() =>
{
Thread.Sleep(2000);
_semaphore.Release();
});
thread.Start();
}
How does Task.Run() work? Is it the case it is limited to the number of cores?
Task.Run schedules the work to run in the thread pool. The thread pool is given wide latitude to schedule the work as best as it can in order to maximize throughput. It will create additional threads when it feels they will be helpful, and remove threads from the pool when it doesn't think it will be able to have enough work for them.
Creating more threads than your processor is able to run at the same time isn't going to be productive when you have CPU bound work. Adding more threads will just result in dramatically more context switches, increasing overhead, and reducing throughput.
Yes for compute bound operations Task.Run() internally uses CLR's thread pool which will throttle the number of new threads to avoid CPU over-subscription. Initially it will run the number of threads that equals to the number of cpu cores concurrently. Then it continually optimises the number of threads using a hill-climbing algorithm based on factors like the number of requests thread pool receives and overall computer resources to either create more threads or fewer threads.
In fact, this is one of the main benefits of using pooled thread over raw thread e.g. (new Thread(() => {}).Start()) as it not only recycles threads but also optimises performance internally for you. As mentioned in the other answer, it's generally a bad idea to block pooled threads because it will "mislead" thread pool's optimisation, simiarly using many pooled thread to do very long-running computation can also lead to thread pool creating more threads and consequently increase the overheads of context switch and later destory extra threads in the pool.
The Task.Run() is running based on CLR Thread pool.
There is a concept called 'OverSubscription', means there
are being more active thread than CPU Cores and they must be time-sliced.
In Thread-Pool when the number of threads that must be scheduled on CPU Cores
increase, Context-Switch rise and at the result, the performance will hurt.
The CLR that manages the Thread-Pool avoid OverSubscription by queuing
and throttling the thread startup and always try to compensate workload.
Just for fun, I wrote this code to simulate a deadlock. Then, I sat and watched it run patiently until the total number of available worker threads that the thread pool had went down to zero. I was curious to see what would happen. Would it throw an exception?
using System;
using System.Diagnostics;
using System.Threading;
namespace Deadlock
{
class Program
{
private static readonly object lockA = new object();
private static readonly object lockB = new object();
static void Main(string[] args)
{
int worker, io;
ThreadPool.GetAvailableThreads(out worker, out io);
Console.WriteLine($"Total number of thread pool threads: {worker}, {io}");
Console.WriteLine($"Total threads in my process: {Process.GetCurrentProcess().Threads.Count}");
Console.ReadKey();
try
{
for (int i = 0; i < 1000000; i++)
{
AutoResetEvent auto1 = new AutoResetEvent(false);
AutoResetEvent auto2 = new AutoResetEvent(false);
ThreadPool.QueueUserWorkItem(ThreadProc1, auto1);
ThreadPool.QueueUserWorkItem(ThreadProc2, auto2);
var allCompleted = WaitHandle.WaitAll(new[] { auto1, auto2 }, 20);
ThreadPool.GetAvailableThreads(out worker, out io);
var total = Process.GetCurrentProcess().Threads.Count;
if (allCompleted)
{
Console.WriteLine($"All threads done: (Iteration #{i + 1}). Total: {total}, Available: {worker}, {io}\n");
}
else
{
Console.WriteLine($"Timed out: (Iteration #{i + 1}). Total: {total}, Available: {worker}, {io}\n");
}
}
Console.WriteLine("Press any key to exit...");
}
catch(Exception ex)
{
Console.WriteLine("An exception occurred.");
Console.WriteLine($"{ex.GetType().Name}: {ex.Message}");
Console.WriteLine("The program will now exit. Press any key to terminate the program...");
}
Console.ReadKey();
}
static void ThreadProc1(object state)
{
lock(lockA)
{
Console.WriteLine("ThreadProc1 entered lockA. Going to acquire lockB");
lock(lockB)
{
Console.WriteLine("ThreadProc1 acquired both locks: lockA and lockB.");
//Do stuff
Console.WriteLine("ThreadProc1 running...");
}
}
if (state != null)
{
((AutoResetEvent)state).Set();
}
}
static void ThreadProc2(object state)
{
lock(lockB)
{
Console.WriteLine("ThreadProc2 entered lockB. Going to acquire lockA.");
lock(lockA)
{
Console.WriteLine("ThreadProc2 acquired both locks: lockA and lockB.");
// Do stuff
Console.WriteLine("ThreadProc2 running...");
}
}
if (state != null)
{
((AutoResetEvent)state).Set();
}
}
}
}
Meanwhile, I also kept the Windows Task Manager's Performance tab running and watched the total number of operating system threads go up as my program ate up more threads.
Here is what I observed:
The OS didn't create more threads as the .NET thread pool created a thread every time. In fact, for every four or five iterations that my for loop ran, the OS thread-count would go up by one or two. This was interesting, but this isn't my question. It proves what has already been established.
More interestingly, I observed that the number of threads did not decrease by 2 on every iteration of my for loop. I expected that it should have gone down by 2 because none of my deadlocked threads are expected to return since they are deadlocked, waiting on each other.
I also observed that when the total number of available worker threads in the thread pool went down to zero, the program still kept running more iterations of my for-loop. This made me curious as to where those new threads were coming from if the thread pool had already run out of threads and none of the threads had returned?
So, to clarify, my two question(s), which, perhaps are related in that a single answer may be the explanation to them, are:
When a single iteration of my for-loop ran, for some of those iterations, no thread pool threads were created. Why? And where did the thread pool get the threads to run these iterations on?
Where did the thread pool get the threads from when it ran out of its total number of available worker threads and still kept running my for loop?
ThreadPool.GetAvailableThreads(out worker, out io);
That's not a great statistic to show you how the thread pool works. Primary problem is that it is ridiculously large number on all recent .NET versions. On my dual-core laptop, it starts out at 1020 in 32-bit mode, 32767 in 64-bit mode. Far, far larger than such an anemic CPU could reasonably handle. This number has significantly inflated over the years, it started out at 50x the number of cores back in .NET 2.0. It is now dynamically calculated based on machine capabilities, the job of the CLR host. It uses a glass that's well over half-full.
The primary job of the threadpool manager is to keep threading efficient. The sweet-spot is to keep the number of executing threads limited to the number of processor cores. Running more reduces perf, the OS then has to context-switch between threads and that adds overhead.
That ideal however cannot always be met, practical tp threads that programmers write are not always well-behaved. In practice they take too long and/or spend too much time blocking on I/O or a lock instead of executing code. Where your example is of course a rather extreme case of blocking.
The thread pool manager is not otherwise aware of exactly why a tp thread takes too long to execute. All it can see is that it takes too long to complete. Getting deep insight into exactly why a thread takes too long is not practical, it takes a debugger and the kind heavily trained massively parallel neural network that programmers have between their ears.
Twice a second, the thread pool manager re-evaluates the work load and allows an extra tp thread to start when none of the active ones complete. Even though that is beyond the optimum. On the theory that this is likely to get more work done since presumably the active ones are blocking too much and not using the available cores efficiently. Also important to solve some deadlock scenarios, albeit that you never want to need that. It is just a regular thread like any other, underlying OS call is CreateThread().
So that's what you see, the number of available threads drops by one twice a second. Independent of your code, this is time-based. There is actually a feedback loop implemented in the manager that tries to dynamically calculate the optimum number of extra threads. You never got there yet with all threads blocking.
This does not go on forever, you ultimately reach the high upper limit set by the default SetMaxThreads(). No exception, assuming you did not hit an OutOfMemoryException first and you'd commonly experience in real life, it just stops adding more threads. You are still adding execution requests to the thread pool, covers bullet 3, they just never actually get started. Eventually you'll run out of memory when the number of requests get too large. You'll have to wait for a long time, takes a while to fill up a gigabyte.
The cause is QueueUserWorkItem: "Queues a method for execution. The method executes when a thread pool thread becomes available." https://msdn.microsoft.com/en-us/library/kbf0f1ct(v=vs.110).aspx
In my understanding Threadpool just slowly increases the number of threads to fit your demand, this is what you see in taskmgr. i think you could verify this by adding some things to be done to your thread.
edit: What I mean is, you just queue them, the first threads are starting, and slowly slowly (every 500ms, https://blogs.msdn.microsoft.com/pedram/2007/08/05/dedicated-thread-or-a-threadpool-thread/) more and more threads are added until limits are reached - afterwards you can still queue new ones.
The thread pool (almost) never runs out of threads. There's an injection heuristic that adds new threads in an (almost) unbounded way when it thinks this is helping throughput. This also is a guard against deadlocks based on too few threads being available.
This can be a big problem because memory usage is (almost) unbounded.
"Almost" because there is a maximum thread count but that tends to be extremely high in practice (thousands of threads).
When a single iteration of my for-loop ran, for some of those iterations, no thread pool threads were created.
The reason is not apparent to me from the data shown. You probably should measure Process.GetCurrentProcess().ThreadCount after each iteration.
Maybe the deadlock was avoided in some cases? It's not a deterministic deadlock.
On the current CLR running threads appear to be 1:1 with OS threads.
Maybe you should run a simpler benchmark?
for (int i = 0; i < 10000000; i++)
{
Task.Run(() => Thread.Sleep(Timeout.Infinite));
int workerThreads;
int cpThreads;
ThreadPool.GetAvailableThreads(out workerThreads, out cpThreads);
Console.WriteLine($"Queued: {i}, threads: {Process.GetCurrentProcess().Threads.Count}, workerThreads: {workerThreads}, workerThreads: {cpThreads}");
Thread.Sleep(100);
}
I have the following code:
static void Main(string[] args)
{
Console.Write("Press ENTER to start...");
Console.ReadLine();
Console.WriteLine("Scheduling work...");
for (int i = 0; i < 1000; i++)
{
//ThreadPool.QueueUserWorkItem(_ =>
new Thread(_ =>
{
Thread.Sleep(1000);
}).Start();
}
Console.ReadLine();
}
According to the textbook C# 4.0 Unleashed by Bart De Smet (page 1466), using new Thread should mean using many more threads than if you use ThreadPool.QueueUserWorkItem which is commented out in my code.
However I've tried both, and seen in Resource Monitor that with "new Thread", there are about 11 threads allocated, however when I use ThreadPool.QueueUserWorkItem, there are about 50.
Why am I getting the opposite outcome of what is mentioned in this book?
Also why if you increase the sleep time, do you get many more threads allocated when using ThreadPool.QueueUserWorkItem?
new Thread() just creates a Thread object; you forgot to call Start() (which creates the actual thread that you see in resource monitor).
Also, if you are looking at the number of threads after the sleep has completed, you won't see any of the new Threads as they have already exited.
On the other hand, the ThreadPool keeps threads around for some time so it can reuse them, so in that case you can still see the threads even after the sleep has completed.
With new Thread(), you might be seeing the number staying around 160 because it took one second to start that many threads, so by the time the 161st thread is started, the first thread is already finished. You should see a higher number of threads if you increase the sleep time.
As for the ThreadPool, it is designed to use as few threads as possible while also keeping the CPU busy. Ideally, the number of busy threads is equal to the number of CPU cores. However, if the pool detects that its threads are currently not using the CPU (sleeping, or waiting for another thread), it starts up more threads (at a rate of 1/second, up to some maximum) to keep the CPU busy.
Our scenario is a network scanner.
It connects to a set of hosts and scans them in parallel for a while using low priority background threads.
I want to be able to schedule lots of work but only have any given say ten or whatever number of hosts scanned in parallel. Even if I create my own threads, the many callbacks and other asynchronous goodness uses the ThreadPool and I end up running out of resources. I should look at MonoTorrent...
If I use THE ThreadPool, can I limit my application to some number that will leave enough for the rest of the application to Run smoothly?
Is there a threadpool that I can initialize to n long lived threads?
[Edit]
No one seems to have noticed that I made some comments on some responses so I will add a couple things here.
Threads should be cancellable both
gracefully and forcefully.
Threads should have low priority leaving the GUI responsive.
Threads are long running but in Order(minutes) and not Order(days).
Work for a given target host is basically:
For each test
Probe target (work is done mostly on the target end of an SSH connection)
Compare probe result to expected result (work is done on engine machine)
Prepare results for host
Can someone explain why using SmartThreadPool is marked wit ha negative usefulness?
In .NET 4 you have the integrated Task Parallel Library. When you create a new Task (the new thread abstraction) you can specify a Task to be long running. We have made good experiences with that (long being days rather than minutes or hours).
You can use it in .NET 2 as well but there it's actually an extension, check here.
In VS2010 the Debugging Parallel applications based on Tasks (not threads) has been radically improved. It's advised to use Tasks whenever possible rather than raw threads. Since it lets you handle parallelism in a more object oriented friendly way.
UPDATE
Tasks that are NOT specified as long running, are queued into the thread pool (or any other scheduler for that matter).
But if a task is specified to be long running, it just creates a standalone Thread, no thread pool is involved.
The CLR ThreadPool isn't appropriate for executing long-running tasks: it's for performing short tasks where the cost of creating a thread would be nearly as high as executing the method itself. (Or at least a significant percentage of the time it takes to execute the method.) As you've seen, .NET itself consumes thread pool threads, you can't reserve a block of them for yourself lest you risk starving the runtime.
Scheduling, throttling, and cancelling work is a different matter. There's no other built-in .NET worker-queue thread pool, so you'll have roll your own (managing the threads or BackgroundWorkers yourself) or find a preexisting one (Ami Bar's SmartThreadPool looks promising, though I haven't used it myself).
In your particular case, the best option would not be either threads or the thread pool or Background worker, but the async programming model (BeginXXX, EndXXX) provided by the framework.
The advantages of using the asynchronous model is that the TcpIp stack uses callbacks whenever there is data to read and the callback is automatically run on a thread from the thread pool.
Using the asynchronous model, you can control the number of requests per time interval initiated and also if you want you can initiate all the requests from a lower priority thread while processing the requests on a normal priority thread which means the packets will stay as little as possible in the internal Tcp Queue of the networking stack.
Asynchronous Client Socket Example - MSDN
P.S. For multiple concurrent and long running jobs that don't do allot of computation but mostly wait on IO (network, disk, etc) the better option always is to use a callback mechanism and not threads.
I'd create your own thread manager. In the following simple example a Queue is used to hold waiting threads and a Dictionary is used to hold active threads, keyed by ManagedThreadId. When a thread finishes, it removes itself from the active dictionary and launches another thread via a callback.
You can change the max running thread limit from your UI, and you can pass extra info to the ThreadDone callback for monitoring performance, etc. If a thread fails for say, a network timeout, you can reinsert back into the queue. Add extra control methods to Supervisor for pausing, stopping, etc.
using System;
using System.Collections.Generic;
using System.Threading;
namespace ConsoleApplication1
{
public delegate void CallbackDelegate(int idArg);
class Program
{
static void Main(string[] args)
{
new Supervisor().Run();
Console.WriteLine("Done");
Console.ReadKey();
}
}
class Supervisor
{
Queue<System.Threading.Thread> waitingThreads = new Queue<System.Threading.Thread>();
Dictionary<int, System.Threading.Thread> activeThreads = new Dictionary<int, System.Threading.Thread>();
int maxRunningThreads = 10;
object locker = new object();
volatile bool done;
public void Run()
{
// queue up some threads
for (int i = 0; i < 50; i++)
{
Thread newThread = new Thread(new Worker(ThreadDone).DoWork);
newThread.IsBackground = true;
waitingThreads.Enqueue(newThread);
}
LaunchWaitingThreads();
while (!done) Thread.Sleep(200);
}
// keep starting waiting threads until we max out
void LaunchWaitingThreads()
{
lock (locker)
{
while ((activeThreads.Count < maxRunningThreads) && (waitingThreads.Count > 0))
{
Thread nextThread = waitingThreads.Dequeue();
activeThreads.Add(nextThread.ManagedThreadId, nextThread);
nextThread.Start();
Console.WriteLine("Thread " + nextThread.ManagedThreadId.ToString() + " launched");
}
done = (activeThreads.Count == 0) && (waitingThreads.Count == 0);
}
}
// this is called by each thread when it's done
void ThreadDone(int threadIdArg)
{
lock (locker)
{
// remove thread from active pool
activeThreads.Remove(threadIdArg);
}
Console.WriteLine("Thread " + threadIdArg.ToString() + " finished");
LaunchWaitingThreads(); // this could instead be put in the wait loop at the end of Run()
}
}
class Worker
{
CallbackDelegate callback;
public Worker(CallbackDelegate callbackArg)
{
callback = callbackArg;
}
public void DoWork()
{
System.Threading.Thread.Sleep(new Random().Next(100, 1000));
callback(System.Threading.Thread.CurrentThread.ManagedThreadId);
}
}
}
Use the built-in threadpool. It has good capabilities.
Alternatively you can look at the Smart Thread Pool implementation here or at Extended Thread Pool for a limit on the maximum number of working threads.