Change thread priority from lowest to highest in .net - c#

I am trying to speed things by having one thread write to a linked list and another thread process the linked list.
For some reason if the method that writes to the linked list I make it into a task and the method that reads from the linked list a low priority thread the program finishes as a whole much faster. In other words I experiense fastests results when doing:
Task.Factory.StartNew( AddItems );
new Thread( startProcessingItems ) { Priority = ThreadPriority.Lowest }.Start();
while(completed==false)
Thread.Sleep(0);
Maybe because the first task is doing so much extra work than the other thread that's why everything as a whole will finish faster if I set the second method a low priority.
Anyways now my question is The startProcessingItems runs with ThreadPriority = Lowest. How could I change it's priority to highest? If I create a new Task in that method will it be running with low priority? Basically the startProcessingItems ends with a list and once it has that list I will like to start executing with highest priority.

This is not a good approach. First off, LinkedList<T> is not thread safe, so writing to it and reading from it in two threads will cause race conditions.
A better approach would be to use BlockingCollection<T>. This allows you to add items (producer thread) and read items (consumer thread) without worrying about thread safety, as it's completely thread safe.
The reading thread can just call blockingCollection.GetConsumingEnumerable() in a foreach to get the elements, and the write thread just adds them. The reading thread will automatically block, so there's no need to mess with priorities.
When the writing thread is "done", you just call CompleteAdding, which will, in turn, allow the reading thread to finish automatically.

You can improve the performance of your program by changing the inherent design, rather than by changing thread/process priorities.
A big part of your problem is that you're doing a busywait:
while(completed==false)
Thread.Sleep(0);
This results in it consuming lots of CPU cycles for no productive work, and it's why lowering it's priority makes it execute quicker. If you aren't busywaiting then that won't be an issue any more.
As Reed has suggested, BlockingCollection is tailor made for this situation. You can have a producer thread adding items using Add, and a consumer thread using Take knowing that the method will simply block if there are no more items to be removed.
You can also store the Task you create and use Task.Result or Task.Wait to have the main thread wait for the other task to finish (without wasting CPU cycles). (If you are using threads directly you can use Join.)

In addition to what Reed and Servy have said:
Thread priority is relative to the processes priority.
The Windows scheduler takes into account all other threads when it schedules time for threads. Threads with higher priority take time away from other threads which could artificially slow down the rest of the system. It's not like the system has no reason not to give you thread more priority. The priority would only have an effect if something else took the CPU away from it--which happens for a reason. If nothing took the CPU away from the thread, it won't magically run faster with a priority of Highest.
Setting thread priority to Highest is almost always the wrong thing to do.
Likely the overhead of synchronization between the two threads will kill any performance gain you think you might get.
Also, Thread.Sleep(0) only relinquishes time to threads of equal priority and is ready to run--which could lead to thread starvation. http://msdn.microsoft.com/en-us/library/d00bd51t(v=vs.80).aspx

Related

How does the Parallel class dynamically adjust the level of parallelism?

What feedback does TPL use to dynamically adjust the number of worker threads?
My previous understanding was that it measures the rate of task completion to see if adding or removing threads is worth it. But then, why does this code keep increasing the number of threads, even though there is a bottleneck introduced by a semaphore?
Surely, there can be no more than 20 task completions per second, and more than 2 threads will not improve that.
var activeThreads = 0;
var semaphore = new SemaphoreSlim(2);
var res = Parallel.For(0, 1_000_000, i =>
{
Interlocked.Increment(ref activeThreads);
semaphore.Wait();
try
{
Thread.Sleep(100);
Console.WriteLine("Threads: " + activeThreads);
}
finally
{
Interlocked.Decrement(ref activeThreads);
semaphore.Release();
}
});
I believe the ParallelOptions is what you are looking for to specify the amount of parallelism.
Parallel.For(0, 1000, new ParallelOptions
{
MaxDegreeOfParallelism = 2
}, i => { Console.WriteLine(i); });
Personally, I think the TPL library will work in a lot of cases, but it isn't really smart about execution distribution (pardon my english). Whenever you have bottlenecks in the execution of your application, have a look at the pipeline pattern for example. Here is a link that describes different approaches to parallel execution very well imo: https://www.dotnetcurry.com/patterns-practices/1407/producer-consumer-pattern-dotnet-csharp
TL;DR: The thing that you are doing in your code that the TPL uses to justify creating a new thread is blocking. (Synchronizing or sleeping, or performing I/O would all count as blocking.)
A longer explanation...
When your task runs, it takes its thread hostage for 100 ms (because you Sleep(100)). While you are sleeping, that thread cannot be used to run other tasks because it would risk not being in a runnable state when the sleep time period expires. Typically we sleep rather than perform an asynchronous action because we need to keep our call stack intact. We are therefore relying on the stack to maintain our state. And the stack is a one-of-a-kind resource for the thread. (There's not actually a lot more to a thread than its stack.)
So the TPL (Thread pool, specifically) tries to keep occupancy high but the thread count low. One way it achieves this is by making sure that there are approximately just as many runnable threads in the system as there are virtual processors. Each time it needs to increase the thread count, it must create a relatively expensive stack for the thread, so it's best not to have so many. And a thread that is not runnable cannot be scheduled, so when the CPU becomes free, you need something to schedule to make use of the processing resources available. If the thread is sleeping, it cannot be scheduled to run. So instead, a thread will be added to the thread pool and the next task will be scheduled on it.
When you are writing parallel code like this (as in your parallel for loop) that can be partitioned and managed by the TPL you should be careful about putting your thread into a non-runnable state. Performing synchronous I/O, waiting for a synchronization object (e.g. semaphore, event or mutex etc.), or sleeping will put the thread into a state where nothing else can be done on the thread until the I/O completes, the sleep interval expires, or the synchronization object becomes signalled. The thread is no good to the TPL during this period.
In your case, you do several of these things: you wait on a semaphore, you sleep, and you perform I/O by writing to the console. The first thing is waiting on that semaphore. If it's not signalled, then you immediately have the situation where the thread is not runnable and the next task of your million-or-so tasks that need to be run must be scheduled on a different thread. If there isn't one, then the TPL can justify creating a new thread to get more tasks started. After-all, what if it's thread #987,321 that will actually wind up setting the semaphore to unblock task #1? The TPL doesn't know what your code does, so it can delay creating threads for a while in the spirit of efficiency, but for correctness, ultimately it will have to create more threads to start chipping away at the task list. There is a complex, implementation-specific heuristic that it applies to monitor, predict and otherwise get this efficiency guess right.
Now your specific question actually asked what feedback does it use to adjust the number of threads. Like I said, the actual implementation is complex and you should probably think of it as a black-box. But in a nutshell, if there are no runnable threads, it may create another thread to keep chipping away at the task list (or may wait a while before doing so, hoping that things will free up), and if there are too many idle threads, it will terminate the idle threads to reclaim their resources.
And to reiterate, as I said at the top, and to hopefully answer your question this time, the one thing you do that allows the TPL to justify creating a new thread is to block. ...even on that first semaphore.
Ran into an article analysing the thread injection algorithm in 2017. As of 2019-08-01, the hillclimbing.cpp file on GitHub hasn't really changed so the article should still be up to date.
Relevant details:
The .NET thread pool has two main mechanisms for injecting threads: a
starvation-avoidance mechanism that adds worker threads if it sees no
progress being made on queued items and a hill-climbing heuristic that
tries to maximize throughput while using as few threads as possible.
...
It calculates the desired number of threads based on the ‘current
throughput’, which is the ‘# of tasks completed’ (numCompletions)
during the current time-period (sampleDuration in seconds).
...
It also takes the current thread count (currentThreadCount) into
consideration.
...
The real .NET Thread Pool only increases the thread-count by one
thread every 500 milliseconds. It keeps doing this until the ‘# of
threads’ has reached the amount that the hill-climbing algorithm
suggests.
...
The [hill-climbing] algorithm only returns values that respect the limits
specified by ThreadPool.SetMinThreads(..) and
ThreadPool.SetMaxThreads(..)
...
In addition, [the hill-climbing algorithm] will only recommend
increasing the thread count if the CPU Utilization is below 95%
So it turns out the thread pool does have a feedback mechanism based on task completion rate. It also doesn't explicitly check whether its threads are blocked or running, but it does keep an eye on overall CPU utilization to detect blockages. All this also means it should be roughly aware of what the other threads and processes are doing.
On the other hand, it will always eagerly spawn at least as many threads as told by ThreadPool.SetMinThreads(), which defaults to the number of logical processors on the machine.
In conclusion, the test code in question was doing two things which make it keep piling up more threads:
there are lots of tasks queued up and sitting in the queue for ages, which indicates starvation
CPU utilization is negligible, which means that a new thread should be able to use it

What is the advantage of creating a thread outside threadpool?

Okay, So I wanted to know what happens when I use TaskCreationOptions.LongRunning. By this answer, I came to know that for long running tasks, I should use this options because it creates a thread outside of threadpool.
Cool. But what advantage would I get when I create a thread outside threadpool? And when to do it and avoid it?
what advantage would I get when I create a thread outside threadpool?
The threadpool, as it name states, is a pool of threads which are allocated once and re-used throughout, in order to save the time and resources necessary to allocate a thread. The pool itself re-sizes on demand. If you queue more work than actual workers exist in the pool, it will allocate more threads in 500ms intervals, one at a time (this exists to avoid allocation of multiple threads simultaneously where existing threads may already finish executing and can serve requests). If many long running operations are performed on the thread-pool, it causes "thread starvation", meaning delegates will start getting queued and ran only once a thread frees up. That's why you'd want to avoid a large amount of threads doing lengthy work with thread-pool threads.
The Managed Thread-Pool docs also have a section on this question:
There are several scenarios in which it is appropriate to create and
manage your own threads instead of using thread pool threads:
You require a foreground thread.
You require a thread to have a particular priority.
You have tasks that cause the thread to block for long periods of time. The thread pool has a maximum number of threads, so a large
number of blocked thread pool threads might prevent tasks from
starting.
You need to place threads into a single-threaded apartment. All ThreadPool threads are in the multithreaded apartment.
You need to have a stable identity associated with the thread, or to dedicate a thread to a task.
For more, see:
Thread vs ThreadPool
When should I not use the ThreadPool in .Net?
Dedicated thread or thread-pool thread?
"Long running" can be quantified pretty well, a thread that takes more than half a second is running long. That's a mountain of processor instructions on a modern machine, you'd have to burn a fat five billion of them per second. Pretty hard to do in a constructive way unless you are calculating the value of Pi to thousands of decimals in the fraction.
Practical threads can only take that long when they are not burning core but are waiting a lot. Invariably on an I/O completion, like reading data from a disk, a network, a dbase server. And often the reason you'd start considering using a thread in the first place.
The threadpool has a "manager". It determines when a threadpool thread is allowed to start. It doesn't happen immediately when you start it in your program. The manager tries to limit the number of running threads to the number of CPU cores you have. It is much more efficient that way, context switching between too many active threads is expensive. And a good throttle, preventing your program from consuming too many resources in a burst.
But the threadpool manager has the very common problem with managers, it doesn't know enough about what is going on. Just like my manager doesn't know that I'm goofing off at Stackoverflow.com, the tp manager doesn't know that a thread is waiting for something and not actually performing useful work. Without that knowledge it cannot make good decisions. A thread that does a lot of waiting should be ignored and another one should be allowed to run in its place. Actually doing real work.
Just like you tell your manager that you go on vacation, so he can expect no work to get done, you tell the threadpool manager the same thing with LongRunning.
Do note that it isn't quite a bad as it, perhaps, sounds in this answer. Particularly .NET 4.0 hired a new manager that's a lot smarter at figuring out the optimum number of running threads. It does so with a feedback loop, collecting data to discover if active threads actually get work done. And adjusts the optimum accordingly. Only problem with this approach is the common one when you close a feedback loop, you have to make it slow so the loop cannot become unstable. In other words, it isn't particularly quick at driving up the number of active threads.
If you know ahead of time that the thread is pretty abysmal, running for many seconds with no real cpu load then always pick LongRunning. Otherwise it is a tuning job, observing the program when it is done and tinkering with it to make it more optimal.

Change Thread.Priority to make program more responsive

I have written a .net winforms application that does some heavy processing and slows my computer down pretty much. I have read something about
Thread.CurrentThread.Priority
but i dont really understand if i should give the main thread more priority or to lower its priority to remove the "lagging" and the slowing of my computer.
Thank you.
It depends entirely what your application and any additional threads are doing. You shouldn't really boost your UI thread priority, however you could lower any background thread priorities.
To keep the UI responsive, don't do any heavy processing on that thread - do the work on a background thread.
It's a bit vague perhaps, but then so if your question. Happy to go into more detail if you can too. Hope that helps !
Yes, it will solve your problem. Set it to ThreadPriority.BelowNormal (or Lowest) and any thread that is started by other processes on your machine will get scheduled ahead of your worker thread. It notably keeps any program you use interactively more responsive. The consequence is that your worker thread can get starved for cpu time when another process is burning cpu. It will still run occasionally, just not very often.
In general, avoid starting more threads than you have cpu cores. Environment.ProcessorCount. The threadpool scheduler already does this automatically, but doesn't pay attention to other processes.
If you're experiencing general "slowing" and "lag" then boosting the priority of any individual thread is only going to make things worse.
It depends exactly how your app is structured, but if you have some heavy processing going on and it's using enough processor time to have a negative impact on the rest of the system then you have two main options:
Reduce the priority of the processing threads so that activity in the rest of the system takes priority.
Introduce artificial breaks in the processing thread. Typically this is done by regularly sending the thread to sleep or yielding to other threads or an event loop.
You should only be increasing the priority on threads that require a higher priority within the system, not just because they are running slowly. Priority means that your thread should be considered more important than other threads. In this case, I don't think that this is the case.
If you lower the priority, then you will probably find it appears to run slower, as other tasks may take the processing time.
What you need to do is reasses your processing, possibly add threads ( or more threads ), or consider how the processing can be improved in other ways. Prioirty is not the answer.

How to "Free" a thread

I have 20 threads running at a time in my program, (create 20 wait for them to finish, start another 20), after a while my program slows way down. Do I need to free the tasks or do anything special? If so how, if not is there a common reason why a program like this would slow down?
You might want to consider using the ThreadPool, either directly, or via the Task Parallel Library (my preferred option). This is likely a better, simpler, and cleaner design than spawning your own threads and blocking on them repeatedly.
That being said, if your program is getting progressively slower, this is something where a profiler can help dramatically. Without seeing code, it's very difficult to diagnose. For example, depending on the work that you're doing, you may be causing the GC to become less efficient over time, which could cause the % of time spent in GC to climb as the program continues its execution. Profiling should give you a good indication of what is taking time as your program executes.
Reed's answer is probably the best way to deal with your issue; however, if you do want to manage the threads yourself, and not use the ThreadPool or TPL, I'd have to ask why you would let 20 threads die and create 20 more. Creating threads is an expensive process, which is why the thread pool exists. If you continually have the same number of parallel tasks, or a maximum number, they should be created once and reused. You can use locking constructs such as semaphore and mutex and have the threads wait when they are done, and just give them new data to work with and release them to proceed again. Waiting on a lock is a very inexpensive operation -- orders of magnitude cheaper than recreating a thread.
So for example, a thread might look like this (pseudocode):
while (program_not_ending)
{
wait_for_new_data_release; // Wait on thread's personal mutex
process_new_data;
resignal_my_mutex; // Cause the beginning of loop to wait again
release_semaphore_saying_I_am_done; // Increment parent semaphore count
}
The parent would then wait for its semaphore to fill up that 20 threads completed, reset the data buckets, and clear all of the thread mutexes.

Threading a large amount of threads

So I loop that will loop about 20000 times. Each time it loops, I create a new thread. The thread basically calls one method. The method is rather slow it takes four seconds to complete. It goes out and scrapes some page(which we have permission to scrape). I add a one second delay in the loop which would make sure only 4 pages are being scrapped at once. My question what happens to that thread once the method is completed?
Thread.Sleep(1000);
Thread t = new Thread(() => scraping(asin.Trim(), sku.Trim()));
t.Start();
My question what happens to that thread once the method is completed?
It will get destroyed as soon as the method completes.
That being said, this is not an ideal approach. You should use the ThreadPool to avoid creating many threads.
Instead of using new Thread, consider using ThreadPool.QueueUserWorkItem to start off the task.
In addition, if you're using .NET 4, you can use Parallel.ForEach to loop through your entire collection concurrently. This will use the ThreadPool (by default) to schedule all of your tasks.
Finally, you probably should eliminate the Thread.Sleep in your loop - it will just slow down the overall operation, and probably not provide you any gains (once you've switched to using the ThreadPool).
It exits and gets destroyed.
Maybe you'd be more interested in using a ThreadPool instead and set their maximum thread count to 4?
It will reuse the same threads for doing your task, as constructing new threads involves allocating new memory for their stacks, especially if you're doing that 20000 times, it might be one of your bottlenecks.
It gets collected and discarded by the operating system. You might be better off with ThreadPool.QueueUserWorkItem. This will re-use threads so you don't have to incur the setup/teardown cost 20000 times.
Based on your edited post, the thread will be destroyed and its resources garbage collected. As #Karim has also mentioned.
If you were using a ThreadPool it would be returned to the pool. If you know exactly how many threads you plan to keep active at any given time you could create a pool with that number to save some overhead.

Categories

Resources