MSDN documentation indicates that threads started by the TPL will enjoy better scheduling. However, since the threads are based upon ThreadPool, they will be implemented as background threads.
Now, there are some tasks I would like to be carried out in parallel, but it is imperative that these tasks be carried out until completion.
So, how do I create such tasks that are essentially foreground threads, but still enjoy the enhanced scheduling provided by the TPL?
You could write your own TaskScheduler implementation. Have a look in the TaskScheduler documentation for an example of implementing a TaskScheduler - hopefully it's relatively simple from there.
The TPL does not really give you Threads, it lets you create Tasks. Tasks can be executed on different threads, so Task != Thread.
As with the plain Threadpool, it would not be a good idea to change any Thread-properties.
But you problem could be easily solved by Waiting for any outstanding tasks from the main thread. You usually want to catch and handle their exceptions too.
The IsBackground property can be assigned to. I'm not sure if this is "okay" or "not-pants-on-head-dumb" though.
Happy coding.
It is imperative that these tasks be
carried out until completion.
I assume you mean that you want to make sure those tasks complete even if the primary thread is shut down?
I wouldn't suggest depending on the foreground thread staying active if the main thread shuts down. Under normal circumstances, you can keep the main thread active, waiting for the tasks to complete. You can also write a handler that can trap unhandled exceptions and do a graceful shutdown--including waiting for the tasks to complete. If something escapes your unhandled exceptions trap, then your process is probably so corrupt that you shouldn't trust whatever results the tasks deliver.
And, of course, nothing you do will prevent a user from shutting down the threads using Task Manager or something similar.
Related
I know the differences between a thread and a task., but I cannot understand if creating threads inside tasks is the same as creating only threads.
It depends on how you use the multithreaded capabilities and the asynchronous programming semantics of the language.
Simple facts first. Assume you have an initial, simple, single-threaded, and near empty application (that just reads a line of input with Console.ReadLine for simplicity sake). If you create a new Thread, then you've created it from within another thread, the main thread. Therefore, creating a thread from within a thread is a perfectly valid operation, and the starting point of any multithreaded application.
Now, a Task is not a thread per se, but it gets executed in one when you do Task.Run which is selected from a .NET managed thread pool. As such, if you create a new thread from within a task, you're essentially creating a thread from within a thread (same as above, no harm done). The caveat here is, that you don't have control of the thread or its lifetime, that is, you can't kill it, suspend it, resume it, etc., because you don't have a handle to that thread. If you want some unit of work done, and you don't care which thread does it, just that's it not the current one, then Task.Run is basically the way to go. With that said, you can always start a new thread from within a task, actually, you can even start a task from within a task, and here is some official documentation on unwrapping nested tasks.
Also, you can await inside a task, and create a new thread inside an async method if you want. However, the usability pattern for async and await is that you use them for I/O bound operations, these are operations that require little CPU time but can take long because they need to wait for something, such as network requests, and disk access. For responsive UI implementations, this technique is often used to prevent blocking of the UI by another operation.
As for being pointless or not, it's a use case scenario. I've faced situations where that could have been the solution, but found that redesigning my program logic so that if I need to use a thread from within a task, then what I do is to have two tasks instead of one task plus the inner thread, gave me a cleaner, and more readable code structure, but that it's just personal flair.
As a final note, here are some links to official documentation and another post regarding multithreaded programming in C#:
Async in Depth
Task based asynchronous programming
Chaining Tasks using Continuation Tasks
Start multiple async Tasks and process them as they complete
Should one use Task.Run within another Task
It depends how you use tasks and what your reason is for wanting another thread.
Task.Run
If you use Task.Run, the work will "run on the ThreadPool". It will be done on a different thread than the one you call it from. This is useful in a desktop application where you have a long-running processor-intensive operation that you just need to get off the UI thread.
The difference is that you don't have a handle to the thread, so you can't control that thread in any way (suspend, resume, kill, reuse, etc.). Essentially, you use Task.Run when you don't care which thread the work happens on, as long as it's not the current one.
So if you use Task.Run to start a task, there's nothing stopping you from starting a new thread within, if you know why you're doing it. You could pass the thread handle between tasks if you specifically want to reuse it for a specific purpose.
Async methods
Methods that use async and await are used for operations that use very little processing time, but have I/O operations - operations that require waiting. For example, network requests, read/writing local storage, etc. Using async and await means that the thread is free to do other things while you wait for a response. The benefits depend on the type of application:
Desktop app: The UI thread will be free to respond to user input while you wait for a response. I'm sure you've seen some programs that totally freeze while waiting for a response from something. This is what asynchronous programming helps you avoid.
Web app: The current thread will be freed up to do any other work required. This can include serving other incoming requests. The result is that your application can handle a bigger load than it could if you didn't use async and await.
There is nothing stopping you from starting a thread inside an async method too. You might want to move some processor-intensive work to another thread. But in that case you could use Task.Run too. So it all depends on why you want another thread.
It would be pointless in most cases of everyday programming.
There are situations where you would create threads.
I updated my code to use Tasks instead of threads....
Looking at memory usage and CPU I do not notices any improvements on the multi-core PC, Is this expected?
My application essentially starts up threads/tasks in different objects when it runs...
All I'm doing is a simple
Task a = new Task(...)
a.Start();
There are various implications to using Tasks instead of Threads, but performance isn't a major one (assuming you weren't creating huge numbers of threads.) A few key differences:
The default TaskScheduler will use thread pooling, so some Tasks may not start until other pending Tasks have completed. If you use Thread directly, every use will start a new Thread.
When an exception occurs in a Task, it gets wrapped into an AggregateException that calling code can receive when it waits for the Task to complete or if you register a continuation on the Task. This is because you can also do things like wait on multiple Tasks to complete, in which case multiple exceptions can be thrown and aggregated.
If you don't observe an unhandled exception thrown by a Task, it will (well, may) eventually be thrown by the finalizer of the Task, which is particularly nasty. I always recommend hooking the TaskScheduler.UnobservedTaskException event so that you can at least log these failures before the application blows up. This is different from Thread exceptions, which show up in the AppDomain.UnhandledException event.
If you simply replaced every usage of Thread with Task and did no other changes I would expect virtually the same performance. The Task API is really just that, it's an API over an existing set of constructs. Under the hood it uses threads to schedule it's activities and hence has similar performance characteristics.
What's great about Task are the new things you can do with them
Composition with ContinueWith
Cancellation
Hierarchies
Etc ...
One great improvement of Takss vs. Threads is that you can easiely build chains of tasks. You can specify when a task should start after the previous task ("OnSuccess", "OnError", a.s.o.) and you can specify if there should be a synchronization context switch. That gives you the great opportunity to run a long running task in bakcground and after that a UI refershing task on the UI thread.
If you are using .Net 4.0 then you can use the Parallel.Invoke method like so
Parallel.Invoke(()=> {
// What ever code you add here will get threaded.
});
for more info see
http://msdn.microsoft.com/en-us/library/dd992634.aspx
You would see difference if your original or converted code do not utlize CPU completely. I.e. if original code always limited number of threads to 2, on quad-core machine it will run at about 50% load with manually created threads and potentially 100% load with tasks (if your tasks can be actaully paralellized). So it looks like either your original code was reasonable from performance point of view, or both implemetaion suffer issues showing similar underutiliztion of CPU.
My model of how threads work is that some ThreadManager gives each thread a turn. When it's a thread's turn, it gets to execute a few lines of code.
To pause a thread, couldn't one just have the ThreadManager (momentarily) stop allowing that thread to have a turn?
To abort a thread, couldn't the ThreadManager just never give that thread another turn?
What's the problem?
Quote from MSDN about pausing threads:
You have no way of knowing what code a
thread is executing when you suspend
it. If you suspend a thread while it
holds locks during a security
permission evaluation, other threads
in the AppDomain might be blocked. If
you suspend a thread while it is
executing a class constructor, other
threads in the AppDomain that attempt
to use that class are blocked.
Deadlocks can occur very easily.
Aborted thread can lead to unpredicted circumstances. There is a good article about this: http://www.bluebytesoftware.com/blog/2009/03/13/ManagedCodeAndAsynchronousExceptionHardening.aspx
I agree with Alex, but to elaborate further, if you need to "pause" a thread, it will probably be better to look at some sort of locking mechanism like Semaphores, Mutexes, or one of the many other ones available.
But, without knowing your code, Windows is a preemptive multitasking environment. Usually this is not needed, just let your threads run and the underlying OS and scheduler will make sure all your tasks get a fair turn.
I have a Main thread that spawns around 20 worker threads.
I need to stop the Main thread until all the other threads are finished.
I know about (thread).Join. But that only works for one thread.
and multiple Joins hurt performance like this.
t1.Join()
t2.Join()
...
t20.Join()
as the program waits one by one for each to stop.
How would I make it such that
the main thread waits for all of a set of threads to end?
You should really look into Task Parallelism (Task Parallel Library). It uses a thread-pool, but also manage task-stealing etc.
Quote: "The TPL scales the degree of concurrency dynamically to most efficiently use all the processors that are available. In addition, the TPL handles the partitioning of the work, the scheduling of threads on the ThreadPool, cancellation support, state management, and other low-level details." on Task Parallel Library
You can use it like this:
Task[] tasks = new Task[3]
{
Task.Factory.StartNew(() => MethodA()),
Task.Factory.StartNew(() => MethodB()),
Task.Factory.StartNew(() => MethodC())
};
//Block until all tasks complete.
Task.WaitAll(tasks);
Or if you use some kind of a loop to spawn your threads:
Data Parallelism (Task Parallel Library)
The joins are fine if that's what you want it to do. The main thread still has to wait for all the worker threads to terminate. Check out this website which is a sample chapter from C# in a Nutshell. It just so happens to be the threading chapter: http://www.albahari.com/threading/part4.aspx.
I can't see an obvious performance penalty for waiting for the threads to finish one-by-one. So, a simple foreach does what you want without any unnecerrasy bells and whistles:
foreach (Thread t in threads) t.Join();
Note: Of course, there's a Win32 API function that allows waiting for several objects (threads, in this case) at once — WaitForMultipleObjectsEx. There are many helper classes or threading frameworks out there on the Internet that utilize it for what you want. But do you really need them for a simple case?
and multiple Joins hurt performance
like this.
There's no "performance hurting", if you want to wait for all of your threads to exit, you call .join() on the threads.
Stuff your threads in a list and do
foreach(var t in myThread)
t.join();
If you are sure you will always have < 64 threads then you could have each new thread reliably set an Event before it exits, and WaitAll on the events in your main thread, once all threads are started up. The Event object would be created in the main thread and passed to the relevant child thread in a thread-safe way at thread creation time.
In native code you could do the same thing on the thread handles themselves, but not sure how to do this in .Net.
See also this prior question: C#: Waiting for all threads to complete
So my question is how to implement cancel/interrupt feature into all (I mean ALL) thread workers in your application in best and most elegant way?
It's not important if it's an HttpWebRequest, IO operation or calculation. User should have an possibility to cancel every action/thread at any moment.
Use .NET 4.0 Tasks with CancellationTokens - they are the new universal cancellation system.
User should have an possibility to
cancel every action/thread at any
moment.
Threading is a practice, not a design... and believe me it has been tried as a design, but it failed miserably. The basic problem with simply canceling any action at any moment is that in a multithreaded environment it's just evil! Imagine that you have a section of code guarded by a lock and you have two threads running in parallel:
Thread 1 acquires the lock.
Thread 2 waits until the lock is released so it can acquire it.
Thread 1 is canceled while it's holding the lock and it doesn't release the lock.
DEADLOCK: Thread 2 is waiting for the lock which will never be released.
This is the simplest example and technically we can take care of this situation in the design, i.e. automatically release any locks that the thread has acquired, but instead of locks think of object states, resource utilization, client dependencies, etc. If your thread is modifying a big object and it's canceled in the middle of the modification, then the state of the object may be inconsistent, the resource which you're utilizing might get hung up, the client depending on that thread might crash... there is a slew of things which can happen and there is simply no way to design for them. In this case you make it a practice to manage the threads: you ensure a safe cancellation of your threads.
Others have already mentioned various methods for starting threads that can be canceled, but I just wanted to touch on the principles. Even in the cases where there is a way to cancel your threads, you still have to keep in mind that you're responsible for determining the safest way to cancel your thread.
It's not important if it's an HttpWebRequest, IO operation or calculation.
I hope now you understand why it's the MOST important thing! Unless you specifically know what your thread is doing, then there is no safe way to automatically cancel it.
P.S.
One thing to remember is that if you don't want hanging threads then for each one of them you can set the Thread.IsBackground flag to true and they will automatically be closed when your application exits.
Your worker threads need a way to check with your main thread to see if they should keep going. One way is to share a static volatile bool that's set by your UI and periodically checked by the worker threads.
My preference is to create your own threads that run instances of a worker class that periodically invoke a callback method provided by your main thread. This callback returns a value that tells the worker to continue, pause, or stop.
Avoid the temptation to use Thread.Abort() to kill worker threads: Manipulating a thread from a different thread.