I've read about advantages of Tasks Difference between Task (System.Threading.Task) and Thread
Also msdn says that "...in the .NET Framework 4, tasks are the preferred API for writing multi-threaded, asynchronous, and parallel code."
Now my program contains such code which receive multicast data from udp:
thread = new Thread(WhileTrueFunctionToReceiveDataFromUdp);
.....
thread.Start();
I have several such threads for each socket.
Am I better to replace this code to use Task?
It depends on what you're doing - if you're not going to use any of the new features in Task and the TPL, and your existing code works, there's no reason to change.
However, Task has many advantages - especially for operations that you want to run in a thread pool thread and return a result.
Also - given that you're using "threads for each socket", you likely will have longer life threads. As such, if you do switch to Task.Factory.StartNew, you'll potentially want to specify that the tasks should be LongRunning or you'll wind up using a lot of ThreadPool threads for your socket data (with the default scheduler).
Do not change anything in the code that already works and will work (at least according to Microsoft). Change it only for reasons like :
You want to use a new features offered by Tasks
Personal study.
Remember that on OS level they basically end up into the same OS Kernel objects.
Hope this helps.
Related
I have ETL project which has a few processing component. The single component is producer-consumer based on BlockingCollection. All of the components are executed via Task.Run in parallel, wait for items to arrive from other components, process them and put the result to their output collections (think pipelines). All components are executed via Task.Run().
Is it possible to force tasks to run on a single core (I don't want them to take 100% of multi-core CPU) without setting processor affinity for the process (this seems like overkill)?
Please note that I still want tasks to run in parallel fashion - just on a single core.
A Task Executes on a thread,the OS decides on which core it executes.
I don't think there is any other way other than settings Processor Affinity.
see here: https://msdn.microsoft.com/en-us/library/system.diagnostics.processthread.processoraffinity.aspx
Are you sure that running them parallels on one core will benefit you with performance, why do you not want to allow the process to potentially use 100% cpu if it needs to? the os will still prioritize it with other processes and not necceserily allow this
You could also just lower the Thread/Process priority if what worries you is your process straining other OS processes:
Process Priority: https://msdn.microsoft.com/en-us/library/system.diagnostics.process.priorityclass.aspx
Thread Priority: https://msdn.microsoft.com/en-us/library/system.threading.thread.priority(v=vs.110).aspx
Yes, this is entirely possible. You just need to implement your own TaskScheduler.
In fact, the example in the TaskSchduler's API docs illustrates how to accomplish exactly what you want--they implement a LimitedConcurrencyLevelTaskScheduler that lets you set the number of worker threads that you want to use.
The links in the Remarks section of the API docs are are also valuable. The Samples for Parallel Programming with the .NET Framework 4 project contains a slew of alternative thread schedulers, described in detail here. They may inspire you to think of alternative approaches to scheduling these tasks.
The only twist here is that you can't use the Task.Run() shortcut anymore--you'll need to go through a TaskFactory instead.
When using Task.Run(), you have a very low control over job and everything is parallel, except if you use a custom Scheduler.
Rather than this technical solution, I suggest using Task Parallel Library (TPL), that could be viewed as a higher layer of handling threaded jobs.
In TPL, you can choose blocks types to process your data, and even connect blocks between them, so when an item has just finished processing, the result can be enqueued in next TPL Block.
You can use an ActionBlock<T> : you define the code to execute for each item to be processed, and when data is available for ActionBlock with .Post(), it is automatically processed... in parallel. But for your need, you can specify MaxDegreeOfParallelism=1.
So with this method you cannot control the Core on which you execute your code, but you ensure all items will be processed sequentially and won't use more than one core at the time.
var workerBlock = new ActionBlock<int>(
// Simulate work by suspending the current thread.
millisecondsTimeout => Thread.Sleep(millisecondsTimeout),
// Specify a maximum degree of parallelism.
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 1
});
// Source: https://learn.microsoft.com/fr-fr/dotnet/api/system.threading.tasks.dataflow.actionblock-1?view=netcore-3.1
You can also read this complete article about TPL, very interesting.
After reading how the thread pool and tasks work in this article I came up with this question -
If I have a complex program in which some modules use tasks and some use thread pool, is it possible that there will be some scheduling problems due to the different uses?
Task are often implemented using the thread pool (one can of course also have tasks using other types of schedulers that give different behavior, but this is the default). In terms of the actual code being executed (assuming your tasks are representing delegates being run) there really isn't much difference.
Tasks are simply creating a wrapper around that thread pool call to provide additional functionality when it comes to gather information about, and processing the results of, that asynchronous operation. If you want to leverage that additional functionality then use tasks. If you have no need to use it in some particular context, there's nothing wrong with using the thread pool directly.
Mix the two, so long as you don't have trouble getting what you want out of the results of those operations, is not a problem at all.
No. And there actually isn't much in the way of memory or performance inefficiencies when mixing approaches; by default tasks use the same thread pool that thread pool threads use.
The only significant disadvantage of mixing both is lack of consistency in your codebase. If you were to pick one, I would use TPL since it is has a rich API for handling many aspects of multi-threading and takes advantage of async/await language features.
Since your usage is divided down module lines, you don't have much to worry about.
No, there wouldn't be problems - you just would be inefficient in doing both. use what is really needed and stick with the pattern. Remember to be sure that you make your app MT Safe also especially if you are accessing the same resources/variables etc... from different threads, regardless of which threading algorithm you use.
There shouldn't be any scheduling problems as such, but of course it's better to use Tasks and let the Framework decide what to do with the scheduled work. In the current version of the framework (4.5) the work will be queued through the ThreadPool unless the LongRunning option is used, but this behaviour may change in future of course.
Verdict: Mixing Tasks and ThreadPool isn't a problem, but for new applications it's recommended to use Tasks instead of queueing work items directly on the ThreadPool (one reason for that is ThreadPool isn't available in Windows 8 Runtime (Modern UI apps).
As part of trying to learn C#, I'm writing a small app that goes through a list of proxies. For each proxy it will create an httpwebrequest to a proxytest.php which prints generic data about a given proxy (or doesn't, in which case the proxy is discarded)
Clearly the webrequest code needs to run in a separate thread - especially since I'm planning on going through rather large lists. But even on a separate thread, going through 5,000 proxies will take forever, so I think this means I am to create multiple threads (correct me if I'm wrong)
I looked through MSDN and random threading tutorials and there's several different classes available. What's the difference between dispatcher, backgroundworker and parallel? I was given this snippet:
Parallel.ForEach(URLsList, new ParallelOptions() { MaxDegreeOfParallelism = S0 }, (m, i, j) =>
{
string[] UP = m.Split('|');
string User = UP[0];
string Pass = UP[1];
// make call here
}
I'm not really sure how it's different than something like starting 5 separate background workers would do.
So what are the differences between those three and what would be a good (easy) approach to this problem?
Thanks
The Dispatcher is an object that models the message loop of WPF applications. If that doesn't mean anything to you then forget you ever heard of it.
BackgroundWorker is a convenience class over a thread that is part of the managed thread pool. It exists to provide some commonly requested functionality over manually assigning work to the thread pool with ThreadPool.QueueUserWorkItem.
The Thread class is very much like using the managed thread pool, with the difference being that you are in absolute control of the thread's lifetime (on the flip side, it's worse than using the thread pool if you intend to launch lots of short tasks).
The Task Parallel Library (TPL) (i.e. using Parallel.ForEach) would indeed be the best approach, since it not only takes care of assigning work units to a number of threads (from the managed thread pool) but it will also automatically divide the work units among those threads.
I would say use the task parallel library. It is a new library around all the manual threading code you will have to write otherwise.
The Task Parallel Library (TPL) is a collection of new classes specifically designed to make it easier and more efficient to execute very fine-grained parallel workloads on modern hardware. TPL has been available separately as a CTP for some time now, and was included in the Visual Studio 2010 CTP, but in those releases it was built on its own dedicated work scheduler. For Beta 1 of CLR 4.0, the default scheduler for TPL will be the CLR thread pool, which allows TPL-style workloads to “play nice” with existing, QUWI-based code, and allows us to reuse much of the underlying technology in the thread pool - in particular, the thread-injection algorithm, which we will discuss in a future post.
from
http://blogs.msdn.com/b/ericeil/archive/2009/04/23/clr-4-0-threadpool-improvements-part-1.aspx
I found working with this new 4 library really easy. This blog is showing the old BackgroundWorker way of doing things and the new Task way of doing things.
http://nitoprograms.blogspot.com/2010/06/reporting-progress-from-tasks.html
I have created a renderer in Silverlight/C#. Currently I'm using System.Threading.ThreadPool to schedule rendering of tiles in parallel. This works well right now, but I would like to limit the number of threads used.
Since this runs on Silverlight there are a couple of restrictions:
If I call ThreadPool.SetMaxThreads the application crashes as documented.
There is no Task Parallel Library
I see a few options:
Find an OSS/third party Thread Pool
Implement my own Thread Pool (I'd rather not)
Use Rx (which I do in other places)
Are there any tested alternative Thread Pools that work with Silverlight out there?
Or can anyone come up with a Rx expression that spawns a limited number of threads and queue work on these?
If you're using Rx, check out:
https://github.com/xpaulbettsx/ReactiveUI/blob/master/ReactiveUI/ObservableAsyncMRUCache.cs
(Copying this one file into your app should be pretty easy, just nuke the this.Log() lines and the IEnableLogger interface)
Using it is pretty easy, just change your SelectMany to CachedSelectMany:
someArray.ToObservable()
.CachedSelectMany(webService)
.Subscribe(x => /* do stuff */);
If you use Rx then it seems like you could quite easily write your own implementation of IScheduler. This could just apply a simple semaphore and then pass the work on to the ThreadPool. With this approach you get to leaverage the ThreadPool, allow for testing as you are coding against an interface and you will also have good seams for testing.
Further more, as you have written this yourself, you could actually use a small-ish (<10)set of Threads that you manage yourself (instead of the threadpool)so you can avoid ThreadPool starvation.
Check out Ami Bar's SmartThreadPool. It's got a ton of features missing from the default .NET threadpool, allows you to set a MaxThreads property per threadpool instance, and supports Silverlight.
I came across this comprehensive explanation of the new .NET TPL library recently, and it sounded pretty impressive. Having read the article, it appears that the new taskmanager is so clever it can even tell whether your parallel tasks would be faster if done serially on the same thread, rather than be parcelled out to worker threads. This could often be a difficult decision.
Having written a lot of code using what threading was available previously, it now seems as though everything ought to be written with tasks, which would hand over a lot of the work to the taskmanager.
Am I right in thinking that whatever I previously did with threads should now be done with tasks? Of course there will always be cases where you need fine control, but should one generally throw ordinary background work onto a task, rather than a new thread? Ie has the default "I need this to run in the background => new thread" become "new task" instead?
Basically, yes, you want to use tasks and let them take care of the thread use. In practice, the tasks are processed by a thread pool.
Tasks are managed by the TaskScheduler. The default TaskScheduler runs tasks on ThreadPool threads and as such you have the same issues as you normally would when using the ThreadPool: It is hard to control the setup (priority, locale, background/foreground, etc.) on threads in the pool. If you need to control any of these aspects it may be better to manage the threads yourself. You may also implement your own scheduler to handle some of these issues.
For most other parts the new Task class works very well.