Parallel computing in .NET

Parallel computing in .NET - c#

Since the launch of .NET 4.0 a new term that has got into the limelight is parallel computing. Does parallel computing provide us some benefits or is it just another concept or feature?
Further is .NET really going to use it in applications?
Further is parallel computing different from parallel programming?
Kindly throw some light on the issue in perspective of .NET and some examples would be helpful.

It is not exactly a new term, just a new emphasis.
And yes, more programmers will have to create more parallel code if they want to profit from new hardware. See "the free lunch is over"
Further is parallel computing different from parallel programming ?
No
And here are some samples on MSDN (The PLINQ raytracer is cool)

You use parallel programming methods to enable parallel computing of your operations. .NET will utilize it if you tell it to via code.
The benefit to parallel computing is overall speed of execution. As you may have noticed over the past few years, processors aren't getting any faster, but the number of CPU cores per system is increasing. Parallel programming is the means by which you can take advantage of this form of upgrade, by splitting large jobs into smaller tasks that can be handled concurrently by separate cores.

Related

multi core processing in c#

I have a c# desktop application. Its purpose is 2 fold.
1). To display a live feed from an IP camera to my winform application.
2). Send any captured motion to my server.
It is (2) that is labour intensive. I believe I have optimised it as much as I can and the RAM is manageable.
However, in my quest to learn and to try to make my code even more efficient I am always open to new approaches.
Today, I have come across parallel processing. But, reading some links it seems to suggest there would be not much performance gain using parallel processing. Indeed in all my travels (contracts) I have never seen anyone use parallel processing in C# development.
Should I take early heed and not bother to look into this or should I see whether there is anything to gain by 'off-loading' my motion detection code to a separate parallel process?
Peoples advice/experience would be greatly informative.
Thanks

I would recommend taking a look at the Task Parallel Library provided in the .NET Framework, it's based on an idea that a piece of work is a Task. The idea is to give an abstraction to having to manage and create threads manually.
Tasks can run in parallel, on their own threads or run on the same thread, depending on the workload and configuration. Task Parallel Library is also great for asynchronous operations and work very well with I/O where the hardware can cause a blocking thread which can cause performance issues in your application, for example reading from a hard drive will cause some issues.
I suggest running a profiler on your application, visual studio professional onwards comes with a built in profiler that will enable you to trace and pin-point intensive operations that could possibly be improved with concurrency. If your application is running smooth, then there is no need, but there's nothing wrong with forward thinking and learning the Task Parallel Library as im sure there will be a point where this will benefit you from knowing how to implement concurrency in your application.
I've used TPL to solve various performance issues with large database calls in iterative loops and it's great for these IO operations, TPL will also take into account the hardware which it's being executed on and if used correctly, always be the most optimal for the hardware its running on. You could take your same piece of code and run it on a 2 core machine and it will still work the best to its abilities the hardware can provide without you having to worry about creating too many threads etc.
Personally, I'd say some asynchronous operations could be a good addition to your application since this is regarding external network camera devices which could cause blocking threads in your application.

best use of Parallel.ForEach / Multithreading

I need to scrape data from a website.
I have over 1,000 links I need to access, and previously I was dividing the links 10 per thread, and would start 100 threads each pulling 10. After few test cases, 100 threads was the best count to minimize the time it retrieved the content for all the links.
I realized that .NET 4.0 offered better support for multi-threading out of the box, but this is done based on how many cores you have, which in my case does not spawn enough threads. I guess what I am asking is: what is the best way to optimize the 1,000 link pulling. Should I be using .ForEach and let the Parallel extension control the amount threads that get spawned, or find a way to tell it how many threads to start and divide the work?
I have not worked with Parallel before so maybe my approach maybe wrong.

you can use MaxDegreeOfParallelism property in Parallel.ForEach to control the number of threads that will be spawned.
Heres the code snippet -
ParallelOptions opt = new ParallelOptions();
opt.MaxDegreeOfParallelism = 5;
Parallel.ForEach(Directory.GetDirectories(Constants.RootFolder), opt, MyMethod);

In general, Parallel.ForEach() is quite good at optimizing the number of threads. It accounts for the number of cores in the system, but also takes into account what the threads are doing (CPU bound, IO bound, how long the method runs, etc.).
You can control the maximum degree of parallelization, but there's no mechanism to force more threads to be used.
Make sure your benchmarks are correct and can be compared in a fair manner (e.g. same websites, allow for a warm-up period before you start measuring, and do many runs since response time variance can be quite high scraping websites). If after careful measurement your own threading code is still faster, you can conclude that you have optimized for your particular case better than .NET and stick with your own code.

Something worth checking out is the TPL Dataflow library.
DataFlow on MSDN.
See Nesting await in Parallel.ForEach
The whole idea behind Parallel.ForEach() is that you have a set of threads and each processes part of the collection. As you noticed, this doesn't work with async-await, where you want to release the thread for the duration of the async call.
Also, the walkthrough Creating a Dataflow Pipeline specifically sets up and processes multiple web page downloads. TPL Dataflow really was designed for that scenario.

Hard to say without looking at your code and how the collection is defined, I've found that Parallel.Invoke is the most flexible. try msdn? ... sounds like you are looking to use Parallel.For Method (Int32, Int32, Action<Int32, ParallelLoopState>)

differences between multithread and multitask

What's different between multithread programming and multitask in C#.net4?
I need some technical reviews.
I am doing some research on the topic and I need something to help me.

Multitasking is a somewhat imprecise term that can mean different things in different contexts. It can refer to:
multi-processing (time sharing between separate processes),
multiple threads or tasks in an embedded system,
a particular form or framework for of multi-threading,
even just plain multithreading
I think that the 'multitasking' term you're asking about is regarding the "Task Parallelism" support added in .NET 4: http://msdn.microsoft.com/en-us/library/dd537609.aspx
That model would fall into the 3rd item above - it's an abstraction for performing work in parallel that uses threading but tries to keep much of the mechanics of threads under the covers.

Available parallel technologies in .Net

I am new to .Net platform. I did a search and found that there are several ways to do parallel computing in .Net:
Parallel task in Task Parallel Library, which is .Net 3.5.
PLINQ, .Net 4.0
Asynchounous Programming, .Net 2.0, (async is mainly used to do I/O heavy tasks, F# has a concise syntax supporting this). I list this because in Mono, there seem to be no TPL or PLINQ. Thus if I need to write cross platform parallel programs, I can use async.
.Net threads. No version limitation.
Could you give some short comments on these or add more methods in this list?

You do need to do a fair amount of research in order to determine how to effectively multithread. There are some good technical articles, part of the Microsoft Parallel Computing team's site.
Off the top of my head, there are several ways to go about multithreading:
Thread class.
ThreadPool, which also has support for I/O-bound operations and an I/O completion port.
Begin*/End* asynchronous operations.
Event-based asynchronous programming (or "EBAP") components, which use SynchronizationContext.
BackgroundWorker, which is an EBAP that defines an asynchronous operation.
Task class (Task Parallel Library) in .NET 4.
Parallel LINQ. There is a good article on Parallel.ForEach (Task Parallel Library) vs. PLINQ.
Rx or "LINQ to Events", which does not yet have a non-Beta version but is nearing completion and looks promising.
(F# only) Asynchronous workflows.
Update: There is an article Understanding and Applying Parallel Patterns with the .NET Framework 4 available for download that gives some direction on which solutions to use for which kinds of parallel scenarios (though it assumes .NET 4 and doesn't cover Rx).

Strictly speaking, the distinction between parallel, asynchronous and concurrent should be made here.
Parallel means that a "task" is split among several smaller sub-"tasks" that can be run at the same time. This requires a multi-core CPU or a multi-CPU computer, where each task has its dedicated core or CPU. Or multiple computers. PLINQ (data parallelism) and TPL (task parallelism) fall into this category.
Asynchronous means that tasks run without blocking each other. F#'s async expression, Rx, Begin/End pattern are all APIs for async programming.
Concurrency is a concept more broad than parallelization and asynchrony.
Concurrency means that several "tasks" run at the same time, interacting with each other. But these "tasks" don't have to run on separate physical computing units, as is meant in parallelization. For example, multitasking operating systems can execute multiple processes concurrently even on single-core single-CPU computers, using time slices.
Concurrency can be achieved for example with the Actor model and message passing (e.g. F#'s mailbox, Erlang processes (Retlang in .Net))
Threads are a relatively low-level concept compared to the concepts above. Threads are tasks running within a process, running concurrently and managed directly by the operating system's scheduler. You can implement parallelization when the operating system maps each thread to a separate core, or an Actor model by implementing message queuing, routing, etc on each thread.

There are also some .NET libraries for data parallel programming which target the Graphics Processing Unit (GPU) including:
Microsoft Accelerator
is for data parallel programming and can target either the GPU or multi-core processors.
Brama is for LINQ style data transformations that run on the GPU.
CUDA.NET provides a wrapper to allow to CUDA to be used from .NET programs.

There is also the Reactive Extensions for .NET (Rx)
Rx is basically linq queries for events. It allows you to process and combine asynchronous data streams in the same way linq allows you to work with collections. So you would probably use it in conjunction with other parallel technologies as a way of bringing the results of your parallel operations together without having to worry about locks and other low level threading primitives.
Expert to Expert: Brian Beckman and Erik Meijer - Inside the .NET Reactive Framework (Rx) gives a good overview of what Rx is all about.
EDIT: Another library worth mention is the Concurrency and Coordination Runtime (CCR), it's been around for a long time (earlier than '06) and is shipped as part of the Microsoft Robotics Studio.
Rx has a lot of the same cool ideas that the CCR has inside it, but with a much nicer API in my opinion. There's still some interesting stuff in the CCR though so it might be worth checking out. There's also a distributed services framework that works with the CCR that might make it useful depending on what you're doing.
Expert to Expert: Meijer and Chrysanthakopoulos - Concurrency, Coordination and the CCR

One more is the new Task Parallel library in .NET 4.0, which is similar and along the lines of what you've already discovered, but this may be an interesting read:
Task Parallel Library

two major ways to do parallel are threads and the new task based library TPL.
Asynchronous Programming you mention is nothing more then one new thread in the threadpool.
PLINQ, Rx and others mentioned are actually extensions sitting on the top of the new task scheduler.
the best article explaining exactly the new architecture for new task scheduler and all libraries on the top of it, Visual Studio 2010 and new TPL .NET 4.0 Task-based Parallelism is here (by Steve Teixeira, Product Unit Manager for Parallel Developer Tools at Microsoft):
http://www.drdobbs.com/visualstudio/224400670
otherwise Dr Dobbs has dedicated parallel programming section here: http://www.drdobbs.com/go-parallel/index.jhtml
The main difference between say threads and new task based parallel programming is you do not need to think anymore in terms of threads, how do you manage pools and underlying OS and hardware anymore. TPL takes care for that you just use tasks. That is a huge change in the way you do parallel on any level including abstraction.
So in .NET actually you do not have many choices:
Thread
New task based, task scheduler.
Obviously the task based is the way to go.
cheers
Valko

How do I pick the best number of threads for hyptherthreading/multicore?

I have some embarrassingly-parallelizable work in a .NET 3.5 console app and I want to take advantage of hyperthreading and multi-core processors. How do I pick the best number of worker threads to utilize either of these the best on an arbitrary system? For example, if it's a dual core I will want 2 threads; quad core I will want 4 threads. What I'm ultimately after is determining the processor characteristics so I can know how many threads to create.
I'm not asking how to split up the work nor how to do threading, I'm asking how do I determine the "optimal" number of the threads on an arbitrary machine this console app will run on.

I'd suggest that you don't try to determine it yourself. Use the ThreadPool and let .NET manage the threads for you.

You can use Environment.ProcessorCount if that's the only thing you're after. But usually using a ThreadPool is indeed the better option.
The .NET thread pool also has provisions for sometimes allocating more threads than you have cores to maximise throughput in certain scenarios where many threads are waiting for I/O to finish.

The correct number is obviously 42.
Now on the serious note. Just use the thread pool, always.
1) If you have a lengthy processing task (ie. CPU intensive) that can be partitioned into multiple work piece meals then you should partition your task and then submit all individual work items to the ThreadPool. The thread pool will pick up work items and start churning on them in a dynamic fashion as it has self monitoring capabilities that include starting new threads as needed and can be configured at deployment by administrators according to the deployment site requirements, as opposed to pre-compute the numbers at development time. While is true that the proper partitioning size of your processing task can take into account the number of CPUs available, the right answer depends so much on the nature of the task and the data that is not even worth talking about at this stage (and besides the primary concerns should be your NUMA nodes, memory locality and interlocked cache contention, and only after that the number of cores).
2) If you're doing I/O (including DB calls) then you should use Asynchronous I/O and complete the calls in ThreadPool called completion routines.
These two are the the only valid reasons why you should have multiple threads, and they're both best handled by using the ThreadPool. Anything else, including starting a thread per 'request' or 'connection' are in fact anti patterns on the Win32 API world (fork is a valid pattern in *nix, but definitely not on Windows).
For a more specialized and way, way more detailed discussion of the topic I can only recommend the Rick Vicik papers on the subject:
designing-applications-for-high-performance-part-1.aspx
designing-applications-for-high-performance-part-ii.aspx
designing-applications-for-high-performance-part-iii.aspx

The optimal number would just be the processor count. Optimally you would always have one thread running on a CPU (logical or physical) to minimise context switches and the overhead that has with it.
Whether that is the right number depends (very much as everyone has said) on what you are doing. The threadpool (if I understand it correctly) pretty much tries to use as few threads as possible but spins up another one each time a thread blocks.
The blocking is never optimal but if you are doing any form of blocking then the answer would change dramatically.
The simplest and easiest way to get good (not necessarily optimal) behaviour is to use the threadpool. In my opinion its really hard to do any better than the threadpool so thats simply the best place to start and only ever think about something else if you can demonstrate why that is not good enough.

A good rule of the thumb, given that you're completely CPU-bound, is processorCount+1.
That's +1 because you will always get some tasks started/stopped/interrupted and n tasks will almost never completely fill up n processors.

The only way is a combination of data and code analysis based on performance data.
Different CPU families and speeds vs. memory speed vs other activities on the system are all going to make the tuning different.
Potentially some self-tuning is possible, but this will mean having some form of live performance tuning and self adjustment.

Or even better than the ThreadPool, use .NET 4.0 Task instances from the TPL. The Task Parallel Library is built on a foundation in the .NET 4.0 framework that will actually determine the optimal number of threads to perform the tasks as efficiently as possible for you.

I read something on this recently (see the accepted answer to this question for example).
The simple answer is that you let the operating system decide. It can do a far better job of deciding what's optimal than you can.
There are a number of questions on a similar theme - search for "optimal number threads" (without the quotes) gives you a couple of pages of results.

I would say it also depends on what you are doing, if your making a server application then using all you can out of the CPU`s via either Environment.ProcessorCount or a thread pool is a good idea.
But if this is running on a desktop or a machine that not dedicated to this task, you might want to leave some CPU idle so the machine "functions" for the user.

It can be argued that the real way to pick the best number of threads is for the application to profile itself and adaptively change its threading behavior based on what gives the best performance.

I wrote a simple number crunching app that used multiple threads, and found that on my Quad-core system, it completed the most work in a fixed period using 6 threads.
I think the only real way to determine is through trialling or profiling.

In addition to processor count, you may want to take into account the process's processor affinity by counting bits in the affinity mask returned by the GetProcessAffinityMask function.

If there is no excessive i/o processing or system calls when the threads are running, then the number of thread (except the main thread) is in general equal to the number of processors/cores in your system, otherwise you can try to increase the number of threads by testing.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.