Task parallel library - Parallelism on single core

Task parallel library - Parallelism on single core - c#

I am working on a WPF application.
In a screen/View i have to make 6 calls to a WCF service. None of those calls are related in the sense they dont share data neither are they dependent on each other. I am planning to use TPL and make these 6 WCF service calls as 6 tasks. Now the application might be either deployed on a single core machine or multiple core machine.
I am being told that usage of TPL on single core machine would actually increase the time take for the tasks to complete because of the overhead that would be placed on the cpu scheduler to time splice different tasks. Is this true. If yes should i still continue with my design or should i look at alternatives.
if i have to look at alternatives, what are those alternatives :) ?

When doing something CPU intensive, you would be adding overhead by running parallel threads on a single core machine.
In your case the tasks are not CPU intensive, they are waiting for a service call to respond, so you can very well run parallel threads on a single core machine.
Depending on how the server handles the calls, there might not be any time increase anyway. If the calls are queued on the server, it will take about the same time to run all calls anyway. In that case it would be better to run the calls in sequence, just because it's simpler.

Your best bet is to profile using multi-core and single core. Most bios's can set the number of active core's so it shouldn't be a big problem. You can do some mock testing to find out if it will work for you.
Obviously using task switching has overhead issues but as long as each task's time is much longer than the setup time you won't notice it.
There are many ways to implement multi-tasking behavior and if you do not know which is best then chances are you need to actually write some test cases and do some profiling. This is not difficult to do. If you are simply trying to use multi-core systems then it generally is quite easy with the latest version of .NET and you can even set it up for multi-core but revert back to single core by using appropriate constructs.
the async/await pattern, for example, can easily be ran synchronously by either using #ifdef or removing all await keywords(with a search and replace tool). Parallel.For loops are easily convertible to normal for loops either directly or by changing MaxDegreeOfParallelism. Tasks can easily be ran synchronously.
If you would like to make it more transparent you could use some pre-processing scripting like T4.

In general, When running multi threads on single core it will be slower since it has Context Switch between the threads.
I think the following diagram will explain you the difference:
As you can see the diagram refer to 4 threads running on single core, first time in multi-tasking and the second time Sequential.
you can see that in multi-tasking all threads will finish at a later time than Sequential tasking.
In your specific case in probably won't be the same and I think #Guffa is right in his answer since its involving WCF calling

Related

Using ThreadPool threads in library

I'm working on a .net core library that will get used mostly in web apps. This library is being built with performance in mind as this is the main design decision. There is some code that is fairly heavy and due to this, will get cached so that subsequent calls are quick. As you can imagine, the first call is slower and I don't want that. I want to execute this code at the earliest possible time to warm up the cache without affecting the other operations. I was thinking of using Task.Start() without awaiting to to achieve this.
My question is, is it frowned upon to use threadpool threads in a library, i.e what is the etiquette on this? As this will be mostly used on web apps, I feel I don't want to interfere with the client's threadpool. That being said, the library will only use one background thread and this will be less than a second. Or should I just let the client take the performance hit for first calls?

If I understand you correctly; it's perfectly legitimate to use multi-threading in a library; as a matter of fact: it happens all the time.
Basically, a lot of async Task methods do this in one way or another. (Sometimes there is no thread)
If it's so heavy you need multiple parallel threads for a long period in time, than it's best to create an explicit initialize routine, and warn the caller in the docs.
Task.Run is typically used for such processing.

Proper understanding of Tasks

At the risk of asking a stupid question (and I will voluntarily delete the question myself if my peers think it is a stupid question)..
I have a C# desktop app.
I upload data to my server using a WCF Service.
I am experimenting with using Tasks.
This code calls my web service...
Task t = Task.Run(() => { wcf.UploadMotionDynamicRaw(bytes); });
I am stress testing this line of code.
I call it as many times in 1 second for a period of X time.
Will this 'flood' my router if the internet is slow for whatever reason?
I can sort of guess that this will be the case...
So, how can I test whether the task has completed before calling it again? In doing this will the extra plumbing slow down my speed gains by using Task?
Finally, is using Tasks making usage of multiple cores?

Will this 'flood' my router if the internet is slow for whatever
reason?
This depends on the size of the file you are uploading and you connection speed. To figure that out, just run .
So, how can I test whether the task has completed before calling it
again?
You can use Task.ContinueWith function (any of available overloads) to "catch" task completion and run some other method, may be in recursion too.
In doing this will the extra plumbing slow down my speed gains by
using Task?
It depends on workload, your processor and timing you expect. So, in other words, run it to measure, there is no possible generic answer to this.
is using Tasks making usage of multiple cores?
Yes, whenever it figures out it is possible. Running single task one after another will not spread the single function work on multiple cores. For this you need to use Parallel.For and similar artifacts. And again, .NET does not provide you with a mechanism for SIMD orchestration, so you are not guaranteed that it will run on multicores, but most probably will.

Why set the ProcessorAffinity for a thread?

I've been unable to find a good explanation as to why a multithreaded executable would want to set the ProcessorAffinity per thread. To me, it seems like this is trying to override the CLR/Operating system; something I don't think I'm smart enough to be doing.
Why would I want to get involved in setting the ProcessorAffinity for threads on a multi-core system?

If you tell a thread to run with a non-set affinity, then it'll be allowed to run on any core. This means however, that when one core is busy, it'll move your thread onto a different core, this stopping and possible moving is called a Context Switch. In most cases you'll never notice it, however, in cases like gaming consoles, context switch's can be a surprisingly expensive process.
In these cases it might be better to move something like the audio loop and the video loop onto "private" core's where they are locked to that core, and as such won't switch, giving possible optimisations.

Only very specific types of applications really benefit from the use of manual thread affinity, mostly applications with long running parallel processes. I could imagine it being used in virus scanners, or math heavy applications like Seti#Home.
Another theoretical advantage is that the processor can make use of its cache if you have small processes that run multiple times. But again, in reality you'd need a really specific type of application to make the difference noticable.
I have never had the need to bother with it. Usually the operating system knows best.

Processor caching.
And can use it to throttle.
Might have lower priority process that you don't want to dominate.
On a 4 processor machine could limit it to one processor.
Throttle can also be done with thread priority.
Would only use this if the process benefits from caching.
I like it because in task manager I can see it hammering one CPU.

When to use Multithread?

When do you use threads in a application? For example, in simple CRUD operations, use of smtp, calling webservices that may take a few time if the server is facing bandwith issues, etc.
To be honest, i don't know how to determine if i need to use a thread (i know that it must be when we're excepting that a operation will take a few time to be done).
This may be a "noob" question but it'll be great if you share with me your experience in threads.
Thanks

I added C# and .NET tags to your question because you mention C# in your title. If that is not accurate, feel free to remove the tags.
There are different styles of multithreading. For example, there are asynchronous operations with callback functions. .NET 4 introduces the parallel Linq library. The style of multithreading you would use, or whether to use any at all, depends on what you are trying to accomplish.
Parallel execution, such as what parallel Linq would generally be trying to do, takes advantage of multiple processor cores executing instructions that do not need to wait for data from each other. There are many sources for such algorithms outside Linq, such as this. However, it is possible that parallel execution may be unable to you or that it does not suit your application.
More traditional multithreading takes advantage of threading within the .NET library (in this case) as provided by System.Thread. Remember that there is some overhead in starting processes on threads, so only use threads when the advantages of doing so outweigh this overhead. Generally speaking, you would only want to use this type of single-processor multithreading when the task running under the thread will have long gaps in which the processor could be doing something else. For example, I/O from hard disk (and, consequently, from a database system that uses one) is many orders of magnitude slower than memory access. Network access can also be slow, as another example. Multithreading could allow another process to be running while waiting for these slow (compared to the processor) operations to complete.
Another example when I have used traditional multithreading is to cache some values the first time a particular ASP.NET page is accessed within a session. I kick off a thread so that the user does not have to wait for the caching to complete before interacting with the page. I also regulate the behavior when the caching does not complete before the user requests another page so that, if the caching does not complete, it is not a problem. It simply makes some further requests faster that were previously too slow.
Consider also the cost that multithreading has to the maintainability of your application. Threaded applications can be harder to debug, for example.
I hope this answers your question at least somewhat.

Joseph Albahari summarized it very well here:
Maintaining a responsive user interface
Making efficient use of an otherwise blocked CPU
Parallel programming
Speculative execution
Allowing requests to be processed simultaneously

One reason to use threads is to split large, CPU-bound tasks across a number of CPUs/cores, to finish faster. Another is to let an extended task execute asynchronously, so the foreground can remain responsive while it runs.
Your examples seem to be concentrating on the second of these. While it can be a good reason, if you can use asynchronous I/O instead, that's usually preferable (e.g., almost anything using sockets can/will be better off using the socket(s) asynchronously). Asynchronous I/O is easier to cancel, and it'll usually have lower CPU overhead as well.

You can use threads when you need different execution paths. This leads(when done correctly) to more responsive and/or faster applications but also leads to more complex code and debugging.
In a simple CRUD scenario maybe is not that useful, but maybe your UI is consuming a slow web service. If you your code is tied to your UI thread you will have unresponsive UI between the service calls.
In that case, using System.Threading.Threads maybe be overkill because you don't need so much control. Using a BackgrounWorker maybe a better choice.
Threading is something difficult to master, but the benefits when used correctly are huge, performance is the most common.

Somehow you have answered your question by yourself. Using threads whenever you execute time consuming operations is right choice. Also you should it in situations when you want to make things faster. For example you want to process some amount of files - each file can be processed by different thread.
By using threads you can better utilize power of multi-core/processor machines.
Monitoring some data in background of your application.
There are dozens of such scenarios.

Realising my comment might suffice as an answer ...
I like to view multi-threading scenarios from a resource perspective. In other words, UI (graphics), networking, disk IO, CPU (cores), RAM etc. I find that helps when deciding where to use multi-threading in the general sense at least.
The reasoning behind this is simply that I can take advantage of one resource on a specific thread (eg. Disk IO) while at the same time using another thread to accomplish something else using a different resource.

Appropriate Multi-Threading Option

Scenario
I have a very heavy number-crunching process that pools large datasets from 3 different databases and then does a bit of processing on each to eventually produce a result.
This process is fine if it is only used by a single asset. However I now have 3500 assets that I need to process, which takes about 1hr30mins in the state of the current process.
Question
What is my best option for speeding this process up in terms of a multi-threaded c# application? Realistically I don't have to share anything between the processing of each asset, so I'm confident that being able to run process multiple assets at a time shouldn't cause too many issues.
Thoughts
I've heard good things about thread pools, but I guess realistically I want something that isn't too huge to implement, is easily understandable and can run off a decent number of threads at a time.
Help would be greatly appreciated.

In .net you can use the existing Thread Pool, no need to implement one yourself. Here is the relevant MSDN.
You should take care not to run too many processes at once (3500 are a bit much), but using the supplied queuing mechanism should get you started in the right direction.
Another thing to try is using PLINQ.

If you don't have a multi-core processor, multiple machines, and/or the thread processes are not I/O bound, multithreading will not help. Start by profiling the current processing to see where the time is going.
Thread pools are fine, and you can use a task queue to do simple load-balancing, but if there's no spare CPU cycles in the current application this would be a waste of time.

The nicest option would be to use the new Task Parallel Library in .NET 4, if you can do this using VS 2010 RC. This has built-in load balancing and work stealing queues, so it will make this task easy to thread, and very scalable.
However, if you need to do this in .NET 3.5, I would recommend using the ThreadPool, and just using ThreadPool.QueueUserWorkItem to start each task.
If your tasks are all very computationally intensive for their entire lifetime, you may want to prevent having too many running concurrently. Some form of queue, which you pull work from and execute, can be beneficial in this case. Just place all of your work items into a queue, and have threads pull work from the queue (with appropriate locking), and process.
If you have a multi-core system, and CPU cycles are your bottleneck, this should scale very well.

The .Net built in ThreadPool will solve both of your requirements of running a decent number of threads as well as being simple to work with. I have previously written an article on the subject which you can find here.

With using SQL Server 2005 or later, you can create user-defined functions in C# and use them from within T-SQL procedures, which can give a marked speedup for number crunching. SQL Server is multi-threaded and does a good job with it, so consider keeping as much of the processing in the database engine as you can.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.