How many threads my current machine can handle optimally? - c#

Original Question
Is there a heuristic or algorithim to programatically find out how many threads i can open in order to obtain maximum throughput of a async operation such as writing on a socket?
Further explained question
I'm assisting a algorithms professor in my college and he posted a assignment where the students are supossed to learn the basics about distributed computing, in his words: Sockets... The assignment is to create a "server" that listens on a given port, receives a string, performs a simple operations on it (i think it's supposed to count it's length) and return Ok or Rejected... The "server" must be able to handle a minimum of 60k submitions per second... My job is to create a little app to simulate 60K clients...
I've managed to automate the distribution of servers and the clients across a university lab in order to test 10 servers at a time (network infrastructure became the bottleneck), the problem here is: A lab is homogeneous, 2 labs are not! If not tunned correctly the "client" usually can't simulate 60k users and report back to me, especially when the lab is a older one, AND i would like to provide the client to the students so they could test their own "server" more reliably... The ability to determine the optimal number of threads to spawn has now become vital! PS: Fire-and-Forget is not a option because the client also tests if the returned value is correct, e.g If i send "Short sentence" i know the result will be "Rejected" and i have to check it...
A class have 60 students... and there's the morning class and the night class, so each week there will be 120 "servers" to test because as the semester moves along the "server" part will have to do more stuff, the client no (it will always only send a string and receive "Ok"/"Rejected")... So there's enough work to be done in order to justify all this work i'm doing...
Edit1
- Changed from Console to a async operation
- I dont want the maximum number of threads, i want the number that will provide maximum throughput! I imagine that on a 6 core pc the number will be higher than on a 2 core pc
Edit2
- I'm building a simple console app to perform some test in another app... one of thouse is a specific kind of load test (RUDY attack) where i have to simulate a lot of clients performing a specific attack... The thing is that there's a curve between throughput and number of threads, where after a given point, opening more threads actually decreases my throughput...
Edit3
Added more context to the initial question...

The Windows console is really meant to be used by more than one thread, otherwise you get interleaved writes. So the thread count for maximum console output would be one.
It's when you're doing computation that multiple threads makes sense. Then, it's rarely useful to use more than one thread per logical processor - or one background thread plus on UI thread for UI apps on a single-core processor.

It depends entirely on the situation - so the actual answer to your question of "is there a magical algorithm that will give me the perfect setup for max throughput?" is ... no.
Sure, more cores means more threads that can run and less context-switching. That said, you've edited your question to include an IO-bound example. IO-bound operations generally make use of completion ports for async operations. So, in that particular case, removing your use of your own dedicated threads for such an operation would be your main concern towards achieving maximum throughput.

Since you changed the question, I'll provide another answer.
It depends on the workload. If you're doing compute-heavy tasks, then use every logical processor. If you're doing IO, then use async calls rather than spawning new threads.
Of course, .NET has a way of managing this for you - the Thread Pool. Use it. Don't worry about how many threads you need, just kick off tasks.

If you are actually trying to do something productive (instead of just printing to the console), you should use System.Threading.Tasks.Task.Factory.StartNew. You can start as many tasks as you want. The runtime will try to distribute them amongst the available hardware threads as well as it can.

Related

Dispatching chunks of work to backgroundworkers

Using C#.
I have 100,000+ pieces of test data that need to have some calculations run with. My actual data set will be in the millions of pieces of data. The test data currently runs sequentially and takes about a minute to process. I want to split this work up and have backgroundworkers process back to back so I will hopefully get the processing done quicker.
What I have in mind is to do a foreach loop with the data and start a backgroundworker with each piece of data. I know I need to limit the number of bw's to three as I have 4 cores on this machine. I have done some testing with simple bw's but not three at the same time.
I have no idea how to go about this. How would one execute three background workers to process this data?
The BackgroundWorker is designed for early learning work mostly. Maybe the odd alternative threading scenario. What you are doing sounds like a very advanced opeartion. You can still use BGW, but raw Threads, Tasks, Threadpools and the like would be better at this point.
There is also the general question if this operation can even be accelerated with Multithreading. I like to say "multithreading has to pick it's problems carefully". Pick it in the wrong scenario and you end with a programm that needs more memory, is more prone to errors and slower then a single BGW or sequential programm.
Your case could be one of the rare cases of a pleasingly paralell operation. Or it could be mostly memory bound. Wich means you run into Paralell slowdown almost instantly. Resist atempts at hardcoding the number of threads. Usually you can leave that load-balancing work to a ThreadPool. To get a better answer you need to get a lot more specific.

Preventing a bottleneck in devicecommunication

I've got quite an abstract question. I'm working on a project that requires constant device communication. I'm integrating multiple devices onto an external processing unit with a touchpanel to execute certain methods. I.e. the "start videocall" button on the touchpanel activates a relay, turns a display-device, camera-device and microphone-device on, etc.
On the flipside, I'm also trying to monitor these devices. What status do they currently have? Are they enabled/disabled ? What input is the display device currently on?
So far, I've come up with two solutions to prevent a bottleneck in the communication where I'm constantly polling (i.e. every two to five seconds to keep an acurate and up-to-date status) the on-state and input-state of the display-device.
Make use of threading so I can enqueue the different commands and execute them async. By also reading the response async, all communication should be nicely spaced out but I'd have a very "busy" communication line, taking it's toll on the processing unit.
With the help of events have the display-device notify the processor of it's changed status. This would take a lot of stress off of the communication line, but I feel like this is very easily disrupted. If the device doesn't throw it's events correctly (or the events are missed out on) the monitored state does not correspond with the actual state.
I'm curious if there are other ways of going about this issue. As of now, I'm leaning towards the second one because it stresses the processing unit a whole lot less, I just feel like I should be building in a lot of safeguards to prevent an inacurate representation of the actual device-states.
The project runs in C# on .Net 3.5.
Polling works, but it isn't fun or optimal. Reactive is best but as you've mentioned there may be a hiccup insuring your still listening to to the device and not just standing by for nothing. In this situation it makes since to optimize both processes. Poll when you're waiting or haven't heard a response in so long and listen when your polling returns good info, passing the polling.
That said, you shouldn't worry about taxing the unit too much with polling on various threads. This sounds like a purpose device so as long as you're not running it hot or stressing it to max all the time then using your resources are perfectly fine.

Running variable threads in C# application based on Processor type

I am writing a Windows Application in C# that will essentially be a multi-threaded one. But I am in a fix because this application can be run on a Celeron/P-IV system to a Core i7 system. So, I am unable to decide how to determine the number of threads to spawn for this application.
Is there any way to determining how many threads I can spawn depending on the processor used to leverage the maximum power of the CPU as well as make my application not lag/slow down/freeze? Is there any kind of general formula that you use?
Thanks.
I'd consider using the ThreadPool. As far as I know, the .NET framework manages the optimum number of threads itself (http://msdn.microsoft.com/en-us/library/0ka9477y.aspx).
Environment.ProcessorCount is what you're looking for.
For the second part of your question, the best way to do that is with testing a different number of threads. It all depends on what exactly your threads are doing, do they spend a large time blockig, or they may be sharing common system resources, hitting each others cache lines, whatever, so you can't really give a general formula for this sort of thing. It's all heavily dependent on your specific thread behavior.
To get the number of logical processors you can use System.Environment.ProcessorCount, however it will be different from actual processors/cores count on a HyperThreading enabled systems. To get more accurate information you will need to use WMI metadata Win32_ComputerSystem.NumberOfProcessors and Win32_ComputerSystem.NumberOfLogicalProcessors.
However I would recommend to let the system take care of scheduling and use one of the high level multithreading subsystems like Tasks or ThreadPool

How many threads to use?

I know there are some existing questions and they provide a very good general perspective on things. I'm hoping to get some details on the C#/VB.Net side for the actual implementation (not philosophy) of some of these perspectives.
My Particular Case
I have a WCF Service which, amongst other things, receives files. For most of the service's life this particular area is actually just sat doing nothing - when work does come it arrives in high bursts of greatly varying quantities.
For each file received (which at a max can be thousands per second) the service needs to work on the files for between 1-10 seconds (each) depending on a number of other services, local resources, and network IO wait times.
To aid the service with these burst workloads I implemented a Queue system. Those thousands of files recieved per second are placed onto the Queue. A controller calculates the number of threads to use based on the size of the queue, up until it reaches a "Peak Max Threads" setting which prevents it from creating additional threads. These threads are placed in a thread pool, and reused to cycle through the queue. The controller will; at intervals; recalculate the number of threads required. If the queue size reduces, a relevant number of threads are released.
The age old problem
How many threads should I peak at? Clearly, adding a new thread everytime a file was received would be silly for lack of a better word - the performance, at best, would deteriorate. Capping the threads when CPU utilization is only 10% across each core, also doesn't seem to be the best use of resources.
So, is there an appropriate way to determine how many threads to cap at? I would rather the service could determine this for itself by sampling available resources, but is there a performance hit from doing so? I know the common answer is to monitor workloads, adjust the counts through trial and error until I find a number I like, but due to the nature of this service (long periods of idle followed by high/burst workloads) it could take a long time to get that kind of information.
What then if we move the server's image to a different host which is faster/slower/different to the first? I have to re-sample the process all over again?
Ideally what I'm after, is for the co-ordinator to intelligently increase the size of the threadpool until CPU utilisation is at x% (would 80% be reasonable? 90%? 99%?). Clearly, I want to do this without adding more threads than is necessary to hit x% otherwise all I'll end up with is threads not just waiting on IO resources, but awaiting each other too.
Thanks in advance!
Related questions (if you want some generic ideas):
How many threads to create?
How many threads is too many?
How many threads to create and when?
A Complication for you
Where would be the fun if I didn't make the problem more difficult?
As it currently stands, the service does hit 100% cpu during these bursts, regularly. The issue is the CPU utilisation spikes. It goes from idle (0-10%) to 100%, and back down again. I'm not sure I can help that - ideally I wouldn't take it all the way to 100%. The problem exists because the files mentioned are in fact images, and part of the services' process is to pass the image through to the System.Windows.Media blackbox which does some complex image processing for me.
There are then lulls in between the spikes because of the IO waits and other processing that goes on. If the spikes hitting 100% can't be helped (and I'm all for knowing how to prevent that, or if I should) how should I aim for the CPU utilisation graph to look? Sat constantly at 100%? Bouncing between 50-100? If I do go through the effort of sampling to decide what does seem to work best, is it guaranteed that switching the virtual servers' host will also work best with the same graph?
This added complexity I won't take into consideration for those of you willing to answer. Feel free to ignore this section. However, any answer that also accounts for this complication, or even answers that just provide tips on how to handle it, I'll at the very least upvote!
Heck of a long question - sorry about that - and thanks for reading so much!!
PerformanceCounter allows you to query for processor usage.
However ,have you tried something the framework provides?
foreach (var file in files)
{
var workitem = file;
Task.Factory.StartNew(() =>
{
// do work on workitem
}, TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
}
You can tune the concurrency level for Tasks in the Task.Factory.
The .NET 4 threadpool by default will schedule the number of threads it finds most performing on the hardware where it runs, but you can change how that works with the previous link.
Probably you need a custom solution but it would be ok to benchmark yours with the standard.
Edit: (comment note):
No links needed, I may have used an invented term since english is not my language. What I mean is: have a variable where you store the variance before the last check (prevDelta), and call it delta. add this to the varuiable avrageDelta and divide by 2, each time you 'check'. You will have the variable averageDelta that will mostly be low since you have no activity. Then have another set of delta variables, one you have already (delta - prevdelta), and store it in a delta variable that is not the average of all deltas but the average of deltas in a small timespan (you will have to come up with an algortihm to calculate accurately this temporal variance). Once done this you can compare the average delta and the 'temporal delta'. The average delta will be mostly low and will slowly go up whjen bursts come. In the same period the temporal delta will go up really fast. Then you have the situation when the burst stops, the average delta goes slowly down, and the 'temporal' goes really fast.
You could use I/O Completion Ports to asynchronously fetch your images without tying up any threads until it comes time to process what you have fetched.
You could then limit your thread pool based on the number of cores on your client PC, making sure to leave a core free for other processes to use.
What about a dynamic thread manager that monitors their overall performance and according to this spawns new threads or kills old ones? The main problem here is only how to define the performance measurement function. The rest can be done with a periodically scheduled job that increases or decreases the number of threads according to the previous number of threads and performance in that case or something like that. Maybe also in connection to resources utilization (CPU, disks, network...).

Is Threading Necessary/Useful?

Basically, I'm wondering if threading is useful or necessary, or possibly more specifically the uses and situations in which you would use it. I don't know much about threading, and have never used it (I primarily use C#) and have wondered if there are any gains to performance or stability if you use them. If anyone would be so kind to explain, I would be grateful.
In the world of desktop applications (my domain), threading is a vital construct in creating responsive user interfaces. Whenever a time-or-computationally-intensive operation needs to run, it's almost essential to run that operation in a separate thread. Otherwise, the user interface locks up and, in some cases, Windows will decide that the whole application has become unresponsive.
Threading is also a vital tool in animation, audio and communications. Basically, any situation in which you find yourself needing to do several things at once lends itself to the use of threads.
there is definitely no gains to stability :). I would suggest you get a basic understanding of threading but don't jump to use it in any real production application until you have a real need. you have C# so not sure if you are building websites or winforms.
Usually the firsty threading use case for winforms is when a user click a button and you want to run some expensive operation (database or webservice call) but you dont want the screen to freeze up . .
a good tutorial to deal with that situation is to look at the backgroundworker class in c# as this will give you a first flavor into this space and then you can go from there
There was a time when our applications would speed up when we deploy them on new CPU. And that speed up was by large extent because CPU speed (clock) was incremented by large factors.
But several years ago, CPU manufacturers stopped increasing CPU clocks because of physical limits (e.g. heat dissipation). And instead they started adding additional cores to CPUs.
Now, if your application runs only on one thread it cannot take advantage of complete CPU (e.g. of 4 cores it uses only 1).
So today to fully utilize CPU we must take effort and divide task on multiple treads.
For ASP.NET this is already done for us by ASP.NET architecture and IIS.
Look here The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software
Here is a simple example of how threading can improve performance. You have a n numbers that all needed to be added together. In a single threaded application, it will take a n time units to add all of the numbers together for the final sum. However, if you broke your numbers into 2 groups, you could have the same operation running side by side with, each with a group of n/2 numbers. Each would take n/2 time units to find their respective sums, and then an additional unit to find the full sum. By creating two threads, you have effectively cut the compute time in half.
Technically on a single core processor, there is no such thing as multi-threading, just the illusion that multiple tasks are happening in parallel since each task gets a small amount of time.
However, that being said, threading is very useful if you have to do some work that takes a long time but you want your application to be responsive (i.e. be able to do other things) while you wait for that task to finish. A good example is GUI applications.
On multi-core / multi-processor systems, you can have one process doing many things at once so the performance gain there is obvious :)

Categories

Resources