How can I run applications at a fixed frequency? - c#

Is there a way to make an application, or a thread, run at a fixed rate?
I'm trying to do some deterministic simulations between networked clients and would like both machines (Windows) to run or process the data at a fixed, unchanging rate. Is this possible?

You can't make an existing application to run at particular speed (there could be VM based solutions that normalize executions speed, but I'm not aware of those myself).
If you writing your own code usual approach is to basically sleep between processing the next iteration. It is commonly done for (simple) games where is less processing than CPU power.
Pseudocode:
while(true)
{
executeStep();
await Task.Delay(GetTimeforNextStep() - DateTime.Now.Utc);
}
Note that precise synchronization is not possible with consumer grade OS (Windows/Linux/MacOS) - you need RTOS for a precise millisecond level timing.

Related

Limits of (soft real-time) timing requirements in Windows OS

In the company I work for we build machines which are controlled by software running on Windows OS. A C# application communicates with a bus controller (via a DLL). The bus controller runs on a tact time of 15ms. That means, that we get updates of the actual sensors in the system with a heart beat of 15ms from the bus controller (which is real time).
Now, the machines are evolving into a next generation, where we get a new bus controller which runs on a tact of 1ms. Since everybody realizes that Windows is not a real time OS, the question arises: should we move the controlling part of the software to a real time application (on a real time OS, e.g. a (soft) PLC).
If we stay on the windows platform, we do not have guaranteed responsiveness. That on itself is not necessarily a problem; if we miss a few bus cycles (have a few hickups), the machine will just produce slightly slower (which is acceptable).
The part that worries me, is Thread synchronization between the main machine controlling thread, and the updates we receive from the real time controller (every millisecond).
Where can I learn more about how Windows / .NET C# behaves when it goes down the path of thread synchronization on milliseconds? I know that e.g. Thread.Sleep(1) can take up to 15 ms because Windows is preempting other tasks, so how does this reflect when I synchronize between two threads with Monitor.PulseAll every ms? Can I expect the same unpredictable behavior? Is it asking for trouble when I am moving into the soft real time requirements of 1ms in Windows applications?
I hope somebody with experience on these aspects of threading can shed some light on this. If I need to clarify more, by all means, shoot.
Your scenario sounds like a candidate for a kiosk-mode/dedicated application.
In the company I work for we build machines which are controlled by software running on Windows OS.
If so, you could rig the machines such that your low-latency I/O thread could run on a dedicated core with thread and process priorities maximized. Furthermore, ensure the machine has enough cores to handle a buffering thread as well as any others that process your data in transit. The buffer should allocate memory upfront if possible to avoid garbage collection bottlenecks.
#Aron's example is good for situations where data integrity can be compromised to a certain extent. In audio, latency matters a lot during recording for multiple reasons but for pure playback, data loss is acceptable to a certain degree. I am assuming this is not an option in your case.
Of course Windows is not designed to be a real-time OS but if you are using it for a dedicated app, you have control over every aspect of it and can turn off all unrelated services and background processes.
I have had a reasonable amount of success writing software to monitor how well UPS units cope with power fluctuations by measuring their power compensation response times (disclaimer: not for commercial purposes though). Since the data to measure per sample was very small, the GC was not problematic and we cycled pre-allocated memory blocks for buffers.
Some micro-optimizations that came in handy:
Using immutable structs to poll I/O data.
Optimizing data structures to work well with memory allocation.
Optimizing processing algorithms to minimize CPU cache misses.
Using an optimized buffer class to hold data in transit.
Using the Monitor and Interlocked classes for synchronization.
Using unsafe code with (void*) to gain easy access to buffer arrays in various ways to decrease processing time. Minimal use of Marshal and Buffer.BlockCopy.
Lastly, you could go the DDK way and write a small driver. Albeit off-topic, DFMirage is a good example of a video driver that provides both an event-based and a polling model for differential screen capture such that the consumer application can chose on-the-fly based on system load.
As for Thread.Sleep, you could use it as sparingly as possible considering your energy consumption boundaries. With redundant processes out of the way, Thread.Sleep(1) should not be as bad as you think. Try the following to see what you get. Note that this has been coded in the SO editor so I may have made mistakes.
Thread.CurrentThread.Priority = ThreadPriority.Highest;
Process.GetCurrentProcess().PriorityClass = ProcessPriorityClass.RealTime;
var ticks = 0L;
var iteration = 0D;
var timer = new Stopwatch();
do
{
iteration++;
timer.Restart();
Thread.Sleep(1);
timer.Stop();
ticks += timer.Elapsed.Ticks;
if (Console.KeyAvailable) { if (Console.ReadKey(true).Key == ConsoleKey.Escape) { break; } }
Console.WriteLine("Elapsed (ms): Last Iteration = {0:N2}, Average = {1:N2}.", timer.Elapsed.TotalMilliseconds, TimeSpan.FromTicks((long) (ticks / iteration)).TotalMilliseconds);
}
while (true);
Console.WriteLine();
Console.WriteLine();
Console.Write("Press any key to continue...");
Console.ReadKey(true);
Come to think about the actual problem itself, processing data at 1ms is pretty easy. When considering audio recording, as an analogous (pun not intended) problem, you might be able to find some inspiration in how to achieve your goals.
Bear in mind.
Even a modest setup can achieve 44.1kHz#16bit per channel sampling rate (that is about 22microseconds or less than a hundredth of your target).
Using ASIO you can achieve sub 10ms latencies
Most methods of achieving high sampling rates will work by increasing your buffer size and sending data to your system in batches
To achieve the best throughput, don't use threads. You DMA and interrupts to callback your processing loop.
Given that sound cards routinely can achieve your goals, you might have a chance.

How many threads my current machine can handle optimally?

Original Question
Is there a heuristic or algorithim to programatically find out how many threads i can open in order to obtain maximum throughput of a async operation such as writing on a socket?
Further explained question
I'm assisting a algorithms professor in my college and he posted a assignment where the students are supossed to learn the basics about distributed computing, in his words: Sockets... The assignment is to create a "server" that listens on a given port, receives a string, performs a simple operations on it (i think it's supposed to count it's length) and return Ok or Rejected... The "server" must be able to handle a minimum of 60k submitions per second... My job is to create a little app to simulate 60K clients...
I've managed to automate the distribution of servers and the clients across a university lab in order to test 10 servers at a time (network infrastructure became the bottleneck), the problem here is: A lab is homogeneous, 2 labs are not! If not tunned correctly the "client" usually can't simulate 60k users and report back to me, especially when the lab is a older one, AND i would like to provide the client to the students so they could test their own "server" more reliably... The ability to determine the optimal number of threads to spawn has now become vital! PS: Fire-and-Forget is not a option because the client also tests if the returned value is correct, e.g If i send "Short sentence" i know the result will be "Rejected" and i have to check it...
A class have 60 students... and there's the morning class and the night class, so each week there will be 120 "servers" to test because as the semester moves along the "server" part will have to do more stuff, the client no (it will always only send a string and receive "Ok"/"Rejected")... So there's enough work to be done in order to justify all this work i'm doing...
Edit1
- Changed from Console to a async operation
- I dont want the maximum number of threads, i want the number that will provide maximum throughput! I imagine that on a 6 core pc the number will be higher than on a 2 core pc
Edit2
- I'm building a simple console app to perform some test in another app... one of thouse is a specific kind of load test (RUDY attack) where i have to simulate a lot of clients performing a specific attack... The thing is that there's a curve between throughput and number of threads, where after a given point, opening more threads actually decreases my throughput...
Edit3
Added more context to the initial question...
The Windows console is really meant to be used by more than one thread, otherwise you get interleaved writes. So the thread count for maximum console output would be one.
It's when you're doing computation that multiple threads makes sense. Then, it's rarely useful to use more than one thread per logical processor - or one background thread plus on UI thread for UI apps on a single-core processor.
It depends entirely on the situation - so the actual answer to your question of "is there a magical algorithm that will give me the perfect setup for max throughput?" is ... no.
Sure, more cores means more threads that can run and less context-switching. That said, you've edited your question to include an IO-bound example. IO-bound operations generally make use of completion ports for async operations. So, in that particular case, removing your use of your own dedicated threads for such an operation would be your main concern towards achieving maximum throughput.
Since you changed the question, I'll provide another answer.
It depends on the workload. If you're doing compute-heavy tasks, then use every logical processor. If you're doing IO, then use async calls rather than spawning new threads.
Of course, .NET has a way of managing this for you - the Thread Pool. Use it. Don't worry about how many threads you need, just kick off tasks.
If you are actually trying to do something productive (instead of just printing to the console), you should use System.Threading.Tasks.Task.Factory.StartNew. You can start as many tasks as you want. The runtime will try to distribute them amongst the available hardware threads as well as it can.

How many threads to use?

I know there are some existing questions and they provide a very good general perspective on things. I'm hoping to get some details on the C#/VB.Net side for the actual implementation (not philosophy) of some of these perspectives.
My Particular Case
I have a WCF Service which, amongst other things, receives files. For most of the service's life this particular area is actually just sat doing nothing - when work does come it arrives in high bursts of greatly varying quantities.
For each file received (which at a max can be thousands per second) the service needs to work on the files for between 1-10 seconds (each) depending on a number of other services, local resources, and network IO wait times.
To aid the service with these burst workloads I implemented a Queue system. Those thousands of files recieved per second are placed onto the Queue. A controller calculates the number of threads to use based on the size of the queue, up until it reaches a "Peak Max Threads" setting which prevents it from creating additional threads. These threads are placed in a thread pool, and reused to cycle through the queue. The controller will; at intervals; recalculate the number of threads required. If the queue size reduces, a relevant number of threads are released.
The age old problem
How many threads should I peak at? Clearly, adding a new thread everytime a file was received would be silly for lack of a better word - the performance, at best, would deteriorate. Capping the threads when CPU utilization is only 10% across each core, also doesn't seem to be the best use of resources.
So, is there an appropriate way to determine how many threads to cap at? I would rather the service could determine this for itself by sampling available resources, but is there a performance hit from doing so? I know the common answer is to monitor workloads, adjust the counts through trial and error until I find a number I like, but due to the nature of this service (long periods of idle followed by high/burst workloads) it could take a long time to get that kind of information.
What then if we move the server's image to a different host which is faster/slower/different to the first? I have to re-sample the process all over again?
Ideally what I'm after, is for the co-ordinator to intelligently increase the size of the threadpool until CPU utilisation is at x% (would 80% be reasonable? 90%? 99%?). Clearly, I want to do this without adding more threads than is necessary to hit x% otherwise all I'll end up with is threads not just waiting on IO resources, but awaiting each other too.
Thanks in advance!
Related questions (if you want some generic ideas):
How many threads to create?
How many threads is too many?
How many threads to create and when?
A Complication for you
Where would be the fun if I didn't make the problem more difficult?
As it currently stands, the service does hit 100% cpu during these bursts, regularly. The issue is the CPU utilisation spikes. It goes from idle (0-10%) to 100%, and back down again. I'm not sure I can help that - ideally I wouldn't take it all the way to 100%. The problem exists because the files mentioned are in fact images, and part of the services' process is to pass the image through to the System.Windows.Media blackbox which does some complex image processing for me.
There are then lulls in between the spikes because of the IO waits and other processing that goes on. If the spikes hitting 100% can't be helped (and I'm all for knowing how to prevent that, or if I should) how should I aim for the CPU utilisation graph to look? Sat constantly at 100%? Bouncing between 50-100? If I do go through the effort of sampling to decide what does seem to work best, is it guaranteed that switching the virtual servers' host will also work best with the same graph?
This added complexity I won't take into consideration for those of you willing to answer. Feel free to ignore this section. However, any answer that also accounts for this complication, or even answers that just provide tips on how to handle it, I'll at the very least upvote!
Heck of a long question - sorry about that - and thanks for reading so much!!
PerformanceCounter allows you to query for processor usage.
However ,have you tried something the framework provides?
foreach (var file in files)
{
var workitem = file;
Task.Factory.StartNew(() =>
{
// do work on workitem
}, TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
}
You can tune the concurrency level for Tasks in the Task.Factory.
The .NET 4 threadpool by default will schedule the number of threads it finds most performing on the hardware where it runs, but you can change how that works with the previous link.
Probably you need a custom solution but it would be ok to benchmark yours with the standard.
Edit: (comment note):
No links needed, I may have used an invented term since english is not my language. What I mean is: have a variable where you store the variance before the last check (prevDelta), and call it delta. add this to the varuiable avrageDelta and divide by 2, each time you 'check'. You will have the variable averageDelta that will mostly be low since you have no activity. Then have another set of delta variables, one you have already (delta - prevdelta), and store it in a delta variable that is not the average of all deltas but the average of deltas in a small timespan (you will have to come up with an algortihm to calculate accurately this temporal variance). Once done this you can compare the average delta and the 'temporal delta'. The average delta will be mostly low and will slowly go up whjen bursts come. In the same period the temporal delta will go up really fast. Then you have the situation when the burst stops, the average delta goes slowly down, and the 'temporal' goes really fast.
You could use I/O Completion Ports to asynchronously fetch your images without tying up any threads until it comes time to process what you have fetched.
You could then limit your thread pool based on the number of cores on your client PC, making sure to leave a core free for other processes to use.
What about a dynamic thread manager that monitors their overall performance and according to this spawns new threads or kills old ones? The main problem here is only how to define the performance measurement function. The rest can be done with a periodically scheduled job that increases or decreases the number of threads according to the previous number of threads and performance in that case or something like that. Maybe also in connection to resources utilization (CPU, disks, network...).

Is Threading Necessary/Useful?

Basically, I'm wondering if threading is useful or necessary, or possibly more specifically the uses and situations in which you would use it. I don't know much about threading, and have never used it (I primarily use C#) and have wondered if there are any gains to performance or stability if you use them. If anyone would be so kind to explain, I would be grateful.
In the world of desktop applications (my domain), threading is a vital construct in creating responsive user interfaces. Whenever a time-or-computationally-intensive operation needs to run, it's almost essential to run that operation in a separate thread. Otherwise, the user interface locks up and, in some cases, Windows will decide that the whole application has become unresponsive.
Threading is also a vital tool in animation, audio and communications. Basically, any situation in which you find yourself needing to do several things at once lends itself to the use of threads.
there is definitely no gains to stability :). I would suggest you get a basic understanding of threading but don't jump to use it in any real production application until you have a real need. you have C# so not sure if you are building websites or winforms.
Usually the firsty threading use case for winforms is when a user click a button and you want to run some expensive operation (database or webservice call) but you dont want the screen to freeze up . .
a good tutorial to deal with that situation is to look at the backgroundworker class in c# as this will give you a first flavor into this space and then you can go from there
There was a time when our applications would speed up when we deploy them on new CPU. And that speed up was by large extent because CPU speed (clock) was incremented by large factors.
But several years ago, CPU manufacturers stopped increasing CPU clocks because of physical limits (e.g. heat dissipation). And instead they started adding additional cores to CPUs.
Now, if your application runs only on one thread it cannot take advantage of complete CPU (e.g. of 4 cores it uses only 1).
So today to fully utilize CPU we must take effort and divide task on multiple treads.
For ASP.NET this is already done for us by ASP.NET architecture and IIS.
Look here The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software
Here is a simple example of how threading can improve performance. You have a n numbers that all needed to be added together. In a single threaded application, it will take a n time units to add all of the numbers together for the final sum. However, if you broke your numbers into 2 groups, you could have the same operation running side by side with, each with a group of n/2 numbers. Each would take n/2 time units to find their respective sums, and then an additional unit to find the full sum. By creating two threads, you have effectively cut the compute time in half.
Technically on a single core processor, there is no such thing as multi-threading, just the illusion that multiple tasks are happening in parallel since each task gets a small amount of time.
However, that being said, threading is very useful if you have to do some work that takes a long time but you want your application to be responsive (i.e. be able to do other things) while you wait for that task to finish. A good example is GUI applications.
On multi-core / multi-processor systems, you can have one process doing many things at once so the performance gain there is obvious :)

How to simulate different CPU frequency and limit RAM

I have to build a simulator with C#. This simulator should be able to run a second thread with configureable CPU speed and limited RAM size, e.g. 144MHz and 50 MB.
Of course I know that a simulator can never be as accurate as the real hardware. But I try to get almost similar performance.
At the moment I'm thinking about creating a thread which I will stop/sleep from time to time. Depending on the desired CPU speed the simulator should adjust the sleep time of this thread and therefore simulate different cpu frequency. To measure the achieved speed I though about using PerformanceCounters. But with this approach I have the problem that I don't know how to limit the RAM size the thread could use.
Do you have any ideas how to realize such a simulator?
Thanks in advance!!
Limit memory is easy with the virtual machines like vmware. You can change cpu speed with some overclocking tools. For example http://cpu.rightmark.org/products/rmclock.shtml
Good luck!
CPU speed limiting? You should check this, perhaps it will useful (to some degree at least).
CPU Emulation and locking to a specific clock speed
If you are concerned with simulating an operating system environment then one answer would be to use a virtual machines environment where you can control memory and CPU parameters, etc.
The threading pause\stop may help you to simulate CPU frequency, but this is going to be terribly inaccurate as when you pause the thread it will be de-scheduled, then it's up to the operating system to re-schedule it at some "random" point in time i.e. a point which you have no control over.
As for limiting the memory, starting a new process that will host your code is an option, and then limiting the memory of that process, e.g.:
http://www.codeproject.com/KB/threads/Setting_Max_Memory_Limit.aspx
This will not really simulate overall OS memory limitations though.
A thread to sleep down the software execution of your guest opcodes ?
I think it works but a little weird, like fast-forward, pause, ff, pause, etc...
If you just want to speed down a process, try this: use the cpu single step features, and "debug" the process. You have to wrote a custom handler for the cpu single stepping trap. Your handler job is only a big loop of NOPs.
You have a fine delay between each instruction.

Categories

Resources