I am using C# to control a hardware device. The program is structured as
A hardware control thread (normal CPU priority)
while (notFinished)
{
Prepare();
await DeviceCommunication();
autoResetEvent.WaitOne();
}
A UI thread (normal CPU priority)
A heavy computational thread (below normal CPU priority)
There is a layer of device API written in C# Task. The AutoResetEvent delay after Task continuation is sometimes as high as 500ms depending on the state of PC (the heavy computational thread is not even running). It is generally fine except during some critical hardware control moments. It requires 10ms response time.
I tested setting the consumer thread to above normal and mocking the asynchronous function to force it to synchronous. It seemed to solve the problem. However, in the real asynchronous functions, there are await. They immediately release the thread. The continuations are in threads from the thread-pool.
Question 1
Is the 500ms delay normal? I am using VirtualBox with i5 2 threads. I expect the target PC will perform similarly to mine.
Assuming my findings are valid. To solve the problem, my choices are
Use Task.GetAwaiter().GetResult() to turn async to sync. It should not cause deadlocks.
Rewrite the device API layer to support true sync operations. It is elegant and follows the general rules but they are just nice to have.
Set Task scheduling priority and CPU priority
Use 3rd parties Task libraries
Question 2
Are there better choices?
Question 3
How to do choice 3 (Set Task scheduling priority and CPU priority)? Is a custom TaskScheduler the only way to do it?
Windows does not provide any scheduling guarantees of any kind. Period. We frequently see extra delays of several seconds for C# code on a Win7 system while device drivers are busy. If you really have a hard real time requirement, you need to be running on an RTOS.
Related
Context: we have a task which might take from 30 seconds to 5 minutes depending on a service we are consuming in some Azure Functions.
We are planning to monitor the current status of that task object to make sure it's running and has not been cancelled/faulted.
There are two ways to go around it:
Create a Task, run it and then cancel it when the main task is finished. Alternatively, maybe use Task.Delay along with a while with a condition.
Create a Thread, run it and wait for it to finish (with a while condition to avoid a while that runs forever).
We have done some research and have realised that both have pros and cons. But we are still not sure about which one would be the best approach and why.
In a similar scenario, what would you use? A task, a thread, or something else?
Using a thread is a bit wasteful, but slightly more reliable.
It is wasteful because each thread allocates 1 MB of memory just for its mere existence.
It is more reliable because it doesn't depend on the availability of ThreadPool threads for running a timer event. A sudden burst in demand for ThreadPool threads could leave the ThreadPool starved for several seconds, or even minutes (in extreme scenarios).
So if wasting 1 MB of memory is a non-issue for the app, use a thread. On the other hand if the absolute precision in the timing of the events is something unimportant, use a task.
You could also use a task started with the option LongRunning, but this is essentially a thread in disguise.
I am using Task.Run(() => this.someMethod()) to schedule a back ground job. I am not interested in the operation result and need to move on with flow of application.
But, sometimes my background task is not getting scheduled for a long time.This has started to happen since we moved from .Net 4.7 from 4.5. Even while debugging the break points are either not hit or hit after considerable delay( > 10 minutes).
Has anyone noticed this behavior or know whats causing it?
I am running on i7 core, 16 GB RAM.
Having your Task taking 10 minutes to even start sounds fishy. My guess is your system is under heavy load, or you have a lot of tasks running.
I'm going to attack the later (for a specific situation).
TaskCreationOptions
LongRunning Specifies that a task will be a long-running,
coarse-grained operation involving fewer, larger components than
fine-grained systems. It provides a hint to the TaskScheduler that
oversubscription may be warranted. Oversubscription lets you create
more threads than the available number of hardware threads. It also
provides a hint to the task scheduler that an additional thread might
be required for the task so that it does not block the forward
progress of other threads or work items on the local thread-pool
queue.
var task = new Task(() => MyLongRunningMethod(),TaskCreationOptions.LongRunning);
task.Start();
This is a quote from Stephen Toub - MSFT on this post
Under the covers, it's going to result in a higher number of threads
being used, because its purpose is to allow the ThreadPool to continue
to process work items even though one task is running for an extended
period of time; if that task were running in a thread from the pool,
that thread wouldn't be able to service other tasks. You'd typically
only use LongRunning if you found through performance testing that not
using it was causing long delays in the processing of other work.
Its hard to know what is your problem with out looking at all your code, however i posted this as a sugestion.
I don't know how many tasks you start this way, but unless the number is really high I would focus the debugging on the method being called, not the caller. A delay of 10 minutes is more likely caused by a deadlock or network issue than the task scheduling.
Some ideas:
For a start, I would add something to the beginning and the end of the method being called that lets you know when it starts to execute and when it finishes. Like a Debug.WriteLine() with a timestamp and task ID.
Make sure the method being called releases all resources, even if it crashes. Crashing threads/tasks may go unnoticed, because they don't cause the application to crash.
Double check that the method being called is thread-safe. You may have been lucky in the past and some new framework optimization is now causing havoc.
I am aware of how async await works. I know that when execution reaches to await, it release the thread and after IO completes, it fetches thread from threadpool and run the remaining code. This way threads are efficiently utilized. But I am confused in some use cases:
Should we use async methods for the very fast IO method, like cache read/write method? Would not they result into unnecessarily context switch. If we use sync method, execution will complete on same thread and context switch may not happen.
Does Async-await saves only memory consumption(by creating lesser threads). Or it also saves cpu as well? As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
I am aware of how async await works.
You are not.
I know that when execution reaches to await, it release the thread
It does not. When execution reaches an await, the awaitable operand is evaluated, and then it is checked to see if the operation is complete. If it is not, then the remainder of the method is signed up as the continuation of the awaitable, and a task representing the work of the current method is returned to the caller.
None of that is "releasing the thread". Rather, control returns to the caller, and the caller keeps executing on the current thread. Of course, if the current caller was the only thing on this thread, then the thread is done. But there is no requirement that an async method be the only call on a thread!
after IO completes
An awaitable need not be an IO operation, but let's suppose that it is.
it fetches thread from threadpool and run the remaining code.
No. It schedules the remaining code to run on the correct context. That context might be a threadpool thread. It might be the UI thread. It might be the current thread. It might be any number of things.
Should we use async methods for the very fast IO method, like cache read/write method?
The awaitable is evaluated. If the awaitable knows that it can complete the operation in a reasonable amount of time then it is perfectly within its rights to do the operation and return a completed task. In which case there is no penalty; you're just checking a boolean to see if the task is completed.
Would not they result into unnecessarily context switch.
Not necessarily.
If we use sync method, execution will complete on same thread and context switch may not happen.
I am confused as to why you think a context switch happens on an IO operation. IO operations run on hardware, below the level of OS threads. There's no thread sitting there servicing IO tasks.
Does Async-await saves only memory consumption(by creating lesser threads)
The purpose of await is to (1) make more efficient use of expensive worker threads by allowing workflows to become more asynchronous, and thereby freeing up threads to do work while waiting for high-latency results, and (2) to make the source code for asynchronous workflows resemble the source code for synchronous workflows.
As far as I know, in case of sync IO, while IO takes place, thread goes into sleep mode. That means it does not consume cpu. Is this understanding correct?
Sure but you have this completely backwards. YOU WANT TO CONSUME CPU. You want to be consuming as much CPU as possible all the time! The CPU is doing work on behalf of the user and if it is idle then its not getting its work done as fast as it could. Don't hire a worker and then pay them to sleep! Hire a worker, and as soon as they are blocked on a high-latency task, put them to work doing something else so the CPU stays as hot as possible all the time. The owner of that machine paid good money for that CPU; it should be running at 100% all the time that there is work to be done!
So let's come back to your fundamental question:
Does async await increases Context switching
I know a great way to find out. Write a program using await, write another one without, run them both, and measure the number of context switches per second. Then you'll know.
But I don't see why context switches per second is a relevant metric. Let's consider two banks with lots of customers and lots of employees. At bank #1 the employees work on one task until it is complete; they never switch context. If an employee is blocked on waiting for a result from another, they go to sleep. At bank #2, employees switch from one task to another when they are blocked, and are constantly servicing customer requests. Which bank do you think has faster customer service?
Should we use async methods for the very fast IO method, like cache read/write method?
Such an IO would not block in the classical sense. "Blocking" is a loosely defined term. Normally it means that the CPU must wait for the hardware.
This type of IO is purely CPU work and there are no context switches. This would typically happen if the app reads a file or socket slower than data can be provided. Here, async IO does not help performance at all. I'm not even sure it would be suitable to unblock the UI thread since all tasks might complete synchronously.
Or it also saves cpu as well?
It generally increases CPU usage in real-world loads. This is because the async machinery adds processing, allocations and synchronization. Also, we need to transition to kernel mode two times instead of once (first to initiate the IO, then to dequeue the IO completion notification).
Typical workloads run with <<100% CPU. A production server with >60% CPU would worry me since there is no margin for error. In such cases the thread pool work queues are almost always empty. Therefore, there are no context switching savings caused by processing multiple IO completions on one context switch.
That's why CPU usage generally increases (slightly), except if the machine is very high on CPU load and the work queues are often capable of delivering a new item immediately.
On the server async IO is mainly useful for saving threads. If you have ample threads available you will realize zero or negative gains. In particular any single IO will not become one bit faster.
That means it does not consume cpu.
It would be a waste to leave the CPU unavailable while an IO is in progress. To the kernel an IO is just a data structure. While it's in progress there is no CPU work to be done.
An anonymous person said:
For IO-bound tasks there may not be a major performance advantage to using separate threads just to wait for a result.
Pushing the same work to a different thread certainly does not help with throughput. This is added work, not reduced work. It's a shell game. (And async IO does not use a thread while it's running so all of this is based on a false assumption.)
A simple way to convince yourself that async IO generally costs more CPU than sync IO is to run a simple TCP ping/pong benchmark sync and async. Sync is faster. This is kind of an artificial load so it's just a hint at what's going on and not a comprehensive measurement.
When working with tasks, a rule of thumb appears to be that the thread pool - typically used by e.g. invoking Task.Run(), or Parallel.Invoke() - should be used for relatively short operations. When working with long running operations, we are supposed to use the TaskCreationOptions.LongRunning flag in order to - as far as I understand it - avoid clogging the thread pool queue, i.e. to push work to a newly-created thread.
But what exactly is a long running operation? How long is long, in terms of time? Are there other factors besides the expected task duration to be considered when deciding whether or not to use the LongRunning, like the anticipated CPU architecture (frequency, the number of cores, ...) or the number of tasks that will be attempted to be run at once from the programmer's perspective?
For example, suppose I have 500 tasks to process in a dedicated application, each taking 10-20 seconds to complete. Should I just start all 500 tasks using Task.Run (e.g. in a loop) and then await them all, perhaps as LongRunning, while leaving the default max level of concurrency? Then again, if I set LongRunning in such case, wouldn't this create 500 new threads and actually cause a lot of overhead and higher memory usage (due to extra threads being allocated) as compared to omitting LongRunning? This is assuming that no new tasks will be scheduled for execution while these 500 are being awaited.
I would guess that the decision to set LongRunning depends on the number of requests made to the thread pool in a given time interval, and that LongRunning should only be used for tasks that are expected to take significantly longer that the majority of the thread pool-placed tasks - by definition, at most a small percentage of all tasks. In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. Am I correct?
It kind of doesn't matter. The problem isn't really about time, it's about what your code is doing. If you're doing asynchronous I/O, you're only using the thread for the short amount of time between individual requests. If you're doing CPU work... well, you're using the CPU. There's no "thread-pool starvation", because the CPUs are fully utilized.
The real problem is when you're doing blocking work that doesn't use the CPU. In case like that, thread-pool starvation leads to CPU-underutilization - you said "I need the CPU for my work" and then you don't actually use it.
If you're not using blocking APIs, there's no point in using Task.Run with LongRunning. If you have to run some legacy blocking code asynchronously, using LongRunning may be a good idea. Total work time isn't as important as "how often you are doing this". If you spin up one thread based on a user clicking on a GUI, the cost is tiny compared to all the latencies already included in the act of clicking a button in the first place, and you can use LongRunning just fine to avoid the thread-pool. If you're running a loop that spawns lots of blocking tasks... stop doing that. It's a bad idea :D
For example, imagine there is no asynchronous API alternative File.Exists. So if you see that this is giving you trouble (e.g. over a faulty network connection), you'd fire it up using Task.Run - and since you're not doing CPU work, you'd use LongRunning.
In contrast, if you need to do some image manipulation that's basically 100% CPU work, it doesn't matter how long the operation takes - it's not a LongRunning thing.
And finally, the most common scenario for using LongRunning is when your "work" is actually the old-school "loop and periodically check if something should be done, do it and then loop again". Long running, but 99% of the time just blocking on some wait handle or something like that. Again, this is only useful when dealing with code that isn't CPU-bound, but that doesn't have proper asynchronous APIs. You might find something like this if you ever need to write your own SynchronizationContext, for example.
Now, how do we apply this to your example? Well, we can't, not without more information. If your code is CPU-bound, Parallel.For and friends are what you want - those ensure you only use enough threads to sature the CPUs, and it's fine to use the thread-pool for that. If it's not CPU bound... you don't really have any option besides using LongRunning if you want to run the tasks in parallel. Ideally, such work would consist of asynchronous calls you can safely invoke and await Task.WhenAll(...) from your own thread.
When working with tasks, a rule of thumb appears to be that the thread pool - typically used by e.g. invoking Task.Run(), or Parallel.Invoke() - should be used for relatively short operations. When working with long running operations, we are supposed to set the TaskCreationOptions.LongRunning to true in order to - as far as I understand it - avoid clogging the thread pool queue, i.e. to push work to a newly-created thread.
The vast majority of the time, you don't need to use LongRunning at all, because the thread pool will adjust to "losing" a thread to a long-running operation after 2 seconds.
The main problem with LongRunning is that it forces you to use the very dangerous StartNew API.
In other words, this appears to be a queuing and thread pool utilization optimization problem that should likely be solved case-by-case through testing, if at all. Am I correct?
Yes. You should never set LongRunning when first writing code. If you are seeing delays due to the thread pool injection rate, then you can carefully add LongRunning.
You should not use TaskCreationOptions.LongRunning in your case. I would use Parallel.For.
The LongRunning option is not to be used if you're going to create a lot of tasks, just like in your case. It is to be used for creating couple of tasks that will be running for a Long Time.
By the way, i never used this option in any similar scenario.
As you point out, TaskCreationOptions.LongRunning's purpose is
to allow the ThreadPool to continue to process work items even though one task is running for an extended period of time
As for when to use it:
It's not a specific length per se...You'd typically only use LongRunning if you found through performance testing that not using it was causing long delays in the processing of other work.
Source
I'm currently working on an application that relies on many different web services to get data. Since I want to modularize each service and have a bit of dependency in there (service1 must run before service 2 and 3 etc), I'm running each service in its own task.
The tasks themselves are either
running actively, meaning they're sending their request to the web service and are waiting for a response or processing the response
waiting (via monitor and timeout) - once a task finishes all waiting tasks wake up and check if their dependencies have finished
Now, the system is running with what I would call good performance (especially since the performance is rather negligible) - however, the application generates quite a number of tasks.
So, to my question: are ~200 tasks in this scenario too many? Do they generate that much overhead so that a basically non-threaded approach would be better?
The general answer is "Measure, Measure, Measure" :) if you're not experiencing any problems with performance, you shouldn't start optimizing.
I'd say 200 tasks are fine though. The beauty of tasks compared to threads is their low overhead compared to "real" threads and even the thread pool. The TaskScheduler is making sure all the hardware threads are utilized as much as possible with the least amount of thread switching. it does this by various tricks such as running child tasks serially, stealing work from queues on other threads and so on.
You can also give the TaskScheduler some hints about what a specific task is going to do via the TaskCreationOptions
If you want some numbers, check out this post, as you can see, Tpl is pretty cheap in terms of overhead:
.NET 4.0 - Performance of Task Parallel Library (TPL), by Daniel Palme
This is another interesting article on the subject:
CLR Inside Out: Using concurrency for scalability, by Joe Duffy