When is using Task.Run wasteful or delusional?

When is using Task.Run wasteful or delusional? - c#

I have an interface that reads/writes an object to storage. In one case the storage is a database with async methods. In the other case it's just a cookie.
I gather that its recommended to use async back along the path ending at an async call, so it seems to make sense for the interface to be async as well. But in the cookie case, I'm just setting a couple fields and sticking it in the response so there isn't any async there yet. I can wrap that bit in await Task.Run() to match the new interface but I don't know if this is advisable or if it has some negative impact on performance.
What to do?
public interface IProfileStore
{
async Task SetProfile(UserProfile profile);
}
public async Task SetProfile(UserProfile profile)
{
// Look mom, I'm needlessly async
await Task.Run(() =>
{
var cookie = new HttpCookie(AnonymousCookieName);
cookie["name"] = profile.FullName;
HttpContext.Current.Response.Cookies.Add(cookie);
});
}

You should not do that; you're just creating needless threadpool churn.
Instead, remove the async keyword from the method and simply return Task.FromResult(0) to return a synchronously-completed task

If you're performing a very short quickly completed operation then you're quite right that there is likely no need to use Task.Run to push the work to another thread. The act of scheduling the code in the thread pool is likely going to take longer than just doing it.
As for how to do that, just remove the await Task.Run that you have no need for and voila, you're all set. You have a synchronous operation that is still wrapped in a Task and so still matches the required interface.

Almost as SLaks suggests if you were doing something async but return the Task, so:
public Task SetProfile(UserProfile profile)
{
return Task.Run(() =>
{
var cookie = new HttpCookie(AnonymousCookieName);
cookie["name"] = profile.FullName;
HttpContext.Current.Response.Cookies.Add(cookie);
});
}
However as he suggests in this case:
public Task SetProfile(UserProfile profile)
{
var cookie = new HttpCookie(AnonymousCookieName);
cookie["name"] = profile.FullName;
HttpContext.Current.Response.Cookies.Add(cookie);
return Task.FromResult(null);
}
Return null as its a system cached completed Task.

Related

Can I run sync code as async to gain performance? [duplicate]

This question already has answers here:
Why use async and return await, when you can return Task<T> directly?
(9 answers)
Closed 2 years ago.
I've read a bunch of forums, tutorials and blogs talking about the usage of Async/Await in C#. The more I read the more confusing it gets.
Also, people mainly talk about calling async stuff in a sync method but not calling sync stuff in an async method.
As I am a junior developer and do not have much experience with async programming I'll post a new question here and hope for some enlightenment.
Consider this:
I have a Web API endpoint that does some calculations and model building and returns some data.
public async Task<JsonResult> GetData()
{
Task<Data> stuff1Task = CalculateStuff1();
Task<Data> stuff2Task = CalculateStuff2();
Task<Data> stuff3Task = CalculateStuff3();
return Json(
new
{
stuff1 = await stuff1Task,
stuff2 = await stuff2Task,
stuff3 = await stuff3Task
}, JsonRequestBehavior.AllowGet
);
}
private async Task<Data> CalculateStuff1()
{
return await SomeAsyncCalculation();
}
private async Task<Data> CalculateStuff2()
{
return SomeSyncCalculation();
}
private async Task<Data> CalculateStuff3()
{
Task<Data> dataTask1 = SomeAsyncCalculation();
Task<Data> dataTask2 = AnotherAsyncCalculation();
Data data1 = await dataTask1;
Data data2 = await dataTask2;
Data combindedData = SyncMethodToCombineData(data1, data2);
return combindedData;
}
Why I consider mixing async and sync code is for getting better performance.
In this case lets pretend SomeAsyncCalculation(), SomeSyncCalculation() and AnotherAsyncCalculation() are pretty costly methods. My goal is to get the methods to run somewhat in parallel to gain some faster response times.
I know it is best to go "Async all the way" but lets be real, rebuilding half the project is not always a priority or a possibility.
Also I might have some integrations with other systems that do not support async operations.
This warning I get for CalculateStuff2() adds to the confusion. :
this async method lacks 'await' operators and will run synchronously
In my understanding the "async" keyword is only good for wrapping the method and allowing me to use await keyword. It also allows me to just return the data and I don't need to manage Task returning results. It also handles exceptions.
The Task<TResult> return type is what makes the method execute on a different thread (although it is not guaranteed it will execute on a different thread).
Concluding questions:
1. Will the async method that does not use await (CalculateStuff2()) run synchronously on it's own thread (if it runs on another thread because it is a Task) or will it run in the main thread of the API call, and always block it no matter what?
2. Is it bad practice to use async without await just to have a nicely wrapped task method out of the box?

You're not need for async in sync method. async generates State Machine that is a kind of redundancy in case you're not need for await.
Consider this somewhat optimized example.
public async Task<JsonResult> GetData()
{
Task<Data> stuff1Task = CalculateStuff1();
Task<Data> stuff3Task = CalculateStuff3();
Data stuff2data = CalculateStuff2(); // run sync method after launching async ones
return Json(new
{
stuff1 = await stuff1Task,
stuff2 = stuff2data,
stuff3 = await stuff3Task
}, JsonRequestBehavior.AllowGet);
}
private Task<Data> CalculateStuff1() // optimized
{
return SomeAsyncCalculation();
}
private Data CalculateStuff2()
{
return SomeSyncCalculation();
}
private async Task<Data> CalculateStuff3()
{
//use combinator to simplify the code
Data[] data = await Task.WhenAll(SomeAsyncCalculation(), AnotherAsyncCalculation());
Data combindedData = SyncMethodToCombineData(data[0], data[1]);
return combindedData;
}
Also consider to differ the CPU-bound and IO-bound operations, look at this article. There's different async approach depending on what exacly you're launching.
Direct answers
Will the async method that does not use await (CalculateStuff2()) run synchronously on it's own thread (if it runs on another thread because it is a Task) or will it run in the main thread of the API call, and always block it no matter what?
Yes, it will run synchronously on the caller Thread. If you want to run some sync method on its own Thread, use Task.Run():
Task<Data> stuff2Task = Task.Run(() => CalculateStuff2());
and then await it.
Is it bad practice to use async without await just to have a nicely wrapped task method out of the box?
Yes, it's bad practice. Redundant State Machine makes overhead which in this case is worthless.

CancellationToken: why not to use AsyncLocal context instead of passing parameter to every async method?

I see microsoft enforces this pattern for async methods:
async Task<object> DoMyOperationAsync(int par1, string par2,..., CancellationToken token=default(CancellationToken))
{
...
CancellationToken.ThrowIfRequested();
...
}
Every single method should have that ugly CancellationToken token=default(CancellationToken) parameter even though most of the time it's not even being used, but just passed through at best.
Why instead of this we cannot just use some sort of CancellationTokenContext, and use it in methods that actually need it?
public class CancellationTokenContext
{
static AsyncLocal<CancellationToken> asyncContext = new AsyncLocal<CancellationToken>();
public static CancellationToken Current {
get {
return asyncContext.Value;
}
set {
asyncContext.Value = value;
}
}
public static void ThrowIfRequested() {
Current.ThrowIfCancellationRequested();
}
}
public class MyClassWithAsyncMethod{
public async Task<object> DoMyOperationAsync(int par1, string par2,...)
{
...
CancellationTokenContext.ThrowIfRequested();
...
}
}

The main reason for not using AsyncLocal is that it's much slower. In fact, it's a dictionary lookup, so it should be approximate 20x slower (~ 20ns vs 1ns).
One other factor is: explicit is usually better than implicit. Though I agree it's a very annoying part of async implementation.

Maybe the main reason is that CancellationToken is a structure. So if you pass it as a parameter, it will be kept in a call stack, without need to allocate memory in heap. But if you use AsyncLocal instead, it will use boxing/unboxing to deal with a cancellation token. In case when async operations are heavily used, i.e. some ASP.NET Core app, it may even affect overall performance because of additional workload for garbage collector.
Other possible problem which I see here, is need to setup up and then restore AsyncLocal for every sub-call which require different cancellation token. For example:
private async Task DoSomeWork()
{
var currentCancallationToken = _ambientCancellationToken.Value;
_ambientCancellationToken.Value = CancellationToken.None;
// i.e. we should finish logging no matter of cancellation signal for calling method, for whatever reason
await _myCustomLogger.AsyncInfo("Do 1st step");
// restore ambient cancellation token behavior to make next step of async flow
_ambientCancellationToken.Value = currentCancallationToken;
await Do1StepOfWork();
_ambientCancellationToken.Value = CancellationToken.None;
// ...and here we go again
await _myCustomLogger.AsyncInfo("Do 2nd step");
// ...it became annoying...
_ambientCancellationToken.Value = currentCancallationToken;
await Do2StepOfWork();
}
But from other prospective, if you are sure, that your app (not a public library/framework) will not make several thousandths async flows per second, and you sure, that you will not use mixed cancellation contexts, it seems ok to use such approach.

"Storing" a task for later completion

I'm trying to "store" an async task for later completion - I've found the async cache example but this is effectively caching task results in a concurrent dictionary so that their results can be reloaded without re-doing the task again (the HTML implementation is here).
Basically what I'm trying to design is a dictionary of tasks, with correlation IDs (GUIDs) as the key. This is for co-ordinating incoming results from another place (XML identified by the GUID correlation ID) and my aim is for the task to suspend execution until the results come in (probably from a queue).
Is this going to work? This is my first foray into proper async coding and I can't find anything similar to my hopeful solution so I may well be entirely on the right track.
Can I effectively "store" a task for later completion, with the task result being set at completion time?
Edit: I've just found out about TaskCompletionSource (based on this question) is that viable?

If I understand your use-case correctly, you can use TaskCompletionSource.
An example of implementation:
public class AsyncCache
{
private Dictionary<Guid, Task<string>> _cache;
public Task<string> GetAsync(Guid guid)
{
if (_cache.TryGetValue(guid, out var task))
{
// The value is either there or already queued
return task;
}
var tcs = new TaskCompletionSource<string>(TaskCreationOptions.RunContinuationsAsynchronously);
_queue.Enqueue(() => {
var result = LoadResult();
tcs.TrySetValue(result);
});
_cache.Add(guid, tcs.Task);
return tcs.Task;
}
}
Here, _queue is whatever queuing mechanism you're going to use to process the data.
Of course, you would also have to make that code thread-safe.

Are you thinking of lazy loading? You could use Lazy<Task> (which will initialise the task but not queue it to run).
var tasks = new Dictionary<Guid, Lazy<Task>>();
tasks.Add(Task1Guid, new Lazy<Task>(() => { whatever the 1st task is }));
tasks.Add(Task2Guid, new Lazy<Task>(() => { whatever the 2nd task is }));
void async RunTaskAsync(Guid guid)
{
await tasks[guid].Value;
}

How should i use async await between my business/service/application layer

Right now my methods look something like this.
ProductManager class in business
public static async Task<List<ProductItem>> GetAllProducts()
{
var context = GetMyContext();
return await context.Products.select(x =>
new ProductItem{ //create item})
.ToListAsync();
}
ProductService class in service.
public async Task<List<ProductItem>> GetAllProducts()
{
return await ProductManager.GetAllProducts();
}
ProductController in application.
public async Task<ActionResult> Index()
{
var ps = new ProductService();
var productsAsync = ps.GetAllProducts();
// Do other work.
var products = await productsAsync;
return View(products);
}
This application gets high usage,
Is this way of doing it totally wrong ?
Should I be await every method ?
Will this start a new thread every time await is called ?

This application gets high usage, Is this way of doing it totally wrong?
No; it looks good to me.
Should I be await every method?
Yes. Once you put in the first await (in ProductManager), then its callers should be awaited, and their callers awaited, and so on, up to the controller action method. This "growth" of async is entirely natural; it's called "async all the way" in my MSDN article on async best practices.
Will this start a new thread every time await is called?
No. Await is about freeing up threads, not using more threads. I have an async intro on my blog that describes how async and await work.

await simply awaits for something to complete. If you don't need the results of a task in your method, you don't need to await it. GetAllProducts should simply return the results of ToListAsync.
public static Task<List<ProductItem>> GetAllProducts()
{
var context = GetMyContext();
return context.Products
.Select(x => new ProductItem{ //create item})
.ToListAsync();
}
async/await adds a bit of overhead, since the compiler has to generate a state machine that stores the original synchronization context, waits for the awaited task to finish and then restores the original synchronization context.
Adding async/await on a method that doesn't need to process the result of the task simply adds overhead. In fact, there are some Roslyn analyzers that detect and fix this issue

Task.Yield() in library needs ConfigureWait(false)

It's recommended that one use ConfigureAwait(false) whenever when you can, especially in libraries because it can help avoid deadlocks and improve performance.
I have written a library that makes heavy use of async (accesses web services for a DB). The users of the library were getting a deadlock and after much painful debugging and tinkering I tracked it down to the single use of await Task.Yield(). Everywhere else that I have an await, I use .ConfigureAwait(false), however that is not supported on Task.Yield().
What is the recommended solution for situations where one needs the equivalent of Task.Yield().ConfigureAwait(false)?
I've read about how there was a SwitchTo method that was removed. I can see why that could be dangerous, but why is there no equivalent of Task.Yield().ConfigureAwait(false)?
Edit:
To provide further context for my question, here is some code. I am implementing an open source library for accessing DynamoDB (a distributed database as a service from AWS) that supports async. A number of operations return IAsyncEnumerable<T> as provided by the IX-Async library. That library doesn't provide a good way of generating async enumerables from data sources that provide rows in "chunks" i.e. each async request returns many items. So I have my own generic type for this. The library supports a read ahead option allowing the user to specify how much data should be requested ahead of when it is actually needed by a call to MoveNext().
Basically, how this works is that I make requests for chunks by calling GetMore() and passing along state between these. I put those tasks in a chunks queue and dequeue them and turn them into actual results that I put in a separate queue. The NextChunk() method is the issue here. Depending on the value of ReadAhead I will keeping getting the next chunk as soon as the last one is done (All) or not until a value is needed but not available (None) or only get the next chunk beyond the values that are currently being used (Some). Because of that, getting the next chunk should run in parallel/not block getting the next value. The enumerator code for this is:
private class ChunkedAsyncEnumerator<TState, TResult> : IAsyncEnumerator<TResult>
{
private readonly ChunkedAsyncEnumerable<TState, TResult> enumerable;
private readonly ConcurrentQueue<Task<TState>> chunks = new ConcurrentQueue<Task<TState>>();
private readonly Queue<TResult> results = new Queue<TResult>();
private CancellationTokenSource cts = new CancellationTokenSource();
private TState lastState;
private TResult current;
private bool complete; // whether we have reached the end
public ChunkedAsyncEnumerator(ChunkedAsyncEnumerable<TState, TResult> enumerable, TState initialState)
{
this.enumerable = enumerable;
lastState = initialState;
if(enumerable.ReadAhead != ReadAhead.None)
chunks.Enqueue(NextChunk(initialState));
}
private async Task<TState> NextChunk(TState state, CancellationToken? cancellationToken = null)
{
await Task.Yield(); // ** causes deadlock
var nextState = await enumerable.GetMore(state, cancellationToken ?? cts.Token).ConfigureAwait(false);
if(enumerable.ReadAhead == ReadAhead.All && !enumerable.IsComplete(nextState))
chunks.Enqueue(NextChunk(nextState)); // This is a read ahead, so it shouldn't be tied to our token
return nextState;
}
public Task<bool> MoveNext(CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
if(results.Count > 0)
{
current = results.Dequeue();
return TaskConstants.True;
}
return complete ? TaskConstants.False : MoveNextAsync(cancellationToken);
}
private async Task<bool> MoveNextAsync(CancellationToken cancellationToken)
{
Task<TState> nextStateTask;
if(chunks.TryDequeue(out nextStateTask))
lastState = await nextStateTask.WithCancellation(cancellationToken).ConfigureAwait(false);
else
lastState = await NextChunk(lastState, cancellationToken).ConfigureAwait(false);
complete = enumerable.IsComplete(lastState);
foreach(var result in enumerable.GetResults(lastState))
results.Enqueue(result);
if(!complete && enumerable.ReadAhead == ReadAhead.Some)
chunks.Enqueue(NextChunk(lastState)); // This is a read ahead, so it shouldn't be tied to our token
return await MoveNext(cancellationToken).ConfigureAwait(false);
}
public TResult Current { get { return current; } }
// Dispose() implementation omitted
}
I make no claim this code is perfect. Sorry it is so long, wasn't sure how to simplify. The important part is the NextChunk method and the call to Task.Yield(). This functionality is used through a static construction method:
internal static class AsyncEnumerableEx
{
public static IAsyncEnumerable<TResult> GenerateChunked<TState, TResult>(
TState initialState,
Func<TState, CancellationToken, Task<TState>> getMore,
Func<TState, IEnumerable<TResult>> getResults,
Func<TState, bool> isComplete,
ReadAhead readAhead = ReadAhead.None)
{ ... }
}

The exact equivalent of Task.Yield().ConfigureAwait(false) (which doesn't exist since ConfigureAwait is a method on Task and Task.Yield returns a custom awaitable) is simply using Task.Factory.StartNew with CancellationToken.None, TaskCreationOptions.PreferFairness and TaskScheduler.Current. In most cases however, Task.Run (which uses the default TaskScheduler) is close enough.
You can verify that by looking at the source for YieldAwaiter and see that it uses ThreadPool.QueueUserWorkItem/ThreadPool.UnsafeQueueUserWorkItem when TaskScheduler.Current is the default one (i.e. thread pool) and Task.Factory.StartNew when it isn't.
You can however create your own awaitable (as I did) that mimics YieldAwaitable but disregards the SynchronizationContext:
async Task Run(int input)
{
await new NoContextYieldAwaitable();
// executed on a ThreadPool thread
}
public struct NoContextYieldAwaitable
{
public NoContextYieldAwaiter GetAwaiter() { return new NoContextYieldAwaiter(); }
public struct NoContextYieldAwaiter : INotifyCompletion
{
public bool IsCompleted { get { return false; } }
public void OnCompleted(Action continuation)
{
var scheduler = TaskScheduler.Current;
if (scheduler == TaskScheduler.Default)
{
ThreadPool.QueueUserWorkItem(RunAction, continuation);
}
else
{
Task.Factory.StartNew(continuation, CancellationToken.None, TaskCreationOptions.PreferFairness, scheduler);
}
}
public void GetResult() { }
private static void RunAction(object state) { ((Action)state)(); }
}
}
Note: I don't recommend actually using NoContextYieldAwaitable, it's just an answer to your question. You should be using Task.Run (or Task.Factory.StartNew with a specific TaskScheduler)

I noticed you edited your question after you accepted the existing answer, so perhaps you're interested in more rants on the subject. Here you go :)
It's recommended that one use ConfigureAwait(false) whenever when you
can, especially in libraries because it can help avoid deadlocks and
improve performance.
It's recommended so, only if you're absolutely sure that any API you're calling in your implementation (including Framework APIs) doesn't depend on any properties of synchronization context. That's especially important for a library code, and even more so if the library is suitable for both client-side and server-side use. E.g, CurrentCulture is a common overlook: it would never be an issue for a desktop app, but it well may be for an ASP.NET app.
Back to your code:
private async Task<TState> NextChunk(...)
{
await Task.Yield(); // ** causes deadlock
var nextState = await enumerable.GetMore(...);
// ...
return nextState;
}
Most likely, the deadlock is caused by the client of your library, because they use Task.Result (or Task.Wait, Task.WaitAll, Task.IAsyncResult.AsyncWaitHandle etc, let them search) somewhere in the outer frame of the call chain. Albeit Task.Yield() is redundant here, this is not your problem in the first place, but rather theirs: they shouldn't be blocking on the asynchronous APIs and should be using "Async All the Way", as also explained in the Stephen Cleary's article you linked.
Removing Task.Yield() may or may not solve this problem, because enumerable.GetMore() can also use some await SomeApiAsync() without ConfigureAwait(false), thus posting the continuation back to the caller's synchronization context. Moreover, "SomeApiAsync" can happen to be a well established Framework API which is still vulnerable to a deadlock, like SendMailAsync, we'll get back to it later.
Overall, you should only be using Task.Yield() if for some reason you want to return to the caller immediately ("yield" the execution control back to the caller), and then continue asynchronously, at the mercy of the SynchronizationContext installed on the calling thread (or ThreadPool, if SynchronizationContext.Current == null). The continuation well may be executed on the same thread upon the next iteration of the app's core message loop. Some more details can be found here:
Task.Yield - real usages?
So, the right thing would be to avoid blocking code all the way. However, say, you still want to make your code deadlock-proof, you don't care about synchronization context and you're sure the same is true about any system or 3rd party API you use in your implementation.
Then, instead of reinventing ThreadPoolEx.SwitchTo (which was removed for a good reason), you could just use Task.Run, as suggested in the comments:
private Task<TState> NextChunk(...)
{
// jump to a pool thread without SC to avoid deadlocks
return Task.Run(async() =>
{
var nextState = await enumerable.GetMore(...);
// ...
return nextState;
});
}
IMO, this is still a hack, with the same net effect, although a much more readable one than using a variation of ThreadPoolEx.SwitchTo(). Same as SwitchTo, it still has an associated cost: a redundant thread switch which may hurt ASP.NET performance.
There is another (IMO better) hack, which I proposed here to address the deadlock with aforementioned SendMailAsync. It doesn't incur an extra thread switch:
private Task<TState> NextChunk(...)
{
return TaskExt.WithNoContext(async() =>
{
var nextState = await enumerable.GetMore(...);
// ...
return nextState;
});
}
public static class TaskExt
{
public static Task<TResult> WithNoContext<TResult>(Func<Task<TResult>> func)
{
Task<TResult> task;
var sc = SynchronizationContext.Current;
try
{
SynchronizationContext.SetSynchronizationContext(null);
task = func(); // do not await here
}
finally
{
SynchronizationContext.SetSynchronizationContext(sc);
}
return task;
}
}
This hack works in the way it temporarily removes the synchronization context for the synchronous scope of the original NextChunk method, so it won't be captured for the 1st await continuation inside the async lambda, effectively solving the deadlock problem.
Stephen has provided a slightly different implementation while answering the same question. His IgnoreSynchronizationContext restores the original synchronization context on whatever happens to be the continuation's thread after await (which could be a completely different, random pool thread). I'd rather not restore it after await at all, as long as I don't care about it.

Inasmuch as the useful and legit API you're looking for is missing, I filed this request proposing its addition to .NET.
I also added it to vs-threading so that the next release of the Microsoft.VisualStudio.Threading NuGet package will include this API. Note that this library is not VS-specific, so you can use it in your app.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

When is using Task.Run wasteful or delusional? - c#

You should not do that; you're just creating needless threadpool churn. Instead, remove the async keyword from the method and simply return Task.FromResult(0) to return a synchronously-completed task

Related

Can I run sync code as async to gain performance? [duplicate]

CancellationToken: why not to use AsyncLocal context instead of passing parameter to every async method?

"Storing" a task for later completion

How should i use async await between my business/service/application layer

Task.Yield() in library needs ConfigureWait(false)

Categories

Resources