Correct way to convert method to async in C#? - c#

I'm attempting to convert the following method (simplified example) to be asynchronous, as the cacheMissResolver call may be expensive in terms of time (database lookup, network call):
// Synchronous version
public class ThingCache
{
private static readonly object _lockObj;
// ... other stuff
public Thing Get(string key, Func<Thing> cacheMissResolver)
{
if (cache.Contains(key))
return cache[key];
Thing item;
lock(_lockObj)
{
if (cache.Contains(key))
return cache[key];
item = cacheMissResolver();
cache.Add(key, item);
}
return item;
}
}
There are plenty of materials on-line about consuming async methods, but the advice I have found on producing them seems less clear-cut. Given that this is intended to be part of a library, are either of my attempts below correct?
// Asynchronous attempts
public class ThingCache
{
private static readonly SemaphoreSlim _lockObj = new SemaphoreSlim(1);
// ... other stuff
// attempt #1
public async Task<Thing> Get(string key, Func<Thing> cacheMissResolver)
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
Thing item;
await _lockObj.WaitAsync();
try
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
item = await Task.Run(cacheMissResolver).ConfigureAwait(false);
_cache.Add(key, item);
}
finally
{
_lockObj.Release();
}
return item;
}
// attempt #2
public async Task<Thing> Get(string key, Func<Task<Thing>> cacheMissResolver)
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
Thing item;
await _lockObj.WaitAsync();
try
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
item = await cacheMissResolver().ConfigureAwait(false);
_cache.Add(key, item);
}
finally
{
_lockObj.Release();
}
return item;
}
}
Is using SemaphoreSlim the correct way to replace a lock statement in an async method? (I can't await in the body of a lock statement.)
Should I make the cacheMissResolver argument of type Func<Task<Thing>> instead? Although this puts the burden of making sure the resolver func is async on the caller (wrapping in Task.Run, I know it will be offloaded to a background thread if it takes a long time).
Thanks.

Is using SemaphoreSlim the correct way to replace a lock statement in an async method?
Yes.
Should I make the cacheMissResolver argument of type Func<Task<Thing>> instead?
Yes. It'll allow the caller to provide an inherently asynchronous operation (such as IO) rather than making this only suitable for work that is long running CPU bound work. (While still supporting CPU bound work by simply having the caller use Task.Run themselves, if that's what they want to do.)
Other than that, just note that there's not point in having await Task.FromResult(...); Wrapping a value in a Task just to immediately unwrap it is pointless. Just use the result directly in such situations, in this case, return the cached value directly. What you're doing isn't really wrong, it's just needlessly complicating/confusing the code.

If your cache is in-memory (it looks like it is), then consider caching the tasks rather than the results. This has a nice side property if two methods request the same key, only a single resolving request is made. Also, since only the cache is locked (and not the resolving operations), you can continue using a simple lock.
public class ThingCache
{
private static readonly object _lockObj;
public async Task<Thing> GetAsync(string key, Func<Task<Thing>> cacheMissResolver)
{
lock (_lockObj)
{
if (cache.Contains(key))
return cache[key];
var task = cacheMissResolver();
_cache.Add(key, task);
}
}
}
However, this will also cache exceptions, which you may not want. One way to avoid this is to permit the exception task to enter the cache initially, but then prune it when the next request is made:
public class ThingCache
{
private static readonly object _lockObj;
public async Task<Thing> GetAsync(string key, Func<Task<Thing>> cacheMissResolver)
{
lock (_lockObj)
{
if (cache.Contains(key))
{
if (cache[key].Status == TaskStatus.RanToCompletion)
return cache[key];
cache.Remove(key);
}
var task = cacheMissResolver();
_cache.Add(key, task);
}
}
}
You may decide this extra check is unnecessary if you have another process pruning the cache periodically.

Related

Is it possible to replace yield by await?

Both yield and await are capable of interrupting the execution of a method. Both of them can be used to let the caller continue execution. Only the await seems to me to be a stronger tool that can do not only this but much more. Is it true?
I would like to replace the following:
yield item;
By this:
await buffer.SendAsync(item);
Is it possible? One thing I like about yield is that everything happens on one thread. Could it be the same with the await approach?
I tried to implement it as follows:
class Program
{
public static async Task Main()
{
await ConsumeAsync();
}
// instead of IEnumerable<int> Produce()
static async Task ProduceAsync(Buffer<int> buffer)
{
for (int i = 0; ; i++)
{
Console.WriteLine($"Produced {i} on thread {Thread.CurrentThread.ManagedThreadId}");
await buffer.SendAsync(i); // instead of yield i;
}
}
static async Task ConsumeAsync()
{
Buffer<int> buffer = new Buffer<int>(ProduceAsync); // instead enumerable.GetEnumerator
while (true)
{
int i = await buffer.ProduceAsync(); // instead of enumerator.MoveNext(); enumerator.Current
Console.WriteLine($"Consumed {i} on thread {Thread.CurrentThread.ManagedThreadId}");
}
}
}
class Buffer<T>
{
private T item;
public Buffer(Func<Buffer<T>, Task> producer)
{
producer(this); // starts the producer
}
public async Task<T> ProduceAsync()
{
// block the consumer
// continue the execution of producer
// await until producer produces something
return item;
}
public async Task SendAsync(T item)
{
this.item = item;
// block the producer
// continue the execution of consumer
// await until the consumer is requesting next item
}
}
But I did not know how to do the synchronization to keep everything on one thread.
Yes, it is possible to use patterns like that. I have done so when writing state machines for UI interactions, where the user needs to place a sequence of clicks, and different things should happen between each click. Using something like await mouse.GetDownTask() allow the code to be written in a fairly linear way.
But that does not necessarily mean that that it was easy to read or understand. The underlying mechanisms to make it all work where quite horrible. So you should be aware that this is abuse of the system, and that other readers of your code will probably not expect such a usage. So use with care, and make sure you are on top of your documentation.
To make this work you likely need to use TaskCompletionSource. Something like this might work:
class Buffer<T>
{
private TaskCompletionSource<T> tcs = new ();
public Buffer() { }
public Task<T> Receive() => tcs.Task;
public void Send(T item)
{
tcs.SetResult(item);
tcs = new TaskCompletionSource<T>();
}
}
You should also be aware of deadlocks. This is a potential problem when using fake async on a single thread like this.
You could just use a Channel<T>:
class Buffer<T>
{
private readonly Channel<T> _channel = Channel.CreateBounded<T>(1);
public Task<T> ProduceAsync()
{
// block the consumer
// continue the execution of producer
// await until producer produces something
return _channel.Reader.ReadAsync().AsTask();
}
public Task SendAsync(T item)
{
return _channel.Writer.WriteAsync(item).AsTask();
// block the producer
// continue the execution of consumer
// await until the consumer is requesting next item
}
}
The built-in Channel<T> implementations have a minimum bounded capacity of 1. My understanding is that you want a channel with zero capacity, like the channels in Go language. Implementing one is doable, but not trivial. For a starting point, see this answer.

Calling Entity Framework async code, synchronously, within a lock

I have an async method which will load some info from the database via Entity Framework.
In one circumstance I want to call that code synchronously from within a lock.
Do I need two copies of the code, one async, one not, or is there a way of calling the async code synchronously?
For example something like this:
using System;
using System.Threading.Tasks;
public class Program
{
public static void Main()
{
new Test().Go();
}
}
public class Test
{
private object someLock = new object();
public void Go()
{
lock(someLock)
{
Task<int> task = Task.Run(async () => await DoSomethingAsync());
var result = task.Result;
}
}
public async Task<int> DoSomethingAsync()
{
// This will make a database call
return await Task.FromResult(0);
}
}
Edit: as a number of the comments are saying the same thing, I thought I'd elaborate a little
Background: normally, trying to do this is a bad idea. Lock and async are polar opposites as documented in lots of places, there's no reason to have an async call in a lock.
So why do it here? I can make the database call synchronously but that requires duplicating some methods which isn't ideal. Ideally the language would let you call the same method synchronously or asynchronously
Scenario: this is a Web API. The application starts, a number of Web API calls execute and they all want some info that's in the database that's provided by a service provider dedicated for that purpose (i.e. a call added via AddScoped in the Startup.cs). Without something like a lock they will all try to get the info from the database. EF Core is only relevant in that every other call to the database is async, this one is the exception.
You simply cannot use a lock with asynchronous code; the entire point of async/await is to switch away from a strict thread-based model, but lock aka System.Monitor is entirely thread focused. Frankly, you also shouldn't attempt to synchronously call asynchronous code; that is simply not valid, and no "solution" is correct.
SemaphoreSlim makes a good alternative to lock as an asynchronous-aware synchronization primitve. However, you should either acquire/release the semaphore inside the async operation in your Task.Run, or you should make your Go an asynchronous method, i.e. public async Task GoAsync(), and do the same there; of course, at that point it becomes redundant to use Task.Run, so: just execute await DoSomethingAsync() directly:
private readonly SemaphoreSlim someLock = new SemaphoreSlim(1, 1);
public async Task GoAsync()
{
await someLock.WaitAsync();
try
{
await DoSomethingAsync();
}
finally
{
someLock.Release();
}
}
If the try/finally bothers you; perhaps cheat!
public async Task GoAsync()
{
using (await someLock.LockAsync())
{
await DoSomethingAsync();
}
}
with
internal static class SemaphoreExtensions
{
public static ValueTask<SemaphoreToken> LockAsync(this SemaphoreSlim semaphore)
{
// try to take synchronously
if (semaphore.Wait(0)) return new(new SemaphoreToken(semaphore));
return SlowLockAsync(semaphore);
static async ValueTask<SemaphoreToken> SlowLockAsync(SemaphoreSlim semaphore)
{
await semaphore.WaitAsync().ConfigureAwait(false);
return new(semaphore);
}
}
}
internal readonly struct SemaphoreToken : IDisposable
{
private readonly SemaphoreSlim _semaphore;
public void Dispose() => _semaphore?.Release();
internal SemaphoreToken(SemaphoreSlim semaphore) => _semaphore = semaphore;
}

What should an async method do if a Task is conditionally executed?

Suppose I have a method that awaits a Task. This method also returns a Task. For example:
public async virtual Task Save(String path)
{
if (NewWords.Any())
{
await FileManager.WriteDictionary(path, NewWords, true);
}
else await Task.Run(() => { });
}
Is the
else await Task.Run(() => { });
necessary here or am I free to leave it? Is there any difference if it is present/absent? Maybe there is some other approach to this I should take?
It's worse than unnecessary, as you're spinning up a thread to do nothing and then waiting until after its finished doing nothing.
The simplest way to do nothing, is to do nothing. In an async method the method will still have returned a Task, but that Task will be completed already, so something awaiting it further up will get straight onto the next thing it needs to do:
public async virtual Task Save(String path)
{
if (NewWords.Any())
{
await FileManager.WriteDictionary(path, NewWords, true);
}
}
(Also, it would be more in line with convention if SaveAsync and WriteDictionaryAsync were the method names here).
If not using async (and there's no need to here, but I understand it's an example) use Task.CompletedTask:
public virtual Task Save(String path)
{
if (NewWords.Any())
{
return FileManager.WriteDictionary(path, NewWords, true);
}
return Task.CompletedTask;
}
If you are coding against an earlier framework than 4.6 and therefore don't have CompletedTask available, then Task.Delay(0) is useful as Delay special cases the value 0 to return a cached completed task (in fact, the same one that CompletedTask returns):
public virtual Task Save(String path)
{
if (NewWords.Any())
{
return FileManager.WriteDictionary(path, NewWords, true);
}
return Task.Delay(0);
}
But the 4.6 way is clearer as to your intent, rather than depending on a quirk of implementation.
It's not neccesary. The async is only needed if at least one await is used. Everything inside the method is executed synchronously except for the await part.

When a method uses a cache, but occasionally does I/O, should it return T, or Task<T>?

So my question basically comes down to, should I return a Task, so the caller can be async, even if 99% of the time, there is no I/O?
Say my code is something like:
var countryList = _cacheService.Get<List<Country>>("Countries", LoadCountrys);
In this example, LoadCountrys is a method that will make the I/O call.
So the first time this is called, it will do some I/O to load all the countries into a cache, then each call after that will just read from an in-memory cache (no I/O).
The method signature would look like:
List<Country> GetCountryList();
The other way to do this would be:
var countryList = await _cacheService.GetAsync<List<Country>>("Countries", LoadCountrysAsync);
The method signature would look like:
List<Country> GetCountryListAsync();
It seems wasteful to me, paying for the overhead of a Task, when most of the time the code is not truly async.
However in Joe Duffy's blog, in the section Variable latency and asynchrony, he goes on to say even if you rarely do I/O, you should still return a Task.
So are my gut instincts wrong? Or maybe the Task overhead is so small it doesn't matter? Or is this a "It depends" kind of answer?
The best way to handle this is return a Task<T>, however you don't need to make your "fast path" async and incur the overhad. Just use Task.FromResult or call a 2nd async to actually execute the factory method to get the value.
public static class CacheManager
{
public static Task<T> GetAsync<T>(string cacheKey, Func<Task<T>> factory)
where T : class
{
var result = (T)MemoryCache.Default.Get(cacheKey);
if (result != null)
{
return Task.FromResult(result);
}
return RunFactory<T>(cacheKey, factory);
}
private static async Task<T> RunFactory<T>(string cacheKey, Func<Task<T>> factory)
where T : class
{
await PurgeOldLocks();
var cacheLock = _locks.GetOrAdd(cacheKey, (key) => new SemaphoreSlim(1));
try
{
//Wait for anyone currently running the factory.
await cacheLock.WaitAsync();
//Check to see if another factory has already ran while we waited.
var oldResult = (T)MemoryCache.Default.Get(cacheKey);
if (oldResult != null)
{
return oldResult;
}
//Run the factory then cache the result.
var newResult = await factory();
MemoryCache.Default.Add(cacheKey, newResult, _myDefaultPolicy);
return newResult;
}
finally
{
cacheLock.Release();
}
}
private static async Task PurgeOldLocks()
{
try
{
//Only one thread can run the purge;
await _purgeLock.WaitAsync();
if ((DateTime.UtcNow - _lastPurge).Duration() > MinPurgeFrequency)
{
_lastPurge = DateTime.UtcNow;
var locksSnapshot = _locks.ToList();
foreach (var kvp in locksSnapshot)
{
//Try to take the lock but do not wait for it.
var waited = await kvp.Value.WaitAsync(0);
if (waited)
{
//We where able to take the lock so remove it from the collection and dispose it.
SemaphoreSlim _;
_locks.TryRemove(kvp.Key, out _);
kvp.Value.Dispose();
}
}
}
}
finally
{
_purgeLock.Release();
}
}
public static TimeSpan MinPurgeFrequency { get; set; } = TimeSpan.FromHours(1);
private static DateTime _lastPurge = DateTime.MinValue;
private static readonly SemaphoreSlim _purgeLock = new SemaphoreSlim(1);
private static readonly ConcurrentDictionary<string, SemaphoreSlim> _locks = new ConcurrentDictionary<string, SemaphoreSlim>();
private static CacheItemPolicy _myDefaultPolicy = //...
}
In order to make sure two threads don't run a factory at the same time I keep a ConcurrentDictionary<string, SemaphoreSlim> to lock running of the factory. However this will lead to memory leaks so once a hour I go through the list in a exclusive lock and delete any key I can take the lock on.
should I return a Task, so the caller can be async, even if 99% of the
time, there is no I/O?
My answer is YES. Theoretically speaking, async methods are probably slightly slower, because more codes are generated/executed (the state machine). But the spot is negligible because async brings more.

If my interface must return Task what is the best way to have a no-operation implementation?

In the code below, due to the interface, the class LazyBar must return a task from its method (and for argument's sake can't be changed). If LazyBars implementation is unusual in that it happens to run quickly and synchronously - what is the best way to return a No-Operation task from the method?
I have gone with Task.Delay(0) below, however I would like to know if this has any performance side-effects if the function is called a lot (for argument's sake, say hundreds of times a second):
Does this syntactic sugar un-wind to something big?
Does it start clogging up my application's thread pool?
Is the compiler cleaver enough to deal with Delay(0) differently?
Would return Task.Run(() => { }); be any different?
Is there a better way?
using System.Threading.Tasks;
namespace MyAsyncTest
{
internal interface IFooFace
{
Task WillBeLongRunningAsyncInTheMajorityOfImplementations();
}
/// <summary>
/// An implementation, that unlike most cases, will not have a long-running
/// operation in 'WillBeLongRunningAsyncInTheMajorityOfImplementations'
/// </summary>
internal class LazyBar : IFooFace
{
#region IFooFace Members
public Task WillBeLongRunningAsyncInTheMajorityOfImplementations()
{
// First, do something really quick
var x = 1;
// Can't return 'null' here! Does 'Task.Delay(0)' have any performance considerations?
// Is it a real no-op, or if I call this a lot, will it adversely affect the
// underlying thread-pool? Better way?
return Task.Delay(0);
// Any different?
// return Task.Run(() => { });
// If my task returned something, I would do:
// return Task.FromResult<int>(12345);
}
#endregion
}
internal class Program
{
private static void Main(string[] args)
{
Test();
}
private static async void Test()
{
IFooFace foo = FactoryCreate();
await foo.WillBeLongRunningAsyncInTheMajorityOfImplementations();
return;
}
private static IFooFace FactoryCreate()
{
return new LazyBar();
}
}
}
Today, I would recommend using Task.CompletedTask to accomplish this.
Pre .net 4.6:
Using Task.FromResult(0) or Task.FromResult<object>(null) will incur less overhead than creating a Task with a no-op expression. When creating a Task with a result pre-determined, there is no scheduling overhead involved.
To add to Reed Copsey's answer about using Task.FromResult, you can improve performance even more if you cache the already completed task since all instances of completed tasks are the same:
public static class TaskExtensions
{
public static readonly Task CompletedTask = Task.FromResult(false);
}
With TaskExtensions.CompletedTask you can use the same instance throughout the entire app domain.
The latest version of the .Net Framework (v4.6) adds just that with the Task.CompletedTask static property
Task completedTask = Task.CompletedTask;
Task.Delay(0) as in the accepted answer was a good approach, as it is a cached copy of a completed Task.
As of 4.6 there's now Task.CompletedTask which is more explicit in its purpose, but not only does Task.Delay(0) still return a single cached instance, it returns the same single cached instance as does Task.CompletedTask.
The cached nature of neither is guaranteed to remain constant, but as implementation-dependent optimisations that are only implementation-dependent as optimisations (that is, they'd still work correctly if the implementation changed to something that was still valid) the use of Task.Delay(0) was better than the accepted answer.
return Task.CompletedTask; // this will make the compiler happy
Recently encountered this and kept getting warnings/errors about the method being void.
We're in the business of placating the compiler and this clears it up:
public async Task MyVoidAsyncMethod()
{
await Task.CompletedTask;
}
This brings together the best of all the advice here so far. No return statement is necessary unless you're actually doing something in the method.
When you must return specified type:
Task.FromResult<MyClass>(null);
I prefer the Task completedTask = Task.CompletedTask; solution of .Net 4.6, but another approach is to mark the method async and return void:
public async Task WillBeLongRunningAsyncInTheMajorityOfImplementations()
{
}
You'll get a warning (CS1998 - Async function without await expression), but this is safe to ignore in this context.
If you are using generics, all answer will give us compile error. You can use return default(T);. Sample below to explain further.
public async Task<T> GetItemAsync<T>(string id)
{
try
{
var response = await this._container.ReadItemAsync<T>(id, new PartitionKey(id));
return response.Resource;
}
catch (CosmosException ex) when (ex.StatusCode == System.Net.HttpStatusCode.NotFound)
{
return default(T);
}
}
return await Task.FromResult(new MyClass());

Categories

Resources