Both yield and await are capable of interrupting the execution of a method. Both of them can be used to let the caller continue execution. Only the await seems to me to be a stronger tool that can do not only this but much more. Is it true?
I would like to replace the following:
yield item;
By this:
await buffer.SendAsync(item);
Is it possible? One thing I like about yield is that everything happens on one thread. Could it be the same with the await approach?
I tried to implement it as follows:
class Program
{
public static async Task Main()
{
await ConsumeAsync();
}
// instead of IEnumerable<int> Produce()
static async Task ProduceAsync(Buffer<int> buffer)
{
for (int i = 0; ; i++)
{
Console.WriteLine($"Produced {i} on thread {Thread.CurrentThread.ManagedThreadId}");
await buffer.SendAsync(i); // instead of yield i;
}
}
static async Task ConsumeAsync()
{
Buffer<int> buffer = new Buffer<int>(ProduceAsync); // instead enumerable.GetEnumerator
while (true)
{
int i = await buffer.ProduceAsync(); // instead of enumerator.MoveNext(); enumerator.Current
Console.WriteLine($"Consumed {i} on thread {Thread.CurrentThread.ManagedThreadId}");
}
}
}
class Buffer<T>
{
private T item;
public Buffer(Func<Buffer<T>, Task> producer)
{
producer(this); // starts the producer
}
public async Task<T> ProduceAsync()
{
// block the consumer
// continue the execution of producer
// await until producer produces something
return item;
}
public async Task SendAsync(T item)
{
this.item = item;
// block the producer
// continue the execution of consumer
// await until the consumer is requesting next item
}
}
But I did not know how to do the synchronization to keep everything on one thread.
Yes, it is possible to use patterns like that. I have done so when writing state machines for UI interactions, where the user needs to place a sequence of clicks, and different things should happen between each click. Using something like await mouse.GetDownTask() allow the code to be written in a fairly linear way.
But that does not necessarily mean that that it was easy to read or understand. The underlying mechanisms to make it all work where quite horrible. So you should be aware that this is abuse of the system, and that other readers of your code will probably not expect such a usage. So use with care, and make sure you are on top of your documentation.
To make this work you likely need to use TaskCompletionSource. Something like this might work:
class Buffer<T>
{
private TaskCompletionSource<T> tcs = new ();
public Buffer() { }
public Task<T> Receive() => tcs.Task;
public void Send(T item)
{
tcs.SetResult(item);
tcs = new TaskCompletionSource<T>();
}
}
You should also be aware of deadlocks. This is a potential problem when using fake async on a single thread like this.
You could just use a Channel<T>:
class Buffer<T>
{
private readonly Channel<T> _channel = Channel.CreateBounded<T>(1);
public Task<T> ProduceAsync()
{
// block the consumer
// continue the execution of producer
// await until producer produces something
return _channel.Reader.ReadAsync().AsTask();
}
public Task SendAsync(T item)
{
return _channel.Writer.WriteAsync(item).AsTask();
// block the producer
// continue the execution of consumer
// await until the consumer is requesting next item
}
}
The built-in Channel<T> implementations have a minimum bounded capacity of 1. My understanding is that you want a channel with zero capacity, like the channels in Go language. Implementing one is doable, but not trivial. For a starting point, see this answer.
Related
I have an async method which will load some info from the database via Entity Framework.
In one circumstance I want to call that code synchronously from within a lock.
Do I need two copies of the code, one async, one not, or is there a way of calling the async code synchronously?
For example something like this:
using System;
using System.Threading.Tasks;
public class Program
{
public static void Main()
{
new Test().Go();
}
}
public class Test
{
private object someLock = new object();
public void Go()
{
lock(someLock)
{
Task<int> task = Task.Run(async () => await DoSomethingAsync());
var result = task.Result;
}
}
public async Task<int> DoSomethingAsync()
{
// This will make a database call
return await Task.FromResult(0);
}
}
Edit: as a number of the comments are saying the same thing, I thought I'd elaborate a little
Background: normally, trying to do this is a bad idea. Lock and async are polar opposites as documented in lots of places, there's no reason to have an async call in a lock.
So why do it here? I can make the database call synchronously but that requires duplicating some methods which isn't ideal. Ideally the language would let you call the same method synchronously or asynchronously
Scenario: this is a Web API. The application starts, a number of Web API calls execute and they all want some info that's in the database that's provided by a service provider dedicated for that purpose (i.e. a call added via AddScoped in the Startup.cs). Without something like a lock they will all try to get the info from the database. EF Core is only relevant in that every other call to the database is async, this one is the exception.
You simply cannot use a lock with asynchronous code; the entire point of async/await is to switch away from a strict thread-based model, but lock aka System.Monitor is entirely thread focused. Frankly, you also shouldn't attempt to synchronously call asynchronous code; that is simply not valid, and no "solution" is correct.
SemaphoreSlim makes a good alternative to lock as an asynchronous-aware synchronization primitve. However, you should either acquire/release the semaphore inside the async operation in your Task.Run, or you should make your Go an asynchronous method, i.e. public async Task GoAsync(), and do the same there; of course, at that point it becomes redundant to use Task.Run, so: just execute await DoSomethingAsync() directly:
private readonly SemaphoreSlim someLock = new SemaphoreSlim(1, 1);
public async Task GoAsync()
{
await someLock.WaitAsync();
try
{
await DoSomethingAsync();
}
finally
{
someLock.Release();
}
}
If the try/finally bothers you; perhaps cheat!
public async Task GoAsync()
{
using (await someLock.LockAsync())
{
await DoSomethingAsync();
}
}
with
internal static class SemaphoreExtensions
{
public static ValueTask<SemaphoreToken> LockAsync(this SemaphoreSlim semaphore)
{
// try to take synchronously
if (semaphore.Wait(0)) return new(new SemaphoreToken(semaphore));
return SlowLockAsync(semaphore);
static async ValueTask<SemaphoreToken> SlowLockAsync(SemaphoreSlim semaphore)
{
await semaphore.WaitAsync().ConfigureAwait(false);
return new(semaphore);
}
}
}
internal readonly struct SemaphoreToken : IDisposable
{
private readonly SemaphoreSlim _semaphore;
public void Dispose() => _semaphore?.Release();
internal SemaphoreToken(SemaphoreSlim semaphore) => _semaphore = semaphore;
}
I developing many algorithms that did most of the threading by themselves by using regular Threads. The approach was always as following
float[] GetData(int requestedItemIndex)
With the method above and index was pushed into some messages queue that was processed by the thread of the inidividual algorithm. So in the end the interface of the algorithm was like this:
public abstract class AlgorithmBase
{
private readonly AlgorithmBase Parent;
private void RequestQueue()
{
}
public float[] GetData(int requestedItemIndex) => Parent.GetData(requestedItemIndex);
}
The example is very primitive, but just to get the clue. The problem is that I can chain algorithms which currently works fine with my solution. As you can see every GetData calls another GetData of a parent algorithm. This can of course get more complex and of course there needs to be a final parent as data source, otherwise I would get StackOverflowExceptions.
Now I try to change this behavior by using async/await. My question here is that if I rewrite my code I would get something like this:
public abstract class AlgorithmBase
{
private readonly AlgorithmBase Parent;
public async Task<float[]> GetDataAsync(int requestedItemIndex, CancellationToken token = default)
{
var data = await Parent.GetDataAsync(requestedItemIndex);
return await Task.Run<float[]>(async () => ProcessData());
}
}
Now, I have chained the algorithms, any every new algorithm spans another Task, which can be quite time consuming when this is done many times.
So my questions is if there is a way where the next task can be embedded in the already running task, by using the defines interface?
There is no need to explicitly use Task.Run. You should avoid that, and leave that choice to the consumer of AlgorithmBase class.
So, you can quite similarly implement async version, in which Task object will be propagated from parents to childred:
public abstract class AlgorithmBase
{
private readonly AlgorithmBase Parent;
private void RequestQueue()
{
}
public Task<float[]> GetDataAsync(int requestedItemIndex)
=> Parent.GetDataAsync(requestedItemIndex);
}
Eventually, some "parent" will implement GetDataAsync, in the same manner as synchronous counterpart.
public class SortAlgorithm : AlgorithmBase
{
public override async Task<float[]> GetDataAsync(int requestedItemIndex)
{
// asynchronously get data
var data = await Parent.GetDataAsync(requestedItemIndex);
// synchronously process data and return from asynchronous method
return this.ProcessData(data);
}
private float[] ProcessData(float[] data)
{
}
}
In the end, consumer of SortAlogirthm can decide whether to await it, or just fire-and-forget it.
var algo = new SortAlgorithm();
// asynchronously wait until it's finished
var data = await algo.GetDataAsync(1);
// start processing without waiting for the result
algo.GetDataAsync(1);
// not needed - GetDataAsync already returns Task, Task.Run is not needed in this case
Task.Run(() => algo.GetDataAsync(1));
When awaiting in library code you normally want to avoid capturing and restoring the context each and every time, especially if you are awaiting in a loop. So to improve the performance of your library consider using .ConfigureAwait(false) on all awaits.
Regarding the right worker method signature I need to understand the following:
is there a point in returning Task instead of void for Worker method (if going sync)?
Should I really wait (call Wait()) on the Worker method (if going sync)?
what should be the return value of Worker method when marked as returning Task object (both if going sync/async)?
what signature and body of Worker method should be, given the work it completes is long-running CPU/IO-bound work? Should I follow this recommendation (if going mixed/async)?
Note
Despite the cpu-bound code, there's a choice to call async versions of io-bound methods (sql queries). So it may be all sync or partially async. As for the nature of code in the Worker method.
public class LoopingService
{
private CancellationTokenSource cts;
// ..
void Worker(CancellationToken cancellationToken)
{
while(!cancellationToken.IsCancellationRequested)
{
// mixed, CPU/IO-bound code
try {
// sql query (can be called either as sync/async)
var lastId = documentService.GetLastDocument().Id;
// get next document from a public resource (third-party code, sync)
// can be moved to a web api
var document = thirdPartyDocumentService.GetNextDocument(lastId);
// apply different processors in parallel
var tasksList = new List<Task>();
foreach(var processor in documentService.Processors) {
// each processor checks if it's applicable
// which may include xml-parsing, additional db calls, regexes
// if it's applicable then document data is inserted into the db
var task = new Task(() => processor.Process(document));
tasksList.Add(task);
task.Start();
}
// or
// var tasksList = documentService.ProcessParallel(document);
Task.WaitAll(tasksList.ToArray(), cancellationToken);
}
catch(Exception ex) {
logger.log(ex);
}
}
}
public void Start()
{
this.cts = new CancellationTokenSource();
Task.Run(() => this.Worker(cts.Token));
}
public void Stop()
{
this.cts.Cancel();
this.cts.Dispose();
}
}
is there a point in returning Task instead of void for Worker method?
If Worker is a truly asynchronous method it should return a Task for you to be able to await it. If it's just a synchronous method runnning on a background thread there is no point of changing the return type from void provided that the method is not supposed to return anything.
what should be the return value of Worker method when marked as returning Task object?
Nothing. Provided that the method is asynchronous and marked as async with a return type of Task, it shouldn't return any value:
async Task Worker(CancellationToken cancellationToken) { ... }
Note that there is no point of defining the method as async unless you actually use the await keyword in it.
what signature and body of Worker method should be given the work it completes is long-running CPU/IO-bound work? Should I follow this recommendation?
Yes, probably. If you for some reason are doing both asynchronous and synchronous (CPU-bound) work in the same method, you should prefer to using an asynchronous signature but not wrap the synchronous stuff in Task.Run. Then your service would look something like this:
public class LoopingService
{
private CancellationTokenSource cts;
async Task Worker(CancellationToken cancellationToken)
{
while (!cancellationToken.IsCancellationRequested)
{
await ...
}
}
public async Task Start()
{
this.cts = new CancellationTokenSource();
await this.Worker(cts.Token).ConfigureAwait(false);
}
public void Stop()
{
this.cts.Cancel();
this.cts.Dispose();
}
}
Ideally your method should be either asynchronous or CPU-bound but not both though.
I'm trying to optimize an async version of something similar (in basic funcionality) to the Monitor.Wait and Monitor.Pulse methods. The idea is to use this over an async method.
Requirements:
1) I have one Task running, that it is in charge of waiting until someone pulses my monitor.
2) That task may compute a complex (ie: time consuming) operation. In the meanwhile, the pulse method could be called several times without doing anything (as the main task is already doing some processing).
3) Once the main task finishes, it starts to Wait again until another Pulse comes in.
Worst case scenario is Wait>Pulse>Wait>Pulse>Wait..., but usually I have tenths/hundreds of pulses for every wait.
So, I have the following class (working, but I think it can be optimized a bit based on my requirements)
internal sealed class Awaiter
{
private readonly ConcurrentQueue<TaskCompletionSource<byte>> _waiting = new ConcurrentQueue<TaskCompletionSource<byte>>();
public void Pulse()
{
TaskCompletionSource<byte> tcs;
if (_waiting.TryDequeue(out tcs))
{
tcs.TrySetResult(1);
}
}
public Task Wait()
{
TaskCompletionSource<byte> tcs;
if (_waiting.TryPeek(out tcs))
{
return tcs.Task;
}
tcs = new TaskCompletionSource<byte>();
_waiting.Enqueue(tcs);
return tcs.Task;
}
}
The problem with the above class is the baggage I'm using just for synchronization. Since I will be waiting from one and only one thread, there is really no need to have a ConcurrentQueue, as I always have only one item in it.
So, I simplified it a bit and wrote the following:
internal sealed class Awaiter2
{
private readonly object _mutex = new object();
private TaskCompletionSource<byte> _waiting;
public void Pulse()
{
var w = _waiting;
if (w == null)
{
return;
}
lock (_mutex)
{
w = _waiting;
if (w == null)
{
return;
}
_waiting = null;
w.TrySetResult(1);
}
}
public Task Wait()
{
var w = _waiting;
if (w != null)
{
return w.Task;
}
lock (_mutex)
{
w = _waiting;
if (w != null)
{
return w.Task;
}
w = _waiting = new TaskCompletionSource<byte>();
return w.Task;
}
}
}
That new version is also working ok, but I'm still thinking it can be optimized a bit more, by removing the locks.
I'm looking for suggestions on how I can optimize the second version. Any ideas?
If you don't need the Wait() call to return a Task but are content with being able to await Wait() then you can implement a custom awaiter/awaitable.
See this link for an overview of the await pattern used by the compiler.
When implementing custom awaitables you will just be dealing with delegates and the actual "waiting" is left up to you. When you want to "await" for a condition it is often possible to keep a list of pending continuations and whenever the condition comes true you can invoke those continuations. You just need to deal with the synchronization coming from the fact that await can be called from arbitrary threads. If you know that you'll only ever await from one thread (say the UI thread) then you don't need any synchronization at all!
I'll try to give you a lock-free implementation but no guarantees that it is correct. If you don't understand why all race conditions are safe you should not use it and implement the async/await protocol using lock-statements or other techniques which you know how to debug.
public sealed class AsyncMonitor
{
private PulseAwaitable _currentWaiter;
public AsyncMonitor()
{
_currentWaiter = new PulseAwaitable();
}
public void Pulse()
{
// Optimize for the case when calling Pulse() when nobody is waiting.
//
// This has an inherent race condition when calling Pulse() and Wait()
// at the same time. The question this was written for did not specify
// how to resolve this, so it is a valid answer to tolerate either
// result and just allow the race condition.
//
if (_currentWaiter.HasWaitingContinuations)
Interlocked.Exchange(ref _currentWaiter, new PulseAwaitable()).Complete();
}
public PulseAwaitable Wait()
{
return _currentWaiter;
}
}
// This class maintains a list of waiting continuations to be executed when
// the owning AsyncMonitor is pulsed.
public sealed class PulseAwaitable : INotifyCompletion
{
// List of pending 'await' delegates.
private Action _pendingContinuations;
// Flag whether we have been pulsed. This is the primary variable
// around which we build the lock free synchronization.
private int _pulsed;
// AsyncMonitor creates instances as required.
internal PulseAwaitable()
{
}
// This check has a race condition which is tolerated.
// It is used to optimize for cases when the PulseAwaitable has no waiters.
internal bool HasWaitingContinuations
{
get { return Volatile.Read(ref _pendingContinuations) != null; }
}
// Called by the AsyncMonitor when it is pulsed.
internal void Complete()
{
// Set pulsed flag first because that is the variable around which
// we build the lock free protocol. Everything else this method does
// is free to have race conditions.
Interlocked.Exchange(ref _pulsed, 1);
// Execute pending continuations. This is free to race with calls
// of OnCompleted seeing the pulsed flag first.
Interlocked.Exchange(ref _pendingContinuations, null)?.Invoke();
}
#region Awaitable
// There is no need to separate the awaiter from the awaitable
// so we use one class to implement both parts of the protocol.
public PulseAwaitable GetAwaiter()
{
return this;
}
#endregion
#region Awaiter
public bool IsCompleted
{
// The return value of this property does not need to be up to date so we could omit the 'Volatile.Read' if we wanted to.
// What is not allowed is returning "true" even if we are not completed, but this cannot happen since we never transist back to incompleted.
get { return Volatile.Read(ref _pulsed) == 1; }
}
public void OnCompleted(Action continuation)
{
// Protected against manual invocations. The compiler-generated code never passes null so you can remove this check in release builds if you want to.
if (continuation == null)
throw new ArgumentNullException(nameof(continuation));
// Standard pattern of maintaining a lock free immutable variable: read-modify-write cycle.
// See for example here: https://blogs.msdn.microsoft.com/oldnewthing/20140516-00/?p=973
// Again the 'Volatile.Read' is not really needed since outdated values will be detected at the first iteration.
var oldContinuations = Volatile.Read(ref _pendingContinuations);
for (;;)
{
var newContinuations = (oldContinuations + continuation);
var actualContinuations = Interlocked.CompareExchange(ref _pendingContinuations, newContinuations, oldContinuations);
if (actualContinuations == oldContinuations)
break;
oldContinuations = actualContinuations;
}
// Now comes the interesting part where the actual lock free synchronization happens.
// If we are completed then somebody needs to clean up remaining continuations.
// This happens last so the first part of the method can race with pulsing us.
if (IsCompleted)
Interlocked.Exchange(ref _pendingContinuations, null)?.Invoke();
}
public void GetResult()
{
// This is just to check against manual calls. The compiler will never call this when IsCompleted is false.
// (Assuming your OnCompleted implementation is bug-free and you don't execute continuations before IsCompleted becomes true.)
if (!IsCompleted)
throw new NotSupportedException("Synchronous waits are not supported. Use 'await' or OnCompleted to wait asynchronously");
}
#endregion
}
You usually don't bother on which thread the continuations run because if they are async methods the compiler has already inserted code (in the continuation) to switch back to the right thread, no need to do it manually in every awaitable implementation.
[edit]
As a starting point for how a locking implementation can look I'll provide one using a lock-statement. It should be easy to replace it by a spinlock or some other locking technique. By using a struct as the awaitable it even has the advantage that it does no additional allocation except for the initial object. (There are of course allocations in the async/await framework in the compiler magic on the calling side, but you can't get rid of these.)
Note that the iteration counter will increment only for every Wait+Pulse pair and will eventually overflow into negative, but that is ok. We just need to bridge the time from the continuation beeing invoked until it can call GetResult. 4 billion Wait+Pulse pairs should be plenty of time for any pending continuations to call its GetResult method. If you don't want that risk you could use a long or Guid for a more unique iteration counter, but IMHO an int is good for almost all scenarios.
public sealed class AsyncMonitor
{
public struct Awaitable : INotifyCompletion
{
// We use a struct to avoid allocations. Note that this means the compiler will copy
// the struct around in the calling code when doing 'await', so for your own debugging
// sanity make all variables readonly.
private readonly AsyncMonitor _monitor;
private readonly int _iteration;
public Awaitable(AsyncMonitor monitor)
{
lock (monitor)
{
_monitor = monitor;
_iteration = monitor._iteration;
}
}
public Awaitable GetAwaiter()
{
return this;
}
public bool IsCompleted
{
get
{
// We use the iteration counter as an indicator when we should be complete.
lock (_monitor)
{
return _monitor._iteration != _iteration;
}
}
}
public void OnCompleted(Action continuation)
{
// The compiler never passes null, but someone may call it manually.
if (continuation == null)
throw new ArgumentNullException(nameof(continuation));
lock (_monitor)
{
// Not calling IsCompleted since we already have a lock.
if (_monitor._iteration == _iteration)
{
_monitor._waiting += continuation;
// null the continuation to indicate the following code
// that we completed and don't want it executed.
continuation = null;
}
}
// If we were already completed then we didn't null the continuation.
// (We should invoke the continuation outside of the lock because it
// may want to Wait/Pulse again and we want to avoid reentrancy issues.)
continuation?.Invoke();
}
public void GetResult()
{
lock (_monitor)
{
// Not calling IsCompleted since we already have a lock.
if (_monitor._iteration == _iteration)
throw new NotSupportedException("Synchronous wait is not supported. Use await or OnCompleted.");
}
}
}
private Action _waiting;
private int _iteration;
public AsyncMonitor()
{
}
public void Pulse(bool executeAsync)
{
Action execute = null;
lock (this)
{
// If nobody is waiting we don't need to increment the iteration counter.
if (_waiting != null)
{
_iteration++;
execute = _waiting;
_waiting = null;
}
}
// Important: execute the callbacks outside the lock because they might Pulse or Wait again.
if (execute != null)
{
// If the caller doesn't want inlined execution (maybe he holds a lock)
// then execute it on the thread pool.
if (executeAsync)
Task.Run(execute);
else
execute();
}
}
public Awaitable Wait()
{
return new Awaitable(this);
}
}
Here is my simple async implementation that I use in my projects:
internal sealed class Pulsar
{
private static TaskCompletionSource<bool> Init() => new TaskCompletionSource<bool>();
private TaskCompletionSource<bool> _tcs = Init();
public void Pulse()
{
Interlocked.Exchange(ref _tcs, Init()).SetResult(true);
}
public Task AwaitPulse(CancellationToken token)
{
return token.CanBeCanceled ? _tcs.Task.WithCancellation(token) : _tcs.Task;
}
}
Add TaskCreationOptions.RunContinuationsAsynchronously to the TCS for async continuations.
The WithCancellation can be omitted of course, if you do not need cancellations.
Because you only have one task ever waiting your function can be simplified to
internal sealed class Awaiter3
{
private volatile TaskCompletionSource<byte> _waiting;
public void Pulse()
{
var w = _waiting;
if (w == null)
{
return;
}
_waiting = null;
#if NET_46_OR_GREATER
w.TrySetResult(1);
#else
Task.Run(() => w.TrySetResult(1));
#endif
}
//This method is not thread safe and can only be called by one thread at a time.
// To make it thread safe put a lock around the null check and the assignment,
// you do not need to have a lock on Pulse, "volatile" takes care of that side.
public Task Wait()
{
if(_waiting != null)
throw new InvalidOperationException("Only one waiter is allowed to exist at a time!");
#if NET_46_OR_GREATER
_waiting = new TaskCompletionSource<byte>(TaskCreationOptions.RunContinuationsAsynchronously);
#else
_waiting = new TaskCompletionSource<byte>();
#endif
return _waiting.Task;
}
}
One behavior I did change. If you are using .NET 4.6 or newer use the code in the #if NET_46_OR_GREATER blocks, if under use the else blocks. When you call TrySetResult you could have the continuation synchronously run, this can cause Pulse() to take a long time to complete. By using TaskCreationOptions.RunContinuationsAsynchronously in .NET 4.6 or wrapping the TrySetResult in a Task.Run for pre 4.6 will make sure that Puse() is not blocked by the continuation of the task.
See the SO question Detect target framework version at compile time on how to make a NET_46_OR_GREATER definition that works in your code.
A simple way to do this is to use SemaphoreSlim which uses Monitor.
public class AsyncMonitor
{
private readonly SemaphoreSlim signal = new SemaphoreSlim(0, 1);
public void Pulse()
{
try
{
signal.Release();
}
catch (SemaphoreFullException) { }
}
public async Task WaitAsync(CancellationToken cancellationToken)
{
await signal.WaitAsync(cancellationToken).ConfigureAwait(false);
}
}
I'm attempting to convert the following method (simplified example) to be asynchronous, as the cacheMissResolver call may be expensive in terms of time (database lookup, network call):
// Synchronous version
public class ThingCache
{
private static readonly object _lockObj;
// ... other stuff
public Thing Get(string key, Func<Thing> cacheMissResolver)
{
if (cache.Contains(key))
return cache[key];
Thing item;
lock(_lockObj)
{
if (cache.Contains(key))
return cache[key];
item = cacheMissResolver();
cache.Add(key, item);
}
return item;
}
}
There are plenty of materials on-line about consuming async methods, but the advice I have found on producing them seems less clear-cut. Given that this is intended to be part of a library, are either of my attempts below correct?
// Asynchronous attempts
public class ThingCache
{
private static readonly SemaphoreSlim _lockObj = new SemaphoreSlim(1);
// ... other stuff
// attempt #1
public async Task<Thing> Get(string key, Func<Thing> cacheMissResolver)
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
Thing item;
await _lockObj.WaitAsync();
try
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
item = await Task.Run(cacheMissResolver).ConfigureAwait(false);
_cache.Add(key, item);
}
finally
{
_lockObj.Release();
}
return item;
}
// attempt #2
public async Task<Thing> Get(string key, Func<Task<Thing>> cacheMissResolver)
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
Thing item;
await _lockObj.WaitAsync();
try
{
if (cache.Contains(key))
return await Task.FromResult(cache[key]);
item = await cacheMissResolver().ConfigureAwait(false);
_cache.Add(key, item);
}
finally
{
_lockObj.Release();
}
return item;
}
}
Is using SemaphoreSlim the correct way to replace a lock statement in an async method? (I can't await in the body of a lock statement.)
Should I make the cacheMissResolver argument of type Func<Task<Thing>> instead? Although this puts the burden of making sure the resolver func is async on the caller (wrapping in Task.Run, I know it will be offloaded to a background thread if it takes a long time).
Thanks.
Is using SemaphoreSlim the correct way to replace a lock statement in an async method?
Yes.
Should I make the cacheMissResolver argument of type Func<Task<Thing>> instead?
Yes. It'll allow the caller to provide an inherently asynchronous operation (such as IO) rather than making this only suitable for work that is long running CPU bound work. (While still supporting CPU bound work by simply having the caller use Task.Run themselves, if that's what they want to do.)
Other than that, just note that there's not point in having await Task.FromResult(...); Wrapping a value in a Task just to immediately unwrap it is pointless. Just use the result directly in such situations, in this case, return the cached value directly. What you're doing isn't really wrong, it's just needlessly complicating/confusing the code.
If your cache is in-memory (it looks like it is), then consider caching the tasks rather than the results. This has a nice side property if two methods request the same key, only a single resolving request is made. Also, since only the cache is locked (and not the resolving operations), you can continue using a simple lock.
public class ThingCache
{
private static readonly object _lockObj;
public async Task<Thing> GetAsync(string key, Func<Task<Thing>> cacheMissResolver)
{
lock (_lockObj)
{
if (cache.Contains(key))
return cache[key];
var task = cacheMissResolver();
_cache.Add(key, task);
}
}
}
However, this will also cache exceptions, which you may not want. One way to avoid this is to permit the exception task to enter the cache initially, but then prune it when the next request is made:
public class ThingCache
{
private static readonly object _lockObj;
public async Task<Thing> GetAsync(string key, Func<Task<Thing>> cacheMissResolver)
{
lock (_lockObj)
{
if (cache.Contains(key))
{
if (cache[key].Status == TaskStatus.RanToCompletion)
return cache[key];
cache.Remove(key);
}
var task = cacheMissResolver();
_cache.Add(key, task);
}
}
}
You may decide this extra check is unnecessary if you have another process pruning the cache periodically.