I have a Singleton Class which loads some data on its construction. The problem is that loading this data requires calling async methods, but the constructor cannot be async.
In other words, my class has following structure:
public class Singleton
{
private static Singleton instance;
private Singleton()
{
LoadData();
}
public static Singleton Instance
{
get
{
if (instance == null)
{
instance = new Singleton();
}
return instance;
}
}
}
LoadData() is an async function which calls lots of async functions as well as initialization.
How can I call LoadData() properly so everything initialize correctly?
The solution for a thread-safe, async singleton is actually super simple, if we only let the inner mechanisms of the Task class work for us!
So, how does a Task work? Let’s say you have an instance of a Task<T> and you await it once. Now the task is executed, and a value of T is produced and returned to you. What if you await the same task instance again? In this case the task just returns the previously produced value immediately in a completely synchronous manner.
And what if you await the same task instance simultaneously from multiple threads (where you would normally get a race condition)? Well, the first one (since there will be one that gets there first) will execute the task code while the others will wait for the result to be processed. Then when the result has been produced, all the await’s will finish (virtually) simultaneously and return the value.
So the solution for an async singleton that is thread-safe is actually super simple:
public class Singleton
{
private static readonly Task<Singleton> _getInstanceTask = CreateSingleton();
public static Task<Singleton> Instance
{
get { return _getInstanceTask; }
}
private Singleton(SomeData someData)
{
SomeData = someData;
}
public SomeData SomeData { get; private set; }
private static async Task<Singleton> CreateSingleton()
{
SomeData someData = await LoadData();
return new Singleton(someData);
}
}
Now you can access the singleton this way:
Singleton mySingleton = await Singleton.Instance;
or
Singleton mySingleton = Singleton.Instance.Result;
or
SomeData mySingletonData = (await Singleton.Instance).SomeData;
or
SomeData mySingletonData = Singleton.Instance.Result.SomeData;
Read more here: Async singleton initialization
The problem is that loading this data requires calling async methods, but the constructor cannot be async.
While you can't make the constructor itself asynchronous, you can call asynchronous methods from within the constructor. You just will not get the results back immediately.
Provided the asynchronous methods return Task or Task<T>, you can always use a continuation on the task to set your data within the class once the asynchronous operation completes, or just block on the results, depending on what makes the most sense in your scenario. Without knowing the requirements for construction of this object, it's difficult to know what is appropriate in this scenario.
Edit:
One option, given the goals listed above, would be to change your Singleton declaration so that method to retrieve the Instance was a method, not a property. This would allow you to make it asynchronous:
public class Singleton
{
private static Singleton instance;
private Singleton()
{
// Don't load the data here - will be called separately
}
public static async Task<Singleton> GetInstance()
{
if (instance == null)
{
instance = new Singleton();
await instance.LoadData();
}
return instance;
}
}
This would allow you to use await on the call to actually retrieve the instance. The nice thing about this is that it does make it very clear that you're calling an asynchronous operation, and you will get proper handling of the results, as the result will come back like any other async method.
Be aware, however, that this isn't thread safe (though the original wasn't either), so if you're going to use this Singleton from multiple threads, you may have to rethink the overall design.
The other option would be to make your Singleton class not automatically load data. Make the methods that retrieve the data from the class asynchronous, instead. This provides some real advantages, as the usage is probably a bit more standard, and you can support calls from multiple threads a bit more easily (since you can control the data loading process) than you'd be able to handle it with making the access of the class instance asynchronous.
You can use asynchronous lazy initialization for this:
public class Singleton
{
private static readonly AsyncLazy<Singleton> instance =
new AsyncLazy<Singleton>(CreateAndLoadData);
private Singleton()
{
}
// This method could also be an async lambda passed to the AsyncLazy constructor.
private static async Task<Singleton> CreateAndLoadData()
{
var ret = new Singleton();
await ret.LoadDataAsync();
return ret;
}
public static AsyncLazy<Singleton> Instance
{
get { return instance; }
}
}
And then you can use it like this:
Singleton singleton = await Singleton.Instance;
One benefit of using AsyncLazy<T> is that it is threadsafe. However, be aware that it always executes its delegate on a thread pool thread.
Well, it doesn't make much sense that you want asynchronously initialize a singleton. If you simply want to call an method that returns Task in your initialization, you can simply do:
var task = MyAsyncMethod();
task.Wait();
return task.Result;
Without the need to make the method async.
But, if what you want is for the singleton value to be a task, you can use Lazy as such:
Lazy<Task<int>> l = new Lazy<Task<int>>(async () => { int i = await calculateNumber(); return i; });
In addition, Lazy<T> is the preferred method for implementing "singletons". Singleton classes are hard to get right (or hard to keep right)...
Related
I have an async method which will load some info from the database via Entity Framework.
In one circumstance I want to call that code synchronously from within a lock.
Do I need two copies of the code, one async, one not, or is there a way of calling the async code synchronously?
For example something like this:
using System;
using System.Threading.Tasks;
public class Program
{
public static void Main()
{
new Test().Go();
}
}
public class Test
{
private object someLock = new object();
public void Go()
{
lock(someLock)
{
Task<int> task = Task.Run(async () => await DoSomethingAsync());
var result = task.Result;
}
}
public async Task<int> DoSomethingAsync()
{
// This will make a database call
return await Task.FromResult(0);
}
}
Edit: as a number of the comments are saying the same thing, I thought I'd elaborate a little
Background: normally, trying to do this is a bad idea. Lock and async are polar opposites as documented in lots of places, there's no reason to have an async call in a lock.
So why do it here? I can make the database call synchronously but that requires duplicating some methods which isn't ideal. Ideally the language would let you call the same method synchronously or asynchronously
Scenario: this is a Web API. The application starts, a number of Web API calls execute and they all want some info that's in the database that's provided by a service provider dedicated for that purpose (i.e. a call added via AddScoped in the Startup.cs). Without something like a lock they will all try to get the info from the database. EF Core is only relevant in that every other call to the database is async, this one is the exception.
You simply cannot use a lock with asynchronous code; the entire point of async/await is to switch away from a strict thread-based model, but lock aka System.Monitor is entirely thread focused. Frankly, you also shouldn't attempt to synchronously call asynchronous code; that is simply not valid, and no "solution" is correct.
SemaphoreSlim makes a good alternative to lock as an asynchronous-aware synchronization primitve. However, you should either acquire/release the semaphore inside the async operation in your Task.Run, or you should make your Go an asynchronous method, i.e. public async Task GoAsync(), and do the same there; of course, at that point it becomes redundant to use Task.Run, so: just execute await DoSomethingAsync() directly:
private readonly SemaphoreSlim someLock = new SemaphoreSlim(1, 1);
public async Task GoAsync()
{
await someLock.WaitAsync();
try
{
await DoSomethingAsync();
}
finally
{
someLock.Release();
}
}
If the try/finally bothers you; perhaps cheat!
public async Task GoAsync()
{
using (await someLock.LockAsync())
{
await DoSomethingAsync();
}
}
with
internal static class SemaphoreExtensions
{
public static ValueTask<SemaphoreToken> LockAsync(this SemaphoreSlim semaphore)
{
// try to take synchronously
if (semaphore.Wait(0)) return new(new SemaphoreToken(semaphore));
return SlowLockAsync(semaphore);
static async ValueTask<SemaphoreToken> SlowLockAsync(SemaphoreSlim semaphore)
{
await semaphore.WaitAsync().ConfigureAwait(false);
return new(semaphore);
}
}
}
internal readonly struct SemaphoreToken : IDisposable
{
private readonly SemaphoreSlim _semaphore;
public void Dispose() => _semaphore?.Release();
internal SemaphoreToken(SemaphoreSlim semaphore) => _semaphore = semaphore;
}
I developing many algorithms that did most of the threading by themselves by using regular Threads. The approach was always as following
float[] GetData(int requestedItemIndex)
With the method above and index was pushed into some messages queue that was processed by the thread of the inidividual algorithm. So in the end the interface of the algorithm was like this:
public abstract class AlgorithmBase
{
private readonly AlgorithmBase Parent;
private void RequestQueue()
{
}
public float[] GetData(int requestedItemIndex) => Parent.GetData(requestedItemIndex);
}
The example is very primitive, but just to get the clue. The problem is that I can chain algorithms which currently works fine with my solution. As you can see every GetData calls another GetData of a parent algorithm. This can of course get more complex and of course there needs to be a final parent as data source, otherwise I would get StackOverflowExceptions.
Now I try to change this behavior by using async/await. My question here is that if I rewrite my code I would get something like this:
public abstract class AlgorithmBase
{
private readonly AlgorithmBase Parent;
public async Task<float[]> GetDataAsync(int requestedItemIndex, CancellationToken token = default)
{
var data = await Parent.GetDataAsync(requestedItemIndex);
return await Task.Run<float[]>(async () => ProcessData());
}
}
Now, I have chained the algorithms, any every new algorithm spans another Task, which can be quite time consuming when this is done many times.
So my questions is if there is a way where the next task can be embedded in the already running task, by using the defines interface?
There is no need to explicitly use Task.Run. You should avoid that, and leave that choice to the consumer of AlgorithmBase class.
So, you can quite similarly implement async version, in which Task object will be propagated from parents to childred:
public abstract class AlgorithmBase
{
private readonly AlgorithmBase Parent;
private void RequestQueue()
{
}
public Task<float[]> GetDataAsync(int requestedItemIndex)
=> Parent.GetDataAsync(requestedItemIndex);
}
Eventually, some "parent" will implement GetDataAsync, in the same manner as synchronous counterpart.
public class SortAlgorithm : AlgorithmBase
{
public override async Task<float[]> GetDataAsync(int requestedItemIndex)
{
// asynchronously get data
var data = await Parent.GetDataAsync(requestedItemIndex);
// synchronously process data and return from asynchronous method
return this.ProcessData(data);
}
private float[] ProcessData(float[] data)
{
}
}
In the end, consumer of SortAlogirthm can decide whether to await it, or just fire-and-forget it.
var algo = new SortAlgorithm();
// asynchronously wait until it's finished
var data = await algo.GetDataAsync(1);
// start processing without waiting for the result
algo.GetDataAsync(1);
// not needed - GetDataAsync already returns Task, Task.Run is not needed in this case
Task.Run(() => algo.GetDataAsync(1));
When awaiting in library code you normally want to avoid capturing and restoring the context each and every time, especially if you are awaiting in a loop. So to improve the performance of your library consider using .ConfigureAwait(false) on all awaits.
Since I create the readonly static instance as soon as someone uses the class, no lazy loading, this code is thread safe and I do not need to follow the Double-checked locking design pattern, correct?
public class BusSingleton<T> where T : IEmpireEndpointConfig, new()
{
private static readonly BusSingleton<T> instance = new BusSingleton<T>();
private IBus bus;
public IBus Bus
{
get { return this.bus; }
}
public static BusSingleton<T> Instance
{
get
{
return instance;
}
}
private BusSingleton()
{
T config = new T();
bus = NServiceBus.Bus.Create(config.CreateConfiguration());
((IStartableBus) bus).Start();
}
}
During the static initializer the run-time puts a lock around the object's type so two instances of the initializer can not be run at the same time.
The only thing you must be careful of is if NServiceBus.Bus.Create, config.CreateConfiguration, or bus.Start() use multiple threads internally and try to access your object's type anywhere within the class/function on that other thread you could deadlock yourself if one of those three function calls does not return until after that internal thread is done.
When you do the traditional "lazy singleton" with double checked locking the static initializer will have already finished and you don't run the risk of deadlocking yourself.
So if you are confidant that those 3 functions will not try to access your type on another thread then it is fine to not use double checked locking for your use case.
That looks safe as long as you don't need to delay the instantiation to run initalization code or anything like that. Which it sounds like you don't need.
https://msdn.microsoft.com/en-us/library/ff650316.aspx
I have a large codebase using my repositories that all implement IRespository and I'm implementing async versions of the methods:
T Find(id);
Task<T> FindAsync(id);
...etc...
There are several kinds of repository. The simplest is based on an immutable collection where the universe of entities is small enough to merit loading them all at once from the DB. This load happens the first time anyone calls any of the IRepository methods. Find(4), for example, will trigger the load if it hasn't happened already.
I've implemented this with Lazy < T >. Very handy and has been working for years.
I can't go cold-turkey on Async so I have to add Async alongside the sync versions. My problem is, I don't know which will be called first - a sync or async method on the repository.
I don't know how to declare my Lazy - if I do it as I've always done it,
Lazy<MyCollection<T>>
then loading it won't be async when FindAsync() is called first. On the other hand, if I go
Lazy<Task<MyCollection<T>>>
This would be great for FindAsync() but how will a synchronous method trigger the initial load w/o running afoul of Mr. Cleary's warnings about deadlock from calling Task.Result?
Thank you for your time!
The problem with Lazy<T> is that there's only one factory method. What you really want is a synchronous factory method if the first call is synchronous, and an asynchronous factory method if the first call is asynchronous. Lazy<T> won't do that for you, and AFAIK there's nothing else built-in that offers these semantics either.
You can, however, build one yourself:
public sealed class SyncAsyncLazy<T>
{
private readonly object _mutex = new object();
private readonly Func<T> _syncFunc;
private readonly Func<Task<T>> _asyncFunc;
private Task<T> _task;
public SyncAsyncLazy(Func<T> syncFunc, Func<Task<T>> asyncFunc)
{
_syncFunc = syncFunc;
_asyncFunc = asyncFunc;
}
public T Get()
{
return GetAsync(true).GetAwaiter().GetResult();
}
public Task<T> GetAsync()
{
return GetAsync(false);
}
private Task<T> GetAsync(bool sync)
{
lock (_mutex)
{
if (_task == null)
_task = DoGetAsync(sync);
return _task;
}
}
private async Task<T> DoGetAsync(bool sync)
{
return sync ? _syncFunc() : await _asyncFunc().ConfigureAwait(false);
}
}
Or you can just use this pattern without encapsulating it:
private readonly object _mutex = new object();
private Task<MyCollection<T>> _collectionTask;
private Task<MyCollection<T>> LoadCollectionAsync(bool sync)
{
lock (_mutex)
{
if (_collectionTask == null)
_collectionTask = DoLoadCollectionAsync(sync);
return _collectionTask;
}
}
private async Task<MyCollection<T>> DoLoadCollectionAsync(bool sync)
{
if (sync)
return LoadCollectionSynchronously();
else
return await LoadCollectionAsynchronously();
}
The "bool sync" pattern is one Stephen Toub showed me recently. AFAIK there's no blogs or anything about it yet.
Tasks will run only once but you can await on them as many times as you want and you can also call Wait() or Result on them after completed and that won't block.
Asynchronous methods are converted into a state machine that schedules the code after each await to run after the awaitable is completed. However, there's an optimization where if the awaitble is already completed the code runs immediately. So, awaiting on completed awaiters bears little overhead.
For those small in-memory repositories, you can return a completed Task using Task.FromResult. And you can cache any Task and await it any time.
how will a synchronous method trigger the initial load w/o running afoul of Mr. Cleary's warnings about deadlock from calling Task.Result?
You can use the synchronous version and use Task.FromResult to load your Lazy<Task<MyCollection<T>>>. If this lazily async operation is exposed to the outside, it may confuse since it will block. If this is an internal single call situation, I would go with:
private Lazy<Task<MyCollection<T>>> myCollection = new Lazy<Task<MyCollection<T>>>(() =>
{
var collection = myRepo.GetCollection<T>();
return Task.FromResult(collection);
}
I'm trying to follow RAII pattern in my service classes, meaning that when an object is constructed, it is fully initialized. However, I'm facing difficulties with asynchronous APIs. The structure of class in question looks like following
class ServiceProvider : IServiceProvider // Is only used through this interface
{
public int ImportantValue { get; set; }
public event EventHandler ImportantValueUpdated;
public ServiceProvider(IDependency1 dep1, IDependency2 dep2)
{
// IDependency1 provide an input value to calculate ImportantValue
// IDependency2 provide an async algorithm to calculate ImportantValue
}
}
I'm also targeting to get rid of side-effects in ImportantValue getter, to make it thread-safe.
Now users of ServiceProvider will create an instance of it, subscribe to an event of ImportantValue change, and get the initial ImportantValue. And here comes the problem, with the initial value. Since the ImportantValue is calculated asynchronously, the class cannot be fully initialized in constructor. It may be okay to have this value as null initially, but then I need to have some place where it will be calculated first time. A natural place for that could be the ImportantValue's getter, but I'm targeting to make it thread-safe and with no side-effects.
So I'm basically stuck with these contradictions. Could you please help me and offer some alternative? Having value initialized in constructor while nice is not really necessary, but no side-effects and thread-safety of property is mandatory.
Thanks in advance.
EDIT: One more thing to add. I'm using Ninject for instantiation, and as far as I understand, it doesn't support async methods to create a binding. While approach with initiating some Task-based operation in constructor will work, I cannot await its result.
I.e. two next approaches (offered as answers so far) will not compile, since Task is returned, not my object:
Kernel.Bind<IServiceProvider>().ToMethod(async ctx => await ServiceProvider.CreateAsync())
or
Kernel.Bind<IServiceProvider>().ToMethod(async ctx =>
{
var sp = new ServiceProvider();
await sp.InitializeAsync();
})
Simple binding will work, but I'm not awaiting the result of asynchronous initialization started in constructor, as proposed by Stephen Cleary:
Kernel.Bind<IServiceProvider>().To<ServiceProvider>();
... and that's not looking good for me.
I have a blog post that describes several approaches to async construction.
I recommend the asynchronous factory method as described by Reed, but sometimes that's not possible (e.g., dependency injection). In these cases, you can use an asynchronous initialization pattern like this:
public sealed class MyType
{
public MyType()
{
Initialization = InitializeAsync();
}
public Task Initialization { get; private set; }
private async Task InitializeAsync()
{
// Asynchronously initialize this instance.
await Task.Delay(100);
}
}
You can then construct the type normally, but keep in mind that construction only starts the asynchronous initialization. When you need the type to be initialized, your code can do:
await myTypeInstance.Initialization;
Note that if Initialization is already complete, execution (synchronously) continues past the await.
If you do want an actual asynchronous property, I have a blog post for that, too. Your situation sounds like it may benefit from AsyncLazy<T>:
public sealed class MyClass
{
public MyClass()
{
MyProperty = new AsyncLazy<int>(async () =>
{
await Task.Delay(100);
return 13;
});
}
public AsyncLazy<int> MyProperty { get; private set; }
}
One potential option would be to move this to a factory method instead of using a constructor.
Your factory method could then return a Task<ServiceProvider>, which would allow you to perform the initialization asynchronously, but not return the constructed ServiceProvider until ImportantValue has been (asynchronously) computed.
This would allow your users to write code like:
var sp = await ServiceProvider.CreateAsync();
int iv = sp.ImportantValue; // Will be initialized at this point
This is a slight modification to #StephenCleary pattern of async initialization.
The difference being the caller doesn't need to 'remember' to await the InitializationTask, or even know anything about the initializationTask (in fact it is now changed to private).
The way it works is that in every method that uses the initialized data there is an initial call to await _initializationTask. This returns instantly the second time around - because the _initializationTask object itself will have a boolean set (IsCompleted which the 'await' mechanism checks) - so don't worry about it initializing multiple times.
The only catch I'm aware of is you mustn't forget to call it in every method that uses the data.
public sealed class MyType
{
public MyType()
{
_initializationTask = InitializeAsync();
}
private Task _initializationTask;
private async Task InitializeAsync()
{
// Asynchronously initialize this instance.
_customers = await LoadCustomersAsync();
}
public async Task<Customer> LookupCustomer(string name)
{
// Waits to ensure the class has been initialized properly
// The task will only ever run once, triggered initially by the constructor
// If the task failed this will raise an exception
// Note: there are no () since this is not a method call
await _initializationTask;
return _customers[name];
}
// one way of clearing the cache
public void ClearCache()
{
InitializeAsync();
}
// another approach to clearing the cache, will wait until complete
// I don't really see a benefit to this method since any call using the
// data (like LookupCustomer) will await the initialization anyway
public async Task ClearCache2()
{
await InitializeAsync();
}
}
You could use my AsyncContainer IoC container which supports the exact same scenario as you.
It also supports other handy scenarios such as async initializers, run-time conditional factories, depend on async and sync factory functions
//The email service factory is an async method
public static async Task<EmailService> EmailServiceFactory()
{
await Task.Delay(1000);
return new EmailService();
}
class Service
{
//Constructor dependencies will be solved asynchronously:
public Service(IEmailService email)
{
}
}
var container = new Container();
//Register an async factory:
container.Register<IEmailService>(EmailServiceFactory);
//Asynchronous GetInstance:
var service = await container.GetInstanceAsync<Service>();
//Safe synchronous, will fail if the solving path is not fully synchronous:
var service = container.GetInstance<Service>();
I know this is an old question, but it's the first which appears on Google and, quite frankly, the accepted answer is a poor answer. You should never force a delay just so you can use the await operator.
A better approach to an initialization method:
private async Task<bool> InitializeAsync()
{
try{
// Initialize this instance.
}
catch{
// Handle issues
return await Task.FromResult(false);
}
return await Task.FromResult(true);
}
This will use the async framework to initialize your object, but then it will return a boolean value.
Why is this a better approach? First off, you're not forcing a delay in your code which IMHO totally defeats the purpose of using the async framework. Second, it's a good rule of thumb to return something from an async method. This way, you know if your async method actually worked/did what it was supposed to. Returning just Task is the equivalent of returning void on a non-async method.
I have a variation of Stephen Cleary's example of an asynchronous initialization pattern. You could encapsulate the Initialization property and await it in the class methods. In this case, the client code will not need to await the initialization task.
public class ClassWithAsyncInit
{
public ClassWithAsyncInit()
{
Initialization = InitializeAsync();
}
private Task Initialization { get; private set; }
private async Task InitializeAsync()
{
// your async init code
}
public async Task FirstMethod()
{
await Initialization;
// ... other code
}
}
The drawback is that it's not convenient if you have a lot of methods in your class and need to await the Initialization task in each one. But sometimes it is okay. Let's say you have a simple interface for saving some JSON objects:
public IDataSaver
{
void Save(string json);
}
And you need to implement it for a database with the asynchronous initialization logic. Considering that you would have only one public method it makes sense to encapsulate the Initialization property and await it in the Save method:
public class SomeDbDataSaver: IDataSaver
{
protected DatabaseClient DbClient { get; set; }
public SomeDbDataSaver()
{
DbClient = new DatabaseClient();
Initialization = InitializeAsync(); // start off the async init
}
private Task Initialization { get; private set; }
private async Task InitializeAsync()
{
await DbClient.CreateDatabaseIfNotExistsAsync("DatabaseName");
}
public async Task Save(string json)
{
await Initialization;
// ... code for saving a data item to the database
}
}