With business logic encapsulated behind synchronous service calls e.g.:
interface IFooService
{
Foo GetFooById(int id);
int SaveFoo(Foo foo);
}
What is the best way to extend/use these service calls in an asynchronous fashion?
At present I've created a simple AsyncUtils class:
public static class AsyncUtils
{
public static void Execute<T>(Func<T> asyncFunc)
{
Execute(asyncFunc, null, null);
}
public static void Execute<T>(Func<T> asyncFunc, Action<T> successCallback)
{
Execute(asyncFunc, successCallback, null);
}
public static void Execute<T>(Func<T> asyncFunc, Action<T> successCallback, Action<Exception> failureCallback)
{
ThreadPool.UnsafeQueueUserWorkItem(state => ExecuteAndHandleError(asyncFunc, successCallback, failureCallback), null);
}
private static void ExecuteAndHandleError<T>(Func<T> asyncFunc, Action<T> successCallback, Action<Exception> failureCallback)
{
try
{
T result = asyncFunc();
if (successCallback != null)
{
successCallback(result);
}
}
catch (Exception e)
{
if (failureCallback != null)
{
failureCallback(e);
}
}
}
}
Which lets me call anything asynchronously:
AsyncUtils(
() => _fooService.SaveFoo(foo),
id => HandleFooSavedSuccessfully(id),
ex => HandleFooSaveError(ex));
Whilst this works in simple use cases it quickly gets tricky if other processes need to coordinate about the results, for example if I need to save three objects asynchronously before the current thread can continue then I'd like a way to wait-on/join the worker threads.
Options I've thought of so far include:
having AsyncUtils return a WaitHandle
having AsyncUtils use an AsyncMethodCaller and return an IAsyncResult
rewriting the API to include Begin, End async calls
e.g. something resembling:
interface IFooService
{
Foo GetFooById(int id);
IAsyncResult BeginGetFooById(int id);
Foo EndGetFooById(IAsyncResult result);
int SaveFoo(Foo foo);
IAsyncResult BeginSaveFoo(Foo foo);
int EndSaveFoo(IAsyncResult result);
}
Are there other approaches I should consider? What are the benefits and potential pitfalls of each?
Ideally I'd like to keep the service layer simple/synchronous and provide some easy to use utility methods for calling them asynchronously. I'd be interested in hearing about solutions and ideas applicable to C# 3.5 and C# 4 (we haven't upgraded yet but will do in the relatively near future).
Looking forward to your ideas.
Given your requirement to stay .NET 2.0 only, and not work on 3.5 or 4.0, this is probably the best option.
I do have three remarks on your current implementation.
Is there a specific reason you're using ThreadPool.UnsafeQueueUserWorkItem? Unless there is a specific reason this is required, I would recommend using ThreadPool.QueueUserWorkItem instead, especially if you're in a large development team. The Unsafe version can potentially allow security flaws to appear as you lose the calling stack, and as a result, the ability to control permissions as closely.
The current design of your exception handling, using the failureCallback, will swallow all exceptions, and provide no feedback, unless a callback is defined. It might be better to propogate the exception and let it bubble up if you're not going to handle it properly. Alternatively, you could push this back onto the calling thread in some fashion, though this would require using something more like IAsyncResult.
You currently have no way to tell if an asynchronous call is completed. This would be the other advantage of using IAsyncResult in your design (though it does add some complexity to the implementation).
Once you upgrade to .NET 4, however, I would recommend just putting this in a Task or Task<T>, as it was designed to handle this very cleanly. Instead of:
AsyncUtils(
() => _fooService.SaveFoo(foo),
id => HandleFooSavedSuccessfully(id),
ex => HandleFooSaveError(ex));
You can use the built-in tools and just write:
var task = Task.Factory.StartNew(
() => return _fooService.SaveFoo(foo) );
task.ContinueWith(
t => HandleFooSavedSuccessfully(t.Result),
TaskContinuationOptions.NotOnFaulted);
task.ContinueWith(
t => try { t.Wait(); } catch( Exception e) { HandleFooSaveError(e); },
TaskContinuationOptions.OnlyOnFaulted );
Granted, the last line there is a bit odd, but that's mainly because I tried to keep your existing API. If you reworked it a bit, you could simplify it...
Asynchronous interface (based on IAsyncResult) is useful only when you have some non-blocking call under the cover. The main point of the interface is to make it possible to do the call without blocking the caller thread.
This is useful in scenarios when you can make some system call and the system will notify you back when something happens (e.g. when a HTTP response is received or when an event happens).
The price for using IAsyncResult based interface is that you have to write code in a somewhat awkward way (by making every call using callback). Even worse, asynchronous API makes it impossible to use standard language constructs like while, for, or try..catch.
I don't really see the point of wrapping synchronous API into asynchronous interface, because you won't get the benefit (there will always be some thread blocked) and you'll just get more awkward way of calling it.
Of course, it makes a perfect sense to run the synchronous code on a background thread somehow (to avoid blocking the main application thread). Either using Task<T> on .NET 4.0 or using QueueUserWorkItem on .NET 2.0. However, I'm not sure if this should be done automatically in the service - it feels like doing this on the caller side would be easier, because you may need to perform multiple calls to the service. Using asynchronous API, you'd have to write something like:
svc.BeginGetFooId(ar1 => {
var foo = ar1.Result;
foo.Prop = 123;
svc.BeginSaveFoo(foo, ar2 => {
// etc...
}
});
When using synchronous API, you'd write something like:
ThreadPool.QueueUserWorkItem(() => {
var foo = svc.GetFooId();
foo.Prop = 123;
svc.SaveFoo(foo);
});
The following is a response to Reed's follow-up question. I'm not suggesting that it's the right way to go.
public static int PerformSlowly(int id)
{
// Addition isn't so hard, but let's pretend.
Thread.Sleep(10000);
return 42 + id;
}
public static Task<int> PerformTask(int id)
{
// Here's the straightforward approach.
return Task.Factory.StartNew(() => PerformSlowly(id));
}
public static Lazy<int> PerformLazily(int id)
{
// Start performing it now, but don't block.
var task = PerformTask(id);
// JIT for the value being checked, block and retrieve.
return new Lazy<int>(() => task.Result);
}
static void Main(string[] args)
{
int i;
// Start calculating the result, using a Lazy<int> as the future value.
var result = PerformLazily(7);
// Do assorted work, then get result.
i = result.Value;
// The alternative is to use the Task as the future value.
var task = PerformTask(7);
// Do assorted work, then get result.
i = task.Result;
}
Related
I have read a number of articles and questions here on StackOverflow about wrapping a callback based API with a Task based one using a TaskCompletionSource, and I'm trying to use that sort of technique when communicating with a Solace PubSub+ message broker.
My initial observation was that this technique seems to shift responsibility for concurrency. For example, the Solace broker library has a Send() method which can possibly block, and then we get a callback after the network communication is complete to indicate "real" success or failure. So this Send() method can be called very quickly, and the vendor library limits concurrency internally.
When you put a Task around that it seems you either serialize the operations ( foreach message await SendWrapperAsync(message) ), or take over responsibility for concurrency yourself by deciding how many tasks to start (eg, using TPL dataflow).
In any case, I decided to wrap the Send call with a guarantor that will retry forever until the callback indicates success, as well as take responsibility for concurrency. This is a "guaranteed" messaging system. Failure is not an option. This requires that the guarantor can apply backpressure, but that's not really in the scope of this question. I have a couple of comments about it in my example code below.
What it does mean is that my hot path, which wraps the send + callback, is "extra hot" because of the retry logic. And so there's a lot of TaskCompletionSource creation here.
The vendor's own documentation makes recommendations about reusing their Message objects where possible rather then recreating them for every Send. I have decided to use a Channel as a ring buffer for this. But that made me wonder - is there some alternative to the TaskCompletionSource approach - maybe some other object that can also be cached in the ring buffer and reused, achieving the same outcome?
I realise this is probably an overzealous attempt at micro-optimisation, and to be honest I am exploring several aspects of C# which are above my pay grade (I'm a SQL guy, really), so I could be missing something obvious. If the answer is "you don't actually need this optimisation", that's not going to put my mind at ease. If the answer is "that's really the only sensible way", my curiosity would be satisfied.
Here is a fully functioning console application which simulates the behaviour of the Solace library in the MockBroker object, and my attempt to wrap it. My hot path is the SendOneAsync method in the Guarantor class. The code is probably a bit too long for SO, but it is as minimal a demo I could create that captures all of the important elements.
using System;
using System.Collections.Generic;
using System.Threading;
using System.Threading.Channels;
using System.Threading.Tasks;
internal class Message { public bool sent; public int payload; public object correlator; }
// simulate third party library behaviour
internal class MockBroker
{
public bool TrySend(Message m, Action<Message> callback)
{
if (r.NextDouble() < 0.5) return false; // simulate chance of immediate failure / "would block" response
Task.Run(() => { Thread.Sleep(100); m.sent = r.NextDouble() < 0.5; callback(m); }); // simulate network call
return true;
}
private Random r = new();
}
// Turns MockBroker into a "guaranteed" sender with an async concurrency limit
internal class Guarantor
{
public Guarantor(int maxConcurrency)
{
_broker = new MockBroker();
// avoid message allocations in SendOneAsync
_ringBuffer = Channel.CreateBounded<Message>(maxConcurrency);
for (int i = 0; i < maxConcurrency; i++) _ringBuffer.Writer.TryWrite(new Message());
}
// real code pushing into a T.T.T.DataFlow block with bounded capacity and parallelism
// execution options both equal to maxConcurrency here, providing concurrency and backpressure
public async Task Post(int payload) => await SendOneAsync(payload);
private async Task SendOneAsync(int payload)
{
Message msg = await _ringBuffer.Reader.ReadAsync();
msg.payload = payload;
// send must eventually succeed
while (true)
{
// *** can this allocation be avoided? ***
var tcs = new TaskCompletionSource<bool>(TaskCreationOptions.RunContinuationsAsynchronously);
msg.correlator = tcs;
// class method in real code, inlined here to make the logic more apparent
Action<Message> callback = (msg) => (msg.correlator as TaskCompletionSource<bool>).SetResult(msg.sent);
if (_broker.TrySend(msg, callback) && await tcs.Task) break;
else
{
// simple demo retry logic
Console.WriteLine($"retrying {msg.payload}");
await Task.Delay(500);
}
}
// real code raising an event here to indicate successful delivery
await _ringBuffer.Writer.WriteAsync(msg);
Console.WriteLine(payload);
}
private Channel<Message> _ringBuffer;
private MockBroker _broker;
}
internal class Program
{
private static async Task Main(string[] args)
{
// at most 10 concurrent sends
Guarantor g = new(10);
// hacky simulation since in this demo there's nothing generating continuous events,
// no DataFlowBlock providing concurrency (it will be limited by the Channel instead),
// and nobody to notify when messages are successfully sent
List<Task> sends = new(100);
for (int i = 0; i < 100; i++) sends.Add(g.Post(i));
await Task.WhenAll(sends);
}
}
Yes, you can avoid the allocation of TaskCompletionSource instances, by using lightweight ValueTasks instead of Tasks. At first you need a reusable object that can implement the IValueTaskSource<T> interface, and the Message seems like the perfect candidate. For implementing this interface you can use the ManualResetValueTaskSourceCore<T> struct. This is a mutable struct, so it should not be declared as readonly. You just need to delegate the interface methods to the corresponding methods of this struct with the very long name:
using System.Threading.Tasks.Sources;
internal class Message : IValueTaskSource<bool>
{
public bool sent; public int payload; public object correlator;
private ManualResetValueTaskSourceCore<bool> _source; // Mutable struct, not readonly
public void Reset() => _source.Reset();
public short Version => _source.Version;
public void SetResult(bool result) => _source.SetResult(result);
ValueTaskSourceStatus IValueTaskSource<bool>.GetStatus(short token)
=> _source.GetStatus(token);
void IValueTaskSource<bool>.OnCompleted(Action<object> continuation,
object state, short token, ValueTaskSourceOnCompletedFlags flags)
=> _source.OnCompleted(continuation, state, token, flags);
bool IValueTaskSource<bool>.GetResult(short token) => _source.GetResult(token);
}
The three members GetStatus, OnCompleted and GetResult are required for implementing the interface. The other three members (Reset, Version and SetResult) will be used for creating and controlling the ValueTask<bool>s.
Now lets wrap the TrySend method of the MockBroker class in an asynchronous method TrySendAsync, that returns a ValueTask<bool>
static class MockBrokerExtensions
{
public static ValueTask<bool> TrySendAsync(this MockBroker source, Message message)
{
message.Reset();
bool result = source.TrySend(message, m => m.SetResult(m.sent));
if (!result) message.SetResult(false);
return new ValueTask<bool>(message, message.Version);
}
}
The message.Reset(); resets the IValueTaskSource<bool>, and declares that the previous asynchronous operation has completed. A IValueTaskSource<T> supports only one asynchronous operation at a time, the produced ValueTask<T> can be awaited only once, and it can no longer be awaited after the next Reset(). That's the price you have to pay for avoiding the allocation of an object: you must follow stricter rules. If you try to bend the rules (intentionally or unintentionally), the ManualResetValueTaskSourceCore<T> will start throwing InvalidOperationExceptions all over the place.
Now lets use the TrySendAsync extension method:
while (true)
{
if (await _broker.TrySendAsync(msg)) break;
// simple demo retry logic
Console.WriteLine($"retrying {msg.payload}");
await Task.Delay(500);
}
You can print in the Console the GC.GetTotalAllocatedBytes(true) before and after the whole operation, to see the difference. Make sure to run the application in Release mode, to see the real picture. You might see that the difference in not that impressive, because the size of a TaskCompletionSource instance is pretty small compared to the bytes allocated by the Task.Delay, and by all the strings generated for writing stuff in the Console.
I'm trying to design a public method that returns a quick result and, if needed, kicks off a long-running background task.
The result is retrieved within the method pretty quickly, and once it is retrieved it needs to be immediately available to the caller, without waiting for the potential background task to complete.
The background task is either not needed at all, or it must run for a significant amount of time - longer than the caller and all other logic in the program would take to complete without it.
Looking through the MS docs, the best design option I can come up with is to return two things from this method: the result, as well as a Task for the background task.
I don't see a better way to do this, but there are some downsides: first, the responsibility for making sure the background task completes falls on the caller of the method, even though the caller really just wants to consume the immediate result and not be concerned with the behind-the-scene stuff. Second, it is awkward to return null if the background task isn't needed to begin with: now the caller must ensure the task isn't null, in addition to making sure it completes if it is null.
What other options are available for this sort of situation?
Here's an example of what my current design looks like:
public async Task<Tuple<string, Task>> GetAndSave() {
var networkResult = await GetFromNetworkAsync();
if (NeedsToSave(networkResult)) {
var saveTask = SaveToDiskAsync(networkResult);
return new Tuple<string, Task>(networkResult, saveTask);
}
else {
return new Tuple<string, Task>(networkResult, null);
}
}
first, the responsibility for making sure the background task completes falls on the caller of the method, even though the caller really just wants to consume the immediate result and not be concerned with the behind-the-scene stuff.
If it's important to make sure the background task completes then instead of returning the Task you could hand it off to another object (that has been injected into the class that has your GetAndSave method). For example:
public class Foo
{
readonly Action<Task> _ensureCompletion;
public Foo(Action<Task> ensureCompletion)
{
_ensureCompletion = ensureCompletion;
}
public async Task<string> GetAndSaveAsync() // Your method
{
string networkResult = await GetFromNetworkAsync();
if (NeedsToSave(networkResult))
{
Task saveTask = SaveToDiskAsync(networkResult);
_ensureCompletion(saveTask); // This is a synchronous call... no await keyword here. And it returns `void`
}
return networkResult;
}
Task<string> GetFromNetworkAsync() {...}
bool NeedsToSave(string x) {...}
Task SaveToDiskAsync(string x) {...}
}
Now you can inject whatever follow-up behavior you desire for the saveTask. For example, you could write stuff out to the console depending on how it goes:
async Task DoStuff()
{
var foo = new Foo(async task =>
// ^^^^^ More on this in a moment
{
try
{
await task;
Console.Writeline("It worked!");
}
catch (Exception e)
{
Console.Writeline(e.ToString());
}
});
var immediateResult = await foo.GetAndSaveAsync();
// do stuff with `immediateResult`
}
Now, the thing that might be confusing about this (and the power behind this type of solution) is how you can have a synchronous call on the one hand:
_ensureCompletion(saveTask); // This is a synchronous call... no await keyword here
...that does asynchronous things:
var foo = new Foo(async task => ...);
// ^^^^^ This statement lambda is asynchronous
(The injected delegate might even write out to the console on a different thread than whatever thread called GetAndSaveAsync()!)
There's no magic here. It all comes down to SynchronizationContext and the inner workings of async/await.
When the compiler encounters the await keyword (in an async context) then it will do a few things:
Turn everything after the await into a continuation (basically a delegate with some state)
Replace the await keyword with something like SynchronizationContext.Current.Post(continuation)
In other words, this:
async void EnsureCompletion(Task task)
{
try
{
await task;
Console.Writeline("It worked!");
}
catch (Exception e)
{
Console.Writeline(e.ToString());
}
}
...gets turned into something like this:
void EnsureCompletion(Task task)
{
try
{
task.ContinueWith(t => SynchronizationContext.Current.Post(_ =>
{
if (t.IsCompletedSuccessfully)
{
Console.Writeline("It worked!");
}
else
{
Console.Writeline(task.Exception.ToString());
}
});
}
catch (Exception e)
{
Console.Writeline(e.ToString());
}
}
As you can see, EnsureCompletion is an entirely synchronous function that does (almost) the exact same thing as its asynchronous form. Notice how it returns void and everything. This is how you can jam an asynchronous statement lambda into a delegate parameter that has a synchronous signature.
I hinted that the console writing might happen on a totally different thread. That's because async/await is orthogonal to threading. It depends on whatever the currently assigned implementation of SynchronizationContext is programmed to do. Unless you're using WPF or WinForms then by default there will be no SynchronizationContext and continuations (the things that get passed to SynchronizationContext.Post) will just be tossed over to whatever thread happens to be free in the default thread pool.
Second, it is awkward to return null if the background task isn't needed to begin with: now the caller must ensure the task isn't null, in addition to making sure it completes if it is null.
I'm of the opinion that null was a design mistake in C#. (Ask me what some other ones are :) ). If you return null from this method:
public Task DoSomethingAsync()
{
return null; // Compiles just fine
}
...then anyone who uses it will encounter a NullReferenceException when they await it.
async Task Blah()
{
await DoSomethingAsync(); // Wham! NullReferenceException :(
}
(the reason why comes back to how the compiler desugurs the await keyword)
...and it's awkward to have to check for nulls everywhere.
Much better would be to just return Task.CompletedTask as #juharr said.
But if you just hand off the background task like I showed above then you don't have to worry about this.
It's recommended that one use ConfigureAwait(false) whenever when you can, especially in libraries because it can help avoid deadlocks and improve performance.
I have written a library that makes heavy use of async (accesses web services for a DB). The users of the library were getting a deadlock and after much painful debugging and tinkering I tracked it down to the single use of await Task.Yield(). Everywhere else that I have an await, I use .ConfigureAwait(false), however that is not supported on Task.Yield().
What is the recommended solution for situations where one needs the equivalent of Task.Yield().ConfigureAwait(false)?
I've read about how there was a SwitchTo method that was removed. I can see why that could be dangerous, but why is there no equivalent of Task.Yield().ConfigureAwait(false)?
Edit:
To provide further context for my question, here is some code. I am implementing an open source library for accessing DynamoDB (a distributed database as a service from AWS) that supports async. A number of operations return IAsyncEnumerable<T> as provided by the IX-Async library. That library doesn't provide a good way of generating async enumerables from data sources that provide rows in "chunks" i.e. each async request returns many items. So I have my own generic type for this. The library supports a read ahead option allowing the user to specify how much data should be requested ahead of when it is actually needed by a call to MoveNext().
Basically, how this works is that I make requests for chunks by calling GetMore() and passing along state between these. I put those tasks in a chunks queue and dequeue them and turn them into actual results that I put in a separate queue. The NextChunk() method is the issue here. Depending on the value of ReadAhead I will keeping getting the next chunk as soon as the last one is done (All) or not until a value is needed but not available (None) or only get the next chunk beyond the values that are currently being used (Some). Because of that, getting the next chunk should run in parallel/not block getting the next value. The enumerator code for this is:
private class ChunkedAsyncEnumerator<TState, TResult> : IAsyncEnumerator<TResult>
{
private readonly ChunkedAsyncEnumerable<TState, TResult> enumerable;
private readonly ConcurrentQueue<Task<TState>> chunks = new ConcurrentQueue<Task<TState>>();
private readonly Queue<TResult> results = new Queue<TResult>();
private CancellationTokenSource cts = new CancellationTokenSource();
private TState lastState;
private TResult current;
private bool complete; // whether we have reached the end
public ChunkedAsyncEnumerator(ChunkedAsyncEnumerable<TState, TResult> enumerable, TState initialState)
{
this.enumerable = enumerable;
lastState = initialState;
if(enumerable.ReadAhead != ReadAhead.None)
chunks.Enqueue(NextChunk(initialState));
}
private async Task<TState> NextChunk(TState state, CancellationToken? cancellationToken = null)
{
await Task.Yield(); // ** causes deadlock
var nextState = await enumerable.GetMore(state, cancellationToken ?? cts.Token).ConfigureAwait(false);
if(enumerable.ReadAhead == ReadAhead.All && !enumerable.IsComplete(nextState))
chunks.Enqueue(NextChunk(nextState)); // This is a read ahead, so it shouldn't be tied to our token
return nextState;
}
public Task<bool> MoveNext(CancellationToken cancellationToken)
{
cancellationToken.ThrowIfCancellationRequested();
if(results.Count > 0)
{
current = results.Dequeue();
return TaskConstants.True;
}
return complete ? TaskConstants.False : MoveNextAsync(cancellationToken);
}
private async Task<bool> MoveNextAsync(CancellationToken cancellationToken)
{
Task<TState> nextStateTask;
if(chunks.TryDequeue(out nextStateTask))
lastState = await nextStateTask.WithCancellation(cancellationToken).ConfigureAwait(false);
else
lastState = await NextChunk(lastState, cancellationToken).ConfigureAwait(false);
complete = enumerable.IsComplete(lastState);
foreach(var result in enumerable.GetResults(lastState))
results.Enqueue(result);
if(!complete && enumerable.ReadAhead == ReadAhead.Some)
chunks.Enqueue(NextChunk(lastState)); // This is a read ahead, so it shouldn't be tied to our token
return await MoveNext(cancellationToken).ConfigureAwait(false);
}
public TResult Current { get { return current; } }
// Dispose() implementation omitted
}
I make no claim this code is perfect. Sorry it is so long, wasn't sure how to simplify. The important part is the NextChunk method and the call to Task.Yield(). This functionality is used through a static construction method:
internal static class AsyncEnumerableEx
{
public static IAsyncEnumerable<TResult> GenerateChunked<TState, TResult>(
TState initialState,
Func<TState, CancellationToken, Task<TState>> getMore,
Func<TState, IEnumerable<TResult>> getResults,
Func<TState, bool> isComplete,
ReadAhead readAhead = ReadAhead.None)
{ ... }
}
The exact equivalent of Task.Yield().ConfigureAwait(false) (which doesn't exist since ConfigureAwait is a method on Task and Task.Yield returns a custom awaitable) is simply using Task.Factory.StartNew with CancellationToken.None, TaskCreationOptions.PreferFairness and TaskScheduler.Current. In most cases however, Task.Run (which uses the default TaskScheduler) is close enough.
You can verify that by looking at the source for YieldAwaiter and see that it uses ThreadPool.QueueUserWorkItem/ThreadPool.UnsafeQueueUserWorkItem when TaskScheduler.Current is the default one (i.e. thread pool) and Task.Factory.StartNew when it isn't.
You can however create your own awaitable (as I did) that mimics YieldAwaitable but disregards the SynchronizationContext:
async Task Run(int input)
{
await new NoContextYieldAwaitable();
// executed on a ThreadPool thread
}
public struct NoContextYieldAwaitable
{
public NoContextYieldAwaiter GetAwaiter() { return new NoContextYieldAwaiter(); }
public struct NoContextYieldAwaiter : INotifyCompletion
{
public bool IsCompleted { get { return false; } }
public void OnCompleted(Action continuation)
{
var scheduler = TaskScheduler.Current;
if (scheduler == TaskScheduler.Default)
{
ThreadPool.QueueUserWorkItem(RunAction, continuation);
}
else
{
Task.Factory.StartNew(continuation, CancellationToken.None, TaskCreationOptions.PreferFairness, scheduler);
}
}
public void GetResult() { }
private static void RunAction(object state) { ((Action)state)(); }
}
}
Note: I don't recommend actually using NoContextYieldAwaitable, it's just an answer to your question. You should be using Task.Run (or Task.Factory.StartNew with a specific TaskScheduler)
I noticed you edited your question after you accepted the existing answer, so perhaps you're interested in more rants on the subject. Here you go :)
It's recommended that one use ConfigureAwait(false) whenever when you
can, especially in libraries because it can help avoid deadlocks and
improve performance.
It's recommended so, only if you're absolutely sure that any API you're calling in your implementation (including Framework APIs) doesn't depend on any properties of synchronization context. That's especially important for a library code, and even more so if the library is suitable for both client-side and server-side use. E.g, CurrentCulture is a common overlook: it would never be an issue for a desktop app, but it well may be for an ASP.NET app.
Back to your code:
private async Task<TState> NextChunk(...)
{
await Task.Yield(); // ** causes deadlock
var nextState = await enumerable.GetMore(...);
// ...
return nextState;
}
Most likely, the deadlock is caused by the client of your library, because they use Task.Result (or Task.Wait, Task.WaitAll, Task.IAsyncResult.AsyncWaitHandle etc, let them search) somewhere in the outer frame of the call chain. Albeit Task.Yield() is redundant here, this is not your problem in the first place, but rather theirs: they shouldn't be blocking on the asynchronous APIs and should be using "Async All the Way", as also explained in the Stephen Cleary's article you linked.
Removing Task.Yield() may or may not solve this problem, because enumerable.GetMore() can also use some await SomeApiAsync() without ConfigureAwait(false), thus posting the continuation back to the caller's synchronization context. Moreover, "SomeApiAsync" can happen to be a well established Framework API which is still vulnerable to a deadlock, like SendMailAsync, we'll get back to it later.
Overall, you should only be using Task.Yield() if for some reason you want to return to the caller immediately ("yield" the execution control back to the caller), and then continue asynchronously, at the mercy of the SynchronizationContext installed on the calling thread (or ThreadPool, if SynchronizationContext.Current == null). The continuation well may be executed on the same thread upon the next iteration of the app's core message loop. Some more details can be found here:
Task.Yield - real usages?
So, the right thing would be to avoid blocking code all the way. However, say, you still want to make your code deadlock-proof, you don't care about synchronization context and you're sure the same is true about any system or 3rd party API you use in your implementation.
Then, instead of reinventing ThreadPoolEx.SwitchTo (which was removed for a good reason), you could just use Task.Run, as suggested in the comments:
private Task<TState> NextChunk(...)
{
// jump to a pool thread without SC to avoid deadlocks
return Task.Run(async() =>
{
var nextState = await enumerable.GetMore(...);
// ...
return nextState;
});
}
IMO, this is still a hack, with the same net effect, although a much more readable one than using a variation of ThreadPoolEx.SwitchTo(). Same as SwitchTo, it still has an associated cost: a redundant thread switch which may hurt ASP.NET performance.
There is another (IMO better) hack, which I proposed here to address the deadlock with aforementioned SendMailAsync. It doesn't incur an extra thread switch:
private Task<TState> NextChunk(...)
{
return TaskExt.WithNoContext(async() =>
{
var nextState = await enumerable.GetMore(...);
// ...
return nextState;
});
}
public static class TaskExt
{
public static Task<TResult> WithNoContext<TResult>(Func<Task<TResult>> func)
{
Task<TResult> task;
var sc = SynchronizationContext.Current;
try
{
SynchronizationContext.SetSynchronizationContext(null);
task = func(); // do not await here
}
finally
{
SynchronizationContext.SetSynchronizationContext(sc);
}
return task;
}
}
This hack works in the way it temporarily removes the synchronization context for the synchronous scope of the original NextChunk method, so it won't be captured for the 1st await continuation inside the async lambda, effectively solving the deadlock problem.
Stephen has provided a slightly different implementation while answering the same question. His IgnoreSynchronizationContext restores the original synchronization context on whatever happens to be the continuation's thread after await (which could be a completely different, random pool thread). I'd rather not restore it after await at all, as long as I don't care about it.
Inasmuch as the useful and legit API you're looking for is missing, I filed this request proposing its addition to .NET.
I also added it to vs-threading so that the next release of the Microsoft.VisualStudio.Threading NuGet package will include this API. Note that this library is not VS-specific, so you can use it in your app.
I have a parametrized rest call that should be executed every five seconds with different params:
Observable<TResult> restCall = api.method1(param1);
I need to create an Observable<TResult> which will poll the restCall every 5 seconds with different values for param1. If the api call fails I need to get an error and make the next call in 5 seconds. The interval between calls should be measured only when restCall is finished (success/error).
I'm currently using RxJava, but a .NET example would also be good.
Introduction
First, an admission, I'm a .NET guy, and I know this approach uses some idioms that have no direct equivalent in Java. But I'm taking you at your word and proceeding on the basis that this is a great question that .NET guys will enjoy, and that hopefully it will lead you down the right path in rx-java, which I have never looked at. This is quite a long answer, but it's mostly explanation - the solution code itself is pretty short!
Use of Either
We will need sort some tools out first to help with this solution. The first is the use of the Either<TLeft, TRight> type. This is important, because you have two possible outcomes of each call either a good result, or an error. But we need to wrap these in a single type - we can't use OnError to send errors back since this would terminate the result stream. Either looks a bit like a Tuple and makes it easier to deal with this situation. The Rxx library has a very full and good implementation of Either, but here is a simple generic example of usage followed by a simple implementation good for our purposes:
var goodResult = Either.Right<Exception,int>(1);
var exception = Either.Left<Exception,int>(new Exception());
/* base class for LeftValue and RightValue types */
public abstract class Either<TLeft, TRight>
{
public abstract bool IsLeft { get; }
public bool IsRight { get { return !IsLeft; } }
public abstract TLeft Left { get; }
public abstract TRight Right { get; }
}
public static class Either
{
public sealed class LeftValue<TLeft, TRight> : Either<TLeft, TRight>
{
TLeft _leftValue;
public LeftValue(TLeft leftValue)
{
_leftValue = leftValue;
}
public override TLeft Left { get { return _leftValue; } }
public override TRight Right { get { return default(TRight); } }
public override bool IsLeft { get { return true; } }
}
public sealed class RightValue<TLeft, TRight> : Either<TLeft, TRight>
{
TRight _rightValue;
public RightValue(TRight rightValue)
{
_rightValue = rightValue;
}
public override TLeft Left { get { return default(TLeft); } }
public override TRight Right { get { return _rightValue; } }
public override bool IsLeft { get { return false; } }
}
// Factory functions to create left or right-valued Either instances
public static Either<TLeft, TRight> Left<TLeft, TRight>(TLeft leftValue)
{
return new LeftValue<TLeft, TRight>(leftValue);
}
public static Either<TLeft, TRight> Right<TLeft, TRight>(TRight rightValue)
{
return new RightValue<TLeft, TRight>(rightValue);
}
}
Note that by convention when using Either to model a success or failure, the Right side is used for the successful value, because it's "Right" of course :)
Some Helper Functions
I'm going to simulate two aspects of your problem with some helper functions. First, here is a factory to generate parameters - each time it is called it will return the next integer in the sequence of integers starting with 1:
// An infinite supply of parameters
private static int count = 0;
public int ParameterFactory()
{
return ++count;
}
Next, here is a function that simulates your Rest call as an IObservable. This function accepts an integer and:
If the integer is even it returns an Observable that immediately sends an OnError.
If the integer is odd it returns a string concatenating the integer with "-ret", but only after a second has passed. We will use this to check the polling interval is behaving as you requested - as a pause between completed invocations however long they take, rather than a regular interval.
Here it is:
// A asynchronous function representing the REST call
public IObservable<string> SomeRestCall(int x)
{
return x % 2 == 0
? Observable.Throw<string>(new Exception())
: Observable.Return(x + "-ret").Delay(TimeSpan.FromSeconds(1));
}
Now The Good Bit
Below is a reasonably generic reusable function I have called Poll. It accepts an asynchronous function that will be polled, a parameter factory for that function, the desired rest (no pun intended!) interval, and finally an IScheduler to use.
The simplest approach I could come up with is to use Observable.Create that uses a scheduler to drive the result stream. ScheduleAsync is a way of Scheduling that uses the .NET async/await form. This is a .NET idiom that allows you to write asynchronous code in an imperative fashion. The async keyword introduces an asynchronous function that can then await one or more asynchronous calls in it's body and will continue on only when the call completes. I wrote a long explanation of this style of scheduling in this question, which includes the older recursive the style that might be easier to implement in an rx-java approach. The code looks like this:
public IObservable<Either<Exception, TResult>> Poll<TResult, TArg>(
Func<TArg, IObservable<TResult>> asyncFunction,
Func<TArg> parameterFactory,
TimeSpan interval,
IScheduler scheduler)
{
return Observable.Create<Either<Exception, TResult>>(observer =>
{
return scheduler.ScheduleAsync(async (ctrl, ct) => {
while(!ct.IsCancellationRequested)
{
try
{
var result = await asyncFunction(parameterFactory());
observer.OnNext(Either.Right<Exception,TResult>(result));
}
catch(Exception ex)
{
observer.OnNext(Either.Left<Exception, TResult>(ex));
}
await ctrl.Sleep(interval, ct);
}
});
});
}
Breaking this down, Observable.Create in general is a factory for creating IObservables that gives you a great deal of control over how results are posted to observers. It's often overlooked in favour of unnecessarily complex composition of primitives.
In this case, we are using it to create a stream of Either<TResult, Exception> so that we can return the successful and failed polling results.
The Create function accepts an observer that represents the Subscriber to which we pass results to via OnNext/OnError/OnCompleted. We need to return an IDisposable within the Create call - in .NET this is a handle by which the Subscriber can cancel their subscription. It's particularly important here because Polling will otherwise go on forever - or at least it won't ever OnComplete.
The result of ScheduleAsync (or plain Schedule) is such a handle. When disposed, it will cancel any pending event we Scheduled - thereby ending the the polling loop. In our case, the Sleep we use to manage the interval is the cancellable operation, although the Poll function could easily be modified to accept a cancellable asyncFunction that accepts a CancellationToken as well.
The ScheduleAsync method accepts a function that will be called to schedule events. It is passed two arguments, the first ctrl is the scheduler itself. The second ct is a CancellationToken we can use to see if cancellation has been requested (by the Subscriber disposing their subscription handle).
The polling itself is performed via an infinite while loop that terminates only if the CancellationToken indicates cancellation has been requested.
In the loop, we can use the magic of async/await to asynchronously invoke the polling function yet still wrap it in an exception handler. This is so awesome! Assuming no error, we send the result as the right value of an Either to the observer via OnNext. If there was an exception, we send that as the left value of an Either to the observer. Finally, we use the Sleep function on the scheduler to schedule a wake-up call after the rest interval - not to be confused with a Thread.Sleep call, this one typically doesn't block any threads. Note that Sleep accepts the CancellationToken enabling that to be aborted as well!
I think you'll agree this is a pretty cool use of async/await to simplify what would have been an awfully tricky problem!
Example Usage
Finally, here is some test code that calls Poll, along with sample output - for LINQPad fans all the code together in this answer will run in LINQPad with Rx 2.1 assemblies referenced:
void Main()
{
var subscription = Poll(SomeRestCall,
ParameterFactory,
TimeSpan.FromSeconds(5),
ThreadPoolScheduler.Instance)
.TimeInterval()
.Subscribe(x => {
Console.Write("Interval: " + x.Interval);
var result = x.Value;
if(result.IsRight)
Console.WriteLine(" Success: " + result.Right);
else
Console.WriteLine(" Error: " + result.Left.Message);
});
Console.ReadLine();
subscription.Dispose();
}
Interval: 00:00:01.0027668 Success: 1-ret
Interval: 00:00:05.0012461 Error: Exception of type 'System.Exception' was thrown.
Interval: 00:00:06.0009684 Success: 3-ret
Interval: 00:00:05.0003127 Error: Exception of type 'System.Exception' was thrown.
Interval: 00:00:06.0113053 Success: 5-ret
Interval: 00:00:05.0013136 Error: Exception of type 'System.Exception' was thrown.
Note the interval between results is either 5 seconds (the polling interval) if an error was immediately returned, or 6 seconds (the polling interval plus the simulated REST call duration) for a successful result.
EDIT - Here is an alternative implementation that doesn't use ScheduleAsync, but uses old style recursive scheduling and no async/await syntax. As you can see, it's a lot messier - but it does also support cancelling the asyncFunction observable.
public IObservable<Either<Exception, TResult>> Poll<TResult, TArg>(
Func<TArg, IObservable<TResult>> asyncFunction,
Func<TArg> parameterFactory,
TimeSpan interval,
IScheduler scheduler)
{
return Observable.Create<Either<Exception, TResult>>(
observer =>
{
var disposable = new CompositeDisposable();
var funcDisposable = new SerialDisposable();
bool cancelRequested = false;
disposable.Add(Disposable.Create(() => { cancelRequested = true; }));
disposable.Add(funcDisposable);
disposable.Add(scheduler.Schedule(interval, self =>
{
funcDisposable.Disposable = asyncFunction(parameterFactory())
.Finally(() =>
{
if (!cancelRequested) self(interval);
})
.Subscribe(
res => observer.OnNext(Either.Right<Exception, TResult>(res)),
ex => observer.OnNext(Either.Left<Exception, TResult>(ex)));
}));
return disposable;
});
}
See my other answer for a different approach that avoids .NET 4.5 async/await features and doesn't use Schedule calls.
I do hope that is some help to the rx-java guys!
I've cleaned up the approach that doesn't use the Schedule call directly - using the Either type from my other answer - it will also work with the same test code and give the same results:
public IObservable<Either<Exception, TResult>> Poll2<TResult, TArg>(
Func<TArg, IObservable<TResult>> asyncFunction,
Func<TArg> parameterFactory,
TimeSpan interval,
IScheduler scheduler)
{
return Observable.Create<Either<Exception, TResult>>(
observer =>
Observable.Defer(() => asyncFunction(parameterFactory()))
.Select(Either.Right<Exception, TResult>)
.Catch<Either<Exception, TResult>, Exception>(
ex => Observable.Return(Either.Left<Exception, TResult>(ex)))
.Concat(Observable.Defer(
() => Observable.Empty<Either<Exception, TResult>>()
.Delay(interval, scheduler)))
.Repeat().Subscribe(observer));
}
This has proper cancellation semantics.
Notes on implementation
The whole construction uses a Repeat to get the looping behaviour.
The initial Defer is used to ensure a different parameter is passed on each iteration
The Select projects the OnNext result to an Either on right side
The Catch projects an OnError result to an Either on the left side - note that this exception terminates the current asyncFunction observable, hence the need for repeat
The Concat adds in the interval delay
My opinion is that the scheduling version is more readable, but this one doesn't use async/await and is therefore .NET 4.0 compatible.
I am looking for a simple way to do a task in background, then update something (on the main thread) when it completes. It's in a low level 'model' class so I can't call InvokeOnMainThread as I don't have an NSObject around. I have this method:
public void GetItemsAsync(Action<Item[]> handleResult)
{
Item[] items=null;
Task task=Task.Factory.StartNew(() =>
{
items=this.CreateItems(); // May take a second or two
});
task.ContinueWith(delegate
{
handleResult(items);
}, TaskScheduler.FromCurrentSynchronizationContext());
}
This seems to work OK, but
1) Is this the best (simplest) way?
2) I'm worried about the local variable:
Item{} items=null
What stops that disappearing when the method returns before the background thread completes?
Thanks.
I think your method slightly violates a Single Responsibility Principle, because it doing too much.
First of all I suggest to change CreateItems to return Task instead of wrapping it in the GetItemsAsync:
public Task<Item[]> CreateItems(CancellationToken token)
{
return Task.Factory.StartNew(() =>
// obtaining the data...
{});
}
CancellationToken is optional but can help you if you able to cancel this long running operation.
With this method you can remove GetItemsAsync entirely because its so simple to handle results by your client without passing this delegate:
// Somewhere in the client of your class
var task = yourClass.CreateItems(token);
task.ContinueWith(t =>
// Code of the delegate that previously
// passed to GetItemsAsync method
{}, TaskScheduler.FromCurrentSynchronizationContext());
Using this approach you'll get more clear code with only one responsibility. Task class itself is a perfect tool for representing asynchronous operation as a first class object. Using proposed technique you can easily mock you current implementation with a fake behavior for unit testing without changing your clients code.
Something like this:
public void GetItemsAsync(Action<Item[]> handleResult)
{
int Id = 11;
Task<Item[]> task = Task.Factory.StartNew(() => CreateItems(Id)); // May take a second or two
task.ContinueWith(t => handleResult(t.Result), TaskScheduler.FromCurrentSynchronizationContext());
}
Your code looks fine.
It's a perfect example of when to use async / await if you can use C# 5, but if not, you have to write it as you have done with continuations.
The items variable is captured in your lambdas, so that's fine too.
When you write a lambda that uses an outside variable, the C# compiler creates a class that contains the variable. This is called a closure and means you can access the variable inside your lambda.
Here is a nicer soultion:
public void GetItemsAsync(Action<Item[]> handleResult)
{
var task = Task.Factory.StartNew<Item[]>(() =>
{
return this.CreateItems(); // May take a second or two
});
task.ContinueWith(delegate
{
handleResult(task.Result);
}, TaskScheduler.FromCurrentSynchronizationContext());
}