Reactive Extensions Observerable.FromAsync: How to wait until async operation is finished - c#

In my application I´m getting events from a message bus (rabbit mq, but actually it does not really matter). The processing of this events is taking rather long and the events are produced in bursts. Only one event should be processed at a time. To fulfill this I´m using Rx in order to serialize the events and execute the processing asynchronously in order to not block the producer.
Here is the (simplified) sample code:
static void Main(string[] args)
{
var input = Observable.Interval(TimeSpan.FromMilliseconds(500));
var subscription = input.Select(i => Observable.FromAsync(ct => Process(ct, i)))
.Concat()
.Subscribe();
Console.ReadLine();
subscription.Dispose();
// Wait until finished?
Console.WriteLine("Completed");
}
private static async Task Process(CancellationToken cts, long i)
{
Console.WriteLine($"Processing {i} ...");
await Task.Delay(1000).ConfigureAwait(false);
Console.WriteLine($"Finished processing {i}");
}
The sample application is disposing the subscription and then the application is terminated. But in this sample, the application is terminating while the last event received before the subscription is still being processed.
Question: What is the best way to wait until the last event is processed after the subscription is disposed? I´m still trying to wrap my head around the async Rx stuff. I assume there is a rather easy way to do this that I´m just not seeing right now.

If you are OK with a quick and easy solution, that is applicable for simple cases like this, then you could just Wait the IObservable instead of subscribing to it. And if you want to receive subscription-like notifications, use the Do operator just before the final Wait:
static void Main(string[] args)
{
var input = Observable.Interval(TimeSpan.FromMilliseconds(500));
input.Select(i => Observable.FromAsync(ct => Process(ct, i)))
.Concat()
.Do(onNext: x => { }, onError: ex => { }, onCompleted: () => { })
.Wait();
Console.WriteLine("Completed");
}

Thanks to the suggestion from Theodor I now found a solution that works for me:
static async Task Main(string[] args)
{
var inputSequence = Observable.Interval(TimeSpan.FromMilliseconds(500));
var terminate = new Subject<Unit>();
var task = Execute(inputSequence, terminate);
Console.ReadLine();
terminate.OnNext(Unit.Default);
await task.ConfigureAwait(false);
Console.WriteLine($"Completed");
Console.ReadLine();
}
private static async Task Process(CancellationToken cts, long i)
{
Console.WriteLine($"Processing {i} ...");
await Task.Delay(1000).ConfigureAwait(false);
Console.WriteLine($"Finished processing {i}");
}
private static async Task Execute(IObservable<long> input, IObservable<Unit> terminate)
{
await input
.TakeUntil(terminate)
.Select(i => Observable.FromAsync(ct => Process(ct, i)))
.Concat();
}

Related

Switch new Task(()=>{ }) for Func<Task>

In an answer to one of my other questions, I was told that use of new Task(() => { }) is not something that is a normal use case. I was advised to use Func<Task> instead. I have tried to make that work, but I can't seem to figure it out. (Rather than drag it out in the comments, I am asking a separate question here.)
My specific scenario is that I need the Task to not start right when it is declared and to be able to wait for it later.
Here is a LinqPad example using new Task(() => { }). NOTE: This works perfectly! (Except that it uses new Task.)
static async void Main(string[] args)
{
// Line that I need to swap to a Func<Task> somehow.
// note that this is "cold" not started task
Task startupDone = new Task(() => { });
var runTask = DoStuff(() =>
{
//+++ This is where we want to task to "start"
startupDone.Start();
});
//+++ Here we wait for the task to possibly start and finish. Or timeout.
// Note that this times out at 1000ms even if "blocking = 10000" below.
var didStartup = startupDone.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
await runTask;
Console.Read();
}
public static async Task DoStuff(Action action)
{
// Swap to 1000 to simulate starting up blocking
var blocking = 1; //1000;
await Task.Delay(500 + blocking);
action();
// Do the rest of the stuff...
await Task.Delay(1000);
}
I tried swapping the second line with:
Func<Task> startupDone = new Func<Task>(async () => { });
But then the lines below the comments with +++ in them don't work right.
I swapped the startupDone.Start() with startupDone.Invoke().
But startupDone.Wait needs the task. Which is only returned in the lambda. I am not sure how to get access to the task outside the lambda so I can Wait for it.
How can use a Func<Task> and start it in one part of my code and do a Wait for it in another part of my code? (Like I can with new Task(() => { })).
The code you posted cannot be refactored to make use of a Func<Task> instead of a cold task, because the method that needs to await the task (the Main method) is not the same method that controls the creation/starting of the task (the lambda parameter of the DoStuff method). This could make the use of the Task constructor legitimate in this case, depending on whether the design decision to delegate the starting of the task to a lambda is justified. In this particular example the startupDone is used as a synchronization primitive, to signal that a condition has been met and the program can continue. This could be achieved equally well by using a specialized synchronization primitive, like for example a SemaphoreSlim:
static async Task Main(string[] args)
{
var startupSemaphore = new SemaphoreSlim(0);
Task runTask = RunAsync(startupSemaphore);
bool startupFinished = await startupSemaphore.WaitAsync(1000);
Console.WriteLine(startupFinished ? "Startup Finished" : "Startup Timed Out");
await runTask;
}
public static async Task RunAsync(SemaphoreSlim startupSemaphore)
{
await Task.Delay(500);
startupSemaphore.Release(); // Signal that the startup is done
await Task.Delay(1000);
}
In my opinion using a SemaphoreSlim is more meaningful in this case, and makes the intent of the code clearer. It also allows to await asynchronously the signal with a timeout WaitAsync(Int32), which is not something that you get from a Task out of the box (it is doable though).
Using cold tasks may be tempting in some cases, but when you revisit your code after a month or two you'll find yourself confused, because of how rare and unexpected is to have to deal with tasks that may or may have not been started yet.
I always try my hardest to never have blocking behavior when dealing with anything async or any type that represents potential async behavior such as Task. You can slightly modify your DoStuff to facilitate waiting on your Action.
static async void Main(string[] args)
{
Func<CancellationToken,Task> startupTask = async(token)=>
{
Console.WriteLine("Waiting");
await Task.Delay(3000, token);
Console.WriteLine("Completed");
};
using var source = new CancellationTokenSource(2000);
var runTask = DoStuff(() => startupTask(source.Token), source.Token);
var didStartup = await runTask;
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
Console.Read();
}
public static async Task<bool> DoStuff(Func<Task> action, CancellationToken token)
{
var blocking = 10000;
try
{
await Task.Delay(500 + blocking, token);
await action();
}
catch(TaskCanceledException ex)
{
return false;
}
await Task.Delay(1000);
return true;
}
First, the type of your "do this later" object is going to become Func<Task>. Then, when the task is started (by invoking the function), you get back a Task that represents the operation:
static async void Main(string[] args)
{
Func<Task> startupDoneDelegate = async () => { };
Task startupDoneTask = null;
var runTask = await DoStuff(() =>
{
startupDoneTask = startupDoneDelegate();
});
var didStartup = startupDoneTask.Wait(1000);
Console.WriteLine(!didStartup ? "Startup Timed Out" : "Startup Finished");
}

Is there a proper way to cancel an asynchronous task?

I encountered a problem how to properly cancel an asynchronous task.
Here is some draft.
My entry point runs two asynchronous task. The first task does some 'long' work and the second one cancels it.
Entry point:
private static void Main()
{
var ctc = new CancellationTokenSource();
var cancellable = ExecuteLongCancellableMethod(ctc.Token);
var cancelationTask = Task.Run(() =>
{
Thread.Sleep(2000);
Console.WriteLine("[Before cancellation]");
ctc.Cancel();
});
try
{
Task.WaitAll(cancellable, cancelationTask);
}
catch (Exception e)
{
Console.WriteLine($"An exception occurred with type {e.GetType().Name}");
}
}
Method that returns cancel-able task:
private static Task ExecuteLongCancellableMethod(CancellationToken token)
{
return Task.Run(() =>
{
token.ThrowIfCancellationRequested();
Console.WriteLine("1st");
Thread.Sleep(1000);
Console.WriteLine("2nd");
Thread.Sleep(1000);
Console.WriteLine("3rd");
Thread.Sleep(1000);
Console.WriteLine("4th");
Thread.Sleep(1000);
Console.WriteLine("[Completed]");
}, token);
}
My purpose is to stop writing '1st','2nd','3rd' immediately after cancellation is called. But I get following results:
1st
2nd
3rd
[Before cancellation]
4th
[Completed]
For obvious reason I didn't get an exception that throws when cancellation is requested. So I tried to rewrite method as following:
private static Task ExecuteLongCancellableAdvancedMethod(CancellationToken token)
{
return Task.Run(() =>
{
var actions = new List<Action>
{
() => Console.WriteLine("1st"),
() => Console.WriteLine("2nd"),
() => Console.WriteLine("3rd"),
() => Console.WriteLine("4th"),
() => Console.WriteLine("[Completed]")
};
foreach (var action in actions)
{
token.ThrowIfCancellationRequested();
action.Invoke();
Thread.Sleep(1000);
}
}, token);
}
And now I got what I want:
1st
2nd
[Before cancellation]
3rd
An exception occurred with type AggregateException
But I guess creating of a collection of Action delegates and looping through it is not the most convenient way to deal with my problem.
So what's the proper way to do it? And why do I need to pass my cancellation token into Task.Run method as the second argument?
The Task won't cancel its self, it is up you you to detect the cancellation request and cleanly abort your work. That's what token.ThrowIfCancellationRequested(); does.
You should place those checks throughout your code, in places where execution can be cleanly stopped, or rolled back to a safe state.
In your second example, you call it once per iteration of the loop, and it works fine. The first example only calls it once, at the very start. If the token hasn't been canceled by that point, the task will run to completion, just like you are seeing.
If you changed it to look like this, you would also see the results you expect.
return Task.Run(() =>
{
token.ThrowIfCancellationRequested();
Console.WriteLine("1st");
Thread.Sleep(1000);
token.ThrowIfCancellationRequested();
Console.WriteLine("2nd");
Thread.Sleep(1000);
token.ThrowIfCancellationRequested();
Console.WriteLine("3rd");
Thread.Sleep(1000);
token.ThrowIfCancellationRequested();
Console.WriteLine("4th");
Thread.Sleep(1000);
Console.WriteLine("[Completed]");
}, token);

Ensuring completion of async OnNext code before process terminates

The unit test below will never print "Async 3" because the test finishes first. How can I ensure it runs to completion? The best I could come up with was an arbitrary Task.Delay at the end or WriteAsync().Result, neither are ideal.
public async Task TestMethod1() // eg. webjob
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Create<int>(async ob =>
{
ob.OnNext(1);
await Task.Delay(1000); // Fake async REST api call
ob.OnNext(2);
await Task.Delay(1000);
ob.OnNext(3);
ob.OnCompleted();
});
observable.Subscribe(i => TestContext.WriteLine($"Sync {i}"));
observable.SelectMany(i => WriteAsync(i).ToObservable()).Subscribe();
await observable;
TestContext.WriteLine("Complete.");
}
public async Task WriteAsync(int value) // Fake async DB call
{
await Task.Delay(1000);
TestContext.WriteLine($"Async {value}");
}
Edit
I realise that mentioning unit tests was probably misleading.This isn't a testing question. The code is a simulation of a real issue of a process running in an Azure WebJob, where both the producer and consumer need to call some Async IO. The issue is that the webjob runs to completion before the consumer has really finished. This is because I can't figure out how to properly await anything from the consumer side. Maybe this just isn't possible with RX...
EDIT:
You're basically looking for a blocking operator. The old blocking operators (like ForEach) were deprecated in favor of async versions. You want to await the last item like so:
public async Task TestMethod1()
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Create<int>(async ob =>
{
ob.OnNext(1);
await Task.Delay(1000);
ob.OnNext(2);
await Task.Delay(1000);
ob.OnNext(3);
ob.OnCompleted();
});
observable.Subscribe(i => TestContext.WriteLine($"Sync {i}"));
var selectManyObservable = observable.SelectMany(i => WriteAsync(i).ToObservable()).Publish().RefCount();
selectManyObservable.Subscribe();
await selectManyObservable.LastOrDefaultAsync();
TestContext.WriteLine("Complete.");
}
While that will solve your immediate problem, it looks like you're going to keep running into issues because of the below (and I added two more). Rx is very powerful when used right, and confusing as hell when not.
Old answer:
A couple things:
Mixing async/await and Rx generally results in getting the pitfalls of both and the benefits of neither.
Rx has robust testing functionality. You're not using it.
Side-Effects, like a WriteLine are best performed exclusively in a subscribe, and not in an operator like SelectMany.
You may want to brush up on cold vs hot observables.
The reason it isn't running to completion is because of your test runner. Your test runner is terminating the test at the conclusion of TestMethod1. The Rx subscription would live on otherwise. When I run your code in Linqpad, I get the following output:
Starting test...
Sync 1
Sync 2
Async 1
Sync 3
Async 2
Complete.
Async 3
...which is what I'm assuming you want to see, except you probably want the Complete after the Async 3.
Using Rx only, your code would look something like this:
public void TestMethod1()
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Concat<int>(
Observable.Return(1),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1)),
Observable.Return(2),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1)),
Observable.Return(3)
);
var syncOutput = observable
.Select(i => $"Sync {i}");
syncOutput.Subscribe(s => TestContext.WriteLine(s));
var asyncOutput = observable
.SelectMany(i => WriteAsync(i, scheduler));
asyncOutput.Subscribe(s => TestContext.WriteLine(s), () => TestContext.WriteLine("Complete."));
}
public IObservable<string> WriteAsync(int value, IScheduler scheduler)
{
return Observable.Return(value)
.Delay(TimeSpan.FromSeconds(1), scheduler)
.Select(i => $"Async {value}");
}
public static class TestContext
{
public static void WriteLine(string s)
{
Console.WriteLine(s);
}
}
This still isn't taking advantage of Rx's testing functionality. That would look like this:
public void TestMethod1()
{
var scheduler = new TestScheduler();
TestContext.WriteLine("Starting test...");
var observable = Observable.Concat<int>(
Observable.Return(1),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1), scheduler),
Observable.Return(2),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1), scheduler),
Observable.Return(3)
);
var syncOutput = observable
.Select(i => $"Sync {i}");
syncOutput.Subscribe(s => TestContext.WriteLine(s));
var asyncOutput = observable
.SelectMany(i => WriteAsync(i, scheduler));
asyncOutput.Subscribe(s => TestContext.WriteLine(s), () => TestContext.WriteLine("Complete."));
var asyncExpected = scheduler.CreateColdObservable<string>(
ReactiveTest.OnNext(1000.Ms(), "Async 1"),
ReactiveTest.OnNext(2000.Ms(), "Async 2"),
ReactiveTest.OnNext(3000.Ms(), "Async 3"),
ReactiveTest.OnCompleted<string>(3000.Ms() + 1) //+1 because you can't have two notifications on same tick
);
var syncExpected = scheduler.CreateColdObservable<string>(
ReactiveTest.OnNext(0000.Ms(), "Sync 1"),
ReactiveTest.OnNext(1000.Ms(), "Sync 2"),
ReactiveTest.OnNext(2000.Ms(), "Sync 3"),
ReactiveTest.OnCompleted<string>(2000.Ms()) //why no +1 here?
);
var asyncObserver = scheduler.CreateObserver<string>();
asyncOutput.Subscribe(asyncObserver);
var syncObserver = scheduler.CreateObserver<string>();
syncOutput.Subscribe(syncObserver);
scheduler.Start();
ReactiveAssert.AreElementsEqual(
asyncExpected.Messages,
asyncObserver.Messages);
ReactiveAssert.AreElementsEqual(
syncExpected.Messages,
syncObserver.Messages);
}
public static class MyExtensions
{
public static long Ms(this int ms)
{
return TimeSpan.FromMilliseconds(ms).Ticks;
}
}
...So unlike your Task tests, you don't have to wait. The test executes instantly. You can bump up the Delay times to minutes or hours, and the TestScheduler will essentially mock the time for you. And then your test runner will probably be happy.
Well, you can use Observable.ForEach to block until an IObservable has terminated:
observable.ForEach(unusedValue => { });
Can you make TestMethod1 a normal, non-async method, and then replace await observable; with this?

Where does Task Execution go?

Based on following code, my expectation was that console would emit
SayTaskHelloCalled
Task Executing
SayHelloAfterSleepTask
But task doesn't runs. Only first line emits in console. Please suggest why?
static void Main(string[] args)
{
var task2 = SayHelloTask();
var result = task2.Result;
Console.WriteLine(result);
}
public static Task<string> SayHelloTask()
{
Thread.Sleep(2000);
Console.WriteLine("SayHelloTaskCalled");
return new Task<string>(() => {
Console.WriteLine("Task Executing");
return "SayHelloAfterSleepTask";
});
Creating a new Task using its one of the constructors hands you back a "Cold Task". Meaning that the Task isn't started yet. Since you've never started the Task, you don't see the expected output.
You need to call Task.Start to start it. In order to return a "Hot Task"(Started Task), you need to use Task.Factory.StartNew or Task.Run.
Following should work:
public static Task<string> SayHelloTask()
{
Thread.Sleep(2000);
Console.WriteLine("SayHelloTaskCalled");
return Task.Run(() => {
Console.WriteLine("Task Executing");
return "SayHelloAfterSleepTask";
});
}
If you prefer your Task to be the "Cold Task" itself, then modify your calling code as below.
static void Main(string[] args)
{
var task2 = SayHelloTask();
task2.Start();//<--Start a "Cold task"
var result = task2.Result;
Console.WriteLine(result);
}

Handle tasks which complete after Task.WhenAll().Wait() specified timeout

I am trying to use Task.WhenAll(tasks).Wait(timeout) to wait for tasks to complete and after that process task results.
Consider this example:
var tasks = new List<Task<Foo>>();
tasks.Add(Task.Run(() => GetData1()));
tasks.Add(Task.Run(() => GetData2()));
Task.WhenAll(tasks).Wait(TimeSpan.FromSeconds(5));
var completedTasks = tasks
.Where(t => t.Status == TaskStatus.RanToCompletion)
.Select(t => t.Result)
.ToList();
// Process completed tasks
// ...
private Foo GetData1()
{
Thread.Sleep(TimeSpan.FromSeconds(4));
return new Foo();
}
private Foo GetData2()
{
Thread.Sleep(TimeSpan.FromSeconds(10));
// How can I get the result of this task once it completes?
return new Foo();
}
It is possible that one of these tasks will not complete their execution within 5 second timeout.
Is it possible to somehow process results of the tasks that have completed after specified timeout? Maybe I am not using right approach in this situation?
EDIT:
I am trying to get all task results that managed to complete within specified timeout. There could be the following outcomes after Task.WhenAll(tasks).Wait(TimeSpan.FromSeconds(5)):
First task completes within 5 seconds.
Second task completes within 5 seconds.
Both tasks complete within 5 seconds.
None of the tasks complete within 5 seconds. Is it possible to get task results that haven't completed within 5 seconds, but have completed later, lets say, after 10 seconds?
In the end with help of the user who removed his answer, I ended up with this solution:
private const int TimeoutInSeconds = 5;
private static void Main(string[] args)
{
var tasks = new List<Task>()
{
Task.Run( async() => await Task.Delay(30)),
Task.Run( async() => await Task.Delay(300)),
Task.Run( async() => await Task.Delay(6000)),
Task.Run( async() => await Task.Delay(8000))
};
Task.WhenAll(tasks).Wait(TimeSpan.FromSeconds(TimeoutInSeconds));
var completedTasks = tasks
.Where(t => t.Status == TaskStatus.RanToCompletion).ToList();
var incompleteTasks = tasks
.Where(t => t.Status != TaskStatus.RanToCompletion).ToList();
Task.WhenAll(incompleteTasks)
.ContinueWith(t => { ProcessDelayedTasks(incompleteTasks); });
ProcessCompletedTasks(completedTasks);
Console.ReadKey();
}
private static void ProcessCompletedTasks(IEnumerable<Task> delayedTasks)
{
Console.WriteLine("Processing completed tasks...");
}
private static void ProcessDelayedTasks(IEnumerable<Task> delayedTasks)
{
Console.WriteLine("Processing delayed tasks...");
}
Instead of Waitall, you probably just want to do some sort of Spin/sleep of 5 seconds and then query the list as you are above.
You should then be able to enumerate again after a few more seconds to see what else has finished.
If performance is a concern, you may want to have additional 'wrapping' to see if All tasks have completed before 5 seconds.
I think there's a possible loss of task items between
var completedTasks = tasks.Where(t => t.Status == TaskStatus.RanToCompletion).ToList();
and
var incompleteTasks = tasks.Where(t => t.Status != TaskStatus.RanToCompletion).ToList();
because some tasks may ran to completition during this time.
As a workaround (not correct though) you coud swap these lines. In this case some tasks may present in each (completedTasks and incompleteTasks) list. But maybe it's better than to be lost completely.
A unit test to compare number of started tasks and number of tasks in completedTasks and incompleteTasks lists may also be useful.

Categories

Resources