Unit tests failing with Observable.FromAsync and Observable.Switch - c#

I'm having troubles testing a class that makes use of Observable.FromAsync<T>() and Observable.Switch<T>(). What it does is to wait for a trigger observable to produce a value, then it starts an async operation, and finally recollects all operations' results in a single output sequence. The gist of it is something like:
var outputStream = triggerStream
.Select(_ => Observable
.FromAsync(token => taskProducer.DoSomethingAsync(token)))
.Switch();
I put up some sanity check tests with the bare minimum parts to understand what's going on, here's the test with results in comments:
class test_with_rx : nspec
{
void Given_async_task_and_switch()
{
Subject<Unit> triggerStream = null;
TaskCompletionSource<long> taskDriver = null;
ITestableObserver<long> testObserver = null;
IDisposable subscription = null;
before = () =>
{
TestScheduler scheduler = new TestScheduler();
testObserver = scheduler.CreateObserver<long>();
triggerStream = new Subject<Unit>();
taskDriver = new TaskCompletionSource<long>();
// build stream under test
IObservable<long> streamUnderTest = triggerStream
.Select(_ => Observable
.FromAsync(token => taskDriver.Task))
.Switch();
/* Also tried with this Switch() overload
IObservable<long> streamUnderTest = triggerStream
.Select(_ => taskDriver.Task)
.Switch(); */
subscription = streamUnderTest.Subscribe(testObserver);
};
context["Before trigger"] = () =>
{
it["Should not notify"] = () => testObserver.Messages.Count.Should().Be(0);
// PASSED
};
context["After trigger"] = () =>
{
before = () => triggerStream.OnNext(Unit.Default);
context["When task completes"] = () =>
{
long result = -1;
before = () =>
{
taskDriver.SetResult(result);
//taskDriver.Task.Wait(); // tried with this too
};
it["Should notify once"] = () => testObserver.Messages.Count.Should().Be(1);
// FAILED: expected 1, actual 0
it["Should notify task result"] = () => testObserver.Messages[0].Value.Value.Should().Be(result);
// FAILED: of course, index out of bound
};
};
after = () =>
{
taskDriver.TrySetCanceled();
taskDriver.Task.Dispose();
subscription.Dispose();
};
}
}
In other tests I've done with mocks too, I can see that the Func passed to FromAsync is actually invoked (e.g. taskProducer.DoSomethingAsync(token)), but then it looks like nothing more follows, and the output stream doesn't produce the value.
I also tried inserting some Task.Delay(x).Wait(), or some taskDriver.Task.Wait() before hitting expectations, but with no luck.
I read this SO thread and I'm aware of schedulers, but at a first look I thought I didn't need them, no ObserveOn() is being used. Was I wrong? What am I missing? TA
Just for completeness, testing framework is NSpec, assertion library is FluentAssertions.

What you're hitting is a case of testing Rx and TPL together.
An exhaustive explanation can be found here but I'll try to give advice for your particular code.
Basically your code is working fine, but your test is not.
Observable.FromAsync will transform into a ContinueWith on the provided task, which will be executed on the taskpool, hence asynchronously.
Many ways to fix your test: (from ugly to complex)
Sleep after result set (note wait doesn't work because Wait doesn't wait for continuations)
taskDriver.SetResult(result);
Thread.Sleep(50);
Set the result before executing FromAsync (because FromAsync will return an immediate IObservable if the task is finished, aka will skip ContinueWith)
taskDriver.SetResult(result);
triggerStream.OnNext(Unit.Default);
Replace FromAsync by a testable alternative, e.g
public static IObservable<T> ToObservable<T>(Task<T> task, TaskScheduler scheduler)
{
if (task.IsCompleted)
{
return task.ToObservable();
}
else
{
AsyncSubject<T> asyncSubject = new AsyncSubject<T>();
task.ContinueWith(t => task.ToObservable().Subscribe(asyncSubject), scheduler);
return asyncSubject.AsObservable<T>();
}
}
(using either a synchronous TaskScheduler, or a testable one)

Related

Parallelize tasks using polly

Let's say I have a list of objects myObjs of type List<Foo>.
I have a polly policy :
var policy = Policy.Handle<Exception>().RetryForever();
which i want to run methods on in parralel, but keep retrying each as they fail.
for (int i = 0; i < myObjs.Count; i++)
{
var obj = myObjs[i];
policy.Execute(() => Task.Factory.StartNew(() => obj.Do(), TaskCreationOptions.LongRunning));
}
Will this be called in parallel and retry each obj? So if myObjs[5].do() fails, will only that get retried while other objects just get executed once?
Also, am I supposed to use the ExecuteAsync() method which accepts Func<Task> instead of the Execute(Action) one as shown in the example? Do() is just a synchronous method, being launched in a separate thread.
Actual code looks like this where each() is just a foreach wrapper()
_consumers.ForEach(c => policy.Execute(() => Task.Factory.StartNew(() => c.Consume(startFromBeg), TaskCreationOptions.LongRunning)));
EDIT:
I tried the code:
class Foo
{
private int _i;
public Foo(int i)
{
_i = i;
}
public void Do()
{
//var rnd = new Random();
if (_i==2)
{
Console.WriteLine("err"+_i);
throw new Exception();
}
Console.WriteLine(_i);
}
}
var policy = Policy.Handle<Exception>().Retry(3);
var foos=Enumerable.Range(0, 5).Select(x => new Foo(x)).ToList();
foos.ForEach(c => policy.Execute(() => Task.Factory.StartNew(() => c.Do(), TaskCreationOptions.LongRunning)));
but am getting result:
0 1 err2 3 4 5
I thought it would retry 2 a few more times, but doesn't. Any idea why?
Whatever owns the tasks must wait for them somehow. Otherwise, exceptions will be ignored and the code will end before the tasks actually complete. So yes, you should probably be using policy.ExecuteAsync() instead. It would look something like this:
var tasks = myObjs
.Select(obj => Task.Factory.StartNew(() => obj.Do(), TaskCreationOptions.LongRunning))
.ToList();
// sometime later
await Task.WhenAll(tasks);

Ensuring completion of async OnNext code before process terminates

The unit test below will never print "Async 3" because the test finishes first. How can I ensure it runs to completion? The best I could come up with was an arbitrary Task.Delay at the end or WriteAsync().Result, neither are ideal.
public async Task TestMethod1() // eg. webjob
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Create<int>(async ob =>
{
ob.OnNext(1);
await Task.Delay(1000); // Fake async REST api call
ob.OnNext(2);
await Task.Delay(1000);
ob.OnNext(3);
ob.OnCompleted();
});
observable.Subscribe(i => TestContext.WriteLine($"Sync {i}"));
observable.SelectMany(i => WriteAsync(i).ToObservable()).Subscribe();
await observable;
TestContext.WriteLine("Complete.");
}
public async Task WriteAsync(int value) // Fake async DB call
{
await Task.Delay(1000);
TestContext.WriteLine($"Async {value}");
}
Edit
I realise that mentioning unit tests was probably misleading.This isn't a testing question. The code is a simulation of a real issue of a process running in an Azure WebJob, where both the producer and consumer need to call some Async IO. The issue is that the webjob runs to completion before the consumer has really finished. This is because I can't figure out how to properly await anything from the consumer side. Maybe this just isn't possible with RX...
EDIT:
You're basically looking for a blocking operator. The old blocking operators (like ForEach) were deprecated in favor of async versions. You want to await the last item like so:
public async Task TestMethod1()
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Create<int>(async ob =>
{
ob.OnNext(1);
await Task.Delay(1000);
ob.OnNext(2);
await Task.Delay(1000);
ob.OnNext(3);
ob.OnCompleted();
});
observable.Subscribe(i => TestContext.WriteLine($"Sync {i}"));
var selectManyObservable = observable.SelectMany(i => WriteAsync(i).ToObservable()).Publish().RefCount();
selectManyObservable.Subscribe();
await selectManyObservable.LastOrDefaultAsync();
TestContext.WriteLine("Complete.");
}
While that will solve your immediate problem, it looks like you're going to keep running into issues because of the below (and I added two more). Rx is very powerful when used right, and confusing as hell when not.
Old answer:
A couple things:
Mixing async/await and Rx generally results in getting the pitfalls of both and the benefits of neither.
Rx has robust testing functionality. You're not using it.
Side-Effects, like a WriteLine are best performed exclusively in a subscribe, and not in an operator like SelectMany.
You may want to brush up on cold vs hot observables.
The reason it isn't running to completion is because of your test runner. Your test runner is terminating the test at the conclusion of TestMethod1. The Rx subscription would live on otherwise. When I run your code in Linqpad, I get the following output:
Starting test...
Sync 1
Sync 2
Async 1
Sync 3
Async 2
Complete.
Async 3
...which is what I'm assuming you want to see, except you probably want the Complete after the Async 3.
Using Rx only, your code would look something like this:
public void TestMethod1()
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Concat<int>(
Observable.Return(1),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1)),
Observable.Return(2),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1)),
Observable.Return(3)
);
var syncOutput = observable
.Select(i => $"Sync {i}");
syncOutput.Subscribe(s => TestContext.WriteLine(s));
var asyncOutput = observable
.SelectMany(i => WriteAsync(i, scheduler));
asyncOutput.Subscribe(s => TestContext.WriteLine(s), () => TestContext.WriteLine("Complete."));
}
public IObservable<string> WriteAsync(int value, IScheduler scheduler)
{
return Observable.Return(value)
.Delay(TimeSpan.FromSeconds(1), scheduler)
.Select(i => $"Async {value}");
}
public static class TestContext
{
public static void WriteLine(string s)
{
Console.WriteLine(s);
}
}
This still isn't taking advantage of Rx's testing functionality. That would look like this:
public void TestMethod1()
{
var scheduler = new TestScheduler();
TestContext.WriteLine("Starting test...");
var observable = Observable.Concat<int>(
Observable.Return(1),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1), scheduler),
Observable.Return(2),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1), scheduler),
Observable.Return(3)
);
var syncOutput = observable
.Select(i => $"Sync {i}");
syncOutput.Subscribe(s => TestContext.WriteLine(s));
var asyncOutput = observable
.SelectMany(i => WriteAsync(i, scheduler));
asyncOutput.Subscribe(s => TestContext.WriteLine(s), () => TestContext.WriteLine("Complete."));
var asyncExpected = scheduler.CreateColdObservable<string>(
ReactiveTest.OnNext(1000.Ms(), "Async 1"),
ReactiveTest.OnNext(2000.Ms(), "Async 2"),
ReactiveTest.OnNext(3000.Ms(), "Async 3"),
ReactiveTest.OnCompleted<string>(3000.Ms() + 1) //+1 because you can't have two notifications on same tick
);
var syncExpected = scheduler.CreateColdObservable<string>(
ReactiveTest.OnNext(0000.Ms(), "Sync 1"),
ReactiveTest.OnNext(1000.Ms(), "Sync 2"),
ReactiveTest.OnNext(2000.Ms(), "Sync 3"),
ReactiveTest.OnCompleted<string>(2000.Ms()) //why no +1 here?
);
var asyncObserver = scheduler.CreateObserver<string>();
asyncOutput.Subscribe(asyncObserver);
var syncObserver = scheduler.CreateObserver<string>();
syncOutput.Subscribe(syncObserver);
scheduler.Start();
ReactiveAssert.AreElementsEqual(
asyncExpected.Messages,
asyncObserver.Messages);
ReactiveAssert.AreElementsEqual(
syncExpected.Messages,
syncObserver.Messages);
}
public static class MyExtensions
{
public static long Ms(this int ms)
{
return TimeSpan.FromMilliseconds(ms).Ticks;
}
}
...So unlike your Task tests, you don't have to wait. The test executes instantly. You can bump up the Delay times to minutes or hours, and the TestScheduler will essentially mock the time for you. And then your test runner will probably be happy.
Well, you can use Observable.ForEach to block until an IObservable has terminated:
observable.ForEach(unusedValue => { });
Can you make TestMethod1 a normal, non-async method, and then replace await observable; with this?

Rx and tasks - cancel running task when new task is spawned?

I have an user interaction scenario I'd like to handle with Rx.
The scenario is similar to the canonical "when user stops typing, do some work" (usually, search for what the user has typed so far) (1) - but I also need to :
(2) only get the latest of the results of "do some work" units (see below)
(3) when a new unit of work starts, cancel any work in progress (in my case it's CPU intensive)
For (1) I use an IObservable for the user events, throttled with .Throttle() to only trigger on pauses between events ("user stops typing").
From that, i .Select(_ => CreateMyTask(...).ToObservable()).
This gives me an IObservable<IObservable<T>> where each of the inner observables wraps a single task.
To get (2) I finally apply .Switch() to only get the results from the newest unit of work.
What about (3) - cancel pending tasks ?
If I understand correctly, whenever there's a new inner IObservable<T>, the .Switch() method subscribes to it and unsubscribes from the previous one(s), causing them to Dispose().
Maybe that can be somehow wired to trigger the task to cancel?
You can just use Observable.FromAsync which will generate tokens that are cancelled when the observer unsubcribes:
input.Throttle(...)
.Select(_ => Observable.FromAsync(token => CreateMyTask(..., token)))
.Switch()
.Subscribe(...);
This will generate a new token for each unit of work and cancel it every time Switch switches to the new one.
Do you have to work with Tasks?
If you're happy to work purely with Observables then you can do this nicely yourself.
Try doing something like this:
var query =
Observable.Create<int>(o =>
{
var cancelling = false;
var cancel = Disposable.Create(() =>
{
cancelling = true;
});
var subscription = Observable.Start(() =>
{
for (var i = 0; i < 100; i++)
{
Thread.Sleep(10); //1000 ms in total
if (cancelling)
{
Console.WriteLine("Cancelled on {0}", i);
return -1;
}
}
Console.WriteLine("Done");
return 42;
}).Subscribe(o);
return new CompositeDisposable(cancel, subscription);
});
This observable is doing some hard work in the for loop with the Thread.Sleep(10);, but when the observable is disposed the loop is exited and the intensive CPU work ceases. Then you can use the standard Rx Dispose with the Switch to cancel the in progress work.
If you'd like that bundled up in a method, then try this:
public static IObservable<T> Start<T>(Func<Func<bool>, T> work)
{
return Observable.Create<T>(o =>
{
var cancelling = false;
var cancel = Disposable
.Create(() => cancelling = true);
var subscription = Observable
.Start(() => work(() => cancelling))
.Subscribe(o);
return new CompositeDisposable(cancel, subscription);
});
}
And then call it with a function like this:
Func<Func<bool>, int> work = cancelling =>
{
for (var i = 0; i < 100; i++)
{
Thread.Sleep(10); //1000 ms in total
if (cancelling())
{
Console.WriteLine("Cancelled on {0}", i);
return -1;
}
}
Console.WriteLine("Done");
return 42;
};
Here's my code that proved this worked:
var disposable =
ObservableEx
.Start(work)
.Subscribe(x => Console.WriteLine(x));
Thread.Sleep(500);
disposable.Dispose();
I got "Cancelled on 50" (sometime "Cancelled on 51") as my output.

Cold Tasks and TaskExtensions.Unwrap

I've got a caching class that uses cold (unstarted) tasks to avoid running the expensive thing multiple times.
public class AsyncConcurrentDictionary<TKey, TValue> : System.Collections.Concurrent.ConcurrentDictionary<TKey, Task<TValue>>
{
internal Task<TValue> GetOrAddAsync(TKey key, Task<TValue> newTask)
{
var cachedTask = base.GetOrAdd(key, newTask);
if (cachedTask == newTask && cachedTask.Status == TaskStatus.Created) // We won! our task is now the cached task, so run it
cachedTask.Start();
return cachedTask;
}
}
This works great right up until your task is actually implemented using C#5's await, ala
cache.GetOrAddAsync("key", new Task(async () => {
var r = await AsyncOperation();
return r.FastSynchronousTransform();
}));)`
Now it looks like TaskExtensions.Unwrap() does exactly what I need by turning Task<Task<T>> into a Task<T>, but it seems that wrapper it returns doesn't actually support Start() - it throws an exception.
TaskCompletionSource (my go to for slightly special Task needs) doesn't seem to have any facilities for this sort of thing either.
Is there an alternative to TaskExtensions.Unwrap() that supports "cold tasks"?
All you need to do is to keep the Task before unwrapping it around and start that:
public Task<TValue> GetOrAddAsync(TKey key, Func<Task<TValue>> taskFunc)
{
Task<Task<TValue>> wrappedTask = new Task<Task<TValue>>(taskFunc);
Task<TValue> unwrappedTask = wrappedTask.Unwrap();
Task<TValue> cachedTask = base.GetOrAdd(key, unwrappedTask);
if (cachedTask == unwrappedTask)
wrappedTask.Start();
return cachedTask;
}
Usage:
cache.GetOrAddAsync(
"key", async () =>
{
var r = await AsyncOperation();
return r.FastSynchronousTransform();
});

Speculative execution using the TPL

I have a List<Task<bool>> that I want to enumerate in parallel finding the first task to complete with a result of true and not waiting for or observe exceptions on any of the other tasks still pending.
var tasks = new List<Task<bool>>
{
Task.Delay(2000).ContinueWith(x => false),
Task.Delay(0).ContinueWith(x => true),
};
I have tried to use PLINQ to do something like:
var task = tasks.AsParallel().FirstOrDefault(t => t.Result);
Which executes in parallel, but doesn't return as soon as it finds a satisfying result. because accessing the Result property is blocking. In order for this to work using PLINQ, I'd have to write this aweful statement:
var cts = new CancellationTokenSource();
var task = tasks.AsParallel()
.FirstOrDefault(t =>
{
try
{
t.Wait(cts.Token);
if (t.Result)
{
cts.Cancel();
}
return t.Result;
}
catch (OperationCanceledException)
{
return false;
}
} );
I've written up an extension method that yields tasks as they complete like so.
public static class Exts
{
public static IEnumerable<Task<T>> InCompletionOrder<T>(this IEnumerable<Task<T>> source)
{
var tasks = source.ToList();
while (tasks.Any())
{
var t = Task.WhenAny(tasks);
yield return t.Result;
tasks.Remove(t.Result);
}
}
}
// and run like so
var task = tasks.InCompletionOrder().FirstOrDefault(t => t.Result);
But it feels like this is something common enough that there is a better way. Suggestions?
Maybe something like this?
var tcs = new TaskCompletionSource<Task<bool>>();
foreach (var task in tasks)
{
task.ContinueWith((t, state) =>
{
if (t.Result)
{
((TaskCompletionSource<Task<bool>>)state).TrySetResult(t);
}
},
tcs,
TaskContinuationOptions.OnlyOnRanToCompletion |
TaskContinuationOptions.ExecuteSynchronously);
}
var firstTaskToComplete = tcs.Task;
Perhaps you could try the Rx.Net library. Its very good for in effect providing Linq to Work.
Try this snippet in LinqPad after you reference the Microsoft Rx.Net assemblies.
using System
using System.Linq
using System.Reactive.Concurrency
using System.Reactive.Linq
using System.Reactive.Threading.Tasks
using System.Threading.Tasks
void Main()
{
var tasks = new List<Task<bool>>
{
Task.Delay(2000).ContinueWith(x => false),
Task.Delay(0).ContinueWith(x => true),
};
var observable = (from t in tasks.ToObservable()
//Convert task to an observable
let o = t.ToObservable()
//SelectMany
from x in o
select x);
var foo = observable
.SubscribeOn(Scheduler.Default) //Run the tasks on the threadpool
.ToList()
.First();
Console.WriteLine(foo);
}
First, I don't understand why are you trying to use PLINQ here. Enumerating a list of Tasks shouldn't take long, so I don't think you're going to gain anything from parallelizing it.
Now, to get the first Task that already completed with true, you can use the (non-blocking) IsCompleted property:
var task = tasks.FirstOrDefault(t => t.IsCompleted && t.Result);
If you wanted to get a collection of Tasks, ordered by their completion, have a look at Stephen Toub's article Processing tasks as they complete. If you want to list those that return true first, you would need to modify that code. If you don't want to modify it, you can use a version of this approach from Stephen Cleary's AsyncEx library.
Also, in the specific case in your question, you could “fix” your code by adding .WithMergeOptions(ParallelMergeOptions.NotBuffered) to the PLINQ query. But doing so still wouldn't work most of the time and can waste threads a lot even when it does. That's because PLINQ uses a constant number of threads and partitioning and using Result would block those threads most of the time.

Categories

Resources