Enforcing one async observable at a time

Enforcing one async observable at a time - c#

I'm trying to integrate some TPL async into a larger Rx chain using Observable.FromAsync, like in this small example:
using System;
using System.Reactive.Linq;
using System.Threading.Tasks;
namespace rxtest
{
class Program
{
static void Main(string[] args)
{
MainAsync().Wait();
}
static async Task MainAsync()
{
await Observable.Generate(new Random(), x => true,
x => x, x => x.Next(250, 500))
.SelectMany((x, idx) => Observable.FromAsync(async ct =>
{
Console.WriteLine("start: " + idx.ToString());
await Task.Delay(x, ct);
Console.WriteLine("finish: " + idx.ToString());
return idx;
}))
.Take(10)
.LastOrDefaultAsync();
}
}
}
However, I've noticed that this seems to start all the async tasks concurrently rather than doing them one at a time, which causes memory usage of the app to balloon. The SelectMany appears to be acting no different than a Merge.
Here, I see output like this:
start: 0
start: 1
start: 2
...
I'd like to see:
start: 0
finish: 0
start: 1
finish: 1
start: 2
finish: 2
...
How can I achieve this?

Change the SelectMany to a Select with a Concat:
static async Task MainAsync()
{
await Observable.Generate(new Random(), x => true,
x => x, x => x.Next(250, 500))
.Take(10)
.Select((x, idx) => Observable.FromAsync(async ct =>
{
Console.WriteLine("start: " + idx.ToString());
await Task.Delay(x, ct);
Console.WriteLine("finish: " + idx.ToString());
return idx;
}))
.Concat()
.LastOrDefaultAsync();
}
EDIT - I moved the Take(10) up the chain because the Generate won't block - so it stops this running away.
The Select projects each event into a stream representing an async task that will start on Subscription. Concat accepts a stream of streams and subscribes to each successive sub-stream when the previous has completed, concatenating all the streams into a single flat stream.

Related

Observable.FromAsync what is IScheduler argument for?

I was trying to execute that Reactive + ReactiveUI Code
Observable.FromAsync(async () => //async code that must run on UI Thread, RxApp.MainThreadScheduler);
But that doesn't get executed on MainUIThread.
Can anyone tell me what's the point of having IScheduler as parameter since it doesn't seem to get executed on UI?

Try running this code:
void Main()
{
Console.WriteLine($"M {System.Threading.Thread.CurrentThread.ManagedThreadId}");
using (var els = new EventLoopScheduler())
{
els.Schedule(() => Console.WriteLine($"ELS {System.Threading.Thread.CurrentThread.ManagedThreadId}"));
Observable.FromAsync(() => GetNumberAsync(1)).Subscribe(x => Console.WriteLine($"S1 {System.Threading.Thread.CurrentThread.ManagedThreadId}"));
Observable.FromAsync(() => GetNumberAsync(2), els).Subscribe(x => Console.WriteLine($"S2 {System.Threading.Thread.CurrentThread.ManagedThreadId}"));
Console.ReadLine();
}
}
Task<int> GetNumberAsync(int run)
{
Console.WriteLine($"{run} GNA {System.Threading.Thread.CurrentThread.ManagedThreadId}");
return Task.Run(() =>
{
Console.WriteLine($"{run} TR {System.Threading.Thread.CurrentThread.ManagedThreadId}");
return 42;
});
}
I get an output like this:
M 1
ELS 20
1 GNA 1
2 GNA 1
1 TR 12
2 TR 13
S1 11
S2 20
You'll note that the EventLoopScheduler creates a thread with ID 20 and the continuation out of the FromAsync when using the els is run on that thread.
The scheduler isn't the one the task is run on, it's the one that the next operator in the pipeline executes using.

So I came with that solution:
Observable
...
.ObserveOnDispatcher()
.Select(_ => InvokingReturningTaskMethod().ToObservable())
.Concat()
.Subscribe()
That seems to be running on Dispatcher Thread
I wanted to find a better solution tho. Something that doesn't require to
InvokingReturningTaskMethod().ToObservable()

Reactive Extensions Observerable.FromAsync: How to wait until async operation is finished

In my application I´m getting events from a message bus (rabbit mq, but actually it does not really matter). The processing of this events is taking rather long and the events are produced in bursts. Only one event should be processed at a time. To fulfill this I´m using Rx in order to serialize the events and execute the processing asynchronously in order to not block the producer.
Here is the (simplified) sample code:
static void Main(string[] args)
{
var input = Observable.Interval(TimeSpan.FromMilliseconds(500));
var subscription = input.Select(i => Observable.FromAsync(ct => Process(ct, i)))
.Concat()
.Subscribe();
Console.ReadLine();
subscription.Dispose();
// Wait until finished?
Console.WriteLine("Completed");
}
private static async Task Process(CancellationToken cts, long i)
{
Console.WriteLine($"Processing {i} ...");
await Task.Delay(1000).ConfigureAwait(false);
Console.WriteLine($"Finished processing {i}");
}
The sample application is disposing the subscription and then the application is terminated. But in this sample, the application is terminating while the last event received before the subscription is still being processed.
Question: What is the best way to wait until the last event is processed after the subscription is disposed? I´m still trying to wrap my head around the async Rx stuff. I assume there is a rather easy way to do this that I´m just not seeing right now.

If you are OK with a quick and easy solution, that is applicable for simple cases like this, then you could just Wait the IObservable instead of subscribing to it. And if you want to receive subscription-like notifications, use the Do operator just before the final Wait:
static void Main(string[] args)
{
var input = Observable.Interval(TimeSpan.FromMilliseconds(500));
input.Select(i => Observable.FromAsync(ct => Process(ct, i)))
.Concat()
.Do(onNext: x => { }, onError: ex => { }, onCompleted: () => { })
.Wait();
Console.WriteLine("Completed");
}

Thanks to the suggestion from Theodor I now found a solution that works for me:
static async Task Main(string[] args)
{
var inputSequence = Observable.Interval(TimeSpan.FromMilliseconds(500));
var terminate = new Subject<Unit>();
var task = Execute(inputSequence, terminate);
Console.ReadLine();
terminate.OnNext(Unit.Default);
await task.ConfigureAwait(false);
Console.WriteLine($"Completed");
Console.ReadLine();
}
private static async Task Process(CancellationToken cts, long i)
{
Console.WriteLine($"Processing {i} ...");
await Task.Delay(1000).ConfigureAwait(false);
Console.WriteLine($"Finished processing {i}");
}
private static async Task Execute(IObservable<long> input, IObservable<Unit> terminate)
{
await input
.TakeUntil(terminate)
.Select(i => Observable.FromAsync(ct => Process(ct, i)))
.Concat();
}

Ensuring completion of async OnNext code before process terminates

The unit test below will never print "Async 3" because the test finishes first. How can I ensure it runs to completion? The best I could come up with was an arbitrary Task.Delay at the end or WriteAsync().Result, neither are ideal.
public async Task TestMethod1() // eg. webjob
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Create<int>(async ob =>
{
ob.OnNext(1);
await Task.Delay(1000); // Fake async REST api call
ob.OnNext(2);
await Task.Delay(1000);
ob.OnNext(3);
ob.OnCompleted();
});
observable.Subscribe(i => TestContext.WriteLine($"Sync {i}"));
observable.SelectMany(i => WriteAsync(i).ToObservable()).Subscribe();
await observable;
TestContext.WriteLine("Complete.");
}
public async Task WriteAsync(int value) // Fake async DB call
{
await Task.Delay(1000);
TestContext.WriteLine($"Async {value}");
}
Edit
I realise that mentioning unit tests was probably misleading.This isn't a testing question. The code is a simulation of a real issue of a process running in an Azure WebJob, where both the producer and consumer need to call some Async IO. The issue is that the webjob runs to completion before the consumer has really finished. This is because I can't figure out how to properly await anything from the consumer side. Maybe this just isn't possible with RX...

EDIT:
You're basically looking for a blocking operator. The old blocking operators (like ForEach) were deprecated in favor of async versions. You want to await the last item like so:
public async Task TestMethod1()
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Create<int>(async ob =>
{
ob.OnNext(1);
await Task.Delay(1000);
ob.OnNext(2);
await Task.Delay(1000);
ob.OnNext(3);
ob.OnCompleted();
});
observable.Subscribe(i => TestContext.WriteLine($"Sync {i}"));
var selectManyObservable = observable.SelectMany(i => WriteAsync(i).ToObservable()).Publish().RefCount();
selectManyObservable.Subscribe();
await selectManyObservable.LastOrDefaultAsync();
TestContext.WriteLine("Complete.");
}
While that will solve your immediate problem, it looks like you're going to keep running into issues because of the below (and I added two more). Rx is very powerful when used right, and confusing as hell when not.
Old answer:
A couple things:
Mixing async/await and Rx generally results in getting the pitfalls of both and the benefits of neither.
Rx has robust testing functionality. You're not using it.
Side-Effects, like a WriteLine are best performed exclusively in a subscribe, and not in an operator like SelectMany.
You may want to brush up on cold vs hot observables.
The reason it isn't running to completion is because of your test runner. Your test runner is terminating the test at the conclusion of TestMethod1. The Rx subscription would live on otherwise. When I run your code in Linqpad, I get the following output:
Starting test...
Sync 1
Sync 2
Async 1
Sync 3
Async 2
Complete.
Async 3
...which is what I'm assuming you want to see, except you probably want the Complete after the Async 3.
Using Rx only, your code would look something like this:
public void TestMethod1()
{
TestContext.WriteLine("Starting test...");
var observable = Observable.Concat<int>(
Observable.Return(1),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1)),
Observable.Return(2),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1)),
Observable.Return(3)
);
var syncOutput = observable
.Select(i => $"Sync {i}");
syncOutput.Subscribe(s => TestContext.WriteLine(s));
var asyncOutput = observable
.SelectMany(i => WriteAsync(i, scheduler));
asyncOutput.Subscribe(s => TestContext.WriteLine(s), () => TestContext.WriteLine("Complete."));
}
public IObservable<string> WriteAsync(int value, IScheduler scheduler)
{
return Observable.Return(value)
.Delay(TimeSpan.FromSeconds(1), scheduler)
.Select(i => $"Async {value}");
}
public static class TestContext
{
public static void WriteLine(string s)
{
Console.WriteLine(s);
}
}
This still isn't taking advantage of Rx's testing functionality. That would look like this:
public void TestMethod1()
{
var scheduler = new TestScheduler();
TestContext.WriteLine("Starting test...");
var observable = Observable.Concat<int>(
Observable.Return(1),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1), scheduler),
Observable.Return(2),
Observable.Empty<int>().Delay(TimeSpan.FromSeconds(1), scheduler),
Observable.Return(3)
);
var syncOutput = observable
.Select(i => $"Sync {i}");
syncOutput.Subscribe(s => TestContext.WriteLine(s));
var asyncOutput = observable
.SelectMany(i => WriteAsync(i, scheduler));
asyncOutput.Subscribe(s => TestContext.WriteLine(s), () => TestContext.WriteLine("Complete."));
var asyncExpected = scheduler.CreateColdObservable<string>(
ReactiveTest.OnNext(1000.Ms(), "Async 1"),
ReactiveTest.OnNext(2000.Ms(), "Async 2"),
ReactiveTest.OnNext(3000.Ms(), "Async 3"),
ReactiveTest.OnCompleted<string>(3000.Ms() + 1) //+1 because you can't have two notifications on same tick
);
var syncExpected = scheduler.CreateColdObservable<string>(
ReactiveTest.OnNext(0000.Ms(), "Sync 1"),
ReactiveTest.OnNext(1000.Ms(), "Sync 2"),
ReactiveTest.OnNext(2000.Ms(), "Sync 3"),
ReactiveTest.OnCompleted<string>(2000.Ms()) //why no +1 here?
);
var asyncObserver = scheduler.CreateObserver<string>();
asyncOutput.Subscribe(asyncObserver);
var syncObserver = scheduler.CreateObserver<string>();
syncOutput.Subscribe(syncObserver);
scheduler.Start();
ReactiveAssert.AreElementsEqual(
asyncExpected.Messages,
asyncObserver.Messages);
ReactiveAssert.AreElementsEqual(
syncExpected.Messages,
syncObserver.Messages);
}
public static class MyExtensions
{
public static long Ms(this int ms)
{
return TimeSpan.FromMilliseconds(ms).Ticks;
}
}
...So unlike your Task tests, you don't have to wait. The test executes instantly. You can bump up the Delay times to minutes or hours, and the TestScheduler will essentially mock the time for you. And then your test runner will probably be happy.

Well, you can use Observable.ForEach to block until an IObservable has terminated:
observable.ForEach(unusedValue => { });
Can you make TestMethod1 a normal, non-async method, and then replace await observable; with this?

Handle tasks which complete after Task.WhenAll().Wait() specified timeout

I am trying to use Task.WhenAll(tasks).Wait(timeout) to wait for tasks to complete and after that process task results.
Consider this example:
var tasks = new List<Task<Foo>>();
tasks.Add(Task.Run(() => GetData1()));
tasks.Add(Task.Run(() => GetData2()));
Task.WhenAll(tasks).Wait(TimeSpan.FromSeconds(5));
var completedTasks = tasks
.Where(t => t.Status == TaskStatus.RanToCompletion)
.Select(t => t.Result)
.ToList();
// Process completed tasks
// ...
private Foo GetData1()
{
Thread.Sleep(TimeSpan.FromSeconds(4));
return new Foo();
}
private Foo GetData2()
{
Thread.Sleep(TimeSpan.FromSeconds(10));
// How can I get the result of this task once it completes?
return new Foo();
}
It is possible that one of these tasks will not complete their execution within 5 second timeout.
Is it possible to somehow process results of the tasks that have completed after specified timeout? Maybe I am not using right approach in this situation?
EDIT:
I am trying to get all task results that managed to complete within specified timeout. There could be the following outcomes after Task.WhenAll(tasks).Wait(TimeSpan.FromSeconds(5)):
First task completes within 5 seconds.
Second task completes within 5 seconds.
Both tasks complete within 5 seconds.
None of the tasks complete within 5 seconds. Is it possible to get task results that haven't completed within 5 seconds, but have completed later, lets say, after 10 seconds?

In the end with help of the user who removed his answer, I ended up with this solution:
private const int TimeoutInSeconds = 5;
private static void Main(string[] args)
{
var tasks = new List<Task>()
{
Task.Run( async() => await Task.Delay(30)),
Task.Run( async() => await Task.Delay(300)),
Task.Run( async() => await Task.Delay(6000)),
Task.Run( async() => await Task.Delay(8000))
};
Task.WhenAll(tasks).Wait(TimeSpan.FromSeconds(TimeoutInSeconds));
var completedTasks = tasks
.Where(t => t.Status == TaskStatus.RanToCompletion).ToList();
var incompleteTasks = tasks
.Where(t => t.Status != TaskStatus.RanToCompletion).ToList();
Task.WhenAll(incompleteTasks)
.ContinueWith(t => { ProcessDelayedTasks(incompleteTasks); });
ProcessCompletedTasks(completedTasks);
Console.ReadKey();
}
private static void ProcessCompletedTasks(IEnumerable<Task> delayedTasks)
{
Console.WriteLine("Processing completed tasks...");
}
private static void ProcessDelayedTasks(IEnumerable<Task> delayedTasks)
{
Console.WriteLine("Processing delayed tasks...");
}

Instead of Waitall, you probably just want to do some sort of Spin/sleep of 5 seconds and then query the list as you are above.
You should then be able to enumerate again after a few more seconds to see what else has finished.
If performance is a concern, you may want to have additional 'wrapping' to see if All tasks have completed before 5 seconds.

I think there's a possible loss of task items between
var completedTasks = tasks.Where(t => t.Status == TaskStatus.RanToCompletion).ToList();
and
var incompleteTasks = tasks.Where(t => t.Status != TaskStatus.RanToCompletion).ToList();
because some tasks may ran to completition during this time.
As a workaround (not correct though) you coud swap these lines. In this case some tasks may present in each (completedTasks and incompleteTasks) list. But maybe it's better than to be lost completely.
A unit test to compare number of started tasks and number of tasks in completedTasks and incompleteTasks lists may also be useful.

Parallel ForEach wait 500 ms before spawning

I have this situation:
var tasks = new List<ITask> ...
Parallel.ForEach(tasks, currentTask => currentTask.Execute() );
Is it possible to instruct PLinq to wait for 500ms before the next thread is spawned?
System.Threading.Thread.Sleep(5000);

You are using Parallel.Foreach totally wrong, You should make a special Enumerator that rate limits itself to getting data once every 500 ms.
I made some assumptions on how your DTO works due to you not providing any details.
private IEnumerator<SomeResource> GetRateLimitedResource()
{
SomeResource someResource = null;
do
{
someResource = _remoteProvider.GetData();
if(someResource != null)
{
yield return someResource;
Thread.Sleep(500);
}
} while (someResource != null);
}
here is how your paralell should look then
Parallel.ForEach(GetRateLimitedResource(), SomeFunctionToProcessSomeResource);

There are already some good suggestions. I would agree with others that you are using PLINQ in a manner it wasn't meant to be used.
My suggestion would be to use System.Threading.Timer. This is probably better than writing a method that returns an IEnumerable<> that forces a half second delay, because you may not need to wait the full half second, depending on how much time has passed since your last API call.
With the timer, it will invoke a delegate that you've provided it at the interval you specify, so even if the first task isn't done, a half second later it will invoke your delegate on another thread, so there won't be any extra waiting.
From your example code, it sounds like you have a list of tasks, in this case, I would use System.Collections.Concurrent.ConcurrentQueue to keep track of the tasks. Once the queue is empty, turn off the timer.

You could use Enumerable.Aggregate instead.
var task = tasks.Aggregate((t1, t2) =>
t1.ContinueWith(async _ =>
{ Thread.Sleep(500); return t2.Result; }));
If you don't want the tasks chained then there is also the overload to Select assuming the tasks are in order of delay.
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => x * 2))
.Select((x, i) => Task.Delay(TimeSpan.FromMilliseconds(i * 500))
.ContinueWith(_ => x.Result));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
From the comments a better options would be to guard the resource instead of using the time delay.
static object Locker = new object();
static int GetResultFromResource(int arg)
{
lock(Locker)
{
Thread.Sleep(500);
return arg * 2;
}
}
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => GetResultFromResource(x)));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}

In this case how about a Producer-Consumer pattern with a BlockingCollection<T>?
var tasks = new BlockingCollection<ITask>();
// add tasks, if this is an expensive process, put it out onto a Task
// tasks.Add(x);
// we're done producin' (allows GetConsumingEnumerable to finish)
tasks.CompleteAdding();
RunTasks(tasks);
With a single consumer thread:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Execute();
// this may not be as accurate as you would like
Thread.Sleep(500);
}
}
If you have access to .Net 4.5 you can use Task.Delay:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
Task.Delay(500)
.ContinueWith(() => task.Execute())
.Wait();
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Enforcing one async observable at a time - c#

Related

Observable.FromAsync what is IScheduler argument for?

Reactive Extensions Observerable.FromAsync: How to wait until async operation is finished

Ensuring completion of async OnNext code before process terminates

Handle tasks which complete after Task.WhenAll().Wait() specified timeout

Parallel ForEach wait 500 ms before spawning

Categories

Resources