ReactiveExtensions Observable FromAsync calling twice Function - c#

Ok, Trying to understand Rx, kinda of lost here.
FromAsyncPattern is now deprecated so I took the example from here (section Light up Task with Rx), and it works, I just made a few changes, not using await just wait the observable and subscribing.....
What I don't understand is Why is called Twice the function SumSquareRoots?
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
class Program
{
static void Main(string[] args)
{
Samples();
}
static void Samples()
{
var x = 100000000;
try
{
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
}
catch (TimeoutException)
{
Console.WriteLine("Timed out :-(");
}
}
static Task<double> SumSquareRoots(long count, CancellationToken ct)
{
return Task.Run(() =>
{
var res = 0.0;
Console.WriteLine("Why I'm called twice");
for (long i = 0; i < count; i++)
{
res += Math.Sqrt(i);
if (i % 10000 == 0 && ct.IsCancellationRequested)
{
Console.WriteLine("Noticed cancellation!");
ct.ThrowIfCancellationRequested();
}
}
return res;
});
}
}

The reason that this is calling SumSquareRoots twice is because you're Subscribing twice:
// Subscribes to res
res.Subscribe(y => Console.WriteLine(y));
// Also Subscribes to res, since it *must* produce a result, even
// if that result is then discarded (i.e. Wait doesn't return IObservable)
res.Wait();
Subscribe is the foreach of Rx - just like if you foreach an IEnumerable twice, you could end up doing 2x the work, multiple Subscribes means multiple the work. To undo this, you could use a blocking call that doesn't discard the result:
Console.WriteLine(res.First());
Or, you could use Publish to "freeze" the result and play it back to > 1 subscriber (kind of like how you'd use ToArray in LINQ):
res = res.Publish();
res.Connect();
// Both subscriptions get the same result, SumSquareRoots is only called once
res.Subscribe(Console.WriteLine);
res.Wait();
The general rule you can follow is, that any Rx method that doesn't return IObservable<T> or Task<T> will result in a Subscription(*)
* - Not technically correct. But your brain will feel better if you think of it this way.

Related

Adding async value during the interation c#

I've a method called by my controller and i'm trying to add new value during the iteration
public async Task<MyResult> GetItems(int itemId) {
try
{
//returns a lists of items
var resList = await _unit.repository.GetAll(x => x.ItemId.Equals(itemId));
if(resList.Count() == 0)
throw new Exception(err.Message);
//Here i need to list the sub-items, but the method returns async
resList.ToList().ForEach(async res => {
switch (res.Type)
{
case Itemtype.None :
res.SubItems = await _unit.repository.GetAll(y => y.ItemId(res.ItemId));
break;
case Itemtype.Low :
//get from another place
break;
}
});
return new MyResult {
res = resList
};
}
catch (Exception ex)
{
throw ex;
}
}
It was shown my Items, but without the sub-items
Note that using await inside a foreach loop will pause the iteration until the Task completes.
This can lead to pretty poor performance if you have a large number of items.
Allow the tasks to run simultaneously:
var tasks = resList.Select(async res =>
{
switch (res.Type)
{
case Itemtype.None :
res.SubItems = await _unit.repository.GetAll(y => y.ItemId(res.ItemId));
break;
case Itemtype.Low :
//get from another place
break;
}
});
Then use Task.WhenAll to allow them all to complete:
await Task.WhenAll(tasks);
Enumerable.Select works differently to List<T>.ForEach in that it returns from the provided delegate, which in this case is Task.
It's important to realize that the await keyword will return when it acts on an incomplete Task. Usually it returns its own incomplete Task that the caller can use to wait until it's done, but if the method signature is void, it will return nothing.
Whatever that ForEach() is, it's likely that it does not accept a delegate that returns a Task, which means that your anonymous method (async res => { ... }) has a void return type.
That means your anonymous method returns as soon as the network request in GetAll() is sent. Because ForEach() isn't waiting for it, it moves on to the next iteration. By the time ForEach() is done everything, all it has done is sent the requests, but not waited for them. So by the time you get to your return statement, you can't be sure anything has been done.
Replace that .ForEach() call with just a regular foreach so that you aren't doing that work inside a void delegate. Then you should see it behave more like you expect. Even better, use Johnathan's answer to start all the tasks inside the loop, then wait for them after.
Microsoft has a very well written series of articles on Asynchronous programming with async and await that I think you will benefit from reading. You can find the rest of the series in the table of contents on the left side of that page.
Switch from resList.ToList().ForEach(...) to ordinary foreach:
foreach (var res in resList)
{
switch (res.Type)
{
case Itemtype.None:
res.SubItems = await _unit.repository.GetAll(y => y.ItemId(res.ItemId));
break;
case Itemtype.Low:
//get from another place
break;
}
}
It seems that your are using List<T>.ForEach. It accepts Action<T> and since the action you pass to it is asynchronous your current code will just create resList.Count() of tasks and proceed to return statement without actually awaiting them. Simple reproducer can look like this:
class MyClass { public int i; }
var col = new[] { new MyClass { i = 1 }, new MyClass { i = 2 } };
col.ToList().ForEach(async i => { await Task.Delay(1); i.i *= 10; });
//col.ToList().ForEach(i => { i.i *= 10; });
Console.WriteLine(string.Join(" ", col.Select(mc => mc.i))); // will print "1 2"

Why async functions are called twice?

I'm using Threading timer to do some periodic job:
private static async void TimerCallback(object state)
{
if (Interlocked.CompareExchange(ref currentlyRunningTasksCount, 1, 0) != 0)
{
return;
}
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
await Task.WhenAll(tasksRead);
var tasksRecord = tasksRead.Where(x => x.Result != null).Select(x => RecordReadingAsync(x.Result));
await Task.WhenAll(tasksRecord);
Interlocked.Decrement(ref currentlyRunningTasksCount);
}
I made timer call back async and used WhenAll. In each working async function I have one Console output, which shows activity. Now the problem is that on second timer event each async function is working twice for some reason. The timer is set to long period. The application is Windows Console type. Is it Select that somehow make it run twice?
This:
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
creates a lazily evaluated IEnumerable which maps numbers to method invocation results. ReadSensorsAsync is not invoked here, it will be invoked during evaluation.
This IEnumerable is evaluated twice. Here:
await Task.WhenAll(tasksRead);
and here:
// Here, another lazy IEnumerable is created based on tasksRead.
var tasksRecord = tasksRead.Where(...).Select(...);
await Task.WhenAll(tasksRecord); // Here, it is evaluated.
Thus, ReadSensorsAsync is invoked twice.
As csharpfolk suggested in the comments, materializing the IEnumerable should fix this:
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i)).ToList();
When you use Task.WhenAll on a IEnumerable<Task<T>> it will return a T[] of the completed Tasks results. You need to save that variable and use it or else you will end up with the multiple enumerations like Henzi mentioned in his answer.
Here is a solution without the unnecessarily calling of .ToList()
private static async void TimerCallback(object state)
{
if (Interlocked.CompareExchange(ref currentlyRunningTasksCount, 1, 0) != 0)
{
return;
}
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
var finshedTasks = await Task.WhenAll(tasksRead);
var tasksRecord = finshedTasks.Where(x => x != null).Select(x => RecordReadingAsync(x));
await Task.WhenAll(tasksRecord);
Interlocked.Decrement(ref currentlyRunningTasksCount);
}
I think I know WHY !
In two words: - reason that function with await impliciti create a callback thread. Better you can see how explain it Jeffrey Richter on this video https://wintellectnow.com/Videos/Watch?videoId=performing-i-o-bound-asynchronous-operations from 00:17:25
just try it:
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
var tasksRecord = tasksRead.Where(x => x.Result != null).Select(x => RecordReadingAsync(x.Result));
await Task.WhenAll(tasksRead);
await Task.WhenAll(tasksRecord);

Await list of async predicates, but drop out on first false

Imagine the following class:
public class Checker
{
public async Task<bool> Check() { ... }
}
Now, imagine a list of instances of this class:
IEnumerable<Checker> checkers = ...
Now I want to control that every instance will return true:
checkers.All(c => c.Check());
Now, this won't compile, since Check() returns a Task<bool> not a bool.
So my question is: How can I best enumerate the list of checkers?
And how can I shortcut the enumeration as soon as a checker returns false?
(something I presume All( ) does already)
"Asynchronous sequences" can always cause some confusion. For example, it's not clear whether your desired semantics are:
Start all checks simultaneously, and evaluate them as they complete.
Start the checks one at a time, evaluating them in sequence order.
There's a third possibility (start all checks simultaneously, and evaluate them in sequence order), but that would be silly in this scenario.
I recommend using Rx for asynchronous sequences. It gives you a lot of options, and it a bit hard to learn, but it also forces you to think about exactly what you want.
The following code will start all checks simultaneously and evaluate them as they complete:
IObservable<bool> result = checkers.ToObservable()
.SelectMany(c => c.Check()).All(b => b);
It first converts the sequence of checkers to an observable, calls Check on them all, and checks whether they are all true. The first Check that completes with a false value will cause result to produce a false value.
In contrast, the following code will start the checks one at a time, evaluating them in sequence order:
IObservable<bool> result = checkers.Select(c => c.Check().ToObservable())
.Concat().All(b => b);
It first converts the sequence of checkers to a sequence of observables, and then concatenates those sequences (which starts them one at a time).
If you do not wish to use observables much and don't want to mess with subscriptions, you can await them directly. E.g., to call Check on all checkers and evaluate the results as they complete:
bool all = await checkers.ToObservable().SelectMany(c => c.Check()).All(b => b);
And how can I shortcut the enumeration as soon as a checker returns false?
This will check the tasks' result in order of completion. So if task #5 is the first to complete, and returns false, the method returns false immediately, regardless of the other tasks. Slower tasks (#1, #2, etc) would never be checked.
public static async Task<bool> AllAsync(this IEnumerable<Task<bool>> source)
{
var tasks = source.ToList();
while(tasks.Count != 0)
{
var finishedTask = await Task.WhenAny(tasks);
if(! finishedTask.Result)
return false;
tasks.Remove(finishedTask);
}
return true;
}
Usage:
bool result = await checkers.Select(c => c.Check())
.AllAsync();
All wasn't built with async in mind (like all LINQ), so you would need to implement that yourself:
async Task<bool> CheckAll()
{
foreach(var checker in checkers)
{
if (!await checker.Check())
{
return false;
}
}
return true;
}
You could make it more reusable with a generic extension method:
public static async Task<bool> AllAsync<TSource>(this IEnumerable<TSource> source, Func<TSource, Task<bool>> predicate)
{
foreach (var item in source)
{
if (!await predicate(item))
{
return false;
}
}
return true;
}
And use it like this:
var result = await checkers.AllAsync(c => c.Check());
You could do
checkers.All(c => c.Check().Result);
but that would run the tasks synchronously, which may be very slow depending on the implementation of Check().
Here's a fully functional test program, following in steps of dcastro:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace AsyncCheckerTest
{
public class Checker
{
public int Seconds { get; private set; }
public Checker(int seconds)
{
Seconds = seconds;
}
public async Task<bool> CheckAsync()
{
await Task.Delay(Seconds * 1000);
return Seconds != 3;
}
}
class Program
{
static void Main(string[] args)
{
var task = RunAsync();
task.Wait();
Console.WriteLine("Overall result: " + task.Result);
Console.ReadLine();
}
public static async Task<bool> RunAsync()
{
var checkers = new List<Checker>();
checkers
.AddRange(Enumerable.Range(1, 5)
.Select(i => new Checker(i)));
return await checkers
.Select(c => c.CheckAsync())
.AllAsync();
}
}
public static class ExtensionMethods
{
public static async Task<bool> AllAsync(this IEnumerable<Task<bool>> source)
{
var tasks = source.ToList();
while (tasks.Count != 0)
{
Task<bool> finishedTask = await Task.WhenAny(tasks);
bool checkResult = finishedTask.Result;
if (!checkResult)
{
Console.WriteLine("Completed at " + DateTimeOffset.Now + "...false");
return false;
}
Console.WriteLine("Working... " + DateTimeOffset.Now);
tasks.Remove(finishedTask);
}
return true;
}
}
}
Here's sample output:
Working... 6/27/2014 1:47:35 AM -05:00
Working... 6/27/2014 1:47:36 AM -05:00
Completed at 6/27/2014 1:47:37 AM -05:00...false
Overall result: False
Note that entire eval ended when exit condition was reached, without waiting for the rest to finish.
As a more out-of-the-box alternative, this seems to run the tasks in parallel and return shortly after the first failure:
var allResult = checkers
.Select(c => Task.Factory.StartNew(() => c.Check().Result))
.AsParallel()
.All(t => t.Result);
I'm not too hot on TPL and PLINQ so feel free to tell me what's wrong with this.

Parallel ForEach wait 500 ms before spawning

I have this situation:
var tasks = new List<ITask> ...
Parallel.ForEach(tasks, currentTask => currentTask.Execute() );
Is it possible to instruct PLinq to wait for 500ms before the next thread is spawned?
System.Threading.Thread.Sleep(5000);
You are using Parallel.Foreach totally wrong, You should make a special Enumerator that rate limits itself to getting data once every 500 ms.
I made some assumptions on how your DTO works due to you not providing any details.
private IEnumerator<SomeResource> GetRateLimitedResource()
{
SomeResource someResource = null;
do
{
someResource = _remoteProvider.GetData();
if(someResource != null)
{
yield return someResource;
Thread.Sleep(500);
}
} while (someResource != null);
}
here is how your paralell should look then
Parallel.ForEach(GetRateLimitedResource(), SomeFunctionToProcessSomeResource);
There are already some good suggestions. I would agree with others that you are using PLINQ in a manner it wasn't meant to be used.
My suggestion would be to use System.Threading.Timer. This is probably better than writing a method that returns an IEnumerable<> that forces a half second delay, because you may not need to wait the full half second, depending on how much time has passed since your last API call.
With the timer, it will invoke a delegate that you've provided it at the interval you specify, so even if the first task isn't done, a half second later it will invoke your delegate on another thread, so there won't be any extra waiting.
From your example code, it sounds like you have a list of tasks, in this case, I would use System.Collections.Concurrent.ConcurrentQueue to keep track of the tasks. Once the queue is empty, turn off the timer.
You could use Enumerable.Aggregate instead.
var task = tasks.Aggregate((t1, t2) =>
t1.ContinueWith(async _ =>
{ Thread.Sleep(500); return t2.Result; }));
If you don't want the tasks chained then there is also the overload to Select assuming the tasks are in order of delay.
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => x * 2))
.Select((x, i) => Task.Delay(TimeSpan.FromMilliseconds(i * 500))
.ContinueWith(_ => x.Result));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
From the comments a better options would be to guard the resource instead of using the time delay.
static object Locker = new object();
static int GetResultFromResource(int arg)
{
lock(Locker)
{
Thread.Sleep(500);
return arg * 2;
}
}
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => GetResultFromResource(x)));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
In this case how about a Producer-Consumer pattern with a BlockingCollection<T>?
var tasks = new BlockingCollection<ITask>();
// add tasks, if this is an expensive process, put it out onto a Task
// tasks.Add(x);
// we're done producin' (allows GetConsumingEnumerable to finish)
tasks.CompleteAdding();
RunTasks(tasks);
With a single consumer thread:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Execute();
// this may not be as accurate as you would like
Thread.Sleep(500);
}
}
If you have access to .Net 4.5 you can use Task.Delay:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
Task.Delay(500)
.ContinueWith(() => task.Execute())
.Wait();
}
}

Processing a batch of request with Reactive Extension

I'm learning Reactive Extensions, and I've been trying to find out if it's a match for a task like this.
I have a Process() method that processes a batch of requests as a unit of work, and invoking a callback when all requests have completed.
The important thing here is that each request will call the callback either synchronous or asynchronous depending on it's implementation, and the batch processor must be able to handle both.
But no threads are started from the batch processor, any new threads (or other async execution) will be initiated from inside the request handlers if necessary. I don't know if this match the use cases of rx.
My current working code looks (almost) like this:
public void Process(ICollection<IRequest> requests, Action<List<IResponse>> onCompleted)
{
IUnitOfWork uow = null;
try
{
uow = unitOfWorkFactory.Create();
var responses = new List<IResponse>();
var outstandingRequests = requests.Count;
foreach (var request in requests)
{
var correlationId = request.CorrelationId;
Action<IResponse> requestCallback = response =>
{
response.CorrelationId = correlationId;
responses.Add(response);
outstandingRequests--;
if (outstandingRequests != 0)
return;
uow.Commit();
onCompleted(responses);
};
requestProcessor.Process(request, requestCallback);
}
}
catch(Exception)
{
if (uow != null)
uow.Rollback();
}
if (uow != null)
uow.Commit();
}
How would you implement this using rx? Is it reasonable?
Note, that the unit of work is to be committed synchronously even if there are async requests that have not yet returned.
My approach to this is two-step.
First create a general-purpose operator that turns Action<T, Action<R>> into Func<T, IObservable<R>>:
public static class ObservableEx
{
public static Func<T, IObservable<R>> FromAsyncCallbackPattern<T, R>(
this Action<T, Action<R>> call)
{
if (call == null) throw new ArgumentNullException("call");
return t =>
{
var subject = new AsyncSubject<R>();
try
{
Action<R> callback = r =>
{
subject.OnNext(r);
subject.OnCompleted();
};
call(t, callback);
}
catch (Exception ex)
{
return Observable.Throw<R>(ex, Scheduler.ThreadPool);
}
return subject.AsObservable<R>();
};
}
}
Next, turn the call void Process(ICollection<IRequest> requests, Action<List<IResponse>> onCompleted) into IObservable<IResponse> Process(IObservable<IRequest> requests):
public IObservable<IResponse> Process(IObservable<IRequest> requests)
{
Func<IRequest, IObservable<IResponse>> rq2rp =
ObservableEx.FromAsyncCallbackPattern
<IRequest, IResponse>(requestProcessor.Process);
var query = (
from rq in requests
select rq2rp(rq)).Concat();
var uow = unitOfWorkFactory.Create();
var subject = new ReplaySubject<IResponse>();
query.Subscribe(
r => subject.OnNext(r),
ex =>
{
uow.Rollback();
subject.OnError(ex);
},
() =>
{
uow.Commit();
subject.OnCompleted();
});
return subject.AsObservable();
}
Now, not only does this run the processing async, but it also ensures the correct order of the results.
In fact, since you are starting with a collection, you could even do this:
var rqs = requests.ToObservable();
var rqrps = rqs.Zip(Process(rqs),
(rq, rp) => new
{
Request = rq,
Response = rp,
});
Then you would have an observable that pairs up each request/response without the need for a CorrelationId property.
I hope this helps.
This is part of the genius of Rx, as you're free to return results either synchronously or asynchronously:
public IObservable<int> AddNumbers(int a, int b) {
return Observable.Return(a + b);
}
public IObservable<int> AddNumbersAsync(int a, int b) {
return Observable.Start(() => a + b, Scheduler.NewThread);
}
They both have the IObservable type, so they work identically. If you want to find out when all IObservables complete, Aggregate will do this, as it will turn 'n' items in an Observable into 1 item that is returned at the end:
IObservable<int> listOfObservables[];
listObservables.ToObservable()
.Merge()
.Aggregate(0, (acc, x) => acc+1)
.Subscribe(x => Console.WriteLine("{0} items were run", x));

Categories

Resources