Why async functions are called twice? - c#

I'm using Threading timer to do some periodic job:
private static async void TimerCallback(object state)
{
if (Interlocked.CompareExchange(ref currentlyRunningTasksCount, 1, 0) != 0)
{
return;
}
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
await Task.WhenAll(tasksRead);
var tasksRecord = tasksRead.Where(x => x.Result != null).Select(x => RecordReadingAsync(x.Result));
await Task.WhenAll(tasksRecord);
Interlocked.Decrement(ref currentlyRunningTasksCount);
}
I made timer call back async and used WhenAll. In each working async function I have one Console output, which shows activity. Now the problem is that on second timer event each async function is working twice for some reason. The timer is set to long period. The application is Windows Console type. Is it Select that somehow make it run twice?

This:
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
creates a lazily evaluated IEnumerable which maps numbers to method invocation results. ReadSensorsAsync is not invoked here, it will be invoked during evaluation.
This IEnumerable is evaluated twice. Here:
await Task.WhenAll(tasksRead);
and here:
// Here, another lazy IEnumerable is created based on tasksRead.
var tasksRecord = tasksRead.Where(...).Select(...);
await Task.WhenAll(tasksRecord); // Here, it is evaluated.
Thus, ReadSensorsAsync is invoked twice.
As csharpfolk suggested in the comments, materializing the IEnumerable should fix this:
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i)).ToList();

When you use Task.WhenAll on a IEnumerable<Task<T>> it will return a T[] of the completed Tasks results. You need to save that variable and use it or else you will end up with the multiple enumerations like Henzi mentioned in his answer.
Here is a solution without the unnecessarily calling of .ToList()
private static async void TimerCallback(object state)
{
if (Interlocked.CompareExchange(ref currentlyRunningTasksCount, 1, 0) != 0)
{
return;
}
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
var finshedTasks = await Task.WhenAll(tasksRead);
var tasksRecord = finshedTasks.Where(x => x != null).Select(x => RecordReadingAsync(x));
await Task.WhenAll(tasksRecord);
Interlocked.Decrement(ref currentlyRunningTasksCount);
}

I think I know WHY !
In two words: - reason that function with await impliciti create a callback thread. Better you can see how explain it Jeffrey Richter on this video https://wintellectnow.com/Videos/Watch?videoId=performing-i-o-bound-asynchronous-operations from 00:17:25
just try it:
var tasksRead = Enumerable.Range(3, 35).Select(i => ReadSensorsAsync(i));
var tasksRecord = tasksRead.Where(x => x.Result != null).Select(x => RecordReadingAsync(x.Result));
await Task.WhenAll(tasksRead);
await Task.WhenAll(tasksRecord);

Related

Await for IEnumerable items, (wait after await)

I have this iterator- return type is IEnumerable<Task<Event>> so that I can await for each item later:
private IEnumerable<Task<Event>> GetEventsAsync(long length)
{
var pos = _reader.BaseStream.Position;
while (_reader.BaseStream.Position - pos < length)
{
yield return ReadEvent(); // this method is async
}
}
Now I want to pass this in to constructor.
private async Task<EventManager> ReadEventManagerAsync()
{
// some other stuff
var length = Reverse(await _reader.ReadInt32()); // bytes to read
var events = GetEventsAsync(length); // cant await iterator. so use linq
return new EventManager(events.Select(async e => await e).Select(x => x.Result));
}
The constructor takes this parameter.
internal EventManager([NotNull, NoEnumeration]IEnumerable<Event> events) {...}
Will this code run asynchronously?
Because I cant await inside lambda even if method is marked as async. I tried to do some hack.
I Understand why Select(async e => await e) returns IEnumerable<Task<Event>> because async e => await e is async method, which return type must be wrapped inside Task
My question is will .Select(x => x.Result) still run async? because I'm awaiting items before this so this shouldn't be a problem right? its like writing this
await e;
return e.Result; // is it safe?
I don't want to use Task.WhenAll because that will enumerate all stuff and that's the thing I'm trying to avoid. I want to keep this iterator untouched until its passed in constructor. I want to keep the execution deferred. (I will use factory method if this approach is not possible)
Will this code run asynchronously?
No. When the enumerable is enumerated, it will run synchronously, complete with possibility of deadlocks.
Select(async e => await e) doesn't do anything. It's the same as Select(e => e). It is not the same as await e;.
The whole converting-stream-to-IEnumerable<T> is extremely odd. Perhaps you meant to convert it to IObservable<T>?
I will use factory method if this approach is not possible
Yes, you cannot use await in a constructor, and blocking can cause deadlocks. An asynchronous factory method is probably your best bet.
No, it will run synchronously (constructors can't be async after all).
That doesn't mean your performance is impacted if (since) you never actually execute the tasks. events.Select(async e => await e).Select(x => x.Result) returns an enumerable after all. It won't call the first Select, which means that the awaited code is never called.
You can use WhenAll for this, and await for it asynchronously.
private async Task<EventManager> ReadEventManagerAsync()
{
// some other stuff
var length = Reverse(await _reader.ReadInt32()); // bytes to read
var events = GetEventsAsync(length); // cant await iterator. so use linq
await Task.WhenAll(events);
return new EventManager(events.Select(x => x.Result));
}
yield does not work with await as expected, may be next c# version will have better provision. As of now, you just have to await and store event in a list. Also if your ReadEvent depends on Stream.Position you cannot use Task.WhenAll or any other way but simply enumerate and store result in list.
private async Task<IEnumerable<Event>> GetEventsAsync(long length)
{
List<Event> events = new List<Event>();
var pos = _reader.BaseStream.Position;
while (_reader.BaseStream.Position - pos < length)
{
// if ReadEvent depends on _reader.BaseStream.Postion
// you cannot use Task.WhenAll, because it executes all
// tasks in parallel, where else, you must await to finish
// your previous ReadEvent
Event e = await ReadEvent(); // this method is async
events.Add(e);
}
return events;
}

Unit tests failing with Observable.FromAsync and Observable.Switch

I'm having troubles testing a class that makes use of Observable.FromAsync<T>() and Observable.Switch<T>(). What it does is to wait for a trigger observable to produce a value, then it starts an async operation, and finally recollects all operations' results in a single output sequence. The gist of it is something like:
var outputStream = triggerStream
.Select(_ => Observable
.FromAsync(token => taskProducer.DoSomethingAsync(token)))
.Switch();
I put up some sanity check tests with the bare minimum parts to understand what's going on, here's the test with results in comments:
class test_with_rx : nspec
{
void Given_async_task_and_switch()
{
Subject<Unit> triggerStream = null;
TaskCompletionSource<long> taskDriver = null;
ITestableObserver<long> testObserver = null;
IDisposable subscription = null;
before = () =>
{
TestScheduler scheduler = new TestScheduler();
testObserver = scheduler.CreateObserver<long>();
triggerStream = new Subject<Unit>();
taskDriver = new TaskCompletionSource<long>();
// build stream under test
IObservable<long> streamUnderTest = triggerStream
.Select(_ => Observable
.FromAsync(token => taskDriver.Task))
.Switch();
/* Also tried with this Switch() overload
IObservable<long> streamUnderTest = triggerStream
.Select(_ => taskDriver.Task)
.Switch(); */
subscription = streamUnderTest.Subscribe(testObserver);
};
context["Before trigger"] = () =>
{
it["Should not notify"] = () => testObserver.Messages.Count.Should().Be(0);
// PASSED
};
context["After trigger"] = () =>
{
before = () => triggerStream.OnNext(Unit.Default);
context["When task completes"] = () =>
{
long result = -1;
before = () =>
{
taskDriver.SetResult(result);
//taskDriver.Task.Wait(); // tried with this too
};
it["Should notify once"] = () => testObserver.Messages.Count.Should().Be(1);
// FAILED: expected 1, actual 0
it["Should notify task result"] = () => testObserver.Messages[0].Value.Value.Should().Be(result);
// FAILED: of course, index out of bound
};
};
after = () =>
{
taskDriver.TrySetCanceled();
taskDriver.Task.Dispose();
subscription.Dispose();
};
}
}
In other tests I've done with mocks too, I can see that the Func passed to FromAsync is actually invoked (e.g. taskProducer.DoSomethingAsync(token)), but then it looks like nothing more follows, and the output stream doesn't produce the value.
I also tried inserting some Task.Delay(x).Wait(), or some taskDriver.Task.Wait() before hitting expectations, but with no luck.
I read this SO thread and I'm aware of schedulers, but at a first look I thought I didn't need them, no ObserveOn() is being used. Was I wrong? What am I missing? TA
Just for completeness, testing framework is NSpec, assertion library is FluentAssertions.
What you're hitting is a case of testing Rx and TPL together.
An exhaustive explanation can be found here but I'll try to give advice for your particular code.
Basically your code is working fine, but your test is not.
Observable.FromAsync will transform into a ContinueWith on the provided task, which will be executed on the taskpool, hence asynchronously.
Many ways to fix your test: (from ugly to complex)
Sleep after result set (note wait doesn't work because Wait doesn't wait for continuations)
taskDriver.SetResult(result);
Thread.Sleep(50);
Set the result before executing FromAsync (because FromAsync will return an immediate IObservable if the task is finished, aka will skip ContinueWith)
taskDriver.SetResult(result);
triggerStream.OnNext(Unit.Default);
Replace FromAsync by a testable alternative, e.g
public static IObservable<T> ToObservable<T>(Task<T> task, TaskScheduler scheduler)
{
if (task.IsCompleted)
{
return task.ToObservable();
}
else
{
AsyncSubject<T> asyncSubject = new AsyncSubject<T>();
task.ContinueWith(t => task.ToObservable().Subscribe(asyncSubject), scheduler);
return asyncSubject.AsObservable<T>();
}
}
(using either a synchronous TaskScheduler, or a testable one)

Cold Tasks and TaskExtensions.Unwrap

I've got a caching class that uses cold (unstarted) tasks to avoid running the expensive thing multiple times.
public class AsyncConcurrentDictionary<TKey, TValue> : System.Collections.Concurrent.ConcurrentDictionary<TKey, Task<TValue>>
{
internal Task<TValue> GetOrAddAsync(TKey key, Task<TValue> newTask)
{
var cachedTask = base.GetOrAdd(key, newTask);
if (cachedTask == newTask && cachedTask.Status == TaskStatus.Created) // We won! our task is now the cached task, so run it
cachedTask.Start();
return cachedTask;
}
}
This works great right up until your task is actually implemented using C#5's await, ala
cache.GetOrAddAsync("key", new Task(async () => {
var r = await AsyncOperation();
return r.FastSynchronousTransform();
}));)`
Now it looks like TaskExtensions.Unwrap() does exactly what I need by turning Task<Task<T>> into a Task<T>, but it seems that wrapper it returns doesn't actually support Start() - it throws an exception.
TaskCompletionSource (my go to for slightly special Task needs) doesn't seem to have any facilities for this sort of thing either.
Is there an alternative to TaskExtensions.Unwrap() that supports "cold tasks"?
All you need to do is to keep the Task before unwrapping it around and start that:
public Task<TValue> GetOrAddAsync(TKey key, Func<Task<TValue>> taskFunc)
{
Task<Task<TValue>> wrappedTask = new Task<Task<TValue>>(taskFunc);
Task<TValue> unwrappedTask = wrappedTask.Unwrap();
Task<TValue> cachedTask = base.GetOrAdd(key, unwrappedTask);
if (cachedTask == unwrappedTask)
wrappedTask.Start();
return cachedTask;
}
Usage:
cache.GetOrAddAsync(
"key", async () =>
{
var r = await AsyncOperation();
return r.FastSynchronousTransform();
});

Parallel ForEach wait 500 ms before spawning

I have this situation:
var tasks = new List<ITask> ...
Parallel.ForEach(tasks, currentTask => currentTask.Execute() );
Is it possible to instruct PLinq to wait for 500ms before the next thread is spawned?
System.Threading.Thread.Sleep(5000);
You are using Parallel.Foreach totally wrong, You should make a special Enumerator that rate limits itself to getting data once every 500 ms.
I made some assumptions on how your DTO works due to you not providing any details.
private IEnumerator<SomeResource> GetRateLimitedResource()
{
SomeResource someResource = null;
do
{
someResource = _remoteProvider.GetData();
if(someResource != null)
{
yield return someResource;
Thread.Sleep(500);
}
} while (someResource != null);
}
here is how your paralell should look then
Parallel.ForEach(GetRateLimitedResource(), SomeFunctionToProcessSomeResource);
There are already some good suggestions. I would agree with others that you are using PLINQ in a manner it wasn't meant to be used.
My suggestion would be to use System.Threading.Timer. This is probably better than writing a method that returns an IEnumerable<> that forces a half second delay, because you may not need to wait the full half second, depending on how much time has passed since your last API call.
With the timer, it will invoke a delegate that you've provided it at the interval you specify, so even if the first task isn't done, a half second later it will invoke your delegate on another thread, so there won't be any extra waiting.
From your example code, it sounds like you have a list of tasks, in this case, I would use System.Collections.Concurrent.ConcurrentQueue to keep track of the tasks. Once the queue is empty, turn off the timer.
You could use Enumerable.Aggregate instead.
var task = tasks.Aggregate((t1, t2) =>
t1.ContinueWith(async _ =>
{ Thread.Sleep(500); return t2.Result; }));
If you don't want the tasks chained then there is also the overload to Select assuming the tasks are in order of delay.
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => x * 2))
.Select((x, i) => Task.Delay(TimeSpan.FromMilliseconds(i * 500))
.ContinueWith(_ => x.Result));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
From the comments a better options would be to guard the resource instead of using the time delay.
static object Locker = new object();
static int GetResultFromResource(int arg)
{
lock(Locker)
{
Thread.Sleep(500);
return arg * 2;
}
}
var tasks = Enumerable
.Range(1, 10)
.Select(x => Task.Run(() => GetResultFromResource(x)));
foreach(var result in tasks.Select(x => x.Result))
{
Console.WriteLine(result);
}
In this case how about a Producer-Consumer pattern with a BlockingCollection<T>?
var tasks = new BlockingCollection<ITask>();
// add tasks, if this is an expensive process, put it out onto a Task
// tasks.Add(x);
// we're done producin' (allows GetConsumingEnumerable to finish)
tasks.CompleteAdding();
RunTasks(tasks);
With a single consumer thread:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
task.Execute();
// this may not be as accurate as you would like
Thread.Sleep(500);
}
}
If you have access to .Net 4.5 you can use Task.Delay:
static void RunTasks(BlockingCollection<ITask> tasks)
{
foreach (var task in tasks.GetConsumingEnumerable())
{
Task.Delay(500)
.ContinueWith(() => task.Execute())
.Wait();
}
}

ReactiveExtensions Observable FromAsync calling twice Function

Ok, Trying to understand Rx, kinda of lost here.
FromAsyncPattern is now deprecated so I took the example from here (section Light up Task with Rx), and it works, I just made a few changes, not using await just wait the observable and subscribing.....
What I don't understand is Why is called Twice the function SumSquareRoots?
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
class Program
{
static void Main(string[] args)
{
Samples();
}
static void Samples()
{
var x = 100000000;
try
{
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
}
catch (TimeoutException)
{
Console.WriteLine("Timed out :-(");
}
}
static Task<double> SumSquareRoots(long count, CancellationToken ct)
{
return Task.Run(() =>
{
var res = 0.0;
Console.WriteLine("Why I'm called twice");
for (long i = 0; i < count; i++)
{
res += Math.Sqrt(i);
if (i % 10000 == 0 && ct.IsCancellationRequested)
{
Console.WriteLine("Noticed cancellation!");
ct.ThrowIfCancellationRequested();
}
}
return res;
});
}
}
The reason that this is calling SumSquareRoots twice is because you're Subscribing twice:
// Subscribes to res
res.Subscribe(y => Console.WriteLine(y));
// Also Subscribes to res, since it *must* produce a result, even
// if that result is then discarded (i.e. Wait doesn't return IObservable)
res.Wait();
Subscribe is the foreach of Rx - just like if you foreach an IEnumerable twice, you could end up doing 2x the work, multiple Subscribes means multiple the work. To undo this, you could use a blocking call that doesn't discard the result:
Console.WriteLine(res.First());
Or, you could use Publish to "freeze" the result and play it back to > 1 subscriber (kind of like how you'd use ToArray in LINQ):
res = res.Publish();
res.Connect();
// Both subscriptions get the same result, SumSquareRoots is only called once
res.Subscribe(Console.WriteLine);
res.Wait();
The general rule you can follow is, that any Rx method that doesn't return IObservable<T> or Task<T> will result in a Subscription(*)
* - Not technically correct. But your brain will feel better if you think of it this way.

Categories

Resources