How do I prevent by Rx test from hanging? - c#

I am reproducing my Rx issue with a simplified test case below. The test below hangs. I am sure it is a small, but fundamental, thing that I am missing, but can't put my finger on it.
public class Service
{
private ISubject<double> _subject = new Subject<double>();
public void Reset()
{
_subject.OnNext(0.0);
}
public IObservable<double> GetProgress()
{
return _subject;
}
}
public class ObTest
{
[Fact]
private async Task SimpleTest()
{
var service = new Service();
var result = service.GetProgress().Take(1);
var task = Task.Run(async () =>
{
service.Reset();
});
await result;
}
}
UPDATE
My attempt above was to simplify the problem a little and understand it. In my case GetProgress() is a merge of various Observables that publish the download progress, one of these Observables is a Subject<double> that publishes 0 everytime somebody calls a method to delete the download.
The race condition identified by Enigmativity and Theodor Zoulias may(??) happen in real life. I display a view which attempts to get the progress, however, quick fingers delete it just in time.
What I need to understand a bit more is if the download is started again (subscription has taken place by now, by virtue of displaying a view, which has already made the subscription) and somebody again deletes it.
public class Service
{
private ISubject<double> _deleteSubject = new Subject<double>();
public void Reset()
{
_deleteSubject.OnNext(0.0);
}
public IObservable<double> GetProgress()
{
return _deleteSubject.Merge(downloadProgress);
}
}

Your code isn't hanging. It's awaiting an observable that sometimes never gets a value.
You have a race condition.
The Task.Run is sometimes executing to completion before the await result creates the subscription to the observable - so it never sees the value.
Try this code instead:
private async Task SimpleTest()
{
var service = new Service();
var result = service.GetProgress().Take(1);
var awaiter = result.GetAwaiter();
var task = Task.Run(() =>
{
service.Reset();
});
await awaiter;
}

The line await result creates a subscription to the observable. The problem is that the notification _subject.OnNext(0.0) may occur before this subscription, in which case the value will pass unobserved, and the await result will continue waiting for a notification for ever. In this particular example the notification is always missed, at least in my PC, because the subscription is delayed for around 30 msec (measured with a Stopwatch), which is longer than the time needed for the task that resets the service to complete, probably because the JITer must load and compile some RX-related assembly. The situation changes when I do a warm-up by calling new Subject<int>().FirstAsync().Subscribe() before running the example. In that case the notification is observed almost always, and the hanging is avoided.
I can think of two robust solutions to this problem.
The solution suggested by Enigmativity, to create an awaitable subscription before starting the task that resets the service. This can be done with either GetAwaiter or ToTask.
To use a ReplaySubject<T> instead of a plain vanilla Subject<T>.
Represents an object that is both an observable sequence as well as an observer. Each notification is broadcasted to all subscribed and future observers, subject to buffer trimming policies.
The ReplaySubject will cache the value so that it can be observed by the future subscription, eliminating the race condition. You could initialize it with a bufferSize of 1 to minimize the memory footprint of the buffer.

Related

Unexpected values for AsyncLocal.Value when mixing ExecutionContext.SuppressFlow and tasks

In an application I am experiencing odd behavior due to wrong/unexpected values of AsyncLocal: Despite I suppressed the flow of the execution context, I the AsyncLocal.Value-property is sometimes not reset within the execution scope of a newly spawned Task.
Below I created a minimal reproducible sample which demonstrates the problem:
private static readonly AsyncLocal<object> AsyncLocal = new AsyncLocal<object>();
[TestMethod]
public void Test()
{
Trace.WriteLine(System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription);
var mainTask = Task.Factory.StartNew(() =>
{
AsyncLocal.Value = "1";
Task anotherTask;
using (ExecutionContext.SuppressFlow())
{
anotherTask = Task.Run(() =>
{
Trace.WriteLine(AsyncLocal.Value); // "1" <- ???
Assert.IsNull(AsyncLocal.Value); // BOOM - FAILS
AsyncLocal.Value = "2";
});
}
Task.WaitAll(anotherTask);
});
mainTask.Wait(500000, CancellationToken.None);
}
In nine out of ten runs (on my pc) the outcome of the Test-method is:
.NET 6.0.2
"1"
-> The test fails
As you can see the test fails because within the action which is executed within Task.Run the the previous value is still present within AsyncLocal.Value (Message: 1).
My concrete questions are:
Why does this happen?
I suspect this happens because Task.Run may use the current thread to execute the work load. In that case, I assume lack of async/await-operators does not force the creation of a new/separate ExecutionContext for the action. Like Stephen Cleary said "from the logical call context’s perspective, all synchronous invocations are “collapsed” - they’re actually part of the context of the closest async method further up the call stack". If that’s the case I do understand why the same context is used within the action.
Is this the correct explanation for this behavior? In addition, why does it work flawlessly sometimes (about 1 run out of 10 on my machine)?
How can I fix this?
Assuming that my theory above is true it should be enough to forcefully introduce a new async "layer", like below:
private static readonly AsyncLocal<object> AsyncLocal = new AsyncLocal<object>();
[TestMethod]
public void Test()
{
Trace.WriteLine(System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription);
var mainTask = Task.Factory.StartNew(() =>
{
AsyncLocal.Value = "1";
Task anotherTask;
using (ExecutionContext.SuppressFlow())
{
var wrapper = () =>
{
Trace.WriteLine(AsyncLocal.Value);
Assert.IsNull(AsyncLocal.Value);
AsyncLocal.Value = "2";
return Task.CompletedTask;
};
anotherTask = Task.Run(async () => await wrapper());
}
Task.WaitAll(anotherTask);
});
mainTask.Wait(500000, CancellationToken.None);
}
This seems to fix the problem (it consistently works on my machine), but I want to be sure that this is a correct fix for this problem.
Many thanks in advance
Why does this happen? I suspect this happens because Task.Run may use the current thread to execute the work load.
I suspect that it happens because Task.WaitAll will use the current thread to execute the task inline.
Specifically, Task.WaitAll calls Task.WaitAllCore, which will attempt to run it inline by calling Task.WrappedTryRunInline. I'm going to assume the default task scheduler is used throughout. In that case, this will invoke TaskScheduler.TryRunInline, which will return false if the delegate is already invoked. So, if the task has already started running on a thread pool thread, this will return back to WaitAllCore, which will just do a normal wait, and your code will work as expected (1 out of 10).
If a thread pool thread hasn't picked it up yet (9 out of 10), then TaskScheduler.TryRunInline will call TaskScheduler.TryExecuteTaskInline, the default implementation of which will call Task.ExecuteEntryUnsafe, which calls Task.ExecuteWithThreadLocal. Task.ExecuteWithThreadLocal has logic for applying an ExecutionContext if one was captured. Assuming none was captured, the task's delegate is just invoked directly.
So, it seems like each step is behaving logically. Technically, what ExecutionContext.SuppressFlow means is "don't capture the ExecutionContext", and that is what is happening. It doesn't mean "clear the ExecutionContext". Sometimes the task is run on a thread pool thread (without the captured ExecutionContext), and WaitAll will just wait for it to complete. Other times the task will be executed inline by WaitAll instead of a thread pool thread, and in that case the ExecutionContext is not cleared (and technically isn't captured, either).
You can test this theory by capturing the current thread id within your wrapper and comparing it to the thread id doing the Task.WaitAll. I expect that they will be the same thread for the runs where the async local value is (unexpectedly) inherited, and they will be different threads for the runs where the async local value works as expected.
If you can, I'd first consider whether it's possible to replace the thread-specific caches with a single shared cache. The app likely predates useful types such as ConcurrentDictionary.
If it isn't possible to use a singleton cache, then you can use a stack of async local values. Stacking async local values is a common pattern. I prefer wrapping the stack logic into a separate type (AsyncLocalValue in the code below):
public sealed class AsyncLocalValue
{
private static readonly AsyncLocal<ImmutableStack<object>> _asyncLocal = new();
public object Value => _asyncLocal.Value?.Peek();
public IDisposable PushValue(object value)
{
var originalValue = _asyncLocal.Value;
var newValue = (originalValue ?? ImmutableStack<object>.Empty).Push(value);
_asyncLocal.Value = newValue;
return Disposable.Create(() => _asyncLocal.Value = originalValue);
}
}
private static AsyncLocalValue AsyncLocal = new();
[TestMethod]
public void Test()
{
Console.WriteLine(System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription);
var mainTask = Task.Factory.StartNew(() =>
{
Task anotherTask;
using (AsyncLocal.PushValue("1"))
{
using (AsyncLocal.PushValue(null))
{
anotherTask = Task.Run(() =>
{
Console.WriteLine("Observed: " + AsyncLocal.Value);
using (AsyncLocal.PushValue("2"))
{
}
});
}
}
Task.WaitAll(anotherTask);
});
mainTask.Wait(500000, CancellationToken.None);
}
This code sample uses Disposable.Create from my Nito.Disposables library.

Should/Could this "recursive Task" be expressed as a TaskContinuation?

In my application I have the need to continually process some piece(s) of Work on some set interval(s). I had originally written a Task to continually check a given Task.Delay to see if it was completed, if so the Work would be processed that corresponded to that Task.Delay. The draw back to this method is the Task that checks these Task.Delays would be in a psuedo-infinite loop when no Task.Delay is completed.
To solve this problem I found that I could create a "recursive Task" (I am not sure what the jargon for this would be) that processes the work at the given interval as needed.
// New Recurring Work can be added by simply creating
// the Task below and adding an entry into this Dictionary.
// Recurring Work can be removed/stopped by looking
// it up in this Dictionary and calling its CTS.Cancel method.
private readonly object _LockRecurWork = new object();
private Dictionary<Work, Tuple<Task, CancellationTokenSource> RecurringWork { get; set; }
...
private Task CreateRecurringWorkTask(Work workToDo, CancellationTokenSource taskTokenSource)
{
return Task.Run(async () =>
{
// Do the Work, then wait the prescribed amount of time before doing it again
DoWork(workToDo);
await Task.Delay(workToDo.RecurRate, taskTokenSource.Token);
// If this Work's CancellationTokenSource is not
// cancelled then "schedule" the next Work execution
if (!taskTokenSource.IsCancellationRequested)
{
lock(_LockRecurWork)
{
RecurringWork[workToDo] = new Tuple<Task, CancellationTokenSource>
(CreateRecurringWorkTask(workToDo, taskTokenSource), taskTokenSource);
}
}
}, taskTokenSource.Token);
}
Should/Could this be represented with a chain of Task.ContinueWith? Would there be any benefit to such an implementation? Is there anything majorly wrong with the current implementation?
Yes!
Calling ContinueWith tells the Task to call your code as soon as it finishes. This is far faster than manually polling it.

Non blocking and reoccurring producer/consumer notifier implementation

Searched hard for a piece of code which does what i want and i am happy with. Reading this and this helped a lot.
I have a scenario where i need a single consumer to be notified by a single producer when new data is available but would also like the consumer to be notified periodically regardless of if new data is available.
It is fine if the consumer is notified more than the reoccurring period but it should not be notified less frequent.
It is possible that multiple notifications for 'new data' occur while the consumer is already notified and working. (So SemaphoreSlim was not a good fit).
Hence, a consumer which is slower than the rate of producer notifications, would not queue up subsequent notifications, they would just "re-signal" that same "data available" flag without affect.
I would also like the consumer to asynchronously wait for the notifications (without blocking a thread).
I have stitched together the below class which wraps around TaskCompletionSource and also uses an internal Timer.
public class PeriodicalNotifier : IDisposable
{
// Need some dummy type since TaskCompletionSource has only the generic version
internal struct VoidTypeStruct { }
// Always reuse this allocation
private static VoidTypeStruct dummyStruct;
private TaskCompletionSource<VoidTypeStruct> internalCompletionSource;
private Timer reSendTimer;
public PeriodicalNotifier(int autoNotifyIntervalMs)
{
internalCompletionSource = new TaskCompletionSource<VoidTypeStruct>();
reSendTimer = new Timer(_ => Notify(), null, 0, autoNotifyIntervalMs);
}
public async Task WaitForNotifictionAsync(CancellationToken cancellationToken)
{
using (cancellationToken.Register(() => internalCompletionSource.TrySetCanceled()))
{
await internalCompletionSource.Task;
// Recreate - to be able to set again upon the next wait
internalCompletionSource = new TaskCompletionSource<VoidTypeStruct>();
}
}
public void Notify()
{
internalCompletionSource.TrySetResult(dummyStruct);
}
public void Dispose()
{
reSendTimer.Dispose();
internalCompletionSource.TrySetCanceled();
}
}
Users of this class can do something like this:
private PeriodicalNotifier notifier = new PeriodicalNotifier(100);
// ... In some task - which should be non-blocking
while (some condition)
{
await notifier.WaitForNotifictionAsync(_tokenSource.Token);
// Do some work...
}
// ... In some thread, producer added new data
notifier.Notify();
Efficiency is important to me, the scenario is of a high frequency data stream, and so i had in mind:
The non-blocking nature of the wait.
I assume Timer is more efficient than recreating Task.Delay and cancelling it if it's not the one to notify.
A concern for the recreation of the TaskCompletionSource
My questions are:
Does my code correctly solve the problem? Any hidden pitfalls?
Am i missing some trivial solution / existing block for this use case?
Update:
I have reached a conclusion that aside from re implementing a more lean Task Completion structure (like in here and here) i have no more optimizations to make. Hope that helps anyone looking at a similar scenario.
Yes, your implementation makes sense but the TaskCompletionSource recreation should be outside the using scope, otherwise the "old" cancellation token may cancel the "new" TaskCompletionSource.
I think using some kind of AsyncManualResetEvent combined with a Timer would be simpler and less error-prone. There's a very nice namespace with async tools in the Visual Studio SDK by Microsoft. You need to install the SDK and then reference the Microsoft.VisualStudio.Threading assembly. Here's an implementation using their AsyncManualResetEvent with the same API:
public class PeriodicalNotifier : IDisposable
{
private readonly Timer _timer;
private readonly AsyncManualResetEvent _asyncManualResetEvent;
public PeriodicalNotifier(TimeSpan autoNotifyInterval)
{
_asyncManualResetEvent = new AsyncManualResetEvent();
_timer = new Timer(_ => Notify(), null, TimeSpan.Zero, autoNotifyInterval);
}
public async Task WaitForNotifictionAsync(CancellationToken cancellationToken)
{
await _asyncManualResetEvent.WaitAsync().WithCancellation(cancellationToken);
_asyncManualResetEvent.Reset();
}
public void Notify()
{
_asyncManualResetEvent.Set();
}
public void Dispose()
{
_timer.Dispose();
}
}
You notify by setting the reset event, asynchronously wait using WaitAsync, enable Cancellation using the WithCancellation extension method and then reset the event. Multiple notifications are "merged" by setting the same reset event.
Subject<Result> notifier = new Subject<Result)();
notifier
.Select(value => Observable.Interval(TimeSpan.FromMilliSeconds(100))
.Select(_ => value)).Switch()
.Subscribe(value => DoSomething(value));
//Some other thread...
notifier.OnNext(...);
This Rx query will keep sending value, every 100 milliseconds, until a new value turns up. Then we notify that value every 100 milliseconds.
If we receive values faster than once every 100 milliseconds, then we basically have the same output as input.

Rx and async nunit test

I'm trying to create an async unit test for the project, but cannot understand how to wait for the async subject to complete:
[Test]
public async void MicroTest()
{
var value = 2;
var first = new AsyncSubject<int>();
var second = new AsyncSubject<int>();
first.Subscribe(_ =>
{
value = _;
second.OnCompleted();
});
first.OnNext(1);
// how to wait for the second subject to complete?
Assert.AreEqual(value, 1);
}
Sync version of this test is works well:
[Test]
public void MicroTest()
{
var value = 2;
var first = new Subject<int>();
var second = new Subject<int>();
first.Subscribe(_ =>
{
value = _;
second.OnCompleted();
});
first.OnNext(1);
Assert.AreEqual(value, 1);
}
AsyncSubject versus Subject
First off, it's worth pointing out that AsyncSubject<T> is not an asynchronous version of Subject<T>. Both are in fact free-threaded* (see footnote).
AsyncSubject is a specialization of Subject intended to be used to model an operation that completes asynchronously and returns a single result. It has two noteworthy features:
Only the last result is published
The result is cached and is available to observers subscribing after it has completed.
It is used internally in various places, including by the ToObservable() extension method defined on Task and Task<T>.
The issue with the test
Recall AsyncSubject<T> will only return the final result received. It does this by waiting for OnCompleted() so it knows what the final result is. Because you do not call OnCompleted() on first your test is flawed as the OnNext() handler - the lambda function passed in your Subscribe call - will never be invoked.
Additionally, it is invalid not to call OnNext() at least once on an AsyncSubject<T>, so when you call await second; you will get an InvalidOperationException if you haven't done this.
If you write your test as follows, all is well:
[Test]
public async void MicroTest()
{
var value = 2;
var first = new AsyncSubject<int>();
var second = new AsyncSubject<int>();
first.Subscribe(_ =>
{
// won't be called until an OnCompleted() has
// been invoked on first
value = _;
// you must send *some* value to second
second.OnNext(_);
second.OnCompleted();
});
first.OnNext(1);
// you must do this for OnNext handler to be called
first.OnCompleted();
// how to wait for the second subject to complete
await second;
Assert.AreEqual(value, 1);
}
About asynchronous tests
As a general rule I would avoid writing asynchronous tests that could wait forever. This gets particularly annoying when it causes resource drains on build servers. Use some kind of timeout e.g:
await second.Timeout(TimeSpan.FromSeconds(1));
No need to handle the exception since that is enough for the test to fail.
**I've borrowed this term from the COM lexicon. In this sense I mean that they, as with most of the Rx framework components, will generally run on whatever thread you happen to invoke their methods on. Being free-threaded doesn't necessarily mean being fully thread safe though. In particular, unlike AsyncSubject<T>, Subject<T> doesn't protect you from the Rx grammar violation of making overlapping calls to OnNext. Use Subject.Synchronize or Observable.Synchronize for this protection.*

How do I know when it's safe to call Dispose?

I have a search application that takes some time (10 to 15 seconds) to return results for some requests. It's not uncommon to have multiple concurrent requests for the same information. As it stands, I have to process those independently, which makes for quite a bit of unnecessary processing.
I've come up with a design that should allow me to avoid the unnecessary processing, but there's one lingering problem.
Each request has a key that identifies the data being requested. I maintain a dictionary of requests, keyed by the request key. The request object has some state information and a WaitHandle that is used to wait on the results.
When a client calls my Search method, the code checks the dictionary to see if a request already exists for that key. If so, the client just waits on the WaitHandle. If no request exists, I create one, add it to the dictionary, and issue an asynchronous call to get the information. Again, the code waits on the event.
When the asynchronous process has obtained the results, it updates the request object, removes the request from the dictionary, and then signals the event.
This all works great. Except I don't know when to dispose of the request object. That is, since I don't know when the last client is using it, I can't call Dispose on it. I have to wait for the garbage collector to come along and clean up.
Here's the code:
class SearchRequest: IDisposable
{
public readonly string RequestKey;
public string Results { get; set; }
public ManualResetEvent WaitEvent { get; private set; }
public SearchRequest(string key)
{
RequestKey = key;
WaitEvent = new ManualResetEvent(false);
}
public void Dispose()
{
WaitEvent.Dispose();
GC.SuppressFinalize(this);
}
}
ConcurrentDictionary<string, SearchRequest> Requests = new ConcurrentDictionary<string, SearchRequest>();
string Search(string key)
{
SearchRequest req;
bool addedNew = false;
req = Requests.GetOrAdd(key, (s) =>
{
// Create a new request.
var r = new SearchRequest(s);
Console.WriteLine("Added new request with key {0}", key);
addedNew = true;
return r;
});
if (addedNew)
{
// A new request was created.
// Start a search.
ThreadPool.QueueUserWorkItem((obj) =>
{
// Get the results
req.Results = DoSearch(req.RequestKey); // DoSearch takes several seconds
// Remove the request from the pending list
SearchRequest trash;
Requests.TryRemove(req.RequestKey, out trash);
// And signal that the request is finished
req.WaitEvent.Set();
});
}
Console.WriteLine("Waiting for results from request with key {0}", key);
req.WaitEvent.WaitOne();
return req.Results;
}
Basically, I don't know when the last client will be released. No matter how I slice it here, I have a race condition. Consider:
Thread A Creates a new request, starts Thread 2, and waits on the wait handle.
Thread B Begins processing the request.
Thread C detects that there's a pending request, and then gets swapped out.
Thread B Completes the request, removes the item from the dictionary, and sets the event.
Thread A's wait is satisfied, and it returns the result.
Thread C wakes up, calls WaitOne, is released, and returns the result.
If I use some kind of reference counting so that the "last" client calls Dispose, then the object would be disposed by Thread A in the above scenario. Thread C would then die when it tried to wait on the disposed WaitHandle.
The only way I can see to fix this is to use a reference counting scheme and protect access to the dictionary with a lock (in which case using ConcurrentDictionary is pointless) so that a lookup is always accompanied by an increment of the reference count. Whereas that would work, it seems like an ugly hack.
Another solution would be to ditch the WaitHandle and use an event-like mechanism with callbacks. But that, too, would require me to protect the lookups with a lock, and I have the added complication of dealing with an event or a naked multicast delegate. That seems like a hack, too.
This probably isn't a problem currently, because this application doesn't yet get enough traffic for those abandoned handles to add up before the next GC pass comes and cleans them up. And maybe it won't ever be a problem? It worries me, though, that I'm leaving them to be cleaned up by the GC when I should be calling Dispose to get rid of them.
Ideas? Is this a potential problem? If so, do you have a clean solution?
Consider using Lazy<T> for SearchRequest.Results maybe? But that would probably entail a bit of redesign. Haven't thought this out completely.
But what would probably be almost a drop-in replacement for your use case is to implement your own Wait() and Set() methods in SearchRequest. Something like:
object _resultLock;
void Wait()
{
lock(_resultLock)
{
while (!_hasResult)
Monitor.Wait(_resultLock);
}
}
void Set(string results)
{
lock(_resultLock)
{
Results = results;
_hasResult = true;
Monitor.PulseAll(_resultLock);
}
}
No need to dispose. :)
I think that your best bet to make this work is to use the TPL for all of you multi-threading needs. That's what it is good at.
As per my comment on your question, you need to keep in mind that ConcurrentDictionary does have side-effects. If multiple threads try to call GetOrAdd at the same time then the factory can be invoked for all of them, but only one will win. The values produced for the other threads will just be discarded, however by then the compute has been done.
Since you also said that doing searches is expensive then the cost of taking a lock ad then using a standard dictionary would be minimal.
So this is what I suggest:
private Dictionary<string, Task<string>> _requests
= new Dictionary<string, Task<string>>();
public string Search(string key)
{
Task<string> task;
lock (_requests)
{
if (_requests.ContainsKey(key))
{
task = _requests[key];
}
else
{
task = Task<string>
.Factory
.StartNew(() => DoSearch(key));
_requests[key] = task;
task.ContinueWith(t =>
{
lock(_requests)
{
_requests.Remove(key);
}
});
}
}
return task.Result;
}
This option nicely runs the search, remembers the task throughout the duration of the search and then removes it from the dictionary when it completes. All requests for the same key while a search is executing get the same task and so will get the same result once the task is complete.
I've test the code and it works.

Categories

Resources