The difference between Rx Throttle(...).ObserveOn(scheduler) and Throttle(..., scheduler) - c#

I have the following code:
IDisposable subscription = myObservable.Throttle(TimeSpan.FromMilliseconds(50), RxApp.MainThreadScheduler)
.Subscribe(_ => UpdateUi());
As expected, UpdateUi() will always execute on the main thread. When I change the code to
IDisposable subscription = myObservable.Throttle(TimeSpan.FromMilliseconds(50))
.ObserveOn(RxApp.MainThreadScheduler)
.Subscribe(_ => UpdateUi());
UpdateUI() will be executed in a background thread.
Why is not Throttle(...).ObserveOn(scheduler) equivalent to Throttle(..., scheduler)?

In both examples in code you've given UpdateUi will always be invoked on the scheduler specified by RxApp.MainThreadScheduler. I can say this with some certainty since ObserveOn is a decorator that ensures the OnNext handler of subscribers is called on the specified scheduler. See here for an in-depth analysis.
So that said, this is a bit puzzling. Either RxApp.MainThreadScheduler is not referring to the correct dispatcher scheduler or UpdateUi is transitioning off the dispatcher thread. The former is not unprecedented - see https://github.com/reactiveui/ReactiveUI/issues/768 where others have run into this. I have no idea what the issue was in that case. Perhaps #PaulBetts can weigh in, or you could raise an issue at https://github.com/reactiveui/. Whatever the case, I would carefully check your assumptions here since I would expect this to be a well tested area. Do you have a complete repro?
As to your specific question, the difference between Throttle(...).ObserveOn(scheduler) and Throttle(..., scheduler) is as follows:
In the first case when Throttle is specified without a scheduler it will use the default platform scheduler to introduce the concurrency necessary to run it's timer - on WPF this would use a thread pool thread. So all the throttling will be done on a background thread and, due to the following ObserveOn the released events only will be passed to the subscriber on the specified scheduler.
In the case where Throttle specifies a scheduler, the throttling is done on that scheduler - both suppressed events and released events will be managed on that scheduler and the subscriber will be called on that same scheduler too.
So either way, the UpdateUi will be called on the RxApp.MainThreadScheduler.
You are best off throttling ui events on the dispatcher in most cases since it's generally more costly to run separate timers on a background thread and pay for the context switch if only a fraction of events are going to make it through the throttle.
So, just to check you haven't run into an issue with RxApp.MainThreadScheduler, I would try specifying the scheduler or SynchronizationContext explicitly via another means. How to do this will depend on the platform you are on - ObserveOnDispatcher() is hopefully available, or use a suitable ObserveOn overload. There are options for controls, syncronizationcontexts and schedulers given the correct Rx libraries are imported.

After some investigation I believe this is caused by a different version of Rx being used run time than I expect (I develop a plugin for a third-party application).
I'm not sure why, but it seems that the default RxApp.MainThreadScheduler fails to initialize correctly. The default instance is a WaitForDispatcherScheduler (source). All functions in this class rely attemptToCreateScheduler:
IScheduler attemptToCreateScheduler()
{
if (_innerScheduler != null) return _innerScheduler;
try {
_innerScheduler = _schedulerFactory();
return _innerScheduler;
} catch (Exception) {
// NB: Dispatcher's not ready yet. Keep using CurrentThread
return CurrentThreadScheduler.Instance;
}
}
What seems to happen in my case is that _schedulerFactory() throws, resulting in CurrentThreadScheduler.Instance to be returned instead.
By manually initializing the RxApp.MainThreadScheduler to new SynchronizationContextScheduler(SynchronizationContext.Current) behavior is as expected.

I've just bumped into an issue that first led me to this question and then to some experimenting.
It turns out, Throttle(timeSpan, scheduler) is clever enough to "cancel" an already scheduled debounced event X, in case the source emits another event Y before X gets observed. Thus, only Y will be eventually observed (provided it's the last debounced event).
With Throttle(timeSpan).ObserveOn(scheduler), both X and Y will be observed.
So, conceptually, that's an important difference between the two approaches. Sadly, Rx.NET docs are scarce, but I believe this behavior is by design and it makes sense to me.
To illustrate this with an example (fiddle):
#nullable enable
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Diagnostics;
using System.Reactive.Concurrency;
using System.Reactive.Linq;
using System.Reactive.Subjects;
using static System.Console;
public class Program
{
static async Task ThrottleWithScheduler()
{
WriteLine($"\n{nameof(ThrottleWithScheduler)}\n");
var sc = new CustomSyncContext();
var scheduler = new SynchronizationContextScheduler(sc);
var subj = new BehaviorSubject<string>("A");
subj
.Do(v => WriteLine($"Emitted {v} on {sc.Elapsed}ms"))
.Throttle(TimeSpan.FromMilliseconds(500), scheduler)
.Subscribe(v => WriteLine($"Observed {v} on {sc.Elapsed}ms"));
await Task.Delay(100);
subj.OnNext("B");
await Task.Delay(200);
subj.OnNext("X");
await Task.Delay(550);
subj.OnNext("Y");
await Task.Delay(2000);
WriteLine("Finished!");
}
static async Task ThrottleWithObserveOn()
{
WriteLine($"\n{nameof(ThrottleWithObserveOn)}\n");
var sc = new CustomSyncContext();
var scheduler = new SynchronizationContextScheduler(sc);
var subj = new BehaviorSubject<string>("A");
subj
.Do(v => WriteLine($"Emitted {v} on {sc.Elapsed}ms"))
.Throttle(TimeSpan.FromMilliseconds(500))
.ObserveOn(scheduler)
.Subscribe(v => WriteLine($"Observed {v} on {sc.Elapsed}ms"));
await Task.Delay(100);
subj.OnNext("B");
await Task.Delay(200);
subj.OnNext("X");
await Task.Delay(550);
subj.OnNext("Y");
await Task.Delay(2000);
WriteLine("Finished!");
}
public static async Task Main()
{
await ThrottleWithScheduler();
await ThrottleWithObserveOn();
}
}
class CustomSyncContext : SynchronizationContext
{
private readonly Stopwatch _sw = Stopwatch.StartNew();
public long Elapsed { get { lock (_sw) { return _sw.ElapsedMilliseconds; } } }
public override void Post(SendOrPostCallback d, object? state)
{
WriteLine($"Scheduled on {Elapsed}ms");
Task.Delay(100).ContinueWith(
continuationAction: _ =>
{
WriteLine($"Executed on {Elapsed}ms");
d(state);
},
continuationOptions: TaskContinuationOptions.ExecuteSynchronously);
}
}
Output:
ThrottleWithScheduler
Emitted A on 18ms
Emitted B on 142ms
Emitted X on 351ms
Scheduled on 861ms
Emitted Y on 907ms
Executed on 972ms
Scheduled on 1421ms
Executed on 1536ms
Observed Y on 1539ms
Finished!
ThrottleWithObserveOn
Emitted A on 4ms
Emitted B on 113ms
Emitted X on 315ms
Scheduled on 837ms
Emitted Y on 886ms
Executed on 951ms
Observed X on 953ms
Scheduled on 1391ms
Executed on 1508ms
Observed Y on 1508ms
Finished!

Related

Unexpected values for AsyncLocal.Value when mixing ExecutionContext.SuppressFlow and tasks

In an application I am experiencing odd behavior due to wrong/unexpected values of AsyncLocal: Despite I suppressed the flow of the execution context, I the AsyncLocal.Value-property is sometimes not reset within the execution scope of a newly spawned Task.
Below I created a minimal reproducible sample which demonstrates the problem:
private static readonly AsyncLocal<object> AsyncLocal = new AsyncLocal<object>();
[TestMethod]
public void Test()
{
Trace.WriteLine(System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription);
var mainTask = Task.Factory.StartNew(() =>
{
AsyncLocal.Value = "1";
Task anotherTask;
using (ExecutionContext.SuppressFlow())
{
anotherTask = Task.Run(() =>
{
Trace.WriteLine(AsyncLocal.Value); // "1" <- ???
Assert.IsNull(AsyncLocal.Value); // BOOM - FAILS
AsyncLocal.Value = "2";
});
}
Task.WaitAll(anotherTask);
});
mainTask.Wait(500000, CancellationToken.None);
}
In nine out of ten runs (on my pc) the outcome of the Test-method is:
.NET 6.0.2
"1"
-> The test fails
As you can see the test fails because within the action which is executed within Task.Run the the previous value is still present within AsyncLocal.Value (Message: 1).
My concrete questions are:
Why does this happen?
I suspect this happens because Task.Run may use the current thread to execute the work load. In that case, I assume lack of async/await-operators does not force the creation of a new/separate ExecutionContext for the action. Like Stephen Cleary said "from the logical call context’s perspective, all synchronous invocations are “collapsed” - they’re actually part of the context of the closest async method further up the call stack". If that’s the case I do understand why the same context is used within the action.
Is this the correct explanation for this behavior? In addition, why does it work flawlessly sometimes (about 1 run out of 10 on my machine)?
How can I fix this?
Assuming that my theory above is true it should be enough to forcefully introduce a new async "layer", like below:
private static readonly AsyncLocal<object> AsyncLocal = new AsyncLocal<object>();
[TestMethod]
public void Test()
{
Trace.WriteLine(System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription);
var mainTask = Task.Factory.StartNew(() =>
{
AsyncLocal.Value = "1";
Task anotherTask;
using (ExecutionContext.SuppressFlow())
{
var wrapper = () =>
{
Trace.WriteLine(AsyncLocal.Value);
Assert.IsNull(AsyncLocal.Value);
AsyncLocal.Value = "2";
return Task.CompletedTask;
};
anotherTask = Task.Run(async () => await wrapper());
}
Task.WaitAll(anotherTask);
});
mainTask.Wait(500000, CancellationToken.None);
}
This seems to fix the problem (it consistently works on my machine), but I want to be sure that this is a correct fix for this problem.
Many thanks in advance
Why does this happen? I suspect this happens because Task.Run may use the current thread to execute the work load.
I suspect that it happens because Task.WaitAll will use the current thread to execute the task inline.
Specifically, Task.WaitAll calls Task.WaitAllCore, which will attempt to run it inline by calling Task.WrappedTryRunInline. I'm going to assume the default task scheduler is used throughout. In that case, this will invoke TaskScheduler.TryRunInline, which will return false if the delegate is already invoked. So, if the task has already started running on a thread pool thread, this will return back to WaitAllCore, which will just do a normal wait, and your code will work as expected (1 out of 10).
If a thread pool thread hasn't picked it up yet (9 out of 10), then TaskScheduler.TryRunInline will call TaskScheduler.TryExecuteTaskInline, the default implementation of which will call Task.ExecuteEntryUnsafe, which calls Task.ExecuteWithThreadLocal. Task.ExecuteWithThreadLocal has logic for applying an ExecutionContext if one was captured. Assuming none was captured, the task's delegate is just invoked directly.
So, it seems like each step is behaving logically. Technically, what ExecutionContext.SuppressFlow means is "don't capture the ExecutionContext", and that is what is happening. It doesn't mean "clear the ExecutionContext". Sometimes the task is run on a thread pool thread (without the captured ExecutionContext), and WaitAll will just wait for it to complete. Other times the task will be executed inline by WaitAll instead of a thread pool thread, and in that case the ExecutionContext is not cleared (and technically isn't captured, either).
You can test this theory by capturing the current thread id within your wrapper and comparing it to the thread id doing the Task.WaitAll. I expect that they will be the same thread for the runs where the async local value is (unexpectedly) inherited, and they will be different threads for the runs where the async local value works as expected.
If you can, I'd first consider whether it's possible to replace the thread-specific caches with a single shared cache. The app likely predates useful types such as ConcurrentDictionary.
If it isn't possible to use a singleton cache, then you can use a stack of async local values. Stacking async local values is a common pattern. I prefer wrapping the stack logic into a separate type (AsyncLocalValue in the code below):
public sealed class AsyncLocalValue
{
private static readonly AsyncLocal<ImmutableStack<object>> _asyncLocal = new();
public object Value => _asyncLocal.Value?.Peek();
public IDisposable PushValue(object value)
{
var originalValue = _asyncLocal.Value;
var newValue = (originalValue ?? ImmutableStack<object>.Empty).Push(value);
_asyncLocal.Value = newValue;
return Disposable.Create(() => _asyncLocal.Value = originalValue);
}
}
private static AsyncLocalValue AsyncLocal = new();
[TestMethod]
public void Test()
{
Console.WriteLine(System.Runtime.InteropServices.RuntimeInformation.FrameworkDescription);
var mainTask = Task.Factory.StartNew(() =>
{
Task anotherTask;
using (AsyncLocal.PushValue("1"))
{
using (AsyncLocal.PushValue(null))
{
anotherTask = Task.Run(() =>
{
Console.WriteLine("Observed: " + AsyncLocal.Value);
using (AsyncLocal.PushValue("2"))
{
}
});
}
}
Task.WaitAll(anotherTask);
});
mainTask.Wait(500000, CancellationToken.None);
}
This code sample uses Disposable.Create from my Nito.Disposables library.

In .net does a new thread get created for each System.Thread.Timer object in a process?

I want to execute methods at different intervals and was looking at using the Timer class to schedule this. However, I wanted to understand if the Timer spun up a new thread for each new schedule which could then potentially impact performance of the application
For System.Threading.Timer:
Short-answer: 90% of the time: no. It uses the Thread Pool to get an existing thread (that isn't doing anything).
Long-answer: Possibly! If all threads in the pool are busy then a new thread will need to be created by the OS and added to the pool and then used by the Timer.
https://learn.microsoft.com/en-us/dotnet/api/system.threading.timer?view=netframework-4.8
Provides a mechanism for executing a method on a thread pool thread at specified intervals. This class cannot be inherited.
For other timer types:
System.Windows.Forms.Timer triggers a new Win32 Window Message to be sent to the Form at a set interval. It does not use a dedicated thread nor a pool thread, instead it uses Win32's SetTimer function.
System.Timers.Timer will also use a pool thread by default (just like System.Threading.Timer) but lets you perform thread synchronization. See https://learn.microsoft.com/en-us/dotnet/api/system.timers.timer?view=netframework-4.8
A single-threaded alternative using coroutines:
I recommend you look at using await Task.Delay instead - as it won't cause a new thread to be used (remember that Tasks are not Threads) - though if you do use Task.Run to run the coroutine in a pool thread then it may run on a new thread:
public static class Foo
{
public static async Task<Int32> Main( String[] args )
{
Task loop30 = this.Every30Seconds();
Task loop20 = this.Every20Seconds();
Taks loop10 = this.Every10Seconds();
await Task.WhenAll( loop30, loop20, loop10 );
return 0;
}
public static async Task Every30Seconds()
{
while( true )
{
Console.WriteLine("Ping!");
await Task.Delay( 30 * 1000 );
}
}
public static async Task Every20Seconds()
{
while( true )
{
Console.WriteLine("Pong!");
await Task.Delay( 20 * 1000 );
}
}
public static async Task Every10Seconds()
{
while( true )
{
Console.WriteLine("Pang!");
await Task.Delay( 10 * 1000 );
}
}
}
Dai's done a good job of answering the question regarding timers and threads.
I thought I'd give you an alternative way of writing your code. You should use Microsoft's Reactive Framework (aka Rx) - NuGet System.Reactive and add using System.Reactive.Linq; - then you can do this:
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
EventLoopScheduler els = new EventLoopScheduler();
els.Schedule(() => Console.WriteLine(Thread.CurrentThread.ManagedThreadId));
IObservable<string> pings = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(30), els).Select(x => "Ping!");
IObservable<string> pongs = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(20), els).Select(x => "Pong!");
IObservable<string> pangs = Observable.Timer(TimeSpan.Zero, TimeSpan.FromSeconds(10), els).Select(x => "Pang!");
IObservable<string> query = Observable.Merge(els, pings, pongs, pangs);
IDisposable subscription = query.Subscribe(x => Console.WriteLine($"{x} ({Thread.CurrentThread.ManagedThreadId})"));
Console.ReadLine();
subscription.Dispose();
els.Dispose();
The EventLoopScheduler creates a single, dedicated, re-usable thread that you can used until you call .Dispose() on it.
Both Observable.Timer and Observable.Merge allow you to specify that you want to use the EventLoopScheduler to ensure that the code is run on that thread.

Why does my code run on multiple threads?

Since a pretty long time I'm trying to understand async-await stuff in .NET, but I struggle to succeed, there's always something totally unexpected happening when I use async.
Here's my application:
namespace ConsoleApp3
{
class Program
{
static async Task Main(string[] args)
{
Console.WriteLine("Hello World!");
var work1 = new WorkClass();
var work2 = new WorkClass();
while(true)
{
work1.DoWork(500);
work2.DoWork(1500);
}
}
}
public class WorkClass
{
public async Task DoWork(int delayMs)
{
var x = 1;
await Task.Delay(delayMs)
var y = 2;
}
}
}
It's just a sample that I created to check how the code will be executed. There are a few things that surprise me.
First off, there are many threads involved! If I set a breakpoint on var y = 2; I can see that threadId is not the same there, it can be 1, or 5, or 6, or something else.
Why is that? I thought that async/await does not use additional threads on its own unless I explicitly command that (by using Task.Run or creating a new Thread). At least this article tries to say that I think.
Ok, but let's say that there are some other threads for whatever reason - even if they are, my await Task.Delay(msDelay); does not have ConfigureAwait(false)! As I understand it, without this call, thread shouldn't change.
It's really difficult for me to grasp the concept well, because I cannot find any good resource that would contain all information instead of just a few pieces of information.
When an asynchronous method awaits something, if it's not complete, it schedules a continuation and then returns. The question is which thread the continuation runs on. If there's a synchronization context, the continuation is scheduled to run within that context - typically a UI thread, or potentially a specific pool of threads.
In your case, you're running a console app which means there is no synchronization context (SynchronizationContext.Current will return null). In that case, continuations are run on thread pool threads. It's not that a new thread is specifically created to run the continuation - it's just that the thread pool will pick up the continuation, whereas the "main" thread won't run the continuation.
ConfigureAwait(false) is used to indicate that you don't want to return to the current synchronization context for the continuation - but as there's no synchronization context anyway in your case, it would make no difference.
Async/await does not use additional threads on its own, but in your example it is not on its own. You are calling Task.Delay, and this method schedules a continuation to run in a thread-pool thread. There is no thread blocked during the delay though. A new thread is not created. When the time comes an existing thread is used to run the continuation, which in your case has very little work to do (just run the var y = 2 assignment), because you are not even awaiting the task returned by DoWork. When this work is done (a fraction of a microsecond later) the thread-pool thread is free again to do other jobs.
Instead of Task.Delay you could await another method that makes no use of threads at all, or a method that creates a dedicated long running thread, or a method that starts a new process. Async/await is not responsible for any of these. Async/await is just a mechanism for creating task continuations in a developer-friendly way.
Here is your application modified for a world without async/await:
class Program
{
static Task Main(string[] args)
{
Console.WriteLine("Hello World!");
var work1 = new WorkClass();
var work2 = new WorkClass();
while (true)
{
work1.DoWork(500);
work2.DoWork(1500);
}
}
}
public class WorkClass
{
public Task DoWork(int delayMs)
{
var x = 1;
int y;
return Task.Delay(delayMs).ContinueWith(_ =>
{
y = 2;
});
}
}

Running Task<T> on a custom scheduler

I am creating a generic helper class that will help prioritise requests made to an API whilst restricting parallelisation at which they occur.
Consider the key method of the application below;
public IQueuedTaskHandle<TResponse> InvokeRequest<TResponse>(Func<TClient, Task<TResponse>> invocation, QueuedClientPriority priority, CancellationToken ct) where TResponse : IServiceResponse
{
var cts = CancellationTokenSource.CreateLinkedTokenSource(ct);
_logger.Debug("Queueing task.");
var taskToQueue = Task.Factory.StartNew(async () =>
{
_logger.Debug("Starting request {0}", Task.CurrentId);
return await invocation(_client);
}, cts.Token, TaskCreationOptions.None, _schedulers[priority]).Unwrap();
taskToQueue.ContinueWith(task => _logger.Debug("Finished task {0}", task.Id), cts.Token);
return new EcosystemQueuedTaskHandle<TResponse>(cts, priority, taskToQueue);
}
Without going into too many details, I want to invoke tasks returned by Task<TResponse>> invocation when their turn in the queue arises. I am using a collection of queues constructed using QueuedTaskScheduler indexed by a unique enumeration;
_queuedTaskScheduler = new QueuedTaskScheduler(TaskScheduler.Default, 3);
_schedulers = new Dictionary<QueuedClientPriority, TaskScheduler>();
//Enumerate the priorities
foreach (var priority in Enum.GetValues(typeof(QueuedClientPriority)))
{
_schedulers.Add((QueuedClientPriority)priority, _queuedTaskScheduler.ActivateNewQueue((int)priority));
}
However, with little success I can't get the tasks to execute in a limited parallelised environment, leading to 100 API requests being constructed, fired, and completed in one big batch. I can tell this using a Fiddler session;
I have read some interesting articles and SO posts (here, here and here) that I thought would detail how to go about this, but so far I have not been able to figure it out. From what I understand, the async nature of the lambda is working in a continuation structure as designed, which is marking the generated task as complete, basically "insta-completing" it. This means that whilst the queues are working fine, runing a generated Task<T> on a custom scheduler is turning out to be the problem.
This means that whilst the queues are working fine, runing a generated Task on a custom scheduler is turning out to be the problem.
Correct. One way to think about it[1] is that an async method is split into several tasks - it's broken up at each await point. Each one of these "sub-tasks" are then run on the task scheduler. So, the async method will run entirely on the task scheduler (assuming you don't use ConfigureAwait(false)), but at each await it will leave the task scheduler, and then re-enter that task scheduler after the await completes.
So, if you want to coordinate asynchronous work at a higher level, you need to take a different approach. It's possible to write the code yourself for this, but it can get messy. I recommend you first try ActionBlock<T> from the TPL Dataflow library, passing your custom task scheduler to its ExecutionDataflowBlockOptions.
[1] This is a simplification. The state machine will avoid creating actual task objects unless necessary (in this case, they are necessary because they're being scheduled to a task scheduler). Also, only await points where the awaitable isn't complete actually cause a "method split".
Stephen Cleary's answer explains well why you can't use TaskScheduler for this purpose and how you can use ActionBlock to limit the degree of parallelism. But if you want to add priorities to that, I think you'll have to do that manually. Your approach of using a Dictionary of queues is reasonable, a simple implementation (with no support for cancellation or completion) of that could look something like this:
class Scheduler
{
private static readonly Priority[] Priorities =
(Priority[])Enum.GetValues(typeof(Priority));
private readonly IReadOnlyDictionary<Priority, ConcurrentQueue<Func<Task>>> queues;
private readonly ActionBlock<Func<Task>> executor;
private readonly SemaphoreSlim semaphore;
public Scheduler(int degreeOfParallelism)
{
queues = Priorities.ToDictionary(
priority => priority, _ => new ConcurrentQueue<Func<Task>>());
executor = new ActionBlock<Func<Task>>(
invocation => invocation(),
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = degreeOfParallelism,
BoundedCapacity = degreeOfParallelism
});
semaphore = new SemaphoreSlim(0);
Task.Run(Watch);
}
private async Task Watch()
{
while (true)
{
await semaphore.WaitAsync();
// find item with highest priority and send it for execution
foreach (var priority in Priorities.Reverse())
{
Func<Task> invocation;
if (queues[priority].TryDequeue(out invocation))
{
await executor.SendAsync(invocation);
}
}
}
}
public void Invoke(Func<Task> invocation, Priority priority)
{
queues[priority].Enqueue(invocation);
semaphore.Release(1);
}
}

BlockingCollection vs Subject for use as a consumer

I'm trying to implement a consumer in C#. There are many publishers which could be executing concurrently. I've created three examples, one with Rx and subject, one with BlockingCollection and a third using ToObservable from the BlockingCollection. They all do the same thing in this simple example and I want them to work with multiple producers.
What are the different qualities of each approach?
I'm already using Rx, so I'd prefer this approach. But I'm concerned that OnNext has no thread safe guarantee and I don't know what the queuing semantics are of Subject and the default scheduler.
Is there a thread safe subject?
Are all messages going to be processed?
Are there any other scenarios when this wont work? Is it processing concurrently?
void SubjectOnDefaultScheduler()
{
var observable = new Subject<long>();
observable.
ObserveOn(Scheduler.Default).
Subscribe(i => { DoWork(i); });
observable.OnNext(1);
observable.OnNext(2);
observable.OnNext(3);
}
Not Rx, but easily adapted to use/subscribe it. It takes an item and then processes it. This should happen serially.
void BlockingCollectionAndConsumingTask()
{
var blockingCollection = new BlockingCollection<long>();
var taskFactory = new TaskFactory();
taskFactory.StartNew(() =>
{
foreach (var i in blockingCollection.GetConsumingEnumerable())
{
DoWork(i);
}
});
blockingCollection.Add(1);
blockingCollection.Add(2);
blockingCollection.Add(3);
}
Using a blocking collection a bit like a subject seems like a good compromise. I'm guessing implicitly will schedule onto task, so that I can use async/await, is that correct?
void BlockingCollectionToObservable()
{
var blockingCollection = new BlockingCollection<long>();
blockingCollection.
GetConsumingEnumerable().
ToObservable(Scheduler.Default).
Subscribe(i => { DoWork(i); });
blockingCollection.Add(1);
blockingCollection.Add(2);
blockingCollection.Add(3);
}
Subject is not thread-safe. OnNexts issued concurrently will directly call an Observer concurrently. Personally I find this quite surprising given the extent to which other areas of Rx enforce the correct semantics. I can only assume this was done for performance considerations.
Subject is kind of a half-way house though, in that it does enforce termination with OnError or OnComplete - after either of these are raised, OnNext is a NOP. And this behaviour is thread-safe.
But use Observable.Synchronize() on a Subject and it will force outgoing calls to obey the proper Rx semantics. In particular, OnNext calls will block if made concurrently.
The underlying mechanism is the standard .NET lock. When the lock is contended by multiple threads they are granted the lock on a first-come first-served basis most of the time. There are certain conditions where fairness is violated. However, you will definitely get the serialized access you are looking for.
ObserveOn has behaviour that is platform specific - if available, you can supply a SynchronizationContext and OnNext calls are Posted to it. With a Scheduler, it ends up putting calls onto a ConcurrentQueue<T> and dispatching them serially via the scheduler - so the thread of execution will depend on the scheduler. Either way, the queuing behaviour will also enforce the correct semantics.
In both cases (Synchronize & ObserveOn), you certainly won't lose messages. With ObserveOn, you can implicitly choose thread you'll process messages on by your choice of Scheduler/Context, with Synchronize you'll process messages on the calling thread. Which is better will depend on your scenario.
There's more to consider as well - such as what you want to do if your producers out-pace your consumer.
You might want to have a look at Rxx Consume as well: http://rxx.codeplex.com/SourceControl/changeset/view/63470#1100703
Sample code showing Synchronize behaviour (Nuget Rx-Testing, Nunit) - it's a bit hokey with the Thread.Sleep code but it's quite fiddly to be bad and I was lazy :):
public class SubjectTests
{
[Test]
public void SubjectDoesNotRespectGrammar()
{
var subject = new Subject<int>();
var spy = new ObserverSpy(Scheduler.Default);
var sut = subject.Subscribe(spy);
// Swap the following with the preceding to make this test pass
//var sut = subject.Synchronize().Subscribe(spy);
Task.Factory.StartNew(() => subject.OnNext(1));
Task.Factory.StartNew(() => subject.OnNext(2));
Thread.Sleep(2000);
Assert.IsFalse(spy.ConcurrencyViolation);
}
private class ObserverSpy : IObserver<int>
{
private int _inOnNext;
public ObserverSpy(IScheduler scheduler)
{
_scheduler = scheduler;
}
public bool ConcurrencyViolation = false;
private readonly IScheduler _scheduler;
public void OnNext(int value)
{
var isInOnNext = Interlocked.CompareExchange(ref _inOnNext, 1, 0);
if (isInOnNext == 1)
{
ConcurrencyViolation = true;
return;
}
var wait = new ManualResetEvent(false);
_scheduler.Schedule(TimeSpan.FromSeconds(1), () => wait.Set());
wait.WaitOne();
_inOnNext = 0;
}
public void OnError(Exception error)
{
}
public void OnCompleted()
{
}
}
}

Categories

Resources