Observe values not seen in other observers - c#

I have an observable that emits unique values e.g.
var source=Observable.Range(1,100).Publish();
source.Connect();
I want to observe its values from e.g. two observers but each observer to get notified only for values not seen in other observers.
So if first observer contains the value 10 the second observer should never get notified for the 10 value.
Update
I chose #Asti`s answer cause it was first and although buggy it pointed to the right direction and up-voted #Shlomo's answer. Too bad I cannot accept both answers as #Shlomo answer was more correct and I really appreciate all his help we get on this tag.

Observables aren't supposed to behave differently for different observers; a better approach would be to give each observer its own filtered observable.
That being said, if your constraints require that you need this behavior in a single observable - we can use a Round-Robin method.
public static IEnumerable<T> Repeat<T>(this IEnumerable<T> source)
{
for (; ; )
foreach (var item in source.ToArray())
yield return item;
}
public static IObservable<T> RoundRobin<T>(this IObservable<T> source)
{
var subscribers = new List<IObserver<T>>();
var shared = source
.Zip(subscribers.Repeat(), (value, observer) => (value, observer))
.Publish()
.RefCount();
return Observable.Create<T>(observer =>
{
subscribers.Add(observer);
var subscription =
shared
.Where(pair => pair.observer == observer)
.Select(pair => pair.value)
.Subscribe(observer);
var dispose = Disposable.Create(() => subscribers.Remove(observer));
return new CompositeDisposable(subscription, dispose);
});
}
Usage:
var source = Observable.Range(1, 100).Publish();
var dist = source.RoundRobin();
dist.Subscribe(i => Console.WriteLine($"One sees {i}"));
dist.Subscribe(i => Console.WriteLine($"Two sees {i}"));
source.Connect();
Result:
One sees 1
Two sees 2
One sees 3
Two sees 4
One sees 5
Two sees 6
One sees 7
Two sees 8
One sees 9
Two sees 10
If you already have a list of observers, the code becomes much simpler.

EDIT: #Asti fixed his bug, and I fixed mine based on his answer. Our answers are now largely similar. I have an idea how to do a purely reactive one, if I have time I'll post that later.
Fixed code:
public static IObservable<T> RoundRobin2<T>(this IObservable<T> source)
{
var subscribers = new BehaviorSubject<ImmutableList<IObserver<T>>>(ImmutableList<IObserver<T>>.Empty);
ImmutableList<IObserver<T>> latest = ImmutableList<IObserver<T>>.Empty;
subscribers.Subscribe(l => latest = l);
var shared = source
.Select((v, i) => (v, i))
.WithLatestFrom(subscribers, (t, s) => (t.v, t.i, s))
.Publish()
.RefCount();
return Observable.Create<T>(observer =>
{
subscribers.OnNext(latest.Add(observer));
var dispose = Disposable.Create(() => subscribers.OnNext(latest.Remove(observer)));
var sub = shared
.Where(t => t.i % t.s.Count == t.s.FindIndex(o => o == observer))
.Select(t => t.v)
.Subscribe(observer);
return new CompositeDisposable(dispose, sub);
});
}
Original answer:
I upvoted #Asti's answer, because he's largely correct: Just because you can, doesn't mean you should. And his answer largely works, but it's subject to a bug:
This works fine:
var source = Observable.Range(1, 20).Publish();
var dist = source.RoundRobin();
dist.Subscribe(i => Console.WriteLine($"One sees {i}"));
dist.Take(1).Subscribe(i => Console.WriteLine($"Two sees {i}"));
This doesn't:
var source = Observable.Range(1, 20).Publish();
var dist = source.RoundRobin();
dist.Take(1).Subscribe(i => Console.WriteLine($"One sees {i}"));
dist.Subscribe(i => Console.WriteLine($"Two sees {i}"));
Output is:
One sees 1
Two sees 1
Two sees 2
Two sees 3
Two sees 4
...
I first thought the bug is Halloween related, but now I'm not sure. The .ToArray() in Repeat should take care of that. I also wrote a pure-ish observable implementation which has the same bug. This implementation doesn't guarantee a perfect Round Robin, but that wasn't in the question:
public static IObservable<T> RoundRobin2<T>(this IObservable<T> source)
{
var subscribers = new BehaviorSubject<ImmutableList<IObserver<T>>>(ImmutableList<IObserver<T>>.Empty);
ImmutableList<IObserver<T>> latest = ImmutableList<IObserver<T>>.Empty;
subscribers.Subscribe(l => latest = l);
var shared = source
.Select((v, i) => (v, i))
.WithLatestFrom(subscribers, (t, s) => (t.v, t.i, s))
.Publish()
.RefCount();
return Observable.Create<T>(observer =>
{
subscribers.OnNext(latest.Add(observer));
var dispose = Disposable.Create(() => subscribers.OnNext(latest.Remove(observer)));
var sub = shared
.Where(t => t.i % t.s.Count == t.s.FindIndex(o => o == observer))
.Select(t => t.v)
.Subscribe(observer);
return new CompositeDisposable(dispose, sub);
});
}

This is a simple distributed queue implementation using TPL Dataflow. But with respect to different observers not seeing the same value, there's little chance of it behaving incorrectly. It's not round-robin, but actually has back-pressure semantics.
public static IObservable<T> Distribute<T>(this IObservable<T> source)
{
var buffer = new BufferBlock<T>();
source.Subscribe(buffer.AsObserver());
return Observable.Create<T>(observer =>
buffer.LinkTo(new ActionBlock<T>(observer.OnNext, new ExecutionDataflowBlockOptions { BoundedCapacity = 1 })
);
}
Output
One sees 1
Two sees 2
One sees 3
Two sees 4
One sees 5
One sees 6
One sees 7
One sees 8
One sees 9
One sees 10
I might prefer skipping Rx entirely and just using TPL Dataflow.

Related

DistinctUntilChanged fires multiple times on multiple subscribers

I have one observable (mainSequence). If a condition is meet it should invoke an async method once until the condition changes. The methods return value will indicate success.
On failure I have a subscription which will inform the user.
Other observable are likely to subscribe to the mainSequence and have a similar error handling pattern.
But the consecutive observers to mainSequence will cause to invoke the mainSequence again. I only would like to have it invoked once hence my DistinctUntilChanged.
The example below outputs:
Working on 6
Working on 6
Working on 100
Working on 6
Working on 101
The output I want is:
Working on 6
Working on 100
Working on 101
I'm missing an reactive operator on my mainSequence, I just don't know which one.
public static void Main()
{
bool IsNumberOk(int n) => n > 5;
Task<bool> DoSomethingAsync(int n)
{
Console.WriteLine($"Working on {n}");
return Task.FromResult(true);
}
var mainSequence = Observable.Range(0, 10)
.Where(IsNumberOk)
.DistinctUntilChanged(IsNumberOk)
.SelectMany(DoSomethingAsync);
// sequence one error handling
mainSequence.Where(x => !x).Subscribe(_ => Console.WriteLine($"Something went wrong with {nameof(mainSequence)}"));
for (var i = 0; i < 2; i++)
{
var iTemp = 100 + i;
var consecutive = mainSequence
.Where(x => x) // if no error on mainSequence
.Select(_ => iTemp)
.DistinctUntilChanged()
.SelectMany(DoSomethingAsync);
consecutive.Where(x => !x).Subscribe(_ => Console.WriteLine($"Something went wrong with {iTemp}"));
}
}
You have misunderstanding with regard to the distinction between an observable and a subscription. They are two distinct things.
The best parallel, in my mind, is that an observable is like a class and a subscription is like an instance of a class. Like a class, the observable is defined once. Each subscription is a new instance of the observable.
Let's take this code - somewhat cut-down from your code in the question.
Task<int> DoSomethingAsync(int n)
{
Console.WriteLine($"Working on {n}");
return Task.FromResult(-n);
}
IObservable<int> mainSequence =
Observable
.Range(0, 3)
.SelectMany(DoSomethingAsync);
That's a single observable.
Now let's do this:
IDisposable mainSubscription1 =
mainSequence
.Subscribe(x => Console.WriteLine($"(1){nameof(mainSequence)}OnNext({x})"));
IDisposable mainSubscription2 =
mainSequence
.Subscribe(x => Console.WriteLine($"(2){nameof(mainSequence)}OnNext({x})"));
I have created two subscriptions, so I get two completely distinct instances of the observable. They run entirely separate of each other. In fact, Observable.Range outputs its values immediately, so each subscription blocks until it is complete. You get this output:
Working on 0
(1)mainSequenceOnNext(0)
Working on 1
(1)mainSequenceOnNext(-1)
Working on 2
(1)mainSequenceOnNext(-2)
Working on 0
(2)mainSequenceOnNext(0)
Working on 1
(2)mainSequenceOnNext(-1)
Working on 2
(2)mainSequenceOnNext(-2)
You can get Observable.Range to not block like this:
IObservable<int> mainSequence =
Observable
.Range(0, 3, Scheduler.Default)
.SelectMany(DoSomethingAsync);
But you still have two completely independent instances of the observable running. You get something like this:
Working on 0
Working on 0
(1)mainSequenceOnNext(0)
Working on 1
(2)mainSequenceOnNext(0)
Working on 1
(1)mainSequenceOnNext(-1)
Working on 2
(2)mainSequenceOnNext(-1)
Working on 2
(1)mainSequenceOnNext(-2)
(2)mainSequenceOnNext(-2)
Now, if you want to share a single observable then you need to Publish it and Connect to the published observable to get the values flowing.
Here's the full code:
IConnectableObservable<int> mainSequence =
Observable
.Range(0, 3, Scheduler.Default)
.SelectMany(DoSomethingAsync)
.Publish();
IDisposable mainSubscription1 =
mainSequence
.Subscribe(x => Console.WriteLine($"(1){nameof(mainSequence)}OnNext({x})"));
IDisposable mainSubscription2 =
mainSequence
.Subscribe(x => Console.WriteLine($"(2){nameof(mainSequence)}OnNext({x})"));
IDisposable mainConnection =
mainSequence
.Connect();
Now when I run that, the two subscriptions don't start producing values until the .Connect() is called.
You get this:
Working on 0
(1)mainSequenceOnNext(0)
(2)mainSequenceOnNext(0)
Working on 1
(1)mainSequenceOnNext(-1)
(2)mainSequenceOnNext(-1)
Working on 2
(1)mainSequenceOnNext(-2)
(2)mainSequenceOnNext(-2)
Now if I had to get your code working, here's what it would look like:
public static void Main()
{
bool IsNumberOk(int n) => n > 5;
Task<bool> DoSomethingAsync(int n)
{
Console.WriteLine($"Working on {n}");
return Task.FromResult(true);
}
var mainSequence =
Observable
.Range(0, 10)
.Where(IsNumberOk)
.DistinctUntilChanged(IsNumberOk)
.SelectMany(DoSomethingAsync)
.Publish();
mainSequence
.Where(x => !x)
.Subscribe(_ => Console.WriteLine($"Something went wrong with {nameof(mainSequence)}"));
for (var i = 0; i < 2; i++)
{
var iTemp = 100 + i;
var consecutive =
mainSequence
.Where(x => x)
.Select(_ => iTemp)
.DistinctUntilChanged()
.SelectMany(DoSomethingAsync);
consecutive
.Where(x => !x)
.Subscribe(_ => Console.WriteLine($"Something went wrong with {iTemp}"));
}
IDisposable mainConnection =
mainSequence
.Connect();
}
It now produces this:
Working on 6
Working on 100
Working on 101

rx.net locking up from use of ToEnumerable

I am trying to convert the below statement so that I can get the key alongside the selected list:
var feed = new Subject<TradeExecuted>();
feed
.GroupByUntil(x => (x.Execution.Contract.Symbol, x.Execution.AccountId, x.Tenant, x.UserId), x => Observable.Timer(TimeSpan.FromSeconds(5)))
.SelectMany(x => x.ToList())
.Select(trades => Observable.FromAsync(() => Mediator.Publish(trades, cts.Token)))
.Concat() // Ensure that the results are serialized.
.Subscribe(cts.Token); // Check status of calls.
The above works, whereas the below does not - when I try and itterate over the list, it locks up.
feed
.GroupByUntil(x => (x.Execution.Contract.Symbol, x.Execution.AccountId, x.Tenant, x.UserId), x => Observable.Timer(timespan))
.Select(x => Observable.FromAsync(() =>
{
var list = x.ToEnumerable(); // <---- LOCK UP if we use list.First() etc
var aggregate = AggregateTrades(x.Key.Symbol, x.Key.AccountId, x.Key.Tenant, list);
return Mediator.Publish(aggregate, cts.Token);
}))
.Concat()
.Subscribe(cts.Token); // Check status of calls.
I am clearly doing something wrong and probably horrific!
Going back to the original code, how can I get the Key alongside the enumerable list (and avoiding the hack below)?
As a sidenote, the below code works but it a nasty hack where I get the keys from the first list item:
feed
.GroupByUntil(x => (x.Execution.Contract.Symbol, x.Execution.AccountId, x.Tenant, x.UserId), x => Observable.Timer(TimeSpan.FromSeconds(5)))
.SelectMany(x => x.ToList())
.Select(trades => Observable.FromAsync(() =>
{
var firstTrade = trades.First();
var aggregate = AggregateTrades(firstTrade.Execution.Contract.Symbol, firstTrade.Execution.AccountId, firstTrade.Tenant, trades);
return Mediator.Publish(aggregate, cts.Token);
}))
.Concat() // Ensure that the results are serialized.
.Subscribe(cts.Token); // Check status of calls.
All versions of your code suffer from trying to eagerly evaluate the grouped sub-observable. Since in v1 and v3 your group observable will run a maximum of 5 seconds, that isn't horrible/awful, but it's still not great. In v2, I don't know what timespan is, but assuming it's 5 seconds, you have the same problem: Trying to turn the grouped sub-observable into a list or an enumerable means waiting for the sub-observable to complete, blocking the thread (or the task).
You can fix this by using the Buffer operator to lazily evaluate the grouped sub-observable:
var timespan = TimeSpan.FromSeconds(5);
feed
.GroupByUntil(x => (x.Execution.Contract.Symbol, x.Execution.AccountId, x.Tenant, x.UserId), x => Observable.Timer(timespan))
.SelectMany(x => x
.Buffer(timespan)
.Select(list => Observable.FromAsync(() =>
{
var aggregate = AggregateTrades(x.Key.Symbol, x.Key.AccountId, x.Key.Tenant, list));
return Mediator.Publish(aggregate, cts.Token);
}))
)
.Concat() // Ensure that the results are serialized.
.Subscribe(cts.Token); // Check status of calls.
This essentially means that until timespan is up, the items in the group by gather in a list inside Buffer. Once timespan is up, they're released as a list, and the mediator publish happens.

System.Reactive - Buffer/group immediately available values of an Observable

let's say you have an IObservable<T> that may supply a few values immediately, and some being pushed continously:
var immediate_values = new [] { "curerntly", "available", "values" }.ToObservable();
var future_values = Observable.Timer(TimeSpan.FromSeconds(5), TimeSpan.FromSeconds(1)).Select(x => "new value!");
IObservable<string> input = immediate_values.Concat(future_values);
Is there any way to transform input into an IObservable<string[]>, where the first array being pushed consists of all immediately available values, and each subsequent array consists of only 1 value (each one being pushed thereafter)?
Above is just example data naturally, this would need to work on any IObservable>T> without knowing the individual input streams.
IObservable<string[]> buffered = input.BufferSomehow();
// should push values:
// First value: string[] = ["currently", "available", "values"]
// Second value: string[] = ["new value!"]
// Third value: string[] = ["new value!"]
// .....
I've thought of the .Buffer() function of course, but I don't really want to buffer by any particular TimeSpan, and can't think of any way to produce an observable with buffer window closing signals.
Can anyone think of a reasonable way to achieve this, or is this not really possible at all?
Thanks!
There is no direct way to distinguish between the on-start-up values of an observable and the subsequent values. My suggestion would be to infer it:
var autoBufferedInput1 = input.Publish(_input => _input
.Buffer(_input.Throttle(TimeSpan.FromSeconds(.1)))
.Select(l => l.ToArray())
);
This sets your buffer boundary to a rolling, extending window of .1 seconds: Each time a value comes in, it extends the window to .1 seconds from the time the value came in, and adds the value to the buffer. If .1 seconds go by with no values, then the buffer is flushed out.
This will have the side-effect that if you have near-simultaneous "hot" values (within .1 seconds of each other), then those will be buffered together. If that's undesired, you can Switch out, though that makes things more complicated:
var autoBufferedInput2 = input.Publish(_input =>
_input.Throttle(TimeSpan.FromSeconds(.1)).Publish(_boundary => _boundary
.Take(1)
.Select(_ => _input.Select(s => new[] { s }))
.StartWith(_input
.Buffer(_boundary)
.Select(l => l.ToArray())
)
.Switch()
)
);
autoBufferedInput2 uses the .1 second inference method until the first buffered list, then switches to simply selecting out and wrapping values in an array.
EDIT: If you want an absolute 1 second gate as well, then the snippets would look like this:
var autoBufferedInput1 = input.Publish(_input => _input
.Buffer(
Observable.Merge(
Observable.Timer(TimeSpan.FromSeconds(1)).Select(_ => Unit.Default),
_input.Throttle(TimeSpan.FromSeconds(.1)).Select(_ => Unit.Default)
)
)
.Select(l => l.ToArray())
);
var autoBufferedInput2 = input.Publish(_input =>
Observable.Merge(
_input.Throttle(TimeSpan.FromSeconds(.1)).Select(_ => Unit.Default),
Observable.Timer(TimeSpan.FromSeconds(1)).Select(_ => Unit.Default)
)
.Publish(_boundary => _boundary
.Take(1)
.Select(_ => _input.Select(s => new[] { s }))
.StartWith(_input
.Buffer(_boundary)
.Select(l => l.ToArray())
)
.Switch()
)
);
For any IObservable<T>, you'd need to do:
var sequence = ongoingSequence.StartWith(initialSequence);
You could take advantage of the fact that the immediately available values are propagated synchronously during the subscription, and toggle some switch after the Subscribe method returns. The implementation below is based on this idea. During the subscription all incoming messages are buffered, after the subscription the buffer is emitted, and after that all future incoming messages are emitted immediately one by one.
public static IObservable<T[]> BufferImmediatelyAvailable<T>(
this IObservable<T> source)
{
return Observable.Create<T[]>(observer =>
{
var buffer = new List<T>();
var subscription = source.Subscribe(x =>
{
if (buffer != null)
buffer.Add(x);
else
observer.OnNext(new[] { x });
}, ex =>
{
buffer = null;
observer.OnError(ex);
}, () =>
{
if (buffer != null)
{
var output = buffer.ToArray();
buffer = null;
observer.OnNext(output);
}
observer.OnCompleted();
});
if (buffer != null)
{
var output = buffer.ToArray();
buffer = null;
observer.OnNext(output);
}
return subscription;
});
}

combining one observable with latest from another observable

I'm trying to combine two observables whose values share some key.
I want to produce a new value whenever the first observable produces a new value, combined with the latest value from a second observable which selection depends on the latest value from the first observable.
pseudo code example:
var obs1 = Observable.Interval(TimeSpan.FromSeconds(1)).Select(x => Tuple.create(SomeKeyThatVaries, x)
var obs2 = Observable.Interval(TimeSpan.FromMilliSeconds(1)).Select(x => Tuple.create(SomeKeyThatVaries, x)
from x in obs1
let latestFromObs2WhereKeyMatches = …
select Tuple.create(x, latestFromObs2WhereKeyMatches)
Any suggestions?
Clearly this could be implemented by subcribing to the second observable and creating a dictionary with the latest values indexable by the key. But I'm looking for a different approach..
Usage scenario: one minute price bars computed from a stream of stock quotes. In this case the key is the ticker and the dictionary contains latest ask and bid prices for concrete tickers, which are then used in the computation.
(By the way, thank you Dave and James this has been a very fruitful discussion)
(sorry about the formatting, hard to get right on an iPad..)
...why are you looking for a different approach? Sounds like you are on the right lines to me. It's short, simple code... roughly speaking it will be something like:
var cache = new ConcurrentDictionary<long, long>();
obs2.Subscribe(x => cache[x.Item1] = x.Item2);
var results = obs1.Select(x => new {
obs1 = x.Item2,
cache.ContainsKey(x.Item1) ? cache[x.Item1] : 0
});
At the end of the day, C# is an OO language and the heavy lifting of the thread-safe mutable collections is already all done for you.
There may be fancy Rx approach (feels like joins might be involved)... but how maintainable will it be? And how will it perform?
$0.02
I'd like to know the purpose of a such a query. Would you mind describing the usage scenario a bit?
Nevertheless, it seems like the following query may solve your problem. The initial projections aren't necessary if you already have some way of identifying the origin of each value, but I've included them for the sake of generalization, to be consistent with your extremely abstract mode of questioning. ;-)
Note: I'm assuming that someKeyThatVaries is not shared data as you've shown it, which is why I've also included the term anotherKeyThatVaries; otherwise, the entire query really makes no sense to me.
var obs1 = Observable.Interval(TimeSpan.FromSeconds(1))
.Select(x => Tuple.Create(someKeyThatVaries, x));
var obs2 = Observable.Interval(TimeSpan.FromSeconds(.25))
.Select(x => Tuple.Create(anotherKeyThatVaries, x));
var results = obs1.Select(t => new { Key = t.Item1, Value = t.Item2, Kind = 1 })
.Merge(
obs2.Select(t => new { Key = t.Item1, Value = t.Item2, Kind = 2 }))
.GroupBy(t => t.Key, t => new { t.Value, t.Kind })
.SelectMany(g =>
g.Scan(
new { X = -1L, Y = -1L, Yield = false },
(acc, cur) => cur.Kind == 1
? new { X = cur.Value, Y = acc.Y, Yield = true }
: new { X = acc.X, Y = cur.Value, Yield = false })
.Where(s => s.Yield)
.Select(s => Tuple.Create(s.X, s.Y)));

How to optimize LINQ OrderBy if the keySelector is slow?

I want to sort a list of objects using a value that can take some time to compute. For now I have code like this:
public IEnumerable<Foo> SortFoo(IEnumerable<Foo> original)
{
return foos.OrderByDescending(foo => CalculateBar(foo));
}
private int CalculateBar(Foo foo)
{
//some slow process here
}
The problem with the above code is that it will call calculate the value several times for each item, which is not good. The possible optimization is to use cached value (maybe a dictionary), but it will mean that SortFoo will have to clear the cache after each sorting (to avoid memory leak, and I do want the value to be recalculated on each SortFoo call).
Is there a cleaner and more elegant solution to this problem?
It appears that .OrderBy() is already optimized for slow keySelectors.
Based on the following, .OrderBy() seems to cache the result of the keySelector delegate you supply it.
var random = new Random(0);
var ordered = Enumerable
.Range(0, 10)
.OrderBy(x => {
var result = random.Next(20);
Console.WriteLine("keySelector({0}) => {1}", x, result);
return result;
});
Console.WriteLine(String.Join(", ", ordered));
Here's the output:
keySelector(0) => 14
keySelector(1) => 16
keySelector(2) => 15
keySelector(3) => 11
keySelector(4) => 4
keySelector(5) => 11
keySelector(6) => 18
keySelector(7) => 8
keySelector(8) => 19
keySelector(9) => 5
4, 9, 7, 3, 5, 0, 2, 1, 6, 8
If it were running the delegate once per comparison, I'd see more than just one invocation of my keySelector delegate per item.
Because each item is compared against other items multiple times in a sort, you can cheaply cache the computation at least one-per-item.
If you're often running the calculation against the same values, Memoizing the function would be your best bet,
public IEnumerable<Foo> SortFoo(IEnumerable<Foo> original)
{
return foos
.Select(f => new { Foo = f, SortBy = CalculateBar(f) })
.OrderByDescending(f=> f.SortBy)
.Select(f => f.Foo);
}
This will reduce the calculations to once per item

Categories

Resources