How to use Rx Throttle throttleDurationSelector in a complex grouping setting - c#

So, I asked how to change Throttle Timespan in middle of a running query here, to which James then replied that there's an overload and actually provided an example too (all good and well, I learned some techniques from there too).
During the previous weekend I produced a piece of code where the Throttle interval would be defined by the incoming stream itself. As a practical example, the stream could be a series of structs defined as follows
struct SomeEvent
{
public int Id;
public DateTimeOffset TimeStamp;
}
And then the accepting stream would check the TimeStamp fields and calculate the absence intervals based on them. Altering a bit James' linked example, the stream could be produced like
Func<SomeEvent, IObservable<long>> throttleFactory = e => Observable.Timer(TimeSpan.FromTicks(throttleDuration.Ticks - (DateTimeOffset.Now.Ticks - e.TimeStamp.Ticks)));
var sequence = Observable.Interval(TimeSpan.FromSeconds(1)).Select(_ => new SomeEvent { Id = 0, TimeStamp = DateTimeOffset.Now.AddTicks(-1) }).Throttle(throttleFactory);
var subscription = sequence.Subscribe(e => Console.WriteLine(e.TimeStamp));
The time shift, a few ticks, is just for illustrational purposes
Then I had a more elaborate example here, again, helped a lot by James. In short, the idea here was that there could be "a tower of alert lights" per ID (akin to traffic lights), having colours like yellow and red, which are lit each on their turn defined by how long there has been absence of events. Then when an event arrives, all the lights are switched off and "absence timers" start from zero again.
The snag I've hit is that I seem to be unable to alter this particular example so that it would use this idea to produce the Throttle value. Particularly I can't seem to get the grouping play out nicely on line grp => grp.Throttle(thresholdSelector(grp.Key, level), scheduler)) in James' code here. Maybe I'm just too exhausted on debugging and all, but I'd sure appreciate if someone could provide a nudge to the right direction!
What's the big idea? Well, the events could be timestamped at the source, but the transmission could add a delay that needs to be accounted for. Judging from the F# Users Group discussion that's gearing with distributed computing (and being somewhat familiar with integration issues myself), a scenario in which the events are timestamped somewhere and then relayed through different queuing systems create two kinds of cases:
A technical timeout: No events have been observed in some
endpoint within a defined interval.
A business timeout: There may
be plenty of events, e.g. a temporal, sustained burst of events
(even duplicates) from a queuing system, but they are timestamped
"too a long time ago".
<edit: Brandon makes a valid point in regard to my example given in 2.. How should one actually interpret the absence of "business timeouts"? In case events haven't arrived, the only valid timeout event to produce is the "technical" one in 1. If they do arrive in a burst, is the receiver interested on the time difference between the events and wants to raise color events accordingly? Or should the timer just be reset according to the timestamps in business events and then when a burst arrives, take the timestamp of the last one (which, again, could be longer than the allowed timeout period). It gets complicated and messy, better drop this one as an example.
That being written, I'd still be interested to know how to perform the join in grp => grp.Throttle(thresholdSelector(grp.Key, level), scheduler)). I'm inclined to mark Brandon's post as an answer too if this gets complicated (as I feel it might get, that grouping is fairly complex, I feel).

It sounds like throttling is no longer what you want. Is this what you are trying to do?
var alarms = events
.GroupBy(e => e.Id)
.SelectMany(grp =>
{
// Determine light color based on delay between events
// go black if event arrives that is not stale
var black = grp
.Where(ev => (Date.Now - ev.TimeStamp) < TimeSpan.FromSeconds(2))
.Select(ev => "black");
// go yellow if no events after 1 second
var yellow = black
.Select(b => Observable.Timer(TimeSpan.FromSeconds(1)))
.SwitchLatest()
.Select(t => "yellow");
// go red if no events after 2 seconds
var red = black
.Select(b => Observable.Timer(TimeSpan.FromSeconds(2)))
.SwitchLatest()
.Select(t => "red");
return Observable
.Merge(black, yellow, red)
.Select(color => new { Id = grp.Key, Color = color });
});

Related

How can I modify an IObservable<char> such that I collect characters until there have been no characters for a period of time?

I would like to write an Rx query that takes an IObvservable<char> and produces an IObservable<string>. The strings should be buffered until there have been no characters produced for a specified time.
The data source is a serial port from which I have captured the DataReceived event and from that I produce an IObservable<char>. The protocol I am dealing with is fundamentally character based, but it is not very consistent in its implementation so I need to observe the character stream in various different ways. In some cases there is an end-of-response terminator (but not a newline) and in one case, I get a string of unknown length and the only way I know it has all arrived is that nothing else arrives for a few hundred milliseconds. That is the problem I am trying to solve.
I have discovered
var result = source.Buffer(TimeSpan.FromMilliseconds(200))
.Select(s=>new string(s.ToArray()));
Buffer(TimeSpan) is almost what I need but not quite. I need the timer to reset every time a new character arrives, so that the buffer is only produced when sufficient time has elapsed since the last character.
Please, can anyone offer a suggestion on how to achieve this?
[Update]
While I was waiting for an answer, I came up with a solution of my own which essentially re-invents Throttle:
public virtual IObservable<string> BufferUntilQuiescentFor(IObservable<char> source, TimeSpan quietTime)
{
var shared = source.Publish().RefCount();
var timer = new Timer(quietTime.TotalMilliseconds);
var bufferCloser = new Subject<Unit>();
// Hook up the timer's Elapsed event so that it notifies the bufferCloser sequence
timer.Elapsed += (sender, args) =>
{
timer.Stop();
bufferCloser.OnNext(Unit.Default); // close the buffer
};
// Whenever the shared source sequence produces a value, reset the timer, which will prevent the buffer from closing.
shared.Subscribe(value =>
{
timer.Stop();
timer.Start();
});
// Finally, return the buffered sequence projected into IObservable<string>
var sequence = shared.Buffer(() => bufferCloser).Select(s=>new string(s.ToArray()));
return sequence;
}
I wasn't understanding Throttle correctly, I thought it behaved differently than it actually does - now that I've had it explained to me with a 'marble diagram' and I understand it correctly, I believe it is actually a much more elegant solution that what I came up with (I haven't tested my code yet, either). It was an interesting exercise though ;-)
All credit for this goes to Enigmativity - I'm just repeating it here to go with the explanation I'm adding.
var dueTime = TimeSpan.FromMilliseconds(200);
var result = source
.Publish(o => o.Buffer(() => o.Throttle(dueTime)))
.Select(cs => new string(cs.ToArray()));
The way it works is shown in this figure (where dueTime corresponds to three dashes of time):
source: -----h--el--l--o----wo-r--l-d---|
throttled: ------------------o------------d|
buffer[0]: -----h--el--l--o--|
buffer[1]: -wo-r--l-d--|
result: ------------------"hello"------"world"
The use of Publish is just to make sure that Buffer and Throttle share a single subscription to the underlying source. From the documentation for Throttle:
Ignores the values from an observable sequence which are followed by another value before due time...
The overload of Buffer being used takes a sequence of "buffer closings." Each time the sequence emits a value, the current buffer is ended and the next is started.
Does this do what you need?
var result =
source
.Publish(hot =>
hot.Buffer(() =>
hot.Throttle(TimeSpan.FromMilliseconds(200))))
.Select(s => new string(s.ToArray()));

How to merge observables on a regular interval?

I'm trying to merge two sensor data streams on a regular interval and I'm having trouble doing this properly in Rx. The best I've come up with is the the sample below, however I doubt this is optimal use of Rx.
Is there a better way?
I've tried Sample() but the sensors produce values at irregular intervals, slow (>1sec) and fast (<1sec). Sample() only seems to deal with fast data.
Observable<SensorA> sensorA = ... /* hot */
Observable<SensorB> sensorB = ... /* hot */
SensorA lastKnownSensorA;
SensorB lastKnownSensorB;
sensorA.Subscribe(s => lastKnownSensorA = s);
sensorB.Subscribe(s => lastKnownSensorB = s);
var combined = Observable.Interval(TimeSpan.FromSeconds(1))
.Where(t => _lastKnownSensorA != null)
.Select(t => new SensorAB(lastKnownSensorA, lastKnownSensorB)
I think #JonasChapuis 's answer may be what you are after, but there are a couple of issues which might be problematic:
CombineLatest does not emit a value until all sources have emitted at least one value each, which can cause loss of data from faster sources up until that point. That can be mitigated by using StartWith to seed a null object or default value on each sensor stream.
Sample will not emit a value if no new values have been observed in the sample period. I can't tell from the question if this is desirable or not, but if not there is in interesting trick to address this using a "pace" stream, described below to create a fixed frequency, instead of the maximum frequency obtained with Sample.
To address the CombineLatest issue, determine appropriate null values for your sensor streams - I usually make these available via a static Null property on the type - which makes the intention very clear. For value types use of Nullable<T> can also be a good option:
Observable<SensorA> sensorA = ... .StartWith(SensorA.Null);
Observable<SensorB> sensorB = ... .StartWith(SensorB.Null);
N.B. Don't make the common mistake of applying StartWith only to the output of CombinedLatest... that won't help!
Now, if you need regular results (which naturally could include repeats of the most recent readings), create a "pace" stream that emits at the desired interval:
var pace = Observable.Interval(TimeSpan.FromSeconds(1));
Then combine as follows, omitting the pace value from results:
var sensorReadings = Observable.CombineLatest(
pace, sensorA, sensorB,
(_, a, b) => new SensorAB(a,b));
It's also worth knowing about the MostRecent operator which can be combined with Zip very effectively if you want to drive output at the speed of a specific stream. See these answers where I demonstrate that approach: How to combine a slow moving observable with the most recent value of a fast moving observable and the more interesting tweak to handle multiple streams: How do I combine three observables such that
How about using the CombineLatest() operator to merge the latest values of the sensors every time either produces a value, followed by Sample() to ensure a max frequency of one measurement per second?
sensorA.CombineLatest(sensorB, (a, b) => new {A=a, B=b}).Sample(TimeSpan.FromSeconds(1))

Aggregate function before IObservable sequence is completed

Is there a way to use Aggregate function (Max, Count, ....) with Buffer before a sequence is completed.
When Completed this will produce results, but with continues stream it does not give
any results?
I was expecting there is some way to make this work with buffer?
IObservable<long> source;
IObservable<IGroupedObservable<long, long>> group = source
.Buffer(TimeSpan.FromSeconds(5))
.GroupBy(i => i % 3);
IObservable<long> sub = group.SelectMany(grp => grp.Max());
sub.Subscribe(l =>
{
Console.WriteLine("working");
});
Use Scan instead of Aggregate. Scan works just like Aggregate except that it sends out intermediate values as the stream advances. It is good for "running totals", which appears to be what you are asking for.
All the "statistical" operators in Rx (Min/Max/Sum/Count/Average) are using a mechanism that propagate the calculate value just when the subscription is completed, and that is the big difference between Scan and Aggregate, basically if you want to be notified when a new value is pushed in your subscription it is necessary to use Scan.
In your case if you want to keep the same logic, you should combine with GroupByUntil or Window operators, the conditions to use both can create and complete the group subscription regularly, and that will be used to push the next value.
You can get more info here: http://www.introtorx.com/content/v1.0.10621.0/07_Aggregation.html#BuildYourOwn
By the way I wrote a text related to what you want. Check in: http://www.codeproject.com/Tips/853256/Real-time-statistics-with-Rx-Statistical-Demo-App

Smoothing Rx Observables

Very similar to this question: Rx IObservable buffering to smooth out bursts of events, I am interested in smoothing out observables that may occur in bursts.
Hopefully the diagram below illustrates that I am aiming for:
Raw: A--B--CDE-F--------------G-----------------------
Interval: o--o--o--o--o--o--o--o--o--o--o--o--o--o--o--o--o
Output: A--B--C--D--E--F-----------G---------------------
Given the raw stream, I wish to stretch these events over regular intervals.
Throttling does not work as then I end up losing elements of the raw sequence.
Zip works well if the raw stream is more frequent than the timer, but fails if there are periods where there are no raw events.
EDIT
In response to Dan's answer, the problem with Buffer is that if bursts of many events arrive within a short time interval then I receive the events too often. Below shows what could happen with a buffer size of 3, and a timeout configured to the required interval:
Raw: -ABC-DEF-----------G-H-------------------------------
Interval: o--------o--------o--------o--------o--------o--------
Buffered: ---A---D-------------------G--------------------------
B E H
C F
Desired: ---------A--------B--------C--------D--------E ..etc.
How about this? (inspired by James' answer mentioned in the comments)...
public static IObservable<T> Regulate<T>(this IObservable<T> source, TimeSpan period)
{
var interval = Observable.Interval(period).Publish().RefCount();
return source.Select(x => Observable.Return(x)
.CombineLatest(interval, (v, _) => v)
.Take(1))
.Concat();
}
It turns each value in the raw observable into its own observable. The CombineLatest means it won't produce a value until the interval does. Then we just take one value from each of these observables and concatenate.
The first value in the raw observable gets delayed by one period. I'm not sure if that is an issue for you or not.
It looks like what you want to use is Buffer. One of the overloads allows you to specify an interval as well as the buffer length. You could conceivably set the length to 1.
Raw.Buffer(interval, 1);
For some more examples of its use, you can refer to the IntroToRX site.

Is it possible to Observable.Buffer on something other than time

I've been looking for examples on how to use Observable.Buffer in rx but can't find anything more substantial than boiler plate time buffered stuff.
There does seem to be an overload to specify a "bufferClosingSelector" but I can't wrap my mind around it.
What I'm trying to do is create a sequence that buffers by time or by an "accumulation".
Consider a request stream where every request has some sort of weight to it and I do not want to process more than x accumulated weight at a time, or if not enough has accumulated just give me what has come in the last timeframe(regular Buffer functionality)
bufferClosingSelector is a function called every time to get an Observable which will produce a value when the buffer is expected to be closed.
For example,
source.Buffer(() => Observable.Timer(TimeSpan.FromSeconds(1))) works like the regular Buffer(time) overload.
In you want to weight a sequence, you can apply a Scan over the sequence and then decide on your aggregating condition.
E.g., source.Scan((a,c) => a + c).SkipWhile(a => a < 100) gives you a sequence which produces a value when the source sequence has added up to more than 100.
You can use Amb to race these two closing conditions to see which reacts first:
.Buffer(() => Observable.Amb
(
Observable.Timer(TimeSpan.FromSeconds(1)),
source.Scan((a,c) => a + c).SkipWhile(a => a < 100)
)
)
You can use any series of combinators which produces any value for the buffer to be closed at that point.
Note:
The value given to the closing selector doesn't matter - it's the notification that matters. So to combine sources of different types with Amb simply change it to System.Reactive.Unit.
Observable.Amb(stream1.Select(_ => new Unit()), stream2.Select(_ => new Unit())

Categories

Resources