Is it possible to Observable.Buffer on something other than time - c#

I've been looking for examples on how to use Observable.Buffer in rx but can't find anything more substantial than boiler plate time buffered stuff.
There does seem to be an overload to specify a "bufferClosingSelector" but I can't wrap my mind around it.
What I'm trying to do is create a sequence that buffers by time or by an "accumulation".
Consider a request stream where every request has some sort of weight to it and I do not want to process more than x accumulated weight at a time, or if not enough has accumulated just give me what has come in the last timeframe(regular Buffer functionality)

bufferClosingSelector is a function called every time to get an Observable which will produce a value when the buffer is expected to be closed.
For example,
source.Buffer(() => Observable.Timer(TimeSpan.FromSeconds(1))) works like the regular Buffer(time) overload.
In you want to weight a sequence, you can apply a Scan over the sequence and then decide on your aggregating condition.
E.g., source.Scan((a,c) => a + c).SkipWhile(a => a < 100) gives you a sequence which produces a value when the source sequence has added up to more than 100.
You can use Amb to race these two closing conditions to see which reacts first:
.Buffer(() => Observable.Amb
(
Observable.Timer(TimeSpan.FromSeconds(1)),
source.Scan((a,c) => a + c).SkipWhile(a => a < 100)
)
)
You can use any series of combinators which produces any value for the buffer to be closed at that point.
Note:
The value given to the closing selector doesn't matter - it's the notification that matters. So to combine sources of different types with Amb simply change it to System.Reactive.Unit.
Observable.Amb(stream1.Select(_ => new Unit()), stream2.Select(_ => new Unit())

Related

How can I modify an IObservable<char> such that I collect characters until there have been no characters for a period of time?

I would like to write an Rx query that takes an IObvservable<char> and produces an IObservable<string>. The strings should be buffered until there have been no characters produced for a specified time.
The data source is a serial port from which I have captured the DataReceived event and from that I produce an IObservable<char>. The protocol I am dealing with is fundamentally character based, but it is not very consistent in its implementation so I need to observe the character stream in various different ways. In some cases there is an end-of-response terminator (but not a newline) and in one case, I get a string of unknown length and the only way I know it has all arrived is that nothing else arrives for a few hundred milliseconds. That is the problem I am trying to solve.
I have discovered
var result = source.Buffer(TimeSpan.FromMilliseconds(200))
.Select(s=>new string(s.ToArray()));
Buffer(TimeSpan) is almost what I need but not quite. I need the timer to reset every time a new character arrives, so that the buffer is only produced when sufficient time has elapsed since the last character.
Please, can anyone offer a suggestion on how to achieve this?
[Update]
While I was waiting for an answer, I came up with a solution of my own which essentially re-invents Throttle:
public virtual IObservable<string> BufferUntilQuiescentFor(IObservable<char> source, TimeSpan quietTime)
{
var shared = source.Publish().RefCount();
var timer = new Timer(quietTime.TotalMilliseconds);
var bufferCloser = new Subject<Unit>();
// Hook up the timer's Elapsed event so that it notifies the bufferCloser sequence
timer.Elapsed += (sender, args) =>
{
timer.Stop();
bufferCloser.OnNext(Unit.Default); // close the buffer
};
// Whenever the shared source sequence produces a value, reset the timer, which will prevent the buffer from closing.
shared.Subscribe(value =>
{
timer.Stop();
timer.Start();
});
// Finally, return the buffered sequence projected into IObservable<string>
var sequence = shared.Buffer(() => bufferCloser).Select(s=>new string(s.ToArray()));
return sequence;
}
I wasn't understanding Throttle correctly, I thought it behaved differently than it actually does - now that I've had it explained to me with a 'marble diagram' and I understand it correctly, I believe it is actually a much more elegant solution that what I came up with (I haven't tested my code yet, either). It was an interesting exercise though ;-)
All credit for this goes to Enigmativity - I'm just repeating it here to go with the explanation I'm adding.
var dueTime = TimeSpan.FromMilliseconds(200);
var result = source
.Publish(o => o.Buffer(() => o.Throttle(dueTime)))
.Select(cs => new string(cs.ToArray()));
The way it works is shown in this figure (where dueTime corresponds to three dashes of time):
source: -----h--el--l--o----wo-r--l-d---|
throttled: ------------------o------------d|
buffer[0]: -----h--el--l--o--|
buffer[1]: -wo-r--l-d--|
result: ------------------"hello"------"world"
The use of Publish is just to make sure that Buffer and Throttle share a single subscription to the underlying source. From the documentation for Throttle:
Ignores the values from an observable sequence which are followed by another value before due time...
The overload of Buffer being used takes a sequence of "buffer closings." Each time the sequence emits a value, the current buffer is ended and the next is started.
Does this do what you need?
var result =
source
.Publish(hot =>
hot.Buffer(() =>
hot.Throttle(TimeSpan.FromMilliseconds(200))))
.Select(s => new string(s.ToArray()));

How to merge observables on a regular interval?

I'm trying to merge two sensor data streams on a regular interval and I'm having trouble doing this properly in Rx. The best I've come up with is the the sample below, however I doubt this is optimal use of Rx.
Is there a better way?
I've tried Sample() but the sensors produce values at irregular intervals, slow (>1sec) and fast (<1sec). Sample() only seems to deal with fast data.
Observable<SensorA> sensorA = ... /* hot */
Observable<SensorB> sensorB = ... /* hot */
SensorA lastKnownSensorA;
SensorB lastKnownSensorB;
sensorA.Subscribe(s => lastKnownSensorA = s);
sensorB.Subscribe(s => lastKnownSensorB = s);
var combined = Observable.Interval(TimeSpan.FromSeconds(1))
.Where(t => _lastKnownSensorA != null)
.Select(t => new SensorAB(lastKnownSensorA, lastKnownSensorB)
I think #JonasChapuis 's answer may be what you are after, but there are a couple of issues which might be problematic:
CombineLatest does not emit a value until all sources have emitted at least one value each, which can cause loss of data from faster sources up until that point. That can be mitigated by using StartWith to seed a null object or default value on each sensor stream.
Sample will not emit a value if no new values have been observed in the sample period. I can't tell from the question if this is desirable or not, but if not there is in interesting trick to address this using a "pace" stream, described below to create a fixed frequency, instead of the maximum frequency obtained with Sample.
To address the CombineLatest issue, determine appropriate null values for your sensor streams - I usually make these available via a static Null property on the type - which makes the intention very clear. For value types use of Nullable<T> can also be a good option:
Observable<SensorA> sensorA = ... .StartWith(SensorA.Null);
Observable<SensorB> sensorB = ... .StartWith(SensorB.Null);
N.B. Don't make the common mistake of applying StartWith only to the output of CombinedLatest... that won't help!
Now, if you need regular results (which naturally could include repeats of the most recent readings), create a "pace" stream that emits at the desired interval:
var pace = Observable.Interval(TimeSpan.FromSeconds(1));
Then combine as follows, omitting the pace value from results:
var sensorReadings = Observable.CombineLatest(
pace, sensorA, sensorB,
(_, a, b) => new SensorAB(a,b));
It's also worth knowing about the MostRecent operator which can be combined with Zip very effectively if you want to drive output at the speed of a specific stream. See these answers where I demonstrate that approach: How to combine a slow moving observable with the most recent value of a fast moving observable and the more interesting tweak to handle multiple streams: How do I combine three observables such that
How about using the CombineLatest() operator to merge the latest values of the sensors every time either produces a value, followed by Sample() to ensure a max frequency of one measurement per second?
sensorA.CombineLatest(sensorB, (a, b) => new {A=a, B=b}).Sample(TimeSpan.FromSeconds(1))

Aggregate function before IObservable sequence is completed

Is there a way to use Aggregate function (Max, Count, ....) with Buffer before a sequence is completed.
When Completed this will produce results, but with continues stream it does not give
any results?
I was expecting there is some way to make this work with buffer?
IObservable<long> source;
IObservable<IGroupedObservable<long, long>> group = source
.Buffer(TimeSpan.FromSeconds(5))
.GroupBy(i => i % 3);
IObservable<long> sub = group.SelectMany(grp => grp.Max());
sub.Subscribe(l =>
{
Console.WriteLine("working");
});
Use Scan instead of Aggregate. Scan works just like Aggregate except that it sends out intermediate values as the stream advances. It is good for "running totals", which appears to be what you are asking for.
All the "statistical" operators in Rx (Min/Max/Sum/Count/Average) are using a mechanism that propagate the calculate value just when the subscription is completed, and that is the big difference between Scan and Aggregate, basically if you want to be notified when a new value is pushed in your subscription it is necessary to use Scan.
In your case if you want to keep the same logic, you should combine with GroupByUntil or Window operators, the conditions to use both can create and complete the group subscription regularly, and that will be used to push the next value.
You can get more info here: http://www.introtorx.com/content/v1.0.10621.0/07_Aggregation.html#BuildYourOwn
By the way I wrote a text related to what you want. Check in: http://www.codeproject.com/Tips/853256/Real-time-statistics-with-Rx-Statistical-Demo-App

How to measure RFT metric in NDepend?

Does NDepend have a direct way to measure RFC (RFT) by CQL? Or do we have to write a CQL query for recursive counting invoked methods in used classes (types) our-self? If so, how does it look like? Similarly to this?
This is indeed possible thanks to the magic FillIterative() method. The only issue is that this code query can take a few seconds to run and you might need to adjust the query time-out (set to 2 seconds per default). Notice that this performance cost is due to the RFT definition, it doesn't come from a problem with the algorithm used.
Here is the code:
// <Name>RFT</Name>
from t in Application.Types
where !t.IsInterface && !t.IsEnumeration
let methodsUsedDirectly =
t.TypesUsed
.Where(tUsed => !tUsed.IsThirdParty)
.ChildMethods()
.Where(m => m.IsUsedBy(t))
let methodsUsedIndirectly = methodsUsedDirectly
.FillIterative(
methods => methods.SelectMany(
m => m.MethodsCalled
.Where(mCalled => !mCalled.IsThirdParty))
)
.DefinitionDomain.ToArray()
let rftLoC = methodsUsedIndirectly.Sum(m=>m.NbLinesOfCode)
select new { t,
methodsUsedDirectly,
methodsUsedIndirectly,
rft = methodsUsedIndirectly.Length,
rftLoC }
By running this query, not only you get the RFT measure for each type, but you get also the RFT in terms of Lines of Code (i.e the sum of LoC of methods in the response) and you can also for each type, lists all methods in the response.

Linq keyword extraction - limit extraction scope

With regards to this solution.
Is there a way to limit the number of keywords to be taken into consideration? For example, I'd like only first 1000 words of text to be calculated. There's a "Take" method in Linq, but it serves a different purpose - all words will be calculated, and N records will be returned. What's the right alternative to make this correctly?
Simply apply Take earlier - straight after the call to Split:
var results = src.Split()
.Take(1000)
.GroupBy(...) // etc
Well, strictly speaking LINQ is not necessarily going to read everything; Take will stop as soon as it can. The problem is that in the related question you look at Count, and it is hard to get a Count without consuming all the data. Likewise, string.Split will look at everything.
But if you wrote a lazy non-buffering Split function (using yield return) and you wanted the first 1000 unique words, then
var words = LazySplit(text).Distinct().Take(1000);
would work
Enumerable.Take does in fact stream results out; it doesn't buffer up its source entirely and then return only the first N. Looking at your original solution though, the problem is that the input to where you would want to do a Take is String.Split. Unfortunately, this method doesn't use any sort of deferred execution; it eagerly creates an array of all the 'splits' and then returns it.
Consequently, the technique to get a streaming sequence of words from some text would be something like:
var words = src.StreamingSplit() // you'll have to implement that
.Take(1000);
However, I do note that the rest of your query is:
...
.GroupBy(str => str) // group words by the value
.Select(g => new
{
str = g.Key, // the value
count = g.Count() // the count of that value
});
Do note that GroupBy is a buffering operation - you can expect that all of the 1,000 words from its source will end up getting stored somewhere in the process of the groups being piped out.
As I see it, the options are:
If you don't mind going through all of the text for splitting purposes, then src.Split().Take(1000) is fine. The downside is wasted time (to continue splitting after it is no longer necesary) and wasted space (to store all of the words in an array even though only the first 1,000) will be needed. However, the rest of the query will not operate on any more words than necessary.
If you can't afford to do (1) because of time / memory constraints, go with src.StreamingSplit().Take(1000) or equivalent. In this case, none of the original text will be processed after 1,000 words have been found.
Do note that those 1,000 words themselves will end up getting buffered by the GroupBy clause in both cases.

Categories

Resources