Does reactive extensions support rolling buffers? - c#

I'm using reactive extensions to collate data into buffers of 100ms:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100))
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
This works fine. However, I want slightly different behavior than that provided by the Buffer operation. Essentially, I want to reset the timer if another data item is received. Only when no data has been received for the entire 100ms do I want to handle it. This opens up the possibility of never handling the data, so I should also be able to specify a maximum count. I would imagine something along the lines of:
.SlidingBuffer(TimeSpan.FromMilliseconds(100), 10000)
I've had a look around and haven't been able to find anything like this in Rx? Can anyone confirm/deny this?

This is possible by combining the built-in Window and Throttle methods of Observable. First, let's solve the simpler problem where we ignore the maximum count condition:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay)
{
var closes = stream.Throttle(delay);
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
The powerful Window method did the heavy lifting. Now it's easy enough to see how to add a maximum count:
public static IObservable<IList<T>> BufferUntilInactive<T>(this IObservable<T> stream, TimeSpan delay, Int32? max=null)
{
var closes = stream.Throttle(delay);
if (max != null)
{
var overflows = stream.Where((x,index) => index+1>=max);
closes = closes.Merge(overflows);
}
return stream.Window(() => closes).SelectMany(window => window.ToList());
}
I'll write a post explaining this on my blog. https://gist.github.com/2244036
Documentation for the Window method:
http://leecampbell.blogspot.co.uk/2011/03/rx-part-9join-window-buffer-and-group.html
http://enumeratethis.com/2011/07/26/financial-charts-reactive-extensions/

I wrote an extension to do most of what you're after - BufferWithInactivity.
Here it is:
public static IObservable<IEnumerable<T>> BufferWithInactivity<T>(
this IObservable<T> source,
TimeSpan inactivity,
int maximumBufferSize)
{
return Observable.Create<IEnumerable<T>>(o =>
{
var gate = new object();
var buffer = new List<T>();
var mutable = new SerialDisposable();
var subscription = (IDisposable)null;
var scheduler = Scheduler.ThreadPool;
Action dump = () =>
{
var bts = buffer.ToArray();
buffer = new List<T>();
if (o != null)
{
o.OnNext(bts);
}
};
Action dispose = () =>
{
if (subscription != null)
{
subscription.Dispose();
}
mutable.Dispose();
};
Action<Action<IObserver<IEnumerable<T>>>> onErrorOrCompleted =
onAction =>
{
lock (gate)
{
dispose();
dump();
if (o != null)
{
onAction(o);
}
}
};
Action<Exception> onError = ex =>
onErrorOrCompleted(x => x.OnError(ex));
Action onCompleted = () => onErrorOrCompleted(x => x.OnCompleted());
Action<T> onNext = t =>
{
lock (gate)
{
buffer.Add(t);
if (buffer.Count == maximumBufferSize)
{
dump();
mutable.Disposable = Disposable.Empty;
}
else
{
mutable.Disposable = scheduler.Schedule(inactivity, () =>
{
lock (gate)
{
dump();
}
});
}
}
};
subscription =
source
.ObserveOn(scheduler)
.Subscribe(onNext, onError, onCompleted);
return () =>
{
lock (gate)
{
o = null;
dispose();
}
};
});
}

With Rx Extensions 2.0, your can answer both requirements with a new Buffer overload accepting a timeout and a size:
this.subscription = this.dataService
.Where(x => !string.Equals("FOO", x.Key.Source))
.Buffer(TimeSpan.FromMilliseconds(100), 1)
.ObserveOn(this.dispatcherService)
.Where(x => x.Count != 0)
.Subscribe(this.OnBufferReceived);
See https://msdn.microsoft.com/en-us/library/hh229200(v=vs.103).aspx for the documentation.

I guess this can be implemented on top of Buffer method as shown below:
public static IObservable<IList<T>> SlidingBuffer<T>(this IObservable<T> obs, TimeSpan span, int max)
{
return Observable.CreateWithDisposable<IList<T>>(cl =>
{
var acc = new List<T>();
return obs.Buffer(span)
.Subscribe(next =>
{
if (next.Count == 0) //no activity in time span
{
cl.OnNext(acc);
acc.Clear();
}
else
{
acc.AddRange(next);
if (acc.Count >= max) //max items collected
{
cl.OnNext(acc);
acc.Clear();
}
}
}, err => cl.OnError(err), () => { cl.OnNext(acc); cl.OnCompleted(); });
});
}
NOTE: I haven't tested it, but I hope it gives you the idea.

Colonel Panic's solution is almost perfect. The only thing that is missing is a Publish component, in order to make the solution work with cold sequences too.
/// <summary>
/// Projects each element of an observable sequence into a buffer that's sent out
/// when either a given inactivity timespan has elapsed, or it's full,
/// using the specified scheduler to run timers.
/// </summary>
public static IObservable<IList<T>> BufferUntilInactive<T>(
this IObservable<T> source, TimeSpan dueTime, int maxCount,
IScheduler scheduler = default)
{
if (maxCount < 1) throw new ArgumentOutOfRangeException(nameof(maxCount));
scheduler ??= Scheduler.Default;
return source.Publish(published =>
{
var combinedBoundaries = Observable.Merge
(
published.Throttle(dueTime, scheduler),
published.Skip(maxCount - 1)
);
return published
.Window(() => combinedBoundaries)
.SelectMany(window => window.ToList());
});
}
Beyond adding the Publish, I've also replaced the original .Where((_, index) => index + 1 >= maxCount) with the equivalent but shorter .Skip(maxCount - 1). For completeness there is also an IScheduler parameter, which configures the scheduler where the timer is run.

Related

Most efficient way to find all modes in a List of randomly generated integers and how often they occured

If a method written in C# will be passed either a null or somewhere between 0 to 6,000,000 randomly generated and unsorted integers, what is the most efficient way to determine all modes and how many times they occurred? In particular, can anyone help me with a LINQ based solution, which I'm struggling with?
Here is what I have so far:
My closest LINQ solution so far only grabs the first mode it finds and does not specify the number of occurrences. It is also about 7 times as slow on my computer as my ugly, bulky implementation, which is hideous.
int mode = numbers.GroupBy(number => number).OrderByDescending(group => group.Count()).Select(k => k.Key).FirstOrDefault();
My manually coded method.
public class NumberCount
{
public int Value;
public int Occurrences;
public NumberCount(int value, int occurrences)
{
Value = value;
Occurrences = occurrences;
}
}
private static List<NumberCount> findMostCommon(List<int> integers)
{
if (integers == null)
return null;
else if (integers.Count < 1)
return new List<NumberCount>();
List<NumberCount> mostCommon = new List<NumberCount>();
integers.Sort();
mostCommon.Add(new NumberCount(integers[0], 1));
for (int i=1; i<integers.Count; i++)
{
if (mostCommon[mostCommon.Count - 1].Value != integers[i])
mostCommon.Add(new NumberCount(integers[i], 1));
else
mostCommon[mostCommon.Count - 1].Occurrences++;
}
List<NumberCount> answer = new List<NumberCount>();
answer.Add(mostCommon[0]);
for (int i=1; i<mostCommon.Count; i++)
{
if (mostCommon[i].Occurrences > answer[0].Occurrences)
{
if (answer.Count == 1)
{
answer[0] = mostCommon[i];
}
else
{
answer = new List<NumberCount>();
answer.Add(mostCommon[i]);
}
}
else if (mostCommon[i].Occurrences == answer[0].Occurrences)
{
answer.Add(mostCommon[i]);
}
}
return answer;
}
Basically, I'm trying to get an elegant, compact LINQ solution at least as fast as my ugly method. Thanks in advance for any suggestions.
I would personally use a ConcurrentDictionary that would update a counter and dictionary are faster to access. I use this method quite a lot and it's more readable.
// create a dictionary
var dictionary = new ConcurrentDictionary<int, int>();
// list of you integers
var numbers = new List<int>();
// parallel the iteration ( we can because concurrent dictionary is thread safe-ish
numbers.AsParallel().ForAll((number) =>
{
// add the key if it's not there with value of 1 and if it's there it use the lambda function to increment by 1
dictionary.AddOrUpdate(number, 1, (key, old) => old + 1);
});
Then it's only a matter of getting the most occurrence there is many ways. I don't fully understand your version but the single most is only a matter of 1 aggregate like so :
var topMostOccurence = dictionary.Aggregate((x, y) => { return x.Value > y.Value ? x : y; });
what you want: 2+ numbers could appear same times in an array, like: {1,1,1,2,2,2,3,3,3}
your current code is from here: Find the most occurring number in a List<int>
but it returns a number only, it's exactly a wrong result.
The problem of Linq is: loop cannot end if you don't want it continue.
But, here I result a list with LINQ as you required:
List<NumberCount> MaxOccurrences(List<int> integers)
{
return integers?.AsParallel()
.GroupBy(x => x)//group numbers, key is number, count is count
.Select(k => new NumberCount(k.Key, k.Count()))
.GroupBy(x => x.Occurrences)//group by Occurrences, key is Occurrences, value is result
.OrderByDescending(x => x.Key) //sort
.FirstOrDefault()? //the first one is result
.ToList();
}
Test details:
Array Size:30000
30000
MaxOccurrences only
MaxOccurrences1: 207
MaxOccurrences2: 38
=============
Full List
Original1: 28
Original2: 23
ConcurrentDictionary1: 32
ConcurrentDictionary2: 34
AsParallel1: 27
AsParallel2: 19
AsParallel3: 36
ArraySize: 3000000
3000000
MaxOccurrences only
MaxOccurrences1: 3009
MaxOccurrences2: 1962 //<==this is the best one in big loop.
=============
Full List
Original1: 3200
Original2: 3234
ConcurrentDictionary1: 3391
ConcurrentDictionary2: 2681
AsParallel1: 3776
AsParallel2: 2389
AsParallel3: 2155
Here is code:
class Program
{
static void Main(string[] args)
{
const int listSize = 3000000;
var rnd = new Random();
var randomList = Enumerable.Range(1, listSize).OrderBy(e => rnd.Next()).ToList();
// the code that you want to measure comes here
Console.WriteLine(randomList.Count);
Console.WriteLine("MaxOccurrences only");
Test(randomList, MaxOccurrences1);
Test(randomList, MaxOccurrences2);
Console.WriteLine("=============");
Console.WriteLine("Full List");
Test(randomList, Original1);
Test(randomList, Original2);
Test(randomList, AsParallel1);
Test(randomList, AsParallel2);
Test(randomList, AsParallel3);
Console.ReadLine();
}
private static void Test(List<int> data, Action<List<int>> method)
{
var watch = System.Diagnostics.Stopwatch.StartNew();
method(data);
watch.Stop();
Console.WriteLine($"{method.Method.Name}: {watch.ElapsedMilliseconds}");
}
private static void Original1(List<int> integers)
{
integers?.GroupBy(number => number)
.OrderByDescending(group => group.Count())
.Select(k => new NumberCount(k.Key, k.Count()))
.ToList();
}
private static void Original2(List<int> integers)
{
integers?.GroupBy(number => number)
.Select(k => new NumberCount(k.Key, k.Count()))
.OrderByDescending(x => x.Occurrences)
.ToList();
}
private static void AsParallel1(List<int> integers)
{
integers?.GroupBy(number => number)
.AsParallel() //each group will be count by a CPU unit
.Select(k => new NumberCount(k.Key, k.Count())) //Grap result, before sort
.OrderByDescending(x => x.Occurrences) //sort after result
.ToList();
}
private static void AsParallel2(List<int> integers)
{
integers?.AsParallel()
.GroupBy(number => number)
.Select(k => new
{
Key = k.Key,
Occurrences = k.Count()
}) //Grap result, before sort
.OrderByDescending(x => x.Occurrences) //sort after result
.ToList();
}
private static void AsParallel3(List<int> integers)
{
integers?.AsParallel()
.GroupBy(number => number)
.Select(k => new NumberCount(k.Key, k.Count())) //Grap result, before sort
.OrderByDescending(x => x.Occurrences) //sort after result
.ToList();
}
private static void MaxOccurrences1(List<int> integers)
{
integers?.AsParallel()
.GroupBy(number => number)
.GroupBy(x => x.Count())
.OrderByDescending(x => x.Key)
.FirstOrDefault()?
.ToList()
.Select(k => new NumberCount(k.Key, k.Count()))
.ToList();
}
private static void MaxOccurrences2(List<int> integers)
{
integers?.AsParallel()
.GroupBy(x => x)//group numbers, key is number, count is count
.Select(k => new NumberCount(k.Key, k.Count()))
.GroupBy(x => x.Occurrences)//group by Occurrences, key is Occurrences, value is result
.OrderByDescending(x => x.Key) //sort
.FirstOrDefault()? //the first one is result
.ToList();
}
private static void ConcurrentDictionary1(List<int> integers)
{
ConcurrentDictionary<int, int> result = new ConcurrentDictionary<int, int>();
integers?.ForEach(x => { result.AddOrUpdate(x, 1, (key, old) => old + 1); });
result.OrderByDescending(x => x.Value).ToList();
}
private static void ConcurrentDictionary2(List<int> integers)
{
ConcurrentDictionary<int, int> result = new ConcurrentDictionary<int, int>();
integers?.AsParallel().ForAll(x => { result.AddOrUpdate(x, 1, (key, old) => old + 1); });
result.OrderByDescending(x => x.Value).ToList();
}
}
public class NumberCount
{
public int Value;
public int Occurrences;
public NumberCount(int value, int occurrences)
{
Value = value;
Occurrences = occurrences;
}
}
Different code is more efficient for differing lengths, but as the length approaches 6 million, this approach seems fastest. In general, LINQ is not for improving the speed of code, but the understanding and maintainability, depending on how you feel about functional programming styles.
Your code is fairly fast, and beats the simple LINQ approaches using GroupBy. It gains a good advantage from using the fact that List.Sort is highly optimized, and my code uses that as well, but on a local copy of the list to avoid changing the source. My code is similar in approach to yours, but is designed around a single pass doing all the computation needed. It uses an extension method I re-optimized for this problem, called GroupByRuns, that returns an IEnumerable<IGrouping<T,T>>. It also is hand expanded rather than fall back on the generic GroupByRuns that takes extra arguments for key and result selection. Since .Net doesn't have an end user accessible IGrouping<,> implementation (!), I rolled my own that implements ICollection to optimize Count().
This code runs about 1.3x as fast as yours (after I slightly optimized yours by 5%).
First, the RunGrouping class to return a group of runs:
public class RunGrouping<T> : IGrouping<T, T>, ICollection<T> {
public T Key { get; }
int Count;
int ICollection<T>.Count => Count;
public bool IsReadOnly => true;
public RunGrouping(T key, int count) {
Key = key;
Count = count;
}
public IEnumerator<T> GetEnumerator() {
for (int j1 = 0; j1 < Count; ++j1)
yield return Key;
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
public void Add(T item) => throw new NotImplementedException();
public void Clear() => throw new NotImplementedException();
public bool Contains(T item) => Count > 0 && EqualityComparer<T>.Default.Equals(Key, item);
public void CopyTo(T[] array, int arrayIndex) => throw new NotImplementedException();
public bool Remove(T item) => throw new NotImplementedException();
}
Second, the extension method on IEnumerable that groups the runs:
public static class IEnumerableExt {
public static IEnumerable<IGrouping<T, T>> GroupByRuns<T>(this IEnumerable<T> src) {
var cmp = EqualityComparer<T>.Default;
bool notAtEnd = true;
using (var e = src.GetEnumerator()) {
bool moveNext() {
return notAtEnd;
}
IGrouping<T, T> NextRun() {
var prev = e.Current;
var ct = 0;
while (notAtEnd && cmp.Equals(e.Current, prev)) {
++ct;
notAtEnd = e.MoveNext();
}
return new RunGrouping<T>(prev, ct);
}
notAtEnd = e.MoveNext();
while (notAtEnd)
yield return NextRun();
}
}
}
Finally, the extension method that finds the max count modes. Basically it goes through the runs and keeps a record of those int with the current longest run count.
public static class IEnumerableIntExt {
public static IEnumerable<KeyValuePair<int, int>> MostCommon(this IEnumerable<int> src) {
var mysrc = new List<int>(src);
mysrc.Sort();
var maxc = 0;
var maxmodes = new List<int>();
foreach (var g in mysrc.GroupByRuns()) {
var gc = g.Count();
if (gc > maxc) {
maxmodes.Clear();
maxmodes.Add(g.Key);
maxc = gc;
}
else if (gc == maxc)
maxmodes.Add(g.Key);
}
return maxmodes.Select(m => new KeyValuePair<int, int>(m, maxc));
}
}
Given an existing random list of integers rl, you can get the answer with:
var ans = rl.MostCommon();
I tested with the code below on my Intel i7-8700K and achieved the following results:
Lambda: found 78 in 134 ms.
Manual: found 78 in 368 ms.
Dictionary: found 78 in 195 ms.
static IEnumerable<int> GenerateNumbers(int amount)
{
Random r = new Random();
for (int i = 0; i < amount; i++)
yield return r.Next(100);
}
static void Main(string[] args)
{
var numbers = GenerateNumbers(6_000_000).ToList();
Stopwatch sw = Stopwatch.StartNew();
int mode = numbers.GroupBy(number => number).OrderByDescending(group => group.Count()).Select(k =>
{
int count = k.Count();
return new { Mode = k.Key, Count = count };
}).FirstOrDefault().Mode;
sw.Stop();
Console.WriteLine($"Lambda: found {mode} in {sw.ElapsedMilliseconds} ms.");
sw = Stopwatch.StartNew();
mode = findMostCommon(numbers)[0].Value;
sw.Stop();
Console.WriteLine($"Manual: found {mode} in {sw.ElapsedMilliseconds} ms.");
// create a dictionary
var dictionary = new ConcurrentDictionary<int, int>();
sw = Stopwatch.StartNew();
// parallel the iteration ( we can because concurrent dictionary is thread safe-ish
numbers.AsParallel().ForAll((number) =>
{
// add the key if it's not there with value of 1 and if it's there it use the lambda function to increment by 1
dictionary.AddOrUpdate(number, 1, (key, old) => old + 1);
});
mode = dictionary.Aggregate((x, y) => { return x.Value > y.Value ? x : y; }).Key;
sw.Stop();
Console.WriteLine($"Dictionary: found {mode} in {sw.ElapsedMilliseconds} ms.");
Console.ReadLine();
}
So far, Netmage's is the fastest I've found. The only thing I have been able to make that can beat it (at least with a valid range of 1 to 500,000,000) will only work with arrays ranging from 1 to 500,000,000 or smaller in value on my computer because I only have 8 GB of RAM. This prevents me from testing it with the full 1 to int.MaxValue range and I suspect it will fall behind in terms of speed at that size as it appears to struggle more and more with larger ranges. It uses the values as indexes and the value at those indexes as the occurrences. With 6 million randomly generated positive 16 bit integers, it is about 20 times as fast as my original method with both in Release Mode. It is only about 1.6 times as fast with 32 bit integers ranging from 1 to 500,000,000.
private static List<NumberCount> findMostCommon(List<int> integers)
{
List<NumberCount> answers = new List<NumberCount>();
int[] mostCommon = new int[_Max];
int max = 0;
for (int i = 1; i < integers.Count; i++)
{
int iValue = integers[i];
mostCommon[iValue]++;
int intVal = mostCommon[iValue];
if (intVal > 1)
{
if (intVal > max)
{
max++;
answers.Clear();
answers.Add(new NumberCount(iValue, max));
}
else if (intVal == max)
{
answers.Add(new NumberCount(iValue, max));
}
}
}
if (answers.Count < 1)
answers.Add(new NumberCount(0, -100)); // This -100 Occurrecnces value signifies that all values are equal.
return answers;
}
Perhaps a branching like this would be optiomal:
if (list.Count < sizeLimit)
answers = getFromSmallRangeMethod(list);
else
answers = getFromStandardMethod(list);

Sort Observable by predefined order in Reactive Extensions

Say I have a type T:
class T {
public int identifier; //Arbitrary but unique for each character (Guids in real-life)
public char character; //In real life not a char, but I chose char here for easy demo purposes
}
And I have a predefined ordered sequence of identifiers:
int[] identifierSequence = new int[]{
9, 3, 4, 4, 7
};
I now need to order an IObservable<T> which produces the following sequence of objects:
{identifier: 3, character 'e'},
{identifier: 9, character 'h'},
{identifier: 4, character 'l'},
{identifier: 4, character 'l'},
{identifier: 7, character 'o'}
So that the resulting IObservable produces hello.
I don't want to use ToArray, as I want to receive objects as soon as they arrive and not wait until everything is observed.
More specifically, I would like to receive them like this:
Input: e h l l o
Output: he l l o
What would be the proper reactive way to do this?
The best I could come up with is this:
Dictionary<int, T> buffer = new Dictionary<int, T>();
int curIndex = 0;
inputObserable.SelectMany(item =>
{
buffer[item.identifier] = item;
IEnumerable<ReportTemplate> GetReadyElements()
{
while (true)
{
int nextItemIdentifier = identifierSequence[curIndex];
T nextItem;
if (buffer.TryGetValue(nextItemIdentifier, out nextItem))
{
buffer.Remove(nextItem.identifier);
curIndex++;
yield return nextItem;
}
else
{
break;
}
}
}
return GetReadyElements();
});
EDIT:
Schlomo raised some very valid issues with my code, which is why I marked his answer as correct. I made some modifications to his to code for it to be usable:
Generic identifier and object type
Iteration instead of recursion to prevent potential stackoverflow on very large observables
Convert the anonymous type to a real class for readability
Wherever possible, lookup a value in a dictionary only once and store as variable instead of looking it up multiple times
Fixed type
This gives me the following code:
public static IObservable<T> OrderByIdentifierSequence<T, TId>(this IObservable<T> source, IList<TId> identifierSequence, Func<T, TId> identifierFunc)
{
var initialState = new OrderByIdentifierSequenceState<T, TId>(0, ImmutableDictionary<TId, ImmutableList<T>>.Empty, Enumerable.Empty<T>());
return source.Scan(initialState, (oldState, item) =>
{
//Function to be called upon receiving new item
//If we can pattern match the first item, then it is moved into Output, and concatted continuously with the next possible item
//Otherwise, if nothing is available yet, just return the input state
OrderByIdentifierSequenceState<T, TId> GetOutput(OrderByIdentifierSequenceState<T, TId> state)
{
int index = state.Index;
ImmutableDictionary<TId, ImmutableList<T>> buffer = state.Buffer;
IList<T> output = new List<T>();
while (index < identifierSequence.Count)
{
TId key = identifierSequence[index];
ImmutableList<T> nextValues;
if (!buffer.TryGetValue(key, out nextValues) || nextValues.IsEmpty)
{
//No values available yet
break;
}
T toOutput = nextValues[nextValues.Count - 1];
output.Add(toOutput);
buffer = buffer.SetItem(key, nextValues.RemoveAt(nextValues.Count - 1));
index++;
}
return new OrderByIdentifierSequenceState<T, TId>(index, buffer, output);
}
//Before calling the recursive function, add the new item to the buffer
TId itemIdentifier = identifierFunc(item);
ImmutableList<T> valuesList;
if (!oldState.Buffer.TryGetValue(itemIdentifier, out valuesList))
{
valuesList = ImmutableList<T>.Empty;
}
var remodifiedBuffer = oldState.Buffer.SetItem(itemIdentifier, valuesList.Add(item));
return GetOutput(new OrderByIdentifierSequenceState<T, TId>(oldState.Index, remodifiedBuffer, Enumerable.Empty<T>()));
})
// Use Dematerialize/Notifications to detect and emit end of stream.
.SelectMany(output =>
{
var notifications = output.Output
.Select(item => Notification.CreateOnNext(item))
.ToList();
if (output.Index == identifierSequence.Count)
{
notifications.Add(Notification.CreateOnCompleted<T>());
}
return notifications;
})
.Dematerialize();
}
class OrderByIdentifierSequenceState<T, TId>
{
//Index shows what T we're waiting on
public int Index { get; }
//Buffer holds T that have arrived that we aren't ready yet for
public ImmutableDictionary<TId, ImmutableList<T>> Buffer { get; }
//Output holds T that can be safely emitted.
public IEnumerable<T> Output { get; }
public OrderByIdentifierSequenceState(int index, ImmutableDictionary<TId, ImmutableList<T>> buffer, IEnumerable<T> output)
{
this.Index = index;
this.Buffer = buffer;
this.Output = output;
}
}
However, this code still has a couple of problems:
Constant copying of the state (mainly the ImmutableDictionary), which can be very expensive. Possible solution: maintain a separate state per observer, instead of per item received.
When one or more of the elements in identifierSequence are not present in the source observable a problem appears. This currently blocks the ordered observable and it will never finish. Possible solutions: Timeout, throw exception when source observable is completed, return all available items when source observable is completed, ...
When the source observable contains more elements than identifierSequence, we get a memory leak. Items that are in the source observable, but not in identifierSequence currently get added to the dictionary, but will not be deleted before the source observable completes. This is a potential memory leak. Possible solutions: check whether the item is in identifierSequence before adding it to the dictionary, bypass code and immediately output the item, ...
MY SOLUTION:
/// <summary>
/// Takes the items from the source observable, and returns them in the order specified in identifierSequence.
/// If an item is missing from the source observable, the returned obserable returns items up until the missing item and then blocks until the source observable is completed.
/// All available items are then returned in order. Note that this means that while a correct order is guaranteed, there might be missing items in the result observable.
/// If there are items in the source observable that are not in identifierSequence, these items will be ignored.
/// </summary>
/// <typeparam name="T">The type that is produced by the source observable</typeparam>
/// <typeparam name="TId">The type of the identifiers used to uniquely identify a T</typeparam>
/// <param name="source">The source observable</param>
/// <param name="identifierSequence">A list of identifiers that defines the sequence in which the source observable is to be ordered</param>
/// <param name="identifierFunc">A function that takes a T and outputs its unique identifier</param>
/// <returns>An observable with the same elements as the source, but ordered by the sequence of items in identifierSequence</returns>
public static IObservable<T> OrderByIdentifierSequence<T, TId>(this IObservable<T> source, IList<TId> identifierSequence, Func<T, TId> identifierFunc)
{
if (source == null)
{
throw new ArgumentNullException(nameof(source));
}
if (identifierSequence == null)
{
throw new ArgumentNullException(nameof(identifierSequence));
}
if (identifierFunc == null)
{
throw new ArgumentNullException(nameof(identifierFunc));
}
if (identifierSequence.Count == 0)
{
return Observable.Empty<T>();
}
HashSet<TId> identifiersInSequence = new HashSet<TId>(identifierSequence);
return Observable.Create<T>(observer =>
{
//current index of pending item in identifierSequence
int index = 0;
//buffer of items we have received but are not ready for yet
Dictionary<TId, List<T>> buffer = new Dictionary<TId, List<T>>();
return source.Select(
item =>
{
//Function to be called upon receiving new item
//We search for the current pending item in the buffer. If it is available, we yield return it and repeat.
//If it is not available yet, stop.
IEnumerable<T> GetAvailableOutput()
{
while (index < identifierSequence.Count)
{
TId key = identifierSequence[index];
List<T> nextValues;
if (!buffer.TryGetValue(key, out nextValues) || nextValues.Count == 0)
{
//No values available yet
break;
}
yield return nextValues[nextValues.Count - 1];
nextValues.RemoveAt(nextValues.Count - 1);
index++;
}
}
//Get the identifier for this item
TId itemIdentifier = identifierFunc(item);
//If this item is not in identifiersInSequence, we ignore it.
if (!identifiersInSequence.Contains(itemIdentifier))
{
return Enumerable.Empty<T>();
}
//Add the new item to the buffer
List<T> valuesList;
if (!buffer.TryGetValue(itemIdentifier, out valuesList))
{
valuesList = new List<T>();
buffer[itemIdentifier] = valuesList;
}
valuesList.Add(item);
//Return all available items
return GetAvailableOutput();
})
.Subscribe(output =>
{
foreach (T cur in output)
{
observer.OnNext(cur);
}
if (index == identifierSequence.Count)
{
observer.OnCompleted();
}
},(ex) =>
{
observer.OnError(ex);
}, () =>
{
//When source observable is completed, return the remaining available items
while (index < identifierSequence.Count)
{
TId key = identifierSequence[index];
List<T> nextValues;
if (!buffer.TryGetValue(key, out nextValues) || nextValues.Count == 0)
{
//No values available
index++;
continue;
}
observer.OnNext(nextValues[nextValues.Count - 1]);
nextValues.RemoveAt(nextValues.Count - 1);
index++;
}
//Mark observable as completed
observer.OnCompleted();
});
});
}
Please note that your implementation has a few problems:
If the two 'l's come before their time, one gets swallowed, then holding up the whole sequence. Your dictionary should map to a collection, not a single item.
There's no OnCompleted message.
Multiple subscribers can screw up the state. Try this (where GetPatternMatchOriginal is your code):
-
var stateMachine = src.GetPatternMatchOriginal(new int[] { 9, 3, 4, 4, 7 });
stateMachine.Take(3).Dump(); //Linqpad
stateMachine.Take(3).Dump(); //Linqpad
The first ouptut is h e l the second output is l o. They should both output h e l.
This implementation fixes those problems, and also is side-effect free using immutable data structures:
public static class X
{
public static IObservable<T> GetStateMachine(this IObservable<T> source, string identifierSequence)
{
//State is held in an anonymous type:
// Index shows what character we're waiting on,
// Buffer holds characters that have arrived that we aren't ready yet for
// Output holds characters that can be safely emitted.
return source
.Scan(new { Index = 0, Buffer = ImmutableDictionary<int, ImmutableList<T>>.Empty, Output = Enumerable.Empty<T>() },
(state, item) =>
{
//Function to be called recursively upon receiving new item
//If we can pattern match the first item, then it is moved into Output, and concatted recursively with the next possible item
//Otherwise just return the inputs
(int Index, ImmutableDictionary<int, ImmutableList<T>> Buffer, IEnumerable<T> Output) GetOutput(int index, ImmutableDictionary<int, ImmutableList<T>> buffer, IEnumerable<T> results)
{
if (index == identifierSequence.Length)
return (index, buffer, results);
var key = identifierSequence[index];
if (buffer.ContainsKey(key) && buffer[key].Any())
{
var toOuptut = buffer[key][buffer[key].Count - 1];
return GetOutput(index + 1, buffer.SetItem(key, buffer[key].RemoveAt(buffer[key].Count - 1)), results.Concat(new[] { toOuptut }));
}
else
return (index, buffer, results);
}
//Before calling the recursive function, add the new item to the buffer
var modifiedBuffer = state.Buffer.ContainsKey(item.Identifier)
? state.Buffer
: state.Buffer.Add(item.Identifier, ImmutableList<T>.Empty);
var remodifiedBuffer = modifiedBuffer.SetItem(item.Identifier, modifiedBuffer[item.Identifier].Add(item));
var output = GetOutput(state.Index, remodifiedBuffer, Enumerable.Empty<T>());
return new { Index = output.Index, Buffer = output.Buffer, Output = output.Output };
})
// Use Dematerialize/Notifications to detect and emit end of stream.
.SelectMany(output =>
{
var notifications = output.Output
.Select(item => Notification.CreateOnNext(item))
.ToList();
if (output.Index == identifierSequence.Length)
notifications.Add(Notification.CreateOnCompleted<T>());
return notifications;
})
.Dematerialize();
}
}
then you can call it like so:
var stateMachine = src.GetStateMachine(new int[] { 9, 3, 4, 4, 7 });
stateMachine.Dump(); //LinqPad
src.OnNext(new T { Identifier = 4, Character = 'l' });
src.OnNext(new T { Identifier = 4, Character = 'l' });
src.OnNext(new T { Identifier = 7, Character = 'o' });
src.OnNext(new T { Identifier = 3, Character = 'e' });
src.OnNext(new T { Identifier = 9, Character = 'h' });
Given you have this:
IObservable<T> source = new []
{
new T() { identifier = 3, character = 'e' },
new T() { identifier = 9, character = 'h'},
new T() { identifier = 4, character = 'l'},
new T() { identifier = 4, character = 'l'},
new T() { identifier = 7, character = 'o'}
}.ToObservable();
int[] identifierSequence = new int[]
{
9, 3, 4, 4, 7
};
...then this works:
IObservable<T> query =
source
.Scan(new { index = 0, pendings = new List<T>(), outputs = new List<T>() }, (a, t) =>
{
var i = a.index;
var o = new List<T>();
a.pendings.Add(t);
var r = a.pendings.Where(x => x.identifier == identifierSequence[i]).FirstOrDefault();
while (r != null)
{
o.Add(r);
a.pendings.Remove(r);
i++;
r = a.pendings.Where(x => x.identifier == identifierSequence[i]).FirstOrDefault();
}
return new { index = i, a.pendings, outputs = o };
})
.SelectMany(x => x.outputs);
Nice question :-)
Given the multiple identical keys, it looks like pattern matching in an arbitrary order to me. Here's what I come up with:
Edit: modified to look up expected items in a dictionary.
public static class MyExtensions
{
public static IObservable<TSource> MatchByKeys<TSource, TKey>(this IObservable<TSource> source, IEnumerable<TKey> keys, Func<TSource, TKey> keySelector, IEqualityComparer<TKey> keyComparer = null)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (keys == null) throw new ArgumentNullException(nameof(keys));
if (keySelector == null) throw new ArgumentNullException(nameof(keySelector));
if (keyComparer == null) keyComparer = EqualityComparer<TKey>.Default;
return Observable.Create<TSource>(observer =>
{
var pattern = new LinkedList<SingleAssignment<TSource>>();
var matchesByKey = new Dictionary<TKey, LinkedList<SingleAssignment<TSource>>>(keyComparer);
foreach (var key in keys)
{
var match = new SingleAssignment<TSource>();
pattern.AddLast(match);
LinkedList<SingleAssignment<TSource>> matches;
if (!matchesByKey.TryGetValue(key, out matches))
{
matches = new LinkedList<SingleAssignment<TSource>>();
matchesByKey.Add(key, matches);
}
matches.AddLast(match);
}
if (pattern.First == null)
{
observer.OnCompleted();
return Disposable.Empty;
}
var sourceSubscription = new SingleAssignmentDisposable();
Action dispose = () =>
{
sourceSubscription.Dispose();
pattern.Clear();
matchesByKey.Clear();
};
sourceSubscription.Disposable = source.Subscribe(
value =>
{
try
{
var key = keySelector(value);
LinkedList<SingleAssignment<TSource>> matches;
if (!matchesByKey.TryGetValue(key, out matches)) return;
matches.First.Value.Value = value;
matches.RemoveFirst();
if (matches.First == null) matchesByKey.Remove(key);
while (pattern.First != null && pattern.First.Value.HasValue)
{
var match = pattern.First.Value;
pattern.RemoveFirst();
observer.OnNext(match.Value);
}
if (pattern.First != null) return;
dispose();
observer.OnCompleted();
}
catch (Exception ex)
{
dispose();
observer.OnError(ex);
}
},
error =>
{
dispose();
observer.OnError(error);
},
() =>
{
dispose();
observer.OnCompleted();
});
return Disposable.Create(dispose);
});
}
private sealed class SingleAssignment<T>
{
public bool HasValue { get; private set; }
private T _value;
public T Value
{
get
{
if (!HasValue) throw new InvalidOperationException("No value has been set.");
return _value;
}
set
{
if (HasValue) throw new InvalidOperationException("Value has alredy been set.");
HasValue = true;
_value = value;
}
}
}
}
Test code:
var src = new Subject<T>();
var ordered = src.MatchByKeys(new[] { 9, 3, 4, 4, 7 }, t => t.Identifier);
var result = new List<T>();
using (ordered.Subscribe(result.Add))
{
src.OnNext(new T { Identifier = 3, Character = 'e' });
src.OnNext(new T { Identifier = 9, Character = 'h' });
src.OnNext(new T { Identifier = 4, Character = 'l' });
src.OnNext(new T { Identifier = 4, Character = 'l' });
src.OnNext(new T { Identifier = 7, Character = 'o' });
src.OnCompleted();
}
Console.WriteLine(new string(result.Select(t => t.Character).ToArray()));

Parallel to async task on c#

I am having Parallel For loop for following statements but I want to use async task not Parallel. Any idea how can i use async task on same statements? I don't need fully working code but just an idea about how to replace async task with Parallel. Happy coding
Parallel.For(0, allRequests.Count(), i =>
{
var rand = new Random();
var token = allTokens.ElementAt(rand.Next(allTokens.Count()));
var accessKey = token.AccessKey;
var secretKey = token.SecretKey;
using (var ctx = new db_mytestdb())
{
var firstRequest = allRequests[i];
Console.WriteLine("Started scan for: " + firstRequest.SearchedUser.EbayUsername + " and using token: " + allTokens[i % allTokens.Count].TokenName);
var bulkScannedItems = new ConcurrentBag<BulkScannedItems>();
var userPreferences = ctx.UserPreferences.FirstOrDefault(x => x.UserId == firstRequest.UserId);
var userBrekEven = userPreferences.BreakEven;
var intoPast = DateTime.Now.Subtract(TimeSpan.FromDays(firstRequest.Range));
var filteredProducts = ctx.EbayUserTransactions.Where(x => x.SearchedUserID == firstRequest.SearchedUserID && x.TransactionDate >= intoPast && x.TransactionDate <= firstRequest.SearchedUser.LastUpdatedAt)
.ToList()
.GroupBy(x => x.ItemID).Select(x => new ResultItem()
{
ItemID = x.Key,
SaleNumber = x.Sum(y => y.QuantityPurchased)
})
.Where(x => x.SaleNumber >= firstRequest.MinSales)
.ToList();
var itemSpecifics = ctx.SearchedUserItems.Where(x => x.SearchedUserID == firstRequest.SearchedUserID).ToList();
foreach (var item in itemSpecifics)
{
foreach (var filtered in filteredProducts)
{
if (item.ItemID == filtered.ItemID)
{
if (item.UPC != null)
{
filtered.UPC = item.UPC;
}
else
{
filtered.UPC = "does not apply";
}
if (item.EAN != null)
{
filtered.EAN = item.EAN;
}
else
{
filtered.EAN = "does not apply";
}
if (item.MPN != null)
{
filtered.MPN = item.MPN;
}
else
{
filtered.MPN = "does not apply";
}
}
}
}
var bulkScanner = new BulkScannerAlgorithm();
foreach (var dbItem in filteredProducts)
{
var amazonItem = bulkScanner.Found(dbItem.UPC, dbItem.ItemID, accessKey, secretKey);
if (amazonItem.Found)
{
bulkScanner.InsertAmazonData(firstRequest, bulkScannedItems, userBrekEven, amazonItem);
continue;
}
amazonItem = bulkScanner.Found(dbItem.EAN, dbItem.ItemID, accessKey, secretKey);
if (amazonItem.Found)
{
bulkScanner.InsertAmazonData(firstRequest, bulkScannedItems, userBrekEven, amazonItem);
continue;
}
amazonItem = bulkScanner.Found(dbItem.MPN, dbItem.ItemID, accessKey, secretKey);
if (amazonItem.Found)
{
bulkScanner.InsertAmazonData(firstRequest, bulkScannedItems, userBrekEven, amazonItem);
continue;
}
}
List<BulkScannedItems> filteredCompleteBulk;
if (firstRequest.IsPrimeOnly == true)
{
filteredCompleteBulk = bulkScannedItems.Where(x => x.CalculatedProfit >= firstRequest.MinProfit && x.IsPrime == true && x.EbayPrice >= firstRequest.minPrice && x.EbayPrice <= firstRequest.maxPrice).DistinctBy(x => x.ASIN).ToList();
}
else
{
filteredCompleteBulk = bulkScannedItems.Where(x => x.CalculatedProfit >= firstRequest.MinProfit && x.EbayPrice >= firstRequest.minPrice && x.EbayPrice <= firstRequest.maxPrice).DistinctBy(x => x.ASIN).ToList();
}
EFBatchOperation.For(ctx, ctx.BulkScannedItems).InsertAll(filteredCompleteBulk);
ctx.user_scanReq_update(firstRequest.UserSellerScanRequestId);
Console.WriteLine("Scan complete for user: " + firstRequest.SearchedUser.EbayUsername);
}
});
Parallelism and asynchrony are both forms of concurrency, but parallelism works by dividing the problem among multiple threads, and asynchrony works by freeing up threads. So they're kind of opposites in how they work.
That said, to make the code asynchronous, you'd start from your lowest-level I/O calls, e.g., the EF ToList and presumably also whatever APIs are used in the implementation of InsertAll. Replace those with asynchronous equivalents (e.g., ToListAsync) and call them with await.
Next, you'd need to replace the Parallel.For loop with code that creates a collection of asynchronous tasks and then (asynchronously) waits for them all, something like:
var tasks = allRequests.Select(async request => { ... });
await Task.WhenAll(tasks);
That's the basic pattern for asynchronous concurrency.
If you find that you do need true parallelism (multiple threads) in addition to asynchrony, consider using TPL Dataflow.
A little clarification, whether you are using Parallel.For/Foreach, Tasks, or await/async, they are all using the same thing behind the scenes (albeit slightly differently). You should always pick the one that fits your problem the best.
If you want to replace the parallel.for with a method of return type Task, that is straight forward enough but you would end up waiting for this piece to be done before you continued your processing.
Async/Await is generally used when dealing with UIs and web calls, it doesn't appear to be useful here.
What is it that you are trying to accomplish? Why the need to 'replace async task with Parallel'?
The general way you would off load a method to a task would be
Task<T> task = Task<T>.Factory.StartNew(() =>
{
});
or
public Task<T> task()
{
.....
}

RX Throttle with timeout [duplicate]

I want to effectively throttle an event stream, so that my delegate is called when the first event is received but then not for 1 second if subsequent events are received. After expiry of that timeout (1 second), if a subsequent event was received I want my delegate to be called.
Is there a simple way to do this using Reactive Extensions?
Sample code:
static void Main(string[] args)
{
Console.WriteLine("Running...");
var generator = Observable
.GenerateWithTime(1, x => x <= 100, x => x, x => TimeSpan.FromMilliseconds(1), x => x + 1)
.Timestamp();
var builder = new StringBuilder();
generator
.Sample(TimeSpan.FromSeconds(1))
.Finally(() => Console.WriteLine(builder.ToString()))
.Subscribe(feed =>
builder.AppendLine(string.Format("Observed {0:000}, generated at {1}, observed at {2}",
feed.Value,
feed.Timestamp.ToString("mm:ss.fff"),
DateTime.Now.ToString("mm:ss.fff"))));
Console.ReadKey();
}
Current output:
Running...
Observed 064, generated at 41:43.602, observed at 41:43.602
Observed 100, generated at 41:44.165, observed at 41:44.602
But I want to observe (timestamps obviously will change)
Running...
Observed 001, generated at 41:43.602, observed at 41:43.602
....
Observed 100, generated at 41:44.165, observed at 41:44.602
Okay,
you have 3 scenarios here:
1) I would like to get one value of the event stream every second.
means: that if it produces more events per second, you will get a always bigger buffer.
observableStream.Throttle(timeSpan)
2) I would like to get the latest event, that was produced before the second happens
means: other events get dropped.
observableStream.Sample(TimeSpan.FromSeconds(1))
3) you would like to get all events, that happened in the last second. and that every second
observableStream.BufferWithTime(timeSpan)
4) you want to select what happens in between the second with all the values, till the second has passed, and your result is returned
observableStream.CombineLatest(Observable.Interval(1000), selectorOnEachEvent)
Here's is what I got with some help from the RX Forum:
The idea is to issue a series of "tickets" for the original sequence to fire. These "tickets" are delayed for the timeout, excluding the very first one, which is immediately pre-pended to the ticket sequence. When an event comes in and there is a ticket waiting, the event fires immediately, otherwise it waits till the ticket and then fires. When it fires, the next ticket is issued, and so on...
To combine the tickets and original events, we need a combinator. Unfortunately, the "standard" .CombineLatest cannot be used here because it would fire on tickets and events that were used previousely. So I had to create my own combinator, which is basically a filtered .CombineLatest, that fires only when both elements in the combination are "fresh" - were never returned before. I call it .CombineVeryLatest aka .BrokenZip ;)
Using .CombineVeryLatest, the above idea can be implemented as such:
public static IObservable<T> SampleResponsive<T>(
this IObservable<T> source, TimeSpan delay)
{
return source.Publish(src =>
{
var fire = new Subject<T>();
var whenCanFire = fire
.Select(u => new Unit())
.Delay(delay)
.StartWith(new Unit());
var subscription = src
.CombineVeryLatest(whenCanFire, (x, flag) => x)
.Subscribe(fire);
return fire.Finally(subscription.Dispose);
});
}
public static IObservable<TResult> CombineVeryLatest
<TLeft, TRight, TResult>(this IObservable<TLeft> leftSource,
IObservable<TRight> rightSource, Func<TLeft, TRight, TResult> selector)
{
var ls = leftSource.Select(x => new Used<TLeft>(x));
var rs = rightSource.Select(x => new Used<TRight>(x));
var cmb = ls.CombineLatest(rs, (x, y) => new { x, y });
var fltCmb = cmb
.Where(a => !(a.x.IsUsed || a.y.IsUsed))
.Do(a => { a.x.IsUsed = true; a.y.IsUsed = true; });
return fltCmb.Select(a => selector(a.x.Value, a.y.Value));
}
private class Used<T>
{
internal T Value { get; private set; }
internal bool IsUsed { get; set; }
internal Used(T value)
{
Value = value;
}
}
Edit: here's another more compact variation of CombineVeryLatest proposed by Andreas Köpf on the forum:
public static IObservable<TResult> CombineVeryLatest
<TLeft, TRight, TResult>(this IObservable<TLeft> leftSource,
IObservable<TRight> rightSource, Func<TLeft, TRight, TResult> selector)
{
return Observable.Defer(() =>
{
int l = -1, r = -1;
return Observable.CombineLatest(
leftSource.Select(Tuple.Create<TLeft, int>),
rightSource.Select(Tuple.Create<TRight, int>),
(x, y) => new { x, y })
.Where(t => t.x.Item2 != l && t.y.Item2 != r)
.Do(t => { l = t.x.Item2; r = t.y.Item2; })
.Select(t => selector(t.x.Item1, t.y.Item1));
});
}
I was struggling with this same problem last night, and believe I've found a more elegant (or at least shorter) solution:
var delay = Observable.Empty<T>().Delay(TimeSpan.FromSeconds(1));
var throttledSource = source.Take(1).Concat(delay).Repeat();
This is the what I posted as an answer to this question in the Rx forum:
UPDATE:
Here is a new version that does no longer delay event forwarding when events occur with a time difference of more than one second:
public static IObservable<T> ThrottleResponsive3<T>(this IObservable<T> source, TimeSpan minInterval)
{
return Observable.CreateWithDisposable<T>(o =>
{
object gate = new Object();
Notification<T> last = null, lastNonTerminal = null;
DateTime referenceTime = DateTime.UtcNow - minInterval;
var delayedReplay = new MutableDisposable();
return new CompositeDisposable(source.Materialize().Subscribe(x =>
{
lock (gate)
{
var elapsed = DateTime.UtcNow - referenceTime;
if (elapsed >= minInterval && delayedReplay.Disposable == null)
{
referenceTime = DateTime.UtcNow;
x.Accept(o);
}
else
{
if (x.Kind == NotificationKind.OnNext)
lastNonTerminal = x;
last = x;
if (delayedReplay.Disposable == null)
{
delayedReplay.Disposable = Scheduler.ThreadPool.Schedule(() =>
{
lock (gate)
{
referenceTime = DateTime.UtcNow;
if (lastNonTerminal != null && lastNonTerminal != last)
lastNonTerminal.Accept(o);
last.Accept(o);
last = lastNonTerminal = null;
delayedReplay.Disposable = null;
}
}, minInterval - elapsed);
}
}
}
}), delayedReplay);
});
}
This was my earlier try:
var source = Observable.GenerateWithTime(1,
x => x <= 100, x => x, x => TimeSpan.FromMilliseconds(1), x => x + 1)
.Timestamp();
source.Publish(o =>
o.Take(1).Merge(o.Skip(1).Sample(TimeSpan.FromSeconds(1)))
).Run(x => Console.WriteLine(x));
Ok, here's one solution. I don't like it, particularly, but... oh well.
Hat tips to Jon for pointing me at SkipWhile, and to cRichter for the BufferWithTime. Thanks guys.
static void Main(string[] args)
{
Console.WriteLine("Running...");
var generator = Observable
.GenerateWithTime(1, x => x <= 100, x => x, x => TimeSpan.FromMilliseconds(1), x => x + 1)
.Timestamp();
var bufferedAtOneSec = generator.BufferWithTime(TimeSpan.FromSeconds(1));
var action = new Action<Timestamped<int>>(
feed => Console.WriteLine("Observed {0:000}, generated at {1}, observed at {2}",
feed.Value,
feed.Timestamp.ToString("mm:ss.fff"),
DateTime.Now.ToString("mm:ss.fff")));
var reactImmediately = true;
bufferedAtOneSec.Subscribe(list =>
{
if (list.Count == 0)
{
reactImmediately = true;
}
else
{
action(list.Last());
}
});
generator
.SkipWhile(item => reactImmediately == false)
.Subscribe(feed =>
{
if(reactImmediately)
{
reactImmediately = false;
action(feed);
}
});
Console.ReadKey();
}
Have you tried the Throttle extension method?
From the docs:
Ignores values from an observable sequence which are followed by another value before dueTime
It's not quite clear to me whether that's going to do what you want or not - in that you want to ignore the following values rather than the first value... but I would expect it to be what you want. Give it a try :)
EDIT: Hmmm... no, I don't think Throttle is the right thing, after all. I believe I see what you want to do, but I can't see anything in the framework to do it. I may well have missed something though. Have you asked on the Rx forum? It may well be that if it's not there now, they'd be happy to add it :)
I suspect you could do it cunningly with SkipUntil and SelectMany somehow... but I think it should be in its own method.
What you are searching for is the CombineLatest.
public static IObservable<TResult> CombineLatest<TLeft, TRight, TResult>(
IObservable<TLeft> leftSource,
IObservable<TRight> rightSource,
Func<TLeft, TRight, TResult> selector
)
that merges 2 obeservables, and returning all values, when the selector (time) has a value.
edit: john is right, that is maybe not the preferred solution
Inspired by Bluelings answer I provide here a version that compiles with Reactive Extensions 2.2.5.
This particular version counts the number of samples and also provide the last sampled value. To do this the following class is used:
class Sample<T> {
public Sample(T lastValue, Int32 count) {
LastValue = lastValue;
Count = count;
}
public T LastValue { get; private set; }
public Int32 Count { get; private set; }
}
Here is the operator:
public static IObservable<Sample<T>> SampleResponsive<T>(this IObservable<T> source, TimeSpan interval, IScheduler scheduler = null) {
if (source == null)
throw new ArgumentNullException(nameof(source));
return Observable.Create<Sample<T>>(
observer => {
var gate = new Object();
var lastSampleValue = default(T);
var lastSampleTime = default(DateTime);
var sampleCount = 0;
var scheduledTask = new SerialDisposable();
return new CompositeDisposable(
source.Subscribe(
value => {
lock (gate) {
var now = DateTime.UtcNow;
var elapsed = now - lastSampleTime;
if (elapsed >= interval) {
observer.OnNext(new Sample<T>(value, 1));
lastSampleValue = value;
lastSampleTime = now;
sampleCount = 0;
}
else {
if (scheduledTask.Disposable == null) {
scheduledTask.Disposable = (scheduler ?? Scheduler.Default).Schedule(
interval - elapsed,
() => {
lock (gate) {
if (sampleCount > 0) {
lastSampleTime = DateTime.UtcNow;
observer.OnNext(new Sample<T>(lastSampleValue, sampleCount));
sampleCount = 0;
}
scheduledTask.Disposable = null;
}
}
);
}
lastSampleValue = value;
sampleCount += 1;
}
}
},
error => {
if (sampleCount > 0)
observer.OnNext(new Sample<T>(lastSampleValue, sampleCount));
observer.OnError(error);
},
() => {
if (sampleCount > 0)
observer.OnNext(new Sample<T>(lastSampleValue, sampleCount));
observer.OnCompleted();
}
),
scheduledTask
);
}
);
}

How can I use Reactive Extensions to throttle Events using a max window size?

Scenario:
I am building a UI application that gets notifcations from a backend service every few milliseconds. Once I get a new notification i want to update the UI as soon as possible.
As I can get lots of notifications within a short amount of time, and as I always only care about the latest event, I use the Throttle() method of the Reactive Extensions framework. This allows me to ignore notification events that are immediately followed by a new notification and so my UI stays responsive.
Problem:
Say I throttle the event stream of notification events to 50ms and the backend sends a notification every 10ms, the Thottle() method will never return an event as it keeps resetting its Sliding Window again and again. Here i need some additional behaviour to specify something like a timeout, so that i can retrieve atleast one event per second or so in case of such a high throughput of events. How can i do this with Reactive Extensions?
As James stated, Observable.Sample will give you the latest value yielded. However, it will do so on a timer, and not in accordance to when the first event in the throttle occurred. More importantly, however, is that if your sample time is high (say ten seconds), and your event fires right after a sample is taken, you won't get that new event for almost ten seconds.
If you need something a little tighter, you'll need to implement your own function. I've taken the liberty of doing so. This code could definitely use some clean up, but I believe it does what you've asked for.
public static class ObservableEx
{
public static IObservable<T> ThrottleMax<T>(this IObservable<T> source, TimeSpan dueTime, TimeSpan maxTime)
{
return source.ThrottleMax(dueTime, maxTime, Scheduler.Default);
}
public static IObservable<T> ThrottleMax<T>(this IObservable<T> source, TimeSpan dueTime, TimeSpan maxTime, IScheduler scheduler)
{
return Observable.Create<T>(o =>
{
var hasValue = false;
T value = default(T);
var maxTimeDisposable = new SerialDisposable();
var dueTimeDisposable = new SerialDisposable();
Action action = () =>
{
if (hasValue)
{
maxTimeDisposable.Disposable = Disposable.Empty;
dueTimeDisposable.Disposable = Disposable.Empty;
o.OnNext(value);
hasValue = false;
}
};
return source.Subscribe(
x =>
{
if (!hasValue)
{
maxTimeDisposable.Disposable = scheduler.Schedule(maxTime, action);
}
hasValue = true;
value = x;
dueTimeDisposable.Disposable = scheduler.Schedule(dueTime, action);
},
o.OnError,
o.OnCompleted
);
});
}
}
And a few tests...
[TestClass]
public class ThrottleMaxTests : ReactiveTest
{
[TestMethod]
public void CanThrottle()
{
var scheduler = new TestScheduler();
var results = scheduler.CreateObserver<int>();
var source = scheduler.CreateColdObservable(
OnNext(100, 1)
);
var dueTime = TimeSpan.FromTicks(100);
var maxTime = TimeSpan.FromTicks(250);
source.ThrottleMax(dueTime, maxTime, scheduler)
.Subscribe(results);
scheduler.AdvanceTo(1000);
results.Messages.AssertEqual(
OnNext(200, 1)
);
}
[TestMethod]
public void CanThrottleWithMaximumInterval()
{
var scheduler = new TestScheduler();
var results = scheduler.CreateObserver<int>();
var source = scheduler.CreateColdObservable(
OnNext(100, 1),
OnNext(175, 2),
OnNext(250, 3),
OnNext(325, 4),
OnNext(400, 5)
);
var dueTime = TimeSpan.FromTicks(100);
var maxTime = TimeSpan.FromTicks(250);
source.ThrottleMax(dueTime, maxTime, scheduler)
.Subscribe(results);
scheduler.AdvanceTo(1000);
results.Messages.AssertEqual(
OnNext(350, 4),
OnNext(500, 5)
);
}
[TestMethod]
public void CanThrottleWithoutMaximumIntervalInterferance()
{
var scheduler = new TestScheduler();
var results = scheduler.CreateObserver<int>();
var source = scheduler.CreateColdObservable(
OnNext(100, 1),
OnNext(325, 2)
);
var dueTime = TimeSpan.FromTicks(100);
var maxTime = TimeSpan.FromTicks(250);
source.ThrottleMax(dueTime, maxTime, scheduler)
.Subscribe(results);
scheduler.AdvanceTo(1000);
results.Messages.AssertEqual(
OnNext(200, 1),
OnNext(425, 2)
);
}
}
Don't use Observable.Throttle, use Observable.Sample like this, where the TimeSpan gives the desired minimum interval between updates:
source.Sample(TimeSpan.FromMilliseconds(50))

Categories

Resources