Batch processing using IObservable

Batch processing using IObservable - c#

My server side sends me batches of messages. The number of messages in a batch and frequency is arbitrary. At times, I get messages at 1 minute intervals and sometimes no messages for an hour. Anywhere from 1 to 10 messages.
My current implementation uses Observable.Buffer(TimeSpan.FromSeconds(5)) to group and send the messages to subscriber.
Instead of having to check every 5 seconds, is there a way to configure the Observable to say send your buffered messages to subscriber if there's an x seconds delay between two messages.
How to avoid an unnecessary timer ticking every 5 seconds? (I'm open to other suggestions to optimize the batch processing.)

Using a bufferClosingSelector factory method
decPL suggested using the overload of Buffer that accepts a bufferClosingSelector - a factory function that is called at the opening of a new buffer. It produces a stream whose first OnNext() or OnCompleted() signals flushing the current buffer. decPLs code looked like this:
observable.Buffer(() => observable.Throttle(TimeSpan.FromSeconds(5)))
This makes considerable progress towards a solution, but it has a couple of problems:
The server will not send messages during periods of activity where messages are published consistently within the throttle duration. This could result in large, infrequently published lists.
There are multiple subscriptions to the source; if it is cold this may have unintended side effects. The bufferClosingSelector factory is called after each buffer closing, so if the source is cold it would be throttling from the initial events, rather than the most recent.
Preventing indefinite throttling
We need to use an additional mechanism to limit the buffer length and prevent indefinite throttling. Buffer has an overload that allows you to specify a maximum length, but unfortunately you can't combine this with a closing selector.
Let's call the desired buffer length limit n. Recall the first OnNext of the closing selector is enough to close the buffer, so all we need to do is Merge the throttle with a counting stream that sends OnNext after n events from the source. We can use .Take(n).LastAsync() to do this; take the first n events but ignore all but the last of this. This is a very useful pattern in Rx.
Making the source "hot"
In order to address the issue of the bufferClosingSelector factory resubscribing to the source, we need to use the common pattern of .Publish().RefCount() on the source to give us a stream that will only send the most recent events to subscribers. This is also a very useful pattern to remember.
Solution
Here is the reworked code, where the throttle duration is merged with a counter:
var throttleDuration = TimeSpan.FromSeconds(5);
var bufferSize = 3;
// single subscription to source
var sourcePub = source.Publish().RefCount();
var output = sourcePub.Buffer(
() => sourcePub.Throttle(throttleDuration)
.Merge(sourcePub.Take(bufferSize).LastAsync()));
Production Ready Code & Tests
Here is a production ready implementation with tests (use nuget packages rx-testing & nunit). Note the parameterization of the scheduler to support testing.
public static partial class ObservableExtensions
{
public static IObservable<IList<TSource>> BufferNearEvents<TSource>(
this IObservable<TSource> source,
TimeSpan maxInterval,
int maxBufferSize,
IScheduler scheduler)
{
if (scheduler == null) scheduler = ThreadPoolScheduler.Instance;
if (maxBufferSize <= 0)
throw new ArgumentOutOfRangeException(
"maxBufferSize", "maxBufferSize must be positive");
var publishedSource = source.Publish().RefCount();
return publishedSource.Buffer(
() => publishedSource
.Throttle(maxInterval, scheduler)
.Merge(publishedSource.Take(maxBufferSize).LastAsync()));
}
}
public class BufferNearEventsTests : ReactiveTest
{
[Test]
public void CloseEventsAreBuffered()
{
TimeSpan maxInterval = TimeSpan.FromTicks(200);
const int maxBufferSize = 1000;
var scheduler = new TestScheduler();
var source = scheduler.CreateColdObservable(
OnNext(100, 1),
OnNext(200, 2),
OnNext(300, 3));
IList<int> expectedBuffer = new [] {1, 2, 3};
var expectedTime = maxInterval.Ticks + 300;
var results = scheduler.CreateObserver<IList<int>>();
source.BufferNearEvents(maxInterval, maxBufferSize, scheduler)
.Subscribe(results);
scheduler.AdvanceTo(1000);
results.Messages.AssertEqual(
OnNext<IList<int>>(expectedTime, buffer => CheckBuffer(expectedBuffer, buffer)));
}
[Test]
public void FarEventsAreUnbuffered()
{
TimeSpan maxInterval = TimeSpan.FromTicks(200);
const int maxBufferSize = 1000;
var scheduler = new TestScheduler();
var source = scheduler.CreateColdObservable(
OnNext(1000, 1),
OnNext(2000, 2),
OnNext(3000, 3));
IList<int>[] expectedBuffers =
{
new[] {1},
new[] {2},
new[] {3}
};
var expectedTimes = new[]
{
maxInterval.Ticks + 1000,
maxInterval.Ticks + 2000,
maxInterval.Ticks + 3000
};
var results = scheduler.CreateObserver<IList<int>>();
source.BufferNearEvents(maxInterval, maxBufferSize, scheduler)
.Subscribe(results);
scheduler.AdvanceTo(10000);
results.Messages.AssertEqual(
OnNext<IList<int>>(expectedTimes[0], buffer => CheckBuffer(expectedBuffers[0], buffer)),
OnNext<IList<int>>(expectedTimes[1], buffer => CheckBuffer(expectedBuffers[1], buffer)),
OnNext<IList<int>>(expectedTimes[2], buffer => CheckBuffer(expectedBuffers[2], buffer)));
}
[Test]
public void UpToMaxEventsAreBuffered()
{
TimeSpan maxInterval = TimeSpan.FromTicks(200);
const int maxBufferSize = 2;
var scheduler = new TestScheduler();
var source = scheduler.CreateColdObservable(
OnNext(100, 1),
OnNext(200, 2),
OnNext(300, 3));
IList<int>[] expectedBuffers =
{
new[] {1,2},
new[] {3}
};
var expectedTimes = new[]
{
200, /* Buffer cap reached */
maxInterval.Ticks + 300
};
var results = scheduler.CreateObserver<IList<int>>();
source.BufferNearEvents(maxInterval, maxBufferSize, scheduler)
.Subscribe(results);
scheduler.AdvanceTo(10000);
results.Messages.AssertEqual(
OnNext<IList<int>>(expectedTimes[0], buffer => CheckBuffer(expectedBuffers[0], buffer)),
OnNext<IList<int>>(expectedTimes[1], buffer => CheckBuffer(expectedBuffers[1], buffer)));
}
private static bool CheckBuffer<T>(IEnumerable<T> expected, IEnumerable<T> actual)
{
CollectionAssert.AreEquivalent(expected, actual);
return true;
}
}

If I understood your description correctly, Observable.Buffer is still your friend, just using the overload that causes an observable event to dictate when buffered items should be sent. Something as follows:
observable.Buffer(() => observable.Throttle(TimeSpan.FromSeconds(5)))

This is an old question, but it seems related to my recent question. Enigmativity found a nice way to do what I think you want to achieve so I thought I'd share. I wrapped the solution in an extension method:
public static class ObservableExtensions
{
public static IObservable<T[]> Batch<T>(this IObservable<T> observable, TimeSpan timespan)
{
return observable.GroupByUntil(x => 1, g => Observable.Timer(timespan))
.Select(x => x.ToArray())
.Switch();
}
}
And it could be used like this:
observableSource.Batch(TimeSpan.FromSeconds(5));

Related

How to reset a postponed / declined message in TPL Dataflow

I am using TDF for my application which works great so far, unfortunately i stumbled upon a specific problem where it seems it can not be handled directly with existing Dataflow mechanisms:
I have N producers (in this case BufferBlocks) which are all linked to only 1 (all to the same) ActionBlock. This block always processes 1 item at a time, and also only has capacity for 1 item.
To the link from the producers to the ActionBlock I also want to add a filter, but the special case here is that the filter condition can change independently of the processed item, and the item must not be discarded!
So basically i want to process all items, but the order / time can change when an item will be processed.
Unfortunately I learned, that if an item is "declined" once -> the filter condition evaluates false, and if this item is not passed to another block (e.g. NullTarget), the target block does not retry the same item (and does not re-evaluate the filter).
public class ConsumeTest
{
private readonly BufferBlock<int> m_bufferBlock1;
private readonly BufferBlock<int> m_bufferBlock2;
private readonly ActionBlock<int> m_actionBlock;
public ConsumeTest()
{
m_bufferBlock1 = new BufferBlock<int>();
m_bufferBlock2 = new BufferBlock<int>();
var options = new ExecutionDataflowBlockOptions() { BoundedCapacity = 1, MaxDegreeOfParallelism = 1 };
m_actionBlock = new ActionBlock<int>((item) => BlockAction(item), options);
var start = DateTime.Now;
var elapsed = TimeSpan.FromMinutes(1);
m_bufferBlock1.LinkTo(m_actionBlock, x => IsTimeElapsed(start, elapsed));
m_bufferBlock2.LinkTo(m_actionBlock);
FillBuffers();
}
private void BlockAction(int item)
{
Console.WriteLine(item);
Thread.Sleep(2000);
}
private void FillBuffers()
{
for (int i = 0; i < 1000; i++)
{
if (i % 2 == 0)
{
m_bufferBlock1.Post(i);
}
else
{
m_bufferBlock2.Post(i);
}
}
}
private bool IsTimeElapsed(DateTime start, TimeSpan elapsed)
{
Console.WriteLine("checking time elapsed");
return DateTime.Now > (start + elapsed);
}
public async Task Start()
{
await m_actionBlock.Completion;
}
}
The code sets up a testing pipeline, and fills the two buffers with odd and even numbers. Both BufferBlocks are connected to one single ActionBlock that only prints the "processed" number and waits 2 secs.
The filter condition between m_bufferBlock1 and the m_actionBlock checks (for testing purposes) if 1 minute is elapsed since we started the whole thing.
If we run this, it generates the following output:
1
checking time elapsed
3
5
7
9
11
13
15
17
19
As we can see, the ActionBlock takes the first element from the BufferBlock without filter, then tries to take an element from the BufferBlock with a filter. The filter evaluates false and it continues to take all elements from the block without the filter.
My expectation was that after an element from the BufferBlock without filter has been processed, it tries to take the element from the other BufferBlock with the filter again, evaluating it again.
This would be my expected (or desired) result:
1
checking time elapsed
3
checking time elapsed
5
checking time elapsed
7
checking time elapsed
9
checking time elapsed
11
checking time elapsed
13
checking time elapsed
15
// after timer has elapsed take elements also from other buffer
2
17
4
19
My question now is, is there a way to "reset" the already "declined" message so that it is evaluated again, or is there another way by modeling it differently? To outline, it is NOT important that they really are pulled from both Buffers strictly alternating! (because i know that this is scheduling dependent and it is totally fine if from time to time 2 items from the same block are dequeued)
But it is important that the "declined" message must not be discarded or re-queued as the order within one buffer is important.
Thank you in advance

One idea is to refresh the link between the two blocks, periodically or on demand. Implementing a periodically refreshable LinkTo is not very difficult. Here is an implementation:
public static IDisposable LinkTo<TOutput>(this ISourceBlock<TOutput> source,
ITargetBlock<TOutput> target, Predicate<TOutput> predicate,
TimeSpan refreshInterval, DataflowLinkOptions linkOptions = null)
{
if (source == null) throw new ArgumentNullException(nameof(source));
if (target == null) throw new ArgumentNullException(nameof(target));
if (predicate == null) throw new ArgumentNullException(nameof(predicate));
if (refreshInterval < TimeSpan.Zero)
throw new ArgumentOutOfRangeException(nameof(refreshInterval));
linkOptions = linkOptions ?? new DataflowLinkOptions();
var locker = new object();
var cts = new CancellationTokenSource();
var token = cts.Token;
var currentLink = source.LinkTo(target, linkOptions, predicate);
var loopTask = Task.Run(async () =>
{
try
{
while (true)
{
await Task.Delay(refreshInterval, token).ConfigureAwait(false);
currentLink.Dispose();
currentLink = source.LinkTo(target, linkOptions, predicate);
}
}
finally
{
lock (locker) { cts.Dispose(); cts = null; }
}
}, token);
_ = Task.Factory.ContinueWhenAny(new[] { source.Completion, target.Completion },
_ => { lock (locker) cts?.Cancel(); }, token, TaskContinuationOptions.None,
TaskScheduler.Default);
return new Unlinker(() =>
{
lock (locker) cts?.Cancel();
// Wait synchronously the task to complete, ignoring cancellation exceptions.
try { loopTask.GetAwaiter().GetResult(); } catch (OperationCanceledException) { }
currentLink.Dispose();
});
}
private struct Unlinker : IDisposable
{
private readonly Action _action;
public Unlinker(Action disposeAction) => _action = disposeAction;
void IDisposable.Dispose() => _action?.Invoke();
}
Usage example:
m_bufferBlock1.LinkTo(m_actionBlock, x => IsTimeElapsed(start, elapsed),
refreshInterval: TimeSpan.FromSeconds(10));
The link between the m_bufferBlock1 and the m_actionBlock will be refreshed every 10 seconds, until one of the two blocks completes.

In RX how to create buffers no faster than they can be processed

Using RX's buffer operator allows the creation of batches after a certain number of results have appeared, or after a specified time, whichever is sooner. This is very useful when piping results to, say, a database on another machine, where one wants to keep latency down, but avoid sending huge numbers of requests (one per result).
I have an additional requirement, which is to preserve the ordering of results into the database (some are updates, which must come after the corresponding adds). This means that outgoing requests cannot overlap in case they get out of order.
Ideally each buffer should continue filling up even after it would normally emit if a previous database request has not yet returned, as this will minimise latency and the number of requests going to the database.
How could the following code be modified to make this work?
source
.Buffer(TimeSpan.FromSeconds(1), 25)
.Subscribe(async batch => await SendToDatabase(batch));

To force outgoing requests to wait until the previous one has returned before being processed, there is an RX trick which turns each result into an observable which completes only when it has finished processing. By combining these with concat the next will not be started until the previous one completes.
source
.Buffer(TimeSpan.FromSeconds(1), 25)
.Select(batch =>
Observable.FromAsync(async () =>
await SendToDatabase(batch)
)
)
.Concat()
.Subscribe(async batch => await SendToDatabase(batch));
This will still produce batches while waiting, though, so is not a perfect solution.

I have written a new observable extension BufferAndAct which does this.
In summary, it takes a time interval, a number (of items), and an action to be applied to each batch. It tries to act on a batch when the time interval expires or when the number of items has been reached, but it will never start acting on a new batch until the previous one has completed, so there is no limit on the potential size of a batch. Modifications could be made to bring this in line with some of the other overloads of Buffer.
It uses a further extension Split which acts like one of the overloads of Buffer, turning an observable of source items into an observable of observables of source items, splitting them when a signal is received from an input observable.
BufferAndAct uses Split to create an observable which gives a tick when a normal, timed, buffer would be emitted on the source observable, and is reset when the actual buffer is released. This could be later, because there is another observable which ticks when there is no request currently in progress. By zipping these two ticks together, Buffer can be used to emit a batch as soon as both criteria are met.
Usage is as follows:
source
.BufferAndAct(TimeSpan.FromSeconds(1), 25, async batch =>
await SendToDatabase(batch)
)
.Subscribe(r => {})
And the source for both extensions:
public static IObservable<TDest> BufferAndAct<TSource, TDest>(
this IObservable<TSource> source,
TimeSpan timeSpan,
int count,
Func<IList<TSource>, Task<TDest>> action
)
{
return new AnonymousObservable<TDest>(observer =>
{
var actionStartedObserver = new Subject<Unit>();
var actionCompleteObserver = new Subject<Unit>();
var published = source.Publish();
var batchReady = published.Select(i => Unit.Default).Split(actionStartedObserver).Select(s => s.Buffer(timeSpan, count).Select(u => Unit.Default).Take(1)).Concat();
var disposable = published.Buffer(Observable.Zip(actionCompleteObserver.StartWith(Unit.Default), batchReady)).SelectMany(async list =>
{
actionStartedObserver.OnNext(Unit.Default);
try
{
return await action(list);
}
finally
{
actionCompleteObserver.OnNext(Unit.Default);
}
}).Finally(() => {}).Subscribe(observer);
published.Connect();
return Disposable.Create(() =>
{
disposable.Dispose();
actionCompleteObserver.Dispose();
});
});
}
public static IObservable<Unit> BufferAndAct<TSource>(
this IObservable<TSource> source,
TimeSpan timeSpan,
int count,
Func<IList<TSource>, Task> action
)
{
return BufferAndAct(source, timeSpan, count, s =>
{
action(s);
return Task.FromResult(Unit.Default);
});
}
public static IObservable<IObservable<TSource>> Split<TSource>(
this IObservable<TSource> source,
IObservable<Unit> boundaries
)
{
return Observable.Create<IObservable<TSource>>(observer =>
{
var tuple = Split(observer);
var d1 = boundaries.Subscribe(tuple.Item2);
var d2 = source.Subscribe(tuple.Item1);
return Disposable.Create(() =>
{
d2.Dispose();
d1.Dispose();
});
});
}
private static Tuple<IObserver<TSource>, IObserver<Unit>> Split<TSource>(this IObserver<IObservable<TSource>> output)
{
ReplaySubject<TSource> obs = null;
var completed = 0; // int not bool to use in interlocked
Action newObservable = () =>
{
obs?.OnCompleted();
obs = new ReplaySubject<TSource>();
output.OnNext(obs);
};
Action completeOutput = () =>
{
if (Interlocked.CompareExchange(ref completed, 0, 1) == 1)
{
output.OnCompleted();
}
};
newObservable();
return new Tuple<IObserver<TSource>, IObserver<Unit>>(Observer.Create<TSource>(obs.OnNext, output.OnError, () =>
{
obs.OnCompleted();
completeOutput();
}), Observer.Create<Unit>(s => newObservable(), output.OnError, () => completeOutput()));
}

How to use Rx to deliver events on a schedule?

I'm using Reactive Extensions for C#. I want several threads to enqueue items on a ConcurrentQueue. Then I want to Subscribe to that queue, but only get 1 element every 1 second. This answer almost works, but not when I add more elements to the queue.
Given a queue of ints: [1, 2, 3, 4, 5, 6]. I want Subscribe(Console.WriteLine) to print a value every second. I want to add more ints from another thread onto the queue while Rx is printing these numbers out. Any ideas?

To pace an input stream to output no faster than at a rate described by a Timespan interval, use this:
var paced = input.Select(i => Observable.Empty<T>()
.Delay(interval)
.StartWith(i)).Concat();
See here for an explanation. Here's an example implementation tailored to a concurrent queue that dequeues quickly. Note that using the ToObservable extension of IEnumerable<T> to convert ConcurrentQueue<T> to an observable directly would be a mistake, because sadly this observable completes as soon as the queue is empty. It's jolly annoying that - at least as far as I can see - there's no asynchronous dequeue on a ConcurrentQueue<T> and so I had to introduce a polling mechanism. Other abstractions (e.g. BlockingCollection<T>) may serve you better!
public static class ObservableExtensions
{
public static IObservable<T> Pace<T>(this ConcurrentQueue<T> queue,
TimeSpan interval)
{
var source = Observable.Create<T>(async (o, ct) => {
while(!ct.IsCancellationRequested)
{
T next;
while(queue.TryDequeue(out next))
o.OnNext(next);
// You might want to use some arbitrary shorter interval here
// to allow the stream to resume after a long delay in source
// events more promptly
await Task.Delay(interval, ct);
}
ct.ThrowIfCancellationRequested();
});
// this does the pacing
return source.Select(i => Observable.Empty<T>()
.Delay(interval)
.StartWith(i)).Concat()
.Publish().RefCount(); // to allow multiple subscribers
}
}
Example usage:
public static void Main()
{
var queue = new ConcurrentQueue<int>();
var stopwatch = new Stopwatch();
queue.Pace(TimeSpan.FromSeconds(1))
.Subscribe(
x => Console.WriteLine(stopwatch.ElapsedMilliseconds + ": x" + x),
e => Console.WriteLine(e.Message),
() => Console.WriteLine("Done"));
stopwatch.Start();
queue.Enqueue(1);
queue.Enqueue(2);
Thread.Sleep(500);
queue.Enqueue(3);
Thread.Sleep(5000);
queue.Enqueue(4);
queue.Enqueue(5);
queue.Enqueue(6);
Console.ReadLine();
}

May be you will satisfied with one of Observable.Buffer overload. But consider not to use buffering with long running subsriptions because buffered elements can stress your RAM.
You can also build you own extension method with any desired behavior using Observable.Generate
void Main()
{
var queue = new ConcurrentQueue<int>();
queue.Enqueue(1);
queue.Enqueue(2);
queue.Enqueue(3);
queue.Enqueue(4);
queue.ObserveEach(TimeSpan.FromSeconds(1)).DumpLive("queue");
}
// Define other methods and classes here
public static class Ex {
public static IObservable<T> ObserveConcurrentQueue<T>(this ConcurrentQueue<T> queue, TimeSpan period)
{
return Observable
.Generate(
queue,
x => true,
x => x,
x => x.DequeueOrDefault(),
x => period)
.Where(x => !x.Equals(default(T)));
}
public static T DequeueOrDefault<T>(this ConcurrentQueue<T> queue)
{
T result;
if (queue.TryDequeue(out result))
return result;
else
return default(T);
}
}

Reactive Extensions: buffer until subscriber is idle

I have a program where I'm receiving events and want to process them in batches, so that all items that come in while I'm processing the current batch will appear in the next batch.
The simple TimeSpan and count based Buffer methods in Rx will give me multiple batches of items instead of giving me one big batch of everything that has come in (in cases when the subscriber takes longer than the specified TimeSpan or more than N items come in and N is greater than count).
I looked at using the more complex Buffer overloads that take Func<IObservable<TBufferClosing>> or IObservable<TBufferOpening> and Func<TBufferOpening, IObservable<TBufferClosing>>, but I can't find examples of how to use these, much less figure out how to apply them to what I'm trying to do.

Does this do what you want?
var xs = new Subject<int>();
var ys = new Subject<Unit>();
var zss =
xs.Buffer(ys);
zss
.ObserveOn(Scheduler.Default)
.Subscribe(zs =>
{
Thread.Sleep(1000);
Console.WriteLine(String.Join("-", zs));
ys.OnNext(Unit.Default);
});
ys.OnNext(Unit.Default);
xs.OnNext(1);
Thread.Sleep(200);
xs.OnNext(2);
Thread.Sleep(600);
xs.OnNext(3);
Thread.Sleep(400);
xs.OnNext(4);
Thread.Sleep(300);
xs.OnNext(5);
Thread.Sleep(900);
xs.OnNext(6);
Thread.Sleep(100);
xs.OnNext(7);
Thread.Sleep(1000);
My Result:
1-2-3
4-5
6-7

What you need is something to buffer the values and then when the worker
is ready it asks for the current buffer and then resets it. This can
be done with a combination of RX and Task
class TicTac<Stuff> {
private TaskCompletionSource<List<Stuff>> Items = new TaskCompletionSource<List<Stuff>>();
List<Stuff> in = new List<Stuff>();
public void push(Stuff stuff){
lock(this){
if(in == null){
in = new List<Stuff>();
Items.SetResult(in);
}
in.Add(stuff);
}
}
private void reset(){
lock(this){
Items = new TaskCompletionSource<List<Stuff>>();
in = null;
}
}
public async Task<List<Stuff>> Items(){
List<Stuff> list = await Items.Task;
reset();
return list;
}
}
then
var tictac = new TicTac<double>();
IObservable<double> source = ....
source.Subscribe(x=>tictac.Push(x));
Then in your worker
while(true){
var items = await tictac.Items();
Thread.Sleep(100);
for each (item in items){
Console.WriteLine(item);
}
}

The way I have done this before is to pull up the ObserveOn method in DotPeek/Reflector and take that queuing concept that it has and adapt it to our requirements. For example, in UI applications with fast ticking data (like finance) the UI thread can get flooded with events and sometimes it cant update quick enough. In these cases we want to drop all events except the last one (for a particular instrument). In this case we changed the internal Queue of the ObserveOn to a single value of T (look for ObserveLatestOn(IScheduler)). In your case you want the Queue, however you want to push the whole queue not just the first value. This should get you started.

Kind of an expansion of #Enigmativity's answer. I have used this to solve the problem:
public static IObservable<(Action ready, IReadOnlyList<T> values)> BufferUntilReady<T>(this IObservable<T> stream)
{
var gate = new BehaviorSubject<Guid>(Guid.NewGuid());
void Ready() => gate.OnNext(Guid.NewGuid());
return stream.Publish(shared => shared
.Buffer(gate.CombineLatest(shared, ValueTuple.Create)
.DistinctUntilChanged(new AnyEqualityComparer<Guid, T>()))
.Where(x => x.Any())
.Select(x => ((Action) Ready, (IReadOnlyList<T>) x)));
}
public class AnyEqualityComparer<T1, T2> : IEqualityComparer<(T1 a, T2 b)>
{
public bool Equals((T1 a, T2 b) x, (T1 a, T2 b) y) => Equals(x.a, y.a) || Equals(x.b, y.b);
public int GetHashCode((T1 a, T2 b) obj) => throw new NotSupportedException();
}
The subscriber receives a Ready() function to be called when ready to receive next buffer. I don't observe each buffer on the same thread to avoid cycles, but I guess you could break it some other place, if you need each buffer to be handled on the same thread.

How can i slow down an Observable without throwing away values in RX?

My scenario:
I have a computation that should be run about once a second. After it is run there should be a wait of about 200ms for other stuff to catch up. If the compuation is still running after a second it should be started a second time, but should the program should wait until it is finished and start the next computation 200ms after finishing.
The way I am doing it now:
_refreshFinished = new Subject<bool>();
_autoRefresher = Observable.Interval(TimeSpan.FromMilliseconds(1000))
.Zip(_refreshFinished, (x,y) => x)
.Subscribe(x => AutoRefresh(stuff));
The problem with this code is, that i see no way to put in a delay after a computation finished.
The Delay method only delays the first element of the observable collection. Usually this behaviour is the right once, since you would have to buffer an endless amount of elements if you wanted to buffer everyone, but since delaying the call to Autorefesh by 200ms delays the output of _refreshFinished by 200ms as well there would be no buffer overhead.
Basicly I want an Observable that fires every every MaxTime(some_call,1000ms) then gets delayed by 200ms or even better, some dynamic value. At this point i dont even really care about the values that are running through this, although that might change in the future.
I´m open to any suggestions

Observable.Generate() has a number of overloads which will let you dynamically adjust the time in which the next item is created.
For instance
IScheduler schd = Scheduler.TaskPool;
var timeout = TimeSpan.FromSeconds(1);
var shortDelay = TimeSpan.FromMilliseconds(200);
var longerDelay = TimeSpan.FromMilliseconds(500);
Observable.Generate(schd.Now,
time => true,
time => schd.Now,
time => new object(), // your code here
time => schd.Now.Subtract(time) > timeout ? shortDelay : longerDelay ,
schd);

This sounds more like a job for the new async framework http://msdn.microsoft.com/en-us/vstudio/gg316360

There is a way to do it. Its not the easiest thing ever, since the wait time has to be dynamicly calculated on each value but it works and is pretty generic.
When you use thise code you can just insert the code that should be called in YOURCODE and everything else works automaticly. You code will be basicly be called every Max(yourCodeTime+extraDelay,usualCallTime+extraDelay). This means yourCode wont be called twice at the same time and the app will always have extraDelay of time to do other stuff.
If there is some easier/other way to do this i would ove to hear it.
double usualCallTime = 1000;
double extraDealy = 100;
var subject = new Subject<double>();
var subscription =
sub.TimeInterval()
.Select(x =>
{
var processingTime = x.Interval.TotalMilliseconds - x.Value;
double timeToWait =
Math.Max(0, usualCallTime - processingTime) + extraDelay;
return Observable.Timer(TimeSpan.FromMilliseconds(timeToWait))
.Select(ignore => timeToWait);
})
.Switch()
.Subscribe(x => {YOURCODE();sub.OnNext(x)});
sub.OnNext(0);
private static void YOURCODE()
{
// do stuff here
action.Invoke();
}

If I understand your problem correctly, you have a long-running compute function such as this:
static String compute()
{
int t = 300 + new Random().Next(1000);
Console.Write("[{0}...", t);
Thread.Sleep(t);
Console.Write("]");
return Guid.NewGuid().ToString();
}
And you want to call this function at least once per second but without overlapping calls, and with a minimum 200ms recovery time between calls. The code below works for this situation.
I started with a more functional approach (using Scan() and Timestamp()), more in the style of Rx--because I was looking for a good Rx exercise--but in the end, this non-aggregating approach was just simpler.
static void Main()
{
TimeSpan period = TimeSpan.FromMilliseconds(1000);
TimeSpan recovery = TimeSpan.FromMilliseconds(200);
Observable
.Repeat(Unit.Default)
.Select(_ =>
{
var s = DateTimeOffset.Now;
var x = compute();
var delay = period - (DateTimeOffset.Now - s);
if (delay < recovery)
delay = recovery;
Console.Write("+{0} ", (int)delay.TotalMilliseconds);
return Observable.Return(x).Delay(delay).First();
})
.Subscribe(Console.WriteLine);
}
Here's the output:
[1144...]+200 a7cb5d3d-34b9-4d44-95c9-3e363f518e52
[1183...]+200 359ad966-3be7-4027-8b95-1051e3fb20c2
[831...]+200 f433b4dc-d075-49fe-9c84-b790274982d9
[766...]+219 310c9521-7bee-4acc-bbca-81c706a4632a
[505...]+485 0715abfc-db9b-42e2-9ec7-880d7ff58126
[1244...]+200 30a3002a-924a-4a64-9669-095152906d85
[1284...]+200 e5b1cd79-da73-477c-bca0-0870f4b5c640
[354...]+641 a43c9df5-53e8-4b58-a0df-7561cf4b0483
[1094...]+200 8f25019c-77a0-4507-b05e-c9ab8b34bcc3
[993...]+200 840281bd-c8fd-4627-9324-372636f8dea3
[edit: this sample uses Rx 2.0(RC) 2.0.20612.0]

Suppose you have an existing 'IObservable' , then the following will work
var delay = TimeSpan.FromSeconds(1.0);
var actual = source.Scan(
new ConcurrentQueue<object>(),
(q, i) =>
{
q.Enqueue(i);
return q;
}).CombineLatest(
Observable.Interval(delay),
(q, t) =>
{
object item;
if (q.TryDequeue(out item))
{
return item;
}
return null;
}).Where(v => v != null);
'actual' is your resultant observable. But keep in mind that the above code has turned that into a Hot observable if it wasn't hot already. So you won't get 'OnCompleted' called.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Batch processing using IObservable - c#

If I understood your description correctly, Observable.Buffer is still your friend, just using the overload that causes an observable event to dictate when buffered items should be sent. Something as follows: observable.Buffer(() => observable.Throttle(TimeSpan.FromSeconds(5)))

Related

How to reset a postponed / declined message in TPL Dataflow

In RX how to create buffers no faster than they can be processed

How to use Rx to deliver events on a schedule?

Reactive Extensions: buffer until subscriber is idle

How can i slow down an Observable without throwing away values in RX?

Categories

Resources