how to make error handling more efficient when using IObservable - c#

I was reading an text book on Reactive programming and learnt that when an exception occurs in a stream then everythink downstream is dead, where a branch of the dataflow dies while the remaining graph keeps functioning and below is some code provided by the author that can prevents the system from going into such a state when an exception happen:
public static void Main(string[] args)
{
var inputs = new Subject<string>();
var (rates, errors) = inputs.Safely(RatesApi.GetRateAsync); // rates is IObservable<decimal>, errors is IObservable<Exception>
IObservable<string> outputs = rates
.Select(Decimal.ToString) // Decimal.ToString is from a third party library
.Merge(errors.Select(ex => ex.Message))
.StartWith("Enter a currency pair like 'EURUSD', or 'q' to quit");
using (outputs.Subscribe(Console.WriteLine))
for (string input; (input = Console.ReadLine().ToUpper()) != "Q";)
inputs.OnNext(input);
}
public static (IObservable<R> Completed, IObservable<Exception> Faulted) Safely<T, R>(this IObservable<T> ts, Func<T, Task<R>> f) // use a third party functional programming libray, which is not relevent to this questions
{
// apply a Task-returning function to each element in a stream. The result is a pair of streams: a stream of successfully computed values and a stream of exceptions
// it also use try catch block to catch the exceptions and wrap into a wrapper then unwrap them back to Exception
}
soSafely allows us to create two branches, each of which can be processed independently until a uniform representation for both cases is obtained, and they can be merged
but there is a problem in this approach, RatesApi.GetRateAsync will be called twice everytime when users enter a string value like 'EURUSD' because the program uses rates twice and it is not materialised. But if I materialize rates by calling ToList(), I get an IObservable<IList<decimal>>, which doesn't fit into the chain, so how can we make the program only call RatesApi.GetRateAsync only once while still preserve the Rx flow?

Related

Reentrance method and partial synchronized calls

I do have a singleton component that manages some information blocks. An information block is a calculated information identified by some characteristics (concrete an Id and a time period). These calculations may take some seconds. All information blocks are stored in a collection.
Some other consumers are using these information blocks. The calculation should start when the first request for this Id and time period comes. I had following flow in mind:
The first consumer requests the data identified by Id and time period.
The component checks if the information block already exists
If not: Create the information block, put it into the collection and start the calculation in a background task. If yes: Take it from the collection
After that the flow goes to the information block:
When the calculation is already finished (by a former call), a callback from the consumer is called with the result of the calculation.
When the calculation is still in process, the callback is called when the calculation is finished.
So long, so good.
The critical section comes when the second (or any other subsequent) call is coming and the calculation is still running. The idea is that the calculation method holds each consumers callback and then when the calculation is finished all consumers callbacks are called.
public class SingletonInformationService
{
private readonly Collection<InformationBlock> blocks = new();
private object syncObject = new();
public void GetInformationBlock(Guid id, TimePersiod timePeriod,
Action<InformationBlock> callOnFinish)
{
InformationBlock block = null;
lock(syncObject)
{
// check out if the block already exists
block = blocks.SingleOrDefault(b => b.Id ...);
if (block == null)
{
block = new InformationBlock(...);
blocks.Add(block);
}
}
block?.BeginCalculation(callOnFinish);
return true;
}
}
public class InformationBlock
{
private Task calculationTask = null;
private CalculationState isCalculating isCalculating = CalculationState.Unknown;
private List<Action<InformationBlock> waitingRoom = new();
internal void BeginCalculation(Action<InformationBlock> callOnFinish)
{
if (isCalculating == CalculationState.Finished)
{
callOnFinish(this);
return;
}
else if (isCalculating == CalculationState.IsRunning)
{
waitingRoom.Add(callOnFinish);
return;
}
// add the first call to the waitingRoom
waitingRoom.Add(callOnFinish);
isCalculating = CalculationState.IsRunning;
calculationTask = Task.Run(() => { // run the calculation})
.ContinueWith(taskResult =>
{
//.. apply the calculation result to local properties
this.Property1 = taskResult.Result.Property1;
// set the state to mark this instance as complete
isCalculating = CalculationState.Finished;
// inform all calls about the result
waitingRoom.ForEach(c => c(this));
waitingRoom.Clear();
}, TaskScheduler.FromCurrentSynchronizationContext());
}
}
Is that approach a good idea? Do you see any failures or possible deadlocks? The method BeginCalculation might be called more than once while the calculation is running. Should I await for the calculationTask?
To have deadlocks, you'll need some cycles: object A depends of object B, that depends on object A again (image below). As I see, that's not your case, since the InformationBlock class doesn't access the service, but is only called by it.
The lock block is also very small, so probably it'll not put you in troubles.
You could look for the Thread-Safe Collection from C# standard libs. This could simplify your code.
I suggest you to use a ConcurrentDictionary, because it's fastest then iterate over the collection every request.

How to listen to change feed continuously RethinkDB

I am having the following problem: with RethinkDB using RunChangesAsync method runs once and when used, it starts listening to changes on a given query. When the query changes, you are given the Cursor<Change<Class>> , which is a delta between the initial state and the actual state.
My question is how can I make this run continuously?
If I use:
while(true)
{
code.... //changes happening while program is here
....../
...RunChangesAsync();
/......processed buffered items
code //new changes here
}
If there are changes happening where i pointed in the code, they would not be caught by the RunChanges. The only changes that would be caught would be while RunChanges is listening. Not before ..or after it retrieves the results.
So I tried wrapping the RunChanges in an observable but it does not listen continuously for changes as I would have expected...it just retrieves 2 null items (garbage I suppose) and ends.
Observable
public IObservable<Cursor<Change<UserStatus?>>> GetObservable() =>
r.Db(Constants.DB_NAME).Table(Constants.CLIENT_TABLE).RunChangesAsync<UserStatus?>(this.con,CancellationToken.None).ToObservable();
Observer
class PlayerSubscriber : IObserver<Cursor<Change<UserStatus?>>>
{
public void OnCompleted() => Console.WriteLine("Finished");
public void OnError(Exception error) => Console.WriteLine("error");
public void OnNext(Cursor<Change<UserStatus?>> value)
{
foreach (var item in value.BufferedItems)
Console.WriteLine(item);
}
}
Program
class Program
{
public static RethinkDB r = RethinkDB.R;
public static bool End = false;
static async Task Main(string[] args)
{
var address = new Address { Host = "127.0.0.1", Port = 28015 };
var con = await r.Connection().Hostname(address.Host).Port(address.Port).ConnectAsync();
var database = new Database(r, con);
var obs = database.GetObservable();
var sub = new PlayerSubscriber();
var disp = obs.Subscribe(sub);
Console.ReadKey();
Console.WriteLine("Hello World!");
}
}
When I am debugging as you can see, the OnNext method of the Observer is executed only once (returns two null objects) and then it closes.
P.S: Database is just a wrapper around rethinkdb queries. The only method used is GetObservable which I posted it. The UserStatus is a POCO.
When creating a change feed, you'll want to create one change feed object. For example, when you get back a Cursor<Change<T>> after running .RunChangesAsync(); that is really all you need.
The cursor object you get back from query.RunChangesAsync() is your change feed object that you will use for the entire lifetime you want to receive changes.
In your example:
while(true)
{
code.... //changes happening while program is here
....../
...RunChangesAsync();
/......processed buffered items
code //new changes here
}
Having .RunChangesAsync(); in a while loop is not the correct approach. You don't need to re-run the query again and get another Cursor<Change<T>>. I'll explain how this works at the end of this post.
Also, do not use cursor.BufferedItems on the cursor object. The cursor.BufferedItems property on the cursor is not meant to consumed by your code directly; the cursor.BufferedItems property is only exposed for those special situations where you want to "peek ahead" inside the cursor object (client-side) for items that are ready to be consumed that are specific to your change feed query.
The proper way to consume items in your change feed is to enumerate over the cursor object itself as shown below:
var cursor = await query.RunChangesAsync(conn);
foreach (var item in cursor){
Console.WriteLine(item);
}
When the cursor runs out of items, it will make a request to the RethinkDB server for more items. Keep in mind, each iteration of the foreach loop can be potentially a blocking call. For example, the foreach loop can block indefinitely when 1) there are no items on the client-side to be consumed (.BufferedItems.Count == 0) and 2) there are no items that have been changed on the server-side according to your change feed query criteria. under these circumstances, the foreach loop will block until RethinkDB server sends you an item that is ready to be consumed.
Documentation about using Reactive Extensions and RethinkDB in C#
There is a driver unit test that shows how .NET Reactive Extensions can work here.
Specifically, Lines 31 - 47 in this unit test set up a change feed with Reactive Extensions:
var changes = R.Db(DbName).Table(TableName)
//.changes()[new {include_states = true, include_initial = true}]
.Changes()
.RunChanges<JObject>(conn);
changes.IsFeed.Should().BeTrue();
var observable = changes.ToObservable();
//use a new thread if you want to continue,
//otherwise, subscription will block.
observable.SubscribeOn(NewThreadScheduler.Default)
.Subscribe(
x => OnNext(x),
e => OnError(e),
() => OnCompleted()
);
Additionally, here is a good example and explanation of what happens and how to consume a change feed with C#:
Hope that helps.
Thanks,
Brian
If you have an operation that has the signature Task<int> ReadAsync(), then the way to set up polling, is like this:
IObservable<int> PollRead(TimeSpan interval)
{
return
Observable
.Interval(interval)
.SelectMany(n => Observable.FromAsync(() => ReadAsync()));
}
I'd also caution about you creating your own implementation of IObservable<T> - it's fraught with danger. You should use Observer.Create(...) if you are creating your own observer that you want to hand around. Generally you don't even do that.

Adding Sample(TimeSpan span) to Reactive Extensions pipeline causes threading issues

Using Reactive Extensions, I have created a rolling buffer of values that caches a small history of recent values in a data stream for use in a plotting application. Since the values arrive much faster than I am interested in displaying, I would like to use the Sample(Timespan span) method in my Reactive pipeline to slow things down. However, adding it to the sample below causes an Exception to be thrown after a bit in the WriteEnumerable method (collection was modified). This is obviously a threading issue related to Sample, but I'm stumped on how exactly to alleviate it. I've tried setting the Scheduler to use in the Sample method to no avail.
Any advice?
class Program
{
static void Main(string[] args)
{
Observable.Interval(TimeSpan.FromSeconds(0.1))
.Take(500)
.TimedRollingBuffer(TimeSpan.FromSeconds(10))
.Sample(TimeSpan.FromSeconds(0.5))
.Subscribe(frame => WriteEnumerable(frame));
var input = "";
while (input != "exit")
{
input = Console.ReadLine();
}
}
private static void WriteEnumerable<T>(IEnumerable<T> enumerable)
{
foreach (T thing in enumerable)
Console.WriteLine(thing + " " + DateTime.UtcNow);
Console.WriteLine(Environment.NewLine);
}
}
public static class Extensions
{
public static IObservable<IEnumerable<Timestamped<T>>> TimedRollingBuffer<T>(this IObservable<T> observable, TimeSpan timeRange)
{
return Observable.Create<IEnumerable<Timestamped<T>>>(
o =>
{
var queue = new Queue<Timestamped<T>>();
return observable.Timestamp().Subscribe(
tx =>
{
queue.Enqueue(tx);
DateTime now = DateTime.Now;
while (queue.Peek().Timestamp < now.Subtract(timeRange))
queue.Dequeue();
o.OnNext(queue);
},
ex => o.OnError(ex),
() => o.OnCompleted()
);
});
}
}
credit where credit is due: reactive extensions sliding time window
The exception is due to the fact you are modifying the queue whilst enumerating it.
The RollingBuffer implementation you cited by #Enigmativity does the right thing - you will notice in his implemention the OnNext is invoked with a ToArray() ensuring a copy of the list as it stands is dispatched to observers rather than the mutating original.
In your case, the introduction of Sample introduces concurrency (which is fine - that's what it is supposed to do) - however, you are passing the queue itself, which will mutate whilst enumeration is occurring due to this introduced concurrency. This is a bug.
In your TimedRollingBuffer, if you were to use o.OnNext(queue.ToArray()) instead of o.OnNext(queue) you wouldn't have this problem.

Creating generated sequence of events as a cold sequence

FWIW - I'm scrapping the previous version of this question in favor of different one along the same way after asking for advice on meta
I have a webservice that contains configuration data. I would like to call it at regular intervals Tok in order to refresh the configuration data in the application that uses it. If the service is in error (timeout, down, etc) I want to keep the data from the previous call and call the service again after a different time interval Tnotok. Finally I want the behavior to be testable.
Since managing time sequences and testability seems like a strong point of the Reactive Extensions, I started using an Observable that will be fed by a generated sequence. Here is how I create the sequence:
Observable.Generate<DataProviderResult, DataProviderResult>(
// we start with some empty data
new DataProviderResult() {
Failures = 0
, Informations = new List<Information>()},
// never stop
(r) => true,
// there is no iteration
(r) => r,
// we get the next value from a call to the webservice
(r) => FetchNextResults(r),
// we select time for next msg depending on the current failures
(r) => r.Failures > 0 ? tnotok : tok,
// we pass a TestScheduler
scheduler)
.Suscribe(r => HandleResults(r));
I have two problems currently:
It looks like I am creating a hot observable. Even trying to use Publish/Connect I have the suscribed action missing the first event. How can I create it as a cold observable?
myObservable = myObservable.Publish();
myObservable.Suscribe(r => HandleResults(r));
myObservable.Connect() // doesn't call onNext for first element in sequence
When I suscribe, the order in which the suscription and the generation seems off, since for any frame the suscription method is fired before the FetchNextResults method. Is it normal? I would expect the sequence to call the method for frame f, not f+1.
Here is the code that I'm using for fetching and suscription:
private DataProviderResult FetchNextResults(DataProviderResult previousResult)
{
Console.WriteLine(string.Format("Fetching at {0:hh:mm:ss:fff}", scheduler.Now));
try
{
return new DataProviderResult() { Informations = dataProvider.GetInformation().ToList(), Failures = 0};
}
catch (Exception)
{}
previousResult.Failures++;
return previousResult;
}
private void HandleResults(DataProviderResult result)
{
Console.WriteLine(string.Format("Managing at {0:hh:mm:ss:fff}", scheduler.Now));
dataResult = result;
}
Here is what I'm seeing that prompted me articulating these questions:
Starting at 12:00:00:000
Fetching at 12:00:00:000 < no managing the result that has been fetched here
Managing at 12:00:01:000 < managing before fetching for frame f
Fetching at 12:00:01:000
Managing at 12:00:02:000
Fetching at 12:00:02:000
EDIT: Here is a bare bones copy-pastable program that illustrates the problem.
/*using System;
using System.Reactive.Concurrency;
using System.Reactive.Linq;
using Microsoft.Reactive.Testing;*/
private static int fetchData(int i, IScheduler scheduler)
{
writeTime("fetching " + (i+1).ToString(), scheduler);
return i+1;
}
private static void manageData(int i, IScheduler scheduler)
{
writeTime("managing " + i.ToString(), scheduler);
}
private static void writeTime(string msg, IScheduler scheduler)
{
Console.WriteLine(string.Format("{0:mm:ss:fff} {1}", scheduler.Now, msg));
}
private static void Main(string[] args)
{
var scheduler = new TestScheduler();
writeTime("start", scheduler);
var datas = Observable.Generate<int, int>(fetchData(0, scheduler),
(d) => true,
(d) => fetchData(d, scheduler),
(d) => d,
(d) => TimeSpan.FromMilliseconds(1000),
scheduler)
.Subscribe(i => manageData(i, scheduler));
scheduler.AdvanceBy(TimeSpan.FromMilliseconds(3000).Ticks);
}
This outputs the following:
00:00:000 start
00:00:000 fetching 1
00:01:000 managing 1
00:01:000 fetching 2
00:02:000 managing 2
00:02:000 fetching 3
I don't understand why the managing of the first element is not picked up immediately after its fetching. There is one second between the sequence effectively pulling the data and the data being handed to the observer. Am I missing something here or is it expected behavior? If so is there a way to have the observer react immediately to the new value?
You are misunderstanding the purpose of the timeSelector parameter. It is called each time a value is generated and it returns a time which indicates how long to delay before delivering that value to observers and then generating the next value.
Here's a non-Generate way to tackle your problem.
private DataProviderResult FetchNextResult()
{
// let exceptions throw
return dataProvider.GetInformation().ToList();
}
private IObservable<DataProviderResult> CreateObservable(IScheduler scheduler)
{
// an observable that produces a single result then completes
var fetch = Observable.Defer(
() => Observable.Return(FetchNextResult));
// concatenate this observable with one that will pause
// for "tok" time before completing.
// This observable will send the result
// then pause before completing.
var fetchThenPause = fetch.Concat(Observable
.Empty<DataProviderResult>()
.Delay(tok, scheduler));
// Now, if fetchThenPause fails, we want to consume/ignore the exception
// and then pause for tnotok time before completing with no results
var fetchPauseOnErrors = fetchThenPause.Catch(Observable
.Empty<DataProviderResult>()
.Delay(tnotok, scheduler));
// Now, whenever our observable completes (after its pause), start it again.
var fetchLoop = fetchPauseOnErrors.Repeat();
// Now use Publish(initialValue) so that we remember the most recent value
var fetchLoopWithMemory = fetchLoop.Publish(null);
// YMMV from here on. Lets use RefCount() to start the
// connection the first time someone subscribes
var fetchLoopAuto = fetchLoopWithMemory.RefCount();
// And lets filter out that first null that will arrive before
// we ever get the first result from the data provider
return fetchLoopAuto.Where(t => t != null);
}
public MyClass()
{
Information = CreateObservable();
}
public IObservable<DataProviderResult> Information { get; private set; }
Generate produces cold observable sequences, so that is my first alarm bell.
I tried to pull your code into linqpad* and run it and changed it a bit to focus on the problem. It seems to me that you have the Iterator and ResultSelector functions confused. These are back-to-front. When you iterate, you should take the value from your last iteration and use it to produce your next value. The result selector is used to pick off (Select) the value form the instance you are iterating on.
So in your case, the type you are iterating on is the type you want to produce values of. Therefore keep your ResultSelector function just the identity function x=>x, and your IteratorFunction should be the one that make the WebService call.
Observable.Generate<DataProviderResult, DataProviderResult>(
// we start with some empty data
new DataProviderResult() {
Failures = 0
, Informations = new List<Information>()},
// never stop
(r) => true,
// we get the next value(iterate) by making a call to the webservice
(r) => FetchNextResults(r),
// there is no projection
(r) => r,
// we select time for next msg depending on the current failures
(r) => r.Failures > 0 ? tnotok : tok,
// we pass a TestScheduler
scheduler)
.Suscribe(r => HandleResults(r));
As a side note, try to prefer immutable types instead of mutating values as you iterate.
*Please provide an autonomous working snippet of code so people can better answer your question. :-)

store retrieve IObservable subscription state in Rx

[ this question is in the realm of Reactive Extensions (Rx) ]
A subscription that needs to continue on application restart
int nValuesBeforeOutput = 123;
myStream.Buffer(nValuesBeforeOutput).Subscribe(
i => Debug.WriteLine("Something Critical on Every 123rd Value"));
Now I need to serialize and deserialize the state of this subscription so that next time the application is started the buffer count does NOT start from zero, but from whatever the buffer count got to before application exit.
How could you persist the state of IObservable.Subscribe() in this case and later load it?
Is there a general solution to saving observer state in Rx?
From Answer to Solution
Based on Paul Betts approach, here's a semi-generalizable implementation that worked in my initial testing
Use
int nValuesBeforeOutput = 123;
var myRecordableStream = myStream.Record(serializer);
myRecordableStream.Buffer(nValuesBeforeOutput).ClearRecords(serializer).Subscribe(
i => Debug.WriteLine("Something Critical on Every 123rd Value"));
Extension methods
private static bool _alreadyRecording;
public static IObservable<T> Record<T>(this IObservable<T> input,
IRepositor repositor)
{
IObservable<T> output = input;
List<T> records = null;
if (repositor.Deserialize(ref records))
{
ISubject<T> history = new ReplaySubject<T>();
records.ForEach(history.OnNext);
output = input.Merge(history);
}
if (!_alreadyRecording)
{
_alreadyRecording = true;
input.Subscribe(i => repositor.SerializeAppend(new List<T> {i}));
}
return output;
}
public static IObservable<T> ClearRecords<T>(this IObservable<T> input,
IRepositor repositor)
{
input.Subscribe(i => repositor.Clear());
return input;
}
Notes
This will not work for storing states that depend on time-intervals between the values produced
You need a serializer implementation that supports serializing T
_alreadyRecording is needed if you subscribe to myRecordableStream more than once
_alreadyRecording is a static boolean, very ugly, and prevents the extension methods from being used in more than one place if needing parallel subscriptions - needs to be re-implemented for future use
There is no general solution for this, and making one would be NonTrivialâ„¢. The closest thing you can do is make myStream some sort of replay Observable (i.e. instead of serializing the state, serialize the state of myStream and redo the work to get you back to where you were).

Categories

Resources