FWIW - I'm scrapping the previous version of this question in favor of different one along the same way after asking for advice on meta
I have a webservice that contains configuration data. I would like to call it at regular intervals Tok in order to refresh the configuration data in the application that uses it. If the service is in error (timeout, down, etc) I want to keep the data from the previous call and call the service again after a different time interval Tnotok. Finally I want the behavior to be testable.
Since managing time sequences and testability seems like a strong point of the Reactive Extensions, I started using an Observable that will be fed by a generated sequence. Here is how I create the sequence:
Observable.Generate<DataProviderResult, DataProviderResult>(
// we start with some empty data
new DataProviderResult() {
Failures = 0
, Informations = new List<Information>()},
// never stop
(r) => true,
// there is no iteration
(r) => r,
// we get the next value from a call to the webservice
(r) => FetchNextResults(r),
// we select time for next msg depending on the current failures
(r) => r.Failures > 0 ? tnotok : tok,
// we pass a TestScheduler
scheduler)
.Suscribe(r => HandleResults(r));
I have two problems currently:
It looks like I am creating a hot observable. Even trying to use Publish/Connect I have the suscribed action missing the first event. How can I create it as a cold observable?
myObservable = myObservable.Publish();
myObservable.Suscribe(r => HandleResults(r));
myObservable.Connect() // doesn't call onNext for first element in sequence
When I suscribe, the order in which the suscription and the generation seems off, since for any frame the suscription method is fired before the FetchNextResults method. Is it normal? I would expect the sequence to call the method for frame f, not f+1.
Here is the code that I'm using for fetching and suscription:
private DataProviderResult FetchNextResults(DataProviderResult previousResult)
{
Console.WriteLine(string.Format("Fetching at {0:hh:mm:ss:fff}", scheduler.Now));
try
{
return new DataProviderResult() { Informations = dataProvider.GetInformation().ToList(), Failures = 0};
}
catch (Exception)
{}
previousResult.Failures++;
return previousResult;
}
private void HandleResults(DataProviderResult result)
{
Console.WriteLine(string.Format("Managing at {0:hh:mm:ss:fff}", scheduler.Now));
dataResult = result;
}
Here is what I'm seeing that prompted me articulating these questions:
Starting at 12:00:00:000
Fetching at 12:00:00:000 < no managing the result that has been fetched here
Managing at 12:00:01:000 < managing before fetching for frame f
Fetching at 12:00:01:000
Managing at 12:00:02:000
Fetching at 12:00:02:000
EDIT: Here is a bare bones copy-pastable program that illustrates the problem.
/*using System;
using System.Reactive.Concurrency;
using System.Reactive.Linq;
using Microsoft.Reactive.Testing;*/
private static int fetchData(int i, IScheduler scheduler)
{
writeTime("fetching " + (i+1).ToString(), scheduler);
return i+1;
}
private static void manageData(int i, IScheduler scheduler)
{
writeTime("managing " + i.ToString(), scheduler);
}
private static void writeTime(string msg, IScheduler scheduler)
{
Console.WriteLine(string.Format("{0:mm:ss:fff} {1}", scheduler.Now, msg));
}
private static void Main(string[] args)
{
var scheduler = new TestScheduler();
writeTime("start", scheduler);
var datas = Observable.Generate<int, int>(fetchData(0, scheduler),
(d) => true,
(d) => fetchData(d, scheduler),
(d) => d,
(d) => TimeSpan.FromMilliseconds(1000),
scheduler)
.Subscribe(i => manageData(i, scheduler));
scheduler.AdvanceBy(TimeSpan.FromMilliseconds(3000).Ticks);
}
This outputs the following:
00:00:000 start
00:00:000 fetching 1
00:01:000 managing 1
00:01:000 fetching 2
00:02:000 managing 2
00:02:000 fetching 3
I don't understand why the managing of the first element is not picked up immediately after its fetching. There is one second between the sequence effectively pulling the data and the data being handed to the observer. Am I missing something here or is it expected behavior? If so is there a way to have the observer react immediately to the new value?
You are misunderstanding the purpose of the timeSelector parameter. It is called each time a value is generated and it returns a time which indicates how long to delay before delivering that value to observers and then generating the next value.
Here's a non-Generate way to tackle your problem.
private DataProviderResult FetchNextResult()
{
// let exceptions throw
return dataProvider.GetInformation().ToList();
}
private IObservable<DataProviderResult> CreateObservable(IScheduler scheduler)
{
// an observable that produces a single result then completes
var fetch = Observable.Defer(
() => Observable.Return(FetchNextResult));
// concatenate this observable with one that will pause
// for "tok" time before completing.
// This observable will send the result
// then pause before completing.
var fetchThenPause = fetch.Concat(Observable
.Empty<DataProviderResult>()
.Delay(tok, scheduler));
// Now, if fetchThenPause fails, we want to consume/ignore the exception
// and then pause for tnotok time before completing with no results
var fetchPauseOnErrors = fetchThenPause.Catch(Observable
.Empty<DataProviderResult>()
.Delay(tnotok, scheduler));
// Now, whenever our observable completes (after its pause), start it again.
var fetchLoop = fetchPauseOnErrors.Repeat();
// Now use Publish(initialValue) so that we remember the most recent value
var fetchLoopWithMemory = fetchLoop.Publish(null);
// YMMV from here on. Lets use RefCount() to start the
// connection the first time someone subscribes
var fetchLoopAuto = fetchLoopWithMemory.RefCount();
// And lets filter out that first null that will arrive before
// we ever get the first result from the data provider
return fetchLoopAuto.Where(t => t != null);
}
public MyClass()
{
Information = CreateObservable();
}
public IObservable<DataProviderResult> Information { get; private set; }
Generate produces cold observable sequences, so that is my first alarm bell.
I tried to pull your code into linqpad* and run it and changed it a bit to focus on the problem. It seems to me that you have the Iterator and ResultSelector functions confused. These are back-to-front. When you iterate, you should take the value from your last iteration and use it to produce your next value. The result selector is used to pick off (Select) the value form the instance you are iterating on.
So in your case, the type you are iterating on is the type you want to produce values of. Therefore keep your ResultSelector function just the identity function x=>x, and your IteratorFunction should be the one that make the WebService call.
Observable.Generate<DataProviderResult, DataProviderResult>(
// we start with some empty data
new DataProviderResult() {
Failures = 0
, Informations = new List<Information>()},
// never stop
(r) => true,
// we get the next value(iterate) by making a call to the webservice
(r) => FetchNextResults(r),
// there is no projection
(r) => r,
// we select time for next msg depending on the current failures
(r) => r.Failures > 0 ? tnotok : tok,
// we pass a TestScheduler
scheduler)
.Suscribe(r => HandleResults(r));
As a side note, try to prefer immutable types instead of mutating values as you iterate.
*Please provide an autonomous working snippet of code so people can better answer your question. :-)
Related
I do have a singleton component that manages some information blocks. An information block is a calculated information identified by some characteristics (concrete an Id and a time period). These calculations may take some seconds. All information blocks are stored in a collection.
Some other consumers are using these information blocks. The calculation should start when the first request for this Id and time period comes. I had following flow in mind:
The first consumer requests the data identified by Id and time period.
The component checks if the information block already exists
If not: Create the information block, put it into the collection and start the calculation in a background task. If yes: Take it from the collection
After that the flow goes to the information block:
When the calculation is already finished (by a former call), a callback from the consumer is called with the result of the calculation.
When the calculation is still in process, the callback is called when the calculation is finished.
So long, so good.
The critical section comes when the second (or any other subsequent) call is coming and the calculation is still running. The idea is that the calculation method holds each consumers callback and then when the calculation is finished all consumers callbacks are called.
public class SingletonInformationService
{
private readonly Collection<InformationBlock> blocks = new();
private object syncObject = new();
public void GetInformationBlock(Guid id, TimePersiod timePeriod,
Action<InformationBlock> callOnFinish)
{
InformationBlock block = null;
lock(syncObject)
{
// check out if the block already exists
block = blocks.SingleOrDefault(b => b.Id ...);
if (block == null)
{
block = new InformationBlock(...);
blocks.Add(block);
}
}
block?.BeginCalculation(callOnFinish);
return true;
}
}
public class InformationBlock
{
private Task calculationTask = null;
private CalculationState isCalculating isCalculating = CalculationState.Unknown;
private List<Action<InformationBlock> waitingRoom = new();
internal void BeginCalculation(Action<InformationBlock> callOnFinish)
{
if (isCalculating == CalculationState.Finished)
{
callOnFinish(this);
return;
}
else if (isCalculating == CalculationState.IsRunning)
{
waitingRoom.Add(callOnFinish);
return;
}
// add the first call to the waitingRoom
waitingRoom.Add(callOnFinish);
isCalculating = CalculationState.IsRunning;
calculationTask = Task.Run(() => { // run the calculation})
.ContinueWith(taskResult =>
{
//.. apply the calculation result to local properties
this.Property1 = taskResult.Result.Property1;
// set the state to mark this instance as complete
isCalculating = CalculationState.Finished;
// inform all calls about the result
waitingRoom.ForEach(c => c(this));
waitingRoom.Clear();
}, TaskScheduler.FromCurrentSynchronizationContext());
}
}
Is that approach a good idea? Do you see any failures or possible deadlocks? The method BeginCalculation might be called more than once while the calculation is running. Should I await for the calculationTask?
To have deadlocks, you'll need some cycles: object A depends of object B, that depends on object A again (image below). As I see, that's not your case, since the InformationBlock class doesn't access the service, but is only called by it.
The lock block is also very small, so probably it'll not put you in troubles.
You could look for the Thread-Safe Collection from C# standard libs. This could simplify your code.
I suggest you to use a ConcurrentDictionary, because it's fastest then iterate over the collection every request.
I have a controller which returns a large json object. If this object does not exist, it will generate and return it afterwards. The generation takes about 5 seconds, and if the client sent the request multiple times, the object gets generated with x-times the children. So my question is: Is there a way to block the second request, until the first one finished, independent who sent the request?
Normally I would do it with a Singleton, but because I am having scoped services, singleton does not work here
Warning: this is very oppinionated and maybe not suitable for Stack Overflow, but here it is anyway
Although I'll provide no code... when things take a while to generate, you don't usually spend that time directly in controller code, but do something like "start a background task to generate the result, and provide a "task id", which can be queried on another different call).
So, my preferred course of action for this would be having two different controller actions:
Generate, which creates the background job, assigns it some id, and returns the id
GetResult, to which you pass the task id, and returns either different error codes for "job id doesn't exist", "job id isn't finished", or a 200 with the result.
This way, your clients will need to call both, however, in Generate, you can check if the job is already being created and return an existing job id.
This of course moves the need to "retry and check" to your client: in exchange, you don't leave the connection to the server opened during those 5 seconds (which could potentially be multiplied by a number of clients) and return fast.
Otherwise, if you don't care about having your clients wait for a response during those 5 seconds, you could do a simple:
if(resultDoesntExist) {
resultDoesntExist = false; // You can use locks for the boolean setters or Interlocked instead of just setting a member
resultIsBeingGenerated = true;
generateResult(); // <-- this is what takes 5 seconds
resultIsBeingGenerated = false;
}
while(resultIsBeingGenerated) { await Task.Delay(10); } // <-- other clients will wait here
var result = getResult(); // <-- this should be fast once the result is already created
return result;
note: those booleans and the actual loop could be on the controller, or on the service, or wherever you see fit: just be wary of making them thread-safe in however method you see appropriate
So you basically make other clients wait till the first one generates the result, with "almost" no CPU load on the server... however with a connection open and a thread from the threadpool used, so I just DO NOT recommend this :-)
PS: #Leaky solution above is also good, but it also shifts the responsability to retry to the client, and if you are going to do that, I'd probably go directly with a "background job id", instead of having the first (the one that generates the result) one take 5 seconds. IMO, if it can be avoided, no API action should ever take 5 seconds to return :-)
Do you have an example for Interlocked.CompareExchange?
Sure. I'm definitely not the most knowledgeable person when it comes to multi-threading stuff, but this is quite simple (as you might know, Interlocked has no support for bool, so it's customary to represent it with an integral type):
public class QueryStatus
{
private static int _flag;
// Returns false if the query has already started.
public bool TrySetStarted()
=> Interlocked.CompareExchange(ref _flag, 1, 0) == 0;
public void SetFinished()
=> Interlocked.Exchange(ref _flag, 0);
}
I think it's the safest if you use it like this, with a 'Try' method, which tries to set the value and tells you if it was already set, in an atomic way.
Besides simply adding this (I mean just the field and the methods) to your existing component, you can also use it as a separate component, injected from the IOC container as scoped. Or even injected as a singleton, and then you don't have to use a static field.
Storing state like this should be good for as long as the application is running, but if the hosted application is recycled due to inactivity, it's obviously lost. Though, that won't happen while a request is still processing, and definitely won't happen in 5 seconds.
(And if you wanted to synchronize between app service instances, you could 'quickly' save a flag to the database, in a transaction with proper isolation level set. Or use e.g. Azure Redis Cache.)
Example solution
As Kit noted, rightly so, I didn't provide a full solution above.
So, a crude implementation could go like this:
public class SomeQueryService : ISomeQueryService
{
private static int _hasStartedFlag;
private static bool TrySetStarted()
=> Interlocked.CompareExchange(ref _hasStartedFlag, 1, 0) == 0;
private static void SetFinished()
=> Interlocked.Exchange(ref _hasStartedFlag, 0);
public async Task<(bool couldExecute, object result)> TryExecute()
{
if (!TrySetStarted())
return (couldExecute: false, result: null);
// Safely execute long query.
SetFinished();
return (couldExecute: true, result: result);
}
}
// In the controller, obviously
[HttpGet()]
public async Task<IActionResult> DoLongQuery([FromServices] ISomeQueryService someQueryService)
{
var (couldExecute, result) = await someQueryService.TryExecute();
if (!couldExecute)
{
return new ObjectResult(new ProblemDetails
{
Status = StatusCodes.Status503ServiceUnavailable,
Title = "Another request has already started. Try again later.",
Type = "https://tools.ietf.org/html/rfc7231#section-6.6.4"
})
{ StatusCode = StatusCodes.Status503ServiceUnavailable };
}
return Ok(result);
}
Of course possibly you'd want to extract the 'blocking' logic from the controller action into somewhere else, for example an action filter. In that case the flag should also go into a separate component that could be shared between the query service and the filter.
General use action filter
I felt bad about my inelegant solution above, and I realized that this problem can be generalized into basically a connection number limiter on an endpoint.
I wrote this small action filter that can be applied to any endpoint (multiple endpoints), and it accepts the number of allowed connections:
[AttributeUsage(AttributeTargets.Method, AllowMultiple = false)]
public class ConcurrencyLimiterAttribute : ActionFilterAttribute
{
private readonly int _allowedConnections;
private static readonly ConcurrentDictionary<string, int> _connections = new ConcurrentDictionary<string, int>();
public ConcurrencyLimiterAttribute(int allowedConnections = 1)
=> _allowedConnections = allowedConnections;
public override async Task OnActionExecutionAsync(ActionExecutingContext context, ActionExecutionDelegate next)
{
var key = context.HttpContext.Request.Path;
if (_connections.AddOrUpdate(key, 1, (k, v) => ++v) > _allowedConnections)
{
Close(withError: true);
return;
}
try
{
await next();
}
finally
{
Close();
}
void Close(bool withError = false)
{
if (withError)
{
context.Result = new ObjectResult(new ProblemDetails
{
Status = StatusCodes.Status503ServiceUnavailable,
Title = $"Maximum {_allowedConnections} simultaneous connections are allowed. Try again later.",
Type = "https://tools.ietf.org/html/rfc7231#section-6.6.4"
})
{ StatusCode = StatusCodes.Status503ServiceUnavailable };
}
_connections.AddOrUpdate(key, 0, (k, v) => --v);
}
}
}
I am having the following problem: with RethinkDB using RunChangesAsync method runs once and when used, it starts listening to changes on a given query. When the query changes, you are given the Cursor<Change<Class>> , which is a delta between the initial state and the actual state.
My question is how can I make this run continuously?
If I use:
while(true)
{
code.... //changes happening while program is here
....../
...RunChangesAsync();
/......processed buffered items
code //new changes here
}
If there are changes happening where i pointed in the code, they would not be caught by the RunChanges. The only changes that would be caught would be while RunChanges is listening. Not before ..or after it retrieves the results.
So I tried wrapping the RunChanges in an observable but it does not listen continuously for changes as I would have expected...it just retrieves 2 null items (garbage I suppose) and ends.
Observable
public IObservable<Cursor<Change<UserStatus?>>> GetObservable() =>
r.Db(Constants.DB_NAME).Table(Constants.CLIENT_TABLE).RunChangesAsync<UserStatus?>(this.con,CancellationToken.None).ToObservable();
Observer
class PlayerSubscriber : IObserver<Cursor<Change<UserStatus?>>>
{
public void OnCompleted() => Console.WriteLine("Finished");
public void OnError(Exception error) => Console.WriteLine("error");
public void OnNext(Cursor<Change<UserStatus?>> value)
{
foreach (var item in value.BufferedItems)
Console.WriteLine(item);
}
}
Program
class Program
{
public static RethinkDB r = RethinkDB.R;
public static bool End = false;
static async Task Main(string[] args)
{
var address = new Address { Host = "127.0.0.1", Port = 28015 };
var con = await r.Connection().Hostname(address.Host).Port(address.Port).ConnectAsync();
var database = new Database(r, con);
var obs = database.GetObservable();
var sub = new PlayerSubscriber();
var disp = obs.Subscribe(sub);
Console.ReadKey();
Console.WriteLine("Hello World!");
}
}
When I am debugging as you can see, the OnNext method of the Observer is executed only once (returns two null objects) and then it closes.
P.S: Database is just a wrapper around rethinkdb queries. The only method used is GetObservable which I posted it. The UserStatus is a POCO.
When creating a change feed, you'll want to create one change feed object. For example, when you get back a Cursor<Change<T>> after running .RunChangesAsync(); that is really all you need.
The cursor object you get back from query.RunChangesAsync() is your change feed object that you will use for the entire lifetime you want to receive changes.
In your example:
while(true)
{
code.... //changes happening while program is here
....../
...RunChangesAsync();
/......processed buffered items
code //new changes here
}
Having .RunChangesAsync(); in a while loop is not the correct approach. You don't need to re-run the query again and get another Cursor<Change<T>>. I'll explain how this works at the end of this post.
Also, do not use cursor.BufferedItems on the cursor object. The cursor.BufferedItems property on the cursor is not meant to consumed by your code directly; the cursor.BufferedItems property is only exposed for those special situations where you want to "peek ahead" inside the cursor object (client-side) for items that are ready to be consumed that are specific to your change feed query.
The proper way to consume items in your change feed is to enumerate over the cursor object itself as shown below:
var cursor = await query.RunChangesAsync(conn);
foreach (var item in cursor){
Console.WriteLine(item);
}
When the cursor runs out of items, it will make a request to the RethinkDB server for more items. Keep in mind, each iteration of the foreach loop can be potentially a blocking call. For example, the foreach loop can block indefinitely when 1) there are no items on the client-side to be consumed (.BufferedItems.Count == 0) and 2) there are no items that have been changed on the server-side according to your change feed query criteria. under these circumstances, the foreach loop will block until RethinkDB server sends you an item that is ready to be consumed.
Documentation about using Reactive Extensions and RethinkDB in C#
There is a driver unit test that shows how .NET Reactive Extensions can work here.
Specifically, Lines 31 - 47 in this unit test set up a change feed with Reactive Extensions:
var changes = R.Db(DbName).Table(TableName)
//.changes()[new {include_states = true, include_initial = true}]
.Changes()
.RunChanges<JObject>(conn);
changes.IsFeed.Should().BeTrue();
var observable = changes.ToObservable();
//use a new thread if you want to continue,
//otherwise, subscription will block.
observable.SubscribeOn(NewThreadScheduler.Default)
.Subscribe(
x => OnNext(x),
e => OnError(e),
() => OnCompleted()
);
Additionally, here is a good example and explanation of what happens and how to consume a change feed with C#:
Hope that helps.
Thanks,
Brian
If you have an operation that has the signature Task<int> ReadAsync(), then the way to set up polling, is like this:
IObservable<int> PollRead(TimeSpan interval)
{
return
Observable
.Interval(interval)
.SelectMany(n => Observable.FromAsync(() => ReadAsync()));
}
I'd also caution about you creating your own implementation of IObservable<T> - it's fraught with danger. You should use Observer.Create(...) if you are creating your own observer that you want to hand around. Generally you don't even do that.
I've got data access layer that has two types of method GetLatestX and GetX. GetLatestX looks something like this:
public IElementType GetLatestElementType(Guid id)
{
IElementType record = null;
using (DatabaseSession session = CreateSession())
{
record = session.Connection.Get<ElementTypeRecord>(id);
}
return record;
}
That's reasonably easy to unit test.
However, GetX wraps this GetLatest in an RefCount Observable and emits new values in response to a messaging system. Testing this method is a lot more complex. I want to check the following, rather complex behavior:
You can subscribe and a it retrieves a value from the database
It starts listening for messages
Subscribing again doesn't result in a repeated database call
When the mock message system simulates a message a new database access is called and the subscriptions get the new versions. Only one additional database call is used.
Unsubscribing the second subscription doesn't result in the system stopping listening to messages.
Unsubscribing to the first subscription results in disposal of resources, and unsubscription from the messages.
So, I've got all this in a single unit test, which is hideous. However, I'm not sure how I could break it up. I could only test 1, but to test 2 I'd have to go through 1 and 2, for 3 I'd still have to go through steps 1, 2, 3 etc. So I'd just be copying the same giant test, but having Asserts in different places each time.
And the code I'm testing in this method:
public IObservable<IElement> GetElement(Guid id)
{
return CreateObservableFor(GetLatestElement(id), GetLatestElement);
}
It's a single line, half of which has been tested earlier. The other half is private:
private IObservable<T> CreateObservableFor<T>(T entity, Func<Guid, T> getLatest)
{
Guid id = (entity as ConfigurationEntity).ID;
//return subject;
return Observable.Create<T>(observer =>
{
// publish first value
observer.OnNext(entity);
// listen to internal or external update notifications from messages
Action<ConfigurationMessage> callback = (message) =>
{
// TODO, check time-stamp is after previous?
// use callback to get latest value
observer.OnNext(getLatest(id));
};
messageService.SubscribeToConfiguration(id.ToString(), callback);
// if it has been completed and stop listening to messages
return Disposable.Create(() =>
{
Console.WriteLine("Unsubscribing from topic " + id);
messageService.UnsubscribeToConfiguration(id.ToString(), callback);
});
}).Publish().RefCount();
}
But it behaves the same way for all the GetX methods.
My first thought is I should split the GetLatestX into an interface I can test separately then mock - but that seems to split the data access class into two for no good reason other than unit tests. They don't really conceptually belong as separate units in my mind. Is there another way of 'mocking' this dependency within a class? Or should I just split them up for the sake of testing?
In the same vein, testing the functionality of GetX is effectively repeatedly testing the logic of CreateObservableFor. I see why I should be testing each API method rather than what is really the internals of the API (in case something changes), but it seems so... inefficient.
How can I structure this unit test in a better way?
Example test:
[Test]
public void GetElementTypeTest()
{
// test data
var id = Guid.NewGuid();
var nameA = "TestNameA";
var nameB = "TestNameB";
// mock database
var columnNames = new[] { "ID", "Name" };
// data values A will be the first set of data returned, and after configuration update B will be returned
var dataValuesA = new List<object[]>();
dataValuesA.Add(new object[] { id, nameA });
var dataValuesB = new List<object[]>();
dataValuesB.Add(new object[] { id, nameB });
mockDbProviderFactory = new MockDbProviderFactory()
.AddDatareaderCommand(columnNames, dataValuesA)
.AddDatareaderCommand(columnNames, dataValuesB);
// test method
IEMF emf = new EMF(mockMessageService.Object, new MockHistorian(), mockDbProviderFactory.Object, "");
var resultObservable = emf.GetElementType(id);
// check subscription to config changes has not occurred and database not accessed
mockDbProviderFactory.Verify(f => f.CreateConnection(), Times.Once);
mockMessageService.Verify(ms => ms.SubscribeToConfiguration(It.IsAny<string>(), It.IsAny<Action<ConfigurationMessage>>()), Times.Never);
//subscribe to observable
int sub1Count = 0;
var subscription = resultObservable.Subscribe(result => {
sub1Count++;
// check result
Assert.AreEqual(new ElementTypeRecord(id, (sub1Count == 1 ? nameA : nameB)), result, "Result from EMF does not match data");
});
// check subscribed to config changes and subscription called
Assert.IsTrue(sub1Count == 1, "Subscription not called");
mockMessageService.Verify(ms => ms.SubscribeToConfiguration(It.IsAny<string>(), It.IsAny<Action<ConfigurationMessage>>()), Times.Once);
// check we've subscribed with our id
Assert.AreEqual(this.configCallbacks[0].Item1, id.ToString(), "Unexpected message system subscription topic");
// open a second, short term subscription and ensure that the system does not re-subscribe to updates, or read the data again
int sub2Count = 0;
resultObservable.Take(1).Subscribe(result => {
sub2Count++;
// check result (should be second data item)
Assert.AreEqual(new ElementTypeRecord(id, nameB), result, "Result from EMF does not match data");
});
// check subscribed to config changes has not changed
mockMessageService.Verify(ms => ms.SubscribeToConfiguration(It.IsAny<string>(), It.IsAny<Action<ConfigurationMessage>>()), Times.Once);
//emit a new value by simulating a configuration change message
this.configCallbacks[0].Item2(new ConfigurationMessage(DateTime.Now));
// check subscriptions called
Assert.IsTrue(sub1Count == 2, "Subscription not called");
Assert.IsTrue(sub2Count == 1, "Subscription not called");
// unsubscribe
mockMessageService.Verify(ms => ms.UnsubscribeToConfiguration(It.IsAny<string>(), It.IsAny<Action<ConfigurationMessage>>()), Times.Never);
subscription.Dispose();
// verify subscription removed
mockMessageService.Verify(ms => ms.UnsubscribeToConfiguration(It.IsAny<string>(), It.IsAny<Action<ConfigurationMessage>>()), Times.Once);
Assert.IsTrue(this.configCallbacks.Count == 0, "Unexpected message system unsubscription topic");
// validate that the connection, command and reader were used correctly
mockDbProviderFactory.Verify(f => f.CreateConnection(), Times.Exactly(2));
mockDbProviderFactory.MockConnection.Verify(c => c.Open(), Times.Exactly(2));
mockDbProviderFactory.MockConnection.Verify(c => c.Close(), Times.Exactly(2));
//first data call
mockDbProviderFactory.MockCommands[0].Verify(c => c.PublicExecuteDbDataReader(It.IsAny<CommandBehavior>()), Times.Once);
mockDbProviderFactory.MockCommands[0].MockDatareader.Verify(dr => dr.Read(), Times.Exactly(2));
//second data call
mockDbProviderFactory.MockCommands[1].Verify(c => c.PublicExecuteDbDataReader(It.IsAny<CommandBehavior>()), Times.Once);
mockDbProviderFactory.MockCommands[1].MockDatareader.Verify(dr => dr.Read(), Times.Exactly(2));
}
I've been trying to create an observable which streams a state-of-the-world (snapshot) from a repository cache, followed by live updates from a separate feed. The catch is that the snapshot call is blocking, so the updates have to be buffered during that time.
This is what I've come up with, a little simplified. The GetStream() method is the one I'm concerned with. I'm wondering whether there is a more elegant solution. Assume GetDataFeed() pulses updates to the cache all day long.
private static readonly IConnectableObservable<long> _updateStream;
public static Constructor()
{
_updateStream = GetDataFeed().Publish();
_updateStream.Connect();
}
static void Main(string[] args)
{
_updateStream.Subscribe(Console.WriteLine);
Console.ReadLine();
GetStream().Subscribe(l => Console.WriteLine("Stream: " + l));
Console.ReadLine();
}
public static IObservable<long> GetStream()
{
return Observable.Create<long>(observer =>
{
var bufferedStream = new ReplaySubject<long>();
_updateStream.Subscribe(bufferedStream);
var data = GetSnapshot();
// This returns the ticks from GetSnapshot
// followed by the buffered ticks from _updateStream
// followed by any subsequent ticks from _updateStream
data.ToObservable().Concat(bufferedStream).Subscribe(observer);
return Disposable.Empty;
});
}
private static IObservable<long> GetDataFeed()
{
var feed = Observable.Interval(TimeSpan.FromSeconds(1));
return Observable.Create<long>(observer =>
{
feed.Subscribe(observer);
return Disposable.Empty;
});
}
Popular opinion opposes Subjects as they are not 'functional', but I can't find a way of doing this without a ReplaySubject. The Replay filter on a hot observable wouldn't work because it would replay everything (potentially a whole day's worth of stale updates).
I'm also concerned about race conditions. Is there a way to guarantee sequencing of some sort, should an earlier update be buffered before the snapshot? Can the whole thing be done more safely and elegantly with other RX operators?
Thanks.
-Will
Whether you use a ReplaySubject or the Replay function really makes no difference. Replay uses a ReplaySubject under the hood. I'll also note that you are leaking subscriptions like mad, which can cause a resource leak. Also, you put no limit on the size of the replay buffer. If you watch the observable all day long, then that replay buffer will keep growing and growing. You should put a limit on it to prevent that.
Here is an updated version of GetStream. In this version I take the simplistic approach of just limitting the Replay to the most recent 1 minute of data. This assumes that GetData will always complete and the observer will observe the results within that 1 minute. Your mileage may vary and you can probably improve upon this scheme. But at least this way when you have watched the observable all day long, that buffer will not have grown unbounded and will still only contain a minute's worth of updates.
public static IObservable<long> GetStream()
{
return Observable.Create<long>(observer =>
{
var updateStreamSubscription = new SingleAssignmentDisposable();
var sequenceDisposable = new SingleAssignmentDisposable();
var subscriptions = new CompositeDisposable(updateStreamDisposable, sequenceDisposable);
// start buffering the updates
var bufferedStream = _updateStream.Replay(TimeSpan.FromMinutes(1));
updateStreamSubscription.Disposable = bufferedStream.Connect();
// now retrieve the initial snapshot data
var data = GetSnapshot();
// subscribe to the snapshot followed by the buffered data
sequenceDisposable.Disposable = data.ToObservable().Concat(bufferedStream).subscribe(observer);
// return the composite disposable which will unsubscribe when the observer wishes
return subscriptions;
});
}
As for your questions about race conditions and filtering out "old" updates...if your snapshot data includes some sort of version information, and your update stream also providers version information, then you can effectively measure the latest version returned by your snapshot query and then filter the buffered stream to ignore values for older versions. Here is a rough example:
public static IObservable<long> GetStream()
{
return Observable.Create<long>(observer =>
{
var updateStreamSubscription = new SingleAssignmentDisposable();
var sequenceDisposable = new SingleAssignmentDisposable();
var subscriptions = new CompositeDisposable(updateStreamDisposable, sequenceDisposable);
// start buffering the updates
var bufferedStream = _updateStream.Replay(TimeSpan.FromMinutes(1));
updateStreamSubscription.Disposable = bufferedStream.Connect();
// now retrieve the initial snapshot data
var data = GetSnapshot();
var snapshotVersion = data.Length > 0 ? data[data.Length - 1].Version : 0;
var filteredUpdates = bufferedStream.Where(update => update.Version > snapshotVersion);
// subscribe to the snapshot followed by the buffered data
sequenceDisposable.Disposable = data.ToObservable().Concat(filteredUpdates).subscribe(observer);
// return the composite disposable which will unsubscribe when the observer wishes
return subscriptions;
});
}
I have successfully used this pattern when merging live updates with a stored snapshot. I haven't yet found an elegant Rx operator that already does this without any race conditions. But the above method could probably be turned into such. :)
Edit: Note I have left out error handling in the examples above. In theory the call to GetSnapshot could fail and you'd leak the subscription to the update stream. I suggest wrapping everything after the CompositeDisposable declaration in a try/catch block, and in the catch handler, ensure call subscriptions.Dispose() before re-throwing the exception.