ReactiveX: Make Observable.Create() be called only once - c#

I'm trying to build a data access layer with ReactiveX (more precisely, Rx.Net) and SQLite.Net.
Part of the job is making an observable that returns the database connection, so that it can be open lazily, only when needed. This is what I came up with so far:
var connection = Observable.Create<SQLiteConnection>(observer =>
{
Debug.WriteLine("CheckInStore: Opening database connection");
var database = new SQLiteConnection(configuration.ConnectionString.DatabasePath);
observer.OnNext(database);
observer.OnCompleted();
return Disposable.Create(() =>
{
Debug.WriteLine("CheckInStore: Closing database connection");
database.Close();
});
});
// Further down the line, a query would look like this:
var objects = connection.SelectMany(db => db.Query<>("select * from MyTable"));
Unfortunately, every time somebody subscribes to this observable, a new connection is created. And it is also closed once the subscription is disposed.
I tried using .Replay(1).RefCount(), but it didn't change anything. I'm not sure to understand that whole RefCount thing anyway.
How can I make this database connection a singleton?

Have a look at this code, which is equivalent, but doesn't open a DB connection:
var conn = Observable.Create<int>(o =>
{
Debug.WriteLine("Opening");
o.OnNext(1);
o.OnCompleted(); //This forces closing code to be called. Comment me out.
return Disposable.Create(() =>
{
Debug.WriteLine("Closing");
});
})
//.Replay(1)
//.RefCount() //.Replay(1).RefCount is necessary if you want to cache the result
;
var sub1 = conn.SelectMany(i => Observable.Return(i)).Subscribe(i => Debug.WriteLine($"1: {i}"));
var sub2 = conn.SelectMany(i => Observable.Return(i)).Subscribe(i => Debug.WriteLine($"2: {i}"));
sub1.Dispose();
sub2.Dispose();
var sub3 = conn.SelectMany(i => Observable.Return(i)).Subscribe(i => Debug.WriteLine($"3: {i}"));
sub3.Dispose();
There's a number of problems here:
Your dispose/unsubscription code will get called everytime you either unsubscribe or complete the observable. Since you're calling OnCompleted, it's going to be open/closed every time.
If you want to re-use the same connection, you need to use .Replay(1).RefCount(). Observable.Create runs the whole function every time a subscriber connects, there's nothing (except .Replay(1).Refcount()) that caches it for you.
Even if you add .Replay(1).Refcount() and remove OnCompleted, you will still get disposal (meaning DB-Closed) behavior if there's no outstanding subscriptions (like after the sub2.Dispose() call).
If you don't dispose the subscriptions, either through using(var sub = connection.SelectMany(...)) or explicitly via sub.Dispose(), you'll never unsubscribe, since this Observable has no way of terminating. In other words, opposite problem of 3, your Close code will never happen.
I hope you get the picture: This is a pretty error-prone way of doing things. I would recommend a simple iterative call, since that tends to work better for DB calls anyway. If you insist on RX, I would look at Observable.Using for your DB connection initialization.

Related

Ensure Polly Policy Runs at Least Once

So, I'm writing some retry logic for acquiring a lock using Polly. The overall timeout value will be provided by the API caller. I know I can wrap a policy in an overall timeout. However, if the supplied timeout value is too low is there a way I can ensure that the policy is executed at least once?
Obviously I could call the delegate separately before the policy is executed but I was just wondering if there was a way to express this requriment in Polly.
var result = Policy.Timeout(timeoutFromApiCaller)
.Wrap(Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500))
.Execute(() => this.TryEnterLock());
If timeoutFromApiCaller is say 1 tick and there's a good chance it takes longer than that to reach the timeout policy then the delegate wouldn't get called (the policy would timeout and throw TimeoutRejectedException).
What I'd like to happen can be expressed as:
var result = this.TryEnterLock();
if (!result)
{
result = Policy.Timeout(timeoutFromApiCaller)
.Wrap(Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500))
.Execute(() => this.TryEnterLock());
}
But it'd be really nice if it could be expressed in pure-Polly...
To be honest I don't understand what does it mean 1 tick, in your case? Is it a nanosecond or greater than that? Your global timeout should be greater than your local timeout.
But as I can see you have not specified a local one. TryEnterLock should receive a TimeSpan in order to do not block the caller for infinite time. If you look at the built in sync primitives most of them provide such a capabilities: Monitor.TryEnter, SpinLock.TryEnter, WaitHandle.WaitOne, etc.
So, just to wrap it up:
var timeoutPolicy = Policy.Timeout(TimeSpan.FromMilliseconds(1000));
var retryPolicy = Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500));
var resilientStrategy = Policy.Wrap(timeoutPolicy, retryPolicy);
var result = resilientStrategy.Execute(() => this.TryEnterLock(TimeSpan.FromMilliseconds(100)));
The timeout and delay values should be adjusted to your business needs. I highly encourage you to log when the global Timeout (onTimeout / onTimeoutAsync) fires and when the retries (onRetry / onRetryAsync) to be able to fine tune / calibrate these values.
EDIT: Based on the comments of this post
As it turned out there is no control over the timeoutFromApiCaller so it can be arbitrary small. (In the given example it is just a few nano-seconds, with the intent to emphasize the problem.) So, in order to have at least one call guarantee we have to make use the Fallback policy.
Instead of calling manually upfront the TryEnterLock outside the policies, we should call it as the last action to satisfy the requirement. Because policies uses escalation, that's why whenever the inner fails then it delegates the problem to the next outer policy.
So, if the provided timeout is so tiny that action can not finish until that period then it will throw a TimeoutRejectedException. With the Fallback we can handle that and the action can be performed again but now without any timeout constraint. This will provide us the desired at least one guarantee.
var atLeastOnce = Policy.Handle<TimeoutRejectedException>
.Fallback((ct) => this.TryEnterLock());
var globalTimeout = Policy.Timeout(TimeSpan.FromMilliseconds(1000));
var foreverRetry = Policy.HandleResult(false)
.WaitAndRetryForever(_ => TimeSpan.FromMilliseconds(500));
var resilientStrategy = Policy.Wrap(atLeastOnce, globalTimeout, foreverRetry);
var result = resilientStrategy.Execute(() => this.TryEnterLock());

Reactive Extensions error handling with Observable SelectMany

I'm trying to write file watcher on certain folder using the reactive extensions library
The idea is to monitor hard drive folder for new files, wait until file is written completely and push event to the subscriber. I do not want to use FileSystemWatcher since it raises Changed event twice for the same file.
So I've wrote it in the "reactive way" (I hope) like below:
var provider = new MessageProviderFake();
var source = Observable.Interval(TimeSpan.FromSeconds(2), NewThreadScheduler.Default).SelectMany(_ => provider.GetFiles());
using (source.Subscribe(_ => Console.WriteLine(_.Name), () => Console.WriteLine("completed to Console")))
{
Console.WriteLine("press Enter to stop");
Console.ReadLine();
}
However I can't find "reactive way" to handle errors. For example, the file directory can be located on the external drive and became unavailable because of connection problem.
So I've added GetFilesSafe that will handle exception errors from the Reactive Extensions:
static IEnumerable<MessageArg> GetFilesSafe(IMessageProvider provider)
{
try
{
return provider.GetFiles();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
return new MessageArg[0];
}
}
and used it like
var source = Observable.Interval(TimeSpan.FromSeconds(2), NewThreadScheduler.Default).SelectMany(_ => GetFilesSafe(provider));
Is there better way to make SelectMany to call provider.GetFiles() even when an exception has been raised? I'm using error counter in such cases to repeat the reading operation N times and then fail (terminate the process).
Is there "try N time and wait Q seconds between attempts" in the Reactive Extensions?
There is a problem with GetFilesSafe also: it returns IEnumerable<MessageArg> for lazy reading however it can raise on iteration and exception will be thrown somewhere in the SelectMany
There's a Retry extension, that just subscribes to the observable again if the current one errors, but it sounds like that won't offer the flexibility you want.
You could build something using Catch, which subscribes to the observable you give it if an error occurs on the outer one. Something like the following (untested):
IObservable<Thing> GetFilesObs(int times, bool delay) {
return Observable
.Return(0)
.Delay(TimeSpan.FromSeconds(delay ? <delay_time> : 0))
.SelectMany(_ => Observable.Defer(() => GetFilesErroringObservable()))
.Catch(Observable.Defer(() => GetFilesObs(times - 1, true)));
}
// call with:
GetFilesObs(<number_of_tries>, false);
As written, this doesn't do anything with the errors other than trigger a retry. In particular, when enough errors have happened, it will just complete without an error, which might not be what you want.

AWS X-Ray shows absurdly long invoke time for P2P lambda call

So I have a lambda that makes a point-to-point call to another lambda. We have AWS X-Ray set up so we can monitor performance. However, X-Ray show this odd result where even though the invocation itself takes only a second, the "invoke" call from the original takes a minute and a half.
This makes no sense, since we are calling the lambda as an event (ACK and forget) and using an async call on which we do not await. It really causes problems because even though all lambdas successfully complete and do their work (as we can see from Cloudwatch logs and resulting data in our data store), occasionally that secondary lambda call takes so long that X-Ray times out, which bombs the whole rest of the trace.
Other notes:
We have Active tracing enabled on both lambdas
We do occasionally have cold start times, but as you can see from the screenshot, there is no "initialization" step here, so both lambdas are warm
This particular example was a singular action with no other activity in the system, so it's not like there was a bottleneck due to high load
Does anyone have an explanation for this, and hopefully what we can do to fix it?
Our invocation code (simplified):
var assetIds = new List<Guid> { Guid.NewGuid() };
var request= new AddBulkAssetHistoryRequest();
request.AssetIds = assetIds.ToList();
request.EventType = AssetHistoryEventTypeConstants.AssetDownloaded;
request.UserId = tokenUserId.Value;
var invokeRequest = new InvokeRequest
{
FunctionName = "devkarl02-BulkAddAssetHistory",
InvocationType = InvocationType.Event,
Payload = JsonConvert.SerializeObject(request)
};
var region = RegionEndpoint.GetBySystemName("us-east-1");
var lambdaClient= new AmazonLambdaClient(region)
_ = lambdaClient.InvokeAsync(invokeRequest);
This is also posted over in the AWS Forums (for whatever that is worth): https://forums.aws.amazon.com/thread.jspa?threadID=307615
So it turns out the issue was that we weren't using the await operator. For some reason, that made the calls interminably slow. Making this small change:
_ = await lambdaClient.InvokeAsync(invokeRequest);
made everything else behave properly, both in logs and in x-ray. Not sure why, but hey, it solved the issue.
As far as I understand, not adding the await, causes the call to execute synchronously while adding the await causes the call to happen async.

Reading from Stream using Observable through FromAsyncPattern, how to close/cancel properly

Need: long-running program with TCP connections
A C# 4.0 (VS1010, XP) program needs to connect to a host using TCP, send and receive bytes, sometimes close the connection properly and reopen it later. Surrounding code is written using Rx.Net Observable style. The volume of data is low but the program should runs continuously (avoid memory leak by taking care of properly disposing resources).
The text below is long because I explain what I searched and found. It now appears to work.
Overall questions are: since Rx is sometime unintuitive, are the solutions good? Will that be reliable (say, may it run for years without trouble)?
Solution so far
Send
The program obtains a NetworkStream like this:
TcpClient tcpClient = new TcpClient();
LingerOption lingerOption = new LingerOption(false, 0); // Make sure that on call to Close(), connection is closed immediately even if some data is pending.
tcpClient.LingerState = lingerOption;
tcpClient.Connect(remoteHostPort);
return tcpClient.GetStream();
Asynchronous sending is easy enough. Rx.Net allows to handle this with much shorter and cleaner code than traditional solutions. I created a dedicated thread with an EventLoopScheduler. The operations needing a send are expressed using IObservable. Using ObserveOn(sendRecvThreadScheduler) guarantee that all send operations are done on that thread.
sendRecvThreadScheduler = new EventLoopScheduler(
ts =>
{
var thread = new System.Threading.Thread(ts) { Name = "my send+receive thread", IsBackground = true };
return thread;
});
// Loop code for sending not shown (too long and off-topic).
So far this is excellent and flawless.
Receive
It seems that to receive data, Rx.Net should also allow shorter and cleaner code that traditional solutions.
After reading several resources (e.g. http://www.introtorx.com/ ) and stackoverflow, it seems that a very simple solution is to bridge the Asynchronous Programming to Rx.Net like in https://stackoverflow.com/a/14464068/1429390 :
public static class Ext
{
public static IObservable<byte[]> ReadObservable(this Stream stream, int bufferSize)
{
// to hold read data
var buffer = new byte[bufferSize];
// Step 1: async signature => observable factory
var asyncRead = Observable.FromAsyncPattern<byte[], int, int, int>(
stream.BeginRead,
stream.EndRead);
return Observable.While(
// while there is data to be read
() => stream.CanRead,
// iteratively invoke the observable factory, which will
// "recreate" it such that it will start from the current
// stream position - hence "0" for offset
Observable.Defer(() => asyncRead(buffer, 0, bufferSize))
.Select(readBytes => buffer.Take(readBytes).ToArray()));
}
}
It mostly works. I can send and receive bytes.
Close time
This is when things start to go wrong.
Sometimes I need to close the stream and keep things clean. Basically this means: stop reading, end the byte-receiving observable, open a new connection with a new one.
For one thing, when connection is forcibly closed by remote host, BeginRead()/EndRead() immediately loop consuming all CPU returning zero bytes. I let higher level code notice this (with a Subscribe() to the ReadObservable in a context where high-level elements are available) and cleanup (including closing and disposing of the stream). This works well, too, and I take care of disposing of the object returned by Subscribe().
someobject.readOneStreamObservableSubscription = myobject.readOneStreamObservable.Subscribe(buf =>
{
if (buf.Length == 0)
{
MyLoggerLog("Read explicitly returned zero bytes. Closing stream.");
this.pscDestroyIfAny();
}
});
Sometimes, I just need to close the stream. But apparently this must cause exceptions to be thrown in the asynchronous read. c# - Proper way to prematurely abort BeginRead and BeginWrite? - Stack Overflow
I added a CancellationToken that causes Observable.While() to end the sequence. This does not help much to avoid these exceptions since BeginRead() can sleep for a long time.
Unhandled exception in the observable caused the program to exit. Searching provided .net - Continue using subscription after exception - Stack Overflow which suggested to add a Catch that resumes the broken Observable with an empty one, effectively.
Code looks like this:
public static IObservable<byte[]> ReadObservable(this Stream stream, int bufferSize, CancellationToken token)
{
// to hold read data
var buffer = new byte[bufferSize];
// Step 1: async signature => observable factory
var asyncRead = Observable.FromAsyncPattern<byte[], int, int, int>(
stream.BeginRead,
stream.EndRead);
return Observable.While(
// while there is data to be read
() =>
{
return (!token.IsCancellationRequested) && stream.CanRead;
},
// iteratively invoke the observable factory, which will
// "recreate" it such that it will start from the current
// stream position - hence "0" for offset
Observable.Defer(() =>
{
if ((!token.IsCancellationRequested) && stream.CanRead)
{
return asyncRead(buffer, 0, bufferSize);
}
else
{
return Observable.Empty<int>();
}
})
.Catch(Observable.Empty<int>()) // When BeginRead() or EndRead() causes an exception, don't choke but just end the Observable.
.Select(readBytes => buffer.Take(readBytes).ToArray()));
}
What now? Question
This appears to work well. Conditions where remote host forcibly closed the connection or is just no longer reachable are detected, causing higher level code to close the connection and retry. So far so good.
I'm unsure if things feel quite right.
For one thing, that line:
.Catch(Observable.Empty<int>()) // When BeginRead() or EndRead() causes an exception, don't choke but just end the Observable.
feels like the bad practice of empty catch block in imperative code. Actual code does log the exception, and higher level code detect the absence of reply and correctly handle, so it should be considered fairly okay (see below)?
.Catch((Func<Exception, IObservable<int>>)(ex =>
{
MyLoggerLogException("On asynchronous read from network.", ex);
return Observable.Empty<int>();
})) // When BeginRead() or EndRead() causes an exception, don't choke but just end the Observable.
Also, this is indeed shorter than most traditional solutions.
Are the solutions correct or did I miss some simpler/cleaner ways?
Are there some dreadful problems that would look obvious to wizards of Reactive Extensions?
Thank you for your attention.

ISubscriber, .Subscribe() and .Unsubscribe() scope

I'm trying to understand the scoping of using SE.Redis objects specifically in the area of subscribing for (and unsubscribing) to notifications.
I'd like to do something like the following to wait for a remote node to indicate it has changed/freed a resource (a very dumb distributed semaphore):
var t = new TaskCompletionSource<bool>();
sub.Subscribe(key, (c, v) =>
{
t.TrySetResult(true);
sub.UnsubscribeAll();
});
return t.Task;
I am fairly certain that is wrong :) and in a multithreaded environment with the ConnectionMultiplexer shared I'm probably going to end up in a race where one thread subscribes to a particular RedisChannel whilst another unsubscribes from it.
Is it possible to safely/efficiently implement this pattern or am I trying to simplify this problem too much and need a per process 'subscription manager' to coordinate my subscriptions?

Categories

Resources