Reactive: Trying to understand how Subject<T> work - c#

Trying to understand how the Subject<T>, ReplaySubject<T> and other work. Here is example:
(Subject is Observable and observer)
public IObservable<int> CreateObservable()
{
Subject<int> subj = new Subject<int>(); // case 1
ReplaySubject<int> subj = new ReplaySubject<int>(); // case 2
Random rnd = new Random();
int maxValue = rnd.Next(20);
Trace.TraceInformation("Max value is: " + maxValue.ToString());
subj.OnNext(-1); // specific value
for(int iCounter = 0; iCounter < maxValue; iCounter++)
{
Trace.TraceInformation("Value: " + iCounter.ToString() + " is about to publish");
subj.OnNext(iCounter);
}
Trace.TraceInformation("Publish complete");
subj.OnComplete();
return subj;
}
public void Main()
{
//
// First subscription
CreateObservable()
.Subscribe(
onNext: (x)=>{
Trace.TraceInformation("X is: " + x.ToString());
});
//
// Second subscribe
CreateObservable()
.Subscribe(
onNext: (x2)=>{
Trace.TraceInformation("X2 is: " + x.ToString());
});
Case 1: The strange situation is - when I use Subject<T> no subscription is made (???) - I never see the "X is: " text - I only see the "Value is: " and "Max value is"... Why does Subject<T> does not push values to subscription ?
Case 2: If I use ReplaySubject<T> - I do see the values in Subscription but I could not apply Defer option to anything. Not to Subject and not to Observable.... So every subscription will receive different values because CreateObservable function is cold observable. Where is Defer ?

Whenever you need to create an observable out of thin air, Observable.Create should be the first thing to think of. Subjects enter the picture in two cases:
You need some kind of "addressable endpoint" to feed data to in order for all subscribers to receive it. Compare this to a .NET event which has both an invocation side (through delegate invocation) and a subscription side (through delegate combine with +- and -= syntax). You'll find in a lot of cases, you can achieve the same effect using Observable.Create.
You need multicasting of messages in a query pipeline, effectively sharing an observable sequence by many forks in your query logic, without triggering multiple subscriptions. (Think of subscribing to your favorite magazine once for your dorm and putting a photo copier right behind the letter box. You still pay one subscription, though all of your friends can read the magazine delivered through OnNext on the letter box.)
Also, in a lot of cases, there's already a built-in primitive in Rx that does exactly what you need. For example, there's From* factory methods to bridge with existing concepts (such as events, tasks, asynchronous methods, enumerable sequence), some of which using a subject under the covers. For the second case of multicasting logic, there's the Publish, Replay, etc. family of operators.

You need to be mindful of when code is executed.
In "Case 1", when you use a Subject<T>, you'll notice that the all of the calls to OnNext & OnCompleted finish before the observable is returned by the CreateObservable method. Since you are using a Subject<T> this means that any subsequent subscription will have missed all of the values so you should expect to get what you got - nothing.
You have to delay the operations on the subject until you have the observer subscribed. To do that using the Create method. Here's how:
public IObservable<int> CreateObservable()
{
return Observable.Create<int>(o =>
{
var subj = new Subject<int>();
var disposable = subj.Subscribe(o);
var rnd = new Random();
var maxValue = rnd.Next(20);
subj.OnNext(-1);
for(int iCounter = 0; iCounter < maxValue; iCounter++)
{
subj.OnNext(iCounter);
}
subj.OnCompleted();
return disposable;
});
}
I've removed all the trace code for succinctness.
So now, for every subscriber, you get a new execution of the code inside the Create method and you would now get the values from the internal Subject<T>.
The use of the Create method is generally the correct way to create observables that you return from methods.
Alternatively you could use a ReplaySubject<T> and avoid the use of the Create method. However this is unattractive for a number of reasons. It forces the computation of the entire sequence at creation time. This give you a cold observable that you could have produced more efficiently without using a replay subject.
Now, as an aside, you should try to avoid using subjects at all. The general rule is that if you're using a subject then you're doing something wrong. The CreateObservable method would be better written as this:
public IObservable<int> CreateObservable()
{
return Observable.Create<int>(o =>
{
var rnd = new Random();
var maxValue = rnd.Next(20);
return Observable.Range(-1, maxValue + 1).Subscribe(o);
});
}
No need for a subject at all.
Let me know if this helps.

Related

Do references to collections cause any trouble with threads?

I have something like the following code:
public class MainAppClass : BaseClass
{
public IList<Token> TokenList
{
get;
set;
}
// This is execute before any thread is created
public override void OnStart()
{
MyDataBaseContext dbcontext = new MyDataBaseContext();
this.TokenList = dbcontext.GetTokenList();
}
// After this the application will create a list of many items to be iterated
// and will create as many threads as are defined in the configuration (5 at the momment),
// then it will distribute those items among the threads for parallel processing.
// The OnProcessItem will be executed for every item and could be running on different threads
protected override void OnProcessItem(AppItem processingItem)
{
string expression = getExpressionFromItem();
expression = Utils.ReplaceTokens(processingItem, expression, this);
}
}
public class Utils
{
public static string ReplaceTokens(AppItem currentProcessingItem, string expression, MainAppClass mainAppClass)
{
Regex tokenMatchExpression = new Regex(#"\[[^+~][^$*]+?\]", RegexOptions.IgnoreCase);
Match tokenMatch = tokenMatchExpression.Match(expression)
if(tokenMatch.Success == false)
{
return expression;
}
string tokenName = tokenMatch.Value;
// This line is my principal suspect of messing in some way with the multiple threads
Token tokenDefinition = mainAppClass.TokenList.Where(x => x.Name == tokenName).First();
Regex tokenElementExpression = new Regex(tokenDefintion.Value);
MyRegexSearchResult evaluationResult = Utils.GetRegexMatches(currentProcessingItem, tokenElementExpression).FirstOrDefault();
string tokenValue = string.Empty;
if (evaluationResult != null && evaluationResult.match.Groups.Count > 1)
{
tokenValue = evaluationResult.match.Groups[1].Value;
}
else if (evaluationResult != null && evaluationResult.match.Groups.Count == 1)
{
tokenValue = evaluationResult.match.Groups[0].Value;
}
expression = expression.Replace("[" + tokenName + "]", tokenValue);
return expression;
}
}
The problem I have right now is that for some reason the value of the token replaced in the expression get confused with one from another thread, resulting in an incorrect replacement as it should be a different value, i.e:
Expression: Hello [Name]
Expected result for item 1: Hello Nick
Expected result for item 2: Hello Sally
Actual result for item 1: Hello Nick
Actual result for item 2: Hello Nick
The actual result is not always the same, sometimes is the expected one, sometimes both expressions are replaced with the value expected for the item 1, or sometimes both expressions are replaced with the value expected for the item 2.
I'm not able to find what's wrong with the code as I was expecting for all the variables within the static method to be in its own scope for every thread, but that doesn't seem to be the case.
Any help will be much appreciated!
Yeah, static objects only have one instance throughout the program - creating new threads doesn't create separate instances of those objects.
You've got a couple different ways of dealing with this.
Door #1. If the threads need to operate on different instances, you'll need to un-static the appropriate places. Give each thread its own instance of the object you need it to modify.
Door #2. Thread-safe objects (like mentioned by Fildor.) I'll admit, I'm a bit less familiar with this door, but it's probably the right approach if you can get it to work (less complexity in code is awesome)
Door #3. Lock on the object directly. One option is to, when modifying the global static, to put it inside a lock(myObject) { } . They're pretty simple and straight-foward (so much simpler than the old C/C++ days), and it'll make it so multiple modifications don't screw the object up.
Door #4. Padlock the encapsulated class. Don't allow outside callers to modify the static variable at all. Instead, they have to call global getters/setters. Then, have a private object inside the class that serves simply as a lockable object - and have the getters/setters lock that lockable object whenever they're reading/writing it.
The tokenValue that you're replacing the token with is coming from evaluationResult.
evaluationResult is based on Utils.GetRegexMatches(currentProcessingItem, tokenElementExpression).
You might want to check GetRegexMatches to see if it's using any static resources, but my best guess is that it's being passed the same currentProcessingItem value in multiple threads.
Look to the code looks like that splits up the AppItems. You may have an "access to modified closure" in there. For example:
for(int i = 0; i < appItems.Length; i++)
{
var thread = new Thread(() => {
// Since the variable `i` is shared across all of the
// iterations of this loop, `appItems[i]` is going to be
// based on the value of `i` at the time that this line
// of code is run, not at the time when the thread is created.
var appItem = appItems[i];
...
});
...
}

Functionally pure dice rolls in C#

I am writing a dice-based game in C#. I want all of my game-logic to be pure, so I have devised a dice-roll generator like this:
public static IEnumerable<int> CreateDiceStream(int seed)
{
var random = new Random(seed);
while (true)
{
yield return 1 + random.Next(5);
}
}
Now I can use this in my game logic:
var playerRolls = players.Zip(diceRolls, (player, roll) => Tuple.Create(player, roll));
The problem is that the next time I take from diceRolls I want to skip the rolls that I have already taken:
var secondPlayerRolls = players.Zip(
diceRolls.Skip(playerRolls.Count()),
(player, roll) => Tuple.Create(player, roll));
This is already quite ugly and error prone. It doesn't scale well as the code becomes more complex.
It also means that I have to be careful when using a dice roll sequence between functions:
var x = DoSomeGameLogic(diceRolls);
var nextRoll = diceRolls.Skip(x.NumberOfDiceRollsUsed).First();
Is there a good design pattern that I should be using here?
Note that it is important that my functions remain pure due to syncronisation and play-back requirements.
This question is not about correctly initializing System.Random. Please read what I have written, and leave a comment if it is unclear.
That's a very nice puzzle.
Since manipulating diceRolls's state is out of the question (otherwise, we'd have those sync and replaying issues you mentioned), we need an operation which returns both (a) the values to be consumed and (b) a new diceRolls enumerable which starts after the consumed items.
My suggestion would be to use the return value for (a) and an out parameter for (b):
static IEnumerable<int> Consume(this IEnumerable<int> rolls, int count, out IEnumerable<int> remainder)
{
remainder = rolls.Skip(count);
return rolls.Take(count);
}
Usage:
var firstRolls = diceRolls.Consume(players.Count(), out diceRolls);
var secondRolls = diceRolls.Consume(players.Count(), out diceRolls);
DoSomeGameLogic would use Consume internally and return the remaining rolls. Thus, it would need to be called as follows:
var x = DoSomeGameLogic(diceRolls, out diceRolls);
// or
var x = DoSomeGameLogic(ref diceRolls);
// or
x = DoSomeGameLogic(diceRolls);
diceRolls = x.RemainingDiceRolls;
The "classic" way to implement pure random generators is to use a specialized form of a state monad (more explanation here), which wraps the carrying around of the current state of the generator. So, instead of implementing (note that my C# is quite rusty, so please consider this as pseudocode):
Int Next() {
nextState, nextValue = NextRandom(globalState);
globalState = nextState;
return nextValue;
}
you define something like this:
class Random<T> {
private Func<Int, Tuple<Int, T>> transition;
private Tuple<Int, Int> NextRandom(Int state) { ... whatever, see below ... }
public static Random<A> Unit<A>(A a) {
return new Random<A>(s => Tuple(s, a));
}
public static Random<Int> GetRandom() {
return new Random<Int>(s => nextRandom(s));
}
public Random<U> SelectMany(Func<T, Random<U>> f) {
return new Random(s => {
nextS, a = this.transition(s);
return f(a).transition(nextS);
}
}
public T Run(Int seed) {
return this.transition(seed);
}
}
Which should be usable with LINQ, if I did everything right:
// player1 = bla, player2 = blub, ...
Random<Tuple<Player, Int>> playerOneRoll = from roll in GetRandom()
select Tuple(player1, roll);
Random<Tuple<Player, Int>> playerTwoRoll = from roll in GetRandom()
select Tuple(player2, roll);
Random<List<Tuple<Player, Int>>> randomRolls = from t1 in playerOneRoll
from t2 in playerTwoRoll
select List(t1, t2);
var actualRolls = randomRolls.Run(234324);
etc., possibly using some combinators. The trick here is to represent the whole "random action" parametrized by the current input state; but this is also the problem, since you'd need a good implementation of NextRandom.
It would be nice if you could just reuse the internals of the .NET Random implementation, but as it seems, you cannot access its internal state. However, I'm sure there are enough sufficiently good PRNG state functions around on the internet (this one looks good; you might have to change the state type).
Another disadvantage of monads is that once you start working in them (ie, construct things in Random), you need to "carry that though" the whole control flow, up to the top level, at which you should call Run once and for all. This is something one needs to get use to, and is more tedious in C# than functional languages optimized for such things.

Getting the latest item in an observable sequence using RX in C#

take the following as an example:
var ob = Observable.Interval(TimeSpan.FromSeconds(1)).StartWith(500).Replay(1).RefCount();
What I'm trying to achieve here is to obtain the value of the latest item in the sequence at any given time "synchronously". Which means extensions like FirstAsync can't make it up for me.
The StartWith and Replay bit ensures that there will always be a value, and the RefCount bit is necessary in my actual code to detect when I can do some disposal actions.
So to simulate this "any given time" part, let's try getting the latest value after 5 seconds:
Observable.Timer(TimeSpan.FromSeconds(5)).Subscribe(x =>
{
// Try to get latest value from "ob" here.
});
So with a 5 second delay, I need to get the value 5 out of the sequence and these are what I have tried so far with no success:
ob.First() - returns 500
ob.Latest().Take(1) - same as above
ob.MostRecent(-1).First() - same as above
ob.MostRecent(-1) - gives me an IEnumerable<long> full of "500"
ob.Last() - never returns because it's waiting for the sequence to complete which it never will
ob.Latest().Last() - same as above
ob.ToTask().Result - same as above
ob.ToEnumerable() - same as above
ob.MostRecent().Last() same as above
It seems there's not much resources around that people can actually do this. The closest I can find is this: "Rx: operator for getting first and most recent value from an Observable stream", but it is not a synchronous call after all (still using a subscription) so it doesn't work for me.
Does any body know if this is actually doable?
To point out why your code probably isn't working as you expect it to
var ob = Observable.Interval(TimeSpan.FromSeconds(1)).StartWith(500).Replay(1).RefCount();
//Note at this point `ob` has never been subscribed to,
// so the Reference-count is 0 i.e. has not be connected.
Observable.Timer(TimeSpan.FromSeconds(5)).Subscribe(x =>
{
// Try to get latest value from "ob" here.
//Here we make our first subscription to the `ob` sequence.
// This will connect the sequence (invoke subscribe)
// which will
// 1) invoke StartWith
// 2) invoke onNext(500)
// 3) invoke First()
// 4) First() will then unsubscribe() as it has the single value it needs
// 5) The refCount will now return to 0
// 6) The sequence will be unsubscribed to.
ob.First().Dump();
//Any future calls like `ob.First()` will thus always get the value 500.
});
Potentially what you want is
var ob = Observable.Interval(TimeSpan.FromSeconds(1))
.Publish(500);
var connection = ob.Connect();
//Note at this point `ob` has never been subscribed to, so the ReferenceCount is 0 i.e. has not be connected.
var subscription = Observable.Timer(TimeSpan.FromSeconds(5)).Subscribe(x =>
{
// Try to get latest value from "ob" here.
ob.First().Dump();
});
//Sometime later
subscription.Dispose();
connection.Dispose()
HOWEVER, You really don't want to be mixing Synchronous calls with Rx. You also generally don't want to be subscribing within a subscription (as .First() is a subscription). What you probably mean to be doing is getting the latest value, and stashing it somewhere. Using .First() is just a slippery slope. You probably would be better writing something like
var subscription = Observable.Timer(TimeSpan.FromSeconds(5))
.SelectMany(_=>ob.Take(1))
.Subscribe(x =>
{
//Do something with X here.
x.Dump();
});
You need to do something like this:
var ob = Observable.Interval(TimeSpan.FromSeconds(1)).StartWith(500);
var latestAndThenTheRest =
Observable
.Create<long>(o =>
{
var bs = new BehaviorSubject<long>(1);
var s1 = ob.Subscribe(bs);
var s2 = bs.Subscribe(o);
return new CompositeDisposable(s1, s2);
});
The only thing that you need to consider here is that ob must be a hot observable for this to even make sense. If it were cold then every subscriber would get a brand new subscription to the start of the ob sequence.
Just to clarify this a bit, and thanks for #LeeCampbell's answer.
What was not working:
var ob = Observable.Interval(TimeSpan.FromSeconds(1)).StartWith(500).Replay(1).RefCount();
Observable.Timer(TimeSpan.FromSeconds(5)).Subscribe(x =>
{
ob.First().Dump();
// This gives you 500.
// Because this is the first time any one subscribes to the observable,
// so it starts right here and gives you the initial value.
});
What would actually work:
var ob = Observable.Interval(TimeSpan.FromSeconds(1)).StartWith(500).Replay(1).RefCount();
ob.Subscribe(); // Subscribe to start the above hot observable immediately.
Observable.Timer(TimeSpan.FromSeconds(5)).Subscribe(x =>
{
ob.First().Dump();
// This would give you either 3 or 4, depending on the speed and timing of your computer.
});
I'm not sure if this answer helps you, but have you looked into BehaviorSubject? It's an IObservable that remembers its latest value. It's a bit like a combination of a regular variable and an observable in one.
Otherwise, why don't you subscribe to 'ob' and store the latest value in a variable yourself?

How can a TPL Dataflow block downstream get data produced by a source?

I'm processing images using TPL Dataflow. I receive a processing request, read an image from a stream, apply several transformations, then write the resulting image to another stream:
Request -> Stream -> Image -> Image ... -> Stream
For that I use the blocks:
BufferBlock<Request>
TransformBlock<Request,Stream>
TransformBlock<Stream,Image>
TransformBlock<Image,Image>
TransformBlock<Image,Image>
...
writerBlock = new ActionBlock<Image>
The problem is the initial Request is what contains some data necessary to create the resulting Stream along with some additional info I need at that point. Do I have to pass the original Request (or some other context object) down the line to the writerBlock across all the other blocks like this:
TransformBlock<Request,Tuple<Request,Stream>>
TransformBlock<Tuple<Request,Stream>,Tuple<Request,Image>>
TransformBlock<Tuple<Request,Image>,Tuple<Request,Image>>
...
(which is ugly), or is there a way to link the first block to the last one (or, generalizing, to the ones that need the additional data)?
Yes, you pretty much need to do what you described, passing the additional data from every block to the next one.
But using a couple of helper methods, you can make this much simpler:
public static IPropagatorBlock<TInput, Tuple<TOutput, TInput>>
CreateExtendedSource<TInput, TOutput>(Func<TInput, TOutput> transform)
{
return new TransformBlock<TInput, Tuple<TOutput, TInput>>(
input => Tuple.Create(transform(input), input));
}
public static IPropagatorBlock<Tuple<TInput, TExtension>, Tuple<TOutput, TExtension>>
CreateExtendedTransform<TInput, TOutput, TExtension>(Func<TInput, TOutput> transform)
{
return new TransformBlock<Tuple<TInput, TExtension>, Tuple<TOutput, TExtension>>(
tuple => Tuple.Create(transform(tuple.Item1), tuple.Item2));
}
The signatures look daunting, but they are actually not that bad.
Also, you might want to add overloads that pass options to the created block, or overloads that take async delegates.
For example, if you wanted to perform some operations on a number using separate blocks, while passing the original number along the way, you could do something like:
var source = new BufferBlock<int>();
var divided = CreateExtendedSource<int, double>(i => i / 2.0);
var formatted = CreateExtendedTransform<double, string, int>(d => d.ToString("0.0"));
var writer = new ActionBlock<Tuple<string, int>>(tuple => Console.WriteLine(tuple));
source.LinkTo(divided);
divided.LinkTo(formatted);
formatted.LinkTo(writer);
for (int i = 0; i < 10; i++)
source.Post(i);
As you can see, your lambdas (except for the last one) deal only with the “current” value (int, double or string, depending on the stage of the pipeline), the “original” value (always int) is passed automatically. At any moment, you can use block created using the normal constructor to access both values (like the final ActionBlock in the example).
(That BufferBlock isn't actually necessary, but I added it to more closely match your design.)
I may be going over my head since I am only starting to play with TPL Dataflow.
But I believe you can accomplish that using a BroadcastBlock as an intermediary between your source and your first target.
BroadcastBlock can offer the message to many targets, so you use it to offer to your target, and also to a JoinBlock, at the end that will merge the result with the original message.
source -> Broadcast ->-----------------------------------------> JoinBlock <source, result>
-> Transformation1 -> Transformation 'n' ->
For example:
var source = new BufferBlock<int>();
var transformation = new TransformBlock<int, int>(i => i * 100);
var broadCast = new BroadcastBlock<int>(null);
source.LinkTo(broadCast);
broadCast.LinkTo(transformation);
var jb = new JoinBlock<int, int>();
broadCast.LinkTo(jb.Target1);
transformation.LinkTo(jb.Target2);
jb.LinkTo(new ActionBlock<Tuple<int, int>>(
c => Console.WriteLine("Source:{0}, Target Result: {1}", c.Item1, c.Item2)));
source.Post(1);
source.Post(2);
source.Complete();
yields...
Source:1, Target Result: 100
Source:2, Target Result: 200
I am just not too sure about how it would behave in an asynchronous environment.

What's the use of chords?

Languages such as Nemerle support the idea of chords. I'd like to know what their practical use is.
The construct also seems to exist in the Cω language (as well as Polyphonic C#), at least according to [Wikipedia](http://en.wikipedia.org/wiki/Chord_(concurrency).
The primary usage of chords appears to involve database programming (more specifically, join calculus), which is unsurprising given that it is a concurrency construct. More than that, I'm afraid I don't know.
A chord is used for concurrency. The definition is available here.
The bit you are looking for:
In most languages, including C#, methods in the signature of a class are in bijective correspondence with the code of their implementations -- for each method which is declared, there is a single, distinct definition of what happens when that method is called. In Cω, however, a body may be associated with a set of (synchronous and/or asynchronous) methods. We call such a definition a chord, and a particular method may appear in the header of several chords. The body of a chord can only execute once all the methods in its header have been called. Thus, when a method is called there may be zero, one, or more chords which are enabled:
If no chord is enabled then the method
invocation is queued up. If the method
is asynchronous, then this simply
involves adding the arguments (the
contents of the message) to a queue.
If the method is synchronous, then the
calling thread is blocked. If there
is a single enabled chord, then the
arguments of the calls involved in the
match are de-queued, any blocked
thread involved in the match is
awakened, and the body runs. When a
chord which involves only asynchronous
methods runs, then it does so in a new
thread. If there are several chords
which are enabled then an unspecified
one of them is chosen to run.
Similarly, if there are multiple calls
to a particular method queued up, we
do not specify which call will be
de-queued when there is a match.
Try Nemerle Computation Expressions:
https://code.google.com/p/nemerle/source/browse/nemerle/trunk/snippets/ComputationExpressions/
Some examples:
def upTo (n : int)
{
comp enumerable
{
mutable i = 0;
while (i < n)
{
i ++;
yield i
}
}
}
def manyTimes : IEnumerable [int] =
comp enumerable
{
yieldcomp upTo(2); // 1 2
yield 100; // 100
yieldcomp upTo(3); // 1 2 3
yield 100; // 100
yieldcomp upTo(10); // 1 2 3 .. 10
}
def fn(n)
{
comp async
{
if (n < 20)
returncomp fn(n + 1);
else
return n;
}
}
def f(n1, n2)
{
comp async
{
defcomp n1 = fn(n1);
defcomp n2 = fn(n2);
return $"$n1 $n2";
}
}
private HttpGet(url : string) : Async[string]
{
comp async
{
def req = WebRequest.Create(url);
using (defcomp resp = req.AsyncGetResponse())
using (stream = resp.GetResponseStream())
using (reader = StreamReader(stream))
return reader.ReadToEnd();
}
}
Some more examples here: (Although article in Russian but code in English :) ) http://habrahabr.ru/blogs/programming/108184/

Categories

Resources