Reactive Extensions and Retry

Reactive Extensions and Retry - c#

So a series of articles popped on my radar this morning. It started with this question, which lead to the original example and source code on GitHub.
I rewrote it slightly, so I can start using it in Console and Service applications:
public static class Extensions
{
static readonly TaskPoolScheduler Scheduler = new TaskPoolScheduler(new TaskFactory());
// Licensed under the MIT license with <3 by GitHub
/// <summary>
/// An exponential back off strategy which starts with 1 second and then 4, 8, 16...
/// </summary>
[SuppressMessage("Microsoft.Security", "CA2104:DoNotDeclareReadOnlyMutableReferenceTypes")]
public static readonly Func<int, TimeSpan> ExponentialBackoff = n => TimeSpan.FromSeconds(Math.Pow(n, 2));
/// <summary>
/// A linear strategy which starts with 1 second and then 2, 3, 4...
/// </summary>
[SuppressMessage("Microsoft.Security", "CA2104:DoNotDeclareReadOnlyMutableReferenceTypes")]
public static readonly Func<int, TimeSpan> LinearStrategy = n => TimeSpan.FromSeconds(1*n);
/// <summary>
/// Returns a cold observable which retries (re-subscribes to) the source observable on error up to the
/// specified number of times or until it successfully terminates. Allows for customizable back off strategy.
/// </summary>
/// <param name="source">The source observable.</param>
/// <param name="retryCount">The number of attempts of running the source observable before failing.</param>
/// <param name="strategy">The strategy to use in backing off, exponential by default.</param>
/// <param name="retryOnError">A predicate determining for which exceptions to retry. Defaults to all</param>
/// <param name="scheduler">The scheduler.</param>
/// <returns>
/// A cold observable which retries (re-subscribes to) the source observable on error up to the
/// specified number of times or until it successfully terminates.
/// </returns>
[SuppressMessage("Microsoft.Reliability", "CA2000:Dispose objects before losing scope")]
public static IObservable<T> RetryWithBackoffStrategy<T>(
this IObservable<T> source,
int retryCount = 3,
Func<int, TimeSpan> strategy = null,
Func<Exception, bool> retryOnError = null,
IScheduler scheduler = null)
{
strategy = strategy ?? ExponentialBackoff;
scheduler = scheduler ?? Scheduler;
if (retryOnError == null)
retryOnError = e => true;
int attempt = 0;
return Observable.Defer(() =>
{
return ((++attempt == 1) ? source : source.DelaySubscription(strategy(attempt - 1), scheduler))
.Select(item => new Tuple<bool, T, Exception>(true, item, null))
.Catch<Tuple<bool, T, Exception>, Exception>(e => retryOnError(e)
? Observable.Throw<Tuple<bool, T, Exception>>(e)
: Observable.Return(new Tuple<bool, T, Exception>(false, default(T), e)));
})
.Retry(retryCount)
.SelectMany(t => t.Item1
? Observable.Return(t.Item2)
: Observable.Throw<T>(t.Item3));
}
}
Now to test how it works, I've written this small program:
class Program
{
static void Main(string[] args)
{
int tryCount = 0;
var cts = new CancellationTokenSource();
var sched = new TaskPoolScheduler(new TaskFactory());
var source = Observable.Defer(
() =>
{
Console.WriteLine("Action {0}", tryCount);
var a = 5/tryCount++;
return Observable.Return("yolo");
});
source.RetryWithBackoffStrategy(scheduler: sched, strategy: Extensions.LinearStrategy, retryOnError: exception => exception is DivideByZeroException);
while (!cts.IsCancellationRequested)
source.Subscribe(
res => { Console.WriteLine("Result: {0}", res); },
ex =>
{
Console.WriteLine("Error: {0}", ex.Message);
},
() =>
{
cts.Cancel();
Console.WriteLine("End Processing after {0} attempts", tryCount);
});
}
}
Initially I have thought, that the event of subscription, will automatically trigger all the subsequent retires. That was not the case, so I had to implement a Cancellation Token and loop until it signals that all reties have been exhausted.
The other option is to use AutoResetEvent:
class Program
{
static void Main(string[] args)
{
int tryCount = 0;
var auto = new AutoResetEvent(false);
var source = Observable.Defer(
() =>
{
Console.WriteLine("Action {0}", tryCount);
var a = 5/tryCount++;
return Observable.Return("yolo");
});
source.RetryWithBackoffStrategy(strategy: Extensions.LinearStrategy, retryOnError: exception => exception is DivideByZeroException);
while (!auto.WaitOne(1))
{
source.Subscribe(
res => { Console.WriteLine("Result: {0}", res); },
ex =>
{
Console.WriteLine("Error: {0}", ex.Message);
},
() =>
{
Console.WriteLine("End Processing after {0} attempts", tryCount);
auto.Set();
});
}
}
}
In both scenarios it will display these lines:
Action 0
Error: Attempted to divide by zero.
Action 1
Result: yolo
End Processing after 2 attempts
The question I have to this crowd is: Is this the best way to use this extension? Or is there a way to subscribe to the Observable so it will re-fire itself, up to the number of retries?
FINAL UPDATE
Based on Brandon's suggestion, this is the proper way of subscribing:
internal class Program
{
#region Methods
private static void Main(string[] args)
{
int tryCount = 0;
IObservable<string> source = Observable.Defer(
() =>
{
Console.WriteLine("Action {0}", tryCount);
int a = 5 / tryCount++;
return Observable.Return("yolo");
});
source.RetryWithBackoffStrategy(strategy: Extensions.ExponentialBackoff, retryOnError: exception => exception is DivideByZeroException, scheduler: Scheduler.Immediate)
.Subscribe(
res => { Console.WriteLine("Result: {0}", res); },
ex => { Console.WriteLine("Error: {0}", ex.Message); },
() =>
{
Console.WriteLine("End Processing after {0} attempts", tryCount);
});
}
#endregion
}
The output will be slightly different:
Action 0
Action 1
Result: yolo
End Processing after 2 attempts
This turned out to be quite useful extension. Here is another example how it can be used, where strategy and error processing is given using delegates.
internal class Program
{
#region Methods
private static void Main(string[] args)
{
int tryCount = 0;
IObservable<string> source = Observable.Defer(
() =>
{
Console.WriteLine("Action {0}", tryCount);
int a = 5 / tryCount++;
return Observable.Return("yolo");
});
source.RetryWithBackoffStrategy(
strategy: i => TimeSpan.FromMilliseconds(1),
retryOnError: exception =>
{
if (exception is DivideByZeroException)
{
Console.WriteLine("Tried to divide by zero");
return true;
}
return false;
},
scheduler: Scheduler.Immediate).Subscribe(
res => { Console.WriteLine("Result: {0}", res); },
ex => { Console.WriteLine("Error: {0}", ex.Message); },
() =>
{
Console.WriteLine("Succeeded after {0} attempts", tryCount);
});
}
#endregion
}
Output:
Action 0
Tried to divide by zero
Action 1
Result: yolo
Succeeded after 2 attempts

Yeah Rx is generally asynchronous so when writing tests, you need to wait for it to finish (otherwise Main just exits right after your call to Subscribe).
Also, make sure you subscribe to the observable produced by calling source.RetryWithBackoffStrategy(...). That produces a new observable that has the retry semantics.
Easiest solution in cases like this is to literally use Wait:
try
{
var source2 = source.RetryWithBackoffStrategy(/*...*/);
// blocks the current thread until the source finishes
var result = source2.Wait();
Console.WriteLine("result=" + result);
}
catch (Exception err)
{
Console.WriteLine("uh oh", err);
}
If you use something like NUnit (which supports asynchronous tests) to write your tests, then you can do:
[Test]
public async Task MyTest()
{
var source = // ...;
var source2 = source.RetryWithBackoffStrategy(/*...*/);
var result = await source2; // you can await observables
Assert.That(result, Is.EqualTo(5));
}

Related

How to keep track of faulted items in TPL pipeline in (thread)safe way

I am using TPL pipeline design together with Stephen Cleary's Try library In short it wraps value/exception and floats it down the pipeline. So even items that have thrown exceptions inside their processing methods, at the end when I await resultsBlock.Completion; have Status=RunToCompletion. So I need other way how to register faulted items. Here is small sample:
var downloadBlock = new TransformBlock<int, Try<int>>(construct => Try.Create(() =>
{
//SomeProcessingMethod();
return 1;
}));
var processBlock = new TransformBlock<Try<int>, Try<int>>(construct => construct.Map(value =>
{
//SomeProcessingMethod();
return 1;
}));
var resultsBlock = new ActionBlock<Try<int>>(construct =>
{
if (construct.IsException)
{
var exception = construct.Exception;
switch (exception)
{
case GoogleApiException gex:
//_notificationService.NotifyUser("OMG, my dear sir, I think I messed something up:/"
//Register that this item was faulted, so we know that we need to retry it.
break;
default:
break;
}
}
});
One solution would be to create a List<int> FaultedItems; where I would insert all faulted items in my Exception handling block and then after await resultsBlock.Completion; I could check if the list is not empty and create new pipeline for faulted items. My question is if I use a List<int> am I at risk of running into problems with thread safety if I decide to play with MaxDegreeOfParallelism settings and I'd be better off using some ConcurrentCollection? Or maybe this approach is flawed in some other way?

I converted a retry-block implementation from an answer to a similar question, to work with Stephen Cleary's Try types as input and output. The method CreateRetryTransformBlock returns a TransformBlock<Try<TInput>, Try<TOutput>>, and the method CreateRetryActionBlock returns something that is practically an ActionBlock<Try<TInput>>.
Three more options are available, the MaxAttemptsPerItem, MinimumRetryDelay and MaxRetriesTotal, on top of the standard execution options.
public class RetryExecutionDataflowBlockOptions : ExecutionDataflowBlockOptions
{
/// <summary>The limit after which an item is returned as failed.</summary>
public int MaxAttemptsPerItem { get; set; } = 1;
/// <summary>The minimum delay duration before retrying an item.</summary>
public TimeSpan MinimumRetryDelay { get; set; } = TimeSpan.Zero;
/// <summary>The limit after which the block transitions to a faulted
/// state (unlimited is the default).</summary>
public int MaxRetriesTotal { get; set; } = -1;
}
public class RetryLimitException : Exception
{
public RetryLimitException(string message, Exception innerException)
: base(message, innerException) { }
}
public static TransformBlock<Try<TInput>, Try<TOutput>>
CreateRetryTransformBlock<TInput, TOutput>(
Func<TInput, Task<TOutput>> transform,
RetryExecutionDataflowBlockOptions dataflowBlockOptions)
{
if (transform == null) throw new ArgumentNullException(nameof(transform));
if (dataflowBlockOptions == null)
throw new ArgumentNullException(nameof(dataflowBlockOptions));
int maxAttemptsPerItem = dataflowBlockOptions.MaxAttemptsPerItem;
int maxRetriesTotal = dataflowBlockOptions.MaxRetriesTotal;
TimeSpan retryDelay = dataflowBlockOptions.MinimumRetryDelay;
if (maxAttemptsPerItem < 1) throw new ArgumentOutOfRangeException(
nameof(dataflowBlockOptions.MaxAttemptsPerItem));
if (maxRetriesTotal < -1) throw new ArgumentOutOfRangeException(
nameof(dataflowBlockOptions.MaxRetriesTotal));
if (retryDelay < TimeSpan.Zero) throw new ArgumentOutOfRangeException(
nameof(dataflowBlockOptions.MinimumRetryDelay));
var internalCTS = CancellationTokenSource
.CreateLinkedTokenSource(dataflowBlockOptions.CancellationToken);
var maxDOP = dataflowBlockOptions.MaxDegreeOfParallelism;
var taskScheduler = dataflowBlockOptions.TaskScheduler;
var exceptionsCount = 0;
SemaphoreSlim semaphore;
if (maxDOP == DataflowBlockOptions.Unbounded)
{
semaphore = new SemaphoreSlim(Int32.MaxValue);
}
else
{
semaphore = new SemaphoreSlim(maxDOP, maxDOP);
// The degree of parallelism is controlled by the semaphore
dataflowBlockOptions.MaxDegreeOfParallelism = DataflowBlockOptions.Unbounded;
// Use a limited-concurrency scheduler for preserving the processing order
dataflowBlockOptions.TaskScheduler = new ConcurrentExclusiveSchedulerPair(
taskScheduler, maxDOP).ConcurrentScheduler;
}
var block = new TransformBlock<Try<TInput>, Try<TOutput>>(async item =>
{
// Continue on captured context after every await
if (item.IsException) return Try<TOutput>.FromException(item.Exception);
var result1 = await ProcessOnceAsync(item);
if (item.IsException || result1.IsValue) return result1;
for (int i = 2; i <= maxAttemptsPerItem; i++)
{
await Task.Delay(retryDelay, internalCTS.Token);
var result = await ProcessOnceAsync(item);
if (result.IsValue) return result;
}
return result1; // Return the first-attempt exception
}, dataflowBlockOptions);
dataflowBlockOptions.MaxDegreeOfParallelism = maxDOP; // Restore initial value
dataflowBlockOptions.TaskScheduler = taskScheduler; // Restore initial value
_ = block.Completion.ContinueWith(_ => internalCTS.Dispose(),
TaskScheduler.Default);
return block;
async Task<Try<TOutput>> ProcessOnceAsync(Try<TInput> item)
{
await semaphore.WaitAsync(internalCTS.Token);
try
{
var result = await item.Map(transform);
if (item.IsValue && result.IsException)
{
ObserveNewException(result.Exception);
}
return result;
}
finally
{
semaphore.Release();
}
}
void ObserveNewException(Exception ex)
{
if (maxRetriesTotal == -1) return;
uint newCount = (uint)Interlocked.Increment(ref exceptionsCount);
if (newCount <= (uint)maxRetriesTotal) return;
if (newCount == (uint)maxRetriesTotal + 1)
{
internalCTS.Cancel(); // The block has failed
throw new RetryLimitException($"The max retry limit " +
$"({maxRetriesTotal}) has been reached.", ex);
}
throw new OperationCanceledException();
}
}
public static ITargetBlock<Try<TInput>> CreateRetryActionBlock<TInput>(
Func<TInput, Task> action,
RetryExecutionDataflowBlockOptions dataflowBlockOptions)
{
if (action == null) throw new ArgumentNullException(nameof(action));
var block = CreateRetryTransformBlock<TInput, object>(async input =>
{
await action(input).ConfigureAwait(false); return null;
}, dataflowBlockOptions);
var nullTarget = DataflowBlock.NullTarget<Try<object>>();
block.LinkTo(nullTarget);
return block;
}
Usage example:
var downloadBlock = CreateRetryTransformBlock(async (int construct) =>
{
int result = await DownloadAsync(construct);
return result;
}, new RetryExecutionDataflowBlockOptions()
{
MaxDegreeOfParallelism = 10,
MaxAttemptsPerItem = 3,
MaxRetriesTotal = 100,
MinimumRetryDelay = TimeSpan.FromSeconds(10)
});
var processBlock = new TransformBlock<Try<int>, Try<int>>(
construct => construct.Map(async value =>
{
return await ProcessAsync(value);
}));
downloadBlock.LinkTo(processBlock,
new DataflowLinkOptions() { PropagateCompletion = true });
To keep things simple, in case that an item has been retried the maximum number of times, the exception preserved is the first one that occurred. The subsequent exceptions are lost. In most cases the lost exceptions are going to be of the same type as the first one anyway.
Caution: The above implementation does not have an efficient input queue. If you feed this block with millions of items, the memory usage will explode.

Parallel.For<int> not working as expected

I wrote a simple Parallel.For loop. But when i run the code, i get random results. I expect var total to be 15 (1+2+3+4+5). I used Interlocked.Add to prevent from race conditions and strange behavior. Can someone explain why the output is random and not 15?
public class Program
{
public static void Main(string[] args)
{
Console.WriteLine("before Dowork");
DoWork();
Console.WriteLine("After Dowork");
Console.ReadLine();
}
public static void DoWork()
{
try
{
int total = 0;
var result = Parallel.For<int>(0, 6,
() => 0,
(i, status, y) =>
{
return i;
},
(x) =>
{
Interlocked.Add(ref total, x);
});
if (result.IsCompleted)
Console.WriteLine($"total is: {total}");
else Console.WriteLine("loop not ready yet");
}
catch(Exception e)
{
Console.WriteLine(e.Message);
}
}
}

Instead of using
(i, status, y) =>
{
return i;
}
you should use
(i, status, y) =>
{
return y + i;
}
Parallel.For splits the source sequence into several partitions. The items in each partition are processed sequentially, but multiple partitions may be executed in parallel.
Each partition has a local state. The local state is the return value of the the above lambda function and it is also passed as the y parameter. So the reason for returning y + i should be clear now: you should update the local state to the sum of the previous state and the input value i.
After every item of a partition has been processed, the final value of the local state is passed to the last function, where you sum up all the states:
(x) =>
{
Interlocked.Add(ref total, x);
}

Observable with Time interval not displaying results on subscribe

I am trying to add a time interval to this Observable sequence( That is produce an integer sequence at a specific timespan) but it seems not to be working. When i remove the time, then it works time. Am i applying the timer wrongly?
var timer = Observable.Interval(TimeSpan.FromSeconds(2)).Take(4);
var nums = Observable.Range(1,1200).Where(a => a % 2 == 0);
var sourcenumbs = timer.SelectMany(nums);
var results = sourcenumbs.Subscribe(
x => Console.WriteLine("OnNext: {0}",x),
ex => Console.WriteLine("OnError: {0}",ex.Message),
() => Console.WriteLine("OnComplete")
);
This code displays no output, Does it get Dispose before it reaches the Subscribe?
But if i had a forloop with a timer in it then it works. Why?
for (int i = 0; i < 10; i++)
{
Thread.Sleep(TimeSpan.FromSeconds(0.9));
}

Is this what you want?
static void Main(string[] args)
{
Execute();
Console.ReadKey();
}
private static async void Execute()
{
var intervals = Observable.Interval(TimeSpan.FromSeconds(2)).StartWith(0);
var evenNumbers = Enumerable.Range(1, 1200).Where(a => a % 2 == 0);
var evenNumbersAtIntervals = intervals.Zip(evenNumbers, (_, num) => num);
try
{
await evenNumbersAtIntervals.ForEachAsync(
x => Console.WriteLine("OnNext: {0}", x)
);
Console.WriteLine("Complete");
}
catch(Exception e)
{
Console.WriteLine("Exception " + e);
}
}
Take note that numbers are Enumerable and not Observable.

Data Propagation in TPL Dataflow Pipeline with Batchblock.Triggerbatch()

In my Producer-Consumer scenario, I have multiple consumers, and each of the consumers send an action to external hardware, which may take some time. My Pipeline looks somewhat like this:
BatchBlock --> TransformBlock --> BufferBlock --> (Several) ActionBlocks
I have assigned BoundedCapacity of my ActionBlocks to 1.
What I want in theory is, I want to trigger the Batchblock to send a group of items to the Transformblock only when one of my Actionblocks are available for operation. Till then the Batchblock should just keep buffering elements and not pass them on to the Transformblock. My batch-sizes are variable. As Batchsize is mandatory, I do have a really high upper-limit for BatchBlock batch size, however I really don't wish to reach upto that limit, I would like to trigger my batches depending upon the availability of the Actionblocks permforming the said task.
I have achieved this with the help of the Triggerbatch() method. I am calling the Batchblock.Triggerbatch() as the last action in my ActionBlock.However interestingly after several days of working properly the pipeline has come to a hault. Upon checking I found out that sometimes the inputs to the batchblock come in after the ActionBlocks are done with their work. In this case the ActionBlocks do actually call Triggerbatch at the end of their work, however since at this point there is no input to the Batchblock at all, the call to TriggerBatch is fruitless. And after a while when inputs do flow in to the Batchblock, there is no one left to call TriggerBatch and restart the Pipeline. I was looking for something where I could just check if something is infact present in the inputbuffer of the Batchblock, however there is no such feature available, I could also not find a way to check if the TriggerBatch was fruitful.
Could anyone suggest a possible solution to my problem. Unfortunately using a Timer to triggerbatches is not an option for me. Except for the start of the Pipeline, the throttling should be governed only by the availability of one of the ActionBlocks.
The example code is here:
static BatchBlock<int> _groupReadTags;
static void Main(string[] args)
{
_groupReadTags = new BatchBlock<int>(1000);
var bufferOptions = new DataflowBlockOptions{BoundedCapacity = 2};
BufferBlock<int> _frameBuffer = new BufferBlock<int>(bufferOptions);
var consumerOptions = new ExecutionDataflowBlockOptions { BoundedCapacity = 1};
int batchNo = 1;
TransformBlock<int[], int> _workingBlock = new TransformBlock<int[], int>(list =>
{
Console.WriteLine("\n\nWorking on Batch Number {0}", batchNo);
//_groupReadTags.TriggerBatch();
int sum = 0;
foreach (int item in list)
{
Console.WriteLine("Elements in batch {0} :: {1}", batchNo, item);
sum += item;
}
batchNo++;
return sum;
});
ActionBlock<int> _worker1 = new ActionBlock<int>(async x =>
{
Console.WriteLine("Number from ONE :{0}",x);
await Task.Delay(500);
Console.WriteLine("BatchBlock Output Count : {0}", _groupReadTags.OutputCount);
_groupReadTags.TriggerBatch();
},consumerOptions);
ActionBlock<int> _worker2 = new ActionBlock<int>(async x =>
{
Console.WriteLine("Number from TWO :{0}", x);
await Task.Delay(2000);
_groupReadTags.TriggerBatch();
}, consumerOptions);
_groupReadTags.LinkTo(_workingBlock);
_workingBlock.LinkTo(_frameBuffer);
_frameBuffer.LinkTo(_worker1);
_frameBuffer.LinkTo(_worker2);
_groupReadTags.Post(10);
_groupReadTags.Post(20);
_groupReadTags.TriggerBatch();
Task postingTask = new Task(() => PostStuff());
postingTask.Start();
Console.ReadLine();
}
static void PostStuff()
{
for (int i = 0; i < 10; i++)
{
_groupReadTags.Post(i);
Thread.Sleep(100);
}
Parallel.Invoke(
() => _groupReadTags.Post(100),
() => _groupReadTags.Post(200),
() => _groupReadTags.Post(300),
() => _groupReadTags.Post(400),
() => _groupReadTags.Post(500),
() => _groupReadTags.Post(600),
() => _groupReadTags.Post(700),
() => _groupReadTags.Post(800)
);
}

Here is an alternative BatchBlock implementation with some extra features. It includes a TriggerBatch method with this signature:
public int TriggerBatch(int nextMinBatchSizeIfEmpty);
Invoking this method will either trigger a batch immediately if the input queue is not empty, otherwise it will set a temporary MinBatchSize that will affect only the next batch. You could invoke this method with a small value for nextMinBatchSizeIfEmpty to ensure that in case a batch cannot be currently produced, the next batch will occur sooner than the configured BatchSize at the block's constructor.
This method returns the size of the batch produced. It returns 0 in case that the input queue is empty, or the output queue is full, or the block has completed.
public class BatchBlockEx<T> : ITargetBlock<T>, ISourceBlock<T[]>
{
private readonly ITargetBlock<T> _input;
private readonly IPropagatorBlock<T[], T[]> _output;
private readonly Queue<T> _queue;
private readonly object _locker = new object();
private int _nextMinBatchSize = Int32.MaxValue;
public Task Completion { get; }
public int InputCount { get { lock (_locker) return _queue.Count; } }
public int OutputCount => ((BufferBlock<T[]>)_output).Count;
public int BatchSize { get; }
public BatchBlockEx(int batchSize, DataflowBlockOptions dataflowBlockOptions = null)
{
if (batchSize < 1) throw new ArgumentOutOfRangeException(nameof(batchSize));
dataflowBlockOptions = dataflowBlockOptions ?? new DataflowBlockOptions();
if (dataflowBlockOptions.BoundedCapacity != DataflowBlockOptions.Unbounded &&
dataflowBlockOptions.BoundedCapacity < batchSize)
throw new ArgumentOutOfRangeException(nameof(batchSize),
"Number must be no greater than the value specified in BoundedCapacity.");
this.BatchSize = batchSize;
_output = new BufferBlock<T[]>(dataflowBlockOptions);
_queue = new Queue<T>(batchSize);
_input = new ActionBlock<T>(async item =>
{
T[] batch = null;
lock (_locker)
{
_queue.Enqueue(item);
if (_queue.Count == batchSize || _queue.Count >= _nextMinBatchSize)
{
batch = _queue.ToArray(); _queue.Clear();
_nextMinBatchSize = Int32.MaxValue;
}
}
if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false);
}, new ExecutionDataflowBlockOptions()
{
BoundedCapacity = 1,
CancellationToken = dataflowBlockOptions.CancellationToken
});
var inputContinuation = _input.Completion.ContinueWith(async t =>
{
try
{
T[] batch = null;
lock (_locker)
{
if (_queue.Count > 0)
{
batch = _queue.ToArray(); _queue.Clear();
}
}
if (batch != null) await _output.SendAsync(batch).ConfigureAwait(false);
}
finally
{
if (t.IsFaulted)
{
_output.Fault(t.Exception.InnerException);
}
else
{
_output.Complete();
}
}
}, TaskScheduler.Default).Unwrap();
this.Completion = Task.WhenAll(inputContinuation, _output.Completion);
}
public void Complete() => _input.Complete();
void IDataflowBlock.Fault(Exception ex) => _input.Fault(ex);
public int TriggerBatch(Func<T[], bool> condition, int nextMinBatchSizeIfEmpty)
{
if (nextMinBatchSizeIfEmpty < 1)
throw new ArgumentOutOfRangeException(nameof(nextMinBatchSizeIfEmpty));
int count = 0;
lock (_locker)
{
if (_queue.Count > 0)
{
T[] batch = _queue.ToArray();
if (condition == null || condition(batch))
{
bool accepted = _output.Post(batch);
if (accepted) { _queue.Clear(); count = batch.Length; }
}
_nextMinBatchSize = Int32.MaxValue;
}
else
{
_nextMinBatchSize = nextMinBatchSizeIfEmpty;
}
}
return count;
}
public int TriggerBatch(Func<T[], bool> condition)
=> TriggerBatch(condition, Int32.MaxValue);
public int TriggerBatch(int nextMinBatchSizeIfEmpty)
=> TriggerBatch(null, nextMinBatchSizeIfEmpty);
public int TriggerBatch() => TriggerBatch(null, Int32.MaxValue);
DataflowMessageStatus ITargetBlock<T>.OfferMessage(
DataflowMessageHeader messageHeader, T messageValue,
ISourceBlock<T> source, bool consumeToAccept)
{
return _input.OfferMessage(messageHeader, messageValue, source,
consumeToAccept);
}
T[] ISourceBlock<T[]>.ConsumeMessage(DataflowMessageHeader messageHeader,
ITargetBlock<T[]> target, out bool messageConsumed)
{
return _output.ConsumeMessage(messageHeader, target, out messageConsumed);
}
bool ISourceBlock<T[]>.ReserveMessage(DataflowMessageHeader messageHeader,
ITargetBlock<T[]> target)
{
return _output.ReserveMessage(messageHeader, target);
}
void ISourceBlock<T[]>.ReleaseReservation(DataflowMessageHeader messageHeader,
ITargetBlock<T[]> target)
{
_output.ReleaseReservation(messageHeader, target);
}
IDisposable ISourceBlock<T[]>.LinkTo(ITargetBlock<T[]> target,
DataflowLinkOptions linkOptions)
{
return _output.LinkTo(target, linkOptions);
}
}
Another overload of the TriggerBatch method allows to examine the batch that can be currently produced, and decide if it should be triggered or not:
public int TriggerBatch(Func<T[], bool> condition);
The BatchBlockEx class does not support the Greedy and MaxNumberOfGroups options of the built-in BatchBlock.

I have found that using TriggerBatch in this way is unreliable:
_groupReadTags.Post(10);
_groupReadTags.Post(20);
_groupReadTags.TriggerBatch();
Apparently TriggerBatch is intended to be used inside the block, not outside it like this. I have seen this result in odd timing issues, like items from next batch batch being included in the current batch, even though TriggerBatch was called first.
Please see my answer to this question for an alternative using DataflowBlock.Encapsulate: BatchBlock produces batch with elements sent after TriggerBatch()

Observing an asynchronous sequence with 'yield return'

The following sample works fine:
static IEnumerable<int> GenerateNum(int sequenceLength)
{
for(int i = 0; i < sequenceLength; i++)
{
yield return i;
}
}
static void Main(string[] args)
{
//var observ = Observable.Start(() => GenerateNum(1000));
var observ = GenerateNum(1000).ToObservable();
observ.Subscribe(
(x) => Console.WriteLine("test:" + x),
(Exception ex) => Console.WriteLine("Error received from source: {0}.", ex.Message),
() => Console.WriteLine("End of sequence.")
);
Console.ReadKey();
}
However, what I really want is to use the commented out line - i.e. I want to run the 'number generator' asynchronously, and every time it yields a new value, I want it to be output to the console. It doesn't seem to work - how can I modify this code to work?

When doing this for asynchronous execution in a console app, you may want to use the ToObservable(IEnumerable<TSource>, IScheduler) overload (see Observable.ToObservable Method (IEnumerable, IScheduler)). To use the built-in thread pool schedule, for example, try
var observ = GenerateNum(1000).ToObservable(Scheduler.ThreadPool);
It works for me...To expand, the following complete example works exactly as I think you intend:
static Random r = new Random();
static void Main(string[] args) {
var observ = GenerateNum(1000).ToObservable(Scheduler.ThreadPool );
observ.Subscribe(
(x) => Console.WriteLine("test:" + x),
(Exception ex) => Console.WriteLine("Error received from source: {0}.", ex.Message),
() => Console.WriteLine("End of sequence.")
);
while (Console.ReadKey(true).Key != ConsoleKey.Escape) {
Console.WriteLine("You pressed a key.");
}
}
static IEnumerable<int> GenerateNum(int sequenceLength) {
for (int i = 0; i < sequenceLength; i++) {
Thread.Sleep(r.Next(1, 200));
yield return i;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Reactive Extensions and Retry - c#

Related

How to keep track of faulted items in TPL pipeline in (thread)safe way

Parallel.For<int> not working as expected

Observable with Time interval not displaying results on subscribe

Data Propagation in TPL Dataflow Pipeline with Batchblock.Triggerbatch()

Observing an asynchronous sequence with 'yield return'

Categories

Resources