I'm looking for a way to implement the following:
A says: "Yo, X happened. Bye."
Others see that and start doing some work.
In other words, I would like to fire an event and let others handle that in a fire and forget way.
So I've looked into the observer pattern: https://msdn.microsoft.com/en-us/library/dd783449(v=vs.110).aspx. However this example is synchronous, and if the observers take a long time to do their work, the notify method blocks for a long time.
I also looked at how to raise events: https://msdn.microsoft.com/en-us/library/9aackb16(v=vs.110).aspx. Also this example is synchronous, and blocks the sender for a long time when the handler takes long to handle the event.
My question is:
How do I do fire and forget events/messages/delegates in C#?
Probably you should meet Task Parallel Library (TPL) Dataflows. There's one data flow called ActionBlock<TInput> that should be a good start for you:
The ActionBlock<TInput> class is a target block that calls a delegate
when it receives data. Think of a ActionBlock<TInput> object as a
delegate that runs asynchronously when data becomes available. The
delegate that you provide to an ActionBlock<TInput> object can be of
type Action or type System.Func<TInput, Task>[...]
Therefore, what about giving a Func<TInput, Task> to ActionBlock<TInput> to perform asynchronous stuff? I've modified the sample found on this TPL Dataflow MSDN article:
List<Func<int, Task>> observers = new List<Func<int, Task>>
{
n => Console.WriteLine(n),
n => Console.WriteLine(n * i),
n => Console.WriteLine(n * n / i)
};
// Create an ActionBlock<int> object that prints values
// to the console.
var actionBlock = new ActionBlock<int>
(
n =>
{
// Fire and forget call to all observers
foreach(Func<int, Task> observer in observers)
{
// Don't await for its completion
observer(n);
}
}
);
// Post several messages to the block.
for (int i = 0; i < 3; i++)
{
actionBlock.Post(i * 10);
}
// Set the block to the completed state
actionBlock.Complete();
// See how I commented out the following sentence.
// You don't wait actions to complete as you want the fire
// and forget behavior!
// actionBlock.Completion.Wait();
You might also want to take a look at BufferBlock<T>.
Related
I have some c# code (MVC WebAPI) which iterates over an array of IDs in parallel and makes an API call for each entry. In the first version, the whole code was a simple, synchronous for loop. Now we changed that to a combination of Task.WhenAll and a LINQ select:
private async Task RunHeavyLoad(IProgress<float> progress) {
List<MyObj> myElements = new List<MyObj>(someEntries);
float totalSteps = 1f / myElements.Count();
int currentStep = 0;
await Task.WhenAll(myElements.Select(async elem => {
var result = await SomeHeavyApiCall(elem);
DoSomethingWithThe(result);
progress.Report(totalSteps * System.Threading.Interlocked.Increment(ref currentStep) * .1f);
}
// Do some more stuff
}
This is a simplified version of the original method! The actual method EnforceImmediateImport is called by this SignalR hub method:
public class ImportStatusHub : Hub {
public async Task RunUnscheduledImportAsync(DateTime? fromDate, DateTime? toDate) {
Clients.Others.enableManualImport(false);
try {
Progress<float> progress = new Progress<float>((p) => Clients.All.updateProgress(p));
await MvcApplication.GlobalScheduler.EnforceImmediateImport(progress, fromDate, toDate);
} catch (Exception ex) {
Clients.All.importError(ex.Message);
}
Clients.Others.enableManualImport(true);
}
}
Now I wonder, if this is "thread safe" per se, or if I need to do something with the progress.Report calls to prevent anything from going wrong.
From the docs:
Remarks
Any handler provided to the constructor or event handlers
registered with the ProgressChanged event are invoked through a
SynchronizationContext instance captured when the instance is
constructed. If there is no current SynchronizationContext at the time
of construction, the callbacks will be invoked on the ThreadPool.
For more information and a code example, see the article Async in 4.5:
Enabling Progress and Cancellation in Async APIs in the .NET Framework
blog.
Like anything else using the SynchronizationContext, it's safe to post from multiple threads.
Custom implementations of IProgress<T> should have their behavior defined.
On your question, internally, Progress only does invoking. It is up to the code you wrote to handle the progress on the other side. I would say that the line progress.Report(totalSteps * System.Threading.Interlocked.Increment(ref currentStep) * .1f); can cause a potential progress reporting issue due to the multiplication which is not atomic.
This is what happens internally inside Progress when you call Report
protected virtual void OnReport(T value)
{
// If there's no handler, don't bother going through the sync context.
// Inside the callback, we'll need to check again, in case
// an event handler is removed between now and then.
Action<T> handler = m_handler;
EventHandler<T> changedEvent = ProgressChanged;
if (handler != null || changedEvent != null)
{
// Post the processing to the sync context.
// (If T is a value type, it will get boxed here.)
m_synchronizationContext.Post(m_invokeHandlers, value);
}
}
On the code though, a better way to run in parallel is to use PLinq. In your current code, if the list contains many items, it will spin up tasks for every single item at the same time and wait for all of them to complete. However, in PLinq, the number of concurrent executions will be determined for you to optimize performance.
myElements.AsParallel().ForAll(async elem =>
{
var result = await SomeHeavyApiCall(elem);
DoSomethingWithThe(result);
progress.Report(totalSteps * System.Threading.Interlocked.Increment(ref currentStep) * .1f);
}
Please keep in mind that AsParallel().ForAll() will immediately return when using async func. So you might want to capture all the tasks and wait for them before you proceed.
One last thing, if your list is being edited while it is being processed, i recommend using ConcurrentQueue or ConcurrentDictionary or ConcurrentBag.
I've been trying to implement a simple producer-consumer pattern using Rx and observable collections. I also need to be able to throttle the number of subscribers easily. I have seen lots of references to LimitedConcurrencyLevelTaskScheduler in parallel extensions but I don't seem to be able to get this to use multiple threads.
I think I'm doing something silly so I was hoping someone could explain what. In the unit test below, I expect multiple (2) threads to be used to consume the strings in the blocking collection. What am I doing wrong?
[TestClass]
public class LimitedConcurrencyLevelTaskSchedulerTestscs
{
private ConcurrentBag<string> _testStrings = new ConcurrentBag<string>();
ConcurrentBag<int> _threadIds= new ConcurrentBag<int>();
[TestMethod]
public void WhenConsumingFromBlockingCollection_GivenLimitOfTwoThreads_TwoThreadsAreUsed()
{
// Setup the command queue for processing combinations
var commandQueue = new BlockingCollection<string>();
var taskFactory = new TaskFactory(new LimitedConcurrencyLevelTaskScheduler(2));
var scheduler = new TaskPoolScheduler(taskFactory);
commandQueue.GetConsumingEnumerable()
.ToObservable(scheduler)
.Subscribe(Go, ex => { throw ex; });
var iterationCount = 100;
for (int i = 0; i < iterationCount; i++)
{
commandQueue.Add(string.Format("string {0}", i));
}
commandQueue.CompleteAdding();
while (!commandQueue.IsCompleted)
{
Thread.Sleep(100);
}
Assert.AreEqual(iterationCount, _testStrings.Count);
Assert.AreEqual(2, _threadIds.Distinct().Count());
}
private void Go(string testString)
{
_testStrings.Add(testString);
_threadIds.Add(Thread.CurrentThread.ManagedThreadId);
}
}
Everyone seems to go through the same learning curve with Rx. The thing to understand is that Rx doesn't do parallel processing unless you explicitly make a query that forces parallelism. Schedulers do not introduce parallelism.
Rx has a contract of behaviour that says zero or more values are produced in series (regardless of how many threads might be used), one after another, with no overlap, finally to be followed by an optional single error or a single complete message, and then nothing else.
This is often written as OnNext*(OnError|OnCompleted).
All that schedulers do is define the rule to determine which thread a new value is processed on if the scheduler has no pending values it is processing for the current observable.
Now take your code:
var taskFactory = new TaskFactory(new LimitedConcurrencyLevelTaskScheduler(2));
var scheduler = new TaskPoolScheduler(taskFactory);
This says that the scheduler will run values for a subscription on one of two threads. But it doesn't mean that it will do this for every value produced. Remember, since values are produced in series, one after another, it is better to re-use an existing thread than to go to the high cost of creating a new thread. So what Rx does is re-use the existing thread if a new value is scheduled on the scheduler before the current value is finished being processed.
This is the key - it re-uses the thread if a new value is scheduled before the processing of existing values is complete.
So your code does this:
commandQueue.GetConsumingEnumerable()
.ToObservable(scheduler)
.Subscribe(Go, ex => { throw ex; });
It means that the scheduler will only create a thread when the first value comes along. But by the time the expensive thread creation operation is complete then the code that adds values to the commandQueue is also done so it has queued them all and hence it can more efficiently use a single thread rather than create a costly second one.
To avoid this you need to construct the query to introduce parallelism.
Here's how:
public void WhenConsumingFromBlockingCollection_GivenLimitOfTwoThreads_TwoThreadsAreUsed()
{
var taskFactory = new TaskFactory(new LimitedConcurrencyLevelTaskScheduler(2));
var scheduler = new TaskPoolScheduler(taskFactory);
var iterationCount = 100;
Observable
.Range(0, iterationCount)
.SelectMany(n => Observable.Start(() => n.ToString(), scheduler)
.Do(x => Go(x)))
.Wait();
(iterationCount == _testStrings.Count).Dump();
(2 == _threadIds.Distinct().Count()).Dump();
}
Now, I've used the Do(...)/.Wait() combo to give you the equivalent of a blocking .Subscribe(...) method.
This results is your asserts both returning true.
I have found that by modifying the subscription as follows I can add 5 subscribers but only two threads will process the contents of the collection so this serves my purpose.
for(int i = 0; i < 5; i++)
observable.Subscribe(Go, ex => { throw ex; });
I'd be interested to know if there is a better or more elegant way to achieve this!
I'm trying to create an async unit test for the project, but cannot understand how to wait for the async subject to complete:
[Test]
public async void MicroTest()
{
var value = 2;
var first = new AsyncSubject<int>();
var second = new AsyncSubject<int>();
first.Subscribe(_ =>
{
value = _;
second.OnCompleted();
});
first.OnNext(1);
// how to wait for the second subject to complete?
Assert.AreEqual(value, 1);
}
Sync version of this test is works well:
[Test]
public void MicroTest()
{
var value = 2;
var first = new Subject<int>();
var second = new Subject<int>();
first.Subscribe(_ =>
{
value = _;
second.OnCompleted();
});
first.OnNext(1);
Assert.AreEqual(value, 1);
}
AsyncSubject versus Subject
First off, it's worth pointing out that AsyncSubject<T> is not an asynchronous version of Subject<T>. Both are in fact free-threaded* (see footnote).
AsyncSubject is a specialization of Subject intended to be used to model an operation that completes asynchronously and returns a single result. It has two noteworthy features:
Only the last result is published
The result is cached and is available to observers subscribing after it has completed.
It is used internally in various places, including by the ToObservable() extension method defined on Task and Task<T>.
The issue with the test
Recall AsyncSubject<T> will only return the final result received. It does this by waiting for OnCompleted() so it knows what the final result is. Because you do not call OnCompleted() on first your test is flawed as the OnNext() handler - the lambda function passed in your Subscribe call - will never be invoked.
Additionally, it is invalid not to call OnNext() at least once on an AsyncSubject<T>, so when you call await second; you will get an InvalidOperationException if you haven't done this.
If you write your test as follows, all is well:
[Test]
public async void MicroTest()
{
var value = 2;
var first = new AsyncSubject<int>();
var second = new AsyncSubject<int>();
first.Subscribe(_ =>
{
// won't be called until an OnCompleted() has
// been invoked on first
value = _;
// you must send *some* value to second
second.OnNext(_);
second.OnCompleted();
});
first.OnNext(1);
// you must do this for OnNext handler to be called
first.OnCompleted();
// how to wait for the second subject to complete
await second;
Assert.AreEqual(value, 1);
}
About asynchronous tests
As a general rule I would avoid writing asynchronous tests that could wait forever. This gets particularly annoying when it causes resource drains on build servers. Use some kind of timeout e.g:
await second.Timeout(TimeSpan.FromSeconds(1));
No need to handle the exception since that is enough for the test to fail.
**I've borrowed this term from the COM lexicon. In this sense I mean that they, as with most of the Rx framework components, will generally run on whatever thread you happen to invoke their methods on. Being free-threaded doesn't necessarily mean being fully thread safe though. In particular, unlike AsyncSubject<T>, Subject<T> doesn't protect you from the Rx grammar violation of making overlapping calls to OnNext. Use Subject.Synchronize or Observable.Synchronize for this protection.*
I needed to alternate between two states with each state having a different interval time.
The best way I could think of doing this was to use Reactive Extensions' Observable.Generate
which is pretty awsome.
From what I read on msdn and other sites, Observable.Finally() should fire if the
observable "terminates gracefully or exceptionally". I was testing the following code
(in LINQPad) to see how it works, but I can not get .Finall() to fire at all.
var ia = TimeSpan.FromSeconds(1);
var ib = TimeSpan.FromSeconds(.2);
var start = DateTime.UtcNow;
var ct = new CancellationTokenSource();
var o = Observable.Generate(
true,
// s => !ct.IsCancellationRequested,
s => (DateTime.UtcNow-start) < TimeSpan.FromSeconds(3) && !ct.IsCancellationRequested,
s => !s,
s => s ? "on" : "off",
s => s? ib : ia)
// .TakeUntil(start+TimeSpan.FromSeconds(3))
.Concat(Observable.Return("end"));
o.Subscribe( s=> s.Dump(), ct.Token);
var t = o.ToTask(ct.Token);
t.ContinueWith(x => x.Dump("done"));
o.Finally(() => "finallY".Dump()); // never gets called?
Thread.Sleep(10000);
ct.Cancel();
If I make Thread.Sleep 10s, the observable sequence finishes and the Task.ContinueWith fires,
but not .Finally().
If I make Thread.Sleep 2s, the observable sequence is canceled and the Task.ContinueWith again fires,
but not .Finally().
Why not?
Look at the return type of the Finally method; should give you a hint. Just like the Concat method returns a new IObservable with the new sequence concatenated to it, but doesn't change the original, the Finally method returns a new IObservable that has that final action, but you're subscribing to the original IObservable. Put the following line in front of your Subscribe call and it'll work.
o = o.Finally(() => "finallY".Dump());
I agree it's an odd API choice though; I'd think of Finally as being more akin to Subscribe than to Concat. You're subscribing to the finally "event"; it's odd that the API forces you to create a completely new IObservable and then subscribe to that just to get the Finally thing to happen. Plus it allows a potential error (made evident if we use the function in your question) that if you subscribe twice to that new IObservable, your Finally function will execute twice. So you have to make sure that one of your subscriptions is on the "finallied" IObservable and the others are all on the original. Just seems unusual.
I guess the way to think about it is that Finally isn't meant to modify the observable, but rather to modify the subscription itself. i.e., they don't expect you typically to make openly-accessible named observables that have Finally things (var o = Observable.[...].Finally(...);) rather it's meant to go inline with the subscription call itself (var subscription = o.Finally(...).Subscribe(...);)
I am new to asynchronous programming. I have a C# dll with an asynchronous method that gets called, takes a function pointer (delegate) and calls this callback function after "result" is calculated.
public delegate void CreatedDelegate(Foo result);
public void CreateAsync(CreatedDelegate createdCallback)
{
Task t = Task.Factory.StartNew(() =>
{
Foo result = ...
createdCallback(result);
});
}
The delegate callback of type "CreatedDelegate" is (in my case) a function pointer to a C++/CLI method that works with the result.
void CreatedCallback(Foo^ result)
{
// do something with result
}
So this asynchronous concept seems to work quite well in most cases, but sometimes I encounter some errors. How can I achieve it if the function "CreateAsync" is called multiple times with different computation effort, that the resulting calls to "CreatedCallback" just happen in the same order as originally "CreateAsync" was called? To make it clearer: The first call to "CreateAsync" should result in the first call to "CreatedCallback" even if a succeeding call of "CreateAsync" is faster and would actually call the callback earlier.
Maybe this can be done by allowing only one active new thread in the asynchronous "CreateAsync" at a time?
To process the callbacks in order, you'll need to implement some queueing of work items. The easiest way is probably to use BlockingCollection type (see MSDN documentation).
Instead of calling the callback, your CreateAsync method would add the task (together with the callback) to the queue:
// Queue to keep tasks and their callbacks
private BlockingCollection<Tuple<Task<Foo>, CreatedDelegate>>
queue = new BlockingCollection<Tuple<Task<Foo>, CreatedDelegate>>()
public void CreateAsync(CreatedDelegate createdCallback) {
Task<Foo> t = Task.Factory.StartNew(() => {
Foo result = ...
return result; });
queue.Add(Tuple.Create(t, createdCallback));
// ..
}
This will only add tasks and callbacks to the queue - to actually call the callback, you'll need another task that waits for the tasks in the queue (in the order in which they were added) and calls the callback:
Task.Factory.StartNew(() => {
while(true) { // while you keep calling 'CreateAsync'
// Get next task (in order) and its callback
Tuple<Task<Foo>, CreatedDelegate> op = queue.Take();
// Wait for the result and give it to callback
op.Item2(op.Item1.Result);
}
}
If order is important, then using Threads might be better:
thread queue = empty
for each task
{
if there are no free 'cpu'
wait on first thread in queue
remove thread from queue
call delegate
create thread
add thread to queue
}
while queue has threads
wait on first thread in queue
remove thread from queue
call delegate