Subject.HasObservers is not immediately true in the sample code attached for an undetermined number of ticks. If I take out the SubscribeOn(), HasObservers is always true, so I know it's to do with IScheduler initialization.
This was causing a problem in our production software where the first few calls to OnNext() were going nowhere despite a guarantee that the IDisposable subscription variable had been initialized before the thread that called OnNext() was allowed to proceed. Is this a bug in RX?
What are other ways to use System.Reactive classes to guarantee the subscription is setup with a scheduler without polling?
I have tried Subject.Synchronize(), but that made no difference.
static void Main(string[] args)
{
for (int i = 0; i < 100; i++)
{
var source = new Subject<long>();
IDisposable subscription = source
.SubscribeOn(ThreadPoolScheduler.Instance)
.Subscribe(Console.WriteLine);
// 0 and 668,000 ticks for subscription setup, but rarely 0.
int iterations = 0;
while (!source.HasObservers)
{
iterations++;
Thread.SpinWait(1);
}
// Next line would rarely output to Console without while loop
source.OnNext(iterations);
subscription.Dispose();
source.Dispose();
}
}
I expected Subject.HasObservers to be true without polling.
As I understand, the problem is that your subscription is done asynchronously: the call is not blocked, so the real subscription will be done later on other thread.
I didn't find the exact way of knowing if the subscription has really landed (it might be even not possible at all). If your problem is the race between the first OnNext and subscription, than maybe you need to convert your Observable into a Connectable Observable using Replay() + Connect(). This way you'll ensure that every subscriber gets exactly the same sequence.
using (var source = new Subject<long>())
{
var connectableSource = source.Replay();
connectableSource.Connect();
using (var subscription = connectableSource
.SubscribeOn(ThreadPoolScheduler.Instance)
.Subscribe(Console.WriteLine))
{
source.OnNext(42); // outputs 42 always
Console.ReadKey(false);
}
}
In my code I still need Console.ReadKey because of the race between subscription done on the other thread and unsubscription.
The solution I came up with for now that I'm hoping someone can improve upon:
public class SubscribedSubject<T> : ISubject<T>, IDisposable
{
private readonly Subject<T> _subject = new Subject<T>();
private readonly ManualResetEventSlim _subscribed = new ManualResetEventSlim();
public bool HasObservers => _subject.HasObservers;
public void Dispose() => _subject.Dispose();
public void OnCompleted() => Wait().OnCompleted();
public void OnError(Exception error) => Wait().OnError(error);
public void OnNext(T value) => Wait().OnNext(value);
public IDisposable Subscribe(IObserver<T> observer)
{
IDisposable disposable = _subject.Subscribe(observer);
_subscribed.Set();
return disposable;
}
private Subject<T> Wait()
{
_subscribed.Wait();
return _subject;
}
}
Example use:
using (var source = new SubscribedSubject<long>())
{
using (source
.SubscribeOn(ThreadPoolScheduler.Instance)
.Subscribe(Console.WriteLine))
{
source.OnNext(42);
Console.ReadKey();
}
}
Related
I want to use IAsyncEnumerable like a Source for akka streams. But I not found, how do it.
No sutiable method in Source class for this code.
using System.Collections.Generic;
using System.Threading.Tasks;
using Akka.Streams.Dsl;
namespace ConsoleApp1
{
class Program
{
static async Task Main(string[] args)
{
Source.From(await AsyncEnumerable())
.Via(/*some action*/)
//.....
}
private static async IAsyncEnumerable<int> AsyncEnumerable()
{
//some async enumerable
}
}
}
How use IAsyncEnumerbale for Source?
This has been done in the past as a part of Akka.NET Streams contrib package, but since I don't see it there anymore, let's go through on how to implement such source. The topic can be quite long, as:
Akka.NET Streams is really about graph processing - we're talking about many-inputs/many-outputs configurations (in Akka.NET they're called inlets and outlets) with support for cycles in graphs.
Akka.NET is not build on top of .NET async/await or even on top of .NET standard thread pool library - they're both pluggable, which means that the lowest barier is basically using callbacks and encoding what C# compiler sometimes does for us.
Akka.NET streams is capable of both pushing and pulling values between stages/operators. IAsyncEnumerable<T> can only pull data while IObservable<T> can only push it, so we get more expressive power here, but this comes at a cost.
The basics of low level API used to implement custom stages can be found in the docs.
The starter boilerplate looks like this:
public static class AsyncEnumerableExtensions {
// Helper method to change IAsyncEnumerable into Akka.NET Source.
public static Source<T, NotUsed> AsSource<T>(this IAsyncEnumerable<T> source) =>
Source.FromGraph(new AsyncEnumerableSource<T>(source));
}
// Source stage is description of a part of the graph that doesn't consume
// any data, only produce it using a single output channel.
public sealed class AsyncEnumerableSource<T> : GraphStage<SourceShape<T>>
{
private readonly IAsyncEnumerable<T> _enumerable;
public AsyncEnumerableSource(IAsyncEnumerable<T> enumerable)
{
_enumerable = enumerable;
Outlet = new Outlet<T>("asyncenumerable.out");
Shape = new SourceShape<T>(Outlet);
}
public Outlet<T> Outlet { get; }
public override SourceShape<T> Shape { get; }
/// Logic if to a graph stage, what enumerator is to enumerable.
protected override GraphStageLogic CreateLogic(Attributes inheritedAttributes) =>
new Logic(this);
sealed class Logic: OutGraphStageLogic
{
public override void OnPull()
{
// method called whenever a consumer asks for new data
}
public override void OnDownstreamFinish()
{
// method called whenever a consumer stage finishes,used for disposals
}
}
}
As mentioned, we don't use async/await straight away here: even more, calling Logic methods in asynchronous context is unsafe. To make it safe we need to register out methods that may be called from other threads using GetAsyncCallback<T> and call them via returned wrappers. This will ensure, that not data races will happen when executing asynchronous code.
sealed class Logic : OutGraphStageLogic
{
private readonly Outlet<T> _outlet;
// enumerator we'll call for MoveNextAsync, and eventually dispose
private readonly IAsyncEnumerator<T> _enumerator;
// callback called whenever _enumerator.MoveNextAsync completes asynchronously
private readonly Action<Task<bool>> _onMoveNext;
// callback called whenever _enumerator.DisposeAsync completes asynchronously
private readonly Action<Task> _onDisposed;
// cache used for errors thrown by _enumerator.MoveNextAsync, that
// should be rethrown after _enumerator.DisposeAsync
private Exception? _failReason = null;
public Logic(AsyncEnumerableSource<T> source) : base(source.Shape)
{
_outlet = source.Outlet;
_enumerator = source._enumerable.GetAsyncEnumerator();
_onMoveNext = GetAsyncCallback<Task<bool>>(OnMoveNext);
_onDisposed = GetAsyncCallback<Task>(OnDisposed);
}
// ... other methods
}
The last part to do are methods overriden on `Logic:
OnPull used whenever the downstream stage calls for new data. Here we need to call for next element of async enumerator sequence.
OnDownstreamFinish called whenever the downstream stage has finished and will not ask for any new data. It's the place for us to dispose our enumerator.
Thing is these methods are not async/await, while their enumerator's equivalent are. What we basically need to do there is to:
Call corresponding async methods of underlying enumerator (OnPull → MoveNextAsync and OnDownstreamFinish → DisposeAsync).
See, if we can take their results immediately - it's important part that usually is done for us as part of C# compiler in async/await calls.
If not, and we need to wait for the results - call ContinueWith to register our callback wrappers to be called once async methods are done.
sealed class Logic : OutGraphStageLogic
{
// ... constructor and fields
public override void OnPull()
{
var hasNext = _enumerator.MoveNextAsync();
if (hasNext.IsCompletedSuccessfully)
{
// first try short-path: values is returned immediately
if (hasNext.Result)
// check if there was next value and push it downstream
Push(_outlet, _enumerator.Current);
else
// if there was none, we reached end of async enumerable
// and we can dispose it
DisposeAndComplete();
}
else
// we need to wait for the result
hasNext.AsTask().ContinueWith(_onMoveNext);
}
// This method is called when another stage downstream has been completed
public override void OnDownstreamFinish() =>
// dispose enumerator on downstream finish
DisposeAndComplete();
private void DisposeAndComplete()
{
var disposed = _enumerator.DisposeAsync();
if (disposed.IsCompletedSuccessfully)
{
// enumerator disposal completed immediately
if (_failReason is not null)
// if we close this stream in result of error in MoveNextAsync,
// fail the stage
FailStage(_failReason);
else
// we can close the stage with no issues
CompleteStage();
}
else
// we need to await for enumerator to be disposed
disposed.AsTask().ContinueWith(_onDisposed);
}
private void OnMoveNext(Task<bool> task)
{
// since this is callback, it will always be completed, we just need
// to check for exceptions
if (task.IsCompletedSuccessfully)
{
if (task.Result)
// if task returns true, it means we read a value
Push(_outlet, _enumerator.Current);
else
// otherwise there are no more values to read and we can close the source
DisposeAndComplete();
}
else
{
// task either failed or has been cancelled
_failReason = task.Exception as Exception ?? new TaskCanceledException(task);
FailStage(_failReason);
}
}
private void OnDisposed(Task task)
{
if (task.IsCompletedSuccessfully) CompleteStage();
else {
var reason = task.Exception as Exception
?? _failReason
?? new TaskCanceledException(task);
FailStage(reason);
}
}
}
As of Akka.NET v1.4.30 this is now natively supported inside Akka.Streams via the RunAsAsyncEnumerable method:
var input = Enumerable.Range(1, 6).ToList();
var cts = new CancellationTokenSource();
var token = cts.Token;
var asyncEnumerable = Source.From(input).RunAsAsyncEnumerable(Materializer);
var output = input.ToArray();
bool caught = false;
try
{
await foreach (var a in asyncEnumerable.WithCancellation(token))
{
cts.Cancel();
}
}
catch (OperationCanceledException e)
{
caught = true;
}
caught.ShouldBeTrue();
I copied that sample from the Akka.NET test suite, in case you're wondering.
You can also use an existing primitive for streaming large collections of data. Here is an example of using Source.unfoldAsync to stream pages of data - in this case github repositories using Octokit - until there is no more.
var source = Source.UnfoldAsync<int, RepositoryPage>(startPage, page =>
{
var pageTask = client.GetRepositoriesAsync(page, pageSize);
var next = pageTask.ContinueWith(task =>
{
var page = task.Result;
if (page.PageNumber * pageSize > page.Total) return Option<(int, RepositoryPage)>.None;
else return new Option<(int, RepositoryPage)>((page.PageNumber + 1, page));
});
return next;
});
To run
using var sys = ActorSystem.Create("system");
using var mat = sys.Materializer();
int startPage = 1;
int pageSize = 50;
var client = new GitHubClient(new ProductHeaderValue("github-search-app"));
var source = ...
var sink = Sink.ForEach<RepositoryPage>(Console.WriteLine);
var result = source.RunWith(sink, mat);
await result.ContinueWith(_ => sys.Terminate());
class Page<T>
{
public Page(IReadOnlyList<T> contents, int page, long total)
{
Contents = contents;
PageNumber = page;
Total = total;
}
public IReadOnlyList<T> Contents { get; set; } = new List<T>();
public int PageNumber { get; set; }
public long Total { get; set; }
}
class RepositoryPage : Page<Repository>
{
public RepositoryPage(IReadOnlyList<Repository> contents, int page, long total)
: base(contents, page, total)
{
}
public override string ToString() =>
$"Page {PageNumber}\n{string.Join("", Contents.Select(x => x.Name + "\n"))}";
}
static class GitHubClientExtensions
{
public static async Task<RepositoryPage> GetRepositoriesAsync(this GitHubClient client, int page, int size)
{
// specify a search term here
var request = new SearchRepositoriesRequest("bootstrap")
{
Page = page,
PerPage = size
};
var result = await client.Search.SearchRepo(request);
return new RepositoryPage(result.Items, page, result.TotalCount);
}
}
Attempt:
public class KeyLock : IDisposable
{
private string key;
private static ISet<string> lockedKeys = new HashSet<string>();
private static object locker1 = new object();
private static object locker2 = new object();
public KeyLock(string key)
{
lock(locker2)
{
// wait for key to be freed up
while(lockedKeys.Contains(key));
this.lockedKeys.Add(this.key = key);
}
}
public void Dispose()
{
lock(locker)
{
lockedKeys.Remove(this.key);
}
}
}
to be used like
using(new KeyLock(str))
{
// section that is critical based on str
}
I test by firing the method twice in the same timespan
private async Task DoStuffAsync(string str)
{
using(new KeyLock(str))
{
await Task.Delay(1000);
}
}
// ...
await Task.WhenAll(DoStuffAsync("foo"), DoStuffAsync("foo"))
but, strangely enough, when I debug I see that the second time it goes straight through the lock and in fact somehow lockedKeys.Contains(key) evaluates to false even through I can see in my debugger windows that the key is there.
Where is the flaw and how do I fix it?
Take a look at lock statement (C# Reference)
It basically breaks down to
object __lockObj = x;
bool __lockWasTaken = false;
try
{
System.Threading.Monitor.Enter(__lockObj, ref __lockWasTaken);
// Your code...
}
finally
{
if (__lockWasTaken) System.Threading.Monitor.Exit(__lockObj);
}
Enter(Object)
Acquires an exclusive lock on the specified object.
What you need to do instead, is keep around and obtain the same reference. You could probably use a thread safe Dictionary ConcurrentDictionary
public static ConcurrentDictionary<string, object> LockMap = new ConcurrentDictionary<string, object> ();
...
lock (LockMap.GetOrAdd(str, x => new object ()))
{
// do locky stuff
}
Note : This is just one example of many ways to do this, you will obviously need to tweak it for your needs
The main problems that I notice are as follows:
※Super dangerous infinite loop in the constructor, and super wasteful as well.
※When accessing the private field lockedKeys, you use different objects to lock on→ Not good
However, why your code does not seem to be working I think is because of the short delay you set. Since it is only 1 second of delay during the debugging when you step from statement to statement, 1 second already passes and it gets disposed.
using(new KeyLock(str)){
await Task.Delay(1000);
}
Luckily for you, I came across a similar problem before and I have a solution, too. Look here for my small solution.
Usage:
//Resource to be shared
private AsyncLock _asyncLock = new AsyncLock();
....
....
private async Task DoStuffAsync()
{
using(await _asyncLock.LockAsync())
{
await Task.Delay(1000);
}
}
// ...
await Task.WhenAll(DoStuffAsync(), DoStuffAsync())
Here's what I'm trying to do:
Keep a queue in memory of items that need processed (i.e. IsProcessed = 0)
Every 5 seconds, get unprocessed items from the db, and if they're not already in the queue, add them
Continuous pull items from the queue, process them, and each time an item is processed, update it in the db (IsProcessed = 1)
Do this all "as parallel as possible"
I have a constructor for my service like
public MyService()
{
Ticker.Elapsed += FillQueue;
}
and I start that timer when the service starts like
protected override void OnStart(string[] args)
{
Ticker.Enabled = true;
Task.Run(() => { ConsumeWork(); });
}
and my FillQueue is like
private static async void FillQueue(object source, ElapsedEventArgs e)
{
var items = GetUnprocessedItemsFromDb();
foreach(var item in items)
{
if(!Work.Contains(item))
{
Work.Enqueue(item);
}
}
}
and my ConsumeWork is like
private static void ConsumeWork()
{
while(true)
{
if(Work.Count > 0)
{
var item = Work.Peek();
Process(item);
Work.Dequeue();
}
else
{
Thread.Sleep(500);
}
}
}
However this is probably a naive implementation and I'm wondering whether .NET has any type of class that is exactly what I need for this type of situation.
Though #JSteward' answer is a good start, you can improve it with mixing up the TPL-Dataflow and Rx.NET extensions, as a dataflow block may easily become an observer for your data, and with Rx Timer it will be much less effort for you (Rx.Timer explanation).
We can adjust MSDN article for your needs, like this:
private const int EventIntervalInSeconds = 5;
private const int DueIntervalInSeconds = 60;
var source =
// sequence of Int64 numbers, starting from 0
// https://msdn.microsoft.com/en-us/library/hh229435.aspx
Observable.Timer(
// fire first event after 1 minute waiting
TimeSpan.FromSeconds(DueIntervalInSeconds),
// fire all next events each 5 seconds
TimeSpan.FromSeconds(EventIntervalInSeconds))
// each number will have a timestamp
.Timestamp()
// each time we select some items to process
.SelectMany(GetItemsFromDB)
// filter already added
.Where(i => !_processedItemIds.Contains(i.Id));
var action = new ActionBlock<Item>(ProcessItem, new ExecutionDataflowBlockOptions
{
// we can start as many item processing as processor count
MaxDegreeOfParallelism = Environment.ProcessorCount,
});
IDisposable subscription = source.Subscribe(action.AsObserver());
Also, your check for item being already processed isn't quite accurate, as there is a possibility to item get selected as unprocessed from db right at the time you've finished it's processing, yet didn't update it in database. In this case item will be removed from Queue<T>, and after that added there again by producer, this is why I've added the ConcurrentBag<T> to this solution (HashSet<T> isn't thread-safe):
private static async Task ProcessItem(Item item)
{
if (_processedItemIds.Contains(item.Id))
{
return;
}
_processedItemIds.Add(item.Id);
// actual work here
// save item as processed in database
// we need to wait to ensure item not to appear in queue again
await Task.Delay(TimeSpan.FromSeconds(EventIntervalInSeconds * 2));
// clear the processed cache to reduce memory usage
_processedItemIds.Remove(item.Id);
}
public class Item
{
public Guid Id { get; set; }
}
// temporary cache for items in process
private static ConcurrentBag<Guid> _processedItemIds = new ConcurrentBag<Guid>();
private static IEnumerable<Item> GetItemsFromDB(Timestamped<long> time)
{
// log event timing
Console.WriteLine($"Event # {time.Value} at {time.Timestamp}");
// return items from DB
return new[] { new Item { Id = Guid.NewGuid() } };
}
You can implement cache clean up in other way, for example, start a "GC" timer, which will remove processed items from cache on regular basis.
To stop events and processing items you should Dispose the subscription and, maybe, Complete the ActionBlock:
subscription.Dispose();
action.Complete();
You can find more information about Rx.Net in their guidelines on github.
You could use an ActionBlock to do your processing, it has a built in queue that you can post work to. You can read up on tpl-dataflow here: Intro to TPL-Dataflow also Introduction to Dataflow, Part 1. Finally, this is a quick sample to get you going. I've left out a lot but it should at least get you started.
using System;
using System.Threading;
using System.Threading.Tasks;
using System.Threading.Tasks.Dataflow;
namespace MyWorkProcessor {
public class WorkProcessor {
public WorkProcessor() {
Processor = CreatePipeline();
}
public async Task StartProcessing() {
try {
await Task.Run(() => GetWorkFromDatabase());
} catch (OperationCanceledException) {
//handle cancel
}
}
private CancellationTokenSource cts {
get;
set;
}
private ITargetBlock<WorkItem> Processor {
get;
}
private TimeSpan DatabasePollingFrequency {
get;
} = TimeSpan.FromSeconds(5);
private ITargetBlock<WorkItem> CreatePipeline() {
var options = new ExecutionDataflowBlockOptions() {
BoundedCapacity = 100,
CancellationToken = cts.Token
};
return new ActionBlock<WorkItem>(item => ProcessWork(item), options);
}
private async Task GetWorkFromDatabase() {
while (!cts.IsCancellationRequested) {
var work = await GetWork();
await Processor.SendAsync(work);
await Task.Delay(DatabasePollingFrequency);
}
}
private async Task<WorkItem> GetWork() {
return await Context.GetWork();
}
private void ProcessWork(WorkItem item) {
//do processing
}
}
}
I have an application running an observable interval with multiple observers. Every 0.5s the interval loads some XML data from a web server, then the observers do some application specific processing on a background thread. Once the data isn't needed anymore, the subscriptions and the interval observable get disposed, so observer's OnNext/OnCompleted/OnError won't be called anymore. So far so good.
My problem: In some rare cases it is possible, that after calling Dispose my observer's OnNext method is still running! Before proceeding to further operations after disposing, I would like to ensure that OnNext has been completed.
My current solution: I've introduced a locker field in my observer class (see code). After disposing I try to acquire a lock and continue only after the lock has been acquired. While this solution works (?), it somehow just feels wrong to me.
Question: Is there a more elegant, more "Rx Way" to solve this problem?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Reactive.Concurrency;
using System.Reactive.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace RxExperimental
{
internal sealed class MyXmlDataFromWeb
{
public string SomeXmlDataFromWeb { get; set; }
}
internal sealed class MyObserver : IObserver<MyXmlDataFromWeb>
{
private readonly object _locker = new object();
private readonly string _observerName;
public MyObserver(string observerName) {
this._observerName = observerName;
}
public object Locker {
get { return this._locker; }
}
public void OnCompleted() {
lock (this._locker) {
Console.WriteLine("{0}: Completed.", this._observerName);
}
}
public void OnError(Exception error) {
lock (this._locker) {
Console.WriteLine("{0}: An error occured: {1}", this._observerName, error.Message);
}
}
public void OnNext(MyXmlDataFromWeb value) {
lock (this._locker) {
Console.WriteLine(" {0}: OnNext running on thread {1}... ", this._observerName, Thread.CurrentThread.ManagedThreadId);
Console.WriteLine(" {0}: XML received: {1}", this._observerName, value.SomeXmlDataFromWeb);
Thread.Sleep(5000); // simulate some long running operation
Console.WriteLine(" {0}: OnNext running on thread {1}... Done.", this._observerName, Thread.CurrentThread.ManagedThreadId);
}
}
}
internal sealed class Program
{
private static void Main() {
const int interval = 500;
//
var dataSource = Observable.Interval(TimeSpan.FromMilliseconds(interval), NewThreadScheduler.Default).Select(_ => {
var data = new MyXmlDataFromWeb {
SomeXmlDataFromWeb = String.Format("<timestamp>{0:yyyy.MM.dd HH:mm:ss:fff}</timestamp>", DateTime.Now)
};
return data;
}).Publish();
//
var observer1 = new MyObserver("Observer 1");
var observer2 = new MyObserver("Observer 2");
//
var subscription1 = dataSource.ObserveOn(NewThreadScheduler.Default).Subscribe(observer1);
var subscription2 = dataSource.ObserveOn(NewThreadScheduler.Default).Subscribe(observer2);
//
var connection = dataSource.Connect();
//
Console.WriteLine("Press any key to cancel ...");
Console.ReadLine();
//
subscription1.Dispose();
subscription2.Dispose();
connection.Dispose();
//
lock (observer1.Locker) {
Console.WriteLine("Observer 1 completed.");
}
lock (observer2.Locker) {
Console.WriteLine("Observer 2 completed.");
}
//
Console.WriteLine("Can only be executed, after all observers completed.");
}
}
}
Yes, there's a more Rx-way of doing this.
The first observation is that unsubscribing from an observable stream is essentially independent from what is currently happening within the observer. There's not really any feedback. Since you have the requirement that you know definitively when the observations have ended, you need to model this into your observable stream. In other words, instead of unsubscribing from the stream, you should complete the stream so you will be able to observe the OnComplete event. In your case, you can use TakeUntil to end the observable instead of unsubscribing from it.
The second observation is that your main program needs to observe when your "observer" finishes its work. But since you made your "observer" an actual IObservable, you don't really have a way to do this. This is a common source of confusion I see when people first start using Rx. If you model your "observer" as just another link in the observable chain, then your main program can observer. Specifically, your "observer" is nothing more than a mapping operation (with side effects) that maps incoming Xml data into "done" messages.
So if you refactor your code, you can get what you want...
public class MyObserver
{
private readonly string _name;
public MyObserver(string name) { _name = name; }
public IObservable<Unit> Handle(IObservable<MyXmlDataFromWeb source)
{
return source.Select(value =>
{
Thread.Sleep(5000); // simulate work
return Unit.Default;
});
}
}
// main
var endSignal = new Subject<Unit>();
var dataSource = Observable
.Interval(...)
.Select(...)
.TakeUntil(endSignal)
.Publish();
var observer1 = new MyObserver("Observer 1");
var observer2 = new MyObserver("Observer 2");
var results1 = observer1.Handle(dataSource.ObserveOn(...));
var results2 = observer2.Handle(dataSource.ObserveOn(...));
// since you just want to know when they are all done
// just merge them.
// use ToTask() to subscribe them and collect the results
// as a Task
var processingDone = results1.Merge(results2).Count().ToTask();
dataSource.Connect();
Console.WriteLine("Press any key to cancel ...");
Console.ReadLine();
// end the stream
endSignal.OnNext(Unit.Default);
// wait for the processing to complete.
// use await, or Task.Result
var numProcessed = await processingDone;
I have a method that queues some work to be executed asynchronously. I'd like to return some sort of handle to the caller that can be polled, waited on, or used to fetch the return value from the operation, but I can't find a class or interface that's suitable for the task.
BackgroundWorker comes close, but it's geared to the case where the worker has its own dedicated thread, which isn't true in my case. IAsyncResult looks promising, but the provided AsyncResult implementation is also unusable for me. Should I implement IAsyncResult myself?
Clarification:
I have a class that conceptually looks like this:
class AsyncScheduler
{
private List<object> _workList = new List<object>();
private bool _finished = false;
public SomeHandle QueueAsyncWork(object workObject)
{
// simplified for the sake of example
_workList.Add(workObject);
return SomeHandle;
}
private void WorkThread()
{
// simplified for the sake of example
while (!_finished)
{
foreach (object workObject in _workList)
{
if (!workObject.IsFinished)
{
workObject.DoSomeWork();
}
}
Thread.Sleep(1000);
}
}
}
The QueueAsyncWork function pushes a work item onto the polling list for a dedicated work thread, of which there will only over be one. My problem is not with writing the QueueAsyncWork function--that's fine. My question is, what do I return to the caller? What should SomeHandle be?
The existing .Net classes for this are geared towards the situation where the asynchronous operation can be encapsulated in a single method call that returns. That's not the case here--all of the work objects do their work on the same thread, and a complete work operation might span multiple calls to workObject.DoSomeWork(). In this case, what's a reasonable approach for offering the caller some handle for progress notification, completion, and getting the final outcome of the operation?
Yes, implement IAsyncResult (or rather, an extended version of it, to provide for progress reporting).
public class WorkObjectHandle : IAsyncResult, IDisposable
{
private int _percentComplete;
private ManualResetEvent _waitHandle;
public int PercentComplete {
get {return _percentComplete;}
set
{
if (value < 0 || value > 100) throw new InvalidArgumentException("Percent complete should be between 0 and 100");
if (_percentComplete = 100) throw new InvalidOperationException("Already complete");
if (value == 100 && Complete != null) Complete(this, new CompleteArgs(WorkObject));
_percentComplete = value;
}
public IWorkObject WorkObject {get; private set;}
public object AsyncState {get {return WorkObject;}}
public bool IsCompleted {get {return _percentComplete == 100;}}
public event EventHandler<CompleteArgs> Complete; // CompleteArgs in a usual pattern
// you may also want to have Progress event
public bool CompletedSynchronously {get {return false;}}
public WaitHandle
{
get
{
// initialize it lazily
if (_waitHandle == null)
{
ManualResetEvent newWaitHandle = new ManualResetEvent(false);
if (Interlocked.CompareExchange(ref _waitHandle, newWaitHandle, null) != null)
newWaitHandle.Dispose();
}
return _waitHandle;
}
}
public void Dispose()
{
if (_waitHandle != null)
_waitHandle.Dispose();
// dispose _workObject too, if needed
}
public WorkObjectHandle(IWorkObject workObject)
{
WorkObject = workObject;
_percentComplete = 0;
}
}
public class AsyncScheduler
{
private Queue<WorkObjectHandle> _workQueue = new Queue<WorkObjectHandle>();
private bool _finished = false;
public WorkObjectHandle QueueAsyncWork(IWorkObject workObject)
{
var handle = new WorkObjectHandle(workObject);
lock(_workQueue)
{
_workQueue.Enqueue(handle);
}
return handle;
}
private void WorkThread()
{
// simplified for the sake of example
while (!_finished)
{
WorkObjectHandle handle;
lock(_workQueue)
{
if (_workQueue.Count == 0) break;
handle = _workQueue.Dequeue();
}
try
{
var workObject = handle.WorkObject;
// do whatever you want with workObject, set handle.PercentCompleted, etc.
}
finally
{
handle.Dispose();
}
}
}
}
If I understand correctly you have a collection of work objects (IWorkObject) that each complete a task via multiple calls to a DoSomeWork method. When an IWorkObject object has finished its work you'd like to respond to that somehow and during the process you'd like to respond to any reported progress?
In that case I'd suggest you take a slightly different approach. You could take a look at the Parallel Extension framework (blog). Using the framework, you could write something like this:
public void QueueWork(IWorkObject workObject)
{
Task.TaskFactory.StartNew(() =>
{
while (!workObject.Finished)
{
int progress = workObject.DoSomeWork();
DoSomethingWithReportedProgress(workObject, progress);
}
WorkObjectIsFinished(workObject);
});
}
Some things to note:
QueueWork now returns void. The reason for this is that the actions that occur when progress is reported or when the task completes have become part of the thread that executes the work. You could of course return the Task that the factory creates and return that from the method (to enable polling for example).
The progress-reporting and finish-handling are now part of the thread because you should always avoid polling when possible. Polling is more expensive because usually you either poll too frequently (too early) or not often enough (too late). There is no reason you can't report on the progress and finishing of the task from within the thread that is running the task.
The above could also be implemented using the (lower level) ThreadPool.QueueUserWorkItem method.
Using QueueUserWorkItem:
public void QueueWork(IWorkObject workObject)
{
ThreadPool.QueueUserWorkItem(() =>
{
while (!workObject.Finished)
{
int progress = workObject.DoSomeWork();
DoSomethingWithReportedProgress(workObject, progress);
}
WorkObjectIsFinished(workObject);
});
}
The WorkObject class can contain the properties that need to be tracked.
public class WorkObject
{
public PercentComplete { get; private set; }
public IsFinished { get; private set; }
public void DoSomeWork()
{
// work done here
this.PercentComplete = 50;
// some more work done here
this.PercentComplete = 100;
this.IsFinished = true;
}
}
Then in your example:
Change the collection from a List to a Dictionary that can hold Guid values (or any other means of uniquely identifying the value).
Expose the correct WorkObject's properties by having the caller pass the Guid that it received from QueueAsyncWork.
I'm assuming that you'll start WorkThread asynchronously (albeit, the only asynchronous thread); plus, you'll have to make retrieving the dictionary values and WorkObject properties thread-safe.
private Dictionary<Guid, WorkObject> _workList =
new Dictionary<Guid, WorkObject>();
private bool _finished = false;
public Guid QueueAsyncWork(WorkObject workObject)
{
Guid guid = Guid.NewGuid();
// simplified for the sake of example
_workList.Add(guid, workObject);
return guid;
}
private void WorkThread()
{
// simplified for the sake of example
while (!_finished)
{
foreach (WorkObject workObject in _workList)
{
if (!workObject.IsFinished)
{
workObject.DoSomeWork();
}
}
Thread.Sleep(1000);
}
}
// an example of getting the WorkObject's property
public int GetPercentComplete(Guid guid)
{
WorkObject workObject = null;
if (!_workList.TryGetValue(guid, out workObject)
throw new Exception("Unable to find Guid");
return workObject.PercentComplete;
}
The simplest way to do this is described here. Suppose you have a method string DoSomeWork(int). You then create a delegate of the correct type, for example:
Func<int, string> myDelegate = DoSomeWork;
Then you call the BeginInvoke method on the delegate:
int parameter = 10;
myDelegate.BeginInvoke(parameter, Callback, null);
The Callback delegate will be called once your asynchronous call has completed. You can define this method as follows:
void Callback(IAsyncResult result)
{
var asyncResult = (AsyncResult) result;
var #delegate = (Func<int, string>) asyncResult.AsyncDelegate;
string methodReturnValue = #delegate.EndInvoke(result);
}
Using the described scenario, you can also poll for results or wait on them. Take a look at the url I provided for more info.
Regards,
Ronald
If you don't want to use async callbacks, you can use an explicit WaitHandle, such as a ManualResetEvent:
public abstract class WorkObject : IDispose
{
ManualResetEvent _waitHandle = new ManualResetEvent(false);
public void DoSomeWork()
{
try
{
this.DoSomeWorkOverride();
}
finally
{
_waitHandle.Set();
}
}
protected abstract DoSomeWorkOverride();
public void WaitForCompletion()
{
_waitHandle.WaitOne();
}
public void Dispose()
{
_waitHandle.Dispose();
}
}
And in your code you could say
using (var workObject = new SomeConcreteWorkObject())
{
asyncScheduler.QueueAsyncWork(workObject);
workObject.WaitForCompletion();
}
Don't forget to call Dispose on your workObject though.
You can always use alternate implementations which create a wrapper like this for every work object, and who call _waitHandle.Dispose() in WaitForCompletion(), you can lazily instantiate the wait handle (careful: race conditions ahead), etc. (That's pretty much what BeginInvoke does for delegates.)