Processing a batch of request with Reactive Extension - c#

I'm learning Reactive Extensions, and I've been trying to find out if it's a match for a task like this.
I have a Process() method that processes a batch of requests as a unit of work, and invoking a callback when all requests have completed.
The important thing here is that each request will call the callback either synchronous or asynchronous depending on it's implementation, and the batch processor must be able to handle both.
But no threads are started from the batch processor, any new threads (or other async execution) will be initiated from inside the request handlers if necessary. I don't know if this match the use cases of rx.
My current working code looks (almost) like this:
public void Process(ICollection<IRequest> requests, Action<List<IResponse>> onCompleted)
{
IUnitOfWork uow = null;
try
{
uow = unitOfWorkFactory.Create();
var responses = new List<IResponse>();
var outstandingRequests = requests.Count;
foreach (var request in requests)
{
var correlationId = request.CorrelationId;
Action<IResponse> requestCallback = response =>
{
response.CorrelationId = correlationId;
responses.Add(response);
outstandingRequests--;
if (outstandingRequests != 0)
return;
uow.Commit();
onCompleted(responses);
};
requestProcessor.Process(request, requestCallback);
}
}
catch(Exception)
{
if (uow != null)
uow.Rollback();
}
if (uow != null)
uow.Commit();
}
How would you implement this using rx? Is it reasonable?
Note, that the unit of work is to be committed synchronously even if there are async requests that have not yet returned.

My approach to this is two-step.
First create a general-purpose operator that turns Action<T, Action<R>> into Func<T, IObservable<R>>:
public static class ObservableEx
{
public static Func<T, IObservable<R>> FromAsyncCallbackPattern<T, R>(
this Action<T, Action<R>> call)
{
if (call == null) throw new ArgumentNullException("call");
return t =>
{
var subject = new AsyncSubject<R>();
try
{
Action<R> callback = r =>
{
subject.OnNext(r);
subject.OnCompleted();
};
call(t, callback);
}
catch (Exception ex)
{
return Observable.Throw<R>(ex, Scheduler.ThreadPool);
}
return subject.AsObservable<R>();
};
}
}
Next, turn the call void Process(ICollection<IRequest> requests, Action<List<IResponse>> onCompleted) into IObservable<IResponse> Process(IObservable<IRequest> requests):
public IObservable<IResponse> Process(IObservable<IRequest> requests)
{
Func<IRequest, IObservable<IResponse>> rq2rp =
ObservableEx.FromAsyncCallbackPattern
<IRequest, IResponse>(requestProcessor.Process);
var query = (
from rq in requests
select rq2rp(rq)).Concat();
var uow = unitOfWorkFactory.Create();
var subject = new ReplaySubject<IResponse>();
query.Subscribe(
r => subject.OnNext(r),
ex =>
{
uow.Rollback();
subject.OnError(ex);
},
() =>
{
uow.Commit();
subject.OnCompleted();
});
return subject.AsObservable();
}
Now, not only does this run the processing async, but it also ensures the correct order of the results.
In fact, since you are starting with a collection, you could even do this:
var rqs = requests.ToObservable();
var rqrps = rqs.Zip(Process(rqs),
(rq, rp) => new
{
Request = rq,
Response = rp,
});
Then you would have an observable that pairs up each request/response without the need for a CorrelationId property.
I hope this helps.

This is part of the genius of Rx, as you're free to return results either synchronously or asynchronously:
public IObservable<int> AddNumbers(int a, int b) {
return Observable.Return(a + b);
}
public IObservable<int> AddNumbersAsync(int a, int b) {
return Observable.Start(() => a + b, Scheduler.NewThread);
}
They both have the IObservable type, so they work identically. If you want to find out when all IObservables complete, Aggregate will do this, as it will turn 'n' items in an Observable into 1 item that is returned at the end:
IObservable<int> listOfObservables[];
listObservables.ToObservable()
.Merge()
.Aggregate(0, (acc, x) => acc+1)
.Subscribe(x => Console.WriteLine("{0} items were run", x));

Related

How to wrap async callback with Observable or async/await?

I am using a network API which works with callbacks. So basically, I have a bunch of method calls that I need to use in this 3rd party library which look like this:
void SendNetworkRequest(string requestType, Action<Response> callback)
I find the code getting a little bit wacky because all of my methods which depend on network resources from this 3rd party API also need to implement callbacks themselves. For example, in my main scene I may want to get the players information and my code would look something like this:
void InitMainScene()
{
_networkHelper.SendNetworkRequest("GetPlayersInfo",OnPlayerInfoResponse);
}
void OnPlayerInfoResponse(Response response)
{
_playerInfo = response.Info;
}
I have recently gotten into RX and am using it abundently in my code. I have read read some about async/await. I have experimented quite a bit, especially with RX, and tried working with Observable.FromAsync() but could not get it to work..
What am I missing here? How can I write code which is cleaner and doesn't require the callback using RX or async/await? The following is the psuedocode which I'm looking for:
void InitMainScene()
{
_playerInfo = Overservable.DoMagicThing<Response>( () => {
_networkHelper.SendNetworkRequest("GetPlayersInfo",(r) =>{return r;}); });
}
Here's how to do it with Rx. I swapped out Response for string just so I could compile and test, but trivial to swap back.:
public IObservable<string> SendNetworkRequestObservable(string requestType)
{
return Observable.Create<string>(observer =>
{
SendNetworkRequest(requestType, s => observer.OnNext(s));
observer.OnCompleted();
return Disposable.Empty;
});
}
// Define other methods and classes here
public void SendNetworkRequest(string requestType, Action<string> callback)
{
callback(requestType); // implementation for testing purposes
}
I am not sure about RX. However, you can convert callback-based SendNetworkRequest to an awaitable task like this:
void SendNetworkRequest(string requestType, Action<Response> callback)
{
// This code by third party, this is here just for testing
if (callback != null)
{
var result = new Response();
callback(result);
}
}
Task<Response> SendNetworkRequestAsync(string requestType)
{
return Task.Run(() =>
{
var t = new TaskCompletionSource<Response>();
SendNetworkRequest(requestType, s => t.TrySetResult(s));
return t.Task;
});
}
Now you can consume SendNetworkRequestAsync with async/await more naturally
Here's my take on doing this with Rx.
Inside the NetworkHelper class add this method:
public IObservable<Response> SendNetworkRequestObservable(string requestType)
{
return
Observable
.Create<Response>(observer =>
{
Action<Response> callback = null;
var callbackQuery =
Observable
.FromEvent<Response>(h => callback += h, h => callback -= h)
.Take(1);
var subscription = callbackQuery.Subscribe(observer);
this.SendNetworkRequest(requestType, callback);
return subscription;
});
}
This creates an observable that internally uses Observable.FromEvent(...).Take(1) to create an observable based on expecting a single call to the Action<Response> callback passed to the SendNetworkRequest method.
The only issue with this is that the call occurs on the current thread right to completion. If you want this code to run on a background thread, but return the result to the current thread, then you can do this:
public IObservable<Response> SendNetworkRequestObservable(string requestType)
{
return
Observable
.Start(() =>
Observable
.Create<Response>(observer =>
{
Action<Response> callback = null;
var callbackQuery =
Observable
.FromEvent<Response>(h => callback += h, h => callback -= h)
.Take(1);
var subscription = callbackQuery.Subscribe(observer);
this.SendNetworkRequest(requestType, callback);
return subscription;
}))
.Merge();
}
Either way you'd call it like this:
var _playerInfo =
_networkHelper
.SendNetworkRequestObservable("GetPlayersInfo")
.Wait();
Just a side-note: your DoMagicThing is a synchronous call. In Rx that's a .Wait() call, but it's better to just use a normal .Subscribe when you can.
As per the comment, you could do this with async:
async void InitMainScene()
{
_playerInfo = await _networkHelper.SendNetworkRequestObservable("GetPlayersInfo");
}

How to mark a TPL dataflow cycle to complete?

Given the following setup in TPL dataflow.
var directory = new DirectoryInfo(#"C:\dev\kortforsyningen_dsm\tiles");
var dirBroadcast=new BroadcastBlock<DirectoryInfo>(dir=>dir);
var dirfinder = new TransformManyBlock<DirectoryInfo, DirectoryInfo>((dir) =>
{
return directory.GetDirectories();
});
var tileFilder = new TransformManyBlock<DirectoryInfo, FileInfo>((dir) =>
{
return directory.GetFiles();
});
dirBroadcast.LinkTo(dirfinder);
dirBroadcast.LinkTo(tileFilder);
dirfinder.LinkTo(dirBroadcast);
var block = new XYZTileCombinerBlock<FileInfo>(3, (file) =>
{
var coordinate = file.FullName.Split('\\').Reverse().Take(3).Reverse().Select(s => int.Parse(Path.GetFileNameWithoutExtension(s))).ToArray();
return XYZTileCombinerBlock<CloudBlockBlob>.TileXYToQuadKey(coordinate[0], coordinate[1], coordinate[2]);
},
(quad) =>
XYZTileCombinerBlock<FileInfo>.QuadKeyToTileXY(quad,
(z, x, y) => new FileInfo(Path.Combine(directory.FullName,string.Format("{0}/{1}/{2}.png", z, x, y)))),
() => new TransformBlock<string, string>((s) =>
{
Trace.TraceInformation("Combining {0}", s);
return s;
}));
tileFilder.LinkTo(block);
using (new TraceTimer("Time"))
{
dirBroadcast.Post(directory);
block.LinkTo(new ActionBlock<FileInfo>((s) =>
{
Trace.TraceInformation("Done combining : {0}", s.Name);
}));
block.Complete();
block.Completion.Wait();
}
i am wondering how I can mark this to complete because of the cycle. A directory is posted to the dirBroadcast broadcaster which posts to the dirfinder that might post back new dirs to the broadcaster, so i cant simply mark it as complete because it would block any directories being added from the dirfinder. Should i redesign it to keep track of the number of dirs or is there anything for this in TPL.
If the purpose of your code is to traverse the directory structure using some sort of parallelism then I would suggest not using TPL Dataflow and use Microsoft's Reactive Framework instead. I think it becomes much simpler.
Here's how I would do it.
First define a recursive function to build the list of directories:
Func<DirectoryInfo, IObservable<DirectoryInfo>> recurse = null;
recurse = di =>
Observable
.Return(di)
.Concat(di.GetDirectories()
.ToObservable()
.SelectMany(di2 => recurse(di2)))
.ObserveOn(Scheduler.Default);
This performs the recurse of the directories and uses the default Rx scheduler which causes the observable to run in parallel.
So by calling recurse with an input DirectoryInfo I get an observable list of the input directory and all of its descendants.
Now I can build a fairly straight-forward query to get the results I want:
var query =
from di in recurse(new DirectoryInfo(#"C:\dev\kortforsyningen_dsm\tiles"))
from fi in di.GetFiles().ToObservable()
let zxy =
fi
.FullName
.Split('\\')
.Reverse()
.Take(3)
.Reverse()
.Select(s => int.Parse(Path.GetFileNameWithoutExtension(s)))
.ToArray()
let suffix = String.Format("{0}/{1}/{2}.png", zxy[0], zxy[1], zxy[2])
select new FileInfo(Path.Combine(di.FullName, suffix));
Now I can action the query like this:
query
.Subscribe(s =>
{
Trace.TraceInformation("Done combining : {0}", s.Name);
});
Now I may have missed a little bit in your custom code but if this is an approach you want to take I'm sure you can fix any logical issues quite easily.
This code automatically handles completion when it runs out of child directories and files.
To add Rx to your project look for "Rx-Main" in NuGet.
I am sure this is not always possible, but in many cases (including directory enumeration) you can use a running counter and the Interlocked functions to have a cyclic one-to-many dataflow that completes:
public static ISourceBlock<string> GetDirectoryEnumeratorBlock(string path, int maxParallel = 5)
{
var outputBuffer = new BufferBlock<string>();
var count = 1;
var broadcastBlock = new BroadcastBlock<string>(s => s);
var getDirectoriesBlock = new TransformManyBlock<string, string>(d =>
{
var files = Directory.EnumerateDirectories(d).ToList();
Interlocked.Add(ref count, files.Count - 1); //Adds the subdir count, minus 1 for the current directory.
if (count == 0) //if count reaches 0 then all directories have been enumerated.
broadcastBlock.Complete();
return files;
}, new ExecutionDataflowBlockOptions() { MaxDegreeOfParallelism = maxParallel });
broadcastBlock.LinkTo(outputBuffer, new DataflowLinkOptions() { PropagateCompletion = true });
broadcastBlock.LinkTo(getDirectoriesBlock, new DataflowLinkOptions() { PropagateCompletion = true });
getDirectoriesBlock.LinkTo(broadcastBlock);
getDirectoriesBlock.Post(path);
return outputBuffer;
}
I have used this with a slight modification to enumerate files, but it works well. Be careful with the max degree of parallelism, this can quickly saturate a network file system!
I don't see any way this can be done, because each block (dirBroadcast and tileFilder) depends on the other one and can't complete on its own.
I suggest you redesign your directory traversal without TPL Dataflow, which isn't a good fit for this kind of problem. A better approach in my opinion would simply be to recursively scan the directories and fill your block with a stream of files:
private static void FillBlock(DirectoryInfo directoryInfo, XYZTileCombinerBlock<FileInfo> block)
{
foreach (var fileInfo in directoryInfo.GetFiles())
{
block.Post(fileInfo);
}
foreach (var subDirectory in directoryInfo.GetDirectories())
{
FillBlock(subDirectory, block);
}
}
FillBlock(directory, block);
block.Complete();
await block.Completion;
Here is a generalized approach of Andrew Hanlon's solution. It returns a TransformBlock that supports posting messages recursively to itself, and completes automatically when there are no more messages to process.
The transform lambda has three arguments instead of the usual one. The first argument is the item being processed. The second argument is the "path" of the processed message, which is a sequence IEnumerable<TInput> containing its parent messages. The third argument is an Action<TInput> that posts new messages to the block, as children of the current message.
/// <summary>Creates a dataflow block that supports posting messages to itself,
/// and knows when it has completed processing all messages.</summary>
public static IPropagatorBlock<TInput, TOutput>
CreateRecursiveTransformBlock<TInput, TOutput>(
Func<TInput, IEnumerable<TInput>, Action<TInput>, Task<TOutput>> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null)
{
if (transform == null) throw new ArgumentNullException(nameof(transform));
dataflowBlockOptions = dataflowBlockOptions ?? new ExecutionDataflowBlockOptions();
int pendingCount = 1; // The initial 1 represents the completion of input1 block
var input1 = new TransformBlock<TInput, (TInput, IEnumerable<TInput>)>(item =>
{
Interlocked.Increment(ref pendingCount);
return (item, Enumerable.Empty<TInput>());
}, new ExecutionDataflowBlockOptions()
{
CancellationToken = dataflowBlockOptions.CancellationToken,
BoundedCapacity = dataflowBlockOptions.BoundedCapacity
});
var input2 = new BufferBlock<(TInput, IEnumerable<TInput>)>(new DataflowBlockOptions()
{
CancellationToken = dataflowBlockOptions.CancellationToken
// Unbounded capacity
});
var output = new TransformBlock<(TInput, IEnumerable<TInput>), TOutput>(async entry =>
{
try
{
var (item, path) = entry;
var postChildAction = CreatePostAction(item, path);
return await transform(item, path, postChildAction).ConfigureAwait(false);
}
finally
{
if (Interlocked.Decrement(ref pendingCount) == 0) input2.Complete();
}
}, dataflowBlockOptions);
Action<TInput> CreatePostAction(TInput parentItem, IEnumerable<TInput> parentPath)
{
return item =>
{
// The Post will be unsuccessful only in case of block failure
// or cancellation, so no specific action is needed here.
if (input2.Post((item, parentPath.Append(parentItem))))
{
Interlocked.Increment(ref pendingCount);
}
};
}
input1.LinkTo(output);
input2.LinkTo(output);
PropagateCompletion(input1, input2,
condition: () => Interlocked.Decrement(ref pendingCount) == 0);
PropagateCompletion(input2, output);
PropagateFailure(output, input1, input2); // Ensure that all blocks are faulted
return DataflowBlock.Encapsulate(input1, output);
async void PropagateCompletion(IDataflowBlock block1, IDataflowBlock block2,
Func<bool> condition = null)
{
try
{
await block1.Completion.ConfigureAwait(false);
}
catch { }
if (block1.Completion.Exception != null)
{
block2.Fault(block1.Completion.Exception.InnerException);
}
else
{
if (block1.Completion.IsCanceled) return; // On cancellation do nothing
if (condition == null || condition()) block2.Complete();
}
}
async void PropagateFailure(IDataflowBlock block1, IDataflowBlock block2,
IDataflowBlock block3)
{
try
{
await block1.Completion.ConfigureAwait(false);
}
catch (Exception ex)
{
if (block1.Completion.IsCanceled) return; // On cancellation do nothing
block2.Fault(ex); block3.Fault(ex);
}
}
}
// Overload with synchronous delegate
public static IPropagatorBlock<TInput, TOutput>
CreateRecursiveTransformBlock<TInput, TOutput>(
Func<TInput, IEnumerable<TInput>, Action<TInput>, TOutput> transform,
ExecutionDataflowBlockOptions dataflowBlockOptions = null)
{
return CreateRecursiveTransformBlock<TInput, TOutput>((item, path, postAction) =>
Task.FromResult(transform(item, path, postAction)), dataflowBlockOptions);
}
The resulting block consists internally of three blocks: two input blocks that receive messages, and one output block that processes the messages. The first input block receives messages from outside, and the second input block receives messages from inside. The second input block has unbounded capacity, so an infinite recursion will eventually result to an OutOfMemoryException.
Usage example:
var fileCounter = CreateRecursiveTransformBlock<string, int>(
(folderPath, parentPaths, postChild) =>
{
var subfolders = Directory.EnumerateDirectories(folderPath);
foreach (var subfolder in subfolders) postChild(subfolder);
var files = Directory.EnumerateFiles(folderPath);
Console.WriteLine($"{folderPath} has {files.Count()} files"
+ $", and is {parentPaths.Count()} levels deep");
return files.Count();
});
fileCounter.LinkTo(DataflowBlock.NullTarget<int>());
fileCounter.Post(Environment.GetFolderPath(Environment.SpecialFolder.MyDocuments));
fileCounter.Complete();
fileCounter.Completion.Wait();
The above code prints in the console all the subfolders of the folder "MyDocuments".
Just to show my real answer, a combination of TPL and Rx.
Func<DirectoryInfo, IObservable<DirectoryInfo>> recurse = null;
recurse = di =>
Observable
.Return(di)
.Concat(di.GetDirectories()
.Where(d => int.Parse(d.Name) <= br_tile[0] && int.Parse(d.Name) >= tl_tile[0])
.ToObservable()
.SelectMany(di2 => recurse(di2)))
.ObserveOn(Scheduler.Default);
var query =
from di in recurse(new DirectoryInfo(Path.Combine(directory.FullName, baselvl.ToString())))
from fi in di.GetFiles().Where(f => int.Parse(Path.GetFileNameWithoutExtension(f.Name)) >= br_tile[1]
&& int.Parse(Path.GetFileNameWithoutExtension(f.Name)) <= tl_tile[1]).ToObservable()
select fi;
query.Subscribe(block.AsObserver());
Console.WriteLine("Done subscribing");
block.Complete();
block.Completion.Wait();
Console.WriteLine("Done TPL Block");
where block is my var block = new XYZTileCombinerBlock<FileInfo>

How can I make this Observable more reusable?

This is an observable sequence that retrieves paginated data from a web service. Each web service response contains a nextRecordsUrl that indicates where to get the next set of records. What is the best way to convert this Observable to something more reusable?
Web services setup:
var auth = new AuthenticationClient ();
await auth.UsernamePasswordAsync (consumerKey, consumerSecret, userName, password + passwordSecurityToken);
var forceClient = new ForceClient (auth.InstanceUrl, auth.AccessToken, auth.ApiVersion);
The Observable:
var observable = Observable.Create<QueryResult<Account>> (async (IObserver<QueryResult<Account>> o) =>
{
try
{
var queryResult = await forceClient.QueryAsync<Account> ("SELECT Id, Name from Account");
if (queryResult != null)
{
o.OnNext (queryResult);
while (!string.IsNullOrEmpty (queryResult.nextRecordsUrl))
{
queryResult = await forceClient.QueryContinuationAsync<Account> (queryResult.nextRecordsUrl);
if (queryResult != null)
{
o.OnNext (queryResult);
}
}
}
o.OnCompleted ();
}
catch (Exception ex)
{
o.OnError (ex);
}
return () => {};
});
Subscribing to the observable and collecting the results:
var accounts = new List<Account> ();
observable.Subscribe (
observer => accounts.AddRange (observer.records),
ex => Console.WriteLine (ex.Message),
() => {});
EDIT: Using Brandon's solution I can now generate the list of results with Aggregate
List<Account> accounts = await forceClient.QueryPages<Account> ("SELECT Id, Name from Account")
.Aggregate (new List<Account> (), (list, value) =>
{
list.AddRange (value.records);
return list;
});
Believe it or not, the Rx-Expiremental library (also maintained by MS) has an operator for this called Expand. Expand is used to take each element from an observable and run it through a function which produces another observable of the same type. That observable is then flattened in to the original, and each item from that goes through the same process.
Imagine being given a tree node with an observable of child nodes. You could use expand to easily traverse this tree. Since a linked-list is just a constrained version of a tree, and since what you have is effectively a linked list where each node is an observable, you can use expand.
public static IObservable<QueryResult<TResult>> QueryPages<TResult>(this ForceClient forceClient, string query)
{
return Observable.FromAsync(() => forceClient.QueryAsync<TResult>(query))
.Where(QueryResultIsValid)
.Expand(result =>
Observable.FromAsync(() => forceClient.QueryContinuationAsync<TResult>(queryResult.nextRecordsUrl))
.Where(QueryResultIsValid)
);
}
public static bool QueryResultIsValid(QueryResult<TResult> result)
{
return result != null;
}
Is something like this what you are looking for?
public static IObservable<QueryResult<TResult>> QueryPages<TResult>(this ForceClient forceClient, string query)
{
return Observable.Create<QueryResult<T>> (async (observer, token) =>
{
// No need for try/catch. Create() will call OnError if your task fails.
// Also no need for OnCompleted(). Create() calls it when your task completes
var queryResult = await forceClient.QueryAsync<TResult> (query);
while (queryResult != null)
{
observer.OnNext (queryResult);
// check the token *after* we call OnNext
// because if an observer unsubscribes
// it typically occurs during the notification
// e.g. they are using .Take(..) or
// something.
if (string.IsNullOrEmpty(queryResult.nextRecordsUrl) ||
token.IsCancellationRequested)
{
break;
}
queryResult = await forceClient.QueryContinuationAsync<TResult> (queryResult.nextRecordsUrl);
}
// No need to return anything. Just the Task itself is all that Create() wants.
}
});
// Usage:
var forceClient = // ...
var foos = forceClient.QueryPages<Foo>("SELECT A, B, C FROM Foo");
Notice I switched it to the overload that provides a cancellation token so that you can stop fetching pages if the observer unsubscribes (Your original version would have continued fetching pages even though the observer had stopped listening). Also note that the async Create awaits your Task and calls OnError or OnCompleted for you so you do not need to worry about that most of the time.

ReactiveExtensions Observable FromAsync calling twice Function

Ok, Trying to understand Rx, kinda of lost here.
FromAsyncPattern is now deprecated so I took the example from here (section Light up Task with Rx), and it works, I just made a few changes, not using await just wait the observable and subscribing.....
What I don't understand is Why is called Twice the function SumSquareRoots?
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
class Program
{
static void Main(string[] args)
{
Samples();
}
static void Samples()
{
var x = 100000000;
try
{
var res = Observable.FromAsync(ct => SumSquareRoots(x, ct))
.Timeout(TimeSpan.FromSeconds(5));
res.Subscribe(y => Console.WriteLine(y));
res.Wait();
}
catch (TimeoutException)
{
Console.WriteLine("Timed out :-(");
}
}
static Task<double> SumSquareRoots(long count, CancellationToken ct)
{
return Task.Run(() =>
{
var res = 0.0;
Console.WriteLine("Why I'm called twice");
for (long i = 0; i < count; i++)
{
res += Math.Sqrt(i);
if (i % 10000 == 0 && ct.IsCancellationRequested)
{
Console.WriteLine("Noticed cancellation!");
ct.ThrowIfCancellationRequested();
}
}
return res;
});
}
}
The reason that this is calling SumSquareRoots twice is because you're Subscribing twice:
// Subscribes to res
res.Subscribe(y => Console.WriteLine(y));
// Also Subscribes to res, since it *must* produce a result, even
// if that result is then discarded (i.e. Wait doesn't return IObservable)
res.Wait();
Subscribe is the foreach of Rx - just like if you foreach an IEnumerable twice, you could end up doing 2x the work, multiple Subscribes means multiple the work. To undo this, you could use a blocking call that doesn't discard the result:
Console.WriteLine(res.First());
Or, you could use Publish to "freeze" the result and play it back to > 1 subscriber (kind of like how you'd use ToArray in LINQ):
res = res.Publish();
res.Connect();
// Both subscriptions get the same result, SumSquareRoots is only called once
res.Subscribe(Console.WriteLine);
res.Wait();
The general rule you can follow is, that any Rx method that doesn't return IObservable<T> or Task<T> will result in a Subscription(*)
* - Not technically correct. But your brain will feel better if you think of it this way.

How should I implement this pattern of calling asynchronously in a loop without C#'s async?

I'm trying to implement this trivial task of listing all objects in an AmazonS3 bucket with paged requests asynchronously in C#4. I have it working in C#5 using the following snippet:
var listRequest = new ListObjectsRequest().WithBucketName(bucketName);
ListObjectsResponse listResponse = null;
var list = new List<List<S3Object>>();
while (listResponse == null || listResponse.IsTruncated)
{
listResponse = await Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null);
list.Add(listResponse.S3Objects);
if (listResponse.IsTruncated)
{
listRequest.Marker = listResponse.NextMarker;
}
}
return list.SelectMany(l => l);
I'm calling the BeginListObjects/EndListObjects pair asynchronously, but I have to repeat that call every time the response says it's truncated. This piece of code works for me.
However, I now want to do this in C#4's TPL, where I don't have the luxury of using async/await and want to understand if this can be done using continuations.
How do I do this same thing in C#4?
Okay, so rather than putting the items into a list with each task/continuation it's easier in a non-await model to just have each task/continuation return the entire sequence. Given that, I used the following helper method to add each one's iterative results onto the aggregate total.
public static Task<IEnumerable<T>> Concat<T>(Task<IEnumerable<T>> first
, Task<IEnumerable<T>> second)
{
return Task.Factory.ContinueWhenAll(new[] { first, second }, _ =>
{
return first.Result.Concat(second.Result);
});
}
Next, I used the follow method to take a task of a single result and turn it into a task of a sequence (containing just that one item).
public static Task<IEnumerable<T>> ToSequence<T>(this Task<T> task)
{
var tcs = new TaskCompletionSource<IEnumerable<T>>();
task.ContinueWith(_ =>
{
if (task.IsCanceled)
tcs.SetCanceled();
else if (task.IsFaulted)
tcs.SetException(task.Exception);
else
tcs.SetResult(Enumerable.Repeat(task.Result, 1));
});
return tcs.Task;
}
Note here that you have some fields/locals not defined; I'm assuming you can add them to the appropriate method without difficulty.
private Task<IEnumerable<S3Object>> method(object sender, EventArgs e)
{
ListObjectsResponse listResponse = null;
return Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null)
.ToSequence()
.ContinueWith(continuation);
}
Here is where the real magic happens. Basically,
public Task<IEnumerable<S3Object>> continuation(Task<IEnumerable<S3Object>> task)
{
if (task.Result == null) //not quite sure what null means here//may need to edit this recursive case
{
return Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null)
.ToSequence()
.ContinueWith(continuation);
}
else if (task.Result.First().IsTruncated)
{
//if the results were trunctated then concat those results with
//TODO modify the request marker here; either create a new one or store the request as a field and mutate.
Task<IEnumerable<S3Object>> nextBatch = Task<ListObjectsResponse>.Factory.FromAsync(
client.BeginListObjects, client.EndListObjects, listRequest, null)
.ToSequence()
.ContinueWith(continuation);
return Concat(nextBatch, task);//recursive continuation call
}
else //if we're done it means the existing results are sufficient
{
return task;
}
}

Categories

Resources