Producer consumer collection with ability to read and write batches of data - c#

I'm looking for a collection like BufferBlock
but with methods like:
SendAsync<T>(T[])
T[] ReceiveAsync<T>()
Is anyone can help with?

These methods aren't available, SendAsync<T> only takes a single T and RecieveAsync<T> only returns a single T, not arrays.
SendAsync<T>(T[])
T[] ReceiveAsync<T>()
However there is TryReceiveAll<T>(out IList<T> items) and you can call SendAsync<T> in a loop to send an array into the BufferBlock or write your own extension method, something like this:
public static async Task SendAllAsync<T>(this ITargetBlock<T> block, IEnumerable<T> items)
{
foreach(var item in items)
{
await block.SendAsync(item)
}
}
Note that SendAsync does return a bool indicating acceptance of the messages, you could return an array of booleans or just return if any of them come back false but that's up to you.
Likely it would be easier to use a BatchBlock<T> that you can send items to as singles using a loop but emits the items in batches, which would be easier than using TryRecieveAll if you're building a pipeline. BatchBlock Walkthrough and BatchBlock Example

ReceiveAsync and SendAsync are available as extension methods on the ISourceBlock and ITargetBlockT<> interfaces. This means that you have to cast the block to those interfaces in order to use the extension methods, eg :
var buffer=new BufferBlock<string>();
var source=(ISourceBlock<string>)buffer;
var target=(ITargetBlock<string>)buffer;
await target.SendAsync("something");
Typically that's not a problem because all Dataflow methods accept interfaces, not concrete types, eg :
async Task MyProducer(ITargetBlock<string> target)
{
...
await target.SendAsync(..);
...
target.Complete();
}
async Task MyConsumer(ISourceBlock<string> target)
{
...
var message=await target.ReceiveAsync();
...
}
public static async Task Main()
{
var buffer=new BufferBlock<string>();
MyProducer(buffer);
await MyConsumer(buffer);
}

Related

Convert `IObservable<T>` to `IEnumerable<Task<T>>`?

I have a couple of asynchronous APIs that use callbacks or events instead of async. I successfully used TaskCompletionSource to wrap them as described here.
Now, I would like to use an API that returns IObservable<T> and yields multiple objects. I read about Rx for .NET, which seems the way to go. However, I'm hesitant to include another dependency and another new paradigm, since I'm already using a lot of things that are new for me in this app (like XAML, MVVM, C#'s async/await).
Is there any way to wrap IObservable<T> analogously to how you wrap a single callback API? I would like to call the API as such:
foreach (var t in GetMultipleInstancesAsync()) {
var res = await t;
Console.WriteLine("Received item:", res);
}
If the observable emits multiple values, you can convert them to Task<T> and then add them to any IEnumerable structure.
Check IObservable ToTask. As discussed here, the observable must complete before awaiting otherwise more values might come over.
This guide here might do the trick for you too
public static Task<IList<T>> BufferAllAsync<T>(this IObservable<T> observable)
{
List<T> result = new List<T>();
object gate = new object();
TaskCompletionSource<IList<T>> finalTask = new TaskCompletionSource<IList<T>>();
observable.Subscribe(
value =>
{
lock (gate)
{
result.Add(value);
}
},
exception => finalTask.TrySetException(exception),
() => finalTask.SetResult(result.AsReadOnly())
);
return finalTask.Task;
}
If you would like to use Rx then you can use your returning list:
GetMultipleInstancesAsync().ToObservable().Subscribe(...);
You can subscribe the OnCompleted/OnError handler.
Also you can wrap it a task list:
var result = await Task.WhenAll(GetMultipleInstancesAsync().ToArray());
So you got an array of your results and you are done.

ForEach lambda async vs Task.WhenAll

I have an async method like this:
private async Task SendAsync(string text) {
...
}
I also have to use this method one time for each item in a List:
List<string> textsToSend = new Service().GetMessages();
Currently my implementation is this:
List<string> textsToSend = new Service().GetMessages();
List<Task> tasks = new List<Task>(textsToSend.Count);
textsToSend.ForEach(t => tasks.Add(SendAsync(t)));
await Task.WhenAll(tasks);
With this code, I get a Task for each message that runs async the sending method.
However, I don't know if is there any different between my implementation and this one:
List<string> textsToSend = new Service().GetMessages();
textsToSend.ForEach(async t => await SendAsync(t));
In the second one, I don't have the List<Task> allocation, but I think that the first one launch all Task in parallel and the second sample, one by one.
Could you help me to clarify if is there any different between the first and second samples?
PD: I also know that C#8 supports foreach async, however I'm using C# 7
You don't even need a list, much less ForEach to execute multiple tasks and await all of them. In any case, ForEach is just a convenience function that uses `foreach.
To execute some async calls concurrently bases on a list of inputs all you need is Enumerable.Select. To await all of them to complete you only need Task.WhenAll :
var tasks=textsToSend.Select(text=>SendAsync(text));
await Task.WhenAll(tasks);
LINQ and IEnumerable in general use lazy evaluation which means Select's code won't be executed until the returned IEnumerable is iterated. In this case it doesn't matter because it's iterated in the very next line. If one wanted to force all tasks to start a call to ToArray() would be enough, eg :
var tasks=textsToSend.Select(SendAsync).ToArray();
If you wanted to execute those async calls sequentially, ie one after the other, you could use a simple foreach. There's no need for C# 8's await foreach :
foreach(var text in textsToSend)
{
await SendAsync(text);
}
The Bug
This line is simply a bug :
textsToSend.ForEach(async t => await SendAsync(t));
ForEach doesn't know anything about tasks so it never awaits for the generated tasks to complete. In fact, the tasks can't be awaited at all. The async t syntax creates an async void delegate. It's equivalent to :
async void MyMethod(string t)
{
await SendAsync(t);
}
textToSend.ForEach(t=>MyMethod(t));
This brings all the problems of async void methods. Since the application knows nothing about those async void calls, it could easily terminate before those methods complete, resulting in NREs, ObjectDisposedExceptions and other weird problems.
For reference check David Fowler's Implicit async void delegates
C# 8 and await foreach
C# 8's IAsyncEnumerable would be useful in the sequential case, if we wanted to return the results of each async operation in an iterator, as soon as we got them.
Before C# 8 there would be no way to avoid awaiting for all results, even with sequential execution. We'd have to collect all of them in a list. Assuming each operation returned a string, we'd have to write :
async Task<List<string> SendTexts(IEnumerable<string> textsToSend)
{
var results=new List<string>();
foreach(var text in textsToSend)
{
var result=await SendAsync(text);
results.Add(result);
}
}
And use it with :
var results=await SendTexts(texts);
In C# 8 we can return individual results and use them asynchronously. We don't need to cache the results before returning them either :
async IAsyncEmumerable<string> SendTexts(IEnumerable<string> textsToSend)
{
foreach(var text in textsToSend)
{
var result=await SendAsync(text);
yield return;
}
}
await foreach(var result in SendTexts(texts))
{
...
}
await foreach is only needed to consume the IAsyncEnumerable result, not produce it
that the first one launch all Task in parallel
Correct. And await Task.WhenAll(tasks); waits for all messages are sent.
The second one also sends messages in parallel but doesn't wait for all messages are sent since you don't await any task.
In your case:
textsToSend.ForEach(async t => await SendAsync(t));
is equivalent to
textsToSend.ForEach(t => SendAsync(t));
the async t => await SendAsync(t) delegate may return the task (it depends on assignable type) as SendAsync(t). In case of passing it to ForEach both async t => await SendAsync(t) and SendAsync(t) will be translated to Action<string>.
Also the first code will throw an exception if any SendAsync throws an excepion. In the second code any exception will be ignored.

How to use yield in async C# task

I am trying to you use yield and return a result from converting X into Y in an async task. But, I am getting an error on select.
The error is:
Error CS1942 The type of the expression in the select clause is
incorrect. Type inference failed in the call to 'Select'.
public async Task<Result<dynamic>> GetYAsync(IEnumerable<X> infos)
{
return Task.WhenAll(from info in infos.ToArray() select async ()=>
{
yield return await new Y(info.Id, "Start");
});
}
Short answer: you can't use an asynchronous yield statement.
But in most cases, you don't need to. Using LINQ you can aggregate all tasks before passing them into Task.WaitAll. I simplified your example to return an IEnumerable<int>, but this will work with every type.
public class Program
{
public static Task<int> X(int x)
{
return Task.FromResult(x);
}
public static async Task<IEnumerable<int>> GetYAsync(IEnumerable<int> infos)
{
var res = await Task.WhenAll(infos.Select(info => X(info)));
return res;
}
public static async void Main()
{
var test = await GetYAsync(new [] {1, 2, 3});
Console.WriteLine(test);
}
}
Your example has another error await new Y(...), a constructor cannot be asynchronous, therefore I replaced it with an asynchronous function. (As hinted in the comments, it is technically possible to create a custom awaitable type and create this type with new, although this is rarely used),
The example above uses infos.Select to create a list of pending tasks, returned by invoking the function X. This list of tasks will then be awaited and returned.
This workaround should fit most cases. Real asynchronous iterators, as for example in JavaScript, are not supported in .Net.
Update: This feature is currently suggested as a language proposal: Async Streams. So maybe we will see this in the future.
Update: If you need asynchronous iterators, there are a few options currently available:
Reactive Stream, RX.Net, which provides you with asynchronous observable streams based on events.
There are implementations of asynchronous iterators or asynchronous enumerables AsyncEnumerable or
.Net Async Enumerable
You do not. Async Enum support (and yield is there to implement enumerable) comes with C# 8 some point in 2019 as it looks. So, for now the answer is simply that you do not.
The reason you get the error is that you can also not returna Result. Yield (return) is specific to implementing enumerations. Your method signature does not match.

Nested async lambda not awaited

The following code does not return the entire collection it is iterating. The returned array has an arbitrary length on every run. What's wrong?
public async Task<IHttpActionResult> GetClients()
{
var clientInfoCollection = new ConcurrentBag<ClientInfoModel>();
await _client.Iterate(async (client) =>
{
clientInfoCollection.Add(new ClientInfoModel
{
name = client.name,
userCount = await _user.Count(clientId)
});
});
return Ok(clientInfoCollection.ToArray());
}
The following code uses the new async MongoDB C# driver
public async Task Iterate(Action<TDocument> processor)
{
await _collection.Find<TDocument>(_ => true).ForEachAsync(processor);
}
The reason you're seeing arbitrary number of values is in the fact the Iterate receives a delegate of type Action<T>, which is equivalent to async void, effectively making this a "fire-and-forget" style of execution.
The inner method isn't actually aware that an async delegate has been passed to it, hence it iterates the collection without actually asynchronously waiting for each item to complete.
What you need to do instead is make the method parameter a delegate of type Func<TDocument, Task> and use the proper overload of ForEachAsync:
public Task Iterate(Func<TDocument, Task> processor)
{
return _collection.Find<TDocument>(_ => true).ForEachAsync(processor);
}
You can see the source here:
public static async Task ForEachAsync<TDocument>(
this IAsyncCursor<TDocument> source,
Func<TDocument, int, Task> processor,
CancellationToken cancellationToken = default(CancellationToken))
{
Ensure.IsNotNull(source, "source");
Ensure.IsNotNull(processor, "processor");
// yes, we are taking ownership... assumption being that they've
// exhausted the thing and don't need it anymore.
using (source)
{
var index = 0;
while (await source.MoveNextAsync(cancellationToken).ConfigureAwait(false))
{
foreach (var document in source.Current)
{
await processor(document, index++).ConfigureAwait(false);
cancellationToken.ThrowIfCancellationRequested();
}
}
}
}
You create the threads, and set them off. From there you can't know what happens. But your codes next step is to return, so you are gambling that the threads will execute faster, than your main thread.
In normal threading scenarios, you will join the threads, who are adding items to the bag. Where a join, is the threads, waiting for the other threads to execute and thereby still being async, but waiting to return before everything is completed.
Which is perfectly explained here: http://www.dotnetperls.com/thread-join

Convert IEnumerable<Task<T>> to IObservable<T>

I'm trying to use the Reactive Extensions (Rx) to buffer an enumeration of Tasks as they complete. Does anyone know if there is a clean built-in way of doing this? The ToObservable extension method will just make an IObservable<Task<T>>, which is not what I want, I want an IObservable<T>, that I can then use Buffer on.
Contrived example:
//Method designed to be awaitable
public static Task<int> makeInt()
{
return Task.Run(() => 5);
}
//In practice, however, I don't want to await each individual task
//I want to await chunks of them at a time, which *should* be easy with Observable.Buffer
public static void Main()
{
//Make a bunch of tasks
IEnumerable<Task<int>> futureInts = Enumerable.Range(1, 100).Select(t => makeInt());
//Is there a built in way to turn this into an Observable that I can then buffer?
IObservable<int> buffered = futureInts.TasksToObservable().Buffer(15); //????
buffered.Subscribe(ints => {
Console.WriteLine(ints.Count()); //Should be 15
});
}
You can use the fact that Task can be converted to observable using another overload of ToObservable().
When you have a collection of (single-item) observables, you can create a single observable that contains the items as they complete using Merge().
So, your code could look like this:
futureInts.Select(t => t.ToObservable())
.Merge()
.Buffer(15)
.Subscribe(ints => Console.WriteLine(ints.Count));

Categories

Resources