Can BlockingCollection.GetConsumingEnumerable Deadlock

Can BlockingCollection.GetConsumingEnumerable Deadlock - c#

Assuming I have an async builder method and I need to convert a collection of db models into a collection of UI models, I wrote the code below (simplified for example).
After reading this medium article, the code was refactoring to bubble ASYNC all the way up but I'd still like to know
will collection.GetConsumingEnumerable() block, leading to deadlocks, similar to .Wait or .Result
what is the preferred way to stream items in an async method
public IEnumerable<GenericDevice> GetGenericDevices(SearchRequestModel srModel)
{
var collection = new BlockingCollection<GenericDevice>(new ConcurrentBag<GenericDevice>());
Task.Run(() => QueueGenericDevices(collection, srModel));
foreach (var genericDevice in collection.GetConsumingEnumerable())
{
yield return genericDevice;
}
}
private async Task QueueGenericDevices(BlockingCollection<GenericDevice> collection, SearchRequestModel srModel)
{
var efDevices = _dbDeviceRepo.GetEfDevices(srModel);
var tasks = efDevices
.Select(async efDevice =>
{
var genericDevice = await BuildGenericDevice(efDevice, srModel);
collection.Add(genericDevice);
});
await Task.WhenAll(tasks);
collection.CompleteAdding();
}

will collection.GetConsumingEnumerable() block, leading to deadlocks, similar to .Wait or .Result
No. There are two parts to the classic deadlock:
A context that only allows one thread at a time (usually a UI SynchronizationContext or ASP.NET pre-Core SynchronizationContext).
Blocking on code that uses await (which uses that context in order to complete the async method).
GetConsumingEnumerable is a blocking call, but it's not blocking on asynchronous code, so there's no danger of the classic deadlock there. More generally, using a queue like this inserts a "barrier" of sorts between two pieces of code.
what is the preferred way to stream items in an async method
Modern code should use IAsyncEnumerable<T>.
However, if you're not on that platform yet, you can use a queue as a "barrier". You wouldn't want to use BlockingCollection<T>, though, because it is designed for synchronous producers and synchronous consumers. For asynchronous producers/consumers, I'd recommend Channels.

Related

async or not async method

I am writing a blazor server web application.
This application works with a database and Entity Framework.
Here is a method i've wrote:
private void MyMethod()
{
var query = mydbcontext.mytable1.Where(t => ...);
foreach (var item in query)
{
...
}
}
As you can see this method is not declared with "async Task". So she will be called without "await" keyword.
I can declare it with "async Task" and call it with "await". It works but it gives me a warning because i have no async call inside.
Let's suppose i decide to not declare it with "async Task" in order to avoid the warning.
What will happen if i need to change something in my function later which needs an Async call (For example this):
private async Task MyMethod()
{
var query = mydbcontext.mytable1.Where(t => ...);
foreach (var item in query)
{
...
}
var countresult = await query.CountAsync();
}
I will need to search all calls of MyMethod and add "await" on each of this calls.
To prevent that, I am wondering if i should not declare all my methods with "async Task". But this is ugly because i will get warnings.
What is the best practice ?
And is there a way to do this loop as ASync ?
private async Task MyMethod()
{
var query = mydbcontext.mytable1.Where(t => ...);
await foreachAsync (var item in query)
{
...
}
var countresult = await query.CountAsync();
}
Thanks

... I am wondering if i should not declare all my methods with "async Task".
If there is no I/O being performed by the method then there is no need to make the method return a Task or Task<T> and by extension no need to use the async keyword.
If the method does do operations that use I/O (like a database call) then you have a couple of options and what you choose depends on the context your code is being used in.
When to use Asynchronous Operations
If the context can benefit from using asynchronous I/O operations, like the context of a web application, then your methods should return Task or Task<T>. The benefit of using asynchronous I/O is that threads are not locked waiting on I/O to complete which is important in applications that need to service many simultaneous requests that are thread resource intensive (like a web application). A windows forms or wpf application is also applicable because it allows for easy coding so the main thread can resume operations while I/O completes (the ui won't appear frozen).
When not to use Asynchronous Operations
If the context cannot benefit from using asynchronous I/O operations then make all your methods synchronous and do not return Task or Task<T>. An example of this would be a console application or a windows service application. You mentioned a Blazor web application in your question so this scenario does not appear to apply to you.
Provide both Asynchronous Operations and Synchronous Operations
If you want to allow the caller to choose then you can include 2 variations for each call that executes an I/O operation. A good example of this are the extensions written for Entity Framework. Almost all of the operations can be performed asynchronously or synchronously. Examples include Single/SingleAsync, All/AllAsync, ToArray/ToArrayAsync, etc.
This approach will also allow you to extend a public method later by adding an async overload if you extend the operation to include an I/O call at a future point in time. There are ways to do this without creating too much code duplication. See Async Programming - Brownfield Async Development by Stephen Cleary for various techniques. My favorite approach is the The Flag Argument Hack.
Side note: If you do not need to consume the results of an I/O call in your method and there is only one I/O method you do not need to use the async/await keywords. You can return the result of the I/O operation directly.

private void MyMethod()
{
var query = mydbcontext.mytable1.Where(t => ...);
foreach (var item in query) // Here you will retrieve your data from the DB.
{
...
}
// If you're going to count it here, don't use the query. You already retrieved
// the data for your foreach.
}
It would be better to retrieve the data before the foreach. That way you will not call the DB twice for 1. the foreach, 2. the count. I.E.
private async Task MyMethod()
{
var data = await mydbcontext.mytable1.Where(t => ...).ToListAsync();
foreach (var item in data)
{
...
}
// Don't call the DB here. You already have the data, so just count the items in
// the list.
var countresult = await data.Count;
}

Handling tasks in parallel

Consider an API that returns Tasks with some values.
I want to update the UI based on that values in parallel (when one of the values is ready I want to update it without waiting for the second one assuming the update of each value as its own update method).
public async Task MyFunc()
{
Task<First> firstTask = MyAPI.GetFirstValue();
Task<Second> secondTask = MyAPI.GetSecondValue();
UpdateFirstValueUI(await firstTask)
UpdateSecondValueUI(await secondTask)
}
the code example will wait for the first value, update the UI, wait for the second value and update the UI again.
What is the best practice for that scenario? I was wondering if ContinueWith is best practice because I mostly see it in legacy code (before there was async-await).
edit with a better example:
assuming we have two implementations of that API and the code looks like that
public async Task MyFunc()
{
Task<First> firstTask = null
Task<Second> secondTask = null
if (someCondition)
{
firstTask = MyAPI1.GetFirstValue();
secondTask = MyAPI1.GetSecondValue();
}
else
{
firstTask = MyAPI2.GetFirstValue();
secondTask = MyAPI2.GetSecondValue();
}
UpdateFirstValueUI(await firstTask)
UpdateSecondValueUI(await secondTask)
}
now as you see I don't want call the update methods in two different branches (assuming we split that method for each API after the branching)
so looking for a way to change only the update calls so they could happen in parallel

The ContinueWith is a primitive method that has some rare uses in library code, and should generally be avoided in application code. The main problem with using the ContinueWith in your case is that it's going to execute the continuation on a ThreadPool, which is not what you want, because your intention is to update the UI. And updating the UI from any other thread than the UI thread is a no no. It is possible to solve this¹ problem by configuring the ContinueWith with a suitable TaskScheduler, but it's much simpler to solve it with async/await composition. My suggestion is to add the Run method below in some static class in your project:
public static class UF // Useful Functions
{
public static async Task Run(Func<Task> action) => await action();
}
This method just invokes and awaits the supplied asynchronous delegate. You could use this method to combine your asynchronous API calls with their UI-updating continuations like this:
public async Task MyFunc()
{
Task<First> task1;
Task<Second> task2;
if (someCondition)
{
task1 = MyAPI1.GetFirstValueAsync();
task2 = MyAPI1.GetSecondValueAsync();
}
else
{
task1 = MyAPI2.GetFirstValueAsync();
task2 = MyAPI2.GetSecondValueAsync();
}
Task compositeTask1 = UF.Run(async () => UpdateFirstValueUI(await task1));
Task compositeTask2 = UF.Run(async () => UpdateSecondValueUI(await task2));
await Task.WhenAll(compositeTask1, compositeTask2);
}
This will ensure that the UI will be updated immediately after each asynchronous operation completes.
As a side note, if you have any suspicion that the MyAPI asynchronous methods may contain blocking code, you could offload them to the ThreadPool by using the Task.Run method, like this:
task1 = Task.Run(() => MyAPI1.GetFirstValueAsync());
For a thorough explanation about why this is a good idea, you can check out this answer.
The difference between the built-in Task.Run method and the custom UF.Run method presented above, is that the Task.Run invokes the asynchronous delegate on the ThreadPool, while the UF.Run invokes it on the current thread. If you have any idea about a better name than Run, please suggest. :-)
¹ The ContinueWith comes with a boatload of other problems as well, like wrapping errors in AggregateExceptions, making it easy to swallow exceptions by mistake, making it hard to propagate the IsCanceled status of the antecedent task, making it trivial to leak fire-and-forget tasks, requiring to Unwrap nested Task<Task>s created by async delegates etc.

What should Task.Run(Func<Task>) be used for? [duplicate]

I would like to ask you on your opinion about the correct architecture when to use Task.Run. I am experiencing laggy UI in our WPF .NET 4.5
application (with Caliburn Micro framework).
Basically I am doing (very simplified code snippets):
public class PageViewModel : IHandle<SomeMessage>
{
...
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
// Makes UI very laggy, but still not dead
await this.contentLoader.LoadContentAsync();
HideLoadingAnimation();
}
}
public class ContentLoader
{
public async Task LoadContentAsync()
{
await DoCpuBoundWorkAsync();
await DoIoBoundWorkAsync();
await DoCpuBoundWorkAsync();
// I am not really sure what all I can consider as CPU bound as slowing down the UI
await DoSomeOtherWorkAsync();
}
}
From the articles/videos I read/saw, I know that await async is not necessarily running on a background thread and to start work in the background you need to wrap it with await Task.Run(async () => ... ). Using async await does not block the UI, but still it is running on the UI thread, so it is making it laggy.
Where is the best place to put Task.Run?
Should I just
Wrap the outer call because this is less threading work for .NET
, or should I wrap only CPU-bound methods internally running with Task.Run as this makes it reusable for other places? I am not sure here if starting work on background threads deep in core is a good idea.
Ad (1), the first solution would be like this:
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
await Task.Run(async () => await this.contentLoader.LoadContentAsync());
HideLoadingAnimation();
}
// Other methods do not use Task.Run as everything regardless
// if I/O or CPU bound would now run in the background.
Ad (2), the second solution would be like this:
public async Task DoCpuBoundWorkAsync()
{
await Task.Run(() => {
// Do lot of work here
});
}
public async Task DoSomeOtherWorkAsync(
{
// I am not sure how to handle this methods -
// probably need to test one by one, if it is slowing down UI
}

Note the guidelines for performing work on a UI thread, collected on my blog:
Don't block the UI thread for more than 50ms at a time.
You can schedule ~100 continuations on the UI thread per second; 1000 is too much.
There are two techniques you should use:
1) Use ConfigureAwait(false) when you can.
E.g., await MyAsync().ConfigureAwait(false); instead of await MyAsync();.
ConfigureAwait(false) tells the await that you do not need to resume on the current context (in this case, "on the current context" means "on the UI thread"). However, for the rest of that async method (after the ConfigureAwait), you cannot do anything that assumes you're in the current context (e.g., update UI elements).
For more information, see my MSDN article Best Practices in Asynchronous Programming.
2) Use Task.Run to call CPU-bound methods.
You should use Task.Run, but not within any code you want to be reusable (i.e., library code). So you use Task.Run to call the method, not as part of the implementation of the method.
So purely CPU-bound work would look like this:
// Documentation: This method is CPU-bound.
void DoWork();
Which you would call using Task.Run:
await Task.Run(() => DoWork());
Methods that are a mixture of CPU-bound and I/O-bound should have an Async signature with documentation pointing out their CPU-bound nature:
// Documentation: This method is CPU-bound.
Task DoWorkAsync();
Which you would also call using Task.Run (since it is partially CPU-bound):
await Task.Run(() => DoWorkAsync());

One issue with your ContentLoader is that internally it operates sequentially. A better pattern is to parallelize the work and then sychronize at the end, so we get
public class PageViewModel : IHandle<SomeMessage>
{
...
public async void Handle(SomeMessage message)
{
ShowLoadingAnimation();
// makes UI very laggy, but still not dead
await this.contentLoader.LoadContentAsync();
HideLoadingAnimation();
}
}
public class ContentLoader
{
public async Task LoadContentAsync()
{
var tasks = new List<Task>();
tasks.Add(DoCpuBoundWorkAsync());
tasks.Add(DoIoBoundWorkAsync());
tasks.Add(DoCpuBoundWorkAsync());
tasks.Add(DoSomeOtherWorkAsync());
await Task.WhenAll(tasks).ConfigureAwait(false);
}
}
Obviously, this doesn't work if any of the tasks require data from other earlier tasks, but should give you better overall throughput for most scenarios.

Difference between calling an async method and Task.Run an async method

I have a method in my view model
private async void SyncData(SyncMessage syncMessage)
{
if (syncMessage.State == SyncState.SyncContacts)
{
this.SyncContacts();
}
}
private async Task SyncContacts()
{
foreach(var contact in this.AllContacts)
{
// do synchronous data analysis
}
// ...
// AddContacts is an async method
CloudInstance.AddContacts(contactsToUpload);
}
When I call SyncData from the UI commands and I'm syncing a large chunk of data UI freezes. But when I call SyncContacts with this approach
private void SyncData(SyncMessage syncMessage)
{
if (syncMessage.State == SyncState.SyncContacts)
{
Task.Run(() => this.SyncContacts());
}
}
Everything is fine. Should not they be the same?
I was thinking that not using await for calling an async method creates a new thread.

Should not they be the same? I was thinking that not using await for
calling an async method creates a new thread.
No, async does not magically allocate a new thread for it's method invocation. async-await is mainly about taking advantage of naturally asynchronous APIs, such as a network call to a database or a remote web-service.
When you use Task.Run, you explicitly use a thread-pool thread to execute your delegate. If you mark a method with the async keyword, but don't await anything internally, it will execute synchronously.
I'm not sure what your SyncContacts() method actually does (since you haven't provided it's implementation), but marking it async by itself will gain you nothing.
Edit:
Now that you've added the implementation, i see two things:
I'm not sure how CPU intensive is your synchronous data analysis, but it may be enough for the UI to get unresponsive.
You're not awaiting your asynchronous operation. It needs to look like this:
private async Task SyncDataAsync(SyncMessage syncMessage)
{
if (syncMessage.State == SyncState.SyncContacts)
{
await this.SyncContactsAsync();
}
}
private Task SyncContactsAsync()
{
foreach(var contact in this.AllContacts)
{
// do synchronous data analysis
}
// ...
// AddContacts is an async method
return CloudInstance.AddContactsAsync(contactsToUpload);
}

What your line Task.Run(() => this.SyncContacts()); really does is creating a new task starting it and returning it to the caller (which is not used for any further purposes in your case). That's the reason why it will do its work in the background and the UI will keep working. If you need to (a)wait for the task to complete, you could use await Task.Run(() => this.SyncContacts());. If you just want to ensure that SyncContacts has finished when you return your SyncData method, you could using the returning task and awaiting it at the end of your SyncData method. As it has been suggested in the comments: If you're not interested in whether the task has finished or not you just can return it.
However, Microsoft recommend to don't mix blocking code and async code and that async methods end with Async (https://msdn.microsoft.com/en-us/magazine/jj991977.aspx). Therefore, you should consider renaming your methods and don't mark methods with async, when you don't use the await keyword.

Just to clarify why the UI freezes - the work done in the tight foreach loop is likely CPU-bound and will block the original caller's thread until the loop completes.
So, irrespective of whether the Task returned from SyncContacts is awaited or not, the CPU bound work prior to calling AddContactsAsync will still occur synchronously on, and block, the caller's thread.
private Task SyncContacts()
{
foreach(var contact in this.AllContacts)
{
// ** CPU intensive work here.
}
// Will return immediately with a Task which will complete asynchronously
return CloudInstance.AddContactsAsync(contactsToUpload);
}
(Re : No why async / return await on SyncContacts- see Yuval's point - making the method async and awaiting the result would have been wasteful in this instance)
For a WPF project, it should be OK to use Task.Run to do the CPU bound work off the calling thread (but not so for MVC or WebAPI Asp.Net projects).
Also, assuming the contactsToUpload mapping work is thread-safe, and that your app has full usage of the user's resources, you could also consider parallelizing the mapping to reduce overall execution time:
var contactsToUpload = this.AllContacts
.AsParallel()
.Select(contact => MapToUploadContact(contact));
// or simpler, .Select(MapToUploadContact);

When would I use Task.Yield()?

I'm using async/await and Task a lot but have never been using Task.Yield() and to be honest even with all the explanations I do not understand why I would need this method.
Can somebody give a good example where Yield() is required?

When you use async/await, there is no guarantee that the method you call when you do await FooAsync() will actually run asynchronously. The internal implementation is free to return using a completely synchronous path.
If you're making an API where it's critical that you don't block and you run some code asynchronously, and there's a chance that the called method will run synchronously (effectively blocking), using await Task.Yield() will force your method to be asynchronous, and return control at that point. The rest of the code will execute at a later time (at which point, it still may run synchronously) on the current context.
This can also be useful if you make an asynchronous method that requires some "long running" initialization, ie:
private async void button_Click(object sender, EventArgs e)
{
await Task.Yield(); // Make us async right away
var data = ExecuteFooOnUIThread(); // This will run on the UI thread at some point later
await UseDataAsync(data);
}
Without the Task.Yield() call, the method will execute synchronously all the way up to the first call to await.

Internally, await Task.Yield() simply queues the continuation on either the current synchronization context or on a random pool thread, if SynchronizationContext.Current is null.
It is efficiently implemented as custom awaiter. A less efficient code producing the identical effect might be as simple as this:
var tcs = new TaskCompletionSource<bool>();
var sc = SynchronizationContext.Current;
if (sc != null)
sc.Post(_ => tcs.SetResult(true), null);
else
ThreadPool.QueueUserWorkItem(_ => tcs.SetResult(true));
await tcs.Task;
Task.Yield() can be used as a short-cut for some weird execution flow alterations. For example:
async Task DoDialogAsync()
{
var dialog = new Form();
Func<Task> showAsync = async () =>
{
await Task.Yield();
dialog.ShowDialog();
}
var dialogTask = showAsync();
await Task.Yield();
// now we're on the dialog's nested message loop started by dialog.ShowDialog
MessageBox.Show("The dialog is visible, click OK to close");
dialog.Close();
await dialogTask;
// we're back to the main message loop
}
That said, I can't think of any case where Task.Yield() cannot be replaced with Task.Factory.StartNew w/ proper task scheduler.
See also:
"await Task.Yield()" and its alternatives
Task.Yield - real usages?

One use of Task.Yield() is to prevent a stack overflow when doing async recursion. Task.Yield() prevents syncronous continuation. Note, however, that this can cause an OutOfMemory exception (as noted by Triynko). Endless recursion is still not safe and you're probably better off rewriting the recursion as a loop.
private static void Main()
{
RecursiveMethod().Wait();
}
private static async Task RecursiveMethod()
{
await Task.Delay(1);
//await Task.Yield(); // Uncomment this line to prevent stackoverlfow.
await RecursiveMethod();
}

Task.Yield() is like a counterpart of Thread.Yield() in async-await but with much more specific conditions. How many times do you even need Thread.Yield()? I will answer the title "when would you use Task.Yield()" broadly first. You would when the following conditions are fulfilled:
want to return the control to the async context (suggesting the task scheduler to execute other tasks in the queue first)
need to continue in the async context
prefer to continue immediately when the task scheduler is free
do not want to be cancelled
prefer shorter code
The term "async context" here means "SynchronizationContext first then TaskScheduler". It was used by Stephen Cleary.
Task.Yield() is approximately doing this (many posts get it slightly wrong here and there):
await Task.Factory.StartNew(
() => {},
CancellationToken.None,
TaskCreationOptions.PreferFairness,
SynchronizationContext.Current != null?
TaskScheduler.FromCurrentSynchronizationContext():
TaskScheduler.Current);
If any one of the conditions is broken, you need to use other alternatives instead.
If the continuation of a task should be in Task.DefaultScheduler, you normally use ConfigureAwait(false). On the contrary, Task.Yield() gives you an awaitable not having ConfigureAwait(bool). You need to use the approximated code with TaskScheduler.Default.
If Task.Yield() obstructs the queue, you need to restructure your code instead as explained by noseratio.
If you need the continuation to happen much later, say, in the order of millisecond, you would use Task.Delay.
If you want the task to be cancellable in the queue but do not want to check the cancellation token nor throw an exception yourself, you need to use the approximated code with a cancellation token.
Task.Yield() is so niche and easily dodged. I only have one imaginary example by mixing my experience. It is to solve an async dining philosopher problem constrained by a custom scheduler. In my multi-thread helper library InSync, it supports unordered acquisitions of async locks. It enqueues an async acquisition if the current one failed. The code is here. It needs ConfigureAwait(false) as a general purpose library so I need to use Task.Factory.StartNew. In a closed source project, my program needs to execute significant synchronous code mixed with async code with
a high thread priority for semi-realtime work
a low thread priority for some background work
a normal thread priority for UI
Thus, I need a custom scheduler. I could easily imagine some poor developers somehow need to mix sync and async code together with some special schedulers in a parallel universe (one universe probably does not contain such developers); but why wouldn't they just use the more robust approximated code so they do not need to write a lengthy comment to explain why and what it does?

Task.Yield() may be used in mock implementations of async methods.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.