Suppose, I have a simple class, with async method in it:
public class Writer
{
public Task WriteAsync(string message);
}
This is internal class, which is absolutely negligible for application's business logic.
The main idea of the method, is that when it's called - method must immediately returns control to the calling method, to avoid any possible delay in important, full of business logic calling method (delay for calling that method is possible of course).
This method calls in different places, very often. and we don't really care if it's successful or won't write last messages in case of unexpectable situation. That's fine.
So, the question is, how can I call WriteAsync to avoid any possible delays in calling method?
I thought about Task.Run(() => WriteAsync(message)) (without await! we don't need to wait this!), but won't that fill my thread pool with a lot of useless work? And it's quite onerously writing everywhere such code...
You may queue the writes and process the queue, i.e. perform the writing, on a dedicated background thread. This is kind of what happens when you call Task.Run, i.e. you queue up delegates in the thread pool. If you require more control, you may for example use a BlockingCollection<T>.
There is an example of how to use a BlockingCollection<T> to read and write items concurrently available on MSDN.
Using this approach, calling WriteAsync will only block for the time it takes to add the message to the queue and this time should be negligible.
Because the method is asynchronous then, by definition, it is already returning control to the caller immediately. If the implementation of that method isn't actually asynchronous, then it should either not return a Task, not have Async in the name, and make it clear to callers that it's synchronous, or it should fix the bug in its implementation that makes it block the caller for an extended period of time. Callers of the method will rightfully expect that, being an asynchronous method, it will return control to the caller immediately by just calling the method normally. If the method has a bug in it that makes it not do that, you shouldn't work around that bug and have callers treat it as a synchronous method when it claims it isn't.
Related
I just encountered this code. I immediately started cringing and talking to myself (not nice things). The thing is I don't really understand why and can't reasonably articulate it. It just looks really bad to me - maybe I'm wrong.
public async Task<IHttpActionResult> ProcessAsync()
{
var userName = Username.LogonName(User.Identity.Name);
var user = await _user.GetUserAsync(userName);
ThreadPool.QueueUserWorkItem((arg) =>
{
Task.Run(() => _billing.ProcessAsync(user)).Wait();
});
return Ok();
}
This code looks to me like it's needlessly creating threads with ThreadPool.QueueUserWorkItem and Task.Run. Plus, it looks like it has the potential to deadlock or create serious resource issues when under heavy load. Am I correct?
The _billing.ProcessAsync() method is awaitable(async), so I would expect that a simple "await" keyword would be the right thing to do and not all this other baggage.
I believe Scott is correct with his guess that ThreadPool.QueueUserWorkItem should have been HostingEnvironment.QueueBackgroundWorkItem. The call to Task.Run and Wait, however, are entirely nonsensical - they're pushing work to the thread pool and blocking a thread pool thread on it, when the code is already on the thread pool.
The _billing.ProcessAsync() method is awaitable(async), so I would expect that a simple "await" keyword would be the right thing to do and not all this other baggage.
I strongly agree.
However, this will change the behavior of the action. It will now wait until Billing.ProcessAsync is completed, whereas before it would return early. Note that returning early on ASP.NET is almost always a mistake - I would say any "billing" processing would be even more certainly a mistake. So, replacing this mess with await will make the app more correct, but it will cause the ProcessAsync action to take longer to return to the client.
It's strange, but depending on what the author is trying to achieve, it seems ok to me to queue a work item in the thread pool from inside an async method.
This is not as starting a thread, it's just queueing an action to be done in a ThreadPool's thread when there is a free one. So the async method (ProcessAsync) can continue and don't need to care about the result.
The weird thing is the code inside the lambda to be enqueued in the ThreadPool. Not only the Task.Run() (which is superflous and just causes unnecessary overhead), but to call an async method without waiting for it to finish is bad inside a method that should be run by the ThreadPool, because it returns the control flow to the caller when awaiting something.
So the ThreadPool eventually thinks this method is finished (and the thread free for the next action in the queue), while actually the method wants to be resumed later.
This may lead to very undefined behaviour. This code may have been working (in certain circumstances), but I would not rely on it and use it as productive code.
(The same goes for calling a not-awaited async method inside Task.Run(), as the Task "thinks" it's finished while the method actually wants to be resumed later).
As solution I'd propose to simply await that async method, too:
await _billing.ProcessAsync(user);
But of course without any knowledge about the context of the code snippet I can't guarantee anything. Note that this would change the behaviour: while until now the code did not wait for _billing.ProcessAsync() to finsih, it would now do. So maybe leaving out await and just fire and forget
_billing.ProcessAsync(user);
maybe good enough, too.
I am going to use this method in a Load Test which means thousands of calls may happen very quickly from different threads. I am wondering if I have to consider what would happen on subsequent call, where a new WebClient is created but before the prior await is complete?
public static async Task<string> SendRequest(this string url)
{
using (var wc = new WebClient())
{
var bytes = await wc.DownloadDataTaskAsync(url);
using (var reader = new StreamReader(new MemoryStream(bytes)))
{
return await reader.ReadToEndAsync();
}
}
}
I use the term reentrant to describe the fact that this method will be called by one or more threads.
So we want to know what potential problems could arise from using this method in a multithreaded context, either through a single call in an environment that has multiple threads, or where multiple calls are being made from one or more threads.
The first thing to look at is what does this method expose externally. If we're designing this method, we can control what it does, but not what the callers do. We need to assume that anyone can do anything with whatever they pass into our method, what they do with the returned value, and what they do with the type/object instance that the class is called on. Let's look at each of these in turn.
The URL:
Obviously the caller can pass in an invalid URL, but that's not an issue that's specific to asynchrony or multithreading. They can't really do anything else with this parameter. They can't mutate the string from another thread after passing it to us, because string is immutable (or at least observably immutable externally).
The return value:
So at first glance, this in fact may appear to be a problem. We're returning an object instance (a Task); that object is being mutated by this method that we're writing (to mark it as faulted, excepted, completed) and it is also likely to be mutated by the caller of this method (to add continuations). It's also quite plausible for this Task to end up being mutated from multiple different threads (the task could be passed to any number of other threads, which could mutate it by adding continuations, or be reading values while we're mutating it).
Fortunately, Task was very specifically designed to support all of these situations, and it will function properly due to the synchronization that it performs internally. As authors of this method, we don't need to care who adds what continuations to our task, from what thread, whether or not different people are adding them at the same time, what order things happen in, whether continuations are added before or after we mark the task as completed, or any of that. While the task can be mutated externally, even from other threads, there's nothing that they could do that would be observable to us, from this method. Likewise, their continuations are going to function appropriately regardless of what we do. Their continuations will always fire some time after the task is marked as completed, or immediately if it was already completed. It doesn't have the possible race conditions that an event based model has of adding an event handler after the event is fired to signal completion.
Finally, we have state of the type/instance.
This one is easy. It's a static method, so there are no instance fields that we could access even if we wanted to. There are also no static fields that this method accesses, so no state is shared between threads that way that we need to be concerned about.
Other than the string input and task output, the state that this method uses is entirely local variables that are never accessible outside of this method. Since this method does everything in a single thread (if there is a synchronization context, or it at least does everything sequentially even if thread pool threads are used), we don't need to worry about any threading issues internally, only what could be happening externally by the caller.
When you're concerned about methods being called multiple times before previous calls have finished, the primary concern here is around access to fields. If the method was accessing instance/static fields, then one would need to consider the implications not only of a method being called with any given input state, but also with what's going on if other methods are accessing those fields at the same time. Since we access none, this is moot for this method.
In Async/Await FAQ, Stephen Toub says:
An awaitable is any type that exposes a GetAwaiter method which returns a valid awaiter.
...
An awaiter is any type returned from an awaitable’s GetAwaiter method and that conforms to a particular pattern.
So in order to be an awaiter, a type should:
Implement the INotifyCompletion interface.
Provide a boolean property called IsCompleted.
Provide a parameterless GetResult method that returns void or TResult.
(I'm ignoring ICriticalNotifyCompletion for now.)
I know the page I mentioned has a sample that shows how the compiler translates await operations but I'm stil having a hard time understanding.
When I await an awaitable,
When is IsCompleted checked? Where should I set it?
When is OnCompleted called?
Which thread calls OnCompleted?
I saw examples of both directly invoking the continuation parameter of OnCompleted and using Task.Run(continuation) in different examples, which should I go for and why?
Why would you want a custom awaiter?
You can see the compiler's interpretation of await here. Essentially:
var temp = e.GetAwaiter();
if (!temp.IsCompleted)
{
SAVE_STATE()
temp.OnCompleted(&cont);
return;
cont:
RESTORE_STATE()
}
var i = temp.GetResult();
Edit from comments: OnCompleted should schedule its argument as a continuation of the asynchronous operation.
In the vast majority of cases, you as a developer need not worry about this. Use the async and await keywords and the compiler and runtime handle all this for you.
To answer your questions:
When does the code checks IsCompleted? Where should I set it?
The task should set IsCompleted when the task has finished doing what it was doing. For example, if the task was loading data from a file, IsCompleted should return true when the data is loaded and the caller can access it.
When does it calls OnCompleted?
OnCompleted usually contains a delegate supplied by the caller to execute when the task has completed.
Does it call OnCompleted in parallel or should the code
inside OnCompleted be asynchronous?
The code in OnCompleted should be thread neutral (not care which thread it is called from). This may be problematic for updating COM objects in Single Threaded Apartments (like any UI classes in Metro/Windows8/Windows Store apps). It does not have to be asynchronous but may contain asynchronous code.
I saw examples of both directly invoking the continuation parameter of OnCompleted
and using Task.Run(continuation) in different examples, which should I go for and when?
Use async/await when you can. Otherwise, use Task.Run() or Task.Wait() because they follow the sequential programming model most people are used to. Using continuations may still be required, particularly in Metro apps where you have apartment issues.
I came across a piece of C# code like this today:
lock(obj)
{
// perform various operations
...
// send a message via a queue but in the same process, Post(yourData, callback)
messagingBus.Post(data, () =>
{
// perform operation
...
if(condition == true)
{
// perform a long running, out of process operation
operation.Perform();
}
}
}
My question is this: can the callback function ever be invoked in such a way as to cause the lock(obj) to not be released before operation.Perform() is called? i.e., is there a way that the callback function can be invoked on the same thread that is holding the lock, and before that thread has released the lock?
EDIT: messagingBus.Post(...) can be assumed to be an insert on to a queue, that then returns immediately. The callback is invoked on some other thread, probably from the thread pool.
For the operation.Perform() you can read it as Thread.Sleep(10000) - just something that runs for a long time and doesn't share or mutate any state.
I'm going to guess.
Post in .net generally implies that the work will be done by another thread or at another time.
So yes, it's not only possible that the lock on obj will be released before Perform is called, it's fairly likely it will happen. However, it's not guaranteed. Perform may complete before the lock is released.
That doesn't mean it's a problem. The "perform various actions" part may need the lock. messagingBus may need the lock to queue the action. The work inside may not need the lock at all, in which case the code is thread safe.
This is all a guess because there's no notion of what work is being done, why it must be inside a lock, and what Post or perform does. So the code may be perfectly safe, or it may be horribly flawed.
Without know what messagingBus.Post is doing, you can't tell. If Post invokes the delegate it is given (the lambda expression in your example) then the lock will be in place while that lambda executes. If Post schedules that delegate for execution at a later time, then the lock will not be in place while the lambda executes. It's not clear what the the lock(obj) is for, to lock calls to messagingBus.Post, or what... Detailing the type (including full namespace) of the messagingBus variable would go a long way to providing better details.
If the callback executes asynchronously, then yes, the lock may still be held when Perform() unless Post() does something specific to avoid that case (which would be unusual).
If the callback was scheduled on the same thread as the call to Post() (e. g. in the extreme example where the thread pool has only 1 thread), a typical thread pool implementation would not execute the callback until the thread finishes it's current task, which in this case would require it releasing the lock before executing Perform().
It's impossible to answer your question without knowing how messagingBus.Post is implemented. Async APIs typically provide no guarantee that the callback will be executed truly concurrently. For example, .Net APM methods such as FileStream.BeginRead may decide to perform the operation synchronously, in wich case the callback will be executed on the same thread that called BeginRead. Returned IAsyncResult.CompletedSynchronously will be set to true in this case.
Reading this article I found several ways to call a method.
Method to call:
public static void SendData(string value) { }
Calls:
delegate void MyDelegate(string value);
//Slow method - NOT RECOMMENDED IN PRODUCTION!
SendData("Update");
// Fast method - STRONGLY RECOMMENDED FOR PRODUCTION!
MyDelegate d = new MyDelegate(SendData);
d.BeginInvoke("Update", null, null);
Is it true? Is it faster?
Action send = () => Send("Update");
send();
Or maybe this?
I need to call a method into a SQL CLR trigger with maximum performance so even small speed increase makes sense.
Which is "faster"?
1) Ask Bob to mow your lawn. Wait until he's done. Then go to the mall.
2) Ask Bob to mow your lawn. Go to the mall while he's mowing your lawn.
The second technique gets you to the mall a lot faster. The price you pay is that you have no idea whether the lawn is going to be mowed by the time you get home or not. With the first technique, you know that when you get home from the mall the lawn will be mowed because you waited until it was before you left in the first place. If your logic depends on knowing that the lawn is mowed by the time you get back then the second technique is wrong.
Now the important bit: Obviously neither technique gets your lawn mowed faster than the other. When you're asking "which is faster?" you have to indicate what operation you're measuring the speed of.
Using a delegate is no faster than directly calling the method (in all reality, creating a delegate and then calling it would be more expensive).
The reason that this is going to seem faster is because directly calling the method blocks the executing thread while the method runs. Your delegate example calls the method asynchronously (using BeginInvoke) so the calling thread continues to execute while the method is executed.
Also, whenever you have a call to BeginInvoke on a delegate you should also have the corresponding EndInvoke, which you're missing in your example:
Is EndInvoke() optional, sort-of optional, or definitely not optional?
and
IanG on Tap: EndInvoke Not Optional
Its a placebo speed improvement from the point of view of when SendData is returning to the caller. BeginInvoke will take a ThreadPool thread and start the method on that thread, then return to caller immediately - the actual work is on another thread. The time it takes to do this work will remain the same regardless of the thread its on. It might improve the responsiveness of your application, depending on the work, but delegates are not faster than direct method calls - as I say, in your situation it seems faster because it returns immediately.
Try this: change BeginInvoke to Invoke - the caller is now blocking, the same as calling SendData normally.
Assuming the code comments are not yours (ie, "RECOMMENDED FOR PRODUCTION") I would fast find the developer responsible and make sure they are aware of Delegate.BeginInvoke and the fact that they are making their app multi-threaded without realising it...
To answer the question, a direct method call is always the fastest way - delegates or reflection incur overhead.
Your best chance to increase performance would be to optimize the code in the method that will be in the SQL CLR stored procedure that the trigger will call. Could you post more information about that?
Note that in the article you cite, the author is talking about WCF calls, notably calls for inserting and updating a database.
The keys points to note in that specific case are:
The work is being done by another machine.
The only information you are getting back is "Success!" (usually) or (occasionally) "Failure" (which the author doesn't seem to care about)
Hence, in that specific case, the background call were better. For general purpose use, direct calls are better.