Task Parallel Library consuming lots of space on production server - c#

I am using –
Task task = new Task(delegate { GetRecordsForEmailReplies(headingList, partialEntity); });
task.Start();
to run some heavy methods, but the problem is it’s consuming lots of
space of CPU on server some time IIS Work process increased above 60%
thats why server gets stuck.
Is there any solution to manage this problem, so please let me know? or any other option to run these heavy method without blocking the page load?

The suggested means for creating a new Task or Task<T> is to use Task.Factory.StartNew with .NET 4.0 or Task.Run with .NET 4.5. There is a detailed explanation from Stephen Toub here on the topic.
He explains that it's more efficient:
For example, we take a lot of care within TPL to make sure that when accessing tasks from multiple threads concurrently, the "right" thing happens. A Task is only ever executed once, and that means we need to ensure that multiple calls to a task's Start method from multiple threads concurrently will only result in the task being scheduled once.
Again, use the Task.Run instead:
Task.Run(() => GetRecordsForEmailReplies(headingList, partialEntity));

Related

C# .NET choice of Multithreading approach

I've looked over multiple similar questions on SO, but I still couldn't answer my own question.
I have a console app (an Azure Webjob actually) which does file processing and DB management. Some heavy data being downloaded from multiple sources and processed on the DB.
Here's an example of my code:
var dbLongIndpendentProcess = doProcesAsync();
var myfilesTasks = files.Select(file => Task.Run(
async () =>
{
// files processing
}
await myfilesTasks.WhenAll();
await dbLongIndpendentProcess;
// continue with other stuff;
It all works fine and does what I am expecting it to do. There are other tasks running in this whole process, but I guess the idea is clear from the code above.
My question: Is this a fair way of approaching this, or would I get more performance (or sense?) by doing the good old "manual" multithreading? The main reason I chose this approach was that it's simple and straightforward.
However, wasn't async/await primarily aimed at doing asynchronous not to block the main (UI) thread. Here I don't have any UI and I am not doing anything. event-driven.
Thanks,
I don't think you're multithreading by using this approach (except the single Task.Run), async doesn't generally run things on separate threads, it only prevents things from blocking. See: https://msdn.microsoft.com/en-gb/library/mt674882.aspx#Anchor_5
The async and await keywords don't cause additional threads to be
created. Async methods don't require multithreading because an async
method doesn't run on its own thread. The method runs on the current
synchronization context and uses time on the thread only when the
method is active. You can use Task.Run to move CPU-bound work to a
background thread, but a background thread doesn't help with a process
that's just waiting for results to become available.
It would be much better to use tasks for the things you want to multithread, then you can take better advantage of machine cores and resources. You might want to look at a task based solution such as Pipelining (which may work in this scenario) etc...: https://msdn.microsoft.com/en-gb/library/ff963548.aspx or another alternative.

What am I losing between Task.Run(() => MyTask).Result, await MyTask and MyTask.Result?

I'm not sure if I'm even asking the question correctly, so bear with me; here's what I'm dealing with:
In my MVC4 project (targetting .Net 4.5.1) If I do await SomeAsyncMethod(...), then the task completes in the background but appears to never return. I believe this has something to do with the thread being returned to the pool and then resuming on a different thread. The workaround I've been using is to use Thread.Run(() => SomeTask).Result;.
So, I find myself having to do Thread.Run(() => SomeAsyncMethod).Result; a lot in my MVC projects lest I end up with deadlocks. Isn't this just another syntax for running the Task synchronously? I'm not sure if this is a limitation of MVC 4 (versus MVC 5) or if that's just how the api works. Am I essentially gaining nothing in terms of asynchronicity by doing this?
We've written a small library here where all of the operations are async Task<T> and it is in a separate assembly, so at least we can use it "properly" elsewhere (e.g. a window phone app), but this MVC 4 project is a consumer of said library, and it feels like we're basically stepping around the benefits of async/await in order to avoid deadlocks, so I'm looking for help in seeing the bigger picture here. It would help to better understand what I'm gaining by consuming asynchronous tasks in a synchronous mannger (if anything), what I'm losing, if there's a solution that gives me back the ability to await these tasks without deadlocking, and whether or not the situation is different between MVC 4 and MVC 5+
TIA
In my MVC4 project (targetting .Net 4.5.1) If I do await SomeAsyncMethod(...), then the task completes in the background but appears to never return.
This is almost certainly due to one of two things. Either:
Code further up the call stack is calling Result or Wait on a task. This will cause a deadlock in ASP.NET. The correct solution is to replace Result/Wait with await. I have more details on my blog.
The httpRuntime#targetFramework is not set to 4.5 or higher in your web.config. This is a common scenario for ASP.NET projects upgraded from an earlier version; you need to explicitly set this value for await to work correctly. There are more details in this blog post.
So, I find myself having to do Thread.Run(() => SomeAsyncMethod).Result; a lot in my MVC projects lest I end up with deadlocks. Isn't this just another syntax for running the Task synchronously?
Pretty much. What actually happens is that SomeAsyncMethod is run on a thread pool thread and then the request thread is blocked until that method is complete.
Am I essentially gaining nothing in terms of asynchronicity by doing this?
Correct. In fact, you're netting a negative benefit.
The whole point of asynchrony on the server side is to increase scalability by freeing up the request threads whenever they aren't needed. The Task.Run(..).Result technique not only prevents the request thread from being freed, it also uses other threads to do the actual work. So it's worse than just doing it all synchronously.
ASP .NET MVC 4 should be aware of the async keyword on Actions and should correctly handle resuming the request when using the await keyword. Are you sure the Action in question that uses await returns a Task and doesn't try to return T itself?
This is handled by ASP .NET MVC using a SynchronizationContext to ensure that the request is resumed correctly after awaiting even if it is on a different thread.
And yes, if you just call .Result, it's blocking the calling thread until the Task completes and you end up using (potentially) two threads for the same request without any benefit.
http://www.asp.net/mvc/overview/performance/using-asynchronous-methods-in-aspnet-mvc-4
Addition to #Stephen's answer, my 2 cents:
I believe this has something to do with the thread being returned to the pool and then resuming on a different thread.
Edit:
No, it all happens on single thread(usually UI thread, in case of MVC it's thread allocated for the request). Async await work on Message pump which is an infinite loop running on single thread. Each await puts a message on message pump and checks if it finished.
Above is not exactly applied for Asp.net. See #Jeff's comments below.
=========================================================
One rule of Async framework, If it's Async, keep it Async all the way.
Creating synchronous wrapper over Async method often results in blocking Main threads where Main thread and Task thread keep on waiting for each other to respond.

Is my Task long running if it is fully async?

Consider such code:
private static async Task ProcessSomethingAsync()
{
while (true)
{
var message = await GetMessageAsync();
await WriteAsync(message);
}
}
Consider that GetMessageAsync and WriteAsync methods leverage asynchronous IO.
Imagine that I have several(from 2 to N) tasks like this, which live as long as application lives.
To my opinion, since the code inside the loop is fully async, it is better not to use LongRunning option when I start such tasks, so that we will be able to leverage ThreadPool instead of creating thread per Task.
Is this correct or am I missing something?
it is better not to use LongRunning option when I start such tasks, so that we will be able to leverage ThreadPool instead of creating thread per Task.
When you're running async code, you should not specify LongRunning. If you do, then (as of today's implementation), the thread pool will start a new thread just to run the first part of your async code. As soon as your code yields at an await, that new thread will be disposed and the rest of the code will run on regular thread pool threads. So, LongRunning is usually counterproductive for async code.
I have a blog post on why StartNew is dangerous, and I (briefly) cover all of the TaskCreationOptions in that post:
AttachedToParent shouldn't be used in async tasks, so that's out. DenyChildAttach should always be used with async tasks (hint: if you didn't already know that, then StartNew isn't the tool you need). DenyChildAttach is passed by Task.Run. HideScheduler might be useful in some really obscure scheduling scenarios but in general should be avoided for async tasks. That only leaves LongRunning and PreferFairness, which are both optimization hints that should only be specified after application profiling. I often see LongRunning misused in particular. In the vast majority of situations, the threadpool will adjust to any long-running task in 0.5 seconds - without the LongRunning flag. Most likely, you don't really need it.
Yes, specifying LongRunning will potentially allow more threads to be created, because you are telling the scheduler that your task is going to hog a thread for a long time.
Async methods are exactly the opposite, they free up the thread to do other things without blocking.

Blocking Methods within Task

I'm currently developing a small server application and getting to grips with Task<>, and other associated operations.
I'm wondering how Blocking operations work within a Task.
So for example, I currently use several Libraries with "blocking" operations. One of them is Npgsql (PostgreSQL provider.)
If I do the following...
Task myTask = new Task<>( () =>
{
using(var db = new PostgresqlDatabaseConnection())
{
db.ExecuteQuery("SELECT takes 50 ms to get data...")
db.Insert(anObject); etc....
}
}
).Start();
And say, chain it to a bunch of other tasks that process that data.
Is this Efficient? I.E. Let's say that ExexuteQuery calls some kind of Thread.Sleep(1) or somehow blocks the thread, is this going to effect my Task Execution?
I'm asking because my server uses several libraries that would have to be rewritten to accomodate a totally Asynchronous methodology. Or is this Async enough?
*My Thoughts *
I'm really just not sure.
I know that if for example, the db.Executre() just ran a while(true) loop until it got it's data, it would almost certainly be blocking my server. Because a lot of time would be spend processing while(true). Or is Task smart enough to know that it should spend less time on this? Or if internally it is using some waiting mechanism, does the Task library know? Does it know that it should be processing some other task while it waits.
I'm currently developing a small server application and getting to
grips with Task<>, and other associated operations.
You won't benefit from using new Task, Task.Factory.StartNew, Task.Run in the server-side application, unless the number of concurrent client connections is really low. Check this and this for some more details.
You would however greatly benefit from using naturally asynchronous API. They don't block a pool thread while "in-flight", so the thread is returned to the pool and then can get busy serving another client request. This improves your server app scalability.
I'm not sure if PostgreSQL provides such API, look for something like ExecuteQueryAsync or BeginExecuteQuery/EndExecuteQuery. If it doesn't have that, just use the synchronous ExecuteQuery method, but do not offload it to a pool thread as you do in your code fragment.
Using the async/await features of C# 5 will definitely make things easier. It can make asynchronous code easier to write, as you write it very similar to how you would write synchronous code.
Take the following example. I am using Thread.Sleep to simulate a long running operation, so any libraries that don't support async natively can still be used via Task.Run. While the Thread.Sleep is holding up the thread, your UI is still responsive. If you had written this code synchronously, your UI would hold up for 1.5 seconds until the thread.sleep was finished.
private async void button1_Click(object sender, EventArgs e)
{
Console.WriteLine("0");
await DoWorkAsync();
Console.WriteLine("3");
}
private async Task DoWorkAsync()
{
Console.WriteLine("1");
await Task.Run(()=>
{
// Do your db work here.
Thread.Sleep(1500);
});
Console.WriteLine("2");
}
So in short, if you have a long running database operations and you want to keep your UI responsive, you should leverage async/await. While this does keep your UI responsive, it does introduce new challenges like: what happens if the user clicks a button multiple times or what if the user closes the window while you are still processing to name some simple cases.
I encourage you to read further on the subject. Jon Skeet has a multi-part series on async. There are also numerous MSDN articles on the subject: 1 2 3 ...
Async programming does nothing to improve the efficiency of your logic or of the database. All it influences it the performance of switching between operations.
You cannot make a query or a computation faster wrapping it in a Task. You only add overhead.
Async IO is used on the server to achieve scalability to hundreds of concurrent requests. You don't need it here.

Should I notice a difference in using Task vs Threads in .Net 4.0?

I updated my code to use Tasks instead of threads....
Looking at memory usage and CPU I do not notices any improvements on the multi-core PC, Is this expected?
My application essentially starts up threads/tasks in different objects when it runs...
All I'm doing is a simple
Task a = new Task(...)
a.Start();
There are various implications to using Tasks instead of Threads, but performance isn't a major one (assuming you weren't creating huge numbers of threads.) A few key differences:
The default TaskScheduler will use thread pooling, so some Tasks may not start until other pending Tasks have completed. If you use Thread directly, every use will start a new Thread.
When an exception occurs in a Task, it gets wrapped into an AggregateException that calling code can receive when it waits for the Task to complete or if you register a continuation on the Task. This is because you can also do things like wait on multiple Tasks to complete, in which case multiple exceptions can be thrown and aggregated.
If you don't observe an unhandled exception thrown by a Task, it will (well, may) eventually be thrown by the finalizer of the Task, which is particularly nasty. I always recommend hooking the TaskScheduler.UnobservedTaskException event so that you can at least log these failures before the application blows up. This is different from Thread exceptions, which show up in the AppDomain.UnhandledException event.
If you simply replaced every usage of Thread with Task and did no other changes I would expect virtually the same performance. The Task API is really just that, it's an API over an existing set of constructs. Under the hood it uses threads to schedule it's activities and hence has similar performance characteristics.
What's great about Task are the new things you can do with them
Composition with ContinueWith
Cancellation
Hierarchies
Etc ...
One great improvement of Takss vs. Threads is that you can easiely build chains of tasks. You can specify when a task should start after the previous task ("OnSuccess", "OnError", a.s.o.) and you can specify if there should be a synchronization context switch. That gives you the great opportunity to run a long running task in bakcground and after that a UI refershing task on the UI thread.
If you are using .Net 4.0 then you can use the Parallel.Invoke method like so
Parallel.Invoke(()=> {
// What ever code you add here will get threaded.
});
for more info see
http://msdn.microsoft.com/en-us/library/dd992634.aspx
You would see difference if your original or converted code do not utlize CPU completely. I.e. if original code always limited number of threads to 2, on quad-core machine it will run at about 50% load with manually created threads and potentially 100% load with tasks (if your tasks can be actaully paralellized). So it looks like either your original code was reasonable from performance point of view, or both implemetaion suffer issues showing similar underutiliztion of CPU.

Categories

Resources