parallel programming - I need some clarifications - c#

Okay , let me try to put it in sentences ...
Lets consider an example,
where I create an async method and call it with await keyword,
As far as my knowledge tells me,
The main thread will be released
In a separate thread, async method will start executing
Once it is executed, The pointer will resume from last position It left in main thread.
Question 1 : Will it come back to main thread or it will be a new thread ?
Question 2: Does it make any difference if the async method is CPU bound or network bound ? If yes, what ?
The important question
Question 3 : Assuming that is was a CPU bound method, What did I achieve? I mean - main thread was released, but at the same time, another thread was used from thread pool. what's the point ?

async does not start a new thread. Neither does await. I recommend you read my async intro post and follow up with the resources at the bottom.
async is not about parallel programming; it's about asynchronous programming. If you need parallel programming, then use the Task Parallel Library (e.g., PLINQ, Parallel, or - in very complex cases - raw Tasks).
For example, you could have an async method that does I/O-bound operations. There's no need for another thread in this scenario, and none will be created.
If you do have a CPU-bound method, then you can use Task.Run to create an awaitable Task that executes that method on a thread pool thread. For example, you could do something like await Task.Run(() => Parallel...); to treat some parallel processing as an asynchronous operation.

Execution of the caller and async method will be entirely on the current thread. async methods don't create a new thread and using async/await does not actually create additional threads. Instead, thread completions/callbacks are used with a synchronization context and suspending/giving control (think Node.js style programming). However, when control is issued to or returns to the await statement, it may end up being on a different completion thread (this depends on your application and some other factors).
Yes, it will run slower if it is CPU or Network bound. Thus the await will take longer.
The benefit is not in terms of threads believe it or not... Asynchronous programming does not necessarily mean multiple threads. The benefit is that you can continue doing other work that doesn't require the async result, before waiting for the async result... An example is a web server HTTP listener thread pool. If you have a pool of size 20 then your limit is 20 concurrent requests... If all of these requests spend 90% of their time waiting on database work, you could async/await the database work and the time during which you await the database result callback will be freed... The thread will return to the HTTP listener thread pool and another user can access your site while the original one waits for the DB work to be done, upping your total limit.
It's really about freeing up threads that wait on externally-bound and slow operations to do other things while those operations execute... Taking advantage of built-in thread pools.

Don't forget that the async part could be some long-running job, e.g. running a giant database query over the network, downloading a file from the internet, etc.

Related

Is it pointless to use Threads inside Tasks in C#?

I know the differences between a thread and a task., but I cannot understand if creating threads inside tasks is the same as creating only threads.
It depends on how you use the multithreaded capabilities and the asynchronous programming semantics of the language.
Simple facts first. Assume you have an initial, simple, single-threaded, and near empty application (that just reads a line of input with Console.ReadLine for simplicity sake). If you create a new Thread, then you've created it from within another thread, the main thread. Therefore, creating a thread from within a thread is a perfectly valid operation, and the starting point of any multithreaded application.
Now, a Task is not a thread per se, but it gets executed in one when you do Task.Run which is selected from a .NET managed thread pool. As such, if you create a new thread from within a task, you're essentially creating a thread from within a thread (same as above, no harm done). The caveat here is, that you don't have control of the thread or its lifetime, that is, you can't kill it, suspend it, resume it, etc., because you don't have a handle to that thread. If you want some unit of work done, and you don't care which thread does it, just that's it not the current one, then Task.Run is basically the way to go. With that said, you can always start a new thread from within a task, actually, you can even start a task from within a task, and here is some official documentation on unwrapping nested tasks.
Also, you can await inside a task, and create a new thread inside an async method if you want. However, the usability pattern for async and await is that you use them for I/O bound operations, these are operations that require little CPU time but can take long because they need to wait for something, such as network requests, and disk access. For responsive UI implementations, this technique is often used to prevent blocking of the UI by another operation.
As for being pointless or not, it's a use case scenario. I've faced situations where that could have been the solution, but found that redesigning my program logic so that if I need to use a thread from within a task, then what I do is to have two tasks instead of one task plus the inner thread, gave me a cleaner, and more readable code structure, but that it's just personal flair.
As a final note, here are some links to official documentation and another post regarding multithreaded programming in C#:
Async in Depth
Task based asynchronous programming
Chaining Tasks using Continuation Tasks
Start multiple async Tasks and process them as they complete
Should one use Task.Run within another Task
It depends how you use tasks and what your reason is for wanting another thread.
Task.Run
If you use Task.Run, the work will "run on the ThreadPool". It will be done on a different thread than the one you call it from. This is useful in a desktop application where you have a long-running processor-intensive operation that you just need to get off the UI thread.
The difference is that you don't have a handle to the thread, so you can't control that thread in any way (suspend, resume, kill, reuse, etc.). Essentially, you use Task.Run when you don't care which thread the work happens on, as long as it's not the current one.
So if you use Task.Run to start a task, there's nothing stopping you from starting a new thread within, if you know why you're doing it. You could pass the thread handle between tasks if you specifically want to reuse it for a specific purpose.
Async methods
Methods that use async and await are used for operations that use very little processing time, but have I/O operations - operations that require waiting. For example, network requests, read/writing local storage, etc. Using async and await means that the thread is free to do other things while you wait for a response. The benefits depend on the type of application:
Desktop app: The UI thread will be free to respond to user input while you wait for a response. I'm sure you've seen some programs that totally freeze while waiting for a response from something. This is what asynchronous programming helps you avoid.
Web app: The current thread will be freed up to do any other work required. This can include serving other incoming requests. The result is that your application can handle a bigger load than it could if you didn't use async and await.
There is nothing stopping you from starting a thread inside an async method too. You might want to move some processor-intensive work to another thread. But in that case you could use Task.Run too. So it all depends on why you want another thread.
It would be pointless in most cases of everyday programming.
There are situations where you would create threads.

Using Default ThreadPool SynchronisationContext in ASP.NET

Reading these two articles, SynchronisationContext and Async/Await. I'm very familiar with the way the ASP.NET SynchronisationContext handles the re-entry of async method execution back onto the request thread.
One of the big issues with re-entry is that if you use .Wait() or .Result you can cause a deadlock, why:
When the await completes, it attempts to execute the remainder of the async method within the captured context. But that context already has a thread in it, which is (synchronously) waiting for the async method to complete.
You get this because of the single thread that is being used by the SynchronisationContext and the way it permits only one chunk of code to run at a time.
What would be the implications of using Task.Factory.StartNew and assigning it a ThreadPool SynchronisationContext within an ASP.NET application?
I'm not bothered by why you would do this, rather more what would happen?
I know you are never supposed to Fire-and-Forget an async operation in ASP.NET because the thread may be cleared up before the async operation has completed. Would this also be true in the case of a ThreadPool SynchronisationContext.
What would be the implications of using Task.Factory.StartNew and assigning it a ThreadPool SynchronisationContext within an ASP.NET application?
There's no reason to use StartNew; Task.Run is superior in almost every use case. Also, you don't need to "assign" a thread pool SynchronizationContext, because thread pool threads have that naturally.
So, sure, you could do this:
await Task.Run(async () => { ... });
and all of your ... code will run on a thread pool thread (outside the request context), and resume on a thread pool thread (outside the request context).
This usage also doesn't have the problem of fire-and-forget, because our code is awaiting the result.
But I generally discourage this on ASP.NET, because think about what it's doing:
The code queues work (...) to the thread pool.
Then it asynchronously waits for that work to complete. During this time, the original request thread is returned to the thread pool.
When the work completes, the calling method continues in the request context.
So it causes an extra thread switch while giving you no benefit. It's possible to have a small amount of parallelism, but then you're talking about some potentially serious scalability limitations. Every time I've done parallelism on ASP.NET in production, I've ended up ripping it out.
I know you are never supposed to Fire-and-Forget an async operation in ASP.NET because the thread may be cleared up before the async operation has completed. Would this also be true in the case of a ThreadPool SynchronisationContext.
Actually, it's not because of a particular thread getting aborted. It's because the entire AppDomain is torn down. This includes aborting all threads.
So, yes, on ASP.NET, fire-and-forget using the thread pool is just as bad as any other kind of fire-and-forget:
Task.Run(() => ...); // bad!
At the very least, you should register the work with the ASP.NET runtime (which does not make the work reliable in the true sense of the word - it only minimizes the chance that the work will be aborted). Modern ASP.NET has HostingEnvironment.QueueBackgroundWorkItem for this.
Before ASP.NET added this method, I had a library that registered the work in a very similar way. I used Task.Run on purpose to run the registered work outside any request context and with a thread pool SynchronizationContext.

c# webapi: Await Task.Run vs more granualar await

I'm using async/await in WebApi Controllers according to this article:
https://msdn.microsoft.com/en-us/magazine/dn802603.aspx
Hava look at this simplified code in my controller:
DataBaseData = await Task.Run( () => GetDataFunction() );
GetDataFunction is a function that will open a database connection, open a reader and read the data from the database.
In many examples I see it handled differently. The GetDataFunction itself is awaited. And within the function every single step is awaited.
For example:
connection.OpenAsync
reader.ReadAsycnc
reader.IsDBNullAsync
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
Update:
Thanks for the help. Now I got it. I did not get, that the asynchronous Api's do not start threads themselves. This really helped: blog.stephencleary.com/2013/11/there-is-no-thread.html
The article you linked to states:
You can kick off some background work by awaiting Task.Run, but there’s no point in doing so. In fact, that will actually hurt your scalability by interfering with the ASP.NET thread pool heuristics... As a general rule, don’t queue work to the thread pool on ASP.NET.
In other words, avoid Task.Run on ASP.NET.
Why is this good practice? Why not just start a thread for the whole database access (with Task.Run)?
It's because the asynchronous APIs do not use other threads. The entire point of async/await is to free up the current thread; not use another thread. I have a blog post describing how async works without needing threads.
So, the Task.Run (or custom thread) approach will use up (block) a thread while getting data from the database. Proper asynchronous methods (e.g., EF6 or the ADO.NET asynchronous APIs) do not; they allow the request thread to be used for other requests while that request is waiting for the database response.
I assume you are wondering why using
await Task.Run(() => GetDataFunction());
instead of
await GetDataFunction();
As you can see in Task.Run(Func) under the section Remarks there is written:
The Run(Func) method is used by language compilers to support the async and await keywords. It is not intended to be called directly from user code.
That means, you should use await GetDataFunction.
It's all about resource management and sharing.
When using connection.Open() method the thread that calls it has to wait for the connection to actually open. While waiting it does nothing but consume the CPU slice allocated to it by the operating system. However, when you await connection.OpenAsync() the thread frees up its resources and the OS can redistribute them to other threads.
Now, there's a catch: to actually gain from asynchronous calls they should take longer than the time it takes for the OS to switch contexts otherwise it will incur a drop in application performance.
The practice of awaiting on connection.OpenAsync(), reader.ReadAsync() etc. is due to the fact that these are potentially long running operations. In a system where de database resides on another machine and it is under a heavy load of requests, connecting to the database and getting the results of a query will take some time. Instead of blocking the CPU while waiting on the results to arrive why not allocate that time slice to another worker thread that waits in the scheduler queue to render the response for another client?
Now, about starting another thread for data access: don't do that!. Each new thread gets allocated about 1MB of memory space so beside wasting CPU time while waiting for database operations to finish you'll be wasting memory also. Furthermore, having such a heavy memory footprint would require a lot more runs for Garbage Collector which will freeze all of your threads when running.
Using Task.Run() will schedule the data access operations on a thread from the ThreadPool but you will lose the benefit of sharing the CPU time with another thread while waiting for database server to respond.

Task.Factory.New vs Task.Run vs async/await, best way to not block UI thread

I need to make sure some of my code runs in the background and doesn't stop block the UI thread, and I thought I knew how to use Task.Factory.New, but as I read more and more, it seems that it doesn't always create a new thread but sometimes runs it on the UI thread if the thread pool thinks that's the way to go so now I am pretty confused over what to use to be safe that the method is not blocking the UI thread. I should also add that I don't need a return value from the method , I just need it to run asynchronously ( in the background ).
So what's the difference between these methods below and what is best to use to make sure it doesn't block the UI thread ?
And is it really true that Task.Factory.StartNew doesn't always create a thread that runs in the background and doesn't block the UI?
public Form1()
{
Task.Factory.StartNew(() => Thread.Sleep(1000));
Task.Run(() => Thread.Sleep(1000));
Task<int> longRunningTask = LongRunningOperationAsync();
}
public async Task<int> LongRunningOperationAsync() // assume we return an int from this long running operation
{
await Task.Delay(1000);
return 1;
}
I should also add that I don't need a return value from the method
Beware of a common problem: just because you don't need a return value doesn't mean you shouldn't await the returned task. awaiting the task provides you with exception information as well. The only time it's appropriate to ignore the returned task is when you don't care about the return value and you don't care about any exceptions the background work may raise (that is, you're OK with silently swallowing them).
In easily >99% of cases, the appropriate behavior is to await the task.
So what's the difference between these methods below
StartNew is an outdated way to run code on a thread pool thread. You should use Run instead of StartNew. I describe why in painful detail on my blog, but the short answer is that StartNew does not understand async delegates and has inappropriate default options.
async is completely different. Run executes code on a thread pool thread, but async rewrites the code so that it doesn't need a thread while there's an asynchronous operation in progress. So, the Run example will block a thread pool thread for 1000ms in the Thread.Sleep call. The async example will run on the UI thread and then not use the UI thread for 1000ms in the await Task.Delay call.
and what is best to use to make sure it doesn't block the UI thread?
Both Task.Run and async are appropriate in UI apps, depending on the nature of the background code.
If your code is asynchronous (usually all I/O-bound code), then you should use async/await. Note that imposing asynchrony "top-down" is hard. Don't think about "forcing code off the UI thread"; instead, identify the naturally-asynchronous parts of your system (DB, file, WebAPI calls, etc), change them to be async at the lowest level, and work up from there.
If your code is synchronous (CPU-bound), then you can use Task.Run if it takes too long to run on your UI thread.
And is it really true that Task.Factory.StartNew doesn't always create a thread that runs in the background and doesn't block the UI?
Run always executes code on threads from the thread pool. The rules for StartNew are much more complex, but in your example example code, if TaskScheduler.Current == TaskScheduler.Default, then StartNew will also use a thread from the thread pool.
The two examples you give are pretty much the same and depend of the context you are using them on.
Tasks are a way to describe a packet of work with meta information to know about its state, synchronize multiple tasks, cancel them etc. The main idea about tasks and asynchronicity is that you don't have to care about "where" (on which thread) this unit of work is executed, it is performed asynchronously, that's all you need to know. So no worries for your UI-Thread, as long as you are working asynchronously, it won't be locked (that's one of the main improvements since Win8 and the async/await keywords, every UI Manipulation has to be async not to bloat the UI).
As for async&await, those are in fact syntactic sugar. In many simple cases, you don't need to care about the task itself, you just want the asynchronous computation "do this and give me the answer when you have it".
int x = await ComputeXAsync(); //Returns a Task<int> which will be automatically unwrapped as an int
If you need to synchronize tasks, for example start multiple tasks in parallel and wait that all of them are done before continuing, you can't rely on the simple await Task<int> format, you then need to work with the task itself, its Status property and in this case the Task.WaitAll(task[]) API.
Edit: I really recommend the chapter of JonSkeets Book C# in Depth (last edition). The Async/Await part is really well explained and will take you way further than any reading online will do.
The first two are actually the same. see here
Well you can think of it as a task is something you want to be done.
A thread is something like a worker which does the task for you. So basically a threads helps you doing a task, but it doesn't return a value.
a task does not create its own OS thread. Instead, tasks are executed by a TaskScheduler; the default scheduler simply runs on the ThreadPool.
See Here
An example is. A remote request you can do as task, you want the job done but dont know when it will be. But a TCPListener for the server I would run as a thread, because it always needs to listen to the socket.
Like Andreas Niedermair pointed out, you can also use longliving tasks with the TaskCreationOptions

Last time. Is await working with threads?

I saw some similar questions before, just want to clarify it.
In this article, it is said "There is no thread" for async calls.
However, in another one, it is said
Here, however, we’re running the callback to update the Text of
textBox1on some arbitrary thread, wherever the Task Parallel Library
(TPL) implementation of ContinueWith happened to put it.
Also, in some cases, when i was calling ContinueWith in my project, i also got "cross-thread access exception.
So, who is right?
ANSWER: thanks to i3arnon. After reading first article more carefully, i found this place
So, we see that there was no thread while the request was in flight.
When the request completed, various threads were “borrowed” or had
work briefly queued to them. This work is usually on the order of a
millisecond or so (e.g., the APC running on the thread pool) down to a
microsecond or so (e.g., the ISR). But there is no thread that was
blocked, just waiting for that request to complete.
Both are. When you have code running on your CPU there's always a thread running it. The question is what happens when you don't have code to run, for example when you are waiting for an IO operation to complete.
If you use async await where you should there would be no thread idly waiting for that operation to complete, and only after it has completed a thread will be given (usually by the Thread Pool) to continue running code on your CPU.
When you don't use async-await (or a different asynchronous paradigm like Begin-End) you would hold a thread throughout the operation, even in the IO parts of it, which is a waste of resources.
It's important to add that although most asynchronous examples regard IO operations, that's not always the case. In some cases it's reasonable to treat a CPU bound operation (where you do hold a thread throughout the whole operation) asynchronously.

Categories

Resources