Creating a responsive version of a complex function [duplicate] - c#

This question already has answers here:
Is there a code template/design pattern I can use when reacting to mouse clicks or button presses? [closed]
(4 answers)
Closed 9 years ago.
I am working on a little project. I need to implement some kind of an algorithm which in most cases will use much of a CPU's resources and therefore take some time to execute and return. I want this method to be kind of responsive and notify of any progress. I might also want to do some other processes while doing these computations.
Consider this class that has the complex method
class Engine
{
public int ComplexMethod(int arg)
{
int result = 0;
for (int i = 0; i < 100000; i++)
{
for (int j = 0; j < 10000; j++)
{
// some complex and time-consuming computations
}
// it would be nice to get notified on arriving this point for example
}
return result;
}
}
What is the best approach to this case?
EDIT: I should have mentioned that it is an app with UI (a WPF application).

You can run the process in a new thread using Task.Run, and use the IProgress<T> interface to notify progress:
class Engine
{
public int ComplexMethod(int arg, IProgress<double> progress)
{
int result = 0;
for (int i = 0; i < 100000; i++)
{
for (int j = 0; j < 10000; j++)
{
// some complex and time-consuming computations
}
progress.Report(i / 100000);
}
return result;
}
}
...
var progress = new Progress<double>(p => ShowProgress(p));
var result = await Task.Run(() => engine.ComplexMethod(arg, progress));
ShowResult(result);
If you have a UI (which is likely), the progression delegate will automatically be called on the UI thread, using Control.Invoke (Windows Forms) or Dispatcher.Invoke (WPF, WinRT, Silverlight), provided that the Progress<T> instance was created on the UI thread.
Note that async/await won't help (inside the method) if the computation is CPU-bound. However it can be used to make it easier to retrieve the result, as shown above. If for some reason you can't or don't want to use await, you can use ContinueWith, specifying TaskScheduler.FromCurrentSynchronizationContext for the scheduler parameter.

Assuming you are using .NET 4.5 (C# 5), you can use TPL (http://msdn.microsoft.com/en-us/library/dd997423(v=vs.110).aspx).
With not knowing your algorithm, all I can suggest is that you return a Task<int> instead of returning a int. This will allow for the function to be ran easily in parallel with other tasks.
I would recommend the following:
public Task<int> ComplexMethodAsync(int arg)
{
Task.Run(()=>ComplextMethod(arg));
}
Now, when you run this method, the ComplexMethod(arg) will be called on a separate thread from the ThreadPool. Call it with:
await ComplextMethodAsync(xyz);
Check out async/await for more information.

Related

Are there more tasks being performed than threads? What's happening?

I wrote a simple code. The machine has 32 threads. At the twentieth second, I see the number 54 in the console. This means that 54 tasks have started. Each task uses thread suspension. I don't understand why tasks continue to run if tasks have already been created and started in all possible threads and the thread suspension code is running in each task.
What's going on, how does it work?
void MyMethod(int i)
{
Console.WriteLine(i);
Thread.Sleep(int.MaxValue);
}
Console.WriteLine(Environment.ProcessorCount);
for (int i = 0; i < int.MaxValue; i++)
{
Thread.Sleep(50);
int j = i;
Task.Run(() => MyMethod(j));
}
And why does this code create so many tasks? (Environment.ProcessorCount => 32)
using System.Net;
void MyMethod(int i)
{
Console.WriteLine(WebRequest.Create("https://192.168.1.1").GetResponse().ContentLength);
}
for (int i = 0; i < Environment.ProcessorCount; i++)
{
int j = i;
Task.Run(() => MyMethod(j));
}
Thread.Sleep(int.MaxValue);
The Task.Run method runs the code on the ThreadPool, and the ThreadPool creates more threads when it becomes saturated. Initially it creates immediately on demand as many threads as the number of cores. After that point it is said to be saturated, and creates one new thread every second until the demand is satisfied. This rate is not documented. It is found by experimentation on .NET 6, and might change in future .NET versions.
You are able to control the saturation threshold with the ThreadPool.SetMinThreads method. For example ThreadPool.SetMinThreads(100, 100). If you give it too large values, this method does nothing and returns false.

Why a simple await Task.Delay(1) enables parallel execution?

I would like to ask simple question about code bellow:
static void Main(string[] args)
{
MainAsync()
//.Wait();
.GetAwaiter().GetResult();
}
static async Task MainAsync()
{
Console.WriteLine("Hello World!");
Task<int> a = Calc(18000);
Task<int> b = Calc(18000);
Task<int> c = Calc(18000);
await a;
await b;
await c;
Console.WriteLine(a.Result);
}
static async Task<int> Calc(int a)
{
//await Task.Delay(1);
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
This example runs Calc functions in synchronous way. When the line //await Task.Delay(1); will be uncommented, the Calc functions will be executed in a parallel way.
The question is: Why, by adding simple await, the Calc function is then async?
I know about async/await pair requirements. I'm asking about what it's really happening when simple await Delay is added at the beginning of a function. Whole Calc function is then recognized to be run in another thread, but why?
Edit 1:
When I added a thread checking to code:
static async Task<int> Calc(int a)
{
await Task.Delay(1);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
it is possible to see (in console) different thread id's. If await Delay line is deleted, the thread id is always the same for all runs of Calc function. In my opinion it proves that code after await is (can be) runned in different threads. And it is the reason why code is faster (in my opinion of course).
It's important to understand how async methods work.
First, they start running synchronously, on the same thread, just like every other method. Without that await Task.Delay(1) line, the compiler will have warned you that the method would be completely synchronous. The async keyword doesn't, by itself, make your method asynchronous. It just enables the use of await.
The magic happens at the first await that acts on an incomplete Task. At that point the method returns. It returns a Task that you can use to check when the rest of the method has completed.
So when you have await Task.Delay(1) there, the method returns at that line, allowing your MainAsync method to move to the next line and start the next call to Calc.
How the continuation of Calc runs (everything after await Task.Delay(1)) depends on if there is a "synchronization context". In ASP.NET (not Core) or a UI application, for example, the synchronization context controls how the continuations run and they would run one after the other. In a UI app, it would be on the same thread it started from. In ASP.NET, it may be a different thread, but still one after the other. So in either case, you would not see any parallelism.
However, because this is a console app, which does not have a synchronization context, the continuations happen on any ThreadPool thread as soon as the Task from Task.Delay(1) completes. That means each continuation can happen in parallel.
Also worth noting: starting with C# 7.1 you can make your Main method async, eliminating the need for your MainAsync method:
static async Task Main(string[] args)
An async function returns the incomplete task to the caller at its first incomplete await. After that the await on the calling side will await that task to become complete.
Without the await Task.Delay(1), Calc() does not have any awaits of its own, so will only return to the caller when it runs to the end. At this point the returned Task is already complete, so the await on the calling site immediately uses the result without actually invoking the async machinery.
layman's version....
nothing in the process is yielding CPU time back without 'delay' and so it doesn't give anything else CPU time, you are confusing this with multiple threaded code. "async and await" is not about multiple threaded but about using the CPU (thread/threads) when its doing non CPU work" aka writing to disk. Writing to disk does not need the thread (CPU). So when something is async, it can free the thread and be used for something else instead of waiting for non CPU (oi task) to complete.
#sunside is saying the same thing just more technically.
static async Task<int> Calc(int a)
{
//faking a asynchronous .... this will give this thread to something else
// until done then return here...
// does not make sense... as your making this take longer for no gain.
await Task.Delay(1);
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
vs
static async Task<int> Calc(int a)
{
using (var reader = File.OpenText("Words.txt"))
{
//real asynchronous .... this will give this thread to something else
var fileText = await reader.ReadToEndAsync();
// Do something with fileText...
}
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
the reason it looks like its in "parallel way" is that its just give the others tasks CPU time.
example; aka without delay
await a; do this (no actual aysnc work)
await b; do this (no actual aysnc work)
await c; do this (no actual aysnc work)
example 2;aka with delay
await a; start this then pause this (fake async), start b but come back and finish a
await b; start this then pause this (fake async), start c but come back and finish b
await c; start this then pause this (fake async), come back and finish c
what you should find is although more is started sooner, the overall time will be longer as it as to jump between tasks for no benefit with a faked asynchronous task. where as, if the await Task.Delay(1) was a real async function aka asynchronous in nature then the benefit would be it can start the other work using the thread which would of been blocked... while it waits for something which does not require the thread.
update silly code to show its slower... Make sure you are in "Release" mode you should always ignore the first run... these test are silly and you would need to use https://github.com/dotnet/BenchmarkDotNet to really see the diff
static void Main(string[] args)
{
Console.WriteLine("Exmaple1 - no Delay, expecting it to be faster, shorter times on average");
for (int i = 0; i < 10; i++)
{
Exmaple1().GetAwaiter().GetResult();
}
Console.WriteLine("Exmaple2- with Delay, expecting it to be slower,longer times on average");
for (int i = 0; i < 10; i++)
{
Exmaple2().GetAwaiter().GetResult();
}
}
static async Task Exmaple1()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Task<int> a = Calc1(18000); await a;
Task<int> b = Calc1(18000); await b;
Task<int> c = Calc1(18000); await c;
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
}
static async Task<int> Calc1(int a)
{
int result = 0;
for (int k = 0; k < a; ++k) { for (int l = 0; l < a; ++l) { result += l; } }
return result;
}
static async Task Exmaple2()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Task<int> a = Calc2(18000); await a;
Task<int> b = Calc2(18000); await b;
Task<int> c = Calc2(18000); await c;
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
}
static async Task<int> Calc2(int a)
{
await Task.Delay(1);
int result = 0;
for (int k = 0; k < a; ++k){for (int l = 0; l < a; ++l) { result += l; } }
return result;
}
By using an async/await pattern, you intend for your Calc method to run as a task:
Task<int> a = Calc(18000);
We need to establish that tasks are generally asynchronous in their nature, but not parallel - parallelism is a feature of threads. However, under the hood, some thread will be used to execute your tasks on. Depending on the context your running your code in, multiple tasks may or may not be executed in parallel or sequentially - but they will be (allowed to be) asynchronous.
One good way of internalizing this is picturing a teacher (the thread) answering question (the tasks) in class. A teacher will never be able to answer two different questions simultaneously - i.e. in parallel - but she will be able to answer questions of multiple students, and can also be interrupted with new questions in between.
Specifically, async/await is a cooperative multiprocessing feature (emphasis on "cooperative") where tasks only get to be scheduled onto a thread if that thread is free - and if some task is already running on that thread, it needs to manually give way. (Whether and how many threads are available for execution is, in turn, dependent on the environment your code is executing in.)
When running Task.Delay(1) the code announces that it is intending to sleep, which signals to the scheduler that another task may execute, thereby enabling asynchronicity. The way it's used in your code it is, in essence, a slightly worse version of Task.Yield (you can find more details about that in the comments below).
Generally speaking, whenever you await, the method currently being executed is "returned" from, marking the code after it as a continuation. This continuation will only be executed when the task scheduler selects the current task for execution again. This may happen if no other task is currently executing and waiting to be executed - e.g. because they all yielded or await something "long-running".
In your example, the Calc method yields due to the Task.Delay and execution returns to the Main method. This, in turn enters the next Calc method and the pattern repeats. As was established in other answers, these continuations may or may not execute on different threads, depending on the environment - without a synchronization context (e.g. in a console application), it may happen. To be clear, this is neither a feature of Task.Delay nor async/await, but of the configuration your code executes in. If you require parallelism, use proper threads or ensure that your tasks are started such that they encourage use of multiple threads.
In another note: Whenever you intend to run synchronous code in an asynchronous manner, use Task.Run() to execute it. This will make sure it doesn't get in your way too much by always using a background thread. This SO answer on LongRunning tasks might be insightful.

Asynchronous Tasks take too much time

I have been trying make an asynchronous approach to my CPU-bound function which compute some aggregate functions. The thing is that there is some Deadlock (I suppose), because the time of calculation is too different. I am reallz newbie in this Task Parallel world, I also read Stephem Cleary articles but I am still unsure of all aspect this asynchronous approach.
My Code:
private static void Main(string[] args)
{
PIServer server = ConnectToDefaultPIServer();
AFTimeRange timeRange = new AFTimeRange("1/1/2012", "6/30/2012");
Program p = new Program();
for (int i = 0; i < 10; i++)
{
p.TestAsynchronousCall(server, timeRange);
//p.TestAsynchronousCall(server, timeRange).Wait();-same results
}
Console.WriteLine("Main check-disconnected done");
Console.ReadKey();
}
private async Task TestAsynchronousCall(PIServer server, AFTimeRange timeRange)
{
AsyncClass asyn;
for (int i = 0; i < 1; i++)
{
asyn = new AsyncClass();
await asyn.DoAsyncTask(server, timeRange);
//asyn.DoAsyncTask(server, timeRange);-same results
}
}
public async Task DoAsyncTask(PIServer server, AFTimeRange timeRange)
{
var timeRanges = DivideTheTimeRange(timeRange);
Task<Dictionary<PIPoint, AFValues>>[] tasksArray = new Task<Dictionary<PIPoint, AFValues>>[2];
tasksArray[0] = (Task.Run(() => CalculationClass.AverageValueOfTagPerDay(server, timeRanges[0])));
// tasksArray[1] = tasksArray[0].ContinueWith((x) => CalculationClass.AverageValueOfTagPerDay(server, timeRanges[1]));
tasksArray[1] = (Task.Run(() => CalculationClass.AverageValueOfTagPerDay(server, timeRanges[1])));
Task.WaitAll(tasksArray);
//await Task.WhenAll(tasksArray); -same results
for (int i = 0; i < tasksArray.Length; i++)
{
Program.Show(tasksArray[i].Result);
}
}
I measure time throught Stopwatch in AverageValueOfTagPerDay functions. This function is synchronous (Is that a problem?). Each Task take 12 seconds. But when I uncommented the line and use ContinueWith() approach, these Tasks take 5-6 seconds each(which is desirable). How is it possible?
More strange is that when I set the for loop in Main() on 10, sometimes it takes 5 seconds as well as when I use ContinueWith(). So I guess somewhere is deadlock but I am unable to find that.
Sorry for english, I got still problem make good senteces when I try explain some difficulties.
I have been trying make an asynchronous approach to my CPU-bound function which compute some aggregate functions.
"Asynchronous" and "CPU-bound" are not terms that go together. If you have a CPU-bound process, then you should use parallel technologies (Parallel, Parallel LINQ, TPL Dataflow).
I am reallz newbie in this Task Parallel world, I also read Stephem Cleary articles but I am still unsure of all aspect this asynchronous approach.
Possibly because I do not cover parallel technologies in any of my articles or blog posts. :) I do cover them in my book, but not online. My online work focuses on asynchrony, which is ideal for I/O-based operations.
To solve your problem, you should use a parallel approach:
public Dictionary<PIPoint, AFValues>[] DoTask(PIServer server, AFTimeRange timeRange)
{
var timeRanges = DivideTheTimeRange(timeRange);
var result = timeRanges.AsParallel().AsOrdered().
Select(range => CalculationClass.AverageValueOfTagPerDay(server, range)).
ToArray();
return result;
}
Of course, this approach assumes that PIServer is threadsafe. It also assumes that there's no I/O being done by the "server" class; if there is, then TPL Dataflow may be a better choice than Parallel LINQ.
If you are planning to use this code in a UI application and don't want to block the UI thread, then you can call the code asynchronously like this:
var results = await Task.Run(() => DoTask(server, timeRange));
foreach (var result in results)
Program.Show(result);

Difference between delegate.BeginInvoke and using ThreadPool threads in C#

In C# is there any difference between using a delegate to do some work asynchronously (calling BeginInvoke()) and using a ThreadPool thread as shown below
public void asynchronousWork(object num)
{
//asynchronous work to be done
Console.WriteLine(num);
}
public void test()
{
Action<object> myCustomDelegate = this.asynchronousWork;
int x = 7;
//Using Delegate
myCustomDelegate.BeginInvoke(7, null, null);
//Using Threadpool
ThreadPool.QueueUserWorkItem(new WaitCallback(asynchronousWork), 7);
Thread.Sleep(2000);
}
Edit:
BeginInvoke makes sure that a thread from the thread pool is used to execute the asynchronous code , so is there any difference?
Joe Duffy, in his Concurrent Programming on Windows book (page 418), says this about Delegate.BeginInvoke:
All delegate types, by convention offer a BeginInvoke and EndInvoke method alongside the ordinary synchronous Invoke method. While this is a nice programming model feature, you should stay away from them wherever possible. The implementation uses remoting infrastructure which imposes a sizable overhead to asynchronous invocation. Queue work to the thread pool directly is often a better approach, though that means you have to co-ordinate the rendezvous logic yourself.
EDIT: I created the following simple test of the relative overheads:
int counter = 0;
int iterations = 1000000;
Action d = () => { Interlocked.Increment(ref counter); };
var stopwatch = new System.Diagnostics.Stopwatch();
stopwatch.Start();
for (int i = 0; i < iterations; i++)
{
var asyncResult = d.BeginInvoke(null, null);
}
do { } while(counter < iterations);
stopwatch.Stop();
Console.WriteLine("Took {0}ms", stopwatch.ElapsedMilliseconds);
Console.ReadLine();
On my machine the above test runs in around 20 seconds. Replacing the BeginInvoke call with
System.Threading.ThreadPool.QueueUserWorkItem(state =>
{
Interlocked.Increment(ref counter);
});
changes the running time to 864ms.

ThreadPool - WaitAll 64 Handle Limit

I am trying to bypass the the wait64 handle limit that .net 3.5 imposes
I have seen this thread : Workaround for the WaitHandle.WaitAll 64 handle limit?
So I understand the general idea but I am having difficulty because I am not using a delegate but rather
I am basically working of this example :
http://msdn.microsoft.com/en-us/library/3dasc8as%28VS.80%29.aspx
This link http://www.switchonthecode.com/tutorials/csharp-tutorial-using-the-threadpool
is similar but again the int variable keeping track of the tasks is a member variable.
Where in the above example would I pass the threadCount integer?
Do I pass it in the callback method as an object? I think I am having trouble with the callback method and passing by reference.
Thanks Stephen,
That link is not entirely clear to me.
Let me post my code to help myself clarify:
for (int flows = 0; flows < NumFlows; flows++)
{
ResetEvents[flows] = new ManualResetEvent(false);
ICalculator calculator = new NewtonRaphson(Perturbations);
Calculators[flows] = calculator;
ThreadPool.QueueUserWorkItem(calculator.ThreadPoolCallback, flows);
}
resetEvent.WaitOne();
Where would I pass in my threadCount variable. I assume it needs to be decremented in calculator.ThreadPoolCallback?
You should not be using multiple wait handles to wait for the completion of multiple work items in the ThreadPool. Not only is it not scalable you will eventually bump into the 64 handle limit imposed by the WaitHandle.WaitAll method (as you have done already). The correct pattern to use in this situation is a counting wait handle. There is one available in the Reactive Extensions download for .NET 3.5 via the CountdownEvent class.
var finished = new CountdownEvent(1);
for (int flows = 0; flows < NumFlows; flows++)
{
finished.AddCount();
ICalculator calculator = new NewtonRaphson(Perturbations);
Calculators[flows] = calculator;
ThreadPool.QueueUserWorkItem(
(state) =>
{
try
{
calculator.ThreadPoolCallback(state);
}
finally
{
finished.Signal();
}
}, flows);
}
finished.Signal();
finished.Wait();
An anonymous method might be easiest:
int threadCount = 0;
for (int flows = 0; flows < NumFlows; flows++)
{
ICalculator calculator = new NewtonRaphson(Perturbations);
Calculators[flows] = calculator;
// We're about to queue a new piece of work:
// make a note of the fact a new work item is starting
Interlocked.Increment(ref threadCount);
ThreadPool.QueueUserWorkItem(
delegate
{
calculator.ThreadPoolCallback(flows);
// We've finished this piece of work...
if (Interlocked.Decrement(ref threadCount) == 0)
{
// ...and we're the last one.
// Signal back to the main thread.
resetEvent.Set();
}
}, null);
}
resetEvent.WaitOne();

Categories

Resources