When I create an array of tasks like this:
var taskArray = new Task<double>[]
{
Task.Factory.StartNew(() => new Random().NextDouble()),
Task.Factory.StartNew(() => new Random().NextDouble()),
Task.Factory.StartNew(() => new Random().NextDouble())
};
Will this create 3 threads for sure, or it is up to the CLR to create threads as it sees fit?
So if I perform this in a web request, this means there will be at least 4 threads created to service the request correct? (the web request + 1 for each task)
Will this create 3 threads for sure, or it is up to the CLR to create threads as it sees fit?
The latter. In particular, as those tasks complete so quickly I wouldn't be surprised if they all executed on the same thread (although not the same thread as the one calling StartNew) - particularly if this is in a "clean" process and the thread pool hasn't had to fire up many threads yet. (IIRC, the thread pool only starts one new thread every 0.5 seconds, which would give plenty of time for all of your tasks to execute on a single thread.)
You can use your own custom TaskScheduler if you really want to, but that would be relatively extreme.
You should read the MSDN article on task schedulers (including the default one) for more information.
No easy answer. It depend on available resources on server. It'll setup queue and if server can run 3 thread in same time (it run it), otherwise it'll queue it.
Related
According to msdn, Task.Run(Action) uses a Thread in the ThreadPool to actually execute the Action, my question is (as the tile states), would Task.Run start a new Thread if there are none available in the ThreadPool? Or would it just wait until there is one available?
[tl;dr] I'm currently enqueuing a some calls directly to the ThreadPool:
ThreadPool.QueueUserWorkItem(x => ...);
But, I've noticed that under some loads, the Application does run out of Threads in the pool (Parallel.ForEach being at fault, somewhere else in the program).
I know that increasing the number of Threads in the Pool will probably NOT solve anything, so I'm thinking to use the MaxDegreeOfParallelism (ParallelOptions) to control the number of Threads used by Parallel.Foreach.
Anyhow, I would still like to know the answer to the stated question.
Thanks =]
would Task.Run start a new Thread if there are none available in the ThreadPool?
Not immediately. The Task would be queued.
But the ThreadPool does manage its threads: when the queue fills up it will create new worker threads at a rate of 2 per second.
And when the queue runs empty Threads will be destroyed at the same rate.
The actual algorithm is a little more involved (from .NET 4 on) but it does mean that the Pool does exercise some relatively simple resource management.
I have been tasked to take over an old bit of code that uses delegates.
SearchDelegate[] dlgt = new SearchDelegate[numSearches];
IAsyncResult[] ar = new IAsyncResult[numSearches];
It then does a loop to start multiple delegate functions
for (int i = 0; i < numSearches; i++)
{
ar[i] = dlgt[i].BeginInvoke(....);
}
It then does a timed loop to get the results from the ar object.
It all seems to work fine. The issue I am having is that sometimes some of these delegate functions can take 3 to 4 seconds to start, even longer if the count goes above 10. Is this a common problem, or is there a setting I can tweak?
This is running on IIS. I can replicate the issue locally with the minimal machine resources being used.
Thanks all.
Daz
can take 3 to 4 seconds to start
is caused by the threadpool. When all threads are busy it only slowly (2/second) creates new threads.
You could up the min amount of threads in the pool but especially for a web app you should research, test and measure that extensively. ASP.NET also is a big stakeholder in the threadpool.
BeginInvoke method dispatches actual work to a thread pool, as it's written in this article. It may take some time actually, when there are no available idle threads. Thread pool may decide to wait for some work items completion or to add additional threads, accounting min and max limits.
Some additional info may be found here The managed threadpool and there Simple description of worker and IO threads in net and at remarks section of this article as well ThreadPool.SetMinThreads.
You should be aware that the same thread pool is used for HTTP requests processing, therefore it's usually senseless to offload custom non IO-bound work to the thread pool in web apps, as it won't give you any benefits and may even hurt performance due to additional thread switches. While BeginInvoke doesn't look as an invocation of asynchronous IO operation.
It doesn't actually matter which concrete thread executes the work — client still have to wait for the response the same amount of time. Looks, like you may, probably, win some time by performing work in parallel, but it won't be possible under load as far as there won't be available threads at thread pool for processing both HTTP requests and your custom work items.
You may want to check this thread for some additional details on this theme. It's related to Task, but this doesn't matter as far as both BeginInvoke and Task.Run are using the same thread pool under the hood.
My app does a lot of background tasks, and for each such action, I created a new thread for each action
Thread StreamResponse = new Thread(() =>
{
DB.ToDb(flag);
});
StreamResponse.Start();
These actions take place thousands per minute. Noticed that the app starts eating RAM. That's normal, because the application creates thousands of threads and not closing them.
From here there is a question, how do I make it so that these actions were in a separate thread. For example, I create a thread in a separate class and in this thread commit acts.
Or otherwise can? The task that the thread was made and after the completion of the action is automatically closed. But perhaps more correctly just all these steps to do in another thread. What do you think? How to make?
May be using Dispatcher.BeginInvoke ?
It seems that you could benefit from using the ThreadPool Class. From MSDN:
Provides a pool of threads that can be used to execute tasks, post work items, process asynchronous I/O, wait on behalf of other threads, and process timers.
Here is a sample code for using it:
ThreadPool.QueueUserWorkItem((x) =>
{
DB.ToDb(flag);
});
And you can set the maximum number of concurrent threads available in the thread pool using the ThreadPool.SetMaxThreads Method in order to improve performance and limit memory usage in your app.
Why not use task factory?
var task = Task.Factory.StartNew(()=>{ /*do stuff*/ });
Task factory works the same way as Queue does but has the advantage of allowing you to handle any return information more elegantly
var result = await task;
The Factory will only spawn threads when it makes sense to, so essentailly is the same as the thread pool
If you want to use schedule longer running tasks this is also considered the better option, in terms of recommended practices, but would need to specify that information on the creation of the task.
If you need further information on there is a good answer available here : ThreadPool.QueueUserWorkItem vs Task.Factory.StartNew
Maybe I did not understand it right ... all the Parallel class issue :(
But from what I am reading now, I understand that when I use the Parallel I actually mobilize all the threads that exists in the threadPool for some task/mission.
For example:
var arrayStrings = new string[1000];
Parallel.ForEach<string>(arrayStrings, someString =>
{
DoSomething(someString);
});
So the Parallel.ForEach in this case is mobilizing all the threads that exists in the threadPool for the 'DoSomething' task/mission.
But does the call Parallel.ForEach will create any new thread at all?
Its clear that there will be no 1000 new threads. But lets assume that there are 1000 new threads, some case that the threadPool release all the thread that it hold so, in this case ... the Parallel.ForEach will create any new thread?
Short answer: Parallel.ForEach() does not “mobilize all the threads”. And any operation that schedules some work on the ThreadPool (which Parallel.ForEach() does) can cause creation of new thread in the pool.
Long answer: To understand this properly, you need to know how three levels of abstraction work: Parallel.ForEach(), TaskScheduler and ThreadPool:
Parallel.ForEach() (and Parallel.For()) schedule their work on a TaskScheduler. If you don't specify a scheduler explicitly, the current one will be used.
Parallel.ForEach() splits the work between several Tasks. Each Task will process a part of the input sequence, and when it's done, it will request another part if one is available, and so on.
How many Tasks will Parallel.ForEach() create? As many as the TaskScheduler will let it run. The way this is done is that each Task first enqueues a copy of itself when it starts executing (unless doing so would violate MaxDegreeOfParallelism, if you set it). This way, the actual concurrency level is up to the TaskScheduler.
Also, the first Task will actually execute on the current thread, if the TaskScheduler supports it (this is done using RunSynchronously()).
The default TaskScheduler simply enqueues each Task to the ThreadPool queue. (Actually, it's more complicated if you start a Task from another Task, but that's not relevant here.) Other TaskSchedulers can do completely different things and some of them (like TaskScheduler.FromCurrentSynchronizationContext()) are completely unsuitable for use with Parallel.ForEach().
The ThreadPool uses quite a complex algorithm to decide exactly how many threads should be running at any given time. But the most important thing here is that scheduling new work item can cause the creating of a new thread (although not necessarily immediately). And because with Parallel.ForEach(), there is always some item queued to be executed, it's completely up to the internal algorithm of ThreadPool to decide the number of threads.
Put together, it's pretty much impossible to decide how many threads will be used by a Parallel.ForEach(), because it depends on many variables. Both extremes are possible: that the loop will run completely synchronously on the current thread and that each item will be run on its own, newly created thread.
But generally, is should be close to optimal efficiency and you probably don't have to worry about all those details.
Parallel.Foreach does not create new threads, nor does it "mobilize all the threads". It uses a limited number of threads from the threadpool and submits tasks to them for parallel execution. In the current implementation the default is to use one thread per core.
I think you have this the wrong way round.
From PATTERNS OF PARALLEL PROGRAMMING you'll see that Parallel.ForEach is just really syntactic sugar.
The Parallel.ForEach is largely boiled down to something like this,
for (int p = 0; p < arrayStrings.Count(); p++)
{
ThreadPool.QueueUserWorkItem(DoSomething(arrayStrings[p]);
}
The ThreadPool takes care of the scheduling. There are some excellent articles around how the ThreadPool's scheduler behaves to some degree if you're interested, but that's nothing to do with TPL.
Parallel does not deal with threads at all - it schedules TASKS to the task framework. THat then has a scheduler and the default scheduler goes to the threadpool. This one will try to find a goo number of threads (better in 4.5 than 4.0) and the Threadpool may slowly spin up new threads.
But that is not a functoin of parallel.foreach ;)
the Parallel.ForEach will create any new thread ???
It never will. As I said - it has 1000 foreach, then it queues 10.000 tasks, Point. THe Task factory scheduler will do what it is programmed to do ((you can replace it). Generally, default - yes, slowly new threads will spring up WITHIN REASON.
I am cleaning up some old code converting it to work asynchronously.
psDelegate.GetStops decStops = psLoadRetrieve.GetLoadStopsByLoadID;
var arStops = decStops.BeginInvoke(loadID, null, null);
WaitHandle.WaitAll(new WaitHandle[] { arStops.AsyncWaitHandle });
var stops = decStops.EndInvoke(arStops);
Above is a single example of what I am doing for asynchronous work. My plan is to have close to 20 different delegates running. All will call BeginInvoke and wait until they are all complete before calling EndInvoke.
My question is will having so many delegates running cause problems? I understand that BeginInvoke uses the ThreadPool to do work and that has a limit of 25 threads. 20 is under that limit but it is very likely that other parts of the system could be using any number of threads from the ThreadPool as well.
Thanks!
No, the ThreadPool manager was designed to deal with this situation. It won't let all thread pool threads run at the same time. It starts off allowing as many threads to run as you have CPU cores. As soon as one completes, it allows another one to run.
Every half second, it steps in if the active threads are not completing. It assumes they are stuck and allows another one to run. On a 2 core CPU, you'd now have 3 threads running.
Getting to the maximum, 500 threads on a 2 core CPU, would take quite a while. You would have to have threads that don't complete for over 4 minutes. If your threads behave that way then you don't want to use threadpool threads.
The current default MaxThreads is 250 per processor, so effectively there should be no limit unless your application is just spewing out calls to BeginInvoke. See http://msdn.microsoft.com/en-us/library/system.threading.threadpool.aspx. Also, the thread pool will try to reuse existing threads before creating new ones in order to reduce the overhead of creating new threads. If you invokes are all fast you will probably not see the thread pool create a lot of threads.
For long running tasks or blocking tasks it is usually better to avoid the thread pool and managed the threads yourself.
However, trying to schedule that many threads will probably not yield the best results on most current machines.