On page 88 of Stephen Toub's book
http://www.microsoft.com/download/en/details.aspx?id=19222
There is the code
private BlockingCollection<T> _streamingData = new BlockingCollection<T>();
// Parallel.ForEach
Parallel.ForEach(_streamingData.GetConsumingEnumerable(),
item => Process(item));
// PLINQ
var q = from item in _streamingData.GetConsumingEnumerable().AsParallel()
...
select item;
Stephen then mentions
"when
passing the result of calling GetConsumingEnumerable as the data source to Parallel.ForEach, the threads used by
the loop have the potential to block when the collection becomes empty. And a blocked thread may not be released by Parallel.ForEach back to the ThreadPool for retirement or other uses. As such, with the code as shown
above, if there are any periods of time where the collection is empty, the thread count in the process may steadily
grow;"
I do not understand why the thread count would grow?
If the collection is empty then wouldn't the blockingcollection not request any further threads?
Hence you do not need to do WithDegreeOfParallelism to limit the number of threads used on the BlockingCollection
The thread pool has a hill climbing algorithm that it uses to estimate the appropriate number of threads. As long as adding threads increases throughput, the thread pool will create more threads. It will assume that some blocking or IO happens and try to saturate the CPU by going over the count of processors in the system.
That is why doing IO and blocking stuff on thread pool threads can be dangerous.
Here is a fully working example of said behavior:
BlockingCollection<string> _streamingData = new BlockingCollection<string>();
Task.Factory.StartNew(() =>
{
for (int i = 0; i < 100; i++)
{
_streamingData.Add(i.ToString());
Thread.Sleep(100);
}
});
new Thread(() =>
{
while (true)
{
Thread.Sleep(1000);
Console.WriteLine("Thread count: " + Process.GetCurrentProcess().Threads.Count);
}
}).Start();
Parallel.ForEach(_streamingData.GetConsumingEnumerable(), item =>
{
});
I do not know why the thread count keeps climbing although it does not increase throughput. According to the model that I explained it would not grow. But I do not know if my model is actually correct.
Maybe the thread-pool has an additional heuristic that makes it spawn threads if it sees no progress at all (measured in tasks completed per second). That would make sense because that would likely prevent a lot of deadlocks in applications. Deadlocks can happen if important tasks cannot run because they are waiting for existing tasks to exit and make threads available. This is a well-known problem with the thread pool.
Related
I completely don't understand the applied meaning of async\await.
I just started learning async\await and I know that there are already a huge number of topics. If I understand correctly, then async\await is not needed anywhere else except for operations with a long wait in a thread, if this is not related to a long calculation. For example, database response, network request, file handling. Many people write that async\await is also needed so as not to block the main thread. And here it is completely unclear to me why it should be blocked. Don't block without async\await, just create a task. So I'm trying to create a code that will wait a long time for a response from the network.
I created an example. I see with my own eyes through the windows task manager that the while (i < int.MaxValue) operation is processed first, taking up the entire processor resource, although I first launched the DownloadFile. And only then, when the processor is released, I see that the download files is in progress. On my machine, the example runs ~54 seconds.
Question: how could I first run the DownloadFile asynchronously so that the threads do not idle uselessly, but can do while (i < int.MaxValue)?
using System.Net;
string PathProject = Directory.GetParent(Directory.GetCurrentDirectory()).Parent.Parent.Parent.FullName;
//Create folder 1 in the project folder
DirectoryInfo Path = new DirectoryInfo($"{PathProject}\\1");
int Iterations = Environment.ProcessorCount * 3;
string file = "https://s182vla.storage.yandex.net/rdisk/82b08d86b9920a5e889c6947e4221eb1350374db8d799ee9161395f7195b0b0e/62f75403/geIEA69cusBRNOpxmtup5BdJ7AbRoezTJE9GH4TIzcUe-Cp7uoav-lLks4AknK2SfU_yxi16QmxiuZOGFm-hLQ==?uid=0&filename=004%20-%2002%20Lesnik.mp3&disposition=attachment&hash=e0E3gNC19eqNvFi1rXJjnP1y8SAS38sn5%2ByGEWhnzE5cwAGsEnlbazlMDWSjXpyvq/J6bpmRyOJonT3VoXnDag%3D%3D&limit=0&content_type=audio%2Fmpeg&owner_uid=160716081&fsize=3862987&hid=98984d857027117759bc5ce6092eaa6a&media_type=audio&tknv=v2&rtoken=k9xogU6296eg&force_default=no&ycrid=na-2bc914314062204f1cbf810798018afd-downloader16e&ts=5e61a6daac6c0&s=eef8b08190dc7b22befd6bad89e1393b394869a1668d9b8af3730cce4774e8ad&pb=U2FsdGVkX1__q3AvjJzgzWG4wVR80Oh8XMl-0Dlfyu9FhqAYQVVkoBV0dtBmajpmOkCXKUXPbREOS-MZCxMNu2rkAkKq_n-AXcZ85svtSFs";
List<Task> tasks = new List<Task>();
void MyMethod1(int i)
{
WebClient client = new WebClient();
client.DownloadFile(file, $"{Path}\\{i}.mp3");
}
void MyMethod2()
{
int i = 0;
while (i < int.MaxValue)
{
i++;
}
}
DateTime dateTimeStart = DateTime.Now;
for (int i = 0; i < Iterations; i++)
{
int j = i;
tasks.Add(Task.Run(() => MyMethod1(j)));
}
for (int i = 0; i < Iterations; i++)
{
tasks.Add(Task.Run(() => { MyMethod2(); MyMethod2(); }));
}
Task.WaitAll(tasks.ToArray());
Console.WriteLine(DateTime.Now - dateTimeStart);
while (true)
{
Thread.Sleep(100);
if (Path.GetFiles().Length == Iterations)
{
Thread.Sleep(1000);
foreach (FileInfo f in Path.GetFiles())
{
f.Delete();
}
return;
}
}
If there are 2 web servers that talk to a database and they run on 2 machines with the same spec the web server with async code will be able to handle more concurrent requests.
The following is from 2014's Async Programming : Introduction to Async/Await on ASP.NET
Why Not Increase the Thread Pool Size?
At this point, a question is always asked: Why not just increase the size of the thread pool? The answer is twofold: Asynchronous code scales both further and faster than blocking thread pool threads.
Asynchronous code can scale further than blocking threads because it uses much less memory; every thread pool thread on a modern OS has a 1MB stack, plus an unpageable kernel stack. That doesn’t sound like a lot until you start getting a whole lot of threads on your server. In contrast, the memory overhead for an asynchronous operation is much smaller. So, a request with an asynchronous operation has much less memory pressure than a request with a blocked thread. Asynchronous code allows you to use more of your memory for other things (caching, for example).
Asynchronous code can scale faster than blocking threads because the thread pool has a limited injection rate. As of this writing, the rate is one thread every two seconds. This injection rate limit is a good thing; it avoids constant thread construction and destruction. However, consider what happens when a sudden flood of requests comes in. Synchronous code can easily get bogged down as the requests use up all available threads and the remaining requests have to wait for the thread pool to inject new threads. On the other hand, asynchronous code doesn’t need a limit like this; it’s “always on,” so to speak. Asynchronous code is more responsive to sudden swings in request volume.
(These days threads are added added every 0.5 second)
WebRequest.Create("https://192.168.1.1").GetResponse()
At some point the above code will probably hit the OS method recv(). The OS will suspend your thread until data becomes available. The state of your function, in CPU registers and the thread stack, will be preserved by the OS while the thread is suspended. In the meantime, this thread can't be used for anything else.
If you start that method via Task.Run(), then your method will consume a thread from a thread pool that has been prepared for you by the runtime. Since these threads aren't used for anything else, your program can continue handling other requests on other threads. However, creating a large number of OS threads has significant overheads.
Every OS thread must have some memory reserved for its stack, and the OS must use some memory to store the full state of the CPU for any suspended thread. Switching threads can have a significant performance cost. For maximum performance, you want to keep a small number of threads busy. Rather than having a large number of suspended threads which the OS must keep swapping in and out of each CPU core.
When you use async & await, the C# compiler will transform your method into a coroutine. Ensuring that any state your program needs to remember is no longer stored in CPU registers or on the OS thread stack. Instead all of that state will be stored in heap memory while your task is suspended. When your task is suspended and resumed, only the data which you actually need will be loaded & stored, rather than the entire CPU state.
If you change your code to use .GetResponseAsync(), the runtime will call an OS method that supports overlapped I/O. While your task is suspended, no OS thread will be busy. When data is available, the runtime will continue to execute your task on a thread from the thread pool.
Is this going to impact the program you are writing today? Will you be able to tell the difference? Not until the CPU starts to become the bottleneck. When you are attempting to scale your program to thousands of concurrent requests.
If you are writing new code, look for the Async version of any I/O method. Sprinkle async & await around. It doesn't cost you anything.
If I understand correctly, then async\await is not needed anywhere else except for operations with a long wait in a thread, if this is not related to a long calculation.
It's kind of recursive, but async is best used whenever there's something asynchronous. In other words, anything where the CPU would be wasted if it had to just spin (or block) while waiting for the operation to complete. Operations that are naturally asynchronous are generally I/O-based (as you mention, DB and other network calls, as well as file I/O), but they can be more arbitrary events, too (e.g., timers). Anything where there isn't actual code to run to get the response.
Many people write that async\await is also needed so as not to block the main thread.
At a higher level, there are two primary benefits to async/await, depending on what kind of code you're talking about:
On the server side (e.g., web apps), async/await provides scalability by using fewer threads per request.
On the client side (e.g., UI apps), async/await provides responsiveness by keeping the UI thread free to respond to user input.
Developers tend to emphasize one or the other depending on the kind of work they normally do. So if you see an async article talking about "not blocking the main thread", they're talking about UI apps specifically.
And here it is completely unclear to me why it should be blocked. Don't block without async\await, just create a task.
That works just fine for many situations. But it doesn't work well in others.
E.g., it would be a bad idea to just Task.Run onto a background thread in a web app. The primary benefit of async in a web app is to provide scalability by using fewer threads per request, so using Task.Run does not provide any benefits at all (in fact, scalability is reduced). So, the idea of "use Task.Run instead of async/await" cannot be adopted as a universal principle.
The other problem is in resource-constrained environments, such as mobile devices. You can only have so many threads there before you start running into other problems.
But if you're talking Desktop apps (e.g., WPF and friends), then sure, you can use async/await to free up the UI thread, or you can use Task.Run to free up the UI thread. They both achieve the same goal.
Question: how could I first run the DownloadFile asynchronously so that the threads do not idle uselessly, but can do while (i < int.MaxValue)?
There's nothing in your code that is asynchronous at all. So really, you're dealing with multithreading/parallelism. In general, I recommend using higher-level constructs such as Parallel for parallelism rather than Task.Run.
But regardless of the API used, the underlying problem is that you're kicking off Environment.ProcessorCount * 6 threads. You'll want to ensure that your thread pool is ready for that many threads by calling ThreadPool.SetMinThreads with the workerThreads set to a high enough number.
It's not web requests but here's a toy example:
Test:
n: 1 await: 00:00:00.1373839 sleep: 00:00:00.1195186
n: 10 await: 00:00:00.1290465 sleep: 00:00:00.1086578
n: 100 await: 00:00:00.1101379 sleep: 00:00:00.6517959
n: 300 await: 00:00:00.1207069 sleep: 00:00:02.0564836
n: 500 await: 00:00:00.1211736 sleep: 00:00:02.2742309
n: 1000 await: 00:00:00.1571661 sleep: 00:00:05.3987737
Code:
using System.Diagnostics;
foreach( var n in new []{1, 10, 100, 300, 500, 1000})
{
var sw = Stopwatch.StartNew();
var tasks = Enumerable.Range(0,n)
.Select( i => Task.Run( async () =>
{
await Task.Delay(TimeSpan.FromMilliseconds(100));
}));
await Task.WhenAll(tasks);
var tAwait = sw.Elapsed;
sw = Stopwatch.StartNew();
var tasks2 = Enumerable.Range(0,n)
.Select( i => Task.Run( () =>
{
Thread.Sleep(TimeSpan.FromMilliseconds(100));
}));
await Task.WhenAll(tasks2);
var tSleep = sw.Elapsed;
Console.WriteLine($"n: {n,4} await: {tAwait} sleep: {tSleep}");
}
I understand a Barrier can be used to have several tasks synchronise their completion before a second phase runs.
I would like to have several tasks synchronise multiple steps like so:
state is 1;
Task1 runs and pauses waiting for state to become 2;
Task2 runs and pauses waiting for state to become 2;
Task2 is final Task and causes the state to progress to state 2;
Task1 runs and pauses waiting for state to become 3;
Task2 runs and pauses waiting for state to become 3;
Task2 is final Task and causes the state to progress to state 3;
state 3 is final state and so all tasks exit.
I know I can spin up new tasks at the end of each state, but since each task does not take too long, I want to avoid creating new tasks for each step.
I can run the above synchronously using for loops, but final state can be 100,000, and so I would like to make use of more than one thread to run the process faster as the process is CPU bound.
I have tried using a counter to keep track of the number of completed Tasks that is incremented by each Task on completion. If the Task is the final Task to complete then it will change the state to the next state. All completed Tasks then wait using while (iterationState == state) await Task.Yield but the performance is terrible and it seems to me a very crude way of doing it.
What is the most efficient way to get the above done? There must be an optimised tool to get this done?
I'm using Parallel.For, creating 300 tasks, and each task needs to run through up to 100,000 states. Each task running through one state completes in less than a second, and creating 300 * 100,000 tasks is a huge overhead that makes running the whole thing synchronously much faster, even if using a single thread.
So I'd like to create 300 Tasks and have these Tasks synchronise moving through the 100,000 states. Hopefully the overhead of creating only 300 tasks instead of 300 * 100,000 tasks, with the overhead of optimised synchronisation between the tasks, will run faster than when doing it synchronously on a single thread.
Each state must complete fully before the next state can be run.
So - what's the optimal synchronisation technique for this scenario? Thanks!
while (iterationState == state) await Task.Yield is indeed a terrible solution to synchronize across your 300 tasks (and no, 300 isn't necessarily super-expensive: you'll only get a reasonable number of threads allocated).
The key problem here isn't the Parallel.For, it's synchronizing across 300 tasks to wait efficiently until each of them have completed a given phase.
The simplest and cleanest solution here would probably be to have a for loop over the stages and a parallel.for over the bit you want parallelized:
for (int stage = 0; stage < 10000; stage++)
{
// the next line blocks until all 300 have completed
// will use thread pool threads as necessary
Parallel.For( ... 300 items to process this stage ... );
}
No extra synchronization primitives needed, no spin-waiting consuming CPU, no needless thrashing between threads trying to see if they are ready to progress.
I think I am understanding what you are trying to do, so here is a suggested way to handle it. Note - I am using Action as the type for the blocking collection, but you can change it to whatever would work best in your scenario.
// Shared variables
CountdownEvent workItemsCompleted = new CountdownEvent(300);
BlockingCollection<Action> workItems = new BlockingCollection<Action>();
CancellationTokenSource cancelSource = new CancellationTokenSource();
// Work Item Queue Thread
for(int i=1; i < stages; ++i)
{
workItemsCompleted.Reset(300);
for(int j=0; j < workItemsForStage[i].Count; ++j)
{
workItems.Add(() => {}) // Add your work item here
}
workItemsCompleted.Wait(token) // token should be passed in from cancelSource.Token
}
// Worker threads that are making use of the queue
// token should be passed to the threads from cancelSource.Token
while(!token.IsCancelled)
{
var item = workItems.Take(token); // Blocks until available item or token is cancelled
item();
workItemsCompleted.Signal();
}
You can use cancelSource from your main thread to cancel the running operations if you need to. In your worker threads you would then need to handle the OperationCancelledException. With this setup you can launch as many worker threads as you need and easily benchmark where you are getting your optimal performance (maybe it is with only using 10 worker threads, etc). Just launch as many workers as you want and then queue up the work items in the Work item queue thread. It's basically a producer-consumer type model except that the producer queues up one phase of the work, then blocks until that phase is done and then queues up the next round of work.
Why is it that the following program will only run a limited number of blocked tasks. The limiting number seems to be the number of cores on the machine.
Initially when I wrote this I expected to see the following:
Job complete output of Jobs 1 - 24
A 2 second gap
Output of Jobs 25 - 48
However the output was:
Job complete output of Jobs 1 - 4
Then randomly completing jobs every couple of 100ms.
When running on server with 32 cores, the program did run as I had expected.
class Program
{
private static object _lock = new object();
static void Main(string[] args)
{
int completeJobs = 1;
var limiter = new MyThreadLimiter();
for (int iii = 1; iii < 100000000; iii++)
{
var jobId = iii;
limiter.Schedule()
.ContinueWith(t =>
{
lock (_lock)
{
completeJobs++;
Console.WriteLine("Job: " + completeJobs + " scheduled");
}
});
}
Console.ReadLine();
}
}
class MyThreadLimiter
{
readonly SemaphoreSlim _semaphore = new SemaphoreSlim(24);
public async Task Schedule()
{
await _semaphore.WaitAsync();
Task.Run(() => Thread.Sleep(2000))
.ContinueWith(t => _semaphore.Release());
}
}
However replacing the Thread.Sleep with Task.Delay gives my expected results.
public async Task Schedule()
{
await _semaphore.WaitAsync();
Task.Delay(2000)
.ContinueWith(t => _semaphore.Release());
}
And using a Thread gives my expected results
public async Task Schedule()
{
await _semaphore.WaitAsync();
var thread = new Thread(() =>
{
Thread.Sleep(2000);
_semaphore.Release();
});
thread.Start();
}
How does Task.Run() work? Is it the case it is limited to the number of cores?
Task.Run schedules the work to run in the thread pool. The thread pool is given wide latitude to schedule the work as best as it can in order to maximize throughput. It will create additional threads when it feels they will be helpful, and remove threads from the pool when it doesn't think it will be able to have enough work for them.
Creating more threads than your processor is able to run at the same time isn't going to be productive when you have CPU bound work. Adding more threads will just result in dramatically more context switches, increasing overhead, and reducing throughput.
Yes for compute bound operations Task.Run() internally uses CLR's thread pool which will throttle the number of new threads to avoid CPU over-subscription. Initially it will run the number of threads that equals to the number of cpu cores concurrently. Then it continually optimises the number of threads using a hill-climbing algorithm based on factors like the number of requests thread pool receives and overall computer resources to either create more threads or fewer threads.
In fact, this is one of the main benefits of using pooled thread over raw thread e.g. (new Thread(() => {}).Start()) as it not only recycles threads but also optimises performance internally for you. As mentioned in the other answer, it's generally a bad idea to block pooled threads because it will "mislead" thread pool's optimisation, simiarly using many pooled thread to do very long-running computation can also lead to thread pool creating more threads and consequently increase the overheads of context switch and later destory extra threads in the pool.
The Task.Run() is running based on CLR Thread pool.
There is a concept called 'OverSubscription', means there
are being more active thread than CPU Cores and they must be time-sliced.
In Thread-Pool when the number of threads that must be scheduled on CPU Cores
increase, Context-Switch rise and at the result, the performance will hurt.
The CLR that manages the Thread-Pool avoid OverSubscription by queuing
and throttling the thread startup and always try to compensate workload.
I need to control one thread for my own purposes: calculating, waiting, reporting, etc...
In all other cases I'm using the ThreadPool or TaskEx.
In debugger, when I'm doing Thread.Sleep(), I notice that some parts of the UI are becoming less responsible. Though, without debugger seems to work fine.
The question is: If I'm creating new Thread and Sleep()'ing it, can it affect ThreadPool/Tasks?
EDIT: here are code samples:
One random place in my app:
ThreadPool.QueueUserWorkItem((state) =>
{
LoadImageSource(imageUri, imageSourceRef);
});
Another random place in my app:
var parsedResult = await TaskEx.Run(() => JsonConvert.DeserializeObject<PocoProductItem>(resultString, Constants.JsonSerializerSettings));
My ConcurrentQueue (modified, original is taken from here):
Creation of thread for Queue needs:
public void Process(T request, bool Async = true, bool isRecurssive = false)
{
if (processThread == null || !processThread.IsAlive)
{
processThread = new Thread(ProcessQueue);
processThread.Name = "Process thread # " + Environment.TickCount;
processThread.Start();
}
If one of the Tasks reports some networking problems, i want this thread to wait a bit
if (ProcessRequest(requestToProcess, true))
{
RequestQueue.Dequeue();
}
else
{
DoWhenTaskReturnedFalse();
Thread.Sleep(3000);
}
So, the question one more time: can Thread.Sleep(3000);, called from new Thread(ProcessQueue);, affect ThreadPool or TaskEx.Run() ?
Assuming that the thread you put on sleep was obtained from thread pool then surely it does affect the thread pool. If you explicitly say that the thread should sleep then it cannot be reused by the thread pool during this time. This may cause the thread pool to spawn new threads if there are some jobs awaiting to be scheduled. Creating a new thread is always expensive - threads are system resources.
You can however look at Task.Delay method (along with async and await) that suspends executing code in a more intelligent way - allowing the thread to be reused during waiting.
Refer to this Thread.Sleep vs. Task.Delay article.
Thread.Sleep() affects the thread it's called from, if you're calling Thread.Sleep() in a ThreadPool thread and trying to queue up more it may be hitting the max count of ThreadPool threads and waiting for a thread to finish before executing another.
http://msdn.microsoft.com/en-us/library/system.threading.threadpool.setmaxthreads.aspx
No, the Thread.Sleep() is only on the current thread. Thread.Sleep(int32) documentation:
The number of milliseconds for which the thread is suspended.
I have the following code:
static void Main(string[] args)
{
Console.Write("Press ENTER to start...");
Console.ReadLine();
Console.WriteLine("Scheduling work...");
for (int i = 0; i < 1000; i++)
{
//ThreadPool.QueueUserWorkItem(_ =>
new Thread(_ =>
{
Thread.Sleep(1000);
}).Start();
}
Console.ReadLine();
}
According to the textbook C# 4.0 Unleashed by Bart De Smet (page 1466), using new Thread should mean using many more threads than if you use ThreadPool.QueueUserWorkItem which is commented out in my code.
However I've tried both, and seen in Resource Monitor that with "new Thread", there are about 11 threads allocated, however when I use ThreadPool.QueueUserWorkItem, there are about 50.
Why am I getting the opposite outcome of what is mentioned in this book?
Also why if you increase the sleep time, do you get many more threads allocated when using ThreadPool.QueueUserWorkItem?
new Thread() just creates a Thread object; you forgot to call Start() (which creates the actual thread that you see in resource monitor).
Also, if you are looking at the number of threads after the sleep has completed, you won't see any of the new Threads as they have already exited.
On the other hand, the ThreadPool keeps threads around for some time so it can reuse them, so in that case you can still see the threads even after the sleep has completed.
With new Thread(), you might be seeing the number staying around 160 because it took one second to start that many threads, so by the time the 161st thread is started, the first thread is already finished. You should see a higher number of threads if you increase the sleep time.
As for the ThreadPool, it is designed to use as few threads as possible while also keeping the CPU busy. Ideally, the number of busy threads is equal to the number of CPU cores. However, if the pool detects that its threads are currently not using the CPU (sleeping, or waiting for another thread), it starts up more threads (at a rate of 1/second, up to some maximum) to keep the CPU busy.