Task.Factory.StartNew - confused about the pool - c#

Hi I'm getting myself tied up with Task.Factory.StartNew. Just as I think I get the idea of it someone has suggested I write the following code;
bool exitLoop = false;
while (!exitLoop)
{
exitLoop = true;
var messages = Queue.GetMessages(20);
foreach (var message in messages)
{
exitLoop = false;
Task.Factory.StartNew(() =>
{
DeliverMessage(message);
});
}
}
In theory this is going to drain a queue, 20 messages at a time, attempting to creat a Task for every message in the queue. So if we had a 1000 messages in the queue then in an instant we'd have 25 tasks and it would eat its way through all the msgs. I previously thought I understood this, I thought StartNew would block once it ran out of entries - in the old days that would have been ~ 25. But given this is .net 4.5 which I'm now under the impression that the upper limit for a pool is now pretty high. What puzzles me is that I would have assumed that is going to flood the pool with new tasks and start blocking, i.e. in an instant I now have 1000 tasks running. So if the pool limit is now hardly a limit why am I not seeing 1000 tasks?
[Edit]
ok, so what I'm seeing is that 1000 tasks are queued to run, rather than are running. So how do I determine the number of running/runnable tasks?

I know this is quite a while after your post, but I hope this may help someone facing your specific challenge. Your last comment stated that the 'DeliverMessage' method was making HTTP requests.
If you are using the 'WebClient' object (for example) to make your requests, it will be bound by the ServicePointManager.DefaultConnectionLimit property. This means it will create at most two (by default) concurrent connections to the host. If you created 1,000 parallel tasks, all 1,000 of those would have to be serviced by those two connections.
You'll have to play around with different values for this setting to find the right balance between throughput in your application and load on the web server.

Related

How to share Threads in C# Properly?

I built an OCR application which reads PDF files and OCR's them. I built it using Multi-threading with the Parallel.ForEach function.
This works brilliantly, but I noticed that the way the threads are divided seems to work differently to what I'm expecting.
Scenario: When allocating only 10 threads using MaxDegreeOfParallelism, it divides the workload and I can see 10 threads being immediately started. However, there are 100 items that needs to be processed. When it gets around 80/100 items processed, it slows down by only running 2 out of the 10 threads. I suspect this is due to 8/10 threads have successfully completed their portion of the work, but because some PDFs took longer on a certain thread, that thread is still processing his portion of the work.
So my question is, how can I write this better so that even if it does get to 80/100, there should ALWAYS be 10 active threads... (of course when it gets to 90+ the threads will die down, but at least it wont process 1 by 1 when the last thread still has workload to complete.
I hope this makes sense. Here is a snippet of my code:
Parallel.ForEach(F.files, new ParallelOptions { MaxDegreeOfParallelism = iNumberOfThreads }, items =>
{
//do work here
}
});
Thanks to Panagiotis Kanavos, I've implemented ActionBlock<T>, which resolves my problem.
var getData = new ActionBlock<JsonPDFReader.File>(items =>
{
//Code Here
}, new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = iNumberOfThreads });
foreach (JsonPDFReader.File items in F.files)
{
getData.Post(items);
}
getData.Complete();
getData.Completion.Wait();

Requests are queuing in Azure AppService though it has enough threads in threadpool

I have written an api using asp.net webapi and deployed it in azure as Appservice. Name of my controller is TestController and My action method is something like bellow.
[Route("Test/Get")]
public string Get()
{
Thread.Sleep(10000);
return "value";
}
So for each request it should wait for 10 sec before return string "value". I have also written another endpoint to see the number of threads in threadpool working for executing requests. That action is something like bellow.
[Route("Test/ThreadInfo")]
public ThreadPoolInfo Get()
{
int availableWorker, availableIO;
int maxWorker, maxIO;
ThreadPool.GetAvailableThreads(out availableWorker, out availableIO);
ThreadPool.GetMaxThreads(out maxWorker, out maxIO);
return new ThreadPoolInfo
{
AvailableWorkerThreads = availableWorker,
MaxWorkerThreads = maxWorker,
OccupiedThreads = maxWorker - availableWorker
};
}
Now when We make 29 get calls concurrently to Test/Get endpoint it takes almost 11 seconds to get succeed all requests. So server executes all the requests concurrently in 11 threads. To see the threads status, making call to Test/ThreadInfo right after making call to Test/Get returns immediately(without waiting)
{
"AvailableWorkerThreads": 8161,
"MaxWorkerThreads": 8191,
"OccupiedThreads": 30
}
Seems 29 threads are executing Test/Get requests and 1 thread is executing Test/ThreadInfo request.
When I make 60 get calls to Test/Get it takes almost 36 seconds to get succeed. Making call to Test/ThreadInfo(takes some time) returns
{
"AvailableWorkerThreads": 8161,
"MaxWorkerThreads": 8191,
"OccupiedThreads": 30
}
If we increase requests number, value of OccupiedThreads increases. Like for 1000 requests it takes 2 min 22 sec and value of OccupiedThreads is 129.
Seems request and getting queued after 30 concurrent call though lot of threads are available in WorkerThread. Gradually it increases thread for concurrent execution but that is not enough(129 for 1000 request).
As our services has lot of IO call(some of them are external api call and some are database query) the latency is also high. As we are using all IO calls async way so server can serve lot of request concurrently but we need more concurrency when processor are doing real work. We are using S2 service plan with one instance. Increasing instance will increase concurrency but we need more concurrency from single instance.
After reading some blog and documentation on IIS we have seen there is a setting minFreeThreads. If the number of available threads in the thread pool falls bellow the value for this setting IIS starts to queue request. Is there anything in appservice like this? And Is it really possible to get more concurrency from azure app service or we are missing some configuration there?
At last got the answer for my question. The thing is that The ASP.NET thread pool maintains a pool of threads that have already incurred the thread initialization costs and are easy to reuse. The .NET thread pool is also self-tuning. It monitors CPU and other resource utilization, and it adds new threads or trims the thread pool size as needed. When there are lot of requests and not enough thread in pool is available then thread pool starts to add new threads in the pool and before that it runs its own algorithm to see the status of memory and cpu use of the system which takes long amount of time and that is why we see slowly increase of worker thread in the pool and get lot of request queued. But luckly there is an option to set the number of worker thread before thread pool switches to an algorithm to add new thread. The code is something like bellow.
public string Settings()
{
int minWorker, minIOC;
ThreadPool.GetMinThreads(out minWorker, out minIOC);
if (ThreadPool.SetMinThreads(300, minIOC)){
return "The minimum number of threads was set successfully.";
}
else
{
return "The minimum number of threads was not changed.";
}
}
Here ThreadPool.SetMinThreads(300, minIOC) is setting the value of minimum threads threadpool will create before switching to an algorithm for adding or removing thread. I have Added this method as an action of my webapi controller and then after running this action by making a request when I made 300 concurrent request to Test/Get endpoint all was running and completed in 11 seconds and no request were queued.
Per my understanding, you need to check the MinWorkerThreads via ThreadPool.GetMinThreads which could retrieve the number of idle threads the ThreadPool maintains in anticipation of new requests. I would recommend that you return the current thread pool info after executed your action (Tetst/Get) instead of making a new request to Test/ThreadInfo. Based on your code, I tested it on my web app with the B1 pricing tier as follows:
1000 concurrent requests, sleep 15s for each action, overall elapsed 3 min 20 sec.
The thread pool creates and destroys worker threads in order to optimize throughput, if the current threads could handle the requests then no more threads would be created. Once those threads finish executing their activities, they are then returned to the thread pool.
Then, I used Asynchronous programming and changed the action as follows:
public async Task<ActionResult> DoJob(int id)
{
await Task.Delay(TimeSpan.FromSeconds(15));
return ThreadInfo();
}
Result:
Nearly 3000 concurrent requests:
In general, AvailableWorkerThreads just means the number of additional worker threads that can be started instead of worker threads have been created for you. I would recommend that you using Asynchronous programming and deploy your real work on azure web app, then check the real performance and find the related approaches if any bottlenecks.

Timeout Exception - Queuing of Requests? Not enough threads?

Background:
I have a service which aggregates data from multiple other services. To make things happen in a timely manner I use async throughout the code, and then gather the various requests into a list of tasks.
Here is some excerpts from the code:
private async Task<List<Foo>> Baz(..., int timeout)
{
var tasks = new List<Task<IEnumerable<Foo>>>();
Tasks.Add(GetFoo1(..., timeout));
Tasks.Add(GetFoo2(..., timeout));
// Up to 6, depending on other parameters. Some tasks return multiple objects.
return await Task.WhenAll(tasks).ContinueWith((antecedent) => { return antecedent.Result.AsEnumerable().SelectMany(f => f).ToList(); }).ConfigureAwait(false);
}
private async Task<IEnumerable<Foo>> GetFoo1(..., int timeout)
{
Stopwatch sw = new Stopwatch();
sw.Start();
var value = await SomeAsyncronousService.GetAsync(..., timeout).ConfigureAwait(false);
sw.Stop();
// Record timing...
return new[] { new Foo(..., value) };
}
private async Task<IEnumerable<Foo>> GetFoo2(..., int timeout)
{
return await Task.Run(() => {
Stopwatch sw = new Stopwatch();
sw.Start();
var r = new[] { new Foo(..., SomeSyncronousService.Get(..., timeout)) };
sw.Start();
sw.Stop();
// Record timing...
return r;
}).ConfigureAwait(false);
}
// In class SomeAsyncronousService
public async Task<string> GetAsync(..., int timeout)
{
...
try
{
using (var httpClient = HttpClientFactory.Create())
{
// I have tried it with both timeout and CTS. The behavior is the same.
//httpClient.Timeout = TimeSpan.FromMilliseconds(timeout);
var cts = new CancellationTokenSource();
cts.CancelAfter(timeout);
var content = ...;
var responseMessage = await httpClient.PostAsync(Endpoint, content, cts.Token).ConfigureAwait(false);
if (responseMessage.IsSuccessStatusCode)
{
var contentData = await responseMessage.Content.ReadAsStringAsync().ConfigureAwait(false);
...
return ...
}
...
}
}
catch (OperationCanceledException ex)
{
// Log statement ...
}
catch (Exception ex)
{
// Log statement ...
}
return ...;
}
The Symptoms:
This code works great on my local machine, and it works fine on our test servers most of the time. However, occasionally we get a bunch of mass recorded timeouts - recorded by the "Record timing" comments above and the Log statements on OperationCanceledExceptions. I do not have any way of telling if the services I call actually timed out.
Now, when I say a series of timeouts I mean that most or all of the tasks (and the HttpClients that all but one use, the other uses a WCF service) all timeout at about the same time.
Now, I know what you are thinking, I am passing in the same timeout. Thats right, but I pass in 250 ms and the run time that is being reported by the various stop watches are around 800 ms or higher.
Now, I do see the OperationCanceledExceptions in the log, but the time stamp of the exception is the same as the time stamp of when the stopwatch ended (or within 2-3 ms) and my service is failing because clients are expecting it to respond in 500 ms or less, not 800 ms.
Now, normally the various services respond in less than 100 ms, with a wide variance among the results. When we a problem occurs, and most / all return in 800 ms or more, they vary only by ~10 ms. The dependencies I call are all on different domains. It seems highly unlikely that all of them are really taking that long to respond, all at the same time.
I suppose there could be a network issue, affecting all requests at the same time, but the other services in our network do not experience the same behavior - it is limited to the new service I am writing.
Even if that was the case, I would expect the cancellation exceptions to occur after 250 ms, then for the task to end and the stopwatch to record 250 (plus 5-20ms or so for exception handling).
So I do not think that it is a network issue. Now I am sure that at least part of the problem is related to me not cancelling / timing out correctly, but it seem to me that all of the out going requests from the service are being affected at the same time independent of HttpClient.
The reason I say that is because the WCF service also shows 800+ ms (according to the stopwatch) when the rest of the requests timeout. The WCF service is not asynchronous. The timeout is set like this:
var binding = new BasicHttpBinding()
{
Security = new BasicHttpSecurity()
{
Mode = BasicHttpSecurityMode.TransportCredentialOnly,
Transport = new HttpTransportSecurity()
{
ClientCredentialType = HttpClientCredentialType.Ntlm
}
},
ReceiveTimeout = TimeSpan.FromMilliseconds(timeout)
};
The Problem:
So, in short I think that something is causing all outgoing requests to any domain to pause or queue which is causing the observed behavior.
I have spent days trying to figure out what is going on, but have had no luck. Any ideas?
EDIT
I think what is happening is that the requests are being put put on hold because there isn't a thread available, and then a few hundred milliseconds later a thread is available and the task starts. Timing the method call shows that it is taking 800 ms, but the timeout on the HttpClient doesn't start until a thread is available to run the async call.
It would also explain why I see that the method takes 800+ ms, but sometimes it still completes without showing a timeout exception. Other times it does throw a timeout exception and does not complete.
I have tried setting the ServicePointManager.DefaultConnectionLimit to 200 in Application_Start, but that did not solve the issue.
The service isn't taking that much traffic compared to our other services, and none of the others appear to have the same problem.
Any ideas?
Edit 2
I logged into the box and monitored netstat while doing (minor) load tests.
Using HttpClient, with 1-2 requests per second the ports would show ESTABLISHED, then move to TIME_WAIT for about 4 minutes. With 3+ requests per second I would end up with about a constant 100 x requests per second ESTABLISHED ports (so 300 for a 3 per second load test), and then I would start seeing them go to CLOSE_WAIT instead of TIME_WAIT - indicating an error condition on close. At the same time I would see the spike in the number of exceptions and time to execute the requests. (TcpTimedWaitDelay does not apply to CLOSE_WAIT).
So I rewrote the whole thing to use HttpWebRequests in serial, instead of HttpClient in parallel. Then I ran the same tests.
Now the ESTABLISHED ports equal 0-2 x requests per second, and the ports then move on to TIME_CLOSE as expected. The performance and throughput improved, but didn't clear up completely.
Then I set TcpTimedWaitDelay to 30 (default 240). The performance has increased dramatically. I have a primitive load test that hits it with 40 requests per second without any issues. I will get a more thorough test setup but I think the problem has been solved.
I don't know what is going on, but it appears that the HttpClient was not closing the ephemoral ports correctly underneath. Many of the developers and architects at my company looked at it and couldn't not see anything wrong with the code. I tried having a single HttpClient in a using statement per request, as well as having a single HttpClient per api I call on the back end. I have tried using HttpClient in parallel and serial. I have tried it with async/await and without. No matter what I tried the behavior was the same.
I would like to be able to use HttpClient, but I can't spend anymore time on this issue as I have it working with HttpWebRequest. My next step is to make the HttpWebRequests occur in Parallel.
Thank you for your input.
I've experienced similar frustrations with the HttpClient. In my scenario I found setting MaxServicePointIdleTime to a much lower value and DefaultConnectionLimit to a high value on the ServicePointManager resolved my issues. I believe in my case I was experiencing pool starvation as the connections were being held open.
You may also want to test without the debugger attached, in release, if you are not already doing so, as the TaskScheduler behaves differently when debugging.
The following MSDN article is very helpful: http://blogs.msdn.com/b/jpsanders/archive/2009/05/20/understanding-maxservicepointidletime-and-defaultconnectionlimit.aspx

await Task.Delay takes longer than expected

I wrote a multithreaded app which uses async/await extensively. It is supposed to download some stuff at a scheduled time. To achieve that, it uses 'await Task.Delay'. Sometimes it sends thousands requests every minute.
It works as expected, but sometimes my program needs to log something big. When it does, it serializes many objects and saves them to a file. During that time, I noticed that my scheduled tasks are executed too late. I've put all the logging to a separate thread with the lowest priority and the problem doesn't occur that often anymore, but it still happens. The things is, I want to know when it happens and in order to know that I have to use something like that:
var delayTestDate = DateTime.Now;
await Task.Delay(5000);
if((DateTime.Now - delayTestDate).TotalMilliseconds > 6000/*delays up to 1 second are tolerated*/) Console.WriteLine("The task has been delayed!");
Moreover, I have found that 'Task.Run', which I also use, can also cause delays. To monitor that, I have to use even more ugly code:
var delayTestDate = DateTime.Now;
await Task.Run(() =>
{
if((DateTime.Now - delayTestDate).TotalMilliseconds > 1000/*delays up to 1 second are tolerated*/) Console.WriteLine("The task has been delayed!");
//do some stuff
delayTestDate = DateTime.Now;
});
if((DateTime.Now - delayTestDate).TotalMilliseconds > 1000/*delays up to 1 second are tolerated*/) Console.WriteLine("The task has been delayed!");
I have to use it before and after every await and Task.Run and inside every async function, which is ugly and inconvenient. I can't put it into a separate function, since it would have to be async and I would have to await it anyway. Does anybody have an idea of a more elegant solution?
EDIT:
Some information I provided in the comments:
As #YuvalItzchakov noticed, the problem may be caused by Thread Pool starvation. That's why I used System.Threading.Thread to take care of the logging outside of the Thread Pool, but as I said, the problem still sometimes occur.
I have a processor with four cores and by subtracting results of ThreadPool.GetAvailableThreads from ThreadPool.GetMaxThreads I get 0 busy worker threads and 1-2 busy completion port threads. Process.GetCurrentProcess().Threads.Count usually returns about 30. It's a Windows Forms app and although it only has a tray icon with a menu, it starts with 11 threads. When it gets to sending thousands requests per minute, it quickly gets up to 30.
As #Noseratio suggested, I tried to play with ThreadPool.SetMinThreads and ThreadPool.SetMaxThreads, but it didn't even change the numbers of busy threads mentioned above.
When you execute Task.Run it uses Thread Pool threads to execute those tasks. When you have long running tasks, you are causing starvation to the Thread Pool, since its resources are currently occupied with long running tasks.
2 Suggestions:
When running long running tasks, make sure to use Task.Factory.Startnew with TaskCreationOptions.LongRunning, which will trigger a new thread creation. You must be cautious here as well, as spinning too many new threads will cause excessive context switches which will cause your app to slow down
Use true async where you have to do IO Bound work, use apis that support the TAP such as HttpClient and Stream, which wont cause a new thread to execute blocking work.
There are overheads in async/await, as well as the tasks themselves being executed at a lower priority. If you need something to happen reliably at an accurate interval, async/await / TPL is not the interface to use.
Try creating an independent background thread that loops until it is scheduled to do work. This way you can control the priority and timing directly without going through TPL / async.
Thread backgroundThread = new Thread(BackgroundWork);
DateTime nextInterval = DateTime.Now;
public void BackgroundWork()
{
if(DateTime.Now > nextInterval){
DoWork();
nextInterval = nextInterval.Add(new TimeSpan(0,0,0,10)); // 10 seconds
}
Thread.Sleep(100);
}
Adjust the Sleep(..) and interval values as needed.
I think you're experiencing the situation described by Joe Duffy in his "CLR thread pool injection, stuttering problems" blog post:
One silly thing our thread pool currently does has to do with how it
creates new threads. Namely, it severely throttles creation of new
threads once you surpass the “minimum” number of threads, which, by
default, is the number of CPUs on the machine. We limit ourselves to
at most one new thread per 500ms once we reach or surpass this number.
One solution might be to explicitly increase the minimum number of thread pool threads before making any use of TPL, e.g.:
ThreadPool.SetMaxThreads(workerThreads: 200, completionPortThreads: 200);
ThreadPool.SetMinThreads(workerThreads: 100, completionPortThreads: 100);
Try playing with these numbers and see if the problem goes away.

How to stop BackgroundWorker from queuing?

I have the following code:
for (int i = 1; i <= 500; i++)
{
BackgroundWorker t = new BackgroundWorker();
t.DoWork += SOME DB METHOD THAT TAKES 5 SECONDS
t.RunWorkerAsync();
}
When I profile this in SQL I notice that the BackgroundWorker appears to be queuing the threads in such a way that only 4 or 5 active connections are open at the same time vs. all 500 connections opening at once. I get no timeouts or blocking from my DB. How can I prevent this queuing and hit the database with all 500 concurrent threads at once?
BackgroundWorker uses the ThreadPool. You can adjust the ThreadPool with ThreadPool.SetMinThreads and ThreadPool.SetMaxThreads. If it will be actually possible to establish that many connections to your database server may be another question (and cause other problems).
However, it's not recommendable to start 500 BackgroundWorker instances! A better solution could be provided by the "Task Parallel Library" with the Task class.
Something like this should help:
Task.Factory.StartNew(
() => { SOME DB METHOD THAT TAKES 5 SECONDS },
TaskCreationOptions.LongRunning
);
From the MSDN documentation:
LongRunning - Specifies that a task will be a long-running,
coarse-grained operation involving fewer, larger components than
fine-grained systems. It provides a hint to the TaskScheduler that
oversubscription may be warranted. Oversubscription lets you create
more threads than the available number of hardware threads.
Or, you could completely bypass the thread pool and use the Thread class directly:
var t = new Thread(() => { SOME DB METHOD THAT TAKES 5 SECONDS });
t.Start();
"Raw" threads will be harder to work with than tasks, though...
You don't, since your computer can't possibly run 500 threads at once. Most probably, you're having 8 to 16 logical threads, and 4 or 5 is what's left available when you run your code. Seems 100% legit.

Categories

Resources