Why does Parallel.Foreach create endless threads?

Why does Parallel.Foreach create endless threads? - c#

The code below continues to create threads, even when the queue is empty..until eventually an OutOfMemory exception occurs. If i replace the Parallel.ForEach with a regular foreach, this does not happen. anyone know of reasons why this may happen?
public delegate void DataChangedDelegate(DataItem obj);
public class Consumer
{
public DataChangedDelegate OnCustomerChanged;
public DataChangedDelegate OnOrdersChanged;
private CancellationTokenSource cts;
private CancellationToken ct;
private BlockingCollection<DataItem> queue;
public Consumer(BlockingCollection<DataItem> queue) {
this.queue = queue;
Start();
}
private void Start() {
cts = new CancellationTokenSource();
ct = cts.Token;
Task.Factory.StartNew(() => DoWork(), ct);
}
private void DoWork() {
Parallel.ForEach(queue.GetConsumingPartitioner(), item => {
if (item.DataType == DataTypes.Customer) {
OnCustomerChanged(item);
} else if(item.DataType == DataTypes.Order) {
OnOrdersChanged(item);
}
});
}
}

I think Parallel.ForEach() was made primarily for processing bounded collections. And it doesn't expect collections like the one returned by GetConsumingPartitioner(), where MoveNext() blocks for a long time.
The problem is that Parallel.ForEach() tries to find the best degree of parallelism, so it starts as many Tasks as the TaskScheduler lets it run. But the TaskScheduler sees there are many Tasks that take a very long time to finish, and that they're not doing anything (they block) so it keeps on starting new ones.
I think the best solution is to set the MaxDegreeOfParallelism.
As an alternative, you could use TPL Dataflow's ActionBlock. The main difference in this case is that ActionBlock doesn't block any threads when there are no items to process, so the number of threads wouldn't get anywhere near the limit.

The Producer/Consumer pattern is mainly used when there is just one Producer and one Consumer.
However, what you are trying to achieve (multiple consumers) more neatly fits in the Worklist pattern. The following code was taken from a slide for unit2 slide "2c - Shared Memory Patterns" from a parallel programming class taught at the University of Utah, which is available in the download at http://ppcp.codeplex.com/
BlockingCollection<Item> workList;
CancellationTokenSource cts;
int itemcount
public void Run()
{
int num_workers = 4;
//create worklist, filled with initial work
worklist = new BlockingCollection<Item>(
new ConcurrentQueue<Item>(GetInitialWork()));
cts = new CancellationTokenSource();
itemcount = worklist.Count();
for( int i = 0; i < num_workers; i++)
Task.Factory.StartNew( RunWorker );
}
IEnumberable<Item> GetInitialWork() { ... }
public void RunWorker() {
try {
do {
Item i = worklist.Take( cts.Token );
//blocks until item available or cancelled
Process(i);
//exit loop if no more items left
} while (Interlocked.Decrement( ref itemcount) > 0);
} finally {
if( ! cts.IsCancellationRequested )
cts.Cancel();
}
}
}
public void AddWork( Item item) {
Interlocked.Increment( ref itemcount );
worklist.Add(item);
}
public void Process( Item i )
{
//Do what you want to the work item here.
}
The preceding code allows you to add worklist items to the queue, and lets you set an arbitrary number of workers (in this case, four) to pull items out of the queue and process them.
Another great resource for the Parallelism on .Net 4.0 is the book "Parallel Programming with Microsoft .Net" which is freely available at: http://msdn.microsoft.com/en-us/library/ff963553

Internally in the Task Parallel Library, the Parallel.For and Parallel.Foreach follow a hill-climbing algorithm to determine how much parallelism should be utilized for the operation.
More or less, they start with running the body on one task, move to two, and so on, until a break-point is reached and they need to reduce the number of tasks.
This works quite well for method bodies that complete quickly, but if the body takes a long time to run, it may take a long time before the it realizes it needs to decrease the amount of parallelism. Until that point, it continues adding tasks, and possibly crashes the computer.
I learned the above during a lecture given by one of the developers of the Task Parallel Library.
Specifying the MaxDegreeOfParallelism is probably the easiest way to go.

Related

Slow down using Async calls with lock

I was noticing an initial slow down in my process and upon taking multiple hangdumps, I was able to isolate the issue and reproduce the scenario using the following code. I am using a library that has locks and what not, which eventually calls the user side implementation of certain methods. These methods make async calls using httpclient. These async calls are made from within these locks inside the library.
Now, my theory as to what is happening (do correct me if I am wrong):
The tasks that get spun try to acquire the lock and hold on to the threads fast enough such that the first PingAsync method needs to wait for the default task scheduler to spin up a new thread for it to run on, which is 0.5 s based on the default .net scheduling algorithm. This is why I think I notice delays for total tasks greater than 32, which also increases linearly with increasing total tasks count.
The workaround:
Increase the minthreads count, which I think is treating the symptom and not the actual problem.
Another way is to have a limited concurrency to control the number of tasks fired. But these are tasks spun by a webserver for incoming httprequests and typically we will not have control over it (or will we?)
I understand that combining asyc and non-async is bad design and using sempahores' async calls would be a better way to go. Assuming I do not have control over this library, how does one go about mitigating this problem?
const int ParallelCount = 16;
const int TotalTasks = 33;
static object _lockObj = new object();
static HttpClient _httpClient = new HttpClient();
static int count = 0;
static void Main(string[] args)
{
ThreadPool.GetMinThreads(out int workerThreads, out int ioThreads);
Console.WriteLine($"Min threads count. Worker: {workerThreads}. IoThreads: {ioThreads}");
ThreadPool.GetMaxThreads(out workerThreads, out ioThreads);
Console.WriteLine($"Max threads count. Worker: {workerThreads}. IoThreads: {ioThreads}");
//var done = ThreadPool.SetMaxThreads(1024, 1000);
//ThreadPool.GetMaxThreads(out workerThreads, out ioThreads);
//Console.WriteLine($"Set Max Threads success? {done}.");
//Console.WriteLine($"Max threads count. Worker: {workerThreads}. IoThreads: {ioThreads}");
//var done = ThreadPool.SetMinThreads(1024, 1000);
//ThreadPool.GetMinThreads(out workerThreads, out ioThreads);
//Console.WriteLine($"Set Min Threads success? {done}.");
//Console.WriteLine($"Min threads count. Worker: {workerThreads}. IoThreads: {ioThreads}");
var startTime = DateTime.UtcNow;
var tasks = new List<Task>();
for (int i = 0; i < TotalTasks; i++)
{
tasks.Add(Task.Run(() => LibraryMethod()));
//while (tasks.Count > ParallelCount)
//{
// var task = Task.WhenAny(tasks.ToArray()).GetAwaiter().GetResult();
// if (task.IsFaulted)
// {
// throw task.Exception;
// }
// tasks.Remove(task);
//}
}
Task.WaitAll(tasks.ToArray());
//while (tasks.Count > 0)
//{
// var task = Task.WhenAny(tasks.ToArray()).GetAwaiter().GetResult();
// if (task.IsFaulted)
// {
// throw task.Exception;
// }
// tasks.Remove(task);
// Console.Write(".");
//}
Console.Write($"\nDone in {(DateTime.UtcNow-startTime).TotalMilliseconds}");
Console.ReadLine();
}
Assuming this is the part where library methods are called,
public static void LibraryMethod()
{
lock (_lockObj)
{
SimpleNonAsync();
}
}
Eventually, the user implementation of this method gets called which is async.
public static void SimpleNonAsync()
{
//PingAsync().Result;
//PingAsync().ConfigureAwaiter(false).Wait();
PingAsync().Wait();
}
private static async Task PingAsync()
{
Console.Write($"{Interlocked.Increment(ref count)}.");
await _httpClient.SendAsync(new HttpRequestMessage
{
RequestUri = new Uri($#"http://127.0.0.1"),
Method = HttpMethod.Get
});
}

These async calls are made from within these locks inside the library.
This is a design flaw. No one should call arbitrary code while under a lock.
That said, the locks have nothing to do with the problem you're seeing.
I understand that combining asyc and non-async is bad design and using sempahores' async calls would be a better way to go. Assuming I do not have control over this library, how does one go about mitigating this problem?
The problem is that the library is forcing your code to be synchronous. This means one thread is being blocked for every download; there's no way around that as long as the library's callbacks are synchronous.
Increase the minthreads count, which I think is treating the symptom and not the actual problem.
If you can't modify the library, then you must use one thread per request, and this becomes a viable workaround. You have to treat the symptom because you can't fix the problem (i.e., the library).
Another way is to have a limited concurrency to control the number of tasks fired. But these are tasks spun by a webserver for incoming httprequests and typically we will not have control over it (or will we?)
No; the tasks causing problems are the ones you're spinning up yourself using Task.Run. The tasks on the server are completely independent; your code can't influence or even detect them.
If you want higher concurrency without waiting for thread injection, then you'll need to increase min threads, and you'll also probably need to increase ServicePointManager.DefaultConnectionLimit. You can then continue to use Task.Run, or (as I would prefer) Parallel or Parallel LINQ to do parallel processing. One nice aspect of Parallel / Parallel LINQ is that it has built-in support for throttling, if that is also desired.

Running a long-running parallel task in the background, while allowing small async tasks to update the foreground

I have around 10 000 000 tasks that each takes from 1-10 seconds to complete. I am running those tasks on a powerful server, using 50 different threads, where each thread picks the first not-done task, runs it, and repeats.
Pseudo-code:
for i = 0 to 50:
run a new thread:
while True:
task = first available task
if no available tasks: exit thread
run task
Using this code, I can run all the tasks in parallell on any given number of threads.
In reality, the code uses C#'s Task.WhenAll, and looks like this:
ServicePointManager.DefaultConnectionLimit = threadCount; //Allow more HTTP request simultaneously
var currentIndex = -1;
var threads = new List<Task>(); //List of threads
for (int i = 0; i < threadCount; i++) //Generate the threads
{
var wc = CreateWebClient();
threads.Add(Task.Run(() =>
{
while (true) //Each thread should loop, picking the first available task, and executing it.
{
var index = Interlocked.Increment(ref currentIndex);
if (index >= tasks.Count) break;
var task = tasks[index];
RunTask(conn, wc, task, port);
}
}));
}
await Task.WhenAll(threads);
This works just as I wanted it to, but I have a problem: since this code takes a lot of time to run, I want the user to see some progress. The progress is displayed in a colored bitmap (representing a matrix), and also takes some time to generate (a few seconds).
Therefore, I want to generate this visualization on a background thread. But this other background thread is never executed. My suspicion is that it is using the same thread pool as the parallel code, and is therefore enqueued, and will not be executed before the parallel code is actually finished. (And that's a bit too late.)
Here's an example of how I generate the progress visualization:
private async void Refresh_Button_Clicked(object sender, RoutedEventArgs e)
{
var bitmap = await Task.Run(() => // <<< This task is never executed!
{
//bla, bla, various database calls, and generating a relatively large bitmap
});
//Convert the bitmap into a WPF image, and update the GUI
VisualizationImage = BitmapToImageSource(bitmap);
}
So, how could I best solve this problem? I could create a list of Tasks, where each Task represents one of my tasks, and run them with Parallel.Invoke, and pick another Thread pool (I think). But then I have to generate 10 million Task objects, instead of just 50 Task objects, running through my array of stuff to do. That sounds like it uses much more RAM than necessary. Any clever solutions to this?
EDIT:
As Panagiotis Kanavos suggested in one of his comments, I tried replacing some of my loop logic with ActionBlock, like this:
// Create an ActionBlock<int> that performs some work.
var workerBlock = new ActionBlock<ZoneTask>(
t =>
{
var wc = CreateWebClient(); //This probably generates some unnecessary overhead, but that's a problem I can solve later.
RunTask(conn, wc, t, port);
},
// Specify a maximum degree of parallelism.
new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = threadCount
});
foreach (var t in tasks) //Note: the objects in the tasks array are not Task objects
workerBlock.Post(t);
workerBlock.Complete();
await workerBlock.Completion;
Note: RunTask just executes a web request using the WebClient, and parses the results. It's nothing in there that can create a dead lock.
This seems to work as the old parallelism code, except that it needs a minute or two to do the initial foreach loop to post the tasks. Is this delay really worth it?
Nevertheless, my progress task still seems to be blocked. Ignoring the Progress< T > suggestion for now, since this reduced code still suffers the same problem:
private async void Refresh_Button_Clicked(object sender, RoutedEventArgs e)
{
Debug.WriteLine("This happens");
var bitmap = await Task.Run(() =>
{
Debug.WriteLine("This does not!");
//Still doing some work here, so it's not optimized away.
};
VisualizationImage = BitmapToImageSource(bitmap);
}
So it still looks like new tasks are not executed as long as the parallell task is running. I even reduced the "MaxDegreeOfParallelism" from 50 to 5 (on a 24 core server) to see if Peter Ritchie's suggestion was right, but no change. Any other suggestions?
ANOTHER EDIT:
The issue seems to have been that I overloaded the thread pool with all my simultaneous blocking I/O calls. I replaced WebClient with HttpClient and its async-functions, and now everything seems to be working nicely.
Thanks to everyone for the great suggestions! Even though not all of them directly solved the problem, I'm sure they all improved my code. :)

.NET already provides a mechanism to report progress with the IProgress< T> and the Progress< T> implementation.
The IProgress interface allows clients to publish messages with the Report(T) class without having to worry about threading. The implementation ensures that the messages are processed in the appropriate thread, eg the UI thread. By using the simple IProgress< T> interface the background methods are decoupled from whoever processes the messages.
You can find more information in the Async in 4.5: Enabling Progress and Cancellation in Async APIs article. The cancellation and progress APIs aren't specific to the TPL. They can be used to simplify cancellation and reporting even for raw threads.
Progress< T> processes messages on the thread on which it was created. This can be done either by passing a processing delegate when the class is instantiated, or by subscribing to an event. Copying from the article:
private async void Start_Button_Click(object sender, RoutedEventArgs e)
{
//construct Progress<T>, passing ReportProgress as the Action<T>
var progressIndicator = new Progress<int>(ReportProgress);
//call async method
int uploads=await UploadPicturesAsync(GenerateTestImages(), progressIndicator);
}
where ReportProgress is a method that accepts a parameter of int. It could also accept a complex class that reported work done, messages etc.
The asynchronous method only has to use IProgress.Report, eg:
async Task<int> UploadPicturesAsync(List<Image> imageList, IProgress<int> progress)
{
int totalCount = imageList.Count;
int processCount = await Task.Run<int>(() =>
{
int tempCount = 0;
foreach (var image in imageList)
{
//await the processing and uploading logic here
int processed = await UploadAndProcessAsync(image);
if (progress != null)
{
progress.Report((tempCount * 100 / totalCount));
}
tempCount++;
}
return tempCount;
});
return processCount;
}
This decouples the background method from whoever receives and processes the progress messages.

How to dequeue when new item in queue

I've an application that works with a queue with strings (which corresponds to different tasks that application needs to perform). At random moments the queue can be filled with strings (like several times a minute sometimes but it also can take a few hours.
Till now I always had a timer that checked every few seconds the queue whether there were items in the queue and removed them.
I think there must be a nicer solution than this way. Is there any way to get an event or so when an item is added to the queue?

Yes. Take a look at TPL Dataflow, in particular, the BufferBlock<T>, which does more or less the same as BlockingCollection without the nasty side-effect of jamming up your threads by leveraging async/await.
So you can:
void Main()
{
var b = new BufferBlock<string>();
AddToBlockAsync(b);
ReadFromBlockAsync(b);
}
public async Task AddToBlockAsync(BufferBlock<string> b)
{
while (true)
{
b.Post("hello");
await Task.Delay(1000);
}
}
public async Task ReadFromBlockAsync(BufferBlock<string> b)
{
await Task.Delay(10000); //let some messages buffer up...
while(true)
{
var msg = await b.ReceiveAsync();
Console.WriteLine(msg);
}
}

I'd take a look at BlockingCollection.GetConsumingEnumerable. The collection will be backed with a queue by default, and it is a nice way to automatically take values from the queue as they are added using a simple foreach loop.
There is also an overload that allows you to supply a CancellationToken meaning you can cleanly break out.

Have you looked at BlockingCollection ? The GetConsumingEnumerable() method allows an indefinite loop to be run on the consumer, to which will new items will be yielded once an item becomes available, with no need for timers, or Thread.Sleep's:
// Common:
BlockingCollection<string> _blockingCollection =
new BlockingCollection<string>();
// Producer
for (var i = 0; i < 100; i++)
{
_blockingCollection.Add(i.ToString());
Thread.Sleep(500); // So you can track the consumer synchronization. Remove.
}
// Consumer:
foreach (var item in _blockingCollection.GetConsumingEnumerable())
{
Debug.WriteLine(item);
}

In C#, how do I call a method at random intervals on a separate thread but have it always process sequentially?

This seems like it has a very simple solution, but I've been looking on and off for months without finding a definitive answer.
I have an object being created by the UI Thread. I'm actually creating several of the same type of object. Sometimes I'll create 1 or 2 every minute, sometimes I'll create 30 in a second. Each time an object it created, I need to perform some calculations on the object. The combination of potentially 30 objects created at in a short time, and the expensive calculations I'm performing on said objects can really lag the UI.
I've tried performing the calculations via Tasks and backgroundWorkers and all sorts of threading but they all perform the calculations out of order, and it's imperative that the objects are calculated in the order they are created and that one doesn't start it's calculations until the object ahead of it finishes it's own calculations.
I can find all sorts of information about how to perform these tasks in parallel, but can anyone explain to me how I can force them to happen sequentially, just not on the UI Thread? Any help would be greatly appreciated. I've been trying to figure this out for months :(

IMO, the easiest way to run tasks in sequential order here is to use Task.ContinueWith. Note TaskContinuationOptions.LazyCancellation, it's used to make sure that cancellation doesn't break the order of the sequence:
Task _currentTask = Task.FromResult(Type.Missing);
readonly object _lock = new Object();
void QueueTask(Action action)
{
lock (_lock)
{
_currentTask = _currentTask.ContinueWith(
lastTask =>
{
// re-throw the error of the last completed task (if any)
lastTask.GetAwaiter().GetResult();
// run the new task
action();
},
CancellationToken.None,
TaskContinuationOptions.LazyCancellation,
TaskScheduler.Default);
}
}
private void button1_Click(object sender, EventArgs e)
{
for(var i = 0; i < 10; i++)
{
var sleep = 1000 - i*100;
QueueTask(() =>
{
Thread.Sleep(sleep);
Debug.WriteLine("Slept for {0} ms", sleep);
});
}
}

I think a proper solution would be to use a Queue, as it is FIFO in a task that is running in the background.
Your code could look something like this:
Edit
Edited to use Queue.Synchronize as #rwong mentioned (thanks!)
var queue = new Queue<MyCustomObject>();
//add object to the queue..
var mySyncedQueue = Queue.Synchronize(queue)
Task.Factory.StartNew(() =>
{
while (true)
{
var myObj = mySyncedQueue.Dequeue();
if (myObj != null)
{ do work...}
}
}, TaskCreationOptions.LongRunning);
This is just to get you started, im sure your could could be more efficient when you know exactly what is needed :)
Edit 2
As you are accessing the queue from multiple threads, it is better to use a ConcurrentQueue.
the method wont look much different:
var concurrentQueue = new ConcurrentQueue<MyCustomObject>();
//add object to the queue..
Task.Factory.StartNew(() =>
{
while (true)
{
MyCustomObject myObj;
if (concurrentQueue.TryDequeue(myObj))
{ do work...}
}
}, TaskCreationOptions.LongRunning);

Create exactly one background task. Then let it process all items in a thread-safe fifo buffer. Add objects to the fifo from your gui thread.

How (and if) to write a single-consumer queue using the TPL?

I've heard a bunch of podcasts recently about the TPL in .NET 4.0. Most of them describe background activities like downloading images or doing a computation, using tasks so that the work doesn't interfere with a GUI thread.
Most of the code I work on has more of a multiple-producer / single-consumer flavor, where work items from multiple sources must be queued and then processed in order. One example would be logging, where log lines from multiple threads are sequentialized into a single queue for eventual writing to a file or database. All the records from any single source must remain in order, and records from the same moment in time should be "close" to each other in the eventual output.
So multiple threads or tasks or whatever are all invoking a queuer:
lock( _queue ) // or use a lock-free queue!
{
_queue.enqueue( some_work );
_queueSemaphore.Release();
}
And a dedicated worker thread processes the queue:
while( _queueSemaphore.WaitOne() )
{
lock( _queue )
{
some_work = _queue.dequeue();
}
deal_with( some_work );
}
It's always seemed reasonable to dedicate a worker thread for the consumer side of these tasks. Should I write future programs using some construct from the TPL instead? Which one? Why?

You can use a long running Task to process items from a BlockingCollection as suggested by Wilka. Here's an example which pretty much meets your applications requirements. You'll see output something like this:
Log from task B
Log from task A
Log from task B1
Log from task D
Log from task C
Not that outputs from A, B, C & D appear random because they depend on the start time of the threads but B always appears before B1.
public class LogItem
{
public string Message { get; private set; }
public LogItem (string message)
{
Message = message;
}
}
public void Example()
{
BlockingCollection<LogItem> _queue = new BlockingCollection<LogItem>();
// Start queue listener...
CancellationTokenSource canceller = new CancellationTokenSource();
Task listener = Task.Factory.StartNew(() =>
{
while (!canceller.Token.IsCancellationRequested)
{
LogItem item;
if (_queue.TryTake(out item))
Console.WriteLine(item.Message);
}
},
canceller.Token,
TaskCreationOptions.LongRunning,
TaskScheduler.Default);
// Add some log messages in parallel...
Parallel.Invoke(
() => { _queue.Add(new LogItem("Log from task A")); },
() => {
_queue.Add(new LogItem("Log from task B"));
_queue.Add(new LogItem("Log from task B1"));
},
() => { _queue.Add(new LogItem("Log from task C")); },
() => { _queue.Add(new LogItem("Log from task D")); });
// Pretend to do other things...
Thread.Sleep(1000);
// Shut down the listener...
canceller.Cancel();
listener.Wait();
}

I know this answer is about a year late, but take a look at MSDN.
which shows how to create a LimitedConcurrencyLevelTaskScheduler from the TaskScheduler class. By limiting the concurrency to a single task, that should then process your tasks in order as they are queued via:
LimitedConcurrencyLevelTaskScheduler lcts = new LimitedConcurrencyLevelTaskScheduler(1);
TaskFactory factory = new TaskFactory(lcts);
factory.StartNew(()=>
{
// your code
});

I'm not sure that TPL is adequate in your use case. From my understanding the main use case for TPL is to split one huge task into several smaller tasks that can be run side by side. For example if you have a big list and you want to apply the same transformation on each element. In this case you can have several tasks applying the transformation on a subset of the list.
The case you describe doesn't seem to fit in this picture for me. In your case you don't have several tasks that do the same thing in parallel. You have several different tasks that each does is own job (the producers) and one task that consumes. Perhaps TPL could be used for the consumer part if you want to have multiple consumers because in this case, each consumer does the same job (assuming you find a logic to enforce the temporal consistency you look for).
Well, this of course is just my personnal view on the subject
Live long and prosper

It sounds like BlockingCollection would be handy for you. So for your code above, you could use something like (assuming _queue is a BlockingCollection instance):
// for your producers
_queue.Add(some_work);
A dedicated worker thread processing the queue:
foreach (var some_work in _queue.GetConsumingEnumerable())
{
deal_with(some_work);
}
Note: when all your producers have finished producing stuff, you'll need to call CompleteAdding() on _queue otherwise your consumer will be stuck waiting for more work.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.