Simple Multithreading Question - c#

Ok I should already know the answer but...
I want to execute a number of different tasks in parallel on a separate thread and wait for the execution of all threads to finish before continuing. I am aware that I could have used the ThreadPool.QueueUserWorkItem() or the BackgroundWorker but did not want to use either (for no particular reason).
So is the code below the correct way to execute tasks in parallel on a background thread and wait for them to finish processing?
Thread[] threads = new Thread[3];
for (int i = 0; i < threads.Length; i++)
{
threads[i] = new Thread(SomeActionDelegate);
threads[i].Start();
}
for (int i = 0; i < threads.Length; i++)
{
threads[i].Join();
}
I know this question must have been asked 100 times before so thanks for your answer and patience.

Yes, this is a correct way to do this. But if SomeActionDelegate is relatively small then the overhead of creating 3 threads will be significant. That's why you probaly should use the ThreadPool anyway. Even though it has no clean equivalent to Join.
A BackgroundWorker is mainly useful when interacting with the GUI.

It's always hard to say what the "correct" way is, but your approach does work correctly.
However, by using Join, the UI thread (if the main thread is a UI thread) will be blocked, therefore freezing the UI. If that is not the case, your approach is basically OK, even if it has different problems (scalability/number of concurrent threads, etc.).

if your are (or will be using .Net 4.0) try using Task parallel Library

Related

Let threads wait for all tasks to complete before starting on the next set of tasks

I have a pipeline that consists of several stages. Jobs in the same stage can be worked on in parallel. But all jobs in stage 1 have to completed before anyone can start working on jobs in stage 2, etc..
I was thinking of synchronizing this work using a CountDownEvent.
My basis structure would be
this.WorkerCountdownEvent = new CountdownEvent(MaxJobsInStage);
this.WorkerCountdownEvent.Signal(MaxJobsInStage); // Starts all threads
// Each thread runs the following code
for (this.currentStage = 0; this.currentStage < this.PipelineStages.Count; this.currentStage++)
{
this.WorkerCountdownEvent.Wait();
var stage = this.PipelineStages[this.currentStage];
if (stage.Systems.Count < threadIndex)
{
var system = stage.Systems[threadIndex];
system.Process();
}
this.WorkerCountdownEvent.Signal(); // <--
}
This would work well for processing one stage. But the first thread that reaches this.WorkerCountdownEvent.Signal() will crash the application as its trying to decrement the signal to below zero.
Of course if I want prevent this, and have the jobs to wait again, I have to call this.WorkerCountdownEvent.Reset(). But I have to call it after all threads have started working, but before one thread is done with its work. Which seems like an impossible task?
Am I using the wrong synchronization primitive? Or should I use two countdown events? Or am I missing something completely?
(Btw usually jobs will take less than a milliseconds so bonus points if someone has a better way to do this using 'slim' primitives like ManualResetEventSlim. ThreadPools, or Task<> are not the direction I'm looking at since these threads will live for very long (hours) and need to go through the pipeline 60x per second. So the overhead of stopping/starting a Tasks is considerable here).
Edit: this questions was flagged as a duplciate of two questions. One of the questons was answered with "use thread.Join()" and the other one was answered with "Use TPL" both answers are (in my opnion) clearly not answers to a question about pipelining and threading primitives such as CountDownEvent.
I think that the most suitable synchronization primitive for this case is the Barrier.
Enables multiple tasks to cooperatively work on an algorithm in parallel through multiple phases.
Usage example:
private Barrier _barrier = new Barrier(this.WorkersCount);
// Each worker thread runs the following code
for (i = 0; i < this.StagesCount; i++)
{
// Here goes the work of a single worker for a single stage...
_barrier.SignalAndWait();
}
Update: In case you want the workers to wait for the signal asynchronously, there is an AsyncBarrier implementation here.

ThreadPool or Task.Factory

I have windows service that can get requests, I want to handle in each request in separated thread.
I want to limit also the number of the threads, i.e maximum 5 threads.
And i want to wait for all threads before i'm close the application,
What is the best way to do that?
What I'm tried:
for (int i = 0; i < 10; i++)
{
var i1 = i;
Task.Factory.StartNew(() => RequestHandle(i1.ToString())).ContinueWith(t => Console.WriteLine("Done"));
}
Task.WaitAll();//Not waiting actually for all threads, why?
In this way i can limit the number of the theads?
Or
var events = new ManualResetEvent[10];
ThreadPool.SetMaxThreads(5, 5);
for (int i = 0; i < 10; i++)
{
var i1 = i;
ThreadPool.QueueUserWorkItem(x =>
{
Test(i1.ToString());
events[i1].Set();
});
}
WaitHandle.WaitAll(events);
There is another way to implement that?
Both approaches should work, neither is a guarantee that each operation will run in a different thread (and that is a good thing). Controlling the maximun number of threads is another thing...
You can set the maximun number of threads of the ThreadPool with SetMaxThreads. Remember that this change is global, you only have one ThreadPool.
The default TaskScheduler will use the thread pool (except in some particular situations where it can run the taks inline on the same thread that is calling them), so changing the parameters of the ThreadPool will also affect the Tasks.
Now, notice I said default TaskScheduler. You can roll your own, which will give you more control over how the tasks will run. It is not advised to create a custom TaskScheduler (unless you really need it and you know what you are doing).
For the embedded question:
Task.WaitAll();//Not waiting actually for all threads, why?
To correctly call Task.WaitAll you need to pass the tasks you want to wait for as parameters. There is no way to wait for all currently existing tasks implicitly, you need to tell it what tasks you want to wait for.
Task.WaitAll();//Not waiting actually for all threads, why?
you have to the following
List<Task> tasks = new List<Task>();
for (int i = 0; i < 10; i++)
{
var i1 = i;
tasks.Add(Task.Factory.StartNew(() => RequestHandle(i1.ToString())).ContinueWith(t => Console.WriteLine("Done")));
}
Task.WaitAll(tasks);
I think that you should make the differences between Task and thread
Task are not threads Task is just a promise of result in the future and your code can execute on only one thread even if you have many tasks scheduled
Thread is a low-level concept if you start a thread you know that it will be a separate thread
I think you first implentation is well enough to be sure that all your code is executed but you have to use Task.Run instead of Task.Factory.StartNew

Task.Factory.StartNew isn't running how I expected

I've been trying to determine the best way of running code concurrently/in parallel with the rest of my code, probably using a thread. From what I've read, using Thread type is a no-no in modern C#. Initially I thought Parallel.Invoke(), but that turns out to be a blocking call till all the inner work is complete.
In my application, I don't need to wait for anything to complete, I don't care about getting a result, I need code that is completely independent of the current thread. Basically a "fire and forget" idea.
From what I thought I understand, Task.Factory.StartNew() is the correct way of running a piece of code concurrently/in parallel with the currently running code.
Based on that, I thought the following code would randomly print out "ABABBABABAA".
void Main()
{
Task.Factory.StartNew(() =>
{
for (int i = 0; i < 10; i++)
{
Console.Write("A");
}
});
for (int i = 0; i < 10; i++)
{
Console.Write("B");
}
}
However, it:
Prints out "BBBBBBBBBBAAAAAAAAAA"
If I swap the Task.Factory.StartNew with the for and vice-versa the same sequence is printed out, which seems bizarre.
So this leads me to think that Task.Factory.StartNew() is never actually scheduling work out to another thread, almost as if calling StartNew is a blocking call.
Given the requirement that I don't need to get any results or wait/await, would it be easier for me to simply create a new Thread and run my code there? The only problem that I have with this is that using a Thread seems to be the opposite of modern best practices and the book "Concurrency in C# 6.0" states:
As soon as you type new Thread(), it’s over; your project already has legacy code
Actually Task.Factory.StartNew does run on a seperate thread the only reason it loses the race everytime is because Task creation time. Here is code to prove it
static void Main(string[] args)
{
Task.Factory.StartNew(() =>
{
for (int i = 0; i < 10; i++)
{
Console.Write("A");
Thread.Sleep(1);
}
});
for (int i = 0; i < 10; i++)
{
Console.Write("B");
Thread.Sleep(1);
}
Console.ReadKey();
}
In my application, I don't need to wait for anything to complete, I don't care about getting a result, I need code that is completely independent of the current thread. Basically a "fire and forget" idea.
Are you sure? Fire-and-forget really means "I'm perfectly OK with exceptions being ignored." Since this is not what most people want, one common approach is to queue the work and save a Task representing that work, and join with it at the "end" of whatever your program does, just to make sure the not-quite-fire-and-forget work does complete successfully.
I've been trying to determine the best way of running code concurrently/in parallel with the rest of my code, probably using a thread. From what I've read, using Thread type is a no-no in modern C#. Initially I thought Parallel.Invoke(), but that turns out to be a blocking call till all the inner work is complete.
True, parallel code will block the calling thread. This can be avoided by wrapping the parallel call in a Task.Run (illustrated by Recipe 7.4 in my book).
In your particular case (i.e., in a fire-and-forget scenario), you can just drop the await and have a bare Task.Run. Though as I mention above, it's probably better to just stash the Task away someplace and await it later.
From what I thought I understand, Task.Factory.StartNew() is the correct way of running a piece of code concurrently/in parallel with the currently running code.
No, StartNew is dangerous and should only be used as a last resort. The proper technique is Task.Run, and if you have truly parallel work to do (i.e., many chunks of CPU-bound code), then a Task.Run wrapper around Parallel/PLINQ would be best.
Based on that, I thought the following code would randomly print out "ABABBABABAA".
As others have noted, this is just a race condition. It takes time to queue work to the thread pool, and computers can count to 10 really fast. (Writing output is a lot slower, but it's still too fast here). The same problem would occur with Task.Run (or a manual Thread).

C# Multi-Threading - Limiting the amount of concurrent threads

I have question on controlling the amount of concurrent threads I want running. Let me explain with what I currently do: For example
var myItems = getItems(); // is just some generic list
// cycle through the mails, picking 10 at a time
int index = 0;
int itemsToTake = myItems.Count >= 10 ? 10 : myItems.Count;
while (index < myItems.Count)
{
var itemRange = myItems.GetRange(index, itemsToTake);
AutoResetEvent[] handles = new AutoResetEvent[itemsToTake];
for (int i = 0; i < itemRange.Count; i++)
{
var item = itemRange[i];
handles[i] = new AutoResetEvent(false);
// set up the thread
ThreadPool.QueueUserWorkItem(processItems, new Item_Thread(handles[i], item));
}
// wait for all the threads to finish
WaitHandle.WaitAll(handles);
// update the index
index += itemsToTake;
// make sure that the next batch of items to get is within range
itemsToTake = (itemsToTake + index < myItems.Count) ? itemsToTake : myItems.Count -index;
This is a path that I currently take. However I do not like it at all. I know I can 'manage' the thread pool itself, but I have heard it is not advisable to do so. So what is the alternative? The semaphore class?
Thanks.
Instead of using ThreadPool directly, you might also consider using TPL or PLINQ. For example, with PLINQ you could do something like this:
getItems().AsParallel()
.WithDegreeOfParallelism(numberOfThreadsYouWant)
.ForAll(item => process(item));
or using Parallel:
var options = new ParallelOptions {MaxDegreeOfParallelism = numberOfThreadsYouWant};
Parallel.ForEach(getItems, options, item => process(item));
Make sure that specifying the degree of parallelism does actually improve performance of your application. TPL and PLINQ use ThreadPool by default, which does a very good job of managing the number of threads that are running. In .NET 4, ThreadPool implements algorithms that add more processing threads only if that improves performance.
Don't use THE treadpool, get another one (just look for google, there are half a dozen implementations out) and manage that yourself.
Managing THE treadpool is not advisable as a lot of internal workings may go ther, managing your OWN threadpool instance is totally ok.
It looks like you can control the maximum number of threads using ThreadPool.SetMaxThreads, although I haven't tested this.
Assuming the question is; "How do I limit the number of worker threads?" The the answer would be use a producer-consumer queue where you control the number of worker threads. Just queue your items and let it handle workers.
Here is a generic implementation you could use.
you can use ThreadPool.SetMaxThreads Method
http://msdn.microsoft.com/en-us/library/system.threading.threadpool.setmaxthreads.aspx
In the documentation, there is a mention of SetMaxThreads ...
public static bool SetMaxThreads (
int workerThreads,
int completionPortThreads
)
Sets the number of requests to the thread pool that can be active concurrently. All requests above that number remain queued until thread pool threads become available.
However:
You cannot set the number of worker threads or the number of I/O completion threads to a number smaller than the number of processors in the computer.
But I guess you are anyways better served by using a non-singleton thread pool.
There is no reason to deal with hybrid thread synchronization constructs (such is AutoResetEvent) and the ThreadPool.
You can use a class that can act as the coordinator responsible for executing all of your code asynchronously.
Wrap using a Task or the APM pattern what the "Item_Thread" does. Then use the AsyncCoordinator class by Jeffrey Richter (can be found at the code from the book CLR via C# 3rd Edition).

Threading Question - Best way to spawn multiple threads

I've created a method, let's call it MemoryStressTest(). I would like to call this from another method, let's call it InitiateMemoryStressTest(). I would like the InitiateMemoryStressTest() method to call multiple instances of MemoryStressTest() via different threads. The threads don't need to be aware of each other and will not be dependent on each other. I'm using .NET 4. What would be the best way to do this?
Be as simple, as possible:
int threadCount = 100;
for (int i = 0; i < threadCount; i++)
{
(new Thread(() => MemoryStressTest())).Start();
}
If you just want new threads - and don't want thread pool threads, tasks etc, then it's very straightforward:
for (int i = 0; i < numberOfThreads; i++)
{
Thread t = new Thread(MemoryStressTest);
t.Start();
// Remember t if you need to wait for them all to finish etc
}
(One benefit of this approach over using the thread pool is that you don't get the "smart" behaviour of .NET in terms of ramping up threads in the thread pool slowly etc. All very well for normal situations, but this is slightly different :)
How about using .NET 4.0 Parallel Framework's tasks instead of threads
- let the system decide how many actual threads to use.
This can be done with a parallel for, or other techniques.

Categories

Resources