I'm writing a C# program which involves multithreading and synchronization between multiple threads. The threads all need perform some iterative work independently, and after some thread has finished a specified number of iterations, it must wait for the other threads to come up. After all of them have finished given number of iterations and obtained some intermediate result, they should do some synchronized work and then continue execution again, until another synchronization point is reached and so on.
Here's my attempt to achieve this (a thread should pause after just one iteration, then wait for the others):
int nThreads = Environment.ProcessorCount;
Thread[] threads = new Thread[nThreads];
ManualResetEvent[] manualResetEvents = new ManualResetEvent[nThreads];
for (int i = 0; i < nThreads; i++)
{
manualResetEvents[i] = new ManualResetEvent(false);
}
int nSteps = 5;
Random rnd = new Random();
for (int i = 0; i < nThreads; i++)
{
int idx = i;
threads[i] = new Thread(delegate ()
{
int cStep = nSteps;
while (cStep > 0)
{
manualResetEvents[idx].Reset();
Console.Write("\nThread {0} working... cStep = {1}\n", idx, cStep);
Thread.Sleep(rnd.Next(1000));
manualResetEvents[idx].Set();
Console.WriteLine("\nThread {0} work done. Waiting Others...cStep = {1}\n", idx, cStep);
WaitHandle.WaitAll(manualResetEvents);
cStep--;
}
});
}
for (int i = 0; i < nThreads; i++)
{
threads[i].Start();
}
for (int i = 0; i < nThreads; i++)
{
threads[i].Join();
}
But the code above appears not working, for not any thread waits for all of the other threads to perform one iteration for some reason. I think I misunderstand the purpose of ManualResetEvent or use it the wrong way, what could you suggest?
Your code is prone to race conditions. After all threads have completed the first iteration, all events are still set; if a single thread then runs through the loop before the others get to reset their events, it'll see the other events still set, and stop waiting early.
There's plenty of ways to resolve this bug, but the best solution for you is System.Threading.Barrier (also see Threading objects and features with section about Barrier). It's explicitly designed for this situation, where you want multiple threads to work in parallel through a multi-step algorithm.
Related
I wrote a simple code. The machine has 32 threads. At the twentieth second, I see the number 54 in the console. This means that 54 tasks have started. Each task uses thread suspension. I don't understand why tasks continue to run if tasks have already been created and started in all possible threads and the thread suspension code is running in each task.
What's going on, how does it work?
void MyMethod(int i)
{
Console.WriteLine(i);
Thread.Sleep(int.MaxValue);
}
Console.WriteLine(Environment.ProcessorCount);
for (int i = 0; i < int.MaxValue; i++)
{
Thread.Sleep(50);
int j = i;
Task.Run(() => MyMethod(j));
}
And why does this code create so many tasks? (Environment.ProcessorCount => 32)
using System.Net;
void MyMethod(int i)
{
Console.WriteLine(WebRequest.Create("https://192.168.1.1").GetResponse().ContentLength);
}
for (int i = 0; i < Environment.ProcessorCount; i++)
{
int j = i;
Task.Run(() => MyMethod(j));
}
Thread.Sleep(int.MaxValue);
The Task.Run method runs the code on the ThreadPool, and the ThreadPool creates more threads when it becomes saturated. Initially it creates immediately on demand as many threads as the number of cores. After that point it is said to be saturated, and creates one new thread every second until the demand is satisfied. This rate is not documented. It is found by experimentation on .NET 6, and might change in future .NET versions.
You are able to control the saturation threshold with the ThreadPool.SetMinThreads method. For example ThreadPool.SetMinThreads(100, 100). If you give it too large values, this method does nothing and returns false.
I'm trying to come up with the best solution for going through a list of strings and performing a POST request with each one of them.
My previous attempt was to make a Queue<String> of the strings 200 or more threads and each thread had a task to Dequeue a string from the list and perform the task, which performed worse than I expected.
What I'm doing wrong here?
My code:
class Checker
{
public Queue<string> pins;
public Checker()
{
pins = GetPins();
StartThreads(1000);
}
public void StartThreads(int threadsCount)
{
Console.WriteLine("Starting Threads");
for (int n = 0; n < 200; n++)
{
var thread = new Thread(Printer);
thread.Name = String.Format("Thread Number ({0})", n);
thread.Start();
}
}
public Queue<string> GetPins()
{
Queue<string> numbers = new Queue<string>();
for (int n = 0; n < 100000; n++)
{
numbers.Enqueue(n.ToString().PadLeft(5, '0'));
}
Console.WriteLine("Got Pins");
return numbers;
}
void Printer()
{
while (pins.Count > 0)
{
var num = pins.Dequeue();
Console.WriteLine();
Console.WriteLine(String.Format("{0} - {1}", num, Thread.CurrentThread.Name));
}
}
}
As you can see I generate 100.000 5 digits long pins and perform a task (output them through console) and assuming i have 1000 threads, it has to be incredibly fast, which is not.
Please tell me what I'm doing wrong and anything I can improve. Thank you!
Queue is not a thread-safe collection. See ConcurrentQueue
Your StartTheads() method does not use the threadsCount arg and therefore only starting 200 threads.
You also need to be careful with threads. Consider using Tasks or ThreadPool instead. IIRC, This will let your app decide how many threads it needs depending on the task count.
I have a c# server and it has a packet handler that handles packets of clients like this:
for (int i = 0; i < packets.Count; i++)
{
new Thread(new HandlePacket(packets[i].bytes, packets[i].count).handle).Start();
}
Usually the handle packet have lots of code to execute (depands on packet id, sometimes even nested loops), the threads are to prevent hanging (or long execution time) the packet handler because one packet handle can sometimes do nested loop like
for (int i = 0; i < 32; i++)
{
for (int j = 0; j < 8; j++)
{
for (int k = 0; k < 32; k++)
{
}
}
}
Is it efficient to do such thing?
If not, what are the alternatives to prevent the hanging of packet handler thread?
The server is for a game, long execution (or hang) time is not acceptable.
Additionally every server have 32 packet handlers (one reserved for each client)
In order to enable data parallelism on the collection, you could make use of the Parallel.ForEach().
Therefore, your code will look something like this:
Parallel.ForEach(packets, (item) =>
{
new HandlePacket(item.bytes, item.count).handle();
});
If you use the Task class you allow the runtime to determine how to use Threads to run your code in parallel.
You would do it like this:
for (int i = 0; i < packets.Count; i++)
{
Task.Run(() => new HandlePacket(packets[i].bytes, packets[i].count).handle).Start());
}
There are some other things you can configure like it, but this will be
much better than starting a thread for each package.
By using the below code firstly some of the calls are not getting made lets say out of 250 , 238 calls are made and rest doesn't.Secondly I am not sure if the calls are made at the rate of 20 calls per 10 seconds.
public List<ShowData> GetAllShowAndTheirCast()
{
ShowResponse allShows = GetAllShows();
ShowCasts showCast = new ShowCasts();
showCast.showCastList = new List<ShowData>();
using (Semaphore pool = new Semaphore(20, 20))
{
for (int i = 0; i < allShows.Shows.Length; i++)
{
pool.WaitOne();
Thread t = new Thread(new ParameterizedThreadStart((taskId) =>
{
showCast.showCastList.Add(MapResponse(allShows.Shows[i]));
}));
pool.Release();
t.Start(i);
}
}
//for (int i = 0; i < allShows.Shows.Length; i++)
//{
// showCast.showCastList.Add(MapResponse(allShows.Shows[i]));
//}
return showCast.showCastList;
}
public ShowData MapResponse(Show s)
{
CastResponse castres = new CastResponse();
castres.CastlistResponse = (GetShowCast(s.id)).CastlistResponse;
ShowData sd = new ShowData();
sd.id = s.id;
sd.name = s.name;
if (castres.CastlistResponse != null && castres.CastlistResponse.Any())
{
sd.cast = new List<CastData>();
foreach (var item in castres.CastlistResponse)
{
CastData cd = new CastData();
cd.birthday = item.person.birthday;
cd.id = item.person.id;
cd.name = item.person.name;
sd.cast.Add(cd);
}
}
return sd;
}
public ShowResponse GetAllShows()
{
ShowResponse response = new ShowResponse();
string showUrl = ClientAPIUtils.apiUrl + "shows";
response.Shows = JsonConvert.DeserializeObject<Show[]>(ClientAPIUtils.GetDataFromUrl(showUrl));
return response;
}
public CastResponse GetShowCast(int showid)
{
CastResponse res = new CastResponse();
string castUrl = ClientAPIUtils.apiUrl + "shows/" + showid + "/cast";
res.CastlistResponse = JsonConvert.DeserializeObject<List<Cast>>(ClientAPIUtils.GetDataFromUrl(castUrl));
return res;
}
All the Calls should be made , but I am not sure where they are getting aborted and even please let me know how to check the rate of calls being made.
I'm assuming that your goal is to process all data about shows but no more than 20 at once.
For that kind of task you should probably use ThreadPool and limit maximum number of concurrent threads using SetMaxThreads.
https://learn.microsoft.com/en-us/dotnet/api/system.threading.threadpool?view=netframework-4.7.2
You have to make sure that collection that you are using to store your results is thread-safe.
showCast.showCastList = new List<ShowData>();
I don't think that standard List is thread-safe. Thread-safe collection is ConcurrentBag (there are others as well). You can make standard list thread-safe but it requires more code. After you are done processing and need to have results in list or array you can convert collection to desired type.
https://learn.microsoft.com/en-us/dotnet/api/system.collections.concurrent.concurrentbag-1?view=netframework-4.7.2
Now to usage of semaphore. What your semaphore is doing is ensuring that maximum 20 threads can be created at once. Assuming that this loop runs in your app main thread your semaphore has no purpose. To make it work you need to release semaphore once thread is completed; but you are calling thread Start() after calling Release(). That results in thread being executed outside "critical area".
using (Semaphore pool = new Semaphore(20, 20)) {
for (int i = 0; i < allShows.Shows.Length; i++) {
pool.WaitOne();
Thread t = new Thread(new ParameterizedThreadStart((taskId) =>
{
showCast.showCastList.Add(MapResponse(allShows.Shows[i]));
pool.Release();
}));
t.Start(i);
}
}
I did not test this solution; additional problems might arise.
Another issue with this program is that it does not wait for all threads to complete. Once all threads are started; program will end. It is possible (and in your case I'm sure) that not all threads completed its operation; this is why ~240 data packets are done when program finishes.
thread.Join();
But if called right after Start() it will stop main thread until it is completed so to keep program concurrent you need to create a list of threads and Join() them at the end of program. It is not the best solution. How to wait on all threads that program can add to ThreadPool
Wait until all threads finished their work in ThreadPool
As final note you cannot access loop counter like that. Final value of loop counter is evaluated later and with test I ran; code has tendency to process odd records twice and skip even. This is happening because loop increases counter before previous thread is executed and causes to access elements outside bounds of array.
Possible solution to that is to create method that will create thread. Having it in separate method will evaluate allShows.Shows[i] to show before next loop pass.
public void CreateAndStartThread(Show show, Semaphore pool, ShowCasts showCast)
{
pool.WaitOne();
Thread t = new Thread(new ParameterizedThreadStart((s) => {
showCast.showCastList.Add(MapResponse((Show)s));
pool.Release();
}));
t.Start(show);
}
Concurrent programming is tricky and I would highly recommend to do some exercises with examples on common pitfalls. Books on C# programming are sure to have a chapter or two on the topic. There are plenty of online courses and tutorials on this topic to learn from.
Edit:
Working solution. Still might have some issues.
public ShowCasts GetAllShowAndTheirCast()
{
ShowResponse allShows = GetAllShows();
ConcurrentBag<ShowData> result = new ConcurrentBag<ShowData>();
using (var countdownEvent = new CountdownEvent(allShows.Shows.Length))
{
using (Semaphore pool = new Semaphore(20, 20))
{
for (int i = 0; i < allShows.Shows.Length; i++)
{
CreateAndStartThread(allShows.Shows[i], pool, result, countdownEvent);
}
countdownEvent.Wait();
}
}
return new ShowCasts() { showCastList = result.ToList() };
}
public void CreateAndStartThread(Show show, Semaphore pool, ConcurrentBag<ShowData> result, CountdownEvent countdownEvent)
{
pool.WaitOne();
Thread t = new Thread(new ParameterizedThreadStart((s) =>
{
result.Add(MapResponse((Show)s));
pool.Release();
countdownEvent.Signal();
}));
t.Start(show);
}
I create dynamic threads in C# and I need to get the status of those running threads.
List<string>[] list;
list = dbConnect.Select();
for (int i = 0; i < list[0].Count; i++)
{
Thread th = new Thread(() =>{
sendMessage(list[0]['1']);
//calling callback function
});
th.Name = "SID"+i;
th.Start();
}
for (int i = 0; i < list[0].Count; i++)
{
// here how can i get list of running thread here.
}
How can you get list of running threads?
On Threads
I would avoid explicitly creating threads on your own.
It is much more preferable to use the ThreadPool.QueueUserWorkItem or if you do can use .Net 4.0 you get the much more powerful Task parallel library which also allows you to use a ThreadPool threads in a much more powerful way (Task.Factory.StartNew is worth a look)
What if we choose to go by the approach of explicitly creating threads?
Let's suppose that your list[0].Count returns 1000 items. Let's also assume that you are performing this on a high-end (at the time of this writing) 16core machine. The immediate effect is that we have 1000 threads competing for these limited resources (the 16 cores).
The larger the number of tasks and the longer each of them runs, the more time will be spent in context switching. In addition, creating threads is expensive, this overhead creating each thread explicitly could be avoided if an approach of reusing existing threads is used.
So while the initial intent of multithreading may be to increase speed, as we can see it can have quite the opposite effect.
How do we overcome 'over'-threading?
This is where the ThreadPool comes into play.
A thread pool is a collection of threads that can be used to perform a number of tasks in the background.
How do they work:
Once a thread in the pool completes its task, it is returned to a queue of waiting threads, where it can be reused. This reuse enables applications to avoid the cost of creating a new thread for each task.
Thread pools typically have a maximum number of threads. If all the threads are busy, additional tasks are placed in queue until they can be serviced as threads become available.
So we can see that by using a thread pool threads we are more efficient both
in terms of maximizing the actual work getting done. Since we are not over saturating the processors with threads, less time is spent switching between threads and more time actually executing the code that a thread is supposed to do.
Faster thread startup: Each threadpool thread is readily available as opposed to waiting until a new thread gets constructed.
in terms of minimising memory consumption, the threadpool will limit the number of threads to the threadpool size enqueuing any requests that are beyond the threadpool size limit. (see ThreadPool.GetMaxThreads). The primary reason behind this design choice, is of course so that we don't over-saturate the limited number of cores with too many thread requests keeping context switching to lower levels.
Too much Theory, let's put all this theory to the test!
Right, it's nice to know all this in theory, but let's put it to practice and see what
the numbers tell us, with a simplified crude version of the application that can give us a coarse indication of the difference in orders of magnitude. We will do a comparison between new Thread, ThreadPool and Task Parallel Library (TPL)
new Thread
static void Main(string[] args)
{
int itemCount = 1000;
Stopwatch stopwatch = new Stopwatch();
long initialMemoryFootPrint = GC.GetTotalMemory(true);
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
int iCopy = i; // You should not use 'i' directly in the thread start as it creates a closure over a changing value which is not thread safe. You should create a copy that will be used for that specific variable.
Thread thread = new Thread(() =>
{
// lets simulate something that takes a while
int k = 0;
while (true)
{
if (k++ > 100000)
break;
}
if ((iCopy + 1) % 200 == 0) // By the way, what does your sendMessage(list[0]['1']); mean? what is this '1'? if it is i you are not thread safe.
Console.WriteLine(iCopy + " - Time elapsed: (ms)" + stopwatch.ElapsedMilliseconds);
});
thread.Name = "SID" + iCopy; // you can also use i here.
thread.Start();
}
Console.ReadKey();
Console.WriteLine(GC.GetTotalMemory(false) - initialMemoryFootPrint);
Console.ReadKey();
}
Result:
ThreadPool.EnqueueUserWorkItem
static void Main(string[] args)
{
int itemCount = 1000;
Stopwatch stopwatch = new Stopwatch();
long initialMemoryFootPrint = GC.GetTotalMemory(true);
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
int iCopy = i; // You should not use 'i' directly in the thread start as it creates a closure over a changing value which is not thread safe. You should create a copy that will be used for that specific variable.
ThreadPool.QueueUserWorkItem((w) =>
{
// lets simulate something that takes a while
int k = 0;
while (true)
{
if (k++ > 100000)
break;
}
if ((iCopy + 1) % 200 == 0)
Console.WriteLine(iCopy + " - Time elapsed: (ms)" + stopwatch.ElapsedMilliseconds);
});
}
Console.ReadKey();
Console.WriteLine("Memory usage: " + (GC.GetTotalMemory(false) - initialMemoryFootPrint));
Console.ReadKey();
}
Result:
Task Parallel Library (TPL)
static void Main(string[] args)
{
int itemCount = 1000;
Stopwatch stopwatch = new Stopwatch();
long initialMemoryFootPrint = GC.GetTotalMemory(true);
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
int iCopy = i; // You should not use 'i' directly in the thread start as it creates a closure over a changing value which is not thread safe. You should create a copy that will be used for that specific variable.
Task.Factory.StartNew(() =>
{
// lets simulate something that takes a while
int k = 0;
while (true)
{
if (k++ > 100000)
break;
}
if ((iCopy + 1) % 200 == 0) // By the way, what does your sendMessage(list[0]['1']); mean? what is this '1'? if it is i you are not thread safe.
Console.WriteLine(iCopy + " - Time elapsed: (ms)" + stopwatch.ElapsedMilliseconds);
});
}
Console.ReadKey();
Console.WriteLine("Memory usage: " + (GC.GetTotalMemory(false) - initialMemoryFootPrint));
Console.ReadKey();
}
Result:
So we can see that:
+--------+------------+------------+--------+
| | new Thread | ThreadPool | TPL |
+--------+------------+------------+--------+
| Time | 6749 | 228ms | 222ms |
| Memory | ≈300kb | ≈103kb | ≈123kb |
+--------+------------+------------+--------+
The above falls nicely inline to what we anticipated in theory. High memory for new Thread as well as slower overall performance when compared to ThreadPool. ThreadPool and TPL have equivalent performance with TPL having a slightly higher memory footprint than a pure thread pool but it's probably a price worth paying given the added flexibility Tasks provide (such as cancellation, waiting for completion querying status of task)
At this point, we have proven that using ThreadPool threads is the preferable option in terms of speed and memory.
Still, we have not answered your question. How to track the state of the threads running.
To answer your question
Given the insights we have gathered, this is how I would approach it:
List<string>[] list = listdbConnect.Select()
int itemCount = list[0].Count;
Task[] tasks = new Task[itemCount];
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
tasks[i] = Task.Factory.StartNew(() =>
{
// NOTE: Do not use i in here as it is not thread safe to do so!
sendMessage(list[0]['1']);
//calling callback function
});
}
// if required you can wait for all tasks to complete
Task.WaitAll(tasks);
// or for any task you can check its state with properties such as:
tasks[1].IsCanceled
tasks[1].IsCompleted
tasks[1].IsFaulted
tasks[1].Status
As a final note, you can not use the variable i in your Thread.Start, since it would create a closure over a changing variable which would effectively be shared amongst all Threads. To get around this (assuming you need to access i), simply create a copy of the variable and pass the copy in, this would make one closure per thread which would make it thread safe.
Good luck!
Use Process.Threads:
var currentProcess = Process.GetCurrentProcess();
var threads = currentProcess.Threads;
Note: any threads owned by the current process will show up here, including those not explicitly created by you.
If you only want the threads that you created, well, why don't you just keep track of them when you create them?
Create a List<Thread> and store each new thread in your first for loop in it.
List<string>[] list;
List<Thread> threads = new List<Thread>();
list = dbConnect.Select();
for (int i = 0; i < list[0].Count; i++)
{
Thread th = new Thread(() =>{
sendMessage(list[0]['1']);
//calling callback function
});
th.Name = "SID"+i;
th.Start();
threads.add(th)
}
for (int i = 0; i < list[0].Count; i++)
{
threads[i].DoStuff()
}
However if you don't need i you can make the second loop a foreach instead of a for
As a side note, if your sendMessage function does not take very long to execute you should somthing lighter weight then a full Thread, use a ThreadPool.QueueUserWorkItem or if it is available to you, a Task
Process.GetCurrentProcess().Threads
This gives you a list of all threads running in the current process, but beware that there are threads other than those you started yourself.
Use Process.Threads to iterate through your threads.