Difference between delegate.BeginInvoke and using ThreadPool threads in C# - c#

In C# is there any difference between using a delegate to do some work asynchronously (calling BeginInvoke()) and using a ThreadPool thread as shown below
public void asynchronousWork(object num)
{
//asynchronous work to be done
Console.WriteLine(num);
}
public void test()
{
Action<object> myCustomDelegate = this.asynchronousWork;
int x = 7;
//Using Delegate
myCustomDelegate.BeginInvoke(7, null, null);
//Using Threadpool
ThreadPool.QueueUserWorkItem(new WaitCallback(asynchronousWork), 7);
Thread.Sleep(2000);
}
Edit:
BeginInvoke makes sure that a thread from the thread pool is used to execute the asynchronous code , so is there any difference?

Joe Duffy, in his Concurrent Programming on Windows book (page 418), says this about Delegate.BeginInvoke:
All delegate types, by convention offer a BeginInvoke and EndInvoke method alongside the ordinary synchronous Invoke method. While this is a nice programming model feature, you should stay away from them wherever possible. The implementation uses remoting infrastructure which imposes a sizable overhead to asynchronous invocation. Queue work to the thread pool directly is often a better approach, though that means you have to co-ordinate the rendezvous logic yourself.
EDIT: I created the following simple test of the relative overheads:
int counter = 0;
int iterations = 1000000;
Action d = () => { Interlocked.Increment(ref counter); };
var stopwatch = new System.Diagnostics.Stopwatch();
stopwatch.Start();
for (int i = 0; i < iterations; i++)
{
var asyncResult = d.BeginInvoke(null, null);
}
do { } while(counter < iterations);
stopwatch.Stop();
Console.WriteLine("Took {0}ms", stopwatch.ElapsedMilliseconds);
Console.ReadLine();
On my machine the above test runs in around 20 seconds. Replacing the BeginInvoke call with
System.Threading.ThreadPool.QueueUserWorkItem(state =>
{
Interlocked.Increment(ref counter);
});
changes the running time to 864ms.

Related

Why a simple await Task.Delay(1) enables parallel execution?

I would like to ask simple question about code bellow:
static void Main(string[] args)
{
MainAsync()
//.Wait();
.GetAwaiter().GetResult();
}
static async Task MainAsync()
{
Console.WriteLine("Hello World!");
Task<int> a = Calc(18000);
Task<int> b = Calc(18000);
Task<int> c = Calc(18000);
await a;
await b;
await c;
Console.WriteLine(a.Result);
}
static async Task<int> Calc(int a)
{
//await Task.Delay(1);
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
This example runs Calc functions in synchronous way. When the line //await Task.Delay(1); will be uncommented, the Calc functions will be executed in a parallel way.
The question is: Why, by adding simple await, the Calc function is then async?
I know about async/await pair requirements. I'm asking about what it's really happening when simple await Delay is added at the beginning of a function. Whole Calc function is then recognized to be run in another thread, but why?
Edit 1:
When I added a thread checking to code:
static async Task<int> Calc(int a)
{
await Task.Delay(1);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
it is possible to see (in console) different thread id's. If await Delay line is deleted, the thread id is always the same for all runs of Calc function. In my opinion it proves that code after await is (can be) runned in different threads. And it is the reason why code is faster (in my opinion of course).
It's important to understand how async methods work.
First, they start running synchronously, on the same thread, just like every other method. Without that await Task.Delay(1) line, the compiler will have warned you that the method would be completely synchronous. The async keyword doesn't, by itself, make your method asynchronous. It just enables the use of await.
The magic happens at the first await that acts on an incomplete Task. At that point the method returns. It returns a Task that you can use to check when the rest of the method has completed.
So when you have await Task.Delay(1) there, the method returns at that line, allowing your MainAsync method to move to the next line and start the next call to Calc.
How the continuation of Calc runs (everything after await Task.Delay(1)) depends on if there is a "synchronization context". In ASP.NET (not Core) or a UI application, for example, the synchronization context controls how the continuations run and they would run one after the other. In a UI app, it would be on the same thread it started from. In ASP.NET, it may be a different thread, but still one after the other. So in either case, you would not see any parallelism.
However, because this is a console app, which does not have a synchronization context, the continuations happen on any ThreadPool thread as soon as the Task from Task.Delay(1) completes. That means each continuation can happen in parallel.
Also worth noting: starting with C# 7.1 you can make your Main method async, eliminating the need for your MainAsync method:
static async Task Main(string[] args)
An async function returns the incomplete task to the caller at its first incomplete await. After that the await on the calling side will await that task to become complete.
Without the await Task.Delay(1), Calc() does not have any awaits of its own, so will only return to the caller when it runs to the end. At this point the returned Task is already complete, so the await on the calling site immediately uses the result without actually invoking the async machinery.
layman's version....
nothing in the process is yielding CPU time back without 'delay' and so it doesn't give anything else CPU time, you are confusing this with multiple threaded code. "async and await" is not about multiple threaded but about using the CPU (thread/threads) when its doing non CPU work" aka writing to disk. Writing to disk does not need the thread (CPU). So when something is async, it can free the thread and be used for something else instead of waiting for non CPU (oi task) to complete.
#sunside is saying the same thing just more technically.
static async Task<int> Calc(int a)
{
//faking a asynchronous .... this will give this thread to something else
// until done then return here...
// does not make sense... as your making this take longer for no gain.
await Task.Delay(1);
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
vs
static async Task<int> Calc(int a)
{
using (var reader = File.OpenText("Words.txt"))
{
//real asynchronous .... this will give this thread to something else
var fileText = await reader.ReadToEndAsync();
// Do something with fileText...
}
Console.WriteLine("Calc started");
int result = 0;
for (int k = 0; k < a; ++k)
{
for (int l = 0; l < a; ++l)
{
result += l;
}
}
return result;
}
the reason it looks like its in "parallel way" is that its just give the others tasks CPU time.
example; aka without delay
await a; do this (no actual aysnc work)
await b; do this (no actual aysnc work)
await c; do this (no actual aysnc work)
example 2;aka with delay
await a; start this then pause this (fake async), start b but come back and finish a
await b; start this then pause this (fake async), start c but come back and finish b
await c; start this then pause this (fake async), come back and finish c
what you should find is although more is started sooner, the overall time will be longer as it as to jump between tasks for no benefit with a faked asynchronous task. where as, if the await Task.Delay(1) was a real async function aka asynchronous in nature then the benefit would be it can start the other work using the thread which would of been blocked... while it waits for something which does not require the thread.
update silly code to show its slower... Make sure you are in "Release" mode you should always ignore the first run... these test are silly and you would need to use https://github.com/dotnet/BenchmarkDotNet to really see the diff
static void Main(string[] args)
{
Console.WriteLine("Exmaple1 - no Delay, expecting it to be faster, shorter times on average");
for (int i = 0; i < 10; i++)
{
Exmaple1().GetAwaiter().GetResult();
}
Console.WriteLine("Exmaple2- with Delay, expecting it to be slower,longer times on average");
for (int i = 0; i < 10; i++)
{
Exmaple2().GetAwaiter().GetResult();
}
}
static async Task Exmaple1()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Task<int> a = Calc1(18000); await a;
Task<int> b = Calc1(18000); await b;
Task<int> c = Calc1(18000); await c;
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
}
static async Task<int> Calc1(int a)
{
int result = 0;
for (int k = 0; k < a; ++k) { for (int l = 0; l < a; ++l) { result += l; } }
return result;
}
static async Task Exmaple2()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
Task<int> a = Calc2(18000); await a;
Task<int> b = Calc2(18000); await b;
Task<int> c = Calc2(18000); await c;
stopwatch.Stop();
Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
}
static async Task<int> Calc2(int a)
{
await Task.Delay(1);
int result = 0;
for (int k = 0; k < a; ++k){for (int l = 0; l < a; ++l) { result += l; } }
return result;
}
By using an async/await pattern, you intend for your Calc method to run as a task:
Task<int> a = Calc(18000);
We need to establish that tasks are generally asynchronous in their nature, but not parallel - parallelism is a feature of threads. However, under the hood, some thread will be used to execute your tasks on. Depending on the context your running your code in, multiple tasks may or may not be executed in parallel or sequentially - but they will be (allowed to be) asynchronous.
One good way of internalizing this is picturing a teacher (the thread) answering question (the tasks) in class. A teacher will never be able to answer two different questions simultaneously - i.e. in parallel - but she will be able to answer questions of multiple students, and can also be interrupted with new questions in between.
Specifically, async/await is a cooperative multiprocessing feature (emphasis on "cooperative") where tasks only get to be scheduled onto a thread if that thread is free - and if some task is already running on that thread, it needs to manually give way. (Whether and how many threads are available for execution is, in turn, dependent on the environment your code is executing in.)
When running Task.Delay(1) the code announces that it is intending to sleep, which signals to the scheduler that another task may execute, thereby enabling asynchronicity. The way it's used in your code it is, in essence, a slightly worse version of Task.Yield (you can find more details about that in the comments below).
Generally speaking, whenever you await, the method currently being executed is "returned" from, marking the code after it as a continuation. This continuation will only be executed when the task scheduler selects the current task for execution again. This may happen if no other task is currently executing and waiting to be executed - e.g. because they all yielded or await something "long-running".
In your example, the Calc method yields due to the Task.Delay and execution returns to the Main method. This, in turn enters the next Calc method and the pattern repeats. As was established in other answers, these continuations may or may not execute on different threads, depending on the environment - without a synchronization context (e.g. in a console application), it may happen. To be clear, this is neither a feature of Task.Delay nor async/await, but of the configuration your code executes in. If you require parallelism, use proper threads or ensure that your tasks are started such that they encourage use of multiple threads.
In another note: Whenever you intend to run synchronous code in an asynchronous manner, use Task.Run() to execute it. This will make sure it doesn't get in your way too much by always using a background thread. This SO answer on LongRunning tasks might be insightful.

Thread vs Parallel.For performance

I am struggling to understand the difference between threads and Parallel.For. I created two functions, one used Parallel.For other invoked threads. Invoking 10 threads would appear to be faster, can anyone please explain? Would threads use multiple processors available in the system (to get executed in parallel) or does it just do time slicing in reference to CLR?
public static bool ParallelProcess()
{
Stopwatch sw = new Stopwatch();
sw.Start();
Parallel.For(0, 10, x =>
{
Console.WriteLine(string.Format("Printing {0} thread = {1}", x,
Thread.CurrentThread.ManagedThreadId));
Thread.Sleep(3000);
});
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.Seconds));
return true;
}
public static bool ParallelThread()
{
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < 10; i++)
{
Thread t = new Thread(new ThreadStart(Thread1));
t.Start();
if (i == 9)
t.Join();
}
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.Seconds));
return true;
}
private static void Thread1()
{
Console.WriteLine(string.Format("Printing {0} thread = {1}", 0,
Thread.CurrentThread.ManagedThreadId));
Thread.Sleep(3000);
}
When called below methods, Parallel.For took twice time then threads.
Algo.ParallelThread(); //took 3 secs
Algo.ParallelProcess(); //took 6 secs
Parallel utilizes however many threads the underlying scheduler provides, which would be the minimum number of threadpool threads to start with.
The number of minimum threadpool threads is by default set to the number of processors. As time goes on and based on many different factors, e.g. all current threads being busy, the scheduler might decide to spawn more threads and go higher than the minimum count.
All of that is managed for you to stop unnecessary resource usage. Your second example circumvents all that by spawning threads manually. If you explicitly set the number of threadpool threads e.g. ThreadPool.SetMinThreads(100, 100), you'll see even the Parallel one takes 3 seconds as it immediately has more threads available to use.
You've got a bunch of things here that are going wrong.
(1) Don't use sw.Elapsed.Seconds this value is an int and (obviously) truncates the fractional part of the time. Worse though, if you have a process that takes 61 seconds to complete this will report 1 as it's like the second hand on a clock. You should instead use sw.Elapsed.TotalSeconds which reports as a double and it shows the total number of seconds regardless how many minutes or hours, etc.
(2) Parallel.For uses the thread-pool. This significantly reduces (or even eliminates) the overhead for creating threads. Each time you call new Thread(() => ...) you are allocating over 1MB of RAM and chewing up valuable resources before any processing can take place.
(3) You're artificially loading up the threads with Thread.Sleep(3000); and this means you are overshadowing the actual time it takes to create threads with a massive sleep.
(4) Parallel.For is, by default, limited by the number of cores on your CPU. So when you run 10 threads the work is being cut in to two steps - meaning that the Thread.Sleep(3000); is being run twice in series, hence the 6 seconds that it's running. The new Thread approach is running all of the threads in one go meaning that it takes just over 3 seconds, but again, the Thread.Sleep(3000); is swamping the thread start up time.
(5) You're also dealing with a CLR JIT issue. The first time you run your code the start-up costs are enormous. Let's change the code to remove the sleeps and to properly join the threads:
public static bool ParallelProcess()
{
Stopwatch sw = new Stopwatch();
sw.Start();
Parallel.For(0, 10, x =>
{
Console.WriteLine(string.Format("Printing {0} thread = {1}", x, Thread.CurrentThread.ManagedThreadId));
});
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.TotalMilliseconds));
return true;
}
public static bool ParallelThread()
{
Stopwatch sw = new Stopwatch();
sw.Start();
var threads = Enumerable.Range(0, 10).Select(x => new Thread(new ThreadStart(Thread1))).ToList();
foreach (var thread in threads) thread.Start();
foreach (var thread in threads) thread.Join();
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.TotalMilliseconds));
return true;
}
private static void Thread1()
{
Console.WriteLine(string.Format("Printing {0} thread = {1}", 0, Thread.CurrentThread.ManagedThreadId));
}
Now, to get rid of the CLR/JIT start up time, let's run the code like this:
ParallelProcess();
ParallelThread();
ParallelProcess();
ParallelThread();
ParallelProcess();
ParallelThread();
The times we get are like this:
Time in secs 3.8617
Time in secs 4.7719
Time in secs 0.3633
Time in secs 1.6332
Time in secs 0.3551
Time in secs 1.6148
The starting run times are terrible compared to the second and third runs that are far more consistent.
The result is that running Parallel.For is 4 to 5 times faster than calling new Thread.
Your snippets are not equivalent. Here is a version of ParallelThread that would do the same as ParallelProcess but starting new threads:
public static bool ParallelThread()
{
Stopwatch sw = new Stopwatch();
sw.Start();
var threads = new Thread[10];
for (int i = 0; i < 10; i++)
{
int x = i;
threads[i] = new Thread(() => Thread1(x));
threads[i].Start();
}
for (int i = 0; i < 10; i++)
{
threads[i].Join();
}
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.Seconds));
return true;
}
private static void Thread1(int x)
{
Console.WriteLine(string.Format("Printing {0} thread = {1}", x,
Thread.CurrentThread.ManagedThreadId));
Thread.Sleep(3000);
}
Here, I am making sure to wait for all the threads. And also, I making sure to match the console output. Things that OP code does not do.
However, the time difference is still there.
Let me tell you what makes the difference, at least in my tests: the order. Run ParallelProcess before ParallelThread and they should both take 3 seconds to complete (ignoring the initial runs, which will take longer because of compilation). I cannot really explain it.
We could modify the above code futher to use the ThreadPool, and that did also result in ParallelProcess completing in 3 seconds (even though I did not modify that version). This is the version of ParallelThread with ThreadPool I came up with:
public static bool ParallelThread()
{
Stopwatch sw = new Stopwatch();
sw.Start();
var events = new ManualResetEvent[10];
for (int i = 0; i < 10; i++)
{
int x = i;
events[x] = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem
(
_ =>
{
Thread1(x);
events[x].Set();
}
);
}
for (int i = 0; i < 10; i++)
{
events[i].WaitOne();
}
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.Seconds));
return true;
}
private static void Thread1(int x)
{
Console.WriteLine(string.Format("Printing {0} thread = {1}", x,
Thread.CurrentThread.ManagedThreadId));
Thread.Sleep(3000);
}
Note: We could use WaitAll on the events, but that would fail on a STAThread.
You have Thread.Sleep(3000) which are the 3 seconds we see. Meaning that we are not really measuring the overhead of any of these methods.
So, I decided I want to study this futher, and to do it, I went up one order of magnitud (from 10 to 100) and removed the Console.WriteLine (which is introducing synchronization anyway).
This is my code listing:
void Main()
{
ParallelThread();
ParallelProcess();
}
public static bool ParallelProcess()
{
Stopwatch sw = new Stopwatch();
sw.Start();
Parallel.For(0, 100, x =>
{
/*Console.WriteLine(string.Format("Printing {0} thread = {1}", x,
Thread.CurrentThread.ManagedThreadId));*/
Thread.Sleep(3000);
});
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.Seconds));
return true;
}
public static bool ParallelThread()
{
Stopwatch sw = new Stopwatch();
sw.Start();
var events = new ManualResetEvent[100];
for (int i = 0; i < 100; i++)
{
int x = i;
events[x] = new ManualResetEvent(false);
ThreadPool.QueueUserWorkItem
(
_ =>
{
Thread1(x);
events[x].Set();
}
);
}
for (int i = 0; i < 100; i++)
{
events[i].WaitOne();
}
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.Seconds));
return true;
}
private static void Thread1(int x)
{
/*Console.WriteLine(string.Format("Printing {0} thread = {1}", x,
Thread.CurrentThread.ManagedThreadId));*/
Thread.Sleep(3000);
}
I am getting 6 seconds for ParallelThread and 9 seconds for ParallelProcess. This remains true even after reversing the order. Which makes me much more confident that this is a real measure of the overhead.
Adding ThreadPool.SetMinThreads(100, 100); bring the time back down to 3 seconds, for both ParallelThread (remember that this version is using the ThreadPool) and ParallelProcess. Meaning that this overhead comes from the thread pool. Now, I can go back to the version that spawns new threads (modified to spawn 100 and with Console.WriteLine commented):
public static bool ParallelThread()
{
Stopwatch sw = new Stopwatch();
sw.Start();
var threads = new Thread[100];
for (int i = 0; i < 100; i++)
{
int x = i;
threads[i] = new Thread(() => Thread1(x));
threads[i].Start();
}
for (int i = 0; i < 100; i++)
{
threads[i].Join();
}
sw.Stop();
Console.WriteLine(string.Format("Time in secs {0}", sw.Elapsed.Seconds));
return true;
}
private static void Thread1(int x)
{
/*Console.WriteLine(string.Format("Printing {0} thread = {1}", x,
Thread.CurrentThread.ManagedThreadId));*/
Thread.Sleep(3000);
}
I get consistent 3 seconds from this version (meaning the time overhead is negligible, since, as I said earlier, Thread.Sleep(3000) is 3 seconds), however I want to note that it would be leaving more garbage to collect than using the ThreadPool or Parallel.For. On the other hand, using Parallel.For remains tied to the ThreadPool. By the way, if you want to degrade its performance, reducing the minimun number of threads is not enough, you got to degreade the maximun number of threads too (e.g. ThreadPool.SetMaxThreads(1, 1);).
All in all, please notice that Parallel.For is easier to use, and harder to wrong.
Invoking 10 threads would appear to be faster, can anyone please explain?
Spawning threads is fast. Although, it will leade to more garbage. Also, note that your test is not great.
Would threads use multiple processors available in the system (to get executed in parallel) or does it just do time slicing in reference to CLR?
Yes, they would. They map to the underlaying operating system threads, can be preempted by it, and will run in any core according to their affinity (see ProcessThread.ProcessorAffinity). To be clear, they are not fibers nor coroutines.
To put it in the simplest of the simplest terms, using Thread class guarantees to create a thread on the operating system level but using the Parallel.For the CLR thinks twice before spawning the OS-level threads. If it feels that it is a good time to create thread on OS-level, it goes ahead, otherwise it employs the available Thread pool. TPL is written to be optimized with a multi-core environment.

How do you get list of running threads in C#?

I create dynamic threads in C# and I need to get the status of those running threads.
List<string>[] list;
list = dbConnect.Select();
for (int i = 0; i < list[0].Count; i++)
{
Thread th = new Thread(() =>{
sendMessage(list[0]['1']);
//calling callback function
});
th.Name = "SID"+i;
th.Start();
}
for (int i = 0; i < list[0].Count; i++)
{
// here how can i get list of running thread here.
}
How can you get list of running threads?
On Threads
I would avoid explicitly creating threads on your own.
It is much more preferable to use the ThreadPool.QueueUserWorkItem or if you do can use .Net 4.0 you get the much more powerful Task parallel library which also allows you to use a ThreadPool threads in a much more powerful way (Task.Factory.StartNew is worth a look)
What if we choose to go by the approach of explicitly creating threads?
Let's suppose that your list[0].Count returns 1000 items. Let's also assume that you are performing this on a high-end (at the time of this writing) 16core machine. The immediate effect is that we have 1000 threads competing for these limited resources (the 16 cores).
The larger the number of tasks and the longer each of them runs, the more time will be spent in context switching. In addition, creating threads is expensive, this overhead creating each thread explicitly could be avoided if an approach of reusing existing threads is used.
So while the initial intent of multithreading may be to increase speed, as we can see it can have quite the opposite effect.
How do we overcome 'over'-threading?
This is where the ThreadPool comes into play.
A thread pool is a collection of threads that can be used to perform a number of tasks in the background.
How do they work:
Once a thread in the pool completes its task, it is returned to a queue of waiting threads, where it can be reused. This reuse enables applications to avoid the cost of creating a new thread for each task.
Thread pools typically have a maximum number of threads. If all the threads are busy, additional tasks are placed in queue until they can be serviced as threads become available.
So we can see that by using a thread pool threads we are more efficient both
in terms of maximizing the actual work getting done. Since we are not over saturating the processors with threads, less time is spent switching between threads and more time actually executing the code that a thread is supposed to do.
Faster thread startup: Each threadpool thread is readily available as opposed to waiting until a new thread gets constructed.
in terms of minimising memory consumption, the threadpool will limit the number of threads to the threadpool size enqueuing any requests that are beyond the threadpool size limit. (see ThreadPool.GetMaxThreads). The primary reason behind this design choice, is of course so that we don't over-saturate the limited number of cores with too many thread requests keeping context switching to lower levels.
Too much Theory, let's put all this theory to the test!
Right, it's nice to know all this in theory, but let's put it to practice and see what
the numbers tell us, with a simplified crude version of the application that can give us a coarse indication of the difference in orders of magnitude. We will do a comparison between new Thread, ThreadPool and Task Parallel Library (TPL)
new Thread
static void Main(string[] args)
{
int itemCount = 1000;
Stopwatch stopwatch = new Stopwatch();
long initialMemoryFootPrint = GC.GetTotalMemory(true);
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
int iCopy = i; // You should not use 'i' directly in the thread start as it creates a closure over a changing value which is not thread safe. You should create a copy that will be used for that specific variable.
Thread thread = new Thread(() =>
{
// lets simulate something that takes a while
int k = 0;
while (true)
{
if (k++ > 100000)
break;
}
if ((iCopy + 1) % 200 == 0) // By the way, what does your sendMessage(list[0]['1']); mean? what is this '1'? if it is i you are not thread safe.
Console.WriteLine(iCopy + " - Time elapsed: (ms)" + stopwatch.ElapsedMilliseconds);
});
thread.Name = "SID" + iCopy; // you can also use i here.
thread.Start();
}
Console.ReadKey();
Console.WriteLine(GC.GetTotalMemory(false) - initialMemoryFootPrint);
Console.ReadKey();
}
Result:
ThreadPool.EnqueueUserWorkItem
static void Main(string[] args)
{
int itemCount = 1000;
Stopwatch stopwatch = new Stopwatch();
long initialMemoryFootPrint = GC.GetTotalMemory(true);
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
int iCopy = i; // You should not use 'i' directly in the thread start as it creates a closure over a changing value which is not thread safe. You should create a copy that will be used for that specific variable.
ThreadPool.QueueUserWorkItem((w) =>
{
// lets simulate something that takes a while
int k = 0;
while (true)
{
if (k++ > 100000)
break;
}
if ((iCopy + 1) % 200 == 0)
Console.WriteLine(iCopy + " - Time elapsed: (ms)" + stopwatch.ElapsedMilliseconds);
});
}
Console.ReadKey();
Console.WriteLine("Memory usage: " + (GC.GetTotalMemory(false) - initialMemoryFootPrint));
Console.ReadKey();
}
Result:
Task Parallel Library (TPL)
static void Main(string[] args)
{
int itemCount = 1000;
Stopwatch stopwatch = new Stopwatch();
long initialMemoryFootPrint = GC.GetTotalMemory(true);
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
int iCopy = i; // You should not use 'i' directly in the thread start as it creates a closure over a changing value which is not thread safe. You should create a copy that will be used for that specific variable.
Task.Factory.StartNew(() =>
{
// lets simulate something that takes a while
int k = 0;
while (true)
{
if (k++ > 100000)
break;
}
if ((iCopy + 1) % 200 == 0) // By the way, what does your sendMessage(list[0]['1']); mean? what is this '1'? if it is i you are not thread safe.
Console.WriteLine(iCopy + " - Time elapsed: (ms)" + stopwatch.ElapsedMilliseconds);
});
}
Console.ReadKey();
Console.WriteLine("Memory usage: " + (GC.GetTotalMemory(false) - initialMemoryFootPrint));
Console.ReadKey();
}
Result:
So we can see that:
+--------+------------+------------+--------+
| | new Thread | ThreadPool | TPL |
+--------+------------+------------+--------+
| Time | 6749 | 228ms | 222ms |
| Memory | ≈300kb | ≈103kb | ≈123kb |
+--------+------------+------------+--------+
The above falls nicely inline to what we anticipated in theory. High memory for new Thread as well as slower overall performance when compared to ThreadPool. ThreadPool and TPL have equivalent performance with TPL having a slightly higher memory footprint than a pure thread pool but it's probably a price worth paying given the added flexibility Tasks provide (such as cancellation, waiting for completion querying status of task)
At this point, we have proven that using ThreadPool threads is the preferable option in terms of speed and memory.
Still, we have not answered your question. How to track the state of the threads running.
To answer your question
Given the insights we have gathered, this is how I would approach it:
List<string>[] list = listdbConnect.Select()
int itemCount = list[0].Count;
Task[] tasks = new Task[itemCount];
stopwatch.Start();
for (int i = 0; i < itemCount; i++)
{
tasks[i] = Task.Factory.StartNew(() =>
{
// NOTE: Do not use i in here as it is not thread safe to do so!
sendMessage(list[0]['1']);
//calling callback function
});
}
// if required you can wait for all tasks to complete
Task.WaitAll(tasks);
// or for any task you can check its state with properties such as:
tasks[1].IsCanceled
tasks[1].IsCompleted
tasks[1].IsFaulted
tasks[1].Status
As a final note, you can not use the variable i in your Thread.Start, since it would create a closure over a changing variable which would effectively be shared amongst all Threads. To get around this (assuming you need to access i), simply create a copy of the variable and pass the copy in, this would make one closure per thread which would make it thread safe.
Good luck!
Use Process.Threads:
var currentProcess = Process.GetCurrentProcess();
var threads = currentProcess.Threads;
Note: any threads owned by the current process will show up here, including those not explicitly created by you.
If you only want the threads that you created, well, why don't you just keep track of them when you create them?
Create a List<Thread> and store each new thread in your first for loop in it.
List<string>[] list;
List<Thread> threads = new List<Thread>();
list = dbConnect.Select();
for (int i = 0; i < list[0].Count; i++)
{
Thread th = new Thread(() =>{
sendMessage(list[0]['1']);
//calling callback function
});
th.Name = "SID"+i;
th.Start();
threads.add(th)
}
for (int i = 0; i < list[0].Count; i++)
{
threads[i].DoStuff()
}
However if you don't need i you can make the second loop a foreach instead of a for
As a side note, if your sendMessage function does not take very long to execute you should somthing lighter weight then a full Thread, use a ThreadPool.QueueUserWorkItem or if it is available to you, a Task
Process.GetCurrentProcess().Threads
This gives you a list of all threads running in the current process, but beware that there are threads other than those you started yourself.
Use Process.Threads to iterate through your threads.

ThreadPool - WaitAll 64 Handle Limit

I am trying to bypass the the wait64 handle limit that .net 3.5 imposes
I have seen this thread : Workaround for the WaitHandle.WaitAll 64 handle limit?
So I understand the general idea but I am having difficulty because I am not using a delegate but rather
I am basically working of this example :
http://msdn.microsoft.com/en-us/library/3dasc8as%28VS.80%29.aspx
This link http://www.switchonthecode.com/tutorials/csharp-tutorial-using-the-threadpool
is similar but again the int variable keeping track of the tasks is a member variable.
Where in the above example would I pass the threadCount integer?
Do I pass it in the callback method as an object? I think I am having trouble with the callback method and passing by reference.
Thanks Stephen,
That link is not entirely clear to me.
Let me post my code to help myself clarify:
for (int flows = 0; flows < NumFlows; flows++)
{
ResetEvents[flows] = new ManualResetEvent(false);
ICalculator calculator = new NewtonRaphson(Perturbations);
Calculators[flows] = calculator;
ThreadPool.QueueUserWorkItem(calculator.ThreadPoolCallback, flows);
}
resetEvent.WaitOne();
Where would I pass in my threadCount variable. I assume it needs to be decremented in calculator.ThreadPoolCallback?
You should not be using multiple wait handles to wait for the completion of multiple work items in the ThreadPool. Not only is it not scalable you will eventually bump into the 64 handle limit imposed by the WaitHandle.WaitAll method (as you have done already). The correct pattern to use in this situation is a counting wait handle. There is one available in the Reactive Extensions download for .NET 3.5 via the CountdownEvent class.
var finished = new CountdownEvent(1);
for (int flows = 0; flows < NumFlows; flows++)
{
finished.AddCount();
ICalculator calculator = new NewtonRaphson(Perturbations);
Calculators[flows] = calculator;
ThreadPool.QueueUserWorkItem(
(state) =>
{
try
{
calculator.ThreadPoolCallback(state);
}
finally
{
finished.Signal();
}
}, flows);
}
finished.Signal();
finished.Wait();
An anonymous method might be easiest:
int threadCount = 0;
for (int flows = 0; flows < NumFlows; flows++)
{
ICalculator calculator = new NewtonRaphson(Perturbations);
Calculators[flows] = calculator;
// We're about to queue a new piece of work:
// make a note of the fact a new work item is starting
Interlocked.Increment(ref threadCount);
ThreadPool.QueueUserWorkItem(
delegate
{
calculator.ThreadPoolCallback(flows);
// We've finished this piece of work...
if (Interlocked.Decrement(ref threadCount) == 0)
{
// ...and we're the last one.
// Signal back to the main thread.
resetEvent.Set();
}
}, null);
}
resetEvent.WaitOne();

How to use ThreadPool class

namespace ThPool
{
class Program
{
private static long val = 0;
private static string obj = string.Empty;
static void Main(string[] args)
{
Thread objThread1 = new Thread(new ThreadStart(IncrementValue));
objThread1.Start();
Thread objThread2 = new Thread(new ThreadStart(IncrementValue));
objThread2.Start();
objThread1.Join();
objThread2.Join();
Console.WriteLine("val=" + val + " however it should be " + (10000000 + 10000000));
Console.ReadLine();
}
private static void IncrementValue()
{
for (int i = 0; i < 10000000; i++)
{
Monitor.Enter(obj);
val++;
Monitor.Exit(obj);
}
}
}
}
How do I use ThreadPool class in replacement of thread & monitor?
There are a couple of ways to use the thread pool. For your task, you should look at the following.
If you just need a task to run the easiest way is to use QueueUserWorkItem, which simply takes a delegate to run. The disadvantage is that you have little control over the job. The delegate can't return a value, and you don't know when the run is completed.
If you want a little more control, you can use the BeginInvoke/EndInvoke interface of delegates. This schedules the code to run on a thread pool thread. You can query the status via the IAsyncResult handle returned by BeginInvoke, and you can get the result (as well as any exception on the worker thread) via EndInvoke.
To use the Enter/Exit on Monitor properly, you have to make sure that Exit is always called. Therefore you should place your Exit call in a finally block.
However, if you don't need to supply a timeout value for Enter, you would be much better off just using the lock keyword, which compiles into a proper set of Enter and Exit calls.
EventWaitHandle[] waitHandles = new EventWaitHandle[2];
for(int i = 0; i < 2; i++)
{
waitHandles[i] = new AutoResetEvent(false);
ThreadPool.QueueUserWorkItem(state =>
{
EventWaitHandle handle = state as EventWaitHandle;
for(int j = 0; j < 10000000; j++)
{
Interlocked.Increment(ref val); //instead of Monitor
}
handle.Set();
}, waitHandles[i]);
}
WaitHandle.WaitAll(waitHandles);
Console.WriteLine("val=" + val + " however it should be " + (10000000 + 10000000));
You should look into ThreadPool.QueueUserWorkItem(). It takes a delegate that is run on the threadpool thread, passing in a state object.
i.e.
string fullname = "Neil Barnwell";
ThreadPool.QueueUserWorkItem(state =>
{
Console.WriteLine("Hello, " + (string)state);
}, fullname);
Don't be confused by Control.BeginInvoke(). This will marshal the call to the thread that created the control, to prevent the issue where cross-thread calls update Controls. If you want simple multi-threading on a Windows form, then look into the BackgroundWorker.
Do not use the thread pool for anything but the most simple things. In fact it is extremely dangerous to aquire a lock on a thread pool thread. However you can safely use the Interlocked API's.
You can use BeginInvoke, EndInvoke. It uses the threadpool behind the scenes but it is easier to program. See here:
http://msdn.microsoft.com/en-us/library/2e08f6yc.aspx

Categories

Resources