I need to call a method many times (ten millon), therefore I use threads. But when the loop has 100 cycles of my method, it launchs an OutOfMemoryException.
I tried add SetMaxThreads to only run 50 threads simultaneous but don't works (because I don't know how to do it). Thanks in advance.
ThreadPool.SetMaxThreads(50, 50);
for (int i = 0; i < tablePersons.Rows.Count; i++)
{
Thread t = new Thread(RegisterPerson);
t.Start(tablePersons.Rows[i]);
}
static void RegisterPerson(object paramObject)
{
DataRow person = (DataRow)paramObject;
Call a service...
}
1) You are confusing thread pool threads with user created threads.
This creates a new thread (not a thread pool thread):
Thread t = new Thread(RegisterPerson);
Seeting the Threadpool to have a maximum of fifty threads:
ThreadPool.SetMaxThreads(50, 50);
has no effect on your loop, where you attempt to create a user thread for each row.
There are a number of ways to enter the thread pool:
Via the Task Parallel Library (from Framework 4.0)
By calling ThreadPool.QueueUserWorkItem
Via asynchronous delegates
Via BackgroundWorker
2) You should not be creating that many user threads.
I would suggest reading: Joe Albahari's excellent Threading in C#
Rather than creating that many separate threads manually, you should probably use Parallel.ForEach(), and let that handle the thread creation for you.
They won't all run simultaneously, but you won't run into memory issues.
Related
I am creating an app that deals with huge number of data to be processed. I want to use threading in C# just to make it processes faster. Please see example code below.
private static void MyProcess(Object someData)
{
//Do some data processing
}
static void Main(string[] args)
{
for (int task = 1; task < 10; task++)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(MyProcess), task);
}
}
Does this mean that a new thread will be created every loop passing the task to the "MyProcess" method (10 threads total)? Also, are the threads going to process concurrently?
The number of threads a threadpool will start depends on multiple factors, see The managed thread pool
Basically you are queing 10 worker items here which are likely to start threads immediatly.
The threads will most likly run concurrently, depending on the machine and number of processors.
If you start a large number of worker items, they will end up in a queue and start running as soon as a thread becomes available.
The calls will be scheduled on the thread pool. It does not guarantee that 10 threads will be created nor that all 10 tasks will be executed concurrently. The number of threads in the thread pool depends on the hardware and is chosen automatically to provide the best performance.
This articles contain good explanations of how it works:
https://owlcation.com/stem/C-ThreadPool-and-its-Task-Queue-Example
https://learn.microsoft.com/en-us/dotnet/api/system.threading.threadpool?redirectedfrom=MSDN&view=netframework-4.8
https://www.c-sharpcorner.com/article/thread-pool-in-net-core-and-c-sharp/
This Stackoverflow question explains the difference between ThreadPool and Thread:
Thread vs ThreadPool
Your method will be queued 9 times (you start at 1, not 0) for execution and will be executed when a thredpool thread will be available.
I'm creating a Windows service with 2 separate components:
1 component creates jobs and inserts them to the database (1 thread)
The 2nd component processes these jobs (multiple FIXED # of threads in a thread pool)
These 2 components will always run as long as the service is running.
What I'm stuck on is determining how to implement this thread pool. I've done some research, and there seems to be many ways of doing this such as creating a class that overriddes the method "ThreadPoolCallback", and using ThreadPool.QueueUserWorkItem to queue a work item. http://msdn.microsoft.com/en-us/library/3dasc8as.aspx
However in the example given, it doesn't seem to fit my scenario. I want to create a FIXED number of threads in a thread pool initially. Then feed it jobs to process. How do I do this?
// Wrapper method for use with thread pool.
public void ThreadPoolCallback(Object threadContext)
{
int threadIndex = (int)threadContext;
Console.WriteLine("thread {0} started...", threadIndex);
_fibOfN = Calculate(_n);
Console.WriteLine("thread {0} result calculated...", threadIndex);
_doneEvent.Set();
}
Fibonacci[] fibArray = new Fibonacci[FibonacciCalculations];
const int FibonacciCalculations = 10;
for (int i = 0; i < FibonacciCalculations; i++)
{
ThreadPool.QueueUserWorkItem(f.ThreadPoolCallback, i);
}
Create a BlockingCollection of work items. The thread that creates jobs adds them to this collection.
Create a fixed number of persistent threads that read items from that BlockingCollection and process them. Something like:
BlockingCollection<WorkItem> WorkItems = new BlockingCollection<WorkItem>();
void WorkerThreadProc()
{
foreach (var item in WorkItems.GetConsumingEnumerable())
{
// process item
}
}
Multiple worker threads can be doing that concurrently. BlockingCollection supports multiple readers and writers, so there's no concurrency problems that you have to deal with.
See my blog post Simple Multithreading, part 2 for an example that uses one consumer and one producer. Adding multiple consumers is a very simple matter of spinning up a new task for each consumer.
Another way to do it is to use a semaphore that controls how many jobs are currently being processed. I show how to do that in this answer. However, I think the shared BlockingCollection is in general a better solution.
The .NET thread pool isn't really designed for a fixed number of threads. It's designed to use the resources of the machine in the best way possible to perform multiple relatively small jobs.
Maybe a better solution for you would be to instantiate a fixed number of BackgroundWorkers instead? There are some reasonable BW examples.
I have the following code:
static void Main(string[] args)
{
Console.Write("Press ENTER to start...");
Console.ReadLine();
Console.WriteLine("Scheduling work...");
for (int i = 0; i < 1000; i++)
{
//ThreadPool.QueueUserWorkItem(_ =>
new Thread(_ =>
{
Thread.Sleep(1000);
}).Start();
}
Console.ReadLine();
}
According to the textbook C# 4.0 Unleashed by Bart De Smet (page 1466), using new Thread should mean using many more threads than if you use ThreadPool.QueueUserWorkItem which is commented out in my code.
However I've tried both, and seen in Resource Monitor that with "new Thread", there are about 11 threads allocated, however when I use ThreadPool.QueueUserWorkItem, there are about 50.
Why am I getting the opposite outcome of what is mentioned in this book?
Also why if you increase the sleep time, do you get many more threads allocated when using ThreadPool.QueueUserWorkItem?
new Thread() just creates a Thread object; you forgot to call Start() (which creates the actual thread that you see in resource monitor).
Also, if you are looking at the number of threads after the sleep has completed, you won't see any of the new Threads as they have already exited.
On the other hand, the ThreadPool keeps threads around for some time so it can reuse them, so in that case you can still see the threads even after the sleep has completed.
With new Thread(), you might be seeing the number staying around 160 because it took one second to start that many threads, so by the time the 161st thread is started, the first thread is already finished. You should see a higher number of threads if you increase the sleep time.
As for the ThreadPool, it is designed to use as few threads as possible while also keeping the CPU busy. Ideally, the number of busy threads is equal to the number of CPU cores. However, if the pool detects that its threads are currently not using the CPU (sleeping, or waiting for another thread), it starts up more threads (at a rate of 1/second, up to some maximum) to keep the CPU busy.
I have the following code:
for (int i = 1; i <= 500; i++)
{
BackgroundWorker t = new BackgroundWorker();
t.DoWork += SOME DB METHOD THAT TAKES 5 SECONDS
t.RunWorkerAsync();
}
When I profile this in SQL I notice that the BackgroundWorker appears to be queuing the threads in such a way that only 4 or 5 active connections are open at the same time vs. all 500 connections opening at once. I get no timeouts or blocking from my DB. How can I prevent this queuing and hit the database with all 500 concurrent threads at once?
BackgroundWorker uses the ThreadPool. You can adjust the ThreadPool with ThreadPool.SetMinThreads and ThreadPool.SetMaxThreads. If it will be actually possible to establish that many connections to your database server may be another question (and cause other problems).
However, it's not recommendable to start 500 BackgroundWorker instances! A better solution could be provided by the "Task Parallel Library" with the Task class.
Something like this should help:
Task.Factory.StartNew(
() => { SOME DB METHOD THAT TAKES 5 SECONDS },
TaskCreationOptions.LongRunning
);
From the MSDN documentation:
LongRunning - Specifies that a task will be a long-running,
coarse-grained operation involving fewer, larger components than
fine-grained systems. It provides a hint to the TaskScheduler that
oversubscription may be warranted. Oversubscription lets you create
more threads than the available number of hardware threads.
Or, you could completely bypass the thread pool and use the Thread class directly:
var t = new Thread(() => { SOME DB METHOD THAT TAKES 5 SECONDS });
t.Start();
"Raw" threads will be harder to work with than tasks, though...
You don't, since your computer can't possibly run 500 threads at once. Most probably, you're having 8 to 16 logical threads, and 4 or 5 is what's left available when you run your code. Seems 100% legit.
Our scenario is a network scanner.
It connects to a set of hosts and scans them in parallel for a while using low priority background threads.
I want to be able to schedule lots of work but only have any given say ten or whatever number of hosts scanned in parallel. Even if I create my own threads, the many callbacks and other asynchronous goodness uses the ThreadPool and I end up running out of resources. I should look at MonoTorrent...
If I use THE ThreadPool, can I limit my application to some number that will leave enough for the rest of the application to Run smoothly?
Is there a threadpool that I can initialize to n long lived threads?
[Edit]
No one seems to have noticed that I made some comments on some responses so I will add a couple things here.
Threads should be cancellable both
gracefully and forcefully.
Threads should have low priority leaving the GUI responsive.
Threads are long running but in Order(minutes) and not Order(days).
Work for a given target host is basically:
For each test
Probe target (work is done mostly on the target end of an SSH connection)
Compare probe result to expected result (work is done on engine machine)
Prepare results for host
Can someone explain why using SmartThreadPool is marked wit ha negative usefulness?
In .NET 4 you have the integrated Task Parallel Library. When you create a new Task (the new thread abstraction) you can specify a Task to be long running. We have made good experiences with that (long being days rather than minutes or hours).
You can use it in .NET 2 as well but there it's actually an extension, check here.
In VS2010 the Debugging Parallel applications based on Tasks (not threads) has been radically improved. It's advised to use Tasks whenever possible rather than raw threads. Since it lets you handle parallelism in a more object oriented friendly way.
UPDATE
Tasks that are NOT specified as long running, are queued into the thread pool (or any other scheduler for that matter).
But if a task is specified to be long running, it just creates a standalone Thread, no thread pool is involved.
The CLR ThreadPool isn't appropriate for executing long-running tasks: it's for performing short tasks where the cost of creating a thread would be nearly as high as executing the method itself. (Or at least a significant percentage of the time it takes to execute the method.) As you've seen, .NET itself consumes thread pool threads, you can't reserve a block of them for yourself lest you risk starving the runtime.
Scheduling, throttling, and cancelling work is a different matter. There's no other built-in .NET worker-queue thread pool, so you'll have roll your own (managing the threads or BackgroundWorkers yourself) or find a preexisting one (Ami Bar's SmartThreadPool looks promising, though I haven't used it myself).
In your particular case, the best option would not be either threads or the thread pool or Background worker, but the async programming model (BeginXXX, EndXXX) provided by the framework.
The advantages of using the asynchronous model is that the TcpIp stack uses callbacks whenever there is data to read and the callback is automatically run on a thread from the thread pool.
Using the asynchronous model, you can control the number of requests per time interval initiated and also if you want you can initiate all the requests from a lower priority thread while processing the requests on a normal priority thread which means the packets will stay as little as possible in the internal Tcp Queue of the networking stack.
Asynchronous Client Socket Example - MSDN
P.S. For multiple concurrent and long running jobs that don't do allot of computation but mostly wait on IO (network, disk, etc) the better option always is to use a callback mechanism and not threads.
I'd create your own thread manager. In the following simple example a Queue is used to hold waiting threads and a Dictionary is used to hold active threads, keyed by ManagedThreadId. When a thread finishes, it removes itself from the active dictionary and launches another thread via a callback.
You can change the max running thread limit from your UI, and you can pass extra info to the ThreadDone callback for monitoring performance, etc. If a thread fails for say, a network timeout, you can reinsert back into the queue. Add extra control methods to Supervisor for pausing, stopping, etc.
using System;
using System.Collections.Generic;
using System.Threading;
namespace ConsoleApplication1
{
public delegate void CallbackDelegate(int idArg);
class Program
{
static void Main(string[] args)
{
new Supervisor().Run();
Console.WriteLine("Done");
Console.ReadKey();
}
}
class Supervisor
{
Queue<System.Threading.Thread> waitingThreads = new Queue<System.Threading.Thread>();
Dictionary<int, System.Threading.Thread> activeThreads = new Dictionary<int, System.Threading.Thread>();
int maxRunningThreads = 10;
object locker = new object();
volatile bool done;
public void Run()
{
// queue up some threads
for (int i = 0; i < 50; i++)
{
Thread newThread = new Thread(new Worker(ThreadDone).DoWork);
newThread.IsBackground = true;
waitingThreads.Enqueue(newThread);
}
LaunchWaitingThreads();
while (!done) Thread.Sleep(200);
}
// keep starting waiting threads until we max out
void LaunchWaitingThreads()
{
lock (locker)
{
while ((activeThreads.Count < maxRunningThreads) && (waitingThreads.Count > 0))
{
Thread nextThread = waitingThreads.Dequeue();
activeThreads.Add(nextThread.ManagedThreadId, nextThread);
nextThread.Start();
Console.WriteLine("Thread " + nextThread.ManagedThreadId.ToString() + " launched");
}
done = (activeThreads.Count == 0) && (waitingThreads.Count == 0);
}
}
// this is called by each thread when it's done
void ThreadDone(int threadIdArg)
{
lock (locker)
{
// remove thread from active pool
activeThreads.Remove(threadIdArg);
}
Console.WriteLine("Thread " + threadIdArg.ToString() + " finished");
LaunchWaitingThreads(); // this could instead be put in the wait loop at the end of Run()
}
}
class Worker
{
CallbackDelegate callback;
public Worker(CallbackDelegate callbackArg)
{
callback = callbackArg;
}
public void DoWork()
{
System.Threading.Thread.Sleep(new Random().Next(100, 1000));
callback(System.Threading.Thread.CurrentThread.ManagedThreadId);
}
}
}
Use the built-in threadpool. It has good capabilities.
Alternatively you can look at the Smart Thread Pool implementation here or at Extended Thread Pool for a limit on the maximum number of working threads.