I have a win form that starts a mini server type thing to serve web pages to the local browser, now the problem is, is that when I start it the application obviously won't run because there is a loop that waits for requests, for every request I create a new thread. Now should I create a complete new thread for the entire process or is there another way? The class is in a separate dll file I have created. Alone it works perfectly as expected.
I suggest you take a look at the ThreadPool Class. It is an easy-to-use option for handling multiple threads:
The thread pool enables you to use threads more efficiently by providing your application with a pool of worker threads that are managed by the system.
To queue a method for execution simply use the QueueUserWorkItem Method:
ThreadPool.QueueUserWorkItem(state =>
{
// do some work!
});
If you realize that you need more active concurrent threads to serve your clients, call the SetMaxThreads Method:
ThreadPool.SetMaxThreads(50, 10);
All requests above those numbers for worker threads and I/O threads remain queued until thread pool threads become available.
There are two ways here:
Async server. More difficult and more performance. http://robjdavey.wordpress.com/2011/02/12/asynchronous-tcp-server-example/
One thread per client. Easy to write but not applicable if you have many clients. http://tech.pro/tutorial/704/csharp-tutorial-simple-threaded-tcp-server
don't use loop until requests
I would follow #Thomas suggestion, but adding waitHandles to your ThreadPool to manage the callback cycles.
WaitCallback classMethod1= new WaitCallback(DoClassMethod1);
bool isQueued = ThreadPool.QueueUserWorkItem(classMethod1, waitHandle[0]);
WaitCallback classMethod2= new WaitCallback(DoClassMethod2);
bool isQueued = ThreadPool.QueueUserWorkItem(classMethod2, waitHandle[1]);
// do this if you want to wait for all requests complated
if (WaitHandle.WaitAll(waitHandles, 5000, false))
// request completed, show your result.
else
// problem.
void DoClassMethod1(object state)
{
// do your work
ManualResetEvent mre = (ManualResetEvent)state;
mre.Set();
}
Related
My application needs to perform a number of tasks per tenant on a minute-to-minute basis. These are fire-and-forget operations, so I don't want to use Parallel.ForEach to handle this.
Instead I'm looping through the list of tenants, and firing off a ThreadPool.QueueUserWorkItem to process each tenants task.
foreach (Tenant tenant in tenants)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(ProcessTenant), tenantAccount);
}
This code works perfectly in production, and can generally process over 100 tenants in under 5 seconds.
However on application startup this causes 100% CPU utilization while things like EF get warmed up during the startup process. To limit this I've implemented a semaphore as follows:
private static Semaphore _threadLimiter = new Semaphore(4, 4);
The idea is to limit this task processing to only be able to use half of the machines logical processors. Inside the ProcessTenant method I call:
try
{
_threadLimiter.WaitOne();
// Perform all minute-to-minute tasks
}
finally
{
_threadLimiter.Release();
}
In testing, this appears to work exactly as expected. CPU utilization on startup stays at around 50% and does not appear to affect how quickly initial startup takes.
So question is mainly around what is actually happening when WaitOne is called. Does this release the thread to work on other tasks - similar to asynchronous calls? The MSDN documentation states that WaitOne: "Blocks the current thread until the current WaitHandle receives a signal."
So I'm just wary that this won't actually allow my web app to continue to utilize this blocked thread while it's waiting, which would make the whole point of this exercise meaningless.
WaitOne does block the thread, and that thread will stop being scheduled on a CPU core until the semaphore is signaled. However, you're holding a large number of threads from the threadpool for possibly a long time ("long" as in "longer than ~500 ms"). This can be an issue because the threadpool grows very slowly, so you may be preventing other part of your application from properly using it.
If you plan on waiting for a significant amount of time, you could use your own threads instead:
foreach (Tenant tenant in tenants)
{
new Thread(ProcessTenant).Start(tenantAccount);
}
However, you're still keeping one thread per item in memory. While they won't eat CPU as they're sleeping on the semaphore, they're still using RAM for nothing (about 1MB per thread). Instead, have a single dedicated thread wait on the semaphore and enqueue new items as needed:
// Run this on a dedicated thread
foreach (Tenant tenant in tenants)
{
_threadLimiter.WaitOne();
ThreadPool.QueueUserWorkItem(_ =>
{
try
{
ProcessTenant(tenantAccount);
}
finally
{
_threadLimiter.Release();
}
});
}
I have some work (a job) that is in a queue (so there a several of them) and I want each job to be processed by a thread.
I was looking at Rx but this is not what I wanted and then came across the parallel task library.
Since my work will be done in an web application I do not want client to be waiting for each job to be finished, so I have done the following:
public void FromWebClientRequest(int[] ids);
{
// I will get the objects for the ids from a repository using a container (UNITY)
ThreadPool.QueueUserWorkItem(delegate
{
DoSomeWorkInParallel(ids, container);
});
}
private static void DoSomeWorkInParallel(int[] ids, container)
{
Parallel.ForEach(ids, id=>
{
Some work will be done here...
var respository = container.Resolve...
});
// Here all the work will be done.
container.Resolve<ILogger>().Log("finished all work");
}
I would call the above code on a web request and then the client will not have to wait.
Is this the correct way to do this?
TIA
From the MSDN docs I see that Unitys IContainer Resolve method is not thread safe (or it is not written). This would mean that you need to do that out of the thread loop. Edit: changed to Task.
public void FromWebClientRequest(int[] ids);
{
IRepoType repoType = container.Resolve<IRepoType>();
ILogger logger = container.Resolve<ILogger>();
// remove LongRunning if your operations are not blocking (Ie. read file or download file long running queries etc)
// prefer fairness is here to try to complete first the requests that came first, so client are more likely to be able to be served "first come, first served" in case of high CPU use with lot of requests
Task.Factory.StartNew(() => DoSomeWorkInParallel(ids, repoType, logger), TaskCreationOptions.LongRunning | TaskCreationOptions.PreferFairness);
}
private static void DoSomeWorkInParallel(int[] ids, IRepoType repository, ILogger logger)
{
// if there are blocking operations inside this loop you ought to convert it to tasks with LongRunning
// why this? to force more threads as usually would be used to run the loop, and try to saturate cpu use, which would be doing nothing most of the time
// beware of doing this if you work on a non clustered database, since you can saturate it and have a bottleneck there, you should try and see how it handles your workload
Parallel.ForEach(ids, id=>{
// Some work will be done here...
// use repository
});
logger.Log("finished all work");
}
Plus as fiver stated, if you have .Net 4 then Tasks is the way to go.
Why go Task (question in comment):
If your method fromClientRequest would be fired insanely often, you would fill the thread pool, and overall system performance would probably not be as good as with .Net 4 with fine graining. This is where Task enters the game. Each task is not its own thread but the new .Net 4 thread pool creates enough threads to maximize performance on a system, and you do not need to bother on how many cpus and how much thread context switches would there be.
Some MSDN quotes for ThreadPool:
When all thread pool threads have been
assigned to tasks, the thread pool
does not immediately begin creating
new idle threads. To avoid
unnecessarily allocating stack space
for threads, it creates new idle
threads at intervals. The interval is
currently half a second, although it
could change in future versions of the
.NET Framework.
The thread pool has a default size of
250 worker threads per available
processor
Unnecessarily increasing the number of
idle threads can also cause
performance problems. Stack space must
be allocated for each thread. If too
many tasks start at the same time, all
of them might appear to be slow.
Finding the right balance is a
performance-tuning issue.
By using Tasks you discard those issues.
Another good thing is you can fine grain the type of operation to run. This is important if your tasks do run blocking operations. This is a case where more threads are to be allocated concurrently since they would mostly wait. ThreadPool cannot achieve this automagically:
Task.Factory.StartNew(() => DoSomeWork(), TaskCreationOptions.LongRunning);
And of course you are able to make it finish on demand without resorting to ManualResetEvent:
var task = Task.Factory.StartNew(() => DoSomeWork());
task.Wait();
Beside this you don't have to change the Parallel.ForEach if you don't expect exceptions or blocking, since it is part of the .Net 4 Task Parallel Library, and (often) works well and optimized on the .Net 4 pool as Tasks do.
However if you do go to Tasks instead of parallel for, remove the LongRunning from the caller Task, since Parallel.For is a blocking operations and Starting tasks (with the fiver loop) is not. But this way you loose the kinda first-come-first-served optimization, or you have to do it on a lot more Tasks (all spawned through ids) which probably would give less correct behaviour. Another option is to wait on all tasks at the end of DoSomeWorkInParallel.
Another way is to use Tasks:
public static void FromWebClientRequest(int[] ids)
{
foreach (var id in ids)
{
Task.Factory.StartNew(i =>
{
Wl(i);
}
, id);
}
}
I would call the above code on a web
request and then the client will not
have to wait.
This will work provided the client does not need an answer (like Ok/Fail).
Is this the correct
way to do this?
Almost. You use Parallel.ForEach (TPL) for the jobs but run it from a 'plain' Threadpool job. Better to use a Task for the outer job as well.
Also, handle all exceptions in that outer Task. And be careful about the thread-safety of the container etc.
I have a multi thread application written by c#, my max thread number is 256 and this application gets the performance counters of the computers in an Ip interval(192.168.1.0 -192.168.205.255)
it works fine and turns many times in a day. because I have to get reports.
But the problem is some times one machine keeps a thread and never finishes its work so my loop doesnt turn...
Are there any way to create threads with a countdown parameter. when I start the threads in foreach?
foreach(Thread t in threads)
{
t.start(); -----> t.start(countdownParameter) etc....
}
coundown parameter is the max life of each threads. This mean if a thread cant reach a machine it have to be abort. for example 60 seconds.. no not 256 machines, I meant 256 threads... there are about 5000 ip and 600 of them are alive. soo I am using 256 threads to read their values. and the other thing is loop. my loop is working as while all off the ipies finish it starts from beginning.
You can't specify a timeout for thread execution. However, you can try to Join each thread with a timeout, and abort it if it doesn't exit.
foreach(Thread t in threads)
{
t.Start();
}
TimeSpan timeOut = TimeSpan.FromSeconds(10);
foreach(Thread t in threads)
{
if (!t.Join(timeOut))
{
// Still not complete after 10 seconds, abort
t.Abort();
}
}
There are of course more elegant ways to do it, like using WaitHandles with the WaitAll method (note that WaitAll is limited to 64 handles at a time on most implementations, and doesn't work on STA threads, like the UI thread)
You should not terminate the thread from the outside. (Never kill a thread, make it commit suicide). Killing a thread can easily corrupt the state of an appdomain if you're not very careful.
You should rewrite the network code in the threads to either time out once the time-limit has been reached, or use asynchronous network code.
Usually a thread gets stuck on a blocking call (unless of course you have a bug causing an infinite loop). You need to identify which call is blocking and "poke" it to get it to unblock. It could be that your thread is waiting inside one of the .NET BCL waiting calls (WaitHandle.WaitOne, etc.) in which case you could use Thread.Interrupt to unblock it. But, in your case it is more likely that the API managing the communication with the remote computers is hung. Sometimes you can simply close the connection from a separate thread and that will unblock the hung method (as is the case with the Socket class). If all else fails then you really might have to fall back on the method of last of calling Thread.Abort. Just keep in mind that if you abort a thread it might corrupt the state of the app domain in which the abort originated or even the entire process itself. There were a lot of provisions added in .NET 2.0 that make aborts a lot safer than they were before, but there is still some risk.
You can use smth like this:
public static T Exec<T>(Func<t> F, int Timeout, out bool Completed)
{
T result = default(T);
Thread thread = new Thread(() => result = F());
thread.Start();
Completed = thread.Join(Timeout);
if(!Completed) thread.Abort();
return result;
}
I am writing a program to crawl the websites. The crawl function is a recursive one and may consume more time to complete, So I used Multi Threading to perform the crawl for multiple websites.
What exactly I need is, after completion crawling one website it call next one (which should be in Queqe) instead multiple websites crawling at a time.
I am using C# and ASP.NET.
The standard practice for doing this is to use a blocking queue. If you are using .NET 4.0 then you can take advantage of the BlockingCollection class otherwise you can use Stephen Toub's implementation.
What you will do is spin up as many worker threads as you feel necessary and have them go around in an infinite loop dequeueing items as they appear in the queue. Your main thread will be enqueueing the item. A blocking queue is designed to wait/block on the dequeue operation until an item becomes available.
public class Program
{
private static BlockingQueue<string> m_Queue = new BlockingQueue<string>();
public static void Main()
{
var thread1 = new Thread(Process);
var thread2 = new Thread(Process);
thread1.Start();
thread2.Start();
while (true)
{
string url = GetNextUrl();
m_Queue.Enqueue(url);
}
}
public static void Process()
{
while (true)
{
string url = m_Queue.Dequeue();
// Do whatever with the url here.
}
}
}
I don't usually think positive thoughts when it comes to web crawlers...
You want to use a threadpool.
ThreadPool.QueueUserWorkItem(new WaitCallback(CrawlSite), (object)s);
You simply 'push' you workload into the queue, and let the threadpool manage it.
I have to say - I'm not a Threading expert and my C# is quite rusty - but considering the requirements I would suggest something like this:
Define a Queue for the websites.
Define a Pool with Crawler threads.
The main process iterates over the website queue and retrieves the site address.
Retrieve an available thread from the pool - assign it the website address and allow it to start running. Set an indicator in the thread object that it should wait for all subsequent threads to finish (so you will not continue to the next site).
Once all the threads have ended - the main thread (started in step #4) will end and return to the main loop of the main process to continue to the next website.
The Crawler behavior should be something like this:
Investigate the content of the current address
Retrieve the hierarchy below the current level
For each child of the current node of the site tree - pull a new crawler thread from the pool and start it running in the background with the address of the child node
If the pool is empty, wait until a thread becomes available.
If the thread is marked to wait - wait for all the other threads to finish
I think there are some challenges here - but as a general flow I believe it can do do job.
Put all your url's in a queue, and pop one off the queue each time you are done with the previous one.
You could also put the recursive links in the queue, to better control how many downloads you are executing at a time.
You could set up X number of worker threads which all get a url off the queue in order to process more at a time. But this way you can throttle it yourself.
You can use ConcurrentQueue<T> in .Net to get a thread safe queue to work with.
Our scenario is a network scanner.
It connects to a set of hosts and scans them in parallel for a while using low priority background threads.
I want to be able to schedule lots of work but only have any given say ten or whatever number of hosts scanned in parallel. Even if I create my own threads, the many callbacks and other asynchronous goodness uses the ThreadPool and I end up running out of resources. I should look at MonoTorrent...
If I use THE ThreadPool, can I limit my application to some number that will leave enough for the rest of the application to Run smoothly?
Is there a threadpool that I can initialize to n long lived threads?
[Edit]
No one seems to have noticed that I made some comments on some responses so I will add a couple things here.
Threads should be cancellable both
gracefully and forcefully.
Threads should have low priority leaving the GUI responsive.
Threads are long running but in Order(minutes) and not Order(days).
Work for a given target host is basically:
For each test
Probe target (work is done mostly on the target end of an SSH connection)
Compare probe result to expected result (work is done on engine machine)
Prepare results for host
Can someone explain why using SmartThreadPool is marked wit ha negative usefulness?
In .NET 4 you have the integrated Task Parallel Library. When you create a new Task (the new thread abstraction) you can specify a Task to be long running. We have made good experiences with that (long being days rather than minutes or hours).
You can use it in .NET 2 as well but there it's actually an extension, check here.
In VS2010 the Debugging Parallel applications based on Tasks (not threads) has been radically improved. It's advised to use Tasks whenever possible rather than raw threads. Since it lets you handle parallelism in a more object oriented friendly way.
UPDATE
Tasks that are NOT specified as long running, are queued into the thread pool (or any other scheduler for that matter).
But if a task is specified to be long running, it just creates a standalone Thread, no thread pool is involved.
The CLR ThreadPool isn't appropriate for executing long-running tasks: it's for performing short tasks where the cost of creating a thread would be nearly as high as executing the method itself. (Or at least a significant percentage of the time it takes to execute the method.) As you've seen, .NET itself consumes thread pool threads, you can't reserve a block of them for yourself lest you risk starving the runtime.
Scheduling, throttling, and cancelling work is a different matter. There's no other built-in .NET worker-queue thread pool, so you'll have roll your own (managing the threads or BackgroundWorkers yourself) or find a preexisting one (Ami Bar's SmartThreadPool looks promising, though I haven't used it myself).
In your particular case, the best option would not be either threads or the thread pool or Background worker, but the async programming model (BeginXXX, EndXXX) provided by the framework.
The advantages of using the asynchronous model is that the TcpIp stack uses callbacks whenever there is data to read and the callback is automatically run on a thread from the thread pool.
Using the asynchronous model, you can control the number of requests per time interval initiated and also if you want you can initiate all the requests from a lower priority thread while processing the requests on a normal priority thread which means the packets will stay as little as possible in the internal Tcp Queue of the networking stack.
Asynchronous Client Socket Example - MSDN
P.S. For multiple concurrent and long running jobs that don't do allot of computation but mostly wait on IO (network, disk, etc) the better option always is to use a callback mechanism and not threads.
I'd create your own thread manager. In the following simple example a Queue is used to hold waiting threads and a Dictionary is used to hold active threads, keyed by ManagedThreadId. When a thread finishes, it removes itself from the active dictionary and launches another thread via a callback.
You can change the max running thread limit from your UI, and you can pass extra info to the ThreadDone callback for monitoring performance, etc. If a thread fails for say, a network timeout, you can reinsert back into the queue. Add extra control methods to Supervisor for pausing, stopping, etc.
using System;
using System.Collections.Generic;
using System.Threading;
namespace ConsoleApplication1
{
public delegate void CallbackDelegate(int idArg);
class Program
{
static void Main(string[] args)
{
new Supervisor().Run();
Console.WriteLine("Done");
Console.ReadKey();
}
}
class Supervisor
{
Queue<System.Threading.Thread> waitingThreads = new Queue<System.Threading.Thread>();
Dictionary<int, System.Threading.Thread> activeThreads = new Dictionary<int, System.Threading.Thread>();
int maxRunningThreads = 10;
object locker = new object();
volatile bool done;
public void Run()
{
// queue up some threads
for (int i = 0; i < 50; i++)
{
Thread newThread = new Thread(new Worker(ThreadDone).DoWork);
newThread.IsBackground = true;
waitingThreads.Enqueue(newThread);
}
LaunchWaitingThreads();
while (!done) Thread.Sleep(200);
}
// keep starting waiting threads until we max out
void LaunchWaitingThreads()
{
lock (locker)
{
while ((activeThreads.Count < maxRunningThreads) && (waitingThreads.Count > 0))
{
Thread nextThread = waitingThreads.Dequeue();
activeThreads.Add(nextThread.ManagedThreadId, nextThread);
nextThread.Start();
Console.WriteLine("Thread " + nextThread.ManagedThreadId.ToString() + " launched");
}
done = (activeThreads.Count == 0) && (waitingThreads.Count == 0);
}
}
// this is called by each thread when it's done
void ThreadDone(int threadIdArg)
{
lock (locker)
{
// remove thread from active pool
activeThreads.Remove(threadIdArg);
}
Console.WriteLine("Thread " + threadIdArg.ToString() + " finished");
LaunchWaitingThreads(); // this could instead be put in the wait loop at the end of Run()
}
}
class Worker
{
CallbackDelegate callback;
public Worker(CallbackDelegate callbackArg)
{
callback = callbackArg;
}
public void DoWork()
{
System.Threading.Thread.Sleep(new Random().Next(100, 1000));
callback(System.Threading.Thread.CurrentThread.ManagedThreadId);
}
}
}
Use the built-in threadpool. It has good capabilities.
Alternatively you can look at the Smart Thread Pool implementation here or at Extended Thread Pool for a limit on the maximum number of working threads.