C# Threading issue & best practices - c#

This is my 1st time using threading in an C# Application. Basically its an application which checks a bunch of web sites in a list whether its dead or alive.
Here is my 1st attempt to work with multi-threading
public void StartThread(string URL,int no)
{
Thread newThread = new Thread(() =>
{
BeginInvoke(new Action(() => richTextBox1.Text += "Thread " + no + " Running" + Environment.NewLine));
bool b = ping(URL);
if (b == true) { BeginInvoke(new Action(() => richTextBox2.Text += "Okay" + Environment.NewLine)); }
else
{ return; }
});
newThread.Start();
}
I'm using the above function to create new threads and each thread is created inside a loop.
foreach (string site in website) {
StartThread(site,i);
i++; // Counter }
Since i'm a beginner i have few questions.
The code works fine but i'm not sure if this the best solution
Sometimes threads run perfectly but sometimes Threads does not return any values from the method ping() which checks the host and returns true if its online using WebRequest. is it usual ?
If i ask the user to specify a no of threads that he needs to use , how can i equally distribute work among those threads ?
Is their an elegant way that i track the status of the thread, ( dead / alive ) ? i currently use rocess.GetCurrentProcess().Threads.Count;

spinning up a new thread for each request is inefficient ... you probably will want to have a fixed number of worker threads (so one can run on each core of the cpu)
have a look at the ConcurrentQueue<T> class, which will give you a thread safe first-in-first-out queue where you can enqueue your requests and let each worker thread dequeue a request, process it, and so on until the queue is empty ...
be aware that you may not call controls on your GUI from other threads than the main GUI thread ... have a look at the ISynchronizeInvoke Interface which can help you decide if an cross thread situation needs to be handled by invoking another thread

1) Your solution is OK. The Thread class has been partially superseded by the Task class, if you're writing new code, you can use that. There is also something completely new in .NET 4.5, called await .However, see 4).
2) Your ping method might simply be crashing if the website is dead. You can show us the code of that method.
4)Thread class is nice because you can easily check the thread state, as per your requirements, using the ThreadState property - just create a List<Thread> , put your threads in it, and then start them one by one.
3)If you want to load the number of threads from input and distribute the work evenly, put the tasks in a queue (you can use the ConcurrentQueue that has already been suggested) and have the threads load the URLs from the queue. Sample code:
you initialize everything
void initialize(){
ConcurrentQueue<string> queue = new ConcurrentQueue<string>();
foreach(string url in websites)
{
queue.Enqueue(url);
}
//and the threads
List<Thread> threads = new List<Thread>();
for (int i = 0; i < threadCountFromTheUser; i++)
{
threads.Add(new Thread(work));
}}
//work method
void work()
{
while (!queue.IsEmpty)
{
string url;
bool fetchedUrl = queue.TryDequeue(out url);
if (fetchedUrl)
ping(url);
}
}
and then run
foreach (Thread t in threads)
{
t.Start();
}
Code not tested

You should consider the .Net ThreadPool. However, it's generally unsuitable for threads that take more than about second to execute.
See also:
Patterns for Parallel Programming: Understanding and Applying Parallel Patterns with the .NET Framework 4

Related

Ensure a Specific Thread runs (acquires a resource) next?

I have a function where I want to execute in a separate thread avoiding two threads to access the same resources. Also I want to make sure that if the thread is currently executing then stop that thread and start executing the new thread. This is what I have:
volatile int threadCount = 0; // use it to know the number of threads being executed
private void DoWork(string text, Action OncallbackDone)
{
threadCount++;
var t = new Thread(new ThreadStart(() =>
{
lock (_lock) // make sure that this code is only accessed by one thread
{
if (threadCount > 1) // if a new thread got in here return and let the last one execute
{
threadCount--;
return;
}
// do some work in here
Thread.Sleep(1000);
OncallbackDone();
threadCount--;
}
}));
t.Start();
}
if I fire that method 5 times then all the threads will be waiting for the lock until the lock is released. I want to make sure that I execute the last thread though. when the threads are waiting to be the owner of the lock how can I determine which will be the next one owning the lock. I want them to own the resource in the order that I created the threads...
EDIT
I am not creating this application with .net 4.0 . Sorry for not mentioning what I was trying to accomplish. I am creating an autocomplete control where I am filtering a lot of data. I don't want the main window to freeze eveytime I want to filter results. also I want to filter results as the user types. If the user types 5 letters at once I want to stop all threads and I will just be interested in the last one. because the lock blocks all the threads sometimes the last thread that I created may own the lock first.
I think you are overcomplicating this. If you are able to use 4.0, then just use the Task Parallel Library. With it, you can just set up a ContinueWith function so that threads that must happen in a certain order are done in the order you dictate. If this is NOT what you are looking for, then I actually would suggest that you not use threading, as this sounds like a synchronous action that you are trying to force into parallelism.
If you are just looking to cancel tasks: then here is a SO question on how to cancel TPL tasks. Why waste the resources if you are just going to dump them all except for the last one.
If you are not using 4.0, then you can accomplish the same thing with a Background Worker. It just takes more boilerplate code to accomplish the same thing :)
I agree with Justin in that you should use the .NET 4 Task Parallel Library. But if you want complete control you should not use the default Task Scheduler, which favors LIFO, but create your own Task Scheduler (http://msdn.microsoft.com/en-us/library/system.threading.tasks.taskscheduler.aspx) and implement the logic that you want to determine which task gets preference.
Using Threads directly is not recommended unless you have deep knowledge of .NET Threading. If you are on .NET 4.0; Tasks and TPL are preferred.
This is what I came up with after reading the links that you guys posted. I guess I needed a Queue therefore I implemented:
volatile int threadCount = 0;
private void GetPredicateAsync(string text, Action<object> DoneCallback)
{
threadCount++;
ThreadPool.QueueUserWorkItem((x) =>
{
lock (_lock)
{
if (threadCount > 1) // disable executing threads at same time
{
threadCount--;
return; // if a new thread is created exit.
// let the newer task do work!
}
// do work in here
Application.Current.Dispatcher.BeginInvoke(new Action(() =>
{
threadCount--;
DoneCallback(Foo);
}));
}
},text);
}

.Net: Background Worker and multiple CPU

I am using the BackgroundWorker to do some heavy stuff in the background so that the UI does not become unresponsive.
But today I noticed that when I run my program, only one of the two CPUs is being used.
Is there any way to use all CPUs with the BackgroundWorker?
Here is my simplified code, just if you are curious!
private System.ComponentModel.BackgroundWorker bwPatchApplier;
this.bwPatchApplier.WorkerReportsProgress = true;
this.bwPatchApplier.DoWork += new System.ComponentModel.DoWorkEventHandler(this.bwPatchApplier_DoWork);
this.bwPatchApplier.ProgressChanged += new System.ComponentModel.ProgressChangedEventHandler(this.bwPatchApplier_ProgressChanged);
this.bwPatchApplier.RunWorkerCompleted += new System.ComponentModel.RunWorkerCompletedEventHandler(this.bwPatchApplier_RunWorkerCompleted);
private void bwPatchApplier_DoWork(object sender, DoWorkEventArgs e)
{
string pc1WorkflowName;
string pc2WorkflowName;
if (!GetWorkflowSettings(out pc1WorkflowName, out pc2WorkflowName)) return;
int progressPercentage = 0;
var weWorkspaces = (List<WEWorkspace>) e.Argument;
foreach (WEWorkspace weWorkspace in weWorkspaces)
{
using (var spSite = new SPSite(weWorkspace.SiteId))
{
foreach (SPWeb web in spSite.AllWebs)
{
using (SPWeb spWeb = spSite.OpenWeb(web.ID))
{
PrintHeader(spWeb.ID, spWeb.Title, spWeb.Url, bwPatchApplier);
try
{
for (int index = 0; index < spWeb.Lists.Count; index++)
{
SPList spList = spWeb.Lists[index];
if (spList.Hidden) continue;
string listName = spList.Title;
if (listName.Equals("PC1") || listName.Equals("PC2"))
{
#region STEP 1
// STEP 1: Remove Workflow
#endregion
#region STEP 2
// STEP 2: Add Events: Adding & Updating
#endregion
}
if ((uint) spList.BaseTemplate == 10135 || (uint) spList.BaseTemplate == 10134)
{
#region STEP 3
// STEP 3: Configure Custom AssignedToEmail Property
#endregion
#region STEP 4
if (enableAssignToEmail)
{
// STEP 4: Install AssignedTo events to Work lists
}
#endregion
}
#region STEP 5
// STEP 5 Install Notification Events
#endregion
#region STEP 6
// STEP 6 Install Report List Events
#endregion
progressPercentage += TotalSteps;
UpdatePercentage(progressPercentage, bwPatchApplier);
}
}
catch (Exception exception)
{
progressPercentage += TotalSteps;
UpdatePercentage(progressPercentage, bwPatchApplier);
}
}
}
}
}
PrintMessage(string.Empty, bwPatchApplier);
PrintMessage("*** Process Completed", bwPatchApplier);
UpdateStatus("Process Completed", bwPatchApplier);
}
Thanks a lot for looking into this :)
The BackgroundWorker does its work within a single background (ThreadPool) thread. As such, if it's computationally heavy, it'll use one CPU heavily. The UI thread is still running on the second, but is probably (like most user interface work) spending almost all of its time idle waiting for input (which is a good thing).
If you want to split your work up to use more than one CPU, you'll need to use some other techniques. This could be multiple BackgroundWorker components, each doing some work, or using the ThreadPool directly. Parallel programming has been simplified in .NET 4 via the TPL, which is likely a very good option. For details, you can see my series on the TPL or MSDN's page on the Task Parallel Library.
Each BackgroundWorker uses only a single thread to do the stuff you tell it to do. To take advantage of multiple cores, you would need multiple threads. That would mean either multiple BackgroundWorkers or spawning multiple threads from within your DoWork method.
The backgroundworker, by itself, only provides one additional thread of execution. It's purpose is to get things off the UI thread, and it's very good at that job. If you want more threads, you need to provide them yourself.
It would be tempting here to build a method that accepts an SPWeb argument, and just call Thread.Start() over and over for each object; then finish with Thread.Join() or WaitAll() to wait for them to finish at the end of the BackgroundWorker. However, this would be a bad idea because you'll lose efficiency as the operating system spends time performing context switches among all the threads.
Instead, you want to force your system to run in only a few threads, but at least two (in this case). A good rule of thumb is (2n - 1), where "n" is the number of processor cores you have... but there are all kinds of cases where you want to break this rule. You can implement this by using a ThreadPool, by iterating over your SPWeb objects and adding them to a queue that you keep pulling from, or other means such as the TPL.
The BackgroundWorker is running a new thread on the second CPU core, leaving the UI responsive.
If you're using .NET 4, look into using the Task Parallel Library, which could give you better results and utilize both cores.
The BackgroundWorker itself is only creating a single thread apart from your main UI to do work in - it's not trying to parallelize the operations within that work thread. If you want to spread your work across multiple work threads you should look into using the TPL. Bear in mind that not all tasks translate well to parallel execution, so if freeing the UI is your only goal this may already be the best you can do.
There are potential pitfalls to this, but you might get some mileage out of utilizing Parallel.ForEach:
Instead of
foreach (SPWeb web in spSite.AllWebs)
{
//Your loop code here
}
You could:
Parallel.Foreach(spSite.AllWebs, web =>
{
//Your loop code here
});
This basically creates a Task (from the Task API in .NET 4.0) from each item and schedules that work with the TaskPool, which will give you some of the parallelism you will need to take advantage of those cores.
You will have to fix the inevitable concurrency problems that might arise from this, but it's a good starting point. You are going to at least fix the fact that you are maintaining a shared state across threads (the progress counter). Here's some guidance on that: http://msdn.microsoft.com/en-us/library/dd997392.aspx

How to guarantee a new thread is created when using the Task.StartNew method

From what I can tell I have misleading bits of information. I need to have a separate thread running in the background.
At the moment I do it like this:
var task = Task.Factory.StartNew
(CheckFiles
, cancelCheckFile.Token
, TaskCreationOptions.LongRunning
, TaskScheduler.Default);//Check for files on another thread
private void CheckFiles()
{
while (!cancelCheckFile.Token.IsCancellationRequested)
{
//do stuff
}
}
This always creates a new thread for me. However after several discussions even if it is marked as LongRunning doesn't guarantee that a new thread will be created.
In the past I have done something like this:
thQueueChecker = new Thread(new ThreadStart(CheckQueue));
thQueueChecker.IsBackground = true;
thQueueChecker.Name = "CheckQueues" + DateTime.Now.Ticks.ToString();
thQueueChecker.Start();
private void CheckQueue()
{
while (!ProgramEnding)
{
//do stuff
}
}
Would you recommend that I go back to this approach to guarantee a new thread is used?
The default task scheduler ThreadPoolTaskScheduler does indeed always create a new thread for long running task. It does not use the thread pool as you can see. It is no different as the manual approach to create the thread by yourself. In theory it could happen that the thread scheduler of .NET 4.5 does something different but in practice it is unlikely to change.
protected internal override void QueueTask(Task task)
{
if ((task.Options & TaskCreationOptions.LongRunning) != TaskCreationOptions.None)
{
new Thread(s_longRunningThreadWork) { IsBackground = true }.Start(task);
}
else
{
bool forceGlobal =
(task.Options & TaskCreationOptions.PreferFairness) != TaskCreationOptions.None;
ThreadPool.UnsafeQueueCustomWorkItem(task, forceGlobal);
}
}
It depends on the Scheduler you use. There are two stock implementations, ThreadPoolTaskScheduler and SynchronizationContextTaskScheduler. The latter doesn't start a thread at all, used by the FromCurrentSynchronizationContext() method.
The ThreadPoolTaskScheduler is what you get. Which indeed uses the LongRunning option, it will use a regular Thread if it set. Important to avoid starving other TP threads. You'll get a TP thread without the option. These are implementation details subject to change without notice, although I'd consider it unlikely anytime soon.
LongRunning is just a hint to the scheduler - if you absolutely must always have a new Thread, you will have to create one.
You'll have to specify why you "always need a separate thread".
void Main()
{
var task = Task.Factory.StartNew(CheckFiles,
cancelCheckFile.Token,
TaskCreationOptions.LongRunning,
TaskScheduler.Default);
task.Wait();
}
A smart scheduler will use 1 thread here. Why shouldn't it?
But in general the CheckFiles() method will be executed on another (than the calling) thread. The issue is whether that thread is especially created or whether it might even be executed on several threads (in succession).
When you are using Tasks you give up control over the Thread. And that should be a good thing.

What is the most efficient method for assigning threads based on the following scenario?

I can have a maximum of 5 threads running simultaneous at any one time which makes use of 5 separate hardware to speedup the computation of some complex calculations and return the result. The API (contains only one method) for each of this hardware is not thread safe and can only run on a single thread at any point in time. Once the computation is completed, the same thread can be re-used to start another computation on either the same or a different hardware depending on availability. Each computation is stand alone and does not depend on the results of the other computation. Hence, up to 5 threads may complete its execution in any order.
What is the most efficient C# (using .Net Framework 2.0) coding solution for keeping track of which hardware is free/available and assigning a thread to the appropriate hardware API for performing the computation? Note that other than the limitation of 5 concurrently running threads, I do not have any control over when or how the threads are fired.
Please correct me if I am wrong but a lock free solution is preferred as I believe it will result in increased efficiency and a more scalable solution.
Also note that this is not homework although it may sound like it...
.NET provides a thread pool that you can use. System.Threading.ThreadPool.QueueUserWorkItem() tells a thread in the pool to do some work for you.
Were I designing this, I'd not focus on mapping threads to your HW resources. Instead I'd expose a lockable object for each HW resource - this can simply be an array or queue of 5 Objects. Then for each bit of computation you have, call QueueUserWorkItem(). Inside the method you pass to QUWI, find the next available lockable object and lock it (aka, dequeue it). Use the HW resource, then re-enqueue the object, exit the QUWI method.
It won't matter how many times you call QUWI; there can be at most 5 locks held, each lock guards access to one instance of your special hardware device.
The doc page for Monitor.Enter() shows how to create a safe (blocking) Queue that can be accessed by multiple workers. In .NET 4.0, you would use the builtin BlockingCollection - it's the same thing.
That's basically what you want. Except don't call Thread.Create(). Use the thread pool.
cite: Advantage of using Thread.Start vs QueueUserWorkItem
// assume the SafeQueue class from the cited doc page.
SafeQueue<SpecialHardware> q = new SafeQueue<SpecialHardware>()
// set up the queue with objects protecting the 5 magic stones
private void Setup()
{
for (int i=0; i< 5; i++)
{
q.Enqueue(GetInstanceOfSpecialHardware(i));
}
}
// something like this gets called many times, by QueueUserWorkItem()
public void DoWork(WorkDescription d)
{
d.DoPrepWork();
// gain access to one of the special hardware devices
SpecialHardware shw = q.Dequeue();
try
{
shw.DoTheMagicThing();
}
finally
{
// ensure no matter what happens the HW device is released
q.Enqueue(shw);
// at this point another worker can use it.
}
d.DoFollowupWork();
}
A lock free solution is only beneficial if the computation time is very small.
I would create a facade for each hardware thread where jobs are enqueued and a callback is invoked each time a job finishes.
Something like:
public class Job
{
public string JobInfo {get;set;}
public Action<Job> Callback {get;set;}
}
public class MyHardwareService
{
Queue<Job> _jobs = new Queue<Job>();
Thread _hardwareThread;
ManualResetEvent _event = new ManualResetEvent(false);
public MyHardwareService()
{
_hardwareThread = new Thread(WorkerFunc);
}
public void Enqueue(Job job)
{
lock (_jobs)
_jobs.Enqueue(job);
_event.Set();
}
public void WorkerFunc()
{
while(true)
{
_event.Wait(Timeout.Infinite);
Job currentJob;
lock (_queue)
{
currentJob = jobs.Dequeue();
}
//invoke hardware here.
//trigger callback in a Thread Pool thread to be able
// to continue with the next job ASAP
ThreadPool.QueueUserWorkItem(() => job.Callback(job));
if (_queue.Count == 0)
_event.Reset();
}
}
}
Sounds like you need a thread pool with 5 threads where each one relinquishes the HW once it's done and adds it back to some queue. Would that work? If so, .Net makes thread pools very easy.
Sounds a lot like the Sleeping barber problem. I believe the standard solution to that is to use semaphores

In .NET is there a thread scheduler for long running threads?

Our scenario is a network scanner.
It connects to a set of hosts and scans them in parallel for a while using low priority background threads.
I want to be able to schedule lots of work but only have any given say ten or whatever number of hosts scanned in parallel. Even if I create my own threads, the many callbacks and other asynchronous goodness uses the ThreadPool and I end up running out of resources. I should look at MonoTorrent...
If I use THE ThreadPool, can I limit my application to some number that will leave enough for the rest of the application to Run smoothly?
Is there a threadpool that I can initialize to n long lived threads?
[Edit]
No one seems to have noticed that I made some comments on some responses so I will add a couple things here.
Threads should be cancellable both
gracefully and forcefully.
Threads should have low priority leaving the GUI responsive.
Threads are long running but in Order(minutes) and not Order(days).
Work for a given target host is basically:
For each test
Probe target (work is done mostly on the target end of an SSH connection)
Compare probe result to expected result (work is done on engine machine)
Prepare results for host
Can someone explain why using SmartThreadPool is marked wit ha negative usefulness?
In .NET 4 you have the integrated Task Parallel Library. When you create a new Task (the new thread abstraction) you can specify a Task to be long running. We have made good experiences with that (long being days rather than minutes or hours).
You can use it in .NET 2 as well but there it's actually an extension, check here.
In VS2010 the Debugging Parallel applications based on Tasks (not threads) has been radically improved. It's advised to use Tasks whenever possible rather than raw threads. Since it lets you handle parallelism in a more object oriented friendly way.
UPDATE
Tasks that are NOT specified as long running, are queued into the thread pool (or any other scheduler for that matter).
But if a task is specified to be long running, it just creates a standalone Thread, no thread pool is involved.
The CLR ThreadPool isn't appropriate for executing long-running tasks: it's for performing short tasks where the cost of creating a thread would be nearly as high as executing the method itself. (Or at least a significant percentage of the time it takes to execute the method.) As you've seen, .NET itself consumes thread pool threads, you can't reserve a block of them for yourself lest you risk starving the runtime.
Scheduling, throttling, and cancelling work is a different matter. There's no other built-in .NET worker-queue thread pool, so you'll have roll your own (managing the threads or BackgroundWorkers yourself) or find a preexisting one (Ami Bar's SmartThreadPool looks promising, though I haven't used it myself).
In your particular case, the best option would not be either threads or the thread pool or Background worker, but the async programming model (BeginXXX, EndXXX) provided by the framework.
The advantages of using the asynchronous model is that the TcpIp stack uses callbacks whenever there is data to read and the callback is automatically run on a thread from the thread pool.
Using the asynchronous model, you can control the number of requests per time interval initiated and also if you want you can initiate all the requests from a lower priority thread while processing the requests on a normal priority thread which means the packets will stay as little as possible in the internal Tcp Queue of the networking stack.
Asynchronous Client Socket Example - MSDN
P.S. For multiple concurrent and long running jobs that don't do allot of computation but mostly wait on IO (network, disk, etc) the better option always is to use a callback mechanism and not threads.
I'd create your own thread manager. In the following simple example a Queue is used to hold waiting threads and a Dictionary is used to hold active threads, keyed by ManagedThreadId. When a thread finishes, it removes itself from the active dictionary and launches another thread via a callback.
You can change the max running thread limit from your UI, and you can pass extra info to the ThreadDone callback for monitoring performance, etc. If a thread fails for say, a network timeout, you can reinsert back into the queue. Add extra control methods to Supervisor for pausing, stopping, etc.
using System;
using System.Collections.Generic;
using System.Threading;
namespace ConsoleApplication1
{
public delegate void CallbackDelegate(int idArg);
class Program
{
static void Main(string[] args)
{
new Supervisor().Run();
Console.WriteLine("Done");
Console.ReadKey();
}
}
class Supervisor
{
Queue<System.Threading.Thread> waitingThreads = new Queue<System.Threading.Thread>();
Dictionary<int, System.Threading.Thread> activeThreads = new Dictionary<int, System.Threading.Thread>();
int maxRunningThreads = 10;
object locker = new object();
volatile bool done;
public void Run()
{
// queue up some threads
for (int i = 0; i < 50; i++)
{
Thread newThread = new Thread(new Worker(ThreadDone).DoWork);
newThread.IsBackground = true;
waitingThreads.Enqueue(newThread);
}
LaunchWaitingThreads();
while (!done) Thread.Sleep(200);
}
// keep starting waiting threads until we max out
void LaunchWaitingThreads()
{
lock (locker)
{
while ((activeThreads.Count < maxRunningThreads) && (waitingThreads.Count > 0))
{
Thread nextThread = waitingThreads.Dequeue();
activeThreads.Add(nextThread.ManagedThreadId, nextThread);
nextThread.Start();
Console.WriteLine("Thread " + nextThread.ManagedThreadId.ToString() + " launched");
}
done = (activeThreads.Count == 0) && (waitingThreads.Count == 0);
}
}
// this is called by each thread when it's done
void ThreadDone(int threadIdArg)
{
lock (locker)
{
// remove thread from active pool
activeThreads.Remove(threadIdArg);
}
Console.WriteLine("Thread " + threadIdArg.ToString() + " finished");
LaunchWaitingThreads(); // this could instead be put in the wait loop at the end of Run()
}
}
class Worker
{
CallbackDelegate callback;
public Worker(CallbackDelegate callbackArg)
{
callback = callbackArg;
}
public void DoWork()
{
System.Threading.Thread.Sleep(new Random().Next(100, 1000));
callback(System.Threading.Thread.CurrentThread.ManagedThreadId);
}
}
}
Use the built-in threadpool. It has good capabilities.
Alternatively you can look at the Smart Thread Pool implementation here or at Extended Thread Pool for a limit on the maximum number of working threads.

Categories

Resources