I have a monte-carlo simulation running across multiple threads with a progress bar to inform the user how it's going. The progress bar management is done in a separate thread using Invoke, but the Form is not updating.
Here is my code:
Thread reportingThread = new Thread(() => UpdateProgress(iSims, ref myBag));
reportingThread.Priority = ThreadPriority.AboveNormal;
reportingThread.Start();`
and here is the function being called:
private void UpdateProgress(int iSims, ref ConcurrentBag<simResult> myBag)
{
int iCount;
string sText;
if (myBag == null)
iCount = 0;
else
iCount = myBag.Count;
while (iCount < iSims)
{
if (this.Msg.InvokeRequired)
{
sText = iCount.ToString() + " simultions of " + iSims + " completed.";
this.Msg.BeginInvoke((MethodInvoker) delegate() { this.Msg.Text = sText; this.Refresh(); });
}
Thread.Sleep(1000);
iCount = myBag.Count;
}
}
I have used both Application.DoEvents() and this.refresh() to try to force the form to update, but nothing happens.
UPDATE: Here is the procedure calling the above function
private void ProcessLeases(Boolean bValuePremium)
{
int iSims, iNumMonths, iNumYears, iIndex, iNumCores, iSimRef;
int iNumSimsPerThread, iThread, iAssets, iPriorityLevel;
string sMsg;
DateTime dtStart, dtEnd;
TimeSpan span;
var threads = new List<Thread>();
ConcurrentBag<simResult> myBag = new ConcurrentBag<simResult>();
ConcurrentBag<summaryResult> summBag = new ConcurrentBag<summaryResult>();
this.Msg.Text = "Updating all settings";
Application.DoEvents();
ShowProgressPanel();
iSims = objSettings.getSimulations();
iNumCores = Environment.ProcessorCount;
this.Msg.Text = "Initialising model";
Application.DoEvents();
iNumSimsPerThread = Convert.ToInt16(Math.Round(Convert.ToDouble(iSims) / Convert.ToDouble(iNumCores), 0));
this.Msg.Text = "Spawning " + iNumCores.ToString() + " threads";
for (iThread = 0; iThread < iNumCores; iThread++)
{
int iStart, iEnd;
if (iThread == 0)
{
iStart = (iThread * iNumSimsPerThread) + 1;
iEnd = ((iThread + 1) * iNumSimsPerThread);
}
else
{
if (iThread < (iNumCores - 1))
{
iStart = (iThread * iNumSimsPerThread) + 1;
iEnd = ((iThread + 1) * iNumSimsPerThread);
}
else
{
iStart = (iThread * iNumSimsPerThread) + 1;
iEnd = iSims;
}
}
Thread thread = new Thread(() => ProcessParallelMonteCarloTasks(iStart, iEnd, iNumMonths, iSimRef, iSims, ref objDB, iIndex, ref objSettings, ref myBag, ref summBag));
switch (iPriorityLevel)
{
case 1: thread.Priority = ThreadPriority.Highest; break;
case 2: thread.Priority = ThreadPriority.AboveNormal; break;
default: thread.Priority = ThreadPriority.Normal; break;
}
thread.Start();
threads.Add(thread);
}
// Now start the thread to aggregate the MC results
Thread MCThread = new Thread(() => objPortfolio.MCAggregateThread(ref summBag, (iSims * iAssets), iNumMonths));
MCThread.Priority = ThreadPriority.AboveNormal;
MCThread.Start();
threads.Add(MCThread);
// Here we review the CollectionBag size to report progress to the user
Thread reportingThread = new Thread(() => UpdateProgress(iSims, ref myBag));
reportingThread.Priority = ThreadPriority.AboveNormal;
reportingThread.Start();
// Wait for all threads to complete
//this.Msg.Text = iNumCores.ToString() + " Threads running.";
foreach (var thread in threads)
thread.Join();
reportingThread.Abort();
this.Msg.Text = "Aggregating results";
Application.DoEvents();
this.Msg.Text = "Preparing Results";
Application.DoEvents();
ShowResults();
ShowResultsPanel();
}
As you can see, there are a number of updates to the Form before my Invoked call and they all work fine - in each case, I am using Application.DoEvents() to update.
myBag is a ConcurrentBag into which each monte-carlo thread dumps it's results. By using the Count method, I can see how many simulations have completed and update the user.
foreach (var thread in threads)
thread.Join();
This is your problem. You are blocking here so nothing will ever update in the UI thread until all your threads complete.
This is a critical point - .DoEvents() happens naturally and all by itself every time a block of code you have attached to a user interface event handler completes executing. One of your primary responsibilities as a developer is to make sure that any code executing in a user interface event handler completes in a timely manner (a few hundred milliseconds, maximum). If you write your code this way then there is never, ever, a need to call DoEvents()... ever.
You should always write your code this way.
Aside from performance benefits, a major plus of using threads is that they are asynchronous by nature - to take advantage of this you have to write your code accordingly. Breaking out of procedural habits is a hard one. What you need to do is to forget the .Join altogether and get out of your ProcessLeases method here - let the UI have control again.
You are dealing with updates in your threads already so all you need is completion notification to let you pick up in a new handler when all of your threads finish their work. You'll need to keep track of your threads - have them each notify on completion (ie: invoke some delegate back on the UI thread, etc) and in whatever method handles it you would do something like
if (allThreadsAreFinished) // <-- You'll need to implement something here
{
reportingThread.Abort();
this.Msg.Text = "Preparing Results";
ShowResults();
ShowResultsPanel();
}
Alternatively, you could also simply call ProcessLeases in a background thread (making sure to correctly invoke all of your calls within it) and then it wouldn't matter that you are blocking that thread with a .Join. You could also then do away with all of the messy calls to .DoEvents().
Additionally, you don't need the call to this.Refresh(); here :
this.Msg.BeginInvoke((MethodInvoker) delegate() {
this.Msg.Text = sText;
this.Refresh();
});
If you aren't blocking the UI thread the control will update just fine without it and you'll only add extra work for nothing. If you are blocking the UI thread then adding the .Refresh() call won't help because the UI thread won't be free to execute it any more than it will be free to execute the previous line. This is programming chaotically - randomly adding code hoping that it will work instead of examining and understanding the reasons why it doesn't.
Chapter 2 : The Workplace Analogy.
Imagine the UI thread is like the manager. The manager can delegate a task in several ways. Using .Join as you've done it is a bit like the manager giving everyone a job to do - Joe gets one job, Lucy gets another, Bill gets a third, and Sara gets a fourth. The manager has follow-up work to do once everyone is done so he comes up with a plan to get it done as soon as possible.
Immediately after giving everyone their task, the manager goes and sits at Joe's desk and stares at him, doing nothing, until Joe is done. When Joe finishes, he moves to Lucy's desk to check if she is done. If she isn't he waits there until Lucy finishes, then moves to Bill's desk and stares at him until he is done... then moves to Sara's desk.
Clearly this isn't productive. Furthermore, each of the four team members have been sending email status updates (Manager.BeginInvoke -> read your email!) to their manager but he hasn't read any of them because he has been spending all of his time sitting at their desks, staring at them, waiting for them to finish their tasks. He hasn't done anything else, for that matter, either. The bosses have been asking what's going on, his phone's been ringing, nobody has updated the weekly financials - nothing. The manager hasn't been able to do anything else because he decided that he needed to sit on his bottom and watch his team work until they finished.
The manager isn't responding... The manager may respond again if you wait. Do you want to fire the manager?
[YES - FIRE HIM] [NO - Keep Waiting]
Better, one would think, if the manager simply set everyone off to work on stuff and then got on with doing other things. All he cares about is when they finish working so all it takes is one more instruction for them to notify him when their work is complete. The UI thread is like your application's manager - its time is precious and you should use as little of it as absolutely necessary. If you have work to do, delegate to a worker thread and don't have the manager sit around waiting for others to finish work - have them notify when things are ready and let the manager go back to work.
Well the code is very partial, but at a glance if
this.Msg.InvokeRequired == false
the following code doesn't get executed. Can that be the issue?
Related
I have a data processing program in C# (.NET 4.6.2; WinForms for the UI). I'm experiencing a strange situation where computer speed seems to be causing Task.Factory.ContinueWhenAll to run earlier than expected or some Tasks are reporting complete before actually running. As you can see below, I have a queue of up to 390 tasks, with no more than 4 in queue at once. When all tasks are complete, the status label is updated to say complete. The ScoreManager involves retrieving information from a database, performing several client-side calculations, and saving to an Excel file.
When running the program from my laptop, everything functions as expected; when running from a substantially more powerful workstation, I experience this issue. Unfortunately, due to organizational limitations, I likely cannot get Visual Studio on the workstation to debug directly. Does anyone have any idea what might be causing this for me to investigate?
private void button1_Click(object sender, EventArgs e)
{
int startingIndex = cbStarting.SelectedIndex;
int endingIndex = cbEnding.SelectedIndex;
lblStatus.Text = "Running";
if (endingIndex < startingIndex)
{
MessageBox.Show("Ending must be further down the list than starting.");
return;
}
List<string> lItems = new List<string>();
for (int i = startingIndex; i <= endingIndex; i++)
{
lItems.Add(cbStarting.Items[i].ToString());
}
System.IO.Directory.CreateDirectory(cbMonth.SelectedItem.ToString());
ThreadPool.SetMaxThreads(4, 4);
List<Task<ScoreResult>> tasks = new List<Task<ScoreResult>>();
for (int i = startingIndex; i <= endingIndex; i++)
{
ScoreManager sm = new ScoreManager(cbStarting.Items[i].ToString(),
cbMonth.SelectedItem.ToString());
Task<ScoreResult> task = Task.Factory.StartNew<ScoreResult>((manager) =>
((ScoreManager)manager).Execute(), sm);
sm = null;
Action<Task<ScoreResult>> itemcomplete = ((_task) =>
{
if (_task.Result.errors.Count > 0)
{
txtLog.Invoke((MethodInvoker)delegate
{
txtLog.AppendText("Item " + _task.Result.itemdetail +
" had errors/warnings:" + Environment.NewLine);
});
foreach (ErrorMessage error in _task.Result.errors)
{
txtLog.Invoke((MethodInvoker)delegate
{
txtLog.AppendText("\t" + error.ErrorText +
Environment.NewLine);
});
}
}
else
{
txtLog.Invoke((MethodInvoker)delegate
{
txtLog.AppendText("Item " + _task.Result.itemdetail +
" succeeded." + Environment.NewLine);
});
}
});
task.ContinueWith(itemcomplete);
tasks.Add(task);
}
Action<Task[]> allComplete = ((_tasks) =>
{
lblStatus.Invoke((MethodInvoker)delegate
{
lblStatus.Text = "Complete";
});
});
Task.Factory.ContinueWhenAll<ScoreResult>(tasks.ToArray(), allComplete);
}
You are creating fire-and-forget tasks, that you don't wait or observe, here:
task.ContinueWith(itemcomplete);
tasks.Add(task);
Task.Factory.ContinueWhenAll<ScoreResult>(tasks.ToArray(), allComplete);
The ContinueWith method returns a Task. You probably need to attach the allComplete continuation to these tasks, instead of their antecedents:
List<Task> continuations = new List<Task>();
Task continuation = task.ContinueWith(itemcomplete);
continuations.Add(continuation);
Task.Factory.ContinueWhenAll<ScoreResult>(continuations.ToArray(), allComplete);
As a side note, you could make your code half in size and significantly more readable if you used async/await instead of the old-school ContinueWith and Invoke((MethodInvoker) technique.
Also: setting an upper limit to the number of ThreadPool threads in order to control the degree of parallelism is extremely inadvisable:
ThreadPool.SetMaxThreads(4, 4); // Don't do this!
You can use the Parallel class instead. It allows controlling the MaxDegreeOfParallelism quite easily.
After discovering state was IsFaulted, I added some code to add some exception information to the log (https://learn.microsoft.com/en-us/dotnet/standard/parallel-programming/exception-handling-task-parallel-library). Seems the problem is an underlying database issue where there are not enough connections left in the connection pool (Timeout expired. The timeout period elapsed prior to obtaining a connection from the pool. This may have occurred because all pooled connections were in use and max pool size was reached.)--the additional speed allows queries to fire more quickly/frequently. Not sure entirely why, as I do have the SqlConnection enclosed in a using clause, but investigating a few things on that front. At any rate, the problem is clearly a little different than what I thought above, so marking this quasi-answered.
Here's a description of what the program should do. The program should create a file and five threads to write in that file...
The first thread should write from 1 to 5 into that file.
The second thread should write from 1 to 10.
The third thread should write from 1 to 15.
The fourth thread should write from 1 to 20.
The fifth thread should write from 1 to 25.
Moreover, an algorithm should be implemented to make each thread print 2 numbers and stops. the next thread should print two numbers and stop. and so on until all the threads finish printing their numbers.
Here's the code I've developed so far...
using System;
using System.IO;
using System.Threading;
using System.Collections;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
public static class OSAssignment
{
// First Thread Tasks...
static void FirstThreadTasks(StreamWriter WritingBuffer)
{
for (int i = 1; i <= 5; i++)
{
if (i % 2 == 0)
{
Console.WriteLine("[Thread1] " + i);
Thread.Sleep(i);
}
else
{
Console.WriteLine("[Thread1] " + i);
}
}
}
// Second Thread Tasks...
static void SecondThreadTasks(StreamWriter WritingBuffer)
{
for (int i = 1; i <= 10; i++)
{
if (i % 2 == 0)
{
if (i == 10)
Console.WriteLine("[Thread2] " + i);
else
{
Console.WriteLine("[Thread2] " + i);
Thread.Sleep(i);
}
}
else
{
Console.WriteLine("[Thread2] " + i);
}
}
}
// Third Thread Tasks..
static void ThirdThreadTasks(StreamWriter WritingBuffer)
{
for (int i = 1; i <= 15; i++)
{
if (i % 2 == 0)
{
Console.WriteLine("[Thread3] " + i);
Thread.Sleep(i);
}
else
{
Console.WriteLine("[Thread3] " + i);
}
}
}
// Fourth Thread Tasks...
static void FourthThreadTasks(StreamWriter WritingBuffer)
{
for (int i = 1; i <= 20; i++)
{
if (i % 2 == 0)
{
if (i == 20)
Console.WriteLine("[Thread4] " + i);
else
{
Console.WriteLine("[Thread4] " + i);
Thread.Sleep(i);
}
}
else
{
Console.WriteLine("[Thread4] " + i);
}
}
}
// Fifth Thread Tasks...
static void FifthThreadTasks(StreamWriter WritingBuffer)
{
for (int i = 1; i <= 25; i++)
{
if (i % 2 == 0)
{
Console.WriteLine("[Thread5] " + i);
Thread.Sleep(i);
}
else
{
Console.WriteLine("[Thread5] " + i);
}
}
}
// Main Function...
static void Main(string[] args)
{
FileStream File = new FileStream("output.txt", FileMode.Create, FileAccess.Write, FileShare.Write);
StreamWriter Writer = new StreamWriter(File);
Thread T1 = new Thread(() => FirstThreadTasks(Writer));
Thread T2 = new Thread(() => SecondThreadTasks(Writer));
Thread T3 = new Thread(() => ThirdThreadTasks(Writer));
Thread T4 = new Thread(() => FourthThreadTasks(Writer));
Thread T5 = new Thread(() => FifthThreadTasks(Writer));
Console.WriteLine("Initiating Jobs...");
T1.Start();
T2.Start();
T3.Start();
T4.Start();
T5.Start();
Writer.Flush();
Writer.Close();
File.Close();
}
}
}
Here's the problems I'm facing...
I cannot figure out how to make the 5 threads write into the same file at the same time even with making FileShare.Write. So I simply decided to write to console for time being and to develop the algorithm and see how it behaves first in console.
Each time I run the program, the output is slightly different from previous. It always happen that a thread prints only one of it's numbers in a specific iteration and continues to output the second number after another thread finishes its current iteration.
I've a got a question that might be somehow offtrack. If I removed the Console.WriteLine("Initiating Jobs..."); from the main method, the algorithm won't behave like I mentioned in Point 2. I really can't figure out why.
Your main function is finishing and closing the file before the threads have started writing to it, so you can you use Thread.Join to wait for a thread to exit. Also I'd advise using a using statement for IDisposable objects.
When you have a limited resources you want to share among threads, you'll need a locking mechanism. Thread scheduling is not deterministic. You've started 5 threads and at that point it's not guaranteed which one will run first. lock will force a thread to wait for a resource to become free. The order is still not determined so T3 might run before T2 unless you add additional logic/locking to force the order as well.
I'm not seeing much difference in the behavior but free running threads will produce some very hard to find bugs especially relating to timing issues.
As an extra note I'd avoid using Sleep as a way of synchronizing threads.
To effectively get one thread to write at a time you need to block all other threads, there's a few methods for doing that such as lock, Mutex, Monitor,AutoResetEvent etc. I'd use an AutoResetEvent for this situation. The problem you then face is each thread needs to know which thread it's waiting for so that it can wait on the correct event.
Please see James' answer as well. He points out a critical bug that escaped my notice: you're closing the file before the writer threads have finished. Consider posting a new question to ask how to solve that problem, since this "question" is already three questions rolled into one.
FileShare.Write tells the operating system to allow other attempts to open the file for writing. Typically this is used for systems that have multiple processes writing to the same file. In your case, you have a single process and it only opens the file once, so this flag really makes no difference. It's the wrong tool for the job.
To coordinate writes between multiple threads, you should use locking. Add a new static field to the class:
private static object synchronizer = new object();
Then wrap each write operation on the file with a lock on that object:
lock(synchronizer)
{
Console.WriteLine("[Thread1] " + i);
}
This wil make no difference while you're using the Console, but I think it will solve the problem you had with writing to the file.
Speaking of which, switching from file write to console write to sidestep the file problem was a clever idea, so kudos for that. Howver an even better implementation of that idea would be to replace all of the write calls with a call to a single function, e.g. "WriteOutput(string)" so that you can switch everything from file to console just by changing one line in that function.
And then you could put the lock into that function as well.
Threaded stuff is not deterministic. It's guaranteed that each thread will run, but there are no guarantees about ordering, when threads will be interrupted, which thread will interrupt which, etc. It's a roll of the dice every time. You just have to get used to it, or go out of your way to force thing to happen in a certain sequence if that really matters for your application.
I dunno about this one. Seems like that shouldn't matter.
OK, I'm coming to this rather late, and but from a theoretical point of view, I/O from multiple threads to a specific end-point is inevitably fraught.
In the example above, it would almost certainly faster and safer to queue the output into an in-memory structure, each thread taking an exclusive lock before doing so, and then having a separate thread to output to the device.
I have developed an application in c#. The class structure is as follows.
Form1 => The UI form. Has a backgroundworker, processbar, and a "ok" button.
SourceReader, TimedWebClient, HttpWorker, ReportWriter //clases do some work
Controller => Has the all over control. From "ok" button click an instance of this class called "cntrl" is created. This cntrlr is a global variable in Form1.cs.
(At the constructor of the Controler I create SourceReader, TimedWebClient,HttpWorker,ReportWriter instances. )
Then I call the RunWorkerAsync() of the background worker.
Within it code is as follows.
private void backgroundWorker1_DoWork(object sender, DoWorkEventArgs e)
{
int iterator = 1;
for (iterator = 1; iterator <= this.urlList.Count; iterator++)
{
cntrlr.Vmain(iterator-1);
backgroundWorker1.ReportProgress(iterator);
}
}
At themoment ReportProgress updates the progressbar.
The urlList mentioned above has 1000 of urls. cntlr.Vamin(int i) process the whole process at themoment. I want to give the task to several threads, each one having to process 100 of urls. Though access for other instances or methods of them is not prohibited, access to ReportWriter should be limited to only one thread at a time. I can't find a way to do this. If any one have an idea or an answer, please explain.
If you do want to restrict multiple threads using the same method concurrently then I would use the Semaphore class to facilitate the required thread limit; here's how...
A semaphore is like a mean night club bouncer, it has been provide a club capacity and is not allowed to exceed this limit. Once the club is full, no one else can enter... A queue builds up outside. Then as one person leaves another can enter (analogy thanks to J. Albahari).
A Semaphore with a value of one is equivalent to a Mutex or Lock except that the Semaphore has no owner so that it is thread ignorant. Any thread can call Release on a Semaphore whereas with a Mutex/Lock only the thread that obtained the Mutex/Lock can release it.
Now, for your case we are able to use Semaphores to limit concurrency and prevent too many threads from executing a particular piece of code at once. In the following example five threads try to enter a night club that only allows entry to three...
class BadAssClub
{
static SemaphoreSlim sem = new SemaphoreSlim(3);
static void Main()
{
for (int i = 1; i <= 5; i++)
new Thread(Enter).Start(i);
}
// Enfore only three threads running this method at once.
static void Enter(int i)
{
try
{
Console.WriteLine(i + " wants to enter.");
sem.Wait();
Console.WriteLine(i + " is in!");
Thread.Sleep(1000 * (int)i);
Console.WriteLine(i + " is leaving...");
}
finally
{
sem.Release();
}
}
}
Note, that SemaphoreSlim is a lighter weight version of the Semaphore class and incurs about a quarter of the overhead. it is sufficient for what you require.
I hope this helps.
I think I would have used the ThreadPool, instead of background worker, and given each thread 1, not 100 url's to process. The thread pool will limit the number of threads it starts at once, so you wont have to worry about getting 1000 requests at once. Have a look here for a good example
http://msdn.microsoft.com/en-us/library/3dasc8as.aspx
Feeling a little more adventurous? Consider using TPL DataFlow to download a bunch of urls:
var urls = new[]{
"http://www.google.com",
"http://www.microsoft.com",
"http://www.apple.com",
"http://www.stackoverflow.com"};
var tb = new TransformBlock<string, string>(async url => {
using(var wc = new WebClient())
{
var data = await wc.DownloadStringTaskAsync(url);
Console.WriteLine("Downloaded : {0}", url);
return data;
}
}, new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = 4});
var ab = new ActionBlock<string>(data => {
//process your data
Console.WriteLine("data length = {0}", data.Length);
}, new ExecutionDataflowBlockOptions{MaxDegreeOfParallelism = 1});
tb.LinkTo(ab); //join output of producer to consumer block
foreach(var u in urls)
{
tb.Post(u);
}
tb.Complete();
Note how you can control the parallelism of each block explicitly, so you can gather in parallel but process without going concurrent (for example).
Just grab it with nuget. Easy.
I have an application I have already started working with and it seems I need to rethink things a bit. The application is a winform application at the moment. Anyway, I allow the user to input the number of threads they would like to have running. I also allow the user to allocate the number of records to process per thread. What I have done is loop through the number of threads variable and create the threads accordingly. I am not performing any locking (and not sure I need to or not) on the threads. I am new to threading and am running into possible issue with multiple cores. I need some advice as to how I can make this perform better.
Before a thread is created some records are pulled from my database to be processed. That list object is sent to the thread and looped through. Once it reaches the end of the loop, the thread call the data functions to pull some new records, replacing the old ones in the list. This keeps going on until there are no more records. Here is my code:
private void CreateThreads()
{
_startTime = DateTime.Now;
var totalThreads = 0;
var totalRecords = 0;
progressThreadsCreated.Maximum = _threadCount;
progressThreadsCreated.Step = 1;
LabelThreadsCreated.Text = "0 / " + _threadCount.ToString();
this.Update();
for(var i = 1; i <= _threadCount; i++)
{
LabelThreadsCreated.Text = i + " / " + _threadCount;
progressThreadsCreated.Value = i;
var adapter = new Dystopia.DataAdapter();
var records = adapter.FindAllWithLocking(_recordsPerThread,_validationId,_validationDateTime);
if(records != null && records.Count > 0)
{
totalThreads += 1;
LabelTotalProcesses.Text = "Total Processes Created: " + totalThreads.ToString();
var paramss = new ArrayList { i, records };
var thread = new Thread(new ParameterizedThreadStart(ThreadWorker));
thread.Start(paramss);
}
this.Update();
}
}
private void ThreadWorker(object paramList)
{
try
{
var parms = (ArrayList) paramList;
var stopThread = false;
var threadCount = (int) parms[0];
var records = (List<Candidates>) parms[1];
var runOnce = false;
var adapter = new Dystopia.DataAdapter();
var lastCount = records.Count;
var runningCount = 0;
while (_stopThreads == false)
{
if (!runOnce)
{
CreateProgressArea(threadCount, records.Count);
}
else
{
ResetProgressBarMethod(threadCount, records.Count);
}
runOnce = true;
var counter = 0;
if (records.Count > 0)
{
foreach (var record in records)
{
counter += 1;
runningCount += 1;
_totalRecords += 1;
var rec = record;
var proc = new ProcRecords();
proc.Validate(ref rec);
adapter.Update(rec);
UpdateProgressBarMethod(threadCount, counter, emails.Count, runningCount);
if (_stopThreads)
{
break;
}
}
UpdateProgressBarMethod(threadCount, -1, lastCount, runningCount);
if (!_noRecordsInPool)
{
records = adapter.FindAllWithLocking(_recordsPerThread, _validationId, _validationDateTime);
if (records == null || records.Count <= 0)
{
_noRecordsInPool = true;
break;
}
else
{
lastCount = records.Count;
}
}
}
}
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
Something simple you could do that would improve perf would be to use a ThreadPool to manage your thread creation. This allows the OS to allocate a group of thread paying the thread create penalty once instead of multiple times.
If you decide to move to .NET 4.0, Tasks would be another way to go.
I allow the user to input the number
of threads they would like to have
running. I also allow the user to
allocate the number of records to
process per thread.
This isn't something you really want to expose to the user. What are they supposed to put? How can they determine what's best? This is an implementation detail best left to you, or even better, the CLR or another library.
I am not performing any locking (and
not sure I need to or not) on the
threads.
The majority of issues you'll have with multithreading will come from shared state. Specifically, in your ThreadWorker method, it looks like you refer to the following shared data: _stopThreads, _totalRecords, _noRecordsInPool, _recordsPerThread, _validationId, and _validationDateTime.
Just because these data are shared, however, doesn't mean you'll have issues. It all depends on who reads and writes them. For example, I think _recordsPerThread is only written once initially, and then read by all threads, which is fine. _totalRecords, however, is both read and written by each thread. You can run into threading issues here since _totalRecords += 1; consists of a non-atomic read-then-write. In other words, you could have two threads read the value of _totalRecords (say they both read the value 5), then increment their copy and then write it back. They'll both write back the value 6, which is now incorrect since it should be 7. This is a classic race condition. For this particular case, you could use Interlocked.Increment to atomically update the field.
In general, to do synchronization between threads in C#, you can use the classes in the System.Threading namespace, e.g. Mutex, Semaphore, and probably the most common, Monitor (equivalent to lock) which allows only one thread to execute a specific portion of code at a time. The mechanism you use to synchronize depends entirely on your performance requirements. For example, if you throw a lock around the body of your ThreadWorker, you'll destroy any performance gains you got through multithreading by effectively serializing the work. Safe, but slow :( On the other hand, if you use Interlocked.Increment and judiciously add other synchronization where necessary, you'll maintain your performance and your app will be correct :)
Once you've gotten your worker method to be thread-safe, you should use some other mechanism to manage your threads. ThreadPool was mentioned, and you could also use the Task Parallel Library, which abstracts over the ThreadPool and smartly determines and scales how many threads to use. This way, you take the burden off of the user to determine what magic number of threads they should run.
The obvious answer is to question why you want threads in the first place? Where is the analysis and benchmarks that show that using threads will be an advantage?
How are you ensuring that non-gui threads do not interact with the gui? How are you ensuring that no two threads interact with the same variables or datastructures in an unsafe way? Even if you realise you do need to use locking, how are you ensuring that the locks don't result in each thread processing their workload serially, removing any advantages that multiple threads might have provided?
I am having hard time in understanding Wait(), Pulse(), PulseAll(). Will all of them avoid deadlock? I would appreciate if you explain how to use them?
Short version:
lock(obj) {...}
is short-hand for Monitor.Enter / Monitor.Exit (with exception handling etc). If nobody else has the lock, you can get it (and run your code) - otherwise your thread is blocked until the lock is aquired (by another thread releasing it).
Deadlock typically happens when either A: two threads lock things in different orders:
thread 1: lock(objA) { lock (objB) { ... } }
thread 2: lock(objB) { lock (objA) { ... } }
(here, if they each acquire the first lock, neither can ever get the second, since neither thread can exit to release their lock)
This scenario can be minimised by always locking in the same order; and you can recover (to a degree) by using Monitor.TryEnter (instead of Monitor.Enter/lock) and specifying a timeout.
or B: you can block yourself with things like winforms when thread-switching while holding a lock:
lock(obj) { // on worker
this.Invoke((MethodInvoker) delegate { // switch to UI
lock(obj) { // oopsiee!
...
}
});
}
The deadlock appears obvious above, but it isn't so obvious when you have spaghetti code; possible answers: don't thread-switch while holding locks, or use BeginInvoke so that you can at least exit the lock (letting the UI play).
Wait/Pulse/PulseAll are different; they are for signalling. I use this in this answer to signal so that:
Dequeue: if you try to dequeue data when the queue is empty, it waits for another thread to add data, which wakes up the blocked thread
Enqueue: if you try and enqueue data when the queue is full, it waits for another thread to remove data, which wakes up the blocked thread
Pulse only wakes up one thread - but I'm not brainy enough to prove that the next thread is always the one I want, so I tend to use PulseAll, and simply re-verify the conditions before continuing; as an example:
while (queue.Count >= maxSize)
{
Monitor.Wait(queue);
}
With this approach, I can safely add other meanings of Pulse, without my existing code assuming that "I woke up, therefore there is data" - which is handy when (in the same example) I later needed to add a Close() method.
Simple recipe for use of Monitor.Wait and Monitor.Pulse. It consists of a worker, a boss, and a phone they use to communicate:
object phone = new object();
A "Worker" thread:
lock(phone) // Sort of "Turn the phone on while at work"
{
while(true)
{
Monitor.Wait(phone); // Wait for a signal from the boss
DoWork();
Monitor.PulseAll(phone); // Signal boss we are done
}
}
A "Boss" thread:
PrepareWork();
lock(phone) // Grab the phone when I have something ready for the worker
{
Monitor.PulseAll(phone); // Signal worker there is work to do
Monitor.Wait(phone); // Wait for the work to be done
}
More complex examples follow...
A "Worker with something else to do":
lock(phone)
{
while(true)
{
if(Monitor.Wait(phone,1000)) // Wait for one second at most
{
DoWork();
Monitor.PulseAll(phone); // Signal boss we are done
}
else
DoSomethingElse();
}
}
An "Impatient Boss":
PrepareWork();
lock(phone)
{
Monitor.PulseAll(phone); // Signal worker there is work to do
if(Monitor.Wait(phone,1000)) // Wait for one second at most
Console.Writeline("Good work!");
}
No, they don't protect you from deadlocks. They are just more flexible tools for thread synchronization. Here is a very good explanation how to use them and very important pattern of usage - without this pattern you will break all the things:
http://www.albahari.com/threading/part4.aspx
Something that total threw me here is that Pulse just gives a "heads up" to a thread in a Wait. The Waiting thread will not continue until the thread that did the Pulse gives up the lock and the waiting thread successfully wins it.
lock(phone) // Grab the phone
{
Monitor.PulseAll(phone); // Signal worker
Monitor.Wait(phone); // ****** The lock on phone has been given up! ******
}
or
lock(phone) // Grab the phone when I have something ready for the worker
{
Monitor.PulseAll(phone); // Signal worker there is work to do
DoMoreWork();
} // ****** The lock on phone has been given up! ******
In both cases it's not until "the lock on phone has been given up" that another thread can get it.
There might be other threads waiting for that lock from Monitor.Wait(phone) or lock(phone). Only the one that wins the lock will get to continue.
They are tools for synchronizing and signaling between threads. As such they do nothing to prevent deadlocks, but if used correctly they can be used to synchronize and communicate between threads.
Unfortunately most of the work needed to write correct multithreaded code is currently the developers' responsibility in C# (and many other languages). Take a look at how F#, Haskell and Clojure handles this for an entirely different approach.
Unfortunately, none of Wait(), Pulse() or PulseAll() have the magical property which you are wishing for - which is that by using this API you will automatically avoid deadlock.
Consider the following code
object incomingMessages = new object(); //signal object
LoopOnMessages()
{
lock(incomingMessages)
{
Monitor.Wait(incomingMessages);
}
if (canGrabMessage()) handleMessage();
// loop
}
ReceiveMessagesAndSignalWaiters()
{
awaitMessages();
copyMessagesToReadyArea();
lock(incomingMessages) {
Monitor.PulseAll(incomingMessages); //or Monitor.Pulse
}
awaitReadyAreaHasFreeSpace();
}
This code will deadlock! Maybe not today, maybe not tomorrow. Most likely when your code is placed under stress because suddenly it has become popular or important, and you are being called to fix an urgent issue.
Why?
Eventually the following will happen:
All consumer threads are doing some work
Messages arrive, the ready area can't hold any more messages, and PulseAll() is called.
No consumer gets woken up, because none are waiting
All consumer threads call Wait() [DEADLOCK]
This particular example assumes that producer thread is never going to call PulseAll() again because it has no more space to put messages in. But there are many, many broken variations on this code possible. People will try to make it more robust by changing a line such as making Monitor.Wait(); into
if (!canGrabMessage()) Monitor.Wait(incomingMessages);
Unfortunately, that still isn't enough to fix it. To fix it you also need to change the locking scope where Monitor.PulseAll() is called:
LoopOnMessages()
{
lock(incomingMessages)
{
if (!canGrabMessage()) Monitor.Wait(incomingMessages);
}
if (canGrabMessage()) handleMessage();
// loop
}
ReceiveMessagesAndSignalWaiters()
{
awaitMessagesArrive();
lock(incomingMessages)
{
copyMessagesToReadyArea();
Monitor.PulseAll(incomingMessages); //or Monitor.Pulse
}
awaitReadyAreaHasFreeSpace();
}
The key point is that in the fixed code, the locks restrict the possible sequences of events:
A consumer threads does its work and loops
That thread acquires the lock
And thanks to locking it is now true that either:
a. Messages haven't yet arrived in the ready area, and it releases the lock by calling Wait() BEFORE the message receiver thread can acquire the lock and copy more messages into the ready area, or
b. Messages have already arrived in the ready area and it receives the messages INSTEAD OF calling Wait(). (And while it is making this decision it is impossible for the message receiver thread to e.g. acquire the lock and copy more messages into the ready area.)
As a result the problem of the original code now never occurs:
3. When PulseEvent() is called No consumer gets woken up, because none are waiting
Now observe that in this code you have to get the locking scope exactly right. (If, indeed I got it right!)
And also, since you must use the lock (or Monitor.Enter() etc.) in order to use Monitor.PulseAll() or Monitor.Wait() in a deadlock-free fashion, you still have to worry about possibility of other deadlocks which happen because of that locking.
Bottom line: these APIs are also easy to screw up and deadlock with, i.e. quite dangerous
This is a simple example of monitor use :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
namespace ConsoleApp4
{
class Program
{
public static int[] X = new int[30];
static readonly object _object = new object();
public static int count=0;
public static void PutNumbers(int numbersS, int numbersE)
{
for (int i = numbersS; i < numbersE; i++)
{
Monitor.Enter(_object);
try
{
if(count<30)
{
X[count] = i;
count++;
Console.WriteLine("Punt in " + count + "nd: "+i);
Monitor.Pulse(_object);
}
else
{
Monitor.Wait(_object);
}
}
finally
{
Monitor.Exit(_object);
}
}
}
public static void RemoveNumbers(int numbersS)
{
for (int i = 0; i < numbersS; i++)
{
Monitor.Enter(_object);
try
{
if (count > 0)
{
X[count] = 0;
int x = count;
count--;
Console.WriteLine("Removed " + x + " element");
Monitor.Pulse(_object);
}
else
{
Monitor.Wait(_object);
}
}
finally
{
Monitor.Exit(_object);
}
}
}
static void Main(string[] args)
{
Thread W1 = new Thread(() => PutNumbers(10,50));
Thread W2 = new Thread(() => PutNumbers(1, 10));
Thread R1 = new Thread(() => RemoveNumbers(30));
Thread R2 = new Thread(() => RemoveNumbers(20));
W1.Start();
R1.Start();
W2.Start();
R2.Start();
W1.Join();
R1.Join();
W2.Join();
R2.Join();
}
}
}