I am exploring with the concept of starting a thread within another thread. this is the code I have come up with, this is watered down version of another program which I am developing currently however I found out that the second level of threads do not complete successfully.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using System.Diagnostics;
namespace ConsoleApplication4
{
public class SomeClassA
{
public SomeClassA(string display)
{
System.Threading.Thread.Sleep(1000);
Console.WriteLine(display);
}
}
public class MainSomeClassA
{
public List<SomeClassA> SomeClassaAList;
public List<Thread> ThreadList;
public MainSomeClassA()
{
ThreadList = new List<Thread>();
SomeClassaAList = new List<SomeClassA>();
for (int i = 0; i < 10; i++)
{
ThreadList.Add(new Thread(() => StartThread("Hello")));
}
WaitComplete();
}
public void WaitComplete()
{
bool AllThreadsAlive = true;
while (AllThreadsAlive)
{
AllThreadsAlive = false;
foreach (Thread t in ThreadList)
{
if (t.IsAlive)
{
AllThreadsAlive = true;
}
}
}
}
public void StartThread(string display)
{
SomeClassaAList.Add(new SomeClassA(display));
}
}
class Program
{
public static List<MainSomeClassA> MainSomeClassAList = new List<MainSomeClassA>();
static void Main(string[] args)
{
Stopwatch sw = new Stopwatch();
MainSomeClassAList = new List<MainSomeClassA>();
List<Thread> ThreadList = new List<Thread>();
bool threadsAlive = true;
sw.Reset();
sw.Start();
for (int i = 0; i < 10; i++)
{
Thread t = new Thread(AddToMainClassAList);
t.Start();
ThreadList.Add(t);
}
while (threadsAlive)
{
threadsAlive = false;
foreach (Thread t in ThreadList)
{
if (t.IsAlive)
{
threadsAlive = true;
}
}
}
sw.Stop();
Console.WriteLine("Elapsed Time: {0}", sw.ElapsedMilliseconds);
Console.ReadKey();
}
public static void AddToMainClassAList()
{
MainSomeClassAList.Add(new MainSomeClassA());
}
}
}
The above code does not print out "hello" and exits without creating the SomeClassA List.
The problem with your code is that you never start the inner threads. Change you constructor to look like this, and it will work:
public MainSomeClassA()
{
ThreadList = new List<Thread>();
SomeClassaAList = new List<SomeClassA>();
for (int i = 0; i < 10; i++)
{
ThreadList.Add(new Thread(() => StartThread("Hello")));
// Start thread here:
ThreadList[ThreadList.Count - 1].Start();
}
WaitComplete();
}
That said, I should point out that you're lucky the program doesn't crash. You have ten threads concurrently trying to modify the MainSomeClassAList object, some of which will necessarily force a reallocation of the internal buffer. As it is, if you print out the Count of the list at the end, you will find it isn't always 10 as it ought to be.
For the code to be truly correct, you would need to add synchronization around the call to Add() in the AddToMainClassAList() method. Same thing applies to the StartThread() method and the SomeClassaAList object.
Finally, your method for waiting on the threads is very poor. You should try to avoid polling at all costs. In this case, the Thread.Join() method is a reasonable choice (you should try to avoid blocking a thread at all, but for this example, it's unavoidable). For example, your busy loop can be replaced by this:
foreach (Thread thread in ThreadList)
{
thread.Join();
}
Related
I would like to run a thread and abort it when I need to run it again while the thread can be still alive, but I noticed that aborting is slow due to how aborting a thread works.
Therefore I have the following implementation of simulations, which construct threads and measure the time for the simulation:
private Thread _t;
public void Main()
{
SimulateAbortThread();
SimulateThreadClass();
}
private void SimulateAbortThread()
{
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < 100; i++)
{
RunThread();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
}
private void RunThread()
{
if (_t != null && _t.IsAlive) _t.Abort();
_t = new Thread(() =>
{
//doStuff();
});
_t.Start();
}
private void SimulateThreadClass()
{
var thread = new ThreadClass();
var sw = new Stopwatch();
sw.Start();
for (var i = 0; i < 100; i++)
{
thread = new ThreadClass();
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
}
private class ThreadClass
{
public ThreadClass()
{
new Thread(() =>
{
//doStuff();
}).Start();
}
}
I discovered that creating a new instance of a class, which constructs a new thread, is faster than aborting the thread itself. The results of both simulations in real cases are identical but, due to how aborting works the simulations only differ in speed, which is why I am currently using the second simulation.
I would like to know, why the second simulation is running faster and how I can avoid assigning a variable a new instance of a class to "restart" threads while they still might run?
I wrote this code in purpose to test multi and single threading speeds. Thanks for all the feedback! I rewrote most of it based on the great comments I received. This now functions properly (maybe has a bug here or there), tests multi threads first, and takes an average to find a more accurate speed: (Scroll to bottom for cont.)
Main method Class
using System;
namespace SingleAndMultiThreading
{
internal class Threads
{
private static void Main(string[] args)
{
long numOfObjCreated;
int numberOfTests;
while (true)
{
try
{
Console.Write("Number of objects to create: ");
numOfObjCreated = Convert.ToInt64(Console.ReadLine());
break;
}
catch (Exception)
{
Console.WriteLine("Invalid input.");
}
}
while (true)
{
try
{
Console.Write("Number of tests to run: ");
numberOfTests = Convert.ToInt32(Console.ReadLine());
break;
}
catch (Exception)
{
Console.WriteLine("Invalid input.");
}
}
CalculateResults(numOfObjCreated, numberOfTests);
Console.ReadKey();
}
private static void CalculateResults(long numOfObjCreated, int numberOfTests)
{
double totalPercentages = 0;
for (var i = 0; i < numberOfTests; i++)
{
totalPercentages += CompleteTests(numOfObjCreated);
}
var accuracy = totalPercentages / numberOfTests;
if ((int)accuracy == 0)
{
Console.WriteLine("\nIn this case, neither single threading or multithreading is faster.\n" +
"They both run equally well under these conditions.\n");
return;
}
if (accuracy < 0)
{
Console.WriteLine("\nIn this case with {0} objects being created, single threading is faster!\n",
string.Format("{0:#,###0}", numOfObjCreated));
return;
}
Console.WriteLine("\nFrom {0} test(s), {1}% was the average percentage of increased speed in multithreading.\n",
string.Format("{0:#,###0}", numberOfTests), string.Format("{0:#,###0}", accuracy));
}
private static double CompleteTests(long numOfObjCreated)
{
Console.WriteLine("Computing...");
var numOfCores = Environment.ProcessorCount;
var timeForMultiThread = MultiThread.Run(numOfObjCreated, numOfCores);
var timeForSingleThread = SingleThread.Run(numOfObjCreated);
var percentFaster = ((timeForSingleThread / timeForMultiThread) * 100) - 100;
//note: .NET does its part in assigning a certian thread to its own core
Console.WriteLine("Using all {0} cores, creating {1} objects is {2}% faster.",
numOfCores, string.Format("{0:#,###0}", numOfObjCreated), string.Format("{0:#,###0}", percentFaster));
return percentFaster;
}
}
}
Single Threading Class
using System;
using System.Diagnostics;
namespace SingleAndMultiThreading
{
internal class SingleThread
{
public static double Run(long numOfObjCreated)
{
var watch = new Stopwatch();
watch.Start();
for (long i = 0; i < numOfObjCreated; i++)
{
new object();
}
watch.Stop();
var totalTime = watch.ElapsedTicks;
Console.WriteLine("The time to create {0} objects with 1 thread is: {1} ticks.",
string.Format("{0:#,###0}", numOfObjCreated), string.Format("{0:#,###0}", totalTime));
return totalTime;
}
}
}
Multi Threading Class
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;
namespace SingleAndMultiThreading
{
internal class MultiThread
{
public static double Run(long numOfObjCreated, int numOfCores)
{
var watch = new Stopwatch();
var workerObject = new Worker(numOfObjCreated / numOfCores);
var listOfThreads = new List<Thread>();
for (long k = 0; k < numOfCores; k++)
{
var workerThread = new Thread(workerObject.DoWork);
listOfThreads.Add(workerThread);
}
watch.Start();
foreach (var thread in listOfThreads)
{
thread.Start();
}
byte countOfCompletedThreads = 0;
while (true)
{
foreach (var thread in listOfThreads)
if (!thread.IsAlive)
countOfCompletedThreads++;
if (countOfCompletedThreads == numOfCores)
break;
countOfCompletedThreads = 0;
}
watch.Stop();
var totalTime = watch.ElapsedTicks;
Console.WriteLine("The time to create {0} objects utilizing all {1} cores is: {2} ticks.",
string.Format("{0:#,###0}", numOfObjCreated), numOfCores, string.Format("{0:#,###0}", totalTime));
return totalTime;
}
}
}
Worker Class
namespace SingleAndMultiThreading
{
public class Worker
{
private readonly long _numOfObjToCreate;
public bool IsDone;
public Worker(long numOfObjToCreate)
{
_numOfObjToCreate = numOfObjToCreate;
}
public void DoWork()
{
for (long i = 0; i < _numOfObjToCreate; i++)
{
new object();
}
IsDone = true;
}
}
}
The output of this code is a bit too long to post (I urge you to copy and paste into your own IDE, its really fascinating). I guess the accepted answer that this doesn't give the same result per every test is due to CPU scheduling, other or minor issues like ASLR and such. More than one thing is happening aside from visual studio running this program, and priorities differently. Also thank you for pointing out that running multi threading first helps because of the already-done memory allocation!
Another thing to point out, I found this while running:
The spikes are when the process of multi threading takes place.
Here's what I'm trying to do:
Get one html page from url which contains multiple links inside
Visit each link
Extract some data from visited link and create object using it
So far All i did is just simple and slow way:
public List<Link> searchLinks(string name)
{
List<Link> foundLinks = new List<Link>();
// getHtmlDocument() just returns HtmlDocument using input url.
HtmlDocument doc = getHtmlDocument(AU_SEARCH_URL + fixSpaces(name));
var link_list = doc.DocumentNode.SelectNodes(#"/html/body/div[#id='parent-container']/div[#id='main-content']/ol[#id='searchresult']/li/h2/a");
foreach (var link in link_list)
{
// TODO Threads
// getObject() creates object using data gathered
foundLinks.Add(getObject(link.InnerText, link.Attributes["href"].Value, getLatestEpisode(link.Attributes["href"].Value)));
}
return foundLinks;
}
To make it faster/efficient I need to implement threads, but I'm not sure how i should approach it, because I can't just randomly start threads, I need to wait for them to finish, thread.Join() kind of solves 'wait for threads to finish' problem, but it becomes not fast anymore i think, because threads will be launched after earlier one is finished.
The simplest way to offload the work to multiple threads would be to use Parallel.ForEach() in place of your current loop. Something like this:
Parallel.ForEach(link_list, link =>
{
foundLinks.Add(getObject(link.InnerText, link.Attributes["href"].Value, getLatestEpisode(link.Attributes["href"].Value)));
});
I'm not sure if there are other threading concerns in your overall code. (Note, for example, that this would no longer guarantee that the data would be added to foundLinks in the same order.) But as long as there's nothing explicitly preventing concurrent work from taking place then this would take advantage of threading over multiple CPU cores to process the work.
Maybe you should use Thread pool :
Example from MSDN :
using System;
using System.Threading;
public class Fibonacci
{
private int _n;
private int _fibOfN;
private ManualResetEvent _doneEvent;
public int N { get { return _n; } }
public int FibOfN { get { return _fibOfN; } }
// Constructor.
public Fibonacci(int n, ManualResetEvent doneEvent)
{
_n = n;
_doneEvent = doneEvent;
}
// Wrapper method for use with thread pool.
public void ThreadPoolCallback(Object threadContext)
{
int threadIndex = (int)threadContext;
Console.WriteLine("thread {0} started...", threadIndex);
_fibOfN = Calculate(_n);
Console.WriteLine("thread {0} result calculated...", threadIndex);
_doneEvent.Set();
}
// Recursive method that calculates the Nth Fibonacci number.
public int Calculate(int n)
{
if (n <= 1)
{
return n;
}
return Calculate(n - 1) + Calculate(n - 2);
}
}
public class ThreadPoolExample
{
static void Main()
{
const int FibonacciCalculations = 10;
// One event is used for each Fibonacci object.
ManualResetEvent[] doneEvents = new ManualResetEvent[FibonacciCalculations];
Fibonacci[] fibArray = new Fibonacci[FibonacciCalculations];
Random r = new Random();
// Configure and start threads using ThreadPool.
Console.WriteLine("launching {0} tasks...", FibonacciCalculations);
for (int i = 0; i < FibonacciCalculations; i++)
{
doneEvents[i] = new ManualResetEvent(false);
Fibonacci f = new Fibonacci(r.Next(20, 40), doneEvents[i]);
fibArray[i] = f;
ThreadPool.QueueUserWorkItem(f.ThreadPoolCallback, i);
}
// Wait for all threads in pool to calculate.
WaitHandle.WaitAll(doneEvents);
Console.WriteLine("All calculations are complete.");
// Display the results.
for (int i= 0; i<FibonacciCalculations; i++)
{
Fibonacci f = fibArray[i];
Console.WriteLine("Fibonacci({0}) = {1}", f.N, f.FibOfN);
}
}
}
I'm running some experiments, based on the .NET thread safe, and non-thread safe dictionary's, as well as my custom one.
The results for writing 20,000,000 (20 million) ints to each are as follows:
Non-thread safe: 909 milliseconds (less then 1 second) Dictionary
Thread safe: 11914 milliseconds (more then 11 seconds) ConcurrentDictionary
Custom: 909 milliseconds (less then 1 second) 2 dictionary's
Thread safe (ConcurrentTryAdd): 12697 milliseconds (more then 12 seconds) No better then #2
These tests were conducted in a single threaded environment, I'm trying to get the speed of the non-thread safe dictionary, with the safety of the thread safe one.
The results are promising so far, I'm surprised how poorly the ConcurrentDictionary handled, maybe its meant for certain scenarios only?
Anyway, below is the code I used to test the three dictionary's, can you tell me if my custom one is thread safe? Do I have to add a lock to if (_list.ContainsKey(threadId))? I don't think so since its only a read, and when the dictionary has an element added to it (a write) its protected by a lock, blocking other threads trying to read it.
There is no locks once the thread has the dictionary, because another thread cannot write to that same dictionary, since each thread gets their own dictionary (based on the ManagedThreadId), making it as safe as a single thread.
Main
using System;
using System.Diagnostics;
namespace LockFreeTests
{
class Program
{
static void Main(string[] args)
{
var sw = Stopwatch.StartNew();
int i = 20000000; // 20 million
IWork work = new Custom(); // Replace with: Control(), Concurrent(), or Custom()
work.Start(i);
sw.Stop();
Console.WriteLine("Total time: {0}\r\nPress anykey to continue...", sw.Elapsed.TotalMilliseconds);
Console.ReadKey(true);
}
}
}
Non-thread safe
using System.Collections.Generic;
namespace LockFreeTests
{
class Control : IWork
{
public void Start(int i)
{
var list = new Dictionary<int, int>();
for (int n = 0; n < i; n++)
{
list.Add(n, n);
}
}
}
}
Thread safe
using System.Collections.Concurrent;
namespace LockFreeTests
{
class Concurrent : IWork
{
public void Start(int i)
{
var list = new ConcurrentDictionary<int, int>();
for (int n = 0; n < i; n++)
{
list.AddOrUpdate(n, n, (a, b) => b);
}
}
}
}
Thread Safe (try add)
using System.Collections.Concurrent;
namespace LockFreeTests
{
class ConcurrentTryAdd : IWork
{
public void Start(int i)
{
var list = new ConcurrentDictionary<int, int>();
for (int n = 0; n < i; n++)
{
bool result = list.TryAdd(n, n);
if (!result)
{
n--;
}
}
}
}
}
Custom
using System.Collections.Generic;
using System.Threading;
namespace LockFreeTests
{
class Custom : IWork
{
private static Dictionary<int, Dictionary<int, int>> _list = null;
static Custom()
{
_list = new Dictionary<int, Dictionary<int, int>>();
}
public void Start(int i)
{
int threadId = Thread.CurrentThread.ManagedThreadId;
Dictionary<int, int> threadList = null;
bool firstTime = false;
lock (_list)
{
if (_list.ContainsKey(threadId))
{
threadList = _list[threadId];
}
else
{
threadList = new Dictionary<int, int>();
firstTime = true;
}
}
for (int n = 0; n < i; n++)
{
threadList.Add(n, n);
}
if (firstTime)
{
lock (_list)
{
_list.Add(threadId, threadList);
}
}
}
}
}
IWorK
namespace LockFreeTests
{
public interface IWork
{
void Start(int i);
}
}
Multi-threaded Example
using System;
using System.Diagnostics;
using System.Threading.Tasks;
namespace LockFreeTests
{
class Program
{
static void Main(string[] args)
{
var sw = Stopwatch.StartNew();
int totalWork = 20000000; // 20 million
int cores = Environment.ProcessorCount;
int workPerCore = totalWork / cores;
IWork work = new Custom(); // Replace with: Control(), Concurrent(), ConcurrentTryAdd(), or Custom()
var tasks = new Task[cores];
for (int n = 0; n < cores; n++)
{
tasks[n] = Task.Factory.StartNew(() =>
{
work.Start(workPerCore);
});
}
Task.WaitAll(tasks);
sw.Stop();
Console.WriteLine("Total time: {0}\r\nPress anykey to continue...", sw.Elapsed.TotalMilliseconds);
Console.ReadKey(true);
}
}
}
The above code runs in 528 milliseconds, that's a 40% speed improvement (from the single thread test)
It's not thread-safe.
Do I have to add a lock to if (_list.ContainsKey(threadId))? I don't think so since its only a read, and when the dictionary has an element added to it (a write) its protected by a lock, blocking other threads trying to read it.
Yes, you do need a lock here to make it thread-safe.
I just wrote about my lock-free thread-safe copy-on-write dictionary implementation here:
http://www.singulink.com/CodeIndex/post/fastest-thread-safe-lock-free-dictionary
It is very fast for quick bursts of writes and lookups usually run at 100% standard Dictionary speed without locking. If you write occasionally and read often, this is the fastest option available.
Example for threading queue book "Accelerated C# 2008" (CrudeThreadPool class) not work correctly. If I insert long job in WorkFunction() on 2-processor machine executing for next task don't run before first is over. How to solve this problem? I want to load the processor to 100 percent
public class CrudeThreadPool
{
static readonly int MAX_WORK_THREADS = 4;
static readonly int WAIT_TIMEOUT = 2000;
public delegate void WorkDelegate();
public CrudeThreadPool()
{
stop = 0;
workLock = new Object();
workQueue = new Queue();
threads = new Thread[MAX_WORK_THREADS];
for (int i = 0; i < MAX_WORK_THREADS; ++i)
{
threads[i] = new Thread(new ThreadStart(this.ThreadFunc));
threads[i].Start();
}
}
private void ThreadFunc()
{
lock (workLock)
{
int shouldStop = 0;
do
{
shouldStop = Interlocked.Exchange(ref stop, stop);
if (shouldStop == 0)
{
WorkDelegate workItem = null;
if (Monitor.Wait(workLock, WAIT_TIMEOUT))
{
// Process the item on the front of the queue
lock (workQueue)
{
workItem = (WorkDelegate)workQueue.Dequeue();
}
workItem();
}
}
} while (shouldStop == 0);
}
}
public void SubmitWorkItem(WorkDelegate item)
{
lock (workLock)
{
lock (workQueue)
{
workQueue.Enqueue(item);
}
Monitor.Pulse(workLock);
}
}
public void Shutdown()
{
Interlocked.Exchange(ref stop, 1);
}
private Queue workQueue;
private Object workLock;
private Thread[] threads;
private int stop;
}
public class EntryPoint
{
static void WorkFunction()
{
Console.WriteLine("WorkFunction() called on Thread 0}", Thread.CurrentThread.GetHashCode());
//some long job
double s = 0;
for (int i = 0; i < 100000000; i++)
s += Math.Sin(i);
}
static void Main()
{
CrudeThreadPool pool = new CrudeThreadPool();
for (int i = 0; i < 10; ++i)
{
pool.SubmitWorkItem(
new CrudeThreadPool.WorkDelegate(EntryPoint.WorkFunction));
}
pool.Shutdown();
}
}
I can see 2 problems:
Inside ThreadFunc() you take a lock(workLock) for the duration of the method, meaning your threadpool is no longer async.
in the Main() method, you close down the threadpool w/o waiting for it to finish. Oddly enough that is why it is working now, stopping each ThreadFunc after 1 loop.
It's hard to tell because there's no indentation, but it looks to me like it's executing the work item while still holding workLock - which is basically going to serialize all the work.
If at all possible, I suggest you start using the Parallel Extensions framework in .NET 4, which has obviously had rather more time spent on it. Otherwise, there's the existing thread pool in the framework, and there are other implementations around if you're willing to have a look. I have one in MiscUtil although I haven't looked at the code for quite a while - it's pretty primitive.