Hello I was wondering how people make loops threaded for example
for(int i = 0; i<10; i++)
{
Console.WriteLine(i);
}
Is it possible to have one thread for every loop?
so,
Thread 1: 0
Thread 2: 1
Thread 3: 2
ect..
and if so, how would I cap the threads ?
What you are looking for is the parallel for
This:
for(int i=0; i <=10; i++)
{
Console.WriteLine(i);
}
Becomes this:
Parallel.For(0, 10, new ParallelOptions { MaxDegreeOfParallelism = 5 }, i =>
{
Console.WriteLine(i);
});
This will spawn one task per iteration of the loop. One of the overloads of parallel takes in the parallel options which lets set the maximum number of tasks running cuncurrently. Official docs: https://msdn.microsoft.com/en-us/library/dd992418(v=vs.110).aspx
Note there is some difference between a thread and a task in C#: What is the difference between task and thread?
If you have a list of items to process I recommend the parallel foreach: https://msdn.microsoft.com/en-us/library/dd460720(v=vs.110).aspx
Related
I am trying to find out why parallel foreach does not give the expected speedup on a machine with 32 physical cores and 64 logical cores with a simple test computation.
...
var parameters = new List<string>();
for (int i = 1; i <= 9; i++) {
parameters.Add(i.ToString());
if (Scenario.UsesParallelForEach)
{
Parallel.ForEach(parameters, parameter => {
FireOnParameterComputed(this, parameter, Thread.CurrentThread.ManagedThreadId, "started");
var lc = new LongComputation();
lc.Compute();
FireOnParameterComputed(this, parameter, Thread.CurrentThread.ManagedThreadId, "stopped");
});
}
else
{
foreach (var parameter in parameters)
{
FireOnParameterComputed(this, parameter, Thread.CurrentThread.ManagedThreadId, "started");
var lc = new LongComputation();
lc.Compute();
FireOnParameterComputed(this, parameter, Thread.CurrentThread.ManagedThreadId, "stopped");
}
}
}
...
class LongComputation
{
public void Compute()
{
var s = "";
for (int i = 0; i <= 40000; i++)
{
s = s + i.ToString() + "\n";
}
}
}
The Compute function takes about 5 seconds to complete. My assumption was, that with the parallel foreach loop each additional iteration creates a parallel thread running on one of the cores and taking as much as it would take to compute the Compute function only once. So, if I run the loop twice, then with the sequential foreach, it would take 10 seconds, with the parallel foreach only 5 seconds (assuming 2 cores are available). The speedup would be 2. If I run the loop three times, then with the sequential foreach, it would take 15 seconds, but again with the parallel foreach only 5 seconds. The speedup would be 3, then 4, 5, 6, 7, 8, and 9. However, what I observe is a constant speedup of 1.3.
Sequential vs parallel foreach. X-axis: number of sequential/parallel execution of the computation. Y-axis: time in seconds
Speedup, time of the sequential foreach divided by parallel foreach
The event fired in FireOnParameterComputed is intended to be used in a GUI progress bar to show the progress. In the progress bar it can be clearly see, that for each iteration, a new thread is created.
My question is, why don't I see the expected speedup or at least close to the expected speedup?
Tasks aren't threads.
Sometimes starting a task will cause a thread to be created, but not always. Creating and managing threads consumes time and system resources. When a task only takes a short amount of time, even though it's counter-intuitive, the single-threaded model is often faster.
The CLR knows this and tries to make its best judgment on how to execute the task based on a number of factors including any hints that you've passed to it.
For Parallel.ForEach, if you're certain that you want multiple threads to be spawned, try passing in ParallelOptions.
Parallel.ForEach(parameters, new ParallelOptions { MaxDegreeOfParallelism = 100 }, parameter => {});
I made a fake test resembling my real computing task. My current code is:
static void Main()
{
List<ulong> list = new List<ulong>();
Action action = () =>
{
Random rng = new Random(Guid.NewGuid().GetHashCode());
ulong i = 0;
do
{
i++;
if (rng.Next(100000000) == 1000)
{
lock (list) list.Add(i);
Console.WriteLine("ThreadId {0}, step {1}: match is found",
Thread.CurrentThread.ManagedThreadId, i);
}
} while (list.Count < 100);
};
int length = Environment.ProcessorCount;
Action[] actions = new Action[length];
for (int i = 0; i < length; i++)
actions[i] = action;
Parallel.Invoke(actions);
Console.WriteLine("The process is completed. {0} matches are found. Press any key...",
list.Count);
Console.ReadKey();
}
Is there any better approach to optimize the number of parallel tasks for one long computing process?
I'm not sure if i understood the question correctly. The code you've shared will run different instances of the action in parallel. But if you like to compute a long running task in parallel for performance, then you should divide the long running task to small work groups Or If the you are iterating over a collection you can use Parallel for or foreach provided by TPL (Task parallel library) which will determine the number of threads depending on metrics like number of cores and load on cpus etc.
So was I just doing some experiments with Task class in c# and the following thing happens.
Here is the method I call
static async Task<List<int>> GenerateList(long size, int numOfTasks)
{
var nums = new List<int>();
Task[] tasks = new Task[numOfTasks];
for (int i = 0; i < numOfTasks; i++)
{
tasks[i] = Task.Run(() => nums.Add(Rand.Nex())); // Rand is a ThreadLocal<Random>
}
for (long i = 0; i < size; i += numOfTasks)
{
await Task.WhenAll(tasks);
}
return nums;
}
I call this method like this
var nums = GenerateList(100000000, 10).Result;
before I used Tasks generation took like 4-5 seconds. after I implemented this method like this if I pass 10-20 number of tasks the time of generation is lowered to 1,8-2,2 seconds but the thing it the List which is return by the method has numOfTask number of Elements in it so in this case List of ten numbers is returned. May be I'm writing something wrong. What can be the problem here. Or may be there is another solution to It. All I want it many task to add numbers in the same list so the generation time would be at least twice faster. Thanks In advance
WhenAll does not run the tasks; it just (asynchronously) waits for them to complete. Your code is only creating 10 tasks, so that's why you're only getting 10 numbers. Also, as #Mauro pointed out, List<T>.Add is not threadsafe.
If you want to do parallel computation, then use Parallel or Parallel LINQ, not async:
static List<int> GenerateList(int size, int numOfTasks)
{
return Enumerable.Range(0, size)
.AsParallel()
.WithDegreeOfParallelism(numOfTasks)
.Select(_ => Rand.Value.Next())
.ToList();
}
As explained by Stephen, you are only creating 10 tasks.
Also, I believe the Add operation on the generic list is not thread safe. You should use a locking mechanism or, if you are targeting framework 4 or newer, use thread-safe collections .
you are adding to the list in the following loop which runs for only 10 times
for (int i = 0; i < numOfTasks; i++)
{
tasks[i] = Task.Run(() => nums.Add(Rand.Nex())); // Rand is a ThreadLocal<Random>
}
you can instead do
for (int i = 0; i < numOfTasks; i++)
{
tasks[i] = new Task(() => nums.Add(Rand.Nex()));
}
I have a for loop running through 500.000ish list. For each of these it is queueing up a SmartThreadPool job.
lines.Length below contains 500.000ish items.
My problem is that i get memory issues when queueing them all at once.. So i though id write a logic to prevent this:
int activeThreads = _smartThreadPool2.ActiveThreads;
if (activeThreads < maxThreads)
{
int iia = 0;
for (int i = 0; i < lines.Length; i++)
{
if (doNotUseAdditive.Checked == true)
{
foreach (string engine in _checkedEngines) // Grab selected engines
{
query = lines[i];
_smartThreadPool2.QueueWorkItem(
new Amib.Threading.Func<string, string, int, int, int>(scrape),
query, engine, iia, useProxies);
iia++;
}
}
}
}
else
{
// Wait
wait.WaitOne();
}
The problem is that i cannot run that if statement inside my for loop, because when i come back to it, it will not remember where it was inside the loop.
I'm using a:
ManualResetEvent wait = new ManualResetEvent(false); //global variable
To "Pause/Resume"
I need to somehow pause the loop after X threads are used and then when threads are available return and continue the loop.
Any ideas?
I don't think that process every item in list in separate thread is a good idea. Even using custom thread pool can be really error-prone (and you examples proves my opinion).
First of all you should determine number of working threads correctly. It seems that you're dealing with computation intensive operations (so called CPU Bound operations) and you should use number of working threads equals to number of logical processors.
Than you can use Parallel LINQ to split all your working set for appropriate amount of chunks and process those chunks in parallel.
Joe Albahari has a great series of posts about this topic: Threading in C#. Part 5. Parallel Programming.
Here is a pseudocode of using PLINQ:
lines
.AsParallel()
.WithDegreeOfParallelism(YourNumberOfProcessors)
.Select(e => ProcessYourData(e));
In My Parallel.ForEach Loop the localFinally delegate does get called on all the threads.
I have found this to happen as my Parallel Loop stalls.
In my Parallel Loop I have about three condition check stages that return before completion of the Loop. And it seems that it is when the Threads are returned from these stages and not the execution of the entire body that it does not execute the localFinally delegate.
The Loop structure is as follows:
var startingThread = Thread.CurrentThread;
Parallel.ForEach(fullList, opt,
()=> new MultipleValues(),
(item, loopState, index, loop) =>
{
if (cond 1)
return loop;
if (cond 2)
{
process(item);
return loop;
}
if (cond 3)
return loop;
Do Work(item);
return loop;
},
partial =>
{
Log State of startingThread and threads
} );
I have run the loop on a small data set and logged in detail and found that while the Parallel.ForEach completes all the iterations and the Log at the last thread of localFinally is --
Calling Thread State is WaitSleepJoin for Thread 6 Loop Indx 16
the Loop still does not complete gracefully and remains stalled... any clues why the stalls ?
Cheers!
Just did a quick test run after seeing the definition of localFinally (executed after each thread finished), which had me suspecting that that could mean there would be far less threads created by parallelism than loops executed. e.g.
var test = new List<List<string>> ();
for (int i = 0; i < 1000; i++)
{
test.Add(null);
}
int finalcount = 0;
int itemcount = 0;
int loopcount = 0;
Parallel.ForEach(test, () => new List<string>(),
(item, loopState, index, loop) =>
{
Interlocked.Increment(ref loopcount);
loop.Add("a");
//Thread.Sleep(100);
return loop;
},
l =>
{
Interlocked.Add(ref itemcount, l.Count);
Interlocked.Increment(ref finalcount);
});
at the end of this loop, itemcount and loopcount were 1000 as expected, and (on my machine) finalcount 1 or 2 depending on the speed of execution. In the situation with the conditions: when returned directly the execution is probably much faster and no extra threads are needed. only when the dowork is executed more threads are needed. However the parameter (l in my case) contains the combined list of all executions.
Could this be the cause of the logging difference?
I think you just misunderstood what localFinally means. It's not called for each item, it's called for each thread that is used by Parallel.ForEach(). And many items can share the same thread.
The reason why it exists is that you can perform some aggregation independently on each thread, and join them together only in the end. This way, you have to deal with synchronization (and have it impact your performance) only in a very small piece of code.
For example, if you want to compute the sum of score for a collection of items, you could do it like this:
int totalSum = 0;
Parallel.ForEach(
collection, item => Interlocked.Add(ref totalSum, ComputeScore(item)));
But here, you call Interlocked.Add() for every item, which can be slow. Using localInit and localFinally, you can rewrite the code like this:
int totalSum = 0;
Parallel.ForEach(
collection,
() => 0,
(item, state, localSum) => localSum + ComputeScore(item),
localSum => Interlocked.Add(ref totalSum, localSum));
Notice that the code uses Interlocked.Add() only in the localFinally and does access the global state in body. This way, the cost of synchronization is paid only a few times, once for each thread used.
Note: I used Interlocked in this example, because it is very simple and quite obviously correct. If the code was more complicated, I would use lock first, and try to use Interlocked only when it was necessary for good performance.