Going from Parallel.ForEach to Multithreading - c#

So I converted a recursive function to iterative and then used Parallel.ForEach but when I was running it through VTune it was only really using 2 logical cores at for the majority of its run time.
I decided to attempt to use managed threads instead, and converted this code:
for (int N = 2; N <= length; N <<= 1)
{
int maxThreads = 4;
var workGroup = Enumerable.Range(0, maxThreads);
Parallel.ForEach(workGroup, i =>
{
for (int j = ((i / maxThreads) * length); j < (((i + 1) / maxThreads) * length); j += N)
{
for (int k = 0; k < N / 2; k++)
{
int evenIndex = j + k;
int oddIndex = j + k + (N / 2);
var even = output[evenIndex];
var odd = output[oddIndex];
output[evenIndex] = even + odd * twiddles[k * (length / N)];
output[oddIndex] = even + odd * twiddles[(k + (N / 2)) * (length / N)];
}
}
});
}
Into this:
for (int N = 2; N <= length; N <<= 1)
{
int maxThreads = 4;
Thread one = new Thread(() => calculateChunk(0, maxThreads, length, N, output));
Thread two = new Thread(() => calculateChunk(1, maxThreads, length, N, output));
Thread three = new Thread(() => calculateChunk(2, maxThreads, length, N, output));
Thread four = new Thread(() => calculateChunk(3, maxThreads, length, N, output));
one.Start();
two.Start();
three.Start();
four.Start();
}
public void calculateChunk(int i, int maxThreads, int length, int N, Complex[] output)
{
for (int j = ((i / maxThreads) * length); j < (((i + 1) / maxThreads) * length); j += N)
{
for (int k = 0; k < N / 2; k++)
{
int evenIndex = j + k;
int oddIndex = j + k + (N / 2);
var even = output[evenIndex];
var odd = output[oddIndex];
output[evenIndex] = even + odd * twiddles[k * (length / N)];
output[oddIndex] = even + odd * twiddles[(k + (N / 2)) * (length / N)];
}
}
}
The issue is in the fourth thread on the last iteration of the N loop I get a index out of bounds exception for the output array where the index is attempting access the equivalent of the length.
I can not pinpoint the cause using debugging, but I believe it is to do with the threads, I ran the code without the threads and it worked as intended.
If any of the code needs changing let me know, I usually have a few people suggest edits. Thanks for your help, I have tried to sort it myself and am fairly certain the problem is occurring in my threading but I can not see how.
PS: The intended purpose is to parallelize this segment of code.

The observed behaviour is almost certainly due to the use of a captured loop iteration variable N. I can reproduce your situation with a simple test:
ConcurrentBag<int> numbers = new ConcurrentBag<int>();
for (int i = 0; i < 10000; i++)
{
Thread t = new Thread(() => numbers.Add(i));
t.Start();
//t.Join(); // Uncomment this to get expected behaviour.
}
// You'd not expect this assert to be true, but most of the time it will be.
Assert.True(numbers.Contains(10000));
Put simply, your for loop is racing to increment N before the value of N can be copied by the delegate that executes the calculateChunk call. As a result calculateChunk sees almost random values of N going up to (and including) length <<= 1 - that's what's causing your IndexOutOfRangeException.
The output values you'll get will be rubbish too as you can never rely on the value of N being correct.
If you want to safely rewrite the original code to utilize more cores, move Parallel.ForEach from the inner loop to the outer loop. If the number of outer loop iterations is high, the load balancer will be able to do its job properly (which it can't with your current workGroup count of 4 - that number of elements is simply too low).

Related

Problem with Heap's algorithm: not all permutations are generated

I wanted to use the recursive version of Heap's algorithm in order to get all permutations of a sequence of natural numbers from 1 to k inclusive, but ran into certain difficulties.
For k = 3, the program outputs 123, 213, 312, 132, but for some reason it does not take 231 and 321 into account. More specifically, according to the video with the implementation of the JavaScript version of the algorithm (https://www.youtube.com/watch?v=xghJNlMibX4), by the fifth permutation k should become equal to 3 (changing in the loop). I don't understand why in my case it reaches 1, and the loop stops executing.
int i, n, temp;
int[] a;
string str = "";
private void button1_Click(object sender, EventArgs e)
{
k = int.Parse(textBox1.Text);
a = new int[k];
for (i = 1; i <= k; i++)
a[i - 1] = i;
Generate(a, k);
}
private void Generate(int[] a, int k)
{
if (k == 1)
{
foreach (int digit in a)
str += digit.ToString();
listBox1.Items.Add(str);
str = "";
return;
}
Generate(a, k - 1);
for (i = 0; i < k - 1; i++)
{
if (k % 2 == 1) Swap(a, 0, k - 1);
else Swap(a, i, k - 1);
Generate(a, k - 1);
}
}
public void Swap(int[] a, int i, int j)
{
temp = a[i];
a[i] = a[j];
a[j] = temp;
}
I focused on a variant of the algorithm found on the Wiki: https://en.wikipedia.org/wiki/Heap%27s_algorithm. Interestingly, the almost identical one which I took from here: https://www.geeksforgeeks.org/heaps-algorithm-for-generating-permutations/ works correctly.
It looks like I haven't been able to rewrite it correctly from a console application for forms.
I can try that version without recursion, but I still want to find out what my mistake was when building a recursive algorithm.
The problem is that your loop variable i is one global variable. This means that when you make a recursive call inside the loop's body, that recursion will alter the value of that loop variable. When the recursion comes back from where it was initiated, i will no longer have the same value, and the loop will exit prematurely.
So change:
for (i = 0; i < k - 1; i++)
to:
for (int i = 0; i < k - 1; i++)
It is good practice to avoid global variables, and to declare them with the smallest scope possible, there where you need them.

How to speed up nested loops in C#

This is a piece of my code, which calculate the differentiate. It works correctly but it takes a lot (because of height and width).
"Data" is a grey image bitmap.
"Filter" is [3,3] matrix.
"fh" and "fw" maximum values are 3.
I am looking to speed up this code.
I also tried with using parallel, for but it didn't work correct (error with out of bounds).
private float[,] Differentiate(int[,] Data, int[,] Filter)
{
int i, j, k, l, Fh, Fw;
Fw = Filter.GetLength(0);
Fh = Filter.GetLength(1);
float sum = 0;
float[,] Output = new float[Width, Height];
for (i = Fw / 2; i <= (Width - Fw / 2) - 1; i++)
{
for (j = Fh / 2; j <= (Height - Fh / 2) - 1; j++)
{
sum=0;
for(k = -Fw/2; k <= Fw/2; k++)
{
for(l = -Fh/2; l <= Fh/2; l++)
{
sum = sum + Data[i+k, j+l] * Filter[Fw/2+k, Fh/2+l];
}
}
Output[i,j] = sum;
}
}
return Output;
}
For parallel execution you need to drop c language like variable declaration at the beginning of method and declare them in actual scope that they are used so they are not shared between threads. Making it parallel should provide some benefit for performance, but making them all ParallerFors is not a good idea as there is a limit for threads amount that actually can run in parallel. I would try to make it with top level loop only:
private static float[,] Differentiate(int[,] Data, int[,] Filter)
{
var Fw = Filter.GetLength(0);
var Fh = Filter.GetLength(1);
float[,] Output = new float[Width, Height];
Parallel.For(Fw / 2, Width - Fw / 2 - 1, (i, state) =>
{
for (var j = Fh / 2; j <= (Height - Fh / 2) - 1; j++)
{
var sum = 0;
for (var k = -Fw / 2; k <= Fw / 2; k++)
{
for (var l = -Fh / 2; l <= Fh / 2; l++)
{
sum = sum + Data[i + k, j + l] * Filter[Fw / 2 + k, Fh / 2 + l];
}
}
Output[i, j] = sum;
}
});
return Output;
}
This is a perfect example of a task where using the GPU is better than using the CPU. A GPU is able to perform trillions of floating point operations per second (TFlops), while CPU performance is still measured in GFlops. The catch is that it's only any good if you use SIMD instructions (Single Instruction Multiple Data). The GPU excels at data-parallel tasks. If different data needs different instructions, using the GPU has no advantage.
In your program, the elements of your bitmap go through the same calculations: the same computations just with slightly different data (SIMD!). So using the GPU is a great option. This won't be too complex because with your calculations threads on the GPU would not need to exchange information, nor would they be dependent on results of previous iterations (Each element would be processed by a different thread on the GPU).
You can use, for example, OpenCL to easily access the GPU. More on OpenCL and using the GPU here: https://www.codeproject.com/Articles/502829/GPGPU-image-processing-basics-using-OpenCL-NET

My multithread creates or overwrites an extra/existing thread

I want to spread a test over multiple threads, so first I divide the total numbers I want tested equally over the amount of threads I want (including remainder). Then I assign the testing that range of numbers to each thread. Everytime the test is positive, a counter is incremented. I test it over a range from 0 to 123, as I know what the result should be, but whenever I assign more then 1 thread to the task, I get the wrong result. When debugging, I noticed that after the starting of the threads, the current line skips back to assigning a new thread. I don't understand why. I have the counter protected by a lock, which as far as i know is working properly. Here is the corresponding code:
for (int n = 0; n < remainder; n++)
{
workers[n] = new Thread(() => CountNumbers(startvalue + n * (tasks + 1), startvalue + (n + 1) * (tasks + 1), modulus));
}
for (int m = remainder; m < nrthreads; m++)
{
workers[m] = new Thread(() => CountNumbers(startvalue + remainder * (tasks + 1) + (m - remainder) * tasks, startvalue + remainder * (tasks + 1) + (m - remainder + 1) * tasks, modulus));
}
for (int k = 0; k < nrthreads; k++)
{
workers[k].Start();
}
for (int k = 0; k < nrthreads; k++)
{
workers[k].Join();
}
the "problem" arises when workers[k].Start() is finished for all k, then it for some reason overwrites the last thread in workers.
I'm relative new to C#, so I struggle with finding the flaw. It is for school, so hints in the right direction are probably more appropiate then clean answers.
This is a common error in C# parallel programming. When you declare anonymous functions through lambda expressions, they capture any variables (not values) that they reference. In your case, all your threads are capturing the same variable instance for the n or m counter, leading all their executions to see its last value.
A simple way to resolve this is to declare another variable within the scope of your loop, and copy the counter to it. Since the scope is restricted to the loop, the variable will not be shared across your threads.
for (int nOuter = 0; nOuter < remainder; nOuter++)
{
int n = nOuter;
workers[n] = new Thread(() => CountNumbers(startvalue + n * (tasks + 1), startvalue + (n + 1) * (tasks + 1), modulus));
}
for (int mOuter = remainder; mOuter < nrthreads; mOuter++)
{
int m = mOuter;
workers[m] = new Thread(() => CountNumbers(startvalue + remainder * (tasks + 1) + (m - remainder) * tasks, startvalue + remainder * (tasks + 1) + (m - remainder + 1) * tasks, modulus));
}
Edit: You can simplify your code if you switch to using PLINQ or TPL constructs. The following should be equivalent to your entire logic:
Parallel.For(0, nrthreads, k =>
{
if (k < remainder)
CountNumbers(startvalue + k * (tasks + 1), startvalue + (k + 1) * (tasks + 1), modulus);
else
CountNumbers(startvalue + remainder * (tasks + 1) + (k - remainder) * tasks, startvalue + remainder * (tasks + 1) + (k - remainder + 1) * tasks, modulus);
});
Your missing a lock on something because you are out of sync. I had the same kind of issue once before only I was iterating a list and I had a valid index value (impossible to be out of range when you use list.Count) yet it was. Because I did not have a proper lock.
The issue is probably in your CountNumbers Method.

Counting sort - implementation differences

I heard about Counting Sort and wrote my version of it based on what I understood.
public void my_counting_sort(int[] arr)
{
int range = 100;
int[] count = new int[range];
for (int i = 0; i < arr.Length; i++) count[arr[i]]++;
int index = 0;
for (int i = 0; i < count.Length; i++)
{
while (count[i] != 0)
{
arr[index++] = i;
count[i]--;
}
}
}
The above code works perfectly.
However, the algorithm given in CLRS is different. Below is my implementation
public int[] counting_sort(int[] arr)
{
int k = 100;
int[] count = new int[k + 1];
for (int i = 0; i < arr.Length; i++)
count[arr[i]]++;
for (int i = 1; i <= k; i++)
count[i] = count[i] + count[i - 1];
int[] b = new int[arr.Length];
for (int i = arr.Length - 1; i >= 0; i--)
{
b[count[arr[i]]] = arr[i];
count[arr[i]]--;
}
return b;
}
I've directly translated this from pseudocode to C#. The code doesn't work and I get an IndexOutOfRange Exception.
So my questions are:
What's wrong with the second piece of code ?
What's the difference algorithm wise between my naive implementation and the one given in the book ?
The problem with your version is that it won't work if the elements have satellite data.
CLRS version would work and it's stable.
EDIT:
Here's an implementation of the CLRS version in Python, which sorts pairs (key, value) by key:
def sort(a):
B = 101
count = [0] * B
for (k, v) in a:
count[k] += 1
for i in range(1, B):
count[i] += count[i-1]
b = [None] * len(a)
for i in range(len(a) - 1, -1, -1):
(k, v) = a[i]
count[k] -= 1
b[count[k]] = a[i]
return b
>>> print sort([(3,'b'),(2,'a'),(3,'l'),(1,'s'),(1,'t'),(3,'e')])
[(1, 's'), (1, 't'), (2, 'a'), (3, 'b'), (3, 'l'), (3, 'e')]
It should be
b[count[arr[i]]-1] = arr[i];
I'll leave it to you to track down why ;-).
I don't think they perform any differently. The second just pushes the correlation of counts out of the loop so that it's simplified a bit within the final loop. That's not necessary as far as I'm concerned. Your way is just as straightforward and probably more readable. In fact (I don't know about C# since I'm a Java guy) I would expect that you could replace that inner while-loop with a library array fill; something like this:
for (int i = 0; i < count.Length; i++)
{
arrayFill(arr, index, count[i], i);
index += count[i];
}
In Java the method is java.util.Arrays.fill(...).
The problem is that you have hard-coded the length of the array that you are using to 100. The length of the array should be m + 1 where m is the maximum element on the original array. This is the first reason that you would think using counting-sort, if you have information about the elements of the array are all minor that some constant and it would work great.

Help with Creating a Recursive Function C#

I am creating a forecasting application that will run simulations for various "modes" that a production plant is able to run. The plant can run in one mode per day, so I am writing a function that will add up the different modes chosen each day that best maximize the plant’s output and best aligns with the sales forecast numbers provided. This data will be loaded into an array of mode objects that will then be used to calculate the forecast output of the plant.
I have created the functions to do this, however, I need to make them recursive so that I am able to handle any number (within reason) of modes and work days (which varies based on production needs). Listed below is my code using for loops to simulate what I want to do. Can someone point me in the right direction in order to create a recursive function to replace the need for multiple for loops?
Where the method GetNumbers4 would be when there were four modes, and GetNumbers5 would be 5 modes. Int start would be the number of work days.
private static void GetNumber4(int start)
{
int count = 0;
int count1 = 0;
for (int i = 0; 0 <= start; i++)
{
for (int j = 0; j <= i; j++)
{
for (int k = 0; k <= j; k++)
{
count++;
for (int l = 0; l <= i; l++)
{
count1 = l;
}
Console.WriteLine(start + " " + (count1 - j) + " " + (j - k) + " " + k);
count1 = 0;
}
}
start--;
}
Console.WriteLine(count);
}
private static void GetNumber5(int start)
{
int count = 0;
int count1 = 0;
for (int i = 0; 0 <= start; i++)
{
for (int j = 0; j <= i; j++)
{
for (int k = 0; k <= j; k++)
{
for (int l = 0; l <= k; l++)
{
count++;
for (int m = 0; m <= i; m++)
{
count1 = m;
}
Console.WriteLine(start + " " + (count1 - j) + " " + (j - k) + " " + (k - l) + " " + l);
count1 = 0;
}
}
}
start--;
}
Console.WriteLine(count);
}
EDITED:
I think that it would be more helpful if I gave an example of what I was trying to do. For example, if a plant could run in three modes "A", "B", "C" and there were three work days, then the code will return the following results.
3 0 0
2 1 0
2 0 0
1 2 0
1 1 1
1 0 2
0 3 0
0 2 1
0 1 2
0 0 3
The series of numbers represent the three modes A B C. I will load these results into a Modes object that has the corresponding production rates. Doing it this way allows me to shortcut creating a list of every possible combination; it instead gives me a frequency of occurrence.
Building on one of the solutions already offered, I would like to do something like this.
//Where Modes is a custom classs
private static Modes GetNumberRecur(int start, int numberOfModes)
{
if (start < 0)
{
return Modes;
}
//Do work here
GetNumberRecur(start - 1);
}
Thanks to everyone who have already provided input.
Calling GetNumber(5, x) should yield the same result as GetNumber5(x):
static void GetNumber(int num, int max) {
Console.WriteLine(GetNumber(num, max, ""));
}
static int GetNumber(int num, int max, string prefix) {
if (num < 2) {
Console.WriteLine(prefix + max);
return 1;
}
else {
int count = 0;
for (int i = max; i >= 0; i--)
count += GetNumber(num - 1, max - i, prefix + i + " ");
return count;
}
}
A recursive function just needs a terminating condition. In your case, that seems to be when start is less than 0:
private static void GetNumberRec(int start)
{
if(start < 0)
return;
// Do stuff
// Recurse
GetNumberRec(start-1);
}
I've refactored your example into this:
private static void GetNumber5(int start)
{
var count = 0;
for (var i = 0; i <= start; i++)
{
for (var j = 0; j <= i; j++)
{
for (var k = 0; k <= j; k++)
{
for (var l = 0; l <= k; l++)
{
count++;
Console.WriteLine(
(start - i) + " " +
(i - j) + " " +
(j - k) + " " +
(k - l) + " " +
l);
}
}
}
}
Console.WriteLine(count);
}
Please verify this is correct.
A recursive version should then look like this:
public static void GetNumber(int start, int depth)
{
var count = GetNumber(start, depth, new Stack<int>());
Console.WriteLine(count);
}
private static int GetNumber(int start, int depth, Stack<int> counters)
{
if (depth == 0)
{
Console.WriteLine(FormatCounters(counters));
return 1;
}
else
{
var count = 0;
for (int i = 0; i <= start; i++)
{
counters.Push(i);
count += GetNumber(i, depth - 1, counters);
counters.Pop();
}
return count;
}
}
FormatCounters is left as an exercise to the reader ;)
I previously offered a simple C# recursive function here.
The top-most function ends up having a copy of every permutation, so it should be easily adapted for your needs..
I realize that everyone's beaten me to the punch at this point, but here's a dumb Java algorithm (pretty close to C# syntactically that you can try out).
import java.util.ArrayList;
import java.util.List;
/**
* The operational complexity of this is pretty poor and I'm sure you'll be able to optimize
* it, but here's something to get you started at least.
*/
public class Recurse
{
/**
* Base method to set up your recursion and get it started
*
* #param start The total number that digits from all the days will sum up to
* #param days The number of days to split the "start" value across (e.g. 5 days equals
* 5 columns of output)
*/
private static void getNumber(int start,int days)
{
//start recursing
printOrderings(start,days,new ArrayList<Integer>(start));
}
/**
* So this is a pretty dumb recursion. I stole code from a string permutation algorithm that I wrote awhile back. So the
* basic idea to begin with was if you had the string "abc", you wanted to print out all the possible permutations of doing that
* ("abc","acb","bac","bca","cab","cba"). So you could view your problem in a similar fashion...if "start" is equal to "5" and
* days is equal to "4" then that means you're looking for all the possible permutations of (0,1,2,3,4,5) that fit into 4 columns. You have
* the extra restriction that when you find a permutation that works, the digits in the permutation must add up to "start" (so for instance
* [0,0,3,2] is cool, but [0,1,3,3] is not). You can begin to see why this is a dumb algorithm because it currently just considers all
* available permutations and keeps the ones that add up to "start". If you want to optimize it more, you could keep a running "sum" of
* the current contents of the list and either break your loop when it's greater than "start".
*
* Essentially the way you get all the permutations is to have the recursion choose a new digit at each level until you have a full
* string (or a value for each "day" in your case). It's just like nesting for loops, but the for loop actually only gets written
* once because the nesting is done by each subsequent call to the recursive function.
*
* #param start The total number that digits from all the days will sum up to
* #param days The number of days to split the "start" value across (e.g. 5 days equals
* 5 columns of output)
* #param chosen The current permutation at any point in time, may contain between 0 and "days" numbers.
*/
private static void printOrderings(int start,int days,List<Integer> chosen)
{
if(chosen.size() == days)
{
int sum = 0;
for(Integer i : chosen)
{
sum += i.intValue();
}
if(sum == start)
{
System.out.println(chosen.toString());
}
return;
}
else if(chosen.size() < days)
{
for(int i=0; i < start; i++)
{
if(chosen.size() >= days)
{
break;
}
List<Integer> newChosen = new ArrayList<Integer>(chosen);
newChosen.add(i);
printOrderings(start,days,newChosen);
}
}
}
public static void main(final String[] args)
{
//your equivalent of GetNumber4(5)
getNumber(5,4);
//your equivalent of GetNumber5(5)
getNumber(5,5);
}
}

Categories

Resources