look at these 2 loops
const int arrayLength = ...
Version 0
public void RunTestFrom0()
{
int sum = 0;
for (int i = 0; i < arrayLength; i++)
for (int j = 0; j < arrayLength; j++)
for (int k = 0; k < arrayLength; k++)
for (int l = 0; l < arrayLength; l++)
for (int m = 0; m < arrayLength; m++)
{
sum += myArray[i][j][k][l][m];
}
}
Version 1
public void RunTestFrom1()
{
int sum = 0;
for (int i = 1; i < arrayLength; i++)
for (int j = 1; j < arrayLength; j++)
for (int k = 1; k < arrayLength; k++)
for (int l = 1; l < arrayLength; l++)
for (int m = 1; m < arrayLength; m++)
{
sum += myArray[i][j][k][l][m];
}
}
Version 2
public void RunTestFrom2()
{
int sum = 0;
for (int i = 2; i < arrayLength; i++)
for (int j = 2; j < arrayLength; j++)
for (int k = 2; k < arrayLength; k++)
for (int l = 2; l < arrayLength; l++)
for (int m = 2; m < arrayLength; m++)
{
sum += myArray[i][j][k][l][m];
}
}
Results for arrayLength=50 are (average from multiple sampling compiled X64):
Version 0: 0.998s (Standard error of the mean 0.001s) total loops: 312500000
Version 1: 1.449s (Standard error of the mean 0.000s) total loops: 282475249
Version 2: 0.774s (Standard error of the mean 0.006s) total loops: 254803968
Version 3: 1.183s (Standard error of the mean 0.001s) total loops: 229345007
if we make arrayLength=45 then
Version 0: 0.495s (Standard error of the mean 0.003s) total loops: 184528125
Version 1: 0.527s (Standard error of the mean 0.001s) total loops: 164916224
Version 2: 0.752s (Standard error of the mean 0.001s) total loops: 147008443
Version 3: 0.356s (Standard error of the mean 0.000s) total loops: 130691232
why:
loop start from 0 is faster than loop start from 1 though more loops
why loop start from 2 behaves weird?
Update:
I did each run 10 times, (that's where standard error of the mean comes from)
I also switched the order of version tests a couple of time. No big difference.
The length of myArray of each dimension = arrayLength, I initialized it in the beginning and the time taken is excluded. The value is 1. So sum gives the total loops.
The complied version is Released mode, and I run it from Outside VS. (Closed VS)
Update2:
Now I discard myArray completely, sum++ instead, and added GC.Collect()
public void RunTestConstStartConstEnd()
{
int sum = 0;
for (int i = constStart; i < constEnd; i++)
for (int j = constStart; j < constEnd; j++)
for (int k = constStart; k < constEnd; k++)
for (int l = constStart; l < constEnd; l++)
for (int m = constStart; m < constEnd; m++)
{
sum++;
}
}
Update
This appears to me to be a result of an unsuccessful attempt at optimization by the jitter, not the compiler. In short, if the jitter can determine the lower bound is a constant it will do something different which turns out to actually be slower. The basis for my conclusions takes some proving, so bear with me. Or go read something else if you're not interested!
I concluded this after testing four different ways to set the lower bound of the loop:
Hard coded in each level, as in colinfang's question
Use a local variable, assigned dynamically through a command line argument
Use a local variable but assign it a constant value
Use a local variable and assign it a constant value, but first pass the value through a goofy sausage-grinding identity function. This confuses the jitter enough to prevent it from applying its constant-value "optimization".
The compiled intermediate language for all four versions of the looping section is almost identical. The only difference is that in version 1 the lower bound is loaded with the command ldc.i4.#, where # is 0, 1, 2, or 3. That stands for load constant. (See ldc.i4 opcode). In all other versions, the lower bound is loaded with ldloc. This is true even in case 3, where the compiler could infer that lowerBound is really a constant.
The resulting performance is not constant. Version 1 (explicit constant) is slower than version 2 (run-time argument) along similar lines as found by the OP. What is very interesting is that version 3 is also slower, with comparable times to version 1. So even though the IL treats the lower bound as a variable, the jitter appears to have figured out that the value never changes, and substitutes a constant as in version 1, with the corresponding performance reduction. In version 4 the jitter can't infer what I know -- that Confuser is actually an identity function -- and so it leaves the variable as a variable. The resulting performance is the same as the command line argument version (2).
My theory on the cause of the performance difference: The jitter is aware and makes use of the fine details of actual processor architecture. When it decides to use a constant other than 0, it has to actually go fetch that literal value from some storage which is not in the L2 cache. When it is fetching a frequently used local variable it instead reads its value from the L2 cache, which is insanely fast. Normally it doesn't make sense to be taking up room in the precious cache with something as dumb as a known literal integer value. In this case we care more about read time than storage, though, so it has an undesired impact on performance.
Here is the full code for the version 2 (command line arg):
class Program {
static void Main(string[] args) {
List<double> testResults = new List<double>();
Stopwatch sw = new Stopwatch();
int upperBound = int.Parse(args[0]) + 1;
int tests = int.Parse(args[1]);
int lowerBound = int.Parse(args[2]); // THIS LINE CHANGES
int sum = 0;
for (int iTest = 0; iTest < tests; iTest++) {
sum = 0;
GC.Collect();
sw.Reset();
sw.Start();
for (int lvl1 = lowerBound; lvl1 < upperBound; lvl1++)
for (int lvl2 = lowerBound; lvl2 < upperBound; lvl2++)
for (int lvl3 = lowerBound; lvl3 < upperBound; lvl3++)
for (int lvl4 = lowerBound; lvl4 < upperBound; lvl4++)
for (int lvl5 = lowerBound; lvl5 < upperBound; lvl5++)
sum++;
sw.Stop();
testResults.Add(sw.Elapsed.TotalMilliseconds);
}
double avg = testResults.Average();
double stdev = testResults.StdDev();
string fmt = "{0,13} {1,13} {2,13} {3,13}"; string bar = new string('-', 13);
Console.WriteLine();
Console.WriteLine(fmt, "Iterations", "Average (ms)", "Std Dev (ms)", "Per It. (ns)");
Console.WriteLine(fmt, bar, bar, bar, bar);
Console.WriteLine(fmt, sum, avg.ToString("F3"), stdev.ToString("F3"),
((avg * 1000000) / (double)sum).ToString("F3"));
}
}
public static class Ext {
public static double StdDev(this IEnumerable<double> vals) {
double result = 0;
int cnt = vals.Count();
if (cnt > 1) {
double avg = vals.Average();
double sum = vals.Sum(d => Math.Pow(d - avg, 2));
result = Math.Sqrt((sum) / (cnt - 1));
}
return result;
}
}
For version 1: same as above except remove lowerBound declaration and replace all lowerBound instances with literal 0, 1, 2, or 3 (compiled and executed separately).
For version 3: same as above except replace lowerBound declaration with
int lowerBound = 0; // or 1, 2, or 3
For version 4: same as above except replace lowerBound declaration with
int lowerBound = Ext.Confuser<int>(0); // or 1, 2, or 3
Where Confuser is:
public static T Confuser<T>(T d) {
decimal d1 = (decimal)Convert.ChangeType(d, typeof(decimal));
List<decimal> L = new List<decimal>() { d1, d1 };
decimal d2 = L.Average();
if (d1 - d2 < 0.1m) {
return (T)Convert.ChangeType(d2, typeof(T));
} else {
// This will never actually happen :)
return (T)Convert.ChangeType(0, typeof(T));
}
}
Results (50 iterations of each test, in 5 batches of 10):
1: Lower bound hard-coded in all loops:
Program Iterations Average (ms) Std Dev (ms) Per It. (ns)
-------- ------------- ------------- ------------- -------------
Looper0 345025251 267.813 1.776 0.776
Looper1 312500000 344.596 0.597 1.103
Looper2 282475249 311.951 0.803 1.104
Looper3 254803968 282.710 2.042 1.109
2: Lower bound supplied at command line:
Program Iterations Average (ms) Std Dev (ms) Per It. (ns)
-------- ------------- ------------- ------------- -------------
Looper 345025251 269.317 0.853 0.781
Looper 312500000 244.946 1.434 0.784
Looper 282475249 222.029 0.919 0.786
Looper 254803968 201.238 1.158 0.790
3: Lower bound hard-coded but copied to local variable:
Program Iterations Average (ms) Std Dev (ms) Per It. (ns)
-------- ------------- ------------- ------------- -------------
LooperX0 345025251 267.496 1.055 0.775
LooperX1 312500000 345.614 1.633 1.106
LooperX2 282475249 311.868 0.441 1.104
LooperX3 254803968 281.983 0.681 1.107
4: Lower bound hard-coded but ground through Confuser:
Program Iterations Average (ms) Std Dev (ms) Per It. (ns)
-------- ------------- ------------- ------------- -------------
LooperZ0 345025251 266.203 0.489 0.772
LooperZ1 312500000 241.689 0.571 0.774
LooperZ2 282475249 219.533 1.205 0.777
LooperZ3 254803968 198.308 0.416 0.778
That is an enourmous array. For all practical purposes you are testing how long it takes your operating system to fetch the values of each element from memory, not to compare whether j, k, etc are less than arrayLength, to increment the counters, and increment your sum. The latency to fetch those values has little to do with the runtime or jitter per se and a lot to do with whatever else happens to be running on your system as a whole and the current compression and organization of the heap.
In addition, because your array is taking up so much room and being accessed frequently it's quite possible that garbage collection is running during some of your test iterations, which would completely inflate the apparent CPU time.
Try doing your test without the array lookup -- just add 1 (sum++) and then see what happens. To be even more thorough, call GC.Collect() just before each test to avoid a collection during the loop.
I think Version 0 is faster because the compiler generates a special code without range checking in that case. See http://msdn.microsoft.com/library/ms973858.aspx (section Range Check Elimination)
Just an Idea :
Maybe there are bitshifting optimizations in the loops so it will take longer when beginning in an uneven count.
I also don't know if your processor might be an indicator for the different results
Off the top of my head, perhaps there's a compiler optimization for 0 -> length. Check your build settings (release vs debug).
Beyond that, unless it's just an issue of your computer doing other work that is influencing the benchmarks, I'm not sure. Perhaps you should alter your benchmark to run each test multiple times and average the results.
Related
I am trying to figure out, if this really is the fastest approach. I want this to be as fast as possible, cache friendly, and serve a good time complexity.
DEMO: https://dotnetfiddle.net/BUGz8s
private static void InvokeMe()
{
int hz = horizontal.GetLength(0) * horizontal.GetLength(1);
int vr = vertical.GetLength(0) * vertical.GetLength(1);
int hzcol = horizontal.GetLength(1);
int vrcol = vertical.GetLength(1);
//Determine true from Horizontal information:
for (int i = 0; i < hz; i++)
{
if(horizontal[i / hzcol, i % hzcol] == true)
System.Console.WriteLine("True, on position: {0},{1}", i / hzcol, i % hzcol);
}
//Determine true position from vertical information:
for (int i = 0; i < vr; i++)
{
if(vertical[i / vrcol, i % vrcol] == true)
System.Console.WriteLine("True, on position: {0},{1}", i / vrcol, i % vrcol);
}
}
Pages I read:
Is there a "faster" way to iterate through a two-dimensional array than using nested for loops?
Fastest way to loop through a 2d array?
Time Complexity of a nested for loop that parses a matrix
Determining the big-O runtimes of these different loops?
EDIT: The code example, is now, more towards what I am dealing with. It's about determining a true point x,y from a N*N Grid. The information available at disposal is: horizontal and vertical 2D arrays.
To NOT cause confusion. Imagine, that overtime, some positions in vertical or horizontal get set to True. This works currently perfectly well. All I am in for, is, the current approach of using one for-loop per 2D array like this, instead of using two for loops per 2D array.
Time complexity for approach with one loop and nested loops is the same - O(row * col) (which is O(n^2) for row == col as in your example for both cases) so the difference in the execution time will come from the constants for operations (since the direction of traversing should be the same). You can use BenchmarkDotNet to measure those. Next benchmark:
[SimpleJob]
public class Loops
{
int[, ] matrix = new int[10, 10];
[Benchmark]
public void NestedLoops()
{
int row = matrix.GetLength(0);
int col = matrix.GetLength(1);
for (int i = 0; i < row ; i++)
for (int j = 0; j < col ; j++)
{
matrix[i, j] = i * row + j + 1;
}
}
[Benchmark]
public void SingleLoop()
{
int row = matrix.GetLength(0);
int col = matrix.GetLength(1);
var l = row * col;
for (int i = 0; i < l; i++)
{
matrix[i / col, i % col] = i + 1;
}
}
}
Gives on my machine:
Method
Mean
Error
StdDev
Median
NestedLoops
144.5 ns
2.94 ns
4.58 ns
144.7 ns
SingleLoop
578.2 ns
11.37 ns
25.42 ns
568.6 ns
Making single loop actually slower.
If you will change loop body to some "dummy" operation - for example incrementing some outer variable or updating fixed (for example first) element of the martix you will see that performance of both loops is roughly the same.
Did you consider
for (int i = 0; i < row; i++)
{
for (int j = 0; j < col; j++)
{
Console.Write(string.Format("{0:00} ", matrix[i, j]));
Console.Write(Environment.NewLine + Environment.NewLine);
}
}
It is basically the same loop as yours, but without / and % that compiler may or may not optimize.
I have a problem that I don't understand, in that code:
ilProbekUcz= valuesUcz.Count; //valuesUcz is the list of <float[]>
for (int i = 0; i < ilWezlowDanych; i++) nodesValueArrayUcz[i] = new BitArray(ilProbekUcz);
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < ilProbekUcz; i++)
{
int index = 0;
linia = (float[])valuesUcz[i];//removing this line not solve problem
for (int a = 0; a < ileRazem; a++)
for (int b = 0; b < ileRazem; b++)
if (a != b)
{
bool value = linia[a] >= linia[b];
nodesValueArrayUcz[index][i] = value;
nodesValueArrayUcz[ilWezlowDanychP2 + index][i] = !value;
index++;
}
}
sw.Stop();
When i increase size of valuesUcz 2x, time of execution is 4x bigger
When i increase size of valuesUcz 4x, time of execution is 8x bigger
etc ...
(ileRazem,ilWezlowDanych is the same)
I understand: increase of ilProbekUcz increases size of BitArrays but i test it many times and it is no problem - time should grow linearly - in code:
ilProbekUcz= valuesUcz.Count; //valuesTest is the list of float[]
for (int i = 0; i < ilWezlowDanych; i++) nodesValueArrayUcz[i] = new BitArray(ilProbekUcz);
BitArray test1 = nodesValueArrayUcz[10];
BitArray test2 = nodesValueArrayUcz[20];
Stopwatch sw = new Stopwatch();
sw.Start();
for (int i = 0; i < ilProbekUcz; i++)
{
int index = 0;
linia = (float[])valuesUcz[i];//removing this line not solve problem
for (int a = 0; a < ileRazem; a++)
for (int b = 0; b < ileRazem; b++)
if (a != b)
{
bool value = linia[a] >= linia[b];
test1[i] = value;
test2[i] = !value;
index++;
}
}
time grows linearly, so the problem is to take a BitArray from the array...
Is any method to do it faster ? (i want time to grow linearly)
You have to understand that measuring time there are many factors that makes them inacurate. The biggest factor when you have huuuuuge arrays as in your example is cashe misses. Many times the same thing written when taking account of cashe, can be as much as 2-5 or more times faster. Two words how cashe works, very roughly. Cache is memory inside cpu. It is waaaaaaaaaaaaaay faster than ram so when you want to fetch a variable from memory you want to make sure this variable is stored in cache and not in ram. If it is stored in cache we say we have a hit otherwise a miss. Some times, not so often, a program is so big that it stores variables in hard drive. In that case you have a huuuuuuuuuuuge hit in delay when you fetch these! An example of cache:
Lets say we have an array of 10 elements in memory(ram)
when you get the first element testArray[0], because testArray[0] is not in cache the cpu brings this value along with a number(lets say 3, the number depends on the cpu) of adjacent elements of the array eg it stores to cache testArray[0], testArray[1], testArray[2], testArray[3]
Now when we get testArray[1] it is in cache so we have a hit. The same with testArray[2] and testArray[3]. testArray[4] isn't in cache so it gets testArray[4] along with another 3 testArray[5], testArray[6], testArray[7]
and so on...
Cache misses are very costly. That means you may expect an array of double the size is going to be accessible double the time. But this is not true. Bigger arrays more misses
and the time may increase 2 or 3 or 4 or more times from what you expect. This is normal. In your example that is what is happening. From 100 million elemensts(first array) you go t0 400 million (second one). The missesare not double but waaay more as you saw. A very cool trick has to do with the way you access an array. In your example ba1[j][i] = (j % 2) == 0; is way worse than ba1[i][j] = (j % 2) == 0;. The same with ba2[j][i] = (j % 2) == 0; and ba1[i][j] = (j % 2) == 0;. You can test it. Just reverse i and j. It has to do with the way the 2D array is stored in memory so in the second case you have more hits that the first one.
as you can read from the title I'm trying to store all the numbers between two numbers in an array.
For example store the numbers between 21 and 43 (22,23,24,25,26,27,28,29...) in an array.
This is the code, I don't know why but it prints only the higher number minus one.
class Program
{
static void Main(string[] args)
{
int higher = 43;
int lower = 21;
int[] numbers = new int[22]; //the numbers between 21 and 43 are 22
for (int i = lower; i < higher;i++)
{
for (int a = 0; a < 22; a++)
{
numbers[a] = i;
}
}
for (int c = 0; c < 22; c++)
{
Console.WriteLine(numbers[c]);
}
Console.ReadLine();
}
}
This is the code, I don't know why but it prints only the higher number minus one.
This question will attract answers giving you a half dozen solutions you can cut and paste to do your assignment.
I note you did not ask a question in your question -- next time, please format your question in the form of a question. The right question to ask here is how do I learn how to spot mistakes in code I've written? because that is the vital skill you lack. Answers that give you the code will not answer that question.
I already gave you a link to a recent answer where I explain that in detail, so study that.
In particular, in your case you have to read the program you wrote as though you had not written it. As though you were coming fresh to the program that someone else wrote and trying to figure out what it does.
The first thing I would do is look at the inner loop and say to myself "what does this do, in words?"
for (int a = 0; a < 22; a++)
{
numbers[a] = i;
}
That is "put the value i in every slot of the array. Now look at the outer loop:
for (int i = lower; i < higher;i++)
{
put the value i in every slot of the array
}
Now the technique to use here is to logically "unroll" the loop. A loop just does something multiple times so write that out. It starts with lower, it goes to higher-1, so that loop does this:
put the value lower in every slot of the array
put the value lower+1 in every slot of the array
…
put the value higher-1 in every slot of the array
What does the third loop do?
print every item in the array
And now you know why it prints the highest number minus one multiple times. Because that's what the program does. We just reasoned it out.
Incidentally the answers posted so far are correct, but some are not the best.
You have a technique that you understand for "do something to every member of an array, and that is:
loop an indexer from 0 to the array size minus one
do something to the array slot at the indexer
But the solutions the other answers are proposing are the opposite:
loop an indexer from the lower to the higher value
compute an index
do something to the array slot at that index
It's important to understand that both are correct, but my feeling is that for the beginner you should stick with the pattern you know. How would we
loop an indexer from 0 to the array size minus one
do something to the array slot at the indexer
for your problem? Let's start with giving you a much better technique for looping the indexer:
for (int i = 0; i < numbers.Length; ++i)
That's a better technique because when you change the size of the array, you don't have to change the loop! And also you are guaranteed that every slot in the array is covered. Design your loops so that they are robust to changes and have good invariants.
Now you have to work out what the right loop body is:
{
int number = i + lower;
numbers[i] = number;
}
Now you know that your loop invariant is "when the loop is done, the array is full of consecutive numbers starting at lower".
For everytime you loop through i, you put that number in every slot of the array. The inner loop is what is causing your issue. A better solution would be:
int higher = 43;
int lower = 21;
int[] numbers = new int[21];
int index = 0;
for (int i = lower + 1; i < higher; i++) // if you want to store everything
// between 21 and 43, you need to
// start with 22, thus lower + 1
{
numbers[index] = i;
index++;
}
for (int c = 0; c < 21; c++)
{
Console.WriteLine(numbers[c]);
}
Console.ReadLine();
Replace a with a direct translation of i
for (int i = lower; i < higher;i++)
{
numbers[i-lower] = i;
}
Use below
int higher = 43;
int lower = 21;
int[] numbers = new int[22]; //the numbers between 21 and 43 are 22
for (int i = lower+1; i < higher; i++)
{
numbers[i-lower] = i;
}
for (int c = 1; c < 21; c++)
{
Console.WriteLine(numbers[c]);
}
Console.ReadLine();
I think higher & lower are variables so following will give you output
for any higher and lower numbers
class Program
{
static void Main(string[] args)
{
int higher = 43;
int lower = 21;
int numDiff = higher - lower - 1;
int[] numbers = new int[numDiff]; //the numbers between 21 and 43 are 22
for(int i = 0; i<numbers.Length; i++)
{
numbers[i] = numDiff + i + 1;
}
for(int b = 0; b<numbers.Length; b++)
{
Console.WriteLine(numbers[b]);
}
Console.ReadLine();
}
}
I have a question regarding this SO post: Understanding How Many Times Nested Loops Will Run
In there the general formula for 3 nested for loops is: n(n+1)(n+2)/3. I don't really know why the 2nd inner loop runs n+1 times while the outer loop runs n times (wouldn't the inner loop run once more before it exits out of the for loop? Either way...the general formula is
n(n+1)(n+2)...(n+r-1)
---------------------
r!
Here, r is the number of nested loops.
I am wondering if this formula is always the same for nested loops or if it changes based on what the comparison is inside the for loops...If it is based on comparison then how can I determine the formula if on an exam I am given some different for loops? How can I generate or come up with this formula if the comparison for the for loops is not the same as the comparison in the SO post which creates that formula?
You'll have to train your mind to recognize and follow the patterns in execution, and come up with a formula for specific situations. The general rule of thumb is that if one for loop will run the code inside of it x times, and it has a loop inside of it that will run y times, then the code inside the inner loop will run x*y times.
The most common type of for loop starts at zero and increments by 1 until it reaches a certain number, like so:
for(int i = 0; i < x; i++)
for(int j = 0; j < y; j++)
for(int k = 0; k < z; k++)
// code here runs x * y * z times
To answer your question, if you change any part of any of the for loops, it will change the number of times the inner code is executed. You need to identify how many times that will be by thinking about the logical code execution.
for(int i = 1; i < x; i++)
for(int j = 0; j < y * 2; j++)
for(int k = 0; k < z; k += 2)
// code here runs (x - 1) * (y * 2) * (z / 2) times
In the above example, each for loop is tweaked in a different way. Notice how the overall formula for the number of times run stays pretty much the same, but now each loop is running a different number of times each time it gets hit.
Things become more complicated when the loops' variables affect more than one loop.
for(int i = 0; i < x; i++)
for(int j = i; j < y; j++) // notice how `j` starts as `i`
// Code here runs `y` times the first time through the outer loop,
// then `y - 1` times,
// then `y - 2` times,
// ...
// if x < y, the pattern continues until the xth time,
// when this loop runs `y - x` times.
// if x > y, the pattern will stop when y == x, and
// code here will run 0 times for the remainder of
// the loops.
So in this last example, assuming x < y, the loop will run y + (y-1) + (y-2) ... + (y-x) times.
It changes based on the inner values.
Example.
for (int i = 0; i < 100; i++)
{
//this loop will run 100 times.
for (int i1 = 0; i1 < 100; i++)
{
// This will run 100 times every outer loop int.
// This means that for each index i in the outer loop this will run 100 times.
// The outer loop runs 100 time and this runs 10,000 times in total.
}
}
for (int i = 0; i < 100; i++)
{
//this loop will run 100 times.
for (int i1 = 0; i1 < 10; i++)
{
// This will run 10 times every outer loop int.
// This means that for each index i in the outer loop this will run 10 times.
// The outer loop runs 100 time and this runs 1,000 times in total.
}
}
An easier way to look at it may be this.
for (int i = 0; i < 10; i++)
{
//this loop will run 10 times.
Console.WriteLine("int i = " + i.ToString()");
for (int i1 = 0; i1 < 10; i++)
{
// This will run 10 times every outer loop int.
// This means that for each index i in the outer loop this will run 10 times.
// The outer loop runs 10 time and this runs 100 times.
Console.WriteLine("int i2 = " + i2.ToString()");
}
}
That will output this.
int i = 0
int i2 = 0
int i2 = 1
int i2 = 2
int i2 = 3
int i2 = 4
int i2 = 5
int i2 = 6
int i2 = 7
int i2 = 8
int i2 = 9
int i = 1
int i2 = 0
int i2 = 1
int i2 = 2
int i2 = 3
int i2 = 4
int i2 = 5
int i2 = 6
int i2 = 7
int i2 = 8
int i2 = 9
and so on.
The formula is based on the inner loop numbers. (On when the loop ends.)
Hi and thanks for looking!
Background
I have a computing task that requires either a lot of time, or parallel computing.
Specifically, I need to loop through a list of about 50 images, Base64 encode them, and then calculate the Levenshtein distance between each newly encoded item and values in an XML file containing about 2000 Base64 string-encoded images in order to find the string in the XML file that has the smallest Lev. Distance from the benchmark string.
A regular foreach loop works, but is too slow so I have chosen to use PLINQ to take advantage of my Core i7 multi-core processor:
Parallel.ForEach(candidates, item => findImage(total,currentWinner,benchmark,item));
The task starts brilliantly, racing along at high speed, but then I get an "Out of Memory" exception.
I am using C#, .NET 4, Forms App.
Question
How do I tweak my PLINQ code so that I don't run out of available memory?
Update/Sample Code
Here is the method that is called to iniate the PLINQ foreach:
private void btnGo_Click(object sender, EventArgs e)
{
XDocument doc = XDocument.Load(#"C:\Foo.xml");
var imagesNode = doc.Element("images").Elements("image"); //Each "image" node contains a Base64 encoded string.
string benchmark = tbData.Text; //A Base64 encoded string.
IEnumerable<XElement> candidates = imagesNode;
currentWinner = 1000000; //Set the "Current" low score to a million and bubble lower scores into it's place iteratively.
Parallel.ForEach(candidates, i => {
dist = Levenshtein(benchmark, i.Element("score").Value);
if (dist < currentWinner)
{
currentWinner = dist;
path = i.Element("path").Value;
}
});
}
. . .and here is the Levenshtein Distance Method:
public static int Levenshtein(string s, string t) {
int n = s.Length;
int m = t.Length;
var d = new int[n + 1, m + 1];
// Step 1
if (n == 0)
{
return m;
}
if (m == 0)
{
return n;
}
// Step 2
for (int i = 0; i <= n; d[i, 0] = i++)
{
}
for (int j = 0; j <= m; d[0, j] = j++)
{
}
// Step 3
for (int i = 1; i <= n; i++)
{
//Step 4
for (int j = 1; j <= m; j++)
{
// Step 5
int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;
// Step 6
d[i, j] = Math.Min(
Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
d[i - 1, j - 1] + cost);
}
}
// Step 7
return d[n, m];
}
Thanks in advance!
Update
Ran into this error again today under different circumstances. I was working on a desktop app with high memory demand. Make sure that you have set the project for 64-bit architecture to access all available memory. My project was set on x86 by default and so I kept getting out of memory exceptions. Of course, this only works if you can count on 64-bit processors for your deployment.
End Update
After struggling a bit with this it appears to be operator error:
I was making calls to the UI thread from the parallel threads in order to update progress labels, but I was not doing it in a thread-safe way.
Additionally, I was running the app without the debugger, so there was an uncaught exception each time the code attempted to update the UI thread from a parallel thread which caused the overflow.
Without being an expert on PLINQ, I am guessing that it handles all of the low-level allocation stuff for you as long as you don't make a goofy smelly code error like this one.
Hope this helps someone else.