Getting Min, Max, Sum with a single parallel for loop - c#

I am trying to get minimum, maximum and sum (for the average) from a large array. I would love to substitute my regular for loop with parallel.for
UInt16 tempMin = (UInt16)(Math.Pow(2,mfvm.cameras[openCamIndex].bitDepth) - 1);
UInt16 tempMax = 0;
UInt64 tempSum = 0;
for (int i = 0; i < acquisition.frameDataShorts.Length; i++)
{
if (acquisition.frameDataShorts[i] < tempMin)
tempMin = acquisition.frameDataShorts[i];
if (acquisition.frameDataShorts[i] > tempMax)
tempMax = acquisition.frameDataShorts[i];
tempSum += acquisition.frameDataShorts[i];
}
I know how to solve this using Tasks with cutting the array myself. However I would love to learn how to use parallel.for for this. Since as I understand it, it should be able to do this very elegantly.
I found this tutorial from MSDN for calculating the Sum, however I have no idea how to extend it to do all three things (min, max and sum) in a single passage.
Results:
Ok I tried PLINQ solution and I have seen some serious improvements.
3 passes (Min, Max, Sum) are on my i7 (2x4 Cores) 4x times faster then sequential aproach. However I tried the same code on Xeon (2x8 core) and results are completelly different. Parallel (again 3 passes) are actually twice as slow as sequential aproach (which is like 5x faster then on my i7).
In the end I have separated the array myself with Task Factory and I have slightly better results on all computers.

I assume that the main issue here is that three different variables are have to be remembered each iteration. You can utilize Tuple for this purpose:
var lockObject = new object();
var arr = Enumerable.Range(0, 1000000).ToArray();
long total = 0;
var min = arr[0];
var max = arr[0];
Parallel.For(0, arr.Length,
() => new Tuple<long, int, int>(0, arr[0], arr[0]),
(i, loop, temp) => new Tuple<long, int, int>(temp.Item1 + arr[i], Math.Min(temp.Item2, arr[i]),
Math.Max(temp.Item3, arr[i])),
x =>
{
lock (lockObject)
{
total += x.Item1;
min = Math.Min(min, x.Item2);
max = Math.Max(max, x.Item3);
}
}
);
I must warn you, though, that this implementation runs about 10x slower (on my machine) than the simple for loop approach you demonstrated in your question, so proceed with caution.

I don't think parallel.for is good fit here but try this out:
public class MyArrayHandler {
public async Task GetMinMaxSum() {
var myArray = Enumerable.Range(0, 1000);
var maxTask = Task.Run(() => myArray.Max());
var minTask = Task.Run(() => myArray.Min());
var sumTask = Task.Run(() => myArray.Sum());
var results = await Task.WhenAll(maxTask,
minTask,
sumTask);
var max = results[0];
var min = results[1];
var sum = results[2];
}
}
Edit
Just for fun due to the comments regarding performance I took a couple measurements. Also, found this Fastest way to find sum.
#10,000,000 values
GetMinMax: 218ms
GetMinMaxAsync: 308ms
public class MinMaxSumTests {
[Test]
public async Task GetMinMaxSumAsync() {
var myArray = Enumerable.Range(0, 10000000).Select(x => (long)x).ToArray();
var sw = new Stopwatch();
sw.Start();
var maxTask = Task.Run(() => myArray.Max());
var minTask = Task.Run(() => myArray.Min());
var sumTask = Task.Run(() => myArray.Sum());
var results = await Task.WhenAll(maxTask,
minTask,
sumTask);
var max = results[0];
var min = results[1];
var sum = results[2];
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
}
[Test]
public void GetMinMaxSum() {
var myArray = Enumerable.Range(0, 10000000).Select(x => (long)x).ToArray();
var sw = new Stopwatch();
sw.Start();
long tempMin = 0;
long tempMax = 0;
long tempSum = 0;
for (int i = 0; i < myArray.Length; i++) {
if (myArray[i] < tempMin)
tempMin = myArray[i];
if (myArray[i] > tempMax)
tempMax = myArray[i];
tempSum += myArray[i];
}
sw.Stop();
Console.WriteLine(sw.ElapsedMilliseconds);
}
}

Do not reinvent the wheel, Min, Max Sum and similar operations are aggregations. Since .NET v3.5 you have a handy versions of LINQ extension methods which are already providing you the solution:
using System.Linq;
var sequence = Enumerable.Range(0, 10).Select(s => (uint)s).ToList();
Console.WriteLine(sequence.Sum(s => (double)s));
Console.WriteLine(sequence.Max());
Console.WriteLine(sequence.Min());
Though they are declared as the extensions for IEnumerable, they have some internal improvements for IList and Array types, so you should measure how your code will work on that types and on IEnumerable's.
In your case this isn't enough, as you clearly do not want to iterate other one array more than one time, so the magic goes here: PLINQ (a.k.a. Parallel-LINQ). You need to add only one method to aggregate your array in parallel:
var sequence = Enumerable.Range(0, 10000000).Select(s => (uint)s).AsParallel();
Console.WriteLine(sequence.Sum(s => (double)s));
Console.WriteLine(sequence.Max());
Console.WriteLine(sequence.Min());
This option add some overhead for synchronization the items, but it do scale well, providing a similar time either for small and big enumerations. From MSDN:
PLINQ is usually the recommended approach whenever you need to apply the parallel aggregation pattern to .NET applications. Its declarative nature makes it less prone to error than other approaches, and its performance on multicore computers is competitive with them.
Implementing parallel aggregation with PLINQ doesn't require adding locks in your code. Instead, all the synchronization occurs internally, within PLINQ.
However, if you still want to investigate the performance for different types of the operations, you can use the Parallel.For and Parallel.ForaEach methods overloads with some aggregation approach, something like this:
double[] sequence = ...
object lockObject = new object();
double sum = 0.0d;
Parallel.ForEach(
// The values to be aggregated
sequence,
// The local initial partial result
() => 0.0d,
// The loop body
(x, loopState, partialResult) =>
{
return Normalize(x) + partialResult;
},
// The final step of each local context
(localPartialSum) =>
{
// Enforce serial access to single, shared result
lock (lockObject)
{
sum += localPartialSum;
}
}
);
return sum;
If you need additional partition for your data, you can use a Partitioner for the methods:
var rangePartitioner = Partitioner.Create(0, sequence.Length);
Parallel.ForEach(
// The input intervals
rangePartitioner,
// same code here);
Also Aggregate method can be used for the PLINQ, with some merge logic
(illustration from MSDN again):
Useful links:
Parallel Aggregation
Enumerable.Min<TSource>(IEnumerable<TSource>) method
Enumerable.Sum method
Enumerable.Max<TSource> (IEnumerable<TSource>) method

Related

Summing every element in a byte array

Now, I'm new to threading and async / sync programming and all that stuff. So, I've been practicing and saw this problem on youtube. The problem was to sum every content of a byte array. It was from the channel called Jamie King. He did this with threads. I've decided to do this with task. I made it asynchronous and it was slower than the synchronous one. The difference between the two was 360 milliseconds! I wonder if any of you could do it faster in an asynchronous way. If so, please post it!
Here's mine:
static Random Random = new Random(999);
static byte[] byteArr = new byte[100_000_000];
static byte TaskCount = (byte)Environment.ProcessorCount;
static int readingLength;
static void Main(string[] args)
{
for (int i = 0; i < byteArr.Length; i++)
{
byteArr[i] = (byte)Random.Next(11);
}
SumAsync(byteArr);
}
static async void SumAsync(byte[] bytes)
{
readingLength = bytes.Length / TaskCount;
int sum = 0;
Console.WriteLine("Running...");
Stopwatch watch = new Stopwatch();
watch.Start();
for (int i = 0; i < TaskCount; i++)
{
Task<int> task = SumPortion(bytes.SubArray(i * readingLength, readingLength));
int result = await task;
sum += result;
}
watch.Stop();
Console.WriteLine("Done! Time took: {0}, Result: {1}", watch.ElapsedMilliseconds, sum);
}
static async Task<int> SumPortion(byte[] bytes)
{
Task<int> task = Task.Run(() =>
{
int sum = 0;
foreach (byte b in bytes)
{
sum += b;
}
return sum;
});
int result = await task;
return result;
}
Note that bytes.SubArray is an extension method. I have one question. Is asynchronous programming slower than synchronous programming?
Please point out my mistakes.
Thanks for your time!
You need to use WhenAll() and return all of the tasks at the end:
static async void SumAsync(byte[] bytes)
{
readingLength = bytes.Length / TaskCount;
int sum = 0;
Console.WriteLine("Running...");
Stopwatch watch = new Stopwatch();
watch.Start();
var results = new Task[TaskCount];
for (int i = 0; i < TaskCount; i++)
{
Task<int> task = SumPortion(bytes.SubArray(i * readingLength, readingLength));
results[i] = task
}
int[] result = await Task.WhenAll(results);
watch.Stop();
Console.WriteLine("Done! Time took: {0}, Result: {1}", watch.ElapsedMilliseconds, result.Sum());
}
When you use the WhenAll() method, you combine all of the Task results, thus the tasks would run in parallel, saving you a lot of necessary time.
You can read more about it in learn.microsoft.com.
asynchronous is not explicitly slower - but runs in the background (Such as waits for connection to a website to be established) - so that the main thread is not stopped for the time it waits for something to happen.
The fastest way to do this is probably going to be to hand-roll a Parallel.ForEach() loop.
Plinq may not even give you a speedup in comparison to a single-threaded approach, and it certainly won't be as fast as Parallel.ForEach().
Here's some sample timing code. When you try this, make sure it's a RELEASE build and that you don't run it under the debugger (which will turn off the JIT optimiser, even if it's a RELEASE build):
using System;
using System.Collections.Concurrent;
using System.Diagnostics;
using System.Linq;
using System.Threading;
using System.Threading.Tasks;
namespace Demo
{
static class Program
{
static void Main()
{
// Create some random bytes (using a seed to ensure it's the same bytes each time).
var rng = new Random(12345);
byte[] byteArr = new byte[500_000_000];
rng.NextBytes(byteArr);
// Time single-threaded Linq.
var sw = Stopwatch.StartNew();
long sum = byteArr.Sum(x => (long)x);
Console.WriteLine($"Single-threaded Linq took {sw.Elapsed} to calculate sum as {sum}");
// Time single-threaded loop;
sw.Restart();
sum = 0;
foreach (var n in byteArr)
sum += n;
Console.WriteLine($"Single-threaded took {sw.Elapsed} to calculate sum as {sum}");
// Time Plinq
sw.Restart();
sum = byteArr.AsParallel().Sum(x => (long)x);
Console.WriteLine($"Plinq took {sw.Elapsed} to calculate sum as {sum}");
// Time Parallel.ForEach() with partitioner.
sw.Restart();
sum = 0;
Parallel.ForEach
(
Partitioner.Create(0, byteArr.Length),
() => 0L,
(subRange, loopState, threadLocalState) =>
{
for (int i = subRange.Item1; i < subRange.Item2; i++)
threadLocalState += byteArr[i];
return threadLocalState;
},
finalThreadLocalState =>
{
Interlocked.Add(ref sum, finalThreadLocalState);
}
);
Console.WriteLine($"Parallel.ForEach with partioner took {sw.Elapsed} to calculate sum as {sum}");
}
}
}
The results I get with an x64 build on my octo-core PC are:
Single-threaded Linq took 00:00:03.1160235 to calculate sum as 63748717461
Single-threaded took 00:00:00.7596687 to calculate sum as 63748717461
Plinq took 00:00:01.0305913 to calculate sum as 63748717461
Parallel.ForEach with partioner took 00:00:00.0839141 to calculate sum as 63748717461
The results I get with an x86 build are:
Single-threaded Linq took 00:00:02.6964067 to calculate sum as 63748717461
Single-threaded took 00:00:00.8200462 to calculate sum as 63748717461
Plinq took 00:00:01.1251899 to calculate sum as 63748717461
Parallel.ForEach with partioner took 00:00:00.1084805 to calculate sum as 63748717461
As you can see, the Parallel.ForEach() with the x64 build is fastest (probably because it's calculating a long total, rather than because of the larger address space).
The Plinq is around three times faster than the Linq non-threaded solution.
The Parallel.ForEach() with a partitioner is more than 30 times faster.
But notably, the non-linq single-threaded code is faster than the Plinq code. In this case, using Plinq is pointless; it makes things slower!
This tells us that the speedup isn't just from multithreading - it's also related to the overhead of Linq and Plinq in comparison to hand-rolling the loop.
Generally speaking, you should only use Plinq when the processing of each element take a relatively long time (and adding a byte to a running total take a very short time).
The advantage of Plinq over Parallel.ForEach() with a partitioner is that it is much simpler to write - however, if it winds up being slower than a simple foreach loop then its utility is questionable. So timing things before choosing a solution is very important!

ParallelEnumerable.Aggregate for several methods

Start to learn multithreading. Have 3 methods to calculate a sum, average, and product of square roots of an array.
At first, I make three separate blocking calls using PLINQ. Then I thought that it would be nice to be able to make it in a single call and return an object with sum, product, and average at the same time. I read that ParallelEnumerable.Aggregate can help me with this, but I totally don't know how to use it.
I would be really grateful for some explanation how to use this function in my case, good/bad aspects of this approach.
public static double Average(double[] array, string tool)
{
if (array == null) throw new ArgumentNullException(nameof(array));
double sum = Enumerable.Sum(array);
double result = sum / array.Length;
Print(tool, result);
return result;
}
public static double Sum(double[] array, string tool)
{
if (array == null) throw new ArgumentNullException(nameof(array));
double sum = Enumerable.Sum(array);
Print(tool, sum);
return sum;
}
public static void ProductOfSquareRoots(double[] array, string tool)
{
if (array == null) throw new ArgumentNullException(nameof(array));
double result = 1;
foreach (var number in array)
{
result = result * Math.Sqrt(number);
}
Print(tool, result);
}
The three aggregated values (average, sum and product of square roots) that you want to compute can each be computed by performing a single pass over the numbers. Instead of doing this three times (one for each aggregated value) you can do this once and aggregate the three values inside the loop (this should save time).
The average is the sum divided by the count and as you already are computing the sum you only need the count in addition to get the average. If you know the size of the input you don't even have to count the items but here I assume that the size of the input is unknown in advance.
If you want to use LINQ you can use Aggregate:
var aggregate = numbers.Aggregate(
// Starting value for the accumulator.
(Count: 0, Sum: 0D, ProductOfSquareRoots: 1D),
// Update the accumulator with a specific number.
(accumulator, number) =>
{
accumulator.Count += 1;
accumulator.Sum += number;
accumulator.ProductOfSquareRoots *= Math.Sqrt(number);
return accumulator;
});
The variable aggregate is a ValueTuple<int, double, double> with the items Count, Sum and ProductOfSquareRoots. Before C# 7 you would use an anonymous type. However, that would require an allocation for each value in the input sequence slowing down the aggregation. By using a mutable value tuple the aggregation should become faster.
Aggregate works with PLINQ so if numbers is of type ParallelQuery<T> and not IEnumerable<T> then the aggregation will be performed in parallel. Notice that this requires the aggregation to be both associative (e.g. (a + b) + c = a + (b + c) and commutative (e.g. a + b = b + a) which in your case is true.
PLINQ has an overhead so it might not perform better compared to single threaded LINQ depending on the number of elements in your sequence and how complex the calculations are. You will have to measure this yourself to determine if PLINQ speeds things up. However, you can use the same Aggregate expression in both LINQ and PLINQ making your code easy to switch from single threaded to parallel by inserting AsParallel() the right place.
Note: you must initialize the result variable with the value 1, because otherwise you will always get 0.
Note 2: instead of Enumerable.Sum(array), just write array.Sum().
No, the Aggregate method won't help you to calculate the three functions at the same time. See Martin Liversage answer.
KISS ;)
if (array == null) throw new ArgumentNullException(nameof(array));
var sum = array.Sum();
var average = array.Average();
var product = array.Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val));
Can be simplified:
var average = sum / array.Length;
This eliminates an extra pass through the array.
Want to parallelize?
var sum = array.AsParallel().Sum();
//var average = array.AsParallel().Average(); // Extra pass!
var average = sum / array.Length; // More fast! Really!
var product = array.AsParallel().Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val));
However, it will probably be slower than the previous method. Such paralleling is justified only for very large collections, in billions of elements.
Each pass through the collection takes time. The less passes, the better the performance. From one we have already disposed of, when calculating the average. Let's do just one.
double sum = 0;
double product = 1;
foreach (var number in array)
{
sum += number;
product = product * Math.Sqrt(number);
}
double average = sum / array.Length;
Three results in one pass! We are the best!
Let's get back to the subject.
The Parallel.Invoke method allows you to execute several functions in parallel, but it does not get the results from them. It is suitable for calculations of the type "fire and forget".
We can parallelize the computation by running multiple tasks. With help of Task.WhenAll waiting for them all completed and get the result.
var results = await Task.WhenAll(
Task.Run(() => array.Sum()),
Task.Run(() => array.Average()),
Task.Run(() => array.Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val)))
);
var sum = results[0];
var average = results[1];
var product = results[2];
It is also not effective for a small size collection. But it may be more efficient than the AsParallel in some cases.
Another way of writing this approach with tasks. Perhaps it will seem clearer.
var sumTask = Task.Run(() => array.Sum());
var avgTask = Task.Run(() => array.Average());
var prodTask = Task.Run(() => array.Aggregate(1.0, (acc, val) => acc * Math.Sqrt(val)));
Task.WaitAll(sumTask, avgTask, prodTask);
sum = sumTask.Result;
average = avgTask.Result;
product = prodTask.Result;

Using Ienumerable.TakeWhile but only returns one set of results

First off I want to apologize if my code is bad or if my description is poor. This is one of my first times working with C# threading/tasks. What I'm trying to do in my code is to go through a list of names and for each 50 names in the list, start a new task and pass off those 50 names to another method that will perform calculation heavy methods on the data. My code only works for the first 50 names in the list and it returns 0 results for every other time and I can't seem to figure out why.
public static async void startInitialDownload(string value)
{
IEnumerable<string> names = await Helper.getNames(value, 0);
decimal multiple = names.Count() / 50;
string[] results;
int num1 = 0;
int num2 = 0;
for (int i = 0; i < multiple + 1; i++)
{
num1 = i * 50;
num2 = (50 * (i + 1));
results = names.TakeWhile((name, index) => index >= num1 && index < num2).ToArray();
Task current = Task.Factory.StartNew(() => getCurrentData(results));
await current.ConfigureAwait(false);
}
}
Realise the enumerable into a list, so that it will be calculated once, not each iteration in the loop. You can use the Skip and Take methods to get a range of the list:
public static async void startInitialDownload(string value) {
IEnumerable<string> names = await Helper.getNames(value, 0);
List<string> nameList = names.ToList();
for (int i = 0; i < nameList.Count; i += 50) {
string[] results = nameList.Skip(i).Take(50).ToArray();
Task current = Task.Factory.StartNew(() => getCurrentData(results));
await current.ConfigureAwait(false);
}
}
Or you can add items to a list, and execute it when it has the right size:
public static async void startInitialDownload(string value) {
IEnumerable<string> names = await Helper.getNames(value, 0);
List<string> buffer = new List<string>();
foreach (string s in names) {
buffer.Add(s);
if (buffer.Count == 50) {
Task current = Task.Factory.StartNew(() => getCurrentData(buffer.ToArray()));
await current.ConfigureAwait(false);
buffer = new List<string>();
}
}
if (buffer.Count > 0) {
Task current = Task.Factory.StartNew(() => getCurrentData(buffer.ToArray()));
await current.ConfigureAwait(false);
}
}
The name TakeWhile suggests that it only takes entries while the condition is true. So if it starts off by reading an entry for which the condition is false, it never takes anything.
So the first loop, you're starting with num1 = 0. So it reads entries from num1 to num2.
The second loop, you're starting with num1 being 51. So it starts reading again ... and the first entry it hits, the condition is false, so it stops.
You might try using Where, or by using Skip before hand.
The tl;dr; of it: I don't think your problem has anything to do with parallel tasks, I think it's due to using the wrong LINQ method to pull the names you want to use.
As I understand it from Stephen Cleary's response to a similar (though not identical) question, you don't need to use ConfigureAwait() there.
Here's the link in question: on stack overflow
And here's what I would do instead with the last two lines of your for loop:
Task.Factory.StartNew(() => getCurrentData(results));
That's it. By using the factory, and by not awaiting, you are letting that task run on its own (possibly on a new thread). Provided that your storage is all thread safe (see: System.Collections.Concurrent btw) then you should be all set.
Caveat: if you aren't showing us what lies after the await then your results may vary.
its not a direct solution, but it might work.
public static IEnumerable<T[]> MakeBuckets<T>(IEnumerable<T> source, int maxSize)
{
List<T> currentBucket = new List<T>(maxSize);
foreach (var s in source)
{
currentBucket.Add(s);
if (currentBucket.Count >= maxSize)
{
yield return currentBucket.ToArray();
currentBucket = new List<T>(maxSize);
}
}
if(currentBucket.Any())
yield return currentBucket.ToArray();
}
later you can iterate through the result of the MakeBucket function.

Several Tasks manipulating on same Object

So was I just doing some experiments with Task class in c# and the following thing happens.
Here is the method I call
static async Task<List<int>> GenerateList(long size, int numOfTasks)
{
var nums = new List<int>();
Task[] tasks = new Task[numOfTasks];
for (int i = 0; i < numOfTasks; i++)
{
tasks[i] = Task.Run(() => nums.Add(Rand.Nex())); // Rand is a ThreadLocal<Random>
}
for (long i = 0; i < size; i += numOfTasks)
{
await Task.WhenAll(tasks);
}
return nums;
}
I call this method like this
var nums = GenerateList(100000000, 10).Result;
before I used Tasks generation took like 4-5 seconds. after I implemented this method like this if I pass 10-20 number of tasks the time of generation is lowered to 1,8-2,2 seconds but the thing it the List which is return by the method has numOfTask number of Elements in it so in this case List of ten numbers is returned. May be I'm writing something wrong. What can be the problem here. Or may be there is another solution to It. All I want it many task to add numbers in the same list so the generation time would be at least twice faster. Thanks In advance
WhenAll does not run the tasks; it just (asynchronously) waits for them to complete. Your code is only creating 10 tasks, so that's why you're only getting 10 numbers. Also, as #Mauro pointed out, List<T>.Add is not threadsafe.
If you want to do parallel computation, then use Parallel or Parallel LINQ, not async:
static List<int> GenerateList(int size, int numOfTasks)
{
return Enumerable.Range(0, size)
.AsParallel()
.WithDegreeOfParallelism(numOfTasks)
.Select(_ => Rand.Value.Next())
.ToList();
}
As explained by Stephen, you are only creating 10 tasks.
Also, I believe the Add operation on the generic list is not thread safe. You should use a locking mechanism or, if you are targeting framework 4 or newer, use thread-safe collections .
you are adding to the list in the following loop which runs for only 10 times
for (int i = 0; i < numOfTasks; i++)
{
tasks[i] = Task.Run(() => nums.Add(Rand.Nex())); // Rand is a ThreadLocal<Random>
}
you can instead do
for (int i = 0; i < numOfTasks; i++)
{
tasks[i] = new Task(() => nums.Add(Rand.Nex()));
}

What do this methods do?

I've found two diferent methods to get a Max value from an array but I'm not really fond of parallel programing, so I really don't understand it.
I was wondering do this methods do the same or am I missing something?
I really don't have much information about them. Not even comments...
The first method:
int[] vec = ... (I guess the content doesn't matter)
static int naiveMax()
{
int max = vec[0];
object obj = new object();
Parallel.For(0, vec.Length, i =>
{
lock (obj) {
if (vec[i] > max) max = vec[i];
}
});
return max;
}
And the second one:
static int Max()
{
int max = vec[0];
object obj = new object();
Parallel.For(0, vec.Length, //could be Parallel.For<int>
() => vec[0],
(i, loopState, partial) =>
{
if(vec[i]>partial) partial = vec[i];
return partial;
},
partial => {
lock (obj) {
if( partial > max) max = partial;
}
});
return max;
}
Do these do the same or something diferent and what? Thanks ;)
Both find the maximum value in an array of integers. In an attempt to find the maximum value faster, they do it "in parallel" using the Parallel.For Method. Both methods fail at this, though.
To see this, we first need a sufficiently large array of integers. For small arrays, parallel processing doesn't give us a speed-up anyway.
int[] values = new int[100000000];
Random random = new Random();
for (int i = 0; i < values.Length; i++)
{
values[i] = random.Next();
}
Now we can run the two methods and see how long they take. Using an appropriate performance measurement setup (Stopwatch, array of 100,000,000 integers, 100 iterations, Release build, no debugger attached, JIT warm-up) I get the following results on my machine:
naiveMax 00:06:03.3737078
Max 00:00:15.2453303
So Max is much much better than naiveMax (6 minutes! cough).
But how does it compare to, say, PLINQ?
static int MaxPlinq(int[] values)
{
return values.AsParallel().Max();
}
MaxPlinq 00:00:11.2335842
Not bad, saved a few seconds. Now, what about a plain, old, sequential for loop for comparison?
static int Simple(int[] values)
{
int result = values[0];
for (int i = 0; i < values.Length; i++)
{
if (result < values[i]) result = values[i];
}
return result;
}
Simple 00:00:05.7837002
I think we have a winner.
Lesson learned: Parallel.For is not pixie dust that you can sprinkle over your code to
make it magically run faster. If performance matters, use the right tools and measure, measure, measure, ...
They appear to do the same thing, however they are very inefficient. The point of parallelization is to improve the speed of code that can be executed independently. Due to race conditions, discovering the maximum (as implemented here) requires an atomic semaphore/lock on the actual logic... Which means you're spinning up many threads and related resources simply to do the code sequentially anyway... Defeating the purpose of parallelization entirely.

Categories

Resources