In the scenario I present to you, my solution is supposed to represent O(n^2 * log n), and the "pointers" solution, which I assume is the fastest way to resolve the "3SUM" problem, represents O(n^2 * 1); leaving the question of is O(1) faster than O(log n), exampling it with my code.
Could someone explain why this seems to be the case? Please. My logic tells me that O(log n) should be as fast as O(1), if not faster.
I hope my comments on the code of my solution are clear.
Edit: I know that this does not sound very smart... log(n) counts the input (n -> ∞), while 1... is just 1. BUT, in this case, for finding a number, how is it supposed to be faster to do sums and subtractions instead of using binary search (log n)? It just does not enter my head.
LeetCode 3SUM problem description
O(n^2 * log n)
For an input of 3,000 values:
Iterations: 1,722,085 (61% less than the "pointers solution")
Runtime: ~92 ms (270% slower than the typical O(n^2) solution)
public IList<IList<int>> MySolution(int[] nums)
{
IList<IList<int>> triplets = new List<IList<int>>();
Array.Sort(nums);
for (int i = 0; i < nums.Length; i++)
{
// Avoid duplicating results.
if (i > 0 && nums[i] == nums[i - 1])
continue;
for (int j = i+1; j < nums.Length - 1; j++)
{
// Avoid duplicating results.
if (j > (i+1) && nums[j] == nums[j - 1])
continue;
// The solution for this triplet.
int numK = -(nums[i] + nums[j]);
// * This is the problem.
// Search for 'k' index in the array.
int kSearch = Array.BinarySearch(nums, j + 1, nums.Length - (j + 1), numK);
// 'numK' exists in the array.
if (kSearch > 0)
{
triplets.Add(new List<int>() { nums[i], nums[j], numK });
}
// 'numK' is too small, break this loop since its value is just going to increase.
else if (~kSearch == (j + 1))
{
break;
}
}
}
return triplets;
}
O(n^2)
For the same input of 3,000 values:
Iterations: 4.458.579
Runtime: ~34 ms
public IList<IList<int>> PointersSolution(int[] nums)
{
IList<IList<int>> triplets = new List<IList<int>>();
Array.Sort(nums);
for (int i = 0; i < nums.Length; i++)
{
if (i > 0 && nums[i] == nums[i - 1])
continue;
int l = i + 1, r = nums.Length - 1;
while (l < r)
{
int sum = nums[i] + nums[l] + nums[r];
if (sum < 0)
{
l++;
}
else if (sum > 0)
{
r--;
}
else
{
triplets.Add(new List<int>() { nums[i], nums[l], nums[r] });
do
{
l++;
}
while (l < r && nums[l] == nums[l - 1]);
}
}
}
return triplets;
}
It seems that your conceptual misunderstanding comes from the fact that you are missing that Array.BinarySearch does some iterations too (it was indicated by the initial iterations counts in the question which you now have changed).
So while assumption that binary search should be faster than simple iteration trough the collection is pretty valid - you are missing that binary search is basically an extra loop, so you should not compare those two but compare the second for loop + binary search in the first solution against the second loop of the second.
P.S.
To argue about time complexity based on runtimes with at least some degree of certainty you need at least to perform several tests with different increasing number of elements (like 100, 1000, 10000, 100000 ...) and see how the runtime changes. Also different inputs for the same number of elements are recommended cause in theory you can hit some optimal cases for one algorithm which can be the worst case scenarios for another.
Quick interjection--not sure your second solution (pointers) is O(n^2)--It has a third inner loop. (See Stron's response below)
I took a moment to profile you code with a generic .NET profiler and:
That ought to do it, huh? ;)
After checking the implementation, I found that BinarySearch internally uses CompareTo which I imagine isn't ideal (but, being a generic for an unmanaged type, it shouldn't be that bad...)
To "Improve" it, I dragged BinarySearch, kicking and screaming, and replaced the CompareTo with actual comparison operators. I named this benchmark MyImproved Here's the results:
Benchmark.NET results:
Interestingly, Benchmark.NET disregards common sense and puts MyImproved over Pointers. This may be due to some optimization which is turned off by the profiler.
Method
Complexity
Mean
Error
StdDev
Code Size
Pointers
O(n^2)???
76.76 ms
1.465 ms
1.628 ms
1,781 B
My
O(n^2 * log n)
93.08 ms
1.831 ms
3.980 ms
1,999 B
MyImproved
O(n^2 * log n)
62.53 ms
1.234 ms
2.226 ms
1,980 B
TL;DR:
.CompareTo() seemed to be bogging down the implementation of .BinarySearch(). Removing it and using actual integer comparison seemed to help a lot. Either that, or it's some funky interface stuff that I'm not prepared to investigate :)
Two tips:
Use sharplab.io to see your lowered code, it may reveal something (link)
try running these seperate tests through the dotnetBenchmark nuget package, it'll give you more accurate timings, and if the memory usage or allocations is considerably higher in one case, that could be your answer.
Anyway, are you running these tests in debug or release mode? I just had a thought that I haven't tested recently, but I believe that the debugger overhead can significantly affect the performance of a binary search.
Give it a go, and let me know
Related
I have recently come across this line of code and what it does is that it goes through an array and returns the value that is seen most often. For example 1,1,2,1,3 so it will return 1 because it appears more than 2 and 3. What I am trying to do is understand how it works so what I did was I went through it with visual studio step by step but it is not ringing any bells.
Can anyone help me understand what is going on here? It would be a total plus if someone can tell me what does c do and what is the logic behind the arguments in the if statements.
int[] arr = a;
int c = 1, maxcount = 1, maxvalue = 0;
int result = 0;
for (int i = 0; i < arr.Length; i++)
{
maxvalue = arr[i];
for (int j = 0; j <arr.Length; j++)
{
if (maxvalue == arr[j] && j != i)
{
c++;
if (c > maxcount)
{
maxcount = c;
result = arr[i];
}
}
else
{
c=1;
}
}
}
return result;
EDIT: On closer examination, the code snippet has a nested loop and is conventionally counting the maximum seen element by simply keeping track of the maximum seen times and the element that was seen and keeping them in sync.
That looks like an implementation of the Boyer-Moore majority vote counting algorithm. They have a nice illustration here.
The logic is simple, and is to compute the majority in a single pass, taking O(n) time. Note that majority here means that more than 50% of the array must be filled with that element. If there is no majority element, you get an "incorrect" result.
Verifying if the element is actually forming a majority is done in a separate pass usually.
It is not computing the most frequent element - what it is computing is the longest run of elements.
Also, it is not doing it very efficiently, the inner loop only needs to compute upto i-1, not upto arr.Length.
c is keeping track of the current run length. The first "if" is to check if this is a "continouous run". The second "if" (after reaching the last element in the run) will check if this run is longer than any run you have seen so far.
In the above input sample, you are getting 1 as answer because it is the longest run. Try with an input where the element with the longest run is not the same as the most frequent element. (e.g., 2,1,1,1,3,2,3,2,3,2,3,2 - here 2 is the most frequent element, but 1,1,1 is the longest run).
I've always loved reducing number of code lines by using simple but smart math approaches. This situation seems to be one of those that need this approach. So what I basically need is to sum up digits in the odd and even places separately with minimum code. So far this is the best way I have been able to think of:
string number = "123456789";
int sumOfDigitsInOddPlaces=0;
int sumOfDigitsInEvenPlaces=0;
for (int i=0;i<number.length;i++){
if(i%2==0)//Means odd ones
sumOfDigitsInOddPlaces+=number[i];
else
sumOfDigitsInEvenPlaces+=number[i];
{
//The rest is not important
Do you have a better idea? Something without needing to use if else
int* sum[2] = {&sumOfDigitsInOddPlaces,&sumOfDigitsInEvenPlaces};
for (int i=0;i<number.length;i++)
{
*(sum[i&1])+=number[i];
}
You could use two separate loops, one for the odd indexed digits and one for the even indexed digits.
Also your modulus conditional may be wrong, you're placing the even indexed digits (0,2,4...) in the odd accumulator. Could just be that you're considering the number to be 1-based indexing with the number array being 0-based (maybe what you intended), but for algorithms sake I will consider the number to be 0-based.
Here's my proposition
number = 123456789;
sumOfDigitsInOddPlaces=0;
sumOfDigitsInEvenPlaces=0;
//even digits
for (int i = 0; i < number.length; i = i + 2){
sumOfDigitsInEvenPlaces += number[i];
}
//odd digits, note the start at j = 1
for (int j = 1; i < number.length; i = i + 2){
sumOfDigitsInOddPlaces += number[j];
}
On the large scale this doesn't improve efficiency, still an O(N) algorithm, but it eliminates the branching
Since you added C# to the question:
var numString = "123456789";
var odds = numString.Split().Where((v, i) => i % 2 == 1);
var evens = numString.Split().Where((v, i) => i % 2 == 0);
var sumOfOdds = odds.Select(int.Parse).Sum();
var sumOfEvens = evens.Select(int.Parse).Sum();
Do you like Python?
num_string = "123456789"
odds = sum(map(int, num_string[::2]))
evens = sum(map(int, num_string[1::2]))
This Java solution requires no if/else, has no code duplication and is O(N):
number = "123456789";
int[] sums = new int[2]; //sums[0] == sum of even digits, sums[1] == sum of odd
for(int arrayIndex=0; arrayIndex < 2; ++arrayIndex)
{
for (int i=0; i < number.length()-arrayIndex; i += 2)
{
sums[arrayIndex] += Character.getNumericValue(number.charAt(i+arrayIndex));
}
}
Assuming number.length is even, it is quite simple. Then the corner case is to consider the last element if number is uneven.
int i=0;
while(i<number.length-1)
{
sumOfDigitsInEvenPlaces += number[ i++ ];
sumOfDigitsInOddPlaces += number[ i++ ];
}
if( i < number.length )
sumOfDigitsInEvenPlaces += number[ i ];
Because the loop goes over i 2 by 2, if number.length is even, removing 1 does nothing.
If number.length is uneven, it removes the last item.
If number.length is uneven, then the last value of i when exiting the loop is that of the not yet visited last element.
If number.length is uneven, by modulo 2 reasoning, you have to add the last item to sumOfDigitsInEvenPlaces.
This seems slightly more verbose, but also more readable, to me than Anonymous' (accepted) answer. However, benchmarks to come.
Well, the compiler seems to think my code more understandable as well, since he removes it all if I don't print the results (which explains why I kept getting a time of 0 all along...). The other code though is obfuscated enough for even the compiler.
In the end, even with huge arrays, it's pretty hard for clock_t to tell the difference between the two. You get about a third less instructions in the second case, but since everything's in cache (and your running sums even in registers) it doesn't matter much.
For the curious, I've put the disassembly of both versions (compiled from C) here : http://pastebin.com/2fciLEMw
I'm having a problem generating the Terras number sequence.
Here is my unsuccessful attempt:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace Terras
{
class Program
{
public static int Terras(int n)
{
if (n <= 1)
{
int return_value = 1;
Console.WriteLine("Terras generated : " + return_value);
return return_value;
}
else
{
if ((n % 2) == 0)
{
// Even number
int return_value = 1 / 2 * Terras(n - 1);
Console.WriteLine("Terras generated : " + return_value);
return return_value;
}
else
{
// Odd number
int return_value = 1 / 2 * (3 * Terras(n - 1) + 1);
Console.WriteLine("Terras generated : " + return_value);
return return_value;
}
}
}
static void Main(string[] args)
{
Console.WriteLine("TERRAS1");
Terras(1); // should generate 1
Console.WriteLine("TERRAS2");
Terras(2); // should generate 2 1 ... instead of 1 and 0
Console.WriteLine("TERRAS5");
Terras(5); // should generate 5,8,4,2,1 not 1 0 0 0 0
Console.Read();
}
}
}
What am I doing wrong?
I know the basics of recursion, but I don’t understand why this doesn’t work.
I observe that the first number of the sequence is actually the number that you pass in, and subsequent numbers are zero.
Change 1 / 2 * Terros(n - 1); to Terros(n - 1)/2;
Also 1 / 2 * (3 * Terros(n - 1) + 1); to (3 * Terros(n - 1) + 1)/2;
1/2 * ... is simply 0 * ... with int math.
[Edit]
Recursion is wrong and formula is mis-guided. Simple iterate
public static void Terros(int n) {
Console.Write("Terros generated :");
int t = n;
Console.Write(" " + t);
while (t > 1) {
int t_previous = t;
if (t_previous%2 == 0) {
t = t_previous/2;
}
else {
t = (3*t_previous+1)/2;
}
Console.Write(", " + t);
}
Console.WriteLine("");
}
The "n is even" should be "t(subscript n-1) is even" - same for "n is odd".
int return_value = 1 / 2 * Terros(n - 1);
int return_value = 1 / 2 * (3 * Terros(n - 1) + 1);
Unfortunately you've hit a common mistake people make with ints.
(int)1 / (int)2 will always be 0.
Since 1/2 is an integer divison it's always 0; in order to correct the math, just
swap the terms: not 1/2*n but n/2; instead of 1/2* (3 * n + 1) put (3 * n + 1) / 2.
Another issue: do not put computation (Terros) and output (Console.WriteLine) in the
same function
public static String TerrosSequence(int n) {
StringBuilder Sb = new StringBuilder();
// Again: dynamic programming is far better here than recursion
while (n > 1) {
if (Sb.Length > 0)
Sb.Append(",");
Sb.Append(n);
n = (n % 2 == 0) ? n / 2 : (3 * n + 1) / 2;
}
if (Sb.Length > 0)
Sb.Append(",");
Sb.Append(n);
return Sb.ToString();
}
// Output: "Terros generated : 5,8,4,2,1"
Console.WriteLine("Terros generated : " + TerrosSequence(5));
The existing answers guide you in the correct direction, but there is no ultimate one. I thought that summing up and adding detail would help you and future visitors.
The problem name
The original name of this question was “Conjuncture of Terros”. First, it is conjecture, second, the modification to the original Collatz sequence you used comes from Riho Terras* (not Terros!) who proved the Terras Theorem saying that for almost all t₀ holds that ∃n ∈ ℕ: tₙ < t₀. You can read more about it on MathWorld and chux’s question on Math.SE.
* While searching for who is that R. Terras mentioned on MathWorld, I found not only the record on Geni.com, but also probable author of that record, his niece Astrid Terras, and her family’s genealogy. Just for the really curious ones. ☺
The formula
You got the formula wrong in your question. As the table of sequences for different t₀ shows, you should be testing for parity of tₙ₋₁ instead of n.
Formula taken from MathWorld.
Also the second table column heading is wrong, it should read t₀, t₁, t₂, … as t₀ is listed too.
You repeat the mistake with testing n instead of tₙ₋₁ in your code, too. If output of your program is precisely specified (e.g. when checked by an automatic judge), think once more whether you should output t₀ or not.
Integer vs float arithmetic
When making an operation with two integers, you get an integer. If a float is involved, the result is float. In both branches of your condition, you compute an expression of this form:
1 / 2 * …
1 and 2 are integers, therefore the division is integer division. Integer division always rounds down, so the expression is in fact
0 * …
which is (almost*) always zero. Mystery solved. But how to fix it?
Instead of multiplying by one half, you can divide by two. In even branch, division by 2 gives no remainder. In odd branch, tₙ₋₁ is odd, so 3 · tₙ₋₁ is odd too. Odd plus 1 is even, so division by two always produces remainder equal to zero in both branches. Integer division is enough, the result is precise.
Also, you could use float division, just replace 1 with 1.0. But this will probably not give correct results. You see, all members of the sequence are integers and you’re getting float results! So rounding with Math.Round() and casting to integer? Nah… If you can, always evade using floats. There are very few use cases for them, I think, most having something to do with graphics or numerical algorithms. Most of the time you don’t really need them and they just introduce round-off errors.
* Zero times whatever could produce NaN too, but let’s ignore the possibility of “whatever” being from special float values. I’m just pedantic.
Recursive solution
Apart from the problems mentioned above, your whole recursive approach is flawed. Obviously you intended Terras(n) to be tₙ. That’s not utterly bad. But then you forgot that you supply t₀ and search for n instead of the other way round.
To fix your approach, you would need to set up a “global” variable int t0 that would be set to given t₀ and returned from Terras(0). Then Terras(n) would really return tₙ. But you wouldn’t still know the value of n when the sequence stops. You could only repeat for bigger and bigger n, ruining time complexity.
Wait. What about caching the results of intermediate Terras() calls in an ArrayList<int> t? t[i] will contain result for Terras(i) or zero if not initialized. At the top of Terras() you would add if (n < t.Count() && t[n] != 0) return t[n]; for returning the value immediately if cached and not repeating the computation. Otherwise the computation is really made and just before returning, the result is cached:
if (n < t.Count()) {
t[n] = return_value;
} else {
for (int i = t.Count(); i < n; i++) {
t.Add(0);
}
t.Add(return_value);
}
Still not good enough. Time complexity saved, but having the ArrayList increases space complexity. Try tracing (preferably manually, pencil & paper) the computation for t0 = 3; t.Add(t0);. You don’t know the final n beforehand, so you must go from 1 up, till Terras(n) returns 1.
Noticed anything? First, each time you increment n and make a new Terras() call, you add the computed value at the end of cache (t). Second, you’re always looking just one item back. You’re computing the whole sequence from the bottom up and you don’t need that big stupid ArrayList but always just its last item!
Iterative solution
OK, let’s forget that complicated recursive solution trying to follow the top-down definition and move to the bottom-up approach that popped up from gradual improvement of the original solution. Recursion is not needed anymore, it just clutters the whole thing and slows it down.
End of sequence is still found by incrementing n and computing tₙ, halting when tₙ = 1. Variable t stores tₙ, t_previous stores previous tₙ (now tₙ₋₁). The rest should be obvious.
public static void Terras(int t) {
Console.Write("Terras generated:");
Console.Write(" " + t);
while (t > 1) {
int t_previous = t;
if (t_previous % 2 == 0) {
t = t_previous / 2;
} else {
t = (3 * t_previous + 1) / 2;
}
Console.Write(", " + t);
}
Console.WriteLine("");
}
Variable names taken from chux’s answer, just for the sake of comparability.
This can be deemed a primitive instance of dynamic-programming technique. The evolution of this solution is common to the whole class of such problems. Slow recursion, call result caching, dynamic “bottom-up” approach. When you are more experienced with dynamic programming, you’ll start seeing it directly even in more complicated problems, not even thinking about recursion.
I do know that this kind of questions have already been answered many times. Although I've found lots of possible answers, they still don't solve my problem, which is to implement the fastest possible way to convert an integer array into a single string. I have for example:
int[] Result = new int[] { 1753387599, 1353678530, 987001 }
I want it reversed, so I believe it's best to precede the further code with
Array.Reverse(Result);
Although I don’t iterate from the end, it’s equivalent to reversing, because I call elements from the end. So I have already done this. Just to let you know - if you can think of any other solution than mine, I suggest using this Array.Reverse, because the solution must be reversed.
I always care only about the last 9 digits of a number - so like modulo 1 000 000 000. Here is what I'd like to get:
987001|353678530|753387599
Separators just to have it clear now. I wrote my own function that is about 50% faster than using .ToString().
tempint - current element of the int array,
StrArray - a string array. It's not worth using StringBuilder or summing
strings, so at the end I simply join the elements of the AnswerArr to get the result.
IntBase - an array containing 1000 elements, numbers in strings from "000" to "999", indexed 0 to 999.
for (i = 0; i < limit; i++)
{
//Some code here
j = 3 * (limit - i);
//Done always
StrArray[j - 1] = IntBase[tempint % 1000];
if (tempint > 999999)
{
//Done in 99/100 cases
StrArray[j - 2] = IntBase[tempint % 1000000 / 1000];
StrArray[j - 3] = IntBase[tempint % 1000000000 / 1000000];
}
else
{
if (tempint > 999)
{
//Done just once
StrArray[j - 2] = IntBase[tempint % 1000 / 1000];
}
}
}
//Some code here
return string.Join(null, StrArray);
There ale lots of calculations before this part and they're are done very fast. While everything goes in 714 ms, without summing integers, it's just 337 ms.
Thanks in advance for any help.
Best regards,
Randolph
Faster? Most efficent? I am not sure, you should try it. But a simple way to convert
int[] Result = new int[] { 1753387599, 1353678530, 987001 };
var newstr = String.Join("|", Result.Reverse().Select(i => i % 1000000000));
I would suggest L.B's answer for most cases. But if you're running for the top efficiency, here are my suggestions:
You can iterate the array from the end, so there's no need to call Reverse of any kind
IntBase[tempint % 1000000 / 1000] is the same as IntBase[tempint % 1000] because division has higher priority than modulus
I bet the whole IntBase intermediate step is slowing you down tremendously
My suggestion would be something like this - much like L.B's code, but imperative and slightly optimized.
var sb = new StringBuilder();
var ints; // Your int[]
// Initial step because of the delimiters.
sb.Append((ints[ints.Length - 1] % 1000000000).ToString());
// Starting with 2nd last element all the way to the first one.
for(var i = ints.Length - 2; i >= 0; i--)
{
sb.Append("|");
sb.Append((ints[i] % 1000000000).ToString());
}
var result = sb.ToString();
I was hoping to figure out a way to write the below in a functional style with extension functions. Ideally this functional style would perform well compared to the iterative/loop version. I'm guessing that there isn't a way. Probably because of the many additional function calls and stack allocations, etc.
Fundamentally I think the pattern which is making it troublesome is that it both calculates a value to use for the Predicate and then needs that calculated value again as part of the resulting collection.
// This is what is passed to each function.
// Do not assume the array is in order.
var a = (0).To(999999).ToArray().Shuffle();
// Approx times in release mode (on my machine):
// Functional is avg 20ms per call
// Iterative is avg 5ms per call
// Linq is avg 14ms per call
private static List<int> Iterative(int[] a)
{
var squares = new List<int>(a.Length);
for (int i = 0; i < a.Length; i++)
{
var n = a[i];
if (n % 2 == 0)
{
int square = n * n;
if (square < 1000000)
{
squares.Add(square);
}
}
}
return squares;
}
private static List<int> Functional(int[] a)
{
return
a
.Where(x => x % 2 == 0 && x * x < 1000000)
.Select(x => x * x)
.ToList();
}
private static List<int> Linq(int[] a)
{
var squares =
from num in a
where num % 2 == 0 && num * num < 1000000
select num * num;
return squares.ToList();
}
An alternative to Konrad's suggestion. This avoids the double calculation, but also avoids even calculating the square when it doesn't have to:
return a.Where(x => x % 2 == 0)
.Select(x => x * x)
.Where(square => square < 1000000)
.ToList();
Personally, I wouldn't sweat the difference in performance until I'd seen it be significant in a larger context.
(I'm assuming that this is just an example, by the way. Normally you'd possibly compute the square root of 1000000 once and then just compare n with that, to shave off a few milliseconds. It does require two comparisons or an Abs operation though, of course.)
EDIT: Note that a more functional version would avoid using ToList at all. Return IEnumerable<int> instead, and let the caller transform it into a List<T> if they want to. If they don't, they don't take the hit. If they only want the first 5 values, they can call Take(5). That laziness could be a big performance win over the original version, depending on the context.
Just solving your problem of the double calculation:
return (from x in a
let sq = x * x
where x % 2 == 0 && sq < 1000000
select sq).ToList();
That said, I’m not sure that this will lead to much performance improvement. Is the functional variant actually noticeably faster than the iterative one? The code offers quite a lot of potential for automated optimisation.
How about some parallel processing? Or does the solution have to be LINQ (which I consider to be slow).
var squares = new List<int>(a.Length);
Parallel.ForEach(a, n =>
{
if(n < 1000 && n % 2 == 0) squares.Add(n * n);
}
The Linq version would be:
return a.AsParallel()
.Where(n => n < 1000 && n % 2 == 0)
.Select(n => n * n)
.ToList();
I don't think there's a functional solution that will be completely on-par with the iterative solution performance-wise. In my timings (see below) the 'functional' implementation from the OP appears to be around twice as slow as the iterative implementation.
Micro-benchmarks like this one are prone to all manner of issues. A common tactic in dealing with variability problems is to repeatedly call the method being timed and compute an average time per call - like this:
// from main
Time(Functional, "Functional", a);
Time(Linq, "Linq", a);
Time(Iterative, "Iterative", a);
// ...
static int reps = 1000;
private static List<int> Time(Func<int[],List<int>> func, string name, int[] a)
{
var sw = System.Diagnostics.Stopwatch.StartNew();
List<int> ret = null;
for(int i = 0; i < reps; ++i)
{
ret = func(a);
}
sw.Stop();
Console.WriteLine(
"{0} per call timings - {1} ticks, {2} ms",
name,
sw.ElapsedTicks/(double)reps,
sw.ElapsedMilliseconds/(double)reps);
return ret;
}
Here are the timings from one session:
Functional per call timings - 46493.541 ticks, 16.945 ms
Linq per call timings - 46526.734 ticks, 16.958 ms
Iterative per call timings - 21971.274 ticks, 8.008 ms
There are a host of other challenges as well: strobe-effects with the timer use, how and when the just-in-time compiler does its thing, the garbage collector running its collections, the order that competing algorithms are run, the type of cpu, the OS swapping other processes in and out, etc.
I tried my hand at a little optimization. I removed the square from the test (num * num < 1000000) - changing it to (num < 1000) - which seemed safe since there are no negatives in the input - that is, I took the square root of both sides of the inequality. Surprisingly, I got different results as compared to the methods in the OP - there were only 500 items in my optimized output as compared to the 241,849 from the three implementations in the OP implementations. So why the difference? Much of the input when squared overflows 32 bit integers, so those extra 241,349 items came from numbers that when squared overflowed to either negative numbers or numbers under 1 million while still passing our evenness test.
optimized (functional) timing:
Optimized per call timings - 16849.529 ticks, 6.141 ms
This was one of the functional implementations altered as suggested. It output the 500 items passing the criteria as expected. It is deceptively "faster" only because it output fewer items than the iterative solution.
We can make the original implementations blow up with an OverflowException by adding a checked block around their implementations. Here is a checked block added to the "Iterative" method:
private static List<int> Iterative(int[] a)
{
checked
{
var squares = new List<int>(a.Length);
// rest of method omitted for brevity...
return squares;
}
}