Group (previous) elements of array by number of subsequent negative ones - c#

I'm trying to process (in C#) a large data file with some numeric data. Given an array of integers, how can it be split/grouped, so that previous n elements are grouped if next n (two or more) are negative numbers. For example, in the array below, two or more consecutive negative numbers should be used as a hint to group the same number of previous elements:
0
1
4
-99
-99
-99
1
2
7
9
-99
-99
3
6
8
-99
-99
5
Output:
Group 1:
0
1
4
[3 negative numbers were next, so group previous 3]
Group 2:
1
Group 3:
2
Group 4:
7
9
[2 negative numbers were next, so group previous 2, but keep any prior positive numbers (1, 2) as separate groups]
Group 5:
3
Group 6:
6
8
[2 negative numbers were next, so group previous 2, but keep any prior positive numbers (3) as separate groups]
Group 7:
5
What will be the fastest, most efficient way to process such an array of data into a new grouped collection?

I'd probably try to stream the whole thing, maintaining a buffer of "current non-negative numbers" and a count of negative numbers. Here's some code which appears to work... I'd expect it to be at least pretty efficient. I'd start with that, and if it's not efficient enough for you, look into why.
(If you do want to have the whole array in memory, you don't need to build up the numbersToEmit list, as you can just maintain indexes instead. But given that a single list is reused, I'd expect this to be okay.)
using System;
using System.Linq;
using System.Collections.Generic;
class Program
{
static void Main(string[] args)
{
int[] input = { 0, 1, 4, -99, -99, -99, 1,
2, 7, 9, -99, -99, 3, 6, 8, -99, -99, 5 };
foreach (var group in GroupByNegativeCounts(input))
{
Console.WriteLine(string.Join(", ", group));
}
}
static IEnumerable<int[]> GroupByNegativeCounts(IEnumerable<int> input)
{
List<int> numbersToEmit = new List<int>();
int negativeCount = 0;
foreach (var value in input)
{
// We never emit anything when we see a negative number
if (value < 0)
{
negativeCount++;
}
else
{
// We emit numbers if we've previously seen negative
// numbers, and then we see a non-negative one.
if (negativeCount > 0)
{
int singles = Math.Max(numbersToEmit.Count - negativeCount, 0);
foreach (var single in numbersToEmit.Take(singles))
{
yield return new[] { single };
}
if (singles != numbersToEmit.Count)
{
yield return numbersToEmit.Skip(singles).ToArray();
}
negativeCount = 0;
numbersToEmit.Clear();
}
numbersToEmit.Add(value);
}
}
// Emit anything we've got left at the end.
foreach (var single in numbersToEmit)
{
yield return new[] { single };
}
}
}

Here's my solution, I tried to make it as simple as possible :)
List<int> input = new List<int>() { 0, 1, 4, -99, -99, -99, 1, 2, 7, 9, -99, -99, 3, 6, 8, -99, -99, 5 };
List<int> reverse_input = input.Reverse<int>().ToList(); //reverse list for easier reading
List<List<int>> grouped_input = new List<List<int>>(); //output variable
List<int> leading_positives;
int leading_negative_count;
while (reverse_input.Any())
{
//Get the amount of leading negatives and remove them from the reversed list
leading_negative_count = reverse_input.TakeWhile(num => num < 0).Count();
reverse_input = reverse_input.Skip(leading_negative_count).ToList();
//get and store leading positives and remove them from the reversed list
leading_positives = reverse_input.TakeWhile(num => num >= 0).ToList();
reverse_input = reverse_input.Skip(leading_positives.Count).ToList();
//take an amount of positives equal to the amount of previously found negatives and add them as a separate list to the output
grouped_input.Add(leading_positives.Take(leading_negative_count).Reverse().ToList());
//for each remaining positive add it as an individual into the output
leading_positives.Skip(leading_negative_count).ToList().ForEach(num => grouped_input.Add(new List<int>() { num }));
}
//output display
grouped_input.Reverse<List<int>>().ToList().ForEach(lst => Console.WriteLine(string.Join(",", lst)));

Related

Group elements of the data set if they are next to each other with LINQ

I have a data set (ex. 1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7) and I want to group items of the same value but only if they are next to each other minimum 3 times.
Is there a way?
I've tried using combinations of Count and GroupBy and Select in every way I know but I can't find a right one.
Or if it can't be done with LINQ then maybe some other way?
I don't think I'd strive for a 100% LINQ solution for this:
var r = new List<List<int>>() { new () { source.First() } };
foreach(var e in source.Skip(1)){
if(e == r.Last().Last()) r.Last().Add(e);
else r.Add(new(){ e });
}
return r.Where(l => l.Count > 2);
The .Last() calls can be replaced with [^1] if you like
This works like:
have an output that is a list of lists
put the first item in the input, into the output
For the second input items onward, if the input item is the same as the last int in the output, add the input item to the last list in the output,
Otherwise make a new list containing the input int and add it onto the end of the output lists
Keep only those output lists longer than 2
If he output is like:
[
[2,2,2],
[6,6,6]
]
Aggregate can be pushed into doing the same thing; this is simply an accumulator (r), an iteration (foreach) and an op on the result Where
var result = source.Skip(1).Aggregate(
new List<List<int>>() { new List<int> { source.First() } },
(r,e) => {
if(e == r.Last().Last()) r.Last().Add(e);
else r.Add(new List<int>(){ e });
return r;
},
r => r.Where(l => l.Count > 2)
);
..but would you want to be the one to explain it to the new dev?
Another LINQy way would be to establish a counter that incremented by one each time the value in the source array changes compared to the pervious version, then group by this integer, and return only those groups 3+, but I don't like this so much because it's a bit "WTF"
var source = new[]{1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7};
int ctr = 0;
var result = source.Select(
(e,i) => new[]{ i==0 || e != source[i-1] ? ++ctr : ctr, e}
)
.GroupBy(
arr => arr[0],
arr => arr[1]
)
.Where(g => g.Count() > 2);
You could consider using the GroupAdjacent or the RunLengthEncode operators, from the MoreLinq package. The former groups adjacent elements in the sequence, that have the same key. The key is retrieved by invoking a keySelector lambda parameter. The later compares the adjacent elements, and emits a single KeyValuePair<T, int> for each series of equal elements. The int value of the KeyValuePair<T, int> represents the number of consecutive equal elements. Example:
var source = new[] { 1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7 };
IEnumerable<IGrouping<int, int>> grouped = MoreLinq.MoreEnumerable
.GroupAdjacent(source, x => x);
foreach (var group in grouped)
{
Console.WriteLine($"Key: {group.Key}, Elements: {String.Join(", ", group)}");
}
Console.WriteLine();
IEnumerable<KeyValuePair<int, int>> pairs = MoreLinq.MoreEnumerable
.RunLengthEncode(source);
foreach (var pair in pairs)
{
Console.WriteLine($"Key: {pair.Key}, Value: {pair.Value}");
}
Output:
Key: 1, Elements: 1, 1
Key: 4, Elements: 4
Key: 6, Elements: 6
Key: 3, Elements: 3, 3
Key: 1, Elements: 1
Key: 2, Elements: 2, 2, 2
Key: 6, Elements: 6, 6, 6
Key: 7, Elements: 7
Key: 1, Value: 2
Key: 4, Value: 1
Key: 6, Value: 1
Key: 3, Value: 2
Key: 1, Value: 1
Key: 2, Value: 3
Key: 6, Value: 3
Key: 7, Value: 1
Live demo.
In the above example I've used the operators as normal methods, because I am not a fan of adding using MoreLinq; and "polluting" the IntelliSense of the Visual Studio with all the specialized operators of the MoreLinq package. An alternative is to enable each operator selectively like this:
using static MoreLinq.Extensions.GroupAdjacentExtension;
using static MoreLinq.Extensions.RunLengthEncodeExtension;
If you don't like the idea of adding a dependency on a third-party package, you could grab the source code of these operators (1, 2), and embed it directly into your project.
If you're nostalgic and like stuff like the Obfuscated C code contest, you could solve it like this.(No best practice claims included)
int[] n = {1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7};
var t = new int [n.Length][];
for (var i = 0; i < n.Length; i++)
t[i] = new []{n[i], i == 0 ? 0 : n[i] == n[i - 1] ? t[i - 1][1] : t[i - 1][1] + 1};
var r = t.GroupBy(x => x[1], x => x[0])
.Where(g => g.Count() > 2)
.SelectMany(g => g);
Console.WriteLine(string.Join(", ", r));
In the end Linq is likely not the best solution here.
A simple for-loop with 1,2,3 additional loop-variables to track the "group index" and the last value makes likely more sense.
Even if it's 2 lines more code written.
I wouldn't use Linq just to use Linq.
I'd rather suggest using a simple for loop to loop over your input array and populate the output list. To keep track of which number is currently being repeated (if any), I'd use a variable (repeatedNumber) that's initially set to null.
In this approach, a number can only be assigned to repeatedNumber if it fulfills the minimum requirement of repeated items. Hence, for your example input, repeatedNumber would start at null, then eventually be set to 2, then be set to 6, and then be reset to null.
One perhaps good use of Linq here is to check if the minimum requirement of repeated items is fulfilled for a given item in input, by checking the necessary consecutive items in input:
input
.Skip(items up to and including current item)
.Take(minimum requirement of repeated items - 1)
.All(equal to current item)
I'll name this minimum requirement of repeated items repetitionRequirement. (In your question post, repetitionRequirement is 3.)
The logic in the for loop goes a follows:
number = input[i]
If number is equal to repeatedNumber, it means that the previously repeated item continues being repeated
Add number to output
Otherwise, if the minimum requirement of repeated items is fulfilled for number (i.e. if the repetitionRequirement - 1 items directly following number in input are all equal to number), it means that number is the first instance of a new repeated item
Set repeatedNumber equal to number
Add number to output
Otherwise, if repeatedNumber has value, it means that the previously repeated item just ended its repetition
Set repeatedNumber to null
Here is a suggested implementation:
(I'd suggest finding a more descriptive method name)
//using System.Collections.Generic;
//using System.Linq;
public static List<int> GetOutput(int[] input, int repetitionRequirement)
{
var consecutiveCount = repetitionRequirement - 1;
var output = new List<int>();
int? repeatedNumber = null;
for (var i = 0; i < input.Length; i++)
{
var number = input[i];
if (number == repeatedNumber)
{
output.Add(number);
}
else if (i + consecutiveCount < input.Length &&
input.Skip(i + 1).Take(consecutiveCount).All(num => num == number))
{
repeatedNumber = number;
output.Add(number);
}
else if (repeatedNumber.HasValue)
{
repeatedNumber = null;
}
}
return output;
}
By calling it with your example input:
var dataSet = new[] { 1, 1, 4, 6, 3, 3, 1, 2, 2, 2, 6, 6, 6, 7 };
var output = GetOutput(dataSet, 3);
you get the following output:
{ 2, 2, 2, 6, 6, 6 }
Example fiddle here.

Sorting numbers using maximum spread

I want to sort a list of integers in such a way that they end up being spread out as much as possible. Assuming base 8, the order of items between 1 and 7 ought to be: {4, 6, 2, 7, 1, 5, 3} as per:
There is a fair amount of ambiguity of course, as both 6 and 2 are equally far away from 4, 0 and 8, so the specific ordering of 6 and 2 is irrelevant. What I'm trying to achieve is to first pick the number furthest away from 0 and base, then pick the number furthest away from 0, base and first number, etc. Any multiple of the base will never occur so I don't care how that is handled.
I can manually design the sort order for any given base, but I need this to work for any base >= 2. Is there a clever/fast way to compute this or do I need to lazily build the sorting mapping tables and cache them for future use?
int SortOrder(int radix, int value)
{
int offset = value % radix;
int[] table = {int.MinValue, 4, 2, 6, 0, 5, 1, 3}; // Hand-crafted for base-8
return table[offset];
}
This is specifically not an answer to the question since it doesn't attempt to find the answer quickly. Rather it builds a dictionary of cached sorting values for each radix.
#region sorting logic
/// <summary>
/// Maintains a collection of sorting maps for all used number bases.
/// </summary>
private static readonly Dictionary<int, int[]> _sortingTable = new Dictionary<int, int[]>();
private static readonly object _sortingLock = new object();
/// <summary>
/// Compute the sorting key for a given multiple.
/// </summary>
/// <param name="radix">Radix or base.</param>
/// <param name="multiple">Multiple.</param>
/// <returns>Sorting key.</returns>
public static int ComputeSortingKey(int radix, long multiple)
{
if (radix < 2)
throw new ArgumentException("Radix may not be less than 2.");
if (multiple == 0)
return int.MinValue; // multiple=0 always needs to be sorted first, so pick the smallest possible key.
int[] map;
if (!_sortingTable.TryGetValue(radix, out map))
lock (_sortingLock)
{
map = new int[radix];
map[0] = -1; // Multiples of the radix are sorted first.
int key = 0;
HashSet<int> occupancy = new HashSet<int> { 0, radix };
HashSet<int> collection = new HashSet<int>(1.ArrayTo(radix)); // (ArrayTo is an extension method in this project)
while (collection.Count > 0)
{
int maxValue = 0;
int maxDistance = 0;
foreach (int value in collection)
{
int distance = int.MaxValue;
foreach (int existingValue in occupancy)
distance = Math.Min(distance, Math.Abs(existingValue - value));
if (distance > maxDistance)
{
maxDistance = distance;
maxValue = value;
}
}
collection.Remove(maxValue);
occupancy.Add(maxValue);
map[maxValue] = key++;
}
_sortingTable.Remove(radix); // Just in case of a race-condition.
_sortingTable.Add(radix, map);
}
long offset = multiple % radix;
if (offset != 0)
if (multiple < 0)
offset = radix - (Math.Abs(multiple) % radix);
return map[(int)offset];
}
#endregion
My original answer was to find the maximum delta. To work your way from out to in, use the same comparison but different selects:
List<double> answer = new List<double>();
List<double> doub = new List<double>() { 0, -1, 2, 3, 4, -5, 7 };//SORT this list for sorted results!
List<double> lowerHalf = new List<double>();
List<double> upperHalf = new List<double>();
for (int i = 0; i < doub.Count; i++)
{
if (i <= (int)Math.Floor((double)doub.Count / 2))
lowerHalf.Add(doub[i]);
else
upperHalf.Add(doub[i]);
}
if (upperHalf.Count < lowerHalf.Count)
{
upperHalf.Insert(0,lowerHalf[lowerHalf.Count-1]);
}
//if(upperHalf[0]==lowerHalf[lowerHalf.Count-1]){double median = lowerHalf[lowerHalf.Count-1]+upperHalf[1])/2;lowerHalf[lowerHalf.Count-1] = median; upperHalf[0]=median;}//use Math.Round or Math.Floor/Ceiling if necessary
for (int i = 0; i < lowerHalf.Count; i++)
{
double deltas = Math.Sqrt(Math.Pow(upperHalf[upperHalf.Count - (i + 1)] - lowerHalf[i], 2));
answer.Add(deltas);
Console.WriteLine("The answer for {1}, {2} is: {0}", deltas, lowerHalf[i], upperHalf[upperHalf.Count - (i+1)]);
}
Console.ReadLine();
This will provide:
The answer for 0, 7 is: 7
The answer for -1, -5 is: 4
The answer for 2, 4 is: 2
The answer for 3, 3 is: 0
NOTE that in the event of an odd number of items in the original range, this method uses the same item for both upper and lower lists. I've added a line to use the "actual" median for your benefit
To get rid of duplicates, use a hashset, union, or distinct()
Original answer - to find maximum delta):
You can use math in your Linq, like:
List<double> doub = new List<double>() { 0, 1, 2, 3, 4, 5, 7 };
double deltas = doub.Select(p => p - doub.First()).OrderBy(p => p).Last();
Console.WriteLine("The answer is: {0}",deltas);
Console.ReadLine();
If your values go negative, you'd need to use squares:
double deltas = Math.Sqrt( doub.Select(p => Math.Pow(p - doub.First(), 2)).OrderBy(p => p).Last());
or Math.Abs or a test to see which is larger - but this should give you an idea on how to get started. If the numbers aren't in order in the original list, you can call an orderby before the select, as well.
Literally:
List<double> doub = new List<double>() { 0, 1, 2, 3, 4, 5, 7 };
double deltas = Math.Sqrt( doub.Select(p => Math.Pow(p - doub.First(), 2)).OrderBy(p => p).Last());
Console.WriteLine("The answer is: {0}",deltas);
Console.ReadLine();
produces
'OilTracker.vshost.exe' (CLR v4.0.30319: OilTracker.vshost.exe): Loaded 'C:\Users\User\Documents\Visual Studio 2015\Projects\OilTracker\OilTracker\bin\Debug\TDAInterface.dll'. Symbols loaded.
'OilTracker.vshost.exe' (CLR v4.0.30319: OilTracker.vshost.exe): Loaded 'C:\Users\User\Documents\Visual Studio 2015\Projects\OilTracker\OilTracker\bin\Debug\BackFeeder.exe'. Symbols loaded.
The answer is: 7
Moving forward to sorting the list, use:
List<double> answer = new List<double>();
List<double> doub = new List<double>() { 0, 1, 2, 3, 4, 5, 7 };
//sort doub if necessary
foreach (double num in doub)
{
double deltas = Math.Sqrt(Math.Pow(doub.Select(p => p - num).OrderBy(p => p).Last(), 2));
answer.Add(deltas);
Console.WriteLine("The answer for {1} is: {0}", deltas,num);
}
Console.ReadLine();
(Again, use another orderby if the list is not in order).
Produces:
'OilTracker.vshost.exe' (CLR v4.0.30319: OilTracker.vshost.exe): Loaded 'C:\Users\User\Documents\Visual Studio 2015\Projects\OilTracker\OilTracker\bin\Debug\TDA_Stream_Interface.dll'. Symbols loaded.
The answer for 0 is: 7
The answer for 1 is: 6
The answer for 2 is: 5
The answer for 3 is: 4
The answer for 4 is: 3
The answer for 5 is: 2
The answer for 7 is: 0
The squares/square root help us change signs and deal with negatives - so
List<double> doub = new List<double>() { 0, -1, 2, 3, 4, -5, 7 };
Produces:
The answer for 0 is: 7
The answer for -1 is: 8
The answer for 2 is: 5
The answer for 3 is: 4
The answer for 4 is: 3
The answer for -5 is: 12
The answer for 7 is: 0
(which aren't in order because I failed to sort the list on either the inbound or outbound sides).
After running, the list "answer" will contain the results - the lowest delta can be accessed by answers.First() and the highest by answers.Last(). similar deltas will exist for different numbers the same number of units apart - you could use a HashSet conversion in the formula if you want to eradicate duplicate deltas. If you need help with that, please let me know.
If you need to store the numbers that created the delta as well as the delta itself, they are available to you in the For/Each loop as the Console.WriteLine() indicates.
If you want to work the range-to-median on both sides of the median, it's probably a good idea to split the list and work in pairs. Make 0 to median in one list, and median to endRange in the second list, work out in the first and in in the second. This should get you there, but if you need help getting over that final hump, let me know.
Hope this helps!!

Finding a number that follows a specific number in a sequence

I have the below array:
int [] array = { 9, 8, 3, 2, 3, 2 };
I'd like to write a statement when i pick a number from array, it gives the following number as the result.
for examaple i pick the number 8 and according to the statement it gives to number 3 as result.
One way to do it is to use SkipWhile to reach the location of the search number, skip one, and take the first item after it:
var array = new[] { 9, 8, 3, 2, 3, 2 };
var next = array.SkipWhile(n => n != 8).Skip(1).First(); // next==3
This code assumes two things:
The search number 8 is there, and
The search number is not the last number in the sequence.
Demo.
If I understand your question correctly you want to return the next item in the array after the selection?
If that is the case you can do the following:
int index = Array.IndexOf(array, 8);
return array[index + 1];
There are some limitations to this implementation, please see here:
https://msdn.microsoft.com/en-us/library/7eddebat(v=vs.110).aspx

How to find intersection of two arrays (Optimal solution)

I am writting algorithm for intersection of two arrays A and B , I want an optimized solution in terms of space complexity and time complexity.
I have written algorithm and it works fine but i want to know if there is any more optimal solution then this exist or if someone could provide me.
What i do is:
(1) Find the Smallest size array among two.
(2) The new array wil be of size allocated size equal to smaller size array
(3) From smaller size array i go and compare with each element in bigger array if it exists one ,i get it in third array"C" and break it right there (because we need to find intersection, even if it repeats 100 times after
we don't care for us only one existence is enough to put in third array). At the same time we also have to check if the element in smaller array which to be compared with all elements in bigger array already exist in third array, Example A=[0,1,1], B[0,1,2,3].
Now we start with A's first element, it is present in array B we save it in C[0], then go to second , now C is [0,1], and in next step we again have 1 to compare, which we have already compared.So for this situation we have to do check if element to be compare already exist in array C then we eliminate check for it.
(4) We store the found element in C (third array) and print it.
My full working code for it is :
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int[] aar1 = { 0, 1, 1, 7, 2, 6, 3, 9, 11, 2, 2,3,3,3,3,3,1 };
int[] aar2 = { 0, 1, 2, 3, 4, 5, 6, 11, 11, 1, 1, 1, 1 };
int[] arr3 = findIntersection(aar1, aar2);
Console.WriteLine("the array is : " + arr3);
Console.ReadKey();
}
private static int[] findIntersection(int[] aar1, int[] aar2)
{
int[] arr3 = { 0 };
if (aar1.Count() < aar2.Count())
{
int counter = 0;
arr3 = new int[aar1.Count()];
foreach (int var1 in aar1)
{
if (!checkifInThirdArray(var1, arr3))
{
foreach (int var2 in aar2)
{
if (var1 == var2)
{
arr3[counter++] = var1;
break;
}
}
}
}
}
else
{
int counter = 0;
arr3 = new int[aar1.Count()];
foreach (int var2 in aar2)
{
if (!checkifInThirdArray(var2, arr3))
{
foreach (int var1 in aar1)
{
if (var2 == var1)
{
arr3[counter++] = var2;
break;
}
}
}
}
}
return arr3;
}
private static bool checkifInThirdArray(int var1, int[] arr3)
{
bool flag = false;
if (arr3 != null)
{
foreach (int arr in arr3)
{
if (arr == var1) { flag = true; break; }
}
}
return flag;
}
}
}
One space complexity issue i found is (the others i would really appreciate if you let me know with solution if you find any) :
(1) When i allocate the size to third array, i allocate the Min of the two arrays to be compared, In case if the intersection element are too less then we
have unnecessarily allocated the extra memory. How to solve this issue ?
Please note that i don't have to use any inbuilt function like intersection() or any other.
It sounds like your solution is an O(n2) one in that, for every single element in one array, you may need to process every single element in the other (in the case where the intersection is the null set). You should be aware that C# actually has facilities for finding the intersection of arrays but, should you wish to implement your own, read on.
You would probably be better of sorting both arrays (in-place if allowed otherwise to a separate collection) then doing a merge-check of the two to construct another. The sort could be O(n log n) and the merge check would be O(n).
If you're wondering what I mean by merge check, it's simply processing both (sorted) arrays side by side.
If the first element in both matches, you have an intersect point and you should store that value and advance both lists until the next value is different.
If they're different, there's no intersect point and you can advance the array with the lowest value until it changes.
By way of example, here's some code in Python (the ideal pseudo-code language) that implements such a solution. Array a contains all the multiples of three between 0 and 18 inclusive (in arbitrary order and including duplicates), while array b has all the even numbers in that range (again, with some duplicates and ordered "randomly").
a = [0,3,15,3,9,6,12,15,18,6]
b = [10,0,2,12,4,6,18,8,16,10,12,6,14,16]
# Copy and sort.
a2 = a; a2.sort()
b2 = b; b2.sort()
# Initial pointers and results for merge check.
ap = 0
bp = 0
c = []
# Continue until either array is exhausted.
while ap < len(a2) and bp < len(b2):
# Check for intersect or which list has lowest value.
if a2[ap] == b2[bp]:
# Intersect, save, advance both lists to next number.
val = a2[ap]
c.append(val)
while ap < len(a2) and a2[ap] == val:
ap += 1
while bp < len(b2) and b2[bp] == val:
bp += 1
elif a2[ap] < b2[bp]:
# A has smallest, advance it to next number.
val = a2[ap]
while ap < len(a2) and a2[ap] == val:
ap += 1
else:
# B has smallest, advance it to next number.
val = b2[bp]
while bp < len(b2) and b2[bp] == val:
bp += 1
print(c)
If you run that, you'll see the intersect list that's formed between the two arrays:
[0, 6, 12, 18]
Maybe I am not understanding you right but why don't you use the following;
int[] aar1 = { 0, 1, 1, 7, 2, 6, 3, 9, 11, 2, 2,3,3,3,3,3,1 };
int[] aar2 = { 0, 1, 2, 3, 4, 5, 6, 11, 11, 1, 1, 1, 1 };
aarResult = aar1.Intersect(aar2).ToArray();
This will result in an array with only the space needed and intersects the arrays. You can also initialize the aarResult as follows to get the minimum array size:
int[] aarResult = new int[Math.Min(aar1.Count(), aar2.Count())];
You can use LINQ Intersect method. It uses hashing and works for linear O(N+M) which is faster than your algorithm:
int[] aar1 = { 0, 1, 1, 7, 2, 6, 3, 9, 11, 2, 2, 3, 3, 3, 3, 3, 1 };
int[] aar2 = { 0, 1, 2, 3, 4, 5, 6, 11, 11, 1, 1, 1, 1 };
int[] result = aar1.Intersect(aar2).ToArray();
It will also solve your unnecessarily allocated items problem, because it will create an array of the exact answer size.

Will the result of a LINQ query always be guaranteed to be in the correct order?

Question: Will the result of a LINQ query always be guaranteed to be in the correct order?
Example:
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
var lowNums =
from n in numbers
where n < 5
select n;
Now, when we walk through the entries of the query result, will it be in the same order as the input data numbers is ordered?
foreach (var x in lowNums)
{
Console.WriteLine(x);
}
If someone can provide a note on the ordering in the documentation, this would be perfect.
For LINQ to Objects: Yep.
For Parallel LINQ: Nope.
For LINQ to Expression Trees (EF, L2S, etc): Nope.
I think the order of elements retrieved by a LINQ is preserved, at least for LINQ to Object, for LINQ to SQL or Entity, it may depend on the order of the records in the table. For LINQ to Object, I'll try explaining why it preserves the order.
In fact when the LINQ query is executed, the IEnumerable source will call to GetEnumerator() to start looping with a while loop and get the next element using MoveNext(). This is how a foreach works on the IEnumerable source. We all know that a foreach will preserve the order of the elements in a list/collection. Digging more deeply into the MoveNext(), I think it just has some Position to save the current Index and MoveNext() just increase the Position and yield the corresponding element (at the new position). That's why it should preserve the order, all the code changing the original order is redundant or by explicitly calling to OrderBy or OrderByDescending.
If you think this
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
foreach(var i in numbers)
if(i < 5) Console.Write(i + " ");
prints out 4 1 3 2 0 you should think this
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
IEnumerator ie = numbers.GetEnumerator();
while(ie.MoveNext()){
if((int)ie.Current < 5) Console.Write(ie.Current + " ");
}
also prints out 4 1 3 2 0. Hence this LINQ query
var lowNums = from n in numbers
where n < 5
select n;
foreach (var i in lowNums) {
Console.Write(i + " ");
}
should also print out 4 1 3 2 0.
Conclusion: The order of elements in LINQ depends on how MoveNext() of an IEnumerator obtained from an IEnumerable is implemented. However, it's for sure that the order of elements in LINQ result will be the same order a foreach loop works on the elements.

Categories

Resources