I'm having issues with a certain task. It's not a homework or anything, it's rather a personal matter now. And I want to know if there's even a solution for this...
The point is to achieve expected O(n) worst-case time complexity of a function, that takes 2 string arrays as input (let's call first one A, and the second array B) and should return an array of integers where each element represents an index of the corresponding element in array A.
So, this is how a function should look like:
private static int[] GetExistingStrings(string[] A, string[] B) { ... }
Array A contains all possible names
Array B contains names which should be excluded (i.e. if some of the names stored in B array are also in the A array, their indices should not be included in an output int[] array; it's also possible that this array can contain some random strings which are not necessarily may present in the A array OR it may even be empty.
For example, if we have these arrays:
string[] A = { "one", "two", "three", "four" }; // 0, 1, 2, 3
string[] B = { "two", "three" }; // Indices of "two" and "three" not taken into account
The function should return:
int[] result = { 0, 3 }; // Indices of "one" and "four"
At first, I tried doing it the obvious and simple way (with nested for-loops):
private static int[] GetExistingStrings(string[] A, string[] B)
{
LinkedList<int> aIndices = new LinkedList<int>();
for (int n = 0; n < A.Length; n++)
{
bool isExcluded = false;
for (int m = 0; m < B.Length; m++)
{
if (A[n].Equals(B[m]))
{
isExcluded = true;
break;
}
}
if (!isExcluded)
{
aIndices.AddLast(i);
}
}
int[] resultArray = new int[aIndices.Count];
aIndices.CopyTo(resultArray, 0);
return resultArray;
}
I used LinkedList because we can't possibly know what the ouput's array size should be and also because adding new nodes to this list is a constant O(1) operation. The problem here, of course, is that this function (as I assume) is O(n*M) time complexity. So, we need to find another way...
My second approach was:
private static int[] GetExistingStrings(string[] A, string[] B)
{
int n = A.Length;
int m = B.Length;
if (m == 0)
{
return GetDefaultOutputArray(n);
}
HashSet<string> bSet = new HashSet<string>(B);
LinkedList<int> aIndices = new LinkedList<int>();
for (int i = 0; i < n; i++)
{
if (!bSet.Contains(A[i]))
{
aIndices.AddLast(i);
}
}
if (aIndices.Count > 0)
{
int[] result = new int[aIndices.Count];
aIndices.CopyTo(result, 0);
return result;
}
return GetDefaultOutputArray(n);
}
// Just an utility function that returns a default array
// with length "arrayLength", where first element is 0, next one is 1 and so on...
private static int[] GetDefaultOutputArray(int arrayLength)
{
int[] array = new int[arrayLength];
for (int i = 0; i < arrayLength; i++)
{
array[i] = i;
}
return array;
}
Here the idea was to add all elements of B array to a HashSet and then use it's method Contains() to check for equality in a for-loop. But I can't quite calculate time complexity of this function... I know for sure that the code in the for-loop will execute n times. But what bugs me the most is the HashSet initialization - should it be taken into account here? How does it affects time complexity? is this function O(n)? Or O(n+m) because of HashSet initialization?
Is there any way to solve this task and achieve O(n)?
If you have n elements in A, m elements in B, and the strings are of length k, the expected time of a hashmap approach is O(k*(m + n)). Unfortunately the worst time is O(km(m + n)) if the hashing algorithm doesn't work. (The odds of which are very low.) I had this wrong before, thanks to #PaulHankin for the correction.
To get O(k*(m + n)) worst time we have to take a very different approach. What you do is build a trie out of B. And now you go through each element of A and look it up in the trie. Unlike a hash, a trie has guaranteed worst case performance (and better yet, allows prefix lookups even though we aren't using that). This approach gives us not just expected average time O(k*(m + n)) but also the same worst time.
You cannot do better than this because just processing the lists requires processing O(k*(m + n)) data.
Here is how you could rewrite your second approach using LINQ, while also selecting case-insensitive string comparison:
public static int[] GetExistingStrings(string[] first, string[] second)
{
var secondSet = new HashSet<string>(second, StringComparer.OrdinalIgnoreCase);
return first
.Select((e, i) => (Element : e, Index : i))
.Where(p => !secondSet.Contains(p.Element))
.Select(p => p.Index)
.ToArray();
}
The time and space complexity is the same (O(n)). It's just a more fancy way to do the same thing.
Here is another way Split a List into smaller lists of N size
The purpose of this post is to share knowledge involving "Linq" and opinions without using "for" ties and "ranges" directly.
Example: I have a list of 100 items and I need to make it into 10 lists.
I use the follow script, somebody has a better way or more performatic?
var subLists = myList.Select((x, i) => new { Index = i, Item = x })
.GroupBy(x => x.Index / "MAXIMUM ITEMS ON SUBLIST")
.Select(x => x.Select(v => X.Item).ToList());
It`s a slow operation
(x, i) => new { Index = i, Item = x }
Here's an extension method that will work with any list
public static IEnumerable<List<T>> splitList<T>(List<T> items, int size)
{
for (int i=0; i < items.Count; i+= size)
{
yield return items.GetRange(i, Math.Min(size, items.Count - i));
}
}
OR better performance
public static List<List<T>> splitList<T>(this List<T> items, int size)
{
List<List<T>> list = new List<List<T>>();
for (int i = 0; i < items.Count; i += size)
list.Add(items.GetRange(i, Math.Min(size, items.Count - i)));
return list;
}
Let's create a generic answer. One that works for any sequence of any length, where you want to split the sequence into a sequence of sub-sequences, where every sub-sequence has a specified length, except maybe for the last:
For example:
IEnumerable<int> items = {10, 11, 12, 13, 14, 15, 16, 17};
// split into subsequences of length 3:
IEnumerable<IEnumerable> splitSequence = items.Split(3);
// splitSequence is a sequence of 3 subsequences:
// {10, 11, 12},
// {13, 14, 15},
// {16, 17}
We'll do this by creating an extension method. This way, the method Split can be used as any LINQ function. See extension methods demystified. To make it efficient, I'll enumerate only once, and I don't enumerate any more items than requested for.
IEnumerable<TSource> Split(this IEnumerable<TSource> source, int splitSize)
{
// TODO: exception if null source, or non-positive splitSize
// Get the enumerator and enumerate enough elements to return
IEnumerator<TSource> enumerator = source.GetEnumerator();
while (enumerator.MoveNext())
{
// there are still items in the source; fill a new sub-sequence
var subSequence = new List<Tsource>(SplitSize);
do
{ // add the current item to the list:
subSequence.Add(enumerator.Current);
}
// repeat this until the subSequence is full or until source has no more elements:
while (subSequence.Count() < splitSize && enumerator.MoveNext());
// return the subSequence
yield return subSequence;
}
}
Usage:
// Get all Students that live in New York, split them into groups of 10 Students
// and return groups that have at least one Law Student
var newYorkLasStudentGroups = GetStudents();
.OrderBy(student => student.UniversityLocation == "New York")
.Split(10)
.Where(studentGroup => studentGroup.Any(student => student.Study == "Law"));
This question is not a duplicate. As mentioned, I question a possible form using linq that would be more perfomatic. "For" ties with "range", for example, I am aware of.
Thank you all for your collaboration, comments and possible solutions !!!
I'm a bit stuck using quick sort algorithm on an integer array, while saving the original indexes of the elements as they're moved around during the sorting process. Using C#/Visual studio
For example
ToSort Array {52,05,08,66,02,10}
Indexes : 0 1 2 3 4 5
AfterSort Array {02,05,08,10,52,66}
Indexes : 4 1 2 5 0 3
I need to save the indexes of the sorted values in another array.
I feel like this is very complex as quick sorting is recursive and any help or pointers would be much appreciated! Thanks!
As #Will said you can do something like this :
var myArray = new int[] { 52, 05, 08, 66, 02, 10 };
///In tupple item1 you have the number, in the item2 you have the index
var myIndexedArray = myArray.Select( ( n, index ) => Tuple.Create( n, index ) );
///Or if you are using c# 7, you can use the tuple literals ! :
var myIndexedArray = myArray.Select( ( n, index ) => ( n, index ) );
///Call your quick sort method, sort by the item1 (the number) inside the method
// or use Enumerable.OrderBy:
myIndexedArray = myIndexedArray.OrderBy(x => x.Item1);
///Then get your things back
int[] numbers = myIndexedArray.Select(x => x.Item1).ToArray();
int[] indexes = myIndexedArray.Select(x => x.Item2).ToArray();
LINQ OrderBy uses QuickSort internally. So instead of implementing QuickSort yourself, use OrderBy, if needed with a custom IComparer<T>.
Put the data to be sorted into an anonymous type which remembers the original index, then sort by value. You can retrieve the original index from the index property of the sorted elements.
using System.Linq;
var data = new int[] { 52,05,08,66,02,10 };
var sortingDictionary = data
.Select((value, index) => new { value, index });
var sorted = sortingDictionary
.OrderBy(kvp => kvp.value)
.ToList(); // enumerate before looping over result!
for (var newIndex = 0; newIndex < sorted.Count(); newIndex ++) {
var item = sorted.ElementAt(newIndex);
Console.WriteLine(
$"New index: {newIndex}, old index: {item.index}, value: {item.value}"
);
}
Fiddle
Edit: incorporated improvements suggested by mjwills
Assume we have a jagged array
int[][] a = { new[] { 1, 2, 3, 4 }, new[] { 5, 6, 7, 8 }, new[] { 9, 10, 11, 12 } };
To get a sum of second row and sum of second column, it can be written both code lines respectively:
int rowSum = a[1].Sum();
int colSum = a.Select(row => row[1]).Sum();
But if we have definition of 2-dimensional array
int[,] a = { { 1, 2, 3, 4 }, { 5, 6, 7, 8 }, { 9, 10, 11, 12 } };
the above-cited code will not work due to compiller errors:
Error 1 Wrong number of indices inside []; expected 2
Error 2 'int[*,*]' does not contain a definition for 'Select' and no extension method 'Select' accepting a first argument of type 'int[*,*]' could be found (are you missing a using directive or an assembly reference?)
So, the question: How to use LINQ methods with n-dimensional arrays, but not jagged ones? And is where a method to convert rectangular array to jagged?
P.S. I tried to find the answer in documentation, but without result.
LINQ to Objects is based on the IEnumerable<T> Interface, i.e. a one-dimensional sequence of values. This means it doesn't mix well with n-dimensional data structures like non-jagged arrays, although it's possible.
You can generate one-dimensional sequence of integers that index into the n-dimensional array:
int rowSum = Enumerable.Range(0, a.GetLength(1)).Sum(i => a[1, i]);
int colSum = Enumerable.Range(0, a.GetLength(0)).Sum(i => a[i, 1]);
About your question "How to use LINQ methods with n-dimensional arrays":
You can't use most LINQ methods with a n dimensional array, because such an array only implements IEnumerable but not IEnumerable<T> and most of the LINQ extension methods are extension methods for IEnumerable<T>.
About the other question: See dtb's answer.
To add to dtb's solution, a more general way of iterating over all items of the array would be:
int[,] b = { { 1, 2, 3, 4 }, { 5, 6, 7, 8 }, { 9, 10, 11, 12 } };
var flattenedArray = Enumerable.Range(0, b.GetLength(0))
.SelectMany(i => Enumerable.Range(0, b.GetLength(1))
.Select(j => new { Row = i, Col = j }));
And now:
var rowSum2 = flattenedArray.Where(t => t.Row == 1).Sum(t => b[t.Row, t.Col]);
var colSum2 = flattenedArray.Where(t => t.Col == 1).Sum(t => b[t.Row, t.Col]);
Of course this is ultra-wasteful as we are creating coordinate tuples even for those items that we will end up filtering out with Where, but if you don't know what the selection criteria will be beforehand this is the way to go (or not -- this seems more like an excercise than something you 'd want to do in practice).
I can also imagine how this might be extended for arrays of any rank (not just 2D) using a recursive lambda and something like Tuple, but that crosses over into masochism territory.
The 2D array doesn't have any built in way of iterating over a row or column. It's not too difficult to create your own such method though. See this class for an implementation which gets an enumerable for row and column.
public static class LINQTo2DArray
{
public static IEnumerable<T> Row<T>(this T[,] Array, int Row)
{
for (int i = 0; i < Array.GetLength(1); i++)
{
yield return Array[Row, i];
}
}
public static IEnumerable<T> Column<T>(this T[,] Array, int Column)
{
for (int i = 0; i < Array.GetLength(0); i++)
{
yield return Array[i, Column];
}
}
}
You can also flatten the array usinga.Cast<int>() but you would then loose all the info about columns/rows
A simpler way is doing like below
var t = new List<Tuple<int, int>>();
int[][] a = t.Select(x => new int[]{ x.Item1, x.Item2}).ToArray();
The simplest LINQ only approach I can see to do these kinds of row and column operations on a two dimensional array is to define the following lookups:
var cols = a
.OfType<int>()
.Select((x, n) => new { x, n, })
.ToLookup(xn => xn.n % a.GetLength(1), xn => xn.x);
var rows = a
.OfType<int>()
.Select((x, n) => new { x, n, })
.ToLookup(xn => xn.n / a.GetLength(1), xn => xn.x);
Now you can simply do this:
var firstColumnSum = cols[0].Sum();
As for n-dimensional, it just gets too painful... Sorry.
This might sound lame, but I have not been able to find a really good explanation of Aggregate.
Good means short, descriptive, comprehensive with a small and clear example.
The easiest-to-understand definition of Aggregate is that it performs an operation on each element of the list taking into account the operations that have gone before. That is to say it performs the action on the first and second element and carries the result forward. Then it operates on the previous result and the third element and carries forward. etc.
Example 1. Summing numbers
var nums = new[]{1,2,3,4};
var sum = nums.Aggregate( (a,b) => a + b);
Console.WriteLine(sum); // output: 10 (1+2+3+4)
This adds 1 and 2 to make 3. Then adds 3 (result of previous) and 3 (next element in sequence) to make 6. Then adds 6 and 4 to make 10.
Example 2. create a csv from an array of strings
var chars = new []{"a","b","c","d"};
var csv = chars.Aggregate( (a,b) => a + ',' + b);
Console.WriteLine(csv); // Output a,b,c,d
This works in much the same way. Concatenate a a comma and b to make a,b. Then concatenates a,b with a comma and c to make a,b,c. and so on.
Example 3. Multiplying numbers using a seed
For completeness, there is an overload of Aggregate which takes a seed value.
var multipliers = new []{10,20,30,40};
var multiplied = multipliers.Aggregate(5, (a,b) => a * b);
Console.WriteLine(multiplied); //Output 1200000 ((((5*10)*20)*30)*40)
Much like the above examples, this starts with a value of 5 and multiplies it by the first element of the sequence 10 giving a result of 50. This result is carried forward and multiplied by the next number in the sequence 20 to give a result of 1000. This continues through the remaining 2 element of the sequence.
Live examples: http://rextester.com/ZXZ64749
Docs: http://msdn.microsoft.com/en-us/library/bb548651.aspx
Addendum
Example 2, above, uses string concatenation to create a list of values separated by a comma. This is a simplistic way to explain the use of Aggregate which was the intention of this answer. However, if using this technique to actually create a large amount of comma separated data, it would be more appropriate to use a StringBuilder, and this is entirely compatible with Aggregate using the seeded overload to initiate the StringBuilder.
var chars = new []{"a","b","c", "d"};
var csv = chars.Aggregate(new StringBuilder(), (a,b) => {
if(a.Length>0)
a.Append(",");
a.Append(b);
return a;
});
Console.WriteLine(csv);
Updated example: http://rextester.com/YZCVXV6464
It partly depends on which overload you're talking about, but the basic idea is:
Start with a seed as the "current value"
Iterate over the sequence. For each value in the sequence:
Apply a user-specified function to transform (currentValue, sequenceValue) into (nextValue)
Set currentValue = nextValue
Return the final currentValue
You may find the Aggregate post in my Edulinq series useful - it includes a more detailed description (including the various overloads) and implementations.
One simple example is using Aggregate as an alternative to Count:
// 0 is the seed, and for each item, we effectively increment the current value.
// In this case we can ignore "item" itself.
int count = sequence.Aggregate(0, (current, item) => current + 1);
Or perhaps summing all the lengths of strings in a sequence of strings:
int total = sequence.Aggregate(0, (current, item) => current + item.Length);
Personally I rarely find Aggregate useful - the "tailored" aggregation methods are usually good enough for me.
Super short
Aggregate works like fold in Haskell/ML/F#.
Slightly longer
.Max(), .Min(), .Sum(), .Average() all iterates over the elements in a sequence and aggregates them using the respective aggregate function. .Aggregate () is generalized aggregator in that it allows the developer to specify the start state (aka seed) and the aggregate function.
I know you asked for a short explaination but I figured as others gave a couple of short answers I figured you would perhaps be interested in a slightly longer one
Long version with code
One way to illustrate what does it could be show how you implement Sample Standard Deviation once using foreach and once using .Aggregate. Note: I haven't prioritized performance here so I iterate several times over the colleciton unnecessarily
First a helper function used to create a sum of quadratic distances:
static double SumOfQuadraticDistance (double average, int value, double state)
{
var diff = (value - average);
return state + diff * diff;
}
Then Sample Standard Deviation using ForEach:
static double SampleStandardDeviation_ForEach (
this IEnumerable<int> ints)
{
var length = ints.Count ();
if (length < 2)
{
return 0.0;
}
const double seed = 0.0;
var average = ints.Average ();
var state = seed;
foreach (var value in ints)
{
state = SumOfQuadraticDistance (average, value, state);
}
var sumOfQuadraticDistance = state;
return Math.Sqrt (sumOfQuadraticDistance / (length - 1));
}
Then once using .Aggregate:
static double SampleStandardDeviation_Aggregate (
this IEnumerable<int> ints)
{
var length = ints.Count ();
if (length < 2)
{
return 0.0;
}
const double seed = 0.0;
var average = ints.Average ();
var sumOfQuadraticDistance = ints
.Aggregate (
seed,
(state, value) => SumOfQuadraticDistance (average, value, state)
);
return Math.Sqrt (sumOfQuadraticDistance / (length - 1));
}
Note that these functions are identical except for how sumOfQuadraticDistance is calculated:
var state = seed;
foreach (var value in ints)
{
state = SumOfQuadraticDistance (average, value, state);
}
var sumOfQuadraticDistance = state;
Versus:
var sumOfQuadraticDistance = ints
.Aggregate (
seed,
(state, value) => SumOfQuadraticDistance (average, value, state)
);
So what .Aggregate does is that it encapsulates this aggregator pattern and I expect that the implementation of .Aggregate would look something like this:
public static TAggregate Aggregate<TAggregate, TValue> (
this IEnumerable<TValue> values,
TAggregate seed,
Func<TAggregate, TValue, TAggregate> aggregator
)
{
var state = seed;
foreach (var value in values)
{
state = aggregator (state, value);
}
return state;
}
Using the Standard deviation functions would look something like this:
var ints = new[] {3, 1, 4, 1, 5, 9, 2, 6, 5, 4};
var average = ints.Average ();
var sampleStandardDeviation = ints.SampleStandardDeviation_Aggregate ();
var sampleStandardDeviation2 = ints.SampleStandardDeviation_ForEach ();
Console.WriteLine (average);
Console.WriteLine (sampleStandardDeviation);
Console.WriteLine (sampleStandardDeviation2);
IMHO
So does .Aggregate help readability? In general I love LINQ because I think .Where, .Select, .OrderBy and so on greatly helps readability (if you avoid inlined hierarhical .Selects). Aggregate has to be in Linq for completeness reasons but personally I am not so convinced that .Aggregate adds readability compared to a well written foreach.
A picture is worth a thousand words
Reminder:
Func<X, Y, R> is a function with two inputs of type X and Y, that returns a result of type R.
Enumerable.Aggregate has three overloads:
Overload 1:
A Aggregate<A>(IEnumerable<A> a, Func<A, A, A> f)
Example:
new[]{1,2,3,4}.Aggregate((x, y) => x + y); // 10
This overload is simple, but it has the following limitations:
the sequence must contain at least one element,
otherwise the function will throw an InvalidOperationException.
elements and result must be of the same type.
Overload 2:
B Aggregate<A, B>(IEnumerable<A> a, B bIn, Func<B, A, B> f)
Example:
var hayStack = new[] {"straw", "needle", "straw", "straw", "needle"};
var nNeedles = hayStack.Aggregate(0, (n, e) => e == "needle" ? n+1 : n); // 2
This overload is more general:
a seed value must be provided (bIn).
the collection can be empty,
in this case, the function will yield the seed value as result.
elements and result can have different types.
Overload 3:
C Aggregate<A,B,C>(IEnumerable<A> a, B bIn, Func<B,A,B> f, Func<B,C> f2)
The third overload is not very useful IMO.
The same can be written more succinctly by using overload 2 followed by a function that transforms its result.
The illustrations are adapted from this excellent blogpost.
Aggregate is basically used to Group or Sum up data.
According to MSDN
"Aggregate Function Applies an accumulator function over a sequence."
Example 1: Add all the numbers in a array.
int[] numbers = new int[] { 1,2,3,4,5 };
int aggregatedValue = numbers.Aggregate((total, nextValue) => total + nextValue);
*important: The initial aggregate value by default is the 1 element in the sequence of collection.
i.e: the total variable initial value will be 1 by default.
variable explanation
total: it will hold the sum up value(aggregated value) returned by the func.
nextValue: it is the next value in the array sequence. This value is than added to the aggregated value i.e total.
Example 2: Add all items in an array. Also set the initial accumulator value to start adding with from 10.
int[] numbers = new int[] { 1,2,3,4,5 };
int aggregatedValue = numbers.Aggregate(10, (total, nextValue) => total + nextValue);
arguments explanation:
the first argument is the initial(starting value i.e seed value) which will be used to start addition with the next value in the array.
the second argument is a func which is a func that takes 2 int.
1.total: this will hold same as before the sum up value(aggregated value) returned by the func after the calculation.
2.nextValue: : it is the next value in the array sequence. This value is than added to the aggregated value i.e total.
Also debugging this code will give you a better understanding of how aggregate work.
In addition to all the great answers here already, I've also used it to walk an item through a series of transformation steps.
If a transformation is implemented as a Func<T,T>, you can add several transformations to a List<Func<T,T>> and use Aggregate to walk an instance of T through each step.
A more concrete example
You want to take a string value, and walk it through a series of text transformations that could be built programatically.
var transformationPipeLine = new List<Func<string, string>>();
transformationPipeLine.Add((input) => input.Trim());
transformationPipeLine.Add((input) => input.Substring(1));
transformationPipeLine.Add((input) => input.Substring(0, input.Length - 1));
transformationPipeLine.Add((input) => input.ToUpper());
var text = " cat ";
var output = transformationPipeLine.Aggregate(text, (input, transform)=> transform(input));
Console.WriteLine(output);
This will create a chain of transformations: Remove leading and trailing spaces -> remove first character -> remove last character -> convert to upper-case. Steps in this chain can be added, removed, or reordered as needed, to create whatever kind of transformation pipeline is required.
The end result of this specific pipeline, is that " cat " becomes "A".
This can become very powerful once you realize that T can be anything. This could be used for image transformations, like filters, using BitMap as an example;
Learned a lot from Jamiec's answer.
If the only need is to generate CSV string, you may try this.
var csv3 = string.Join(",",chars);
Here is a test with 1 million strings
0.28 seconds = Aggregate w/ String Builder
0.30 seconds = String.Join
Source code is here
Definition
Aggregate method is an extension method for generic collections. Aggregate method applies a function to each item of a collection. Not just only applies a function, but takes its result as initial value for the next iteration. So, as a result, we will get a computed value (min, max, avg, or other statistical value) from a collection.
Therefore, Aggregate method is a form of safe implementation of a recursive function.
Safe, because the recursion will iterate over each item of a collection and we can’t get any infinite loop suspension by wrong exit condition. Recursive, because the current function’s result is used as a parameter for the next function call.
Syntax:
collection.Aggregate(seed, func, resultSelector);
seed - initial value by default;
func - our recursive function. It can be a lambda-expression, a Func delegate or a function type T F(T result, T nextValue);
resultSelector - it can be a function like func or an expression to compute, transform, change, convert the final result.
How it works:
var nums = new[]{1, 2};
var result = nums.Aggregate(1, (result, n) => result + n); //result = (1 + 1) + 2 = 4
var result2 = nums.Aggregate(0, (result, n) => result + n, response => (decimal)response/2.0); //result2 = ((0 + 1) + 2)*1.0/2.0 = 3*1.0/2.0 = 3.0/2.0 = 1.5
Practical usage:
Find Factorial from a number n:
int n = 7;
var numbers = Enumerable.Range(1, n);
var factorial = numbers.Aggregate((result, x) => result * x);
which is doing the same thing as this function:
public static int Factorial(int n)
{
if (n < 1) return 1;
return n * Factorial(n - 1);
}
Aggregate() is one of the most powerful LINQ extension method, like Select() and Where(). We can use it to replace the Sum(), Min(). Max(), Avg() functionality, or to change it by implementing addition context:
var numbers = new[]{3, 2, 6, 4, 9, 5, 7};
var avg = numbers.Aggregate(0.0, (result, x) => result + x, response => (double)response/(double)numbers.Count());
var min = numbers.Aggregate((result, x) => (result < x)? result: x);
More complex usage of extension methods:
var path = #“c:\path-to-folder”;
string[] txtFiles = Directory.GetFiles(path).Where(f => f.EndsWith(“.txt”)).ToArray<string>();
var output = txtFiles.Select(f => File.ReadAllText(f, Encoding.Default)).Aggregate<string>((result, content) => result + content);
File.WriteAllText(path + “summary.txt”, output, Encoding.Default);
Console.WriteLine(“Text files merged into: {0}”, output); //or other log info
This is an explanation about using Aggregate on a Fluent API such as Linq Sorting.
var list = new List<Student>();
var sorted = list
.OrderBy(s => s.LastName)
.ThenBy(s => s.FirstName)
.ThenBy(s => s.Age)
.ThenBy(s => s.Grading)
.ThenBy(s => s.TotalCourses);
and lets see we want to implement a sort function that take a set of fields, this is very easy using Aggregate instead of a for-loop, like this:
public static IOrderedEnumerable<Student> MySort(
this List<Student> list,
params Func<Student, object>[] fields)
{
var firstField = fields.First();
var otherFields = fields.Skip(1);
var init = list.OrderBy(firstField);
return otherFields.Skip(1).Aggregate(init, (resultList, current) => resultList.ThenBy(current));
}
And we can use it like this:
var sorted = list.MySort(
s => s.LastName,
s => s.FirstName,
s => s.Age,
s => s.Grading,
s => s.TotalCourses);
Aggregate used to sum columns in a multi dimensional integer array
int[][] nonMagicSquare =
{
new int[] { 3, 1, 7, 8 },
new int[] { 2, 4, 16, 5 },
new int[] { 11, 6, 12, 15 },
new int[] { 9, 13, 10, 14 }
};
IEnumerable<int> rowSums = nonMagicSquare
.Select(row => row.Sum());
IEnumerable<int> colSums = nonMagicSquare
.Aggregate(
(priorSums, currentRow) =>
priorSums.Select((priorSum, index) => priorSum + currentRow[index]).ToArray()
);
Select with index is used within the Aggregate func to sum the matching columns and return a new Array; { 3 + 2 = 5, 1 + 4 = 5, 7 + 16 = 23, 8 + 5 = 13 }.
Console.WriteLine("rowSums: " + string.Join(", ", rowSums)); // rowSums: 19, 27, 44, 46
Console.WriteLine("colSums: " + string.Join(", ", colSums)); // colSums: 25, 24, 45, 42
But counting the number of trues in a Boolean array is more difficult since the accumulated type (int) differs from the source type (bool); here a seed is necessary in order to use the second overload.
bool[][] booleanTable =
{
new bool[] { true, true, true, false },
new bool[] { false, false, false, true },
new bool[] { true, false, false, true },
new bool[] { true, true, false, false }
};
IEnumerable<int> rowCounts = booleanTable
.Select(row => row.Select(value => value ? 1 : 0).Sum());
IEnumerable<int> seed = new int[booleanTable.First().Length];
IEnumerable<int> colCounts = booleanTable
.Aggregate(seed,
(priorSums, currentRow) =>
priorSums.Select((priorSum, index) => priorSum + (currentRow[index] ? 1 : 0)).ToArray()
);
Console.WriteLine("rowCounts: " + string.Join(", ", rowCounts)); // rowCounts: 3, 1, 2, 2
Console.WriteLine("colCounts: " + string.Join(", ", colCounts)); // colCounts: 3, 2, 1, 2
Everyone has given his explanation. My explanation is like that.
Aggregate method applies a function to each item of a collection. For example, let's have collection { 6, 2, 8, 3 } and the function Add (operator +) it does (((6+2)+8)+3) and returns 19
var numbers = new List<int> { 6, 2, 8, 3 };
int sum = numbers.Aggregate(func: (result, item) => result + item);
// sum: (((6+2)+8)+3) = 19
In this example there is passed named method Add instead of lambda expression.
var numbers = new List<int> { 6, 2, 8, 3 };
int sum = numbers.Aggregate(func: Add);
// sum: (((6+2)+8)+3) = 19
private static int Add(int x, int y) { return x + y; }
A short and essential definition might be this: Linq Aggregate extension method allows to declare a sort of recursive function applied on the elements of a list, the operands of whom are two: the elements in the order in which they are present into the list, one element at a time, and the result of the previous recursive iteration or nothing if not yet recursion.
In this way you can compute the factorial of numbers, or concatenate strings.