Explain Linq Microsoft Select - Indexed [Example]

Explain Linq Microsoft Select - Indexed [Example] - c#

I'm running throuth Microsoft's 101 LINQ Samples, and I'm stumped on how this query knows how to assign the correct int value to the correct int field:
public void Linq12()
{
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
var numsInPlace = numbers.Select((num, index) => new { Num = num, InPlace = (num == index) });
Console.WriteLine("Number: In-place?");
foreach (var n in numsInPlace)
{
Console.WriteLine("{0}: {1}", n.Num, n.InPlace);
}
}
I saw in SO #336758 that there have been errors in the examples before, but it is much more likely that I am just missing something.
Could someone explain this and how the compiler knows how to interpret this data correctly?
EDIT:
OK, I think my confusion comes from the LINQ extension that enables the Select feature to work. The Func and two int parameters IEnumerable<TResult> IEnumerable<int>.Select(Func<int,int,TResult> selector) are most likely the key to my lack of understanding.

I'm not really sure what you are asking of but the Select iterates over the list starting at index 0. If the value of the element at the current index is equal to the index it will set the InPlace property in the anonymous object to true. My guess is that the code above prints true for 3, 6 and 7, right?
It would also make it easier to explain if you write what you don't understand.
Jon Skeet has written a series of blog post where he implement linq, read about Select here: Reimplementation of Select
UPDATE: I noticed in one of your comment to one of the other comments and it seems like it is the lambda and not linq itself that is confusing you. If you read Skeet's blog post you see that Select has two overloads:
public static IEnumerable<TResult> Select<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TResult> selector)
public static IEnumerable<TResult> Select<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, int, TResult> selector)
The Select with index matches the second overload. As you can see it is an extension of IEnumerable<TSource> which in your case is the list of ints and therefor you are calling the Select on an IEnumerable<int> and the signature of Select becomes: Select<int, TResult>(this IEnumerable<int> source, Func<int, int, TResult> selector). As you can see I changed TSource against int, since that is the generic type of your IEnumerable<int>. I still have TResult since you are using anonymous type. So that might explain some parts?

Looks correct to me.
First you have an anonymous type with Num and InPlace being created. Then the LINQ Select is just iterating over the elements with the element and the index of that element. If you were to rewrite it without linq and anonymous classes, it would look like this:
class NumsInPlace
{
public int Num { get; set; }
public bool InPlace { get; set; }
}
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
List<NumsInPlace> numsInPlace = new List<int>();
for (int index = 0; i < numbers.length; i++)
{
int num = numers[index];
numsInPlace.Add(new NumsInPlace() { Num = num, InPlace = (index == num) });
}
Console.WriteLine("Number: In-place?");
foreach (var n in numsInPlace)
{
Console.WriteLine("{0}: {1}", n.Num, n.InPlace);
}
The MSDN on Enumerable.Select has the details, but the projection function (num, index) always has the item first, then the index second (if supplied).

how does the compiler know that LINQ Select is using the index value as an index to the array?
The compiler doesn't know about index values. The implementation of that overload of Select knows about index values.
//all the compiler sees is a method that accepts 2 int parameters and returns a bool.
Func<int, int, bool> twoIntFunc = (x, y) => (x == y);
//The compiler sees there's an overload of Enumerable.Select which accepts such a method.
IEnumerable<bool> query = numbers.Select(twoIntFunc);
Enumerable.Select's implementation does the rest of the work by calling that method with the appropriate parameters.
The first argument to selector represents the element to process. The second argument to selector represents the zero-based index of that element in the source sequence.
So - Select will call your method with (5, 0) first, then Select calls it with (4, 1).

The lambda expression (num, index) => new { Num = num, InPlace = (num == index) } is executed once for each element in the input sequence and passed the item and it's index as the arguments.
Update
Lambda expressions can be implicitly typed, that is, from the required type, the compiler can figure out (or imply) what types you intend the arguments to be.
(num, index) => new { Num = num, InPlace = (num == index) }
is equivalent to
someAnonymousType MyMethod(int num, int index)
{
return new
{
Num = num,
InPlace = (num == index)
};
}
obviously you can't write the latter because you can't type the name of an anonymous type, but the compiler can ;)
The compiler knows this because the overload of Select that you're using accepts a Func<TSource, Int32, TResult>, this is a Func that takes, two arguments of type TSource (the type of your IEnumberable<T>, in this case int) and an Int32 (which represents the index) and returns an object of TResult, being whatever you choose to return from your function, in this case, an anonymous type.
The lambda can be cast to the required type and therefore it just works.

The second argument in the select is the index, which increments as the compiler traverses the array of numbers. The compiler will see
num index num = index
5 0 false
4 1 false
1 2 false
3 3 true
9 4 false
8 5 false
6 6 true
7 7 true
2 8 false
0 9 false

It provides true for 3,6 and 7. You need to remember it starts with the index at 0.

Related

How to distinguish between a found 0 and default(int) when using methods like FindLast?

I have a list of integers and I need to find the last occurrence that matches a predicate. To use a very simple example:
var myList = new List<int> { 1, 5, 6, 20, 18, 2, 3, 0, 4 };
var lastMatch = myList.FindLast(e => e == 0 || e == 2);
This seems like the perfect usecase for FindLast. Problem is that this method returns default(T) if nothing was found, which in the case of integers, is actually a valid value 0. So the question is, if this method returns 0, how can I know if it found something or not? Is there a better method for this case with ints?

Use FindLastIndex instead. If the index is negative no match was found. If it isn't negative: that's the index you want, so: use the indexer with that index.

As an alternative to #MarcGravell answer:
Instead of the List FindLast method, you could use the Linq Last extension method overload that takes a predicate as an argument. It will throw an exception if no match is found.
See Last documentation

In general case when we have IEnumerable<T> with arbitrary T (we can't play trick with int? now)
we can implement an extension method:
public static partial class EnumerableExtensions {
public static int LastIndex<T>(this IEnumerable<T> source,
Predicate<T> predicate) {
if (source is null)
throw new ArgumentNullException(nameof(source));
if (predicate is null)
throw new ArgumentNullException(nameof(predicate));
int result = -1;
int index = -1;
foreach (T item in source) {
index += 1;
if (predicate(item))
result = index;
}
return result;
}
}
And then
var lastMatch = myList.LastIndex(e => e == 0 || e == 2);

Need to understand the below code using func

I am new to delegates. Today I saw a code on this Link. AS i am new to c# and specially to delegates, i was unable to understand the below code.
public static void Main()
{
Func<String, int, bool> predicate = (str, index) => str.Length == index;
String[] words = { "orange", "apple", "Article", "elephant", "star", "and" };
IEnumerable<String> aWords = words.Where(predicate).Select(str => str);
foreach (String word in aWords)
Console.WriteLine(word);
}
The OutPut of the above code is "star". AS predicate is expecting to parameters but in this case we are not passing any parameters. Your comments will be really appreciated.

So first, there's a function definition:
Func<String, int, bool> predicate = (str, index) => str.Length == index;
which reads as "given a string represented as str and an index represented as index return true if the length of the string is equal to the index otherwise false"
When you come across to the enumerable pipeline:
IEnumerable<String> aWords = words.Where(predicate).Select(str => str);
you pass this function definition above, which is similar to:
words.Where((element, index) => element.Length == index).Select(str => str);
and as you can see only the element "star" meets that criteria, i.e. the "star" has length 4 and its index is also 4.
In regard to your confusion of:
AS predicate is expecting to parameters but in this case we are not
passing any parameters.
Note that when it comes to LINQ, we only specify the "what" and the "how" is an implementation detail. so in the aforementioned code, the Where clause will pass each element and its index to the predicate function.
On another note, the Select is superfluous, just IEnumerable<String> aWords = words.Where(predicate) should shuffice.

Let's start to examine this code from the first line.
Func<String, int, bool> predicate = (str, index) => str.Length == index;
Here we have simply the declaration of a variable named predicate This is a particular kind of variable whose value is a function that receives two parameters of type string and int and it is expected to return a bool. But where is the body of that function? Just after the equals sign. The (str,index) are the parameters and the body is simply a comparison between the length of the parameter str with the value of the parameter index.
=> str.Length == index returns true if the condition matches otherwise returns false
Now that we understand the declaration of the delegate, the next question is where we use it. This is even simpler. We can use it whenever we need a function that matches our delegate.
Now there is an overload of Where IEnumerable extension that expects just that, so we can simply put the predicate variable in that place and call the Where extension

In short, the code just says, "Select all words where the length equals index"
string[] words = { "orange", "apple", "Article", "elephant", "star", "and" };
// select all words where the length equals the index
var aWords = words.Where((str, i) => str.Length == i);
foreach (var word in aWords)
Console.WriteLine(word);
Where<TSource>(IEnumerable<TSource>, Func<TSource,Int32,Boolean>)
Filters a sequence of values based on a predicate. Each element's
index is used in the logic of the predicate function.
Func<T1,T2,TResult> Delegate
Encapsulates a method that has two parameters and returns a value of
the type specified by the TResult parameter.
So the only magic is the Func
Func<String, int, bool> predicate = (str, index) => str.Length == index;
Which (in this case) is just a fancy way of writing
public bool DoSomething(string str, int index)
{
return str.Length == index;
}
In essence its just a delegate, unlike an Action its capable of returning a value
Delegates (C# Programming Guide)
A delegate is a type that represents references to methods with a
particular parameter list and return type. When you instantiate a
delegate, you can associate its instance with any method with a
compatible signature and return type. You can invoke (or call) the
method through the delegate instance.

To simplify it a bit,
var aWords = words.Where(str => str.Length == 4);
is the same as:
Func<string, bool> predicate = str => str.Length == 4;
var aWords = words.Where(predicate);
predicate() executes it, and predicate without () can be used to pass it as parameter to a method.
With local functions (functions that can be declared inside other functions) introduced in C# 7, it can be :
bool predicate(string str) { return str.Length == 4; }
var aWords = words.Where(predicate);

lambda expression foreach loop

I have the following code
int someCount = 0;
for ( int i =0 ; i < intarr.Length;i++ )
{
if ( intarr[i] % 2 == 0 )
{
someCount++;
continue;
}
// Some other logic for those not satisfying the condition
}
Is it possible to use any of the Array.Where or Array.SkiplWhile to achieve the same?
foreach(int i in intarr.where(<<condtion>> + increment for failures) )
{
// Some other logic for those not satisfying the condition
}

Use LINQ:
int someCount = intarr.Count(val => val % 2 == 0);

I definitely prefer #nneonneo's way for short statements (and it uses an explicit lambda), but if you want to build a more elaborate query, you can use the LINQ query syntax:
var count = ( from val in intarr
where val % 2 == 0
select val ).Count();
Obviously this is probably a poor choice when the query can be expressed with a single lambda expression, but I find it useful when composing larger queries.
More examples: http://code.msdn.microsoft.com/101-LINQ-Samples-3fb9811b

Nothing (much) prevents you from rolling your own Where that counts the failures. "Nothing much" because neither lambdas nor methods with yield return statements are allowed to reference out/ref parameters, so the desired extension with the following signature won't work:
// dead-end/bad signature, do not attempt
IEnumerable<T> Where(
this IEnumerable<T> self,
Func<T,bool> predicate,
out int failures)
However, we can declare a local variable for the failure-count and return a Func<int> that can get the failure-count, and a local variable is completely valid to reference from lambdas. Thus, here's a possible (tested) implementation:
public static class EnumerableExtensions
{
public static IEnumerable<T> Where<T>(
this IEnumerable<T> self,
Func<T,bool> predicate,
out Func<int> getFailureCount)
{
if (self == null) throw new ArgumentNullException("self");
if (predicate == null) throw new ArgumentNullException("predicate");
int failures = 0;
getFailureCount = () => failures;
return self.Where(i =>
{
bool res = predicate(i);
if (!res)
{
++failures;
}
return res;
});
}
}
...and here's some test code that exercises it:
Func<int> getFailureCount;
int[] items = { 0, 1, 2, 3, 4 };
foreach(int i in items.Where(i => i % 2 == 0, out getFailureCount))
{
Console.WriteLine(i);
}
Console.WriteLine("Failures = " + getFailureCount());
The above test, when run outputs:
0
2
4
Failures = 2
There are a couple caveats I feel obligated to warn about. Since you could break out of the loop prematurely without having walked the entire IEnumerable<>, the failure-count would only reflect encountered-failures, not the total number of failures as in #nneonneo's solution (which I prefer.) Also, if the implementation of LINQ's Where extension were to change in a way that called the predicate more than once per item, then the failure count would be incorrect. One more point of interest is that, from within your loop body you should be able to make calls to the getFailureCount Func to get the current running failure count so-far.
I presented this solution to show that we are not locked-into the existing prepackaged solutions. The language and framework provides us with lots of opportunities to extend it to suit our needs.

what is "if (predicate(item, index))" operation in LINQ?

I am fairly new to LINQ. I am looking at this code, and not sure if I understand this properly. I realize that it is an extension and generic method, but what is predicate(item, index) performing (lets say i pass in an array of ints when calling this method)?
I know that predicate is a delegate, but maybe I just don't know how delegation works, someone has any good example/explanation they'd like to give. Also, what is yield keyword, is it just used in linq stuff?
private static IEnumerable<TSource> WhereImpl<TSource>(
this IEnumerable<TSource> source,
Func<TSource, int, bool> predicate)
{
int index = 0;
foreach (TSource item in source)
{
if (predicate(item, index))
{
yield return item;
}
index++;
}
}
I am trying to follow Reimplementing LINQ to Objects: Part 2 - "Where" from Skeet's blog.

predicate(item, index)
is defined to be of type
Func<TSource, int, bool>
that means a method that has parameters of TSource and int and returns a bool - a predicate.
An example for TSource = string could be (totally made up):
bool IsLengthLargerThan(string s, int length)
{
return s.Length > length;
}
Also, what is yield keyword, is it
just used in linq stuff?
yield is specific to iterator blocks - this has been around before LINQ. It basically works like a state machine - yield return item; will return item to the caller and suspend execution, but once you request the next item, execution will resume on the next line. It's easiest to see how it works if you step through it with a debugger.

First the predicate(item, index) is a delegate that takes in the item in the enumeration and the index of that item in the enumeration. So if you started with an array of integers, the item would be an integer and the index would be its index in the array. This index is the result of the current instance of the enumeration so if you add a Where clause and the original index of an item was 3 and the where filtered out the first 3 then its new index would be 0.
The yield keyword is a C# keyword for easy output of an IEnumerable.

predicate(..) is a function that takes 2 parameters, a string and an int, and return true or false. yield return is a keyword that essentially is like saying "yea, add this to the IEnumerable that I'll be returning, but let's keep looking for others"
So it executes the predicate function with a string and an int. You see index was initialized to 0 and is auto-incremented every time. That's the second param to predicate. When the function returns true, you say "add it to my return collection"

How is a variable in a lambda expression given its value

How does index in the below example obtain its value? I understand that n is automatically obtained from the source numbers, but, while the meaning is clear, I do not see how index is given its value:
int[] numbers = { 5, 4, 1, 3, 9, 8, 6, 7, 2, 0 };
var firstSmallNumbers = numbers.TakeWhile((n, index) => n >= index);
The signature of TakeWhile is:
public static IEnumerable<TSource> TakeWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, int, bool> predicate);

This version of TakeWhile supplies the index of the source element in the sequence as the second parameter to the predicate. I.e. the predicate is called as predicate(5, 0), then predicate(4, 1), predicate(1, 2), predicate(3, 3) etc. See the MSDN documentation.
There is also a “simpler” version of the function, supplying only the values in the sequence, see MSDN.

The index is generated by the implementation of TakeWhile, which might look a bit like this.

Things become clear as long as you figure out how TakeWhile can be implemented :
public static IEnumerable<TSource> TakeWhile<TSource>(this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)
{
int index = 0;
foreach (TSource item in source)
{
if (predicate(item, index))
{
yield return item;
}
else
{
yield break;
}
index++;
}
}

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Explain Linq Microsoft Select - Indexed [Example] - c#

The second argument in the select is the index, which increments as the compiler traverses the array of numbers. The compiler will see num index num = index 5 0 false 4 1 false 1 2 false 3 3 true 9 4 false 8 5 false 6 6 true 7 7 true 2 8 false 0 9 false

It provides true for 3,6 and 7. You need to remember it starts with the index at 0.

Related

How to distinguish between a found 0 and default(int) when using methods like FindLast?

Need to understand the below code using func

lambda expression foreach loop

what is "if (predicate(item, index))" operation in LINQ?

How is a variable in a lambda expression given its value

Categories

Resources