How does Linq know the name of an element?

How does Linq know the name of an element? - c#

I’ve solved a C# beginner Kata on Codewars, asking to return a string[] with only 4-character names. I used a list that is filled in an if statement and then converted back to a string and returned.
My question is regarding the best practice solution below, which is presented later.
I understand that the same string[] that comes as an argument is refilled with elements and returned. But how does the program know that each element of the array is called “name”, as it’s never mentioned before?
Does linq know that a variable with a singular name is an element of a plural name group, here “names”?
Thanks for helping out!
using System;
using System.Collections.Generic;
using System.Linq;
public static class Kata {
public static IEnumerable<string> FriendOrFoe (string[] names) {
return names.Where(name => name.Length == 4);
}
}

I understand that the same string[] that comes as an argument is refilled with elements and returned
I'll address this briefly, because it's not the question you're asking. This isn't true - no refilling of anything occurs. The original array is unaltered and a Where is a loop that runs over the array and selectively emits items from it when the test presented to it evaluates to true for that item. The way it does this is via a special construct called a yield return which is a way of allowing code to return from a method and then re-enter it and carry on from where it left off before, rather than starting all over again from the beginning of the method. There is only ever one array, and the looping/testing is not performed unless you start reading from the set of strings produced by the Where. If you want to know more about that, drop a comment.
Moving on..
Does linq know that a variable with a singular name is an element of a plural name group, here “names”?
No; the IDE knows the variable name because that's what you chose to call it just after the Where(
Perhaps it would help to link it to something you already know. It would be perfectly acceptable to write this code:
public static class Kata {
public static IEnumerable<string> FriendOrFoe (string[] names) {
return names.Where(IsNameOfLengthFour);
}
static bool IsNameOfLengthFour(string name){
return name.Length == 4;
}
}
Where demands some method be supplied that takes a string and returns a boolean. It demands that the input be a string because it's being called on names which is an array of string. If it were ints in the array, the method passed to Where would have to take an int
In C# there's often a push to make things more compact, so to get rid of all that wordiness above, we have a much more compact form of writing method bodies. Let's reduce our wordy version:
static bool IsNameOfLengthFour(string name){
return name.Length == 4;
}
We can chuck out the return type, because we can guess that from the type being returned. We can get rid of static too, and just assume static if we're calling it from inside another static, or not if we're not. We can ditch the input type too, because we can guess that from the type going in.
IsNameOfLengthFour(name){
return name.Length == 4;
}
If we have a special syntx that is only one line and must be a value that is auto returned, we can get rid of the return, and the {} because it's only one line, so we don't need to fence off multiple statements:
IsNameOfLengthFour(name) => name.Length == 4
Now, we don't actually need a method name any more either if we're going to use this in some place where a name is irrelevant, and we really don't need () for a single argument either:
name => name.Length == 4
And that's enough of an expression for the compiler to be able to form a method out of it, and plumb it into something that expects a method taking a string and returning a boolean. We've thrown away all the fluff of a method that we humans like (names and identifiers) and given the compiler just the raw nuts and bolts it needs - the logic of the method. The compiler will recreate the rest of the fluff when it wires it all together for us; we won't ever be able to call this mini-method from elsewhere in our code but we don't care. We got what we wanted, which is a nice compact way of expressing the logic:
Where(n => n.Length==4);
You did a good job, calling the argument to this mini-method something sensible. I see x used a lot and it gets really confusing when what X is changes.. For example:
names
.Where(name => ...)
.GroupBy(name => ...)
.Select(g => g.First())
.Where(name => ...)
Where works on your array of names so calling the argument to the delegate name or n is a good idea. Where will filter it down but ultimately it still emits a set of strings that are names, so it's still a good idea to call it name on the way into a GroupBy.. But a GroupBy produces a set of IGrouping, not a set of string so the thing coming our of a GroupBy is no longer a name.. In the next Select I call it g to reflect that it's a grouping, not a name, but I then take the first item in the grouping which is, actually, a name.. So in the final Where I go back to calling the input argument name to reflect what it's back to being..
When LINQ statements get much more complicated, it really helps to name these arguments well
Note: in this answer I've used words like "list" or "set" and I mean those in the general English sense that an "array is a list of ..", not a specifically C# List<xxx> or HashSet<xxx> sense. If you see lowercase words that align with C# types, they are not intended to refer to that specific type

I understand that the same string[] that comes as an argument is refilled with elements and returned.
No, absolutely not. A new sequence of strings is returned based on the input array. The input array is not modified or changed in any way.
But how does the program know that each element of the array is called “name”, as it’s never mentioned before?
name is a parameter to an anonymous function. The name parameter is a string based on context. This could be x or ASDASDASD or whatever you want, but here we use name since we have, on each call, one "name" from names.
Thus,
names is an array of strings passed into the function
the .Where returns a new IEnumerable<string> from the current array based on a predicate function (e.g. returns true for a match, false to omit)
The predicate name => name.Length == 4 takes a string and returns true if the string is length 4
The return from the function is the strings from names that are exactly 4 characters in length

Related

Count of System.Collections.Generic.List

ViewBag.EquipmentList = myInventoryEntities.p_Configurable_Equipment_Request_Select(Address_ID, false).Select(c => new { Value = c.Quantity + " " + c.Device_Name + " (s)", ID = c.Device_ID.ToString() }).ToList();
In Razor i want to do the following
#ViewBag.EquipmentList.Count
But Count is always == 1
I know i can iterate in a foreach but would rather a more direct approach.
Perhaps I am conceptually off?

EDIT: Okay, so now it seems like you've got past Count not executing:
But Count is always == 1
Sounds like you're always fetching a list with exactly one entry in. If that's unexpected, you should look at why you're only getting a single entry - start with debugging into the code... this doesn't sound like a Razor problem at all.
If your value were really a List<T> for some T, I believe it should be fine. The expression ViewBag.EquipmentList.Count would evaluate dynamically at every point.
In other words, if you're really, really using the assignment code shown, it should be okay.
If, however, the value is just some implementation of IEnumerable<T> which doesn't expose a Count property, then you'd need to use the Enumerable.Count() extension method - and you can't use normal "extension method syntax" with dynamic typing. One simple fix would be to use:
Enumerable.Count(ViewBag.EquipmentList)
... which will still use dynamic typing for the argument to Enumerable.Count, but it won't have to find the Count() method as an extension method.
Alternatively, make sure that your value really is a List<T>. If it's actually an array, you should be able to change the code to use the Length property instead.

#ViewBag.EquipmentListNI.Count()

Resharper: Possible Multiple Enumeration of IEnumerable

I'm using the new Resharper version 6. In several places in my code it has underlined some text and warned me that there may be a Possible multiple enumeration of IEnumerable.
I understand what this means, and have taken the advice where appropriate, but in some cases I'm not sure it's actually a big deal.
Like in the following code:
var properties = Context.ObjectStateManager.GetObjectStateEntry(this).GetModifiedProperties();
if (properties.Contains("Property1") || properties.Contains("Property2") || properties.Contains("Property3")) {
...
}
It's underlining each mention of properties on the second line, warning that I am enumerating over this IEnumerable multiple times.
If I add .ToList() to the end of line 1 (turning properties from a IEnumerable<string> to a List<string>), the warnings go away.
But surely, if I convert it to a List, then it will enumerate over the entire IEnumerable to build the List in the first place, and then enumerate over the List as required to find the properties (i.e. 1 full enumeration, and 3 partial enumerations). Whereas in my original code, it is only doing the 3 partial enumerations.
Am I wrong? What is the best method here?

I don't know exactly what your properties really is here - but if it's essentially representing an unmaterialized database query, then your if statement will perform three queries.
I suspect it would be better to do:
string[] propertiesToFind = { "Property1", "Property2", "Property3" };
if (properties.Any(x => propertiesToFind.Contains(x))
{
...
}
That will logically only iterate over the sequence once - and if there's a database query involved, it may well be able to just use a SQL "IN" clause to do it all in the database in a single query.

If you invoke Contains() on a IEnumerable, it will invoke the extension method which will just iterate through the items in order to find it. IList has real implementation for Contains() that probably are more efficient than a regular iteration through the values (it might have a search tree with hashes?), hence it doesn't warn with IList.
Since the extension method will only be aware that it's an IEnumerable, it probably can not utilize any built-in methods for Contains() even though it would be possible in theory to identify known types and cast them accordingly in order to utilize them.

Understanding how the C# compiler deals with chaining linq methods

I'm trying to wrap my head around what the C# compiler does when I'm chaining linq methods, particularly when chaining the same method multiple times.
Simple example: Let's say I'm trying to filter a sequence of ints based on two conditions.
The most obvious thing to do is something like this:
IEnumerable<int> Method1(IEnumerable<int> input)
{
return input.Where(i => i % 3 == 0 && i % 5 == 0);
}
But we could also chain the where methods, with a single condition in each:
IEnumerable<int> Method2(IEnumerable<int> input)
{
return input.Where(i => i % 3 == 0).Where(i => i % 5 == 0);
}
I had a look at the IL in Reflector; it is obviously different for the two methods, but analysing it further is beyond my knowledge at the moment :)
I would like to find out:
a) what the compiler does differently in each instance, and why.
b) are there any performance implications (not trying to micro-optimize; just curious!)

The answer to (a) is short, but I'll go into more detail below:
The compiler doesn't actually do the chaining - it happens at runtime, through the normal organization of the objects! There's far less magic here than what might appear at first glance - Jon Skeet recently completed the "Where clause" step in his blog series, Re-implementing LINQ to Objects. I'd recommend reading through that.
In very short terms, what happens is this: each time you call the Where extension method, it returns a new WhereEnumerable object that has two things - a reference to the previous IEnumerable (the one you called Where on), and the lambda you provided.
When you start iterating over this WhereEnumerable (for example, in a foreach later down in your code), internally it simply begins iterating on the IEnumerable that it has referenced.
"This foreach just asked me for the next element in my sequence, so I'm turning around and asking you for the next element in your sequence".
That goes all the way down the chain until we hit the origin, which is actually some kind of array or storage of real elements. As each Enumerable then says "OK, here's my element" passing it back up the chain, it also applies its own custom logic. For a Where, it applies the lambda to see if the element passes the criteria. If so, it allows it to continue on to the next caller. If it fails, it stops at that point, turns back to its referenced Enumerable, and asks for the next element.
This keeps happening until everyone's MoveNext returns false, which means the enumeration is complete and there are no more elements.
To answer (b), there's always a difference, but here it's far too trivial to bother with. Don't worry about it :)

The first will use one iterator, the second will use two. That is, the first sets up a pipeline with one stage, the second will involve two stages.
Two iterators have a slight performance disadvantage to one.

LINQ naming Standard - Lambda Expression [duplicate]

This question already has answers here:
Lambda variable names - to short name, or not to short name? [closed]
(11 answers)
Closed 9 years ago.
We normally follow coding / naming standard for all C# syntax. For Example, if we declare string inside the method, we use Scope-datatype-FieldName format. (lstrPersonName)
List<Person> icolPerson;
private LoadPersonName()
{
string lstrPersonaName;
}
i am kind of thinking how do we follow the naming standard in Lambda Expression. Especially when we defined the arguments for func delegate, we use shorted names like x. For Example
var lobjPerson = icolPerson.Where(x => x.person_id = 100)
if you look at the above line, X does not follow any standard. i also read that one purpose of lambda expression is to reduce the lenghthy code.
i am here to seek help to get the right approach for naming standard.
var lobjPerson = icolPerson.Where(x => x.person_id = 100)
OR
var lobjPerson = icolPerson.Where(lobjPerson => lobjPerson .person_id = 100)

My lambdas use one letter arguments, usually the first letter of what the parameter name would be:
buttons.Select(b => b.Text);
services.Select(s => s.Type);
But sometimes I add a few more letters to make things clearer, or to disambiguate between two parameters.
And when there isn't much meaning attached to the things I use xs and ys:
values.Aggregate((x, y) => x + y);
All said, the standard I use for lambdas is shortness first, expressiveness later, because the context tends to help understand things (in the first example it's obvious that the b stands for a button).

I've done a lot of programming in VBScript (ASP), and there the use of hungarian notation to keep track of the data type was crucial to keep sane. In a type safe language like C# I find no use at all to use hungarian notation that way.
Descriptive variable names does very much for the readability of the code, but it doesn't always have to describe every aspect of a variable. A name like persons indicates that it's a collection of person objects, whether it's a List<Person> or IEnumerable<Person> is usually not that important to understand what the code is doing, and the compiler tells you immediately if you are trying to do something completely wrong.
I frequently use single letter variable names, where the scope of the variable is very limited. For example the i index variable in a small loop, or in lambda expressions.

I usually just use the first letter of the type I am querying so in your case since it is of type Person I would do:
var lobjPerson = icolPerson.Where(p => p.person_id = 100);
Another example if it was say type Car I would do:
var lobjCar = icolCar.Where(c => c.car_id = 100);
I do this for quickness, however, I don't think there is anything wrong using full names in the query i.e. car => car.car_id

I think the naming standard you use (hungarian notation) was the need of the hour when high profile tools like Visual Studio were not present. Today you have technologies like Intellisense which tell you most of the things you are doing wrong with a variable as you type, like assigning an int to a string. So, as of today, it is not an exclusive part of good coding conventions or standards as it used to be. Most people follow pascal casing or camel casing these days. If anybody uses it, I think it is more because of a personal preference.
Having said that, according to the way you code and name your variables, you should follow the second approach instead. Again, this boils down to you personal preference.
PS: This comment does not mean that I am looking down on your usage of hungarian notation. :)

There is no the right approach at all. Naming standard like any other coding standards (like spaces and new lines after/before expressions) is a convention inside group. So just write your code as you like and as you should do it at your company. But do it consistently!
i prefer use underscore prefix in names of private fields. but your style(visibility + type + name) looks for me like an old WInAPI code citation(Hungarian notation) =).

As long as your lambda expression is only a single statement, I'd keep the variable name as short as possible. The overall purpose of naming is to make the code readable, and in these cases, the method name and general context should be enough.
You don't have to use x all the time. For a Person object, I'd tend to use p instead.
If you need to write a more complex expression involving braces and semicolons, I'd say that all the normal naming rules apply.

If your collection is called something-persons, then it's obvious that when you're doing persons.Where(p => p.Whatever) that p is a person. If your lambdas are so complicated it's not immediately obvious whatever the parameters you're using are, then you shouldn't use lambdas at all, or make them drastically simpler.
As for your other conventions, what is the point of using Hungarian for calling out that personName is a string? It's obvious that a name would be a string. And why do you need a prefix to remind you that a variable is a local? Its declaration is necessarily in the same method, so that can't be so long ago that you've already forgotten it. If you have, then your methods are too long or complex.
Just have your identifiers appropriately describe what the meaning of the variable is, then the rest will be obvious.

LINQ is not just for the developers who used to write queries for databases but also for the Functional Programmers.
So do not worry if you are not very comfortable with SQL kind of queries, we have a very nice option called “Lambda Expression”.
Here I am going to demonstrate the scenario for both the options with a small example. This example has an array of integers and I am only retrieving the even numbers using the power of LINQ.
Here we go
using System;
using System.Collections.Generic;
using System.Text;
using System.Query;
using System.Xml.XLinq;
using System.Data.DLinq;
namespace LINQConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
int[] arrInt = {1,2,3,4,5,6,7,8,9,10};
#region Place to change
//Language Integrated Query
var aa = from s in arrInt
where s % 2 == 0
select s;
#endregion
foreach (var item in aa)
{
Console.WriteLine("{0}", item);
}
Console.ReadKey();
}
}
}
If you do not want to use the different approach of query for Language then you are free to use Lambda Expression.
So just replace the #region area with the following code block results will be identical.
#region Place to change
//Lambda Expression
var aa = arrInt.Where(s => s % 2 == 0);
#endregion

Probably BAD coding style ... please comment

I am checking whether the new name already exists or not.
Code 1
if(cmbxExistingGroups.Properties.Items.Cast<string>().ToList().Exists(txt => txt==txtNewGroup.Text.Trim())) {
MessageBox.Show("already exists.", "Add new group");
}
Otherwise I could have written:
Code 2
foreach(var str in cmbxExistingGroups.Properties.Items)
{
if(str==txtNewGroup.Text) {
MessageBox.Show("already exists.", "Add new group");
break;
}
}
I wrote these two and thought I was exploiting language features in code 1.
...and yes: both of them work for me ... I am wondering about the performance :-/

I appreciate the cleverness of the first sample (assuming it works), but the second one is a lot easier for the next person who has to maintain the code to figure out.

Sometimes just a little indentation makes a world of difference:
if (cmbxExistingGroups.Properties.Items
.Cast<string>().ToList()
.Exists
(
txt => txt==txtNewGroup.Text.Trim()
))
{
MessageBox.Show("already exists.", "Add new group");
}
Since your using a List<String>, you might as well just drop the Exists predicate and use Contains...use Exists when comparing complex objects by unique values.

I've quoted it before but I'll do it again:
Write your code as if the person maintaining it is a homicidal maniac
who knows where you live.

would
cmbxExistingGroups.Properties.Items.Contains(text)
not work instead?

There are a few things wrong here:
1) The two bits of code don't do the same thing - the first looks for the trimmed version of txtNewGroup, the second just looks for txtNewGroup
2) There's no point in calling ToList() - that just make things less efficient
3) Using Exists with a predicate is overkill - Contains is all you need here
So, the first could easily come down to:
if (cmbxExistingGroups.Properties.Items.Cast<string>.Contains(txtNewGroup.Text))
{
// Stuff
}
I'd probably create a variable to give "cmbxExistingGroups.Properties.Items.Cast" a meaningful, simple name - but then I'd say it's easier to understand than the explicit foreach loop.

The first code bit is fine, except instead of calling Enumerable.ToList() and List<T>.Exists(), you should just call Enumerable.Any() -- it does a lazy evaluation, so it never allocates the memory for the List<T>, and it will stop enumerating cmbxExistingGroups.Properties.Items and casting them to string. Also, calling the trim from inside that predicate means it happens for every item it looks at. It would be best to move it out to the outer scope:
string match = txtNewGroup.Text.Trim();
if(cmbxExistingGroups.Properties.Items.Cast<string>().Any(txt => txt==match)) {
MessageBox.Show("already exists.", "Add new group");
}

Verbosity in coding is not always bad at all. I prefer the second code snippet a lot over the first one. Just imagine you would have to maintain (or even change the functionality of) the first example... um.

Well, if it were me, it would be a variation on 2. Always prefer readability over one-liners. Additionally, always extract a method to make it clearer.
your calling code becomes
if( cmbxExistingGroups.ContainsKey(txtNewGroup.Text) )
{
MessageBox.Show("Already Exists");
}
If you define an extension method for Combo Boxes
public static class ComboBoxExtensions
{
public static bool ContainsKey(this ComboBox comboBox, string key)
{
foreach (string existing in comboBox.Items)
{
if (string.Equals(key, existing))
{
return true;
}
}
return false;
}
}

First, they're not equivalent. The 1st sample does a check against txtNewSGroup.Text.Trim(), the 2nd omits trim. Also, the 1st casts everything to a string, whereas the second uses whatever comes out of the iterator. I assume that's an object, or you wouldn't have needed the cast in the 1st place.
So, to be fair, the closest equivalent to the 2nd sample in the LINQ style would be:
if (mbxExistingGroups.Properties.Items.Cast<string>().Contains(txtNewGroup.Text)) {
...
}
which isn't too bad. But, since you seem to be working with old style IEnumerable instead of new fangled IEnumerable<T>, why don't we give you another extension method:
public static Contains<T>(this IEnumerable e, T value) {
return e.Cast<T>().Contains(value);
}
And now we have:
if (mbxExistingGroups.Properties.Items.Contains(txtNewGroup.Text)) {
...
}
which is pretty readable IMO.

I would agree, go with the second one because it will be easier to maintain for anybody else who works on it and when you come back to that in 6-12 months, it will be easier to remember what you were doing.

both of them works for me ..i am wonodering about the performance
I see no one read the question :) I think I see what you're doing (I don't use this language). The first tries to generate the list and test it in one shot. The second does an explicit iteration and can "short circuit" itself (exit early) if it finds the duplicate early on. The question is whether the "all at once" is more efficient due to the language implementation.

The second of the two would perform better, and it would perform the same as other people's samples that use Contains.
The reason why the first one uses an extra trim. plus a conversion to list. so it iterates once for conversion, then starts again to check using exists, and does a trim each time, but will exit iteration if found. The second starts iterating once, has no trim, and will exit if found.
So in short the answer to your question is the second performs much better.

From a performance point of view:
txtNewGroup.Text.Trim()
Do your control interaction/string manipulation outside of the loop - one time, instead of n times.

I imagine that on the WTF's per minute scale, the first would be off the chart. Count the dots, any more than two per line is a potential problem

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.