Which is the better method to use LINQ? - c#

I have these two lines, that do exactly the same thing. But are written differently. Which is the better practice, and why?
firstRecordDate = (DateTime)(from g in context.Datas
select g.Time).Min();
firstRecordDate = (DateTime)context.Datas.Min(x => x.Time);

there is no semantic difference between method syntax and query
syntax. In addition, some queries, such as those that retrieve the
number of elements that match a specified condition, or that retrieve
the element that has the maximum value in a source sequence, can only
be expressed as method calls.
http://msdn.microsoft.com/en-us/library/bb397947.aspx
Also look here: .NET LINQ query syntax vs method chain
It comes down to what you are comfortable with and what you find is more readable.

The second one use lambda expressions. I like it as it is compact and easier to read (although some will find the former easier to read).
Also, the first is better suited if you have a SQL background.

I'd say go with what is most readable or understandable with regards to your development team. Come back in a year or so and see if you can remember that LINQ... well, this particular LINQ is obviously simple so that's moot :-)
Best practice is also quite opinionated, you aren't going to get one answer here. In this case, I'd go for the second item because it's concise and I can personally read and understand it faster than the first, though only slightly faster.

I personally much prefer using lambda expressions. As far as I know there is no real difference as you say you can do exactly the same thing both ways. We agreed to all use the lambda as it is easy to read, follow and to pick up for people who don't like SQL.

There is absolutely no difference in terms of the results, assuming you do actually write equivalent statements in each format.
Go for the most readable one for any given query. Complex queries with joins and many where clauses etc are often easier to write/read in the linq query syntax, but really simple ones like context.Employees.SingleOrDefault(e => e.Id == empId) are easier using the method-chaining syntax. There's no general "one is better" rule, and two people may have a difference of opinion for any given example.

There is no semantic difference between the two statements. Which you choose is purely a matter of style preference

Do you need the explicit cast in either of them? Isn't Time already a DateTime?
Personally I prefer the second approach as I find the extension method syntax more familiar than the LINQ syntax, but it is really just personal preference, they perform the same.
The second one written to more exactly look like the first would be context.Datas.Select(x => x.Time).Min(). So you can see how you wrote it with Min(x => x.Time) might be slightly more efficient, because you only have on operation instead of two

The query comprehension syntax is actually compiled down to a series of calls to the extension methods, which means that the two syntaxes are semantically identical. Whichever style you prefer is the one you should use.

Related

What are we guaranteed regarding side-effects in LINQ predicates?

I just saw this bit of code that has a count++ side-effect in the .GroupBy predicate. (originally here).
object[,] data; // This contains all the data.
int count = 0;
List<string[]> dataList = data.Cast<string>()
.GroupBy(x => count++ / data.GetLength(1))
.Select(g => g.ToArray())
.ToList();
This terrifies me because I have no idea how many times the implementation will invoke the key selector function. And I also don't know if the function is guaranteed to be applied to each item in order. I realize that, in practice, the implementation may very well just call the function once per item in order, but I never assumed that as being guaranteed, so I'm paranoid about depending on that behaviour -- especially given what may happen on other platforms, other future implementations, or after translation or deferred execution by other LINQ providers.
As it pertains to a side-effect in the predicate, are we offered some kind of written guarantee, in terms of a LINQ specification or something, as to how many times the key selector function will be invoked, and in what order?
Please, before you mark this question as a duplicate, I am looking for a citation of documentation or specification that says one way or the other whether this is undefined behaviour or not.
For what it's worth, I would have written this kind of query the long way, by first performing a select query with a predicate that takes an index, then creating an anonymous object that includes the index and the original data, then grouping by that index, and finally selecting the original data out of the anonymous object. That seems more like a correct way of doing functional programming. And it also seems more like something that could be translated to a server-side query. The side-effect in the predicate just seems wrong to me - and against the principles of both LINQ and functional programming, so I would assume there would be no guarantee specified and that this may very well be undefined behaviour. Is it?
I realize this question may be difficult to answer if the documentation and LINQ specification actually says nothing regarding side effects in predicates. I want to know specifically whether:
Specs say it's permissible and how. (I doubt it)
Specs say it's undefined behaviour (I suspect this is true and am looking for a citation)
Specs say nothing. (Sloppy spec, if you ask me, but it would be nice to know if others have searched for language regarding side-effects and also come up empty. Just because I can't find it doesn't mean it doesn't exist.)
According to official C# Language Specification, on page 203, we can read (emphasis mine):
12.17.3.1 The C# language does not specify the execution semantics of query expressions. Rather, query expressions are
translated into invocations of methods that adhere to the
query-expression pattern (§12.17.4). Specifically, query expressions
are translated into invocations of methods named Where, Select,
SelectMany, Join, GroupJoin, OrderBy, OrderByDescending, ThenBy,
ThenByDescending, GroupBy, and Cast. These methods are expected to
have particular signatures and return types, as described in §12.17.4.
These methods may be instance methods of the object being queried or
extension methods that are external to the object. These methods
implement the actual execution of the query.
From looking at the source code of GroupBy in corefx on GitHub, it does seems like the key selector function is indeed called once per element, and it is called in the order that the previous IEnumerable provides them. I would in no way consider this a guarantee though.
In my view, any IEnumerables which cannot be enumerated multiple times safely are a big red flag that you may want to reconsider your design choices. An interesting issue that could arise from this is that for example if you view the contents of this IEnumerable in the Visual Studio debugger, it will probably break your code, since it would cause the count variable to go up.
The reason this code hasn't exploded up until now is probably because the IEnumerable is never stored anywhere, since .ToList is called right away. Therefore there is no risk of multiple enumerations (again, with the caveat about viewing it in the debugger and so on).

Single Select() statement or multiple for transformation with multiple steps?

In C#, I have a collection of objects that I want to transform to a different type. This conversion, which I would like to do with LINQ Select(), requires multiple operations in sequence. To perform these operations, is it better to chain together multiple Select() queries like
resultSet.Select(stepOneDelegate).Select(stepTwoDelegate).Select(stepThreeDelegate);
or instead to perform these three steps in a single call?
resultSet.Select(item => stepThree(stepTwo(stepOne(item))));
Note: The three steps themselves are not necessarily functions. They are meant to be a concise demonstration of the problem. If that has an effect on the answer please include that information.
Any performance difference would be negligible, but the definitive answer to that is simply "test it". The question would be more around readability, which one is easier to understand and grasp what is going on.
Cases where I have needed to Project on a Projection would include when working with EF Linq expressions where I ultimately need to do something that isn't supported by EF Linq so I need to materialize a projection (usually to an anonymous type) then finish the expression before selecting the final output. In these cases you would need to use the first example.
Personally I'd probably just stick to the first scenario as to me it is easy to understand what is going on, and it easily supports additions for other operations such as filtering with Where or using OrderBy etc. The second scenario only really comes up when I'm building a composite model.
.Select(x => new OuterModel
{
Id = x.Id,
InnerModel = x.Inner.Select(i => new InnerModel
{
// ...
})
}) // ...
In most cases though this can be handled through Automapper.
I'd be wary of any code that I felt needed chaining a lot of Select expressions as it would smell like trying to do too much in one expression chain. Making something easy to understand, even if it involves a few extra steps and might add a few milliseconds to the execution is far better than having the risk of needing to track down bugs that someone introduced because they misunderstood potentially complex looking code.

LINQ Design Curiosity: Skip/Take vs. SkipWhile/TakeWhile

Is there any particular reason to have separate methods Skip and SkipWhile, rather than simply having overloads of the same method?
What I mean is, instead of Skip(int), SkipWhile(Func<TSource,bool>), and SkipWhile(Func<TSource,int,bool>), why not have Skip(int), Skip(Func<TSource,bool>), and Skip(Func<TSource,int,bool>)? I'm sure there's some reason for it, as the whole LINQ system was designed by people with much more experience than me, but that reasoning is not apparent.
The only possibility that's come to mind has been issues with the parser for the SQL-like syntax, but that already distinguishes between things like Select(Func<TSource,TResult>) and Select(Func<TSource,int,TResult>), so I doubt that's why.
The same question applies to Take and TakeWhile, which are complimentary to the above.
Edit: To clarify, I am aware of the functional differences between the variants, I'm merely asking about the design decision on the naming of the methods.
IMO, the only reason would be better readability. Skip sound like “Skip N number of records”, while SkipWhile sounds like “Skip until a condition is met”. These names are self-explanatory
The "While" indicates that LINQ will only skip while the lambda expression evaluates to true, and will stop skipping as soon as it is no longer true. This is a very different thing from just skipping a fixed number of items.
The same reasoning holds true for Take, of course.
All is well in the interest of clarity!

Help Need with LINQ Syntax

Can someone help to change to following to select unique Model from Product table
var query = from Product in ObjectContext.Products.Where(p => p.BrandId == BrandId & p.ProdDelOn == null)
orderby Product.Model
select Product;
I'm guessing you that you still want to filter based on your existing Where() clause. I think this should take care of it for you (and will include the ordering as well):
var query = ObjectContext.Products
.Where(p => p.BrandId == BrandId && p.ProdDelOn == null)
.Select(p => p.Model)
.Distinct()
.OrderBy(m => m);
But, depending on how you read the post...it also could be taken as you're trying to get a single unique Model out of the results (which is a different query):
var model = ObjectContext.Products
.Where(p => p.BrandId == BrandId && p.ProdDelOn == null)
.Select(p => p.Model)
.First();
Change the & to && and add the following line:
query = query.Distinct();
I'm afraid I can't answer the question - but I want to comment on it nonetheless.
IMHO, this is an excellent example of what's wrong with the direction the .NET Framework has been going in the last few years. I cannot stand LINQ, and nor do I feel too warmly about extension methods, anonymous methods, lambda expressions, and so on.
Here's why: I have yet to see a situation where either of these things actually contribute anything to solving real-world programming problems. LINQ is ceratinly no replacement for SQL, so you (or at least the project) still need to master that. Writing the LINQ statements is not any simpler than writing the SQL, but it does add run-time processing to build an expression tree and "compile" it into an SQL statement. Now, if you could solve complex problems more easily with LINQ than with SQL directly, or if it meant you didn't need to also know SQL, and if you could trust LINQ would produce good-enough SQL all the time, it might still have been worth using. But NONE of these preconditions are met, so I'm left wondering what the benefit is supposed to be.
Of course, in good old-fashioned SQL the statement would be
SELECT DISTINCT [Model]
FROM [Product]
WHERE [BrandID] = #brandID AND [ProdDelOn] IS NULL
ORDER BY [Model]
In many cases the statements can be easily generated with dev tools and encapsulated by stored procedures. This would perform better, but I'll grant that for many things the performance difference between LINQ and the more straightforward stored procs would be totally irrelevant. (On the other hand, performance problems do have a tendency to sneak in, as we devs often work with totally unrealistic amounts of data and on environments that have little in common with those hosting our software in real production systems.) But the advantages of just not using LINQ are HUGE:
1) Fewer skills required (since you must use SQL anyway)
2) All data access can be performed in one way (since you need SQL anyway)
3) Some control over HOW to get data and not just what
4) Less chance of being rightfully accused of writing bloatware (more efficient)
Similar things can be said with respect to many of the new language features introduced since C# 2.0, though I do appreciate and use some of them. The "var" keyword with type inferrence is great for initializing locals - it's not much use getting the same type information two times on the same line. But let's not pretend this somehow helps one bit if you have a problem to solve. Same for anonymous types - nested private types served the same purpose with hardly any more code, and I've found NO use for this feature since trying it out when it was new and shiny. Extention methods ARE in fact just plain old utility methods, and I have yet to hear any good explanation of why one should use the SAME syntax for instance methods and static methods invoked on another class! This actually means that adding a method to a class and getting no build warnings or errors can break an application. (In case you doubt: If you had an extension method Bar() for your Foo type, Foo.Bar() invokes a completely different implementation which may or may not do something similar to what your extension method Bar() did the day you introduce an instance method with the same signature. It'll build and crash at runtime.)
Sorry to rant like this, and maybe there is a better place to post this than in response to a question. But I really think anyone starting out with LINQ is wasting their time - unless it's in preparation for an MS certification exam, which AFAIU is also something a bit removed from reality.

Methods: What is better List or object?

While I was programming I came up with this question,
What is better, having a method accept a single entity or a List of those entity's?
For example I need a List of strings. I can either have:
a method accepting a List and return a List of strings with the results.
List<string> results = methodwithlist(List[objects]);
or
a method accepting a object and return a string. Then use this function in a loop and so filling a list.
for int i = 0; i < List<objects>.Count;i++;)
{
results = methodwithsingleobject(List<objects>[i]);
}
** This is just a example. I need to know which one is better, or more used and why.
Thanks!
Well, it's easy to build the first form when you've got the second - but using LINQ, you really don't need to write your own, once you've got the projection. For example, you could write:
List<string> results = objectList.Select(X => MethodWithSingleObject()).ToList();
Generally it's easier to write and test a method which only deals with a single value, unless it actually needs to know the rest of the values in the collection (e.g. to find aggregates).
I would choose the second because it's easier to use when you have a single string (i.e. it's more general purpose). Also, the responsibility of the method itself is more clear because the method should not have anything to do with lists if it's purpose is just to modify a string.
Also, you can simplify the call with Linq:
result = yourList.Select(p => methodwithsingleobject(p));
This question comes up a lot when learning any language, the answer is somewhat moot since the standard coding practice is to rely upon LINQ to optimize the code for you at runtime. But this presumes you're using a version of the language that supports it. But if you do want to do some research on this there are a few Stack Overflow articles that delve into this and also give external resources to review:
In .NET, which loop runs faster, 'for' or 'foreach'?
C#, For Loops, and speed test... Exact same loop faster second time around?
What I have learned, though, is not to rely too heavily on Count and to use Length on typed Collections as that can be a lot faster.
Hope this is helpful.

Categories

Resources