LINQ Where(): are these two LINQ statements the same? [duplicate]

LINQ Where(): are these two LINQ statements the same? [duplicate] - c#

This question already has an answer here:
And difference between FirstOrDefault(func) & Where(func).FirstOrDefault()?
(1 answer)
Closed 9 years ago.
Can someone explain if these two Linq statements are same or do they differ in terms of execution. I am guessing the result of their execution is the same but please correct me if I am wrong.
var k = db.MySet.Where(a => a.Id == id).SingleOrDefault().Username;
var mo = db.MySet.SingleOrDefault(a => a.Id == id).Username;

Yes, both instructions are functionally equivalent and return the same result. The second is just a shortcut.
However, I wouldn't recommend writing it like this, because SingleOrDefault will return null if there is no item with the specified Id. This will result in a NullReferenceException when you access Username. If you don't expect it to return null, use Single, not SingleOrDefault, because it will give you a more useful error message if your expectation is not met. If you're not sure that a user with that Id exists, use SingleOrDefault, but check the result before accessing its members.

yes. these two linq statements are same.
but i suggest you wirte code like this:
var mo = db.MySet.SingleOrDefault(a => a.Id == id);
if(mo !=null)
{
string username=mo.Username;
}

var k = db.MySet.Where(a => a.Id == id).SingleOrDefault().Username;
var mo = db.MySet.SingleOrDefault(a => a.Id == id).Username;
You asked if they are equivalent...
Yes, they will return the same result, both in LINQ-to-Objects and in LINQ-to-SQL/Entity-Framework
No, they aren't equal, equal in LINQ-to-Objects. Someone had benchmarked them and discovered that the first one is a little faster (because the .Where() has special optimizations based on the type of db.MySet) Reference: https://stackoverflow.com/a/8664387/613130

They're different in the terms of actual code they will execute, but I can't see a situation in which they will give different results. In fact, if you've got Resharper installed, it will recommend you change the former into the latter.
However, I'd in general question why you ever want to do SingleOrDefault() without immediately following it with a null check.

Instead of checking for null I always check for default(T) as the name of the LINQ function implies too. In my opinion its a bit more maintainable code in case the type is changed into a struct or class.

They will both return the same result (or both will throw a NULL reference exception). However, the second one may be more efficient.
The first version will need to enumerate all values that meet the where condition, and then it will check to see if this returned only 1 value. Thus this version might need to enumerate over 100s of values.
The second version will check for only one value that meets the condition at the start. As soon as that version finds 2 values, it will thrown an exception, so it does not have the overhead of enumerating (possibly) 100s of values that will never be used.

Related

which of these linq queries are more performant? [duplicate]

What is the difference between these two Linq queries:
var result = ResultLists().Where( c=> c.code == "abc").FirstOrDefault();
// vs.
var result = ResultLists().FirstOrDefault( c => c.code == "abc");
Are the semantics exactly the same?
Iff sematically equal, does the predicate form of FirstOrDefault offer any theoretical or practical performance benefit over Where() plus plain FirstOrDefault()?

Either is fine.
They both run lazily - if the source list has a million items, but the tenth item matches then both will only iterate 10 items from the source.
Performance should be almost identical and any difference would be totally insignificant.

The second one. All other things being equal, the iterator in the second case can stop as soon as it finds a match, where the first one must find all that match, and then pick the first of those.

Nice discussion, all the above answers are correct.
I didn't run any performance test, whereas on the bases of my experience FirstOrDefault() sometimes faster and optimize as compare to Where().FirstOrDefault().
I recently fixed the memory overflow/performance issue ("neural-network algorithm") and fix was changing Where(x->...).FirstOrDefault() to simply FirstOrDefault(x->..).
I was ignoring the editor's recommendation to change Where(x->...).FirstOrDefault() to simply FirstOrDefault(x->..).
So I believe the correct answer to the above question is
The second option is the best approach in all cases

Where is actually a deferred execution - it means, the evaluation of an expression is delayed until its realized value is actually required. It greatly improves performance by avoiding unnecessary execution.
Where looks kind of like this, and returns a new IEnumerable
foreach (var item in enumerable)
{
if (condition)
{
yield return item;
}
}
FirstOrDefault() returns <T> and not throw any exception or return null when there is no result

C# linq possible multiple enumeration best practices

I sometimes use LINQ constructs in my C# source. I use VS 2010 with ReSharper. Now I'm getting "Possible multiple enumeration of IEnumerable" warning from ReSharper.
I would like to refactor it according to the best practices. Here's briefly what it does:
IEnumerable<String> codesMatching = from c in codes where conditions select c;
String theCode = null;
if (codesMatching.Any())
{
theCode = codesMatching.First();
}
if ((theCode == null) || (codesMatching.Count() != 1))
{
throw new Exception("Matching code either not found or is not unique.");
}
// OK - do something with theCode.
A question:
Should I first store the result of the LINQ expression in a List?
(I'm pretty sure it won't return more than a couple of rows - say 10 at the most.)
Any hints appreciated.
Thanks
Pavel

Since you want to verify if your condition is unique, you can try this (and yes, you must store the result):
var codes = (from c in codes where conditions select c).Take(2).ToArray();
if (codes.Length != 1)
{
throw new Exception("Matching code either not found or is not unique.");
}
var code = codes[0];

Yes, you need to store result as List\Array, and then use it. In that case it won't enumerate it a couple of times.
In your case if you need to be sure that there is just one item that satisfy condition, you can use Single - if there will be more than one item that satisfy conditions it will throw exception. If there will be no items at all, it also throw exception.
And your code will be easier:
string theCode = (from c in codes where conditions select c).Single();
But in that case you can't change exception text, or you need to wrap it into own try\catch block and rethrow it with custom text\exception

Finalizing enumerable with .ToList()/.ToArray() would get rid of the warning, but to understand if it is better than multiple enumerations or not would depend on codes and conditions implementations. .Any() and .First() are lazy primitives and won't execute past the first element and .Count() might not be hit at all, hence converting to a list might be more wasteful than getting a new enumerator.

LINQ and ReSharper

Hi I have the following code:
if (!_jobs.Any(j => j.Id == emailJob.Id))
{
}
This code should find any elements which satisfy the condition. So I would assume that it should return after finding the first element, something like this:
if (!_jobs.FirstOrDefault(j => j.Id == emailJob.Id) != null)
{
}
Resharper tries to simplify this LINQ expression to:
if (_jobs.All(j => j.Id != emailJob.Id))
{
}
This seems less efficient to me because it has to check that every single element satisifies the inverse condition.
Sorry if I'm just misunderstanding how LINQ works.
Joe

The two versions are mirror approaches and do exactly the same amount of work.
When you say "if none of the items satisfy this condition" (!Any), then all of the items have to be checked in order to get a definite answer.
ReSharper's suggestion is useful because it guides you towards using the method that more clearly shows what is going to happen: All jobs will have to be examined.

Both Any and All will stop looking immediately upon the condition failing.
If you are looking for more than just taking our word or anecdotal evidence, you can see from the source that this is what it is doing.
There is an extra inversion in the All method after the predicate is applied. This is a relative 0 performance impact, so code readability is the main concern.

If there is any job that does match the emailJob id, then the .Any() approach can abort early. By the same token, the .All() approach can stop working as soon as it finds a condition that's false, which will happen at the same job. The efficiency should be about the same.

IEnumerable<T>.Single and casting

I a have 2 objects A and B. B is inherited from A and has some more properties.
I have IEnumerable{A} that contains only B objects.
What I want to do is:
list.Single(b => b.PropertyThatOnlyExistOnB == "something")
I would have expect something like this to work:
list.Single((B) b => b.PropertyThatOnlyExistOnB == "something")
But it doesn't compile. For now I just doing:
B result = null;
foreach (b in list)
{
if((B)b.PropertyThatOnlyExistOnB == "something")
{
result = (B)b;
}
}
Is there a shorter way?
Thanks

Use the Enumerable.OfType<TResult> extension methods to filter/cast.
list.OfType<B>().Single(b => b.PropertyThatOnlyExistOnB == "something")

Although I like #VirtualBlackFox's answer best, for completeness sake: Here is how to get your idea to work:
list.Single(b => ((B)b).PropertyThatOnlyExistOnB == "something");
You weren't that far off track, except that you got some of the syntax confused. The b => EXPRESSION syntax denotes a lambda expression. You can't start altering the stuff before the =>, unless you want to add (or remove) arguments:
* `x => LAMBDA_WITH_ONE_PARAMETER`
* `(x) => LAMBDA_WITH_ONE_PARAMETER`
* `() => LAMBDA_WITH_NO_PARAMETERS`
* `(x, y, z) => LAMBDA_WITH_THREE_PARAMETERS`

I have IEnumerable<A> that contains only B objects.
I would question this statement about your variable. You've specified that it is an IEnumerable<A>, but it contains only instances of B. What is the purpose of this? If you are explicitly only requiring instances of B in all circumstances, it would be better for this to be an IEnumerable<B>, as it safeguards problems that could be caught at compile time.
Consider the following, I would imagine that you may have some code similar to:
var setOfA = // Get a set of A.
DoSomethingWithA(setOfA);
var instanceOfB = GetInstanceOfB(setOfA);
In this case, I can understand that an IEnumerable<A> is perfectly valid, except when you want to perform the latter operation, GetInstanceOfB. Let's imagine, the definition is:
B GetInstanceOfB(IEnumerable<A> setOfA)
{
return // The answer to your question.
}
Now, the initial problem I hope you see, is that you're putting all your cards on the notion that your list (setOfA in my example), is always only going to contain instances of B. While you may guarantee that from your developer point of view, the compiler can make no such assumption, it can only guarantee that setOfA (list) is an IEnumerable<A>, and therein lies the potential issue.
Looking at the answers provided (all of which are perfectly valid [#VirtualBlackFox being the safest answer] given your notion):
I have IEnumerable<A> that contains only B objects.
What if, in some future change, setOfA, also contains an instance of C (a potential future subclass of A). Given this answer:
list.Single(b => ((B)b).PropertyThatOnlyExistOnB == "something");
What if setOfA is actually: [C B B]. You can see that the explicit cast (B)b will cause an InvalidCastException to be thrown. Because of the nature of the Single operation, it will continue to enumerate until the first instance that something fails the predicate (PropertyThatOnlyExistOnB == "something"), or an exception is thrown. In this instance, the exception could be thrown which is unexpected, and likely unhandled. This answer, is similar to:
list.Cast<B>().Single(b => b.PropertyThatOnlyExistOnB == "something");
Given this answer:
list.Single<A>(b => (b as B).PropertyThatOnlyExistOnB == "something")
In the same situation, the exception would arise as a thrown instance of NullReferenceException, because the instance of C cannot be safely type cast to B.
Now, don't get me wrong, I am not picking holes with those answers, as I said they are perfectly valid given the remit of your question. But in circumstances where your code changes, those perfectly valid answers become potential future issues.
Given this answer:
list.OfType<B>.Single(b => b.PropertyThatOnlyExistOnB == "something");
This allows you to safely type cast to a potential subset of A that are in fact B, and the compiler can guarantee that your predicate is only being used on an IEnumerable<B>.
But this would lead me to discovering that the juncture in your code is trying to handle your IEnumerable<A> but perform an operation where you really want your IEnumerable<B>. In which case, shouldn't you refactor this code to possibly have an explicit method:
B GetMatchingInstanceOfB(IEnumerable<B> setOfB)
{
if (setOfB == null) throw new ArgumentNullException("setOfB");
return setOfB.Single(b => b.PropertyThatOnlyExistOnB == "something");
}
The change in the design of the method ensures that it will only explicitly accept a valid set of B, and you don't have to worry about your cast within that method. The method is responsible only for matching a single item of B.
This of course means you need to push your cast out to a different level, but that still is much more explicit:
var b = GetMatchingInstanceOfB(setOfA.OfType<B>());
I'm also assuming that you have sufficient error handling in place in circumstances where the predicate will fail where all instances are B, e.g., more than 1 item satisfies PropertyThatOnlyExistOnB == "something".
This might have been a pointless rant about reviewing your code, but I think it is worth considering unexpected situations that could arise, and how potentially tweaking your variables can save you a potential headache in the future.

This should work fine:
list.Single<A>(b => (b as B).PropertyThatOnlyExistOnB == "something")
If you dont want to risk exceptions to be thrown you can do this:
list.Single<A>(b => ((b is B)&&((b as B).PropertyThatOnlyExistOnB == "something")))

The || (or) Operator in Linq with C#

I'm using linq to filter a selection of MessageItems. The method I've written accepts a bunch of parameters that might be null. If they are null, the criteria for the file should be ignored. If it is not null, use it to filter the results.
It's my understanding that when doing an || operation is C#, if the first expression is true, the second expression should not be evaluated.
e.g.
if(ExpressionOne() || ExpressionTwo())
{
// only ExpressionOne was evaluated because it was true
}
now, in linq, I'm trying this:
var messages = (from msg in dc.MessageItems
where String.IsNullOrEmpty(fromname) || (!String.IsNullOrEmpty(fromname) && msg.FromName.ToLower().Contains(fromname.ToLower()))
select msg);
I would have thought this would be sound, because String.IsNullOrEmpty(fromname) would equal true and the second part of the || wouldn't get run.
However it does get run, and the second part
msg.FromName.ToLower().Contains(fromname.ToLower()))
throws a null reference exception (because fromname is null)!! - I get a classic "Object reference not set to an instance of an object" exception.
Any help?

Have a read of this documentation which explains how linq and c# can experience a disconnect.
Since Linq expressions are expected to be reduced to something other than plain methods you may find that this code breaks if later it is used in some non Linq to Objects context.
That said
String.IsNullOrEmpty(fromname) ||
( !String.IsNullOrEmpty(fromname) &&
msg.FromName.ToLower().Contains(fromname.ToLower())
)
Is badly formed since it should really be
String.IsNullOrEmpty(fromname) ||
msg.FromName.ToLower().Contains(fromname.ToLower())
which makes it nice and clear that you are relying on msg and msg.FromName to both be non null as well.
To make your life easier in c# you could add the following string extension method
public static class ExtensionMethods
{
public static bool Contains(
this string self, string value, StringComparison comparison)
{
return self.IndexOf(value, comparison) >= 0;
}
public static bool ContainsOrNull(
this string self, string value, StringComparison comparison)
{
if (value == null)
return false;
return self.IndexOf(value, comparison) >= 0;
}
}
Then use:
var messages = (from msg in dc.MessageItems
where msg.FromName.ContainsOrNull(
fromname, StringComparison.InvariantCultureIgnoreCase)
select msg);
However this is not the problem. The problem is that the Linq to SQL aspects of the system are trying to use the fromname value to construct the query which is sent to the server.
Since fromname is a variable the translation mechanism goes off and does what is asked of it (producing a lower case representation of fromname even if it is null, which triggers the exception).
in this case you can either do what you have already discovered: keep the query as is but make sure you can always create a non null fromname value with the desired behaviour even if it is null.
Perhaps better would be:
IEnumerable<MessageItem> results;
if (string.IsNullOrEmpty(fromname))
{
results = from msg in dc.MessageItems
select msg;
}
else
{
results = from msg in dc.MessageItems
where msg.FromName.ToLower().Contains(fromname)
select msg;
}
This is not so great it the query contained other constraints and thus invovled more duplication but for the simple query actually should result in more readable/maintainable code. This is a pain if you are relying on anonymous types though but hopefully this is not an issue for you.

Okay. I found A solution.
I changed the offending line to:
where (String.IsNullOrEmpty(fromemail) || (msg.FromEmail.ToLower().Contains((fromemail ?? String.Empty).ToLower())))
It works, but it feels like a hack. I'm sure if the first expression is true the second should not get evaluated.
Would be great if anyone could confirm or deny this for me...
Or if anyone has a better solution, please let me know!!!

If you are using LINQ to SQL, you cannot expect the same C# short-circuit behavior in SQL Server. See this question about short-circuit WHERE clauses (or lack thereof) in SQL Server.
Also, as I mentioned in a comment, I don't believe you are getting this exception in LINQ to SQL because:
Method String.IsNullOrEmpty(String) has no supported translation to SQL, so you can't use it in LINQ to SQL.
You wouldn't be getting the NullReferenceException. This is a managed exception, it would only happen client-side, not in SQL Server.
Are you sure this is not going through LINQ to Objects somewhere? Are you calling ToList() or ToArray() on your source or referencing it as a IEnumerable<T> before running this query?
Update: After reading your comments I tested this again and realized some things. I was wrong about you not using LINQ to SQL. You were not getting the "String.IsNullOrEmpty(String) has no supported translation to SQL" exception because IsNullOrEmpty() is being called on a local variable, not an SQL column, so it is running client-side, even though you are using LINQ to SQL (not LINQ to Objects). Since it is running client-side, you can get a NullReferenceException on that method call, because it is not translated to SQL, where you cannot get a NullReferenceException.
One way to make your solution seem less hacky is be resolving fromname's "null-ness" outside the query:
string lowerfromname = String.IsNullOrEmpty(fromname) ? fromname : fromname.ToLower();
var messages = from msg in dc.MessageItems
where String.IsNullOrEmpty(lowerfromname) || msg.Name.ToLower().Contains(lowerfromname)
select msg.Name;
Note that this will not always be translated to something like (using your comments as example):
SELECT ... FROM ... WHERE #theValue IS NULL OR #theValue = theValue
Its translation will be decided at runtime depending on whether fromname is null or not. If it is null, it will translate without a WHERE clause. If it is not null, it will translate with a simple "WHERE #theValue = theValue", without null check in T-SQL.
So in the end, the question of whether it will short-circuit in SQL or not is irrelevant in this case because the LINQ to SQL runtime will emit different T-SQL queries if fromname is null or not. In a sense, it is short-circuited client-side before querying the database.

Are you sure it's 'fromname' that's null and not 'msg.FromName' that's null?

Like Brian said, I would look if the msg.FromName is null before doing the ToLower().Contains(fromname.ToLower()))

You are correct that the second conditional shouldn't get evaluated as you are using the short-circuit comparitors (see What is the best practice concerning C# short-circuit evaluation?), however I'd suspect that the Linq might try to optimise your query before executing it and in doing so might change the execution order.
Wrapping the whole thing in brackets also, for me, makes for a clearer statement as the whole 'where' condition is contained within the parenthases.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

LINQ Where(): are these two LINQ statements the same? [duplicate] - c#

yes. these two linq statements are same. but i suggest you wirte code like this: var mo = db.MySet.SingleOrDefault(a => a.Id == id); if(mo !=null) { string username=mo.Username; }

Instead of checking for null I always check for default(T) as the name of the LINQ function implies too. In my opinion its a bit more maintainable code in case the type is changed into a struct or class.

Related

which of these linq queries are more performant? [duplicate]

C# linq possible multiple enumeration best practices

LINQ and ReSharper

IEnumerable<T>.Single and casting

The || (or) Operator in Linq with C#

Categories

Resources