Selecting the first item from a query eficiently - c#

Been reading Microsoft's LINQ docs for a while in search for the correct way to do this. Microsoft's example is the following:
Customer custQuery =
(from custs in db.Customers
where custs.CustomerID == "BONAP"
select custs)
.First();
This obviosly works and it's the obvious way to do it(except for using FirstOrDefault() rather than First()), however to me, it looks like this runs the query and after it's done it selects the first.
Is there a way to return the first result and not continue the query?

however to me, it looks like this runs the query and after it's done it selects the first
Nope. The query inside the parentheses returns an IQueryable object, which is basically the representation of a query that hasn't been run yet. It's only when you call .First() does it actually process the IQueryable object and translate it into a database query, and without looking I guarantee you it only asks the database for the first item.
However, if you were to write .ToList().First() instead of just .First() (and you see beginners making this mistake in less obvious ways), it would indeed load everything into memory and then pull the first object from it.
But the code you've pasted is perfectly efficient.

Related

MongoDB and returning collections efficiently

I am very new to Mongo (this is actually day 1) and using the C# driver that is available for it. One thing that I want to know (as I am not sure how to word it in Google) is how does mongo handle executing queries when I want to grab a part of the collection.
What I mean by this is that I know that with NHibernate and EF Core, the query is first built and it will only fire when you cast it. So say like an IQueryable to IEnnumerable, .ToList(), etc.
Ex:
//Query is fired when I call .ToList, until that point it is just building it
context.GetLinqQuery<MyObject>().Where(x => x.a == 'blah').ToList();
However, with Mongo's examples it appears to me that if I want to grab a filtered result I will first need to get the collection, and then filter it down.
Ex:
var collection = _database.GetCollection<MyObject>("MyObject");
//Empty filter for ease of typing for example purposes
var filter = Builders<MyObject>.Filter.Empty;
var collection.Find(filter).ToList();
Am I missing something here, I do not think I saw any overload in the GetCollection method that will accept a filter. Does this mean that it will first load the whole collection into memory, then filter it? Or will it still be building the query and only execute it once I call either .Find or .ToList on it?
I ask this because at work we have had situations where improper positioning of .ToList() would result is seriously weak performance. Apologies if this is not the right place to ask.
References:
https://docs.mongodb.com/guides/server/read_queries/
The equivalent to your context.GetLinqQuery<MyObject>() would be to use AsQueryable:
collection.AsQueryable().Where(x => x.a == "blah").ToList();
The above query will be executed server side* and is equivalent to:
collection.Find(Builders<MyObject>.Filter.Eq(x => x.a, "blah")).ToEnumerable().ToList();
* The docs state that:
Only LINQ queries that can be translated to an equivalent MongoDB query are supported. If you write a LINQ query that can’t be translated you will get a runtime exception and the error message will indicate which part of the query wasn’t supported.

LINQ to SQL - Am I fetching or manipulating local data?

On some code I have right now we are executing our LINQ to SQL queries like so:
db.Customers.Where(c => c.Name.StartsWith("A"))
.OrderBy(c => c.Name).Select(c => c.Name.ToUpper());
But a lot of examples I see, the Linq to SQL code is written like:
var query =
from c in db.Customers
where c.Name.StartsWith ("A")
orderby c.Name
select c.Name.ToUpper();
I am worried that we are fetching the whole table in the current code, and afterwards manipulating it locally, which from my point of view is not efficient compared to having the SQL server doing it.
Is the two examples equivalent or is there a difference?
After finding out that what I was searching for was called "linq vs. method chaining", I found my answer here:
.NET LINQ query syntax vs method chain
The question is whether there's a difference between method chaining and linq query, as I have described in my question.
The answer is that there is none, you are free to use both methods. Comments mention there might be minor differences in the compilation time, but this is not my concern.
This might be superfluous information, but in both code blocks you are not going to get anything from the database. At this point, you will only have told LINQ what query it should run. To actually run the query you're going to need to add a call to .First, .FirstOrDefault, .Single, .SingleOrDefault, .ToList, or their async counterparts.

Dynamically Generating a Linq/Lambda Where Clause

I've been searching here and Google, but I'm at a loss. I need to let users search a database for reports using a form. If a field on the form has a value, the app will get any reports with that field set to that value. If a field on a form is left blank, the app will ignore it. How can I do this? Ideally, I'd like to just write Where clauses as Strings and add together those that are not empty.
.Where("Id=1")
I've heard this is supposed to work, but I keep getting an error: "could not be resolved in the current scope of context Make sure all referenced variables are in scope...".
Another approach is to pull all the reports then filter it one where clause at a time. I'm hesitant to do this because 1. that's a huge chunk of data over the network and 2. that's a lot of processing on the user side. I'd like to take advantage of the server's processing capabilities. I've heard that it won't query until it's actually requested. So doing something like this
var qry = ctx.Reports
.Select(r => r);
does not actually run the query until I do:
qry.First()
But if I start doing:
qry = qry.Where(r => r.Id = 1).Select(r => r);
qry = qry.Where(r => r.reportDate = '2010/02/02').Select(r => r);
Would that run the query? Since I'm adding a where clause to it. I'd like a simple solution...in the worst case I'd use the Query Builder things...but I'd rather avoid that (seems complex).
Any advice? :)
Linq delays record fetching until a record must be fetched.
That means stacking Where clauses is only adding AND/OR clauses to the query, but still not executing.
Execution of the generated query will be done in the precise moment you try to get a record (First, Any etc), a list of records(ToList()), or enumerate them (foreach).
.Take(N) is not considered fetching records - but adding a (SELECT TOP N / LIMIT N) to the query
No, this will not run the query, you can structure your query this way, and it is actually preferable if it helps readability. You are taking advantage of lazy evaluation in this case.
The query will only run if you enumerate results from it by using i.e. foreach or you force eager evaluation of the query results, i.e. using .ToList() or otherwise force evaluation, i.e evaluate to a single result using i.e First() or Single().
Try checking out this dynamic Linq dll that was released a few years back - it still works just fine and looks to be exactly what you are looking for.

LINQ to Entities: Query not working with certain parameter value

I have a very strange problem with a LINQ to Entities query with EF1.
I have a method with a simple query:
public DateTime GetLastSuccessfulRun(string job)
{
var entities = GetEntities();
var query = from jr in entities.JOBRUNS
where jr.JOB_NAME == job && jr.JOB_INFO == "SUCCESS"
orderby jr.JOB_END descending
select jr.JOB_END;
var result = query.ToList().FirstOrDefault();
return result.HasValue ? result.Value : default(DateTime);
}
The method GetEntities returns an instance of a class that is derived from System.Data.Objects.ObjectContext and has automatically been created by the EF designer when I imported the schema of the database.
The query worked just fine for the last 15 or 16 months. And it still runs fine on our test system. In the live system however, there is a strange problem: Depending on the value of the parameter job, it returns the correct results or an empty result set, although there is data it should return.
Anyone ever had a strange case like that? Any ideas what could be the problem?
Some more info:
The database we query against is a Oracle 10g, we are using an enhanced version of the OracleEFProvider v0.2a.
The SQl statement that is returned by ToTraceString works just fine when executed directly via SQL Developer, even with the same parameter that is causing the problem in the LINQ query.
The following also returns the correct result:
entities.JOBRUNS.ToList().Where(x => x.JOB_NAME == job && x.JOB_INFO == "SUCCESS").Count();
The difference here is the call to ToList on the table before applying the where clause. This means two things:
The data is in the database and it is correct.
The problem seems to be the query including the where clause when executed by the EF Provider.
What really stuns me is, that this is a live system and the problem occurred without any changes to the database or the program. One call to that method returned the correct result and the next call five minutes later returned the wrong result. And since then, it only returns the wrong results.
Any hints, suggestions, ideas etc. are welcome, never mind, how far-fetched they seem! Please post them as answers, so I can vote on them, just for the fact for reading my lengthy question and bothering thinking about that strange problem... ;-)
First of all remove ObjectContext caching. Object context internally uses UnitOfWork and IdentityMap patterns. This can have big impact on queries.

NHibernate - Equivalent of CountDistinct projection using LINQ

I'm in the midst of trying to replace a the Criteria queries I'm using for a multi-field search page with LINQ queries using the new LINQ provider. However, I'm running into a problem getting record counts so that I can implement paging. I'm trying to achieve a result
equivalent to that produced by a CountDistinct projection from the Criteria API using LINQ. Is there a way to do this?
The Distinct() method provided by LINQ doesn't seem to behave the way I would expect, and appending ".Distinct().Count()" to the end of a LINQ query grouped by the field I want a distinct count of (an integer ID column) seems to return a non-distinct count of those values.
I can provide the code I'm using if needed, but since there are so many fields, it's
pretty long, so I didn't want to crowd the post if it wasn't needed.
Thanks!
I figured out a way to do this, though it may not be optimal in all situations. Just doing a .Distinct() on the LINQ query does, in fact, produce a "distinct" in the resulting SQL query when used without .Count(). If I cause the query to be enumerated by using .Distinct().ToList() and then use the .Count() method on the resulting in-memory collection, I get the result I want.
This is not exactly equivalent to what I was originally doing with the Criteria query, since the counting is actually being done in the application code, and the entire list of IDs must be sent from the DB to the application. In my case, though, given the small number of distinct IDs, I think it will work, and won't be too much of a performance bottleneck.
I do hope, however, that a true CountDistinct() LINQ operation will be implemented in the future.
You could try selecting the column you want a distinct count of first. It would look something like: Select(p => p.id).Distinct().Count(). As it stands, you're distincting the entire object, which will compare the reference of the object and not the actual values.

Categories

Resources