I currently have this Linq query:
return this.Context.StockTakeFacts
.OrderByDescending(stf => stf.StockTakeId)
.Where(stf => stf.FactKindId == ((int)kind))
.Take(topCount)
.ToList<IStockTakeFact>();
The intent is to return every fact for the topCount of StockTakes but instead I can see that I will only get the topCount number of facts.
How do I Linq-ify this query to achieve my aim?
I could use 2 queries to get the top-topCount StockTakeId and then do a "between" but I wondered what tricks Linq might have.
This is what I'm trying to beat. Note that it's really more about learning that not being able to find a solution. Also concerned about performance not for these queries but in general, I don't want to just to easy stuff and find out it's thrashing behind the scenes. Like what is the penalty of that contains clause in my second query below?
List<long> stids = this.Context.StockTakes
.OrderByDescending(st => st.StockTakeId)
.Take(topCount)
.Select(st => st.StockTakeId)
.ToList<long>();
return this.Context.StockTakeFacts
.Where(stf => (stf.FactKindId == ((int)kind)) && (stids.Contains(stf.StockTakeId)))
.ToList<IStockTakeFact>();
What about this?
return this.Context.StockTakeFacts
.OrderByDescending(stf => stf.StockTakeId)
.Where(stf => stf.FactKindId == ((int)kind))
.Take(topCount)
.Select(stf=>stf.Fact)
.ToList();
If I've understood what you're after correctly, how about:
return this.Context.StockTakes
.OrderByDescending(st => st.StockTakeId)
.Take(topCount)
.Join(
this.Context.StockTakeFacts,
st => st.StockTakeId,
stf => stf.StockTakeId,
(st, stf) => stf)
.OrderByDescending(stf => stf.StockTakeId)
.ToList<IStockTakeFact>();
Here's my attempt using mostly query syntax and using two separate queries:
var stids =
from st in this.Context.StockTakes
orderby st.StockTakeId descending
select st.StockTakeId;
var topFacts =
from stid in stids.Take(topCount)
join stf in this.Context.StockTakeFacts
on stid equals stf.StockTakeId
where stf.FactKindId == (int)kind
select stf;
return topFacts.ToList<IStockTakeFact>();
As others suggested, what you were looking for is a join. Because the join extension has so many parameters they can be a bit confusing - so I prefer query syntax when doing joins - the compiler gives errors if you get the order wrong, for instance. Join is by far preferable to a filter not only because it spells out how the data is joined together, but also for performance reasons because it uses indexes when used in a database and hashes when used in linq to objects.
You should note that I call Take in the second query to limit to the topCount stids used in the second query. Instead of having two queries, I could have used an into (i.e., query continuation) on the select line of the stids query to combine the two queries, but that would have created a mess for limiting it to topCount items. Another option would have been to put the stids query in parentheses and invoked Take on it. Instead, separating it out into two queries seemed the cleanest to me.
I ordinarily avoid specifying generic types whenever I think the compiler can infer the type; however, IStockTakeFact is almost certainly an interface and whatever concrete type implements it is likely contained by this.Context.StockTakeFacts; which creates the need to specify the generic type on the ToList call. Ordinarily I omit the generic type parameter to my ToList calls - that seems to be an element of my personal tastes, yours may differ. If this.Context.StockTakeFacts is already a List<IStockTakeFact> you could safely omit the generic type on the ToList call.
Related
I have the following query:
var vendors = (from pp in this.ProductPricings
join pic in this.ProductItemCompanies
on pp.CompanyId equals pic.CompanyId into left
from pic in left.DefaultIfEmpty()
orderby pp.EffectiveDate descending
group pp by new { pp.Company, SortOrder = (pic != null) ? pic.SortOrder : short.MinValue } into v
select v).OrderBy(z => z.Key.SortOrder);
Does anyone know how the last OrderBy() is applied? Does that become part of the SQL query, or are all the results loaded in to memory and then passed to OrderBy()?
And if it's the second case, is there any way to make it all one query? I only need the first item and it would be very inefficent to return all the results.
Well it will try to apply the OrderBy to the original query since you are still using an IQueryable - meaning it hasn't been converted to an IEnumerable or hydrated to a collection using ToList or an equivalent.
Whether it can or not depends on the complexity of the resulting query. You'd have to try it to find out. My guess is it will turn the main query into a subquery and layer on a "SELECT * FROM (...) ORDER BY SortOrder" outer query.
Given your specific example the order by in this situation most, likely be appliead as part of the expression tree when it getting build, there for it will be applied to sql generated by the LINQ query, if you would convert it to Enumarable like ToList as mentioned in another answer then Order by would be applied as an extension to Enumerable.
Might use readable code, because as you write it is not understandable.
You will have a problem in the future with the linq statement. The problem is that if your statement does not return any value the value will be null and whenever you make cause a exception.
You must be careful.
I recommend you to do everything separately to understand the code friend.
I have a system that uses tags to categorize content similar to the way Stack Overflow does. I am trying to generate a list of the most recently used tags using LINQ to SQL.
(from x in cm.ContentTags
join t in cm.Tags on x.TagID equals t.TagID
orderby x.ContentItem.Date descending select t)
.Distinct(p => p.TagID) // <-- Ideally I'd be able to do this
The Tags table has a many to many relationship with the ContentItems table. ContentTags joins them, each tuple having a reference to the Tag and ContentItem.
I can't just use distinct because it compares on Tablename.* rather than Tablename.PrimaryKey, and I can't implement an IEqualityComparer since that doesn't translate to SQL, and I don't want to pull potential millions of records from the DB with .ToList(). So, what should I do?
You could write your own query provider, that supports such an overloaded distinct operator. It's not cheap, but it would probably be worthwhile, particularly if you could make it's query generation composable. That would enable a lot of customizations.
Otherwise you could create a stored proc or a view.
Use a subselect:
var result = from t in cm.Tags
where(
from x in cm.ContentTags
where x.TagID == t.TagID
).Contains(t.TagID)
select t;
This will mean only distinct records are returned form cm.Tags, only problem is you will need to find some way to order result
Use:
.GroupBy(x => x.TagID)
And then extract the data in:
Select(x => x.First().Example)
I'm trying to create a LINQ provider. I'm using the guide LINQ: Building an IQueryable provider series, and I have added the code up to LINQ: Building an IQueryable Provider - Part IV.
I am getting a feel of how it is working and the idea behind it. Now I'm stuck on a problem, which isn't a code problem but more about the understanding.
I'm firing off this statement:
QueryProvider provider = new DbQueryProvider();
Query<Customer> customers = new Query<Customer>(provider);
int i = 3;
var newLinqCustomer = customers.Select(c => new { c.Id, c.Name}).Where(p => p.Id == 2 | p.Id == i).ToList();
Somehow the code, or expression, knows that the Where comes before the Select. But how and where?
There is no way in the code that sorts the expression, in fact the ToString() in debug mode, shows that the Select comes before the Where.
I was trying to make the code fail. Normal I did the Where first and then the Select.
So how does the expression sort this? I have not done any change to the code in the guide.
The expressions are "interpreted", "translated" or "executed" in the order you write them - so the Where does not come before the Select
If you execute:
var newLinqCustomer = customers.Select(c => new { c.Id, c.Name})
.Where(p => p.Id == 2 | p.Id == i).ToList();
Then the Where is executed on the IEnumerable or IQueryable of the anonymous type.
If you execute:
var newLinqCustomer = customers.Where(p => p.Id == 2 | p.Id == i)
.Select(c => new { c.Id, c.Name}).ToList();
Then the Where is executed on the IEnumerable or IQueryable of the customer type.
The only thing I can think of is that maybe you're seeing some generated SQL where the SELECT and WHERE have been reordered? In which case I'd guess that there's an optimisation step somewhere in the (e.g.) LINQ to SQL provider that takes SELECT Id, Name FROM (SELECT Id, Name FROM Customer WHERE Id=2 || Id=#i) and converts it to SELECT Id, Name FROM Customer WHERE Id=2 || Id=#i - but this must be a provider specific optimisation.
No, in the general case (such as LINQ to Objects) the select will be executed before the where statement. Think of it is a pipeline, your first step is a transformation, the second a filter. Not the other way round, as it would be the case if you wrote Where...Select.
Now, a LINQ Provider has the freedom to walk the expression tree and optimize it as it sees fit. Be aware that you may not change the semantics of the expression though. This means that a smart LINQ to SQL provider would try to pull as many where clauses it can into the SQL query to reduce the amount of data travelling over the network. However, keep the example from Stuart in mind: Not all query providers are clever, partly because ruling out side effects from query reordering is not as easy as it seems.
I'm trying to make a LINQ to SQL statement which filters results where the ID is not in some list of integers. I realise the .contains() method cannot be used in Linq to SQL but for the purposes of explaining what I'd like to do, here's what I'd like to do:
nextInvention = (from inv in iContext.Inventions
where !historyList.Contains(inv.Id)
orderby inv.DateSubmitted ascending
select inv).First<Invention>();
Any idea how I might go about doing this?
Thanks!
Contains can be used in LINQ to SQL... it's the normal way of performing "IN" queries. I have a vague recollection that there are some restrictions, but it definitely can work. The restrictions may be around the types involved... is historyList a List<T>? It probably isn't supported for arbitrary IEnumerable<T>, for example.
Now, I don't know whether inverting the result works to give a "NOT IN" query, but it's at least worth trying. Have you tried the exact query from your question?
One point to note: I think the readability of your query would improve if you tried to keep one clause per line:
var nextInvention = (from inv in iContext.Inventions
where !historyList.Contains(inv.Id)
orderby inv.DateSubmitted ascending
select inv)
.First();
Also note that in this case, the method syntax is arguably simpler:
var nextInvention = iContext.Inventions
.Where(inv => !historyList.Contains(inv.Id))
.OrderBy(inv => inv.DateSubmitted)
.First();
LINQ is one of the greatest improvements to .NET since generics and it saves me tons of time, and lines of code. However, the fluent syntax seems to come much more natural to me than the query expression syntax.
var title = entries.Where(e => e.Approved)
.OrderBy(e => e.Rating).Select(e => e.Title)
.FirstOrDefault();
var query = (from e in entries
where e.Approved
orderby e.Rating
select e.Title).FirstOrDefault();
Is there any difference between the two or is there any particular benefit of one over other?
Neither is better: they serve different needs. Query syntax comes into its own when you want to leverage multiple range variables. This happens in three situations:
When using the let keyword
When you have multiple generators (from clauses)
When doing joins
Here's an example (from the LINQPad samples):
string[] fullNames = { "Anne Williams", "John Fred Smith", "Sue Green" };
var query =
from fullName in fullNames
from name in fullName.Split()
orderby fullName, name
select name + " came from " + fullName;
Now compare this to the same thing in method syntax:
var query = fullNames
.SelectMany (fName => fName.Split().Select (name => new { name, fName } ))
.OrderBy (x => x.fName)
.ThenBy (x => x.name)
.Select (x => x.name + " came from " + x.fName);
Method syntax, on the other hand, exposes the full gamut of query operators and is more concise with simple queries. You can get the best of both worlds by mixing query and method syntax. This is often done in LINQ to SQL queries:
var query =
from c in db.Customers
let totalSpend = c.Purchases.Sum (p => p.Price) // Method syntax here
where totalSpend > 1000
from p in c.Purchases
select new { p.Description, totalSpend, c.Address.State };
I prefer to use the latter (sometimes called "query comprehension syntax") when I can write the whole expression that way.
var titlesQuery = from e in entries
where e.Approved
orderby e.Rating
select e.Titles;
var title = titlesQuery.FirstOrDefault();
As soon as I have to add (parentheses) and .MethodCalls(), I change.
When I use the former, I usually put one clause per line, like this:
var title = entries
.Where (e => e.Approved)
.OrderBy (e => e.Rating)
.Select (e => e.Title)
.FirstOrDefault();
I find that a little easier to read.
Each style has their pros and cons. Query syntax is nicer when it comes to joins and it has the useful let keyword that makes creating temporary variables inside a query easy.
Fluent syntax on the other hand has a lot more methods and operations that aren't exposed through the query syntax. Also since they are just extension methods you can write your own.
I have found that every time I start writing a LINQ statement using the query syntax I end up having to put it in parenthesis and fall back to using fluent LINQ extension methods. Query syntax just doesn't have enough features to use by itself.
In VB.NET i very much prefer query syntax.
I hate to repeat the ugly Function-keyword:
Dim fullNames = { "Anne Williams", "John Fred Smith", "Sue Green" };
Dim query =
fullNames.SelectMany(Function(fName) fName.Split().
Select(Function(Name) New With {Name, fName})).
OrderBy(Function(x) x.fName).
ThenBy(Function(x) x.Name).
Select(Function(x) x.Name & " came from " & x.fName)
This neat query is much more readable and maintainable in my opinion:
query = From fullName In fullNames
From name In fullName.Split()
Order By fullName, name
Select name & " came from " & fullName
VB.NET's query syntax is also more powerful and less verbose than in C#: https://stackoverflow.com/a/6515130/284240
For example this LINQ to DataSet(Objects) query
VB.NET:
Dim first10Rows = From r In dataTable1 Take 10
C#:
var first10Rows = (from r in dataTable1.AsEnumerable()
select r)
.Take(10);
I don't get the query syntax at all. There's just no reason for it in my mind. let can be acheived with .Select and anonymous types. I just think things look much more organized with the "punctuation" in there.
The fluent interface if there's just a where. If I need a select or orderby, I generally use the Query syntax.
Fluent syntax does seem more powerful indeed, it should also work better for organizing code into small reusable methods.
I know this question is tagged with C#, but the Fluent syntax is painfully verbose with VB.NET.
I really like the Fluent syntax and I try to use it where I can, but in certain cases, for example where I use joins, I usually prefer the Query syntax, in those cases I find it easier to read, and I think some people are more familiar to Query (SQL-like) syntax, than lambdas.
While I do understand and like the fluent format , I've stuck to Query for the time being for readability reasons. People just being introduced to LINQ will find Query much more comfortable to read.
I prefer the query syntax as I came from traditional web programming using SQL. It is much easier for me to wrap my head around. However, it think I will start to utilize the .Where(lambda) as it is definitely much shorter.
I've been using Linq for about 6 months now. When I first started using it I preferred the query syntax as it's very similar to T-SQL.
But, I'm gradually coming round to the former now, as it's easy to write reusable chunks of code as extension methods and just chain them together. Although I do find putting each clause on its own line helps a lot with readability.
I have just set up our company's standards and we enforce the use of the Extension methods. I think it's a good idea to choose one over the other and don't mix them up in code. Extension methods read more like the other code.
The comprehension syntax does not have all operators and using parentheses around the query and add extension methods after all just begs me for using extension methods from the start.
But for the most part it is just personal preference with a few exceptions.
From Microsoft's docs:
As a rule when you write LINQ queries, we recommend that you use query syntax whenever possible and method syntax whenever necessary. There is no semantic or performance difference between the two different forms. Query expressions are often more readable than equivalent expressions written in method syntax.
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/#query-expression-overview
They also say:
To get started using LINQ, you do not have to use lambdas extensively. However, certain queries can only be expressed in method syntax and some of those require lambda expressions.
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/linq/query-syntax-and-method-syntax-in-linq