LINQ to Entities not ordering correctly - c#

I have a function that runs a somewhat complex LINQ query, but I've verified that the simplified code below also has the problem. I specifically tell the query to order by RequiredDate, which is a DateTime. This is completely ignored, however--the sorting actually occurs by another property, PONumber. The database is all random test data, so nothing is ordered except the Id column. I'm not sure why the other property is being used instead of the column I'm trying to sort by. I use Kendo UI, so the IEnumerable is converted to a Kendo type in the controller, but the LINQ to Entities query returns the incorrect order. What is causing this problem?
(simplified versions are below)
Class:
public partial class PurchaseOrder : BaseEntity
{
public virtual int PONumber { get; set; }
public virtual DateTime RequiredDate { get; set; }
}
Mapping:
public PurchaseOrderMap()
{
ToTable("PurchaseOrder");
HasKey(c => c.Id);
Property(u => u.PONumber).IsRequired();
Property(u => u.RequiredDate).IsRequired();
}
Service (this fetches the data):
public virtual IEnumerable<PurchaseOrder> GetAllPOs()
{
var query = _poRepository.Table;
query = query.Where(p => p.Shipment == null);
query = query.OrderBy(p => p.RequiredDate);
return query;
}
Function is called in the controller by this code. DataSourceRequest and DataSourceResult are functions in Kendo UI.
public ActionResult POList([DataSourceRequest]DataSourceRequest request)
{
var pos = _poService.GetAllPOs();
DataSourceResult result = pos.ToDataSourceResult(request, o => PreparePOModelForList(o));
return Json(result);
}
The actual query against the DB (courtesy of SQL Profiler) is:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[PONumber] AS [PONumber],
[Extent1].[RequiredDate] AS [RequiredDate],
[Extent1].[LastUpdateDate] AS [LastUpdateDate],
FROM [dbo].[PurchaseOrder] AS [Extent1]
ORDER BY [Extent1].[PONumber] ASC
OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY

Based on the OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY I'm guessing you have some additional logic somewhere which is attempting to apply pagination via the Skip() and Take() methods. My guess is you do some additional sorting there that you are missing. I can't prove that based on the code you have given, but try to figure out what is generating your OFFSET ... FETCH NEXT ... and I suspect you'll find your answer.

Related

EF loading all children even with Where of FirstOrDefault

I have this code:
using (var context = new MyDbContext(connectionString))
{
context.Configuration.LazyLoadingEnabled = true;
context.Configuration.ProxyCreationEnabled = true;
context.Database.Log = logValue => File.AppendAllText(logFilePath, logValue);
var testItem1 = context.ParentTable
.FirstOrDefault(parent => parent.Id == 1)
.ChildEntities
.FirstOrDefault(child => child.ChildId == 2000);
}
When executing this code and examining log file for EF 6 (logFilePath), I see that children entities are loaded for the entire ParentTable record with Id == 1, while LazyLoading is enabled and Where condition for child table is specified (child.ChildId == 2000).
Shouldn't EF load only relevant children or is reading Items executed first and then on in-memory data FirstOrDefault gets executed?
Because if some parent has many children entities, this way, it can significantly decrease performance when loading children with condition?
I guess the workaround would be to load children entities separately?
This is a complete log file for above code (some lines excluded for easier reading):
SELECT TOP (1)
....
FROM [dbo].[ParentTable] AS [Extent1]
WHERE 1 = [Extent1].[Id]
SELECT
...
FROM [dbo].[ChildTable] AS [Extent1]
WHERE [Extent1].[ParentId] = #EntityKeyValue1
-- EntityKeyValue1: '1' (Type = Int32, IsNullable = false)
NOTE: Added relevant classes:
public class MyDbContext : DbContext
{
public DbSet<ParentTable> ParentTable { get; set; }
public DbSet<ChildTable> ChildTable { get; set; }
static MyDbContext()
{
Database.SetInitializer<MyDbContext>(null);
}
public MyDbContext(string connStr)
: base(connStr)
{
}
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.Entity<ParentTable>()
.HasMany(t => t.ChildEntities);
}
}
[Table("ParentTable", Schema = "dbo")]
public class ParentTable
{
public int Id { get; set; }
public virtual ICollection<ChildTable> ChildEntities { get; set; }
}
[Table("ChildTable", Schema = "dbo")]
public class ChildTable
{
public int ChildId { get; set; }
public int ParentId { get; set; }
[ForeignKey("ParentId")]
public virtual ParentTable Parent { get; set; }
}
use this query:
var testItem1 = context.ChildTables
.Include(p=>p.ParentTable)
.Where(ch => ch.ChildId == 2000)
.FirstOrDefault();
Your problem has nothing to do with lazy loading. It is because you use FirstOrDefault too early in your sequence of LINQ methods.
I'll first write the proper query, then I'll explain why that one is better.
var result = dbContext.ParentTable
.Where(parent => parent.Id == 1)
.SelectMany(parent => parent.ChildEntities.Where(child => child.ChildId == 2000))
.FirstOrDefault();
If you look closely to the LINQ methods, you'll see there are two types: those that return IQueryable<...>, and the others. LINQ methods of the first group are use lazy execution, also called deferred execution. This means that these statements won't execute the query. They will only change the Expression in the IQueryable. The database is not queried yet.
LINQ statements from the latter group will deep inside call GetEnumerator() and most of the times repeatedly call MoveNext() / Current. This will send the IQueryable.Expression to the IQueryable.Provider, who will try to translate the Expression into SQL and execute the query to fetch data from the database (to be precise: the translation doesn't always have to be SQL, that depends on the Provider). The fetched data is presented as an IEnumerator<...>, of which you can call MoveNext() / Current.
Your first FirstOrDefault will already execute the query. Apart from that it is executed too early, and might fetch more data than you want, you can also have the problem that it returns null.
The proper method would be to use Select. Only the last statement should contain a non_IQueryable method like FirstOrDefault.
I used SelectMany instead of Select, because you are only interested in the ChildEntities of the Parent, not in any of the Parent properties.
var result = dbContext.ParentTable
.Where(parent => parent.Id == 1)
.SelectMany(parent => parent.ChildEntities.Where(child => child.ChildId == 2000))
.FirstOrDefault();
Although this solves your problem, this will fetch more data than you actually plan to use. For instance, every Child will have a foreign key to the Parent. You know the Parent has a primary key value equal to 1, so the foreign key of the Child will also have a value of 1. Why transfer it?
In this case, I expect only one Child, so the problem is not too big. But in other cases you might be sending the same value often.
When using entity framework, always use Select and select only the properties that you plan to use. Only fetch the complete row or use Include if you plan to update the fetched item.
Another thing that will slow down your process if you don't use Select, is that when you fetch complete rows, the original fetched data and a copy of it are put in the DbContext.ChangeTracker. This is done to make it possible to detect what values must be save when you call SaveChanges. If you don't plan to update the fetched data, don't waste processing power to put the fetched data in the change tracker.

C# OData: get a OData result is very slow

I have that Action:
[EnableQuery]
public IHttpActionResult Get()
{
var ordWeb = orderCtx.ORDER.AsQueryable();
var ordWebDTO =ordWeb.ProjectTo<ORDER>(mapper.ConfigurationProvider);
return Ok(ordWebDTO.toList);
}
This an action inside a controller.
orderWebDTO is a result of a mapping with some fields coming from different tables of a Database.
In that case Odata query coming from Url should be processed AFTER "return" call.
when I use Odata Query in the URL (ex. localhost/Controller?%24top=30) EntityFramework load all data from database WITHOUT filter them (in the example: last 30 records).
It's very expensive: I have more than 35k records, and it load all of them and AFTER get last 30...
How can resolve it?
UPDATE 09.13.18
I have that kind of mapping with one value calculated while mapping work.
var c = new MapperConfiguration(
cfg => cfg.CreateMap<ORDER, ORDER_WEB>()
.ForMember(....)
.ReverseMap()
);
mapper = c.CreateMapper();
In the ORDER_WEB model I have that:
public class ORDER_WEB
{
...
...
public string ValueFromEntityFrameworkModel {get; set;}
public string Set_ORDER
{
get
{
ORDER_TYPE tipo = new ORDER_TYPE();
return tipo.GetData(ValueFromEntityFrameworkModel);
}
set { }
}
without toList() It cannot work...
For this reason OData work on ALL records and AFTER assign the values mapping including Set_ORDER.
The point is that : is it possible to do an OData query (with attributes/parameters) with few records and AFTER assign values mapping?
I hope to be clear...
There are errors in your code sample, but if this accurately reflects what you are doing in your actual code sample, then
ordWebDTO.ToList()
Will go to the database and retrieve all 35k records AND THEN apply the OData filters you were looking to apply. Compare that to:
[EnableQuery]
public IQueryable<ORDER> Get()
{
var ordWeb = orderCtx.ORDER.AsQueryable();
var ordWebDTOs =ordWeb.ProjectTo<ORDER>(mapper.ConfigurationProvider);
return ordWebDTOs;
}
This will return an IQueryable against which the OData filters will be applied so that when the list is materialized, it is an efficient query to the database.

NHibernate Filtered Child Collection Lazy Loaded even with eager fetch specified

Im trying to find out why a child collection is being returned without filtering even when eager loading the collection and the generated SQL is correct.
The fluent mappings for the classes are:
public class OptionIdentifierMap : ClassMap<OptionIdentifier>
{
public OptionIdentifierMap()
: base("OptionIdentifier")
{
//Id Mapping Removed
HasMany<OptionPrice>(x => x.OptionPrices)
.KeyColumn("OptionIdentifier_id")
.Cascade.None();
}
}
public class OptionPriceMap : ClassMap<OptionPrice>
{
public OptionPriceMap()
: base("OptionPrice")
{
//Id Mapping removed
References(x => x.Option)
.Column("OptionIdentifier_id")
.Cascade.None()
.ForeignKey("FK_OptionPrice_OptionIdentifier_id_OptionIdentifier_Id")
.Not.Nullable();
References(x => x.Increment)
.Column("PricingIncrement_id")
.Cascade.None()
.ForeignKey("FK_OptionPrice_PricingIncrement_id_PricingIncrement_Id")
.Not.Nullable();
Map(x => x.Price).Not.Nullable();
}
}
and PricingIncrement mapping
public class PricingIncrementMap : ClassMap<PricingIncrement>
{
public PricingIncrementMap()
: base("PricingIncrement")
{
Map(x => x.IncrementYear);
HasMany<OptionPrice>(x => x.Options)
.KeyColumn("PricingIncrement_id")
.Cascade.None().Inverse();
}
}
And the Entities are:
public class PricingIncrement : Entity
{
public PricingIncrement()
{
Options = new List<OptionPrice>();
}
public virtual int IncrementYear { get; set; }
public virtual IList<OptionPrice> Options { get; set; }
}
public class OptionPrice : Entity
{
public OptionPrice()
{
}
public virtual OptionIdentifier Option { get; set; }
public virtual PricingIncrement Increment { get; set; }
public virtual float Price { get; set; }
}
public class OptionIdentifier : Entity
{
public OptionIdentifier()
{
OptionPrices = new List<OptionPrice>();
}
public virtual IList<OptionPrice> OptionPrices { get; set; }
}
Im trying to query All the OptionIdentifier that have an optionprice value for an specific PricingIncrement.
The SQL Query that nhibernate generates from my criteria is:
SELECT this_.Id as Id37_4_,
.......
FROM OptionIdentifier this_ inner join OptionPrice op2_ on this_.Id = op2_.OptionIdentifier_id
inner join PricingIncrement i3_ on op2_.PricingIncrement_id = i3_.Id
WHERE (this_.IsDeleted = 0)
and this_.Id in (7)
and i3_.IncrementYear = 2015
The criteria I'm using to build this query is:
ICriteria pagedCriteria = this.Session.CreateCriteria<OptionIdentifier>()
.CreateAlias("OptionPrices", "op", JoinType.InnerJoin)
.CreateAlias("op.Increment", "i", JoinType.InnerJoin)
.SetFetchMode("op", FetchMode.Eager)
.SetFetchMode("i", FetchMode.Eager)
.Add(Restrictions.Eq("i.IncrementYear", 2015))
.Add(Expression.In("Id", idList.ToList<int>()))
.SetResultTransformer(CriteriaSpecification.DistinctRootEntity);
When looking at SQL Profiler, the query executes and the result is correct, i get one row for each child in the OptionPrice table that matches the criteria, in my case one, from the available 4 rows that match the OptionIdentifier (there are 4 rows in PricingIncrement and 4 in OptionPrice one for each PricingIncrement for the OptionIdentifier_id 7)
But when i try to iterate the collection to get some values, for some reason nhibernate is loading the child collection, as if lazy load was specified, and loading the full 4 child rows. Reading the documentation FetchMode is supposed to fix this preventing nhibernate to lazy load child collections. Similar to a N+1 common issue.
I checked the SQL Profiler to see whats happening and nhibernate is generating queries without the original filter to fill the child collection when i try to access it. If i dont access the collection no query is generated.
Doing some testing i tried different join types and fetch modes, and so far the only way to iterate the collection without having hibernate load all the elements is to specify in the join type LeftOuterJoin, but this means something different.
I tried to search for issues similar but all of them say that eager loading should work, or mention that i should use filters. And so far i havent found any answer.
Any help is greatly appreciated.
I would like to share my approach, maybe not the answer...
I. avoid fetching one-to-many (collections)
When creating any kind of complex queries (ICriteria, QueryOver) we should use (LEFT) JOIN only on a start schema. I.e. on many-to-one (References() in fluent). That leads to expected row count from the perspective of paging (there is always only ONE row per root Entity)
To avoid 1 + N issue with collections (but even with many-to-one in fact) we have the NHiberante powerful feature:
19.1.5. Using batch fetching
NHibernate can make efficient use of batch fetching, that is, NHibernate can load several uninitialized proxies if one proxy is accessed (or collections. Batch fetching is an optimization of the lazy select fetching strategy)...
Read more here:
How to Eager Load Associations without duplication in NHibernate?
How to implement batch fetching with Fluent NHibernate when working with Oracle?
So, in our case, we would adjust mapping like this:
public PricingIncrementMap()
: base("PricingIncrement")
{
Map(x => x.IncrementYear);
HasMany<OptionPrice>(x => x.OptionPrices)
.KeyColumn("OptionIdentifier_id")
.Cascade.None()
.Inverse() // I would use .Inverse() as well
// batch fetching
.BatchSize(100);
}
II. collection filtering
So, we managed to avoid 1 + N issue, and we also query only star schema. Now, how can we load just filtered set of items of our collection? Well, we have native and again very powerful NHibernate feature:
18.1. NHibernate filters.
NHibernate adds the ability to pre-define filter criteria and attach those filters at both a class and a collection level. A filter criteria is the ability to define a restriction clause very similiar to the existing "where" attribute available on the class and various collection elements...
Read more about it here:
how to assign data layer-level filters
Limit collection to retrieve only recent entries for readonly entity
So in our case we would define filter
public class CollFilter : FilterDefinition
{
public CollFilter()
{
WithName("CollFilter")
.WithCondition("PricingIncrement_id = :pricingIncrementId")
.AddParameter("pricingIncrementId",NHibernate.Int32);
}
}
And we would need to extend our mapping again:
HasMany<OptionPrice>(x => x.OptionPrices)
.KeyColumn("OptionIdentifier_id")
.Cascade.None()
.Inverse()
// batch fetching
.BatchSize(100)
// this filter could be turned on later
.ApplyFilter<CollFilter>();
Now, before our query will be executed, we just have to enable that filter and provide proper ID of the year 2015:
// the ID of the PricingIncrement with year 2015
var pricingIncrementId thes.Session
.QueryOver<PricingIncrement>()
.Where(x => x.IncrementYear == 2015)
.Take(1)
.Select(x => x.ID)
.SingleOrDefault<int?>();
this.Session
.EnableFilter("CollFilter")
.SetParameter("pricingIncrementId", pricingIncrementId);
// ... the star schema query could be executed here
III. Sub-query to filter root entity
Finally we can use sub-query, to restrict the amount of root entities to be returned with our query.
15.8. Detached queries and subqueries
Read more about it here:
Query on HasMany reference
NHibernate Criteria Where any element of list property is true
so, our subquery could be
// Subquery
var subquery = DetachedCriteria.For<OptionPrice >()
.CreateAlias("Increment", "i", JoinType.InnerJoin)
.Add(Restrictions.Eq("i.IncrementYear", 2015))
.SetProjection(Projections.Property("Option.ID"));
// root query, ready for paging, and still filtered as wanted
ICriteria pagedCriteria = this.Session.CreateCriteria<OptionIdentifier>()
.Add(Subqueries.PropertyIn("ID", subquery))
.SetResultTransformer(CriteriaSpecification.DistinctRootEntity);
Summary: We can use lot of features, which are shipped with NHibernate. They are there for a reason. And with them together we can achieve stable and solid code, which is ready for further extending (paging at the first place)
NOTE: maybe I made some typos... but the overall idea should be clear

Lambda expression with .Where clause using Contains

When connecting to CRM 2013 is there a smart way to create a lambda expression that gets the entities who's GUID are in a List.
This code breaks on the Where clause and gives the error:
Invalid 'where' condition. An entity member is invoking an invalid property or method.
Code:
private List<UserInformationProxy> GetContactsFromGuidList(List<Guid> contactList)
{
var result = _serviceContext.ContactSet
.Where(x=> contactList.Contains((Guid) x.ContactId)) // this line breaks
.Select(x => new UserInformationProxy()
{
FullName = x.FullName,
Id = x.ContactId
})
.Distinct()
.ToList<UserInformationProxy>();
return result;
}
// return class
public class UserInformationProxy
{
public Guid? Id { get; set; }
public string FullName { get; set; }
public string DomainName { get; set; }
}
Currently I'm solving this by getting all the contacts from the ContactSet and sorting out the ones I want with a loop in my code. This works, but is quite slow as I need to get all 10000 contacts instead of sending the Guids of the 40 Im actually interested in to the SQL server.
QueryExpressions support an In operator, so this should work just fine:
private List<UserInformationProxy> GetContactsFromGuidList(List<Guid> contactList)
{
var qe = new QueryExpression(Contact.EntityLogicalName);
qe.ColumnSet = new ColumnSet("fullname", "contactid")
qe.Criteria.AddCondition("contactid", ConditionOperator.In, list.Cast<Object>().ToArray());
qe.Distinct = true;
var results = service.RetrieveMultiple(qe).Entities.Select (e => e.ToEntity<Contact>()).
Select(x => new UserInformationProxy()
{
FullName = x.FullName,
Id = x.ContactId
});
return results;
}
On a side note, every Contact has to have an Id that is not empty, so there is no need to check for it.
EDIT: It is possible to accomplish using a single query, Daryl posted an answer with the right code.
Other (not so clever) alternatives are:
Retrieve all the records and after check the Guids
Do a single retrieve for each Guid
Because are only 40 records, I suggest to use late-bound to retrieve the records, in order to choose the minimal ColumnSet.
Useful links related to this issue:
Another question regarding Dynamics CRM LINQ limitations
Performance test Early Bound vs Late Bound

Break out part of linq-to-sql expression to a separate function

I have two entity classes:
public class Invoice
{
public int ID { get; set;}
public int Amount { get { return InvoiceLines.Sum(il => il.Amount); }}
public EntitySet<InvoiceLines> InvoiceLines {get;set;};
}
public class InvoiceLine
{
public Invoice Invoice {get;set;}
public int InvoiceID {get;set;}
public int Amount {get;set;}
public string SomeHugeString {get;set;}
}
(The real classes are sqlmetal generated, I shortened it down here to get to the point).
Querying for all amounts:
var amounts = from i in invoice select i.Amount;
This will cause all invoicelines to be lazy loaded with one database call per invoice. I can solve it with data load options, but that would cause the entire InvoiceLine objects to be read, including SomeHugeString.
Repeating the amount calculation in the query will get a good SQL translation:
var amounts = from i in invoice select i.InvoiceLines.Sum(il => il.Amount);
I sould like to have linq-to-sql somehow get part of the expression tree from a function/property. Is there a way to rewrite Invoice.Amount so that the first amounts query will give the same SQL translation as the second one?
You can do something similar using AsExpandable() from LINQKit:
Expression<Func<Invoice, int>> getAmount =
i => i.InvoiceLines.Sum(il => il.Amount);
var amounts = from i in invoice.AsExpandable() select getAmount.Invoke(i);
You can create your own functions using IQueryable interface.
I've used standard NorthWind DB:
public static class LinqExtensions
{
public static IQueryable<int> CalculateAmounts(this IQueryable<Order> order)
{
return from o in order select o.Order_Details.Sum(i => i.Quantity);
}
}
var amounts = (from o in context.Orders select o).CalculateAmounts();
This code generates such SQL:
SELECT [t2].[value]
FROM [dbo].[Orders] AS [t0]
OUTER APPLY (
SELECT SUM(CONVERT(Int,[t1].[Quantity])) AS [value]
FROM [dbo].[Order Details] AS [t1]
WHERE [t1].[OrderID] = [t0].[OrderID]
) AS [t2]
I'd suggest you set the 'SomeHugeString' property to be lazy loaded. This way you can load InvoiceLine without getting that huge string, which means you can use DataLoadOptions.LoadWith():

Categories

Resources