Is there a version of the DataLoadOptions class that exists in LINQ to SQL for LINQ to Entities? Basically I want to have one place that stores all of the eager loading configuration and not have to add .Include() calls to all of my LINQ to Entities queries. Or if someone has a better solution definitely open to that as well.
TIA,
Benjy
Personally I'm glad that there is no (official) EF equivalent of DataLoadOptions. Why? A few reasons:
It is too easy to use them in a way that exceptions are thrown, see the remarks section of the MSDN page.
I don't like the concept of filtered child collections. When a Customer has Orders, I want the Orders member to represent the orders of that customer (lazy or not). A filter defined somewhere else (by AssociateWith) is easily forgotten. I will filter them when and where needed. This leads to the last objection:
Most importantly: it is stateful and most bugs are caused by unexpected state. DataLoadOptions change the DataContext's state in a way that later queries are influenced. I prefer to define eager loading where and when I need it. Typing is cheap, bugs are expensive.
Nevertheless, for completeness's sake I should mention that Muhammad Mosa did put some effort in an EF version of DataLoadOptions. I never tried it though.
I realize that you probably want to prevent repetitive code. But if you need identically shaped queries in more than one place you are already repeating code, with or without "globally" defined includes. A central eager loading configuration is pseudo DRY-ness. And soon you'll find yourself tripping over your own feet when eager loading is not desired but, darn, it's configured to happen!
Entity Framework does not support eager loading settings for the whole 'ObjectContext'. But you can declare all required 'IQueryable' properties with include options in a partial class. For example:
public IQueryable<Order> Orders {
get {
return OrderSet.Include("OrderDetails");
}
}
Related
I would like to know if the following scenario is possible with Entity Framework:
I want to load several tables with the option AsNoTracking since they are all like static tables that cannot be changed by user.
Those tables also happen to be navigation property of others. Up till now I relied on the AutoMapping feature of the Entity Framework, and don't use the .Include() or LazyLoading functionality.
So instead of:
var result = from x in context.TestTable
.Include("ChildTestTable")
select x;
I am using it like this:
context.ChildTestTable.Load();
context.TestTable.Load();
var result = context.TestTable.Local;
This is working smoothly because the application is so designed that the tables within the Database are very small, there won't be a table that exceeds 600 rows (and that's already pretty high value in my app).
Now my way of loading data, isn't working with .AsNoTracking().
Is there any way to make it working?
So I can write:
context.ChildTestTable.AsNoTracking().List();
var result = context.TestTable.AsNoTracking().List();
Instead of:
var result = from x in context.TestTable.AsNoTracking()
.Include("ChildTestTable")
select x;
So basically, I want to have 1 or more tables loaded with AutoMapping feature on but without loading them into the Object State Manager, is that a possibility?
The simple answer is no. For normal tracking queries, the state manager is used for both identity resolution (finding a previously loaded instance of a given entity and using it instead of creating a new instance) and fixup (connecting navigation properties together). When you use a no-tracking query it means that the entities are not tracked in the state manager. This means that fixup between entities from different queries cannot happen because EF has no way of finding those entities.
If you were to use Include with your no-tracking query then EF would attempt to do some fixup between entities within the query, and this will work a lot of the time. However, some queries can result in referencing the same entity multiple times and in some of those cases EF has no way of knowing that it is the same entity being referenced and hence you may get duplicates.
I guess the thing you don't really say is why you want to use no-tracking. If your tables don't have a lot of data then you're unlikely to see significant perf improvements, although many factors can influence this. (As a digression, using the ObservableCollection returned by .Local could also impact perf and should not be necessary if the data never changes.) Generally speaking you should only use no-tracking if you have an explicit need to do so since otherwise it ends up adding complexity without benefit.
I want to cache a never-changing aggregate which would be accessible by a root object only (all other entities are accessible only by using Reference/HasMany properties on the root object)?
Should I use NHibernate (which we are already using) second-level-cache or is it better to build some sort of singleton that provides access to all entities in the aggregate?
I found a blog post about getting everything with MultiQuery but my database does not support it.
The 'old way' to do this would be to
Do a select * from all aggregate tables
Loop the entities and set the References and the Collections manually
Something like:
foreach (var e in Entities)
{
e.Parent = loadedParentEntities.SingleOrDefault(pe => e.ParentId = pe.Id);
}
But surely there is a way to tell NHibernate to do this for me?
Update
Currently I tried merely fetching everything from the db and hope NHibernate does all the reference setting. It does not however :(
var getRoot = Session.Query<RootObject>().ToList();
var getRoot_hasMany = Session.Query<RootObjectCollection>().ToList();
var getRoot_hasMany_ref = Session.Query<RootObjectCollectionReference>().ToList();
var getRoot_hasMany_hasMany = Session.Query<RootObjectCollectionCollection>().ToList();
Domain:
Root objects are getRoot. These have a collection property 'HasMany'. These HasMany have each a reference back to GetRoot, and a reference to another entity (getRoot_hasMany_ref), and a collection of their own (getRoot_hasMany_hasMany). If this doesn't make sense, I'll create an ERD but the actual structure is not really relevant for the question (I think).
This results in 4 queries being executed. (which is good)
However, when accessing properties like getRoot.First().HasMany.First().Ref or getRoot.First().HasMany.First().HasMany().First() it still results in extra queries being executed even altough everything should already be known to the ISession?
So how do I tell NHibernate to perform those 4 queries and then build the graphs without using any proxy properties, ... so that I have access to everything even after the ISession went out of scope?
I think there are several questions in one.
I stopped trying to trick NHibernate too much. I wouldn't access entities from multiple threads, because they are usually not thread safe. At least when using lazy loading. Caching lazy entities is therefore something evil.
I would avoid too many queries by the use of batch size, which is far the cleanest and easiest solution and in most cases "good enough". It's fully transparent to the business logic, which makes it so cool.
I would:
Consider not caching the entity at all. Use NH first level cache (say: always load it using session.Get()). Make use of lazy loading when only a small part of the data is used in a single transaction.
Is there is a proven need to cache the data, consider to turn off lazy loading at all (by making the entities non-lazy and setting all the collections to non lazy. Load the entity once and cache it. Still consider thread safety when accessing the data while it is still loaded.
Should the entities be lazy, because some instances of the same type are not in the cache, consider using a DTO-like structure as cache. Copy all data in a similar class structure which are not entities. This may sound like a lot of additional work, but at the end it will avoid many strange problems and safe you much time.
Usually, query time is less important as flush time. This time is used by NH to find which entities changed in a session. To avoid this, make entities read only if you can.
if the whole object tree never changes (config settings?) then just load them efficiently with all references/collections initialised
using(var Session = Sessionfactory = OpenSession())
{
var root = Session.Query<RootObject>().FetchMany(x => x.Collection).ToFutureValue();
Session.Query<RootObjectCollection>().Fetch(x => x.Ref).FetchMany(x => x.Collection).ToFuture();
// Do something with root.Value
}
I've got a couple of entities in a parent-child relationship: Family (parent) and Updates (child). I want to read a list of families without the corresponding updates. There are only 17 families but about 60,000 updates so I really don't want the updates.
I used EntitiesToDTOs to generate a DTO from the Family entity and to create an assembler for converting the Family entity to the FamilyDTO. The ToDTO method of the assembler looks like:
public static FamilyDTO ToDTO(this Family entity)
{
if (entity == null) return null;
var dto = new FamilyDTO();
dto.FamilyCode = entity.FamilyCode;
dto.FamilyName = entity.FamilyName;
dto.CreateDatetime = entity.CreateDatetime;
dto.Updates_ID = entity.Updates.Select(p => p.ID).ToList();
entity.OnDTO(dto);
return dto;
}
When I run the assembler I find each resulting FamilyDTO has the Updates_ID list populated, although lazy loading is set to true for the EF model (edmx file). Is it possible to configure EntitiesToDTOs to support lazy loading of child elements or will it always use eager loading? I can't see any option in EntitiesToDTOs that could be set to support lazy loading when generating the assembler.
By the way, I'm part of a larger team that uses EntitiesToDTOs to regenerate the assemblers on an almost daily basis, so I'd prefer not to modify the assembler by hand if possible.
I'm Fabian, creator of EntitiesToDTOs.
First of all, thanks a lot for using it.
What you have detected is in fact what I don't want the Assembler to do, I want the developer to map the navigation properties only if needed using the partial methods OnDTO and OnEntity. Otherwise you run into problems like you have.
Seems like I never run into that problem using the tool, THANKS A LOT.
So right now I'm fixing this. It's now fixed in version 3.1.
Based on the code that you've posted here, and based on how I think someone would implement such a solution (i.e. to convert records to a DTO format) I think that you would have no choice but to do eager loading.
Some key points:
1) Your Updates_ID field is clearly a List, which means that it's hydrating the collection right away (ToList always executes. Only a regular IEnumerable employs deferred execution).
2) If you're sticking any sort of navigation property in a DTO it would automatically be loaded eagerly. That's because once you so much as touch a navigation property that was brought back by Entity Framework, the framework automatically loads it from the database, and doesn't care that all you wanted was to populate a DTO with it.
I have a database with a Customer, Supplier, and Services (this is a gross simplification, I really have about 100 tables)
I am developing a new Entity Framework library for accessing these tables.
A Customer has many Suppliers
A Supplier has many Services
I am trying to decide which approach to follow -
A )
Use mapping to connect the Customer to the Supplier and the Supplier to the Services, then every time I load a customer I get all his suppliers and their services (and other tables loaded)
B )
Have no mapping between entities, but provide methods in the relevant repository; e.g. in the supplier repository I'll have IEnumerable<Supplier> GetSupplierByCustomerID(int customerID)
EDIT Changed above to IEnumerable based on suggestions.
Are these the two main approaches when using EF? Which is considered better, from your perspective.
Is there another approach I'm not considering?
I would personnally expose many simple methods.
Use mapping to connect the Customer to the Supplier and the Supplier
to the Services, then every time I load a customer I get all his
suppliers and their services (and other tables loaded)
If you only need to get the Name of your customer from its ID, then the above solution would require you to load useless and heavy object graph unless you use lazy loading, but as you may have some serialization process (3-tier architecture ?), it's a problem for you as you can't use lazy loading in this case.
So you could expose for example:
Supplier GetSupplierByID(int supplierID)
IEnumerable<Supplier> GetSuppliersByCustomerID(int customerID)
...
I would also recommend not exposing IQueryable. If possible, use IEnumerable instead. See this article for more details about the danger of using IQueryable when all implications are not well known.
In general I feel like putting a repository over EF is always a good idea. You get to abstract your database logic from your client-side logic (or even business logic). And the specific case that you're mentioning you would be able to do one other nice benefit: You would only get the information that you want when you specifically call for it (like the GetSupplierByCustomerID example that you mentioned.
Another approach you might consider is the one that I mentioned in the answer to this question: Bounded Contexts. The more separation of concerns that you have in your application, the better it will be in the long run for you and your fellow programmers (especially when you want to unit test it all).
It is just my opinion, I do not know whether it is proper in your case since it depends on your business requirements, but I generally prefer the third option.
All repositories return IEnumerables, not IQueryables : this enables all database operations to be finished before running any business logic.
All repositories expose methods with optional parameters enabling to declare included navigation properties : this enables to call repository methods with required navigation entities.
Create a base generic repository and inherit from it in each of your repositories.
Implement unit of work pattern to share context and enable transaction.
sample method signiture from base repository (T is the type of entity):
IEnumerable<T> Find(Expression<Func<T, bool>> criteria, params Expression<Func<T, object>>[] navigationList)
When its about to map or not to map, I'd opt for A. There are many advantages to navigation properties (like Customer.Supplier) and there are many ways to control lazy/eager loading.
Advantages of navigation properties is that linq queries are much easier to write. Hardly ever you'll have to write a join:
With join:
from supp in db.Supliers
join serv in db.Services on supp.SupplierId equals serv.SupplierId
select ...
With navigation property
from supp in db.Supliers
from serv in supp.Services
select ...
Or things like this:
from supp in db.Supliers
select new { supp.Name, ServicesCount = supp.Services.Count() }
and EF will figure out how to do the joins in SQL.
Having navigation properties doesn't mean that they always get loaded. For lazy loading to happen, two conditions must be met
The property must be defined as virtual to enable EF to override it in a proxy type with wiring to cary out lazy loading.
The context must be lazy-loading enabled. They are by default, but you can turn it off per instance by setting context.Configuration.LazyLoadingEnabled = false.
So this also shows two ways to control lazy loading: you can enable/disable it structurally or temporarily.
Apart from that you can control the opposite, eager loading, in two ways:
Using the Include statement:
db.Suppliers.Include(s => s.Services)
Including navigation properties in projections:
from supp in db.Supliers
from serv in supp.Services
select new { supp.Name, serv.ServiceName }
(there are more ways, but these are the most important ones)
This would applies to writing linq queries in your services or repositories. As others have said: don't expose IQueryable to the consumers of your service/repository methods.
One last important note: lazy loading is only possible within the scope of a life context. If the context is disposed and a lazy-loading navigation property is addressed, an exception is thrown. At the same time it is recommended to uses context instances with a short life span. So there's the dilemma: expose entity objects or only DTO's or view models or stuff like that. When you expose lazy loading-enabled entity objects a consumer may inadvertently address a navigation property that has not been loaded yet, and the context is gone.
I will start by saying I have already looked thoroughly in stack overflow, nhusers and the documentation for a possible solution to my issue.
I need to be able to query only the base class table in parts of my multi/future query when eagerly loading associations (although from the research I have done I don't think this is possible)
I have started to map an existing schema using fluent nhibernate as a proof of concept. I have mapped an inheritance hierarchy using table per sub class (The mappings all work perfectly fine so I won't paste them all in here). The hierarchy has around 15 sub classes and the base class has some additional associations. E.g.
Base
Dictionary<string, Attribute> Attributes
List<EntityChange> Changes
I need to eagerly load both of the collections as in the given scenario they are required for post processing and lazily loading them causes performance issues. I am eagerly loading them by a multi / future query:
var baseQuery = session.CreateCriteria<Base>("b")
.CreateCriteria("Nested", JoinType.LeftOuterJoin)
.CreateCriteria("Nested2", JoinType.LeftOuterJoin)
.CreateCriteria("Nested2.AdditionalNested", JoinType.LeftOuterJoin);
var logsQuery = session.CreateCriteria<Base>("b").CreateAlias("Changes", "c", JoinType.LeftOuterJoin,
Expression.And(Expression.Ge("c.EntryDate", changesStartDate), Expression.Le("c.EntryDate", changesEndDate)))
.AddOrder(Order.Desc("c.EntryDate"));
var attributesQuery = session.CreateCriteria<Base>("t").SetFetchMode("Attributes", FetchMode.Join);
logsQuery.Future<Base>();
attributesQuery.Future<Base>();
var results = baseQuery.Future<Base>().ToList();
The queries execute and return the correct results. But just to eagerly load the associations in this manner means that the attribute and changes queries have to perform a polymorphic fetch (the addition of about 15 left outer joins per query that aren't required). I know this is required for polymorphic querying but the base query will return the hierarchy that I desire. The other parts of the multi query that are issuing a polymorphic query are redundant.
I haven't yet mapped the whole of the hierarchy so there will be additional unecessary joins being performed and there are also other associations that could be loaded up front. These two combined without the addition of an increase in volume will lead to performance issues. The performance currently of this query is about 6 seconds (which admittedly is better than the 20 it's currently taking) but by messing around a bit with the query and taking out the extra joins I can get it down to about 2 seconds (this is a common query so getting it as low as possible is beneficial not just pleasing to me. It will also be run from multiple distributed machine so I would rather not get into a discussion about caching / 2nd level caching).
I have tried
using the class modifier in the query 'class = base'. I initially done this blindly but believe this is for discriminator values. Even if it is for the case statement in the SQL this will not prevent the extra joins.
Doing everything in a single query. This is slower than splitting it up as above and gives the cartesian product
Using 'Polymorphism.Explicit();' in the fluent mappings. This has no effect as I am using ClassMap with SubclassMaps so it is ignored. I tried changing all the maps to ClassMaps and using Join but this didn't give the desired behaviour.
Tried to trick nhibernate into joining the base class table onto itself for loading associations (basically load a more concrete type to prevent the polymorphic query) - create a derived class 'BaseOnlyLoading' which uses the same table and primary key as the base class. This was obviously a hack but I was just trying to see what's possible. NHibernate doesn't allow the class and sub class to use the same table.
Define the BaseOnlyLoadingMap to be a classmap with the same assocations as the BaseMap with a join back onto the Base. This was hopeful as assocation collections are resolved in the context based on full type name.
Use an interceptor which modifies the SQL that before it's execute. I wouldn't use this in production and just tried it out of interest. I passed an interceptor into a local session. This caused issues and I didn't proceed.
The HQL 'Type' query operator as explained here although I am not sure this has been implemented in the .NET version and might behave similarly to 1.
There is comment on highest rated answer (How to perform a non-polymorphic HQL query in Hibernate?) which suggest overriding the IsExplicitPolymorphism on the persister. I had a quick look and from what I remember the persister was either global per entity or created in the SessionImpl from a static factory which would prevent doing this. Even if this was possible I am not sure what sort of side effects this would have.
I tried using some SQL to load everything but even if I use a stored proc I am not sure how nhibernate will piece the graph back together. Maybe I could specify all the entities and aliases?
Specifying explicit per query would be nice. Any suggestions?
Thanks in advance.