Accessing virtual child object executes sql without WHERE clause - c#

I have tried to research why, what I believe should be a relatively straightforward concept, is not working as expected. From what I have read the following should work in the way I expect, but it isn't.
I am retrieving data from an SQL database using entity framework, however I cannot get Lazy / Deferred loading to work when accessing child objects.
Here is my class
public class Product
{
[Key]
public int Id { get; set; }
public string Name { get; set; }
public string ShortDescription { get; set; }
public string FullDescription { get; set; }
public virtual IList<ProductEvent> Events { get; set; } = new List<ProductEvent>();
}
I then assign a variable to a Product
var product = ServiceController.ProductService.GetProduct(123456);
and the GetProduct method:
public Product GetProduct(int id, params Expression<Func<Product, object>>[] includedProperties)
{
IQueryable<Product> products = _db.Products;
if (includedProperties != null)
{
foreach (var includeProperty in includedProperties)
{
products = products.Include(includeProperty);
}
}
return products.Single(p => p.Id == id);
}
The generated SQL includes a WHERE clause when I call the method and pass in a Product ID, as expected. So far, so good.
My problem arises when I try to access a sub-set of Events which are related to this Product:
var desiredEvent = product.Events.SingleOrDefault(e => e.StartDateTime == '2017-07-01T02:00');
I would expect the generated SQL to contain a WHERE clause to only return Events with a matching StartDateTime, however the SQL does not contain any WHERE clause so all Product Events are loaded into memory and filtered there. Some of my Products have over 100,000 events so this is causing performance issues.
I cannot understand what is wrong with the code which is causing my filter to be ignored when Events are accessed.
Am I missing something fundamental here with the way EF handles queries? Or is the way I am accessing the Product in the first place causing the problem?

Related

Fastest and efficient way to get record using Entity Framework and C#?

I am beginner and I am using a function which takes approximately 20 seconds to load a record - please is there any other, more efficient and faster way to execute this? How i can include guest id with guest name so that its works faster rather then use for each loop to assign guest name with guest id
Data.Tables.Booking
[Table("Bookings")]
public class Booking : ITrackable
{
[Key]
[DatabaseGenerated(DatabaseGeneratedOption.Identity)]
public int Id { get; set; }
public int GuestId { get; set; }
public int RoomId { get; set; }
public DateTime CheckInDateTime { get; set; }
public DateTime CheckOutDateTime { get; set; }
public decimal DailyPricePerBed { get; set; }
[StringLength(1000)]
public string Memo { get; set; }
public string PriceType { get; set; }
public bool Paid { get; set; }
public DateTime CreatedAt { get; set; }
public DateTime ChangedAt { get; set; }
public string? InvoiceType { get; set; }
}
Booking model
public class Booking : BaseModel
{
[Required]
public int Id { get; set; }
[Required(ErrorMessage = Constants.ERROR_REQUIRED)]
[Display(Name = "Gast")]
public int GuestId { get; set; }
[Required(ErrorMessage = Constants.ERROR_REQUIRED)]
[Display(Name = "Zimmer")]
public int RoomId { get; set; }
[Required(ErrorMessage = Constants.ERROR_REQUIRED)]
[Display(Name = "Check-In")]
public DateTime CheckInDateTime { get; set; }
[Required(ErrorMessage = Constants.ERROR_REQUIRED)]
[Display(Name = "Check-Out")]
public DateTime CheckOutDateTime { get; set; }
[Required(ErrorMessage = Constants.ERROR_REQUIRED)]
[Display(Name = "Tagespreis pro Bett")]
public decimal DailyPricePerBed { get; set; }
[Required(ErrorMessage = Constants.ERROR_REQUIRED)]
[Display(Name = "Price Type required")]
public string PriceType { get; set; }
[StringLength(1000, ErrorMessage = Constants.ERROR_MAX_LENGHT)]
[Display(Name = "Sonstiges")]
public string Memo { get; set; }
[Display(Name = "Bezahlt")]
public bool Paid { get; set; }
public List<Guest> Guests { get; set; }
public List<Room> Rooms { get; set; }
public string GuestName { get; set; }
public string TitleBooking { get; set; }
public bool Selected { get; set; }
public int RoomNumber { get; set; }
public RoomType RoomType { get; set; }
public string EncryptedRoomId { get; set; }
public string EncryptedPartnerId { get; set; }
public string? InvoiceType { get; set; }
}
}
public List<Booking> Load(int roomId)
{
var result = _context.Bookings
.Where(item => item.RoomId == roomId)
.Select(item => _mapper.Map<Booking>(item))
.ToList();
//its takes time here to load each guest name, if i remove
//this part
//its works fast but its showing guest ids on calendar, i want
//to show guest name
foreach (var booking in result)
{
if (_context.Guests.Any(o => o.Id == booking.GuestId)) // update
{
booking.GuestName = _guestRepository.GetName(booking.GuestId);
}
}
return result;
}
You code will be slow for two main reasons. Firstly you are selecting a set of Bookings then using Automapper to "Map" these across to copies of Booking entity classes. If Bookings contain navigation property references to other entities and you have Lazy Loading enabled, this is quite likely resulting in a LOT of lazy load hits as Mapper.Map "touches" each navigation property as it's iterating through the Bookings to copy the object graph across. You are then going and iterating over each booking to call to the GuestRepository to get the guest name, and that could also be of varying efficiency. For instance does the repository do something like:
return _context.Guests.Where(g => g.GuestId == guestId).Select(g => g.GuestName).Single();
or does it do something like:
var guest = _context.Guests.Single(g => g.GuestId == guestId);
return guest.GuestName;
The first runs an SQL Statement to retrieve one column for one row. The second reads all columns from Guest and builds a Guest entity for one row, just to return one value.
The first thing would be to ensure that you have navigation properties defined for all relationships, such as between Bookings and Guests.
From your example I'm guessing this code is part of a BookingRepository which has a dependency on a GuestRepository to get guest information. If there is one piece of advice I can give, it is to avoid the Generic Repository pattern in EF, or thinking of Repositories as serving individual entities. Instead, design Repositories to serve business needs, like a Controller in MVC. If I have a BookingController set up to serve everything to do with making/reviewing bookings, then I can have a BookingRepository to handle all Domain interactions for that Controller. Not just Booking entities, but everything needed for making/reviewing bookings.
The next thing is handling projection. Returning entities outside of the scope of a DbContext that reads them is generally not a great idea. Entities represent data domain. Views have their own concerns with regards to what data they need, and how they want to present it. They should have a purpose-built representation of the domain they are concerned with, a View Model.
So for instance if we have a Booking entity, it should be associated with a Room and a Guest. Each of these would be entities for their respective tables and linked by FKs within the Booking table. What the view is concerned with isn't everything in the Booking, Room, and Guest, just bits of details which can be flattened down from the respective tables and columns. For instance:
public class BookingViewModel
{
public int BookingId { get; set; }
public int RoomId { get; set; }
public int GuestId { get; set; }
public string RoomNumber { get; set;}
public string GuestName { get; set; }
}
Now when we fetch bookings for a given room, without diving into Repositories yet, just working with a DbContext and it's entities:
var bookings = _context.Bookings
.Where(b => b.RoomId == roomId)
.Select(b => new BookingViewModel
{
BookingId = b.BookingId,
RoomId = b.Room.RoomId,
GuestId = b.Guest.GuestId,
RoomNumber = b.Room.RoomNumber,
GuestName = b.Guest.LastName + ", " + b.Guest.FirstName
}).ToList();
With Automapper, we can configure mapping rules for translating a Booking and it's related structure into this BookingViewModel and the above can be simplified to something looking like:
var bookings = _context.Bookings
.Where(b => b.RoomId == roomId)
.ProjectTo<BookingViewModel>(mapperConfig)
.ToList();
Where "mapperConfig" is an instance of a MapperConfiguration set up with the rules to translate Booking -> BookingViewModel. This could be one global Config, or a config built as requested by a factory method.
The benefits with either Select or ProjectTo is that the projection goes directly to the SQL Query so the only data returned is what is needed to populate the view model. There are no risks of lazy loading surprises, or even worrying about tracked entities bogging down the DbContext.
When starting out I would get the hang of using the DbContext and projection without introducing a Repository pattern. The EF DbContext acts as both a Unit of Work and Repository in the sense, and trying to abstract that fact from your application can mean introducing significant performance and flexibility penalties.
For introducing repositories I would recommend either a Repository pattern and Unit of Work pattern that leverage IQueryable so that callers can project details as they need, or having repositories that abstract the domain (entities) into the needs of the consumer. (view models) IQueryable provides a lot of flexibility making Repositories easy to mock for testing, but are coupled to Entity Framework as consumers need to know that fact and manage the DbContext's Scope to use them effectively. Designing repositories that return ViewModels creates a cleaner boundary to isolate consumers from EF, but requires the Repository to have a larger footprint to accommodate all methods, variants, and concerns for the consumer(s). For instance supporting sorting, pagination, etc. Structuring Repositories to serve individual controllers can certainly help compared to repositories-per-entity that serve many controllers with different concerns.
Edit: If updating the result to a view model represents too big of a change, do keep these details in mind for future work because the approach your code base is using is highly inefficient. You Can mitigate the problem to a degree with some smaller changes including loading the entities detached and using Include to Eager-load the required relationships.
The first change would be moving the GuestName into a domain concern that the entity can resolve. Inside the Booking entity change:
public string GuestName { get; set; }
to:
private string _guestName = null;
public string GuestName
{
get
{
return _guestName ?? (_guestName = Guests.SingleOrDefault(g => g.GuestId == GuestId)?.Name;
}
set { _guestName = value; }
}
If you are using .Net Core 6 or 7 this can be simplified to:
private string? _guestName = null;
public string? GuestName
{
get => _guestName ??= Guests.SingleOrDefault(item => item.GuestId == GuestId)?.Name;
set => _guestName = value;
}
Rather than going to the repository for every guest record to get the name, let the entity go to its Guests collection. (if available) This is written as to not break existing code, so any code that "Sets" the guest name will still take precedence.
Alternatively you could use the setter like you are and just go to booking.Guests to get the applicable guest's name rather than going to the repository:
foreach (var booking in result)
{
booking.GuestName = booking.Guests.FirstOrDefault(item => item.GuestId == booking.GuestId)?.Name;
}
If the guest name needs to be formatted from a Guest First and Last Name:
foreach (var booking in result)
{
var mainGuest = booking.Guests.FirstOrDefault(item => item.GuestId == booking.GuestId);
if (mainGuest == null) continue;
booking.GuestName = $"{mainGuest.LastName}, {mainGuest.FirstName}";
}
For methods like this to work, whether using the property or getting the guest from the Guests collection and using the Setter, the entity must have the Guests collection eager loaded, which is the next step:
Eager load any required details your view is going to need about the guest, and detach them:
var result = _context.Bookings
.Include(item => item.Guests)
.Where(item => item.RoomId == roomId)
.AsNoTracking()
.ToList();
This will do something similar to your original code, but it will eager load the Bookings collection, and it will detach the resulting entities so that they won't be lazy-loadable proxies or have changes tracked by the DbContext. The issue with using Mapper.Map without AsNoTracking is that if your DbContext is set up to use lazy loading, the Mapper.Map call will go through each property, which when it hits a navigation property, trigger a lazy load. This will ensure that all data is mapped, but it is extremely slow and inefficient. The above example eager loads the Bookings collection. If there are other navigation properties your view will touch, these will very likely currently be #null now, so you will need to ensure they are eager loaded using Include as well.
Eager loading with Include does come with some performance issues when dealing with one-to-many relationships, especially eager loading several one-to-many relationships in that it produces Cartesian Products where the total volume of data grows by factors with the more relationships you load. This will typically be faster than lazy loading, but still represents a significant resource and performance cost. This is why projecting to a view model is recommended. You will still generate a Cartesian Product, but across far fewer fields as the projection only selects the fields you actually need rather than everything in the associated table. If you are using Eager Loading, eager load only what you know you will need, not everything, or you will soon be facing similar performance and memory bloat problems.
As comments have pointed out, you have not specified which of the parts of that function is consuming the time, nor is it clear what your model looks like.
However, I shall make two assumptions:
the problem is that you are looping through result and then hitting the database for each booking, perhaps twice.
the Guest name property is in the Guests entity / table.
If so, something like the following should speed it up by minimising the database calls to two for the whole function.
// Database Call 1
var result = _context.Bookings
.Where(item => item.RoomId == roomId)
.Select(item => _mapper.Map<Booking>(item))
.ToList();
// Get the Guest Ids used in the booking retrieved above
List<int> guestIds = result.Select(b => b.GuestId).Distinct().ToList();
// Get all Guests that are used - Database Call 2
List<Guest> guests = _context.Guests.Where(g => guestIds.Contains(g.Id)).ToList();
// Now put the name on the bookings
foreach (var booking in result)
{
Guest? guest = guests.SingleOrDefault(g => g.Id == booking.GuestId);
if (guest is not null)
{
booking.GuestName = guest.Name;
}
}
Alternatively, if you have a navigation property to guest from booking, then you can retrieve the guest along with the booking in a single database call like so:
var result = _context.Bookings
.Include(b => b.Guest)
.Where(b => b.RoomId == roomId)
... etc.

Loading Parent Child data using Entity Framework stored procedure

I have two stored procedures, one to load parent data and other to load child data. Is there anyway to do this in a better way? Please see code below.
public partial class Product
{
public long Id { get; set; }
public string ProductId { get; set; }
public string Name { get; set; }
public string ProductionNo { get; set; }
public string UniqueIdentificationNo { get; set; }
public virtual ICollection<Process> Processes { get; set; }
}
public List<GetProductsBetween_Result> GetProductsBetween(DateTime? startDate, DateTime? endDate)
{
var products = DbContext.GetProductsBetween(startDate, endDate).ToList();
foreach(var product in products)
{
product.Processes = DbContext.GetWaitingTime(product.Id).ToList();
}
return products;
}
This entire scenario can be rewritten using Linq(Language Intergrated Query).
You can apply eager loading to fetch the Processes of a product. This is similar
to sql joins which Linq provides, such a state that code is optimised and foreachs
and hits to the server can be reduced
var products = from u in dbContext.Product.Include(u => u.Processes)
select new ProductView
{
item1 = XXX,
item2 = yyy
};
var newdata = user.ToList().where("some lambda expressions")
with conditions to filter date
I had the same issue with a really complex search that had to return the full tree of an object.
What you need to do is:
Make a stored procedure that returns in only one call multiple result sets for the root object and for all its dependent objects:
i.e. (add your own conditions)
SELECT p.* FROM Products p WHERE p.Name LIKE '%test%';
SELECT DISTINCT r.* FROM Processes r INNER JOIN Products p ON p.id = r.idProduct WHERE p.Name LIKE '%test%';
After the return, re-attach the parent objects with its childs as described in this article:
https://blogs.infosupport.com/ado-net-entity-framework-advanced-scenarios-working-with-stored-procedures-that-return-multiple-resultsets/
You will need to iterate each Product from the first result set and attach its Processes.
This should be much faster since you only make one round-trip to the database and less queries.

Linq query to get parent id, grandparent id, etc right up to the root id

I have the following class
public class Category
{
public virtual int ID { get; set; }
public virtual string Name { get; set; }
public virtual int? ParentID { get; set; }
public virtual IList<Category> Children { get; set; }
}
I am mapping a database table with a self referencing foreign key relationship with fields id, name and parent id to this class using Nhibernate (though it could be the entity framework the orm doesn't really matter).
Given any category I need a method/query that gives me the category parentid, grandparentid etc. I think one way of doing it is using a method that works recursively extracting successive parentids and stopping when it hits a null parentid. The best I've come up with so far is to do something along the lines of
newcategory = Load<Category>(category.parentId)
add newcategoryid to list
category = newcategory
repeat until category.parentid is null
But I wondered if there could be performance problems with something like this as it could involve numerous trips to the database.
Entity framework would probably have a hard time with this as you mentioned. You could load everything into memory for that table and then use LINQ to recurse up the hierarchy.
var categories = from c in category
select c;
//ToList executes the SQL statement once
var catList = categories.ToList()
//recurse over in-memory list
Or you could create a SQL stored procedure and just call it with EF.

How to write a LINQ join for two classes in ASP.NET MVC4

I am having trouble accessing a property from a joined table, using EF Code First. Here is my query in my home controller:
var pcs = from h in db.Hardwares
join hwt in db.HardwareTypes
on h.HardwareTypeId equals hwt.Id
where h.HardwareType.HType == "PC"
select new { Hardware = h, HardwareType = hwt };
ViewBag.Pcs = pcs.ToList();
Here is my HardwareTypes class:
public class HardwareType
{
public int Id { get; set; }
//[Required]
//[StringLength(128)]
public string HType { get; set; }
}
Here is my Hardware class:
public class Hardware
{
public int Id { get; set; }
public int HardwareTypeId { get; set;}
public virtual HardwareType HardwareType { get; set; }
}
How do I change my LINQ query so HType is in my ViewBag? The database seems to be generating correctly, it's just I can't seem to access HType. I get an error 'object' does not contain a definition for 'HType'
Ok, time for a full reset. Based on the chat, I want you to try the following:
1) Keep your original LINQ query, but don't make it an anonymous type. Create another object type that will have a Hardware and a HardwareType property in it. This will make Pcs (as well as ViewBag.Pcs) a List of that particular type.
2) Put the following foreach loop in your View code:
foreach (var item in ViewBag.Pcs as List<YourTypeName>)
{
var htype = item.HardwareType.HType;
}
And if you don't want to do some lazy loading each time through the loop, you can do the following in the controller to just load up the entities right away:
var hardwareTypes = pcs.Select(p => p.HardwareType).ToList();
Let me know if that works.

Linq-to-SQL Load 1:1 Relations in a single query

We have several classes with multiple 1:1 Relationships for quick joins, and while this works fine for anonymous types for tabular display, I'm unsure how to fully populate the type in a single linq query.
We have these properties either because it's an off 1:1, or we don't want to query through a child collection to find a "primary" every display, we instead incur the cost by setting these Primary IDs on save.
A stripped down example for the context of this post:
public class Contact
{
public long Id { get; set; }
public EntitySet<Address> Addresses { get; set; }
public EntityRef<Address> PrimaryAddress { get; set; }
public long? PrimaryAddressId { get; set; }
public EntitySet<Email> Emails { get; set; }
public EntityRef<Email> PrimaryEmail { get; set; }
public long? PrimaryEmailId { get; set; }
public string FirstName { get; set; }
public string LastName { get; set; }
}
public class Address
{
public long Id { get; set; }
public EntitySet<Contact> Contacts { get; set; }
public bool IsPrimary { get; set; }
public string Street1 { get; set; }
public string Street2 { get; set; }
public string City { get; set; }
public string State { get; set; }
public string Country { get; set; }
}
public class Email
{
public long Id { get; set; }
public EntitySet<Contact> Contacts { get; set; }
public bool IsPrimary { get; set; }
public string Address { get; set; }
}
The problem is when displaying a list of contacts, the PrimaryAddress and PrimaryEmail have to be lazy loaded. If we do DataLoadOptions it doesn't give the desired effect either since it's a 1:1, example:
var DB = new DataContext();
var dlo = new DataLoadOptions();
dlo.LoadWith<Contact>(c => c.PrimaryAddress);
dlo.LoadWith<Contact>(c => c.PrimaryEmail);
DB.LoadOptions = dlo;
var result = from c in DB.Contacts select c;
result.ToList();
The above code results in a INNER JOIN since it treats it like a parent relationship, it doesn't respect the nullable FK relationship and left join the 1:1 properties. The desired query would be something like:
Select t1.*, t.2*, t3.*
From Contact t1
Left Join Address t2 On t1.PrimayAddressId = t2.Id
Left Join Email On t1.PrimaryEmailId = t3.Id
Is there a way to do this and get a IQueryable with these nullable 1:1 properties populated, or even a List? Due to other constraints, we need the type to be Contact, so anonymous types won't work. Pretty open to options, anything would be better than lazy loading n*(number of 1:1s)+1 queries for the number of rows we display.
Update: Finally got around to updating this, the devart guys have fixed the behavior in later versions to work perfectly. There's no need for DataLoadOptions at all, just using fields off the table works, for example:
var DB = new DataContext();
var result = from c in DB.Contacts
select new {
c.Id
c.FirstName,
c.LastName,
Address = c.PrimaryAddress.Street1 + " " + c.PrimaryAddress.Street2 //...
Email = c.PrimaryEmail.Address
};
This correctly performs a single left outer join to the related Address and Email tables. Now the fix is specific to the situation here of getting this anonymous type...but they also fixed the DataLoadOptions behavior where we do need it, correctly keyed off the foreign key type now. Hope this update helps others on an older version...I highly recommend upgrading, there are lots of new enhancements in versions since 5.35 (many making life much easier).
Original:
What we ended up with was a different approach. This may be specific behavior to the devart: dotConnect for Oracle provider (as of version 5.35.62, if this behavior changes I'll try and update this question).
var DB = new DataContext();
var result = from c in DB.Contacts
select new {
c.Id
c.FirstName,
c.LastName,
Address = new AddressLite {
Street1 = c.PrimaryAddress.Street1,
Street2 = c.PrimaryAddress.Street2,
City = c.PrimaryAddress.City,
State = go.PrimaryAddress.State,
Country = go.PrimaryAddress.Country },
Email = c.PrimaryEmail.Address
};
result.ToList();
This results in a single query. While calling a child object in the select, e.g. c.PrimaryAddress does not cause a join to occur (resulting in a lot of select ... from address where id = n lazy loads, one per row of tabular data we're displaying), calling a property on it however, e.g. c.PrimaryAddress.Street1 DOES cause a correct left join in the address table in the query query. The linq above works only in linq-to-sql, it would fail with null reference on linq-to-entities, but...in the case we're dealing with that's fine.
The good:
Single query, producing left joins to Address and Email
Lightweight objects for address and down to just a string for email (they both have some back-reference EntiySet in the real project, making them more expensive than necessary for simple tabular display needs)
Fast/clean, the above is a much simpler query than manually joining every child table we were doing, cleaner code.
Performance, the creation of the heavier objects was quite a hit, changing from Email to string, Address to AddressLite and (in the full project) Phone to PhoneLite resulted in pages just displaying tabular data going from 300-500ms down to 50-100ms.
The Bad:
Anonymous type, there are cases where we need a strong type, having to create those (even as quick as ReSharper makes this task) adds a lot of clutter.
Since we can't modify and save an anonymous type, or any type we create without a good deal of annotation work, which must be updated if the model changes anything around that. (since those classes aren't generated)
Left join is generated if IsForeignKey is set to false in the association attribute for the EntityRef-typed property.
We came up against much the same problem with the DataLoadOptions, lazy loading and your primary records.
To be honest I'm not totally happy with the solution we came up with as it's not exactly very neat, and the SQL query it produces can be complicated, but essentially we created wrapper classes with copies of the fields we wanted to force load and used sub queries to load in the records. For your example above:
public class ContactWithPrimary
{
public Contact Contact { get; set; }
public Email PrimaryEmail { get; set; }
public Address PrimaryAddress { get; set; }
}
Then an example LINQ query would be:
List<ContactWithPrimary> Contacts = DataContext.Contacts
.Select(con => new ContactWithPrimary
{
Contact = con,
PrimaryEmail = con.PrimaryEmail,
PrimaryAddress = con.PrimaryAddress
}).ToList();
What it does do however is pull it out in a single query.
You might want to take a look at Rob Conery's Lazy List implementation.
http://blog.wekeroad.com/blog/lazy-loading-with-the-lazylist/
It basically hides the entire lazy loading implementation from you and you don't need to specify any loading options.
The only drawback is that it only works for lists. It is however possible to write an implementation for properties as well. Here is my effort.
public class LazyProperty<TEntityType> where TEntityType : class
{
private readonly IQueryable<TEntityType> source;
private bool loaded;
private TEntityType entity;
public LazyProperty()
{
loaded = true;
}
public LazyProperty(IQueryable<TEntityType> source)
{
this.source = source;
}
public TEntityType Entity
{
get
{
if (!loaded)
{
entity = source.SingleOrDefault();
loaded = true;
}
return entity;
}
set
{
entity = value;
loaded = true;
}
}
}

Categories

Resources