How to load multiple result sets with one Entity Framework call? - c#

While learning Entity Framework 6, I have hit an obstacle and unsure how to handle a situation. While making an API, a user may want a specific endpoint which requires access to multiple tables (Fake Entity, since it does not have a real table association). Below is some fake DbSets and a random class.
I am looking for a way to include all of these tables data (With Where Clauses) all in 1 query. I was doing 3 separate calls, but I don't think this is the best way.
var anonObject = new AnonClass()
{
SometItems = await Context.Table1.Where(t => t.Something == true).ToListAsync();
SometItems2 = await Context.Table2.Where(t => t.Something == true).ToListAsync();
SometItems3 = await Context.Table3.Where(t => t.Something == true).ToListAsync();
};
DbSet<Table1> Table1;
DbSet<Table2> Table2;
DbSet<Table3> Table3;
public sealed AnonClass
{
public IEnumerable<Table1> SomeItems;
public IEnumerable<Table2> SomeItems2;
public IEnumerable<Table3> SomeItems3;
}
Each of these are individual calls, I want them all in one.

Related

EF 6 - Performance of GroupBy

I don't have a problem currently, but I want to make sure, that the performance is not too shabby for my issue. My search on Microsofts documentation was without any success.
I have a Entity of the name Reservation. I now want to add some statistics to the program, where I can see some metrics about the reservations (reservations per month and favorite spot/seat in particular).
Therefore, my first approach was the following:
public async Task<ICollection<StatisticElement<Seat>>> GetSeatUsage(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return await this.FetchGroupedSeatData(allReservations, company);
}
public async Task<ICollection<StatisticElement<DateTime>>> GetMonthlyReservations(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return this.FetchGroupedReservationData(allReservations);
}
private async Task<ICollection<StatisticElement<Seat>>> FetchGroupedSeatData(
IEnumerable<Reservation> reservations,
Company company)
{
var groupedReservations = reservations.GroupBy(r => r.SeatId).ToList();
var companySeats = await this.seatService.GetAll(company);
return (from companySeat in companySeats
let groupedReservation = groupedReservations.FirstOrDefault(s => s.Key == companySeat.Id)
select new StatisticElement<Seat>()
{
Value = companySeat,
StatisticalCount = groupedReservation?.Count() ?? 0,
}).OrderByDescending(s => s.StatisticalCount).ToList();
}
private ICollection<StatisticElement<DateTime>> FetchGroupedReservationData(IEnumerable<Reservation> reservations)
{
var groupedReservations = reservations.GroupBy(r => new { Month = r.Date.Month, Year = r.Date.Year }).ToList();
return groupedReservations.Select(
groupedReservation => new StatisticElement<DateTime>()
{
Value = new DateTime(groupedReservation.Key.Year, groupedReservation.Key.Month, 1),
StatisticalCount = groupedReservation.Count(),
}).
OrderBy(s => s.Value).
ToList();
}
To explain the code a little bit: With GetSeatUsage and GetMonthlyReservations I can get the above mentioned data of a company. Therefore, I fetch ALL reservations at first (with reservationService.GetAll) - this is the point, where I think the performance will be a problem in the future.
Afterwards, I call either FetchGroupedSeatData or FetchGroupedReservationData, which first groups the reservations I previously fetched from the database and then converts them in a, for me, usable format.
As I said, I think the group by after I have read ALL the data from the database MIGHT be a problem, but I cannot find any information regarding performance in the documentation.
My other idea was, that I create a new method in my ReservationService, which then already returns the grouped list. But, again, I can't find the information, that the EF adds the GroupBy to the DB Query or basically does it after all of the data has been read from the database. This method would look something like this:
return await this.Context.Set<Reservation>.Where(r => r.User.CompanyId == company.Id).GroupBy(r => r.SeatId).ToListAsync();
Is this already the solution? Where can I check that? Am I missing something completely obvious?

LINQ querying using an entity object not in the database - Error: LINQ to Entities does not recognize the method

I have an Entity Framework v5 model created from a database. The table Season has a corresponding entity called Season. I need to calculate the Season's minimum start date and maximum end date for each year for a Project_Group. I then need to be able to JOIN those yearly min/max Season values in other LINQ queries. To do so, I have created a SeasonLimits class in my Data Access Layer project. (A SeasonLimits table does not exist in the database.)
public partial class SeasonLimits : EntityObject
{
public int Year { get; set; }
public DateTime Min_Start_Date { get; set; }
public int Min_Start_Date_ID { get; set; }
public DateTime Max_End_Date { get; set; }
public int Max_End_Date_ID { get; set; }
public static IQueryable<SeasonLimits> QuerySeasonLimits(MyEntities context, int project_Group_ID)
{
return context
.Season
.Where(s => s.Locations.Project.Project_Group.Any(pg => pg.Project_Group_ID == project_Group_ID))
.GroupBy(x => x.Year)
.Select(sl => new SeasonLimits
{
Year = sl.Key,
Min_Start_Date = sl.Min(d => d.Start_Date),
Min_Start_Date_ID = sl.Min(d => d.Start_Date_ID),
Max_End_Date = sl.Max(d => d.End_Date),
Max_End_Date_ID = sl.Max(d => d.End_Date_ID)
});
}
}
// MVC Project
var seasonHoursByYear =
from d in context.AuxiliaryDateHours
from sl in SeasonLimits.QuerySeasonLimits(context, pg.Project_Group_ID)
where d.Date_ID >= sl.Min_Start_Date_ID
&& d.Date_ID < sl.Max_End_Date_ID
group d by new
{
d.Year
} into grp4
orderby grp4.Key.Year
select new
{
Year = grp4.Key.Year,
HoursInYear = grp4.Count()
};
In my MVC project, whenever I attempt to use the QuerySeasonLimits method in a LINQ query JOIN, I receive the message,
"LINQ to Entities does not recognize the method
'System.Linq.IQueryable`1[MyDAL.SeasonLimits]
QuerySeasonLimits(MyDAL.MyEntities, MyDAL.Project_Group)' method, and
this method cannot be translated into a store expression."
Is this error being generated because SeasonLimits is not an entity that exists in the database? If this can't be done this way, is there another way to reference the logic so that it can be used in other LINQ queries?
EF is trying to translate your query to SQL and as there is no direct mapping between your method and the generated SQL you're getting the error.
First option would be not to use the method and instead write the contents of the method directly in the original query (I'm not sure at the moment if this would work, as I don't have a VS running). In the case this would work, you'll most likely end up with a very complicated SQL with a poor performance.
So here comes the second option: don't be afraid to use multiple queries to get what you need. Sometimes it also makes sense to send a simpler query to the DB and continue with modifications (aggregation, selection, ...) in the C# code. The query gets translated to SQL everytime you try to enumerate over it or if you use one of the ToList, ToDictionary, ToArray, ToLookup methods or if you're using a First, FirstOrDefault, Single or SingleOrDefault calls (see the LINQ documentation for the specifics).
One possible example that could fix your query (but most likely is not the best solution) is to start your query with:
var seasonHoursByYear =
from d in context.AuxiliaryDateHours.ToList()
[...]
and continue with all the rest. This minor change has fundamental impact:
by calling ToList the DB will be immediately queried and the whole
AuxiliaryDateHours table will be loaded into the application (this will be a performance problem if the table has too many rows)
a second query will be generated when calling your QuerySeasonLimits method (you could/should also include a ToList call for that)
the rest of the seasonHoursByYear query: where, grouping, ... will happen in memory
There are a couple of other points that might be unrelated at this point.
I haven't really investigated the intent of your code - as this could lead to further optimizations - even total reworks that could bring you more gains in the end...
I eliminated the SeasonLimits object and the QuerySeasonLimits method, and wrote the contents of the method directly in the original query.
// MVC Project
var seasonLimits =
from s in context.Season
.Where(s => s.Locations.Project.Project_Group.Any(pg => pg.Project_Group_ID == Project_Group_ID))
group s by new
{
s.Year
} into grp
select new
{
grp.Key.Year,
Min_Start_Date = grp.Min(x => x.Start_Date),
Min_Start_Date_ID = grp.Min(x => x.Start_Date_ID),
Max_End_Date = grp.Max(x => x.End_Date),
Max_End_Date_ID = grp.Max(x => x.End_Date_ID)
};
var seasonHoursByYear =
from d in context.AuxiliaryDateHours
from sl in seasonLimits
where d.Date_ID >= sl.Min_Start_Date_ID
&& d.Date_ID < sl.Max_End_Date_ID
group d by new
{
d.Year
} into grp4
orderby grp4.Key.Year
select new
{
Year = grp4.Key.Year,
HoursInYear = grp4.Count()
};

NHibernate query extremely slow compared to hard coded SQL query

I'm re-writing some of my old NHibernate code to be more database agnostic and use NHibernate queries rather than hard coded SELECT statements or database views. I'm stuck with one that's incredibly slow after being re-written. The SQL query is as such:
SELECT
r.recipeingredientid AS id,
r.ingredientid,
r.recipeid,
r.qty,
r.unit,
i.conversiontype,
i.unitweight,
f.unittype,
f.formamount,
f.formunit
FROM recipeingredients r
INNER JOIN shoppingingredients i USING (ingredientid)
LEFT JOIN ingredientforms f USING (ingredientformid)
So, it's a pretty basic query with a couple JOINs that selects a few columns from each table. This query happens to return about 400,000 rows and has roughly a 5 second execution time. My first attempt to express it as an NHibernate query was as such:
var timer = new System.Diagnostics.Stopwatch();
timer.Start();
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.Fetch(prop => prop.Ingredient).Eager()
.Fetch(prop => prop.IngredientForm).Eager()
.List();
timer.Stop();
This code works and generates the desired SQL, however it takes 120,264ms to run. After that, I loop through recIngs and populate a List<T> collection, which takes under a second. So, something NHibernate is doing is extremely slow! I have a feeling this is simply the overhead of constructing instances of my model classes for each row. However, in my case, I'm only using a couple properties from each table, so maybe I can optimize this.
The first thing I tried was this:
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => joinForm.FormDisplayName)
.List<String>();
Here, I just grab a single value from one of my JOIN'ed tables. The SQL code is once again correct and this time it only grabs the FormDisplayName column in the select clause. This call takes 2498ms to run. I think we're on to something!!
However, I of course need to return several different columns, not just one. Here's where things get tricky. My first attempt is an anonymous type:
.Select(r => new { DisplayName = joinForm.FormDisplayName, IngName = joinIng.DisplayName })
Ideally, this should return a collection of anonymous types with both a DisplayName and an IngName property. However, this causes an exception in NHibernate:
Object reference not set to an instance of an object.
Plus, .List() is trying to return a list of RecipeIngredients, not anonymous types. I also tried .List<Object>() to no avail. Hmm. Well, perhaps I can create a new type and return a collection of those:
.Select(r => new TestType(r))
The TestType construction would take a RecipeIngredients object and do whatever. However, when I do this, NHibernate throws the following exception:
An unhandled exception of type 'NHibernate.MappingException' occurred
in NHibernate.dll
Additional information: No persister for: KitchenPC.Modeler.TestType
I guess NHibernate wants to generate a model matching the schema of RecipeIngredients.
How can I do what I'm trying to do? It seems that .Select() can only be used for selecting a list of a single column. Is there a way to use it to select multiple columns?
Perhaps one way would be to create a model with my exact schema, however I think that would end up being just as slow as the original attempt.
Is there any way to return this much data from the server without the massive overhead, without hard coding a SQL string into the program or depending on a VIEW in the database? I'd like to keep my code completely database agnostic. Thanks!
The QueryOver syntax for conversion of selected columns into artificial object (DTO) is a bit different. See here:
16.6. Projections for more details and nice example.
A draft of it could be like this, first the DTO
public class TestTypeDTO // the DTO
{
public string PropertyStr1 { get; set; }
...
public int PropertyNum1 { get; set; }
...
}
And this is an example of the usage
// DTO marker
TestTypeDTO dto = null;
// the query you need
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
// place for projections
.SelectList(list => list
// this set is an example of string and int
.Select(x => joinForm.FormDisplayName)
.WithAlias(() => dto.PropertyStr1) // this WithAlias is essential
.Select(x => joinIng.Weight) // it will help the below transformer
.WithAlias(() => dto.PropertyNum1)) // with conversion
...
.TransformUsing(Transformers.AliasToBean<TestTypeDTO>())
.List<TestTypeDTO>();
So, I came up with my own solution that's a bit of a mix between Radim's solution (using the AliasToBean transformer with a DTO, and Jake's solution involving selecting raw properties and converting each row to a list of object[] tuples.
My code is as follows:
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(
p => joinIng.IngredientId,
p => p.Recipe.RecipeId,
p => p.Qty,
p => p.Unit,
p => joinIng.ConversionType,
p => joinIng.UnitWeight,
p => joinForm.UnitType,
p => joinForm.FormAmount,
p => joinForm.FormUnit)
.TransformUsing(IngredientGraphTransformer.Create())
.List<IngredientBinding>();
I then implemented a new class called IngredientGraphTransformer which can convert that object[] array into a list of IngredientBinding objects, which is what I was ultimately doing with this list anyway. This is exactly how AliasToBeanTransformer is implemented, only it initializes a DTO based on a list of aliases.
public class IngredientGraphTransformer : IResultTransformer
{
public static IngredientGraphTransformer Create()
{
return new IngredientGraphTransformer();
}
IngredientGraphTransformer()
{
}
public IList TransformList(IList collection)
{
return collection;
}
public object TransformTuple(object[] tuple, string[] aliases)
{
Guid ingId = (Guid)tuple[0];
Guid recipeId = (Guid)tuple[1];
Single? qty = (Single?)tuple[2];
Units usageUnit = (Units)tuple[3];
UnitType convType = (UnitType)tuple[4];
Int32 unitWeight = (int)tuple[5];
Units rawUnit = Unit.GetDefaultUnitType(convType);
// Do a bunch of logic based on the data above
return new IngredientBinding
{
RecipeId = recipeId,
IngredientId = ingId,
Qty = qty,
Unit = rawUnit
};
}
}
Note, this is not as fast as doing a raw SQL query and looping through the results with an IDataReader, however it's much faster than joining in all the various models and building the full set of data.
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => r.column1, r => r.column2})
.List<object[]>();
Would this work?

Conditionally Include() in Entity Framework

I'm using EF4.3 with DbContext.
I have an entity that I store in cache, so I need to eager load the necessary data before converting to a list and popping it in cache.
My database is normalised so data is spread over several tables. The base entity is "User", a User may or may not be a "Subscriber" and a Subscriber can be one of 3 types "Contributor", "Member" or "Administrator"
At present the whole fetch is not very elegant due to my lack of knowledge in EF, Linq et al.
public static User Get(Guid userId)
{
Guard.ThrowIfDefault(userId, "userId");
var r = new CrudRepo<User>(Local.Items.Uow.Context);
var u = r.FindBy(x => x.UserId == userId)
.Include("BookmarkedDeals")
.Include("BookmarkedStores")
.SingleOrDefault();
if (u.IsNotNull() && u.IsActive)
{
if (u.IsAdmin)
{
u.GetAdministrator();
}
else if (u.IsContributor)
{
u.GetContributor();
}
else if (u.IsMember)
{
u.GetMember();
}
else
{
string.Format("Case {0} not implemented", u.UserRoleId)
.Throw<NotImplementedException>();
}
}
return u;
}
Each of the 'Get' methods gets a Subscriber entity plus the relevant Include() entities for the role type.
I'm pretty sure it can be done a whole lot more elegently than this but struggling with the initial thought process.
Anyone help?
UPDATED with example of one of the Get methods
public static void GetMember(this User user)
{
Guard.ThrowIfNull(user, "user");
var r = new ReadRepo<Subscriber>(Local.Items.Uow.Context);
user.Subscriber = r.FindBy(x => x.UserId == user.UserId)
.Include("Kudos")
.Include("Member.DrawEntries")
.Include("Member.FavouriteCategories")
.Include("Member.FavouriteStores")
.Single();
}
If your "User" entity is connected to your other entities you can use the Load method of the connected entity collection to get the related entities. For example if your "User" entity has a property "Subscriber" you could call u.Subscriber.Load() to get the related entity. Here is the related MSDN article
var u = r.FindBy(x => x.UserId == userId)
.Include("BookmarkedDeals")
.Include("BookmarkedStores")
.SingleOrDefault();
if(someCondition)
{
u = u.Include("something");
}
Don't have a place to test this, but have you tried that?

Programmatically load a LINQ2SQL partial class

I am working on project that allows a user to add Time to a Task. On the Task I have a field for EstimatedDuration, and my thoughts are I can get ActualDuration from the Time added to the Task.
I have a LINQ2SQL class for the Task, as well as and additional Task class (using partials).
I have the following for my query so far:
public IQueryable<Task> GetTasks(TaskCriteria criteria)
{
// set option to eager load child object(s)
var opts = new System.Data.Linq.DataLoadOptions();
opts.LoadWith<Task>(row => row.Project);
opts.LoadWith<Task>(row => row.AssignedToUser);
opts.LoadWith<Task>(row => row.Customer);
opts.LoadWith<Task>(row => row.Stage);
db.LoadOptions = opts;
IQueryable<Task> query = db.Tasks;
if (criteria.ProjectId.HasValue())
query = query.Where(row => row.ProjectId == criteria.ProjectId);
if (criteria.StageId.HasValue())
query = query.Where(row => row.StageId == criteria.StageId);
if (criteria.Status.HasValue)
query = query.Where(row => row.Status == (int)criteria.Status);
var result = query.Select(row => row);
return result;
}
What would be the best way to get at the ActualDuration, which is just a sum of the Units in the TaskTime table?
Add a property to your Task partial class, similiar to this:
public int ActualDuration
{
get {
YourDataContext db = new YourDataContext();
return
db.TaskDurations.Where(t => t.task_id == this.id).
Sum (t => t.duration);
}
}
Then you can reference the actual duration as Task.ActualDuration.
Update: You asked about how to do this with a partial class. Of course it hits the database again. The only way to get data from the database that you don't know yet is to hit the database. If you need to avoid this for performance reasons, write a subquery or SQL function that calculates the actual duration, and use a Tasks view that includes the calculated value. Now, the function/query will still have to aggregate the entered durations for every task row in the result set, so it will still be performance-intensive. If you have a very large task table and performance issues, keep a running tally on the task table. I think that even the partial class solution is fine for several 100,000's of tasks. You rarely retrieve large numbers at once, I would assume. Grid controls with paging only get a page at the time.

Categories

Resources