Programmatically load a LINQ2SQL partial class - c#

I am working on project that allows a user to add Time to a Task. On the Task I have a field for EstimatedDuration, and my thoughts are I can get ActualDuration from the Time added to the Task.
I have a LINQ2SQL class for the Task, as well as and additional Task class (using partials).
I have the following for my query so far:
public IQueryable<Task> GetTasks(TaskCriteria criteria)
{
// set option to eager load child object(s)
var opts = new System.Data.Linq.DataLoadOptions();
opts.LoadWith<Task>(row => row.Project);
opts.LoadWith<Task>(row => row.AssignedToUser);
opts.LoadWith<Task>(row => row.Customer);
opts.LoadWith<Task>(row => row.Stage);
db.LoadOptions = opts;
IQueryable<Task> query = db.Tasks;
if (criteria.ProjectId.HasValue())
query = query.Where(row => row.ProjectId == criteria.ProjectId);
if (criteria.StageId.HasValue())
query = query.Where(row => row.StageId == criteria.StageId);
if (criteria.Status.HasValue)
query = query.Where(row => row.Status == (int)criteria.Status);
var result = query.Select(row => row);
return result;
}
What would be the best way to get at the ActualDuration, which is just a sum of the Units in the TaskTime table?

Add a property to your Task partial class, similiar to this:
public int ActualDuration
{
get {
YourDataContext db = new YourDataContext();
return
db.TaskDurations.Where(t => t.task_id == this.id).
Sum (t => t.duration);
}
}
Then you can reference the actual duration as Task.ActualDuration.
Update: You asked about how to do this with a partial class. Of course it hits the database again. The only way to get data from the database that you don't know yet is to hit the database. If you need to avoid this for performance reasons, write a subquery or SQL function that calculates the actual duration, and use a Tasks view that includes the calculated value. Now, the function/query will still have to aggregate the entered durations for every task row in the result set, so it will still be performance-intensive. If you have a very large task table and performance issues, keep a running tally on the task table. I think that even the partial class solution is fine for several 100,000's of tasks. You rarely retrieve large numbers at once, I would assume. Grid controls with paging only get a page at the time.

Related

EF 6 - Performance of GroupBy

I don't have a problem currently, but I want to make sure, that the performance is not too shabby for my issue. My search on Microsofts documentation was without any success.
I have a Entity of the name Reservation. I now want to add some statistics to the program, where I can see some metrics about the reservations (reservations per month and favorite spot/seat in particular).
Therefore, my first approach was the following:
public async Task<ICollection<StatisticElement<Seat>>> GetSeatUsage(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return await this.FetchGroupedSeatData(allReservations, company);
}
public async Task<ICollection<StatisticElement<DateTime>>> GetMonthlyReservations(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return this.FetchGroupedReservationData(allReservations);
}
private async Task<ICollection<StatisticElement<Seat>>> FetchGroupedSeatData(
IEnumerable<Reservation> reservations,
Company company)
{
var groupedReservations = reservations.GroupBy(r => r.SeatId).ToList();
var companySeats = await this.seatService.GetAll(company);
return (from companySeat in companySeats
let groupedReservation = groupedReservations.FirstOrDefault(s => s.Key == companySeat.Id)
select new StatisticElement<Seat>()
{
Value = companySeat,
StatisticalCount = groupedReservation?.Count() ?? 0,
}).OrderByDescending(s => s.StatisticalCount).ToList();
}
private ICollection<StatisticElement<DateTime>> FetchGroupedReservationData(IEnumerable<Reservation> reservations)
{
var groupedReservations = reservations.GroupBy(r => new { Month = r.Date.Month, Year = r.Date.Year }).ToList();
return groupedReservations.Select(
groupedReservation => new StatisticElement<DateTime>()
{
Value = new DateTime(groupedReservation.Key.Year, groupedReservation.Key.Month, 1),
StatisticalCount = groupedReservation.Count(),
}).
OrderBy(s => s.Value).
ToList();
}
To explain the code a little bit: With GetSeatUsage and GetMonthlyReservations I can get the above mentioned data of a company. Therefore, I fetch ALL reservations at first (with reservationService.GetAll) - this is the point, where I think the performance will be a problem in the future.
Afterwards, I call either FetchGroupedSeatData or FetchGroupedReservationData, which first groups the reservations I previously fetched from the database and then converts them in a, for me, usable format.
As I said, I think the group by after I have read ALL the data from the database MIGHT be a problem, but I cannot find any information regarding performance in the documentation.
My other idea was, that I create a new method in my ReservationService, which then already returns the grouped list. But, again, I can't find the information, that the EF adds the GroupBy to the DB Query or basically does it after all of the data has been read from the database. This method would look something like this:
return await this.Context.Set<Reservation>.Where(r => r.User.CompanyId == company.Id).GroupBy(r => r.SeatId).ToListAsync();
Is this already the solution? Where can I check that? Am I missing something completely obvious?

Update an entity's property's values in the database whithin a repository query but before the query returns

I know my title sounds a little bit confusing but here's the better explanation:
I have an API made with ASP.NET CORE and .NET 5 where I use the repository pattern to perform all my DB queries.
For one of these queries I need to have some kind of calculation made and set the result of that calculation as the value of one of the entity's property in the database before it returns the entire object, but the calculation needs to happen before the query returns because I need the query to also return the new value inside the object.
In other words, here's what I had so far:
public async Task<IReadOnlyList<Place>> GetAllPlacesByCategoryWithRelatedDataAndFilters(Guid id, string city)
{
// I need the ratio to be calculated before I return "places"
var places = await context.Places
.Include(d => d.Category)
.Where(d => d.CategoryId == id)
.ToListAsync();
foreach(var place in places)
{
// Ratio is a double
place.Ratio = CalculateRatioByMultiplying(place.Amount1, place.Amount2);
// Here's where I don't know how to really update the value inside of the
// database, if I leave it like this then it will show the new value when
// the query is performed but the value is still the default value in the
// database
// I've also tried this but this time nothing happens,
// it doesn't even throw an Exception or anything and the value
// is still the default value in the database
var ratio = CalculateRatioByMultiplying(place.Amount1, place.Amount2);
place.Ratio = ratio;
await placeRepository.UpdateAsync(place);
// Also the CalculateRatioByMultiplying is a method I've added to the
// IPlaceRepository and so in this PlaceRepository too that just
// multiplies arg1 with arg2 and returns a double
}
// Re-do the query with the new Ratio values for each of the places
var places = await context.Places
.Include(d => d.Category)
.Where(d => d.CategoryId == id)
.Where(d => d.Ratio != 0
&& d.City == city)
.OrderByDescending(d => d.Ratio)
.Take(10)
.ToListAsync();
return places;
}
However, I'm also using the mediator pattern to query from the controllers ( I'm using clean architecture ) and some times I get confused as to should the calculation be made in the "GetAllPlacesByCategoryWithRelatedDataAndFiltersQueryHandler" ( where the query is called ) or in the repository directly.
Unfortunately I can't share the project or any sample code as it is a private project but I can answer any question you have of course.
Thanks for your help
You're trying to update something (executing a command) while querying the data. Not sure if you're using CQRS or not, but it's still a separation of concerns problem.
May I suggest you entirely remove the update of place.Ratio from this query. And in all the commands that might have effects on place.Ratio, you could publish/dispatch a PlaceRatioRequiredChangedEvent event (it's an INotification for MediatR) then handle this event/notification.
I know it's a handful of work, but it'll set a better business flow for later development.

LINQ querying using an entity object not in the database - Error: LINQ to Entities does not recognize the method

I have an Entity Framework v5 model created from a database. The table Season has a corresponding entity called Season. I need to calculate the Season's minimum start date and maximum end date for each year for a Project_Group. I then need to be able to JOIN those yearly min/max Season values in other LINQ queries. To do so, I have created a SeasonLimits class in my Data Access Layer project. (A SeasonLimits table does not exist in the database.)
public partial class SeasonLimits : EntityObject
{
public int Year { get; set; }
public DateTime Min_Start_Date { get; set; }
public int Min_Start_Date_ID { get; set; }
public DateTime Max_End_Date { get; set; }
public int Max_End_Date_ID { get; set; }
public static IQueryable<SeasonLimits> QuerySeasonLimits(MyEntities context, int project_Group_ID)
{
return context
.Season
.Where(s => s.Locations.Project.Project_Group.Any(pg => pg.Project_Group_ID == project_Group_ID))
.GroupBy(x => x.Year)
.Select(sl => new SeasonLimits
{
Year = sl.Key,
Min_Start_Date = sl.Min(d => d.Start_Date),
Min_Start_Date_ID = sl.Min(d => d.Start_Date_ID),
Max_End_Date = sl.Max(d => d.End_Date),
Max_End_Date_ID = sl.Max(d => d.End_Date_ID)
});
}
}
// MVC Project
var seasonHoursByYear =
from d in context.AuxiliaryDateHours
from sl in SeasonLimits.QuerySeasonLimits(context, pg.Project_Group_ID)
where d.Date_ID >= sl.Min_Start_Date_ID
&& d.Date_ID < sl.Max_End_Date_ID
group d by new
{
d.Year
} into grp4
orderby grp4.Key.Year
select new
{
Year = grp4.Key.Year,
HoursInYear = grp4.Count()
};
In my MVC project, whenever I attempt to use the QuerySeasonLimits method in a LINQ query JOIN, I receive the message,
"LINQ to Entities does not recognize the method
'System.Linq.IQueryable`1[MyDAL.SeasonLimits]
QuerySeasonLimits(MyDAL.MyEntities, MyDAL.Project_Group)' method, and
this method cannot be translated into a store expression."
Is this error being generated because SeasonLimits is not an entity that exists in the database? If this can't be done this way, is there another way to reference the logic so that it can be used in other LINQ queries?
EF is trying to translate your query to SQL and as there is no direct mapping between your method and the generated SQL you're getting the error.
First option would be not to use the method and instead write the contents of the method directly in the original query (I'm not sure at the moment if this would work, as I don't have a VS running). In the case this would work, you'll most likely end up with a very complicated SQL with a poor performance.
So here comes the second option: don't be afraid to use multiple queries to get what you need. Sometimes it also makes sense to send a simpler query to the DB and continue with modifications (aggregation, selection, ...) in the C# code. The query gets translated to SQL everytime you try to enumerate over it or if you use one of the ToList, ToDictionary, ToArray, ToLookup methods or if you're using a First, FirstOrDefault, Single or SingleOrDefault calls (see the LINQ documentation for the specifics).
One possible example that could fix your query (but most likely is not the best solution) is to start your query with:
var seasonHoursByYear =
from d in context.AuxiliaryDateHours.ToList()
[...]
and continue with all the rest. This minor change has fundamental impact:
by calling ToList the DB will be immediately queried and the whole
AuxiliaryDateHours table will be loaded into the application (this will be a performance problem if the table has too many rows)
a second query will be generated when calling your QuerySeasonLimits method (you could/should also include a ToList call for that)
the rest of the seasonHoursByYear query: where, grouping, ... will happen in memory
There are a couple of other points that might be unrelated at this point.
I haven't really investigated the intent of your code - as this could lead to further optimizations - even total reworks that could bring you more gains in the end...
I eliminated the SeasonLimits object and the QuerySeasonLimits method, and wrote the contents of the method directly in the original query.
// MVC Project
var seasonLimits =
from s in context.Season
.Where(s => s.Locations.Project.Project_Group.Any(pg => pg.Project_Group_ID == Project_Group_ID))
group s by new
{
s.Year
} into grp
select new
{
grp.Key.Year,
Min_Start_Date = grp.Min(x => x.Start_Date),
Min_Start_Date_ID = grp.Min(x => x.Start_Date_ID),
Max_End_Date = grp.Max(x => x.End_Date),
Max_End_Date_ID = grp.Max(x => x.End_Date_ID)
};
var seasonHoursByYear =
from d in context.AuxiliaryDateHours
from sl in seasonLimits
where d.Date_ID >= sl.Min_Start_Date_ID
&& d.Date_ID < sl.Max_End_Date_ID
group d by new
{
d.Year
} into grp4
orderby grp4.Key.Year
select new
{
Year = grp4.Key.Year,
HoursInYear = grp4.Count()
};

Static select lists based on database queries

In the process of refactoring an ASP.NET MVC 5 web project, I see an opportunity to move some select lists to another class where multiple controllers can access them. This would allow me to remove duplicate code.
In this instance, the select lists require a trip to the database. To hand-code the lists, which might change over time, would not be feasible (hence, the database query).
Although I have no compiler errors and the page appears to work as intended, I am not sure if I am creating other problems by taking this approach. Is the code approach shown below a "bad" way to achieve this outcome? Is there a better way to do this?
In summary, this is what I am doing:
The class and methods within are static
A private static readonly database context is defined at the top
The two functions shown query the database and produce the desired results.
Because this is a static class, there is no dispose method.
The Class:
public static class ElpLookupLists
{
private static readonly EllAssessmentContext Db = new EllAssessmentContext();
// code...
internal static IEnumerable<SelectListItem> StandardSelectList(string selectedDomain)
{
return Db.ElpStandardLists.Where(m => m.Domain == selectedDomain)
.Select(m => m.Standard).Distinct()
.Select(z => new SelectListItem { Text = z.ToString(), Value = z.ToString() }).OrderBy(z => z.Value)
.ToList();
}
internal static IEnumerable<SelectListItem> PerformanceIndicatorSelectList(string selectedDomain, int? selectedStandard,
string selectedSubConcept)
{
var query =
Db.ElpStandardLists.Where(m => m.Domain == selectedDomain).Where(m => m.Standard == selectedStandard);
if (!string.IsNullOrEmpty(selectedSubConcept)) query = query.Where(m => m.SubConcept == selectedSubConcept);
var list =
query.Select(m => m.PerformanceIndicator)
.Distinct().OrderBy(m => m)
.Select(z => new SelectListItem { Text = z.ToString(), Value = z.ToString() })
.OrderBy(z => z.Text).ToList();
return list;
}
}
In my opinion a better alternative would be to create a separate controller with methods to get this data, and you could OutputCache this method. You can then call this method in other controllers, and it won't make the database trip every time. The return value will be cached. You can control the cache settings of course.
The advantage of this technique over yours is that in your case, the database trip will always happen when the application starts because the method is static, irrespective of whether or not you are going to use it. Whereas by using a cached method, you make the database trip the first time you call the method.

NHibernate query extremely slow compared to hard coded SQL query

I'm re-writing some of my old NHibernate code to be more database agnostic and use NHibernate queries rather than hard coded SELECT statements or database views. I'm stuck with one that's incredibly slow after being re-written. The SQL query is as such:
SELECT
r.recipeingredientid AS id,
r.ingredientid,
r.recipeid,
r.qty,
r.unit,
i.conversiontype,
i.unitweight,
f.unittype,
f.formamount,
f.formunit
FROM recipeingredients r
INNER JOIN shoppingingredients i USING (ingredientid)
LEFT JOIN ingredientforms f USING (ingredientformid)
So, it's a pretty basic query with a couple JOINs that selects a few columns from each table. This query happens to return about 400,000 rows and has roughly a 5 second execution time. My first attempt to express it as an NHibernate query was as such:
var timer = new System.Diagnostics.Stopwatch();
timer.Start();
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.Fetch(prop => prop.Ingredient).Eager()
.Fetch(prop => prop.IngredientForm).Eager()
.List();
timer.Stop();
This code works and generates the desired SQL, however it takes 120,264ms to run. After that, I loop through recIngs and populate a List<T> collection, which takes under a second. So, something NHibernate is doing is extremely slow! I have a feeling this is simply the overhead of constructing instances of my model classes for each row. However, in my case, I'm only using a couple properties from each table, so maybe I can optimize this.
The first thing I tried was this:
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => joinForm.FormDisplayName)
.List<String>();
Here, I just grab a single value from one of my JOIN'ed tables. The SQL code is once again correct and this time it only grabs the FormDisplayName column in the select clause. This call takes 2498ms to run. I think we're on to something!!
However, I of course need to return several different columns, not just one. Here's where things get tricky. My first attempt is an anonymous type:
.Select(r => new { DisplayName = joinForm.FormDisplayName, IngName = joinIng.DisplayName })
Ideally, this should return a collection of anonymous types with both a DisplayName and an IngName property. However, this causes an exception in NHibernate:
Object reference not set to an instance of an object.
Plus, .List() is trying to return a list of RecipeIngredients, not anonymous types. I also tried .List<Object>() to no avail. Hmm. Well, perhaps I can create a new type and return a collection of those:
.Select(r => new TestType(r))
The TestType construction would take a RecipeIngredients object and do whatever. However, when I do this, NHibernate throws the following exception:
An unhandled exception of type 'NHibernate.MappingException' occurred
in NHibernate.dll
Additional information: No persister for: KitchenPC.Modeler.TestType
I guess NHibernate wants to generate a model matching the schema of RecipeIngredients.
How can I do what I'm trying to do? It seems that .Select() can only be used for selecting a list of a single column. Is there a way to use it to select multiple columns?
Perhaps one way would be to create a model with my exact schema, however I think that would end up being just as slow as the original attempt.
Is there any way to return this much data from the server without the massive overhead, without hard coding a SQL string into the program or depending on a VIEW in the database? I'd like to keep my code completely database agnostic. Thanks!
The QueryOver syntax for conversion of selected columns into artificial object (DTO) is a bit different. See here:
16.6. Projections for more details and nice example.
A draft of it could be like this, first the DTO
public class TestTypeDTO // the DTO
{
public string PropertyStr1 { get; set; }
...
public int PropertyNum1 { get; set; }
...
}
And this is an example of the usage
// DTO marker
TestTypeDTO dto = null;
// the query you need
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
// place for projections
.SelectList(list => list
// this set is an example of string and int
.Select(x => joinForm.FormDisplayName)
.WithAlias(() => dto.PropertyStr1) // this WithAlias is essential
.Select(x => joinIng.Weight) // it will help the below transformer
.WithAlias(() => dto.PropertyNum1)) // with conversion
...
.TransformUsing(Transformers.AliasToBean<TestTypeDTO>())
.List<TestTypeDTO>();
So, I came up with my own solution that's a bit of a mix between Radim's solution (using the AliasToBean transformer with a DTO, and Jake's solution involving selecting raw properties and converting each row to a list of object[] tuples.
My code is as follows:
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(
p => joinIng.IngredientId,
p => p.Recipe.RecipeId,
p => p.Qty,
p => p.Unit,
p => joinIng.ConversionType,
p => joinIng.UnitWeight,
p => joinForm.UnitType,
p => joinForm.FormAmount,
p => joinForm.FormUnit)
.TransformUsing(IngredientGraphTransformer.Create())
.List<IngredientBinding>();
I then implemented a new class called IngredientGraphTransformer which can convert that object[] array into a list of IngredientBinding objects, which is what I was ultimately doing with this list anyway. This is exactly how AliasToBeanTransformer is implemented, only it initializes a DTO based on a list of aliases.
public class IngredientGraphTransformer : IResultTransformer
{
public static IngredientGraphTransformer Create()
{
return new IngredientGraphTransformer();
}
IngredientGraphTransformer()
{
}
public IList TransformList(IList collection)
{
return collection;
}
public object TransformTuple(object[] tuple, string[] aliases)
{
Guid ingId = (Guid)tuple[0];
Guid recipeId = (Guid)tuple[1];
Single? qty = (Single?)tuple[2];
Units usageUnit = (Units)tuple[3];
UnitType convType = (UnitType)tuple[4];
Int32 unitWeight = (int)tuple[5];
Units rawUnit = Unit.GetDefaultUnitType(convType);
// Do a bunch of logic based on the data above
return new IngredientBinding
{
RecipeId = recipeId,
IngredientId = ingId,
Qty = qty,
Unit = rawUnit
};
}
}
Note, this is not as fast as doing a raw SQL query and looping through the results with an IDataReader, however it's much faster than joining in all the various models and building the full set of data.
IngredientForms joinForm = null;
Ingredients joinIng = null;
var recIngs = session.QueryOver<Models.RecipeIngredients>()
.JoinAlias(r => r.IngredientForm, () => joinForm)
.JoinAlias(r => r.Ingredient, () => joinIng)
.Select(r => r.column1, r => r.column2})
.List<object[]>();
Would this work?

Categories

Resources