C# GroupBy Trouble - c#

As of now, I am trying to create a list that groups based on certain criteria and then display that list in the view.
I have two database tables and one is an association table.
First Table
public partial class InitialTraining
{
public InitialTraining()
{
InitialTrainingAssociations = new HashSet<InitialTrainingAssociation>();
}
public int Id { get; set; }
[ForeignKey("MedicInfo")]
public int TfoId { get; set; }
[ForeignKey("InstructorInfo")]
public int? InstructorId { get; set; }
[ForeignKey("PilotInfo")]
public int? PilotId { get; set; }
public DateTime DateTakenInitial { get; set; }
public decimal FlightTime { get; set; }
public bool Active { get; set; }
[StringLength(2000)]
public string Narrative { get; set; }
[Required]
[StringLength(20)]
public string TrainingType { get; set; }
[ForeignKey("CodePhase")]
public int PhaseId { get; set; }
[ForeignKey("PhaseTrainingType")]
public int PhaseTrainingTypeId { get; set; }
public string EnteredBy { get; set; }
public DateTime? EnteredDate { get; set; }
public virtual MedicInfo MedicInfo { get; set; }
public virtual MedicInfo InstructorInfo { get; set; }
public virtual MedicInfo PilotInfo { get; set; }
public virtual Code_Phase CodePhase { get; set; }
public virtual Code_PhaseTrainingType PhaseTrainingType { get; set; }
public virtual ICollection<InitialTrainingAssociation> InitialTrainingAssociations { get; set; }
}
Second Table (Association Table)
public class InitialTrainingAssociation
{
public int Id { get; set; }
[ForeignKey("InitialTraining")]
public int InitialTrainingId { get; set; }
[ForeignKey("CodePerformanceAnchor")]
public int? PerformanceAnchorId { get; set; }
[ForeignKey("GradingSystem")]
public int? GradingSystemId { get; set; }
public virtual AviationMedicTraining.CodePerformanceAnchor CodePerformanceAnchor { get; set; }
public virtual InitialTraining InitialTraining { get; set; }
public virtual GradingSystem GradingSystem { get; set; }
}
Here is my GroupBy in C#.
// get list of initial training record ids for statistics
var lstInitialTrainings = db.InitialTrainings.Where(x => x.TfoId == medicId && x.Active).Select(x => x.Id).ToList();
// get list of initial training performance anchors associated with initial training records
var lstPerformanceAnchors = db.InitialTrainingAssociations
.Where(x => lstInitialTrainings.Contains(x.InitialTrainingId)).GroupBy(t => t.PerformanceAnchorId)
.Select(s => new MedicStatistic()
{
PerformanceAnchorName = db.CodePerformanceAnchor.FirstOrDefault(v => v.Id == s.Key).PerformanceAnchor,
AnchorCount = s.Count()
}).ToList();
My Goal
Obviously from my code I want to group by the performance anchor in the association table, but I need more information from the Initial Training table to include in my ViewModel MedicStatistic, but I am having trouble figuring out the best way to do it.
My overall goal is to be able to get the most recent time a performance anchor was completed from the Initial Training table.
Visual
Initial Training Table (not all fields were captured in snippet b/c they're not important for the purpose of this question)
Initial Training Association Table
What I expect
So, from the pictures provided above as you can see there are multiple 1's for performance anchor id's in the association table, but they each have different InitialTrainingId. So, this specific performance anchor has been done multiple times, but I need to get the most recent date from the Initial Training table. Also, I need to get the corresponding grade with the anchor from the Grading System table, based on the most recent date.
So, for the performance anchor that equals 1.. I would want the grade that corresponds to the InitialTrainingId of 17 because that record was the most recent time that the performance anchor of 1 was done.
If you have any questions please let me know.

You want the data grouped by CodePerformanceAnchor, so the most natural way to start the query is at its DbSet which immediately eliminates the necessity of grouping:
from pa in db.CodePerformanceAnchors
let mostRecentInitialTraining
= pa.InitialTrainingAssociations
.Select(ita => ita.InitialTraining)
.OrderByDescending(tr => tr.DateTakenInitial)
.FirstOrDefault()
select new
{
pa.PerformanceAnchor,
mostRecentInitialTraining.DateTakenInitial,
mostRecentInitialTraining. ...
...
AnchorCount = pa.InitialTrainingAssociations.Count()
}
As you see, only navigation properties are used and the query as a whole is pretty straightforward. I assume that the PerformanceAchor class also has an InitialTrainingAssociations collection.
I can't guarantee that EF will be able to execute it entirely server-side though, that's always tricky with more complex LINQ queries.

I'm going to ignore the virtual properties in your InitialTrainingAssociation class, since you didn't mention anything about them and it's not immediately apparent to me whether they actually contain data, or why they are virtual.
It seems like IQueryable.Join is the easiest way to combine the data you want.
In the following example, we will start with the entries from the InitialTrainings table. We will then Join with the InitialTrainingAssociations table, which will result in a collection of paired InitialTraining and InitialTrainingAssociation objects.
var initialTrainingResults =
// Start with the InitialTrainings data.
db.InitialTrainings
// Add association information.
.Join(
// The table we want to join with
db.InitialTrainingAssociations,
// Key selector for the outer type (the type of the collection
// initiating the join, in this case InitialTraining)
it => it.Id,
// Key selector for the inner type (the type of the collection
// being joined with, in this case InitialTrainingAssociation)
ita => ita.InitialTrainingId,
// Result selector. This defines how we store the joined data.
// We store the results in an anonymous type, so that we can
// use the intermediate data without having to declare a new class.
(InitialTraining, InitialTrainingAssociation) =>
new { InitialTraining, InitialTrainingAssociation }
)
From here, we can add data from the PerformanceAnchors and GradingSystems tables, by performing more Joins. Each time we perform a Join, we will add a new entity to our anonymous type. The result will be a collection of anonymous types representing data we retrieved from the database.
// Add performance anchor information.
.Join(
db.PerformanceAnchors,
x => x.InitialTrainingAssociation.PerformanceAnchorId,
pa => pa.Id,
(x, PerformanceAnchor) =>
new { x.InitialTrainingAssociation, x.InitialTraining, PerformanceAnchor }
)
// Add grading system information.
.Join(
db.GradingSystems,
x => x.InitialTrainingAssociation.GradingSystemId,
gs => gs.Id,
// No need for InitialTrainingAssociation anymore, so we don't
// include it in this final selector.
(x, GradingSystem) =>
new { x.InitialTraining, x.PerformanceAnchor, GradingSystem }
);
(This was a verbose example to show how you can join all the tables together. You can use less Joins if you don't need to access all the data at once, and you can filter down the InitialTrainings collection that we start with if you know you only need to access certain pieces of data.)
At this point, initialTrainingResults is an IEnumerable containing one entry for each association between the InitialTrainings, PerformanceAnchors, and GradingSystems tables. Essentially, what we've done is taken all the InitialTrainingAssociations and expanded their Ids into actual objects.
To get the most recent set of data for each performance anchor:
var performanceAnchors = initialTrainingResults
// Group by each unique Performance Anchor. Remember, the IEnumerable
// we are operating on contains our anonymous type of combined Training,
// Performance Anchor and Grading data.
.GroupBy(x => x.PerformanceAnchor.Id)
// Order each Performance Anchor group by the dates of its training,
// and take the first one from each group
.Select(g => g.OrderByDescending(x => x.InitialTraining.DateTakenInitial).First());

In the Select you can order the group result to get the most recent associated InitialTraining by DateTakenInitial, and from there get the desired data
//...omitted for brevity
.GroupBy(t => t.PerformanceAnchorId)
.Select(g => {
var mostRecent = g.OrderByDescending(_ => _.InitialTraining.DateTakenInitial).First();
// get the corresponding grade with the anchor from the Grading System table
var gradeid = mostRecent.GradingSystemId;
var gradingSystem = mostRecent.GradingSystem;
//get the most recent date from the Initial Training
var mostRecentDate = mostRecent.InitialTraining.DateTakenInitial
//..get the desired values and assign to view model
var model = new MedicStatistic {
//Already have access to CodePerformanceAnchor
PerformanceAnchorName = mostRecent.CodePerformanceAnchor.PerformanceAnchor
AnchorCount = g.Count(),
MostRecentlyCompleted = mostRecentDate,
};
return model;
});

Related

C# Linq order by with new mapped value

I have List of Orders, which have the property "Status" which is an int. For each status I have a translations in different languages. I want to sort my list by the selected translation and not by numeric status value. What is the best practice here?
public record OrderTranslation
{
public string OrderStatus { get; set; }
public string StatusDescription { get; set; }
public Language Language { get; set; }
}
public record Order
{
public int? Id { get; set; }
public int Status { get; set; }
// I have added a new value to set the translated value and I want to order by this
public string TranslatedStatusValue { get; set;}
}
my function:
public async Task<FilterResult> FilterAsync(FilterRequest filterRequest, List<string> filterProperties, Language selectedLanguage)
{
var orderTranslations = dataContext
.OrderTranslations
.Where(ot => ot.Language == selectedLanguage)
.ToList();
var orders = dataContext.Orders.AsNoTracking();
foreach (var order in orders)
{
var description = orderTranslations
.Single(x => x.OrderStatus == serviceContract.Status)
.StatusDescription;
serviceContract.TranslatedValue = description;
}
// The TranslatedValue is always empty here
// This is not working, but I want to Order by the translation. Is there another possibility to to this, not using an extra property?
IQueryable<ServiceContractOrder> query = orders
.OrderBy("TranslatedStatusValue", filterRequest.IsSortAscending)
.WhereMatchesFilter(filterRequest, filterProperties);
result.FilterHits = await query
.Skip(filterRequest.ItemsToSkip())
.Take(filterRequest.ItemsPerPage)
.Cast<object>()
.ToListAsync();
result.TotalCount = await query.CountAsync();
result.ObjectType = typeof(Order).AssemblyQualifiedName;
result.FilteredProperties = filterProperties;
}
It all depends on your size of data and what you want to achieve.
If you have small data set, without pagination, you can sort them in client side ( in your dotnet code).
If you have a large dataset, and/or you need pagination, then you will need to apply the sorting to the DB. In such case, I would suggest you to store the translated values in Same table as Owned Entity or maybe different table. And then you can apply sorting in your LINQ query.
Two benefits you get is,
Sorting is absolute, and order is maintained across queries.
Performance, as sorting on client-side hurts for large data sets.
What you lose,
Any change to translation has to be applied to DB. This makes your database complex.
If the order status is a fixed set of data, like enum, then you can chose to have a denormalized design. i.e., to have a dedicated OrderStatus table with translations and then join them to your Order table.
Your domain will be somewhat like,
public record OrderStatus
{
int Id{get; set;}
public ISet<OrderTranslation> Translations { get; set; }
}
public record OrderTranslation(Language Language, string)
{
public string OrderStatus { get; set; }
public string StatusDescription { get; set; }
public Language Language { get; set; }
}
public record Order
{
public int? Id { get; set; }
public OrderStatus Status { get; set; }
}

GroupBy to identify items with different scores and the number of times they are assigned different scores

I have the following Model classes: Assessment and AssessmentItem. Each assessment is submitted by a unique user for a specific Submission, associated with a Rubric. Each Assessment may have many AssessmentItems, which is composed of a score assigned by the user and the id of the associated rubric item (RubricItemId). Basically, many users can assess a submission using multiple RubricItemId values (as criteria).
public class Assessment
{
[Key]
public int Id { get; set; }
public bool IsCompleted { get; set; }
[ForeignKey("SubmissionId")]
public Submission Submission { get; set; }
public int SubmissionId { get; set; }
[ForeignKey("RubricId")]
public Rubric Rubric { get; set; }
public int RubricId { get; set; }
[ForeignKey("EvaluatorId")]
public ApplicationUser Evaluator { get; set; }
public string EvaluatorId { get; set; }
public IEnumerable<AssessmentItem> AssessmentItems { get; set; }
}
public class AssessmentItem
{
[Key]
public int Id { get; set; }
public int CurrentScore { get; set; }
[ForeignKey("RubricItemId")]
public RubricItem RubricItem { get; set; }
public int RubricItemId { get; set; }
[ForeignKey("AssessmentId")]
public Assessment Assessment { get; set; }
public int AssessmentId { get; set; }
}
I am trying to find the RubricItemId values with different scores per each submission, along with the number of times they were assigned different scores across all submissions. For example, RubricItem #1 was scored differently by users in 10 submissions. However, I do not know where to start. I have the following code to do this without taking into account the submission.
var a = _context.AssessmentItems.GroupBy(ai => ai.AssessmentId)
.Where(da => da.Select(d => d.CurrentScore)
.Distinct()
.Count() == 1
);
This code neither computers the count when RubricItemId is assigned different scores. I wonder how I can move forward from here. Should I use GroupBy. If I want to do this per Submission, I believe there has to be another GroupBy using SubmissionId, right? Any tips and suggestions?
If you could provide some C# format sample data, I could test. This is my attempt - since I am not building against EF you may need an AsEnumerable at some point, which could pull the whole database over.
var ans = _context.Assessments
.GroupBy(a => a.SubmissionId)
.SelectMany(a_sg => a_sg.SelectMany(a => a.AssessmentItems)
.GroupBy(ai => new { ai.RubricItemId, ai.CurrentScore})
.Where(ai_ricsg => ai_ricsg.Count() > 1)
.Select(ai_ricsg => new { ai_ricsg.Key.RubricItemId, DifferentScoreCountPerSubmission = ai_ricsg.Count() })
)
.GroupBy(ric => ric.RubricItemId)
.Select(ric_rig => new { RubricItemId = ric_rig.Key, DifferentScoreCount = ric_rig.Sum(ric => ric.DifferentScoreCountPerSubmission) });
NetMage beat me to it and probably will be closer to what you end up needing. I can confirm his answer should work without needing to materialize the entities:
Since you are wanting details at a submission level, the start of the query should likely be at the Assessment level.
At a high level you will probably be looking to group on the Submission then utilizing SelectMany with a further group-by to drill down to the items you want to count.
var query = _context.Assessments.GroupBy(x => x.SumbissionId)
.SelectMany(x => x.SelectMany(g => g.AssessmentItems
.Select(ai => new { SubmissionId = g.Key, ai.CurrentScore, ai.RubricItemId})
.GroupBy(ai => ai);
This would just be a start, which will get you a structure that can count the distinct combinations of Submission, Score, and RubricItem. I've verified this /w EF6 against a database, so extending out like what NetMage has outlined should be possible without materializing it, or at worst, materializing it as something like the above.
The key thing in this case would be to deal with FKs and particular fields wherever possible rather than pulling back entire entities in the Linq queries as a query like this will probably have a pretty big footprint on DB row touches.

How can I get the count of a list in an Entity Framework model without including/loading the entire collection?

I have a model in Entity Framework Core that goes something like this:
public class Anime
{
public int EpisodeCount { get { return Episodes.Count() } }
public virtual ICollection<Episode> Episodes { get; set; }
}
I'm having the issue of EpisodeCount being 0. The solution currently is to run a .Include(x => x.Episodes) within my EF query, but that loads the entire collection of episodes where it's not needed. This also increases my HTTP request time, from 100ms to 700ms which is just not good.
I'm not willing to sacrifice time for simple details, so is there a solution where I can have EF only query the COUNT of the episodes, without loading the entire collection in?
I was suggested to do this
var animeList = context.Anime.ToPagedList(1, 20);
animeList.ForEach(x => x.EpisodeCount = x.Episodes.Count());
return Json(animeList);
but this also returns 0 in EpisodeCount, so it's not a feasible solution.
You need to project the desired data into a special class (a.k.a. ViewModel, DTO etc.). Unfortunately (or not?), in order to avoid N + 1 queries the projection must not only include the count, but all other fields as well.
For instance:
Model:
public class Anime
{
public int Id { get; set; }
public string Name { get; set; }
// other properties...
public virtual ICollection<Episode> Episodes { get; set; }
}
ViewModel / DTO:
public class AnimeInfo
{
public int Id { get; set; }
public string Name { get; set; }
// other properties...
public int EpisodeCount { get; set; }
}
Then the following code:
var animeList = db.Anime.Select(a => new AnimeInfo
{
Id = a.Id,
Name = a.Name,
EpisodeCount = a.Episodes.Count()
})
.ToList();
produces the following single SQL query:
SELECT [a].[Id], [a].[Name], (
SELECT COUNT(*)
FROM [Episode] AS [e]
WHERE [a].[Id] = [e].[AnimeId]
) AS [EpisodeCount]
FROM [Anime] AS [a]

NullReferenceException Query SQLite database with Where on a concatenated string property

I'm trying to select a record using the following code:
Location item = connection
.Table<Location>()
.Where(l => l.Label.Equals(label))
.FirstOrDefault();
This results in:
System.NullReferenceException: Object reference not set to an instance of an object.
When I try the same, on a different property (Postcode), it all works fine also when no records are found.:
Location item = connection
.Table<Location>()
.Where(l => l.Postcode.Equals(label))
.FirstOrDefault();
This is the Location Class:
// These are the Locations where the Stock Take Sessions are done
public class Location : DomainModels, IComparable<Location>
{
[JsonProperty("id"), PrimaryKey]
public int Id { get; set; }
public string Name { get; set; }
public string Street { get; set; }
public int Number { get; set; }
public string Postcode { get; set; }
public string City { get; set; }
public bool Completed { get; set; }
[Ignore] // Removing this does not have an impact on the NullReferenceException
public string Label => $"{Name ?? ""} - ({Postcode ?? ""})";
public int CompareTo(Location other)
{
return Name.CompareTo(other.Name);
}
// Navigation property
// One to many relationship with StockItems
[OneToMany(CascadeOperations = CascadeOperation.All), Ignore]
public List<StockItem> StockItems { get; set; }
// Specify the foreign key to StockTakeSession
[ForeignKey(typeof(StockTakeSession))]
public int StockTakeSessionId { get; set; }
// One to one relationship with StockTakeSession
[OneToOne]
public StockTakeSession StockTakeSession { get; set; }
}
What am I doing wrong?
Thanks for any suggestions!
Your where filters in the data store on Label but your markup on your class Location has decorated the Label property with IgnoreAttribute. This means the Label property will not be set until after the entity has been materialized to memory and you can't do anything with it in the data store.
.Where(l => l.Label.Equals(label))
Fixes
There are some options.
You could set this to computed and create a computed column in the store with that same logic. This involves manually changing your table schema either directly in your RDBMS manager or editing your migration scripts. The property gets marked with [DatabaseGenerated(DatabaseGeneratedOption.Computed)] (if using attributes, which your code above is).
You could change the Where to filter on the Properties that compose Label that are found in the store. ie: .Where(l => l.Postcode.Equals(Postcode) && l.Name.Equals(Name))
You could materialize everything before that particular filter to memory and then apply the filter. This is not recommended if everything up to that point leads to a lot of records. Example, with the code below if the table is large you would be retrieving everything for a single record.
Location item = connection
.Table<Location>()
.AsEnumerable()
.Where(l => l.Label.Equals(label))
.FirstOrDefault();
Edit
[Ignore] // Removing this does not have an impact on the NullReferenceException
No, it should not unless you go through and add the column with the same name to your existing schema and populate it with all data. (or create a computed column in your schema with the same name)

LINQ grouping multiple fields and placing non-unique fields into list

I have a list of objects, TargetList populated from the database which I want to group together based on the AnalyteID, MethodID and InstrumentID fields, but the Unit fields will be stored in a list applicable to each grouped object.
Furthermore, it is only possible for one of the available units to have a target assigned to it. Therefore, during the grouping I need a check to see if a target is available and, if so, skip creation of the unit list.
The TargetList object contains the following attributes:
public int id { get; set; }
public int AnalyteID { get; set; }
public string AnalyteName { get; set; }
public int MethodID { get; set; }
public string MethodName { get; set; }
public int InstrumentID { get; set; }
public string InstrumentName { get; set; }
public int UnitID { get; set; }
public string UnitDescription { get; set; }
public decimal TargetMean { get; set; }
public List<Unit> Units { get; set; }
I have a method for multi-grouping using LINQ:
TargetList.GroupBy(x => new { x.AnalyteID, x.MethodID, x.InstrumentID })...
But unsure as to how to check for a target at a row before extracting all available units at current group if target doesn't exist.
I created a solution which groups all rows returned from the database based on the AnalyteID, MethodID and InstrumentID ('names' of each of these are included in the grouping aswell).
Additionally, all non-unique Unit attributes (UnitID and UnitDescription) are placed into a list only if the TargetMean is 0.
targetViewModel.TargetList
// Group by unique analyte/method/instrument
.GroupBy(x => new { x.AnalyteID, x.AnalyteName, x.MethodID, x.MethodName, x.InstrumentID, x.InstrumentName })
// Select all attributes and collect units together in a list
.Select(g => new TargetView
{
id = g.Max(i => i.id),
AnalyteID = g.Key.AnalyteID,
AnalyteName = g.Key.AnalyteName,
MethodID = g.Key.MethodID,
MethodName = g.Key.MethodName,
InstrumentID = g.Key.InstrumentID,
InstrumentName = g.Key.InstrumentName,
TargetMean = g.Max(i => i.TargetMean),
UnitID = g.Max(i => i.UnitID),
UnitDescription = g.Max(i => i.UnitDescription),
// only extract units when target mean is 0
Units = g.Where(y => y.TargetMean == 0)
.Select(c => new Unit { ID = c.UnitID, Description = c.UnitDescription }).ToList()
}).ToList();
Note: The Max method is used to extract any required non-key attributes, such as the TargetMean/id. This works fine because only one row will ever be returned if a TargetMean exists.
It does feel 'dirty' to use the Max method in order to obtain all other non-key attributes though so if anyone has any other suggestions, please feel free to drop an answer/comment as I am interested to see if there are any cleaner ways of achieving the same result.

Categories

Resources