Very slow runtime with Entity Framework nested loop (using nav properties) - c#

Right now, I'm trying to write a method for a survey submission program that utilizes a very normalized schema.
I have a method that is meant to generate a survey for a team of people, linking several different EF models together in the process. However, this method runs EXTREMELY slowly for anything but the smallest team sizes (taking 11.2 seconds to execute for a 4-person team, and whopping 103.9 seconds for an 8 person team). After some analysis, I found that 75% of the runtime is taken up in the following block of code:
var TeamMembers = db.TeamMembers.Where(m => m.TeamID == TeamID && m.OnTeam).ToList();
foreach (TeamMember TeamMember in TeamMembers)
{
Employee employee = db.Employees.Find(TeamMember.EmployeeID);
SurveyForm form = new SurveyForm();
form.Submitter = employee;
form.State = "Not Submitted";
form.SurveyGroupID = surveygroup.SurveyGroupID;
db.SurveyForms.Add(form);
db.SaveChanges();
foreach (TeamMember peer in TeamMembers)
{
foreach (SurveySectionDetail SectionDetail in sectionDetails)
{
foreach (SurveyAttributeDetail AttributeDetail in attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == SectionDetail.SurveySectionDetailID) )
{
SurveyAnswer answer = new SurveyAnswer();
answer.Reviewee = peer;
answer.SurveyFormID = form.SurveyFormID;
answer.Detail = AttributeDetail;
answer.SectionDetail = SectionDetail;
db.SurveyAnswers.Add(answer);
db.SaveChanges();
}
}
}
}
I'm really at a loss as to how I might go about cutting back the runtime. Is this just the price I pay for having this many related entities? I know that joins are expensive operations, and that I've essentially got 3 Or is there some inefficiency that I'm overlooking?
Thanks for your help!
EDIT: As requested by Xiaoy312, here's how sectionDetails and attributeDetails are defined:
SurveyTemplate template = db.SurveyTemplates.Find(SurveyTemplateID);
List<SurveySectionDetail> sectionDetails = new List<SurveySectionDetail>();
List<SurveyAttributeDetail> attributeDetails = new List<SurveyAttributeDetail>();
foreach (SurveyTemplateSection section in template.SurveyTemplateSections)
{
SurveySectionDetail SectionDetail = new SurveySectionDetail();
SectionDetail.SectionName = section.SectionName;
SectionDetail.SectionOrder = section.SectionOrder;
SectionDetail.Description = section.Description;
SectionDetail.SurveyGroupID = surveygroup.SurveyGroupID;
db.SurveySectionDetails.Add(SectionDetail);
sectionDetails.Add(SectionDetail);
db.SaveChanges();
foreach (SurveyTemplateAttribute attribute in section.SurveyTemplateAttributes)
{
SurveyAttributeDetail AttributeDetail = new SurveyAttributeDetail();
AttributeDetail.AttributeName = attribute.AttributeName;
AttributeDetail.AttributeScale = attribute.AttributeScale;
AttributeDetail.AttributeType = attribute.AttributeType;
AttributeDetail.AttributeOrder = attribute.AttributeOrder;
AttributeDetail.SectionDetail = SectionDetail;
db.SurveyAttributeDetails.Add(AttributeDetail);
attributeDetails.Add(AttributeDetail);
db.SaveChanges();
}
}

There is several points that you can improve :
Do not SaveChanges() on each Add() :
foreach (TeamMember TeamMember in TeamMembers)
{
...
// db.SaveChanges();
foreach (TeamMember peer in TeamMembers)
{
foreach (SurveySectionDetail SectionDetail in sectionDetails)
{
foreach (SurveyAttributeDetail AttributeDetail in attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == SectionDetail.SurveySectionDetailID) )
{
...
// db.SaveChanges();
}
}
}
db.SaveChanges();
}
Consider to reduce the numbers of round trips to the database. This can be done by : they are memory-intensive
using Include() to preload your navigation properties; or
cashing the partial or whole table with ToDictionary() or ToLookup()
Instead of Add(), use AddRange() or even BulkInsert() from EntityFramework.BulkInsert if that fits your setup :
db.SurveyAnswers.AddRange(
TeamMembers.SelectMany(p =>
sectionDetails.SelectMany(s =>
attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == s.SurveySectionDetailID)
.Select(a => new SurveyAnswer()
{
Reviewee = p,
SurveyFormID = form.SurveyFormID,
Detail = a,
SectionDetail = s,
}))));

Use Include to avoid SELECT N + 1 issue.
SurveyTemplate template = db.SurveyTemplates.Include("SurveyTemplateSections")
.Include("SurveyTemplateSections.SurveyTemplateAttributes")
.First(x=> x.SurveyTemplateID == SurveyTemplateID);
Generate the whole object graph and then save to DB.
List<SurveySectionDetail> sectionDetails = new List<SurveySectionDetail>();
List<SurveyAttributeDetail> attributeDetails = new List<SurveyAttributeDetail>();
foreach (SurveyTemplateSection section in template.SurveyTemplateSections)
{
SurveySectionDetail SectionDetail = new SurveySectionDetail();
//Some code
sectionDetails.Add(SectionDetail);
foreach (SurveyTemplateAttribute attribute in section.SurveyTemplateAttributes)
{
SurveyAttributeDetail AttributeDetail = new SurveyAttributeDetail();
//some code
attributeDetails.Add(AttributeDetail);
}
}
db.SurveySectionDetails.AddRange(sectionDetails);
db.SurveyAttributeDetails.AddRange(attributeDetails);
db.SaveChanges();
Load all employees you want before the loop, this will avoids database query for every team member.
var teamMemberIds = db.TeamMembers.Where(m => m.TeamID == TeamID && m.OnTeam)
.Select(x=>x.TeamMemberId).ToList();
var employees = db.Employees.Where(x => teamMemberIds.Contains(x.EmployeeId));
create a dictionary for attributeDetails based on their sectionDetailId to avoid query the list on every iteration.
var attributeDetailsGroupBySection = attributeDetails.GroupBy(x => x.SectionDetailId)
.ToDictionary(x => x.Key, x => x);
Move saving of SurveyAnswers and SurveyForms to outside of the loops:
List<SurveyForm> forms = new List<SurveyForm>();
List<SurveyAnswer> answers = new List<SurveyAnswer>();
foreach (int teamMemberId in teamMemberIds)
{
var employee = employees.First(x => x.Id == teamMemberId);
SurveyForm form = new SurveyForm();
//some code
forms.Add(form);
foreach (int peer in teamMemberIds)
{
foreach (SurveySectionDetail SectionDetail in sectionDetails)
{
foreach (SurveyAttributeDetail AttributeDetail in
attributeDetailsGroupBySection[SectionDetail.Id])
{
SurveyAnswer answer = new SurveyAnswer();
//some code
answers.Add(answer);
}
}
}
}
db.SurveyAnswers.AddRange(answers);
db.SurveyForms.AddRange(forms);
db.SaveChanges();
Finally if you want faster insertions you can use EntityFramework.BulkInsert. With this extension, you can save the data like this:
db.BulkInsert(answers);
db.BulkInsert(forms);

Related

What do i use with LINQ in order to READ database instead of SqlDataReader?

I'm trying to use LINQ for the first time and don't understand how to read through the database. I'm sure I am doing something completely stupid and just need a little guidance.
year = year.Where(x => x.statementYear == response[i]).ToList();
SqlDataReader reader;
while (reader.Read())
{
}
Edited question with more info:
This is what I have been trying to get to work.. I don't think I need the con.open() either if I am using the using statement right?
using (SqlConnection con = new SqlConnection(".."))
{
List<string> paths = new List<string>();
// Open Connection
con.Open();
if (response != null)
{
for (var i = 0; i < response.Length; i++)
{
if (response != null)
{
using (var db = new db())
{
List<ClientStatement_Inventory> years = new List<ClientStatement_Inventory>();
years = years.Where(x => x.statementYear == response[i]).ToList();
foreach (var year in years)
{
paths.Add(year.statementPath);
}
}
}
}
}
}
When using Linq, there's no use or point in using SqlDataReader....
You really didn't show much to go on - but basically, with Linq, you should have a DbContext (that's your "connection" to the database), and that DbContext should contain any number of DbSet - basically representing the tables in your database. You select from these DbSet's and then you get back a List<Entity> which you just iterate over.
Something along the lines of:
-- select your customers matching a certain criteria
var customers = NorthwindDbContext.Customers.Where(x => x.statementYear == response[i]).ToList();
-- iterate over your customers
foreach (Customer c in customers)
{
// do whatever with your "Customer" here
}
UPDATE:
from your updated question - all you really need is:
List<string> paths = new List<string>();
using (var db = new FMBDBPRDEntities1())
{
List<ClientStatement_Inventory> years = new List<ClientStatement_Inventory>();
years = years.Where(x => x.statementYear == response[i]).ToList();
foreach (var year in years)
{
paths.Add(year.statementPath);
}
}
That opens the context to access the database, reads some data into a list, iterates over the elements in the list - done. Everything else is useless and can be just deleted.
And you could write this a lot simpler, too:
using (var db = new FMBDBPRDEntities1())
{
List<string> paths = years.Where(x => x.statementYear == response[i])
.Select(y => y.statementPath).ToList();
}
UPDATE #2: if your response is a collection of values (I'm assuming those would be int values), and you need to iterate over it - try something like this:
if (response != null)
{
List<string> allPaths = new List<string>();
using (var db = new FMBDBPRDEntities1())
{
foreach (int aYear in response)
{
List<string> paths = years.Where(x => x.statementYear == aYear)
.Select(y => y.statementPath).ToList();
allPaths.AddRange(paths);
}
}
return allPaths; // or do whatever with your total results
}
UPDATE #3: just wondering - it seems you never really access the DbContext at all ..... I'm just guessing here - but something along the lines of:
if (response != null)
{
List<string> allPaths = new List<string>();
using (var db = new FMBDBPRDEntities1())
{
foreach (int aYear in response)
{
// access a DbSet from your DbContext here!
// To actually fetch data from the database.....
List<string> paths = db.ClientStatement_Inventory
.Where(x => x.statementYear == aYear)
.Select(y => y.statementPath).ToList();
allPaths.AddRange(paths);
}
}
return allPaths; // or do whatever with your total results
}

How to write LINQ query for fetching the specific records and generating new result at the same time?

I have to update one field in the row of the table after fetching two records from the same row. As an easiest practice I have fetched two records individually, created a new value and then updating that particular property through Entity framework. I think there is a better way to do the same thing with less code. If any body can suggest please.
if (objModel.amountpaid==0)
{
using (estatebranchEntities db=new estatebranchEntities())
{
int rentVar = Convert.ToInt32(db.PropertyDetails.Where(m => m.propertyid == objVM.propertyid).Select(m => m.rent).SingleOrDefault());
int balanceVar = Convert.ToInt32(db.PropertyDetails.Where(m => m.propertyid == objVM.propertyid).Select(m => m.balance).SingleOrDefault());
int balanceUpdateVar = (rentVar + balanceVar);
var propInfo = new PropertyDetail() { balance = balanceUpdateVar };
//var result = (from a in db.PropertyDetails
// where a.propertyid == objVM.propertyid
// select new PropertyDetail
// {
// rent = a.rent,
// balance = a.balance
// }).ToList();
db.PropertyDetails.Attach(propInfo);
db.Entry(propInfo).Property(z => z.balance).IsModified = true;
db.SaveChanges();
}
}
Here is what I think you can do.
Fetch the data once and update once.
using (estatebranchEntities db=new estatebranchEntities())
{
var propDetails = db.PropertyDetails.FirstOrDefault(m => m.propertyid == objVM.propertyid);
if (propDetails != null)
{
int rentVar = Convert.ToInt32(propDetails.rent);
int balanceVar = Convert.ToInt32(propDetails.balance);
int balanceUpdateVar = rentVar + balanceVar;
//now do the update
propDetails.balance = balanceUpdateVar;
db.Entry(proDetails).State = EntityState.Modified;
db.SaveChanges();
}
}
if you need to use the rentVar,balanceVar or the balanceUpdateVar, outside of the using statement then declare them outside it.

Entity Framework M:1 relationship resulting in primay key duplication

I'm somewhat new to EF 6.0 so I'm pretty sure I'm doing something wrong here.
there are two questions related to the problem
what am I doing wrong here
what's the best practice to achieve this
I'm using a code first model, and used the edmx designer to design the model and relationships, the system needs to pull information periodically from a webservice and save it to a local database (SQL Lite) in a desktop application
so I get an order list from the API, when I populate and try to save Ticket, I get a duplicate key exception when trying to insert TicketSeatType -
how do I insert the ticket to dbContext, so that It doesn't try and re-insert insert TicketSeatType and TicketPriceType, I have tried setting the child object states to unchanged but it seems to be inserting
secondly, what would be the best practice to achieve this using EF ? it just looks very inefficient loading each object into memory and comparing if it exists or not
since I need to update the listing periodically, I have to check against each object in the database if it exists, then update, else insert
code:
//read session from db
if (logger.IsDebugEnabled) logger.Debug("reading session from db");
dbSession = dbContext.SessionSet.Where(x => x.Id == sessionId).FirstOrDefault();
//populate orders
List<Order> orders = (from e in ordersList
select new Order {
Id = e.OrderId,
CallCentreNotes = e.CallCentreNotes,
DoorEntryCount = e.DoorEntryCount,
DoorEntryTime = e.DoorEntryTime,
OrderDate = e.OrderDate,
SpecialInstructions = e.SpecialInstructions,
TotalValue = e.TotalValue,
//populate parent refernece
Session = dbSession
}).ToList();
//check and save order
foreach (var o in orders) {
dbOrder = dbContext.OrderSet.Where(x => x.Id == o.Id).FirstOrDefault();
if (dbOrder != null) {
dbContext.Entry(dbOrder).CurrentValues.SetValues(o);
dbContext.Entry(dbOrder).State = EntityState.Modified;
}
else {
dbContext.OrderSet.Add(o);
dbContext.Entry(o.Session).State = EntityState.Unchanged;
}
}
dbContext.SaveChanges();
//check and add ticket seat type
foreach (var o in ordersList) {
foreach (var t in o.Tickets) {
var ticketSeatType = new TicketSeatType {
Id = t.TicketSeatType.TicketSeatTypeId,
Description = t.TicketSeatType.Description
};
dbTicketSeatType = dbContext.TicketSeatTypeSet.Where(x => x.Id == ticketSeatType.Id).FirstOrDefault();
if (dbTicketSeatType != null) {
dbContext.Entry(dbTicketSeatType).CurrentValues.SetValues(ticketSeatType);
dbContext.Entry(dbTicketSeatType).State = EntityState.Modified;
}
else {
if (!dbContext.ChangeTracker.Entries<TicketSeatType>().Any(x => x.Entity.Id == ticketSeatType.Id)) {
dbContext.TicketSeatTypeSet.Add(ticketSeatType);
}
}
}
}
dbContext.SaveChanges();
//check and add ticket price type
foreach (var o in ordersList) {
foreach (var t in o.Tickets) {
var ticketPriceType = new TicketPriceType {
Id = t.TicketPriceType.TicketPriceTypeId,
SeatCount = t.TicketPriceType.SeatCount,
Description = t.TicketPriceType.Description
};
dbTicketPriceType = dbContext.TicketPriceTypeSet.Where(x => x.Id == ticketPriceType.Id).FirstOrDefault();
if (dbTicketPriceType != null) {
dbContext.Entry(dbTicketPriceType).CurrentValues.SetValues(ticketPriceType);
dbContext.Entry(dbTicketPriceType).State = EntityState.Modified;
}
else {
if (!dbContext.ChangeTracker.Entries<TicketPriceType>().Any(x => x.Entity.Id == ticketPriceType.Id)) {
dbContext.TicketPriceTypeSet.Add(ticketPriceType);
}
}
}
}
dbContext.SaveChanges();
//check and add tickets
foreach (var o in ordersList) {
dbOrder = dbContext.OrderSet.Where(x => x.Id == o.OrderId).FirstOrDefault();
foreach (var t in o.Tickets) {
var ticket = new Ticket {
Id = t.TicketId,
Quantity = t.Quantity,
TicketPrice = t.TicketPrice,
TicketPriceType = new TicketPriceType {
Id = t.TicketPriceType.TicketPriceTypeId,
Description = t.TicketPriceType.Description,
SeatCount = t.TicketPriceType.SeatCount,
},
TicketSeatType = new TicketSeatType {
Id = t.TicketSeatType.TicketSeatTypeId,
Description = t.TicketSeatType.Description
},
Order = dbOrder
};
//check from db
dbTicket = dbContext.TicketSet.Where(x => x.Id == t.TicketId).FirstOrDefault();
dbTicketSeatType = dbContext.TicketSeatTypeSet.Where(x => x.Id == t.TicketSeatType.TicketSeatTypeId).FirstOrDefault();
dbTicketPriceType = dbContext.TicketPriceTypeSet.Where(x => x.Id == t.TicketPriceType.TicketPriceTypeId).FirstOrDefault();
if (dbTicket != null) {
dbContext.Entry(dbTicket).CurrentValues.SetValues(t);
dbContext.Entry(dbTicket).State = EntityState.Modified;
dbContext.Entry(dbTicket.Order).State = EntityState.Unchanged;
dbContext.Entry(dbTicketSeatType).State = EntityState.Unchanged;
dbContext.Entry(dbTicketPriceType).State = EntityState.Unchanged;
}
else {
dbContext.TicketSet.Add(ticket);
dbContext.Entry(ticket.Order).State = EntityState.Unchanged;
dbContext.Entry(ticket.TicketSeatType).State = EntityState.Unchanged;
dbContext.Entry(ticket.TicketPriceType).State = EntityState.Unchanged;
}
}
}
dbContext.SaveChanges();
UPDATE:
Found the answer, it has to do with how EF tracks references to objects, in the above code, I was creating new entity types from the list for TicketPriceType and TicketSeatType:
foreach (var o in ordersList) {
dbOrder = dbContext.OrderSet.Where(x => x.Id == o.OrderId).FirstOrDefault();
foreach (var t in o.Tickets) {
var ticket = new Ticket {
Id = t.TicketId,
Quantity = t.Quantity,
TicketPrice = t.TicketPrice,
TicketPriceType = new TicketPriceType {
Id = t.TicketPriceType.TicketPriceTypeId,
Description = t.TicketPriceType.Description,
SeatCount = t.TicketPriceType.SeatCount,
},
TicketSeatType = new TicketSeatType {
Id = t.TicketSeatType.TicketSeatTypeId,
Description = t.TicketSeatType.Description
},
Order = dbOrder
};
....
in this case the EF wouldn't know which objects they were and try to insert them.
the solution is to read the entities from database and allocate those, so it's referencing the same entities and doesn't add new ones
foreach (var t in o.Tickets) {
//check from db
dbTicket = dbContext.TicketSet.Where(x => x.Id == t.TicketId).FirstOrDefault();
dbTicketSeatType = dbContext.TicketSeatTypeSet.Where(x => x.Id == t.TicketSeatType.TicketSeatTypeId).FirstOrDefault();
dbTicketPriceType = dbContext.TicketPriceTypeSet.Where(x => x.Id == t.TicketPriceType.TicketPriceTypeId).FirstOrDefault();
var ticket = new Ticket {
Id = t.TicketId,
Quantity = t.Quantity,
TicketPrice = t.TicketPrice,
TicketPriceType = dbTicketPriceType,
TicketSeatType = dbTicketSeatType,
Order = dbOrder
};
...}
Don't you think that you are trying to write very similar codes for defining the state of each entity?
We can handle all of these operations with a single command.
You can easily achieve this with the newly released EntityGraphOperations for Entity Framework Code First. I am the author of this product. And I have published it in the github, code-project (includes a step-by-step demonstration and a sample project is ready for downloading) and nuget. With the help of InsertOrUpdateGraph method, it will automatically set your entities as Added or Modified. And with the help of DeleteMissingEntities method, you can delete those entities which exists in the database, but not in the current collection.
// This will set the state of the main entity and all of it's navigational
// properties as `Added` or `Modified`.
context.InsertOrUpdateGraph(ticket);
By the way, I feel the need to mention that this wouldn't be the most efficient way of course. The general idea is to get the desired entity from the database and define the state of the entity. It would be as efficient as possible.

How can I ensure rows are not loaded twice with EF / LINQ

I created code to load definitions from an external API. The code iterates through a list of words, looks up a definition for each and then I thought to use EF to insert these into my SQL Server database.
However if I run this twice it will load the same definitions the second time. Is there a way that I could make it so that EF does not add the row if it already exists?
public IHttpActionResult LoadDefinitions()
{
var words = db.Words
.AsNoTracking()
.ToList();
foreach (var word in words)
{
HttpResponse<string> response = Unirest.get("https://wordsapiv1.p.mashape.com/words/" + word)
.header("X-Mashape-Key", "xxxx")
.header("Accept", "application/json")
.asJson<string>();
RootObject rootObject = JsonConvert.DeserializeObject<RootObject>(response.Body);
var results = rootObject.results;
foreach (var result in results)
{
var definition = new WordDefinition()
{
WordId = word.WordId,
Definition = result.definition
};
db.WordDefinitions.Add(definition);
}
db.SaveChanges();
}
return Ok();
}
Also would appreciate if anyone has any suggestions as to how I could better implement this loading.
foreach (var result in results)
{
if(!(from d in db.WordDefinitions where d.Definition == result.definition select d).Any())
{
var definition = new WordDefinition()
{
WordId = word.WordId,
Definition = result.definition
};
db.WordDefinitions.Add(definition);
}
}
You can search for Definition value.
var wd = db.WordDefinition.FirstOrDefault(x => x.Definition == result.definition);
if(wd == null) {
var definition = new WordDefinition() {
WordId = word.WordId,
Definition = result.definition
};
db.WordDefinitions.Add(definition);
}
In this way you can get a WordDefinition that already have your value.
If you can also use WordId in the same way:
var wd = db.WordDefinition.FirstOrDefault(x => x.WordId == word.WordId);

Property Mapping with Reflection efficiency

I have an object that I am populating via mapping and lookup woth Reflection, here is the call
this.SearchResults = (from a in response.postings
select new SearchResponseModel
{
Id = a.id,
TimeStampDate = a.timestampDate,
Body = a.body,
Title = a.heading,
Status = a.status,
State = a.state,
Language = a.language,
Currency = a.currency,
CategoryGroup = a.category_group,
Source = a.source,
ExternalId = a.external_id,
ExternalUrl = a.external_url,
Price = a.price,
Location = PopulateLocation(a.location)
}
).ToList();
And the method that does the mapping.
private static List<LocationLookupModel> PopulateLocation(Location location)
{
List<LocationLookupModel> allLocations = new List<LocationLookupModel>();
if (HttpContext.Current.Session["LocationModel"] == null)
{
HttpContext.Current.Session["LocationModel"] = allLocations = new LocationModel().LocationList;
}
else
{
allLocations = (List<LocationLookupModel>)HttpContext.Current.Session["LocationModel"];
}
List<LocationLookupModel> modelList = new List<LocationLookupModel>();
foreach (PropertyInfo propertyInfo in location.GetType().GetProperties())
{
var value = propertyInfo.GetValue(location);
if (value != null)
{
LocationLookupModel model = (from a in allLocations
where a.Code == propertyInfo.GetValue(location).ToString()
select a).FirstOrDefault();
if (model != null)
{
modelList.Add(model);
}
}
}
return modelList;
}
}
The issue that I run into is that allLocations object has about 70k records (it represents a list of location lookup values for countries, states, zipcodes, etc), and populating about 100 instances of SearchResponseModel takes about 20 seconds. This is far too long for a UI call, and I have not been able to find a way to make it faster. I understand I am basically doing 3 nested loops (calling helper method for each population, looping over reflected properties, and finally the LINQ call over 70k records) so there are some time efficiency issues, but I am a bit lost on what tricks I can use to make this process more efficient.

Categories

Resources