How can I ensure rows are not loaded twice with EF / LINQ

How can I ensure rows are not loaded twice with EF / LINQ - c#

I created code to load definitions from an external API. The code iterates through a list of words, looks up a definition for each and then I thought to use EF to insert these into my SQL Server database.
However if I run this twice it will load the same definitions the second time. Is there a way that I could make it so that EF does not add the row if it already exists?
public IHttpActionResult LoadDefinitions()
{
var words = db.Words
.AsNoTracking()
.ToList();
foreach (var word in words)
{
HttpResponse<string> response = Unirest.get("https://wordsapiv1.p.mashape.com/words/" + word)
.header("X-Mashape-Key", "xxxx")
.header("Accept", "application/json")
.asJson<string>();
RootObject rootObject = JsonConvert.DeserializeObject<RootObject>(response.Body);
var results = rootObject.results;
foreach (var result in results)
{
var definition = new WordDefinition()
{
WordId = word.WordId,
Definition = result.definition
};
db.WordDefinitions.Add(definition);
}
db.SaveChanges();
}
return Ok();
}
Also would appreciate if anyone has any suggestions as to how I could better implement this loading.

foreach (var result in results)
{
if(!(from d in db.WordDefinitions where d.Definition == result.definition select d).Any())
{
var definition = new WordDefinition()
{
WordId = word.WordId,
Definition = result.definition
};
db.WordDefinitions.Add(definition);
}
}

You can search for Definition value.
var wd = db.WordDefinition.FirstOrDefault(x => x.Definition == result.definition);
if(wd == null) {
var definition = new WordDefinition() {
WordId = word.WordId,
Definition = result.definition
};
db.WordDefinitions.Add(definition);
}
In this way you can get a WordDefinition that already have your value.
If you can also use WordId in the same way:
var wd = db.WordDefinition.FirstOrDefault(x => x.WordId == word.WordId);

Related

GROUP BY to List<> with Linq

I'm new to using Linq so I don't understand some things or its syntax. I want to group a list and then loop through it with foreach, like my logic below. Obviously my logic doesn't work.
My code:
var final = finalv.Union(finalc);
final = final.GroupBy(x => x.Clave);
foreach (var articulo in final)
{
Articulo articulo2 = new Articulo();
articulo2.ArtID = articulo.ArtID;
articulo2.Clave = articulo.Clave;
articulo2.ClaveAlterna = articulo.ClaveAlterna;
lista.Add(articulo2);
}

First, such usage is syntactically consistent with this overloaded method of GroupBy: GroupBy<TSource,TKey>(IEnumerable<TSource>, Func<TSource,TKey>), and it will return a IEnumerable<IGrouping<TKey,TSource>> variable.
That means, if you run final.GroupBy(x => x.Clave), let's assume he returns finalWithGrouped, then finalWithGrouped.Key is the key and finalWithGrouped.ToList() is a collection of all variables with the same key(at here, it is with the same Clave).
And for your code, try this:
var final = finalv.Union(finalc);
var finalWithGrouped = final.GroupBy(x => x.Clave);
foreach (var articulosWithSameClavePair in finalWithGrouped)
{
var clave = articulosWithSameClavePair.Key;
var articulos = articulosWithSameClavePair.ToList();
foreach(var articulo in articulos)
{
Articulo articulo2 = new Articulo();
articulo2.ArtID = articulo.ArtID;
articulo2.Clave = articulo.Clave;
articulo2.ClaveAlterna = articulo.ClaveAlterna;
lista.Add(articulo2);
}
}
I suggest you read some examples of using GroupBy.

When you group a list, it will return a key and groued list and you are trying reach a single property of a list.
When you group an data, you can convert it to dictionary, It is not nessesary but better way for me. You can try this code:
var final = finalv.Union(finalc);
final = final.GroupBy(x => x.Clave).ToDictionary(s=> s.Key, s=> s.ToList();
foreach (var articulo in final)
{
foreach (var articuloItem in articulo.value)
{
Articulo articulo2 = new Articulo();
articulo2.ArtID = articuloItem.ArtID;
articulo2.Clave = articuloItem.Clave;
articulo2.ClaveAlterna = articuloItem.ClaveAlterna;
lista.Add(articulo2);
}
}

How to pass a list of strings through webapi and get the results without those strings?

My code already gets the table without containing a string. How can I get a list without containing a list of strings? I want to get the result of SELECT * FROM table WHERE column NOT IN ('x' ,'y');
public IEnumerable<keyart1> Get(string keyword)
{
List<keyart1> keylist;
using (dbEntities5 entities = new dbEntities5())
{
keylist = entities.keyart1.Where(e => e.keyword != keyword).ToList();
var result = keylist.Distinct(new ItemEqualityComparer());
return result;
}
}

I think i found the answer if anybody interested
public IEnumerable<keyart1> Get([FromUri] string[] keyword1)
{
List<keyart1> keylist;
List<IEnumerable<keyart1>> ll;
using (dbEntities5 entities = new dbEntities5())
{
ll = new List<IEnumerable<keyart1>>();
foreach (var item in keyword1)
{
keylist = entities.keyart1.Where(e => e.keyword != item).ToList();
var result = keylist.Distinct(new ItemEqualityComparer());
ll.Add(result);
}
var intersection = ll.Aggregate((p, n) => p.Intersect(n).ToList());
return intersection;
}
}

C# Calculate field inside LINQ Query

I need some help to calculate a property inside my Linq query.
I know I need to use "let" somewhere, but I can't figure it out!
So, first I have this method to get my list from Database:
public BindingList<Builders> GetListBuilders()
{
BindingList<Builders> builderList = new BindingList<Builders>();
var ctx = new IWMJEntities();
var query = (from l in ctx.tblBuilders
select new Builders
{
ID = l.BuilderID,
Projeto = l.NomeProjeto,
Status = l.Status,
DataPedido = l.DataPedido,
DataPendente = l.DataPendente,
DataEntregue = l.DataEntregue,
DataAnulado = l.DataAnulado
});
foreach (var list in query)
builderList.Add(list);
return builderList;
}
Then, I have a function to calculate the Days between Dates accordingly to Status:
public int GetDays()
{
int Dias = 0;
foreach (var record in GetListBuilders)
{
if (record.Status == "Recebido")
{
Dias = GetBusinessDays(record.DataPedido, DateTime.Now);
}
else if (record.Status == "Pendente")
{
Dias = GetBusinessDays(record.DataPedido, (DateTime)record.DataPendente);
}
else if (record.Status == "Entregue")
{
Dias = GetBusinessDays(record.DataPedido, (DateTime)record.DataEntregue);
}
else if (record.Status == "Anulado")
{
Dias = GetBusinessDays(record.DataPedido, (DateTime)record.DataAnulado);
}
}
return Dias;
}
I need to call the GetDays in a DataGridView to give the days for each record.
My big problem is, How do I get this? include it in Linq Query? Calling GetDays() (need to pass the ID from each record to GetDays() function)!?
Any help?
Thanks

I think it would be easier to create an extension method:
public static int GetBusinessDays(this Builders builder) // or type of ctx.tblBuilders if not the same
{
if (builder == null) return 0;
switch(builder.status)
{
case "Recebido": return GetBusinessDays(builder.DataPedido, DateTime.Now);
case "Pendente": return GetBusinessDays(builder.DataPedido, (DateTime)builder.DataPendente);
case "Entregue": return GetBusinessDays(builder.DataPedido, (DateTime)builder.DataEntregue);
case "Anulado": GetBusinessDays(builder.DataPedido, (DateTime)builder.DataAnulado);
default: return 0;
}
}
Then, call it like that:
public BindingList<Builders> GetListBuilders()
{
BindingList<Builders> builderList = new BindingList<Builders>();
var ctx = new IWMJEntities();
var query = (from l in ctx.tblBuilders
select new Builders
{
ID = l.BuilderID,
Projeto = l.NomeProjeto,
Status = l.Status,
DataPedido = l.DataPedido,
DataPendente = l.DataPendente,
DataEntregue = l.DataEntregue,
DataAnulado = l.DataAnulado,
Dias = l.GetBusinessDays()
});
foreach (var list in query)
builderList.Add(list);
return builderList;
}
To do better, to convert a object to a new one, you should create a mapper.

Why does it need to be a part of the query? You can't execute C# code on the database. If you want the calculation to be done at the DB you could create a view.
You're query is executed as soon as the IQueryable is enumerated at the foreach loop. Why not just perform the calculation on each item as they are enumerated and set the property when you are adding each item to the list?

Very slow runtime with Entity Framework nested loop (using nav properties)

Right now, I'm trying to write a method for a survey submission program that utilizes a very normalized schema.
I have a method that is meant to generate a survey for a team of people, linking several different EF models together in the process. However, this method runs EXTREMELY slowly for anything but the smallest team sizes (taking 11.2 seconds to execute for a 4-person team, and whopping 103.9 seconds for an 8 person team). After some analysis, I found that 75% of the runtime is taken up in the following block of code:
var TeamMembers = db.TeamMembers.Where(m => m.TeamID == TeamID && m.OnTeam).ToList();
foreach (TeamMember TeamMember in TeamMembers)
{
Employee employee = db.Employees.Find(TeamMember.EmployeeID);
SurveyForm form = new SurveyForm();
form.Submitter = employee;
form.State = "Not Submitted";
form.SurveyGroupID = surveygroup.SurveyGroupID;
db.SurveyForms.Add(form);
db.SaveChanges();
foreach (TeamMember peer in TeamMembers)
{
foreach (SurveySectionDetail SectionDetail in sectionDetails)
{
foreach (SurveyAttributeDetail AttributeDetail in attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == SectionDetail.SurveySectionDetailID) )
{
SurveyAnswer answer = new SurveyAnswer();
answer.Reviewee = peer;
answer.SurveyFormID = form.SurveyFormID;
answer.Detail = AttributeDetail;
answer.SectionDetail = SectionDetail;
db.SurveyAnswers.Add(answer);
db.SaveChanges();
}
}
}
}
I'm really at a loss as to how I might go about cutting back the runtime. Is this just the price I pay for having this many related entities? I know that joins are expensive operations, and that I've essentially got 3 Or is there some inefficiency that I'm overlooking?
Thanks for your help!
EDIT: As requested by Xiaoy312, here's how sectionDetails and attributeDetails are defined:
SurveyTemplate template = db.SurveyTemplates.Find(SurveyTemplateID);
List<SurveySectionDetail> sectionDetails = new List<SurveySectionDetail>();
List<SurveyAttributeDetail> attributeDetails = new List<SurveyAttributeDetail>();
foreach (SurveyTemplateSection section in template.SurveyTemplateSections)
{
SurveySectionDetail SectionDetail = new SurveySectionDetail();
SectionDetail.SectionName = section.SectionName;
SectionDetail.SectionOrder = section.SectionOrder;
SectionDetail.Description = section.Description;
SectionDetail.SurveyGroupID = surveygroup.SurveyGroupID;
db.SurveySectionDetails.Add(SectionDetail);
sectionDetails.Add(SectionDetail);
db.SaveChanges();
foreach (SurveyTemplateAttribute attribute in section.SurveyTemplateAttributes)
{
SurveyAttributeDetail AttributeDetail = new SurveyAttributeDetail();
AttributeDetail.AttributeName = attribute.AttributeName;
AttributeDetail.AttributeScale = attribute.AttributeScale;
AttributeDetail.AttributeType = attribute.AttributeType;
AttributeDetail.AttributeOrder = attribute.AttributeOrder;
AttributeDetail.SectionDetail = SectionDetail;
db.SurveyAttributeDetails.Add(AttributeDetail);
attributeDetails.Add(AttributeDetail);
db.SaveChanges();
}
}

There is several points that you can improve :
Do not SaveChanges() on each Add() :
foreach (TeamMember TeamMember in TeamMembers)
{
...
// db.SaveChanges();
foreach (TeamMember peer in TeamMembers)
{
foreach (SurveySectionDetail SectionDetail in sectionDetails)
{
foreach (SurveyAttributeDetail AttributeDetail in attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == SectionDetail.SurveySectionDetailID) )
{
...
// db.SaveChanges();
}
}
}
db.SaveChanges();
}
Consider to reduce the numbers of round trips to the database. This can be done by : they are memory-intensive
using Include() to preload your navigation properties; or
cashing the partial or whole table with ToDictionary() or ToLookup()
Instead of Add(), use AddRange() or even BulkInsert() from EntityFramework.BulkInsert if that fits your setup :
db.SurveyAnswers.AddRange(
TeamMembers.SelectMany(p =>
sectionDetails.SelectMany(s =>
attributeDetails.Where(a => a.SectionDetail.SurveySectionDetailID == s.SurveySectionDetailID)
.Select(a => new SurveyAnswer()
{
Reviewee = p,
SurveyFormID = form.SurveyFormID,
Detail = a,
SectionDetail = s,
}))));

Use Include to avoid SELECT N + 1 issue.
SurveyTemplate template = db.SurveyTemplates.Include("SurveyTemplateSections")
.Include("SurveyTemplateSections.SurveyTemplateAttributes")
.First(x=> x.SurveyTemplateID == SurveyTemplateID);
Generate the whole object graph and then save to DB.
List<SurveySectionDetail> sectionDetails = new List<SurveySectionDetail>();
List<SurveyAttributeDetail> attributeDetails = new List<SurveyAttributeDetail>();
foreach (SurveyTemplateSection section in template.SurveyTemplateSections)
{
SurveySectionDetail SectionDetail = new SurveySectionDetail();
//Some code
sectionDetails.Add(SectionDetail);
foreach (SurveyTemplateAttribute attribute in section.SurveyTemplateAttributes)
{
SurveyAttributeDetail AttributeDetail = new SurveyAttributeDetail();
//some code
attributeDetails.Add(AttributeDetail);
}
}
db.SurveySectionDetails.AddRange(sectionDetails);
db.SurveyAttributeDetails.AddRange(attributeDetails);
db.SaveChanges();
Load all employees you want before the loop, this will avoids database query for every team member.
var teamMemberIds = db.TeamMembers.Where(m => m.TeamID == TeamID && m.OnTeam)
.Select(x=>x.TeamMemberId).ToList();
var employees = db.Employees.Where(x => teamMemberIds.Contains(x.EmployeeId));
create a dictionary for attributeDetails based on their sectionDetailId to avoid query the list on every iteration.
var attributeDetailsGroupBySection = attributeDetails.GroupBy(x => x.SectionDetailId)
.ToDictionary(x => x.Key, x => x);
Move saving of SurveyAnswers and SurveyForms to outside of the loops:
List<SurveyForm> forms = new List<SurveyForm>();
List<SurveyAnswer> answers = new List<SurveyAnswer>();
foreach (int teamMemberId in teamMemberIds)
{
var employee = employees.First(x => x.Id == teamMemberId);
SurveyForm form = new SurveyForm();
//some code
forms.Add(form);
foreach (int peer in teamMemberIds)
{
foreach (SurveySectionDetail SectionDetail in sectionDetails)
{
foreach (SurveyAttributeDetail AttributeDetail in
attributeDetailsGroupBySection[SectionDetail.Id])
{
SurveyAnswer answer = new SurveyAnswer();
//some code
answers.Add(answer);
}
}
}
}
db.SurveyAnswers.AddRange(answers);
db.SurveyForms.AddRange(forms);
db.SaveChanges();
Finally if you want faster insertions you can use EntityFramework.BulkInsert. With this extension, you can save the data like this:
db.BulkInsert(answers);
db.BulkInsert(forms);

Trying to access variable from outside foreach loop

The application I am building allows a user to upload a .csv file, which will ultimately fill in fields of an existing SQL table where the Ids match. First, I am using LinqToCsv and a foreach loop to import the .csv into a temporary table. Then I have another foreach loop where I am trying to loop the rows from the temporary table into an existing table where the Ids match.
Controller Action to complete this process:
[HttpPost]
public ActionResult UploadValidationTable(HttpPostedFileBase csvFile)
{
var inputFileDescription = new CsvFileDescription
{
SeparatorChar = ',',
FirstLineHasColumnNames = true
};
var cc = new CsvContext();
var filePath = uploadFile(csvFile.InputStream);
var model = cc.Read<Credit>(filePath, inputFileDescription);
try
{
var entity = new TestEntities();
var tc = new TemporaryCsvUpload();
foreach (var item in model)
{
tc.Id = item.Id;
tc.CreditInvoiceAmount = item.CreditInvoiceAmount;
tc.CreditInvoiceDate = item.CreditInvoiceDate;
tc.CreditInvoiceNumber = item.CreditInvoiceNumber;
tc.CreditDeniedDate = item.CreditDeniedDate;
tc.CreditDeniedReasonId = item.CreditDeniedReasonId;
tc.CreditDeniedNotes = item.CreditDeniedNotes;
entity.TemporaryCsvUploads.Add(tc);
}
var idMatches = entity.Authorizations.ToList().Where(x => x.Id == tc.Id);
foreach (var number in idMatches)
{
number.CreditInvoiceDate = tc.CreditInvoiceDate;
number.CreditInvoiceNumber = tc.CreditInvoiceNumber;
number.CreditInvoiceAmount = tc.CreditInvoiceAmount;
number.CreditDeniedDate = tc.CreditDeniedDate;
number.CreditDeniedReasonId = tc.CreditDeniedReasonId;
number.CreditDeniedNotes = tc.CreditDeniedNotes;
}
entity.SaveChanges();
entity.Database.ExecuteSqlCommand("TRUNCATE TABLE TemporaryCsvUpload");
TempData["Success"] = "Updated Successfully";
}
catch (LINQtoCSVException)
{
TempData["Error"] = "Upload Error: Ensure you have the correct header fields and that the file is of .csv format.";
}
return View("Upload");
}
The issue in the above code is that tc is inside the first loop, but the matches are defined after the loop with var idMatches = entity.Authorizations.ToList().Where(x => x.Id == tc.Id);, so I am only getting the last item of the first loop.
So I would need to put var idMatches = entity.Authorizations.ToList().Where(x => x.Id == tc.Id); in the first loop, but then I can't access it in the second. If I nest the second loop then it is way to slow. Is there any way I could put the above statement in the first loop and still access it. Or any other ideas to accomplish the same thing? Thanks!

Instead of using multiple loops, keep track of processed IDs as you go and then exclude any duplicates.
[HttpPost]
public ActionResult UploadValidationTable(HttpPostedFileBase csvFile)
{
var inputFileDescription = new CsvFileDescription
{
SeparatorChar = ',',
FirstLineHasColumnNames = true
};
var cc = new CsvContext();
var filePath = uploadFile(csvFile.InputStream);
var model = cc.Read<Credit>(filePath, inputFileDescription);
try
{
var entity = new TestEntities();
var tcIdFound = new HashSet<string>();
foreach (var item in model)
{
if (tcIdFound.Contains(item.Id))
{
continue;
}
var tc = new TemporaryCsvUpload();
tc.Id = item.Id;
tc.CreditInvoiceAmount = item.CreditInvoiceAmount;
tc.CreditInvoiceDate = item.CreditInvoiceDate;
tc.CreditInvoiceNumber = item.CreditInvoiceNumber;
tc.CreditDeniedDate = item.CreditDeniedDate;
tc.CreditDeniedReasonId = item.CreditDeniedReasonId;
tc.CreditDeniedNotes = item.CreditDeniedNotes;
entity.TemporaryCsvUploads.Add(tc);
}
entity.SaveChanges();
entity.Database.ExecuteSqlCommand("TRUNCATE TABLE TemporaryCsvUpload");
TempData["Success"] = "Updated Successfully";
}
catch (LINQtoCSVException)
{
TempData["Error"] = "Upload Error: Ensure you have the correct header fields and that the file is of .csv format.";
}
return View("Upload");
}
If you want to make sure you get the last value for any duplicate ids, then store each TemporaryCsvUpload record in a dictionary instead of using only a HashSet. Same basic idea though.

Declare idMatches before the first loop, but don't instantiate it or set its value to null. Then you'll be able to use it inside both loops. After moving the declaration before the first loop, you'll still end up having the values from the last iteration using a simple Where. You'll need to concatenate the already existing list with results for the current iteration.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

How can I ensure rows are not loaded twice with EF / LINQ - c#

foreach (var result in results) { if(!(from d in db.WordDefinitions where d.Definition == result.definition select d).Any()) { var definition = new WordDefinition() { WordId = word.WordId, Definition = result.definition }; db.WordDefinitions.Add(definition); } }

Related

GROUP BY to List<> with Linq

How to pass a list of strings through webapi and get the results without those strings?

C# Calculate field inside LINQ Query

Very slow runtime with Entity Framework nested loop (using nav properties)

Trying to access variable from outside foreach loop

Categories

Resources