How to optimize this create/update/delete comparisons? - c#

I'm with EF 6.
I have a list of currents item in the db, which I retrieve with:
ae_alignedPartners_olds = ctx.AE_AlignedPartners.AsNoTracking().ToList(); // list of List<AE_AlignedPartners>
Than, I retry the same objects from a JSON, with:
ae_alignedPartners_news = GetJSONListObjects(); // list of List<AE_AlignedPartners>
Than I'm making some comparisons (to see which I need to update, which I need to delete and which to create. That's the current code:
// intersection
var IDSIntersections = (from itemNew in ae_alignedPartners_news
join itemOld in ae_alignedPartners_olds on itemNew.ObjectID equals itemOld.ObjectID
select itemNew).Select(p => p.ObjectID).ToList();
// to update
IList<AE_AlignedPartners> ae_alignedPartners_toUpdate = new List<AE_AlignedPartners>();
foreach (var item in IDSIntersections)
{
var itemOld = ae_alignedPartners_olds.First(p => p.ObjectID == item);
var itemNew = ae_alignedPartners_news.First(p => p.ObjectID == item);
if (itemOld.Field1 != itemNew.Field1 ||
itemOld.Field2 != itemNew.Field2 ||
itemOld.Field3 != itemNew.Field3 ||
itemOld.Field4 != itemNew.Field4 ||
itemOld.Field5 != itemNew.Field5 ||
itemOld.Field6 != itemNew.Field6 ||
itemOld.Field7 != itemNew.Field7 ||
itemOld.Field8 != itemNew.Field8 ||
itemOld.Field9 != itemNew.Field9)
{
itemOld.Field1 = itemNew.Field1;
itemOld.Field2 = itemNew.Field2;
itemOld.Field3 = itemNew.Field3;
itemOld.Field4 = itemNew.Field4;
itemOld.Field5 = itemNew.Field5;
itemOld.Field6 = itemNew.Field6;
itemOld.Field7 = itemNew.Field7;
itemOld.Field8 = itemNew.Field8;
itemOld.Field9 = itemNew.Field9;
ae_alignedPartners_toUpdate.Add(itemOld);
}
}
// to create
IList<AE_AlignedPartners> ae_alignedPartners_toCreate = ae_alignedPartners_news.Where(p => !IDSIntersections.Contains(p.ObjectID)).ToList();
// to delete
IList<AE_AlignedPartners> ae_alignedPartners_toDelete = ae_alignedPartners_olds.Where(p => !IDSIntersections.Contains(p.ObjectID)).ToList();
Which is faster enough for 1000~ records. Over 50k, it becomes very very slow.
What do you suggest to improve the whole?

If you want to find out what's slow I suggest profiling or simply pausing the debugger 10 times to see where it stops most often (you can try that with your existing code). But here I could spot the problem immediately:
var itemOld = ae_alignedPartners_olds.First(p => p.ObjectID == item);
var itemNew = ae_alignedPartners_news.First(p => p.ObjectID == item);
This is scanning the entire list which is O(N). Together with the outer loop this becomes O(N^2).
The best solution would be to restructure your query so that these lookups are not necessary. It seems to me that the join already outputs the objects you need.
But you can also use a hash table to speed up the lookups.
var dict_ae_alignedPartners_olds = ae_alignedPartners_olds.ToDictionary(p => p.ObjectID);
var dict_ae_alignedPartners_news = ae_alignedPartners_news.ToDictionary(p => p.ObjectID);
foreach (var item in IDSIntersections)
{
var itemOld = dict_ae_alignedPartners_olds[item];
var itemNew = dict_ae_alignedPartners_news[item];
//...
}

Related

Remove a range of items from a list without looping

Hi Is there a more ellegant way of doing this must I do the loop is there like a range funciton I could just remove all the items found
Sorry I should have showing how my qry is being inserted.
Btw these are two different entities that I am removing from I hope you get the idea.
var qry = db.AssemblyListItems.AsNoTracking().Where(x =>
x.ProductionPlanID == (long)_currentPlan.ProductionPlan ).ToList();
var hasbeenAssembled = db.CompletedPrinteds.AsNoTracking().Where(x =>
x.ProductionPlanId == item.ProductionPlanID).ToList();
var hasbeenFound = db.CompletedPrinteds.AsNoTracking().Where(x =>
x.ProductionPlanId== item.ProductionPlanID).ToList();
foreach (var subitem in hasbeenAssembled )
{
if(item.ProductionPlanID ==subitem.ProductionPlanId && item.DocumentNo == subitem.DocumentNo && item.DocumentNo == subitem.DocumentNo && item.OutstandingToMake ==0)
{
qry.RemoveAll(x => x.ProductionPlanID == subitem.ProductionPlanId && x.DocumentNo == item.DocumentNo && x.ItemCode == subitem.StockCode && item.OutstandingToMake ==0);
}
}
public List<AssemblyListItems> RemoveDespatchedItems(List<AssemblyListItems> AssemblyItems)
{
foreach (AssemblyListItems item in AssemblyItems)
{
using (var db = new LiveEntities())
{
var hasNotBeenDespatched = db.PalletizedItems.Where(w => w.Despatched != "Not Despatched");
foreach (var subitem in hasNotBeenDespatched)
{
AssemblyItems.RemoveAll(x => x.ProductionPlanID == subitem.ProductionPlanID && x.DocumentNo == item.DocumentNo && x.ItemCode == subitem.StockCode);
}
}
}
return AssemblyItems;
}
I just need to remove the items from the first query hasNotBeenDespatched from the second query.As could be over 400 items i want it to be efficient as possible.
Edit 2
I am a we bit closer thanks buts its still not removing the items from the removedespatchitems from the assebmittems I do not no why
public List<AssemblyListItems> RemoveDespatchedItems(List<AssemblyListItems> AssemblyItems, Int64 ProductionPlanId)
{
using (var db = newLiveEntities())
{
List<PalletizedItems> removeDespatchItems = db.PalletizedItems.Where(w => w.Despatched != "Not Despatched" && w.ProductionPlanID == ProductionPlanId).ToList();
var itemsDocumentNo = db.PalletizedItems.Select(x => x.ProductionPlanItemID).ToList();
foreach (var subitem in removeDespatchItems) {
AssemblyItems.RemoveAll(x => x.ProductionPlanID == subitem.ProductionPlanID && itemsDocumentNo.Contains(x.ProductionPlanItemID) && x.ItemCode == subitem.StockCode && x.LineQuantity==x.AllocatedQuantity);
}
}
return AssemblyItems;
}
Not 100% I get exactly how it should be.
However in general you could use join that would result in it being done in the database. Something like this:
var remainingItems = (from ali in db.FUEL_AssemblyListItems
join completed in db.FuelCompletedPrinteds
on new { ali.ProductionPlanID, ali.DocumentNo, ali.ItemCode } equals new { completed.ProductionPlanID, completed.DocumentNo, completed.StockCode }
join dispatched in db.FUEL_PalletizedItems
on new { ali.ProductionPlanID, ali.DocumentNo, ali.ItemCode } equals new { dispatched.ProductionPlanID, dispatched.DocumentNo, dispatched.StockCode }
where (ali.ProductionPlanID == (long) _currentPlan.ProductionPlan
&& ali.DocumentNo == completed.DocumentNo
&& completed.OutstandingToMake == 0
&& dispatched.Despatched != "Not Despatched")
select ali).ToList();
Depending upon the records in the database the join might need to be a outer join which needs a slightly different syntax but hopefully you've got a starting point.

foreach to update values in object overwrites values

Caveat: I know some parts of this code are bad, but it is what it is for now. I just want it to run properly. I fully intend to refactor later. I just need a working app right now.
The initial linq query grabs several fields of data, but more data must be added per item in the resultset. So, we have the foreach below. It grabs data and updates each row.
It overwrites everything to what I'm thinking is probably the last iteration of the foreach. Why? How do I keep it from overwriting?
Keep in mind that the working variable just meabns a period id. I want to get previous or future periods, and subtracting or adding to this allows this.
public List<intranetGS.Forecast> getForecast(int branchId) {
//user role protection here
intraDataContext q = new intraDataContext();
//this grabs the initial data
var basefc = (from f in q.fc_items
where f.color_option == false
select new intranetGS.Forecast {
itemId = f.item_id,
item = f.item_number,
itemDesc = f.description,
itemSuffix = f.item_suffix,
itemPrefix = f.item_prefix,
designation = f.designation
});
//now we filter
switch (getDesignation(branchId)) {
case 1:
basefc = basefc.Where(n => n.designation != 3);
basefc = basefc.Where(n => n.designation != 6);
break;
case 2:
basefc = basefc.Where(n => n.designation > 3);
break;
case 3:
basefc = basefc.Where(n => n.designation != 2);
basefc = basefc.Where(n => n.designation != 6);
break;
}
var current = Convert.ToInt32(DateTime.Now.Month);
var working = 0;
var year = Convert.ToInt32(DateTime.Now.Year);
List<intranetGS.Forecast> res = new List<intranetGS.Forecast>();
foreach (var f in basefc) {
working = getPeriod(current + "/" + (year - 1)); //starting with last year;
var ly = (from l in q.fc_forecasts where l.period == working && l.branch == branchId && l.item == f.itemId select l).FirstOrDefault();
if (!object.ReferenceEquals(ly, null)) {
f.lastYearForecast = ly.forecast;
f.lastYearReceipt = ly.receipt;
}
working = getPeriod(current + "/" + year) - 2; //two months ago
var m2 = (from l in q.fc_forecasts where l.period == working && l.branch == branchId && l.item == f.itemId select l).FirstOrDefault();
if (!object.ReferenceEquals(m2, null)) {
f.twoMosForecast = m2.forecast;
f.twoMosReceipts = m2.receipt;
f.twoMosUsage = m2.usage_lb;
}
working = getPeriod(current + "/" + year) - 1; //one month ago
var m1 = (from l in q.fc_forecasts where l.period == working && l.branch == branchId && l.item == f.itemId select l).FirstOrDefault();
if (!object.ReferenceEquals(m1, null)) {
f.oneMosForecast = m1.forecast;
f.oneMosReceipts = m1.receipt;
f.oneMosUsage = m1.usage_lb;
}
working = getPeriod(current + "/" + year); //current month
var m = (from l in q.fc_forecasts where l.period == working && l.branch == branchId && l.item == f.itemId select l).FirstOrDefault();
if (!object.ReferenceEquals(m, null)) {
f.currentMosForecast = m.forecast;
f.currentMosReceipts = m.receipt;
f.currentMosusage = m.usage_lb;
}
working = getPeriod(current + "/" + year) + 1; //one month from now
var mnext1 = (from l in q.fc_forecasts where l.period == working && l.branch == branchId && l.item == f.itemId select l).FirstOrDefault();
if (!object.ReferenceEquals(mnext1, null)) {
f.plusOneForecast = mnext1.forecast;
f.plusOneForecastId = mnext1.forcast_id;
}
working = getPeriod(current + "/" + year) + 2; //two months from now
var mnext2 = (from l in q.fc_forecasts where l.period == working && l.branch == branchId && l.item == f.itemId select l).FirstOrDefault();
if (!object.ReferenceEquals(mnext2, null)) {
f.plusTwoForecast = mnext2.forecast;
f.plusTwoForecastId = mnext2.forcast_id;
}
} //this is insanely and extremely cumbersome; refactor later.
return basefc;
}
UPDATE: It wasn't a list, it needed to be a list to avoid the overwrite.
The issue is that there is a delayed execution which occurs in linq as the user is building the query and internally it is building an expression tree where new expressions can be added. Once the query factors are settled upon during an execution, such as an enumerable target in a for loop or via .ToList() that list is fluid still. Since the code was simply adding more expressions and not filtering it out into a new list, the query just grew.
The question is when working on existing code, did the developer want to keep building the expression tree for performance or did they intend to make the items concrete at each step along the process?.
You may be fixing an issue by making the initial list concrete but could be introducing a logic bug going forward. Keep that in mind.

modify items in a Generic List

i cannot modify generic List with :
var x = (PaypalResponse)Session["PaypalResponse"]; // x.Response is my List
x.Response.ToList().Where(i => i.Id== 1).ForEach(s => s.Selected = true);
where am I doing wrong?
Thanks.
You could do this:
x.Response.Where(i => i.Id == 1).ToList().ForEach(s => s.Selected = true);
However, it's a bit of a waste of resources to construct a new list just for this one line of code. I'd recommend this instead:
foreach(var s in x.Response.Where(i => i.Id == 1))
{
s.Selected = true;
}
If you only want to update at most one item, you can do this instead:
var s = x.Response.FirstOrDefault(i => i.Id == 1);
if (s != null)
{
s.Selected = true;
}
And of course, if you know there will be one item to update, it's even easier:
x.Response.First(i => i.Id == 1).Selected = true;

Issue with LINQ group by with count

I'm trying to run the following query but for some reason MemberTransactionCount and NonMemberTransactionCount are coming back as the exact same values. It seems that the .Where() clauses aren't working as we'd expect them to.
Hoping someone can point out where I might be going wrong.
from trans in transactions
orderby trans.TransactionDate.Year , trans.TransactionDate.Month
group trans by new {trans.TransactionDate.Year, trans.TransactionDate.Month}
into grp
select new MemberTransactions
{
Month = string.Format("{0}/{1}", grp.Key.Month, grp.Key.Year),
MemberTransactionCount =
grp.Where(x => x.Account.Id != Guid.Empty || x.CardNumber != null)
.Sum(x => x.AmountSpent),
NonMemberTransactionCount =
grp.Where(x => x.Account.Id == Guid.Empty && x.CardNumber == null)
.Sum(x => x.AmountSpent)
}
EDIT
I've verified in the database that the results are not what they should be. It seems to be adding everything together and not taking into account the Account criteria that we're looking at.
I ended up solving this with two separate queries. It's not exactly as I wanted, but it does the job and seems to just as quick as I would have hoped.
var memberTrans = from trans in transactions
where trans.Account != null
|| trans.CardNumber != null
orderby trans.TransactionDate.Month
group trans by trans.TransactionDate.Month
into grp
select new
{
Month = grp.Key,
Amount = grp.Sum(x => x.AmountSpent)
};
var nonMemberTrans = (from trans in transactions
where trans.Account == null
&& trans.CardNumber == null
group trans by trans.TransactionDate.Month
into grp
select new
{
Month = grp.Key,
Amount = grp.Sum(x => x.AmountSpent)
}).ToList();
var memberTransactions = new List<MemberTransactions>();
foreach (var trans in memberTrans)
{
var non = (from nt in nonMemberTrans
where nt.Month == trans.Month
select nt).FirstOrDefault();
var date = new DateTime(2012, trans.Month, 1);
memberTransactions.Add(new MemberTransactions
{
Month = date.ToString("MMM"),
MemberTransactionCount = trans.Amount,
NonMemberTransactionCount = non != null ? non.Amount : 0.00m
});
}
I think the main problem here is that you doubt the result, though it might be correct.
Add another property for verification:
TotalAmount = grp.Sum(x => x.AmountSpent)

Improving performance of linq query

I'm optimizing a method with a number of Linq queries. So far the execution time is around 3 seconds and I'm trying to reduce it. There is quite a lot of operations and calculations happening in the method, but nothing too complex.
I will appreciate any suggections and ideas how the performance can be improved and code optimized.
The whole code of the method(Below I'll point where I have the biggest delay):
public ActionResult DataRead([DataSourceRequest] DataSourceRequest request)
{
CTX.Configuration.AutoDetectChangesEnabled = false;
var repoKomfortaktion = new KomfortaktionRepository();
var komfortaktionen = CTX.Komfortaktionen.ToList();
var result = new List<AqGeplantViewModel>();
var gruppen = new HashSet<Guid?>(komfortaktionen.Select(c => c.KomfortaktionsGruppeId).ToList());
var hochgeladeneKomplettabzuege = CTX.Komplettabzug.Where(c => gruppen.Contains(c.KomfortaktionsGruppeId)).GroupBy(c => new { c.BetriebId, c.KomfortaktionsGruppeId }).Select(x => new { data = x.Key }).ToList();
var teilnehmendeBetriebe = repoKomfortaktion.GetTeilnehmendeBetriebe(CTX, gruppen);
var hochgeladeneSperrlistenPlz = CTX.SperrlistePlz.Where(c => gruppen.Contains(c.KomfortaktionsGruppeId) && c.AktionsKuerzel != null)
.GroupBy(c => new { c.AktionsKuerzel, c.BetriebId, c.KomfortaktionsGruppeId }).Select(x => new { data = x.Key }).ToList();
var hochgeladeneSperrlistenKdnr = CTX.SperrlisteKdnr.Where(c => gruppen.Contains(c.KomfortaktionsGruppeId) && c.AktionsKuerzel != null)
.GroupBy(c => new { c.AktionsKuerzel, c.BetriebId, c.KomfortaktionsGruppeId }).Select(x => new { data = x.Key }).ToList();
var konfigsProAktion = CTX.Order.GroupBy(c => new { c.Vfnr, c.AktionsId }).Select(c => new { count = c.Count(), c.Key.AktionsId, data = c.Key }).ToList();
foreach (var komfortaktion in komfortaktionen)
{
var item = new AqGeplantViewModel();
var zentraleTeilnehmer = teilnehmendeBetriebe.Where(c => c.TeilnahmeStatus.Any(x => x.KomfortaktionId == komfortaktion.Id && x.AktionsTypeId == 1)).ToList();
var lokaleTeilnehmer = teilnehmendeBetriebe.Where(c => c.TeilnahmeStatus.Any(x => x.KomfortaktionId == komfortaktion.Id && x.AktionsTypeId == 2)).ToList();
var hochgeladeneSperrlistenGesamt =
hochgeladeneSperrlistenPlz.Count(c => c.data.AktionsKuerzel == komfortaktion.Kuerzel && c.data.KomfortaktionsGruppeId == komfortaktion.KomfortaktionsGruppeId) +
hochgeladeneSperrlistenKdnr.Count(c => c.data.AktionsKuerzel == komfortaktion.Kuerzel && c.data.KomfortaktionsGruppeId == komfortaktion.KomfortaktionsGruppeId);
item.KomfortaktionId = komfortaktion.KomfortaktionId;
item.KomfortaktionName = komfortaktion.Aktionsname;
item.Start = komfortaktion.KomfortaktionsGruppe.StartAdressQualifizierung.HasValue ? komfortaktion.KomfortaktionsGruppe.StartAdressQualifizierung.Value.ToString("dd.MM.yyyy") : string.Empty;
item.LokalAngemeldet = lokaleTeilnehmer.Count();
item.ZentralAngemeldet = zentraleTeilnehmer.Count();
var anzHochgelandenerKomplettabzuege = hochgeladeneKomplettabzuege.Count(c => zentraleTeilnehmer.Count(x => x.BetriebId == c.data.BetriebId) == 1) +
hochgeladeneKomplettabzuege.Count(c => lokaleTeilnehmer.Count(x => x.BetriebId == c.data.BetriebId) == 1);
item.KomplettabzugOffen = (zentraleTeilnehmer.Count() + lokaleTeilnehmer.Count()) - anzHochgelandenerKomplettabzuege;
item.SperrlisteOffen = (zentraleTeilnehmer.Count() + lokaleTeilnehmer.Count()) - hochgeladeneSperrlistenGesamt;
item.KonfigurationOffen = zentraleTeilnehmer.Count() - konfigsProAktion.Count(c => c.AktionsId == komfortaktion.KomfortaktionId && zentraleTeilnehmer.Any(x => x.Betrieb.Vfnr == c.data.Vfnr));
item.KomfortaktionsGruppeId = komfortaktion.KomfortaktionsGruppeId;
result.Add(item);
}
return Json(result.ToDataSourceResult(request));
}
The first half (before foreach) takes half a second which is okay. The biggest delay is inside foreach statement in the first iteration and in particular in these lines, execution of zentraleTeilnehmer takes 1.5 second for the first time.
var zentraleTeilnehmer = teilnehmendeBetriebe.Where(c => c.TeilnahmeStatus.Any(x => x.KomfortaktionId == komfortaktion.Id && x.AktionsTypeId == 1)).ToList();
var lokaleTeilnehmer = teilnehmendeBetriebe.Where(c => c.TeilnahmeStatus.Any(x => x.KomfortaktionId == komfortaktion.Id && x.AktionsTypeId == 2)).ToList();
TeilnehmendeBetriebe has over 800 lines, where TeilnahmeStatus property has normally around 4 items. So, maximum 800*4 iterations, which is not a huge number afterall...
Thus, I'm mostly interected in optimizing these lines, hoping to reduce execution time to half a second or so.
What I tried:
Rewrite Linq to foreach: didn't help, same time... probably not surprising, but was worth a try.
foreach (var tb in teilnehmendeBetriebe) //836 items
{
foreach (var ts in tb.TeilnahmeStatus) //3377 items
{
if (ts.KomfortaktionId == komfortaktion.Id && ts.AktionsTypeId == 1)
{
testResult.Add(tb);
break;
}
}
}
Selecting particular columns for teilnehmendeBetriebe with .Select(). Didn't help either.
Neither helped other small manipulations I tried.
What is interesting - while the first iteration of foreach can take up to 2 seconds, the second and further take just milisecons, so .net is capable of optimizing or reusing calculation data.
Any advice on what can be changed in order to improve performance is very welcome!
Edit:
TeilnahmeBetriebKomfortaktion.TeilnahmeStatus is loaded eagerly in the method GetTeilnehmendeBetriebe:
public List<TeilnahmeBetriebKomfortaktion> GetTeilnehmendeBetriebe(Connection ctx, HashSet<Guid?> gruppen)
{
return ctx.TeilnahmeBetriebKomfortaktion.Include(
c => c.TeilnahmeStatus).ToList();
}
Edit2:
The query which is sent when executing GetTeilnehmendeBetriebe:
SELECT
[Extent1].[Id] AS [Id],
[Extent1].[BetriebId] AS [BetriebId],
[Extent1].[MandantenId] AS [MandantenId],
[Extent1].[CreatedUser] AS [CreatedUser],
[Extent1].[UpdatedUser] AS [UpdatedUser],
[Extent1].[CreatedDate] AS [CreatedDate],
[Extent1].[UpdatedDate] AS [UpdatedDate],
[Extent1].[IsDeleted] AS [IsDeleted]
FROM [Semas].[TeilnahmeBetriebKomfortaktion] AS [Extent1]
WHERE [Extent1].[IsDeleted] <> cast(1 as bit)
My assumption is that TeilnahmeBetriebKomfortaktion.TeilnahmeStatus is a lazy loaded collection, resulting in the N + 1 problem. You should eagerly fetch that collection to improve your performance.
The following iterations of the foreach loop are fast, because after the first iteration those objects are no longer requested from the database server but are server from memory.

Categories

Resources