Optimize or ditch LINQ query? - c#

So I have a LINQ (to SQL) query that pulls information from a database into a grid. There is a function to aggregate the grid data based on the current filter parameters which will sum the amount of recurring "X"'s in the grid data.
For instance, lets assume the grid displays customer vists to a grocery store. The original data may show the follow:
Date | Name | No. Prod | Total $
--------------------------------------------
01/02/13 | Customer A | 4 products | $23.00
01/02/13 | Customer B | 2 products | $3.26
01/02/13 | Customer C | 7 products | $47.42
01/16/13 | Customer A | 3 products | $26.22
Clicking the summation function for the clients column will display the following grid data:
Cnt| Name | Tot. Prod | Total $
--------------------------------------
2 | Customer A | 7 products | $49.22
1 | Customer B | 2 products | $3.26
1 | Customer C | 7 products | $47.42
My problem is that I am doing the summation logic in a LINQ query. I assumed this would be fast...but it is just the opposite. Here is a sample.
Expression<Func<OrdersView, bool>> filter;
filter = m => m.RecordCreated >= fromDate && m.RecordCreated <= toDate && m.DepartmentID == _depID;
var ClientAggOrders = dataContext.OrdersView
.Where(filter)
.GroupBy(m => m.Name)
.Select(gr => new
{
Name = gr.Key,
Count = gr.Where(s => s.ID != null).Count(),
id = gr.Select(s => s.ID),
S1 = gr.Sum(s => s.Tare < s.Gross ? s.Tare : s.Gross),
S2 = gr.Sum(s => s.Tare < s.Gross ? s.Gross : s.Tare),
NetWeight = gr.Sum(s => s.NetWeight),
Price = gr.Sum(s => s.NetPrice)
}
).ToList();
My question is, why is this such bad practice? LINQ allows for these expressions in the SELECT clause, but the time it takes to execute is beyond absurd to the point where I don't see it being beneficial in any real world scenario.
Am I using LINQ wrong and should I just move my logic outside of the query or can this be optimized and done within LINQ properly? Thanks for any advice!

You can use LINQPad to see the SQL that is generated.
Because of the way LINQ to SQL works, id = gr.Select(s => s.ID) causes a subquery to be executed for every group. Remove this, and instead get the ID+Name in your GroupBy: .GroupBy(m => new{m.ID, m.Name})
You should find that the generated SQL will now be a single statement, instead of the main statement plus a statement for each group.

Perform grouping only in memory? Solve your problem?
var ordersView =
dataContext.OrdersView
.Where(m => m.RecordCreated >= fromDate && m.RecordCreated <= toDate && m.DepartmentID == _depID)
.ToList();
var ClientAggOrders = ordersView.GroupBy(m => m.Name).Select(gr => new
{
Name = gr.Key,
Count = gr.Where(s => s.ID != null).Count(),
id = gr.Select(s => s.ID),
S1 = gr.Sum(s => s.Tare < s.Gross ? s.Tare : s.Gross),
S2 = gr.Sum(s => s.Tare < s.Gross ? s.Gross : s.Tare),
NetWeight = gr.Sum(s => s.NetWeight),
Price = gr.Sum(s => s.NetPrice)
}).ToList();

Related

How to convert this SQL query to LINQ or Lambda expression in C#?

I have the following simple table:
table : Inventory
+-------+-----------+-----------+
| Id | ProductId | cost |
+-------+-----------+-----------+
| 1 | 1 | 10 |
| 2 | 2 | 55 |
| 3 | 1 | 42 |
| 4 | 3 | 102 |
| 5 | 2 | 110 |
+-------+-----------+-----------+
I have the following SQL query:
SELECT T.Id
FROM Inventory AS T INNER JOIN
(SELECT ProductId
FROM Inventory
GROUP BY ProductId
HAVING (COUNT(*) > 1)) AS S ON T.ProductId = S.ProductId
This works to give me all of the Ids where a duplicate ProductId exists. Using the above table, this query would return Ids { 1,2,3,5 }, which is exactly what I want.
I tried converting this into a Lambda expression, but it continually fails with the join. Can anyone get me started and point me in the right direction to write this expression?
This is what I have tried:
var q = inventory.Join( inventory.GroupBy( o => o.ProductId ).Where( o => o.Count( ) > 1 ), g => g.ProductId, gb => gb.Key, ( g, gb ) => g.Id ).ToList( );
You need to use somthing like this:
var result = Inventory
.GroupBy(x => x.ProductId)
.Where(x => x.Count() > 1)
.SelectMany(x => x.ToList())
.Select(x => x.Id);

How do i write this query in entity-framework

I have a table (Items) with records in this format.
Name | ProductId | Owner
Product1 | 1 | xx
Product2 | 2 | yy
Product1 | 1 | xx
Product3 | 3 | xx
Product3 | 3 | xx
Product3 | 3 | xx
Product4 | 4 | xx
Product2 | 2 | xx
Product5 | 5 | xx
I want to write entity framework query that will return the top 3 products, their names and Count. The result should look like this.
Result
Name: [Product3, Product2, Product1],
Count:[3, 2, 2]
I have checked other answers here on SO but still did not find something close to what i want. Note Name and Count returns a list of the top 3 respectively.
You can try this code
var query = context
.products
.GroupBy(p => p.Name)
.OrderByDescending(g => g.Count())
.Take(3);
products.Select(p => new
{
Name = query.Select(g => g.Key).ToList(),
Count = query.Select(g => g.Count()).ToList(),
})
.FirstOrDefault();
Although I recommend that you get top 3 product with count together from database and then put it in different list, like this :
var products = context.products
.GroupBy(p => p.Name)
.OrderByDescending(g => g.Count())
.Take(3)
.Select(g => new
{
Name = g.Key,
Count = g.Count()
})
.ToList();
List<string> names = products.Select(p => p.Name).ToList();
List<int> counts = products.Select(p => p.Count).ToList();
The following should work:
products.GroupBy(p => p.ProductId)
.OrderByDescending(g => g.Count())
.Take(3);

Successive SelectMany in Linq Request

I have three tables built with EF code first.
I try to retrieve some information with SelectMany so that I can flatten the query and get only the fields that I need among those three tables.
My tables are presented as follow:
Tables: ProductOptions *-* ProductOptionValues 1-* LanguageProductOptionValue
|ProductOptionID | OVPriceOffset | LanguagesListID
|PriceOffset | OptionValueCategory | ProductOptionValueName
| | ... |
var queryCabColor = _db.ProductOptions
.Where(c => c.ProductOptionTypeID == 18 && c.ProductId == 1)
.SelectMany(z => z.ProductOptionValues, (productOptions, productOptionValues)
=> new
{
productOptions.ProductOptionID,
productOptions.PriceOffset,
productOptionValues.OVPriceOffset,
productOptionValues.OptionValueCategory,
productOptionValues.ProductOptionValuesID,
productOptionValues.Value,
productOptionValues.LanguageProductOptionValue
})
.SelectMany(d => d.LanguageProductOptionValue, (productOptionValues, productOptionValuesTranslation)
=> new
{
productOptionValuesTranslation.LanguagesListID,
productOptionValuesTranslation.ProductOptionValueName
})
.Where(y => y.LanguagesListID == currentCulture);
So far, when I loop in the query I can just retrieve the LanguagesListID and ProductOptionValueName and I can't find a way to get all of the above mentionned fields. Any suggestion?
I think in your case the Linq syntax is more appropriate than explicit SelectMany. Something like this should work:
var queryCabColor =
from productOptions in db.ProductOptions
where productOptions.ProductOptionTypeID == 18 && productOptions.ProductId == 1
from productOptionValues in productOptions.ProductOptionValues
from productOptionValuesTranslation in productOptionValues.LanguageProductOptionValue
where productOptionValuesTranslation.LanguagesListID == currentCulture
select new
{
productOptions.ProductOptionID,
productOptions.PriceOffset,
productOptionValues.OVPriceOffset,
productOptionValues.OptionValueCategory,
productOptionValues.ProductOptionValuesID,
productOptionValues.Value,
productOptionValuesTranslation.LanguagesListID,
productOptionValuesTranslation.ProductOptionValueName
};

Entity Framework Group By with Max Date and count

I have the following SQL
SELECT Tag , COUNT(*) , MAX(CreatedDate)
FROM dbo.tblTags
GROUP BY Tag
Which outputs the following:
+-----------------+------------------+-------------------------+
| Tag | (No column name) | (No column name) |
+-----------------+------------------+-------------------------+
| a great tag | 1 | 2015-04-01 18:30:31.623 |
| not a test | 1 | 2015-04-01 17:46:09.360 |
| test | 5 | 2015-04-01 18:13:17.920 |
| test2 | 1 | 2013-03-07 16:53:54.217 |
+-----------------+------------------+-------------------------+
I'm trying to replicate the output of that query using EntityFramework.
I have the following logic which works:
var GroupedTags = Tags.GroupBy(c => c.Tag)
.Select(g => new
{
name = g.Key,
count = g.Count(),
date = g.OrderByDescending(gt => gt.CreatedDate).FirstOrDefault().CreatedDate
})
.OrderBy(c => c.name);
But takes horribly long to execute compared to the raw SQL query. Any suggestions on how to optimise my approach? It somehow feels wrong.
If you want a max, use the Max() Linq method:
var GroupedTags = Tags.GroupBy(c => c.Tag)
.Select(g => new
{
name = g.Key,
count = g.Count(),
date = g.Max(x => x.CreatedDate)
})
.OrderBy(c => c.name);

Calculating difference between different columns in different rows

I have a table that records what happens to a vehicle during a visit. For each visit, there is are multiple rows to denote what has been done. The table looks like this
VisitID | ActionID | StartTime | EndTime
0 | 0 | 1/1/2013 | 1/2/2013
1 | 0 | 1/2/2013 | 1/4/2013
1 | 1 | 1/4/2013 | 1/7/2013
2 | 0 | 1/4/2013 | 1/5/2013
2 | 1 | 1/5/2013 | 1/6/2013
2 | 2 | 1/6/2013 | 1/7/2013
I wish to construct a LINQ query capable of getting the amount of time a visit took. TotalTime is calculated by first finding the first and last (lowest and highest) ActionID, then last.EndTime - first.StartTime. Expected results:
VisitID | TotalTime
0 | 1
1 | 5
2 | 3
I can generate my expected results by doing
var first = db.Visits.Where(v => v.ActionID == 0)
var last = db.Visits.GroupBy(x => x.VisitID).Select(g => g.OrderByDescending(x => x.ActionID).First())
first.Join(last, f => f.VisitID, l => l.VisitID, (f, l) new{ VisitID = Key, TotalTime = l.EndTime - f.StartTime});
I really don't like the hack I used to get the last ActionID, and I would really like to be able to do this within 1 LINQ statement. What do I need to do to achieve this?
I think this should work...
var result = db.Visits.GroupBy(v => v.VisitID)
.Select(g => new
{
VisitId = g.Key,
TotalTime = g.Max(v => v.EndTime).Subtract(g.Min(v => v.StartTime)).Days
});
Edit: This assumes the actionid doesn't matter so much as the max and min start dates. Here is a different solution where the actionid's are ordered and the first and last Visits are used to calculate the time difference.
var result2 = db.Visits.GroupBy(v => v.VisitID)
.Select(g => new
{
VisitId = g.Key,
TotalTime =
g.OrderBy(v => v.ActionID).Last().EndTime.Subtract(g.OrderBy(v => v.ActionID).First().StartTime).Days
});
db.Visits.GroupBy(v => v.VisitID)
.Select(g => new { VisitId = g.Key,
Days = g.Select(x => x.EndTime.ToOADate()).Sum()
-g.Select(x => x.StartTime.ToOADate()).Sum() });
Used a little hack to add all starts days and end days for each visit and get the difference.

Categories

Resources