How to aggregate and SUM EntityFramework fields with multiple joins - c#

I am able to produce a set of results that are desirable, but I have the need to group and sum of these fields and am struggling to understand how to approach this.
In my scenario, what would be the best way to get results that will:
Have a distinct [KeyCode] (right now I get many records, same KeyCode
but different occupation details)
SUM wage and projection fields (in same query)
Here is my LINQ code:
private IQueryable<MyAbstractCustomOccupationInfoClass> GetMyAbstractCustomOccupationInfoClass(string[] regionNumbers)
{
//Get a list of wage data
var wages = _db.ProjectionAndWages
.Join(
_db.HWOLInformation,
wages => wages.KeyCode,
hwol => hwol.KeyCode,
(wages, hwol) => new { wages, hwol }
)
.Where(o => regionNumbers.Contains(o.hwol.LocationID))
.Where(o => o.wages.areaID.Equals("48"))
.Where(o => regionNumbers.Contains(o.wages.RegionNumber.Substring(4))); //regions filter, remove first 4 characters (0000)
//Join OccupationInfo table to wage data, for "full" output results
var occupations = wages.Join(
_db.OccupationInfo,
o => o.wages.KeyCode,
p => p.KeyCode,
(p, o) => new MyAbstractCustomOccupationInfoClass
{
KeyCode = o.KeyCode,
KeyTitle = o.KeyTitle,
CareerField = o.CareerField,
AverageAnnualOpeningsGrowth = p.wages.AverageAnnualOpeningsGrowth,
AverageAnnualOpeningsReplacement = p.wages.AverageAnnualOpeningsReplacement,
AverageAnnualOpeningsTotal = p.wages.AverageAnnualOpeningsTotal,
});
//TO-DO: How to Aggregate and Sum "occupations" list here & make the [KeyCode] Distinct ?
return occupations;
}
I am unsure if I should perform the Grouping mechanism on the 2nd join? Or perform a .GroupJoin()? Or have a third query?

var occupations = _db.OccupationInfo.GroupJoin(
wages,
o => o.KeyCode,
p => p.wages.KeyCode,
(o, pg) => new MyAbstractCustomOccupationInfoClass {
KeyCode = o.KeyCode,
KeyTitle = o.KeyTitle,
CareerField = o.CareerField,
AverageAnnualOpeningsGrowth = pg.Sum(p => p.wages.AverageAnnualOpeningsGrowth),
AverageAnnualOpeningsReplacement = pg.Sum(p => p.wages.AverageAnnualOpeningsReplacement),
AverageAnnualOpeningsTotal = pg.Sum(p => p.wages.AverageAnnualOpeningsTotal),
});

Related

C# Linq - Refactoring ForEach Loop with Sub List

Am trying to refactor some data in order to display some charts.
I can't seem to figure out why using the following, it lists all the values at the top rather than being sequential like the source data.
var categories = VehicleSales.Select(v => v.name).Distinct().ToList();
var refactoredResults = new List<StackedColumnChart>();
foreach (var category in categories)
{
var subresult = VehicleSales.Where(x => x.vehicleType == category)
.GroupBy(x => x.vehicleType)
.Select(gcs => new StackedColumnChart
{
Category = category,
Values = gcs.Select(x => (int)x.data).DefaultIfEmpty(0).ToList()
}).ToList();
refactoredResults.AddRange(subresult);
}
Source Data:
Then the actual results and expected results:
Thanks in advance!
You can do that without loop and selecting a distinct values, just use GroupBy method and map each group to StackedColumnChart using Select
var refactoredResults = VehicleSales
.GroupBy(s => s.Category)
.Select(g => new StackedColumnChart
{
Category = g.Key,
Values = g.Select(s => s.Value).ToList()
})
.ToList();
If the original data is not sorted and you'll need to sort the values by week number, you can use OrderBy clause before selecting a values Values = g.OrderBy(s => s.WeekNumber).Select(s => s.Value).ToList()

linq group by and select multiple columns not in group by

I'm trying to select multiple columns not in a group by using linq - c#.
Using linq, I'm trying to group by ISNULL(fieldOne,''),ISNULL(fieldTo,'') and then select field_One, field_Two, field_Three for each group. So for each row that the group by would return, I want to see numerous rows.
So far I have the following, but can't seem to select all the needed columns.
var xy = tableQueryable.Where(
!string.IsNullOrEmpty(cust.field_One)
|| ! string.IsNullOrEmpty(ust.field_Two)
).GroupBy(cust=> new { field_One= cust.field_One ?? string.Empty, field_Tow = cust.field_Two ?? string.Empty}).Where(g=>g.Count()>1).AsQueryable();
Can somebody help pls?
You are pretty much there - all you are missing is a Select from the group:
var xy = tableQueryable
.Where(!string.IsNullOrEmpty(cust.first_name) || ! string.IsNullOrEmpty(ust.lastName))
.GroupBy(cust=> new { first_name = cust.first_name ?? string.Empty, last_name = cust.last_name ?? string.Empty})
.Where(g=>g.Count()>1)
.ToList() // Try to work around the cross-apply issue
.SelectMany(g => g.Select(cust => new {
Id = cust.Id
, cust.FirstName
, cust.LastName
, cust.RepId
}));
Select from each group does the projection of the fields that you want, while SelectMany dumps all the results into a flat list.
Would this work for you?
var groupsWithDuplicates = tableQueryable
.Where(c => !string.IsNullOrWhiteSpace(c.first_name) || !string.IsNullOrWhiteSpace(c.last_name))
.GroupBy(c => new { FirstName = c.first_name ?? "", LastName = c.last_name ?? "" })
.Where(group => group.Count() > 1) // Only keep groups with more than one item
.ToList();
var duplicates = groupsWithDuplicates
.SelectMany(g => g) // Flatten out groups into a single collection
.Select(c => new { c.first_name, c.last_name, c.customer_rep_id });
For me I have used following query to do the filter Customer and get the customer records group by the JobFunction. In my case the issue get resolved after adding the .AsEnumerable() after the where solve the problem.
var query = _context.Customer
.Where(x => x.JobTitle.ToUpper().Contains(searchText.ToUpper())).AsEnumerable()
.GroupBy(item => item.JobFunction,
(key, group) => new {
JobFunction = key,
CustomerRecords = group.ToList().Select(c => c).ToList()
})
.ToList();

Entity framework use already selected value saved in new variable later in select sentance

I wrote some entity framework select:
var query = context.MyTable
.Select(a => new
{
count = a.OtherTable.Where(b => b.id == id).Sum(c => c.value),
total = a.OtherTable2.Where(d => d.id == id) * count ...
});
I have always select total:
var query = context.MyTable
.Select(a => new
{
count = a.OtherTable.Where(b => b.id == id).Sum(c => c.value),
total = a.OtherTable2.Where(d => d.id == id) * a.OtherTable.Where(b => b.id == id).Sum(c => c.value)
});
Is it possible to select it like in my first example, because I have already retrieved the value (and how to do that) or should I select it again?
One possible approach is to use two successive selects:
var query = context.MyTable
.Select(a => new
{
count = a.OtherTable.Where(b => b.id == id).Sum(c => c.value),
total = a.OtherTable2.Where(d => d.id == id)
})
.Select(x => new
{
count = x.count,
total = x.total * x.count
};
You would simple do
var listFromDatabase = context.MyTable;
var query1 = listFromDatabase.Select(a => // do something );
var query2 = listFromDatabase.Select(a => // do something );
Although to be fair, Select requires you to return some information, and you aren't, you're somewhere getting count & total and setting their values. If you want to do that, i would advise:
var listFromDatabase = context.MyTable.ToList();
listFromDatabase.ForEach(x =>
{
count = do_some_counting;
total = do_some_totalling;
});
Note, the ToList() function stops it from being IQueryable and transforms it to a solid list, also the List object allows the Linq ForEach.
If you're going to do complex stuff inside the Select I would always do:
context.MyTable.AsEnumerable()
Because that way you're not trying to still Query from the database.
So to recap: for the top part, my point is get all the table contents into variables, use ToList() to get actual results (do a workload). Second if trying to do it from a straight Query use AsEnumerable to allow more complex functions to be used inside the Select

LINQ: how to get a group of a table ordering with a related table?

I have a doubt about the object IGrouping that results from a linq where I use a "group by" sentence.
I have two tables in the database, Products and Responses they have a relationship 1 to *. In the Responses table we have a column called FinalRate which is the rate of the product. The products can have n responses or rates.
I want to get the Products order by the sum of the FinalRate divided by the number of rates done. That is to say, order by the average rate descending from higher to lower marks.
As it can be read in the code (at the end of the question), I try to get the responses first. To sum all the finalrates and divide them by the count I use a group.
There are 2 problems with the code, even if the current code works:
1.-I tried to get the Products in a single query but it is impossible because I can not use the products table in the group and then use the Response table in the "orderby". One more thing LINQ only gives you the possibility to group one table, it is imposible to have "group prod, response".
I couldn't get this sql sentence in LINQ:
select prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name,
prod.ProductImageUrl
from rev_product prod
inner join rev_response res on res.AtProductid=prod.ProductID
group by prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name
,prod.ProductImageUrl
order by (sum(res.FinalRate)/count(res.AtProductid))
I tried this:
var gruposproductos = (from prod in ctx.Products
join res in ctx.Responses on prod.ProductID equals res.AtProductId
group prod by prod.ProductID into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2);
But as I say, the "orderby (g.Sum..." gives an error, because "into g" groups the Product table, not the Response Table.
So this is why in my final code I don't get the products in the same LINQ sentence.
2.-Once accepted this fact, the problem is that I get an IGrouping, but I don't obtain a list of Responses that I can iterate without doing the two foreach in the code. I wanted only one loop, as one would do if you had a "List" object.
It is not really a cool method but it works. Moreover, I have to control that in the second loop there is only added 1 time.
Any better code?
var groupproducts = (from res in ctx.Responses
group res by res.AtProductId into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2).ToList();
List<Product> theproducts = new List<Product>();
foreach (var groupresponse in groupproducts)
{
foreach (var response in groupresponse)
{
var producttemp= (from prod in ctx.Products
where prod.ProductID == response.AtProductId
select prod).First();
theproducts.Add(producttemp);
}
}
}
FINAL SOLUTION (thx a lot #Daniel)
var productsanonymtype = ctx.Products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Count() == 0 ? 0 : x.Responses.Select(r => (double)r.FinalRate).Sum() / x.Responses.Count()
}).OrderByDescending(x => x.AverageRating);
List<Product> products = new List<Product>();
foreach (var prod in productsanonymtype)
{
products.Add(prod.Product);
}
Try this:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Sum(x => x.FinalRate) /
x.Responses.Count()
});
The Sum overload I am using is not implemented in all providers. If that's a problem for you, you can use this alternate version:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Select(x => x.FinalRate)
.Sum() /
x.Responses.Count()
});
If there is no navigation property from product to its responses you should first try to fix that. If you can't you can use this version:
products.Join(responses, x => x.Id, x => x.ProductId,
(p, r) => new { Product = p, Response = r })
.GroupBy(x => x.Product)
.Select(g => new { Product = g.Key,
AverageRating = g.Select(x => x.Response.FinalRate)
.Sum() /
g.Count()
});
Assuming FinalRate is an int, both methods will calculate the average rating with an int, i.e. there will be no 4.5 rating. And there will be no rounding, i.e. an actual average rating of 4.9 will result in 4. You can fix that by casting one of the operands of the division to double.
Another problem is the case with no ratings so far. The code above will result in an exception in this case. If that's a problem for you, you can change the calculation to this:
AverageRating = g.Count() == 0
? 0
: g.Select(x => (double)x.Response.FinalRate).Sum() / g.Count()
ctx.Products.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);
And here with the projection.....
ctx.Products.Select(x => new {
ProductID = x.ProductID,
Commercial_Product_Name = x.Commercial_Product_Name,
Manufacturer_Name = x.Manufacturer_Name,
ProductImageUrl = x.ProductImageUrl,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.FinalRate,
CountProductId = x.CountProductId
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);

How to join 3 tables with lambda expression?

I have a simple LINQ lambda join query but I want to add a 3rd join with a where clause. How do I go about doing that?
Here's my single join query:
var myList = Companies
.Join(
Sectors,
comp => comp.Sector_code,
sect => sect.Sector_code,
(comp, sect) => new {Company = comp, Sector = sect} )
.Select( c => new {
c.Company.Equity_cusip,
c.Company.Company_name,
c.Company.Primary_exchange,
c.Company.Sector_code,
c.Sector.Description
});
I want to add the following SQL command to the above LINQ query and still maintain the projections:
SELECT
sector_code, industry_code
FROM
distribution_sector_industry
WHERE
service = 'numerical'
The 3rd join would be made with Sector table & Distribution_sector_industry on sector_code.
Thanks in advance.
Just a guess:
var myList = Companies
.Join(
Sectors,
comp => comp.Sector_code,
sect => sect.Sector_code,
(comp, sect) => new { Company = comp, Sector = sect })
.Join(
DistributionSectorIndustry.Where(dsi => dsi.Service == "numerical"),
cs => cs.Sector.Sector_code,
dsi => dsi.Sector_code,
(cs, dsi) => new { cs.Company, cs.Sector, IndustryCode = dsi.Industry_code })
.Select(c => new {
c.Company.Equity_cusip,
c.Company.Company_name,
c.Company.Primary_exchange,
c.Company.Sector_code,
c.Sector.Description,
c.IndustryCode
});
Okay, I can't see why you'd want to select sector_code when you already know it, but I think you want this:
var query = from company in Companies
join sector in Sectors
on company.SectorCode equals sector.SectorCode
join industry in DistributionSectorIndustry
on sector.SectorCode equals industry.SectorCode
where industry.Service == "numerical"
select new {
company.EquityCusip,
company.CompanyName,
company.PrimaryExchange,
company.SectorCode,
sector.Description,
industry.IndustryCode
};
Notes:
I've changed it into a query expression as that's a much more readable way of expressing a query like this.
Although the "where" clause comes after the join, assuming this is a LINQ to SQL or Entity Framework query, it shouldn't make any difference
I've lengthened the range variable names for clarity
I've converted your other names into conventional .NET names; you can do this too in your model
For 4 Tables
var query = CurrencyDeposits
.Join(Customers, cd => cd.CustomerId, cus => cus.Id, (cd, cus)
=> new { CurrencyDeposit = cd, Customer = cus })
.Join(Currencies, x => x.CurrencyDeposit.CurrencyId, cr => cr.Id, (x, cr)
=> new { x.CurrencyDeposit, x.Customer, Currency = cr })
.Join(Banks, x => x.CurrencyDeposit.BankId, bn => bn.Id, (x, bn)
=> new { x.CurrencyDeposit, x.Customer, x.Currency, Bank = bn})
.Select(s => new {
s.CurrencyDeposit.Id,
s.Customer.NameSurname,
s.Currency.Code,
s.Bank.BankName,
s.CurrencyDeposit.RequesCode
});
Try something like this...
var myList = ({from a in Companies
join b in Sectors on a.Sector_code equals b.Sector_code
join c in Distribution on b.distribution_code equals a.distribution_code
select new {...});

Categories

Resources