SQL to LINQ duplicated group count - c#

I am converting a SQL result to LINQ.
The SQL is simple:
select NAME, DESC, count(*) total from dbo.TBL_ITEM_BY_PROVIDER p
inner join dbo.TBL_TYPE_PROVIDER tp on tp.id = p.provider_id
group by NAME, DESC SORT_ORDER
order by SORT_ORDER
The output is simple:
NAME DESC Count(*)
CSD Census and Statistics 5
LandsD Lands Department 52
PlandD Planning Department 29
My LINQ:
from p in data.TBL_ITEM_BY_PROVIDERs
join tp in data.TBL_TYPE_PROVIDERs on p.PROVIDER_ID equals tp.ID
group new { p, tp } by new { tp.NAME, tp.DESC } into provider
orderby (provider.Key.NAME)
select new {
provider.Key.NAME,
provider.Key.DESC,
count = (from pp in provider select pp.tp.NAME.ToList().Count())
};
and the output is a duplicated count array: [5,5,5,5,5]
0:{NAME: "CSD", DESC: "Census and Statistics", count: [5, 5, 5, 5, 5]}
1:{NAME: "LandsD", DESC: "Lands Department", count: [52, 52, 52, 52...]}
2:{NAME: "PlandD", DESC: "Planning Department", count: [29, 29, 29, 29...]}
How to properly write a group statement like SQL?

You can write the grouping a bit differently. As you only want the count of how many items there are in the group you can just:
var result = from p in data.TBL_ITEM_BY_PROVIDERs
join tp in data.TBL_TYPE_PROVIDERs on p.PROVIDER_ID equals tp.ID
group 1 by new { tp.NAME, tp.DESC } into provider
orderby provider.Key.NAME
select new {
provider.Key.NAME,
provider.Key.DESC,
Count = provider.Count()
};
Notice that the following does not do what you expect:
pp.tp.NAME.ToList().Count()
NAME is a string. Performing ToList() on it returns a List<char> so Count() on that counts the number of letters in the string. As you are doing in in the select statement of a nested query you get back a collection of the count, instead of a number.
Last, notice that in your sql your ordering is by order by SORT_ORDER and in your linq it is by order by provider.Key.NAME - Not the same field, and just by chance gives for this data the same desired ordering

According to documentation, LINQ group clause returns a sequence of IGrouping<TKey,TElement>. While IGrouping<TKey,TElement> implements IEnumerable<TElement>, to calculate count of items in the group you can just call Count() method.
Also you can simplify group clause for your query.
from item in data.TBL_ITEM_BY_PROVIDERs
join provider in data.TBL_TYPE_PROVIDERs
on item.PROVIDER_ID equals provider.ID
group item by provider into itemsByProvider
orderby itemsByProvider.Key.NAME
select new
{
itemsByProvider.Key.NAME,
itemsByProvider.Key.DESC,
count = itemsByProvider.Count()
};

Related

Group by and left join in linq

There are two tables, one is customers has the fields customerID,GroupID and the other one is CustomerGroup has the fields GroupID,GroupName, I want to get the quantity of customerID in each group, here is the LINQ statement:
var groups = from customerGroups in db.CustomerGroup
join customers in db.Customers on customerGroups.GroupID equals customers.GroupID into gc
where customerGroups.MerchantID == merchantID
from subCustomerGroups in gc.DefaultIfEmpty()
group customerGroups by customerGroups.GroupName into grpCustomerGroups
select new { GroupName = grpCustomerGroups.Key, Quantity = customers.Count()};
the problme is that Quantity = customers.Count() is invalid, how to correct the statement?
The expected sql steatment is
exec sp_executesql N'SELECT
1 AS [C1],
[GroupBy1].[K1] AS [GroupName],
[GroupBy1].[A1] AS [C2]
FROM ( SELECT
[Extent1].[GroupName] AS [K1],
COUNT(CustomerID) AS [A1]
FROM [dbo].[CustomerGroup] AS [Extent1]
LEFT OUTER JOIN [dbo].[Customer] AS [Extent2] ON [Extent1].[GroupID] = [Extent2].[GroupID]
WHERE [Extent1].[MerchantID] = #p__linq__0
GROUP BY [Extent1].[GroupName]
) AS [GroupBy1]',N'#p__linq__0 bigint',#p__linq__0=9
Usually, if you find yourself doing a left outer join followed by a GroupBy, it is because you want "items with their sub-items", Like "Schools with their Students", "Clients with their Orders", "CustomerGroups with their Customers", etc. If you want this, consider using GroupJoin instead of "Join + DefaultIfEmpty + GroupBy"
I'm more familiar with method syntax, so I'll use that one.
int merchantId = ...
var result = dbContext.CustomerGroups
// keep only the CustomerGroups from merchantId
.Where(customerGroup => customerGroup.MerchantId == merchantId)
.GroupJoin(dbContext.Customers, // GroupJoin with Customers
customerGroup => customerGroup.GroupId, // from every CustomerGroup take the GroupId
customer => customer.GroupId, // from every Customer take the GroupId
// ResultSelector:
(customerGroup, customersInThisGroup) => new // from every CustomerGroup with all its
{ // matching customers make one new object
GroupName = customerGroup.Key,
Quantity = customersInThisGroup.CustomerId, // ???
});
In words:
Take the sequence of CustomerGroups. Keep only those CustomerGroups that have a value for property MerchantId equal to merchantId. From every remaining CustomerGroup, get all its Customers, by comparing the CustomerGroup.GroupId with each Customer.GroupId.
The result is a sequence of CustomerGroups, each with its Customers. From this result (parameter ResultSelector) get the GroupName from the Customer and the Quantity from the Customers in this group.
Your statement was:
Quantity = customers.CustomerID,
This will not work. I'm sure this is not what you want. Alas you forgot to write what you want. I think it is this:
Quantity = customers.Count().
But if you want the CustomerId of all Customers in this CustomerGroup:
// ResultSelector:
(customerGroup, customersInThisGroup) => new
{
GroupName = customerGroup.Key,
CustomerIds = customersInThisGroup.Select(customer => customer.CustomerId)
.ToList(),
);
If you want you can use the ResultSelector to get "CustomerGroups with their Customers". Most efficient is to select only the properties you actually plan to use:
// ResultSelector:
(customerGroup, customersInThisGroup) => new
{
// select only the CustomerGroup properties that you plan to use:
Id = CustomerGroup.GroupId,
Name = CustomerGroup.Name,
... // other properties that you plan to use
Customers = customersInThisGroup.Select(customer => new
{
// again, select only the Customer properties that you plan to use
Id = customer.Id,
Name = customer.Name,
...
// not needed, you know the value:
// GroupId = customer.GroupId
});
The reason not to select the foreign key of the Customers, is efficiency. If CustomerGroup [14] has 1000 Customers, then every Customer in this group will have a value for GroupId equal to [14]. It would be a waste to send this value [14] 1001 times.

Convert sql to LINQ query for selection of multiple max columns in c#

SELECT MAX(sectionid) AS SectionId,MAX(displayorder) AS DisplayOrder,propertyid AS PropertyId,1 AS IsSpecSection FROM (
SELECT mp.SectionId ,mp.DisplayOrder ,mp.PropertyId FROM
ModelProperties mp
INNER JOIN PropertySections PS ON mp.SectionId =
ps.SectionId
WHERE ps.IsSpecSection = 1 )s
GROUP BY propertyid
I want to convert above query into LINQ, able to do it for selection of single max column but not for multiple.
I haven't tested the code you have to modify the code as you need
using (var dbContext = new YourEntityName())
{
var result = (from mp in dbContext.ModelProperties
join ps in dbContext.PropertySections on mp.SectionId equals ps.SectionId
where ps.IsSpecSection = 1
group a by new { propertyid } into g
select sectionid , MAX(displayorder)AS DisplayOrder,propertyid AS PropertyId, 1 AS IsSpecSection).ToList();
}
Max value in Linq select Within Innerjoin
You can use this code,
var list=(from mp in ModelProperties
join ps in PropertySections on mp.SectionId equals ps.SectionId
where ps.IsSpecSection == 1
group new { mp, ps } by new { mp.PropertyId } into mgrp
from grp in mgrp.DefaultIfEmpty()
select new
{
grp.mp.SectionId,
grp.mp.PropertyId,
grp.mp.DisplayOrder,
grp.ps.IsSpecSection
}).OrderByDescending(x=>x.SectionId).First();
This query helps you to retrieve ModelProperties rows that has matching SectionId in PropertySections and IsSpecSection has the value 1. Matching rows are then grouped by PropertyId. OrderByDescending sort the retrieved results in descending order of SectionId. First() retrieve the rows that has maximum SectionId for each PropertySections as the rows are sorted in descending order of SectionId.

Get top five most repeating records in Entity Framework

I want to get top five most repeating records from a table in link to Entity Framework 4.0. How it can be possible in a single query which returns a list of collection of five records?
You simply group by count, order descending by count and then Take(5). Grouping examples, amongst others, can be found at 101 LINQ Samples.
Actually you should group by fields which define whether record is repeating or not. E.g. in your case it should be something like member id. Then you can introduce new range variable which will keep number of records in each group. Use that variable for ordering results:
var query = from s in db.Statistics
group s by s.MemberId into g // group by member Id
let loginsCount = g.Count() // get count of entries for each member
orderby loginsCount descending // order by entries count
select new { // create new anonymous object with all data you need
MemberId = g.Key,
LoginsCount = loginsCount
};
Then take first 5:
var top5 = query.Take(5);
That will generate query like
SELECT TOP (5) // Take(5)
[GroupBy1].[K1] AS [MemberId], // new { MemberId, LoginsCount }
[GroupBy1].[A1] AS [C1]
FROM ( SELECT
[Extent1].[MemberId] AS [K1],
COUNT(1) AS [A1] // let loginsCount = g.Count()
FROM [dbo].[Statistics] AS [Extent1]
GROUP BY [Extent1].[MemberId] // group s by s.MemberId
) AS [GroupBy1]
ORDER BY [GroupBy1].[A1] DESC // orderby loginsCount descending

Select records count from multiple tables in a single query

I have some models (restaurants, shops, products), and i want to select records count for multiple models in a single linq query.
I know how it should be in sql, but i don't know how to translate it in linq:
select
(select count(*) from restaurants) as restaurantsCount,
(select count(*) from shops) as shopsCount,
(select count(*) from products) as productsCount
from
dual
Considering dual is a dummy table with single row:
var result = new
{
RestaurantsCount = context.Restaurants.Count(),
ShopsCount = context.Shops.Count(),
ProductsCount = context.Products.Count()
};
Single query solution:
var result = from dummyRow in new List<string> { "X" }
join product in context.products on 1 equals 1 into pg
join shop in context.shops on 1 equals 1 into sg
join restaurant in context.restaurants on 1 equals 1 into rg
select new
{
productsCount = pg.Count(),
shopsCount = sg.Count(),
restaurantsCount = rg.Count()
};

LINQ GroupBy confusion

I have
var result = (from rev in Revisions
join usr in Users on rev.UserID equals usr.ID
join clc in ChangedLinesCounts on rev.Revision equals clc.Revision
select new {rev.Revision,
rev.Date, usr.UserName, usr.ID, clc.LinesCount}).Take(6);
I make a couple of joins on different tables, not relevant for this question what keys are, but at the end of this query my result "table" contains
{Revision, Date, UserName, ID, LinesCount}
Now I execute e GroupBy in order to calculate a total lines count per user.
So..
from row in result group row by row.ID into g {1}
select new {
g.Key,
totalCount = g.Sum(count=>count.LinesCount)
};
So I get a Key=ID, and totalCount=Sum, but
Confusion
I would like to have also other fields in final result.
In my understanding "table" after {1} grouping query consist of
{Revision, Date, UserName, ID, LinesCount, TotalCount}
If my assumption is correct, why I can not do something like this:
from row in result group row by row.ID into g {1}
select new {
g.Key,
g.Revision //Revision doesn't exist ! Why ??
totalCount = g.Sum(count=>count.LinesCount)
};
but
from row in result group row by row.ID into g {1}
select new {
g.Key,
Revision = g.Select(x=>x.Revision), //Works !
totalCount = g.Sum(count=>count.LinesCount)
};
Works !, but imo, sucks, cause I execute another Select.
Infact looking on LinqPad SQL output I get 2 SQL queries.
Question
Is there any elegant and optimal way to do this, or I always need to run Select
on groupped data, in order to be able to access the fields, that exists ?
The problem is, that you only group by ID - if you'd do that in SQL, you couldn't access the other fields either...
To have the other fields as well, you have to include them in you group clause:
from row in result group row by new { row.ID, row.Revision } into g
select new {
g.Key.ID,
g.Key.Revision
totalCount = g.Sum(count=>count.LinesCount)
};
The problem here is your output logically looks something like this:
Key = 1
Id = 1, Revision = 3587, UserName = Bob, LinesCount = 34, TotalCount = 45
Id = 1, Revision = 3588, UserName = Joe, LinesCount = 64, TotalCount = 54
Id = 1, Revision = 3589, UserName = Jim, LinesCount = 37, TotalCount = 26
Key = 2
Id = 2, Revision = 3587, UserName = Bob, LinesCount = 34, TotalCount = 45
Id = 2, Revision = 3588, UserName = Joe, LinesCount = 64, TotalCount = 54
Id = 2, Revision = 3589, UserName = Jim, LinesCount = 37, TotalCount = 26
Much like if you were to perform a an SQL GROUP BY, an value is either part of the key and thus unique per group, or is in the details and thus is repeated multiple times and possibly different for each row.
Now, logically, it might be that Revision and UserName are unique for each Id but Linq has no way to know that (the same as SQL has no way to know that).
To solve this you'll need to some how specify which revision you want. For instance:
Revision = g.FirstOrDefault(x => x.Revision)
To avoid the multiple SQL problem you would need to use an aggregate function that can be translated in to SQL since most SQL dialects do not have a first operator (the result set is considered unordered so technically no item is "first").
Revision = g.Min(x => x.Revision)
Revision = g.Max(x => x.Revision)
Unfortunately Linq does not have a min/max operator for strings, so although the SQL might support this, Linq does not.
In this case you can produce an intermediate result set for the Id and totals, then join this back to the original set to get the details, eg:
from d in items
join t in (
from t in items
group by t.Id into g
select new { Id = g.Key, Total = g.Sum(x => x.LineCount) }
) on d.Id equals t.Id
select new { Id = d.Id, Revision = d.Revision, Total = t.Total }
Revision doesn't exist in your second example because it's not a member of IGrouping<T>, in IGrouping<T> you have a Key property, and it's also an IEnumerable<T> for all the rows grouped together. Thus each of those rows has a Revision, but there is no Revision for the grouping itself.
If the Revision will be the same for all rows with the same ID, you could use FirstOrDefault() so that the select nets at most one answer:
from row in result group row by row.ID into g {1}
select new {
g.Key,
Revision = g.Select(x=>x.Revision).FirstOrDefault(),
totalCount = g.Sum(count=>count.LinesCount)
};
If the Revision is not unique per ID, though, you'd want to use an anonymous type as #Tobias suggests for the grouping, then you will get a grouping based on ID and Revision.

Categories

Resources