Group by percentage in LINQ - c#
I try to create the following SQL query in LINQ to use in my ASP MVC project:
SELECT State, (Count(a.State)* 100 / (Select Count(*) From myTable))
FROM myTable a
GROUP BY a.State
What I have so far:
var data = db.myTable.GroupBy(fu => fu.State)
.Select(g => new { Label = g.Key, Value = g.Key * 100 / g.Count() })
.ToList();
The calculation is not correct. Have to get LINQ producing the same results as SQL?
Probably this:-
Value = g.Count() * 100 / db.myTable.Count()
This seems to be equivalent of your SQL query.
So your complete query should look like:-
var data = db.myTable.GroupBy(fu => fu.State)
.Select(g => new { Label = g.Key, Value = g.Count() * 100 / db.myTable.Count() })
.ToList();
You can try this:
var data = db.myTable.GroupBy(fu => fu.State)
.Select(g => new { Label = g.Key, Value = g.Count() * 100 / db.myTable.Count() })
.ToList();
g.Count() is going to give you Count(a.State) value and db.myTable.Count() the total of rows in that table
Related
LINQ Query with GroupBy, MAX and Count
What could be the LINQ query for this SQL? SELECT PartId, BSId, COUNT(PartId), MAX(EffectiveDateUtc) FROM PartCostConfig (NOLOCK) GROUP BY PartId, BSId HAVING COUNT(PartId) > 1 I am actually grouping by two columns and trying to retrieve max EffectiveDateUtc for each part. This is what I could write. Stuck up on pulling the top record based on the date. Also not sure, if this is a optimal one. //Get all the parts which have more than ONE active record with the pat //effective date and for the same BSId var filters = (from p in configs ?.GroupBy(w => new { w.PartId, w.BSId }) ?.Select(g => new { PartId = g.Key.PartId, BSId = g.Key.BSId, Count = g.Count() }) ?.Where(y => y.Count > 1) select p) ?.Distinct()?.ToList(); var filteredData = (from p in configs join f in filters on p.PartId equals f.PartId select new Config { Id = p.Id, PartId = p.PartId, BSId = p.BSId, //EffectiveDateUtc = MAX(??) }).OrderByDescending(x => x.EffectiveDateUtc).GroupBy(g => new { g.PartId, g.BSId }).ToList(); NOTE: I need the top record (based on date) for each part. Was trying to see if I can avoid for loop.
The equivalent query would be: var query = from p in db.PartCostConfig group p by new { p.PartId, p.BSId } into g let count = g.Count() where count > 1 select new { g.Key.PartId, g.Key.BSId, Count = count, EffectiveDate = g.Max(x => x.EffectiveDateUtc), };
If I understand well, you are trying to achieve something like this: var query=configs.GroupBy(w => new{ w.PartId, w.BSId}) .Where(g=>g.Count()>1) .Select(g=>new { g.Key.PartId, g.Key.BSId, Count = g.Count(), EffectiveDate = g.Max(x => x.EffectiveDateUtc) });
GroupBy performs slowly
I have the following query and is super slow for 3000 records and produces 370 entries. How can I improve performance on it? dealerResults = _results.GroupBy(x => new { x.DealerName, x.DealerId }) .Select(x => new MarketingReportResults() { DealerId = x.Key.DealerId, DealerName = x.Key.DealerName, LinkedTotal = linkedLeadCores.Count(y => y.DealerId == x.Key.DealerId), LeadsTotal = x.Count(), SalesTotal = x.Count(y => y.IsSold), Percent = (decimal)(x.Count() * 100) / count, ActiveTotal = x.Count(y => y.IsActive), }).ToList();
I think the linkedLeadCores.Count() is the bottleneck here as you loop though the entire linkedLeadCores list each time a entry of _results is processed. This assumption seems to be confirmed by your comments also. So to remove the bottleneck you could create a map (aka dictionary) that holds the count for each dealer before doing anything with _results like this ... var linkedLeadCoresCountMap = linkedLeadCores .GroupBy(y => y.DealerId ) .ToDictionary(y => y.Key, y => y.Count()); ... and then you could write LinkedTotal = linkedLeadCoresCountMap.ContainsKey(x.Key.DealerId) ? linkedLeadCoresCountMap[x.Key.DealerId] : 0,
Doing a Group Join to linkedLeadCores will use an internal hash table for lookup and should solve your problem. var dealerResults = (from r in _results.GroupBy(x => new { x.DealerName, x.DealerId }) join llc in linkedLeadCores on r.Key.DealerId equals llc.DealerId into g select new MarketingReportResults() { DealerId = r.Key.DealerId, DealerName = r.Key.DealerName, LinkedTotal = g.Count(), LeadsTotal = r.Count(), SalesTotal = r.Count(y => y.IsSold), Percent = (decimal)(r.Count() * 100) / count, ActiveTotal = r.Count(y => y.IsActive), }).ToList();
Converting SQL to Linq with groupby, sum and count
I would like to do a group by and on that a sum and a count. I don't seem to be able to create the solution in linq. How can I convert my query to linq? SELECT HistoricalBillingProductGroup, COUNT(*), BillingPeriod, SUM(TotalMonthlyChargesOtcAndMrc) FROM [x].[dbo].[tblReport] group by BillingPeriod, HistoricalBillingProductGroup order by BillingPeriod This is what I got sofar in Linq var result = context.Reports.GroupBy(x => new {x.BillingPeriod, x.HistoricalBillingProductGroup}) .Select(x => new StatisticsReportLine { HistoricalBillingGroup = x.FirstOrDefault().HistoricalBillingProductGroup, BillingPeriod = x.FirstOrDefault().BillingPeriod, CountOfRows = x.Count(), SumOfAmount = x.Sum(p => p.TotalMonthlyChargesOtcAndMrc) ?? 0 }) .ToString(); The query I get from this is enormous and takes a very long time to load. In SQL its a matter of milliseconds. I hardly doubt this is the solution.
I believe the calls to x.FirstOrDefault() are the source of your problem. Each one of these will result in a very costly inner query inside the SELECT clause of the generated SQL. Try using the Key property of the IGrouping<T> instead : var result = context.Reports .GroupBy(x => new {x.BillingPeriod, x.HistoricalBillingProductGroup}) .OrderBy(x => x.Key.BillingPeriod) .Select(x => new StatisticsReportLine { HistoricalBillingProductGroup = x.Key.HistoricalBillingProductGroup, BillingPeriod = x.Key.BillingPeriod, CountOfRows = x.Count(), SumOfAmount = x.Sum(p => p.TotalMonthlyChargesOtcAndMrc) ?? 0 }); Or if you prefer query syntax: var result = (from r in context.Reports group r by new { r.BillingPeriod, r.HistoricalBillingProductGroup } into g orderby g.Key.BillingPeriod select new StatisticsReportLine { HistoricalBillingProductGroup = g.Key.HistoricalBillingProductGroup, BillingPeriod = g.Key.BillingPeriod, CountOfRows = g.Count(), SumOfAmount = x.Sum(p => p.TotalMonthlyChargesOtcAndMrc) ?? 0 });
You could try this one: var result = context.Reports .GroupBy(x => new {x.BillingPeriod, x.HistoricalBillingProductGroup}) .Select(x => new StatisticsReportLine { HistoricalBillingGroup = x.Key.HistoricalBillingProductGroup, BillingPeriod = x.Key.BillingPeriod, CountOfRows = x.Count(), SumOfAmount = x.Sum(p => p.TotalMonthlyChargesOtcAndMrc) ?? 0 }).ToString(); In the above query you make a group by on two properties, BillingPeriod and HistoricalBillingProductGroup. So in each group that will be created, you will have a key, that will be consisted by these two properties.
LINQ query with distinct count
I am trying to construct a LINQ query in C# that will give me a list of distinct values from a column in a dataset with a count for each row. The results would look like this. State Count AL 55 AK 40 AZ 2 Here is the SQL that does that. SELECT name, COUNT(*) AS count FROM architecture arch GROUP BY name ORDER BY name I've figured out the LINQ to get the DISTINCT values which is. var query = ds.Tables[0].AsEnumerable() .OrderBy(dr1 => dr1.Field<string>("state")) .Select(dr1 => new {state = dr1.Field<string>("state")}) .Distinct().ToList(); But I can't figure out how to get the COUNT(*) for each distinct value to work in LINQ. Any idea how I can add that into the LINQ query?
You need to group your results based on State and the Select count from the group like: var query = ds.Tables[0].AsEnumerable() .GroupBy(r => r.Field<string>("state")) .Select(grp => new { state = grp.Key, Count = grp.Count() }) .OrderBy(o => o.state) .ToList();
Group all rows by value of state column. Then order groups by grouping key. And last step - project each group into anonymous object with grouping key (state) and count of rows in group: var query = ds.Tables[0].AsEnumerable() .GroupBy(r => r.Field<string>("state")) .OrderBy(g => g.Key) .Select(g => new { State = g.Key, Count = g.Count() }) .ToList(); Query syntax will look like (I'll skip converting to list, to avoid mixing syntaxes): var query = from r in ds.Tables[0].AsEnumerable() group r by r.Field<string>("state") into g orderby g.Key select new { State = g.Key, Count = g.Count() };
I think you need GroupBy var query = ds.Tables[0].AsEnumerable() .GroupBy(dr1 => dr1.Field<string>("state")) .Select(g => new {state = g.Key, count = g.Count()) .ToList();
Why bother with Distinct, when you can translate your SQL query to LINQ almost word-for-word? You can do it like this: var query = ds.Tables[0].AsEnumerable() .GroupBy(dr1 => dr1.Field<string>("state")) .Select(g => new { State = g.Key , Count = g.Count() }) .OrderBy(p => p.State) .ToList(); This produces a list of {State, Count} pairs. If you prefer a dictionary of state-to-count, you can change your query like this: var query = ds.Tables[0].AsEnumerable() .GroupBy(dr1 => dr1.Field<string>("state")) .ToDictionary(g => g.Key, g => g.Count());
var query = ds.Tables[0].AsEnumerable() .GroupBy(x=>x.Field<string>("state")) .Select( g => new{ state = g.Key, count = g.Count() });
Guess what, the equivalent of group by is group by :) var query = from dr1 in ds.Tables[0].AsEnumerable() group dr1 by dr1.Field<string>("state") into state select new { State = state.Key, Count = state.Count() };
var stat = from row in ds.Tables[0].AsEnumerable() group row by new { Col1 = row["Name"], } into TotalCount select new { ActionName = TotalCount.Key.Col1, ActionCount = TotalCount.Count(), };
Select the right group based on distribution/percentages from a table
I have this CSV table filename,set,countrycode,timeofday,calibration,otherconditions,environment,precipitation,sky 20130913_060749,NULL,AB,day,0.0773734454989712,,city,none,overcast 20130913_060921,NULL,AB,day,0.289865800606369,,city,"none,haze",overcast 20130913_060951,NULL,AB,day,0.288528490852974,,city,haze,overcast 20130913_061021,NULL,AB,day,0.28358887059185,,city,haze,overcast 20130913_061051,NULL,AB,day,0.190207839896071,,city,haze,overcast 20130913_061156,NULL,AB,day,0.264707576215636,,city,haze,overcast 20130913_071226,NULL,AB,day,0.206269488454271,,city,haze,overcast 20130913_071256,NULL,AB,day,0.24887042174634,,"city,suburb",haze,overcast 20130913_071326,NULL,AB,day,0.0938082719333332,,suburb,haze,overcast 20130913_071356,NULL,AB,day,0.00627162842767295,,suburb,haze,overcast 20130913_073510,NULL,AB,day,0.372935257761037,,city,"fog,haze",partlyCloudy 20130913_073541,NULL,AB,day,0.328964273325638,,city,fog,partlyCloudy 20130913_083611,NULL,AB,day,0.289816662996633,,city,fog,partlyCloudy 20130913_083641,NULL,AB,day,0.291602099474152,,city,fog,partlyCloudy 20130913_083711,NULL,AB,day,0.205089179739094,,city,fog,partlyCloudy 20130913_083741,NULL,AB,day,0.345628858397651,,"city,playStreet",fog,partlyCloudy 20130913_083811,NULL,AB,day,0.0755803723447712,,"city,playStreet",fog,partlyCloudy 20130913_083958,NULL,AB,day,0.322821196115,,city,fog,partlyCloudy 20130913_084028,NULL,AB,day,0.191182147849585,,city,fog,partlyCloudy 20130913_084122,NULL,AB,day,0.348537625962529,,city,fog,partlyCloudy 20130913_094152,NULL,AB,day,0.331768750001122,,city,"fog,haze",partlyCloudy 20130913_094222,NULL,AB,day,0.297051596076413,,"city,suburb",haze,partlyCloudy 20130913_094252,NULL,AB,day,0.430273879317406,,suburb,haze,partlyCloudy 20130913_094322,NULL,AB,day,0.294162675805556,,suburb,haze,partlyCloudy 20130913_104352,NULL,AB,day,0,,suburb,haze,partlyCloudy 20130913_104422,NULL,AB,day,0,,suburb,haze,partlyCloudy 20130913_104516,NULL,AB,day,0.518891257745526,,suburb,haze,partlyCloudy 20130913_114546,NULL,AB,day,0.50039745769322,,suburb,haze,partlyCloudy 20130913_114616,NULL,AB,day,0.275762545494949,,suburb,haze,partlyCloudy 20130913_114646,NULL,AB,day,0.483892784993604,,suburb,haze,partlyCloudy 20130913_114716,NULL,AB,day,0.352958331105499,,suburb,haze,partlyCloudy 20130913_115640,NULL,AB,day,0.363986335968361,,suburb,haze,overcast 20130913_115710,NULL,AB,day,0.341002657474747,,suburb,haze,overcast 20130913_115740,NULL,AB,day,0.306984466792517,,suburb,haze,overcast 20130913_125810,NULL,AB,day,0.468337309183052,,suburb,haze,overcast 20130913_125840,NULL,AB,day,0.373136910552722,,suburb,haze,overcast 20130913_125910,NULL,AB,day,0.442251706279878,,suburb,haze,overcast 20130913_125940,NULL,AB,day,0.473174240569349,,"expressway,suburb",haze,overcast 20130913_150029,NULL,AB,night,0.0691816235557669,,suburb,fog,undefined 20130913_150059,NULL,AB,night,0.0434565306952736,,suburb,fog,undefined 20130913_150654,NULL,AB,night,0.277026709611307,,suburb,fog,undefined 20130913_150724,NULL,AB,night,0.240319377418189,,suburb,fog,undefined 20130913_150754,NULL,AB,night,0.209570896148148,,suburb,fog,undefined 20130913_150824,NULL,AB,night,0.153044032648117,,suburb,fog,undefined 20130913_150854,NULL,AB,night,0.276118705622896,,suburb,fog,undefined 20130913_150924,NULL,AB,night,0.0274314977841365,,suburb,fog,undefined 20130913_151713,NULL,AB,night,0.424058516444064,,suburb,fog,undefined 20130913_150824,NULL,AB,night,0.153044032648117,,suburb,fog,undefined 20130913_150854,NULL,AB,night,0.276118705622896,,suburb,fog,undefined 20130913_150924,NULL,AB,night,0.0274314977841365,,suburb,fog,undefined 20130913_151713,NULL,AB,night,0.424058516444064,,suburb,fog,undefined 20130913_150824,NULL,AB,night,0.153044032648117,,suburb,fog,undefined 20130913_150854,NULL,AB,night,0.276118705622896,,suburb,fog,undefined 20130913_150924,NULL,AB,night,0.0274314977841365,,suburb,fog,undefined 20130913_151713,NULL,AB,night,0.424058516444064,,suburb,fog,undefined I read it into a datatable and grouped them like that [with help from a previous question :_)] : foreach (DataRow row in table.Rows) { var oldRow = row.Field<string>("filename(hhmmss)"); var newRow = oldRow.Remove(oldRow.Length - 4); row.SetField("filename", newRow); } var fileNameGroups = table.AsEnumerable() .GroupBy(r => r.Field<string>("filename(hhmmss)")); I want to be able to select the group that fulfils a certain criteria(distribution) like : Select 3 groups that fulfill having 30% day 30% night 40% dawn PS: yes, the above table is not optimal for this but just the approach and proof of concept Any similar "selection" examples,problem solutions are welcome PS> Please look at the comments below :) there is more description
Strange requirement, however, here it is: var fileNameGroups = tblCSV.AsEnumerable() .GroupBy(r => r.Field<string>("filename(hhmmss)")) .Select(g => new { Group = g, FileName = g.Key, DawnPercent = 100.0 * g.Count(r => r.Field<string>("timeofday")=="dawn") / g.Count(), NightPercent = 100.0 * g.Count(r => r.Field<string>("timeofday")=="night") / g.Count(), DayPercent = 100.0 * g.Count(r => r.Field<string>("timeofday")=="day") / g.Count() }) .Where(x => x.DayPercent == 30 && x.NightPercent == 30 && x.DawnPercent == 40) .Select(x => x.Group) .Take(3);