C# LINQ Lambda multiple group of IEnumerable with Sum() - c#

I have a collection of data = IEnumerable<AnalyticsData> and I'm trying to group by multiple properties and Sum() on an integer column. The end result will be a collection of AnalyticsReportRow<dynamic>() as you can see below, though this isn't highly relevant.
In the final Select() method, I want to pass an object in, ideally from the original set and would prefer not to recreate one in the middle of my chained queries if possible. Most of the examples seem like the create either a new strongly-typed or dynamic object to pass into the next link in the chain.
Here's what I have spent a few hours trying to work with, and this returns the set as it is in the first code block below with all rows (I export to CSV, hence the formatting):
var pageViewsData = analyticsData.GroupBy(data => new { g1 = data.Webproperty, pv = data.PageViews, d = data })
.GroupBy(data => new { gg1 = data.Key.g1, dd = data.Key.d })
.Select(data => new AnalyticsReportRow<dynamic>(data.Key.dd, "Page_Views", data.Sum(datas => datas.Key.pv)));
Result is this:
"CustomerA","","","","","Page_Views",0,"A1-810","","",2,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"A1-810","","",2,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-N8013","","",2,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-P3113","","",7,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-P3113","","",2,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-P3113","","",3,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-P3113","","",3,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-P3113","","",2,"4/10/2015 16:08:33"
And would like to end up with a Sum() on the second-last column, grouped by customer and then by device. For example:
"CustomerA","","","","","Page_Views",0,"A1-810","","",4,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-N8013","","",2,"4/10/2015 16:08:33"
"CustomerA","","","","","Page_Views",0,"GT-P3113","","",16,"4/10/2015 16:08:33"
I am having a hard time wrapping my head around the logic and could really use an example of how to group like this, even pseudocode and dynamic types.
Thank you.

After spending several hours on this today, and posting my question, I decided to try a few more things and read some more documentation on GroupBy().
As it turns out, I was missing the fact that you can provide a Key Selector and an Element Selector to the GroupBy method as explained in the MSDN documentation. If I understand correctly, this provides the ability to have a distinct qualifier that tells the query how to group.
In the end, this appears to give me what I need. I would really like some feedback on this to make sure I'm going about it correctly:
var pageViewsData = analyticsData.Where(data => data.PageViews > 0)
.GroupBy(data => new { g1 = data.Webproperty, g2 = data.DeviceModel }, data => data)
.Select(data => new AnalyticsReportRow<dynamic>(data.FirstOrDefault(), "Page_Views", data.Sum(d => d.PageViews)));

Try something like this:
var query = from d in analyticsData
group d by new { d.Webproperty, d.DeviceModel }
into g
select new
{
g.Webproperty,
g.DeviceModel,
Total = g.Sum(it => it.PageViews)
};
var result = query.ToList();

Related

Find then Sum with IMongoCollection

I am writing an api to sum a column called ViewCount from collection. At the moment, my code is something like this:
var filter = Builders<Post>.Filter.Eq(u => u.IsDelete, query.Param.IsDelete);
IMongoCollection<Post> _posts;
var postViewCount = _posts.Find(filter).ToList().Select(a =>a.ViewCount).Sum();
Result is ok with correct summation but performance is getting slow a lots with only 20k records (about 6s for simple call). If I use like this, it will be very faster (just 200ms) but I cannot put filter in here:
var filter = Builders<Post>.Filter.Eq(u => u.IsDelete, query.Param.IsDelete);
IMongoCollection<Post> _posts;
var postViewCount = _posts.AsQueryable().Sum(x => x.ViewCount);
So my question is how can I handle this case? Thanks guys!
Use .Where(). After calling .AsQueryable() you get IQueryable, so you have all LINQ funtionality available. :)
_posts.AsQueryable().Where(filter).Sum(x => x.ViewCount);
If you use the Agg feature, you will get a faster query.
Good luck with!
var aggregate = new BsonDocument
{
{ "Count", new BsonDocument("$sum", 1) }
};
_posts.Aggregate().Match(filter).Group(aggregate);

Linq challenge: converting this piece of code from method chain to standard Linq

The challenge is about converting from method chain to standard linq a piece of code full of group by.
The context
To fully understand the topic here you can read the original question (with class definitions, sample data and so on): Linq: rebuild hierarchical data from the flattened list
Thanks to #Akash Kava, I've found the solution to my problem.
Chain method formulation
var macroTabs = flattenedList
.GroupBy(x => x.IDMacroTab)
.Select((x) => new MacroTab
{
IDMacroTab = x.Key,
Tabs = x.GroupBy(t => t.IDTab)
.Select(tx => new Tab {
IDTab = tx.Key,
Slots = tx.Select(s => new Slot {
IDSlot = s.IDSlot
}).ToList()
}).ToList()
}).ToList();
But, for sake of knowledge, I've tried to convert the method chain to the standard Linq formulation but something is wrong.
What happens is similar to this..
My attempt to convert it to Linq standard syntax
var antiflatten = flattenedList
.GroupBy(x => x.IDMacroTab)
.Select(grouping => new MacroTab
{
IDMacroTab = grouping.Key,
Tabs = (from t in grouping
group grouping by t.IDTab
into group_tx
select new Tab
{
IDTab = group_tx.Key,
Slots = (from s in group_tx
from s1 in s
select new Slot
{
IDSlot = s1.IDSlot
}).ToList()
}).ToList()
});
The result in LinqPad
The classes and the sample data on NetFiddle:
https://dotnetfiddle.net/8mF1qI
This challenge helped me to understand what exactly returns a Linq Group By (and how prolix is the Linq syntax with Group By).
As LinqPad clearly shows a Group By returns a List of Groups. Group is a very simple class which has just one property: a Key
As this answer states, from definition of IGrouping (IGrouping<out TKey, out TElement> : IEnumerable<TElement>, IEnumerable) the only way to access to the content of the subgroups is to iterate through elements (a foreach, another group by, a select, ecc).
Here is shown the Linq syntax formulation of the method chain.
And here is the source code on Fiddle
But let's go on trying to see another solution:
What we usually do in SQL when we do a Group By is to list all the columns but the one which have been grouped. With Linq is different.. it still returns ALL the columns.
In this example we started with a dataset with 3 'columns' {IDMacroTab, IDTab, IDSlot}. We grouped for the first column, but Linq would return the whole dataset, unless we explicitly tell him..

C# and LINQ - arbitrary statement instead of let

Let's say I'm doing a LINQ query like this (this is LINQ to Objects, BTW):
var rows =
from t in totals
let name = Utilities.GetName(t)
orderby name
select t;
So the GetName method just calculates a display name from a Total object and is a decent use of the let keyword. But let's say I have another method, Utilities.Sum() that applies some math on a Total object and sets some properties on it. I can use let to achieve this, like so:
var rows =
from t in totals
let unused = Utilities.Sum(t)
select t;
The thing that is weird here, is that Utilities.Sum() has to return a value, even if I don't use it. Is there a way to use it inside a LINQ statement if it returns void? I obviously can't do something like this:
var rows =
from t in totals
Utilities.Sum(t)
select t;
PS - I know this is probably not good practice to call a method with side effects in a LINQ expression. Just trying to understand LINQ syntax completely.
No, there is no LINQ method that performs an Action on all of the items in the IEnumerable<T>. It was very specifically left out because the designers actively didn't want it to be in there.
Answering the question
No, but you could cheat by creating a Func which just calls the intended method and spits out a random return value, bool for example:
Func<Total, bool> dummy = (total) =>
{
Utilities.Sum(total);
return true;
};
var rows = from t in totals
let unused = dummy(t)
select t;
But this is not a good idea - it's not particularly readable.
The let statement behind the scenes
What the above query will translate to is something similar to this:
var rows = totals.Select(t => new { t, unused = dummy(t) })
.Select(x => x.t);
So another option if you want to use method-syntax instead of query-syntax, what you could do is:
var rows = totals.Select(t =>
{
Utilities.Sum(t);
return t;
});
A little better, but still abusing LINQ.
... but what you should do
But I really see no reason not to just simply loop around totals separately:
foreach (var t in totals)
Utilities.Sum(t);
You should download the "Interactive Extensions" (NuGet Ix-Main) from Microsoft's Reactive Extensions team. It has a load of useful extensions. It'll let you do this:
var rows =
from t in totals.Do(x => Utilities.Sum(x))
select t;
It's there to allow side-effects on a traversed enumerable.
Please, read my comment to the question. The simplest way to achieve such of functionality is to use query like this:
var rows = from t in totals
group t by t.name into grp
select new
{
Name = t.Key,
Sum = grp.Sum()
};
Above query returns IEnumerable object.
For further information, please see: 101 LINQ Samples

Selecting several objects based on array of IDs

I have an array of ProgramIDs and would like to create a number of Select statements dynamically depending on how many ProgramIds there are.
For example:
var surveyProgramVar = surveyProgramRepository.Find().Where(x => x.ProgramId == resultsviewmodel.ProgramIds.FirstOrDefault());
This is an example of the select statement working with a single ProgramId.FirstOrDefault(). How do I create a list/array of SurveyProgramVars and select for each ProgramIds in the array?
It won't be necessarily optimal, but you might try:
var surveyProgramVar = surveyProgramRepository.Find()
.Where(x => resultsviewmodel.ProgramIds.Contains(x.ProgramId));
You could try something like:
var surveyProgramVar = surveyProgramRepository.Find().Where(x => resultsviewmodel.ProgramIds.Contains(x.ProgramId));
Tip: If the Find() method does a hit on a database, would be nice if you create a specific method to to a IN statment on the query. If you does not do this, it will take all records on a table and filter it in memory (linq to objects), which works but not very nice. Your code could be something like:
var surveyProgramVar = surveyProgramRepository.FindByProgramsId(resultsviewmodel.ProgramIds);

Is there any benefits to using List.AddRange over a LINQ.ToList()?

I was working through an MVC4 tutorial today and saw the user implemented a select in a different manner than I'm used to. His code was:
var GenreLst = new List<string>();
var GenreQry = from d in db.Movies
orderby d.Genre
select d.Genre;
GenreLst.AddRange(GenreQry.Distinct());
ViewBag.movieGenre = new SelectList(GenreLst);
I looked at it and rewrote it in my own way as:
var genres = db.Movies
.OrderBy(m => m.Genre)
.Select(m => m.Genre)
.Distinct()
.ToList();
ViewBag.MovieGenre = new SelectList(genres);
His GenreList variable isn't used elsewhere, so I got rid of it. My main question is how he uses the AddRange. Is it any better to AddRange than ToList?
Thanks for reading!
e.ToList<T>() is implemented under the hood as:
return new List<T>(e);
And the List<T>(IEnumerable<T> e) constructor actually just calls this.AddRange(e) internally.
In other words, the two bits of code will do the exact same thing, in the exact same way.

Categories

Resources