Groupby count for every hour with count value 0 also - c#

I have a list of person and I want count of person register every hour. I have used below GroupBy clause and I got the correct result.
var persons = lstPerson.GroupBy(x =>(x.CreatedOn.Hour))
.Select(grp => new { total = grp.Count(), key = grp.Key })
.OrderBy(x => x.key)
.ToList();
But I want for every hour. It only shows the value in which count is there. If for 1st hour no person is registered then it doesn't show 0 count in list.
So for example there are person registered only for 13, 14,15 hours(means at 13:00,14:00 and 15:00 hours) then it shows the count for it but not for other hours.

A couple of options:
Firstly, and with least change to your code, you can just "add one" to each group and then "subtract one" from each count:
var persons = lstPerson
.Select(x => (x.CreatedOn.Hour)) // Get the hours from the people
.Concat(Enumerable.Range(0, 24)) // Add an extra copy of each hour
.GroupBy(h => h) // ↓ subtract the extra hours
.Select(grp => new { total = grp.Count() - 1, key = grp.Key })
.OrderBy(x => x.key)
.ToList();
Secondly, more tidy but involves replacing all of your code, you can join the list of people onto the list of hours:
var persons = Enumerable.Range(0, 24)
.GroupJoin(
lstPerson,
h => h, // Correlate the hours in the range
p => p.CreatedOn.Hour, // with the hours from each person
(h, ps) => new { total = ps.Count(), key = h))
.ToList(); // ↑ This selects one element for each hour in the range

Related

Group datetime by an interval of minutes

I'm trying to group my list using linq by an interval of 30 minutes.
Let’s say we have this list:
X called at 10:00 AM
Y called at 10:10 AM
Y called at 10:20 AM
Y called at 10:35 AM
X called at 10:40 AM
Y called at 10:45 AM
What i need is to group these items in a 30 minutes frame and by user, like so:
X called at 10:00 AM
Y called 3 times between 10:10 AM and 10:35 AM
X called at 10:40 AM
Y called at 10:45 AM
Here's what i'm using with Linq:
myList
.GroupBy(i => i.caller, (k, g) => g
.GroupBy(i => (long)new TimeSpan(Convert.ToDateTime(i.date).Ticks - g.Min(e => Convert.ToDateTime(e.date)).Ticks).TotalMinutes / 30)
.Select(g => new
{
count = g.Count(),
obj = g
}));
I need the result in one list, but instead im getting the result in nested lists, which needs multiple foreach to extract.
Any help is much appreciated!
I think you are looking for SelectMany which will unwind one level of grouping:
var ans = myList
.GroupBy(c => c.caller, (caller, cg) => new { Key = caller, MinDateTime = cg.Min(c => c.date), Calls = cg })
.SelectMany(cg => cg.Calls.GroupBy(c => (int)(c.date - cg.MinDateTime).TotalMinutes / 30))
.OrderBy(cg => cg.Min(c => c.date))
.ToList();
Note: The GroupBy return selects the Min as a minor efficiency improvement so you don't constantly re-find the minimum DateTime for each group per call.
Note 2: The (int) conversion creates the buckets - otherwise, .TotalMinutes returns a double and the division by 30 just gives you a (unique) fractional answer and you get no grouping into buckets.
By modifying the initial code (again for minor efficiency), you can reformat the answer to match your textual result:
var ans = myList
.GroupBy(c => c.caller, (caller, cg) => new { Key = caller, MinDateTime = cg.Min(c => c.date), Calls = cg })
.SelectMany(cg => cg.Calls.GroupBy(c => (int)(c.date - cg.MinDateTime).TotalMinutes / 30), (bucket, cg) => new { FirstCall = cg.MinBy(c => c.date), Calls = cg })
.OrderBy(fcc => fcc.FirstCall.date)
.ToList();
var ans2 = ans.Select(fcc => new { Caller = fcc.FirstCall.caller, FirstCallDateTime = fcc.FirstCall.date, LastCallDateTime = fcc.Calls.Max(c => c.date), Count = fcc.Calls.Count() })
.ToList();
Instead of grouping by a DateTime, try grouping by a key derived from the date.
string GetTimeBucketId(DateTime time) {
return $"${time.Year}-{time.Month}-{time.Day}T{time.Hour}-{time.Minute % 30}";
}
myList
.GroupBy(i => GetTimeBucketId(i.caller.date))
.Select(g => { Count = g.Count(), Key = g.Key });

Linq query with multiple GroupBy

Hey I have a linq query that I cannot make any sense of. It draws from a table to get sales data for a specific company(collection of stores).
The sales table is a collection of individual sale items, multiple sale items can be on one bill. I need to make a query that sorts a day worth of sales by store and then counts the total Bills for that store for that day and gives the average bill for that store for that day.
Here is the start of what I have so far
//get list of all stores bill count/average bill
FactSales.Where(d => d.DateKey >= 20130920).Where(d => d.DateKey <= 20130920)
.Where(c => c.CompanyID == 4).GroupBy(g=>new{g.StoreID, g.BillID}).Select(g=>new{Store = g.Key.StoreID, Amount = g.Sum(a => a.Amount).Value })
This gets me a list of all individual bills but how do I get the average and bill count for each store in a list that would contain : storeID, Avg Bill(Sum of all store bills/billcount) and the Bill Count of each store?
Ive been at this awhile and read many posts and I must be missing something very obvious.
Any help will be greatly appreciated!
FactSales
// filter sale positions by day
.Where(d => d.DateKey >= 20130920 && d.DateKey <= 20130920)
// filter sale positions by company
.Where(c => c.CompanyID == 4)
// group values by store and bill so each row will contains all sales in one bill
.GroupBy(g=>new { g.StoreID, g.BillID })
// calculate total amount for each bill
.Select(g=>new { Store = g.Key.StoreID, Bill = g.Key.BillID, Amount = g.Sum(a => a.Amount).Value })
// group bills by store so each row will contains all bills with total amount for this store
.GroupBy(g => g.Store)
// calculate bills count and avg amount for this store
.Select(g => new { Store = g.Key, Bills = g.Count(), AvgBill = g.Average(x => x.Amount) });
I thing that g.Sum(a => a.Amount).Value should not contains Value property and you can do just g.Sum(a => a.Amount).
i thk this will also work :
var result = FactSales
.Where(d => d.DateKey >= 20130920 && d.DateKey <= 20130920)
.Where(c => c.CompanyID == 4)
.GroupBy(t => new { t.StoreId })
.Select(X => new
{
Store = X.Key.StoreId,
Sum = X.Sum(t => t.Amount),
Bills = X.Count(),
average = X.Sum(t => t.Amount) / X.Count()
});
isn't.

Using LINQ to select highest earning employee per department

Not really sure how to word it but basically I have a bunch of data in a list that goes similar to this:
Person 1
Name - Joe Bloggs
Age - 40
Department - IT
Wage - 20,000
Person 2
Name - Jess Jane
Age - 40
Department - Kitchen
Wage - 16,000
...you get the idea.
At the moment, I've just selected all of the people, ordered them by wage and entered them in a listbox very simply by doing this.
var item = (from employee in employeeList.employees
orderby employee.wage descending
select employee);
Now, my question is, how can I change this bit of code so that it filters through the list and show only the highest earning employee in their department? So for example, instead of having hundreds of employees listed, it will only show the highest earning employee in IT, then the highest earning employee in catering, etc.
Is it possible? If not are there any other methods I can use to achieve this?
You could try something like this:
var result = employeeList.employees.GroupBy(emp => emp.Departemnt)
.Select(gr => new
{
Departemnt = gr.Key,
Employee = gr.OrderByDescending(x=>x.wage)
.FirstOrDefault()
});
That we do above is a grouping by department and the we pick for each department the employee with the highest wage.
Update
In .NET 6, the above could be re-written as below:
var result = employeeList.employees
.GroupBy(employee => employee.Departemnt)
.Select(gr => new
{
Departemnt = gr.Key,
Employee = gr.MaxBy(x => x.wage)
});
This approach lets you get the result without ordering. So it will take only O(n) time instead of O(n*log(n))
var highestEarningPersonByDepartment = persons
.GroupBy(p => p.Department)
.Select(g => new { Department = g.Key, HighestEarningPerson = g.First(person => person.Wage == g.Max(p => p.Wage)) })
.ToDictionary(dp => dp.Department, dp => dp.HighestEarningPerson);
Edit: A more optimised version would be:
var maxWageByDepartment = persons
.GroupBy(p => p.Department)
.ToDictionary(g => g.Key, g => g.Max(p => p.Wage));
var richestPersonByDepartment = persons
.GroupBy(p => p.Department)
.Select(g => new { Department = g.Key, HighestEarningPerson = g.First(person => person.Wage == maxWageByDepartment[g.Key]) });
Something along these line :
var highSalemployeePerDept= employeeList.employees.GroupBy(x => x.Department)
.Select(g => g.OrderByDescending(x => x.Salary).First());
var highestEarnersByDept =
employeeList.employees.GroupBy(employee => employee.Department)
.Select(grouping => grouping.OrderByDescending(employee => employee.Wage).First())
.ToArray();

Averaging with Linq while ignoring 0s cleanly

I have a linq statement that averages the rows in a DataTable and displays them on a chart, grouped by date and time of day.
There are 1 big problem: there are many 0 values that are returned, due to particular times of day simply not having anything going on. These are skewing my averages something awful
Different times of day may have 0s in different columns, so I can't just delete each row with a 0 in the columns (cleanly), as I would end up with no rows left in the datatable, or at least I can't think of a clean way to do it in any case.
This is what I have:
var results = from row2 in fiveDayDataTable.AsEnumerable()
group row2 by ((DateTime)row2["TheDate"]).TimeOfDay
into g
select new
{
Time = g.Key,
AvgItem1 = g.Average(x => (int)x["Item1"]),
AvgItem2 = g.Average(x => (int)x["Item2"]),
AvgItem3 = g.Average(x => (int)x["Item3"]),
AvgItem4 = g.Average(x => (int)x["Item4"]),
AvgItem5 = g.Average(x => (int)x["Item5"]),
};
I don't know if this is possible, so I figured I would ask- is there a way to do the average without the 0s?
Thank you!
Sure you can filter out the zeros:
AvgItem1 = g.Select(x => (int)x["Item1"]).Where(x => x != 0).Average(),
AvgItem2 = g.Select(x => (int)x["Item2"]).Where(x => x != 0).Average(),
AvgItem3 = g.Select(x => (int)x["Item3"]).Where(x => x != 0).Average(),
AvgItem4 = g.Select(x => (int)x["Item4"]).Where(x => x != 0).Average(),
AvgItem5 = g.Select(x => (int)x["Item5"]).Where(x => x != 0).Average(),
If your result set (after the Where) might be empty, you might need to call DefaultIfEmpty.
AvgItem1 = g.Select(x => (int)x["Item1"]).Where(x => x != 0).DefaultIfEmpty(0).Average(),
This will return a non-empty result set so your Average will be able to work with it.
Since you have a lot of repetition, you could consider refactoring your average logic into a separate method or anonymous function:
Func<IEnumerable<YourClass>, string, double> avg =
(g, name) => g.Select(x => (int)x[name]).Where(x => x != 0).Average();
var results = from row2 in fiveDayDataTable.AsEnumerable()
group row2 by ((DateTime)row2["TheDate"]).TimeOfDay
into g
select new
{
Time = g.Key,
AvgItem1 = avg(g, "Item1"),
AvgItem2 = avg(g, "Item2"),
AvgItem3 = avg(g, "Item3"),
AvgItem4 = avg(g, "Item4"),
AvgItem5 = avg(g, "Item5"),
};
Add a Where to each query just before the Average in which you ensure that the item is not equal to zero.

How do I .OrderBy() and .Take(x) this LINQ query?

The LINQ query below is working fine but I need to tweak it a bit.
I want all the records in the file grouped by recordId (a customer number) and then ordered by, in descending order, the date. I'm getting the grouping and the dates are in descending order. Now, here comes the tweaking.
I want the groups to be sorted, in ascending order, by recordId. Currently, the groups are sorted by the date, or so it seems. I tried adding a .OrderBy after the .GroupBy and couldn't get that to work at all.
Last, I want to .take(x) records where x is dependent on some other factors. Basically, the .take(x) will return the most-recent x records. I tried placing a .take(x) in various places and I wasn't getting the correct results.
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = tokens[13],
date = Convert.ToDateTime(tokens[17])
}
)
.OrderByDescending (m => m.date)
.GroupBy (m => m.recordId)
.Dump();
Edit #1 -
recordId is not unique. There may / will likely be multiple records with the same recordId. recordId is actually a customer number.
The output will be a resultset with first name, last name, date, and recordId. Depending on several factors, there many be 1 to 5 records returned for each recordId.
Edit #2 -
The .Take(x) is for the recordId. Each recordId may have multiple rows. For now, let's assume I want the most recent date for each recordId. (select top(1) when sorted by date descending)
Edit #3 -
The following query generates the following results. Note each recordId only produces 1 row in the output (this is okay) and it appears it is the most recent date. I haven't thouroughly checked this yet.
Now, how do I sort, in ascending order, by recordId?
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = Convert.ToInt32(tokens[13]),
date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy (m => m.recordId)
.OrderByDescending (m => m.Max (x => x.date ) )
.Select (m => m.First () )
.Dump();
FirstName LastName recordId date
X X 2531334 3/11/2011 12:00:00 AM
X X 1443809 10/18/2001 12:00:00 AM
X X 2570897 3/10/2011 12:00:00 AM
X X 1960526 3/10/2011 12:00:00 AM
X X 2475293 3/10/2011 12:00:00 AM
X X 2601783 3/10/2011 12:00:00 AM
X X 2581844 3/6/2011 12:00:00 AM
X X 1773430 3/3/2011 12:00:00 AM
X X 1723271 2/4/2003 12:00:00 AM
X X 1341886 2/28/2011 12:00:00 AM
X X 1427818 11/15/1986 12:00:00 AM
You can't that easily order by a field which is not part of the group by fields. You get a list for each group. This means, you get a list of date for each recordId.
You could order by Max(date) or Min(date).
Or you could group by recordId and date, and order by date.
order by most recent date:
.GroupBy (m => m.recordId)
// take the most recent date in the group
.OrderByDescending (m => m.Max(x => x.date))
.SelectMany(x => x.First
The Take part is another question. You could just add Take(x) to the expression, then you get this number of groups.
Edit:
For a kind of select top(1):
.GroupBy (m => m.recordId)
// take the most recent date in the group
.OrderByDescending (m => m.Max(x => x.date))
// take the first of each group, which is the most recent
.Select(x => x.First())
// you got the most recent record of each recordId
// and you can take a certain number of it.
.Take(x);
snipped I had before in my answer, you won't need it according to your question as it is now:
// create a separate group for each unique date and recordId
.GroupBy (m => m.date, m => m.recordId)
.OrderByDescending (m => m.Key)
This seems very similar to your other question - Reading a delimted file using LINQ
I don't believe you want to use Group here at all - I believe instead that you want to use OrderBy and ThenBy - something like:
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = tokens[13],
date = Convert.ToDateTime(tokens[17])
}
)
.OrderBy (m => m.recordId)
.ThenByDescending (m => m.date)
.Dump();
For a simple Take... you can just add this .Take(N) just before the Dump()
However, I'm not sure this is what you are looking for? Can you clarify your question?
just add
.OrderBy( g=> g.Key);
after your grouping. This will order your groupings by RecordId ascending.
Last, I want to .take(x) records where
x is dependent on some other factors.
Basically, the .take(x) will return
the most-recent x records.
If you mean by "the most recent" by date, why would you want to group by RecordId in the first place - just order by date descending:
..
.OrderByDescending (m => m.date)
.Take(x)
.Dump();
If you just want to get the top x records in the order established by the grouping though you could do the following:
...
.GroupBy (m => m.recordId)
.SelectMany(s => s)
.Take(x)
.Dump();
If you want something like the first 3 for each group, then I think you need to use a nested query like:
var recipients = File.ReadAllLines(path)
.Select(record => record.Split('|'))
.Select(tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
RecordId = tokens[13],
Date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy(m => m.RecordId)
.Select(grouped => new
{
Id = grouped.Key,
First3 = grouped.OrderByDescending(x => x.Date).Take(3)
}
.Dump();
and if you want this flattened into a record list then you can use SelectMany:
var recipients = var recipients = File.ReadAllLines(path)
.Select(record => record.Split('|'))
.Select(tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
RecordId = tokens[13],
Date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy(m => m.RecordId)
.Select(grouped => grouped.OrderByDescending(x => x.Date).Take(3))
.SelectMany(item => item)
.Dump();

Categories

Resources