Linq query with multiple GroupBy - c#

Hey I have a linq query that I cannot make any sense of. It draws from a table to get sales data for a specific company(collection of stores).
The sales table is a collection of individual sale items, multiple sale items can be on one bill. I need to make a query that sorts a day worth of sales by store and then counts the total Bills for that store for that day and gives the average bill for that store for that day.
Here is the start of what I have so far
//get list of all stores bill count/average bill
FactSales.Where(d => d.DateKey >= 20130920).Where(d => d.DateKey <= 20130920)
.Where(c => c.CompanyID == 4).GroupBy(g=>new{g.StoreID, g.BillID}).Select(g=>new{Store = g.Key.StoreID, Amount = g.Sum(a => a.Amount).Value })
This gets me a list of all individual bills but how do I get the average and bill count for each store in a list that would contain : storeID, Avg Bill(Sum of all store bills/billcount) and the Bill Count of each store?
Ive been at this awhile and read many posts and I must be missing something very obvious.
Any help will be greatly appreciated!

FactSales
// filter sale positions by day
.Where(d => d.DateKey >= 20130920 && d.DateKey <= 20130920)
// filter sale positions by company
.Where(c => c.CompanyID == 4)
// group values by store and bill so each row will contains all sales in one bill
.GroupBy(g=>new { g.StoreID, g.BillID })
// calculate total amount for each bill
.Select(g=>new { Store = g.Key.StoreID, Bill = g.Key.BillID, Amount = g.Sum(a => a.Amount).Value })
// group bills by store so each row will contains all bills with total amount for this store
.GroupBy(g => g.Store)
// calculate bills count and avg amount for this store
.Select(g => new { Store = g.Key, Bills = g.Count(), AvgBill = g.Average(x => x.Amount) });
I thing that g.Sum(a => a.Amount).Value should not contains Value property and you can do just g.Sum(a => a.Amount).

i thk this will also work :
var result = FactSales
.Where(d => d.DateKey >= 20130920 && d.DateKey <= 20130920)
.Where(c => c.CompanyID == 4)
.GroupBy(t => new { t.StoreId })
.Select(X => new
{
Store = X.Key.StoreId,
Sum = X.Sum(t => t.Amount),
Bills = X.Count(),
average = X.Sum(t => t.Amount) / X.Count()
});
isn't.

Related

How to display product using group by prices in c# console app

I have list of products which has id, name and price. I want to show it in console using prices such as
Price 1 to 100
--list of products
price 101 to 200
--list of products
and it so on till last highest price.
I need to determine at runtime how many segments I need to create based upon highest price.
If you have a list called products that have a property called Price you could use a basic Linq expression with OrderBy and Where
Edit: now that I understand your question better, you could create a new list for each segment of 100's.
so something like this:
products = products.OrderBy(x => x.Price).ToList();
var subProducts = products.Where(x => x.Price > 0 && x.Price < 100).ToList();
// then print each item in the list here.
// then do the next segment
subProducts = products.Where(x => x.Price >= 100 && x.Price < 200).ToList();
and you could put this basic format in a loop and then iterate through by the 100's to get each segment.
There are two problems to solve here. First, you need to figure out, for any given price, how to categorize it into a price range. The information you've given in your post isn't quite specific enough unless all your prices are in whole numbers, but I'll make an educated guess here.
// Model
record ProductPriceRange(decimal Low, decimal High);
// Desired behavior
GetPriceRange(0.01m).Should().Be(new ProductPriceRange(0.01m, 100m));
GetPriceRange(100m).Should().Be(new ProductPriceRange(0.01m, 100m));
GetPriceRange(100.01m).Should().Be(new ProductPriceRange(100.01m, 200m));
// Implementation
ProductPriceRange GetPriceRange(decimal price)
{
var groupBasis = (int)(((price - 0.01m) / 100)) * 100 + 0.01m;
return new ProductPriceRange(groupBasis, groupBasis + 99.99m);
}
Then your grouping code can look something like this:
var productGroups =
from p in products
group p by GetPriceRange(p.Price) into g
select new {
PriceRange = g.Key,
Products = g.ToList()
};
foreach (var g in productGroups)
{
Console.WriteLine($"{g.PriceRange.Low} - {g.PriceRange.High}: {string.Join(", ", g.Products)}");
}
Sample output:
0.01 - 100.00: Product { Price = 0.01 }, Product { Price = 99.99 }, Product { Price = 100 }
100.01 - 200.00: Product { Price = 100.01 }, Product { Price = 199.99 }, Product { Price = 200 }
200.01 - 300.00: Product { Price = 200.01 }
Try the following:
var query = products
.GroupBy(p => (int)(p.Price / 100))
.OrderBy(g => g.Key)
.Select(g => new {
Lower = g.Min(x => x.Price),
Higher = g.Max(x => x.Price),
Items = g.OrderBy(x => x.Price).ToList())
})
.ToList();
It will return list of lists.

C# Linq to aggregate results and compute averages

I have created a sqlite db. Now I need a table where I can store the quantity bought by my clients and the average price they paid.
The code is in C# and the table has the following structure:
ClientID; ItemID; Quantity; Price; Date;
where I store my client ID and the price they paid when they bought a given quantity of an item on a given date.
It happens that the same client can buy the same item on multiple days and pay a different price.
What I need is to aggregate all the quantity each client bought for a given item and what is the average price he paid.
I assume this can be done in Linq to make it efficient but I am not sure how to set up the query for the need above. Any help is much appreciated.
Did you try anything? How about:
var result = data.GroupBy(x => new {x.ClientID, x.ItemID})
.Select(
g =>
new
{
g.Key.ClientID,
g.Key.ItemID,
AvgPrice = g.Average(c => c.Price),
SumQuantity = g.Sum(c => c.Quantity)
});
purchases.GroupBy(g => new { g.ClientId, g.ItemId })
.Select(g => new
{
ClientId = g.Key.ClientId,
ItemId = g.Key.ItemId,
Price = g.Sum(p => p.Price * p.Quantity) / g.Sum(p => p.Quantity)
});

Linq Grouping & Distinct

Let's say I have the following records
Timestamp,Hash,Strength,Dev
1493898886.78,6483516d23526eed51504a59554c76b0f4c2f2a05973517ff451ce1abae06038,-68,273783
1493898886.78,6483516d23526eed51504a59554c76b0f4c2f2a05973517ff451ce1abae06038,-66,273783
1493898892.28,6483516d23526eed51504a59554c76b0f4c2f2a05973517ff451ce1abae06038,-59,273783
1493898893.0,f76dcc5bfefe5b0ab9a014149bc68f17bf4e12e60e285f4e66c7fcbbb725324e,-63,273783
1493898894.39,6d3c09d97816c0dda102b6f73205484a41a7e652af89e7fba7acff4f78879d89,-65,273783
1493898894.48,6d3c09d97816c0dda102b6f73205484a41a7e652af89e7fba7acff4f78879d89,-61,273783
1493898896.19,6483516d23526eed51504a59554c76b0f4c2f2a05973517ff451ce1abae06038,-63,273783
1493898900.19,6483516d23526eed51504a59554c76b0f4c2f2a05973517ff451ce1abae06038,-58,273783
If you notice the first record, the timestamp and Hash are duplicated, how can I with linq Select distinct records with highest Strength?
var dist = records
.GroupBy(o => new { o.MACHash, o.Timestamp })
.Select(y => y.Max(x => x.Strength))
.ToList();
gets me only the strength, but i want the reduced list
Following selects one record per timestamp + hash group, the one with highest strength:
var q = db.TableName
.GroupBy(x => new { x.Timestamp, x.Hash })
.Select(g => g.OrderByDescending(x => x.Strength).First());
Assuming that your table name is A
from a in db.A
group a by new { TimeStamp = a.TimeStamp, Hash = a.Hash } into groupItem
select new { TimeStamp = groupItem.Key.TimeStamp,
Hash = groupItem.Key.Hash, Strength = groupItem.Max(x => x.Strength)}
You will get the max strength for each TimeStamp, Hash

Averaging with Linq while ignoring 0s cleanly

I have a linq statement that averages the rows in a DataTable and displays them on a chart, grouped by date and time of day.
There are 1 big problem: there are many 0 values that are returned, due to particular times of day simply not having anything going on. These are skewing my averages something awful
Different times of day may have 0s in different columns, so I can't just delete each row with a 0 in the columns (cleanly), as I would end up with no rows left in the datatable, or at least I can't think of a clean way to do it in any case.
This is what I have:
var results = from row2 in fiveDayDataTable.AsEnumerable()
group row2 by ((DateTime)row2["TheDate"]).TimeOfDay
into g
select new
{
Time = g.Key,
AvgItem1 = g.Average(x => (int)x["Item1"]),
AvgItem2 = g.Average(x => (int)x["Item2"]),
AvgItem3 = g.Average(x => (int)x["Item3"]),
AvgItem4 = g.Average(x => (int)x["Item4"]),
AvgItem5 = g.Average(x => (int)x["Item5"]),
};
I don't know if this is possible, so I figured I would ask- is there a way to do the average without the 0s?
Thank you!
Sure you can filter out the zeros:
AvgItem1 = g.Select(x => (int)x["Item1"]).Where(x => x != 0).Average(),
AvgItem2 = g.Select(x => (int)x["Item2"]).Where(x => x != 0).Average(),
AvgItem3 = g.Select(x => (int)x["Item3"]).Where(x => x != 0).Average(),
AvgItem4 = g.Select(x => (int)x["Item4"]).Where(x => x != 0).Average(),
AvgItem5 = g.Select(x => (int)x["Item5"]).Where(x => x != 0).Average(),
If your result set (after the Where) might be empty, you might need to call DefaultIfEmpty.
AvgItem1 = g.Select(x => (int)x["Item1"]).Where(x => x != 0).DefaultIfEmpty(0).Average(),
This will return a non-empty result set so your Average will be able to work with it.
Since you have a lot of repetition, you could consider refactoring your average logic into a separate method or anonymous function:
Func<IEnumerable<YourClass>, string, double> avg =
(g, name) => g.Select(x => (int)x[name]).Where(x => x != 0).Average();
var results = from row2 in fiveDayDataTable.AsEnumerable()
group row2 by ((DateTime)row2["TheDate"]).TimeOfDay
into g
select new
{
Time = g.Key,
AvgItem1 = avg(g, "Item1"),
AvgItem2 = avg(g, "Item2"),
AvgItem3 = avg(g, "Item3"),
AvgItem4 = avg(g, "Item4"),
AvgItem5 = avg(g, "Item5"),
};
Add a Where to each query just before the Average in which you ensure that the item is not equal to zero.

How do I .OrderBy() and .Take(x) this LINQ query?

The LINQ query below is working fine but I need to tweak it a bit.
I want all the records in the file grouped by recordId (a customer number) and then ordered by, in descending order, the date. I'm getting the grouping and the dates are in descending order. Now, here comes the tweaking.
I want the groups to be sorted, in ascending order, by recordId. Currently, the groups are sorted by the date, or so it seems. I tried adding a .OrderBy after the .GroupBy and couldn't get that to work at all.
Last, I want to .take(x) records where x is dependent on some other factors. Basically, the .take(x) will return the most-recent x records. I tried placing a .take(x) in various places and I wasn't getting the correct results.
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = tokens[13],
date = Convert.ToDateTime(tokens[17])
}
)
.OrderByDescending (m => m.date)
.GroupBy (m => m.recordId)
.Dump();
Edit #1 -
recordId is not unique. There may / will likely be multiple records with the same recordId. recordId is actually a customer number.
The output will be a resultset with first name, last name, date, and recordId. Depending on several factors, there many be 1 to 5 records returned for each recordId.
Edit #2 -
The .Take(x) is for the recordId. Each recordId may have multiple rows. For now, let's assume I want the most recent date for each recordId. (select top(1) when sorted by date descending)
Edit #3 -
The following query generates the following results. Note each recordId only produces 1 row in the output (this is okay) and it appears it is the most recent date. I haven't thouroughly checked this yet.
Now, how do I sort, in ascending order, by recordId?
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = Convert.ToInt32(tokens[13]),
date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy (m => m.recordId)
.OrderByDescending (m => m.Max (x => x.date ) )
.Select (m => m.First () )
.Dump();
FirstName LastName recordId date
X X 2531334 3/11/2011 12:00:00 AM
X X 1443809 10/18/2001 12:00:00 AM
X X 2570897 3/10/2011 12:00:00 AM
X X 1960526 3/10/2011 12:00:00 AM
X X 2475293 3/10/2011 12:00:00 AM
X X 2601783 3/10/2011 12:00:00 AM
X X 2581844 3/6/2011 12:00:00 AM
X X 1773430 3/3/2011 12:00:00 AM
X X 1723271 2/4/2003 12:00:00 AM
X X 1341886 2/28/2011 12:00:00 AM
X X 1427818 11/15/1986 12:00:00 AM
You can't that easily order by a field which is not part of the group by fields. You get a list for each group. This means, you get a list of date for each recordId.
You could order by Max(date) or Min(date).
Or you could group by recordId and date, and order by date.
order by most recent date:
.GroupBy (m => m.recordId)
// take the most recent date in the group
.OrderByDescending (m => m.Max(x => x.date))
.SelectMany(x => x.First
The Take part is another question. You could just add Take(x) to the expression, then you get this number of groups.
Edit:
For a kind of select top(1):
.GroupBy (m => m.recordId)
// take the most recent date in the group
.OrderByDescending (m => m.Max(x => x.date))
// take the first of each group, which is the most recent
.Select(x => x.First())
// you got the most recent record of each recordId
// and you can take a certain number of it.
.Take(x);
snipped I had before in my answer, you won't need it according to your question as it is now:
// create a separate group for each unique date and recordId
.GroupBy (m => m.date, m => m.recordId)
.OrderByDescending (m => m.Key)
This seems very similar to your other question - Reading a delimted file using LINQ
I don't believe you want to use Group here at all - I believe instead that you want to use OrderBy and ThenBy - something like:
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = tokens[13],
date = Convert.ToDateTime(tokens[17])
}
)
.OrderBy (m => m.recordId)
.ThenByDescending (m => m.date)
.Dump();
For a simple Take... you can just add this .Take(N) just before the Dump()
However, I'm not sure this is what you are looking for? Can you clarify your question?
just add
.OrderBy( g=> g.Key);
after your grouping. This will order your groupings by RecordId ascending.
Last, I want to .take(x) records where
x is dependent on some other factors.
Basically, the .take(x) will return
the most-recent x records.
If you mean by "the most recent" by date, why would you want to group by RecordId in the first place - just order by date descending:
..
.OrderByDescending (m => m.date)
.Take(x)
.Dump();
If you just want to get the top x records in the order established by the grouping though you could do the following:
...
.GroupBy (m => m.recordId)
.SelectMany(s => s)
.Take(x)
.Dump();
If you want something like the first 3 for each group, then I think you need to use a nested query like:
var recipients = File.ReadAllLines(path)
.Select(record => record.Split('|'))
.Select(tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
RecordId = tokens[13],
Date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy(m => m.RecordId)
.Select(grouped => new
{
Id = grouped.Key,
First3 = grouped.OrderByDescending(x => x.Date).Take(3)
}
.Dump();
and if you want this flattened into a record list then you can use SelectMany:
var recipients = var recipients = File.ReadAllLines(path)
.Select(record => record.Split('|'))
.Select(tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
RecordId = tokens[13],
Date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy(m => m.RecordId)
.Select(grouped => grouped.OrderByDescending(x => x.Date).Take(3))
.SelectMany(item => item)
.Dump();

Categories

Resources