LINQ - Distinct by value? - c#

Code :
news = (from New myNew in new News()
select myNew).Distinct().ToList();
but this Distinct is for "object" with same values. I need, into my list, a myNew for each month. (so one for january, one for februaru, and so on). Than, news will get 12 record.
Is it possible a sort of Distinct(myNew.Month)?

You could group by month and take the first or last or whatever(you haven't told us):
var news = News()
.GroupBy(n => n.Month)
.Select(grp => grp.Last());
Edit: From the comment on Habib's answer i see that you want 12 months even if there are no news. Then you need to do a "Linq Outer-Join":
var monthlyNews = from m in Enumerable.Range(1, 12) // left outer join every month
join n in News() on m equals n.Month into m_n
from n in m_n.DefaultIfEmpty()
group n by m into MonthGroups
select new {
Month = MonthGroups.Key,
LastNews = MonthGroups.Last()
};
foreach (var m in monthlyNews)
{
int month = m.Month;
var lastNewsInMonth = m.LastNews;
if (lastNewsInMonth != null) ; // do something...
}
Edit: Since you have problems to implement the query in your code, you don't need to select the anonymous type which contains also the month. You can also select only the news itself:
var monthlyNews = from m in Enumerable.Range(1, 12) // every motnh
join n in news on m equals n.Month into m_n
from n in m_n.DefaultIfEmpty()
group n by m into MonthGroups
select MonthGroups.Last();
Note that you now get 12 news but some of them might be null when there are no news in that month.

Solution 1. Get MoreLinq (also available as NuGet package and use
News().DistinctBy(n => n.Property)
Solution 2. Implement an IEqualityComparer and use this Distinct() overload.

var result = News()
.GroupBy(p => p.Month)
.Select(g => g.First())
.ToList();

Short hand solution
var vNews = News()
.GroupBy(p => p.Month, (key, p) => p.FirstOrDefault())
.ToList();

var vNews = News()
.GroupBy(p => p.Month)
.Select(g => g.First())
.ToList();

Related

C# LINQ Group by

I'm new to C# and trying to answer some LINQ questions. I'm stuck on 1st marked as difficult...
Q: What were the top 10 origin airports with the largest average​ departure delays, including the values of these delays? (Hint: use group by)?
I have a list named "Flights" populated with more than 20000 objects of class "FlightInfo".
Properties of the FlightInfo class are:
string Carrier, string Origin, string Destination, int DepartureDelay, int ArrivalDelay, int Cancelled, int Distance.
I understand that I should group FlightInfo by FlightInfo.Origin and than average each of these groups by FlightInfo.DepartureDelay and than show 10 with the highest average delay, but beside grouping I'm completely stuck on how to proceed further.
Thank you in advance for any help!
Here is the example of one of previous questions that I was able to answer:
Q: The weighted arrival delay of a flight is its arrival delay divided the distance. What  was the flight with the largest weighted arrival delay out of Boston, MA?
A:
var weighted = (from FlightInfo in Flights
where FlightInfo.Origin == "Boston MA"
orderby (FlightInfo.ArrivalDelay / FlightInfo.Distance) descending
select FlightInfo).Take(1);
var topTen = flights.
GroupBy(g => g.Origin).
Select(g => new { Origin = g.Key, AvgDelay = g.ToList().Average(d => d.DepartureDelay) }).
OrderByDescending(o => o.AvgDelay).
Take(10);
var result = flights
.GroupBy(f => f.Origin)
.OrderByDescending(g => g.Average(f => f.DepartureDelay))
.Take(10)
.Select(g => new
{
AirportName = g.Key,
Flights = g.ToList()
});
The last .Select parameter depends on what you want.
You could do this.
var top10 = Flights.GroupBy(g=>g.Origin) // groupby origin
.OrderByDescending(x=> x.Sum(f=> f.ArrivalDelay / f.Distance)) // Get the weighted delay for each fight and use for ordering.
.Select(x=>x.Key) //Airport or Origin (Modify with what you want)
.Take(10)
.ToList() ;

Linq Query To Group All Other Than Top 3

I have the following code in Linq, and I was wondering how to make it so that it groups all others beside the top 3 into an others category and sum their volumes.
var list = (from t in sortedCollection.DataItem
orderby t.volume
select t).Take(3);
You need to use Skip to ignore top 3 and group the rest like:
var list = (from t in sortedCollection.DataItem
orderby t.volume
select t).Skip(3);
From the comments, it seems you only want to get the sum of a particular field after skipping first 3 records.
var sum = (from t in sortedCollection.DataItem
orderby t.volume
select t).Skip(3).Sum(r=> r.VOLUME);
Or with a complete method syntax:
var Sum = sortedCollection.DateItem.OrderBy(t => t.volume)
.Skip(3)
.Sum(r=> r.volume);
If you need grouping , that it would look like:
With method syntax it should be something like:
var query = sortedCollection.DateItem.OrderBy(t => t.volume)
.Skip(3)
.GroupBy(t => t.YourGroupingField);
To do Sum based on a field you can do something like:
var query = sortedCollection.DateItem.OrderBy(t => t.volume)
.Skip(3)
.GroupBy(t => t.YourGroupingField)
.Select(grp => new SqlCommand(
{
Key = grp.Key,
Sum = grp.Sum(r=> r.ValueFieldForSum)
}));

LINQ: how to get a group of a table ordering with a related table?

I have a doubt about the object IGrouping that results from a linq where I use a "group by" sentence.
I have two tables in the database, Products and Responses they have a relationship 1 to *. In the Responses table we have a column called FinalRate which is the rate of the product. The products can have n responses or rates.
I want to get the Products order by the sum of the FinalRate divided by the number of rates done. That is to say, order by the average rate descending from higher to lower marks.
As it can be read in the code (at the end of the question), I try to get the responses first. To sum all the finalrates and divide them by the count I use a group.
There are 2 problems with the code, even if the current code works:
1.-I tried to get the Products in a single query but it is impossible because I can not use the products table in the group and then use the Response table in the "orderby". One more thing LINQ only gives you the possibility to group one table, it is imposible to have "group prod, response".
I couldn't get this sql sentence in LINQ:
select prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name,
prod.ProductImageUrl
from rev_product prod
inner join rev_response res on res.AtProductid=prod.ProductID
group by prod.ProductID,prod.Commercial_Product_Name,prod.Manufacturer_Name
,prod.ProductImageUrl
order by (sum(res.FinalRate)/count(res.AtProductid))
I tried this:
var gruposproductos = (from prod in ctx.Products
join res in ctx.Responses on prod.ProductID equals res.AtProductId
group prod by prod.ProductID into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2);
But as I say, the "orderby (g.Sum..." gives an error, because "into g" groups the Product table, not the Response Table.
So this is why in my final code I don't get the products in the same LINQ sentence.
2.-Once accepted this fact, the problem is that I get an IGrouping, but I don't obtain a list of Responses that I can iterate without doing the two foreach in the code. I wanted only one loop, as one would do if you had a "List" object.
It is not really a cool method but it works. Moreover, I have to control that in the second loop there is only added 1 time.
Any better code?
var groupproducts = (from res in ctx.Responses
group res by res.AtProductId into g
orderby (g.Sum(ra =>ra.FinalRate)/g.Count())
descending select g).Take(2).ToList();
List<Product> theproducts = new List<Product>();
foreach (var groupresponse in groupproducts)
{
foreach (var response in groupresponse)
{
var producttemp= (from prod in ctx.Products
where prod.ProductID == response.AtProductId
select prod).First();
theproducts.Add(producttemp);
}
}
}
FINAL SOLUTION (thx a lot #Daniel)
var productsanonymtype = ctx.Products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Count() == 0 ? 0 : x.Responses.Select(r => (double)r.FinalRate).Sum() / x.Responses.Count()
}).OrderByDescending(x => x.AverageRating);
List<Product> products = new List<Product>();
foreach (var prod in productsanonymtype)
{
products.Add(prod.Product);
}
Try this:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Sum(x => x.FinalRate) /
x.Responses.Count()
});
The Sum overload I am using is not implemented in all providers. If that's a problem for you, you can use this alternate version:
products.Select(x => new
{
Product = x,
AverageRating = x.Responses.Select(x => x.FinalRate)
.Sum() /
x.Responses.Count()
});
If there is no navigation property from product to its responses you should first try to fix that. If you can't you can use this version:
products.Join(responses, x => x.Id, x => x.ProductId,
(p, r) => new { Product = p, Response = r })
.GroupBy(x => x.Product)
.Select(g => new { Product = g.Key,
AverageRating = g.Select(x => x.Response.FinalRate)
.Sum() /
g.Count()
});
Assuming FinalRate is an int, both methods will calculate the average rating with an int, i.e. there will be no 4.5 rating. And there will be no rounding, i.e. an actual average rating of 4.9 will result in 4. You can fix that by casting one of the operands of the division to double.
Another problem is the case with no ratings so far. The code above will result in an exception in this case. If that's a problem for you, you can change the calculation to this:
AverageRating = g.Count() == 0
? 0
: g.Select(x => (double)x.Response.FinalRate).Sum() / g.Count()
ctx.Products.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);
And here with the projection.....
ctx.Products.Select(x => new {
ProductID = x.ProductID,
Commercial_Product_Name = x.Commercial_Product_Name,
Manufacturer_Name = x.Manufacturer_Name,
ProductImageUrl = x.ProductImageUrl,
FinalRate = x.Responses.Sum(y => y.FinalRate),
CountProductId = x.Responses.Count
})
.GroupBy(x => new {
ProductId = x.ProductId,
FinalRate = x.FinalRate,
CountProductId = x.CountProductId
})
.OrderBy(x => x.Key.FinalRate / x.Key.CountProductId);

How do I .OrderBy() and .Take(x) this LINQ query?

The LINQ query below is working fine but I need to tweak it a bit.
I want all the records in the file grouped by recordId (a customer number) and then ordered by, in descending order, the date. I'm getting the grouping and the dates are in descending order. Now, here comes the tweaking.
I want the groups to be sorted, in ascending order, by recordId. Currently, the groups are sorted by the date, or so it seems. I tried adding a .OrderBy after the .GroupBy and couldn't get that to work at all.
Last, I want to .take(x) records where x is dependent on some other factors. Basically, the .take(x) will return the most-recent x records. I tried placing a .take(x) in various places and I wasn't getting the correct results.
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = tokens[13],
date = Convert.ToDateTime(tokens[17])
}
)
.OrderByDescending (m => m.date)
.GroupBy (m => m.recordId)
.Dump();
Edit #1 -
recordId is not unique. There may / will likely be multiple records with the same recordId. recordId is actually a customer number.
The output will be a resultset with first name, last name, date, and recordId. Depending on several factors, there many be 1 to 5 records returned for each recordId.
Edit #2 -
The .Take(x) is for the recordId. Each recordId may have multiple rows. For now, let's assume I want the most recent date for each recordId. (select top(1) when sorted by date descending)
Edit #3 -
The following query generates the following results. Note each recordId only produces 1 row in the output (this is okay) and it appears it is the most recent date. I haven't thouroughly checked this yet.
Now, how do I sort, in ascending order, by recordId?
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = Convert.ToInt32(tokens[13]),
date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy (m => m.recordId)
.OrderByDescending (m => m.Max (x => x.date ) )
.Select (m => m.First () )
.Dump();
FirstName LastName recordId date
X X 2531334 3/11/2011 12:00:00 AM
X X 1443809 10/18/2001 12:00:00 AM
X X 2570897 3/10/2011 12:00:00 AM
X X 1960526 3/10/2011 12:00:00 AM
X X 2475293 3/10/2011 12:00:00 AM
X X 2601783 3/10/2011 12:00:00 AM
X X 2581844 3/6/2011 12:00:00 AM
X X 1773430 3/3/2011 12:00:00 AM
X X 1723271 2/4/2003 12:00:00 AM
X X 1341886 2/28/2011 12:00:00 AM
X X 1427818 11/15/1986 12:00:00 AM
You can't that easily order by a field which is not part of the group by fields. You get a list for each group. This means, you get a list of date for each recordId.
You could order by Max(date) or Min(date).
Or you could group by recordId and date, and order by date.
order by most recent date:
.GroupBy (m => m.recordId)
// take the most recent date in the group
.OrderByDescending (m => m.Max(x => x.date))
.SelectMany(x => x.First
The Take part is another question. You could just add Take(x) to the expression, then you get this number of groups.
Edit:
For a kind of select top(1):
.GroupBy (m => m.recordId)
// take the most recent date in the group
.OrderByDescending (m => m.Max(x => x.date))
// take the first of each group, which is the most recent
.Select(x => x.First())
// you got the most recent record of each recordId
// and you can take a certain number of it.
.Take(x);
snipped I had before in my answer, you won't need it according to your question as it is now:
// create a separate group for each unique date and recordId
.GroupBy (m => m.date, m => m.recordId)
.OrderByDescending (m => m.Key)
This seems very similar to your other question - Reading a delimted file using LINQ
I don't believe you want to use Group here at all - I believe instead that you want to use OrderBy and ThenBy - something like:
var recipients = File.ReadAllLines(path)
.Select (record => record.Split('|'))
.Select (tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
recordId = tokens[13],
date = Convert.ToDateTime(tokens[17])
}
)
.OrderBy (m => m.recordId)
.ThenByDescending (m => m.date)
.Dump();
For a simple Take... you can just add this .Take(N) just before the Dump()
However, I'm not sure this is what you are looking for? Can you clarify your question?
just add
.OrderBy( g=> g.Key);
after your grouping. This will order your groupings by RecordId ascending.
Last, I want to .take(x) records where
x is dependent on some other factors.
Basically, the .take(x) will return
the most-recent x records.
If you mean by "the most recent" by date, why would you want to group by RecordId in the first place - just order by date descending:
..
.OrderByDescending (m => m.date)
.Take(x)
.Dump();
If you just want to get the top x records in the order established by the grouping though you could do the following:
...
.GroupBy (m => m.recordId)
.SelectMany(s => s)
.Take(x)
.Dump();
If you want something like the first 3 for each group, then I think you need to use a nested query like:
var recipients = File.ReadAllLines(path)
.Select(record => record.Split('|'))
.Select(tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
RecordId = tokens[13],
Date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy(m => m.RecordId)
.Select(grouped => new
{
Id = grouped.Key,
First3 = grouped.OrderByDescending(x => x.Date).Take(3)
}
.Dump();
and if you want this flattened into a record list then you can use SelectMany:
var recipients = var recipients = File.ReadAllLines(path)
.Select(record => record.Split('|'))
.Select(tokens => new
{
FirstName = tokens[2],
LastName = tokens[4],
RecordId = tokens[13],
Date = Convert.ToDateTime(tokens[17])
}
)
.GroupBy(m => m.RecordId)
.Select(grouped => grouped.OrderByDescending(x => x.Date).Take(3))
.SelectMany(item => item)
.Dump();

LINQ to SQL: GroupBy() and Max() to get the object with latest date

Consider a SQL Server table that's used to store events for auditing.
The need is to get only that latest entry for each CustID. We want to get the entire object/row. I am assuming that a GroupBy() will be needed in the query. Here's the query so far:
var custsLastAccess = db.CustAccesses
.Where(c.AccessReason.Length>0)
.GroupBy(c => c.CustID)
// .Select()
.ToList();
// (?) where to put the c.Max(cu=>cu.AccessDate)
Question:
How can I create the query to select the latest(the maximum AccessDate) record/object for each CustID?
I'm wondering if something like:
var custsLastAccess = db.CustAccesses
.Where(c.AccessReason.Length>0)
.GroupBy(c => c.CustID)
.Select(grp => new {
grp.Key,
LastAccess = grp
.OrderByDescending(x => x.AccessDate)
.Select(x => x.AccessDate)
.FirstOrDefault()
}).ToList();
you could also try OrderBy() and Last()
Using LINQ syntax, which I think looks cleaner:
var custsLastAccess = from c in db.CustAccesses
group c by c.CustID into grp
select grp.OrderByDescending(c => c.AccessDate).FirstOrDefault();
Here: this uses max rather than OrderByDesc, so should be more efficient.
var subquery = from c in CustAccesses
group c by c.CustID into g
select new
{
CustID = g.Key,
AccessDate = g.Max(a => a.AccessDate)
};
var query = from c in CustAccesses
join s in subquery
on c.CustID equals s.CustID
where c.AccessDate == s.AccessDate
&& !string.IsNullOrEmpty(c.AccessReason)
select c;

Categories

Resources