GroupBy multiple columns in Linq with Take() - c#

I have an entity that looks like this:
public class Entries
{
public Contract Contract { get; set; }
public int Year { get; set; }
}
public class Contract
{
public int Id { get; set; }
public string Name { get; set; }
}
I want to be able to query the database and return an object that would allow me to report on the Contract data like this where the numbers for each year are counts for that year:
Contract ID/Name 2015 2016
1 - ABC 12 4
2 - XYZ 17 76
3 - QRS 414 0
I've started with Linq like this:
var results = context.Entries
.Include(x => x.Contract)
.GroupBy(x => new { x.Contract, x.Year })
.Select(x => new { Contract = x.Key.Contract, Year = x.Key.Year, Count = x.Count() })
.OrderBy(x => x.Contract.Number)
.Take(5).ToList();
I would like this single IQueryable to set me up to push the data into an object that mimics the table above. I'm having trouble setting it up, though, because of the Take(). I only want to show the first 5 results, but when I apply it like I've done above after grouping, using the example data given, there will be one record for ABC 2015 and one record for ABC 2016. I'm not sure the best way to use GroupBy() and Take() together to accomplish my goal.

You need a nested collection:
var results = context.Entries
.Include(x => x.Contract)
.GroupBy(t=>t.Contract)
.Select(t=>new{
Contract=t.Key,
Years=t.GroupBy(s=>s.Year)
.Select(s=>new{
Year=s.Key,
Count=s.Count()
})
})
.Take(5);
You'll have a IEnumerable of pairs of Contract/Years where Years is an IEnumerable of year / count pairs with the counts per year.

Related

c# linq OrderByDescending with inner LastOrDefault [duplicate]

This question already has answers here:
LINQ To Entities does not recognize the method Last. Really?
(6 answers)
Closed 3 years ago.
Is there a way to order, let's say, all customers by the date of the last purchase?
For example
ctx.Customers.OrderByDescending(e => e.Purchases.LastOrDefault().DateTime);
It would be something like this, however, this doesn't work. It throws the exception
LINQ to Entities does not recognize the method 'Purchase
LastOrDefault[Purchase]
(System.Collections.Generic.IEnumerable`1[Purchase])'
method, and this method cannot be translated into a store expression
edit:
public class Customer
{
public Customer()
{
Purchases = new List<Purchase>();
}
public int Id { get; set; }
public string Name { get; set; }
[JsonIgnore]
public virtual IList<Purchase> Purchases { get; set; }
}
public class Purchase
{
public int Id { get; set; }
public int IdCustomer { get; set; }
public DateTime DateTime { get; set; }
public virtual Customer Customer { get; set; }
}
In Context I do have somthing like
protected override void OnModelCreating(DbModelBuilder modelBuilder)
{
modelBuilder.HasRequired(s => s.Customer)
.WithMany(p => p.Purchases)
.HasForeignKey(s => s.IdCustomer);
}
ctx.Customers.OrderByDescending(e => e.Purchases.LastOrDefault().DateTime);
looks like a context query (Entity Framework, usually dbContext), so here you have an IQueryable not a List.
Entity Framework will try to convert this to a SQL statement before giving you results, but
SELECT * BOTTOM(X) FROM TABLE ORDER BY Purchases desc
is not an expression, but more importantly EF just doesn't recognize what you want to do.
Instead, you just want to flip the logic to:
SELECT * TOP(X) FROM TABLE ORDER BY Purchases asc
Or:
ctx.Customers.OrderBy(e => e.Purchases.FirstOrDefault().DateTime);
or you can order by on your subquery:
ctx.Customers.OrderBy(e => e.Purchases.OrderByDescending(x => x.propertyToSortOn)
.FirstOrDefault().DateTime);
Getting the last n records from the bottom of a sorted list, is actually the same as getting the top n from a list sorted the other way:
1,2,3,4,5,6 -> top 3 in ascending order = 1,2,3
6,5,4,3,2,1 -> bottom 3 in descending order = 3,2,1
LastOrDefault is not supported in Linq-to-Entities (meaning they have not yet developed a way to translate that to the equivalent SQL code). One option is to use AsEnumerable to do the ordering in memory:
ctx.Customers
.AsEnumerable()
.OrderByDescending(e => e.Purchases.LastOrDefault().DateTime);
However, since the order of Purchases is not deterministic, you may want to specify an order there as well:
ctx.Customers
.AsEnumerable()
.OrderByDescending(e => e.Purchases.OrderBy(p => p.DateTime).LastOrDefault());
or just use Max on the `Purchases':
ctx.Customers
.AsEnumerable()
.OrderByDescending(e => e.Purchases.Max(p => p.DateTime));
If the performance of any of those queries is not acceptable, the last resort would be to write the direct SQL and pass that to ctx.Customers.SqlQuery()

What's the best way to sort about 2.5 million records in memory in c#?

Consider I have a class
class Employee
{
public string Id { get; set; }
public string Type { get; set; }
public string Identifier { get; set; }
public object Resume { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }
}
List<Employee> employees = LoadEmployees(); //Around 2.5 million to 3 millions employees
employees = employees
.Where(x => x.Identifier != null)
.OrderBy(x => x.Identifier)
.ToArray();
I have a requirement where I want to load and sort around 2.5 million employees in memory but the Linq query gets stuck on the OrderBy clause. Any pointers on this? I have created this Employee class just to simplify my problem.
I would use the .Where(x => x.Identifier != null) clause first, since it filters some data first and then do the OrderBy. Given the fact that you have only ~2.5 million records and that they are only basic types like string and DateTime, then you should not have any problems with the memory in this case.
Edit:
I have just ran your code as a sample and indeed it is a matter of seconds (like over 15 seconds on my machine which does not have a very powerful CPU, but still, it does not get stuck):
List<Employee> employees = new List<Employee>();
for(int i=0;i<2500000;i++)
{
employees.Add(new Employee
{
Id = Guid.NewGuid().ToString(),
Identifier = Guid.NewGuid().ToString(),
Type = i.ToString(),
StartDate = DateTime.MinValue,
EndDate = DateTime.Now
});
}
var newEmployees = employees
.Where(x => x.Identifier != null)
.OrderBy(x => x.Identifier)
.ToArray();
As a second edit, I have just ran some tests, and it seems that an implementation using Parallel Linq can be in some cases faster with about 1.5 seconds than the serial implementation:
var newEmployees1 = employees.AsParallel()
.Where(x => x.Identifier != null)
.OrderBy(x => x.Identifier)
.ToArray();
And these are the best numbers that I got:
7599 //serial implementation
5752 //parallel linq
But the parallel tests could variate from one machine to another so I suggest making some tests yourself and if you still find a problem about this, then maybe edit the question/post another one.
Using the hint that #Igor proposed in the comment below, the parallel implementation with StringComparer.OrdinalIgnoreCase is about three times faster than the simple parallel implementation. The final (fastest) code looks like this:
var employees = employees.AsParallel()
.Where(x => x.Identifier != null)
.OrderBy(x => x.Identifier, StringComparer.OrdinalIgnoreCase)
.ToArray();

Get all records with inner record from last month - MongoDB C# SDK

I'm trying to get all the users that had any kind of activity in the last month from my MongoDB database, using C# SDK.
My User record contains a list of statistical records (as ObjctId) with creation date.
public class UserRecord
{
private string firstName;
public ObjectId Id
{
get;
set;
}
public List<ObjectId> Statistics
{
get;
set;
}
}
And my query builder function looks like this:
static IMongoQuery GenerateGetLiveUsersQuery(DateTime lastStatistic)
{
List<IMongoQuery> queries = new List<IMongoQuery>();
queries.Add((Query<UserRecord>.GTE(U =>
U.Statistics.OrderByDescending(x => x.CreationTime).First().CreationTime
, lastStatistic)));
///.... More "queries.Add"...
return Query.And(queries);
}
Not sure what I'm doing wrong but I get en System.NotSupportedException error message while trying to build the query (the GTE query).
Unable to determine the serialization information for the expression:
(UserRecord U) =>
Enumerable.First(Enumerable.OrderByDescending(U.Statistics, (ObjectId x) => x.CreationTime)).CreationTime.
The following query should work
var oId = new ObjectId(lastStatistic, 1, 1, 1);
var query = Query<UserRecord>.GTE(e => e.Statistics, oId);
you can create an ObjectId based on the lastStatistic Date which is your cut-off. Then you can just query the UserRecord collection to find any records that have an item in the Statistics list that is greater than your ObjectId.

Remove all but the highest value from a List grouped by another property using LINQ

I have a couple of classes that look like this (simplified for SO):
public class Result
{
List<Review> Reviews { get; set; }
}
public class Review
{
public string Name { get; set; }
public string Amount { get; set; }
}
So the net result is we have a "Result" object that has a List<> of Review objects, each of which has a name and an amount. The "Name" property can be duplicated one or more times within this list.
Given a "Reviews" property that has several Review objects with duplicate names, I want to use LINQ to remove the "Review" objects with duplicate names from the list that have the lowest values (in other words, I want to remove all Review objects from the list EXCEPT for the one with the highest value).
So if my list looks like this:
Name Amount
----------------
A 2
B 3
C 1
A 1
B 4
I want to use the LINQ Remove function on my list so that my end result is:
Name Amount
----------------
A 2
C 1
B 4
Any suggestions on how do do this using LINQ? I'm looking for solutions on my own as well, just figured I'd post here to see if it's faster than figuring it out on my own. :)
Here is a way you can do it :
var endResult = reviewList.OrderByDescending(e => e.Amount)
.GroupBy(e => e.Name)
.Select(g => g.First());

How do I construct a LINQ with multiple GroupBys?

I have an entity that looks like this:
public partial class MemberTank
{
public int Id { get; set; }
public int AccountId { get; set; }
public int Tier { get; set; }
public string Class { get; set; }
public string TankName { get; set; }
public int Battles { get; set; }
public int Victories { get; set; }
public System.DateTime LastUpdated { get; set; }
}
A tiny sample of the data:
Id AccountId Tier Class TankName Battles Victories
--- --------- ---- ----- --------- ------- ----------
1 432423 5 Heavy KV 105 58
2 432423 6 Heavy IS 70 39
3 544327 5 Heavy KV 200 102
4 325432 7 Medium KV-13 154 110
5 432423 7 Medium KV-13 191 101
Ultimately I am trying to get a result that is a list of tiers, within the tiers is a list of classes, and within the class is a distinct grouping of the TankName with the sums of Battles and Victories.
Is it possible to do all this in a single LINQ statement? Or is there another way to easily get the result? (I know I can easily loop through the DbSet several times to produce the list I want; I am hoping for a more efficient way of getting the same result with LINQ.)
This should do it:
var output = from mt in MemberTanks
group by {mt.Tier, mt.Class, mt.TankName} into g
select new { g.Key.Tier,
g.Key.Class,
g.Key.TankName,
Fights = g.Sum(mt => mt.Battles),
Wins = g.Sum(mt=> mt.Victories
};
You could also use Method syntax. This should give you the same as #TheEvilGreebo
var result = memberTanks.GroupBy(x => new {x.Tier, x.Class, x.TankName})
.Select(g => new { g.Key.Tier,
g.Key.Class,
g.Key.TankName,
Fights = g.Sum(mt => mt.Battles),
Wins = g.Sum(mt=> mt.Victories)
});
Which syntax you use comes down to preference.
Remove the .Select to return the IGrouping which will enable you to enumerate the groups
var result = memberTanks.GroupBy(x => new {x.Tier, x.Class, x.TankName})
I kept trying to get useful results our of the The Evil Greebo's answer. While the answer does yield results (after fixing the compilation issues mentioned in responses) it doesn't give me what I was really looking for (meaning I didn't explain myself well enough in the question).
Feanz left a comment in my question to check out the MS site with LINQ examples and, even though I thought I had looked there before, this time I found their example of nested group bys and I tried it their way. The following code gives me exactly what I was looking for:
var result = from mt in db.MemberTanks
group mt by mt.Tier into tg
select new
{
Tier = tg.Key,
Classes = from mt in tg
group mt by mt.Class into cg
select new
{
Class = cg.Key,
TankTypes = from mt in cg
group mt by mt.TankName into tng
select new
{
TankName = tng.Key,
Battles = tng.Sum(mt => mt.Battles),
Victories = tng.Sum(mt => mt.Victories),
Count = tng.Count()
}
}
};
I'll leave the answer by Mr. Greebo checked as most people will likely get the best results from that.

Categories

Resources