More Elegant LINQ Alternative to Foreach Extension - c#

This is purely to improve my skill. My solution works fine for the primary task, but it's not "neat". I'm currently working on a .NET MVC with Entity framework project. I know only basic singular LINQ functions which have sufficed over the years. Now I'd like to learn how to fancy.
So I have two models
public class Server
{
[Key]
public int Id { get; set; }
public string InstanceCode { get; set; }
public string ServerName { get; set; }
}
public class Users
{
[Key]
public int Id { get; set; }
public string Name { get; set; }
public int ServerId { get; set; } //foreign key relationship
}
In one of my view models I was asked to provide a dropdown list for selecting a server when creating a new user. The drop down list populated with text and value Id as an IEnumerable
Here's my original property for dropdown list of servers
public IEnumerable<SelectListItem> ServerItems
{
get { Servers.ToList().Select(s => new selectListItem { Value = x.Id.ToString(), Text = $"{s.InstanceCode}#{s.ServerName}" }); }
}
Update on requirements, now I need to display how many users are related to each server selection. Ok no problem. Here's what I wrote off the top of my head.
public IEnumerable<SelectListItem> ServerItems
{
get
{
var items = new List<SelectListItem>();
Servers.ToList().ForEach(x => {
var count = Users.ToList().Where(t => t.ServerId == x.Id).Count();
items.Add(new SelectListItem { Value = x.Id.ToString(), Text = $"{x.InstanceCode}#{x.ServerName} ({count} users on)" });
});
return items;
}
}
This gets my result lets say "localhost#rvrmt1u (8 Users)" but thats it..
What if I want to sort this dropdown list by user count. All I'm doing is another variable in the string.
TLDR ... I'm sure that someone somewhere can teach me a thing or two about converting this to a LINQ Query and making it look nicer. Also bonus points for knowing how I could sort the list to show servers with the most users on it first.

OK, we have this mess:
var items = new List<SelectListItem>();
Servers.ToList().ForEach(x => {
var count = Users.ToList().Where(t => t.ServerId == x.Id).Count();
items.Add(new SelectListItem { Value = x.Id.ToString(), Text = $"{x.InstanceCode}#{x.ServerName} ({count} users on)" });
});
return items;
Make a series of small, careful, obviously-correct refactorings that gradually improve the code.
Start with: Let's abstract those complicated operations to their own methods.
Note that I've replaced the unhelpful x with the helpful server.
int UserCount(Server server) =>
Users.ToList().Where(t => t.ServerId == server.Id).Count();
Why on earth is there a ToList on Users? That looks wrong.
int UserCount(Server server) =>
Users.Where(t => t.ServerId == server.Id).Count();
We notice that there is a built-in method that does these two operations together:
int UserCount(Server server) =>
Users.Count(t => t.ServerId == server.Id);
And similarly for creating an item:
SelectListItem CreateItem(Server server, int count) =>
new SelectListItem
{
Value = server.Id.ToString(),
Text = $"{server.InstanceCode}#{server.ServerName} ({count} users on)"
};
And now our property body is:
var items = new List<SelectListItem>();
Servers.ToList().ForEach(server =>
{
var count = UserCount(server);
items.Add(CreateItem(server, count);
});
return items;
Already much nicer.
Never use ForEach as a method if you're just going to pass a lambda body! There's already a built-in mechanism in the language that does it better! There is no reason to write items.Foreach(item => {...}); when you could simply write foreach(var item in items) { ... }. It's simpler and easier to understand and debug, and the compiler can optimize it better.
var items = new List<SelectListItem>();
foreach (var server in Servers.ToList())
{
var count = UserCount(server);
items.Add(CreateItem(server, count);
}
return items;
Much nicer.
Why is there a ToList on Servers? Completely unnecessary!
var items = new List<SelectListItem>();
foreach(var server in Servers)
{
var count = UserCount(server);
items.Add(CreateItem(server, count);
}
return items;
Getting better. We can eliminate the unnecessary variable.
var items = new List<SelectListItem>();
foreach(var server in Servers)
items.Add(CreateItem(server, UserCount(server));
return items;
Hmm. This gives us an insight that CreateItem could be doing the count itself. Let's rewrite it.
SelectListItem CreateItem(Server server) =>
new SelectListItem
{
Value = server.Id.ToString(),
Text = $"{server.InstanceCode}#{server.ServerName} ({UserCount(server)} users on)"
};
Now our prop body is
var items = new List<SelectListItem>();
foreach(var server in Servers)
items.Add(CreateItem(server);
return items;
And this should look familiar. We have re-invented Select and ToList:
var items = Servers.Select(server => CreateItem(server)).ToList();
Now we notice that the lambda can be replaced with the method group:
var items = Servers.Select(CreateItem).ToList();
And we have reduced that whole mess to a single line that clearly and unambiguously looks like what it does. What does it do? It creates an item for every server and puts them in a list. The code should read like what it does, not how it does it.
Study the techniques I used here carefully.
Extract complex code to helper methods
Replace ForEach with real loops
Eliminate unnecessary ToLists
Revisit earlier decisions when you realize there's an improvement to be made
Recognize when you are re-implementing simple helper methods
Don't stop with one improvement! Each improvement makes it possible to do another.
What if I want to sort this dropdown list by user count?
Then sort it by user count! We abstracted that away into a helper method, so we can use it:
var items = Servers
.OrderBy(UserCount)
.Select(CreateItem)
.ToList();
We now notice that we're calling UserCount twice. Do we care? Maybe. It could be a perf problem to call it twice, or, horrors, it might not be idempotent! If either are a problem then we need to undo a decision we made before. It's easier to deal with this situation in comprehension mode rather than fluent mode, so let's rewrite as a comprehension:
var query = from server in Servers
orderby UserCount(server)
select CreateItem(server);
var items = query.ToList();
Now we go back to our earlier:
SelectListItem CreateItem(Server server, int count) => ...
and now we can say
var query = from server in Servers
let count = UserCount(server)
orderby count
select CreateItem(server, count);
var items = query.ToList();
and we are only calling UserCount once per server.
Why go back to comprehension mode? Because to do this in fluent mode makes a mess:
var query = Servers
.Select(server => new { server, count = UserCount(server) })
.OrderBy(pair => pair.count)
.Select(pair => CreateItem(pair.server, pair.count))
.ToList();
And it looks a little ugly. (In C# 7 you could use a tuple instead of an anonymous type, but the idea is the same.)

The trick with LINQ is just to type return and go from there. Don't create a list and add items to it; there is usually a way to select it all in one go.
public IEnumerable<SelectListItem> ServerItems
{
get
{
return Servers.Select
(
server =>
new
{
Server = server,
UserCount = Users.Count( u => u.ServerId = server.Id )
}
)
.Select
(
item =>
new SelectListItem
{
Value = item.Server.Id.ToString(),
Text = string.Format
(
#"{0}{1} ({2} users on)" ,
item.Server.InstanceCode,
item.Server.ServerName,
item.UserCount
)
}
);
}
}
In this example there are actually two Select statements-- one to extract the data, and one to do the formatting. In an ideal situation the logic for those two tasks would be separated into different layers, but this is an OK compromise.

Related

EF 6 - Performance of GroupBy

I don't have a problem currently, but I want to make sure, that the performance is not too shabby for my issue. My search on Microsofts documentation was without any success.
I have a Entity of the name Reservation. I now want to add some statistics to the program, where I can see some metrics about the reservations (reservations per month and favorite spot/seat in particular).
Therefore, my first approach was the following:
public async Task<ICollection<StatisticElement<Seat>>> GetSeatUsage(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return await this.FetchGroupedSeatData(allReservations, company);
}
public async Task<ICollection<StatisticElement<DateTime>>> GetMonthlyReservations(Company company)
{
var allReservations = await this.reservationService.GetAll(company);
return this.FetchGroupedReservationData(allReservations);
}
private async Task<ICollection<StatisticElement<Seat>>> FetchGroupedSeatData(
IEnumerable<Reservation> reservations,
Company company)
{
var groupedReservations = reservations.GroupBy(r => r.SeatId).ToList();
var companySeats = await this.seatService.GetAll(company);
return (from companySeat in companySeats
let groupedReservation = groupedReservations.FirstOrDefault(s => s.Key == companySeat.Id)
select new StatisticElement<Seat>()
{
Value = companySeat,
StatisticalCount = groupedReservation?.Count() ?? 0,
}).OrderByDescending(s => s.StatisticalCount).ToList();
}
private ICollection<StatisticElement<DateTime>> FetchGroupedReservationData(IEnumerable<Reservation> reservations)
{
var groupedReservations = reservations.GroupBy(r => new { Month = r.Date.Month, Year = r.Date.Year }).ToList();
return groupedReservations.Select(
groupedReservation => new StatisticElement<DateTime>()
{
Value = new DateTime(groupedReservation.Key.Year, groupedReservation.Key.Month, 1),
StatisticalCount = groupedReservation.Count(),
}).
OrderBy(s => s.Value).
ToList();
}
To explain the code a little bit: With GetSeatUsage and GetMonthlyReservations I can get the above mentioned data of a company. Therefore, I fetch ALL reservations at first (with reservationService.GetAll) - this is the point, where I think the performance will be a problem in the future.
Afterwards, I call either FetchGroupedSeatData or FetchGroupedReservationData, which first groups the reservations I previously fetched from the database and then converts them in a, for me, usable format.
As I said, I think the group by after I have read ALL the data from the database MIGHT be a problem, but I cannot find any information regarding performance in the documentation.
My other idea was, that I create a new method in my ReservationService, which then already returns the grouped list. But, again, I can't find the information, that the EF adds the GroupBy to the DB Query or basically does it after all of the data has been read from the database. This method would look something like this:
return await this.Context.Set<Reservation>.Where(r => r.User.CompanyId == company.Id).GroupBy(r => r.SeatId).ToListAsync();
Is this already the solution? Where can I check that? Am I missing something completely obvious?

Using a list property to filter a dbset with LINQ

Say I have an event that has a list of staff tasks:
public class Event()
{
public Guid? StaffId { get; set; }
}
public class StaffTask()
{
public Guid StaffId { get; set; }
public Guid TaskId { get; set; }
}
How would I do something like this where I get all the events for a list of staff members?
var staffTasks = new List<StaffTasks>()
{
new StaffTask () { StaffId = "guid1", TaskId = "guid2" },
new StaffTask () { StaffId = "guid3", TaskId = "guid4" }
};
queryable = _db.Events.AsQueryable()
.Where(event =>
staffTasks.Any(st => st.StaffId == event.StaffId)
);
I currently get this error when running the above:
The LINQ expression 'DbSet<Event>()
.Where(e => __staffTasks
.Any(or => (Nullable<Guid>)or.StaffId == e.StaffId))' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to 'AsEnumerable', 'AsAsyncEnumerable', 'ToList', or 'ToListAsync'.
The goal would be to have this return only the second and third event here
var events = new List<Event>() {
new Event() { StaffId = null },
new Event() { StaffId = "guid1" },
new Event() { StaffId = "guid2" },
new Event() { StaffId = "guid20" },
new Event() { StaffId = null }
}
i'm still not sure why #Prasad's answer didn't work
EF wants to take the C# you provide and make an SQL out of it. It knows how to do some things, but not everything
When you have a pattern of "collection in c#, search in column for value within the collection" EF will want to make an SQL like WHERE id IN(value1,value2..) but critically it won't go digging and running complex projections to get that list of values
Any will work, but (as far as I know) only on collections that are just the type of the value being searched. This means projecting your StaffTasks to a simple Guid? collection would also work as an Any:
var staff = staffTasks.Select(st => (Guid?)st.StaffId).ToArray();
_db.Events
.Where(event => staff.Any(st => event.StaffId == st ));
EF can translate this to an IN like it can Contains, but the reason it's probably just better to remember "don't use Any, use Contains" is because Contains is much better at causing a compile error if you do something EF won't tolerate.
This wouldn't compile (note I've used staffTasks.Contains):
_db.Events
.Where(event => staffTasks.Contains(event.StaffId));
So Contains automatically guides you towards using a list of primitives whereas Any makes you think in LINQ "I'll just pull the prop I want in the lambda" mode and write:
_db.Events
.Where(event => staffTasks.Any(st => event.StaffId == st.SomeProp));
This would compile in c# but won't translate to EF because EF would have to run the projection to get the values it wants to put in the IN. It also tries to get away with doing a nullable<T> == T here (StaffTask and Event have different types for StaffId), which is legal C# in a locally evaluated Any, but another thing that EF doesn't translate
--
So ends up, your answer became translated as COALESCE(event.StaffId, '00000000-0000-0000-0000-000000000000') IN ('guid1', 'guid3') (I guess that 3's a typo because your sample data is guid2) which is fine
If you'd type-matched the list so it was full of Guid? you could have dropped the ??Guid.Empty and it would have translated as event.StaffId IN ('guid1', 'guid3') which is also fine (it would discard the null StaffId on events) and actually possibly faster as the COALESCE could preclude the use of an index
And if you'd used a list of Guid? With Any it also would have worked..
..but generally if you use Contains you will have these things work out first time more often because the way Contains demands you supply things is more often in line with how EF needs to receive things
this seemed to get the job done, though i'm still not sure why #Prasad's answer didn't work
var staffTasks = new List<StaffTasks>()
{
new StaffTask () { StaffId = "guid1", TaskId = "guid2" },
new StaffTask () { StaffId = "guid3", TaskId = "guid4" }
};
var staff = staffTasks.Select(st => st.StaffId).ToList();
queryable = _db.Events.AsQueryable()
.Where(event => staffTasks.Contains(event.StaffId ?? Guid.Empty));

How do I copy projected results into another variable in c#?

The code below wont run, as it complains that I am trying to add a type Anon into a type of Clients. How can I store certain results in another variable after projecting them originally and having lost the original Type.
(PS. I have made my example simple but am actually dealing with a more complex case. Not projecting is not an option in my actual case. Edited to provide clarification.)
var clients = Clients.Where(c => c.FirstName.StartsWith("Mark"))
.Select(c => new {
LastName = c.LastName.ToUpper(),
c.DateAdded,
c.FirstName,
})
.ToList();
var certainClients = new List<Clients> { };
foreach (var client in clients)
{
if(client.DateAdded.Date < DateTime.Today) {
certainClients.Add(client);
}
}
certainClients.Dump();
There are two options.
First. Instead of using an anon data type, use Clients datatype. As in effect you are creating Clients object -
var clients = Clients.Where(c => c.FirstName.StartsWith("Mark"))
.Select(c => new Clients{
LastName = c.LastName.ToUpper(),
c.DateAdded,
c.FirstName,
})
Second. Create a list of object and assign whatever custom/anon data type to it -
var certainClients = new List<object> { };
The best way is to project to a custom business entity class.
This means that we actually define the class ourselves. For example.
public class ClientEntity
{
public string LastName;
public DateTime DateAdded;
// etc for custom fields or properties you want
}
Then we can simply project to our custom made class
var clients = Clients.Where(c => c.FirstName.StartsWith("Mark"))
.Select(c => new ClientEntity{
LastName = c.LastName.ToUpper(),
DateAdded = c.DateAdded,
etc
})
This way it avoids List <object> which is not type safe and doesn't need to be similar to the original client class for example if we want the length of the original name.

Method for finding most often occurring item in a list

Ok, so I have a project about a carpark. Long story short I need a method to find most often occurring model in a data list. Any good ways to approach this (Using VST 2012)
private static int FilterbyModel(string Model, List<Car> cars)
{
int modelCount = 0;
List<string> Modelis = new List<string>();
foreach (Car s in cars)
{
if (s.Modelis == Model)
{
if (!Model.Contains(s.Modelis))
{
Modelis.Add(s.Modelis);
modelCount++;
}
}
return modelCount;
Above method would work, but I need to provide a specific model to look for rather than to just find the most common one.
Simplest way to count occurrences is to group by model and then count each group's elements. Here I give an example with LINQ:
var carsByModel = cars.GroupBy(x => x.Model)
.Select(x => new { Model = x.Key, Count = c.Count() });
Now if you want to order them and pick the most common one:
var mostCommonCar = carsByModel.OrderByDescending(x => x.Count).First();
If, for example, you need to print the two most common models:
foreach (var model in carsByModel.OrderByDescending(x => x.Count).Take(2))
Console.WriteLine($"{model.Model}: {model.Count}");
Or the less common twos:
foreach (var model in carsByModel.OrderBy(x => x.Count).Take(2))
Console.WriteLine($"{model.Model}: {model.Count}");
If in that count you're interested in one specific model you may do this (note that I do omit StringComparer in this example but I always suggest to use it in production code):
carsByModel.First(x => x.Model.Equals("modelYouWant"));
However note that if you just need to count occurrences of a specific model then this is faster and simpler:
int occurencesOfModelIWant = cars.Count(x => x.Model.Equals("modelYouWant"));

Sorting data issue

So I have a little issue in sorting some data I have. In a Telerik Grid, I have a column called Requestor that displays the name of a person or Unit (group of people). The problem is, Requestor has two sources it can get it's data from. Here are the two sources.
1.) RequestorId: This is a foreign key to a table called Customer. Here, I store all the data for the user, including their full name. This field can be null btw.
2.) UnitId: This is another foreign key to a table called Units. Here, I store all the data for the Units, particularlly their names. This field can be null btw.
Here is the logic:
//Entity class that contains all the data for my grid
var purchaseOrders = _purchaseOrders.GetPurchaseOrders();
//Key is Id of PurchaseOrders, Value is name of requestor
var dictionary = new Dictionary<int, string>();
foreach (var purchaseOrder in purchaseOrders)
{
if (purchaseOrder.requestorId != null)
dictionary.add(purchaseOrder.Requestor.Fullname);
else
dictionary.add(purchaseOrder.unit.Fullname);
}
dictionary.orderby(x => x.value).ToDictionary(x => x.Key, x.Value);
var tempPurchaseOrders = new List<PurchaseOrder>();
foreach (var item in dictionary)
{
tempPurchaseOrders.Add(purchaseOrders.Where(x => x.Id == item.Key).FirstOrDefault());
}
purchaseOrders = tempPurchaseOrders.AsQueryable();
return purchaseOrders;
This logic returns an ordered list based on what I want to do, however, the problem is the amount of time it takes to process. It takes 1 minute to process. That's horrible obviously. Is there anyway to optimize this? I cut down the source after I return for the grid because there is no logical way to really cut it down beforehand.
Any help would be appreciated. Thanks.
Edit: I found out I no longer am required to use the RequestName field. That limits the data to two areas now. Still a minute to process though.
Did you try something like this:
return _purchaseOrders.GetPurchaseOrders().Select(i => new
{
OrderColumn = i.requestorId != null ? purchaseOrder.Requestor.Fullname : purchaseOrder.unit.Fullname,
// map other columns
})
.OrderBy(i => i.OrderColumn);
A bit like SÅ‚awomir Rosiek's solution (but entity framework won't accept that statement):
return _purchaseOrders.GetPurchaseOrders()
.OrderBy(o => o.unit.Fullname).ToList();
(since you don't use RequestName anymore).
Especially when GetPurchaseOrders() is an IQueryable from EF you delegate the sorting to the database engine because the sort expression becomes part of the SQL statement.
So I came up with my own solution. I first tried what both SÅ‚awomir Rosiek and Gert Arnold did. Unfortunately, like Gert mentioned, the first answer would not go through. The second one had similar issues.
In the end, I created a class to store the data from both Requestors and Units. It consisted of the following:
internal class RequestorData
{
public int entityId { get; set; }
public string Name { get; set; }
public bool isRequestorId { get; set; }
}
Next, I did the following.
//Entity class that contains all the data for my grid
var purchaseOrders = _purchaseOrders.GetPurchaseOrders();
var tempPurchaseOrders = new List<PurchaseOrder>();
var requestors = new List<RequestorData>();
var customers = purchaseOrders.Select(po => po.Requestor).Distinct().ToList();
var units = purchaseOrders.Select(po => po.Unit).Distinct().ToList();
foreach (var customer in customers)
{
if (customer != null)
requestors.Add(new RequestorData { entityId = customer.Id, Name = customer.FullName, isRequestorId = true });
}
foreach (var unit in units)
{
if (unit != null)
requestors.Add(new RequestorData { entityId = unit.Id, Name = unit.FullName, isRequestorId = false });
}
requestors = requestors.OrderBy(r => r.Name).ToList();
foreach (var requestor in requestors)
{
var id = requestor.entityId;
if (requestor.isRequestorId)
tempPurchaseOrders.AddRange(purchaseOrders.Where(po => po.RequestorId == id).ToList());
else
tempPurchaseOrders.AddRange(purchaseOrders.Where(po => po.UnitId == id).ToList());
}
purchaseOrders = tempPurchaseOrders.AsQueryable();
return purchaseOrders;
I ran this new rendition and have a 5-6 second time of wait. That's not perfect but much better than before. Thanks for all the help.

Categories

Resources