Improving performance of Linq to SQL list intersection

Improving performance of Linq to SQL list intersection - c#

I've got some working code here but I'm really concerned that it's not efficient - but I can't think of a way to improve it. Any thoughts?
We have IQueryable<Users> users, which gets its data from a users table with entity framework mapping each user to a separate table of organizations, which is a one-to-many relationship. We also have a List<string> orgCriteria, which is a list of organizations we want to filter Users by. In essence we want to get a list of users who have a membership in any of the organizations in the criteria.
To compare the names of the user's orgs to the orgs in the filter criteria, we have to use some linq/EF mappings like this: var x = users.Select(x => x.Orgs.Name).ToList(); However, the display name is what we get from the criteria, which means we have to translate the partial name to the display name as well...
These are the tables that get pulled in to all this: User, Orgs, UserOrgs. User has a FK to the Id of Orgs, and UserOrgs has 3 columns: Id, UserId, OrgId where UserId and OrgId are FKs to their respective tables. A user can have 0 or as many as there are Orgs. Each org has a name, but the display name, which we map in the domain model normally, is composed of three columns: name, foo, and bar with bar being the nullable column.
I tried an intersect like this, but it doesn't work: users = users.Where(x => x.Select(y => string.Format("{0} - {1}{2}", y.Org.Name, y.Org.foo, y.Org.bar != null ? " - " + y.Org.bar : string.Empty)).Intersect(orgCriteria).Any()); because I get this error:
Local sequence cannot be used in LINQ to SQL implementations of query
operators except the Contains operator.
So I can make it work by combining a foreach and an intersect, but I'm concerned... If we have 500 users who each have 20 orgs, it seems like this could be a very expensive filter.
This way works, but it makes me nervous:
foreach(var user in users)
{
List<string> userOrgNames = user.Orgs.Select(x => string.Format("{0} - {1}{2}", y.Org.Name, y.Org.foo, y.Org.bar != null ? " - " + y.Org.bar : string.Empty)).ToList();
if (!userOrgNames.Intersect(orgCriteria).Any())
users = users.Where(x => x.Id != user.Id);
}
Any ideas?
Edit - Here is a rudimentary diagram!

You can try something below.
I did this LINQ based on this statement In essence we want to get a list of users who have a membership in any of the organizations in the criteria.
var filteredUsers = users.Where(user =>
user.Orgs.Any(org => orgCriteria.Contains($"{org.Name} - {org.foo}{org.bar}")));
In your case if $"{org.Name} - {org.foo}{org.bar}" this does not work, use string.Format("{0} - {1}{2}", org.Name, org.foo, org.bar)

Related

How to return IQueryable LINQ result from two joined tables into a List<string>?

This is an add-on question to one asked here: Entity Framework Core 5.0 How to convert LINQ for many-to-many join to use Intersection table for ASP.NET Membership
How can I return the results of an the following LINQ IQueryable result, which is from two join tables, for the RoleName column to a List<string>?
var queryResult = (this.DbContext.aspnet_UsersInRoles
.Where(x => x.UserId == dpass.UserId)
.Join(
this.DbContext.aspnet_Roles,
ur => ur.RoleId,
r => r.RoleId,
(ur, role) => new
{
ur,
role
}
)
.Select(x => new { x.ur.UserId, x.role.RoleName })
);
UPDATE 1
I need the List in the form of an array of values so that I can use the Contains() method. I need to search for specific RoleNames assigned to a UserId. If I use ToList() on the IQueryable, then the array result is in the form of:
{ RoleName = "admin"}
{ Rolename = "user"}
I am unable to use the .Contains() method because I get the following error:
cannot convert from 'string' to <anonymous type: string RoleName>.
It seems be to expecting a class that the query result can be assigned to. But, one doesn't exist because I am doing this on-the-fly.
UPDATE 2
I need the queryResult in a List that is in the form of:
{ "admin"}
{ "user"}
With this output, I can use the .Contains() method to perform multiple checks. This is used for determining Windows Forms field properties. So, if the UserId belongs to the admin role then the form enables certain check boxes and radio buttons whereas if the UserId belongs to the user role then the form enables different check boxes. This is not an exhaustive list of roles available along with the checks that are performed by the form. But, what is important is that there are multiple checks on the List that need to be performed in separate IF statements.
Currently, I am able to use the queryResult to do the following:
Get a list of the RoleNames
Perform separate LINQ queries on the queryResult by checking for the specific RoleName
Perform a .Count() > 0 check to see if the UserId is in a specific role.
This seems like an ugly hack because I have the intermediate step of creating 1 + N variables to retrieve, by LINQ, and store each RoleName and then check to see if the .Count() is greater than zero. I think that the List method would be cleaner and more efficient. If that is possible.
var varUser = from d in queryResult
where d.RoleName == "user"
select new { d.RoleName };
var varAdmin = from u in queryResult
where u.RoleName == "admin"
select new { u.RoleName };
//... more declarations and LINQs ...

Short answer:
Select only the RoleName, and use SelectMany instead of Select
Better answer
So you have a table of Roles, and a table of Users (I'm simplifying your long identifiers, not part of the problem and way too much typing).
There seems to be a many to many relation between Roles and Users: Every Role is a role for zero or more Users, every User has zero or more Roles.
This many-to-many relation is implemented using a standard junction table: UsersInRoles. This junction table has two foreign keys: one to the User and one to the Roles.
You have a UserId, and it seems that you want all names of all Roles of the user that has this Id.
How about this:
int userId = ...
// Get the names of all Roles of the User with this Id
var namesOfRolesOfThisUser = dbContext.UsersInRoles
// only the user with this Id:
.Where(userInRole => userInRole.UserId == userId)
// get the names of all Roles for this userInRole
.SelectMany(userInRole => dbContext.Roles.Where(role => role.RoleId == userInRole.RoleId)
.Select(role => role.RoleName));
In words: from the table of UsersInRoles, keep only those UsersInRoles that have a value for property UserId that equals userId.
From every one of the remaining UsersInRoles, select all Roles that have a RoleId that equeals the UserInRole.RoleId. From these Roles take the RoleName.
I use SelectMany to make sure that I get one sequence of strings, instead of a sequence of sequences of strings.
If you suspect double RoleNames, consider to append Distinct() at the end.
But I want to Join!
Some people really like to do the joins themselves.
int userId = ...
var namesOfRolesOfThisUser = dbContext.UsersInRoles
.Where(userInRole => userInRole.UserId == userId)
.Join(dbContext.Roles,
userInRole => userInRole.RoleId, // from every UserInRole take the foreign key
role => role.RoleId, // from every Role take the primary key
// when they match, take only the name of the Role
(userInRole, role) => role.RoleName);

Try to use GroupBy(). Be careful, this method is not supported by direct IQueryable to SQL conversion. If you will try to call GroupBy() before .ToList(), it will throw an error.
In your example you could this: select a list in memory and then work with it:
var queryResult = (this.DbContext.aspnet_UsersInRoles
.Where(x => x.UserId == dpass.UserId)
.Join(this.DbContext.aspnet_Roles,
ur => ur.RoleId,
r => r.RoleId,
(ur, role) => new { ur, role }
)
.Select(x => new { x.ur.UserId, x.role.RoleName })
.ToList() // MATERIALIZE FIRST
.GroupBy(x => x.UserId) //ADD THIS
);
queryResult.Contains(roleName=> roleName == "ROLE_TO_SEARCH")
var userId = queryResult.Key;

Entity Framework Count with a filter on field really slow - large count

When I am running a Model Mapping, one company has a lot of members, like 405,000 members.
viewModel.EmployeeCount = company.MembershipUser.Count(x => x.Deleted == false);
When I run the SQL query, it takes a few milliseconds. In ASP.NET MVC, EF6 C# this can take up to 10 minutes for one list view controller hit. Thoughts?
Company is my Domain Model Entity, and MembershipUser is a public virtual virtual (FK) using entity framework 6, not C#6
When I'm in my CompanyController (MVC) and I ask for a company list, I get a list without the company count included. When I do a viewModelMapping to my Model to prep to pass to the view, I need to add the count, and do not have access to the context or DB, etc.
// Redisplay list of companies
var viewModel = CrmViewModelMapping.CompanyListToCompanyViewModel(pagedCompanyList);
CompanyListToCompanyViewModel maps the list of companies to the list of my ViewModel and does the count (MembershipUsers) there.
I also tried adding the count property to the company DomainModel such as:
public int EmployeeCount
{
get
{
// return MembershipUser.Where(x => x.Deleted == false).Count();
return MembershipUser.Count(x => x.Deleted == false);
}
}
But it also takes a long time on companies with a lot of Employees.
It's almost like I want this to be my SQL Query:
Select *, (SELECT count(EmployeeID) as Count WHERE Employee.CompanyID = CompanyID) as employeeCount from Company
But early on I just assumed I could let EF lazy loading and subQueries do the work. but the overhead on large counts is killing me. On small datasets I see no real difference, but once the counts get large my site is unsusable.

When you are accessing the navigation property and using the count method, you are materializing all the MembershipUser table and doing the filter in C#.
There are three operations in this command: The C# go to the database and execute the query, transform the query result in C# object list (materialize) and execute the filter (x => x.Deleted == false) in this list.
To solve this problem you can do the filter in the MembershipUser DbSet:
Db.MembershipUser.Count(x => x.Deleted == false && companyId == company.Id);
Doing the query using the DbSet, the filter will be done in database without materialize all 405000 rows.

Better way to join tables with Entity Framework

Good morning,
I have inherited a database with no foreign key relations and the project is such that i have to ignore this major issue and work around it. Obviously this eliminates some of the cooler features of Entity Framework providing me related entities automatically.
So i have been forced to do something like this:
using (var db = new MyEntities())
{
Entities.Info record = db.Infoes.Where(x => x.UserId == authInfo.User.Id).FirstOrDefault();
//Get all the accounts for the user
List<Entities.AcctSummary> accounts = db.AcctSummaries.Where(x => x.InfoId == record.Id).ToList();
//Loop through each account
foreach (Entities.AcctSummary account in accounts)
{
//pull records for account
List<Entities.Records> records= db.Records.Where(x => x.AcctSummaryId == account.Id).ToList();
}
}
If there a better way to join the "record" and "accounts" Entities, or perhaps a more efficient way for getting "records" in a single query?
TIA

Are you just looking for the .Join() extension method? As an example, joining Infoes and Accounts might look like this:
var accounts = db.Infoes.Join(db.Accounts,
i => i.Id,
a => a.InfoId,
(info, account) => new { info, account });
This would result in accounts being an enumeration of an anonymous type with two properties, one being the Info record and the other being the Account record. The collection would be the full superset of the joined records.
You can of course return something other than new { info, account }, it works just like anything you'd put into a .Select() clause. Whatever you select from these joined tables would be what you have an enumeration of in accounts. You can further join more tables by changing .Join() extensions, returning whatever you want from each.

What is the way to join these two list?

I have a IList<User> that contains objects with a pair of value: Name and Surname.
On the database I have a table that contains rows with Name and Surname field. I want on codebehind to return the list of the rows that match my List, so let say have Name and Surname (respectively) equals.
My actual code is:
utenti = (from User utente in db.User.AsEnumerable()
join amico in amiciParsed
on new { utente.Nome, utente.Cognome } equals
new { Nome = amico.first_name, Cognome = amico.last_name }
select utente).OrderBy(p => p.Nome)
.OrderBy(p => p.Cognome)
.OrderBy(p => p.Nickname)
.ToList();
but this it is not good for two reasons:
It will download the whole records of the DB on the client;
I can't match Name and Surname as case sensitive (example Marco cordi != Marco Cordi); and on DB I have every kind of up/down chars.
As suggested on a previously question, seems that this answer can't help me, since I have to do a join (and also because the first problem it is not related).
What's the way to resolve this problem?

I don't know if this will work in your situation, but you might give it a try.
First, create a new list of strings:
List<string> amici = aimiciParsed.Select(x => x.first_name + "|" + x.last_name).ToList();
Then, select the users from DB, based on this list
var utenti = db.User.AsEnumerable().Where(utente =>
amici.Contains(utente.Nome + "|" + utente.Cognome)).ToList();
It sends the list of strings to the DB as a list of parameters and translates it into a query like
SELECT * FROM User WHERE User.Nome + "|" + User.Cognome IN (#p1, #p2, #p3 ...)
Unfortunately, there is no way to call Contains with something like StringComparison.OrdinalIgnoreCase, so you might have to change the collation of your columns.

This could be done with PredicateBuilder:
using LinqKit;
var predicate = PredicateBuilder.False<User>();
foreach(var amico in amiciParsed)
{
var a1 = amico; // Prevent modified closure (pre .Net 4.5)
predicate = predicate.Or(user => user.Nome == a1.first_name
&& user.Cognome == a1.last_name);
}
var query = db.User.Where(predicate.Expand())
.OrderBy(p => p.Nome)
...
The advantage is that indexes on Nome and Cognome can be used (which is impossible if you search on a concatenated value). On the other hand, the number of OR clauses can get very large, which may hit certain limits in SQL Server (https://stackoverflow.com/a/1869810/861716). You'll have to stress-test this (although the same goes for IN clauses).

When asking a question here on SO, you may want to translate it to English - don't expect people to know what "uente", "amico" or "Cognome" are.
One question: Why do you use ..in db.User.AsEnumerable() and not just ..in db.User?
Let everything in your query stay IQueryable (instead of IEnumerable). This lets Linq2Sql create SQLs that are as optimized as possible, instead of downloading all the records and joining the records client-side. This may also be the reason your search turns case-sensitive. Client-side in-memory string comparison will always be case-sensitive, while string comparison in SQL depends on the database's configuration.
Try ditching the .AsEnumerable() and see if you get better results:
utenti = (from User foo in db.User
join bar in amiciParsed
...

Dynamic where clause using Linq to SQL in a join query in a MVC application

I am looking for a way to query for products in a catalog using filters on properties which have been assigned to the product based on the category to which the product belongs. So I have the following entities involved:
Products
-Id
-CategoryId
Categories
[Id, Name, UrlName]
Properties
[Id, CategoryId, Name, UrlName]
PropertyValues
[Id, PropertyId, Text, UrlText]
ProductPropertyValues
[ProductId, PropertyValueId]
When I add a product to the catalog, multiple ProductPropertyValues will be added based on the category and I would like to be able to filter all products from a category by selecting values for one or more properties. The business logic and SQL indexes and constraints make sure that all UrlNames and texts are unique for values properties and categories.
The solution will be a MVC3 EF code first based application and the routing is setup as followed:
/products/{categoryUrlName}/{*filters}
The filter routing part has a variable length so multiple filters can be applied. Each filter contains the UrlName of the property and the UrlText of the value separated by an underscore.
An url could look like this /products/websites/framework_mvc3/language_csharp
I will gather all filters, which I will hold in a list, by reading the URL. Now it is time to actually get the products based on multiple properties and I have been trying to find the right strategy.
Maybe there is another way to implement the filters. All larger web shops use category depending filters and I am still looking for the best way to implement the persistence part for this type of functionality. The suggested solutions result in an "or" resultset if multiple filters are selected. I can imagine that adding a text property to the product table in which all property values are stores as a joined string can work as well. I have no idea what this would cost performance wise. At leased there will be no complex join and the properties and their values will be received as text anyway.
Maybe the filtering mechanism can be done client side ass well.

The tricky part about this is sending the whole list into the database as a filter. Your approach of building up more and more where clauses can work:
productsInCategory = ProductRepository
.Where(p => p.Category.Name == category);
foreach (PropertyFilter pf in filterList)
{
PropertyFilter localVariableCopy = pf;
productsInCategory = from product in productsInCategory
where product.ProductProperties
.Any(pp => pp.PropertyValueId == localVariableCopy.ValueId)
select product;
}
Another way to go is to send the whole list in using the List.Contains method
List<int> valueIds = filterList.Select(pf => pf.ValueId).ToList();
productsInCategory = ProductRepository
.Where(p => p.Category.Name == category)
.Where(p => p.ProductProperties
.Any(pp => valueIds.Contains(pp.PropertyValueId)
);

IEnumerable<int> filters = filterList.Select(pf => pf.ValueId);
var products = from pp in ProductPropertyRepository
where filters.Contains(pp.PropertyValueId)
&& pp.Product.Category.Name == category
select pp.Product;
Bear in mind that as Contains is used, the filters will be passed in as sproc parameters, this means that you have to be careful not to exceed the sproc parameter limit.

I came up with a solution that even I can understand... by using the 'Contains' method you can chain as many WHERE's as you like. If the WHERE is an empty string, it's ignored (or evaluated as a select all). Here is my example of joining 2 tables in LINQ, applying multiple where clauses and populating a model class to be returned to the view.
public ActionResult Index()
{
string AssetGroupCode = "";
string StatusCode = "";
string SearchString = "";
var mdl = from a in _db.Assets
join t in _db.Tags on a.ASSETID equals t.ASSETID
where a.ASSETGROUPCODE.Contains(AssetGroupCode)
&& a.STATUSCODE.Contains(StatusCode)
&& (
a.PO.Contains(SearchString)
|| a.MODEL.Contains(SearchString)
|| a.USERNAME.Contains(SearchString)
|| a.LOCATION.Contains(SearchString)
|| t.TAGNUMBER.Contains(SearchString)
|| t.SERIALNUMBER.Contains(SearchString)
)
select new AssetListView
{
AssetId = a.ASSETID,
TagId = t.TAGID,
PO = a.PO,
Model = a.MODEL,
UserName = a.USERNAME,
Location = a.LOCATION,
Tag = t.TAGNUMBER,
SerialNum = t.SERIALNUMBER
};
return View(mdl);
}

I know this an old answer but if someone see's this I've built this project:
https://github.com/PoweredSoft/DynamicLinq
Which should be downloadable on nuget as well:
https://www.nuget.org/packages/PoweredSoft.DynamicLinq
You could use this to loop through your filter coming from query string and do
something in the lines of
query = query.Query(q =>
{
q.Compare("AuthorId", ConditionOperators.Equal, 1);
q.And(sq =>
{
sq.Compare("Content", ConditionOperators.Equal, "World");
sq.Or("Title", ConditionOperators.Contains, 3);
});
});

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.