Advanced filtering in linq - c#

I'm here to ask a question on how would a code look like when it came to advanced filtering in C# with LINQ. I have experience with Linq, but this is something that was out of my understanding.
Lets say we have a class Item that has properties (string)Name, (bool)New and (int)Price and users would have to input their filters and get the results they need.
Lets say we put 5 objects inside a list list that is a List of Items.
new Item("Pen",true,12);
new Item("PostIt",false,35);
new Item("Phone",true,140);
new Item("Watch",true,5);
new Item("Lavalamp",false,2);
Now I woud like to process this information to get.. All New times that cost over 10. I know I can do this with
List<Item> Results = list.where(item => item.Price> 10 && item.New).ToList();
but what if a user wants to get all items that cost over 10 regardless of being new or not.. I can't change the query during runtime to fit the needs and I don't think that making a query for every possible combination of input parameters is a right way to do this... Can someone give me an example on how this should be done?

You can define base query
var result = list.Where(item=> item.Price > 10); //DON'T Call ToList() here
if(someCondition)
result = result.Where(item=> item.New);
//in the end you are calling
return result.ToList();
Like #MikeEason said you don't want to call ToList() on your first result because this will execute the query. Your goal is to build the complex query and execute it only once. Because of that this is done when you return the result.

If you only have those three conditions then you can build your query in several steps:
IEnumerable<Item> result=list;
int Price=10;
bool FilterByPrice, bool FilterByNew;//Set this variables in your environment
if(FilterByPrice)
result=result.Where(item => item.Price> Price);
if(FilterByNew)
result=result.Where(item => item.New);
Your query will be executed when you call ToList method or went you iterate over the query result thanks to deferred execution.

So let's say your items exist in your database and you want to query them. The user has a checkbox, if he wants to see only new items or all of them. If the box is checked you set a bool value for it.
// Compose the query
var results = _db.Where(item => item.Price > 10 );
// Still composing
if (onlyNewItems)
{
results = results.Where(item => item.New);
}
// ToList() executes the query, data is returned;
return results.ToList();
This does not run the query twice. In fact, until you materialize your query, you are still composing it. If you would return it now, it would be of type IQueryable<T>. Only after you call .ToList(), is your query actually executed and you get an IEnumerable<T> in this case a List<T> back.

List<Item> Results = list.where(item => item.Price > 10
&& (condition ? item.New : true)).ToList();
you can extend this way. just pass true if your condition is false and it is like nothing is happened.

Related

IEnumerable takes too long to process when filtering on it

I have a feeling I know that what the reason is for this behavior, but I don't know what the best way of resolving it will be.
I have built a LinqToSQL query:
public IEnumerable<AllConditionByCountry> GenerateConditions(int paramCountryId)
{
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
}).OrderBy(x => x.CountryID).AsEnumerable<AllConditionByCountry>();
return AllConditionsByCountry;
}
This query returns about 9500+ rows of data.
I'm calling this from my Controller like so:
svcGenerateConditions generateConditions = new svcGenerateConditions(db);
IEnumerable<AllConditionByCountry> AllConditionsByCountry;
AllConditionsByCountry = generateConditions.GenerateConditions(1);
Which then I'm looping through:
foreach (var record in AllConditionsByCountry)
{
...
...
...
This is where I think the issue is:
var rList = AllConditionsByCountry
.Where(x => x.ConditionID == conditionID)
.Select(x => x)
.AsEnumerable();
I'm doing an nested loop based off the data that I'm gathering from the above query (utilizing the original data I'm getting from AllConditionByCountry. I think this is where my issue lies. When it is doing the filter on the data, it SLOWS down greatly.
Basically this process writes out a bunch of files (.json, .html)
I've tested this at first using just ADO.Net and to run through all of these records it took about 4 seconds. Using EF (stored procedure or LinqToSql) it takes minutes.
Is there anything I should do with my types of lists that I'm using or is that just the price of using LinqToSql?
I've tried to return List<AllConditionByCountry>, IQueryable, IEnumerable from my GenerateConditions method. List took a very long time (similar to what I'm seeing now). IQueryable I got errors when I tried to do the 2nd filter (Query results cannot be enumerated more than once).
I have run this same Linq statement in LinqPad and it returns in less than a second.
I'm happy to add any additional information.
Please let me know.
Edit:
foreach (var record in AllConditionsByCountry)
{
...
...
...
var rList = AllConditionsByCountry
.Where(x => x.ConditionID == conditionID)
.Select(x => x)
.AsEnumerable();
conditionDescriptionTypeID = item.ConditionDescriptionTypeId;
id = conditionDescriptionTypeID + "_" + count.ToString();
...
...
}
TL;DR: You're making 9895 queries against the database instead of one. You need to rewrite your query such that only one is executed. Look into how IEnumerable works for some hints into doing this.
Ah, yeah, that for loop is your problem.
foreach (var record in AllConditionsByCountry)
{
...
...
...
var rList = AllConditionsByCountry.Where(x => x.ConditionID == conditionID).Select(x => x).AsEnumerable();
conditionDescriptionTypeID = item.ConditionDescriptionTypeId;
id = conditionDescriptionTypeID + "_" + count.ToString();
...
...
}
Linq-to-SQL works similarly to Linq in that it (loosely speaking) appends functions to a chain to be executed when the enumerable is iterated - for example,
Enumerable.FromResult(1).Select(x => throw new Exception());
This doesn't actually cause your code to crash because the enumerable is never iterated. Linq-to-SQL operates on a similar principle. So, when you define this:
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
}).OrderBy(x => x.CountryID).AsEnumerable<AllConditionByCountry>();
You're not executing anything against a database, you're just instructing C# to build a query that does this when it is iterated. That's why just declaring this query is fast.
Your problem comes when you get to your for loop. When you hit your for loop, you signal that you want to start iterating the AllConditionsByCountry iterator. This causes .NET to go off and execute the initial query, which takes time.
When you call AllConditionsByCountry.Where(x => x.ConditionID == conditionID) in the for loop, you're constructing another iterator that doesn't actually do anything. Presumably you actually use the result of rList within that loop, however, you're essentially constructing N queries to be executed against the database (where N is the size of AllConditionsByCountry).
This leads to a scenario where you are effectively executing approximately 9501 queries against the database - 1 for your initial query and then one query for each element within the original query. The drastic slowdown compared to ADO.NET is because you're probably making 9500 more queries than you were originally.
You ideally should change the code so that there is one and only one query executed against the database. You've a couple of options:
Rewrite the Linq-to-SQL query such that all of the legwork is done by the SQL database
Rewrite the Linq-to-SQL query so it looks like this
var conditions = AllConditionsByCountry.ToList();
foreach (var record in conditions)
{
var rList = conditions.Where(....);
}
Note that in that example I am searching conditions rather than AllConditionsByCountry - .ToList() will return a list that has already been iterated so you do not create any more database queries. This will still be slow (since you're doing O(N^2) over 9500 records), but it will still be faster than creating 9500 queries since it will all be done in memory.
Just rewrite the query in ADO.NET if you're more comfortable with raw SQL than Linq-to-SQL. There's nothing wrong with this.
I think I should point out what methods cause an IEnumerable to be iterated and what ones don't.
Any method named As* (such as AsEnumerable<T>()) do not cause the enumerable to be iterated. It's essentially a way of casting from one type to another.
Any method named To* (such as ToList<T>()) will cause the enumerable to be iterated. In the event of Linq-to-SQL this will also execute the database query. Any method that also results in you getting a value out of the enumerable will also cause iteration. You can use this to your advantage by creating a query and forcing iteration using ToList() and then searching that list - this will cause the comparisons to be done in memory, which is what I demo above
//Firstly: IEnumerable<> should be List<>, because you need to massage result later
public IEnumerable<AllConditionByCountry> GenerateConditions(int paramCountryId)
{
var AllConditionsByCountry =
(from cd in db.tblConditionDescriptions...
join...
join...
select new AllConditionByCountry
{
CountryID = cd.CountryID,
ConditionDescription = cd.ConditionDescription,
ConditionID = cd.ConditionID,
...
...
})
.OrderBy(x => x.CountryID)
.ToList() //return a list, so only 1 query is executed
//.AsEnumerable<AllConditionByCountry>();//it's useless code, anyway.
return AllConditionsByCountry;
}
about this part:
foreach (var record in AllConditionsByCountry) // you can use AllConditionsByCountry.ForEach(record=>{...});
{
...
//AllConditionsByCountry will not query db again, because it's a list, no long a query
var rList = AllConditionsByCountry.Where(x => x.ConditionID == conditionID);//.Select(x => x).AsEnumerable(); //no necessary to use AsXXX if compilation do not require it.
...
}
BTW,
you should have your result paged, no page will need 100+ result. 10K return is the issue itself.
GenerateConditions(int paramCountryId, int page = 0, int pagesize = 50)
it's weird that you have to use a sub-query, usually it means GenerateConditions did not return the data structure you need, you should change it to give right data, no more subquery
use compiled query to improve more: https://learn.microsoft.com/en-us/dotnet/framework/data/adonet/ef/language-reference/compiled-queries-linq-to-entities
we don't see your full query, but usually, it's right the part you should improve, especially when you have many conditions to filter and join and group... a little change could make all differences.

C# and LINQ - arbitrary statement instead of let

Let's say I'm doing a LINQ query like this (this is LINQ to Objects, BTW):
var rows =
from t in totals
let name = Utilities.GetName(t)
orderby name
select t;
So the GetName method just calculates a display name from a Total object and is a decent use of the let keyword. But let's say I have another method, Utilities.Sum() that applies some math on a Total object and sets some properties on it. I can use let to achieve this, like so:
var rows =
from t in totals
let unused = Utilities.Sum(t)
select t;
The thing that is weird here, is that Utilities.Sum() has to return a value, even if I don't use it. Is there a way to use it inside a LINQ statement if it returns void? I obviously can't do something like this:
var rows =
from t in totals
Utilities.Sum(t)
select t;
PS - I know this is probably not good practice to call a method with side effects in a LINQ expression. Just trying to understand LINQ syntax completely.
No, there is no LINQ method that performs an Action on all of the items in the IEnumerable<T>. It was very specifically left out because the designers actively didn't want it to be in there.
Answering the question
No, but you could cheat by creating a Func which just calls the intended method and spits out a random return value, bool for example:
Func<Total, bool> dummy = (total) =>
{
Utilities.Sum(total);
return true;
};
var rows = from t in totals
let unused = dummy(t)
select t;
But this is not a good idea - it's not particularly readable.
The let statement behind the scenes
What the above query will translate to is something similar to this:
var rows = totals.Select(t => new { t, unused = dummy(t) })
.Select(x => x.t);
So another option if you want to use method-syntax instead of query-syntax, what you could do is:
var rows = totals.Select(t =>
{
Utilities.Sum(t);
return t;
});
A little better, but still abusing LINQ.
... but what you should do
But I really see no reason not to just simply loop around totals separately:
foreach (var t in totals)
Utilities.Sum(t);
You should download the "Interactive Extensions" (NuGet Ix-Main) from Microsoft's Reactive Extensions team. It has a load of useful extensions. It'll let you do this:
var rows =
from t in totals.Do(x => Utilities.Sum(x))
select t;
It's there to allow side-effects on a traversed enumerable.
Please, read my comment to the question. The simplest way to achieve such of functionality is to use query like this:
var rows = from t in totals
group t by t.name into grp
select new
{
Name = t.Key,
Sum = grp.Sum()
};
Above query returns IEnumerable object.
For further information, please see: 101 LINQ Samples

How to order Dates in Linq?

I am using C# and LINQ, I have some Date in type Date
At the moment I am using this script to order a list by date of start, from the earlier to the latest.
With the following code my events are not sorted:
events.OrderBy(x => x.DateTimeStart).ToList();
return events.AsQueryable();
What could be wrong here?
events.OrderBy(x => x.DateTimeStart).ToList() creates a new list, and you don't return it.
You probably want something like
return events.OrderBy(x => x.DateTimeStart).ToList();
events.OrderBy(x => x.DateTimeStart): Declare a query that sorts events by property DateTimeStart. The query is not performed yet.
events.OrderBy(x => x.DateTimeStart).ToList();: Process the previous query. Iterate through all events, checks their DateTimeStart, sort them and safe result as a List and then... discard the result! Because you didn't safe it. Compare it with something like that:
int a = 0;
a + 1;
b = a; // b is 0
return events.AsQueryable();: Here you are returning your original events instead of sorted.
You should write your code as follows:
return events.OrderBy(x => x.DateTimeStart).ToList().AsQueryable();
That version will create static list of sorted events. If now you change the events list, the result will not take into account your changes.
The second solution is:
return events.OrderBy(x => x.DateTimeStart).AsQueryable();
That version will do no work. It just declarates a manner to sort events and returns that manner as IQueryable. If you use returned value in future code it will always contain all sorted events even if you add new ones prior to use that.
store your orderedevents in a variable and return this variable asQueryable();
var orderedEvents = events.OrderBy(x => x.DateTimeStart).ToList();
return orderedEvents.AsQueryable();
or if you dont need that variable return your ordered events directly.
return events.OrderBy(x => x.DateTimeStart).ToList().AsQueryable();

Trying to emulate a SQL IN statement on a series of SQL LIKE statements using LINQ

I'm trying to emulate a SQL IN statement on a series of SQL LIKE statements using LINQ (C#).
I start off with a IQueryable<user> object named Query which has not yet been 'filtered'.
I'm just starting to post onto Stack Overflow so please be patient with me :)...
ObjectQuery<user> Context = this.Context.users;
IQueryable<user> Query = (IQueryable<user>)Context;
// create a BLANK clone of the FULL list (Query)
var QueryFinal = Query.Where(i => i.ID == 0);
foreach (String Item in userFilter.Name.Contains)
{
// based on the FULL list (Query) return all records that apply to this item & then append results to the final list
QueryFinal = QueryFinal.Concat(Query.Where(i => i.Name.Contains(Item)));
}
return QueryFinal.ToList();
I thought that on each iteration, the result set being returned on the Query.Where statement would be appended to the QueryFinal list, which it does, but for some reason, on each subsequent iteration, it appears to overwrite all the previous records which are supposed to be 'stored for safe-keeping' in the final list. I've have also tried using .Union but still not the results I was hoping for. All it seems to return is the last result set, not all of the appended result sets together. Anyone spot what I'm doing wrong?
Because of deferred execution, when you call QueryFinal.ToList();, it will be using the last value of Item - this is a pretty common problem when doing these sorts of queries in a foreach loop.
In your case, something like this might help:
foreach (String Item in userFilter.Name.Contains)
{
string currentItem = Item;
// based on the FULL list (Query) return all records that apply to this item & then append results to the final list
QueryFinal = QueryFinal.Concat(Query.Where(i => i.Name.Contains(currentItem)));
}
try this:
var query = this.Context.users.Where(u =>
userFilter.Name.Contains.Any(c => u.Name.Contains(c))
);

How can I set properties on all items from a linq query with values from another object that is also pulled from a query?

I have a query pulling from a database:
List<myClass> items = new List<myClass>(from i in context
select new myClass
{
A = i.A,
B = "", // i doesn't know this, this comes from elsewhere
C = i.C
}
I also have another query doing a similar thing:
List<myClass2> otherItems = new List<myClass2>(from j in context
select new myClass2
{
A = j.A, // A is the intersection, there will only be 1 A here but many A's in items
B = j.B
}
In reality these classes are much larger and query data that is separated not only by database but by server as well. Is it possible to use a LINQ query to populate the property B for all items where items.A intersect? All of the built in LINQ predicates appear only to do aggregates, selections or bool expressions.
In my brain I had something like this, but this is all off:
items.Where(x => x.B = (otherItems.Where(z => z.A == x.A).Single().B));
Or am I being ridiculous with trying to make this work in LINQ and should just abandon it in favor of a for loop where the actual setting becomes trivial? Because of deadlines I will be resorting to the for loop (and it's probably going to end up being a lot more readable in the long run anyway), but is it possible to do this? Would an extension method be necessary to add a special predicate to allow this?
LINQ is designed for querying. If you're trying to set things, you should definitely use a loop (probably foreach). That doesn't mean you won't be able to use LINQ as part of that loop, but you shouldn't be trying to apply a side-effect within LINQ itself.
Query the OtherItems first. Do a ToDictionary() on the result. Then, when querying the database, do this:
var items = from i in context
select new myClass
{ A = i.A,
B = otherItems[i.A],
C = i.C
}

Categories

Resources