LINQ Specify multiple conditions - c#

How do I specify more than one condition in the below LINQ?
if (Books.Select(x => x.BookName).Count() !=
Books.Select(x => x.BookName).Distinct().Count())
{
//Todo
}
I want to specify some thing Like this:
if (Books.Select(x => x.BookName && x.price).Count() !=
Books.Select(x => x.BookName && x.price).Distinct().Count())
{
//Todo
}

You don't need a select for getting the first count. You can use an anonymous type for getting the second count. This will get number of distinct book name and prices:
if (Books.Count() != Books.Select(x => new { x.BookName, x.price}).Distinct().Count())
{
//Todo
}

So, to answer your question, project each item out into an anonymous type.
However, to avoid iterating the source data multiple times (which is particularly problematic if Book represents a database query), and to avoid performing the projection repeatedly, you can use GroupBy instead:
bool areDuplicatedBooks = Books.GroupBy(x => new{x.BookName, x.price})
.Where(group => group.Count() > 1)
.Any()

Related

How to make reusable conditions that can be used in queryables? [duplicate]

This question already has answers here:
How to reuse where clauses in Linq To Sql queries
(4 answers)
Closed 1 year ago.
So in a function I've a large queryable and I apply a bunch of where cause on it based on other conditions.
Like in this example:
query.Where(i =>
_context.FicDernierEvt
.Where(y => y.VteAffaire == null && y.ApvAffaire == null)
.Select(y => y.IdFicheCrm)
.Contains(i.Id)
);
I've this condition _context.FicDernierEvt.Where(y => y.VteAffaire == null && y.ApvAffaire == null).Select(y => y.IdFicheCrm).Contains(i.Id) that is used a lot in my code.
I would like to avoid having this all accross my code so i've tried to make a function:
private bool isProspect(FicFicheCrm ficheCrm){
return _context.FicDernierEvt
.Where(y => y.VteAffaire == null && y.ApvAffaire == null)
.Select(y => y.IdFicheCrm)
.Contains(ficheCrm.Id);
}
So i could use it this way:
query.Where(i => isProspect(i));
But it didn't worked since, it's just not mean to be done that way.
Do someone have an idea on how to make reusable conditions like this to be used in queryables ?
My advice would be to extend LINQ with a Where method that contains your predicate.
If you are not familiar with Extension methods, consider to read Extension Methods Demystified
You fogrot to tell us what type of IQueryable<...> you store in _context.FicDernierEvt, but let's assume it is IQueryable<FicDernierEvt>. In other words, assume that _context.FicDernierEvt is a table of FicDernierEvts.
Requirement Procedure GetIdFics (TODO: invent proper name) takes as input an IQueryable<FicDernierEvt>, keeps only those ficDernierEvts that have a null value for both properties VteAffaire and ApvAffaire, and returns from every remaining ficDernierEvt the value of property ficDernierEvt.IdFic
I don't know the type of IdFic, but let's assume it is an int
public static IQueryable<int> GetIdFics( // TODO: invent proper name
this IQueryable<FicDernierEvt> source)
{
return source.Where(ficDernierEvt => ficDernierEvt.VteAffaire == null
&& ficDernierEvt.ApvAffaire == null)
.Select(ficDernierEvt => ficDernierEvt.IdFic);
}
That's all!
Usage:
IQueryable<int> myIdFics = _context.FicDernierEvt.GetIdFics();
You say you have this Where/Select in a lot of places:
var oldIdFics = _context.FicDernierEvt
.Where(ficDernierEvt.Date.Year < 2010)
.GetIdfics();
var invalidIdFics = _context.FicDernierEvt.GetIdFics()
.Where(idFic => idFic <= 0);
You can even use it in a more complicate LINQ statement:
IQueryable<FicDernierEvt> unpaidFicDernierEvts = this.GetUnpaidFicDernierEvts();
int customerId = this.GetCustomerId(customerName);
var largestUnpaidCustomerIdFic = unpaidFicDernierEvts
.Where(unpaidEvt => unpaidEvt.CustomerId == customerId)
.GetIdFics()
.Max();

Collections manipulation, need help optimizing this code from a report generator

I'm creating a report generating tool that use custom data type of different sources from our system. The user can create a report schema and depending on what asked, the data get associated based different index keys, time, time ranges, etc. The project is NOT doing queries in a relational database, it's pure C# code in collections from RAM.
I'm having a huge performance issue and I'm looking at my code since a few days and struggle with trying to optimize it.
I stripped down the code to the minimum for a short example of what the profiler point as the problematic algorithm, but the real version is a bit more complex with more conditions and working with dates.
In short, this function return a subset of "values" satisfying the conditions depending on the keys of the values that were selected from the "index rows".
private List<LoadedDataSource> GetAssociatedValues(IReadOnlyCollection<List<LoadedDataSource>> indexRows, List<LoadedDataSource> values)
{
var checkContainers = ((ValueColumn.LinkKeys & ReportLinkKeys.ContainerId) > 0 &&
values.Any(t => t.ContainerId.HasValue));
var checkEnterpriseId = ((ValueColumn.LinkKeys & ReportLinkKeys.EnterpriseId) > 0 &&
values.Any(t => t.EnterpriseId.HasValue));
var ret = new List<LoadedDataSource>();
foreach (var value in values)
{
var valid = true;
foreach (var index in indexRows)
{
// ContainerId
var indexConservedSource = index.AsEnumerable();
if (checkContainers && index.CheckContainer && value.ContainerId.HasValue)
{
indexConservedSource = indexConservedSource.Where(t => t.ContainerId.HasValue && t.ContainerId.Value == value.ContainerId.Value);
if (!indexConservedSource.Any())
{
valid = false;
break;
}
}
//EnterpriseId
if (checkEnterpriseId && index.CheckEnterpriseId && value.EnterpriseId.HasValue)
{
indexConservedSource = indexConservedSource.Where(t => t.EnterpriseId.HasValue && t.EnterpriseId.Value == value.EnterpriseId.Value);
if (!indexConservedSource.Any())
{
valid = false;
break;
}
}
}
if (valid)
ret.Add(value);
}
return ret;
}
This works for small samples, but as soon as I have thousands of values, and 2-3 index rows with a few dozens values too, it can take hours to generate.
As you can see, I try to break as soon as a index condition fail and pass to the next value.
I could probably do everything in a single "values.Where(####).ToList()", but that condition get complex fast.
I tried generating a IQueryable around indexConservedSource but it was even worse. I tried using a Parallel.ForEach with a ConcurrentBag for "ret", and it was also slower.
What else can be done?
What you are doing, in principle, is calculating intersection of two sequences. You use two nested loops and that is slow as the time is O(m*n). You have two other options:
sort both sequences and merge them
convert one sequence into hash table and test the second against it
The second approach seems better for this scenario. Just convert those index lists into HashSet and test values against it. I added some code for inspiration:
private List<LoadedDataSource> GetAssociatedValues(IReadOnlyCollection<List<LoadedDataSource>> indexRows, List<LoadedDataSource> values)
{
var ret = values;
if ((ValueColumn.LinkKeys & ReportLinkKeys.ContainerId) > 0 &&
ret.Any(t => t.ContainerId.HasValue))
{
var indexes = indexRows
.Where(i => i.CheckContainer)
.Select(i => new HashSet<int>(i
.Where(h => h.ContainerId.HasValue)
.Select(h => h.ContainerId.Value)))
.ToList();
ret = ret.Where(v => v.ContainerId == null
|| indexes.All(i => i.Contains(v.ContainerId)))
.ToList();
}
if ((ValueColumn.LinkKeys & ReportLinkKeys.EnterpriseId) > 0 &&
ret.Any(t => t.EnterpriseId.HasValue))
{
var indexes = indexRows
.Where(i => i.CheckEnterpriseId)
.Select(i => new HashSet<int>(i
.Where(h => h.EnterpriseId.HasValue)
.Select(h => h.EnterpriseId.Value)))
.ToList();
ret = ret.Where(v => v.EnterpriseId == null
|| indexes.All(i => i.Contains(v.EnterpriseId)))
.ToList();
}
return ret;
}

Sitecore: efficient way to use LINQ to compare against an ID

I have a LINQ query retrieving a list of , such as this:
var results = SearchContext.GetQueryable<Person>()
.Where(i => i.Enabled)
.Where(i => i.TemplateName == "Person")
.Random(6);
Each object of type "Person" has a "Location" field which is also a Glass mapped item, and hence has an ID; I would like to only select items whose Location has a specific ID.
How can I go about doing this in an efficient manner?
EDIT: I should probably clarify that I am unable to perform this comparison, efficiently or not. Because the GUID is an object and I cannot perform ToString in a LINQ query, I am unable to only pick the items whose Location item has a specific ID. Any clues on how this could be achieved?
EDIT 2: Adding the clause
.Where(i => i.Location.Id == this.Id)
Doesn't work, for... some reason, as I'm unable to debug what LINQ "sees". If I convert the other ID I'm comparing it against to string this way:
var theOtherID = this.Id.ToString("N");
Then it works with this LINQ line:
.Where(i => i["Location"].Contains(theOtherID))
I still have no idea why.
One approach is to include a separate property on Person that is ignored by Glass mapper, but can be used in searches:
[SitecoreIgnore]
[Sitecore.ContentSearch.IndexField("location")]
public Sitecore.Data.ID LocationID { get; set; }
You can use this in your search as follows:
Sitecore.Data.ID locationId = Sitecore.Data.ID.Parse(stringOrGuid);
var results = SearchContext.GetQueryable<Person>()
.Where(i => i.Enabled)
.Where(i => i.TemplateName == "Person")
.Where(i => i.LocationID == locationId)
.Random(6);
I think the efficiency of using multiple where clauses vs. conditionals is debatable. They will likely result in the same Lucene query being performed. I would prefer readability over optimization in this instance, but that's just me.
I can't think of a more efficient methods than using a simple where statement like in:
var results = SearchContext.GetQueryable<Person>()
.Where(i => i.Enabled && i.TemplateName == "Person" &&
i.Location != null && i.Location.Id == 1)
.Random(6);
Keep in mind that if you use the && statement instead of a where for each parameter, you reduce the complexity of the algorithm.
You could also use an Inverse Navigation Property on Location to a virtual ICollection<Person> and then be able to do this:
var results = SearchContext.GetQueryable<Location>()
.Where(i => i.Id == 1 && i.Persons.Where(p => p.Enabled && p.TemplateName == "Person").Any())
.Random(6);
The first option would still be the most efficient, because the second one uses sub-queries. But it is worth knowing you can do your search the other way.

Checking a list with null values for duplicates in C#

In C#, I can use something like:
List<string> myList = new List<string>();
if (myList.Count != myList.Distinct().Count())
{
// there are duplicates
}
to check for duplicate elements in a list. However, when there are null items in list this produces a false positive. I can do this using some sluggish code but is there a way to check for duplicates in a list while disregarding null values with a concise way ?
If you're worried about performance, the following code will stop as soon as it finds the first duplicate item - all the other solutions so far require the whole input to be iterated at least once.
var hashset = new HashSet<string>();
if (myList.Where(s => s != null).Any(s => !hashset.Add(s)))
{
// there are duplicates
}
hashset.Add returns false if the item already exists in the set, and Any returns true as soon as the first true value occurs, so this will only search the input as far as the first duplicate.
I'd do this differently:
Given Linq statements will be evaluated lazily, the .Any will short-circuit - meaning you don't have to iterate & count the entire list, if there are duplicates - and as such, should be more efficient.
var dupes = myList
.Where(item => item != null)
.GroupBy(item => item)
.Any(g => g.Count() > 1);
if(dupes)
{
//there are duplicates
}
EDIT: http://pastebin.com/b9reVaJu Some Linqpad benchmarking that seems to conclude GroupBy with Count() is faster
EDIT 2: Rawling's answer below seems at least 5x faster than this approach!
var nonNulls = myList.Where(x => x != null)
if (nonNulls.Count() != nonNulls.Distinct().Count())
{
// there are duplicates
}
Well, two nulls are duplicates, aren't they?
Anyway, compare the list without nulls:
var denullified = myList.Where(l => l != null);
if(denullified.Count() != denullified.Distinct().Count()) ...
EDIT my first attempt sucks because it is not deferred.
instead,
var duplicates = myList
.Where(item => item != null)
.GroupBy(item => item)
.Any(g => g.Skip(1).Any());
poorer implementation deleted.

Why does this LINQ-to-SQL query get a NotSupportedException?

The following LINQ statement:
public override List<Item> SearchListWithSearchPhrase(string searchPhrase)
{
List<string> searchTerms = StringHelpers.GetSearchTerms(searchPhrase);
using (var db = Datasource.GetContext())
{
return (from t in db.Tasks
where searchTerms.All(term =>
t.Title.ToUpper().Contains(term.ToUpper()) &&
t.Description.ToUpper().Contains(term.ToUpper()))
select t).Cast<Item>().ToList();
}
}
gives me this error:
System.NotSupportedException: Local
sequence cannot be used in LINQ to SQL
implementation of query operators
except the Contains() operator.
Looking around it seems my only option is to get all my items first into a generic List, then do a LINQ query on that.
Or is there a clever way to rephrase the above LINQ-to-SQL statement to avoid the error?
ANSWER:
Thanks Randy, your idea helped me to build the following solution. It is not elegant but it solves the problem and since this will be code generated, I can handle up to e.g. 20 search terms without any extra work:
public override List<Item> SearchListWithSearchPhrase(string searchPhrase)
{
List<string> searchTerms = StringHelpers.GetSearchTerms(searchPhrase);
using (var db = Datasource.GetContext())
{
switch (searchTerms.Count())
{
case 1:
return (db.Tasks
.Where(t =>
t.Title.Contains(searchTerms[0])
|| t.Description.Contains(searchTerms[0])
)
.Select(t => t)).Cast<Item>().ToList();
case 2:
return (db.Tasks
.Where(t =>
(t.Title.Contains(searchTerms[0])
|| t.Description.Contains(searchTerms[0]))
&&
(t.Title.Contains(searchTerms[1])
|| t.Description.Contains(searchTerms[1]))
)
.Select(t => t)).Cast<Item>().ToList();
case 3:
return (db.Tasks
.Where(t =>
(t.Title.Contains(searchTerms[0])
|| t.Description.Contains(searchTerms[0]))
&&
(t.Title.Contains(searchTerms[1])
|| t.Description.Contains(searchTerms[1]))
&&
(t.Title.Contains(searchTerms[2])
|| t.Description.Contains(searchTerms[2]))
)
.Select(t => t)).Cast<Item>().ToList();
default:
return null;
}
}
}
Ed, I've run into a similiar situation. The code is below. The important line of code is where I set the memberList variable. See if this fits your situation. Sorry if the formatting didn't come out to well.
Randy
// Get all the members that have an ActiveDirectorySecurityId matching one in the list.
IEnumerable<Member> members = database.Members
.Where(member => activeDirectoryIds.Contains(member.ActiveDirectorySecurityId))
.Select(member => member);
// This is necessary to avoid getting a "Queries with local collections are not supported"
//error in the next query.
memberList = members.ToList<Member>();
// Now get all the roles associated with the members retrieved in the first step.
IEnumerable<Role> roles = from i in database.MemberRoles
where memberList.Contains(i.Member)
select i.Role;
Since you cannot join local sequence with linq table, the only way to translate the above query into SQL woluld be to create WHERE clause with as many LIKE conditions as there are elements in searchTerms list (concatenated with AND operators). Apparently linq doesn't do that automatically and throws an expception instead.
But it can be done manually by iterating through the sequence:
public override List<Item> SearchListWithSearchPhrase(string searchPhrase)
{
List<string> searchTerms = StringHelpers.GetSearchTerms(searchPhrase);
using (var db = Datasource.GetContext())
{
IQueryable<Task> taskQuery = db.Tasks.AsQueryable();
foreach(var term in searchTerms)
{
taskQuery = taskQuery.Where(t=>t.Title.ToUpper().Contains(term.ToUpper()) && t.Description.ToUpper().Contains(term.ToUpper()))
}
return taskQuery.ToList();
}
}
Mind that the query is still executed by DBMS as a SQL statement. The only drawback is that searchTerms list shouldn't be to long - otherwise the produced SQL statement won'tbe efficient.

Categories

Resources