Does LINQ know how to optimize "queries"? - c#

Suppose I do something like
var Ordered = MyList.OrderBy(x => x.prop1).ThenBy(x => x.prop2);
Does MyList.OrderBy(x => x.prop1) return the filtered list, and then does it further filter that list by ThenBy(x => x.prop2)? In other words, is it equivalent to
var OrderedByProp1 = MyList.OrderBy(x => x.prop1);
var Ordered = OrderedByProp1.OrderBy(x => x.prop2);
???
Because obviously it's possible to optimize this by running a sorting algorithm with a comparator:
var Ordered = MyList.Sort( (x,y) => x.prop1 != y.prop1 ? x.prop1 < y.prop1 : ( x.prop2 < y.prop2 ) );
If it does do some sort of optimization and intermediate lists are not returned in the process, then how does it know how to do that? How do you write a class that optimizes chains of methods on itself? Makes no sense.

Does MyList.OrderBy(x => x.prop1) return the filtered list
No. LINQ methods (at least typically) return queries, not the results of executing those queries.
OrderBy just returns an object which, when you ask it for an item, will return the first item in the collection given a particular ordering. But until you actually ask it for a result it's not doing anything.
Note you can also get a decent idea as to what's going on by just looking at what OrderBy returns. It returns IOrderedEnumerable<T>. That interface has a method CreateOrderedEnumerable which:
Performs a subsequent ordering on the elements of an IOrderedEnumerable according to a key.
That method is what ThenBy uses to indicate that there is a subsequent ordering.
This means that you're building up all of the comparers that you want to be used, from the OrderBy and all ThenBy calls before you ever need to generate a single item in the result set.
For more specifics on exactly how you can go about creating this behavior, see Jon Skeet's blog series on the subject.

Related

Retrieving non-duplicates from 2 Collections using LINQ

Background: I have two Collections of different types of objects with different name properties (both strings). Objects in Collection1 have a field called Name, objects in Collection2 have a field called Field.
I needed to compare these 2 properties, and get items from Collection1 where there is not a match in Collection2 based on that string property (Collection1 will always have a greater or equal number of items. All items should have a matching item by Name/Field in Collection2 when finished).
The question: I've found answers using Lists and they have helped me a little(for what it's worth, I'm using Collections). I did find this answer which appears to be working for me, however I would like to convert what I've done from query syntax (if that's what it's called?) to a LINQ query. See below:
//Query for results. This code is what I'm specifically trying to convert.
var result = (from item in Collection1
where !Collection2.Any(x => x.ColumnName == item.FieldName)
select item).ToList();
//** Remove items in result from Collection1**
//...
I'm really not at all familiar with either syntax (working on it), but I think I generally understand what this is doing. I'm struggling trying to convert this to LINQ syntax though and I'd like to learn both of these options rather than some sort of nested loop.
End goal after I remove the query results from Collection1: Collection1.Count == Collection2 and the following is true for each item in the collection: ItemFromCollection1.Name == SomeItemFromCollection2.Field (if that makes sense...)
You can convert this to LINQ methods like this:
var result = Collection1.Where(item => !Collection2.Any(x => x.ColumnName == item.FieldName))
.ToList();
Your first query is the opposite of what you asked for. It's finding records that don't have an equivalent. The following will return all records in Collection1 where there is an equivalent:
var results=Collection1.Where(c1=>!Collection2.Any(c2=>c2.Field==c1.Name));
Please note that this isn't the fastest approach, especially if there is a large number of records in collection2. You can find ways of speeding it up through HashSets or Lookups.
if you want to get a list of non duplicate values to be retained then do the following.
List<string> listNonDup = new List<String>{"6","1","2","4","6","5","1"};
var singles = listNonDup.GroupBy(n => n)
.Where(g => g.Count() == 1)
.Select(g => g.Key).ToList();
Yields: 2, 4, 5
if you want a list of all the duplicate values then you can do the opposite
var duplicatesxx = listNonDup.GroupBy(s => s)
.SelectMany(g => g.Skip(1)).ToList();

Filter in linq with ID's in a List<int>

I need do a filter that request data with a parameter included in a list.
if (filter.Sc.Count > 0)
socios.Where(s => filter.Sc.Contains(s.ScID));
I try on this way but this not work, I tried also...
socios.Where( s => filter.Sc.All(f => f == s.ScID));
How I can do a filter like this?
socios.Where(s => filter.Sc.Contains(s.ScID));
returns a filtered query. It does not modify the query. You are ignoring the returned value. You need something like:
socios = socios.Where(s => filter.Sc.Contains(s.ScID));
but depending on the type of socios the exact syntax may be different.
In addition to needing to use the return value of your LINQ .Where(), you have a potential logic error in your second statement. The equivalent logic for a .Contains() is checking if Any of the elements pass the match criteria. In your case, the second statement would be
var filteredSocios = socios.Where( s => filter.Sc.Any(f => f == s.ScID));
Of course if you can compare object-to-object directly, the .Contains() is still adequate as long as you remember to use the return value.

How do I use Linq with a HashSet of Integers to pull multiple items from a list of Objects?

I have a HashSet of ID numbers, stored as integers:
HashSet<int> IDList; // Assume that this is created with a new statement in the constructor.
I have a SortedList of objects, indexed by the integers found in the HashSet:
SortedList<int,myClass> masterListOfMyClass;
I want to use the HashSet to create a List as a subset of the masterListOfMyclass.
After wasting all day trying to figure out the Linq query, I eventually gave up and wrote the following, which works:
public List<myclass> SubSet {
get {
List<myClass> xList = new List<myClass>();
foreach (int x in IDList) {
if (masterListOfMyClass.ContainsKey(x)) {
xList.Add(masterListOfMyClass[x]);
}
}
return xList;
}
private set { }
}
So, I have two questions here:
What is the appropriate Linq query? I'm finding Linq extremely frustrating to try to figuere out. Just when I think I've got it, it turns around and "goes on strike".
Is a Linq query any better -- or worse -- than what I have written here?
var xList = IDList
.Where(masterListOfMyClass.ContainsKey)
.Select(x => masterListOfMyClass[x])
.ToList();
If your lists both have equally large numbers of items, you may wish to consider inverting the query (i.e. iterate through masterListOfMyClass and query IDList) since a HashSet is faster for random queries.
Edit:
It's less neat, but you could save a lookup into masterListOfMyClass with the following query, which would be a bit faster:
var xList = IDList
.Select(x => { myClass y; masterListOfMyClass.TryGetValue(x, out y); return y; })
.Where(x => x != null)
.ToList();
foreach (int x in IDList.Where(x => masterListOfMyClass.ContainsKey(x)))
{
xList.Add(masterListOfMyClass[x]);
}
This is the appropriate linq query for your loop.
Here the linq query will not effective in my point of view..
Here is the Linq expression:
List<myClass> xList = masterListOfMyClass
.Where(x => IDList.Contains(x.Key))
.Select(x => x.Value).ToList();
There is no big difference in the performance in such a small example, Linq is slower in general, it actually uses iterations under the hood too. The thing you get with ling is, imho, clearer code and the execution is defered until it is needed. Not i my example though, when I call .ToList().
Another option would be (which is intentionally the same as Sankarann's first answer)
return (
from x in IDList
where masterListOfMyClass.ContainsKey(x)
select masterListOfMyClass[x]
).ToList();
However, are you sure you want a List to be returned? Usually, when working with IEnumerable<> you should chain your calls using IEnumerable<> until the point where you actually need the data. There you can decide to e.g. loop once (use the iterator) or actually pull the data in some sort of cache using the ToList(), ToArray() etc. methods.
Also, exposing a List<> to the public implies that modifying this list has an impact on the calling class. I would leave it to the user of the property to decide to make a local copy or continue using the IEnumerable<>.
Second, as your private setter is empty, setting the 'SubSet' has no impact on the functionality. This again is confusing and I would avoid it.
An alternate (an maybe less confusing) declaration of your property might look like this
public IEnumerable<myclass> SubSet {
get {
return from x in IDList
where masterListOfMyClass.ContainsKey(x)
select masterListOfMyClass[x]
}
}

Sort based on function

I have a method that given 2 strings he returns a number (between 0 and 100) which represents is how alike they are, being 0 "not similar at all" and 100 "they are the same"
Now the thing is that i have a list of County (string name, GeoRef coordinates, string Mayor) which i would like to sort based on the return of my function...
im looking for something like myList.Sort(f=>MyScoreEvaluator("York",f.Name))
Can anyone tell me how to do so?
Edit1: I dont think that the method "Sort" is quite i want... Sort compare itens inside of the list... i want to compare the itens of the list against a external info and based on that result sort the items
The OrderBy and OrderByDescending are returning the same item order...
Edit2: Heres is the code of the OrderBy I'm using: aux.OrderBy(f => StringComparisonHelper.HowAlike(f.Name, countyNameSearched));
You can use OrderBy, and re-assign your list:
list = list.OrderBy(f => MyScoreEvaluator("York", f.Name))
You could just use OrderBy:
list.OrderBy(f => MyScoreEvaluator("York", f.Name))
Or Implement a custom Comparer:
public static int SortByName(County x, County y)
{
return x.Name.CompareTo(y.Name);
}
Usage:
list.Sort(new Comparison<County>(SortByName))
There is an OrderBy in LINQ:
var sorted = myList.OrderBy(f => MyScoreEvaluator("York", f.Name))
Or to sort descendingly:
var sortedDesc = myList.OrderByDescending(f => MyScoreEvaluator("York", f.Name))
It's very easy to use the LINQ OrderBy extension (see others' answers).
If you want to use Sort, it would be:
myList.Sort((x, y) => MyScoreEvaluator("York", x.Name)
.CompareTo(MyScoreEvaluator("York", y.Name)));
This assumes that myList is a System.Collections.Generic.List<>.
If you want the other sort direction, swap x and y on one side of the lambda arrow =>, of course.
EDIT:
Remember .Sort method on List<> modifies the same instance. The return type of Sort method is void. On the other hand, OrderBy creates a new IEnumerable<> on which you can call .ToList() to get a new list object. The old object is unchanged. You might assign the new object to the variable that held the original list. Other variables that reference the old object won't be affected by that. Example:
myList = myList.OrderBy(f => MyScoreEvaluator("York", f.Name)).ToList();
NEW EDIT:
If performance is an issue, it's not clear which of these two to use. The OrderBy method calls the MyScoreEvaluator only once per item in your original list. The Sort method as presented here, calls MyScoreEvaluator a lot more times, because it doesn't "remember" the result of each MyScoreEvaluator call (the Comparison<> delegate instance is a black box to the Sort algorithm). So if it wants to compare "Fork" and "Kork", it calls MyScoreEvaluator twice. Then afterwards if it wants to compare "Kork" and "Yorc", it does the "Kork" MyScoreEvaluator again. On the other hand, the sort algorithm of List<>.Sort is superior to that of OrderBy.

Sorting a list in C# (with various parameters)

I have a list of objects. That objects have various field, e.g. age and name
Now sometimes I'd like to sort the list by names and sometimes by age. Additional sometimes increasing order and sometimes decreasing order.
Now I understand that i should implement the Comparable interface in my object and override the CompareTo method.
But how can i do this when i want to support various sorting orders?
Do i have to set the sorting order in my object or is it somehow possible to pass the sorting order by the sort method call?
The method call can do everything; no need for a comparer:
list.Sort((x,y)=>string.Compare(x.Name,y.Name));
list.Sort((x,y)=>y.Age.CompareTo(x.Age)); // desc
list.Sort((x,y)=>x.Age.CompareTo(y.Age)); // asc
Note the second is descending, by swapping x/y in the compare.
If you're using List<T> and you want to sort the list in place, then the Sort function provides an overload that accepts a Comparison<T>. You can use this to provide different comparisons for a list.
For example, to sort on Age:
list.Sort((x, y) => x.Age.CompareTo(y.Age));
To sort on Name:
list.Sort((x, y) => string.Compare(x.Name, y.Name));
To sort in descending order, simply reverse the parameters.
Alternatively, you could use LINQ to create various queries against your list that provide the results in whatever order you like, but this won't have any effect upon the underlying list (whether that's bad or good is up to you):
var byAge = list.OrderBy(x => x.Age);
var byName = list.OrderBy(x => x.Name);
To sort in descending order, use OrderByDescending in place of OrderBy.
You can also just use LINQ to handle this:
var sortedByAge = myList.OrderBy(i => i.Age);
var sortedByName = myList.OrderBy(i => i.Name);
If you want to handle sorting in place, you can use List<T>.Sort(Comparison<T>):
// Sort by Age
myList.Sort( (l, r) => l.Age.CompareTo(r.Age) );
// Sort by Name
myList.Sort( (l, r) => l.Name.CompareTo(r.Name) );
You can sort your objects data with linq
something like this
var query = from cust in customers
orderby cust.Age ascending
select cust;
You can also use
list.OrderByDescending(a => a.Age);
or
list.OrderByAscending(a => a.Age);

Categories

Resources