OrderBy list by a nested list - c#

I have a objectA list, each one contain an another objectB list.
I'm trying to OrderBy each objectB list with an (int) Id :
var sorted = objectA.OrderBy(a => a.ObjectB.OrderBy(b => b.Id)).ToList();
Of course, this doesn't work. Someone have an advice ?

If you want to sort each ObjectB list in place (i.e. modify it), then simply use the List<T>.Sort method. You will need to specify a custom Comparison<T> delegate:
foreach (var a in objectA)
{
a.ObjectB.Sort((x, y) => x.Id - y.Id)
}
If objectA is a List<ObjectA>, then you can use the ForEach method and pass a delegate:
objectA.ForEach(a => a.ObjectB.Sort((x, y) => x.Id - y.Id));
If you don't want to modify your original ObjectA instances, then you will have to project each ObjectA instance into a new instance (by cloning it) and then assign sorted ObjectB lists. It would look something like (presuming that all properties have public setters):
var newList = objectA
.Select(x => new ObjectA()
{
Id = x.Id,
SomethingElse = x.SomethingElse,
ObjectB = x.ObjectB.OrderBy(b => b.Id).ToList()
})
.ToList();

You can prepare extension method and use it at a generic way. For first, helpers for Expression:
public static void SetProperty<T, B>(
this Expression<Func<T, B>> propertySelector,
T target,
B value)
{
SetObjectProperty(target, propertySelector, value);
}
public static void SetObjectProperty<T, B>(
T target,
Expression<Func<T, B>> propertySelector,
object value)
{
if (target == null)
{
throw new ArgumentNullException("target");
}
if (propertySelector == null)
{
throw new ArgumentNullException("propertySelector");
}
var memberExpression = propertySelector.Body as MemberExpression;
if (memberExpression == null)
{
throw new NotSupportedException("Cannot recognize property.");
}
var propertyInfo = memberExpression.Member as PropertyInfo;
if (propertyInfo == null)
{
throw new NotSupportedException(
"You can select property only."
+ " Currently, selected member is: "
+ memberExpression.Member);
}
propertyInfo.SetValue(target, value);
}
Then write this extension:
public static IEnumerable<TSource> OrderInnerCollection<TSource, TInner, TKey>(
this IEnumerable<TSource> source,
Expression<Func<TSource, IEnumerable<TInner>>> innerSelector,
Func<TInner, TKey> keySelector)
{
var innerSelectorDelegate = innerSelector.Compile();
foreach (var item in source)
{
var collection = innerSelectorDelegate(item);
collection = collection.OrderBy(keySelector);
innerSelector.SetProperty(item, collection);
yield return item;
}
}
And usage:
var result = objectA.OrderInnerCollection(
aObj => aObj.ObjectB,
objB => objB.Id).ToList();

Related

Expression.Call GroupBy then Select and Count()?

Using Expression trees, I would need to build a GroupBy in a generic way.
The static method I'm going to use is the following:
public static IQueryable<Result> GroupBySelector<TSource>(this IQueryable<TSource> source, String coloumn)
{
//Code here
}
The Result class has two property :
public string Value { get; set; }
public int Count { get; set; }
Basically I'd like to build the following Linq query via Expression trees:
query.GroupBy(s => s.Country).Select(p => new
{
Value = p.Key,
Count = p.Count()
}
)
How would you implement it?
Looking at:
query.GroupBy(s => s.Country).Select(p => new
{
Value = p.Key,
Count = p.Count()
}
);
To match the signature of IQueryable<Result> what you actually need here is:
query.GroupBy(s => s.Country).Select(p => new
Result{
Value = p.Key,
Count = p.Count()
}
);
Now, the Select can work with any IQueryable<IGrouping<string, TSource>> as is. It's only the GroupBy that needs us to use expression trees.
Our task here is to start with a type and a string that represents a property (that itself returns string) and create a Expression<Func<TSource, string>> that represents obtaining the value of that property.
So, let's produce the simple bit of the method first:
public static IQueryable<Result> GroupBySelector<TSource>(this IQueryable<TSource> source, string column)
{
Expression<Func<TSource, string>> keySelector = //Build tree here.
return source.GroupBy(keySelector).Select(p => new Result{Value = p.Key, Count = p.Count()});
}
Okay. How to build the tree.
We're going to need a lambda that has a paramter of type TSource:
var param = Expression.Parameter(typeof(TSource));
We're going to need to obtain the property whose name matches column:
Expression.Property(param, column);
And the only logic needed in the lambda is simply to access that property:
Expression<Func<TSource, string>> keySelector = Expression.Lambda<Func<TSource, string>>
(
Expression.Property(param, column),
param
);
Putting it all together:
public static IQueryable<Result> GroupBySelector<TSource>(this IQueryable<TSource> source, String column)
{
var param = Expression.Parameter(typeof(TSource));
Expression<Func<TSource, string>> keySelector = Expression.Lambda<Func<TSource, string>>
(
Expression.Property(param, column),
param
);
return source.GroupBy(keySelector).Select(p => new Result{Value = p.Key, Count = p.Count()});
}
About the only thing left is the exception-handling, which I normally don't include in an answer, but one part of this is worth paying attention to.
First the obvious null and empty checks:
public static IQueryable<Result> GroupBySelector<TSource>(this IQueryable<TSource> source, String column)
{
if (source == null) throw new ArgumentNullException("source");
if (column == null) throw new ArgumentNullException("column");
if (column.Length == 0) throw new ArgumentException("column");
var param = Expression.Parameter(typeof(TSource));
Expression<Func<TSource, string>> keySelector = Expression.Lambda<Func<TSource, string>>
(
Expression.Property(param, column),
param
);
return source.GroupBy(keySelector).Select(p => new Result{Value = p.Key, Count = p.Count()});
}
Now, let's consider what happens if we pass a string for column that doesn't match a property of TSource. We get an ArgumentException with the message Instance property '[Whatever you asked for]' is not defined for type '[Whatever the type is]'. That's pretty much what we want in this case, so no issue.
If however we passed a string that did identify a property but where that property wasn't of type string we'd get something like "Expression of type 'System.Int32' cannot be used for return type 'System.String'". That's not dreadful, but it's not great either. Let's be more explicit:
public static IQueryable<Result> GroupBySelector<TSource>(this IQueryable<TSource> source, String column)
{
if (source == null) throw new ArgumentNullException("source");
if (column == null) throw new ArgumentNullException("column");
if (column.Length == 0) throw new ArgumentException("column");
var param = Expression.Parameter(typeof(TSource));
var prop = Expression.Property(param, column);
if (prop.Type != typeof(string)) throw new ArgumentException("'" + column + "' identifies a property of type '" + prop.Type + "', not a string property.", "column");
Expression<Func<TSource, string>> keySelector = Expression.Lambda<Func<TSource, string>>
(
prop,
param
);
return source.GroupBy(keySelector).Select(p => new Result{Value = p.Key, Count = p.Count()});
}
If this method was internal the above would perhaps be over-kill, but if it was public the extra info would be well worth it if you came to debug it.

Quick way to get the difference between two List<> objects

How do I get itemsToRemove to only contain "bar one", and itemsToAdd to only contain "bar five"?
I'm trying to use "Except", but obviously I'm using it incorrectly.
var oldList = new List<Foo>();
oldList.Add(new Foo(){ Bar = "bar one"});
oldList.Add(new Foo(){ Bar = "bar two"});
oldList.Add(new Foo(){ Bar = "bar three"});
oldList.Add(new Foo(){ Bar = "bar four"});
var newList = new List<Foo>();
newList.Add(new Foo(){ Bar = "bar two"});
newList.Add(new Foo(){ Bar = "bar three"});
newList.Add(new Foo(){ Bar = "bar four"});
newList.Add(new Foo(){ Bar = "bar five"});
var itemsToRemove = oldList.Except(newList); // should only contain "bar one"
var itemsToAdd = newList.Except(oldList); // should only contain "bar one"
foreach(var item in itemsToRemove){
Console.WriteLine(item.Bar + " removed");
// currently says
// bar one removed
// bar two removed
// bar three removed
// bar four removed
}
foreach(var item in itemsToAdd){
Console.WriteLine(item.Bar + " added");
// currently says
// bar two added
// bar three added
// bar four added
// bar five added
}
Except will use the default Equals and GetHashCode method of the objects in question to define "equality" for the objects, unless you provide a custom comparer (you have not). In this case, that will compare the references of the objects, not their Bar value.
One option would be to create an IEqualityComparer<Foo> that compares the Bar property, rather than references to the object itself.
public class FooComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo x, Foo y)
{
if (x == null ^ y == null)
return false;
if (x == null && y == null)
return true;
return x.Bar == y.Bar;
}
public int GetHashCode(Foo obj)
{
if (obj == null)
return 0;
return obj.Bar.GetHashCode();
}
}
Another option is to create an Except method that accepts a selector to compare the values on. We can create such a method and then use that:
public static IEnumerable<TSource> ExceptBy<TSource, TKey>(
this IEnumerable<TSource> first,
IEnumerable<TSource> second,
Func<TSource, TKey> keySelector,
IEqualityComparer<TKey> comparer = null)
{
comparer = comparer ?? EqualityComparer<TKey>.Default;
var set = new HashSet<TKey>(second.Select(keySelector), comparer);
return first.Where(item => set.Add(keySelector(item)));
}
This allows us to write:
var itemsToRemove = oldList.ExceptBy(newList, foo => foo.Bar);
var itemsToAdd = newList.ExceptBy(oldList, foo => foo.Bar);
Your logic is sound, but Except default behaviour for comparing two classes is to go by references. Since you are effectively create two lists with 8 differents objets (regardless of their content), there will be no two equal objects.
You can, however, use the Except overload that takes an IEqualityComparer. For example:
public class FooEqualityComparer : IEqualityComparer<Foo>
{
public bool Equals(Foo left, Foo right)
{
if(left == null && right == null) return true;
return left != null && right != null && left.Bar == right.Bar;
}
public int GetHashCode(Foo item)
{
return item != null ? item.Bar.GetHashcode() : 0;
}
}
// In your code
var comparer = new FooEqualityComparer();
var itemsToRemove = oldList.Except(newList, comparer );
var itemsToAdd = newList.Except(oldList, comparer);
This is mostly a riff on Servy's answer to give a more general approach to this:
public class PropertyEqualityComparer<TItem, TKey> : EqualityComparer<Tuple<TItem, TKey>>
{
readonly Func<TItem, TKey> _getter;
public PropertyEqualityComparer(Func<TItem, TKey> getter)
{
_getter = getter;
}
public Tuple<TItem, TKey> Wrap(TItem item) {
return Tuple.Create(item, _getter(item));
}
public TItem Unwrap(Tuple<TItem, TKey> tuple) {
return tuple.Item1;
}
public override bool Equals(Tuple<TItem, TKey> x, Tuple<TItem, TKey> y)
{
if (x.Item2 == null && y.Item2 == null) return true;
if (x.Item2 == null || y.Item2 == null) return false;
return x.Item2.Equals(y.Item2);
}
public override int GetHashCode(Tuple<TItem, TKey> obj)
{
if (obj.Item2 == null) return 0;
return obj.Item2.GetHashCode();
}
}
public static class ComparerLinqExtensions {
public static IEnumerable<TSource> Except<TSource, TKey>(this IEnumerable<TSource> first, IEnumerable<TSource> second, Func<TSource, TKey> keyGetter)
{
var comparer = new PropertyEqualityComparer<TSource, TKey>(keyGetter);
var firstTuples = first.Select(comparer.Wrap);
var secondTuples = second.Select(comparer.Wrap);
return firstTuples.Except(secondTuples, comparer)
.Select(comparer.Unwrap);
}
}
// ...
var itemsToRemove = oldList.Except(newList, foo => foo.Bar);
var itemsToAdd = newList.Except(oldList, foo => foo.Bar);
This should work fine for any classes without unusual equality semantics, where it's incorrect to call the object.Equals() override instead of IEquatable<T>.Equals().Notably, this will work fine for anonymous types.
This is because you're comparing objects of type Foo, and not property Bar of type string. Try:
var itemsToRemove = oldList.Select(i => i.Bar).Except(newList.Select(i => i.Bar));
var itemsToAdd = newList.Select(i => i.Bar).Except(oldList.Select(i => i.Bar));
Implement IComparable on your data objects; I think you're being bitten by reference comparison. If you change Foo to just string, your code works.
var oldList = new List<string>();
oldList.Add("bar one");
oldList.Add("bar two");
oldList.Add("bar three");
oldList.Add("bar four");
var newList = new List<string>();
newList.Add("bar two");
newList.Add("bar three");
newList.Add("bar four");
newList.Add("bar five");
var itemsToRemove = oldList.Except(newList); // should only contain "bar one"
var itemsToAdd = newList.Except(oldList); // should only contain "bar one"
foreach (var item in itemsToRemove)
{
Console.WriteLine(item + " removed");
}
foreach (var item in itemsToAdd)
{
Console.WriteLine(item + " added");
}

Add a LINQ or DBContext extension method to get an element if not exist then create with data in predicate (FirstOrCreate)

I'm trying to add a LINQ or DbContext extension method to get an element (FirstOrDefault) but if one does not already exist then create a new instance with data (FirstOrCreate) instead of returning null.
is this possible?
i.e.:
public static class LINQExtension
{
public static TSource FirstOrCreate<TSource>(
this IEnumerable<TSource> source,
Func<TSource, bool> predicate)
{
if (source.First(predicate) != null)
{
return source.First(predicate);
}
else
{
return // ???
}
}
}
and a usage could be:
using (var db = new MsBoxContext())
{
var status = db.EntitiesStatus.FirstOrCreate(s => s.Name == "Enabled");
//Here we should get the object if we find one
//and if it doesn't exist create and return a new instance
db.Entities.Add(new Entity()
{
Name = "New Entity",
Status = status
});
}
I hope that you understand my approach.
public static class LINQExtension
{
public static TSource FirstOrCreate<TSource>(
this IQueryable<TSource> source,
Expression<Func<TSource, bool>> predicate,
Func<T> defaultValue)
{
return source.FirstOrDefault(predicate) ?? defaultValue();
}
}
usage
var status = db.EntitiesStatus.FirstOrCreate(s => s.Name == "Enabled",
() => new EntityStatus {Name = "Enabled"});
However you must note that this will not work quite like FirstOrDefault().
If you did the following
var listOfStuff = new List<string>() { "Enabled" };
var statuses = from s in listOfStuff
select db.EntitiesStatus.FirstOrCreate(s => s.Name == "Enabled",
() => new EntityStatus {Name = "Enabled"});
You would get O(n) hits to the database.
However I suspect if you did...
var listOfStuff = new List<string>() { "Enabled" };
var statuses = from s in listOfStuff
select db.EntitiesStatus.FirstOrDefault(s => s.Name == "Enabled")
?? new EntityStatus {Name = "Enabled"};
It is plausible it could work...
conclussion:
instead of implement an extension method the best solution is using ?? operator in that way:
var status = db.EntitiesStatus.FirstOrDefault(s => s.Name == "Enabled") ?? new EntityStatus(){Name = "Enabled"};
I am a self taught programmer and I am really bad at typing so i was looking for the exact same thing. I ended up writing my own. It took a few steps and revisions before it would work with more than 1 property. Of course there are some limitations and I haven't fully tested it but so far it seems to work for my purposes of keeping the records distinct in the DB and shortening the code (typing time).
public static class DataExtensions
{
public static TEntity InsertIfNotExists<TEntity>(this ObjectSet<TEntity> objectSet, Expression<Func<TEntity, bool>> predicate) where TEntity : class, new()
{
TEntity entity;
#region Check DB
entity = objectSet.FirstOrDefault(predicate);
if (entity != null)
return entity;
#endregion
//NOT in the Database... Check Local cotext so we do not enter duplicates
#region Check Local Context
entity = objectSet.Local().AsQueryable().FirstOrDefault(predicate);
if (entity != null)
return entity;
#endregion
///********* Does NOT exist create entity *********\\\
entity = new TEntity();
// Parse Expression Tree and set properties
//Hit a recurrsive function to get all the properties and values
var body = (BinaryExpression)((LambdaExpression)predicate).Body;
var dict = body.GetDictionary();
//Set Values on the new entity
foreach (var item in dict)
{
entity.GetType().GetProperty(item.Key).SetValue(entity, item.Value);
}
return entity;
}
public static Dictionary<string, object> GetDictionary(this BinaryExpression exp)
{
//Recurssive function that creates a dictionary of the properties and values from the lambda expression
var result = new Dictionary<string, object>();
if (exp.NodeType == ExpressionType.AndAlso)
{
result.Merge(GetDictionary((BinaryExpression)exp.Left));
result.Merge(GetDictionary((BinaryExpression)exp.Right));
}
else
{
result[((MemberExpression)exp.Left).Member.Name] = exp.Right.GetExpressionVaule();
}
return result;
}
public static object GetExpressionVaule(this Expression exp)
{
if (exp.NodeType == ExpressionType.Constant)
return ((ConstantExpression)exp).Value;
if (exp.Type.IsValueType)
exp = Expression.Convert(exp, typeof(object));
//Taken From http://stackoverflow.com/questions/238413/lambda-expression-tree-parsing
var accessorExpression = Expression.Lambda<Func<object>>(exp);
Func<object> accessor = accessorExpression.Compile();
return accessor();
}
public static IEnumerable<T> Local<T>(this ObjectSet<T> objectSet) where T : class
{
//Taken From http://blogs.msdn.com/b/dsimmons/archive/2009/02/21/local-queries.aspx?Redirected=true
return from stateEntry in objectSet.Context.ObjectStateManager.GetObjectStateEntries(
EntityState.Added |
EntityState.Modified |
EntityState.Unchanged)
where stateEntry.Entity != null && stateEntry.EntitySet == objectSet.EntitySet
select stateEntry.Entity as T;
}
public static void Merge<TKey, TValue>(this Dictionary<TKey, TValue> me, Dictionary<TKey, TValue> merge)
{
//Taken From http://stackoverflow.com/questions/4015204/c-sharp-merging-2-dictionaries
foreach (var item in merge)
{
me[item.Key] = item.Value;
}
}
}
Usage is as simple as:
var status = db.EntitiesStatus.InsertIfNotExists(s => s.Name == "Enabled");
The extension will check the database first, if is not found it will check the local context (so you do not add it twice), if it is still not found it creates the entity, parses the expression tree to get the properties and values from the lambda expression, sets those values on a new entity, adds the entity to the context and returns the new entity.
A few things to be aware of...
This does not handle all possible uses (assumes all the expressions in the lambda are ==)
The project I did this in is using an ObjectContext as apposed to a DBContext (I have not switched yet so I don't know if this would work with DBContext. I assume it would not be difficult to change)
I am self-taught so there maybe many ways to optimize this. If you have any input please let me know.
What about this extension that also adds the new created entity to the DbSet.
public static class DbSetExtensions
{
public static TEntity FirstOrCreate<TEntity>(
this DbSet<TEntity> dbSet,
Expression<Func<TEntity, bool>> predicate,
Func<TEntity> defaultValue)
where TEntity : class
{
var result = predicate != null
? dbSet.FirstOrDefault(predicate)
: dbSet.FirstOrDefault();
if (result == null)
{
result = defaultValue?.Invoke();
if (result != null)
dbSet.Add(result);
}
return result;
}
public static TEntity FirstOrCreate<TEntity>(
this DbSet<TEntity> dbSet,
Func<TEntity> defaultValue)
where TEntity : class
{
return dbSet.FirstOrCreate(null, defaultValue);
}
}
The usage with predicate:
var adminUser = DbContext.Users.FirstOrCreate(u => u.Name == "Admin", () => new User { Name = "Admin" });
or without predicate:
var adminUser = DbContext.Users.FirstOrCreate(() => new User { Name = "Admin" });

Filtering duplicates out of an IEnumerable

I have this code:
class MyObj {
int Id;
string Name;
string Location;
}
IEnumerable<MyObj> list;
I want to convert list to a dictionary like this:
list.ToDictionary(x => x.Name);
but it tells me I have duplicate keys. How can I keep only the first item for each key?
I suppose the easiest way would be to group by key and take the first element of each group:
list.GroupBy(x => x.name).Select(g => g.First()).ToDictionary(x => x.name);
Or you could use Distinct if your objects implement IEquatable to compare between themselves by key:
// I'll just randomly call your object Person for this example.
class Person : IEquatable<Person>
{
public string Name { get; set; }
public bool Equals(Person other)
{
if (other == null)
return false;
return Name == other.Name;
}
public override bool Equals(object obj)
{
return base.Equals(obj as Person);
}
public override int GetHashCode()
{
return Name.GetHashCode();
}
}
...
list.Distinct().ToDictionary(x => x.Name);
Or if you don't want to do that (maybe because you normally want to compare for equality in a different way, so Equals is already in use) you could make a custom implementation of IEqualityComparer just for this case:
class PersonComparer : IEqualityComparer<Person>
{
public bool Equals(Person x, Person y)
{
if (x == null)
return y == null;
if (y == null)
return false;
return x.Name == y.Name;
}
public int GetHashCode(Person obj)
{
return obj.Name.GetHashCode();
}
}
...
list.Distinct(new PersonComparer()).ToDictionary(x => x.Name);
list.Distinct().ToDictionary(x => x.Name);
You could also create your own Distinct extension overload method that accepted a Func<> for choosing the distinct key:
public static class EnumerationExtensions
{
public static IEnumerable<TSource> Distinct<TSource,TKey>(
this IEnumerable<TSource> source, Func<TSource,TKey> keySelector)
{
KeyComparer comparer = new KeyComparer(keySelector);
return source.Distinct(comparer);
}
private class KeyComparer<TSource,TKey> : IEqualityComparer<TSource>
{
private Func<TSource,TKey> keySelector;
public DelegatedComparer(Func<TSource,TKey> keySelector)
{
this.keySelector = keySelector;
}
bool IEqualityComparer.Equals(TSource a, TSource b)
{
if (a == null && b == null) return true;
if (a == null || b == null) return false;
return keySelector(a) == keySelector(b);
}
int IEqualityComparer.GetHashCode(TSource obj)
{
return keySelector(obj).GetHashCode();
}
}
}
Apologies for any bad code formatting, I wanted to reduce the size of the code on the page. Anyway, you can then use ToDictionary:
var dictionary = list.Distinct(x => x.Name).ToDictionary(x => x.Name);
Could make your own perhaps? For example:
public static class Extensions
{
public static IDictionary<TKey, TValue> ToDictionary2<TKey, TValue>(
this IEnumerable<TValue> subjects, Func<TValue, TKey> keySelector)
{
var dictionary = new Dictionary<TKey, TValue>();
foreach(var subject in subjects)
{
var key = keySelector(subject);
if(!dictionary.ContainsKey(key))
dictionary.Add(key, subject);
}
return dictionary;
}
}
var dictionary = list.ToDictionary2(x => x.Name);
Haven't tested it, but should work. (and it should probably have a better name than ToDictionary2 :p)
Alternatively, you can implement a DistinctBy method, for example like this:
public static IEnumerable<TSubject> DistinctBy<TSubject, TValue>(this IEnumerable<TSubject> subjects, Func<TSubject, TValue> valueSelector)
{
var set = new HashSet<TValue>();
foreach(var subject in subjects)
if(set.Add(valueSelector(subject)))
yield return subject;
}
var dictionary = list.DistinctBy(x => x.Name).ToDictionary(x => x.Name);
The problem here is that the ToDictionary extension method does not support multiple values with the same key. One solution is to write a version which does and use that instead.
public static Dictionary<TKey,TValue> ToDictionaryAllowDuplicateKeys<TKey,TValue>(
this IEnumerable<TValue> values,
Func<TValue,TKey> keyFunc) {
var map = new Dictionary<TKey,TValue>();
foreach ( var cur in values ) {
var key = keyFunc(cur);
map[key] = cur;
}
return map;
}
Now converting to a dictionary is straight forward
var map = list.ToDictionaryAllowDuplicateKeys(x => x.Name);
The following will work if you have different instances of MyObj with the same value for the Name property. It will take the first instance found for each duplicate (sorry for the obj - obj2 notation, it is just sample code):
list.SelectMany(obj => new MyObj[] {list.Where(obj2 => obj2.Name == obj.Name).First()}).Distinct();
EDIT: Joren's solution is better as it does not create unnecessary arrays in the process.

Distinct() with lambda?

Right, so I have an enumerable and wish to get distinct values from it.
Using System.Linq, there's, of course, an extension method called Distinct. In the simple case, it can be used with no parameters, like:
var distinctValues = myStringList.Distinct();
Well and good, but if I have an enumerable of objects for which I need to specify equality, the only available overload is:
var distinctValues = myCustomerList.Distinct(someEqualityComparer);
The equality comparer argument must be an instance of IEqualityComparer<T>. I can do this, of course, but it's somewhat verbose and, well, cludgy.
What I would have expected is an overload that would take a lambda, say a Func<T, T, bool>:
var distinctValues = myCustomerList.Distinct((c1, c2) => c1.CustomerId == c2.CustomerId);
Anyone know if some such extension exists, or some equivalent workaround? Or am I missing something?
Alternatively, is there a way of specifying an IEqualityComparer inline (embarrass me)?
Update
I found a reply by Anders Hejlsberg to a post in an MSDN forum on this subject. He says:
The problem you're going to run into is that when two objects compare
equal they must have the same GetHashCode return value (or else the
hash table used internally by Distinct will not function correctly).
We use IEqualityComparer because it packages compatible
implementations of Equals and GetHashCode into a single interface.
I suppose that makes sense.
IEnumerable<Customer> filteredList = originalList
.GroupBy(customer => customer.CustomerId)
.Select(group => group.First());
It looks to me like you want DistinctBy from MoreLINQ. You can then write:
var distinctValues = myCustomerList.DistinctBy(c => c.CustomerId);
Here's a cut-down version of DistinctBy (no nullity checking and no option to specify your own key comparer):
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
HashSet<TKey> knownKeys = new HashSet<TKey>();
foreach (TSource element in source)
{
if (knownKeys.Add(keySelector(element)))
{
yield return element;
}
}
}
To Wrap things up . I think most of the people which came here like me want the simplest solution possible without using any libraries and with best possible performance.
(The accepted group by method for me i think is an overkill in terms of performance. )
Here is a simple extension method using the IEqualityComparer interface which works also for null values.
Usage:
var filtered = taskList.DistinctBy(t => t.TaskExternalId).ToArray();
Extension Method Code
public static class LinqExtensions
{
public static IEnumerable<T> DistinctBy<T, TKey>(this IEnumerable<T> items, Func<T, TKey> property)
{
GeneralPropertyComparer<T, TKey> comparer = new GeneralPropertyComparer<T,TKey>(property);
return items.Distinct(comparer);
}
}
public class GeneralPropertyComparer<T,TKey> : IEqualityComparer<T>
{
private Func<T, TKey> expr { get; set; }
public GeneralPropertyComparer (Func<T, TKey> expr)
{
this.expr = expr;
}
public bool Equals(T left, T right)
{
var leftProp = expr.Invoke(left);
var rightProp = expr.Invoke(right);
if (leftProp == null && rightProp == null)
return true;
else if (leftProp == null ^ rightProp == null)
return false;
else
return leftProp.Equals(rightProp);
}
public int GetHashCode(T obj)
{
var prop = expr.Invoke(obj);
return (prop==null)? 0:prop.GetHashCode();
}
}
Shorthand solution
myCustomerList.GroupBy(c => c.CustomerId, (key, c) => c.FirstOrDefault());
No there is no such extension method overload for this. I've found this frustrating myself in the past and as such I usually write a helper class to deal with this problem. The goal is to convert a Func<T,T,bool> to IEqualityComparer<T,T>.
Example
public class EqualityFactory {
private sealed class Impl<T> : IEqualityComparer<T,T> {
private Func<T,T,bool> m_del;
private IEqualityComparer<T> m_comp;
public Impl(Func<T,T,bool> del) {
m_del = del;
m_comp = EqualityComparer<T>.Default;
}
public bool Equals(T left, T right) {
return m_del(left, right);
}
public int GetHashCode(T value) {
return m_comp.GetHashCode(value);
}
}
public static IEqualityComparer<T,T> Create<T>(Func<T,T,bool> del) {
return new Impl<T>(del);
}
}
This allows you to write the following
var distinctValues = myCustomerList
.Distinct(EqualityFactory.Create((c1, c2) => c1.CustomerId == c2.CustomerId));
Here's a simple extension method that does what I need...
public static class EnumerableExtensions
{
public static IEnumerable<TKey> Distinct<T, TKey>(this IEnumerable<T> source, Func<T, TKey> selector)
{
return source.GroupBy(selector).Select(x => x.Key);
}
}
It's a shame they didn't bake a distinct method like this into the framework, but hey ho.
This will do what you want but I don't know about performance:
var distinctValues =
from cust in myCustomerList
group cust by cust.CustomerId
into gcust
select gcust.First();
At least it's not verbose.
From .NET 6 or later, there is a new build-in method Enumerable.DistinctBy to achieve this.
var distinctValues = myCustomerList.DistinctBy(c => c.CustomerId);
// With IEqualityComparer
var distinctValues = myCustomerList.DistinctBy(c => c.CustomerId, someEqualityComparer);
Something I have used which worked well for me.
/// <summary>
/// A class to wrap the IEqualityComparer interface into matching functions for simple implementation
/// </summary>
/// <typeparam name="T">The type of object to be compared</typeparam>
public class MyIEqualityComparer<T> : IEqualityComparer<T>
{
/// <summary>
/// Create a new comparer based on the given Equals and GetHashCode methods
/// </summary>
/// <param name="equals">The method to compute equals of two T instances</param>
/// <param name="getHashCode">The method to compute a hashcode for a T instance</param>
public MyIEqualityComparer(Func<T, T, bool> equals, Func<T, int> getHashCode)
{
if (equals == null)
throw new ArgumentNullException("equals", "Equals parameter is required for all MyIEqualityComparer instances");
EqualsMethod = equals;
GetHashCodeMethod = getHashCode;
}
/// <summary>
/// Gets the method used to compute equals
/// </summary>
public Func<T, T, bool> EqualsMethod { get; private set; }
/// <summary>
/// Gets the method used to compute a hash code
/// </summary>
public Func<T, int> GetHashCodeMethod { get; private set; }
bool IEqualityComparer<T>.Equals(T x, T y)
{
return EqualsMethod(x, y);
}
int IEqualityComparer<T>.GetHashCode(T obj)
{
if (GetHashCodeMethod == null)
return obj.GetHashCode();
return GetHashCodeMethod(obj);
}
}
All solutions I've seen here rely on selecting an already comparable field. If one needs to compare in a different way, though, this solution here seems to work generally, for something like:
somedoubles.Distinct(new LambdaComparer<double>((x, y) => Math.Abs(x - y) < double.Epsilon)).Count()
Take another way:
var distinctValues = myCustomerList.
Select(x => x._myCaustomerProperty).Distinct();
The sequence return distinct elements compare them by property '_myCaustomerProperty' .
You can use LambdaEqualityComparer:
var distinctValues
= myCustomerList.Distinct(new LambdaEqualityComparer<OurType>((c1, c2) => c1.CustomerId == c2.CustomerId));
public class LambdaEqualityComparer<T> : IEqualityComparer<T>
{
public LambdaEqualityComparer(Func<T, T, bool> equalsFunction)
{
_equalsFunction = equalsFunction;
}
public bool Equals(T x, T y)
{
return _equalsFunction(x, y);
}
public int GetHashCode(T obj)
{
return obj.GetHashCode();
}
private readonly Func<T, T, bool> _equalsFunction;
}
You can use InlineComparer
public class InlineComparer<T> : IEqualityComparer<T>
{
//private readonly Func<T, T, bool> equalsMethod;
//private readonly Func<T, int> getHashCodeMethod;
public Func<T, T, bool> EqualsMethod { get; private set; }
public Func<T, int> GetHashCodeMethod { get; private set; }
public InlineComparer(Func<T, T, bool> equals, Func<T, int> hashCode)
{
if (equals == null) throw new ArgumentNullException("equals", "Equals parameter is required for all InlineComparer instances");
EqualsMethod = equals;
GetHashCodeMethod = hashCode;
}
public bool Equals(T x, T y)
{
return EqualsMethod(x, y);
}
public int GetHashCode(T obj)
{
if (GetHashCodeMethod == null) return obj.GetHashCode();
return GetHashCodeMethod(obj);
}
}
Usage sample:
var comparer = new InlineComparer<DetalleLog>((i1, i2) => i1.PeticionEV == i2.PeticionEV && i1.Etiqueta == i2.Etiqueta, i => i.PeticionEV.GetHashCode() + i.Etiqueta.GetHashCode());
var peticionesEV = listaLogs.Distinct(comparer).ToList();
Assert.IsNotNull(peticionesEV);
Assert.AreNotEqual(0, peticionesEV.Count);
Source:
https://stackoverflow.com/a/5969691/206730
Using IEqualityComparer for Union
Can I specify my explicit type comparator inline?
If Distinct() doesn't produce unique results, try this one:
var filteredWC = tblWorkCenter.GroupBy(cc => cc.WCID_I).Select(grp => grp.First()).Select(cc => new Model.WorkCenter { WCID = cc.WCID_I }).OrderBy(cc => cc.WCID);
ObservableCollection<Model.WorkCenter> WorkCenter = new ObservableCollection<Model.WorkCenter>(filteredWC);
A tricky way to do this is use Aggregate() extension, using a dictionary as accumulator with the key-property values as keys:
var customers = new List<Customer>();
var distincts = customers.Aggregate(new Dictionary<int, Customer>(),
(d, e) => { d[e.CustomerId] = e; return d; },
d => d.Values);
And a GroupBy-style solution is using ToLookup():
var distincts = customers.ToLookup(c => c.CustomerId).Select(g => g.First());
IEnumerable lambda extension:
public static class ListExtensions
{
public static IEnumerable<T> Distinct<T>(this IEnumerable<T> list, Func<T, int> hashCode)
{
Dictionary<int, T> hashCodeDic = new Dictionary<int, T>();
list.ToList().ForEach(t =>
{
var key = hashCode(t);
if (!hashCodeDic.ContainsKey(key))
hashCodeDic.Add(key, t);
});
return hashCodeDic.Select(kvp => kvp.Value);
}
}
Usage:
class Employee
{
public string Name { get; set; }
public int EmployeeID { get; set; }
}
//Add 5 employees to List
List<Employee> lst = new List<Employee>();
Employee e = new Employee { Name = "Shantanu", EmployeeID = 123456 };
lst.Add(e);
lst.Add(e);
Employee e1 = new Employee { Name = "Adam Warren", EmployeeID = 823456 };
lst.Add(e1);
//Add a space in the Name
Employee e2 = new Employee { Name = "Adam Warren", EmployeeID = 823456 };
lst.Add(e2);
//Name is different case
Employee e3 = new Employee { Name = "adam warren", EmployeeID = 823456 };
lst.Add(e3);
//Distinct (without IEqalityComparer<T>) - Returns 4 employees
var lstDistinct1 = lst.Distinct();
//Lambda Extension - Return 2 employees
var lstDistinct = lst.Distinct(employee => employee.EmployeeID.GetHashCode() ^ employee.Name.ToUpper().Replace(" ", "").GetHashCode());
The Microsoft System.Interactive package has a version of Distinct that takes a key selector lambda. This is effectively the same as Jon Skeet's solution, but it may be helpful for people to know, and to check out the rest of the library.
Here's how you can do it:
public static class Extensions
{
public static IEnumerable<T> MyDistinct<T, V>(this IEnumerable<T> query,
Func<T, V> f,
Func<IGrouping<V,T>,T> h=null)
{
if (h==null) h=(x => x.First());
return query.GroupBy(f).Select(h);
}
}
This method allows you to use it by specifying one parameter like .MyDistinct(d => d.Name), but it also allows you to specify a having condition as a second parameter like so:
var myQuery = (from x in _myObject select x).MyDistinct(d => d.Name,
x => x.FirstOrDefault(y=>y.Name.Contains("1") || y.Name.Contains("2"))
);
N.B. This would also allow you to specify other functions like for example .LastOrDefault(...) as well.
If you want to expose just the condition, you can have it even simpler by implementing it as:
public static IEnumerable<T> MyDistinct2<T, V>(this IEnumerable<T> query,
Func<T, V> f,
Func<T,bool> h=null
)
{
if (h == null) h = (y => true);
return query.GroupBy(f).Select(x=>x.FirstOrDefault(h));
}
In this case, the query would just look like:
var myQuery2 = (from x in _myObject select x).MyDistinct2(d => d.Name,
y => y.Name.Contains("1") || y.Name.Contains("2")
);
N.B. Here, the expression is simpler, but note .MyDistinct2 uses .FirstOrDefault(...) implicitly.
Note: The examples above are using the following demo class
class MyObject
{
public string Name;
public string Code;
}
private MyObject[] _myObject = {
new MyObject() { Name = "Test1", Code = "T"},
new MyObject() { Name = "Test2", Code = "Q"},
new MyObject() { Name = "Test2", Code = "T"},
new MyObject() { Name = "Test5", Code = "Q"}
};
I'm assuming you have an IEnumerable<T>, and in your example delegate, you would like c1 and c2 to be referring to two elements in this list?
I believe you could achieve this with a self join:
var distinctResults = from c1 in myList
join c2 in myList on <your equality conditions>
I found this as the easiest solution.
public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
return source.GroupBy(keySelector).Select(x => x.FirstOrDefault());
}

Categories

Resources