Which s better Distinct before or after Select? - c#

filterCollection is a list of int's
I am trying to create a new list from an existing with distinct with Linq
In this cause which is better
filterCollection.Select(filterId => new FilterTable() { FilterId = filtertId }).Distinct().ToList();
OR
filterCollection.Distinct().Select(filterId => new FilterTable() { FilterId = filtertId }).ToList();
I am not sure this is correct.

It depends. It depends on what is in filterCollection, if its Linq-To-Objects or Linq-To-Entities (database driven Linq provider) It depends on the implementation of FilterTable, because if this class does not override Equals and GetHashCode it will not work at all.
Since you project filterId i assume that the it's IEnumerable<int>(or string), then Distinct before the Select will work, it will remove duplicates. But after the Select it might not work because you create new FilterTable instances.
To make it work implement IEquatable<FilterTable> and override Equals+GetHashCode:
public class FilterTable : IEquatable<FilterTable>
{
public int FilterId { get;set; }
public bool Equals(FilterTable other)
{
return FilterId == other?.FilterId;
}
public override bool Equals(object obj)
{
return obj is FilterTable ft && this.Equals(ft);
}
public override int GetHashCode()
{
return FilterId.GetHashCode();
}
}

Related

C# Linq .Distinct() method not working [duplicate]

This question already has answers here:
LINQ's Distinct() on a particular property
(23 answers)
Closed 20 days ago.
class Program
{
static void Main(string[] args)
{
List<Book> books = new List<Book>
{
new Book
{
Name="C# in Depth",
Authors = new List<Author>
{
new Author
{
FirstName = "Jon", LastName="Skeet"
},
new Author
{
FirstName = "Jon", LastName="Skeet"
},
}
},
new Book
{
Name="LINQ in Action",
Authors = new List<Author>
{
new Author
{
FirstName = "Fabrice", LastName="Marguerie"
},
new Author
{
FirstName = "Steve", LastName="Eichert"
},
new Author
{
FirstName = "Jim", LastName="Wooley"
},
}
},
};
var temp = books.SelectMany(book => book.Authors).Distinct();
foreach (var author in temp)
{
Console.WriteLine(author.FirstName + " " + author.LastName);
}
Console.Read();
}
}
public class Book
{
public string Name { get; set; }
public List<Author> Authors { get; set; }
}
public class Author
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool Equals(object obj)
{
return true;
//if (obj.GetType() != typeof(Author)) return false;
//else return ((Author)obj).FirstName == this.FirstName && ((Author)obj).FirstName == this.LastName;
}
}
This is based on an example in "LINQ in Action". Listing 4.16.
This prints Jon Skeet twice. Why? I have even tried overriding Equals method in Author class. Still Distinct does not seem to work. What am I missing?
Edit:
I have added == and != operator overload too. Still no help.
public static bool operator ==(Author a, Author b)
{
return true;
}
public static bool operator !=(Author a, Author b)
{
return false;
}
LINQ Distinct is not that smart when it comes to custom objects.
All it does is look at your list and see that it has two different objects (it doesn't care that they have the same values for the member fields).
One workaround is to implement the IEquatable interface as shown here.
If you modify your Author class like so it should work.
public class Author : IEquatable<Author>
{
public string FirstName { get; set; }
public string LastName { get; set; }
public bool Equals(Author other)
{
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
public override int GetHashCode()
{
int hashFirstName = FirstName == null ? 0 : FirstName.GetHashCode();
int hashLastName = LastName == null ? 0 : LastName.GetHashCode();
return hashFirstName ^ hashLastName;
}
}
Try it as DotNetFiddle
The Distinct() method checks reference equality for reference types. This means it is looking for literally the same object duplicated, not different objects which contain the same values.
There is an overload which takes an IEqualityComparer, so you can specify different logic for determining whether a given object equals another.
If you want Author to normally behave like a normal object (i.e. only reference equality), but for the purposes of Distinct check equality by name values, use an IEqualityComparer. If you always want Author objects to be compared based on the name values, then override GetHashCode and Equals, or implement IEquatable.
The two members on the IEqualityComparer interface are Equals and GetHashCode. Your logic for determining whether two Author objects are equal appears to be if the First and Last name strings are the same.
public class AuthorEquals : IEqualityComparer<Author>
{
public bool Equals(Author left, Author right)
{
if((object)left == null && (object)right == null)
{
return true;
}
if((object)left == null || (object)right == null)
{
return false;
}
return left.FirstName == right.FirstName && left.LastName == right.LastName;
}
public int GetHashCode(Author author)
{
return (author.FirstName + author.LastName).GetHashCode();
}
}
Another solution without implementing IEquatable, Equals and GetHashCode is to use the LINQs GroupBy method and to select the first item from the IGrouping.
var temp = books.SelectMany(book => book.Authors)
.GroupBy (y => y.FirstName + y.LastName )
.Select (y => y.First ());
foreach (var author in temp){
Console.WriteLine(author.FirstName + " " + author.LastName);
}
There is one more way to get distinct values from list of user defined data type:
YourList.GroupBy(i => i.Id).Select(i => i.FirstOrDefault()).ToList();
Surely, it will give distinct set of data
Distinct() performs the default equality comparison on objects in the enumerable. If you have not overridden Equals() and GetHashCode(), then it uses the default implementation on object, which compares references.
The simple solution is to add a correct implementation of Equals() and GetHashCode() to all classes which participate in the object graph you are comparing (ie Book and Author).
The IEqualityComparer interface is a convenience that allows you to implement Equals() and GetHashCode() in a separate class when you don't have access to the internals of the classes you need to compare, or if you are using a different method of comparison.
You've overriden Equals(), but make sure you also override GetHashCode()
The Above answers are wrong!!!
Distinct as stated on MSDN returns the default Equator which as stated The Default property checks whether type T implements the System.IEquatable interface and, if so, returns an EqualityComparer that uses that implementation. Otherwise, it returns an EqualityComparer that uses the overrides of Object.Equals and Object.GetHashCode provided by T
Which means as long as you overide Equals you are fine.
The reason you're code is not working is because you check firstname==lastname.
see https://msdn.microsoft.com/library/bb348436(v=vs.100).aspx and https://msdn.microsoft.com/en-us/library/ms224763(v=vs.100).aspx
You can achieve this several ways:
1. You may to implement the IEquatable interface as shown Enumerable.Distinct Method or you can see #skalb's answer at this post
2. If your object has not unique key, You can use GroupBy method for achive distinct object list, that you must group object's all properties and after select first object.
For example like as below and working for me:
var distinctList= list.GroupBy(x => new {
Name= x.Name,
Phone= x.Phone,
Email= x.Email,
Country= x.Country
}, y=> y)
.Select(x => x.First())
.ToList()
MyObject class is like as below:
public class MyClass{
public string Name{get;set;}
public string Phone{get;set;}
public string Email{get;set;}
public string Country{get;set;}
}
3. If your object's has unique key, you can only use the it in group by.
For example my object's unique key is Id.
var distinctList= list.GroupBy(x =>x.Id)
.Select(x => x.First())
.ToList()
You can use extension method on list which checks uniqueness based on computed Hash.
You can also change extension method to support IEnumerable.
Example:
public class Employee{
public string Name{get;set;}
public int Age{get;set;}
}
List<Employee> employees = new List<Employee>();
employees.Add(new Employee{Name="XYZ", Age=30});
employees.Add(new Employee{Name="XYZ", Age=30});
employees = employees.Unique(); //Gives list which contains unique objects.
Extension Method:
public static class LinqExtension
{
public static List<T> Unique<T>(this List<T> input)
{
HashSet<string> uniqueHashes = new HashSet<string>();
List<T> uniqueItems = new List<T>();
input.ForEach(x =>
{
string hashCode = ComputeHash(x);
if (uniqueHashes.Contains(hashCode))
{
return;
}
uniqueHashes.Add(hashCode);
uniqueItems.Add(x);
});
return uniqueItems;
}
private static string ComputeHash<T>(T entity)
{
System.Security.Cryptography.SHA1CryptoServiceProvider sh = new System.Security.Cryptography.SHA1CryptoServiceProvider();
string input = JsonConvert.SerializeObject(entity);
byte[] originalBytes = ASCIIEncoding.Default.GetBytes(input);
byte[] encodedBytes = sh.ComputeHash(originalBytes);
return BitConverter.ToString(encodedBytes).Replace("-", "");
}
The Equal operator in below code is incorrect.
Old
public bool Equals(Author other)
{
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
NEW
public override bool Equals(Object obj)
{
var other = obj as Author;
if (other is null)
{
return false;
}
if (FirstName == other.FirstName && LastName == other.LastName)
return true;
return false;
}
Instead of
var temp = books.SelectMany(book => book.Authors).Distinct();
Do
var temp = books.SelectMany(book => book.Authors).DistinctBy(f => f.Property);

List contains list

Hi I have the following lists:
var CustomerName = new List<Customer>();
var DummyData = new List<Customer>();
How can I quickly check that DummyData is contained inside of CustomerName? Also performance is key as these lists might contain thousands of values.
Brute Force Method
Use linq all method against the DummyData variable O(N*K)
// If you override Equals and GetHashCode or are comparing by reference
DummyData.All(a=>CustomerName.Contains(a))
//If you compare by property
DummyData.All(a=>
CustomerName.Any(b=>
a.FirstName==b.FirstName &&
a.LastName == b.LastName
//repeat to include checks for all properties
)
);
Using a HashSet
Put your results into a hashset and use linq's All method again checking if hashset contains items, takes N steps to build hashset and K steps to check, complexity is O(N+K)
var hs = new HashSet<Customer>(CustomerName);
DummyData.All(a=>hs.Contains(a));
You will need to override Equals And GetHashCode
If you haven't overriden these two yet you'll need to unless you want to compare properties and this prevents you from using the hash set method as well
public class Customer
{
public string FirstName { get; set; }
public string LastName { get; set; }
public override bool Equals(object obj)
{
var customer = obj as Customer;
return customer != null && Equals(customer);
}
protected bool Equals(Customer other)
{
return string.Equals(FirstName, other.FirstName) && string.Equals(LastName, other.LastName);
}
public override int GetHashCode()
{
unchecked
{
return ((FirstName?.GetHashCode() ?? 0)*397) ^ (LastName?.GetHashCode() ?? 0);
}
}
}

LINQ Intersect not returning items

I have implemented a comparison class for my custom class, so that I can use Intersect on two lists (StudentList1 and StudentList2). However, when I run the following code, I don't get any results.
Student:
class CompareStudent : IEqualityComparer<Student>
{
public bool Equals(Student x, Student y)
{
if (x.Age == y.Age && x.StudentName.ToLower() == y.StudentName.ToLower())
return true;
return false;
}
public int GetHashCode(Student obj)
{
return obj.GetHashCode();
}
}
class Student
{
public int StudentId{set;get;}
public string StudentName{set;get;}
public int Age{get;set;}
public int StandardId { get; set; }
}
Main:
IList<Student> StudentList1 = new List<Student>{
new Student{StudentId=1,StudentName="faisal",Age=29,StandardId=1},
new Student{StudentId=2,StudentName="qais",Age=19,StandardId=2},
new Student{StudentId=3,StudentName="ali",Age=19}
};
IList<Student> StudentList2 = new List<Student>{
new Student{StudentId=1,StudentName="faisal",Age=29,StandardId=1},
new Student{StudentId=2,StudentName="qais",Age=19,StandardId=2},
};
var NewStdList = StudentList1.Intersect(StudentList2, new CompareStudent());
Console.ReadLine();
The problem is within your GetHashCode() method, change it to:
public int GetHashCode(Student obj)
{
return obj.StudentId ^ obj.Age ^ obj.StandardId ^ obj.StudentName.Length;
}
In your current code, Equals is not called as the current GetHashCode() returns two different integers, so no point in calling Equals.
If GetHashCode of the first object is different than the second, the objects are not equal, if the result is the same, Equals is being called.
The GetHashCode I've written above is not ultimate, see What is the best algorithm for an overridden System.Object.GetHashCode? for more details on how to implement GetHashCode.
GetHashCode() is not (and cannot be) collision free, which is why the Equals method is required in the first place.
You are calling GetHashCode() on the base object, which will return a different value for the different references. I would implement it like this:
public override int GetHashCode(Student obj)
{
unchecked
{
return obj.StudentName.GetHashCode() + obj.Age.GetHashCode();
}
}

IEquatable doesnt call Equals method

Ih, i am facing a problem with IEquatable (C#). As you can see in the following code, I got a class where i've implement IEquatable but it's "Equals" method is not getting reach. My objective is:
I have a datetime column in my database and i would like to distinct only date, not considering the "time" part.
for example: 12-01-2014 23:14 would be equal to 12-01-2014 18:00.
namespace MyNamespace
{
public class MyRepository
{
public void MyMethod(int id)
{
var x = (from t in context.MyTable
where t.id == id
select new MyClassDatetime()
{
Dates = v.Date
}).Distinct().ToList();
}
}
public class MyClassDatetime : IEquatable<MyClassDatetime>
{
public DateTime? Dates { get; set; }
public bool Equals(MyClassDatetime other)
{
if (other == null) return false;
return (this.Dates.HasValue ? this.Dates.Value.ToShortDateString().Equals(other.Dates.Value.ToShortDateString()) : false);
}
public override bool Equals(object other)
{
return this.Equals(other as MyClassDatetime );
}
public override int GetHashCode()
{
int hashDate = Dates.GetHashCode();
return hashDate;
}
}
}
Have you know how can i make it work properly or other option to do what i need??
Thank you!!
Your implementation of GetHashCode is incorrect for the desired equality semantics. That's because it returns different hash codes for dates that you want to compare equal, which is a bug.
To fix it, change it to
public override int GetHashCode()
{
return Dates.HasValue ? Dates.Value.Date.GetHashCode() : 0;
}
You should also update Equals in the same spirit, it's not a good idea to mess with string representations of dates:
public bool Equals(MyClassDatetime other)
{
if (other == null) return false;
if (Dates == null) return other.Dates == null;
return Dates.Value.Date == other.Dates.Value.Date;
}
Update: As usr very correctly points out, since you are using LINQ on an IQueryable the projection and Distinct call will be translated to a store expression and this code will still not run. To get around that you can use an intermediate AsEnumerable call:
var x = (from t in context.MyTable
where t.id == id
select new MyClassDatetime()
{
Dates = v.Date
}).AsEnumerable().Distinct().ToList();
Thans for reply but it still not solving my problem.
I finally found a way to do it but without using IEquatable.
var x = (from t in context.MyTable
where t.Id == id
select EntityFunctions.CreateDateTime(t.Date.Value.Year, t.Date.Value.Month,t.Date.Value.Day, 0, 0, 0)).Distinct();
=)

Searching the properties of a list of a custom class

I have a custom class called CustomClass. It contains a variable called "Name" and a list of values (for the sake of simplicity let's make this an int - in reality it is another custom class, but the principle should be the same).
So :
public class CustomClass {
string name;
}
I have a List<CustomClass>.
When I attempt to add a value to this List, the logic I want, is for this List to check if it contains a CustomClass with the same name as the value I want to add.
If it does, then do x, otherwise, do y.
listOfCustomClass.Contains(customClassToAdd.name) will not work in this case, I assume, however this is the functionality I require.
What is best practice here ?
I think you can try something like var x = MyList.Where(C=> C.Name == InsertedName) and check the result (not tested)
You'll have to create a new class,let's call it CustomList, that inherits from IList<> where you can override the add method, do your check, and then add it to the base. Something like this:
public class CustomList<T> : IList<T> where T : CustomClass
{
private List<T> innerlist;
public void Add(T item)
{
if(innerlist.Any(a => a.Name == item.Name)))
innerlist.Add(item);
}
}
you can do it using linq as follow but you have to make name field public.
List<CustomClass> list = new List<CustomClass>();
CustomClass toCheck = new CustomClass();
if (list.Any(p => p.name.Equals(toCheck)))
{
//do x here
}
else
{
//do y here
}
however if you don't want to use linq then Do some changes in CustomClass as follow
public class CustomClass
{
string name;
List<int> intLost = new List<int>();
public override bool Equals(object obj)
{
return this.Equals(obj as CustomClass);
}
public override int GetHashCode()
{
return 0;
}
public bool Equals(CustomClass cc)
{
if (cc == null) return false;
return this.name.Equals(cc.name);
}
}
Then you can do this.
List<CustomClass> list = new List<CustomClass>();
CustomClass toCheck = new CustomClass();
if (list.Contains(toCheck))
{
//do x here
}
else
{
//do y here
}
It seems to me that you want to override the .Add() behavior of your List<CustomClass>. While you could use extension methods, I think a better solution would be to invent a class that extends List in some manner. I'd recommend implementing IList in your collection class if you need to have that level of control over add operations...
public class CustomClassList : IList<CustomClass>
{
public void Add (CustomClass item)
{
if(this.Select(t => t.Name).Contains(item.Name))
// Do whatever here...
else
// Do whatever else here...
}
// ... other IList implementations here ...
}
try this:
IList<CustomClass> list = new List<CustomClass>();
CustomClass customClass = new CustomClass();
customClass.name = "Lucas";
if((list.Tolist().Find(x => x.name == customClass.name)) == null)
{
list.Add(customClass);
}
else
{
//do y;
}
You could override the Equals(object o) function in your CustomClass, so that two CustomClasses are considered equal if their names are the same. Then
listOfCustomClass.Contains(customClassToAdd);
should work.
Another way is to override Equals method on your CustomClass and then just call List.Contains()
If the name property uniquely identifies the CustomClass, then you should overload Equals and GetHashCode(). The reason List.Contains doesn't work is that underneath the HashCodes are compared. So you need to overload GetHashCode and Equals something like this:
public override int GetHashCode()
{
return this.name.GetHashCode();
}
public override bool Equals(object obj)
{
var other = obj as CustomClass;
if (other != null)
{
if (other.Name == this.Name)
{
return true;
}
}
return false;
}

Categories

Resources