LINQ - Distinct is ignored? - c#

So I have a problem with my LINQ code, where I have to select a Distinct data set, I implement the following IEqualityComparer:
public class ProjectRoleComparer : IEqualityComparer<ProjectUserRoleMap>
{
public bool Equals(ProjectUserRoleMap x, ProjectUserRoleMap y)
{
return x.RoleID.Equals(y.RoleID);
}
public int GetHashCode(ProjectUserRoleMap obj)
{
return obj.GetHashCode();
}
}
In this context, I wish to retrieve a bunch of ProjectUserRoleMap objects related to a given Project, identified by it's ID, I only want one ProjectUserRoleMap per unique RoleID, but my strict instruction to perform a distinct select on the RoleID is ignored. I am totally clueless as to why this is the case, and do not understand LINQ enough to think of a workaround. Here is the calling code:
ProjectRoleComparer prCom = new ProjectRoleComparer();
IEnumerable<ProjectUserRoleMap> roleList = ProjectData.AllProjectUserRoleMap.Where(x => x.ProjectID == id).Distinct(prCom);
This code gives me 6 entries, when the number of entries I know I want is just 4. Am I doing something wrong with my usage of LINQ?
For reference, the ProjectUserRoleMap object has a RoleID, (int)

Your implementation of GetHashCode is wrong. Return obj.RoleID.GetHashCode();
Background:
Code that consumes an IEqualityComparer<T> usually first compares the hash codes of two objects. Only if those hash codes are the same Equals is called.
It is implemented like this, because two unequal objects can have the same hash key, but two equal objects never can have different hash keys - if GetHashCode() is implemented correctly.
This knowledge is used to improve the efficiency and performance of the comparison as implementations of GetHashCode are supposed to be fast, cheap operations.

Try:
public int GetHashCode(ProjectUserRoleMap obj)
{
return obj.RoleID.GetHashCode();
}

Related

C# Hashset.Contains with custom EqualityComparer never calls GetHashCode()

I have a very large (hundreds of thousands) hashset of Customer objects in my database. Then I get a newly imported hashset of customer objects and have to check for every new object, if it is contained in the existing hashset. Performance is very important.
I cannot use the default Equalitycomparer as it needs to be compared based on only three properties. Also, I can't override the Equals and GetHashCode functions of the Customer class for other reasons. So I aimed for a custom EqualityComparer (I tried implementing IEqualityComparer or inheriting from EqualityComparer and overriding like you see below - both with the same end result).
public class CustomerComparer : EqualityComparer<Customer>
{
public CustomerComparer(){ }
public override bool Equals(Customer x, Customer y)
{
return x != null &&
y != null &&
x.Name == y.Name &&
x.Description == y.Description &&
x.AdditionalInfo == y.AdditionalInfo
}
public override int GetHashCode(Customer obj)
{
var hashCode = -1885141022;
hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(obj.Name);
hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(obj.Description);
hashCode = hashCode * -1521134295 + EqualityComparer<string>.Default.GetHashCode(obj.AdditionalInfo);
return hashCode;
}
}
Now to my problem: When I use the default EqualityComparer, generally only the GetHashCode method of Customer is called and the performance for my use case is very good (1-2 seconds). When I use my custom EqualityComparer, the GetHashCode method is never called but always the Equals method. The performance for my use case is horrible (hours). See code below:
public void FilterImportedCustomers(ISet<Customer> dataBase, IEnumerable<Customer> imported){
var equalityComparer = new CustomerComparer();
foreach (var obj in imported){
//great performance, always calls Customer.GetHashCode
if (!dataBase.Contains(obj){
//...
}
//awful performance, only calls CustomerComparer.AreEqual
if (!dataBase.Contains(obj, equalityComparer))
//...
}
}
}
Does anyone have an idea, how I can solve this problem? That would be amazing, I'm really stuck trying to solve this huge performance problem.
EDIT :
I solved it by passing my EuqalityComparer when initializing the hashset! By using the constructor overload that takes an IEqualityComparer so var database = new HashSet(new CustomerComparer())
Thank you, guys!
I solved it by passing my EqualityComparer when initializing the hashset! Is used the constructor overload that takes an IEqualityComparer so var database = new HashSet(new CustomerComparer())
Thanks to Lee and NetMage who commented under my original post.

xUnit Equal two same collections, same order, same types, returns false

Shouldn't xUnit Equal method return true if two collections are equal meaning have same objects in the same order?
Example:
var result = new List<item>()
{
new item()
{
TypeId = typesEnum.Integer,
Code = "code"
},
new item()
{
TypeId = typesEnum.Integer,
Code = "code2"
}
}
and Assert.Equal(expectedResult, result) returns false.
and I have exactly the same list in expectedResult, checked one by one, every property, type, everything. When I write my own IEqualityComparer and compare every single property of item class in it then the result is true. But default comparer returns false. Is that how it's supposed to work for xUnit? If so the question is how to compare if two collections are equal like equivalent?
Custom comparer looks like this:
internal class ItemComparer : IEqualityComparer<Item>
{
public bool Equals(Item x, Item y)
{
return x.Code.Equals(y.Code) && x.TypeId.Equals(y.TypeId)
}
public int GetHashCode(Item obj)
{
return obj.GetHashCode();
}
}
Here's a link to similiar question:
CLICK
And the answer is that it should work like I think it should without having to write my own comparer. The question is why doesn't it?
I use xUnit 2.4.1
It is not an XUnit issue or particularity, but how C# works. When you have two objects non-primitive, with the same values inside, they are equivalent but not equals - internally they have different memory references for each property. You can even try to compare both using == or .Equals() inside an if statement, and it will return false.
This answer explains very well about this subject
What you could do:
Overwrite comparison operators and use Assert.True(), and compare them in order to return a true;
Use a library that provides a comparison by equivalency, like FluentAssertion: object1.Should().BeEquivalentTo(object2);

LINQ to Entity Any() with related Object Collection

First, let me say that I've researched this problem and read the following stack overflow articles, but none of them really address this situation.
How can I use Linq to join between objects and entities?
inner join in linq to entities
Situation
I have two classes
public class Section{
public string SchoolId{get;set;}
public string CourseId{get;set;}
public string SectionId{get;set;}
}
public class RelatedItem{
public string SchoolId{get;set;}
public string CourseId{get;set;}
public string SectionId{get;set;}
//..
}
I have an array of Section coming from one source and is an actual collection of Objects.
RelatedItem I'm getting via a LINQ to Entities call against a DbContext.
My goal is to get all of the RelatedItems based on the Sections I have from the other source.
I'm writing a query like this
Section[] mySections = GetSections(); //Third Party Source
IQueryable<RelatedItem> relatedItems = DbContext.RelatedItems
.Where(r=>
mySections.Any(s=> s.SchoolId == r.SchoolId &&
s.CourseId == r.CourseId &&
s.SectionId == r.SectionId)
);
Problem
At runtime, I receive the following error
Unable to create a constant value of type
'ProjectNamespace.Section'. Only primitive types or
enumeration types are supported in this context.
I found a work around, but it involves doing the following, but it doesn't take advantage of any of my table indexes.
var sectionIds = sections.Select(s=>string.Concat(s.SchoolId, "|",s.CourseId, "|",s.SectionId));
IQueryable<RelatedItem> relatedItems = DbContext.RelatedItems
.Where(r=>
sectionIds.Contains(string.Concat(r.SchoolId, "|",r.CourseId, "|",r.SectionId))
);
This block of code works, and currently is pretty fast (but this is dev, and my record count is small). Aside from converting my related items to a collection in memory, does anyone have any other suggestions?
Try using Contains instead:
Section[] mySections = GetSections(); //Third Party Source
IQueryable<RelatedItem> relatedItems = DbContext.RelatedItems.Where(r=>
mySections.Select(s => s.SchoolId).Contains(r.SchoolId) &&
mySections.Select(s => s.CourseId).Contains(r.CourseId) &&
mySections.Select(s => s.SectionId).Contains(r.SectionId)
);
Contains should translate to WHERE IN clauses in SQL.
This won't work if using .NET 3.5 and LINQ to Entities, as it wasn't implemented in that version.
Proper way to solve this is to implement IEquitable. Here is an example on how to do it Does LINQ to Entities support IEquatable in a 'where' clause predicate?
One tip when implementing Equals() and GetHashCode() do not call any .NET methods (like getType()) only compare primitives SchoolId, CourseId, SectionId, it should get converted to expression tree and work just fine.

Remove one list from another mvc

I have two lists of the same type and I am trying to subtract the information in one list from the other and then save the result into the model.
I have tried two ways of doing it and so far I can't get either to work:
These are the two lists:
List<ApplicationsDetailsModel> AppList = ctx.Database.SqlQuery<ApplicationsDetailsModel>("exec get_applications_r").ToList();
var AppExceptionList = new List<ApplicationsDetailsModel>();
foreach(var g in AnIrrelevantList)
{
AppExceptionList.Add(new ApplicationsDetailsModel()
{
AppNum = g.AppNum,
AppName = g.AppName
});
}
So they now both have different data in the same format.
model.AppList = AppList.Except(AppExceptionList).ToList();
This doesn't bring up any errors but it also doesn't subtract the second list from the first.
var onlyInFirst = AppList.RemoveAll(a => AppExceptionList.Any(b => AppList == AppExceptionList));
I got this idea from this question.
Anyone know where I am going wrong?
The instances are not the same and are therefore not found to be equal by Except since it's checking for reference equal (which is obviously never going to be the case). For your situation you need to write a custom equality comparer... I've taken a stab at it here...
public class ApplicationsDetailsModelEqualityComparer : IEqualityComparer<ApplicationsDetailsModel>
{
public bool Equals(ApplicationsDetailsModel x, ApplicationsDetailsModel y)
{
return x.AppNum == y.AppNum && x.AppName == y.AppName;
}
public int GetHashCode(ApplicationsDetailsModel obj)
{
int hashCode = (obj.AppName != null ? obj.AppName.GetHashCode() : 0);
hashCode = (hashCode * 397) ^ obj.AppNum.GetHashCode();
return hashCode;
}
}
Usage...
model.AppList = AppList.Except(AppExceptionList, new ApplicationsDetailsModelEqualityComparer()).ToList();
Note that I'm assuming your AppNum and AppName together uniquely identify your objects in your list.
The Except method doesn't know how to compare two objects of type ApplicationsDetailsModel. You need to tell him explicitly, using an IEqualityComparer :
public class ApplicationsDetailsModelComparer : IEqualityComparer<ApplicationsDetailsModel> {
public bool Equals(ApplicationsDetailsModel first, ApplicationsDetailsModel second) {
return first.AppNum == second.AppNum;
}
public int GetHashCode(ApplicationsDetailsModel applicationsDetailsModel) {
return applicationsDetailsModel.AppNum.GetHashCode();
}
}
Then, you use it like this :
model.AppList = AppList.Except(AppExceptionList, new ApplicationsDetailsModelComparer ()).ToList();
If AppNum isn't an unique value in your collection (like a primary key), feel free to adapt the comparer class to your needs.
jgauffin's answer on the question that you linked to sums it up:
http://stackoverflow.com/a/13361682/89092
Except requires that Equals and GetHashCode is implemented in the traversed class.
The problem is that the Except method does not now how to compare instances of ApplicationsDetailsModel
You should implement GetHashCode in ApplicationsDetailsModel to create a way to uniquely identify an instance
You should implement Equals in ApplicationsDetailsModel and use the result of GetHashCode to return whether or no the instances should be considered "Equal". It is probably best to do this by implementing the IEquatable interface: http://msdn.microsoft.com/en-us/library/ms131187(v=vs.110).aspx
When you perform these steps, the Except method will work as expected

Retrieved Dictionary Key Not Found

I have a SortedDictionary declared like such:
SortedDictionary<MyObject,IMyInterface> dict = new SortedDictionary<MyObject,IMyInterface>();
When its populated with values, if I grab any key from the dictionary and then try to reference it immediately after, I get a KeyNotFoundException:
MyObject myObj = dict.Keys.First();
var value = dict[myObj]; // This line throws a KeyNotFoundException
As I hover over the dictionary (after the error) with the debugger, I can clearly see the same key that I tried to reference is in fact contained in the dictionary. I'm populating the dictionary using a ReadOnlyCollection of MyObjects. Perhaps something odd is happening there? I tried overriding the == operator and Equals methods to get the explicit comparison I wanted, but no such luck. That really shouldn't matter since I'm actually getting a key directly from the Dictionary then querying the Dictionary using that same key. I can't figure out what's causing this. Has anyone ever seen this behavior?
EDIT 1
In overriding Equals I also overloaded (as MS recommends) GetHashCode as well. Here's the implementation of MyObject for anyone interested:
public class MyObject
{
public string UserName { get; set;}
public UInt64 UserID { get; set;}
public override bool Equals(object obj)
{
if (obj == null || GetType()!= obj.GetType())
{
return false;
}
// Return true if the fields match:
return this.Equals((MyObject)obj);
}
public bool Equals(MyObject other)
{
// Return true if the fields match
return this.UserID == other.UserID;
}
public override int GetHashCode()
{
return (int)this.UserID;
}
public static bool operator ==( MyObject a, MyObject b)
{
// If both are null, or both are same instance, return true.
if (System.Object.ReferenceEquals(a, b))
{
return true;
}
// If one is null, but not both, return false.
if (((object)a == null) || ((object)b == null))
{
return false;
}
// Return true if the fields match:
return a.UserID == b.UserID
}
public static bool operator !=( MyObject a, MyObject b)
{
return !(a == b);
}
}
What I noticed from debugging is that if I add a quick watch (after the KeyNotFoundException is thrown) for the expression:
dict.ElementAt(0).Key == value;
it returns true. How can this be?
EDIT 2
So the problem ended up being because SortedDictionary (and Dictionary as well) are not thread-safe. There was a background thread that was performing some operations on the dictionary which seem to be triggering a resort of the collection (adding items to the collection would do this). At the same time, when the dictionary iterated through the values to find my key, the collection was being changed and it was not finding my key even though it was there.
Sorry for all of you who asked for code on this one, I'm currently debugging an application that I inherited and I didn't realize this was going on on a timed, background thread. As such, I thought I copied and pasted all the relevant code, but I didn't realize there was another thread running behind everything manipulating the collection.
It appears that the problem ended up being because SortedDictionary is not thread-safe. There was a background thread that was performing some operations on the dictionary (adding items to the collection) which seems to be triggering a resort of the collection. At the same time, when the dictionary was attempting to iterate through the values to find my key, the collection was being changed and resorted, rendering the enumerator invalid, and it was not finding my key even though it was there.
I have a suspicion - it's possible that you're changing the UserID of the key after insertion. For example, this would demonstrate the problem:
var key = new MyObject { UserId = 10 };
var dictionary = new Dictionary<MyObject, string>();
dictionary[key] = "foo";
key.UserId = 20; // This will change the hash code
var value = dict[key]; // Bang!
You shouldn't change properties involved in equality/hash-code considerations for an object which is being used as the key in a hash-based collection. Ideally, change your code so that this can't be changed - make UserId readonly, initialized on construction.
The above definitely would cause a problem - but it's possible that it's not the same as the problem you're seeing, of course.
In addition to overloading == and Equals, make sure you override GetHashCode with a suitable hash function. In particular, see this specification from the documentation:
If two objects compare as equal, the GetHashCode method for each object must return the same value. However, if two objects do not
compare as equal, the GetHashCode methods for the two objects do not
have to return different values.
The GetHashCode method for an object must consistently return the same hash code as long as there is no modification to the object state
that determines the return value of the object's Equals method. Note
that this is true only for the current execution of an application,
and that a different hash code can be returned if the application is
run again.
For the best performance, a hash function should generate an even distribution for all input, including input that is heavily clustered.
An implication is that small modifications to object state should
result in large modifications to the resulting hash code for best hash
table performance.
Hash functions should be inexpensive to compute.
The GetHashCode method should not throw exceptions.
I agree with Jon Skeet's suspicion that you're somehow unintentionally modifying UserID property after it's added as a key. But since the only property that's important for testing equality in MyObject is UserID (and therefore that's the only property that the Dictionary cares about), I'd recommend refactoring your code to use a simple Dictionary<ulong, IMyInterface> instead:
Dictionary<ulong, IMyInterface> dict = new Dictionary<string, IMyInterface>();
ulong userID = dict.Keys.First();
var value = dict[userID];

Categories

Resources