C# linq Intersect override for complex object - c#

I have these two objects (dummy code)
var students = new List<Student>();
var girl = new Student() { Name = "Simran", StudentId = 4 };
var sameGirl = new Student() { Name = "Norman", StudentId = 4 };
I wanted to check if these two objects are the same using the Intersect method but to my understanding Intersect uses Equals under the hood so these two objects will evaluate to false, I don't really know how to override the Equals or Intersect methods, but in essence, I want to check if the Ids of the objects are the same. Can the Equals or Intersect method be overridden to evaluate a part of the object, not the whole object?

It depends upon your choice. You can override the Equals method and just compare only required properties and on the basis of comparison, return true or false.
So, implement IComparer and override the Equals method. Only include StudentId of the source and target for comparison. Would that fulfill your requirement?

thank you all for your help, you pointed me out in the right direction, I didn't really understand how to override the equals or the comparer
public class fooComparer<T> : IEqualityCmparer<T> where T :notnull
{
public book Equals(T? x, T? y)
{
return x?.studentId == y?.studentId && x!= null
}
public int GetHashCode(T obj)
{
return $"{obj.StudentId}
_{obj.Name}".GetHashCode():}}
}

Related

How do I make structural equality to work on collection properties in C#?

One of the great advantages is supposed to be value based/structural equality, but how do I get that to work with collection properties?
Concrete simple example:
public record Something(string Id);
public record Sample(List<Something> something);
With the above records I would expect the following test to pass:
[Fact]
public void Test()
{
var x = new Sample(new List<Something>() {
new Something("x1")
});
var y = new Sample(new List<Something>() {
new Something("x1")
});
Assert.Equal(x, y);
}
I understand that it is because of List being a reference type, but does it exist a collection that implements value based comparison? Basically I would like to do a "deep" value based comparison.
Records don't do this automatically, but you can implement the Equals method yourself:
public record Sample(List<Something> something) : IEquatable<Sample>
{
public virtual bool Equals(Sample? other) =>
other != null &&
Enumerable.SequenceEqual(something, other.something);
}
But note that GetHashCode should be overridden to be consistent with Equals. See also implement GetHashCode() for objects that contain collections

How to update the property to true or false based on comparing two list

I have two list
class obj1
{
public string country{ get; set; }
public string region{ get; set; }
}
class obj2
{
public string country{ get; set; }
public string region { get; set; }
public string XYZ { get; set; }
public bool ToBeChanged{ get; set; }
}
first list looks like:
List<obj1> alist = new List<obj1>();
alist.Add("US", "NC");
alist.Add("US", "SC");
alist.Add("US", "NY");
second list (List<obj2> alist2) may make 1000 of entries with many combination of country and region.
I need to update the property "ToBeChanged" to "True" if second (alist2) list properties (country and region) matches to first(alist1) and false in otherwise.
Please help.
Thanks,
Vaibhav
Two points from the comments, and my thoughts:
Some aren't sure exactly what your matching criteria is. But to me it seems fairly clear that you're matching on 'country' and 'region'. Nevertheless, in the future, state this explicitly.
You got one comment criticizing your choice of variable names. That criticism is fully justified. Code is far easier to maintain when you have little hints as to what it's doing, and variable names are crucial for that.
Now, regarding my particular solution:
In the code below, I've renamed some of your objects to make them clear in their purpose. I'd like to rename 'obj2', but I'll leave that to you because I'm not exactly sure what you're intending to do with it, and I definitely don't know what 'XYZ' is for. Here are the renamed classes, with some added constructors to aid in list construction.
class RegionInfo {
public RegionInfo(string country, string region) {
this.country = country;
this.region = region;
}
public string country{ get; set; }
public string region{ get; set; }
}
class obj2 {
public obj2 (string country, string region, string XYZ) {
this.country = country;
this.region = region;
this.XYZ = XYZ;
}
public string country{ get; set; }
public string region { get; set; }
public string XYZ { get; set; }
public bool ToBeChanged{ get; set; }
}
I'm using a LINQ Join to match the two lists, outputting only the 'obj2' side of the join, and then looping the result to toggle the 'ToBeChanged' value.
var regionInfos = new List<RegionInfo>() {
new RegionInfo("US", "NC"),
new RegionInfo("US", "SC"),
new RegionInfo("US", "NY")
};
var obj2s = new List<obj2> {
new obj2("US", "NC", "What am I for?"),
new obj2("US", "SC", "Like, am I supposed to be the new value?"),
new obj2("CA", "OT", "XYZ doesn't have a stated purpose")
};
var obj2sToChange = obj2s
.Join(
regionInfos,
o2 => new { o2.country, o2.region },
reg => new { reg.country, reg.region },
(o2,reg) => o2
);
foreach (var obj2 in obj2sToChange)
obj2.ToBeChanged = true;
obj2s.Dump(); // using Linqpad, but you do what works to display
This results in:
country
region
XYZ
ToBeChanged
US
NC
What am I for?
True
US
SC
Like, am I supposed to be the new value?
True
CA
OT
XYZ doesn't have a stated purpose
False
First of all, with LINQ you can never change the source. You can only extract data from the source. After that you can use the extracted data to update the source.
I need to update the property "ToBeChanged" to "True" if second (alist2) list properties (country and region) matches to first(alist1) and false in otherwise.
This is not a proper requirement. alist1 is a sequence of obj1 objects. I think, that you want the property ToBeChanged of a certain obj2 to be true if any of the obj1 items in alist1 has a [country, region] combination that matches the [country, region] combination of the obj2 concerned.
requirement Get all obj2 in alist2, that have a [country, region] combination that matches any of the [country, region] combinations of the obj1 objects in alist1.
You probably thought about using Where for this. Something like "Where [country, region] combination in the other list". Whenever you need to find out if an item is in another list, consider to use one of the overloads of Enumerable.Contains
The problem is, that the [Country, Region] combination in every obj2 can be converted to an object of class obj1, but if you want to check if they are equal, you will have a compare by reference, while you want a compare by value.
There are two solutions for this:
create an EqualityComparer that compares obj1 by Value
create [Country, Region] as anonymous type. Anonymous types always compare by value.
The latter is the most easy, so we'll do that one first.
Use anonymous types for comparison
First convert alist into anonymous type containing [Country, Region] combinations:
var eligibleCountryRegionCombinations = alist.Select(obj1 => new
{
Country = obj1.Country,
Region = obj1.Region,
});
Note that I don't use ToList at the end: the enumerable is created, but the sequence has not been enumerated yet. In LINQ terms this is called lazy or deferred execution.
IEnumerable<obj2> obj2sThatNeedToBeChanged = alist2.Select(obj2 => new
{
CountryRegionCombination = new
{
Country = obj2.Country,
Region = obj2.Region,
},
Original = obj2,
})
.Where(item => eligibleCountryRegionCombinations.Contains(
item.CountryRegionCombination))
.Select(item => item.Original);
CountryRegionCombination is an anonymous type of the same type as the anonymous items in eligibleCountryRegionCombinations. Therefore you can use Contains. Because the items are anonymous type, the equality comparison is comparison by value.
The final select will remove the anonymous type, and keep only the Original.
Note that the query is still not enumerated.
foreach (var obj2 in obj2sThatNeedToBeChanged.ToList())
{
obj2.ToBeChanged = true;
}
It can be dangerous to change the source that you are enumerating. In this case it is not a problem, because the field that you change is not used to create the enumeration. Still I think it is safer, because of possible future changes, to do a ToList before you start changing the source.
Create an equality comparer
One of the overload of Enumerable.Contains has a parameter comparer. This expects an IEqualityComparer<obj1>
class Obj1Comparer : EqualityComparer<obj1>
{
public static IEqualityComparer<obj1> ByValue {get;} = new Obj1Comparer();
private static IEqualityComparer<string> CountryComparer => StringComparer.OrdinalIgnoreCase;
private static IEqualityComparer<string> RegionComparer => StringComparer.OrdinalIgnoreCase;
public override bool Equals (obj1 x, obj1 y)
{
if (x == null) return y == null; // true if both null, false if x null, but y not null
if (y == null) return false; // because x not null
// optimization:
if (Object.ReferenceEquals(x, y)) return true;
if (x.GetType() != y.GetType()) return false;
return CountryComparer.Equals(x.Country, y.Country)
&& RegionComparer.Equals(x.Region, y.Region);
}
To make it easy to change equality of countries, I created a separate comparer for countries and for regions. So if later you want to compare case sensitive, or if you change Country from string to a foreign key to a table of countries, then changes will be minimal.
You also need to override GetHashCode. If x equals y, then GetHashCode should rerturn the same value. Not the other way round: if x and y different they may return the same hash code. However, code will be more efficient if you have more different Hash codes.
public override int GetHashCode (obj1 x)
{
if (x == null) return 87966354; // just a number
return CountryComparer.GetHashCode(x.Country)
^ RegionComparer.GetHashCode(x.Region);
}
Which HashCode you return depends on how often this will be called, for instance in dictionaries, comparers like Contains, etc.
How "different" are the Countries and Regions? A different Country will probably also mean a different region. So maybe it is efficient enough if you only calculate the Hash code for the Country. If a Country has many, many regions, then it will probably be better to calculate the hash code for regions as well If a Region is only in one Country (OberAmmerGau is probably only in Germany), or in only a few Regions (how many regions "New Amsterdam" will there be?), then you won't have to check the Country at all.
Because we have an equality comparer, we don't need to convert alist to an anonymous type, we can specify that Contains should compare by value.
IEqualityComparer<obj1> comparer = Obj1Comparer.ByValue;
IEnumerable<obj2> obj2sThatNeedToBeChanged = alist2.Select(obj2 => new
{
Obj1 = new Obj1
{
Country = obj2.Country,
Region = obj2.Region,
},
Original = obj2,
})
.Where(item => alist.Contains(item.CountryRegionCombination, comparer))
.Select(item => item.Original);
Fast method: Extension method
The fastest method, and maybe also the most simple one, is to create an extension method.
private static IEqualityComparer<string> CountryComparer => StringComparer.OrdinalIgnoreCase;
private static IEqualityComparer<string> RegionComparer => StringComparer.OrdinalIgnoreCase;
public static IEnumerable<Obj2> WhereSameLocation(
this IEnumerable<Obj2> source,
IEnumerable<Obj1> obj1Items)
{
// TODO: what to do if source == null?
foreach (Obj2 obj2 in source)
{
// check if there is any obj1 with same [Country, Region]
if (obj1Items
.Where(obj1 => CountryComparer.Equals(obj2.Country, obj1.Country)
&& RegionComparer.Equals(obj2.Region, obj1.Region))
.Any())
{
yield return obj2;
}
}
}
Usage:
IEnumerable<Obj1> alist = ...
IEnumerable<Obj2> alist2 = ...
IEnumerable<obj2> obj2sThatNeedToBeChanged = alist2.WhereSameLocation(alist);

How to get the distinct data from a list?

I want to get distinct list from list of persons .
List<Person> plst = cl.PersonList;
How to do this through LINQ. I want to store the result in List<Person>
Distinct() will give you distinct values - but unless you've overridden Equals / GetHashCode() you'll just get distinct references. For example, if you want two Person objects to be equal if their names are equal, you need to override Equals/GetHashCode to indicate that. (Ideally, implement IEquatable<Person> as well as just overriding Equals(object).)
You'll then need to call ToList() to get the results back as a List<Person>:
var distinct = plst.Distinct().ToList();
If you want to get distinct people by some specific property but that's not a suitable candidate for "natural" equality, you'll either need to use GroupBy like this:
var people = plst.GroupBy(p => p.Name)
.Select(g => g.First())
.ToList();
or use the DistinctBy method from MoreLINQ:
var people = plst.DistinctBy(p => p.Name).ToList();
Using the Distinct extension method will return an IEnumerable which you can then do a ToList() on:
List<Person> plst = cl.PersonList.Distinct().ToList();
You can use Distinct method, you will need to Implement IEquatable and override equals and hashcode.
public class Person : IEquatable<Person>
{
public string Name { get; set; }
public int Code { get; set; }
public bool Equals(Person other)
{
//Check whether the compared object is null.
if (Object.ReferenceEquals(other, null)) return false;
//Check whether the compared object references the same data.
if (Object.ReferenceEquals(this, other)) return true;
//Check whether the person' properties are equal.
return Code.Equals(other.Code) && Name.Equals(other.Name);
}
// If Equals() returns true for a pair of objects
// then GetHashCode() must return the same value for these objects.
public override int GetHashCode()
{
//Get hash code for the Name field if it is not null.
int hashPersonName = Name == null ? 0 : Name.GetHashCode();
//Get hash code for the Code field.
int hashPersonCode = Code.GetHashCode();
//Calculate the hash code for the person.
return hashPersonName ^ hashPersonCode;
}
}
var distinctPersons = plst.Distinct().ToList();

How to use IList.Contains() method to find an object

I have a list of Persons inside a Company Class.
public class Company{
IList<Person> persons;
}
Then I have a List of companies,
IList<Company> companies;
Now I have a name (say "Lasantha"). If this name is a part of the name of any person in a company, I want to find that company. I tried using companies.Contains() method. I overrided the object.Equals method, inside the Person class as this,
public override bool Equals(object o)
{
var other = o as Person;
return this.Name.ToLower().Contains(other.Name.ToLower());
}
But this is not working. It's not calling this Equal method as well. Can someone help me please.
Thank you.
Overriding the equality comparison in this manner is wrong.
Equality should be transitive: if
"FooBar".Equals("Foo") == true
then it must also hold that
"Foo".Equals("FooBar") == true
However, in this case you are using Contains which will break the transitivity because "FooBar" contains "Foo", but "Foo" does not contain "FooBar". Apart from that, you should not override the Equals method on a class unless each and every last comparison between objects of that class will have the same semantics (which in this case seems highly dubious).
So, given that overriding Equals is not the solution, what should you do?
One convenient way is to use LINQ:
var companiesWithPeopleWithLasanthaInTheirName =
companies.Where(c => c.persons.Any(p => p.Name.Contains("Lasantha")));
However the above comparison is case-sensitive, so if you need it to not be you have to tweak it; there is an answer in this question: Case insensitive 'Contains(string)'
You can use Linq, something like
var temp = companies.Where(p => p.People.Any(q => q.Name.Contains("Lasantha")));
Here is the full example;
public class Example
{
private IList<Company> companies;
public Example()
{
Person p1 = new Person(){Name = "Lasantha"};
Person p2 = new Person(){Name = "Red Kid"};
Company comp = new Company();
comp.People = new List<Person>();
comp.People.Add(p1);
comp.People.Add(p2);
companies = new List<Company>();
companies.Add(comp);
var temp = companies.Where(p => p.People.Any(q => q.Name.Contains("Lasantha")));
}
}
public class Company
{
public IList<Person> People
{
get;
set;
}
}
public class Person
{
public string Name { get; set; }
}
You need to overload Equals so that it takes a Person as parameter. Otherwise it will default to reference comparison.
public override bool Equals(Person p)
{
//...
}
Then as msdn states, you may need to provide CompareTo (IComparable) as well.
I don't think you need to override Equals when you can get this as
you should be using where to filter the companies and then use Any on the Person list to get the ones matching your name criteria
companies.Where(c => c.persons.Any(p => p.Name.Contains("Value here"));
You're searching for
a list of characters ("Lasantha")
inside a list of Persons
inside a list of Companies
Well, other contributors already described how to do it correctly

Distinct() returns duplicates with a user-defined type

I'm trying to write a Linq query which returns an array of objects, with unique values in their constructors. For integer types, Distinct returns only one copy of each value, but when I try creating my list of objects, things fall apart. I suspect it's a problem with the equality operator for my class, but when I set a breakpoint, it's never hit.
Filtering out the duplicate int in a sub-expression solves the problem, and also saves me from constructing objects that will be immediately discarded, but I'm curious why this version doesn't work.
UPDATE: 11:04 PM Several folks have pointed out that MyType doesn't override GetHashCode(). I'm afraid I oversimplified the example. The original MyType does indeed implement it. I've added it below, modified only to put the hash code in a temp variable before returning it.
Running through the debugger, I see that all five invocations of GetHashCode return a different value. And since MyType only inherits from Object, this is presumably the same behavior Object would exhibit.
Would I be correct then to conclude that the hash should instead be based on the contents of Value? This was my first attempt at overriding operators, and at the time, it didn't appear that GetHashCode needed to be particularly fancy. (This is the first time one of my equality checks didn't seem to work properly.)
class Program
{
static void Main(string[] args)
{
int[] list = { 1, 3, 4, 4, 5 };
int[] list2 =
(from value in list
select value).Distinct().ToArray(); // One copy of each value.
MyType[] distinct =
(from value in list
select new MyType(value)).Distinct().ToArray(); // Two objects created with 4.
Array.ForEach(distinct, value => Console.WriteLine(value));
}
}
class MyType
{
public int Value { get; private set; }
public MyType(int arg)
{
Value = arg;
}
public override int GetHashCode()
{
int retval = base.GetHashCode();
return retval;
}
public override bool Equals(object obj)
{
if (obj == null)
return false;
MyType rhs = obj as MyType;
if ((Object)rhs == null)
return false;
return this == rhs;
}
public static bool operator ==(MyType lhs, MyType rhs)
{
bool result;
if ((Object)lhs != null && (Object)rhs != null)
result = lhs.Value == rhs.Value;
else
result = (Object)lhs == (Object)rhs;
return result;
}
public static bool operator !=(MyType lhs, MyType rhs)
{
return !(lhs == rhs);
}
}
You need to override GetHashCode() in your class. GetHashCode must be implemented in tandem with Equals overloads. It is common for code to check for hashcode equality before calling Equals. That's why your Equals implementation is not getting called.
Your suspicion is correct,it is the equality which currently just checks the object references. Even your implementation does not do anything extra, change it to this:
public override bool Equals(object obj)
{
if (obj == null)
return false;
MyType rhs = obj as MyType;
if ((Object)rhs == null)
return false;
return this.Value == rhs.Value;
}
In you equality method you are still testing for reference equality, rather than semantic equality, eg on this line:
result = (Object)lhs == (Object)rhs
you are just comparing two object references which, even if they hold exactly the same data, are still not the same object. Instead, your test for equality needs to compare one or more properties of your object. For instance, if your object had an ID property, and objects with the same ID should be considered semantically equivalent, then you could do this:
result = lhs.ID == rhs.ID
Note that overriding Equals() means you should also override GetHashCode(), which is another kettle of fish, and can be quite difficult to do correctly.
You need to implement GetHashCode().
It seems that a simple Distinct operation can be implemented more elegantly as follows:
var distinct = items.GroupBy(x => x.ID).Select(x => x.First());
where ID is the property that determines if two objects are semantically equivalent. From the confusion here (including that of myself), the default implementation of Distinct() seems to be a little convoluted.
I think MyType needs to implement IEquatable for this to work.
The other answers have pretty much covered the fact that you need to implement Equals and GetHashCode correctly, but as a side note you may be interested to know that anonymous types have these values implemented automatically:
var distinct =
(from value in list
select new {Value = value}).Distinct().ToArray();
So without ever having to define this class, you automatically get the Equals and GetHashCode behavior you're looking for. Cool, eh?

Categories

Resources