C# SortedSet<T> and equality - c#

I am a bit puzzled about the behaviour of SortedSet, see following example:
public class Blah
{
public double Value { get; private set; }
public Blah(double value)
{
Value = value;
}
}
public class BlahComparer : Comparer<Blah>
{
public override int Compare(Blah x, Blah y)
{
return Comparer<double>.Default.Compare(x.Value, y.Value);
}
}
public static void main()
{
var blahs = new List<Blah> {new Blah(1), new Blah(2),
new Blah(3), new Blah(2)}
//contains all 4 entries
var set = new HashSet<Blah>(blahs);
//contains only Blah(1), Blah(2), Blah(3)
var sortedset = new SortedSet<Blah>(blahs, new BlahComparer());
}
So SortedSet discards entries if Compare(x,y) returns 0. Can I prevent this, such that my SortedSet behaves like HashSet and discards entries only if Equals() returns true?

Description
SortedSet: You have many elements you need to store, and you want to store them in a sorted order and also eliminate all duplicates from the data structure. The SortedSet type, which is part of the System.Collections.Generic namespace in the C# language and .NET Framework, provides this functionality.
According to MSDN Compare method returns
Less than zero if x is less than y.
Zero if x equals y.
Greater than zero if x is greater than y.
More Information
Dotnetperls - C# SortedSet Examples
MSDN: Compare Method
Update
If your Bla class implements IComparable and you want your list sorted you can do this.
var blahs = new List<Blah> {new Blah(1), new Blah(2),
new Blah(3), new Blah(2)};
blahs.Sort();
If your Bla class NOT implements IComparable and you want your list sorted you can use Linq (System.Linq namespace) for that.
blahs = blahs.OrderBy(x => x.MyProperty).ToList();

You can do this if you provide an alternate comparison when the Values are equal and the Compare method would otherwise return 0. In most cases this would probably just defer the problem instead of solving it. As others have noted, the SortedSet discards duplicates and when you provide a custom comparer it uses that to determine duplicity.
static void Main(string[] args)
{
var blahs = new List<Blah>
{
new Blah(1, 0), new Blah(2, 1),
new Blah(3, 2), new Blah(2, 3)
};
blahs.Add(blahs[0]);
//contains all 4 entries
var set = new HashSet<Blah>(blahs);
//contains all 4 entries
var sortedset = new SortedSet<Blah>(blahs, new BlahComparer());
}
}
public class Blah
{
public double Value { get; private set; }
public Blah(double value, int index)
{
Value = value;
Index = index;
}
public int Index { get; private set; }
public override string ToString()
{
return Value.ToString();
}
}
public class BlahComparer : Comparer<Blah>
{
public override int Compare(Blah x, Blah y)
{
// needs null checks
var referenceEquals = ReferenceEquals(x, y);
if (referenceEquals)
{
return 0;
}
var compare = Comparer<double>.Default.Compare(x.Value, y.Value);
if (compare == 0)
{
compare = Comparer<int>.Default.Compare(x.Index, y.Index);
}
return compare;
}
}

You can't find the other Blah(2) because you're using a Set.
Set - A collection of well defined and **distinct** objects
MultiSet, for instance, allows duplicate objects.

Sounds what you want is property-based sorting, but duplicate checking should be based on reference equality. To accomplish this (and if you don't mind that the memory consumption of your comparer can increase over time) we can add a fallback to the comparer that calculates the compare result based on IDs unique to the instances:
public class BlahComparer : Comparer<Blah>
{
private readonly ObjectIDGenerator _idGenerator = new();
public override int Compare(Blah x, Blah y)
{
int compareResult = Comparer<double>.Default.Compare(x.Value, y.Value);
if (compareResult == 0)
{
// Comparing hash codes is optional and is only done in order to potentially avoid using _idGenerator further below which is better for memory consumption.
compareResult =
Comparer<int>.Default.Compare(RuntimeHelpers.GetHashCode(x), RuntimeHelpers.GetHashCode(y));
if (compareResult == 0)
{
// HashCodes are the same but it might actually still be two different objects, so compare unique IDs:
compareResult = Comparer<long>.Default.Compare(_idGenerator.GetId(x, out bool _), _idGenerator.GetId(y, out bool _)); // This increases the memory consumption of the comparer for every newly encountered Blah
}
}
return compareResult;
}
}

Related

How to check if properties of two objects are equal

I have two objects using the ff. class:
public class Test {
public string Name {get; set;}
public List<Input> Inputs {get;set;}
......
//some other properties I don't need to check
}
public class Input {
public int VariableA {get;set;}
public int VariableB {get;set;}
public List<Sancti> Sancts {get;set;}
}
public class Sancti {
public string Symbol {get;set;}
public double Percentage {get;set;}
}
I want to check if two instance of Test has the same Inputs value. I've done this using a loop but I believe this is not the way to do this.
I've read some links: link1, link2 but they seem gibberish for me. Are there simpler ways to do this, like a one-liner something like:
test1.Inputs.IsTheSameAs(test2.Inputs)?
I was really hoping for a more readable method. Preferrably Linq.
NOTE: Order of inputs should not matter.
One way is to check the set negation between the two lists. If the result of listA negated by listB has no elements, that means that everything in listA exists in listB. If the reverse is also true, then the two lists are equal.
bool equal = testA.Inputs.Except(testB.Inputs).Count() == 0
&& testB.Inputs.Except(testA.Inputs).Count() == 0;
Another is to simply check each element of listA and see if it exists in listB (and vice versa):
bool equal = testA.Inputs.All(x => testB.Inputs.Contains(x))
&& testB.Inputs.All(x => testA.Inputs.Contains(x));
This being said, either of these can throw a false positive if there is one element in a list that would be "equal" to multiple elements in the other. For example, the following two lists would be considered equal using the above approaches:
listA = { 1, 2, 3, 4 };
listB = { 1, 1, 2, 2, 3, 3, 4, 4 };
To prevent that from happening, you would need to perform a one-to-one search rather than the nuclear solution. There are several ways to do this, but one way to do this is to first sort both lists and then checking their indices against each other:
var listASorted = testA.Inputs.OrderBy(x => x);
var listBSorted = testB.Inputs.OrderBy(x => x);
bool equal = testA.Inputs.Count == testB.Inputs.Count
&& listASorted.Zip(listBSorted, (x, y) => x == y).All(b => b);
(If the lists are already sorted or if you'd prefer to check the lists exactly (with ordering preserved), then you can skip the sorting step of this method.)
One thing to note with this method, however, is that Input needs to implement IComparable in order for them to be properly sorted. How you implement it exactly is up to you, but one possible way would be to sort Input based on the XOR of VariableA and VariableB:
public class Input : IComparable<Input>
{
...
public int Compare(Input other)
{
int a = this.VariableA ^ this.VariableB;
int b = other.VariableA ^ other.VariableB;
return a.Compare(b);
}
}
(In addition, Input should also override GetHashCode and Equals, as itsme86 describes in his answer.)
EDIT:
After being drawn back to this answer, I would now like to offer a much simpler solution:
var listASorted = testA.Inputs.OrderBy(x => x);
var listBSorted = testB.Inputs.OrderBy(x => x);
bool equal = listASorted.SequenceEqual(listBSorted);
(As before, you can skip the sorting step if the lists are already sorted or you want to compare them with their existing ordering intact.)
SequenceEqual uses the equality comparer for a particular type for determining equality. By default, this means checking that the values of all public properties are equal between two objects. If you want to implement a different approach, you can define an IEqualityComparer for Input:
public class InputComparer : IEqualityComparer<Input>
{
public bool Equals(Input a, Input b)
{
return a.variableA == b.variableA
&& a.variableB == b.variableB
&& ... and so on
}
public int GetHashCode(Input a)
{
return a.GetHashCode();
}
}
You can change your Input and Sancti class definitions to override Equals and GetHasCode. The following solution considers that 2 Inputs are equal when:
VariableA are equal and
VariableB are equal and
The Sancts List are equal, considering that the Sancti elements with the same Symbol must have the same Percentage to be equal
You may need to change this if your specifications are different:
public class Input
{
public int VariableA { get; set; }
public int VariableB { get; set; }
public List<Sancti> Sancts { get; set; }
public override bool Equals(object obj)
{
Input otherInput = obj as Input;
if (ReferenceEquals(otherInput, null))
return false;
if ((this.VariableA == otherInput.VariableA) &&
(this.VariableB == otherInput.VariableB) &&
this.Sancts.OrderBy(x=>x.Symbol).SequenceEqual(otherInput.Sancts.OrderBy(x => x.Symbol)))
return true;
else
{
return false;
}
}
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + VariableA.GetHashCode();
hash = hash * 23 + VariableB.GetHashCode();
hash = hash * 23 + Sancts.GetHashCode();
return hash;
}
}
}
public class Sancti
{
public string Symbol { get; set; }
public double Percentage { get; set; }
public override bool Equals(object obj)
{
Sancti otherInput = obj as Sancti;
if (ReferenceEquals(otherInput, null))
return false;
if ((this.Symbol == otherInput.Symbol) && (this.Percentage == otherInput.Percentage) )
return true;
else
{
return false;
}
}
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = 17;
// Suitable nullity checks etc, of course :)
hash = hash * 23 + Symbol.GetHashCode();
hash = hash * 23 + Percentage.GetHashCode();
return hash;
}
}
}
Doing this, you just have to do this to check if Inputs are equal:
test1.Inputs.SequenceEqual(test2.Inputs);

Compare two lists of objects of the same type

This is how the custom object is defined:
public class AccountDomain
{
public string MAILDOMAIN { get; set; }
public string ORG_NAME { get; set; }
}
This is how I am populating the List of objects:
List<AccountDomain> mainDBAccountDomain = mainDB.GetAllAccountsAndDomains();
List<AccountDomain> manageEngineAccountDomain = ManageEngine.GetAllAccountsAndDomains();
This code works fine - if I look at the locals windows I can see a List of Objects in both mainDBAccountDomain and manageEngineAccountDomain.
I'm struggling with the next bit, ideally I want a new list of type AccountDomain that contains all entries that are in mainDBAccountDomain and not ManageEngineAccountDomain
Any help greatly appreciated, even if it's just a pointer in the right direction!
I want a new list of type AccountDomain that contains all entries that are in mainDBAccountDomain and not ManageEngineAccountDomain
It's very simple with linq to objects, it's exactly what the Enumerable.Except function does:
var result = mainDBAccountDomain.Except(manageEngineAccountDomain).ToList();
You can pass a comparer to the Except function if you need something different from reference equality, or you could implement Equals and GetHashCode in AccountDomain (and optionally implement IEquatable<AccountDomain> on top of these).
See this explanation if you need more details about comparers.
Here's an example:
public class AccountDomainEqualityComparer : IEqualityComparer<AccountDomain>
{
public static readonly AccountDomainEqualityComparer Instance
= new AccountDomainEqualityComparer();
private AccountDomainEqualityComparer()
{
}
public bool Equals(AccountDomain x, AccountDomain y)
{
if (ReferenceEquals(x, y))
return true;
if (x == null || y == null)
return false;
return x.MAILDOMAIN == y.MAILDOMAIN
&& x.ORG_NAME == y.ORG_NAME;
}
public int GetHashCode(AccountDomain obj)
{
if (obj == null)
return 0;
return (obj.MAILDOMAIN ?? string.Empty).GetHashCode()
^ (397 * (obj.ORG_NAME ?? string.Empty).GetHashCode());
}
}
Then, you use it like this:
var result = mainDBAccountDomain.Except(manageEngineAccountDomain,
AccountDomainEqualityComparer.Instance)
.ToList();

C# Custom list duplicate values based on a property

I have a custom list class let say,
public class Fruit
{
public string Name { get; set; }
public string Size { get; set; }
public string Weight{ get; set; }
}
Now I am adding records to it like this,
List<Fruit> Fruits= new List<Fruit>();
//some foreach loop
Fruit fruit = new Fruit();
fruit.Name = ...;
fruit.Size = ...;
fruit.Weight = ...;
Fruits.Add(fruit);
What I want ?
I want to make changes to Public Fruit Class in a way that it checks if any of fruit in custom list has already has same weight then just ignore it and continue e.g. don't add it to the list.
I would prefer doing it without changing foreach loop logic
Use LINQ .Any() - Determines whether any element of a sequence exists or satisfies a condition. (MSDN: http://msdn.microsoft.com/en-us/library/system.linq.enumerable.any.aspx)
if (!Fruits.Any(f => fruit.Weight != null && f.Weight == fruit.Weight))
Fruits.Add(fruit);
If duplicate weights are not allowed i would use a HashSet<Fruit> with a custom IEqualityComparer:
public class FruitWeightComparer : IEqualityComparer<Fruit>
{
public bool Equals(Fruit x, Fruit y)
{
if(x == null || y== null) return false;
if (Object.ReferenceEquals(x, y)) return true;
return x.Weight == y.Weight;
}
public int GetHashCode(Fruit obj)
{
return obj.Weight == null ? 0 : obj.Weight.GetHashCode();
}
}
Now you can use the HashSet constructor with this comparer:
HashSet<Fruit> Fruits = new HashSet<Fruit>(new FruitWeightComparer());
// ...
bool notInSet = Fruits.Add(fruit);
HashSet.Add returns true if the item could be added.
You can control it at insert time by simply not inserting already existing fruit
if (!myFruits.Any(f => f.Weight == newFruit.Weight))
myFruits.Add(newFruit);
If you can't manipulate the insertion logic you can make a custom list that wraps a normal List<T> and changes the behavior of Add like in the above example:
public class FruitsWithDistinctWeightList : IEnumerable<Fruit>
{
private List<Fruit> internalList;
... // Constructor etc.
public void Add(Fruit fruit)
{
if (!internalList.Any(f => f.Weight == fruit.Weight))
internalList.Add(fruit);
}
... // Further impl of IEnumerable<Fruit> or IList<Fruit>
}
You could also use some existing collection that does not allow duplicate items. For example some hash based collection such as HashSet<Fruit>:
var fruitsWithDistinctWeight = new HashSet<Fruit>(new FruitWeightComparer());
Where you'd use a comparer that says fruits with equal weight are equal:
public class FruitWeightComparer : IEqualityComparer<Fruit>
{
public bool Equals(Fruit one, Fruit two)
{
return one.Weight == two.Weight;
}
public int GetHashCode(Fruit item)
{
return one.Weight.GetHashCode();
}
}
Note that a HashSet<T> is not ordered like a list is. Note that all of the code above for simplicity assumes that the Weight field is set. If you have public setters on your class (i.e. no guarantees of this) you would have to change appropriately.

Sort a List<T> by enum where enum is out of order

I have a List of messages.
Each message has a type.
public enum MessageType
{
Foo = 0,
Bar = 1,
Boo = 2,
Doo = 3
}
The enum names are arbitrary and cannot be changed.
I need to return the list sorted as: Boo, Bar, Foo, Doo
My current solution is to create a tempList, add the values in the order I want, return the new list.
List<Message> tempList = new List<Message>();
tempList.AddRange(messageList.Where(m => m.MessageType == MessageType.Boo));
tempList.AddRange(messageList.Where(m => m.MessageType == MessageType.Bar));
tempList.AddRange(messageList.Where(m => m.MessageType == MessageType.Foo));
tempList.AddRange(messageList.Where(m => m.MessageType == MessageType.Doo));
messageList = tempList;
How can I do this with an IComparer?
An alternative to using IComparer would be to build an ordering dictionary.
var orderMap = new Dictionary<MessageType, int>() {
{ MessageType.Boo, 0 },
{ MessageType.Bar, 1 },
{ MessageType.Foo, 2 },
{ MessageType.Doo, 3 }
};
var orderedList = messageList.OrderBy(m => orderMap[m.MessageType]);
So, let's write our own comparer:
public class MyMessageComparer : IComparer<MessageType> {
protected IList<MessageType> orderedTypes {get; set;}
public MyMessageComparer() {
// you can reorder it's all as you want
orderedTypes = new List<MessageType>() {
MessageType.Boo,
MessageType.Bar,
MessageType.Foo,
MessageType.Doo,
};
}
public int Compare(MessageType x, MessageType y) {
var xIndex = orderedTypes.IndexOf(x);
var yIndex = orderedTypes.IndexOf(y);
return xIndex.CompareTo(yIndex);
}
};
How to use:
messages.OrderBy(m => m.MessageType, new MyMessageComparer())
There is a easier way: just create ordereTypes list and use another overload of OrderBy:
var orderedTypes = new List<MessageType>() {
MessageType.Boo,
MessageType.Bar,
MessageType.Foo,
MessageType.Doo,
};
messages.OrderBy(m => orderedTypes.IndexOf(m.MessageType)).ToList();
Hm.. Let's try to take advantages from writing our own IComparer. Idea: write it like our last example but in some other semantic. Like this:
messages.OrderBy(
m => m.MessageType,
new EnumComparer<MessageType>() {
MessageType.Boo,
MessageType.Foo }
);
Or this:
messages.OrderBy(m => m.MessageType, EnumComparer<MessageType>());
Okay, so what we need. Our own comparer:
Must accept enum as generic type (how to solve)
Must be usable with collection initializer syntax (how to)
Must sort by default order, when we have no enum values in our comparer (or some enum values aren't in our comparer)
So, here is the code:
public class EnumComparer<TEnum>: IComparer<TEnum>, IEnumerable<TEnum> where TEnum: struct, IConvertible {
protected static IList<TEnum> TypicalValues { get; set; }
protected IList<TEnum> _reorderedValues;
protected IList<TEnum> ReorderedValues {
get { return _reorderedValues.Any() ? _reorderedValues : TypicalValues; }
set { _reorderedValues = value; }
}
static EnumComparer() {
if (!typeof(TEnum).IsEnum)
{
throw new ArgumentException("T must be an enumerated type");
}
TypicalValues = new List<TEnum>();
foreach (TEnum value in Enum.GetValues(typeof(TEnum))) {
TypicalValues.Add(value);
};
}
public EnumComparer(IList<TEnum> reorderedValues = null) {
if (_reorderedValues == null ) {
_reorderedValues = new List<TEnum>();
return;
}
_reorderedValues = reorderedValues;
}
public void Add(TEnum value) {
if (_reorderedValues.Contains(value))
return;
_reorderedValues.Add(value);
}
public int Compare(TEnum x, TEnum y) {
var xIndex = ReorderedValues.IndexOf(x);
var yIndex = ReorderedValues.IndexOf(y);
// no such enums in our order list:
// so this enum values must be in the end
// and must be ordered between themselves by default
if (xIndex == -1) {
if (yIndex == -1) {
xIndex = TypicalValues.IndexOf(x);
yIndex = TypicalValues.IndexOf(y);
return xIndex.CompareTo(yIndex);
}
return -1;
}
if (yIndex == -1) {
return -1; //
}
return xIndex.CompareTo(yIndex);
}
public void Clear() {
_reorderedValues = new List<TEnum>();
}
private IEnumerable<TEnum> GetEnumerable() {
return Enumerable.Concat(
ReorderedValues,
TypicalValues.Where(v => !ReorderedValues.Contains(v))
);
}
public IEnumerator<TEnum> GetEnumerator() {
return GetEnumerable().GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator() {
return GetEnumerable().GetEnumerator();
}
}
So, well, let's make sorting more faster. We need to override default OrderBy method for our enums:
public static class LinqEnumExtensions
{
public static IEnumerable<TSource> OrderBy<TSource, TEnum>(this IEnumerable<TSource> source, Func<TSource, TEnum> selector, EnumComparer<TEnum> enumComparer) where TEnum : struct, IConvertible
{
foreach (var enumValue in enumComparer)
{
foreach (var sourceElement in source.Where(item => selector(item).Equals(enumValue)))
{
yield return sourceElement;
}
}
}
}
Yeah, that's lazy. You can google how yield works. Well, let's test speed. Simple benchmark: http://pastebin.com/P8qaU20Y. Result for n = 1000000;
Enumerable orderBy, elementAt: 00:00:04.5485845
Own orderBy, elementAt: 00:00:00.0040010
Enumerable orderBy, full sort: 00:00:04.6685977
Own orderBy, full sort: 00:00:00.4540575
We see, that our own orderBy by is more lazy that standart order by (yeah, it doesn't need to sort everything). And faster even for fullsort.
Problems in this code: it doesn't support ThenBy(). If you need this, you can write your own linq extension that returns IOrderedEnumerable There are a blog post series by Jon Skeet which goes into LINQ to Objects in some depth, providing a complete alternative implementation. The basis of IOrderedEnumerable is covered in part 26a and 26b, with more details and optimization in 26c and 26d.
Instead of using an IComparer, you could also use a SelectMany approach, which should have better performance for large message lists, if you have a fixed number of message types.
var messageTypeOrder = new [] {
MessageType.Boo,
MessageType.Bar,
MessageType.Foo,
MessageType.Doo,
};
List<Message> tempList = messageTypeOrder
.SelectMany(type => messageList.Where(m => m.MessageType == type))
.ToList();
You may avoid writing a completely new type just to implement IComparable. Use the Comparer class instead:
IComparer<Message> comparer = Comparer.Create<Message>((message) =>
{
// lambda that compares things
});
tempList.Sort(comparer);
You can build a mapping dictionary dynamically from the Enum values with LINQ like this:
var mappingDIctionary = new List<string>((string[])Enum.GetNames(typeof(Hexside)))
.OrderBy(label => label )
.Select((i,n) => new {Index=i, Label=n}).ToList();
Now any new values added to the Enum n future will automatically get properly mapped.
Also, if someone decides to renumber, refactor, or reorder the enumeration, everything is handled automatically.
Update:
As pointed out below, Alphabetical ordering was not called for; rather a semi- alphabetical ordering, so essentially random. Although not an answer to this particular question, this technique might be useful to future visitors, so I will leave it standing.
No need to have the mapping. This should give you the list and order based on the enum. You don't have to modify anything even when you change the enum's order or and new items...
var result = (from x in tempList
join y in Enum.GetValues(typeof(MessageType)).Cast<MessageType>()
on x equals y
orderby y
select y).ToList();
If you are about to get this working with Entity Framework (EF), you would have to spread out your enum in your OrderBy as such:
messageList.OrderBy(m =>
m.MessageType == MessageType.Boo ? 0 :
m.MessageType == MessageType.Bar ? 1 :
m.MessageType == MessageType.Foo ? 2 :
m.MessageType == MessageType.Doo ? 3 : 4
);
This creates a sub select with CASE WHEN, then ORDER BY on that temporary column.

Anonymous type and intersection of 2 lists

public class thing
{
public int Id{get;set;}
public decimal shouldMatch1 {get;set;}
public int otherMatch2{get;set;}
public string doesntMatter{get;set;}
public int someotherdoesntMatter{get;set;}
}
List<thing> firstList = new List<thing>();
List<thing> secondList = new List<thing>();
firstList.Add( new thing{ Id=1,shouldMatch1 = 1.11M, otherMatch2=1000,doesntMatter="Some fancy string", someotherdoesntMatter=75868});
firstList.Add( new thing{ Id=2,shouldMatch1 = 2.22M, otherMatch2=2000,doesntMatter="Some fancy string", someotherdoesntMatter=65345});
firstList.Add( new thing{ Id=3,shouldMatch1 = 3.33M, otherMatch2=3000,doesntMatter="Some fancy string", someotherdoesntMatter=75998});
firstList.Add( new thing{ Id=4,shouldMatch1 = 4.44M, otherMatch2=4000,doesntMatter="Some fancy string", someotherdoesntMatter=12345});
secondList.Add( new thing{ Id=100,shouldMatch1 = 1.11M, otherMatch2=1000,doesntMatter="Some fancy string", someotherdoesntMatter=75868});
secondList.Add( new thing{ Id=200,shouldMatch1 = 2.22M, otherMatch2=200,doesntMatter="Some fancy string", someotherdoesntMatter=65345});
secondList.Add( new thing{ Id=300,shouldMatch1 = 3.33M, otherMatch2=300,doesntMatter="Some fancy string", someotherdoesntMatter=75998});
secondList.Add( new thing{ Id=400,shouldMatch1 = 4.44M, otherMatch2=4000,doesntMatter="Some fancy string", someotherdoesntMatter=12345});
//Select new firstList.Id,secondList.Id where firstList.shouldMatch1 ==secondList.shouldMatch1 && firstList.otherMatch2==secondList.otherMatch2
//SHould return
//1,100
//4,400
Is there a way to intersect the lists, or must I iterate them?
Pseudocode
firstList.Intersect(secondList).Where(firstList.shouldMatch1 == secondList.shouldMatch1 && firstList.otherMatch2 == secondList.otherMatch2)
Select new {Id1=firstList.Id,Id2=secondList.Id};
Regards
_Eric
You could use an approach other than intersecting and implementing an IEqualityComparer, as follows:
var query = from f in firstList
from s in secondList
where f.shouldMatch1 == s.shouldMatch1 &&
f.otherMatch2 == s.otherMatch2
select new { FirstId = f.Id, SecondId = s.Id };
foreach (var item in query)
Console.WriteLine("{0}, {1}", item.FirstId, item.SecondId);
This is essentially the Enumerable.SelectMany method in query format. A join would likely be quicker than this approach.
Consider using a multi-condition join to join your records. An intersect would cause you to lose ID's either on the left or the right.
Here is an example of a working multi-column join for this particular scenario. The appeal of this query is that it requires no equality comparer, and it allows you to retrieve the ID column while joining on the other specified columns.
var query = from first in firstList
join second in secondList on
new { first.shouldMatch1, first.otherMatch2 }
equals
new { second.shouldMatch1, second.otherMatch2 }
select new
{
FirstId = first.Id,
SecondId = second.Id
};
You need to make your thing type override Equals and GetHashCode to indicate its equality semantics:
public sealed class Thing : IEquatable<Thing>
{
public int Id{get;set;}
public decimal ShouldMatch1 {get;set;}
public int OtherMatch2{get;set;}
public string DoesntMatter{get;set;}
public int SomeOtherDoesntMatter{get;set;}
public override int GetHashCode()
{
int hash = 17;
hash = hash * 31 + ShouldMatch1.GetHashCode() ;
hash = hash * 31 + OtherMatch2.GetHashCode() ;
return hash;
}
public override bool Equals(object other) {
return Equals(other as Thing);
}
public bool Equals(Thing other) {
if (other == null) {
return false;
}
return ShouldMatch1 == other.ShouldMatch1 &&
OtherMatch2 == other.OtherMatch2;
}
}
Note that sealing the class makes the equality test simpler. Also note that if you put one of these in a dictionary as a key but then change Id, ShouldMatch1 or OtherMatch2 you won't be able to find it again...
Now if you're using a real anonymous type, you don't get to do this... and it's tricky to implement an IEqualityComparer<T> to pass to Intersect when it's anonymous. You could write an IntersectBy method, a bit like MoreLINQ's DisinctBy method... that's probably the cleanest approach if you're really using an anonymous type.
You'd use it like this:
var query = first.Intersect(second);
You then end up with an IEnumerable<Thing> which you can get the right bits out of.
Another option is to use a join:
var query = from left in first
join right in second
on new { left.ShouldMatch1, left.OtherMatch2 } equals
new { right.ShouldMatch1, right.OtherMatch2 }
select new { left, right };
(EDIT: I've just noticed others have done a join too... ah well.)
Yet another option if you're only interested in the bits of the match is to project the sequences:
var query = first.Select(x => new { x.ShouldMatch1, x.OtherMatch2 })
.Intersect(second.Select(x => new { x.ShouldMatch1,
x.OtherMatch2 }));
You will need an equality comparer:
public class thingEqualityComparer : IEqualityComparer<thing>
{
#region IEqualityComparer<thing> Members
public bool Equals(thing x, thing y) {
return (x.shouldMatch1 == y.shouldMatch1 && x.otherMatch2 == y.otherMatch2)
public int GetHashCode(thing obj) {
// if this does not suffice provide a better implementation.
return obj.GetHashCode();
}
#endregion
}
Then you can intersect the collections with:
firstList.Intersect(secondList, new thingEqualityComparer());
Alternatively, you can override the Equal function (see John's solution).
Also please not that thing is not anonymous class - this would be for example new { prop = 1 }.

Categories

Resources