Using LINQ's Except<> method with an object from an external API - c#

I have a List<> of HTMLAnchor objects (HTMLAnchor is an object from an external API). I want to exclude clicking on some of the links as they are for logging out, etc.
Using LINQ, I can use the Except operator. However, on here (http://msdn.microsoft.com/en-us/vcsharp/aa336761.aspx#except1), the example using the custom type (Product if I remember correctly) does not use the overloaded version of Except.
Furthermore, if I am using a type not defined by me, do the rules change? And should the class I write to implement IEquality have the same name I am trying to exclude in my generic collection (HtmlAnchor)?
Thanks

If you want to compare anchors using the default Equals method, which in this case will probably give you reference equality, you don't need to do anything: just pass the set of anchors to exclude:
anchors.Except(anchorsToExclude);
If the members of the sequence to exclude will not be reference-equal (or whatever HtmlAnchor.Equals deems equal), the interface you want to implement is IEqualityComparer<T>. This exists precisely to allow you to provide a custom equality comparison for a type that you don't define, so the rules don't change -- you just have to use the appropriate overload of Except.
So you would create a class called e.g. HtmlAnchorEqualityComparer which implements IEqualityComparer<HtmlAnchor>, and pass an instance of that to Except:
anchors.Except(anchorsToExclude, new HtmlAnchorEqualityComparer())

When you don't have control over the type, and the default equality operations do not suffice (ie. Equals is not properly implemented), you should use the overload which takes an IEqualityComparer<T> parameter. This is a class that you can implement yourself to provide the definition of equality, that you need.

Related

Default comparer when using OrderBy extension

Out of curiosity: What comparer is used when sorting a bunch of objects using the following extension method?
OrderBy(x=> x)
Background: I have to check wether two ISet< T > instances contain the same elements and considered to use the
bool setsEqual = MySet.SequenceEqual(OtherSet);
method. As the order of those elements contained in the sets are not defined and may differ, a SequenceEqual would fail in those cases where the internal order is not the same. So i would have to explictly define an order. As the order algo for itself is completely irrelevant as long as it´s stable, i just used an "Identity" lambda expression:
bool setsEqual = MySet.OrderBy(x => x).SequenceEqual(OtherSet.OrderBy(x => x);
But what does "Compare the objects themselves" mean to the code? As this OrderBy extension method is a generic one, there must be a default compare algo in place that is able to sort objects without knowing anything more about it, and that would mean a comparison for sorting had to be delegated to the type of the set elements itself. Is there an interface that the elements´ type would have to support, or is there a default comparer (may be comparing internal memory addresses of objects) in place?
To answer the question of sorting: sorting uses IComparable<T> or IComperable if that isn't implemented. The IComperable interfaces force you to implement a int CompareTo(object) method (or int CompareTo(T) method if you used the typed version).
The order of your elements is determined by the sign of the int. The value returned is interpreted as follows:
0: the two objects are equivalent (i.e. the same)
-1: the compared object precedes this object (i.e. comes before this object)
1: the compared object follows this object (i.e. comes after this object)
The actual value is ignored, the sign is all that matters. If you implement your own IComparable interface, you have to choose the semantics for sort order.
Many objects already implement IComparable already, like all your numbers, strings, etc. You'll need to implement it explicitly if you need to sort objects you've created yourself. It's not a bad practice if you intend those objects to be displayed in a list on screen at all.
As to your specific case, where you just need to determine if a set and another IEnumerable are equivalent, then you would use the ISet<T>.SetEquals(IEnumerable<T>) method which is implemented in the standard library set implementations. Sets, by definition, only guarantee the values are unique, so as long as the number of elements are the same, you only need to detect that all the elements in one IEnumerable can be found in the set.
The method used the IComparable<T>-or the IComparable-interface depending on which of both are implemented. If none is implemented the order is arbitrary.
However you won´t need to order you instances before comparing the sets. Simply loop one set and check if all of its elements are contained in the other set. Or use this:
var areEqual = firstSet.All(x => secondSet.Contains(x)) && secondSet.All(x => firstSet.Contains(x));
Or even simpler:
var areEqual = !firstSet.Except(secondSet).Any() && !secondSet.Except(firstSet).Any();
Both ways perform much faster than your appraoch as the iteration of elements stops when the first element is found that does not fit. Using OrderBy you´d loop all elements, regardless if there was already a mismatch.
Unlike for equality, there's no 'default' comparer for objects in general.
It seems that Comparer<TKey>.Default always returns a comparer, for any type TKey. If no sensible comparison method can be determined, you get an exception, but only once the comparer is used.
At least one object must implement IComparable.

How can I be sure that List.Contains works for a list of DataTables?

If I have this:
List<DataTable> listDataTables = functionToAddSomeDataTables();
and I want to do a comparison like this:
if(listDataTables.Contains(aDataTable))
{
//do something.
}
How can I know if it is comparing the reference or the schema or the content or all of the above?
Do I need to write my own IEquatable.Equals to make sure it works properly or can I trust that the built in .Equals for DataTable works as I would hope?
Is there a general rule or observation for knowing when .Contains, or similar comparisons are by reference or by value?
Thanks in advance :)
List<T>.Contains uses the object's object.Equals(object) method. Since DataTable's documentation says that its Equals was inherited from Object.Equals, the default Object.Equals implementation of reference comparison is what will be used. If you want the comparison by something else, include that equality comparer by using LINQ's Contains method.
(as an example, compare DataTable Methods and Decimal Methods: only Decimal lists Equals on the list on the left, and says "(Overrides ValueType.Equals(Object).)" instead of "(Inherited from Object.)")
You have to write your own Equals method and compare the needed properties where. The built in (default) Contains() method will check values for value types (string, int...) and references for reference types (your class is a reference type)

FluentAssertions: ShouldBeEquivalentTo method still invokes Object.Equals()?

I have a class, let's call it Foo, that is a value type and hence overrides the Equals/GetHashCode() methods. In a separate test fixture, I want to assert that all the properties on Foo have been set properly, not just the properties used for equality. For this reason, my test assertion specifically uses the ShouldBeEquivalentTo method, which the documentation suggests will consider two objects to be equivalent if "both object graphs have equally named properties with the same value, irrespective of the type of those objects."
However, it appears that ShouldBeEquivalentTo invokes Foo.Equals method, gets a true result and then proceeds to short-circuit the reflection based property matching that ShouldBeEquivalentTo promises.
Is this expected behavior? If so, in inspecting the FA source, I saw no easy way to alter this behavior (IEquivalencyStep is declared internal). Are there any other ways to around this?
Edit:
Dennis: Yes, the two objects I'm comparing are of the same type. To summarize, I have overridden Equals on class Foo, but do not want FA to use this notion of equality for my unit tests.
I think you cannot alter behavior of this function, it assumes that if you override Equals - than you want comparison to be the same way. You can try the following function:
dto.ShouldHave().SharedProperties().EqualTo(foo);
Or you can implement in Foo class NativeEquals method which will be calling base.Equals() , and then use this method explicitly for comparison, it will work great for value types.

"To" vs "As" vs "Get" Method Prefixes

Does anyone know of any naming convention rules/guidelines that dictate when to use a "To" prefix (myVariable.ToList()), an "As" prefix (myVariable.AsEnumerable()), or a "Get" prefix (myVariable.GetHashCode())?
I assume there's no convention, so just use what fits best to what you're doing.
"To" creates something new/ converts it
"As" is just a "different view" on the same f.e. by using iterators
"Get" is a getter for everything else
My understanding/conventions:
"To" performs a conversion; A new object is created in memory, based on the data inherent in your source.
"As" performs a cast; The same reference passed in is returned behind the "mask" of a different type.
"Get" performs pretty much anything else that takes in a source and whose primary product is a transformed result. Gets can perform a calculation, return a child, retrieve data from a store, instantiate objects from a default state, etc. Not all such methods have to be named "Get", but most methods intended to calculate, instantiate, project, or otherwise transform, and then return the product as their primary purpose are "getters".
When myObj is not related to List, prefix "To" to convert.
When myObj is a subclass of Enumerable, prefix "As" to give it as Enumerable
When myObj is not related to List, but it composes / can compose List use "Get" prefix
If you're using Entity Framework for CRUD operations on a database, then using .ToList() will have your query be executed right there, as opposed to using AsEnumerable() which will use deferred execution until you actually try to access a record.
That's one that I thought of right off the top of my head.
As is a reinterpretation of existing data. AsEnumerable does nothing. It is implemented as return input;.
To implies a conversion.
Get does not imply any of the former.
You will find valid deviations from these rules. They are not set in stone.
I would say that To vs As has more to do with differences like class vs interface
i e you are saying AsEnumerable when you really want to return something that implements interface.
ToList on opposite returns new object which is representation of current state of current object, ie ToDictionary just another way of representing same data.
Third ones Get methods returns some properties of the object OR something about part of it's state and not the full state.

Why Find methods are absent in Enumerable<> while present in List<>?

I think my question says it all. Why Find methods are absent in Enumerable<> while present in List<>. If they were there it would have reduced the burden of writing large LINQ Queries to find something from Enumerable<>. I know i can change the Enumerable to List using .ToList() but that would be a hack.
The Enumerable.FirstOrDefault<TSource> Extension Method does exactly the same as the List<T>.Find Method.
 
Enumerable.FirstOrDefault<TSource> Method
Returns the first element of the sequence that satisfies a condition or a default value if no such element is found.
Return Value: default(TSource) if source is empty or if no element passes the test specified by predicate; otherwise, the first element in source that passes the test specified by predicate.
 
List<T>.Find Method
Searches for an element that matches the conditions defined by the specified predicate, and returns the first occurrence within the entire List<T>.
Return Value: The first element that matches the conditions defined by the specified predicate, if found; otherwise, the default value for type T.
It's very common for classes to include more "helper functions" than interfaces, for the simple reason that adding a helper function to a class simply entails adding code the code for that method to one place (the class in question), while adding a helper function to an interface compels every implementation of that interface to add code for that function.
It would be helpful if the next version of the CLR could provide a means by which interfaces could specify default implementations for their members, especially if implementations of old versions of an interface could be regarded as implementing new versions, using the default implementation for any new members. If such a thing were legal, IEnumerable<T> could add a Count method, which could be overridden by any implementation which was able to determine the number of items without having to iterate through them, but which would otherwise use a default method that would count via iteration. If such a feature existed, adding members like Find to IEnumerable<T> would be useful. Unfortunately, I know of no plan to implement such a feature.

Categories

Resources