Why GetHashCode is part of the Object class? Only small part of the objects of the classes are used as keys in hash tables. Wouldn't it be better to have a separate interface which must be implemented when we want objects of the class to serve as keys in hash table.
There must be a reason that MS team decided to include this method in Object class and thus make it available "everywhere".
It was a design mistake copied from Java, IMO.
In my perfect world:
ToString would be renamed ToDebugString to set expectations appropriately
Equals and GetHashCode would be gone
There would be a ReferenceEqualityComparer implementation of IEqualityComparer<T>: the equals part of this is easy at the moment, but there's no way of getting an "original" hash code if it's overridden
Objects wouldn't have monitors associated with them: Monitor would have a constructor, and Enter/Exit etc would be instance methods.
Equality (and thus hashing) cause problems in inheritance hierarchies in general - so long as you can always specify the kind of comparison you want to use (via IEqualityComparer<T>) and objects can implement IEquatable<T> themselves if they want to, I don't see why it should be on Object. EqualityComparer<T>.Default could use the reference implementation if T didn't implement IEquatable<T> and defer to the objects otherwise. Life would be pleasant.
Ah well. While I'm at it, array covariance was another platform mistake. If you want language mistakes in C#, I can start another minor rant if you like ;) (It's still by far my favourite language, but there are things I wish had been done differently.)
I've blogged about this elsewhere, btw.
Only small part of the objects of the classes are used as keys in hash tables
I would argue that this is not a true statement. Many classes are often used as keys in hash tables - and object references themselves are very often used. Having the default implementation of GetHashCode exist in System.Object means that ANY object can be used as a key, without restrictions.
This seems much nicer than forcing a custom interface on objects, just to be able to hash them. You never know when you may need to use an object as the key in a hashed collection.
This is especially true when using things like HashSet<T> - In this case, often, an object reference is used for tracking and uniqueness, not necessarily as a "key". Had hashing required a custom interface, many classes would become much less useful.
It allows any object to be used as a key by "identity". This is beneficial in some cases, and harmful in none. So, why not?
So anything can be keyed on. (Sorta)
That way HashTable can take an object vs something that implements IHashable for example.
To Drive simple equality comparison.
On objects that don't implement it directly it defaults to .NET's Internal Hash Code which I believe is either a unique ID for the object instance or a hash of the memory footprint it takes up. (I cannot remember and .NET Reflector can't go past the .NET component of the class).
GetHashCode is in object so that you can use anything as a key into a Hashtable, a basic container class. It provides symmetry. I can put anything into an ArrayList, why not a Hashtable?
If you require classes to implement IHashable, then for every sealed class that doesn't implement IHashable, you will writing adapters when you want to use it as key that include the hashing capability. Instead, you get it by default.
Also Hashcodes are a good second line for object equality comparison (first line is pointer equality).
If every class has GetHashCode you can put every object in a hash. Imagine you have a to use third party objects (which you can't modify) and want to put them into ab hash. If these objects didn't implement you fictional IHashable you couldn't do it. This is obviously a bad thing ;)
Just a guess, but the garbage collector may store hashtables of some objects internally (perhaps to keep track of finalizable objects), which means any object needs to have a hash key.
Related
I am doubting my understanding of the System.Collection.Generic.IReadOnlyCollection<T> semantics and doubting how to design using concepts like read-only and immutable. Let me describe the two natures I'm doubting between by using the documentation , which states
Represents a strongly-typed, read-only collection of elements.
Depending on whether I stress the words 'Represents' or 'read-only' (when pronouncing in my head, or out loud if that's your style) I feel like the sentence changes meaning:
When I stress 'read-only', the documentation defines in my opinion observational immutability (a term used in Eric Lippert's article), meaning that an implementation of the interface may do as it pleases as long as no mutations are visible publicly†.
When I stress 'Represents', the documentation defines (in my opinion, again) an immutable facade (again described in Eric Lippert's article), which is a weaker form, where mutations may be possible, but just cannot be made by the user. For example, a property of type IReadOnlyCollection<T> makes clear to the user (i.e. someone that codes against the declaring type) that he may not modify this collection. However, it is ambiguous in whether the declaring type itself may modify the collection.
For the sake of completeness: The interface doesn't carry any semantics other than the that carries by the signatures of its members. In this case the observational or facade immutability is implementation dependent (not just implementation-of the-interface-dependent, but instance-dependent).
The first option is actually my preferred interpretation, although this contract can easily be broken, e.g. by constructing a ReadOnlyCollection<T> from an array of T's and then setting a value into the wrapper array.
The BCL has excellent interfaces for facade immutability, such as IReadOnlyCollection<T>, IReadOnlyList<T> and perhaps even IEnumerable<T>, etc. However, I find observational immutability also useful and as far as I know, there aren't any interfaces in the BCL carring this meaning (please point them out to me if I'm wrong). It makes sense that these don't exist, because this form of immutability cannot be enforced by an interface declaration, only by implementers (an interface could carry the semantics though, as I'll show below). Aside: I'd love to have this ability in a future C# version!
Example: (may be skipped) I frequently have to implement a method that gets as argument a collection which is used by another thread as well, but the method requires the collection not to be modified during its execution and I therefore declare the parameter to be of type IReadOnlyCollection<T> and give myself a pat on the back thinking that I've met the requirements. Wrong... To a caller that signature looks like as if the method promises not to change the collection, nothing else, and if the caller takes the second interpretation of the documentation (facade) he might just think mutation is allowed and the method in question is resistant to that. Although there are other more conventional solutions for this example, I hope you see that this problem can be a practical problem, in particular when others are using your code (or future-you for that matter).
So now to my actual problem (which triggered doubting the existing interfaces semantics):
I would like to use observational immutability and facade immutability and distinguish between them. Two options I thought of are:
Use the BCL interfaces and document each time whether it is observational or just facade immutability. Disadvantage: Users using such code will only consult documentation when it's already too late, namely when a bug has been found. I want to lead them into the pit of success; documentation cannot do that). Also, I find this kind of semantics important enough to be visible in the type system rather than solely in documentation.
Define interfaces that carry the observational immutability semantics explicitly, like IImmutableCollection<T> : IReadOnlyCollection<T> { } and IImmutableList<T> : IReadOnlyList<T> { }. Note that the interfaces don't have any members except for the inherited ones. The purpose of these interfaces would be to solely say "Even the declaring type won't change me!"‡ I specifically said "won't" here as opposed to "can't". Herein lies a disadvantage: an evil (or erroneous, to stay polite) implementer isn't prevented from breaking this contract by the compiler or anything really. The advantage however is that a programmer who chose to implement this interface rather than the one it directly inherits from, is most likely aware of the extra message sent by this interface, because the programmer is aware of the existence of this interface, and is thereby likely to implement it accordingly.
I'm thinking of going with the second option but am afraid it has design issues comparable to those of delegate types (which were invented to carry semantic information over their semanticless counterparts Func and Action) and somehow that failed, see e.g. here.
I would like to know if you've encountered/discussed this problem as well, or whether I'm just quibbling about semantics too much and should just accept the existing interfaces and whether I'm just unaware of existing solutions in the BCL. Any design issues like those mentioned above would be helpful. But I am particularly interested in other solutions you might (have) come up with to my problem (which is in a nutshell distinguishing observational and facade immutability in both declaration and usage).
Thank you in advance.
† I'm ignoring mutations of the fields etc on the elements of the collection.
‡ This is valid for the example I gave earlier, but the statement is actually broader. For instance any declaring method won't change it, or a parameter of such a type conveys that the method can expects the collection not to change during its execution (which is different from saying that the method cannot change the collection, which is the only statement one can make with existing interfaces), and probably many others.
An interface cannot ensure immutability. A word in the name wont prevent mutability, that's just another hint like documentation.
If you want an immutable object, require a concrete type that is immutable. In c#, immutability is implementation-dependant and not visible in an interface.
As Chris said, you can find existing implementations of immutable collections.
Background:
I have 2 instances of an object of the same type. One object is populated with the configuration of a device I'm connected to, the other object is populated with a version of the configuration that I've stored on my hard drive.
The user can alter either, so I'd like to compare them and present the differences to the user.
Each object contains a number of ViewModel properties, all of which extend ViewModelBase, which are the ones I want to compare.
Question:
Is a better way to do this than what I'm about to propose.
I'm thinking of using Reflection to inspect each property in my objects, and for each that extend ViewModelBase, I'll loop through each of those properties. For any that are different, I'll put the name and value into a list and then present that to the user.
Rather than inventing this wheel, I'm wondering if this is this a problem that's been solved before? Is there a better way for it to be done?
Depending on the amount of properties to be compared, manual checking would be the more efficient option. However, if you have lots of properties or want the check to be dynamic (i.e. you just add new properties and it automagically works), then I think Reflection is the way to go here.
Why not just implement the equals operator for your type?
http://msdn.microsoft.com/en-us/library/ms173147(v=vs.80).aspx
Edit: Having read more carefully I see what you're actually asking is what the most efficient way of doing the actual comparison is.
Doing it via reflection saves on code but is slower. Doing it with lots of manual comparions is fairly quick but more code.
If you are fairly determent and lazy in the good way. You can mix benefits of both solutions. With help of tool like cci you can emit method that compares properties. The beauty of this is that your reflection code will be executed on compile time leaving you with strait forward method to execute at runtime. This allows you to change models as you see fit and not worry about comparison code. There is a down side to this and that is learning cci which is quite challenging.
I want to understand all the advantages of singly rooted class (object) hierarchy in languages like .NET, Java.
I can think of one advantage. Let's say I have a function which I want to accept all data types (or references thereof). Then in that case instead of writing a function for each data type, I can write a single function:
public void MyFun(object obj)
{
// Some code
}
What other advantages we get from such type of hierarchy?
I'll quote some lines from a nice book - Thinking in Java by Bruce Eckel:
All objects in a singly rooted hierarchy have an interface in common,
so they are all ultimately the same type. The alternative (provided by
C++) is that you don’t know that everything is the same fundamental
type. From a backward-compatibility standpoint this fits the model of
C better and can be thought of as less restrictive, but when you want
to do full-on object-oriented programming you must then build your own
hierarchy to provide the same convenience that’s built into other OOP
languages. And in any new class library you acquire, some other
incompatible interface will be used. It requires effort (and possibly
multiple inheritance) to work the new interface into your design. Is
the extra “flexibility” of C++ worth it? If you need it—if you have a
large investment in C—it’s quite valuable. If you’re starting from
scratch, other alternatives such as Java can often be more productive.
All objects in a singly rooted hierarchy (such as Java provides) can
be guaranteed to have certain functionality. You know you can perform
certain basic operations on every object in your system. A singly
rooted hierarchy, along with creating all objects on the heap, greatly
simplifies argument passing.
A singly rooted hierarchy makes it much easier to implement a garbage
collector (which is conveniently built into Java). The necessary
support can be installed in the base class, and the garbage collector
can thus send the appropriate messages to every object in the system.
Without a singly rooted hierarchy and a system to manipulate an object
via a reference, it is difficult to implement a garbage collector.
Since run-time type information is guaranteed to be in all objects,
you’ll never end up with an object whose type you cannot determine.
This is especially important with system level operations, such as
exception handling, and to allow greater flexibility in programming.
A single-rooted hierarchy is not about passing your objects to methods but rather about a common interface all your objects implement.
For example, in C# the System.Object implements few members which are inherited down the hierarchy.
For example this includes the ToString() which is used to get a literal representation of your object. You are guaranteed that for each object, the ToString() will succeed. At the language level you can use this feature to get strings from expressions like (4-11).ToString().
Another example is the GetType() which returns the object of type System.Type representing the type of the object the method is invoked on. Because this member is defined at the top of the hierarchy, the reflection is easier, more uniform than for example in C++.
It provides a base for everything. For example in C# the Object class is the root which has methods such as ToString() and GetType() which are very useful, if you're not sure what specific objects you will be dealing with.
Also - not sure if it would be a good idea, but you could create Extension Methods on the Object class and then every instance of every class would be able to use the method.
For example, you could create an Extension Method called WriteToLogFile(this Object o) and then have it use reflection on the object to write details of it's instance members to your log. There are of course better ways to log things, but it is just an example.
Single rooted hierarchy enables platform developer to have some minimum knowledge about all objects which simplifies development of other libraries which can be used on all other objects.
Think about Collections without GetHashCode(), Reflection without GetType() etc.
I still use Wintellect's PowerCollections library, even though it is aging and not maintained because it did a good job covering holes left in the standard MS Collections libraries. But LINQ and C# 4.0 are poised to replace PowerCollections...
I was very happy to discover System.Linq.Lookup because it should replace Wintellect.PowerCollections.MultiDictionary in my toolkit. But Lookup seems to be immutable! Is that true, can you only created a populated Lookup by calling ToLookup?
Yes, you can only create a Lookup by calling ToLookup. The immutable nature of it means that it's easy to share across threads etc, of course.
If you want a mutable version, you could always use the Edulinq implementation as a starting point. It's internally mutable, but externally immutable - and I wouldn't be surprised if the Microsoft implementation worked in a similar way.
Personally I'm rarely in a situation where I want to mutate the lookup - I would prefer to perform appropriate transformations on the input first. I would encourage you to think in this way too - I find myself wishing for better immutability support from other collections (e.g. Dictionary) more often than I wish that Lookup were mutable :)
That is correct. Lookup is immutable, you can create an instance by using the Linq ToLookup() extension method. Technically even that fact is an implementation detail since the method returns an ILookup interface which in the future might be implemented by some other concrete class.
I was wondering, why on some occasions i see a class representing some type's collection.
For example:
In Microsoft XNA Framework: TextureCollection, TouchCollection, etc.
Also other classes in the .NET framework itself, ending with Collection.
Why is it designed this way? what are the benefits for doing it this way and not as a generic type collection, like was introduced in C# 2.0 ?
Thanks
The examples you gave are good ones. TextureCollection is sealed and has no public constructor, only an internal one. TouchCollection implements IList<TouchLocation>, similar to the way List<T> implements IList<T>. Generics at work here btw, the upvoted answer isn't correct.
TextureCollection is intentionally crippled, it makes sure that you can never create an instance of it. Only secret knowledge about textures can fill this collection, a List<> wouldn't suffice since it cannot be initialized with that secret knowledge that makes the indexer work. Nor does the class need to be generic, it only knows about Texture class instances.
The TouchCollection is similarly specialized. The Add() method throws a NotSupportedException. This cannot be done with a regular List<> class, its Add() method isn't virtual so cannot be overridden to throw the exception.
This is not unusual.
In the .NET framework itself, many type-safe collections predate 2.0 Generics, and are kept for compatibility.
For several XAML-related contexts, there's either no syntax to specify a generic class, or the syntax is cumbersome. Therefore, when List<T> wiould be used, there's a specific TList written for each need.
It allows you to define your own semantics on the collection (you may not want to have an Add or AddRange method etc...).
Additionally, readability is increased by not having your code littered with List<Touch> and List<Texture> everywhere.
There is also quite a lot of .NET 1.0/1.1 code that still needs to work, so the older collections that predate generics still need to exist.
It's not that easy to use generic classes in XAML for example.
Following on from Oded's answer, your own class type allows for much easier change down the track when you decide you want a stack / queue etc instead of that List. There can be lots of reasons for this, including performance, memory use etc.
In fact, it's usually a good idea to hide that type of implementation detail - users of your class just want to know that it stores Textures, not how.