Advantage of singly rooted class hierarchy - c#

I want to understand all the advantages of singly rooted class (object) hierarchy in languages like .NET, Java.
I can think of one advantage. Let's say I have a function which I want to accept all data types (or references thereof). Then in that case instead of writing a function for each data type, I can write a single function:
public void MyFun(object obj)
{
// Some code
}
What other advantages we get from such type of hierarchy?

I'll quote some lines from a nice book - Thinking in Java by Bruce Eckel:
All objects in a singly rooted hierarchy have an interface in common,
so they are all ultimately the same type. The alternative (provided by
C++) is that you don’t know that everything is the same fundamental
type. From a backward-compatibility standpoint this fits the model of
C better and can be thought of as less restrictive, but when you want
to do full-on object-oriented programming you must then build your own
hierarchy to provide the same convenience that’s built into other OOP
languages. And in any new class library you acquire, some other
incompatible interface will be used. It requires effort (and possibly
multiple inheritance) to work the new interface into your design. Is
the extra “flexibility” of C++ worth it? If you need it—if you have a
large investment in C—it’s quite valuable. If you’re starting from
scratch, other alternatives such as Java can often be more productive.
All objects in a singly rooted hierarchy (such as Java provides) can
be guaranteed to have certain functionality. You know you can perform
certain basic operations on every object in your system. A singly
rooted hierarchy, along with creating all objects on the heap, greatly
simplifies argument passing.
A singly rooted hierarchy makes it much easier to implement a garbage
collector (which is conveniently built into Java). The necessary
support can be installed in the base class, and the garbage collector
can thus send the appropriate messages to every object in the system.
Without a singly rooted hierarchy and a system to manipulate an object
via a reference, it is difficult to implement a garbage collector.
Since run-time type information is guaranteed to be in all objects,
you’ll never end up with an object whose type you cannot determine.
This is especially important with system level operations, such as
exception handling, and to allow greater flexibility in programming.

A single-rooted hierarchy is not about passing your objects to methods but rather about a common interface all your objects implement.
For example, in C# the System.Object implements few members which are inherited down the hierarchy.
For example this includes the ToString() which is used to get a literal representation of your object. You are guaranteed that for each object, the ToString() will succeed. At the language level you can use this feature to get strings from expressions like (4-11).ToString().
Another example is the GetType() which returns the object of type System.Type representing the type of the object the method is invoked on. Because this member is defined at the top of the hierarchy, the reflection is easier, more uniform than for example in C++.

It provides a base for everything. For example in C# the Object class is the root which has methods such as ToString() and GetType() which are very useful, if you're not sure what specific objects you will be dealing with.
Also - not sure if it would be a good idea, but you could create Extension Methods on the Object class and then every instance of every class would be able to use the method.
For example, you could create an Extension Method called WriteToLogFile(this Object o) and then have it use reflection on the object to write details of it's instance members to your log. There are of course better ways to log things, but it is just an example.

Single rooted hierarchy enables platform developer to have some minimum knowledge about all objects which simplifies development of other libraries which can be used on all other objects.
Think about Collections without GetHashCode(), Reflection without GetType() etc.

Related

Can I define member function inside namespace in c# or outside a class? [duplicate]

C# will not allow to write non-member functions and every method should be part of a class. I was thinking this as a restriction in all CLI languages. But I was wrong and I found that C++/CLI supports non-member functions. When it is compiled, compiler will make the method as member of some unnamed class.
Here is what C++/CLI standard says,
[Note: Non-member functions are treated by the CLI as members of some unnamed class; however, in C++/CLI source code, such functions cannot be qualified explicitly with that class name. end note]
The encoding of non-member functions in metadata is unspecified. [Note: This does not cause interop problems because such functions cannot have public visibility. end note]
So my question is why don't C# implement something like this? Or do you think there should not be non-member functions and every method should belong to some class?
My opinion is to have non-member function support and it helps to avoid polluting class's interface.
Any thoughts..?
See this blog posting:
http://blogs.msdn.com/ericlippert/archive/2009/06/22/why-doesn-t-c-implement-top-level-methods.aspx
(...)
I am asked "why doesn't C# implement feature X?" all the time. The answer is always the same: because no one ever designed, specified, implemented, tested, documented and shipped that feature. All six of those things are necessary to make a feature happen. All of them cost huge amounts of time, effort and money. Features are not cheap, and we try very hard to make sure that we are only shipping those features which give the best possible benefits to our users given our constrained time, effort and money budgets.
I understand that such a general answer probably does not address the specific question.
In this particular case, the clear user benefit was in the past not large enough to justify the complications to the language which would ensue. By stricting how different language entities nest inside each other we (1) restrict legal programs to be in a common, easily understood style, and (2) make it possible to define "identifier lookup" rules which are comprehensible, specifiable, implementable, testable and documentable.
By restricting method bodies to always be inside a struct or class, we make it easier to reason about the meaning of an unqualified identifier used in an invocation context; such a thing is always an invocable member of the current type (or a base type).
(...)
and this follow-up posting:
http://blogs.msdn.com/ericlippert/archive/2009/06/24/it-already-is-a-scripting-language.aspx
(...)
Like all design decisions, when we're faced with a number of competing, compelling, valuable and noncompossible ideas, we've got to find a workable compromise. We don't do that except by considering all the possibilites, which is what we're doing in this case.
(emphasis from original text)
C# doesn't allow it because Java didn't allow it.
I can think of several reasons why the designers of Java probably didn't allow it
Java was designed to be simple. They attempted to make a language without random shortcuts, so that you generally have just one simple way to do everything, even if other approaches would have been cleaner or more concise. They wanted to minimize the learning curve, and learning "a class may contain methods" is simpler than "a class may contain methods, and functions may exist outside classes".
Superficially, it looks less object-oriented. (Anything that isn't part of an object obviously can't be object-oriented? Can it? of course, C++ says yes, but C++ wasn't involved in this decision)
As I already said in comments, I think this is a good question, and there are plenty of cases where non-member functions would've been preferable. (this part is mostly a response to all the other answers saying "you don't need it")
In C++, where non-member functions are allowed, they are often preferred, for several reasons:
It aids encapsulation. The fewer methods have access to the private members of a class, the easier that class will be to refactor or maintain. Encapsulation is an important part of OOP.
Code can be reused much easier when it is not part of a class. For example, the C++ standard library defines std::find or std::sort` as non-member functions, so that they can be reused on any type of sequences, whether it is arrays, sets, linked lists or (for std::find, at least) streams. Code reuse is also an important part of OOP.
It gives us better decoupling. The find function doesn't need to know about the LinkedList class in order to be able to work on it. If it had been defined as a member function, it would be a member of the LinkedList class, basically merging the two concepts into one big blob.
Extensibility. If you accept that the interface of a class is not just "all its public members", but also "all non-member functions that operate on the class", then it becomes possible to extend the interface of a class without having to edit or even recompile the class itself.
The ability to have non-member functions may have originated with C (where you had no other choice), but in modern C++, it is a vital feature in its own right, not just for backward-comparibility purposes, but because of the simpler, cleaner and more reusable code it allows.
In fact, C# seems to have realized much the same things, much later. Why do you think extension methods were added? They are an attempt at achieving the above, while preserving the simple Java-like syntax.
Lambdas are also interesting examples, as they too are essentially small functions defined freely, not as members of any particular class. So yes, the concept of non-member functions is useful, and C#'s designers have realized the same thing. They've just tried to sneak the concept in through the back door.
http://www.ddj.com/cpp/184401197 and http://www.gotw.ca/publications/mill02.htm are two articles written by C++ experts on the subject.
Non member functions are a good thing because they improve encapsulation and reduce coupling between types. Most modern programming languages such as Haskell and F# support free functions.
What's the benefit of not putting each method in a named class? Why would a non-member function "pollute" the class's interface? If you don't want it as part of the public API of a class, either don't make it public or don't put it in that class. You can always create a different class.
I can't remember ever wanting to write a method floating around with no appropriate scope - other than anonymous functions, of course (which aren't really the same).
In short, I can't see any benefit in non-member functions, but I can see benefits in terms of consistency, naming and documentation in putting all methods in an appropriately named class.
The CLS (common language specification) says that you shouldn't have non-member functions in a library that conforms to the CLS. It's like an extra set of restrictions in addition to the basic restrictions of the CLI (common language interface).
It is possible that a future version of C# will add the ability to write a using directive that allows the static members of a class to be accessed without the class name qualification:
using System.Linq.Enumerable; // Enumerable is a static class
...
IEnumerable<int> range = Range(1, 10); // finds Enumerable.Range
Then there will be no need to change the CLS and existing libraries.
These blog posts demonstrate a library for functional programming in C#, and they use a class name that is just one letter long, to try and cut down the noise caused by the requirement to qualify static method calls. Examples like that would be made a little nicer if using directives could target classes.
Since Java, most programmers have easily accepted that any method is a member of a class. I doesn't make any considerable obstacles and make the concept of method more narrow, which make a language easier.
However, indeed, class infers object, and object infers state, so the concept of class containing only static methods looks a little absurd.
Having all code lie within classes allows for a more powerful set of reflection capabilities.
It allows the use of static intializers, which can initialize the data needed by static methods within a class.
It avoids name clashes between methods by explicitly enclosing them within a unit that cannot be added to by another compilation unit.
I think you really need to clarify what you would want to create non-member static methods to achieve.
For instance, some of the things you might want them for could be handled with Extension Methods
Another typical use (of a class which only contains static methods) is in a library. In this case, there is little harm in creating a class in an assembly which is entirely composed of static methods. It keeps them together, avoids naming collisions. After all, there are static methods in Math which serve the same purpose.
Also, you should not necessarily compare C++'s object model with C#. C++ is largely (but not perfectly) compatible with C, which didn't have a class system at all - so C++ had to support this programming idiom out of the C legacy, not for any particular design imperative.
Csharp does not have non-member function because it has copied or inspired by java's philosophy that only OOPs is the solution for all the problems and it will only allow things to be solved using OO way.
Non-member functions are very important feature if we really want to do generic programming. They are more reusable compared to putting them in a class.
CSharp has to come up with ExtensionMethods due to absence of non-member functions.
As now programming languages are moving towards functional programming paradigm and it seems to be the better way to approach and solve the problem and is the future. CSharp should rethink about it.
Bear something in mind: C++ is a much more complicated language than C#. And although they may be similiar syntactically, they are very different beasts semantically. You wouldn't think it would be terribly difficult to make a change like this, but I could see how it could be. ANTLR has a good wiki page called What makes a language problem hard? that's good to consult for questions like this. In this case:
Context sensitive lexer? You can't decide what vocabulay symbol to match unless you know what kind of sentence you are parsing.
Now instead of just worrying about functions defined in classes, we have to worry about functions defined outside classes. Conceptually, there isn't much difference. But in terms of lexing and parsing the code, now you have the added problem of having to say "if a function is outside a class, it belongs to this unnamed class. However, if it is inside the class, then it belongs to that class."
Also, if the compiler comes across a method like this:
public void Foo()
{
Bar();
}
...it now has to answer the question "is Bar located within this class or is it a global class?"
Forward or external references? I.e., multiple passes needed? Pascal has a "forward" reference to handle intra-file procedure references, but references to procedures in other files via the USES clauses etc... require special handling.
This is another thing that causes problems. Remember that C# doesn't require forward declarations. The compiler will make one pass just to determine what classes are named and what functions those classes contain. Now you have to worry about finding classes and functions where functions can be either inside or outside of a class. This is something a C++ parser doesn't have to worry about as it parses everything in order.
Now don't get me wrong, it could probably be done in C#, and I would probably use such a feature. But is it really worth all the trouble of overcoming these obstacles when you could just type a class name in front of a static method?
Free functions are very useful if you combine them with duck typing. The whole C++ STL is based on it. Hence I am sure that C# will introduce free functions when they manage to add true generics.
Like economics, language design is also about psychology. If you create appetite for true generics via free functions in C# and not deliver, then you would kill C#. Then all C# developers would move to C++ and nobody wants that to happen, not the C# community and most certainly not those invested in C++.
While it's true you need a class (e.g. a static class called FreeFunctions) to hold such functions, you're free to place using static FreeFunctions; at the top of any file that needs the functions from it, without having to litter your code with FreeFunctions. qualifiers.
I'm not sure if there's actually a case where this is demonstrably inferior to not requiring the function definitions to be contained in a class.
Look, other programming languages have a hard time to define the internal nature of a function instance from the compiler's point of view. In Pascal and C, the instances are basically defined as something that can be processed as pointer only. Especially, since reading/writing to executable code positions is what 7 out of 9 computer science professors are dead set against. As member of a class, no one does need to care how to treat its manifestation because this manifestation's type is derived from a class property. It is possible to create something that is exactly processed like a global function: a lambda function, assigned to a variable:
Func<int,int> myFunc = delegate(int var1)
{
Console.WriteLine("{0}",var1*2);
return var1*3;
};
. And it can simply be called like a global function by its variable name.
If so, the difference would be implementing a new object type on the lowest level with same behavior as another one. That is considered bad practice by experienced programmers, and was perhaps scrapped because of this.

Immutability/Read-only semantics (particular C# IReadOnlyCollection<T>)

I am doubting my understanding of the System.Collection.Generic.IReadOnlyCollection<T> semantics and doubting how to design using concepts like read-only and immutable. Let me describe the two natures I'm doubting between by using the documentation , which states
Represents a strongly-typed, read-only collection of elements.
Depending on whether I stress the words 'Represents' or 'read-only' (when pronouncing in my head, or out loud if that's your style) I feel like the sentence changes meaning:
When I stress 'read-only', the documentation defines in my opinion observational immutability (a term used in Eric Lippert's article), meaning that an implementation of the interface may do as it pleases as long as no mutations are visible publicly†.
When I stress 'Represents', the documentation defines (in my opinion, again) an immutable facade (again described in Eric Lippert's article), which is a weaker form, where mutations may be possible, but just cannot be made by the user. For example, a property of type IReadOnlyCollection<T> makes clear to the user (i.e. someone that codes against the declaring type) that he may not modify this collection. However, it is ambiguous in whether the declaring type itself may modify the collection.
For the sake of completeness: The interface doesn't carry any semantics other than the that carries by the signatures of its members. In this case the observational or facade immutability is implementation dependent (not just implementation-of the-interface-dependent, but instance-dependent).
The first option is actually my preferred interpretation, although this contract can easily be broken, e.g. by constructing a ReadOnlyCollection<T> from an array of T's and then setting a value into the wrapper array.
The BCL has excellent interfaces for facade immutability, such as IReadOnlyCollection<T>, IReadOnlyList<T> and perhaps even IEnumerable<T>, etc. However, I find observational immutability also useful and as far as I know, there aren't any interfaces in the BCL carring this meaning (please point them out to me if I'm wrong). It makes sense that these don't exist, because this form of immutability cannot be enforced by an interface declaration, only by implementers (an interface could carry the semantics though, as I'll show below). Aside: I'd love to have this ability in a future C# version!
Example: (may be skipped) I frequently have to implement a method that gets as argument a collection which is used by another thread as well, but the method requires the collection not to be modified during its execution and I therefore declare the parameter to be of type IReadOnlyCollection<T> and give myself a pat on the back thinking that I've met the requirements. Wrong... To a caller that signature looks like as if the method promises not to change the collection, nothing else, and if the caller takes the second interpretation of the documentation (facade) he might just think mutation is allowed and the method in question is resistant to that. Although there are other more conventional solutions for this example, I hope you see that this problem can be a practical problem, in particular when others are using your code (or future-you for that matter).
So now to my actual problem (which triggered doubting the existing interfaces semantics):
I would like to use observational immutability and facade immutability and distinguish between them. Two options I thought of are:
Use the BCL interfaces and document each time whether it is observational or just facade immutability. Disadvantage: Users using such code will only consult documentation when it's already too late, namely when a bug has been found. I want to lead them into the pit of success; documentation cannot do that). Also, I find this kind of semantics important enough to be visible in the type system rather than solely in documentation.
Define interfaces that carry the observational immutability semantics explicitly, like IImmutableCollection<T> : IReadOnlyCollection<T> { } and IImmutableList<T> : IReadOnlyList<T> { }. Note that the interfaces don't have any members except for the inherited ones. The purpose of these interfaces would be to solely say "Even the declaring type won't change me!"‡ I specifically said "won't" here as opposed to "can't". Herein lies a disadvantage: an evil (or erroneous, to stay polite) implementer isn't prevented from breaking this contract by the compiler or anything really. The advantage however is that a programmer who chose to implement this interface rather than the one it directly inherits from, is most likely aware of the extra message sent by this interface, because the programmer is aware of the existence of this interface, and is thereby likely to implement it accordingly.
I'm thinking of going with the second option but am afraid it has design issues comparable to those of delegate types (which were invented to carry semantic information over their semanticless counterparts Func and Action) and somehow that failed, see e.g. here.
I would like to know if you've encountered/discussed this problem as well, or whether I'm just quibbling about semantics too much and should just accept the existing interfaces and whether I'm just unaware of existing solutions in the BCL. Any design issues like those mentioned above would be helpful. But I am particularly interested in other solutions you might (have) come up with to my problem (which is in a nutshell distinguishing observational and facade immutability in both declaration and usage).
Thank you in advance.
† I'm ignoring mutations of the fields etc on the elements of the collection.
‡ This is valid for the example I gave earlier, but the statement is actually broader. For instance any declaring method won't change it, or a parameter of such a type conveys that the method can expects the collection not to change during its execution (which is different from saying that the method cannot change the collection, which is the only statement one can make with existing interfaces), and probably many others.
An interface cannot ensure immutability. A word in the name wont prevent mutability, that's just another hint like documentation.
If you want an immutable object, require a concrete type that is immutable. In c#, immutability is implementation-dependant and not visible in an interface.
As Chris said, you can find existing implementations of immutable collections.

semantically represent generics

I am trying to understand generics in a semantic way. For instance, abstract classes seemed to snap into place for me when I read people refer to them as structures that can set policy. Interfaces snapped when I read people refer to them as collaboration contracts.
What are some good ways to think about generics that might help me to differentiate them from other OO structures and write more intelligent APIs?
Think of generic classes as stencils to make other classes (similarly, generic functions are stencils for making other functions). Type parameters serve as openings in your stencils: by plugging in a concrete type into them, you make the generic class or the generic function into a real class or function. The type parameters "stick through" the designated holes in the stencil, producing a complete definition.
It seems you want to approach your understanding from a top-down perspective. "What is it" in a qualitative sense and then derive the real meaning from there. Isn't it easier to simply learn what these different constructs do rather than trying to come up with labels? i.e. approach it from a bottom-up perspective and infer your own qualitative descriptions from what you've now already understood firsthand.
Abstract classes require you to implement a property or method and can't be instantiated. What distinguishes it from an interface? It requires subclasses to choose yours as its only base class. Interfaces face no such restriction but require you to define its entire behavior in the implementation, rather than relying on some of the behavior to be defined in the base class.
Similarly, generics allow you to introduce types as variables that can be specified by the caller. The utility of this is analogous to method parameters in general, just taken to a higher level. In other words, method parameters allow you to vary the implementation based on some input specified by the caller. Generic parameters allow you to vary the implementation based on some (other) input (i.e. types) specified by the caller.
Surely it's clear why List<T> is more useful than ArrayList. I'm not really sure why metaphors are really helpful for understanding why.
You could view them as wrappers around object types. You are creating functions that will do something for whatever type of object it is instantiated for, so it's like a template that will perform the same work for multiple types of objects.
Microsoft's introduction to generics might have some good descriptions as well
http://msdn.microsoft.com/en-us/library/ms379564(v=vs.80).aspx

Why GetHashCode is in Object class?

Why GetHashCode is part of the Object class? Only small part of the objects of the classes are used as keys in hash tables. Wouldn't it be better to have a separate interface which must be implemented when we want objects of the class to serve as keys in hash table.
There must be a reason that MS team decided to include this method in Object class and thus make it available "everywhere".
It was a design mistake copied from Java, IMO.
In my perfect world:
ToString would be renamed ToDebugString to set expectations appropriately
Equals and GetHashCode would be gone
There would be a ReferenceEqualityComparer implementation of IEqualityComparer<T>: the equals part of this is easy at the moment, but there's no way of getting an "original" hash code if it's overridden
Objects wouldn't have monitors associated with them: Monitor would have a constructor, and Enter/Exit etc would be instance methods.
Equality (and thus hashing) cause problems in inheritance hierarchies in general - so long as you can always specify the kind of comparison you want to use (via IEqualityComparer<T>) and objects can implement IEquatable<T> themselves if they want to, I don't see why it should be on Object. EqualityComparer<T>.Default could use the reference implementation if T didn't implement IEquatable<T> and defer to the objects otherwise. Life would be pleasant.
Ah well. While I'm at it, array covariance was another platform mistake. If you want language mistakes in C#, I can start another minor rant if you like ;) (It's still by far my favourite language, but there are things I wish had been done differently.)
I've blogged about this elsewhere, btw.
Only small part of the objects of the classes are used as keys in hash tables
I would argue that this is not a true statement. Many classes are often used as keys in hash tables - and object references themselves are very often used. Having the default implementation of GetHashCode exist in System.Object means that ANY object can be used as a key, without restrictions.
This seems much nicer than forcing a custom interface on objects, just to be able to hash them. You never know when you may need to use an object as the key in a hashed collection.
This is especially true when using things like HashSet<T> - In this case, often, an object reference is used for tracking and uniqueness, not necessarily as a "key". Had hashing required a custom interface, many classes would become much less useful.
It allows any object to be used as a key by "identity". This is beneficial in some cases, and harmful in none. So, why not?
So anything can be keyed on. (Sorta)
That way HashTable can take an object vs something that implements IHashable for example.
To Drive simple equality comparison.
On objects that don't implement it directly it defaults to .NET's Internal Hash Code which I believe is either a unique ID for the object instance or a hash of the memory footprint it takes up. (I cannot remember and .NET Reflector can't go past the .NET component of the class).
GetHashCode is in object so that you can use anything as a key into a Hashtable, a basic container class. It provides symmetry. I can put anything into an ArrayList, why not a Hashtable?
If you require classes to implement IHashable, then for every sealed class that doesn't implement IHashable, you will writing adapters when you want to use it as key that include the hashing capability. Instead, you get it by default.
Also Hashcodes are a good second line for object equality comparison (first line is pointer equality).
If every class has GetHashCode you can put every object in a hash. Imagine you have a to use third party objects (which you can't modify) and want to put them into ab hash. If these objects didn't implement you fictional IHashable you couldn't do it. This is obviously a bad thing ;)
Just a guess, but the garbage collector may store hashtables of some objects internally (perhaps to keep track of finalizable objects), which means any object needs to have a hash key.

Why C# is not allowing non-member functions like C++

C# will not allow to write non-member functions and every method should be part of a class. I was thinking this as a restriction in all CLI languages. But I was wrong and I found that C++/CLI supports non-member functions. When it is compiled, compiler will make the method as member of some unnamed class.
Here is what C++/CLI standard says,
[Note: Non-member functions are treated by the CLI as members of some unnamed class; however, in C++/CLI source code, such functions cannot be qualified explicitly with that class name. end note]
The encoding of non-member functions in metadata is unspecified. [Note: This does not cause interop problems because such functions cannot have public visibility. end note]
So my question is why don't C# implement something like this? Or do you think there should not be non-member functions and every method should belong to some class?
My opinion is to have non-member function support and it helps to avoid polluting class's interface.
Any thoughts..?
See this blog posting:
http://blogs.msdn.com/ericlippert/archive/2009/06/22/why-doesn-t-c-implement-top-level-methods.aspx
(...)
I am asked "why doesn't C# implement feature X?" all the time. The answer is always the same: because no one ever designed, specified, implemented, tested, documented and shipped that feature. All six of those things are necessary to make a feature happen. All of them cost huge amounts of time, effort and money. Features are not cheap, and we try very hard to make sure that we are only shipping those features which give the best possible benefits to our users given our constrained time, effort and money budgets.
I understand that such a general answer probably does not address the specific question.
In this particular case, the clear user benefit was in the past not large enough to justify the complications to the language which would ensue. By stricting how different language entities nest inside each other we (1) restrict legal programs to be in a common, easily understood style, and (2) make it possible to define "identifier lookup" rules which are comprehensible, specifiable, implementable, testable and documentable.
By restricting method bodies to always be inside a struct or class, we make it easier to reason about the meaning of an unqualified identifier used in an invocation context; such a thing is always an invocable member of the current type (or a base type).
(...)
and this follow-up posting:
http://blogs.msdn.com/ericlippert/archive/2009/06/24/it-already-is-a-scripting-language.aspx
(...)
Like all design decisions, when we're faced with a number of competing, compelling, valuable and noncompossible ideas, we've got to find a workable compromise. We don't do that except by considering all the possibilites, which is what we're doing in this case.
(emphasis from original text)
C# doesn't allow it because Java didn't allow it.
I can think of several reasons why the designers of Java probably didn't allow it
Java was designed to be simple. They attempted to make a language without random shortcuts, so that you generally have just one simple way to do everything, even if other approaches would have been cleaner or more concise. They wanted to minimize the learning curve, and learning "a class may contain methods" is simpler than "a class may contain methods, and functions may exist outside classes".
Superficially, it looks less object-oriented. (Anything that isn't part of an object obviously can't be object-oriented? Can it? of course, C++ says yes, but C++ wasn't involved in this decision)
As I already said in comments, I think this is a good question, and there are plenty of cases where non-member functions would've been preferable. (this part is mostly a response to all the other answers saying "you don't need it")
In C++, where non-member functions are allowed, they are often preferred, for several reasons:
It aids encapsulation. The fewer methods have access to the private members of a class, the easier that class will be to refactor or maintain. Encapsulation is an important part of OOP.
Code can be reused much easier when it is not part of a class. For example, the C++ standard library defines std::find or std::sort` as non-member functions, so that they can be reused on any type of sequences, whether it is arrays, sets, linked lists or (for std::find, at least) streams. Code reuse is also an important part of OOP.
It gives us better decoupling. The find function doesn't need to know about the LinkedList class in order to be able to work on it. If it had been defined as a member function, it would be a member of the LinkedList class, basically merging the two concepts into one big blob.
Extensibility. If you accept that the interface of a class is not just "all its public members", but also "all non-member functions that operate on the class", then it becomes possible to extend the interface of a class without having to edit or even recompile the class itself.
The ability to have non-member functions may have originated with C (where you had no other choice), but in modern C++, it is a vital feature in its own right, not just for backward-comparibility purposes, but because of the simpler, cleaner and more reusable code it allows.
In fact, C# seems to have realized much the same things, much later. Why do you think extension methods were added? They are an attempt at achieving the above, while preserving the simple Java-like syntax.
Lambdas are also interesting examples, as they too are essentially small functions defined freely, not as members of any particular class. So yes, the concept of non-member functions is useful, and C#'s designers have realized the same thing. They've just tried to sneak the concept in through the back door.
http://www.ddj.com/cpp/184401197 and http://www.gotw.ca/publications/mill02.htm are two articles written by C++ experts on the subject.
Non member functions are a good thing because they improve encapsulation and reduce coupling between types. Most modern programming languages such as Haskell and F# support free functions.
What's the benefit of not putting each method in a named class? Why would a non-member function "pollute" the class's interface? If you don't want it as part of the public API of a class, either don't make it public or don't put it in that class. You can always create a different class.
I can't remember ever wanting to write a method floating around with no appropriate scope - other than anonymous functions, of course (which aren't really the same).
In short, I can't see any benefit in non-member functions, but I can see benefits in terms of consistency, naming and documentation in putting all methods in an appropriately named class.
The CLS (common language specification) says that you shouldn't have non-member functions in a library that conforms to the CLS. It's like an extra set of restrictions in addition to the basic restrictions of the CLI (common language interface).
It is possible that a future version of C# will add the ability to write a using directive that allows the static members of a class to be accessed without the class name qualification:
using System.Linq.Enumerable; // Enumerable is a static class
...
IEnumerable<int> range = Range(1, 10); // finds Enumerable.Range
Then there will be no need to change the CLS and existing libraries.
These blog posts demonstrate a library for functional programming in C#, and they use a class name that is just one letter long, to try and cut down the noise caused by the requirement to qualify static method calls. Examples like that would be made a little nicer if using directives could target classes.
Since Java, most programmers have easily accepted that any method is a member of a class. I doesn't make any considerable obstacles and make the concept of method more narrow, which make a language easier.
However, indeed, class infers object, and object infers state, so the concept of class containing only static methods looks a little absurd.
Having all code lie within classes allows for a more powerful set of reflection capabilities.
It allows the use of static intializers, which can initialize the data needed by static methods within a class.
It avoids name clashes between methods by explicitly enclosing them within a unit that cannot be added to by another compilation unit.
I think you really need to clarify what you would want to create non-member static methods to achieve.
For instance, some of the things you might want them for could be handled with Extension Methods
Another typical use (of a class which only contains static methods) is in a library. In this case, there is little harm in creating a class in an assembly which is entirely composed of static methods. It keeps them together, avoids naming collisions. After all, there are static methods in Math which serve the same purpose.
Also, you should not necessarily compare C++'s object model with C#. C++ is largely (but not perfectly) compatible with C, which didn't have a class system at all - so C++ had to support this programming idiom out of the C legacy, not for any particular design imperative.
Csharp does not have non-member function because it has copied or inspired by java's philosophy that only OOPs is the solution for all the problems and it will only allow things to be solved using OO way.
Non-member functions are very important feature if we really want to do generic programming. They are more reusable compared to putting them in a class.
CSharp has to come up with ExtensionMethods due to absence of non-member functions.
As now programming languages are moving towards functional programming paradigm and it seems to be the better way to approach and solve the problem and is the future. CSharp should rethink about it.
Bear something in mind: C++ is a much more complicated language than C#. And although they may be similiar syntactically, they are very different beasts semantically. You wouldn't think it would be terribly difficult to make a change like this, but I could see how it could be. ANTLR has a good wiki page called What makes a language problem hard? that's good to consult for questions like this. In this case:
Context sensitive lexer? You can't decide what vocabulay symbol to match unless you know what kind of sentence you are parsing.
Now instead of just worrying about functions defined in classes, we have to worry about functions defined outside classes. Conceptually, there isn't much difference. But in terms of lexing and parsing the code, now you have the added problem of having to say "if a function is outside a class, it belongs to this unnamed class. However, if it is inside the class, then it belongs to that class."
Also, if the compiler comes across a method like this:
public void Foo()
{
Bar();
}
...it now has to answer the question "is Bar located within this class or is it a global class?"
Forward or external references? I.e., multiple passes needed? Pascal has a "forward" reference to handle intra-file procedure references, but references to procedures in other files via the USES clauses etc... require special handling.
This is another thing that causes problems. Remember that C# doesn't require forward declarations. The compiler will make one pass just to determine what classes are named and what functions those classes contain. Now you have to worry about finding classes and functions where functions can be either inside or outside of a class. This is something a C++ parser doesn't have to worry about as it parses everything in order.
Now don't get me wrong, it could probably be done in C#, and I would probably use such a feature. But is it really worth all the trouble of overcoming these obstacles when you could just type a class name in front of a static method?
Free functions are very useful if you combine them with duck typing. The whole C++ STL is based on it. Hence I am sure that C# will introduce free functions when they manage to add true generics.
Like economics, language design is also about psychology. If you create appetite for true generics via free functions in C# and not deliver, then you would kill C#. Then all C# developers would move to C++ and nobody wants that to happen, not the C# community and most certainly not those invested in C++.
While it's true you need a class (e.g. a static class called FreeFunctions) to hold such functions, you're free to place using static FreeFunctions; at the top of any file that needs the functions from it, without having to litter your code with FreeFunctions. qualifiers.
I'm not sure if there's actually a case where this is demonstrably inferior to not requiring the function definitions to be contained in a class.
Look, other programming languages have a hard time to define the internal nature of a function instance from the compiler's point of view. In Pascal and C, the instances are basically defined as something that can be processed as pointer only. Especially, since reading/writing to executable code positions is what 7 out of 9 computer science professors are dead set against. As member of a class, no one does need to care how to treat its manifestation because this manifestation's type is derived from a class property. It is possible to create something that is exactly processed like a global function: a lambda function, assigned to a variable:
Func<int,int> myFunc = delegate(int var1)
{
Console.WriteLine("{0}",var1*2);
return var1*3;
};
. And it can simply be called like a global function by its variable name.
If so, the difference would be implementing a new object type on the lowest level with same behavior as another one. That is considered bad practice by experienced programmers, and was perhaps scrapped because of this.

Categories

Resources