C# method override resolution weirdness - c#

Consider the following snippet of code:
using System;
class Base
{
public virtual void Foo(int x)
{
Console.WriteLine("Base.Foo(int)");
}
}
class Derived : Base
{
public override void Foo(int x)
{
Console.WriteLine("Derived.Foo(int)");
}
public void Foo(object o)
{
Console.WriteLine("Derived.Foo(object)");
}
}
public class Program
{
public static void Main()
{
Derived d = new Derived();
int i = 10;
d.Foo(i);
}
}
And the surprising output is:
Derived.Foo(object)
I would expect it to select the overridden Foo(int x) method, since it's more specific. However, C# compiler picks the non-inherited Foo(object o) version. This also causes a boxing operation.
What is the reason for this behaviour?

This is the rule, and you may not like it...
Quote from Eric Lippert
if any method on a more-derived class is an applicable candidate, it
is automatically better than any method on a less-derived class, even
if the less-derived method has a better signature match.
The reason is because the method (that is a better signature match) might have been added in a later version and thereby be introducing a "brittle base class" failure
Note : This is a fairly complicated/in-depth part of the C# specs and it jumps all over the place. However, the main parts of the issue you are experiencing are written as follows
Update
And this is why i like stackoverflow, It is such a great place to learn.
I was quoting the the section on the run time processing of the method call. Where as the question is about compile time overload resolution, and should be.
7.6.5.1 Method invocations
...
The set of candidate methods is reduced to contain only methods from
the most derived types: For each method C.F in the set, where C is the
type in which the method F is declared, all methods declared in a base
type of C are removed from the set. Furthermore, if C is a class type
other than object, all methods declared in an interface type are
removed from the set. (This latter rule only has affect when the
method group was the result of a member lookup on a type parameter
having an effective base class other than object and a non-empty
effective interface set.)
Please see Eric's post answer https://stackoverflow.com/a/52670391/1612975 for a full detail on whats going on here and the appropriate part of the specs
Original
C#
Language Specification
Version 5.0
7.5.5 Function member invocation
...
The run-time processing of a function member invocation consists of
the following steps, where M is the function member and, if M is an
instance member, E is the instance expression:
...
If M is an instance function member declared in a reference-type:
E is evaluated. If this evaluation causes an exception, then no further steps are executed.
The argument list is evaluated as described in §7.5.1.
If the type of E is a value-type, a boxing conversion (§4.3.1) is performed to convert E to type object, and E is considered to be of
type object in the following steps. In this case, M could only be a
member of System.Object.
The value of E is checked to be valid. If the value of E is null, a System.NullReferenceException is thrown and no further steps are
executed.
The function member implementation to invoke is determined:
If the binding-time type of E is an interface, the function member to invoke is the implementation of M provided by the run-time
type of the instance referenced by E. This function member is
determined by applying the interface mapping rules (§13.4.4) to
determine the implementation of M provided by the run-time type of the
instance referenced by E.
Otherwise, if M is a virtual function member, the function member to invoke is the implementation of M provided by the run-time type of
the instance referenced by E. This function member is determined by
applying the rules for determining the most derived implementation
(§10.6.3) of M with respect to the run-time type of the instance
referenced by E.
Otherwise, M is a non-virtual function member, and the function member to invoke is M itself.
After reading the specs what's interesting is, if you use an interface which describes the method, the compiler will choose the overload signature, in-turn working as expected
public interface ITest
{
void Foo(int x);
}
Which can be shown here
In regards to the interface, it does make sense when considering the overloading behavior was implemented to protect against Brittle base class
Additional Resources
Eric Lippert, Closer is better
The aspect of overload resolution in C# I want to talk about today is
really the fundamental rule by which one potential overload is judged
to be better than another for a given call site: closer is always
better than farther away. There are a number of ways to characterize
“closeness” in C#. Let’s start with the closest and move our way out:
A method first declared in a derived class is closer than a method first declared in a base class.
A method in a nested class is closer than a method in a containing class.
Any method of the receiving type is closer than any extension method.
An extension method found in a class in a nested namespace is closer than an extension method found in a class in an outer namespace.
An extension method found in a class in the current namespace is closer than an extension method found in a class in a namespace
mentioned by a using directive.
An extension method found in a class in a namespace mentioned in a using directive where the directive is in a nested namespace is closer
than an extension method found in a class in a namespace mentioned in
a using directive where the directive is in an outer namespace.

The accepted answer is correct (excepting the fact that it quotes the wrong section of the spec) but it explains things from the perspective of the specification rather than giving a justification for why the specification is good.
Let's suppose we have base class B and derived class D. B has a method M that takes Giraffe. Now, remember, by assumption, the author of D knows everything about B's public and protected members. Put another way: the author of D must know more than the author of B, because D was written after B, and D was written to extend B to a scenario not already handled by B. We should therefore trust that the author of D is doing a better job of implementing all functionality of D than the author of B.
If the author of D makes an overload of M that takes an Animal, they are saying I know better than the author of B how to deal with Animals, and that includes Giraffes. We we should expect overload resolution when given a call to D.M(Giraffe) to call D.M(Animal), and not B.M(Giraffe).
Let's put this another way: We are given two possible justifications:
A call to D.M(Giraffe) should go to B.M(Giraffe) because Giraffe is more specific than Animal
A call to D.M(Giraffe) should go to D.M(Animal) because D is more specific than B
Both justifications are about specificity, so which justification is better? We're not calling any method on Animal! We're calling the method on D, so that specificity should be the one that wins. The specificity of the receiver is far, far more important than the specificity of any of its parameters. The parameter types are there for tie breaking. The important thing is making sure we choose the most specific receiver because that method was written later by someone with more knowledge of the scenario that D is intended to handle.
Now, you might say, what if the author of D has also overridden B.M(Giraffe)? There are two arguments why a call to D.M(Giraffe) should call D.M(Animal) in this case.
First, the author of D should know that D.M(Animal) can be called with a Giraffe, and it must be written do the right thing. So it should not matter from the user's perspective whether the call is resolved to D.M(Animal) or B.M(Giraffe), because D has been written correctly to do the right thing.
Second, whether the author of D has overridden a method of B or not is an implementation detail of D, and not part of the public surface area. Put another way: it would be very strange if changing whether or not a method was overridden changes which method is chosen. Imagine if you're calling a method on some base class in one version, and then in the next version the author of the base class makes a minor change to whether a method is overridden or not; you would not expect overload resolution in the derived class to change. C# has been designed carefully to prevent this kind of failure.

Related

C# - Inside instance method, why can we access static members without using the class name?

This seems counter-intuitive to me. If we have a class Dog with static method CountAllDogs(), C# forbids to call it like this: myDog.CountAllDogs(). (myDog is an object of type Dog).
But if we are inside an instance method Bark(), we can call it simply by using CountAllDogs(). Inside the instance method Bark(), the context ("this") is the object myDog, not the class itself, so I wonder why this is allowed?
"Why" questions are frequently vague, and this one is no exception. Rather than answer your vague and confusing question, I'll answer a different question.
What is the fundamental rule for resolving unqualified names in C#?
The fundamental rule of resolving an unqualified name is search from inside to outside. Suppose you have:
using System;
namespace A {
namespace B {
using C;
class D
{
public static void E() {}
}
class F : D {
public static void G() {}
public void H()
{
Action i = ()=>{};
Now suppose somewhere inside H we have an unqualified name, X. We need to figure out what it means. So we go from inside to outside:
Is there any local variable X? If yes, that's the meaning of X. If not...
Is there any member of F called X?
Is there any member of D -- the base class of F -- called X?
Is there any member of object -- the base class of D -- called X?
Is there any member of B called X?
Is there any member of C -- which B is "using" -- called X?
Is there any member of A called X?
Is there any member of System called X?
Is there any global namespace called X?
(This is a sketch that leaves out a few details, such as how aliases are dealt with and so on; read the specification if you want the details.)
An interesting point here is that base classes are considered to be "more inside" than the lexically containing program element. Members of D are considered to be members of F, so they must be checked before we check B.
That's the fundamental rule. There are also a few extra rules added for convenience. For instance, if we had X() then only invocable members are considered when doing the search. There is also the famous "Color Color" rule, which says that if you have a type called Color and a property of type Color called Color -- because what else would you call it? -- then the name lookup rules for Color are smart about figuring out whether you meant the type or the property, even when that means departing from the fundamental rule of unqualified name lookup.
Now that you know the fundamental rule you can apply it to your situation. Why can you call a static member without qualification? Because the fundamental rule of unqualified name lookup is "search from inside to outside", and doing so finds a static element with that name. If we're inside H and we have G() then G is not a local but it is an invocable member of the enclosing class, so it wins. If we're inside H and we have E() then we find it in D, after failing to find it in H or F. And so on.
When you call an unqualified instance member, same thing. The unqualified name is resolved, and if it turns out to be an instance member, then this is used as the receiver of the member.
To call an instance-member you need an instance of course. In case you´re allready within an instance-member you can of course use the this-reference, which points to the current instance. If you´re on the other side calling your instance-member from outside your class you´d write something like this:
myInstance.DoSomething();
So actually using this as qualifier is redundant, you may simply omit it if you´re within the method.
A static member doesn´t know any instance, thus no this. Adding the class´-name is again redundant information as adding the this-keyowrd to an instance-member.
The compiler should be clever enough to determine that a member is static or not and thus if or of not he needs an instance.
Personally I agree that adding the context in which a member might be invoked is a good thing, which is even forced in Java e.g. This avoids anoying prefixes on variables such as m_ for instance- and s_ for static members. But again the compiler allready knows, it´s just a matter of taste.
So having said this there´s no actual need why this should or should not be permitted as it doesn´t avoid any mistakes not produces any. It´s just a convention-thing to be more strict.

Why C# compiler use an invalid method's overload?

I have been confused by the following code
class A
{
public void Abc(int q)
{
Console.Write("A");
}
}
class B : A
{
public void Abc(double p)
{
Console.Write("B");
}
}
...
var b = new B();
b.Abc((int)1);
The result of code execution is "B" written to console.
In fact the B class contains two overloads of Abc method, the first for int parameter, the second one for double. Why the compiler use a double version for an integer argument?
Be careful the method abc(double) doesn't shadow or override the method abc(int)
Since the compiler can implicitly convert the int to double, it chooses the B.Abc method. This is explained in this post by Jon Skeet (search for "implicit"):
The target of the method call is an expression of type Child, so the
compiler first looks at the Child class. There's only one method
there, and it's applicable (there's an implicit conversion from int to
double) so that's the one that gets picked. The compiler doesn't
consider the Parent method at all.
The reason for this is to reduce the risk of the brittle base class
problem...
More from Eric Lippert
As the standard says, “methods in a base class are not candidates if any method in a derived class is applicable”.
In other words, the overload resolution algorithm starts by searching
the class for an applicable method. If it finds one then all the other
applicable methods in deeper base classes are removed from the
candidate set for overload resolution. Since Delta.Frob(float) is
applicable, Charlie.Frob(int) is never even considered as a candidate.
Only if no applicable candidates are found in the most derived type do
we start looking at its base class.
Things get a little more interesting if we extend the example in your question with this additional class that descends from A:
class C : A {
public void Abc(byte b) {
Console.Write("C");
}
}
If we execute the following code
int i = 1;
b.Abc((int)1);
b.Abc(i);
c.Abc((int)1);
c.Abc(i);
the results are BBCA. This is because in the case of the B class, the compiler knows it can implicitly cast any int to double. In the case of the C class, the compiler knows it can cast the literal int 1 to a byte (because the value 1 fits in a byte) so C's Abc method gets used. The compiler, however, can't implicitly cast any old int to a byte, so c.Abc(i) can't use C's Abc method. It must use the parent class in that case.
This page on Implicit Numeric Conversions shows a compact table of which numeric types have implicit conversions to other numeric types.
You get the same functionality even when you define B as:
class B : A
{
public void Abc(object p)
{
Console.Write("B");
}
}
Simply, it's because overload resolution is done by looking at methods defined in the current class. If there are any suitable methods in the current class, it stops looking. Only if there are no suitable matches does it look at base classes
You can take a look at the Overload resolution spec for a detailed explanation.
Different languages (such as C++, Java, or C#) have vastly different overload resolution rules. In C#, the overload was correctly chosen as per the language spec. If you wanted the other overload to be chosen, you have a choice. Remember this:
When a derived class intends to declare another overload for an inherited method, so as to treat all available overloads as equal-rights peers, it must also explicitly override all the inherited overloads with a base call as well.
What is the language design benefit of requiring this exercise?
Imagine that you are using a 3rd party library (say, .NET framework) and deriving from one of its classes. At some point you introduce a private method called Abc (a new, unique name, not an overload of anything). Two years later you upgrade the 3rd party library version without noticing that they also added a method, accessible to you and called, regrettably, Abc, except that it has a different parameter type somewhere (so the upgrade doesn't alert you with a compile time error) and it behaves subtly differently or maybe even has a different purpose altogether. Do you really want one half of your private calls to Abc to be silently redirected to the 3rd party Abc? In Java, this may happen. In C# or C++, this isn't going to happen.
The upside of the C# way is that it's somewhat easier, for a redistributed library, to add functionality while rigorously keeping backward compatibility. In two ways actually:
You won't ever mess with your customers' private method calls inside their own code.
You won't ever break your customers by adding a new uniquely named method, although you must still think twice before adding an overload of YOUR own existing method.
The downside of the C# way is that it cuts a hole into the OOP philosophy of overriding methods ever changing only the implementation, but not the API of a class.

Base Class or Derived Class is Runtime Type?

I've been reading documentation about virtual methods :
In a virtual method invocation, the run-time type of the instance for
which that invocation takes place determines the actual method
implementation to invoke. In a non-virtual method invocation, the
compile-time type of the instance is the determining factor. In
precise terms, when a method named N is invoked with an argument list
A on an instance with a compile-time type C and a run-time type R
(where R is either C or a class derived from C), the invocation is
processed as follows... :
http://msdn.microsoft.com/en-us/library/aa645767(v=vs.71).aspx
However, I noticed something which is bold above. Lets say we have a code like this:
class Planet{
public string Name;
public float Size;
public virtual void SpinPlanet(){
Console.WriteLine("Hoooraaay!!!");
}
}
class Earth : Planet{
}
And somewhere in my code I do:
Earth world = new Earth();
world.SpinPlanet();
In this case:
N is SpinPlanet()
C is Earth
R is Planet
So how come R can be derived class of compile-time type C. Aren't the base class types being resolved during run-time?
You are mistaken- the compile-time type (C) is Earth and the run-time type (R) is also Earth. The part of the specification that you point out is not really relevant here.
What is relevant is http://msdn.microsoft.com/en-us/library/aa691356(v=vs.71).aspx, specifically:
The set of candidate methods for the method invocation is constructed.
Starting with the set of methods associated with M, which were found
by a previous member lookup (Section 7.3), the set is reduced to those
methods that are applicable with respect to the argument list A.
The only candidate implementation of SpinPlanet just happens to be in the base class of Earth, not in the derived class.
The part of the spec that you refer to would apply if the code were:
Planet world = new Earth();
world.SpinPlanet();
(especially if Earth defined an override for SpinPlanet) because then the compile type (the type of the variable) would be Planet, but the runtime type would be Earth.
The correct method to invoke will be resolved at runtime, by picking it from Virtual Methods Table. So if you add to Earth
class Earth : Planet{
public override void SpinPlanet(){
Console.WriteLine("Hoooraaay Earth!!!");
}
}
on code like this
Planet world = new Earth();
world.SpinPlanet(); //even if declared type Planet,
// the real type remain Earth, so its method will be called
will be invoked Earth's method.
In my example compile time type is Planet, but runtime type is Earth.
In your example, the compile time and runtime types are the same Earth.

What is the difference between method hiding and shadowing in C#?

What is the difference between method hiding and shadowing in C#? Are they same or different? Can we call them as polymorphism (compile time or run time)?
What is the difference between method hiding and shadowing in C#?
Shadowing is another commonly used term for hiding. The C# specification only uses "hiding" but either is acceptable.
You call out just "method hiding" but there are forms of hiding other than method hiding. For example:
namespace N
{
class D {}
class C
{
class N
{
class D
{
N.D nd; // Which N.D does this refer to?
the nested class N hides the namespace N when inside D.
Can we call them as polymorphism (compile time or run time)?
Method hiding can be used for polymorphism, yes. You can even mix method hiding with method overriding; it is legal to introduce a new virtual method by hiding an old virtual method; in that case which virtual method is chosen depends on the compile-time and run-time type of the receiver. Doing that is very confusing and you should avoid it if possible.
The VB.NET compiler calls it shadowing, in C# it is called hiding. Calling it shadowing in C# is a spill-over from VB.
And it is a a compiler warning, it essentially is a name-conflict between base and derived class.
Can we call them as polymorphism (compile time or run time)?
It certainly is not a form of runtime polymorphism. A call to a hiding or to a hidden method is resolved at compile time. Which makes that it will in general not be called or considered polymorphism.
The two terms mean the same in C#.
Method hiding == shadowing.
You can use this as a form of polymorphism - when you don't want the base class method to be visible/usable through the inheriting class.
A shadowing method is completely decoupled from the base class - it is a new method. The term hiding is used because it has an identical signature to the one of the base class and is "hiding" it - it breaks the inheritance chain.
Name hiding of C# (new modifier) is called shadowing in VB.NET (keyword Shadows).
This can be thought of as polymorphism only in the sense that overriding is a "polymorphism", i.e. static or compile-time. It is not a polymorphism in the classical sense of calling virtual functions.
They are just two different words for the same thing, but differ in the context where you most often use them. Typically, what is called "hiding" is related to polymorphism but what is called "shadowing" is not.
In C# parlance, when you say "hiding" you're usually talking about inheritance, where a more derived method "hides" a base-class method from the normal inherited method call chain.
When you say "shadow" you're usually talking about scope: an identifier in an inner scope is "shadowing" an identifier at a higher scope. In other languages, what is called "hiding" in C# is sometimes called "shadowing" as well.
Both are compile-time concepts; they describe what object a given identifier refers to in a given context when the compiler goes to bind it.
public class A
{
public int B;
public void C()
{
return this.B;
}
}
public class D : A
{
public int X;
public new void C()
{
var X = 1.0m;
return X;
}
}
Method D.C() "hides" method A.C(); normally, a call to D.C() would always call into the base classes A.C() method, since it's not virtual. We don't want that; we want D.C(). Obviously this is something you should try to avoid, because it's confusing, especially if you start up-casting your D's to A's, but it exists if you need it. Also, note that method hiding is automatic: without the new keyword here, D.C() still hides A.C() but we get a warning because usually that's not what you want. The new keyword just makes it clear that is really is what we want.
Local variable X in D.C() shadows class member D.X within the scope of D.C() only. In this case, there are two things in scope that could legitimately be called X and the compiler needs rules to tell it which one you mean. The "more local" X shadows the "less local" D.X so that's what we get.

What's really happening with new and override under the covers?

I've found loads of practical examples of this, and understand the practical output when overriding or hiding methods, but I'm looking for some under the covers info on why this is and why C# allows it when according to the rules of polymorphism, this shouldn't be allowed - at least, insofar as my understanding of polymorphism goes (which seems to coincide with the standard definitions found on Wikipedia/Webopedia).
Class Base
{
public virtual void PrintName()
{
Console.WriteLine("BaseClass");
}
}
Class FirstDerived : Base
{
public override void PrintName()
{
Console.WriteLine("FirstDerived");
}
}
Class SecondDerived : Base
{
public new void PrintName()
{
Console.WriteLine("SecondDerived");
}
}
Using the following code:
FirstDerived b = new FirstDerived();
BaseClass a = b;
b.PrintName();
a.PrintName();
I get:
FirstDerived
FirstDerived
Okay, I get that, makes sense.
SecondDerived c = new SecondDerived();
BaseClass a = c;
c.PrintName();
a.PrintName();
I get:
SecondDerived
BaseClass
Okay, that makes sense, too, instance a can't see c.PrintName() so it's using its own method to print its own name, however I can cast my instance to its true type using:
((SecondDerived)a).PrintName();
or
(a as SecondDerived).PrintName();
to get the output I would expect:
SecondDerived
So what is going on under the covers and what does this mean in terms of polymorphism? I'm told that this facility "breaks polymorphism" - and I guess according to the definition, it does. Is that right? Would an "object oriented" langage like C# really allow you to break one of the core principles of OOP?
(This answers the "why is it allowed" which I think is really the central point of your question. How it works in terms of the IL is less interesting to my mind... let me know if you want me to go into that though. Basically it's just a case of specifying the method to call with a different type token.)
It allows base classes to evolve without breaking derived classes.
Suppose Base didn't originally have the PrintName method. The only way to get at SecondDerived.PrintName would be to have an expression with a static type of SecondDerived, and call it on that. You ship your product, everything is fine.
Now fast forward to Base introducing a PrintName method. This may or may not have the same semantics of SecondDerived.PrintName - it's safest to assume that it doesn't.
Any callers of Base.PrintName know that they're calling the new method - they couldn't have called it before. Any callers which were previously using SecondDerived.PrintName still want to use it though - they don't want to suddenly end up calling Base.PrintName which could do something entirely different.
The difficulty is new callers of SecondDerived.PrintName, who may or may not appreciate that this isn't an override of Base.PrintName. They may be able to notice this from the documentation of course, but it may not be obvious. However, at least we haven't broken existing code.
When SecondDerived is recompiled though, the authors will be made aware that there's now a Base.PrintName class through a warning. They can either stick to their existing non-virtual scheme by adding the new modifier, or make it override the Base.PrintName method. Until they make that decision, they'll keep getting a warning.
Versioning and compatibility isn't usually mentioned in OO theory in my experience, but C# has been designed to try to avoid compatibility nightmares. It doesn't solve the problem completely, but it does a pretty good job.
I answer "how" it works. Jon has answered the "Why" part.
Calls to virtual methods are resolved a bit differently to those of non-virtual ones. Basically, a virtual method declaration introduces a "virtual method slot" in the base class. The slot will hold a pointer to the actual method definition (and the contents will point to an overridden version in the derived classes and no new slot will be created). When the compiler generates code for a virtual method call, it uses the callvirt IL instruction, specifying the method slot to call. The runtime will dispatch the call to the appropriate method. On the other hand, a non-virtual method is called with a call IL instruction, which will be statically resolved to the actual method by the compiler, at compile time (only with the knowledge of the compile-time type of the variable). new modifier does nothing in the compiled code. It essentially tells the C# compiler "Dude, shut up! I'm sure I'm doing the right thing" and turns off the compiler warning.
A new method (actually, any method without an override modifier) will introduce a completely separate chain of methods (new method slot). Note that a new method can be virtual itself. The compiler will look at the static type of the variable when it wants to resolve the method chain and the run time will choose the actual method in that specific chain.
According to the Wikipedia definition:
Type polymorphism in object-oriented
programming is the ability of one
type, A, to appear as and be used like
another type, B
Later on the same page:
Method overriding is where a subclass
replaces the implementation of one or
more of its parent's methods. Neither
method overloading nor method
overriding are by themselves
implementations of polymorphism.
The fact that SecondDerived does not provide an override for the PrintName does not affect its ability to appear and be used as Base. The new method implementation it provides will not be used anywhere an instance of SecondDerived is treated as an instance of the Base; it will be used only when that instance is explicitly used as an instance of SecondDerived.
Moreover, SecondClass can actually explicitly implement the Base.PrintName in addition to the new hiding implementation, thus providing its own override that will be used when treated as Base. (Though, Base has to be an explicit interface definition or has to derive from one to allow this)

Categories

Resources