Why does this polymorphic C# code print what it does? - c#

I was recently given the following piece of code as a sort-of puzzle to help understand Polymorphism and Inheritance in OOP - C#.
// No compiling!
public class A
{
public virtual string GetName()
{
return "A";
}
}
public class B:A
{
public override string GetName()
{
return "B";
}
}
public class C:B
{
public new string GetName()
{
return "C";
}
}
void Main()
{
A instance = new C();
Console.WriteLine(instance.GetName());
}
// No compiling!
Now, after a long, long chat with the other developer who presented the puzzle, I know what the output is, but I won't spoil it for you. The only issue I'm really having is how we get to that output, how the code steps through, what's inheriting what, etc.
I thought C would be returned as that seems to be the class that is defined. Then I went through my head as to whether B would be returned because C inherits B - but B also inherits A (which is where I got confused!).
Question:
Could anyone explain how polymorphism and inheritance play their part in retrieving the output, eventually displayed on screen?

The correct way to think about this is to imagine that every class requires its objects to have a certain number of "slots"; those slots are filled with methods. The question "what method actually gets called?" requires you to figure out two things:
What are the contents of each slot?
Which slot is called?
Let's start by considering the slots. There are two slots. All instances of A are required to have a slot we'll call GetNameSlotA. All instances of C are required to have a slot we'll call GetNameSlotC. That's what the "new" means on the declaration in C -- it means "I want a new slot". Compared to the "override" on the declaration in B, which means "I do not want a new slot, I want to re-use GetNameSlotA".
Of course, C inherits from A, so C must also have a slot GetNameSlotA. Therefore, instances of C have two slots -- GetNameSlotA, and GetNameSlotC. Instances of A or B which are not C have one slot, GetNameSlotA.
Now, what goes into those two slots when you create a new C? There are three methods, which we'll call GetNameA, GetNameB, and GetNameC.
The declaration of A says "put GetNameA in GetNameSlotA". A is a superclass of C, so A's rule applies to C.
The declaration of B says "put GetNameB in GetNameSlotA". B is a superclass of C, so B's rule applies to instances of C. Now we have a conflict between A and B. B is the more derived type, so it wins -- B's rule overrides A's rule. Hence the word "override" in the declaration.
The declaration of C says "put GetNameC in GetNameSlotC".
Therefore, your new C will have two slots. GetNameSlotA will contain GetNameB and GetNameSlotC will contain GetNameC.
We've now determined what methods are in what slots, so we've answered our first question.
Now we have to answer the second question. What slot is called?
Think about it like you're the compiler. You have a variable. All you know about it is that it is of type A. You're asked to resolve a method call on that variable. You look at the slots available on an A, and the only slot you can find that matches is GetNameSlotA. You don't know about GetNameSlotC, because you only have a variable of type A; why would you look for slots that only apply to C?
Therefore this is a call to whatever is in GetNameSlotA. We've already determined that at runtime, GetNameB will be in that slot. Therefore, this is a call to GetNameB.
The key takeaway here is that in C# overload resolution chooses a slot and generates a call to whatever happens to be in that slot.

It should return "B" because B.GetName() is held in the little virtual table box for the A.GetName() function. C.GetName() is a compile time "override", it doesn't override the virtual table so you can't retrieve it through a pointer to A.

Easy, you only have to keep the inheritance tree in mind.
In your code, you hold a reference to a class of type 'A', which is instantiated by an instance of type 'C'. Now, to resolve the exact method address for the virtual 'GetName()' method, the compiler goes up the inheritance hierarchy and looks for the most recent override (note that only 'virtual' is an override, 'new' is something completely different...).
That's in short what happens. The new keyword from type 'C' would only play a role if you would call it on an instance of type 'C' and the compiler then would negate all possible inheritance relations altogether. Strictly spoken, this has nothing to do at all with polymorphism - you can see that from the fact that whether you mask a virtual or non-virtual method with the 'new' keyword doesn't make any difference...
'New' in class 'C' means exactly that: If you call 'GetName()' on an instance of this (exact) type, then forget everything and use THIS method. 'Virtual' in contrary means: Go up the inheritance tree until you find a method with this name, no matter what the exact type of the calling instance is.

OK, the post is a bit old, but it's an excellent question and an excellent answer, so I just wanted to add my thoughts.
Consider the following example, which is the same as before, except for the main function:
// No compiling!
public class A
{
public virtual string GetName()
{
return "A";
}
}
public class B:A
{
public override string GetName()
{
return "B";
}
}
public class C:B
{
public new string GetName()
{
return "C";
}
}
void Main()
{
Console.Write ( "Type a or c: " );
string input = Console.ReadLine();
A instance = null;
if ( input == "a" ) instance = new A();
else if ( input == "c" ) instance = new C();
Console.WriteLine( instance.GetName() );
}
// No compiling!
Now it's really obvious that the function call cannot be bound to a specific function at compile time. Something must be compiled however, and that information can only depend on the type of the reference. So, it would be impossible to execute the GetName function of class C with any reference other than one of type C.
P.S. Maybe I should've used the term method in stead of function, but as Shakespeare said: A function by any other name is still a function :)

Actually, I think it should display C, because new operator just hides all ancestor methods with the same name. So, with methods of A and B hidden, only C remains visible.
http://msdn.microsoft.com/en-us/library/51y09td4%28VS.71%29.aspx#vclrfnew_newmodifier

Related

Why is method overloading not working in this C# program?

namespace test
{
class Program
{
static void Main(string[] args)
{
Derived obj = new Derived();
int i = 10;
obj.Foo(i);
Console.ReadLine();
}
}
class Base
{
public virtual void Foo(int i)
{
Console.WriteLine("Base:Foo()");
}
}
class Derived:Base
{
public override void Foo(int i)
{
Console.WriteLine("Foo(int)");
}
public void Foo(object i)
{
Console.WriteLine("Foo(object)");
}
}
}
output of the program according to me should be Foo(int) but output is coming as Foo(object) please help me in understanding the diffrence in output
Good question, I can reproduce your results. If one takes a look at the C# specifications one will find the following snippets:
7.5.3 Overload resolution
For example, the set of candidates for a method invocation does not
include methods marked override (§7.4), and methods in a base class
are not candidates if any method in a derived class is applicable
(§7.6.5.1).
7.4 Member Lookup
Otherwise, the set consists of all accessible (§3.5) members named N
in T, including inherited members and the accessible members named N
in object. If T is a constructed type, the set of members is obtained
by substituting type arguments as described in §10.3.2. Members that
include an override modifier are excluded from the set.
7.6.5.1 Method invocations
The set of candidate methods is reduced to contain only methods from
the most derived types: For each method C.F in the set, where C is the
type in which the method F is declared, all methods declared in a base
type of C are removed from the set. Furthermore, if C is a class type
other than object, all methods declared in an interface type are
removed from the set.
Sounds a bit complicated? Even the C# designers seem to think so and put in the 'helpful' note:
7.6.5.1 Method invocations
The intuitive effect of the resolution rules described above is as
follows: To locate the particular method invoked by a method
invocation, start with the type indicated by the method invocation and
proceed up the inheritance chain until at least one applicable,
accessible, non-override method declaration is found. Then perform
type inference and overload resolution on the set of applicable,
accessible, non-override methods declared in that type and invoke the
method thus selected. If no method was found, try instead to process
the invocation as an extension method invocation.
If we take a look at your derived class, we see two possible methods for C# to use:
A) public override void Foo(int i)
B) public void Foo(object i)
Let's use that last checklist!
Applicability - Both A and B are applicable -(both are void, both are named 'Foo' and both can accept an integer value).
Accessibility - Both A and B are accessible (public)
Not Overridden - Only B is not overridden.
But wait you might say! A is more specific than B!
Correct, but that consideration is only made after we've disregarded option A. As Eric Lippert (one of the designers) puts it Closer is always better than farther away. (Thanks Anthony Pegram)
Addendum
There is always the 'new' keyword:
class Derived : Base
{
public new void Foo(int i)
{
Console.WriteLine("Foo(int)");
}
public void Foo(object i)
{
Console.WriteLine("Foo(object)");
}
}
Though the specifics of that best left for another question!
The simple datatype int descends from object. You are overriding the function and also overloading the parameter list. Since the function name is the same with a different signature the compiler allows this. For simple objects, I image one copy of the parameter signature in the most basic form is stored in the method table.

Is it possible to restrict the values of optional parameters in C#?

C# allows the use of optional parameters: one can specify the value in case the parameter is omitted in a call and the compiler then specifies the value itself.
Example:
public interface IFoo {
void SomeMethod (int para = 0);
}
This idea is useful but a problem is that one can define several "default values" on different levels of the class hierarchy. Example:
public class SubFoo : IFoo {
public void SomeMethod (int para = 1) {
//do something
}
}
If one later calls:
SubFoo sf = new SubFoo ();
sf.SomeMethod ();
Foo f = sf;
f.SomeMethod ();
The result is that the first call is done with para equal to 1 and the second with para equal to 0 (as interface). This make sense since the compiler adds the default values and in the first case the this is a SubFoo thus the default value is 1.
It is of course up to the programmer to maintain consistency, but such schizophrenic situations can easily occur when a programmer changes his/her mind in the middle of the process and forgets to modify all default values.
Problematic is that the compiler doesn't warn that different default values are in use whereas this can be checked by moving up the class hierarchy. Furthmore some people might mimic default parameters with:
public class SubFoo2 {
public virtual void SomeMethod () {
SomeMethod(1);
}
public void SomeMethod (int para) {
//do something
}
}
Which allows dynamic binding and thus overriding consistently. It thus requires one to be very careful with how default values are "implemented".
Are there ways to enforce (with compiler flags for instance) to check whether the default values are consistent? If not it would be nice to have at least a warning that something is not really consistent.
Well not necessary compile-time solution - but you can make unit test for that (I suspect that you're taking seriously unit testing and you run them frequently if you ask this kind of question). The idea is to create assertion method like AssertThatDefaultParametersAreEqual(Type forType) - find all classes that are not abstract (using reflection) and inherit from forType then iterate over all methods which have defined default parameters:
MethodInfo[] methodInfo = Type.GetType(classType).GetMethods(BindingFlags.OptionalParamBinding | BindingFlags.Invoke);
Group them by MethodInfo.Name and check does within the group all same parameters with default values (could be obtained by MethodInfo.GetParameters().Where(x => x.IsOptional)) have the equal property of ParameterInfo.DefaultValue.
edit: btw. that might not work in Mono because compilers aren't obligated to emit for instance: Optional BindingFlag.
If you want to have a compile time indication that a method is changing the default value of an optional argument, you're going to need to use some sort of 3rd party code analysis tool, as C# itself doesn't provide any means of providing such a restriction, or any warnings when its done.
As a workaround, one option is to avoid using optional parameter values and instead use multiple overloads. Since you have an interface here, that would mean using an extension method so that the implementation of the overload with a default value is still defined in the general case:
public interface IFoo
{
void SomeMethod(int para);
}
public static class FooExtensions
{
public static void SomeMethod(this IFoo foo)
{
foo.SomeMethod(0);
}
}
So while this approach does technically still allow someone to create an extension (or instance) method named SomeMethod and accepting no int argument, it would mean that someone would really need to go out of their way to actively change the "default value". It doesn't require implementations of the interface to supply the default value, which risks them unintentionally providing the wrong default value.
Define const int DefaultPara = 1; and then use that instead of hard coding numerical values.
interface IFoo
{
void SomeMethod (int para = DefaultPara);
}
public class SubFoo : IFoo {
public void SomeMethod (int para = DefaultPara) {
//do something
}
}

Field initializer accessing `this`: invalid in C#, valid in Java? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
First, an introduction:
This code:
class C
{
int i = 5;
byte[] s = new byte[i];
}
fails to compile with the following error:
A field initializer cannot reference the nonstatic field, method, or property `C.i'
Resharper says something similar: Cannot access non-static field i in static context.
This is inline with what the C# spec says -- that a field initializer can't access the instance currently being created (this) or, by extension, any of the instance fields:
A variable initializer for an instance field cannot reference the
instance being created. Thus, it is a compile-time error to reference
this in a variable initializer, as it is a compile-time error for a
variable initializer to reference any instance member through a
simple-name.
However, this works just fine in Java:
class C {
int i = 5;
byte s[] = new byte[i]; //no errors here
}
Still with me? Ok, here's the question. Err, questions.
In a hypothetical world where this would be valid in C#, I'm wondering: would it even be possible? If so, what would be the pros and cons that it would add to the table?
Also, since it's really supported by Java, do the same pros/cons hold for Java? Or is there a fundamental difference in the way type initializers work in the two languages?
In short, the ability to access the receiver before the constructor body runs is a feature of marginal benefits that makes it easier to write buggy programs. The C# language designers therefore disabled it entirely. If you need to use the receiver then put that logic in the constructor body.
as for why the feature is legal in Java, you'll have to ask a Java designer.
In C#, field initializers are merely convenience semantics for the developer. The compiler moves all field initializers into the body of the constructor ABOVE where the call is made to the base constructor. So fields are initialized going up the ancestor chain, and the class is initialized from the base down.
Static references are ok because they are initialized before anything else.
By no means is this an authoritative answer, but let me make an educated guess.
There is a fundamental difference, and I think the answers to the other questions are related to this difference.
It lies in order-of-type-initialization, especially in the context of inheritance.
So, how does instance initialization work?
In C#:
all instance field initializers run first, "up" the inheritance chain, from most derived to base class.
then the ctors run, "down" the chain, from base to derived.
The possibility of ctors calling each-other or (explicitely) calling ctors of base classes doesn't change the situation, so I'll leave it out.
What basically happens is, this runs for each chass in the chain, starting with the most derived:
Derived.initialize(){
derivedInstance.field1 = field1Initializer();
[...]
Base.Initialize();
Derived.Ctor();
}
A simple example shows this:
void Main()
{
new C();
}
class C: B {
public int c = GetInt("C.c");
public C(){
WriteLine("C.ctor");
}
}
class B {
public int b = GetInt("B.b");
public static int GetInt(string _var){
WriteLine(_var);
return 6;
}
public B(){
WriteLine("B.ctor");
}
public static void WriteLine(string s){
Console.WriteLine(s);
}
}
Output:
C.c
B.b
B.ctor
C.ctor
That means that if accessing fields in a field initializer was valid, I could do this disaster:
class C: B {
int c = b; //b is a field inherited from the base class, and NOT YET INITIALIZED!
[...]
}
In Java:
Long, interesting article about type initialization here. To summarize:
It's a bit more complicated, because besides the notion of instance field initializers, there's the notion of an (optional) instance initializer, but here's the gist of it:
Everything runs down the inheritance chain.
the instance initializer of the base class runs
the field initializers of the base class run
the ctor(s) of the base class run
repeat above steps for the next class down the inheritance chain.
repeat previous step until reaching the most derived class.
Here's the proof: (or run it yourself online)
class Main
{
public static void main (String[] args) throws java.lang.Exception
{
new C();
}
}
class C extends B {
{
WriteLine("init C");
}
int c = GetInt("C.c");
public C(){
WriteLine("C.ctor");
}
}
class B {
{
WriteLine("init B");
}
int b = GetInt("B.b");
public static int GetInt(String _var){
WriteLine(_var);
return 6;
}
public B(){
WriteLine("B.ctor");
}
public static void WriteLine(String s){
System.out.println(s);
}
}
Output:
init B
B.b
B.ctor
init C
C.c
C.ctor
What this means is, by the time a field initializer runs, all inherited fields are already initialized (by initializer OR ctor in base class), so it's safe enough to allow this behaviour:
class C: B {
int c = b; //b is inherited from the base class, and it's already initialized!
[...]
}
In Java, like in C#, the field initializers are run in the order of declaration.
The Java compiler even goes through the effort of checking that the field initializers aren't called out-of-order* :
class C {
int a = b; //compiler error: illegal forward reference
int b = 5;
}
* As an aside, you can access fields out-of-order if the initializer calls an instance method to do so:
class C {
public int a = useB(); //after initializer completes, a == 0
int b = 5;
int useB(){
return b; //use b regardless if it was initialized or not.
}
}
It is because field initialisers are moved into the constructor by the compiler (unless static) and so you would need be explicit in your constructor like this:
class C
{
int i = 5;
byte[] s;
public C()
{
s = new byte[i];
}
}
This is a bit of a non-answer, but I like to think of anything in the body of a class as being sequence-independent. It's not supposed to be sequential code that needs to be evaluated in a particular manner--it's just default state for the class. If you use code like that, you're expecting i to be evaluated before s.
Anyways, you can just make i a const (as it should be), anyways.

class casting in c#

Here is the c# code
class A {
public int Foo(){ return 5;}
public virtual int Bar(){return 5;}
}
class B : A{
public new int Foo() { return 1;} //shadow
public override int Bar() {return 1;} //override
}
Output of
Console.WriteLine(((A)clB).Foo()); // output 5 <<<--
Console.WriteLine(((A)clB).Bar()); // output 1
How do we get this ouput.Can anyone explain the class casting process here.
Update:
And how does this show difference between shadowing and override
I'll assume that
var clB = new B();
The difference between the Foo and Bar methods is that while Bar uses inheritance and polymorphism to decide what implementation to call, the Foo method hides it's original implementation.
In, a word, A.Foo() and B.Foo() are completely unrelated, they just happen to have the same name. When the compiler sees that a variable of type A invokes Foo it goes in and executes A.Foo(), since the method is not virtual, so it cannot be overriden. Similarly, when it sees a variable of the type B invoking Foo it executes B.Foo(), regardless of the actual type of the instance that is contained in the variable.
On the other hand, the Bar method is defined as virtual, and the inheriting classes can (and are expected to) override it's implementation. So whenever a call is made to Bar, regardless if it is from a variable that is declared as A or B, the method that is actually called must be found as the "latest" implementation in the hierarchy of the calling object itself, with no impact from the type of variable that was used to refer to the object.
In the class B, you introduce a new method Foo with the same name and signature as the method already there (inherited from A). So B has two methods with the same name. That's not something you would do if you could avoid it.
Which of the two methods Foo that gets called, depends on the compile-time type of the variable or expression (of type A or B) used.
In contrast the method Bar is virtual. There is only one method Bar in B. No matter what the compile-time type of the expression is, it is always the "correct" override that gets called.
Writing
((A)clB).Foo()
is like saying "Treat clB as if it were an A (if you can) and give me the result of Foo()". Since A has a non-virtual Foo method, it executes A.Foo. Since B's Foo method is a "new" method, it is not used in this instance.
Writing
((A)clB).Bar()
is similar - "Treat clB as if it were an A (if you can) and give me the result of Bar()". Now A has a virtual Bar method, meaning it can be overridden in base classes. Since the object is really a B, which has an override for Foo(), B.Foo() is called instead.
var clB = new B();
//Uses B's Foo method
Console.WriteLine(clB.Foo()); // output 1
//Uses A's Foo method since new was use to overload method
Console.WriteLine(((A)clB).Foo()); // output 5
//Uses B's Bar Method
Console.WriteLine(clB.Bar()); // output 1
//Uses B's Bar Method since A's Bar method was virtual
Console.WriteLine(((A)clB).Bar()); // output 1

Pattern for calling method which takes the most derived form of the passed object

I was surprised when I ran into this situation and realized I wasn't sure what the best solution was.
Say I have the following three types:
class A { }
class B : A { }
class C : A { }
And the following three methods:
DoSomething(A a){ }
DoSomething(B b){ }
DoSomething(C c){ }
I have a List<A> which contains objects of type B and C
I would like to do this:
foreach(A a in list) { DoSomething(a) }
and have it call the method which matches most closely to the underlying type,
but of course this will always call DoSomething(A a)
I'd prefer not to have a bunch of type checking to get the right method call, and I don't want to add anything to the classes A, B or , C.
Is it possible?
This is a rather well-known issue with virtual dispatch in statically typed languages: it only handles one parameter (this) "virtually"; for all other parameters, the method call is bound using the static type of the argument. Since your list is a list of A, the code is only ever going to call the A overload.
You would need multiple dispatch to achieve the stated goal, and since the language does not provide this out of the box unless you switch to dynamic, so you will have to either make the switch or implement it yourself. There are many tradeoffs to consider when making this decision (and also when deciding how to implement multiple dispatch if needed), so don't do this lightly.
You'll pay a cost in performance, but one simple way of accomplishing this is by using the dynamic run-time binder. Simply cast the argument to dynamic:
foreach(A a in list) { DoSomething((dynamic)a); }
If you're willing to use dynamic keyword, I guess something like
DoSomething((dynamic)a);
will do the job for you.
Otherwise, with static types, you could say
void DoSomething(A a)
{
var aAsB = a as B;
if (aAsB != null)
DoSomething(aAsB);
var aAsC = a as C;
if (aAsC != null)
DoSomething(aAsC);
// general A case here
}
but that's maybe what you call a bunch of type checking.

Categories

Resources