As discovered in C 3.5, the following would not be possible due to type erasure: -
int foo<T>(T bar)
{
return bar.Length; // will not compile unless I do something like where T : string
}
foo("baz");
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
Having read about the dynamic keyword, I wrote the following: -
int foo<T>(T bar)
{
dynamic test = bar;
return test.Length;
}
foo("baz"); // will compile and return 3
So, as far as I understand, dynamic will bypass compile time checking but if the type has been erased, surely it would still be unable to resolve the symbol unless it goes deeper and uses some kind of reflection?
Is using the dynamic keyword in this way bad practice and does this make generics a little more powerful?
dynamics and generics are 2 completely different notions. If you want compile-time safety and speed use strong typing (generics or just standard OOP techniques such as inheritance or composition). If you do not know the type at compile time you could use dynamics but they will be slower because they are using runtime invocation and less safe because if the type doesn't implement the method you are attempting to invoke you will get a runtime error.
The 2 notions are not interchangeable and depending on your specific requirements you could use one or the other.
Of course having the following generic constraint is completely useless because string is a sealed type and cannot be used as a generic constraint:
int foo<T>(T bar) where T : string
{
return bar.Length;
}
you'd rather have this:
int foo(string bar)
{
return bar.Length;
}
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
No, this isn't because of type erasure. Anyway there is no type erasure in C# (unlike Java): a distinct type is constructed by the runtime for each different set of type arguments, there is no loss of information.
The reason why it doesn't work is that the compiler knows nothing about T, so it can only assume that T inherits from object, so only the members of object are available. You can, however, provide more information to the compiler by adding a constraint on T. For instance, if you have an interface IBar with a Length property, you can add a constraint like this:
int foo<T>(T bar) where T : IBar
{
return bar.Length;
}
But if you want to be able to pass either an array or a string, it won't work, because the Length property isn't declared in any interface implemented by both String and Array...
No, C# does not have type erasure - only Java has.
But if you specify only T, without any constraint, you can not use obj.Lenght because T can virtually be anything.
foo(new Bar());
The above would resolve to an Bar-Class and thus the Lenght Property might not be avaiable.
You can only use Methods on T when you ensure that T this methods also really has. (This is done with the where Constraints.)
With the dynamics, you loose compile time checking and I suggest that you do not use them for hacking around generics.
In this case you would not benefit from dynamics in any way. You just delay the error, as an exception is thrown in case the dynamic object does not contain a Length property. In case of accessing the Length property in a generic method I can't see any reason for not constraining it to types who definately have this property.
"Dynamics are a powerful new tool that make interop with dynamic languages as well as COM easier, and can be used to replace much turgid reflective code. They can be used to tell the compiler to execute operations on an object, the checking of which is deferred to runtime.
The great danger lies in the use of dynamic objects in inappropriate contexts, such as in statically typed systems, or worse, in place of an interface/base class in a properly typed system."
Qouted From Article
Thought I'd weigh-in on this one, because no one clarified how generics work "under the hood". That notion of T being an object is mentioned above, and is quite clear. What is not talked about, is that when we compile C# or VB or any other supported language, - at the Intermediate Language (IL) level (what we compile to) which is more akin to an assembly language or equivalent of Java Byte codes, - at this level, there is no generics! So the new question is how do you support generics in IL? For each type that accesses the generic, a non-generic version of the code is generated which substitutes the generic(s) such as the ubiquitous T to the actual type it was called with. So if you only have one type of generic, such as List<>, then that's what the IL will contain. But if you use many implementation of a generic, then many specific implementations are created, and calls to the original code substituted with the calls to the specific non-generic version. To be clear, a MyList used as: new MyList(), will be substituted in IL with something like MyList_string().
That's my (limited) understanding of what's going on. The point being, the benefit of this approach is that the heavy lifting is done at compile-time, and at runtime there's no degradation to performance - which is again, why generic are probably so loved used anywhere, and everywhere by .NET developers.
On the down-side? If a method or type is used many times, then the output assembly (EXE or DLL) will get larger and larger, dependent of the number of different implementation of the same code. Given the average size of DLLs output - I doubt you'll ever consider generics to be a problem.
Related
I know that Java implements parametric polymorphism (Generics) with erasure. I understand what erasure is.
I know that C# implements parametric polymorphism with reification. I know that can make you write
public void dosomething(List<String> input) {}
public void dosomething(List<Int> input) {}
or that you can know at runtime what the type parameter of some parameterised type is, but I don't understand what it is.
What is a reified type?
What is a reified value?
What happens when a type/value is reified?
Reification is the process of taking an abstract thing and creating a concrete thing.
The term reification in C# generics refers to the process by which a generic type definition and one or more generic type arguments (the abstract thing) are combined to create a new generic type (the concrete thing).
To phrase it differently, it is the process of taking the definition of List<T> and int and producing a concrete List<int> type.
To understand it further, compare the following approaches:
In Java generics, a generic type definition is transformed to essentially one concrete generic type shared across all allowed type argument combinations. Thus, multiple (source code level) types are mapped to one (binary level) type - but as a result, information about the type arguments of an instance is discarded in that instance (type erasure).
As a side effect of this implementation technique, the only generic type arguments that are natively allowed are those types that can share the binary code of their concrete type; which means those types whose storage locations have interchangeable representations; which means reference types. Using value types as generic type arguments requires boxing them (placing them in a simple reference type wrapper).
No code is duplicated in order to implement generics this way.
Type information that could have been available at runtime (using reflection) is lost. This, in turn, means that specialization of a generic type (the ability to use specialized source code for any particular generic argument combination) is very restricted.
This mechanism doesn't require support from the runtime environment.
There are a few workarounds to retain type information that a Java program or a JVM-based language can use.
In C# generics, the generic type definition is maintained in memory at runtime. Whenever a new concrete type is required, the runtime environment combines the generic type definition and the type arguments and creates the new type (reification). So we get a new type for each combination of the type arguments, at runtime.
This implementation technique allows any kind of type argument combination to be instantiated. Using value types as generic type arguments does not cause boxing, since these types get their own implementation. (Boxing still exists in C#, of course - but it happens in other scenarios, not this one.)
Code duplication could be an issue - but in practice it isn't, because sufficiently smart implementations (this includes Microsoft .NET and Mono) can share code for some instantiations.
Type information is maintained, which allows specialization to an extent, by examining type arguments using reflection. However, the degree of specialization is limited, as a result of the fact that a generic type definition is compiled before any reification happens (this is done by compiling the definition against the constraints on the type parameters - thus, the compiler has to be able "understand" the definition even in the absence of specific type arguments).
This implementation technique depends heavily on runtime support and JIT-compilation (which is why you often hear that C# generics have some limitations on platforms like iOS, where dynamic code generation is restricted).
In the context of C# generics, reification is done for you by the runtime environment. However, if you want to more intuitively understand the difference between a generic type definition and a concrete generic type, you can always perform a reification on your own, using the System.Type class (even if the particular generic type argument combination you're instantiating didn't appear in your source code directly).
In C++ templates, the template definition is maintained in memory at compile time. Whenever a new instantiation of a template type is required in the source code, the compiler combines the template definition and the template arguments and creates the new type. So we get a unique type for each combination of the template arguments, at compile time.
This implementation technique allows any kind of type argument combination to be instantiated.
This is known to duplicate binary code but a sufficiently smart tool-chain could still detect this and share code for some instantiations.
The template definition itself is not "compiled" - only its concrete instantiations are actually compiled. This places fewer constraints on the compiler and allows a greater degree of template specialization.
Since template instantiations are performed at compile time, no runtime support is needed here either.
This process is lately referred to as monomorphization, especially in the Rust community. The word is used in contrast to parametric polymorphism, which is the name of the concept that generics come from.
Reification means generally (outside of computer science) "to make something real".
In programming, something is reified if we're able to access information about it in the language itself.
For two completely non-generics-related examples of something C# does and doesn't have reified, let's take methods and memory access.
OO languages generally have methods, (and many that don't have functions that are similar though not bound to a class). As such you can define a method in such a language, call it, perhaps override it, and so on. Not all such languages let you actually deal with the method itself as data to a program. C# (and really, .NET rather than C#) does let you make use of MethodInfo objects representing the methods, so in C# methods are reified. Methods in C# are "first class objects".
All practical languages have some means to access the memory of a computer. In a low-level language like C we can deal directly with the mapping between numeric addresses used by the computer, so the likes of int* ptr = (int*) 0xA000000; *ptr = 42; is reasonable (as long as we've a good reason to suspect that accessing memory address 0xA000000 in this way won't blow something up). In C# this isn't reasonable (we can just about force it in .NET, but with the .NET memory management moving things around it's not very likely to be useful). C# does not have reified memory addresses.
So, as refied means "made real" a "reified type" is a type we can "talk about" in the language in question.
In generics this means two things.
One is that List<string> is a type just as string or int are. We can compare that type, get its name, and enquire about it:
Console.WriteLine(typeof(List<string>).FullName); // System.Collections.Generic.List`1[[System.String, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]]
Console.WriteLine(typeof(List<string>) == (42).GetType()); // False
Console.WriteLine(typeof(List<string>) == Enumerable.Range(0, 1).Select(i => i.ToString()).ToList().GetType()); // True
Console.WriteLine(typeof(List<string>).GenericTypeArguments[0] == typeof(string)); // True
A consequence of this is that we can "talk about" a generic method's (or method of a generic class) parameters' types within the method itself:
public static void DescribeType<T>(T element)
{
Console.WriteLine(typeof(T).FullName);
}
public static void Main()
{
DescribeType(42); // System.Int32
DescribeType(42L); // System.Int64
DescribeType(DateTime.UtcNow); // System.DateTime
}
As a rule, doing this too much is "smelly", but it has many useful cases. For example, look at:
public static TSource Min<TSource>(this IEnumerable<TSource> source)
{
if (source == null) throw Error.ArgumentNull("source");
Comparer<TSource> comparer = Comparer<TSource>.Default;
TSource value = default(TSource);
if (value == null)
{
using (IEnumerator<TSource> e = source.GetEnumerator())
{
do
{
if (!e.MoveNext()) return value;
value = e.Current;
} while (value == null);
while (e.MoveNext())
{
TSource x = e.Current;
if (x != null && comparer.Compare(x, value) < 0) value = x;
}
}
}
else
{
using (IEnumerator<TSource> e = source.GetEnumerator())
{
if (!e.MoveNext()) throw Error.NoElements();
value = e.Current;
while (e.MoveNext())
{
TSource x = e.Current;
if (comparer.Compare(x, value) < 0) value = x;
}
}
}
return value;
}
This doesn't do lots of comparisons between the type of TSource and various types for different behaviours (generally a sign that you shouldn't have used generics at all) but it does split between a code path for types that can be null (should return null if no element found, and must not make comparisons to find the minimum if one of the elements compared is null) and the code path for types that cannot be null (should throw if no element found, and doesn't have to worry about the possibility of null elements).
Because TSource is "real" within the method, this comparison can be made either at runtime or jitting time (generally at jitting time, and certainly the above case would do so at jitting time and not produce machine code for the path not taken) and we have a separate "real" version of the method for each case. (Though as an optimisation, the machine code is shared for different methods for different reference-type type parameters, because it can be without affecting this, and hence we can reduce the amount of machine code jitted.)
(It's not common to talk about reification of generic types in C# unless you also deal with Java, because in C# we just take this reification for granted; all types are reified. In Java, non-generic types are referred to as reified because that is a distinction between them and generic types.)
As duffymo already noted, "reification" isn't the key difference.
In Java, generics are basically there to improve compile-time support - it allows you to use strongly typed e.g. collections in your code, and have type safety handled for you. However, this only exists at compile-time - the compiled bytecode no longer has any notion of generics; all the generic types are transformed into "concrete" types (using object if the generic type is unbounded), adding type conversions and type checks as needed.
In .NET, generics are an integral feature of the CLR. When you compile a generic type, it stays generic in the generated IL. It's not just transformed into non-generic code as in Java.
This has several impacts on how generics work in practice. For example:
Java has SomeType<?> to allow you to pass any concrete implementation of a given generic type. C# cannot do this - every specific (reified) generic type is its own type.
Unbounded generic types in Java mean that their value is stored as an object. This can have a performance impact when using value types in such generics. In C#, when you use a value type in a generic type, it stays a value type.
To give a sample, let's suppose you have a List generic type with one generic argument. In Java, List<String> and List<Int> will end up being the exact same type at runtime - the generic types only really exist for compile-time code. All calls to e.g. GetValue will be transformed to (String)GetValue and (Int)GetValue respectively.
In C#, List<string> and List<int> are two different types. They are not interchangeable, and their type-safety is enforced in runtime as well. No matter what you do, new List<int>().Add("SomeString") will never work - the underlying storage in List<int> is really some integer array, while in Java, it is necessarily an object array. In C#, there are no casts involved, no boxing etc.
This should also make it obvious why C# can't do the same thing as Java with SomeType<?>. In Java, all generic types "derived from" SomeType<?> end up being the exact same type. In C#, all the various specific SomeType<T>s are their own separate type. Removing compile-time checks, it's possible to pass SomeType<Int> instead of SomeType<String> (and really, all that SomeType<?> means is "ignore compile-time checks for the given generic type"). In C#, it's not possible, not even for derived types (that is, you can't do List<object> list = (List<object>)new List<string>(); even though string is derived from object).
Both implementations have their pros and cons. There's been a few times when I'd have loved to be able to just allow SomeType<?> as an argument in C# - but it simply doesn't make sense the way C# generics work.
Reification is an object-oriented modeling concept.
Reify is a verb that means "make something abstract real".
When you do object oriented programming it's common to model real world objects as software components (e.g. Window, Button, Person, Bank, Vehicle, etc.)
It's also common to reify abstract concepts into components as well (e.g. WindowListener, Broker, etc.)
If I have a generic interface with a covariant type parameter, like this:
interface IGeneric<out T>
{
string GetName();
}
And If I define this class hierarchy:
class Base {}
class Derived1 : Base{}
class Derived2 : Base{}
Then I can implement the interface twice on a single class, like this, using explicit interface implementation:
class DoubleDown: IGeneric<Derived1>, IGeneric<Derived2>
{
string IGeneric<Derived1>.GetName()
{
return "Derived1";
}
string IGeneric<Derived2>.GetName()
{
return "Derived2";
}
}
If I use the (non-generic)DoubleDown class and cast it to IGeneric<Derived1> or IGeneric<Derived2> it functions as expected:
var x = new DoubleDown();
IGeneric<Derived1> id1 = x; //cast to IGeneric<Derived1>
Console.WriteLine(id1.GetName()); //Derived1
IGeneric<Derived2> id2 = x; //cast to IGeneric<Derived2>
Console.WriteLine(id2.GetName()); //Derived2
However, casting the x to IGeneric<Base>, gives the following result:
IGeneric<Base> b = x;
Console.WriteLine(b.GetName()); //Derived1
I expected the compiler to issue an error, as the call is ambiguous between the two implementations, but it returned the first declared interface.
Why is this allowed?
(inspired by A class implementing two different IObservables?. I tried to show to a colleague that this will fail, but somehow, it didn't)
If you have tested both of:
class DoubleDown: IGeneric<Derived1>, IGeneric<Derived2> {
string IGeneric<Derived1>.GetName() {
return "Derived1";
}
string IGeneric<Derived2>.GetName() {
return "Derived2";
}
}
class DoubleDown: IGeneric<Derived2>, IGeneric<Derived1> {
string IGeneric<Derived1>.GetName() {
return "Derived1";
}
string IGeneric<Derived2>.GetName() {
return "Derived2";
}
}
You must have realized that the results in reality, changes with the order you declaring the interfaces to implement. But I'd say it is just unspecified.
First off, the specification(§13.4.4 Interface mapping) says:
If more than one member matches, it is unspecified which member is the implementation of I.M.
This situation can only occur if S is a constructed type where the two members as declared in the generic type have different signatures, but the type arguments make their signatures identical.
Here we have two questions to consider:
Q1: Do your generic interfaces have different signatures?
A1: Yes. They are IGeneric<Derived2> and IGeneric<Derived1>.
Q2: Could the statement IGeneric<Base> b=x; make their signatures identical with type arguments?
A2: No. You invoked the method through a generic covariant interface definition.
Thus your call meets the unspecified condition. But how could this happen?
Remember, whatever the interface you specified to refer the object of type DoubleDown, it is always a DoubleDown. That is, it always has these two GetName method. The interface you specify to refer it, in fact, performs contract selection.
The following is the part of captured image from the real test
This image shows what would be returned with GetMembers at runtime. In all cases you refer it, IGeneric<Derived1>, IGeneric<Derived2> or IGeneric<Base>, are nothing different. The following two image shows more details:
As the images shown, these two generic derived interfaces have neither the same name nor another signatures/tokens make them identical.
The compiler can't throw an error on the line
IGeneric<Base> b = x;
Console.WriteLine(b.GetName()); //Derived1
because there is no ambiguity that the compiler can know about. GetName() is in fact a valid method on interface IGeneric<Base>. The compiler doesn't track the runtime type of b to know that there is a type in there which could cause an ambiguity. So it's left up to the runtime to decide what to do. The runtime could throw an exception, but the designers of the CLR apparently decided against that (which I personally think was a good decision).
To put it another way, let's say that instead you simply had written the method:
public void CallIt(IGeneric<Base> b)
{
string name = b.GetName();
}
and you provide no classes implementing IGeneric<T> in your assembly. You distribute this and many others implement this interface only once and are able to call your method just fine. However, someone eventually consumes your assembly and creates the DoubleDown class and passes it into your method. At what point should the compiler throw an error? Surely the already compiled and distributed assembly containing the call to GetName() can't produce a compiler error. You could say that the assignment from DoubleDown to IGeneric<Base> produces the ambiguity. but once again we could add another level of indirection into the original assembly:
public void CallItOnDerived1(IGeneric<Derived1> b)
{
return CallIt(b); //b will be cast to IGeneric<Base>
}
Once again, many consumers could call either CallIt or CallItOnDerived1 and be just fine. But our consumer passing DoubleDown also is making a perfectly legal call that could not cause a compiler error when they call CallItOnDerived1 as converting from DoubleDown to IGeneric<Derived1> should certainly be OK. Thus, there is no point at which the compiler can throw an error other than possibly on the definition of DoubleDown, but this would eliminate the possibility of doing something potentially useful with no workaround.
I have actually answered this question more in depth elsewhere, and also provided a potential solution if the language could be changed:
No warning or error (or runtime failure) when contravariance leads to ambiguity
Given that the chance of the language changing to support this is virtually zero, I think that the current behavior is alright, except that it should be laid out in the specifications so that all implementations of the CLR would be expected to behave the same way.
Holy goodness, lots of really good answers here to what is quite a tricky question. Summing up:
The language specification does not clearly say what to do here.
This scenario usually arises when someone is attempting to emulate interface covariance or contravariance; now that C# has interface variance we hope that less people will use this pattern.
Most of the time "just pick one" is a reasonable behaviour.
How the CLR actually chooses which implementation is used in an ambiguous covariant conversion is implementation-defined. Basically, it scans the metadata tables and picks the first match, and C# happens to emit the tables in source code order. You can't rely on this behaviour though; either can change without notice.
I'd only add one other thing, and that is: the bad news is that interface reimplementation semantics do not exactly match the behaviour specified in the CLI specification in scenarios where these sorts of ambiguities arise. The good news is that the actual behaviour of the CLR when re-implementing an interface with this kind of ambiguity is generally the behaviour that you'd want. Discovering this fact led to a spirited debate between me, Anders and some of the CLI spec maintainers and the end result was no change to either the spec or the implementation. Since most C# users do not even know what interface reimplementation is to begin with, we hope that this will not adversely affect users. (No customer has ever brought it to my attention.)
The question asked, "Why doesn't this produce a compiler warning?".
In VB, it does(I implemented it).
The type system doesn't carry enough information to provide a warning at time of invocation about variance ambiguity. So the warning has to be emitted earlier ...
In VB, if you declare a class C which implements both IEnumerable(Of Fish) and IEnumerable(Of Dog), then it gives a warning saying that the two will conflict in the common case IEnumerable(Of Animal). This is enough to stamp out variance-ambiguity from code that's written entirely in VB.
However, it doesn't help if the problem class was declared in C#. Also note that it's completely reasonable to declare such a class if no one invokes a problematic member on it.
In VB, if you perform a cast from such a class C into IEnumerable(Of Animal), then it gives a warning on the cast. This is enough to stamp out variance-ambiguity even if you imported the problem class from metadata.
However, it's a poor warning location because it's not actionable: you can't go and change the cast. The only actionable warning to people would be to go back and change the class definition. Also note that it's completely reasonable to perform such a cast if no one invokes a problematic member on it.
Question:
How come VB emits these warnings but C# doesn't?
Answer:
When I put them into VB, I was enthusiastic about formal computer science, and had only been writing compilers for a couple of years, and I had the time and enthusiasm to code them up.
Eric Lippert was doing them in C#. He had the wisdom and maturity to see that coding up such warnings in the compiler would take a lot of time that could be better spent elsewhere, and was sufficiently complex that it carried high risk. Indeed the VB compilers had bugs in these very warnings that were only fixed in VS2012.
Also, to be frank, it was impossible to come up with a warning message useful enough that people would understand it. Incidentally,
Question:
How does the CLR resolve the ambiguity when chosing which one to invoke?
Answer:
It bases it on the lexical ordering of inheritance statements in the original source code, i.e. the lexical order in which you declared that C implements IEnumerable(Of Fish) and IEnumerable(Of Dog).
Trying to delve into the "C# language specifications", it looks that the behaviour is not specified (if I did not get lost in my way).
7.4.4 Function member invocation
The run-time processing of a function member invocation consists of the following steps, where M is the function member and, if M is an instance member, E is the instance expression:
[...]
o The function member implementation to invoke is determined:
• If the compile-time type of E is an interface, the function member to invoke is the implementation of M provided by the run-time type of the instance referenced by E. This function member is determined by applying the interface mapping rules (§13.4.4) to determine the implementation of M provided by the run-time type of the instance referenced by E.
13.4.4 Interface mapping
Interface mapping for a class or struct C locates an implementation for each member of each interface specified in the base class list of C. The implementation of a particular interface member I.M, where I is the interface in which the member M is declared, is determined by examining each class or struct S, starting with C and repeating for each successive base class of C, until a match is located:
• If S contains a declaration of an explicit interface member implementation that matches I and M, then this member is the implementation of I.M.
• Otherwise, if S contains a declaration of a non-static public member that matches M, then this member is the implementation of I.M. If more than one member matches, it is unspecified which member is the implementation of I.M. This situation can only occur if S is a constructed type where the two members as declared in the generic type have different signatures, but the type arguments make their signatures identical.
I've recently had a strange issue with one of my APIs reported. Essentially for some reason when used with VB code the VB compiler does not do implicit casts to Object when trying to invoke the ToString() method.
The following is a minimal code example firstly in C# and secondly in VB:
Graph g = new Graph();
g.LoadFromEmbeddedResource("VDS.RDF.Configuration.configuration.ttl");
foreach (Triple t in g.Triples)
{
Console.WriteLine(t.Subject.ToString());
}
The above compiles and runs fine while the below does not:
Dim g As Graph = New Graph()
g.LoadFromEmbeddedResource("VDS.RDF.Configuration.configuration.ttl")
For Each t As Triple In g.Triples
Console.WriteLine(t.Subject.ToString())
Next
The second VB example gives the following compiler exception:
Overload resolution failed because no
accessible 'ToString' accepts this
number of arguments.
This appears to be due to the fact that the type of the property t.Subject that I am trying to write to the console has explicitly defined ToString() methods which take parameters. The VB compiler appears to expect one of these to be used and does not seem to implicitly cast to Object and use the standard Object.ToString() method whereas the C# compiler does.
Is there any way around this e.g. a VB compiler option or is it best just to ensure that the type of the property (which is an interface in this example) explicitly defines an unparameterized ToString() method to ensure compatability with VB?
Edit
Here are the additional details requested by Lucian
Graph is an implementation of an interface but that is actually irrelevant since it is the INode interface which is the type that t.Subject returns which is the issue.
INode defines two overloads for ToString() both of which take parameters
Yes it is a compile time error
No I do not use hide-by-name, the API is all written in C# so I couldn't generate that kind of API if I wanted to
Note that I've since added an explicit unparameterized ToString() overload to the interface which has fixed the issue for VB users.
RobV, I'm the VB spec lead, so I should be able to answer your question, but I'll need some clarification please...
What are the overloads defined on "Graph"? It'd help if you could make a self-contained repro. It's hard to explain overloading behavior without knowing the overload candidates :)
You said it failed with a "compiler exception". That doesn't really exist. Do you mean a "compile-time error"? Or a "run-time exception"?
Something to check is whether you're relying on any kind of "hide-by-name" vs "hide-by-sig" behavior. C# compiler only ever emits "hide-by-sig" APIs; VB compiler can emit either depending on whether you use the "Shadows" keyword.
C# overload algorithm is to walk up the inheritance hierarchy level by level until it finds a possible match; VB overload algorithm is to look at all levels of the inheritance hierarchy simultaneously to see which has the best match. This is all a bit theoretical, but with a small self-contained repro of your problem I could explain what it means in practice.
Hans, I don't think your explanation is the right one. Your code gives compile-time error "BC30455: Argument not specified for parameter 'mumble' of ToString". But RobV had experienced "Overload resolution failed because no accessible 'ToString' accepts this number of arguments".
Here's a repro of this behavior. It also shows you the workaround, cast with CObj():
Module Module1
Sub Main()
Dim itf As IFoo = New CFoo()
Console.WriteLine(itf.ToString()) '' Error BC30455
Console.WriteLine(CObj(itf).ToString()) '' Okay
End Sub
End Module
Interface IFoo
Function ToString(ByVal mumble As Integer) As String
End Interface
Class CFoo
Implements IFoo
Function ToString1(ByVal mumble As Integer) As String Implements IFoo.ToString
Return "foo"
End Function
End Class
I think this is annotated in the VB.NET Language Specification, chapter 11.8.1 "Overloaded method resolution":
The justification for this rule is
that if a program is loosely-typed
(that is, most or all variables are
declared as Object), overload
resolution can be difficult because
all conversions from Object are
narrowing. Rather than have the
overload resolution fail in many
situations (requiring strong typing of
the arguments to the method call),
resolution the appropriate overloaded
method to call is deferred until run
time. This allows the loosely-typed
call to succeed without additional
casts.
An unfortunate side-effect of this,
however, is that performing the
late-bound call requires casting the
call target to Object. In the case of
a structure value, this means that the
value must be boxed to a temporary. If
the method eventually called tries to
change a field of the structure, this
change will be lost once the method
returns.
Interfaces are excluded from this
special rule because late binding
always resolves against the members of
the runtime class or structure type,
which may have different names than
the members of the interfaces they
implement.
Not sure. I'd transliterate it as: VB.NET is a loosely typed language where many object references are commonly late bound. This makes method overload resolution perilous.
When should one use dynamic keyword in c# 4.0?.......Any good example with dynamic keyword in c# 4.0 that explains its usage....
Dynamic should be used only when not using it is painful. Like in MS Office libraries. In all other cases it should be avoided as compile type checking is beneficial. Following are the good situation of using dynamic.
Calling javascript method from Silverlight.
COM interop.
Maybe reading Xml, Json without creating custom classes.
How about this? Something I've been looking for and was wondering why it was so hard to do without 'dynamic'.
interface ISomeData {}
class SomeActualData : ISomeData {}
class SomeOtherData : ISomeData {}
interface ISomeInterface
{
void DoSomething(ISomeData data);
}
class SomeImplementation : ISomeInterface
{
public void DoSomething(ISomeData data)
{
dynamic specificData = data;
HandleThis( specificData );
}
private void HandleThis(SomeActualData data)
{ /* ... */ }
private void HandleThis(SomeOtherData data)
{ /* ... */ }
}
You just have to maybe catch for the Runtime exception and handle how you want if you do not have an overloaded method that takes the concrete type.
Equivalent of not using dynamic will be:
public void DoSomething(ISomeData data)
{
if(data is SomeActualData)
HandleThis( (SomeActualData) data);
else if(data is SomeOtherData)
HandleThis( (SomeOtherData) data);
...
else
throw new SomeRuntimeException();
}
As described in here dynamics can make poorly-designed external libraries easier to use: Microsoft provides the example of the Microsoft.Office.Interop.Excel assembly.
And With dynamic, you can avoid a lot of annoying, explicit casting when using this assembly.
Also, In opposition to #user2415376 ,It is definitely not a way to handle Interfaces since we already have Polymorphism implemented from the beginning days of the language!
You can use
ISomeData specificData = data;
instead of
dynamic specificData = data;
Plus it will make sure that you do not pass a wrong type of data object instead.
Check this blog post which talks about dynamic keywords in c#. Here is the gist:
The dynamic keyword is powerful indeed, it is irreplaceable when used with dynamic languages but can also be used for tricky situations while designing code where a statically typed object simply will not do.
Consider the drawbacks:
There is no compile-time type checking, this means that unless you have 100% confidence in your unit tests (cough) you are running a risk.
The dynamic keyword uses more CPU cycles than your old fashioned statically typed code due to the additional runtime overhead, if performance is important to your project (it normally is) don’t use dynamic.
Common mistakes include returning anonymous types wrapped in the dynamic keyword in public methods. Anonymous types are specific to an assembly, returning them across assembly (via the public methods) will throw an error, even though simple testing will catch this, you now have a public method which you can use only from specific places and that’s just bad design.
It’s a slippery slope, inexperienced developers itching to write something new and trying their best to avoid more classes (this is not necessarily limited to the inexperienced) will start using dynamic more and more if they see it in code, usually I would do a code analysis check for dynamic / add it in code review.
Here is a recent case in which using dynamic was a straightforward solution. This is essentially 'duck typing' in a COM interop scenario.
I had ported some code from VB6 into C#. This ported code still needed to call other methods on VB6 objects via COM interop.
The classes needing to be called looked like this:
class A
{
void Foo() {...}
}
class B
{
void Foo() {...}
}
(i.e., this would be the way the VB6 classes looked in C# via COM interop.
Since A and B are independent of each other you can't cast one to the other, and they have no common base class (COM doesn't support that AFAIK and VB6 certainly didn't. And they did not implement a common interface - see below).
The original VB6 code which was ported did this:
' Obj must be either an A or a B
Sub Bar(Obj As Object)
Call Obj.Foo()
End Sub
Now in VB6 you can pass things around as Object and the runtime will figure out if those objects have method Foo() or not. But in C# a literal translation would be:
// Obj must be either an A or a B
void Bar(object Obj)
{
Obj.Foo();
}
Which will NOT work. It won't compile because object does not have a method called "Foo", and C# being typesafe won't allow this.
So the simple "fix" was to use dynamic, like this:
// Obj must be either an A or a B
void Bar(dynamic Obj)
{
Obj.Foo();
}
This defers type safety until runtime, but assuming you've done it right works just fine.
I wouldn't endorse this for new code, but in this situation (which I think is not uncommon judging from other answers here) it was valuable.
Alternatives considered:
Using reflection to call Foo(). Probably would work, but more effort and less readable.
Modifying the VB6 library wasn't on the table here, but maybe there could be an approach to define A and B in terms of a common interface, which VB6 and COM would support. But using dynamic was much easier.
Note: This probably will turn out to be a temporary solution. Eventually if the remaining VB6 code is ported over then a proper class structure can be used.
I will like to copy an excerpt from the code project post, which define that :
Why use dynamic?
In the statically typed world, dynamic gives developers a lot of rope
to hang themselves with. When dealing with objects whose types can be
known at compile time, you should avoid the dynamic keyword at all
costs. Earlier, I said that my initial reaction was negative, so what
changed my mind? To quote Margret Attwood, context is all. When
statically typing, dynamic doesn't make a stitch of sense. If you are
dealing with an unknown or dynamic type, it is often necessary to
communicate with it through Reflection. Reflective code is not easy to
read, and has all the pitfalls of the dynamic type above. In this
context, dynamic makes a lot of sense.[More]
While Some of the characteristics of Dynamic keyword are:
Dynamically typed - This means the type of variable declared is
decided by the compiler at runtime time.
No need to initialize at the time of declaration.
e.g.,
dynamic str;
str=”I am a string”; //Works fine and compiles
str=2; //Works fine and compiles
Errors are caught at runtime
Intellisense is not available since the type and its related methods and properties can be known at run time only. [https://www.codeproject.com/Tips/460614/Difference-between-var-and-dynamic-in-Csharp]
It is definitely a bad idea to use dynamic in all cases where it can be used. This is because your programs will lose the benefits of compile-time checking and they will also be much slower.
I am a PHP web programmer who is trying to learn C#.
I would like to know why C# requires me to specify the data type when creating a variable.
Class classInstance = new Class();
Why do we need to know the data type before a class instance?
As others have said, C# is static/strongly-typed. But I take your question more to be "Why would you want C# to be static/strongly-typed like this? What advantages does this have over dynamic languages?"
With that in mind, there are lots of good reasons:
Stability Certain kinds of errors are now caught automatically by the compiler, before the code ever makes it anywhere close to production.
Readability/Maintainability You are now providing more information about how the code is supposed to work to future developers who read it. You add information that a specific variable is intended to hold a certain kind of value, and that helps programmers reason about what the purpose of that variable is.
This is probably why, for example, Microsoft's style guidelines recommended that VB6 programmers put a type prefix with variable names, but that VB.Net programmers do not.
Performance This is the weakest reason, but late-binding/duck typing can be slower. In the end, a variable refers to memory that is structured in some specific way. Without strong types, the program will have to do extra type verification or conversion behind the scenes at runtime as you use memory that is structured one way physically as if it were structured in another way logically.
I hesitate to include this point, because ultimately you often have to do those conversions in a strongly typed language as well. It's just that the strongly typed language leaves the exact timing and extent of the conversion to the programmer, and does no extra work unless it needs to be done. It also allows the programmer to force a more advantageous data type. But these really are attributes of the programmer, rather than the platform.
That would itself be a weak reason to omit the point, except that a good dynamic language will often make better choices than the programmer. This means a dynamic language can help many programmers write faster programs. Still, for good programmers, strongly-typed languages have the potential to be faster.
Better Dev Tools If your IDE knows what type a variable is expected to be, it can give you additional help about what kinds of things that variable can do. This is much harder for the IDE to do if it has to infer the type for you. And if you get more help with the minutia of an API from the IDE, then you as a developer will be able to get your head around a larger, richer API, and get there faster.
Or perhaps you were just wondering why you have to specify the class name twice for the same variable on the same line? The answer is two-fold:
Often you don't. In C# 3.0 and later you can use the var keyword instead of the type name in many cases. Variables created this way are still statically typed, but the type is now inferred for you by the compiler.
Thanks to inheritance and interfaces sometimes the type on the left-hand side doesn't match the type on the right hand side.
It's simply how the language was designed. C# is a C-style language and follows in the pattern of having types on the left.
In C# 3.0 and up you can kind of get around this in many cases with local type inference.
var variable = new SomeClass();
But at the same time you could also argue that you are still declaring a type on the LHS. Just that you want the compiler to pick it for you.
EDIT
Please read this in the context of the users original question
why do we need [class name] before a variable name?
I wanted to comment on several other answers in this thread. A lot of people are giving "C# is statically type" as an answer. While the statement is true (C# is statically typed), it is almost completely unrelated to the question. Static typing does not necessitate a type name being to the left of the variable name. Sure it can help but that is a language designer choice not a necessary feature of static typed languages.
These is easily provable by considering other statically typed languages such as F#. Types in F# appear on the right of a variable name and very often can be altogether ommitted. There are also several counter examples. PowerShell for instance is extremely dynamic and puts all of its type, if included, on the left.
One of the main reasons is that you can specify different types as long as the type on the left hand side of the assignment is a parent type of the type on the left (or an interface that is implemented on that type).
For example given the following types:
class Foo { }
class Bar : Foo { }
interface IBaz { }
class Baz : IBaz { }
C# allows you to do this:
Foo f = new Bar();
IBaz b = new Baz();
Yes, in most cases the compiler could infer the type of the variable from the assignment (like with the var keyword) but it doesn't for the reason I have shown above.
Edit: As a point of order - while C# is strongly-typed the important distinction (as far as this discussion is concerned) is that it is in fact also a statically-typed language. In other words the C# compiler does static type checking at compilation time.
C# is a statically-typed, strongly-typed language like C or C++. In these languages all variables must be declared to be of a specific type.
Ultimately because Anders Hejlsberg said so...
You need [class name] in front because there are many situations in which the first [class name] is different from the second, like:
IMyCoolInterface obj = new MyInterfaceImplementer();
MyBaseType obj2 = new MySubTypeOfBaseType();
etc. You can also use the word 'var' if you don't want to specify the type explicitely.
Why do we need to know the data type
before a class instance?
You don't! Read from right to left. You create the variable and then you store it in a type safe variable so you know what type that variable is for later use.
Consider the following snippet, it would be a nightmare to debug if you didn't receive the errors until runtime.
void FunctionCalledVeryUnfrequently()
{
ClassA a = new ClassA();
ClassB b = new ClassB();
ClassA a2 = new ClassB(); //COMPILER ERROR(thank god)
//100 lines of code
DoStuffWithA(a);
DoStuffWithA(b); //COMPILER ERROR(thank god)
DoStuffWithA(a2);
}
When you'r thinking you can replace the new Class() with a number or a string and the syntax will make much more sense. The following example might be a bit verbose but might help to understand why it's designed the way it is.
string s = "abc";
string s2 = new string(new char[]{'a', 'b', 'c'});
//Does exactly the same thing
DoStuffWithAString("abc");
DoStuffWithAString(new string(new char[]{'a', 'b', 'c'}));
//Does exactly the same thing
C#, as others have pointed out, is a strongly, statically-typed language.
By stating up front what the type you're intending to create is, you'll receive compile-time warnings when you try to assign an illegal value. By stating up front what type of parameters you accept in methods, you receive those same compile-time warnings when you accidentally pass nonsense into a method that isn't expecting it. It removes the overhead of some paranoia on your behalf.
Finally, and rather nicely, C# (and many other languages) doesn't have the same ridiculous, "convert anything to anything, even when it doesn't make sense" mentality that PHP does, which quite frankly can trip you up more times than it helps.
c# is a strongly-typed language, like c++ or java. Therefore it needs to know the type of the variable. you can fudge it a bit in c# 3.0 via the var keyword. That lets the compiler infer the type.
That's the difference between a strongly typed and weakly typed language. C# (and C, C++, Java, most more powerful languages) are strongly typed so you must declare the variable type.
When we define variables to hold data we have to specify the type of data that those variables will hold. The compiler then checks that what we are doing with the data makes sense to it, i.e. follows the rules. We can't for example store text in a number - the compiler will not allow it.
int a = "fred"; // Not allowed. Cannot implicitly convert 'string' to 'int'
The variable a is of type int, and assigning it the value "fred" which is a text string breaks the rules- the compiler is unable to do any kind of conversion of this string.
In C# 3.0, you can use the 'var' keyword - this uses static type inference to work out what the type of the variable is at compile time
var foo = new ClassName();
variable 'foo' will be of type 'ClassName' from then on.
One things that hasn't been mentioned is that C# is a CLS (Common Language Specification) compliant language. This is a set of rules that a .NET language has to adhere to in order to be interopable with other .NET languages.
So really C# is just keeping to these rules. To quote this MSDN article:
The CLS helps enhance and ensure
language interoperability by defining
a set of features that developers can
rely on to be available in a wide
variety of languages. The CLS also
establishes requirements for CLS
compliance; these help you determine
whether your managed code conforms to
the CLS and to what extent a given
tool supports the development of
managed code that uses CLS features.
If your component uses only CLS
features in the API that it exposes to
other code (including derived
classes), the component is guaranteed
to be accessible from any programming
language that supports the CLS.
Components that adhere to the CLS
rules and use only the features
included in the CLS are said to be
CLS-compliant components
Part of the CLS is the CTS the Common Type System.
If that's not enough acronyms for you then there's a tonne more in .NET such as CLI, ILasm/MSIL, CLR, BCL, FCL,
Because C# is a strongly typed language
Static typing also allows the compiler to make better optimizations, and skip certain steps. Take overloading for example, where you have multiple methods or operators with the same name differing only by their arguments. With a dynamic language, the runtime would need to grade each version in order to determine which is the best match. With a static language like this, the final code simply points directly to the appropriate overload.
Static typing also aids in code maintenance and refactoring. My favorite example being the Rename feature of many higher-end IDEs. Thanks to static typing, the IDE can find with certainty every occurrence of the identifier in your code, and leave unrelated identifiers with the same name intact.
I didn't notice if it were mentioned yet or not, but C# 4.0 introduces dynamic checking VIA the dynamic keyword. Though I'm sure you'd want to avoid it when it's not necessary.
Why C# requires me to specify the data type when creating a variable.
Why do we need to know the data type before a class instance?
I think one thing that most answers haven't referenced is the fact that C# was originally meant and designed as "managed", "safe" language among other things and a lot of those goals are arrived at via static / compile time verifiability. Knowing the variable datatype explicitly makes this problem MUCH easier to solve. Meaning that one can make several automated assessments (C# compiler, not JIT) about possible errors / undesirable behavior without ever allowing execution.
That verifiability as a side effect also gives you better readability, dev tools, stability etc. because if an automated algorithm can understand better what the code will do when it actually runs, so can you :)
Statically typed means that Compiler can perform some sort of checks at Compile time not at run time. Every variable is of particular or strong type in Static type. C# is strongly definitely strongly typed.