Given this code:
class C
{
C()
{
Test<string>(A); // fine
Test((string a) => {}); // fine
Test((Action<string>)A); // fine
Test(A); // type arguments cannot be inferred from usage!
}
static void Test<T>(Action<T> a) { }
void A(string _) { }
}
The compiler complains that Test(A) can't figure out T to be string.
This seems like a pretty easy case to me, and I swear I've relied far more complicated inference in other generic utility and extension functions I've written. What am I missing here?
Update 1: This is in the C# 4.0 compiler. I discovered the issue in VS2010 and the above sample is from a simplest-case repro I made in LINQPad 4.
Update 2: Added some more examples to the list of what works.
Test(A);
This fails because the only applicable method (Test<T>(Action<T>)) requires type inference, and the type inference algorithm requires that each each argument be of some type or be an anonymous function. (This fact is inferred from the specification of the type inference algorithm (§7.5.2)) The method group A is not of any type (even though it is convertable to an appropriate delegate type), and it is not an anonymous function.
Test<string>(A);
This succeeds, the difference being that type inference is not necessary to bind Test, and method group A is convertable to the required delegate parameter type void Action<string>(string).
Test((string a) => {});
This succeeds, the difference being that the type inference algorithm makes provision for anonymous functions in the first phase (§7.5.2.1). The parameter and return types of the anonymous function are known, so an explicit parameter type inference can be made, and a correspondense is thereby made between the types in the anonymous function (void ?(string)) and the type parameter in the delegate type of the Test method’s parameter (void Action<T>(T)). No algorithm is specified for method groups that would correspond to this algorithm for anonymous functions.
Test((Action<string>)A);
This succeeds, the difference being that the untyped method group parameter A is cast to a type, thereby allowing the type inference of Test to proceed normally with an expression of a particular type as the only argument to the method.
I can think of no reason in theory why overload resolution could not be attempted on the method group A. Then—if a single best binding is found—the method group could be given the same treatment as an anonymous function. This is especially true in cases like this where the method group contains exactly one candidate and it has no type parameters. But the reason it does not work in C#4 appears to be the fact that this feature was not designed and implemented. Given the complexity of this feature, the narowness of its application, and the existance of three easy work-arounds, I am not going to be holding my breath for it!
I think it's because it's a two-step inference:
It has to infer that you want to convert A to a generic delegate
It has to infer what the type of the delegate parameter should be
I'm not sure if this is the reason, but my hunch is that a two-step inference isn't necessarily easy for the compiler.
Edit:
Just a hunch, but something is telling me the first step is the problem. The compiler has to figure out to convert to a delegate with a different number of generic parameters, and so it can't infer the types of the parameters.
This looks like a vicious circle to me.
Test method expects a parameter of delegate type constructed from generic type Action<T>. You pass in a method group instead: Test(A). This means compiler has to convert your parameter to a delegate type (method group conversion).
But which delegate type? To know the delegate type we need to know T. We didn't specify it explicitly, so compiler has to infer it to figure out the delegate type.
To infer the type parameters of the method we need to know the types of the method arguments, in this case the delegate type. Compiler doesn't know the argument type and thus fails.
In all other cases either type of argument is apparent:
// delegate is created out of anonymous method,
// no method group conversion needed - compiler knows it's Action<string>
Test((string a) => {});
// type of argument is set explicitly
Test((Action<string>)A);
or type parameter is specified explicitly:
Test<string>(A); // compiler knows what type of delegate to convert A to
P.S. more on type inference
You're passing the name of the Method A. The .Net framework CAN convert it to an Action, but it's implicit and it will not take responsibility for it.
But still, a method name is NOT an explicit Action<> Object. And therefor it won't infer the type as an Action type.
I could be wrong, but I imagine the real reason C# cannot infer the type is due to method overloading and the ambiguity that arises. For example, suppose I have the following methods: void foo (int) and void foo (float). Now if I write var f = foo. Which foo should the compiler pick? Likewise, the same problem happens with your example using Test(foo).
Related
Suppose I have a generic method. What is the difference between calling the method with specifying the type and calling without it.
When does it become mandatory to specify the type?
Write(1);
Write<int>(1);
Write<string>("test");
Write("test");
private static void Write<T>(T param)
{
Console.WriteLine(param);
}
The compiler will attempt to resolve the type argument at compile time. In cases where this is possible, it is done automatically without you have to specify an explicit type. In all other cases, when the compiler is unable to determine the type argument, the compiler will throw an error and you cannot compile your application.
In your example Write(1), the argument 1 is of type int. So the compiler infers that T must be int, so the call is equivalent to Write<int>(1). Similarly with Write("test"), "test" is of type string, so the compiler will infer that T is string.
This generally works on the compile time type of the argument you pass to the function. So if you have a string with a compile time type of object, T would be object:
object arg = "foo";
Write(arg); // arg is object
// this is equivalent to this:
Write<object>(arg);
So although you are passing a string, its compile time type is object so that’s the only information, the compiler is able to use here.
In addition, you might want to specify the type argument explicitly whenever you want the compiler to use a different type than the compile time argument of what you’re passing to it. In the above example, you could use Write<object>(1) which would cause the integer 1 to be casted into object. However, in practice this usually defeats the purpose of generic methods and may have actual consequences even if you don’t need to access the actual parameter’s compile time type. For example, value types would be boxed when passed as object but generic methods allow you to keep them as value types—remember that a generic method definition (same applies to types) is kind of equivalent to specifying an overload with every other valid generic type argument value (so you usually have an infinite copy of methods).
Of course, the compiler’s type inference also works with multiple arguments:
void Write<T1, T2>(T1 arg1, T2 arg2)
{ … }
Write(1, "foo");
// is equivalent to
Write<int, string>(1, "foo");
However, when the compiler is not able to infer even one type argument, you will have to specify all of them in the call.
It's not mandatory in this situation because the parameter param is the generic (the compiler resolves it like that).
That means it's mandatory whenever the type argument is not supplied as a parameter.
It becomes mandatory to specify the type when the compiler cannot infer it, or when you are not happy with what the compiler infers.
There are no difference between
Write<string>("test");
Write("test");
But, there difference, when you specify not exactly type of parameter, but convertible, for example
Write<object>("test");
Write("test");
Just try to print type parameter and will see difference
private static void Write<T>(T param)
{
Console.WriteLine(typeof(T));
Console.WriteLine(param);
}
I have two overloaded generic methods:
T Foo<T>(T t) { Console.WriteLine("T"); return t; }
T Foo<T>(int i) { Console.WriteLine("int"); return default(T); }
When I try to call Foo as follows on my computer:
Foo(5);
I get no compiler errors or warnings, and the first method with the generic argument is called (i.e. the output is T). Will this be the case in all C# incarnations and on all platforms? In that case, why?
On the other hand, if I explicitly specify the type in the generic call:
Foo<int>(5);
the second method with the int argument is called, i.e. the output is now int. Why?
I am using different argument names in my two method overloads, so the output from the following calls are as expected:
Foo<int>(t: 5); // output 'T'
Foo<int>(i: 5); // output 'int'
If I am calling the first method, I can even leave out the type specification:
Foo(t: 5); // output 'T'
But if I try to compile this:
Foo(i: 5);
I get an error The type arguments for method 'Foo(int)' cannot be inferred from the usage. Try specifying the type arguments explicitly. Why cannot the compiler deal with this call?
Note These tests have been performed with LinqPad on a Windows 8 x64 system (in case that is relevant to the results...)
Last question
Since you specified (by parameter name) that it should call the overload that takes an int parameter, the compiler has no idea what to pass for T.
First question
Because of this, Foo(5) only matches one overload (Foo<T>()).
Therefore, it must only call Foo<T>().
Second question
When you explicitly specify a type argument (<int>), both overloads are applicable.
In that case, Foo(int) is better, since its parameter is not of generic type.
As per the C# spec §7.5.3.2:
Otherwise, if MP has more specific parameter types than MQ, then MP is better than MQ. Let {R1, R2, …, RN} and {S1, S2, …, SN} represent the uninstantiated and unexpanded parameter types of MP and MQ. MP’s parameter types are more specific than MQ’s if, for each parameter, RX is not less specific than SX, and, for at least one parameter, RX is more specific than SX:
A type parameter is less specific than a non-type parameter.
(emphasis added)
When I declare a method like this:
void DoWork<T>(T a) { }
void DoWork(int a) { }
And call it with this:
int a = 1;
DoWork(a);
What DoWork method will it call and why? I can't seem to find it in any MSDN documentation.
As Eric Lippert says:
The C# specification says that when you have a choice between calling ReallyDoIt<string>(string) and ReallyDoIt(string) – that is, when the choice is between two methods that have identical signatures, but one gets that signature via generic substitution – then we pick the “natural” signature over the “substituted” signature.
UPDATE:
What we have in C# spec (7.5.3):
When a generic method is called without specifying type arguments, a type inference process attempts to infer type arguments for the call. Through type inference, the type argument int is determined from the argument to the method. Type inference occurs as part of the binding-time processing of a method invocation and takes place before the overload resolution step of the invocation.
When a particular method group is specified in a method invocation, and no type arguments are specified as part of the method invocation, type inference is applied to each generic method in the method group. If type inference succeeds, then the inferred type arguments are used to determine the types of arguments for subsequent overload resolution. If overload resolution chooses a generic method as the one to invoke, then the inferred type arguments are used as the actual type arguments for the invocation. If type inference for a particular method fails, that method does not participate in overload resolution.
So before overload resolution we have two methods in method group. One DoWork(int) and other inferred DoWork<int>(int).
And we go to 7.5.3.2 (Better function member):
In case the parameter type sequences {P1, P2, …, PN} and {Q1, Q2, …, QN} are equivalent (i.e. each Pi has an identity conversion to the corresponding Qi), the following tie-breaking rules are applied, in order, to determine the better function member.
1) If MP is a non-generic method and MQ is a generic method, then MP is better than MQ.
This seems odd to me, but I remember a thread where Eric Lippert commented on the inability (by design, or at least convention, I think) of C# to overload methods based on return type, so perhaps it's in some convoluted way related to that.
Is there any reason this does not work:
public static T Test<T>() where T : new()
{
return new T();
}
// Elsewhere
SomeObject myObj = Test();
But this does:
var myObj = Test<SomeObject>();
From a certain perspective, they're both fine, in that you're not Repeating Yourself (in a very small way), but is this just a different pass of the compiler?
First off, this is "overloading based on return type":
void M(int x){...}
int M(int x){...}
string M(int x){...}
The declarations are not legal; you can't overload a method based on return type because the return type is not part of the signature, and the signature has to be unique.
What you are talking about is method type inference based on the method's return type. We don't support that either.
The reason is because the return type might be what you are trying to figure out.
M(Test());
What's the return type of Test? That depends on which overload of M we choose. What overload of M do we choose? That depends on the return type of Test.
In general, C# is designed so that every subexpression has a type, and the types are worked out from the "inside" towards the "outside", not from the outside to the inside.
The notable exceptions are anonymous functions, method groups, and null:
M(x=>x+1)
What's the type of x=>x+1? It depends on which overload of M is called.
M(N); // N is a method group
what's the type of N? Again, it depends on which overload of M is called.
And so on. In these cases we do reason from "outside" to "inside".
Type inference involving lambdas is extremely complicated and was difficult to implement. We don't want to have that same complication and difficulty throughout the compiler.
Except for typeless expressions (null, method groups, and lambda expressions), the type of an expression must be statically determinable by the expression itself, regardless of context.
In other words, the type of an expression Test() cannot depend on what you're assigning it to.
Check C# Language Specification §7.5.2, the declaring type of a variable is not an attestation for type inference, and obviously it shouldn't be. Consider the following code:
Base b = Test<Derived>();
Derived d = Test<Derived>();
The return type of the method probably differs from the declaring type of the variable, since we have implicit convert in C#.
Type inference by the compiler doesn't use the "expected type" of an assignment as part of the logic.
So, the "scope of consideration" for type inference is not this:
SomeObject myObj = Test();
but this:
Test();
And, there are no clues here as to the expected type.
If you want an example of why the type of an expression needs to be able to be determined by the expression itself, consider the following two cases:
We don't use the return value at all - we're just calling the method for its side-effects.
We pass the return value directly into an overloaded method
Using the "expected type" of the return value when it comes to generic type resolution would introduce a whole whack of additional complexity into the compiler, and all you've gained is that sometimes you need to explicitly specify the type and sometimes you don't, and whether you need to or not can change based on unrelated changes elsewhere in the code.
I wonder why it is not possible a method parameter as var type like
private void myMethod(var myValue) {
// do something
}
You can only use var for variables inside the method body. Also the variable must be assigned at declaration and it must be possible to deduce the type unambiguously from the expression on the right-hand side.
In all other places you must specify a type, even if a type could in theory be deduced.
The reason is due to the way that the compiler is designed. A simplified description is that it first parses everything except method bodies and then makes a full analysis of the static types of every class, member, etc. It then uses this information when parsing the method bodies, and in particular for deducing the type of local variables declared as var. If var were allowed anywhere then it would require a large change to the way the compiler works.
You can read Eric Lippert's article on this subject for more details:
Why no var on fields?
Because the compiler determines the actual type by looking at the right hand side of the assignment. For example, here it is determined to be a string:
var s = "hello";
Here it is determined to be Foo:
var foo = new Foo();
In method arguments, there is no "right hand side of the assignment", so you can't use var.
See the posting by Eric Lippert about why var is not allowed on fields, which also contains the explanation why it doesn't work in method signatures:
Let me give you a quick oversimplification of how the C# compiler works. First we run through every source file and do a "top level only" parse. That is, we identify every namespace, class, struct, enum, interface, and delegate type declaration at all levels of nesting. We parse all field declarations, method declarations, and so on. In fact, we parse everything except method bodies; those, we skip and come back to them later.
[...]
if we have "var" fields then the type of the field cannot be determined until the expression is analyzed, and that happens after we already need to know the type of the field.
Please see Juliet's answer for a better answer to this question.
Because it was too hard to add full type inference to C#.
Other languages such as Haskell and ML can automatically infer the most general type without you having to declare it.
The other answers state that it's "impossible" for the compiler to infer the type of var but actually it is possible in principle. For example:
abstract void anotherMethod(double z, double w);
void myMethod<T>(T arg)
{
anotherMethod(arg, 2.0); // Now a compiler could in principle infer that arg must be of type double (but the actual C# compiler can't)
}
Have "var" method parameters is in principle the same thing as generic methods:
void myMethod<T>(T arg)
{
....
}
It is unfortunate that you can't just use the same syntax for both but this is probably due to the fact that that C#'s type inference was added only later.
In general, subtle changes in the language syntax and semantics can turn a "deterministic" type inference algorithm into an undecidable one.
ML, Haskell, Scala, F#, SML, and other languages can easily figure out the type from equivalent expressions in their own language, mainly because they were designed with type-inference in mind from the very start. C# wasn't, its type-inference was tacked on as a post-hoc solution to the problem of accessing anonymous types.
I speculate that true Hindley-Milner type-inference was never implemented for C# because its complicated to deduce types in a language so dependent on classes and inheritance. Let's say I have the following classes:
class Base { public void Print() { ... } }
class Derived1 : Base { }
class Derived2 : Base { }
And now I have this method:
var create() { return new Derived1(); }
What's the return type here? Is it Derived1, or should it be Base? For that matter, should it be object?
Ok, now lets say I have this method:
void doStuff(var someBase) { someBase.Print(); }
void Main()
{
doStuff(new Derived1());
doStuff(new Derived2()); // <-- type error or not?
}
The first call, doStuff(new Derived1()), presumably forces doStuff to the type doStuff(Derived1 someBase). Let's assume for now that we infer a concrete type instead of a generic type T.
What about the second call, doStuff(new Derived1())? Is it a type error, or do we generalize to doStuff<T>(T somebase) where T : Base instead? What if we made the same call in a separate, unreferenced assembly -- the type inference algorithm would have no idea whether to use the narrow type or the more genenarlized type. So we'd end up with two different type signatures based on whether method calls originate from inside the assembly or a foreign assembly.
You can't generalize wider types based on usage of the function. You basically need to settle on a single concrete type as soon as you know which concrete type is being pass in. So in the example code above, unless you explicitly cast up to the Base type, doStuff is constrained to accept types of Derived1 and the second call is a type error.
Now the trick here is settling on a type. What happens here:
class Whatever
{
void Foo() { DoStuff(new Derived1()); }
void Bar() { DoStuff(new Derived2()); }
void DoStuff(var x) { ... }
}
What's the type of DoStuff? For that matter, we know based on the above that one of the Foo or Bar methods contain a type error, but can you tell from looking which has the error?
Its not possible to resolve the type without changing the semantics of C#. In C#, order of method declaration has no impact on compilation (or at least it shouldn't ;) ). You might say instead that the method declared first (in this case, the Foo method) determines the type, so Bar has an error.
This works, but it also changes the semantics of C#: changes in method order will change the compiled type of the method.
But let's say we went further:
// Whatever.cs
class Whatever
{
public void DoStuff(var x);
}
// Foo.cs
class Foo
{
public Foo() { new Whatever().DoStuff(new Derived1()); }
}
// Bar.cs
class Bar
{
public Bar() { new Whatever().DoStuff(new Derived2()); }
}
Now the methods is being invoked from different files. What's the type? Its not possible to decide without imposing some rules on compilation order: if Foo.cs gets compiled before Bar.cs, the type is determined by Foo.cs.
While we can impose those sorts of rules on C# to make type inference work, it would drastically change the semantics of the language.
By contrast, ML, Haskell, F#, and SML support type inference so well because they have these sorts of restrictions: you can't call methods before they're declared, the first method call to inferred functions determines the type, compilation order has an impact on type inference, etc.
The "var" keyword is used in C# and VB.NET for type inference - you basically tell the C# compiler: "you figure out what the type is".
"var" is still strongly typed - you're just too lazy yourself to write out the type and let the compiler figure it out - based on the data type of the right-hand side of the assignment.
Here, in a method parameter, the compiler has no way of figuring out what you really meant. How? What type did you really mean? There's no way for the compiler to infer the type from the method definition - therefore it's not a valid statement.
Because c# is type safe and strong type language. At any place of your program compiler always knows the type of argument you are using. var keyword was just introduced to have variables of anonymus types.
Check dynamic in C# 4
Type inference is type inference, either in local expressions or global / interprocedural. So it isn't about "not having a right hand side", because in compiler theory, a procedure call is a form of "right hand side".
C# could do this if the compiler did global type inference, but it does not.
You can use "object" if you want a parameter that accepts anything, but then you need to deal with the runtime conversion and potential exceptions yourself.
"var" in C# isn't a runtime type binding, it is a compile time feature that ends up with a very specific type, but C# type inference is limited in scope.