Related
The following call to the overloaded Enumerable.Select method:
var itemOnlyOneTuples = "test".Select<char, Tuple<char>>(Tuple.Create);
fails with an ambiguity error (namespaces removed for clarity):
The call is ambiguous between the following methods or properties:
'Enumerable.Select<char,Tuple<char>>
(IEnumerable<char>,Func<char,Tuple<char>>)'
and
'Enumerable.Select<char,Tuple<char>>
(IEnumerable<char>, Func<char,int,Tuple<char>>)'
I can certainly understand why not specifying the type-arguments explicitly would result in an ambiguity (both the overloads would apply), but I don't see one after doing so.
It appears clear enough to me that the intention is to call the first overload, with the method-group argument resolving to Tuple.Create<char>(char). The second overload should not apply because none of the Tuple.Create overloads can be converted to the expected Func<char,int,Tuple<char>> type. I'm guessing the compiler is confused by Tuple.Create<char, int>(char, int), but its return-type is wrong: it returns a two-tuple, and is hence not convertible to the relevant Func type.
By the way, any of the following makes the compiler happy:
Specifying a type-argument for the method-group argument: Tuple.Create<char> (Perhaps this is actually a type-inference issue?).
Making the argument a lambda-expression instead of a method-group: x => Tuple.Create(x). (Plays well with type-inference on the Select call).
Unsurprisingly, trying to call the other overload of Select in this manner also fails:
var itemIndexTwoTuples = "test".Select<char, Tuple<char, int>>(Tuple.Create);
What's the exact problem here?
First off, I note that this is a duplicate of:
Why is Func<T> ambiguous with Func<IEnumerable<T>>?
What's the exact problem here?
Thomas's guess is essentially correct. Here are the exact details.
Let's go through it a step at a time. We have an invocation:
"test".Select<char, Tuple<char>>(Tuple.Create);
Overload resolution must determine the meaning of the call to Select. There is no method "Select" on string or any base class of string, so this must be an extension method.
There are a number of possible extension methods for the candidate set because string is convertible to IEnumerable<char> and presumably there is a using System.Linq; in there somewhere. There are many extension methods that match the pattern "Select, generic arity two, takes an IEnumerable<char> as the first argument when constructed with the given method type arguments".
In particular, two of the candidates are:
Enumerable.Select<char,Tuple<char>>(IEnumerable<char>,Func<char,Tuple<char>>)
Enumerable.Select<char,Tuple<char>>(IEnumerable<char>,Func<char,int,Tuple<char>>)
Now, the first question we face is are the candidates applicable? That is, is there an implicit conversion from each supplied argument to the corresponding formal parameter type?
An excellent question. Clearly the first argument will be the "receiver", a string, and it will be implicitly convertible to IEnumerable<char>. The question now is whether the second argument, the method group "Tuple.Create", is implicitly convertible to formal parameter types Func<char,Tuple<char>>, and Func<char,int, Tuple<char>>.
When is a method group convertible to a given delegate type? A method group is convertible to a delegate type when overload resolution would have succeeded given arguments of the same types as the delegate's formal parameter types.
That is, M is convertible to Func<A, R> if overload resolution on a call of the form M(someA) would have succeeded, given an expression 'someA' of type 'A'.
Would overload resolution have succeeded on a call to Tuple.Create(someChar)? Yes; overload resolution would have chosen Tuple.Create<char>(char).
Would overload resolution have succeeded on a call to Tuple.Create(someChar, someInt)? Yes, overload resolution would have chosen Tuple.Create<char,int>(char, int).
Since in both cases overload resolution would have succeeded, the method group is convertible to both delegate types. The fact that the return type of one of the methods would not have matched the return type of the delegate is irrelevant; overload resolution does not succeed or fail based on return type analysis.
One might reasonably say that convertibility from method groups to delegate types ought to succeed or fail based on return type analysis, but that's not how the language is specified; the language is specified to use overload resolution as the test for method group conversion, and I think that's a reasonable choice.
Therefore we have two applicable candidates. Is there any way that we can decide which is better than the other? The spec states that the conversion to the more specific type is better; if you have
void M(string s) {}
void M(object o) {}
...
M(null);
then overload resolution chooses the string version because string is more specific than object. Is one of those delegate types more specific than the other? No. Neither is more specific than the other. (This is a simplification of the better-conversion rules; there are actually lots of tiebreakers, but none of them apply here.)
Therefore there is no basis to prefer one over the other.
Again, one could reasonably say that sure, there is a basis, namely, that one of those conversions would produce a delegate return type mismatch error and one of them would not. Again, though, the language is specified to reason about betterness by considering the relationships between the formal parameter types, and not about whether the conversion you've chosen will eventually result in an error.
Since there is no basis upon which to prefer one over the other, this is an ambiguity error.
It is easy to construct similar ambiguity errors. For example:
void M(Func<int, int> f){}
void M(Expression<Func<int, int>> ex) {}
...
M(x=>Q(++x));
That's ambiguous. Even though it is illegal to have a ++ inside an expression tree, the convertibility logic does not consider whether the body of a lambda has something inside it that would be illegal in an expression tree. The conversion logic just makes sure that the types check out, and they do. Given that, there's no reason to prefer one of the M's over the other, so this is an ambiguity.
You note that
"test".Select<char, Tuple<char>>(Tuple.Create<char>);
succeeds. You now know why. Overload resolution must determine if
Tuple.Create<char>(someChar)
or
Tuple.Create<char>(someChar, someInt)
would succeed. Since the first one does and the second one does not, the second candidate is inapplicable and eliminated, and is therefore not around to become ambiguous.
You also note that
"test".Select<char, Tuple<char>>(x=>Tuple.Create(x));
is unambiguous. Lambda conversions do take into account the compatibility of the returned expression's type with the target delegate's return type. It is unfortunate that method groups and lambda expressions use two subtly different algorithms for determining convertibility, but we're stuck with it now. Remember, method group conversions have been in the language a lot longer than lambda conversions; had they been added at the same time, I imagine that their rules would have been made consistent.
I'm guessing the compiler is confused by Tuple.Create<char, int>(char, int), but its return-type is wrong: it returns a two-tuple.
The return type isn't part of the method signature, so it isn't considered during overload resolution; it's only verified after an overload has been picked. So as far as the compiler knows, Tuple.Create<char, int>(char, int) is a valid candidate, and it is neither better nor worse than Tuple.Create<char>(char), so the compiler can't decide.
The following call to the overloaded Enumerable.Select method:
var itemOnlyOneTuples = "test".Select<char, Tuple<char>>(Tuple.Create);
fails with an ambiguity error (namespaces removed for clarity):
The call is ambiguous between the following methods or properties:
'Enumerable.Select<char,Tuple<char>>
(IEnumerable<char>,Func<char,Tuple<char>>)'
and
'Enumerable.Select<char,Tuple<char>>
(IEnumerable<char>, Func<char,int,Tuple<char>>)'
I can certainly understand why not specifying the type-arguments explicitly would result in an ambiguity (both the overloads would apply), but I don't see one after doing so.
It appears clear enough to me that the intention is to call the first overload, with the method-group argument resolving to Tuple.Create<char>(char). The second overload should not apply because none of the Tuple.Create overloads can be converted to the expected Func<char,int,Tuple<char>> type. I'm guessing the compiler is confused by Tuple.Create<char, int>(char, int), but its return-type is wrong: it returns a two-tuple, and is hence not convertible to the relevant Func type.
By the way, any of the following makes the compiler happy:
Specifying a type-argument for the method-group argument: Tuple.Create<char> (Perhaps this is actually a type-inference issue?).
Making the argument a lambda-expression instead of a method-group: x => Tuple.Create(x). (Plays well with type-inference on the Select call).
Unsurprisingly, trying to call the other overload of Select in this manner also fails:
var itemIndexTwoTuples = "test".Select<char, Tuple<char, int>>(Tuple.Create);
What's the exact problem here?
First off, I note that this is a duplicate of:
Why is Func<T> ambiguous with Func<IEnumerable<T>>?
What's the exact problem here?
Thomas's guess is essentially correct. Here are the exact details.
Let's go through it a step at a time. We have an invocation:
"test".Select<char, Tuple<char>>(Tuple.Create);
Overload resolution must determine the meaning of the call to Select. There is no method "Select" on string or any base class of string, so this must be an extension method.
There are a number of possible extension methods for the candidate set because string is convertible to IEnumerable<char> and presumably there is a using System.Linq; in there somewhere. There are many extension methods that match the pattern "Select, generic arity two, takes an IEnumerable<char> as the first argument when constructed with the given method type arguments".
In particular, two of the candidates are:
Enumerable.Select<char,Tuple<char>>(IEnumerable<char>,Func<char,Tuple<char>>)
Enumerable.Select<char,Tuple<char>>(IEnumerable<char>,Func<char,int,Tuple<char>>)
Now, the first question we face is are the candidates applicable? That is, is there an implicit conversion from each supplied argument to the corresponding formal parameter type?
An excellent question. Clearly the first argument will be the "receiver", a string, and it will be implicitly convertible to IEnumerable<char>. The question now is whether the second argument, the method group "Tuple.Create", is implicitly convertible to formal parameter types Func<char,Tuple<char>>, and Func<char,int, Tuple<char>>.
When is a method group convertible to a given delegate type? A method group is convertible to a delegate type when overload resolution would have succeeded given arguments of the same types as the delegate's formal parameter types.
That is, M is convertible to Func<A, R> if overload resolution on a call of the form M(someA) would have succeeded, given an expression 'someA' of type 'A'.
Would overload resolution have succeeded on a call to Tuple.Create(someChar)? Yes; overload resolution would have chosen Tuple.Create<char>(char).
Would overload resolution have succeeded on a call to Tuple.Create(someChar, someInt)? Yes, overload resolution would have chosen Tuple.Create<char,int>(char, int).
Since in both cases overload resolution would have succeeded, the method group is convertible to both delegate types. The fact that the return type of one of the methods would not have matched the return type of the delegate is irrelevant; overload resolution does not succeed or fail based on return type analysis.
One might reasonably say that convertibility from method groups to delegate types ought to succeed or fail based on return type analysis, but that's not how the language is specified; the language is specified to use overload resolution as the test for method group conversion, and I think that's a reasonable choice.
Therefore we have two applicable candidates. Is there any way that we can decide which is better than the other? The spec states that the conversion to the more specific type is better; if you have
void M(string s) {}
void M(object o) {}
...
M(null);
then overload resolution chooses the string version because string is more specific than object. Is one of those delegate types more specific than the other? No. Neither is more specific than the other. (This is a simplification of the better-conversion rules; there are actually lots of tiebreakers, but none of them apply here.)
Therefore there is no basis to prefer one over the other.
Again, one could reasonably say that sure, there is a basis, namely, that one of those conversions would produce a delegate return type mismatch error and one of them would not. Again, though, the language is specified to reason about betterness by considering the relationships between the formal parameter types, and not about whether the conversion you've chosen will eventually result in an error.
Since there is no basis upon which to prefer one over the other, this is an ambiguity error.
It is easy to construct similar ambiguity errors. For example:
void M(Func<int, int> f){}
void M(Expression<Func<int, int>> ex) {}
...
M(x=>Q(++x));
That's ambiguous. Even though it is illegal to have a ++ inside an expression tree, the convertibility logic does not consider whether the body of a lambda has something inside it that would be illegal in an expression tree. The conversion logic just makes sure that the types check out, and they do. Given that, there's no reason to prefer one of the M's over the other, so this is an ambiguity.
You note that
"test".Select<char, Tuple<char>>(Tuple.Create<char>);
succeeds. You now know why. Overload resolution must determine if
Tuple.Create<char>(someChar)
or
Tuple.Create<char>(someChar, someInt)
would succeed. Since the first one does and the second one does not, the second candidate is inapplicable and eliminated, and is therefore not around to become ambiguous.
You also note that
"test".Select<char, Tuple<char>>(x=>Tuple.Create(x));
is unambiguous. Lambda conversions do take into account the compatibility of the returned expression's type with the target delegate's return type. It is unfortunate that method groups and lambda expressions use two subtly different algorithms for determining convertibility, but we're stuck with it now. Remember, method group conversions have been in the language a lot longer than lambda conversions; had they been added at the same time, I imagine that their rules would have been made consistent.
I'm guessing the compiler is confused by Tuple.Create<char, int>(char, int), but its return-type is wrong: it returns a two-tuple.
The return type isn't part of the method signature, so it isn't considered during overload resolution; it's only verified after an overload has been picked. So as far as the compiler knows, Tuple.Create<char, int>(char, int) is a valid candidate, and it is neither better nor worse than Tuple.Create<char>(char), so the compiler can't decide.
Just out of curiosity:
Many LINQ extension methods exist as both generic and non-generic variants, for example Any and Any<>, Where and Where<> etc. Writing my queries I usually use the non-generic variants and it works fine.
What would be the cases when one has to use generic methods?
--- edit ---
P.S.: I am aware of the fact that internally only generic methods are called and the compiler tries to resolve the content of the generic brackets <> during compilation.
My question is rather what are the cases then one has to provide the type explicitly and not to rely on the compiler's intuition?
Always. The C# compiler is smart enough to infer what the type of the method is based on the parameters. This is important when the type is anonymous, and thus has no name.
obj.SomeMethod(123); //these calls are the same
obj.SomeMethod<int>(123);
obj.SomeMethod(new { foo = 123 }); //what type would I write here?!
Edit: To be clear, you are always calling the generic method. It just looks like a non-generic method, since the compiler and Intellisense are smart.
Edit: To your updated question, you would want to be specific if you want to use a type that is not the type of the object you are passing. There are two such cases:
If the parameter implements an interface, and you want to operate on that interface, not the concrete type, then you should specify the interface:
obj.DoSomething<IEnumerable<Foo>>( new List<Foo>() );
If the parameter is implicitly convertible to another type, and you want to use the second type, then you should specify it:
obj.DoSomethingElse<long> ( 123 ); //123 is actually an int, but convertible to long
On the other hand, if you need a cast to do the conversion (or you insert one anyway), then you don't need to specify:
obj.DoYetAnotherThing( (Transformed)new MyThing() ); // calls DoYetAnotherThing<Transformed>
One example I ran into today:
ObjectSet<User> users = context.Users;
var usersThatMatch = criteria.Aggregate(users, (u, c) => u.Where(c));
The above code won't work because the .Where method doesn't return an ObjectSet<User>. You could get around this one of two ways. I could call .AsQueryable() on users, to make sure it's strongly typed as an IQueryable, or I could pass specific type arguments into the Aggregate method:
criteria.Aggregate<Func<User, bool>, IEnumerable<User>>(
PersonSet, (u, c) => u.Where(c));
Another couple of more common examples are the Cast and OfType methods, which have no way to infer what type you want, and in many cases are being called on a non-generic collection in the first place.
In general, the folks that designed the LINQ methods went out of their way to avoid the need to use explicit types in these generic methods, and for the most part you don't need to. I'd say it's best to know it's an option, but avoid doing it unless you find it necessary.
This seems odd to me, but I remember a thread where Eric Lippert commented on the inability (by design, or at least convention, I think) of C# to overload methods based on return type, so perhaps it's in some convoluted way related to that.
Is there any reason this does not work:
public static T Test<T>() where T : new()
{
return new T();
}
// Elsewhere
SomeObject myObj = Test();
But this does:
var myObj = Test<SomeObject>();
From a certain perspective, they're both fine, in that you're not Repeating Yourself (in a very small way), but is this just a different pass of the compiler?
First off, this is "overloading based on return type":
void M(int x){...}
int M(int x){...}
string M(int x){...}
The declarations are not legal; you can't overload a method based on return type because the return type is not part of the signature, and the signature has to be unique.
What you are talking about is method type inference based on the method's return type. We don't support that either.
The reason is because the return type might be what you are trying to figure out.
M(Test());
What's the return type of Test? That depends on which overload of M we choose. What overload of M do we choose? That depends on the return type of Test.
In general, C# is designed so that every subexpression has a type, and the types are worked out from the "inside" towards the "outside", not from the outside to the inside.
The notable exceptions are anonymous functions, method groups, and null:
M(x=>x+1)
What's the type of x=>x+1? It depends on which overload of M is called.
M(N); // N is a method group
what's the type of N? Again, it depends on which overload of M is called.
And so on. In these cases we do reason from "outside" to "inside".
Type inference involving lambdas is extremely complicated and was difficult to implement. We don't want to have that same complication and difficulty throughout the compiler.
Except for typeless expressions (null, method groups, and lambda expressions), the type of an expression must be statically determinable by the expression itself, regardless of context.
In other words, the type of an expression Test() cannot depend on what you're assigning it to.
Check C# Language Specification ยง7.5.2, the declaring type of a variable is not an attestation for type inference, and obviously it shouldn't be. Consider the following code:
Base b = Test<Derived>();
Derived d = Test<Derived>();
The return type of the method probably differs from the declaring type of the variable, since we have implicit convert in C#.
Type inference by the compiler doesn't use the "expected type" of an assignment as part of the logic.
So, the "scope of consideration" for type inference is not this:
SomeObject myObj = Test();
but this:
Test();
And, there are no clues here as to the expected type.
If you want an example of why the type of an expression needs to be able to be determined by the expression itself, consider the following two cases:
We don't use the return value at all - we're just calling the method for its side-effects.
We pass the return value directly into an overloaded method
Using the "expected type" of the return value when it comes to generic type resolution would introduce a whole whack of additional complexity into the compiler, and all you've gained is that sometimes you need to explicitly specify the type and sometimes you don't, and whether you need to or not can change based on unrelated changes elsewhere in the code.
I wonder why it is not possible a method parameter as var type like
private void myMethod(var myValue) {
// do something
}
You can only use var for variables inside the method body. Also the variable must be assigned at declaration and it must be possible to deduce the type unambiguously from the expression on the right-hand side.
In all other places you must specify a type, even if a type could in theory be deduced.
The reason is due to the way that the compiler is designed. A simplified description is that it first parses everything except method bodies and then makes a full analysis of the static types of every class, member, etc. It then uses this information when parsing the method bodies, and in particular for deducing the type of local variables declared as var. If var were allowed anywhere then it would require a large change to the way the compiler works.
You can read Eric Lippert's article on this subject for more details:
Why no var on fields?
Because the compiler determines the actual type by looking at the right hand side of the assignment. For example, here it is determined to be a string:
var s = "hello";
Here it is determined to be Foo:
var foo = new Foo();
In method arguments, there is no "right hand side of the assignment", so you can't use var.
See the posting by Eric Lippert about why var is not allowed on fields, which also contains the explanation why it doesn't work in method signatures:
Let me give you a quick oversimplification of how the C# compiler works. First we run through every source file and do a "top level only" parse. That is, we identify every namespace, class, struct, enum, interface, and delegate type declaration at all levels of nesting. We parse all field declarations, method declarations, and so on. In fact, we parse everything except method bodies; those, we skip and come back to them later.
[...]
if we have "var" fields then the type of the field cannot be determined until the expression is analyzed, and that happens after we already need to know the type of the field.
Please see Juliet's answer for a better answer to this question.
Because it was too hard to add full type inference to C#.
Other languages such as Haskell and ML can automatically infer the most general type without you having to declare it.
The other answers state that it's "impossible" for the compiler to infer the type of var but actually it is possible in principle. For example:
abstract void anotherMethod(double z, double w);
void myMethod<T>(T arg)
{
anotherMethod(arg, 2.0); // Now a compiler could in principle infer that arg must be of type double (but the actual C# compiler can't)
}
Have "var" method parameters is in principle the same thing as generic methods:
void myMethod<T>(T arg)
{
....
}
It is unfortunate that you can't just use the same syntax for both but this is probably due to the fact that that C#'s type inference was added only later.
In general, subtle changes in the language syntax and semantics can turn a "deterministic" type inference algorithm into an undecidable one.
ML, Haskell, Scala, F#, SML, and other languages can easily figure out the type from equivalent expressions in their own language, mainly because they were designed with type-inference in mind from the very start. C# wasn't, its type-inference was tacked on as a post-hoc solution to the problem of accessing anonymous types.
I speculate that true Hindley-Milner type-inference was never implemented for C# because its complicated to deduce types in a language so dependent on classes and inheritance. Let's say I have the following classes:
class Base { public void Print() { ... } }
class Derived1 : Base { }
class Derived2 : Base { }
And now I have this method:
var create() { return new Derived1(); }
What's the return type here? Is it Derived1, or should it be Base? For that matter, should it be object?
Ok, now lets say I have this method:
void doStuff(var someBase) { someBase.Print(); }
void Main()
{
doStuff(new Derived1());
doStuff(new Derived2()); // <-- type error or not?
}
The first call, doStuff(new Derived1()), presumably forces doStuff to the type doStuff(Derived1 someBase). Let's assume for now that we infer a concrete type instead of a generic type T.
What about the second call, doStuff(new Derived1())? Is it a type error, or do we generalize to doStuff<T>(T somebase) where T : Base instead? What if we made the same call in a separate, unreferenced assembly -- the type inference algorithm would have no idea whether to use the narrow type or the more genenarlized type. So we'd end up with two different type signatures based on whether method calls originate from inside the assembly or a foreign assembly.
You can't generalize wider types based on usage of the function. You basically need to settle on a single concrete type as soon as you know which concrete type is being pass in. So in the example code above, unless you explicitly cast up to the Base type, doStuff is constrained to accept types of Derived1 and the second call is a type error.
Now the trick here is settling on a type. What happens here:
class Whatever
{
void Foo() { DoStuff(new Derived1()); }
void Bar() { DoStuff(new Derived2()); }
void DoStuff(var x) { ... }
}
What's the type of DoStuff? For that matter, we know based on the above that one of the Foo or Bar methods contain a type error, but can you tell from looking which has the error?
Its not possible to resolve the type without changing the semantics of C#. In C#, order of method declaration has no impact on compilation (or at least it shouldn't ;) ). You might say instead that the method declared first (in this case, the Foo method) determines the type, so Bar has an error.
This works, but it also changes the semantics of C#: changes in method order will change the compiled type of the method.
But let's say we went further:
// Whatever.cs
class Whatever
{
public void DoStuff(var x);
}
// Foo.cs
class Foo
{
public Foo() { new Whatever().DoStuff(new Derived1()); }
}
// Bar.cs
class Bar
{
public Bar() { new Whatever().DoStuff(new Derived2()); }
}
Now the methods is being invoked from different files. What's the type? Its not possible to decide without imposing some rules on compilation order: if Foo.cs gets compiled before Bar.cs, the type is determined by Foo.cs.
While we can impose those sorts of rules on C# to make type inference work, it would drastically change the semantics of the language.
By contrast, ML, Haskell, F#, and SML support type inference so well because they have these sorts of restrictions: you can't call methods before they're declared, the first method call to inferred functions determines the type, compilation order has an impact on type inference, etc.
The "var" keyword is used in C# and VB.NET for type inference - you basically tell the C# compiler: "you figure out what the type is".
"var" is still strongly typed - you're just too lazy yourself to write out the type and let the compiler figure it out - based on the data type of the right-hand side of the assignment.
Here, in a method parameter, the compiler has no way of figuring out what you really meant. How? What type did you really mean? There's no way for the compiler to infer the type from the method definition - therefore it's not a valid statement.
Because c# is type safe and strong type language. At any place of your program compiler always knows the type of argument you are using. var keyword was just introduced to have variables of anonymus types.
Check dynamic in C# 4
Type inference is type inference, either in local expressions or global / interprocedural. So it isn't about "not having a right hand side", because in compiler theory, a procedure call is a form of "right hand side".
C# could do this if the compiler did global type inference, but it does not.
You can use "object" if you want a parameter that accepts anything, but then you need to deal with the runtime conversion and potential exceptions yourself.
"var" in C# isn't a runtime type binding, it is a compile time feature that ends up with a very specific type, but C# type inference is limited in scope.