How to write an extension method/class using two generics? [duplicate] - c#

Given:
static TDest Gimme<TSource,TDest>(TSource source)
{
return default(TDest);
}
Why can't I do:
string dest = Gimme(5);
without getting the compiler error:
error CS0411: The type arguments for method 'Whatever.Gimme<TSource,TDest>(TSource)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
The 5 can be inferred as int, but there's a restriction where the compiler won't/can't resolve the return type as a string. I've read in several places that this is by design but no real explanation. I read somewhere that this might change in C# 4, but it hasn't.
Anyone know why return types cannot be inferred from generic methods? Is this one of those questions where the answer's so obvious it's staring you in the face? I hope not!

The general principle here is that type information flows only "one way", from the inside to the outside of an expression. The example you give is extremely simple. Suppose we wanted to have type information flow "both ways" when doing type inference on a method R G<A, R>(A a), and consider some of the crazy scenarios that creates:
N(G(5))
Suppose there are ten different overloads of N, each with a different argument type. Should we make ten different inferences for R? If we did, should we somehow pick the "best" one?
double x = b ? G(5) : 123;
What should the return type of G be inferred to be? Int, because the other half of the conditional expression is int? Or double, because ultimately this thing is going to be assigned to double? Now perhaps you begin to see how this goes; if you're going to say that you reason from outside to inside, how far out do you go? There could be many steps along the way. See what happens when we start to combine these:
N(b ? G(5) : 123)
Now what do we do? We have ten overloads of N to choose from. Do we say that R is int? It could be int or any type that int is implicitly convertible to. But of those types, which ones are implicitly convertible to an argument type of N? Do we write ourselves a little prolog program and ask the prolog engine to solve what are all the possible return types that R could be in order to satisfy each of the possible overloads on N, and then somehow pick the best one?
(I'm not kidding; there are languages that essentially do write a little prolog program and then use a logic engine to work out what the types of everything are. F# for example, does way more complex type inference than C# does. Haskell's type system is actually Turing Complete; you can encode arbitrarily complex problems in the type system and ask the compiler to solve them. As we'll see later, the same is true of overload resolution in C# - you cannot encode the Halting Problem in the C# type system like you can in Haskell but you can encode NP-HARD problems into overload resolution problems.) (See below)
This is still a very simple expression. Suppose you had something like
N(N(b ? G(5) * G("hello") : 123));
Now we have to solve this problem multiple times for G, and possibly for N as well, and we have to solve them in combination. We have five overload resolution problems to solve and all of them, to be fair, should be considering both their arguments and their context type. If there are ten possibilities for N then there are potentially a hundred possibilities to consider for N(N(...)) and a thousand for N(N(N(...))) and very quickly you would have us solving problems that easily had billions of possible combinations and made the compiler very slow.
This is why we have the rule that type information only flows one way. It prevents these sorts of chicken and egg problems, where you are trying to both determine the outer type from the inner type, and determine the inner type from the outer type and cause a combinatorial explosion of possibilities.
Notice that type information does flow both ways for lambdas! If you say N(x=>x.Length) then sure enough, we consider all the possible overloads of N that have function or expression types in their arguments and try out all the possible types for x. And sure enough, there are situations in which you can easily make the compiler try out billions of possible combinations to find the unique combination that works. The type inference rules that make it possible to do that for generic methods are exceedingly complex and make even Jon Skeet nervous. This feature makes overload resolution NP-HARD.
Getting type information to flow both ways for lambdas so that generic overload resolution works correctly and efficiently took me about a year. It is such a complex feature that we only wanted to take it on if we absolutely positively would have an amazing return on that investment. Making LINQ work was worth it. But there is no corresponding feature like LINQ that justifies the immense expense of making this work in general.
UPDATE: It turns out that you can encode arbitrarily difficult problems in the C# type system. C# has nominal generic subtyping with generic contravariance, and it has been shown that you can build a Turing Machine out of generic type definitions and force the compiler to execute the machine, possibly going into infinite loops. At the time I wrote this answer the undecidability of such type systems was an open question. See https://stackoverflow.com/a/23968075/88656 for details.

You have to do:
string dest = Gimme<int, string>(5);
You need to specify what your types are in the call to the generic method. How could it know that you wanted a string in the output?
System.String is a bad example because it's a sealed class, but say it wasn't. How could the compiler know that you didn't want one of its subclasses instead if you didn't specify the type in the call?
Take this example:
System.Windows.Forms.Control dest = Gimme(5);
How would the compiler know what control to actually make? You'd need to specify it like so:
System.Windows.Forms.Control dest = Gimme<int, System.Windows.Forms.Button>(5);

Calling Gimme(5) ignoring the return value is a legal statement how would the compiler know which type to return?

I use this technique when I need to do something like that:
static void Gimme<T>(out T myVariable)
{
myVariable = default(T);
}
and use it like this:
Gimme(out int myVariable);
Print(myVariable); //myVariable is already declared and usable.
Note that inline declaration of out variables is available since C# 7.0

This was a design decision I guess. I also find it useful while programming in Java.
Unlike Java, C# seems to evolve towards a functional programming language, and you can get type inference the other way round, so you can have:
var dest = Gimme<int, string>(5);
which will infer the type of dest. I guess mixing this and the java style inference could prove to be fairly difficult to implement.

If a function is supposed to return one of a small number of types, you could have it return a class with defined widening conversions to those types. I don't think it's possible to do that in a generic way, since the widening ctype operator doesn't accept a generic type parameter.

public class ReturnString : IReq<string>
{
}
public class ReturnInt : IReq<int>
{
}
public interface IReq<T>
{
}
public class Handler
{
public T MakeRequest<T>(IReq<T> requestObject)
{
return default(T);
}
}
var handler = new Handler();
string stringResponse = handler.MakeRequest(new ReturnString());
int intResponse = handler.MakeRequest(new ReturnInt());

Related

Infer generic type with two generic type parameters [duplicate]

This question already has answers here:
Partial generic type inference possible in C#?
(6 answers)
Closed 4 years ago.
I have the following method
public bool HasTypeAttribute<TAttribute, TType>(TType obj)
{
return typeof(TType).GetCustomAttribute<TAttribute>() != null;
}
and I want to be able to use it like this:
MyClass instance = new MyClass();
TypeHelper.HasTypeAttribute<SerializableAttribute>(instance);
but I can't get it working because of the
incorrect number of type parameters
so that I need to call
TypeHelper.HasTypeAttribute<SerializableAttribute, MyClass>(instance);
which certainly makes sense, but why can the compiler not infer the type of the passed object? Because if the method looked like this:
public void Demo<T>(T obj)
{
}
the compiler would certainly be able to infer the type, so that I can write
Foo.Demo(new Bar());
instead of
Foo.Demo<Bar>(new Bar());
So, is there a way to make type inference work in this case? Is it a design flaw by me or can I achieve what I want? Reordering the parameters doesn't help too...
You could break the call into multiple steps, which lets type inference kick in wherever it can.
public class TypeHelperFor<TType>
{
public bool HasTypeAttribute<TAttribute>() where TAttribute : Attribute
{
return typeof(TType).GetCustomAttribute<TAttribute>() != null;
}
}
public static class TypeHelper
{
public static TypeHelperFor<T> For<T>(this T obj)
{
return new TypeHelperFor<T>();
}
}
// The ideal, but unsupported
TypeHelper.HasTypeAttribute<SerializableAttribute>(instance);
// Chained
TypeHelper.For(instance).HasTypeAttribute<SerializableAttribute>();
// Straight-forward/non-chained
TypeHelper.HasTypeAttribute<SerializableAttribute, MyClass>(instance);
That should work alright for this case, but I'd warn against using it in cases where the final method returns void, because it's too easy to leave the chain half-formed if you're not doing anything with the return value.
e.g.
// If I forget to complete the chain here...
if (TypeHelper.For(instance)) // Compiler error
// But if I forget the last call on a chain focused on side-effects, like this one:
// DbHelper.For(table).Execute<MyDbOperationType>();
DbHelper.For(table); // Blissfully compiles but does nothing
// Whereas the non-chained version would protect me
DbHelper.Execute<MyTableType, MyDbOperationType>(table);
Because the C# rules don't allow that.
It would be feasible for C# to have a rule where if some of the types related to parameters (and hence could be inferred at least some of the time) and the number of explicitly given types was the same as the number of remaining un-inferrable types, then the two would work in tandem.
This would need someone to propose it, convince other people involved in the decision-making around C# it was a good idea, and then for it to be implemented. This has not happened.
Aside from the fact that features start off having to prove themselves worth the extra complexity they bring (add anything to a language and it is immediately more complex with more work, more chance of compiler bugs etc.) the question then is, is it a good idea?
On the plus, your code in this particular example would be better.
On the negative, everyone's code is now slightly more complicated, because there's more ways something that is wrong can be wrong, resulting either in code that fails at runtime rather than compile-time or less useful error messages.
That people already find some cases around inference to be confusing would point to the idea that adding another complicating case would not be helpful.
Which isn't to say that it is definitely a bad idea, just that there are pros and cons that make it a matter of opinion, rather than an obvious lack.

dynamic and generics in C#

As discovered in C 3.5, the following would not be possible due to type erasure: -
int foo<T>(T bar)
{
return bar.Length; // will not compile unless I do something like where T : string
}
foo("baz");
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
Having read about the dynamic keyword, I wrote the following: -
int foo<T>(T bar)
{
dynamic test = bar;
return test.Length;
}
foo("baz"); // will compile and return 3
So, as far as I understand, dynamic will bypass compile time checking but if the type has been erased, surely it would still be unable to resolve the symbol unless it goes deeper and uses some kind of reflection?
Is using the dynamic keyword in this way bad practice and does this make generics a little more powerful?
dynamics and generics are 2 completely different notions. If you want compile-time safety and speed use strong typing (generics or just standard OOP techniques such as inheritance or composition). If you do not know the type at compile time you could use dynamics but they will be slower because they are using runtime invocation and less safe because if the type doesn't implement the method you are attempting to invoke you will get a runtime error.
The 2 notions are not interchangeable and depending on your specific requirements you could use one or the other.
Of course having the following generic constraint is completely useless because string is a sealed type and cannot be used as a generic constraint:
int foo<T>(T bar) where T : string
{
return bar.Length;
}
you'd rather have this:
int foo(string bar)
{
return bar.Length;
}
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
No, this isn't because of type erasure. Anyway there is no type erasure in C# (unlike Java): a distinct type is constructed by the runtime for each different set of type arguments, there is no loss of information.
The reason why it doesn't work is that the compiler knows nothing about T, so it can only assume that T inherits from object, so only the members of object are available. You can, however, provide more information to the compiler by adding a constraint on T. For instance, if you have an interface IBar with a Length property, you can add a constraint like this:
int foo<T>(T bar) where T : IBar
{
return bar.Length;
}
But if you want to be able to pass either an array or a string, it won't work, because the Length property isn't declared in any interface implemented by both String and Array...
No, C# does not have type erasure - only Java has.
But if you specify only T, without any constraint, you can not use obj.Lenght because T can virtually be anything.
foo(new Bar());
The above would resolve to an Bar-Class and thus the Lenght Property might not be avaiable.
You can only use Methods on T when you ensure that T this methods also really has. (This is done with the where Constraints.)
With the dynamics, you loose compile time checking and I suggest that you do not use them for hacking around generics.
In this case you would not benefit from dynamics in any way. You just delay the error, as an exception is thrown in case the dynamic object does not contain a Length property. In case of accessing the Length property in a generic method I can't see any reason for not constraining it to types who definately have this property.
"Dynamics are a powerful new tool that make interop with dynamic languages as well as COM easier, and can be used to replace much turgid reflective code. They can be used to tell the compiler to execute operations on an object, the checking of which is deferred to runtime.
The great danger lies in the use of dynamic objects in inappropriate contexts, such as in statically typed systems, or worse, in place of an interface/base class in a properly typed system."
Qouted From Article
Thought I'd weigh-in on this one, because no one clarified how generics work "under the hood". That notion of T being an object is mentioned above, and is quite clear. What is not talked about, is that when we compile C# or VB or any other supported language, - at the Intermediate Language (IL) level (what we compile to) which is more akin to an assembly language or equivalent of Java Byte codes, - at this level, there is no generics! So the new question is how do you support generics in IL? For each type that accesses the generic, a non-generic version of the code is generated which substitutes the generic(s) such as the ubiquitous T to the actual type it was called with. So if you only have one type of generic, such as List<>, then that's what the IL will contain. But if you use many implementation of a generic, then many specific implementations are created, and calls to the original code substituted with the calls to the specific non-generic version. To be clear, a MyList used as: new MyList(), will be substituted in IL with something like MyList_string().
That's my (limited) understanding of what's going on. The point being, the benefit of this approach is that the heavy lifting is done at compile-time, and at runtime there's no degradation to performance - which is again, why generic are probably so loved used anywhere, and everywhere by .NET developers.
On the down-side? If a method or type is used many times, then the output assembly (EXE or DLL) will get larger and larger, dependent of the number of different implementation of the same code. Given the average size of DLLs output - I doubt you'll ever consider generics to be a problem.

Does C# support type inference of the return type?

This is just a curiousity about if there is a fundamental thing stopping something like this (or correct me if there's already some way):
public TTo Convert<TTo, TFrom>(TFrom from)
{
...
}
Called like this:
SomeType someType = converter.Convert(someOtherType);
Because what would happen if you did this?
static void M(int x){}
static void M(double x){}
static T N<T>() {}
...
M(N());
Now what is T? int or double?
It's all very easy to solve the problem when you know what the type you're assigning to is, but much of the time the type you're assigning to is the thing you're trying to figure out in the first place.
Reasoning from inside to outside is hard enough. Reasoning from outside to inside is far more difficult, and doing both at the same time is extremely difficult. If it is hard for the compiler to understand what is going on, imagine how hard it is for the human trying to read, understand and debug the code when inferences can be made both from and to the type of the context of an expression. This kind of inference makes programs harder to understand, not easier, and so it would be a bad idea to add it to C#.
Now, that said, C# does support this feature with lambda expressions. When faced with an overload resolution problem in which the lambda can be bound two, three, or a million different ways, we bind it two, three or a million different ways and then evaluate those million different possible bindings to determine which one is "the best". This makes overload resolution at least NP-HARD in C#, and it took me the better part of a year to implement. We were willing to make that investment because (1) lambdas are awesome, and (2) most of the time people write programs that can be analyzed in a reasonable amount of time and can be understood by humans. So it was worth the cost. But in general, that kind of advanced analysis is not worth the cost.
C# expressions always* have a fixed type, regardless of their surroundings.
You're asking for an expression whose type is determined by whatever it's assigned to; that would violate this principle.
*) except for lambda expressions, function groups, and the null literal.
Unlike Java, in C# type reference doesn't base on the return type. And don't ask me why, Eric Lippert had answered these "why can't C# ..." questions:
because no one ever designed, specified, implemented, tested,
documented and shipped that feature

Use of "var" type in variable declaration

Our internal audit suggests us to use explicit variable type declaration instead of using the keyword var. They argue that using of var "may lead to unexpected results in some cases".
I am not aware of any difference between explicit type declaration and using of var once the code is compiled to MSIL.
The auditor is a respected professional so I cannot simply refuse such a suggestion.
How about this...
double GetTheNumber()
{
// get the important number from somewhere
}
And then elsewhere...
var theNumber = GetTheNumber();
DoSomethingImportant(theNumber / 5);
And then, at some point in the future, somebody notices that GetTheNumber only ever returns whole numbers so refactors it to return int rather than double.
Bang! No compiler errors and you start seeing unexpected results, because what was previously floating-point arithmetic has now become integer arithmetic without anybody noticing.
Having said that, this sort of thing should be caught by your unit tests etc, but it's still a potential gotcha.
I tend to follow this scheme:
var myObject = new MyObject(); // OK as the type is clear
var myObject = otherObject.SomeMethod(); // Bad as the return type is not clear
If the return type of SomeMethod ever changes then this code will still compile. In the best case you get compile errors further along, but in the worst case (depending on how myObject is used) you might not. What you will probably get in that case is run-time errors which could be very hard to track down.
Some cases could really lead to unexpected results. I'm a var fan myself, but this could go wrong:
var myDouble = 2;
var myHalf = 1 / myDouble;
Obviously this is a mistake and not an "unexpected result". But it is a gotcha...
var is not a dynamic type, it is simply syntactic sugar. The only exception to this is with Anonymous types. From the Microsoft Docs
In many cases the use of var is optional and is just a syntactic convenience. However, when a variable is initialized with an anonymous type you must declare the variable as var if you need to access the properties of the object at a later point.
There is no difference once compiled to IL unless you have explicitly defined the type as different to the one which would be implied (although I can't think of why you would). The compiler will not let you change the type of a variable declared with var at any point.
From the Microsoft documentation (again)
An implicitly typed local variable is strongly typed just as if you had declared the type yourself, but the compiler determines the type
In some cases var can impeed readability. More Microsoft docs state:
The use of var does have at least the potential to make your code more difficult to understand for other developers. For that reason, the C# documentation generally uses var only when it is required.
In the non-generic world you might get different behavior when using var instead of the type whenever an implicit conversion would occur, e.g. within a foreach loop.
In the example below, an implicit conversion from object to XmlNode takes place (the non-generic IEnumerator interface only returns object). If you simply replace the explicit declaration of the loop variable with the var keyword, this implicit conversion no longer takes place:
using System;
using System.Xml;
class Program
{
static void Foo(object o)
{
Console.WriteLine("object overload");
}
static void Foo(XmlNode node)
{
Console.WriteLine("XmlNode overload");
}
static void Main(string[] args)
{
XmlDocument doc = new XmlDocument();
doc.LoadXml("<root><child/></root>");
foreach (XmlNode node in doc.DocumentElement.ChildNodes)
{
Foo(node);
}
foreach (var node in doc.DocumentElement.ChildNodes)
{
// oops! node is now of type object!
Foo(node);
}
}
}
The result is that this code actually produces different outputs depending on whether you used var or an explicit type. With var the Foo(object) overload will be executed, otherwise the Foo(XmlNode) overload will be. The output of the above program therefore is:
XmlNode overload
object overload
Note that this behavior is perfectly according to the C# language specification. The only problem is that var infers a different type (object) than you would expect and that this inference is not obvious from looking at the code.
I did not add the IL to keep it short. But if you want you can have a look with ildasm to see that the compiler actually generates different IL instructions for the two foreach loops.
It's an odd claim that using var should never be used because it "may lead to unexpected results in some cases", because there are subtleties in the C# language far more complex than the use of var.
One of these is the implementation details of anonymous methods which can lead to the R# warning "Access to modified closure" and behaviour that is very much not what you might expect from looking at the code. Unlike var which can be explained in a couple of sentences, this behaviour takes three long blog posts which include the output of a disassembler to explain fully:
The implementation of anonymous methods in C# and its consequences (part 1)
The implementation of anonymous methods in C# and its consequences (part 2)
The implementation of anonymous methods in C# and its consequences (part 3)
Does this mean that you also shouldn't use anonymous methods (i.e. delegates, lambdas) and the libraries that rely on them such as Linq or ParallelFX just because in certain odd circumstances the behaviour might not be what you expect?
Of course not.
It means that you need to understand the language you're writing in, know its limitations and edge cases, and test that things work as you expect them to. Excluding language features on the basis that they "may lead to unexpected results in some cases" would mean that you were left with very few language features to use.
If they really want to argue the toss, ask them to demonstrate that a number of your bugs can be directly attributed to use of var and that explicit type declaration would have prevented them. I doubt you'll hear back from them soon.
They argue that using of var "may lead
to unexpected results in some cases".to unexpected results in some cases".
If unexpected is, "I don't know how to read the code and figure out what it is doing," then yes, it may lead to unexpected results. The compiler has to know what type to make the variable based on the code written around the variable.
The var keyword is a compile time feature. The compiler will put in the appropriate type for the declaration. This is why you can't do things like:
var my_variable = null
or
var my_variable;
The var keyword is great because, you have to define less information in the code itself. The compiler figures out what it is supposed to do for you. It's almost like always programming to an interface when you use it (where the interface methods and properties are defined by what you use within the declaration space of the variable defined by var). If the type of a variable needs to change(within reason of course), you don't need to worry about changing the variable declaration, the compiler handles this for you. This may sound like a trivial matter, but what happens if you have to change the return value in a function, and that function is used all throughout the program. If you didn't use var, then you have to find and replace every place that variable is called. With the var keyword, you don't need to worry about that.
When coming up with guidelines, as an auditor has to do, it is probably better to err on the side of fool safe, that is white listing good practices / black listing bad practices as opposed to telling people to simply be sensible and do the right thing based on an assessment of the situation at hand.
If you just say "don't use var anywhere in code", you get rid of a lot of ambiguity in the coding guidelines. This should make code look & feel more standardized without having to solve the question of when to do this and when to do that.
I personally love var. I use it for all local variables. All the time. If the resulting type is not clear, then this is not an issue with var, but an issue with the (naming of) methods used to initialize a variable...
I follow a simple principle when it comes to using the var keyword. If you know the type beforehand, don't use var.
In most cases, I use var with linq as I might want to return an anonymous type.
var best using when you have obviously declaration
ArrayList<Entity> en = new ArrayList<Enity>()
complicates readability
var en = new ArrayList<Entity>()
Lazy, clear code, i like it
I use var only where it is clear what type the variable is, or where it is no need to know the type at all (e.g. GetPerson() should return Person, Person_Class, etc.).
I do not use var for primitive types, enum, and string. I also do not use it for value type, because value type will be copied by assignment so the type of variable should be declared explicitly.
About your auditor comments, I would say that adding more lines of code as we have been doing everyday also "lead to unexpected results in some cases". This argument validity has already proven by those bugs we created, therefore I would suggest freezing the code base forever to prevent that.
using var is lazy code if you know what the type is going to be. Its just easier and cleaner to read. When looking at lots and lots of code, easier and cleaner is always better
There is absolutely no difference in the IL output for a variable declaration using var and one explicitly specified (you can prove this using reflector). I generally only use var for long nested generic types, foreach loops and anonymous types, as I like to have everything explicitly specified. Others may have different preferences.
var is just a shorthand notation of using the explicit type declaration.
You can only use var in certain circumstances; You'll have to initialize the variable at declaration time when using var.
You cannot assign a variable that is of another type afterwards to the variable.
It seems to me that many people tend to confuse the 'var' keyword with the 'Variant' datatype in VB6 .
The "only" benefit that i see towards using explicit variable declaration, is with well choosen typenames you state the intent of your piece of code much clearer (which is more important than anything else imo). The var keyword's benefit really is what Pieter said.
I also think that you will run into trouble if you declare your doubles without the D on the end. when you compile the release version, your compiler will likely strip off the double and make them a float to save space since it will not consider your precision.
var will compile to the same thing as the Static Type that could be specified. It just removes the need to be explicit with that Type in your code. It is not a dynamic type and does not/can not change at runtime. I find it very useful to use in foreach loops.
foreach(var item in items)
{
item.name = ______;
}
When working with Enumerations some times a specific type is unknown of time consuming to lookup. The use of var instead of the Static Type will yeald the same result.
I have also found that the use of var lends it self to easier refactoring. When a Enumeration of a different type is used the foreach will not need to be updated.
Use of var might hide logical programming errors, that otherwise you would have got warning from the compiler or the IDE. See this example:
float distX = innerDiagramRect.Size.Width / (numObjInWidth + 1);
Here, all the types in the calculation are int, and you get a warning about possible loss of fraction because you pick up the result in a float variable.
Using var:
var distX = innerDiagramRect.Size.Width / (numObjInWidth + 1);
Here you get no warning because the type of distX is compiled as int. If you intended to use float values, this is a logical error that is hidden to you, and hard to spot in executing unless it triggers a divide by zero exception in a later calculation if the result of this initial calculation is <1.

Generic methods in .NET cannot have their return types inferred. Why?

Given:
static TDest Gimme<TSource,TDest>(TSource source)
{
return default(TDest);
}
Why can't I do:
string dest = Gimme(5);
without getting the compiler error:
error CS0411: The type arguments for method 'Whatever.Gimme<TSource,TDest>(TSource)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
The 5 can be inferred as int, but there's a restriction where the compiler won't/can't resolve the return type as a string. I've read in several places that this is by design but no real explanation. I read somewhere that this might change in C# 4, but it hasn't.
Anyone know why return types cannot be inferred from generic methods? Is this one of those questions where the answer's so obvious it's staring you in the face? I hope not!
The general principle here is that type information flows only "one way", from the inside to the outside of an expression. The example you give is extremely simple. Suppose we wanted to have type information flow "both ways" when doing type inference on a method R G<A, R>(A a), and consider some of the crazy scenarios that creates:
N(G(5))
Suppose there are ten different overloads of N, each with a different argument type. Should we make ten different inferences for R? If we did, should we somehow pick the "best" one?
double x = b ? G(5) : 123;
What should the return type of G be inferred to be? Int, because the other half of the conditional expression is int? Or double, because ultimately this thing is going to be assigned to double? Now perhaps you begin to see how this goes; if you're going to say that you reason from outside to inside, how far out do you go? There could be many steps along the way. See what happens when we start to combine these:
N(b ? G(5) : 123)
Now what do we do? We have ten overloads of N to choose from. Do we say that R is int? It could be int or any type that int is implicitly convertible to. But of those types, which ones are implicitly convertible to an argument type of N? Do we write ourselves a little prolog program and ask the prolog engine to solve what are all the possible return types that R could be in order to satisfy each of the possible overloads on N, and then somehow pick the best one?
(I'm not kidding; there are languages that essentially do write a little prolog program and then use a logic engine to work out what the types of everything are. F# for example, does way more complex type inference than C# does. Haskell's type system is actually Turing Complete; you can encode arbitrarily complex problems in the type system and ask the compiler to solve them. As we'll see later, the same is true of overload resolution in C# - you cannot encode the Halting Problem in the C# type system like you can in Haskell but you can encode NP-HARD problems into overload resolution problems.) (See below)
This is still a very simple expression. Suppose you had something like
N(N(b ? G(5) * G("hello") : 123));
Now we have to solve this problem multiple times for G, and possibly for N as well, and we have to solve them in combination. We have five overload resolution problems to solve and all of them, to be fair, should be considering both their arguments and their context type. If there are ten possibilities for N then there are potentially a hundred possibilities to consider for N(N(...)) and a thousand for N(N(N(...))) and very quickly you would have us solving problems that easily had billions of possible combinations and made the compiler very slow.
This is why we have the rule that type information only flows one way. It prevents these sorts of chicken and egg problems, where you are trying to both determine the outer type from the inner type, and determine the inner type from the outer type and cause a combinatorial explosion of possibilities.
Notice that type information does flow both ways for lambdas! If you say N(x=>x.Length) then sure enough, we consider all the possible overloads of N that have function or expression types in their arguments and try out all the possible types for x. And sure enough, there are situations in which you can easily make the compiler try out billions of possible combinations to find the unique combination that works. The type inference rules that make it possible to do that for generic methods are exceedingly complex and make even Jon Skeet nervous. This feature makes overload resolution NP-HARD.
Getting type information to flow both ways for lambdas so that generic overload resolution works correctly and efficiently took me about a year. It is such a complex feature that we only wanted to take it on if we absolutely positively would have an amazing return on that investment. Making LINQ work was worth it. But there is no corresponding feature like LINQ that justifies the immense expense of making this work in general.
UPDATE: It turns out that you can encode arbitrarily difficult problems in the C# type system. C# has nominal generic subtyping with generic contravariance, and it has been shown that you can build a Turing Machine out of generic type definitions and force the compiler to execute the machine, possibly going into infinite loops. At the time I wrote this answer the undecidability of such type systems was an open question. See https://stackoverflow.com/a/23968075/88656 for details.
You have to do:
string dest = Gimme<int, string>(5);
You need to specify what your types are in the call to the generic method. How could it know that you wanted a string in the output?
System.String is a bad example because it's a sealed class, but say it wasn't. How could the compiler know that you didn't want one of its subclasses instead if you didn't specify the type in the call?
Take this example:
System.Windows.Forms.Control dest = Gimme(5);
How would the compiler know what control to actually make? You'd need to specify it like so:
System.Windows.Forms.Control dest = Gimme<int, System.Windows.Forms.Button>(5);
Calling Gimme(5) ignoring the return value is a legal statement how would the compiler know which type to return?
I use this technique when I need to do something like that:
static void Gimme<T>(out T myVariable)
{
myVariable = default(T);
}
and use it like this:
Gimme(out int myVariable);
Print(myVariable); //myVariable is already declared and usable.
Note that inline declaration of out variables is available since C# 7.0
This was a design decision I guess. I also find it useful while programming in Java.
Unlike Java, C# seems to evolve towards a functional programming language, and you can get type inference the other way round, so you can have:
var dest = Gimme<int, string>(5);
which will infer the type of dest. I guess mixing this and the java style inference could prove to be fairly difficult to implement.
If a function is supposed to return one of a small number of types, you could have it return a class with defined widening conversions to those types. I don't think it's possible to do that in a generic way, since the widening ctype operator doesn't accept a generic type parameter.
public class ReturnString : IReq<string>
{
}
public class ReturnInt : IReq<int>
{
}
public interface IReq<T>
{
}
public class Handler
{
public T MakeRequest<T>(IReq<T> requestObject)
{
return default(T);
}
}
var handler = new Handler();
string stringResponse = handler.MakeRequest(new ReturnString());
int intResponse = handler.MakeRequest(new ReturnInt());

Categories

Resources