I'm working on a typed scripting language backed by C# Expression Trees. I'm stuck on one issue around proper type conversion with binary operators. Here is an example of the behavior I'm trying to mimic: (The conversion rules should be the same C#'s compiler)
var value = "someString" + 10; // yields a string
var value = 5 + "someString"; // also yields a string, I don't know why
var x = 10f + 10; // yields a float
var y = 10 + 10f; // also yields a float, I don't know why
How does the C# compiler know to call ToString() on the integer in the first line and to convert the integers to floats in both directions when adding with a float? Are these conversion rules hard coded?
My compiler basically works like this now for binary operators:
Expression Visit(Type tryToConvertTo, ASTNode node) {
// details don't matter. If tryToConvertTo is not null
// the resulting expression is cast to that type if not already of that type
}
// very simplified but this is the gist of it
Expression VisitBinaryOperator(Operator operator) {
Expression lhs = Visit(null, operator.lhs);
Expression rhs = Visit(lhs, operator.rhs); // wrong, but mostly works unless we hit one of the example cases or something similar
switch(operator.opType) {
case OperatorType.Add: {
return Expression.Add(lhs, rhs);
}
// other operators / error handling etc omitted
}
}
I know always accepting the left hand side's type is wrong, but I have no idea what the proper approach to resolving the example expressions might be other than hard coding the rules for primitive types.
If anyone can point me in the right direction I'd be very grateful!
This kind of questions can only be answered accurately via the language specification.
+ operator with string and int operands
https://github.com/dotnet/csharpstandard/blob/draft-v7/standard/expressions.md#1195-addition-operator
Here, under String concatenation you will see:
These overloads of the binary + operator perform string concatenation. If an operand of string concatenation is null, an empty string is substituted. Otherwise, any non-string operand is converted to its string representation by invoking the virtual ToString method inherited from type object. If ToString returns null, an empty string is substituted.
+ operator with float and int operands
The int here is implicitly converted to float, specified here:
https://github.com/dotnet/csharpstandard/blob/draft-v7/standard/conversions.md#1023-implicit-numeric-conversions
I recently saw an example where the following was demonstrated to work:
T Add<T>(dynamic a, dynamic b)
{
return a + b;
}
Add<string>("hello", "world"); // Returns "helloworld"
However, if I were to attempt to use expressions to create a "generic" Add function:
ParameterExpression left = Expression.Parameter(typeof(T), "left");
ParameterExpression right = Expression.Parameter(typeof(T), "right");
var add = Expression.Lambda<Func<T, T, T>>(Expression.Add(left, right), left, right).Compile(); // Fails with System.InvalidOperationException : The binary operator Add is not defined for the types 'System.String' and 'System.String' when T == String.
and then used this function with strings, it fails because the String type does not actually implement the + operator, but is simply syntactic sugar for String.Concat().
How then, does dynamic allow this to work? I figured that at runtime it is past the point where + would be rewritten using String.Concat().
dynamic uses runtime helper functions that replicate C# compiler rules. One of these rules allows + on string objects even when no operator is defined by the framework. The standard numeric types such as int have no custom operator overload either, that too is done by the compiler and needs to be performed at runtime when using dynamic. This is why you need a reference to Microsoft.CSharp.dll: dynamic cannot work without those helper functions.
Based on the documentation, maybe instead of Expression.Add(left, right) you could say Expression.Add(left, right, method) where method is the MethodInfo of the static String.Concat(String, String).
var method = typeof(string).GetMethod("Concat", new[] { typeof(string), typeof(string), });
EDIT: Hmm, my answer sort of misses the point. The interesting question is: What operations does the runtime consider when it tries to resolve a + that the compiler has let through without type-checking? Bulit-in addition for numeric types? String concatenation? Delegate concatenation? User-defined operator overloads?
In your first example a and be are still strings (try this):
// Define other methods and classes here
T Add<T>(dynamic a, dynamic b)
{
Console.WriteLine(a.GetType());
Console.WriteLine(b.GetType());
return a + b;
}
Maybe this makes more sense?
void Main()
{
var x = Add<string>(new { val = "hello"},new { val = "world"}); // Returns "hello world"
Console.WriteLine(x);
}
// Define other methods and classes here
T Add<T>(dynamic a, dynamic b)
{
return a.val + b.val;
}
using the following code, all cause a compilation error (.net 2):
var headers = new WebHeaderCollection();
var a = headers[0];
var b = headers[(int)0];
const int FIRST_HEADER = 0;
var c = headers[FIRST_HEADER];
All fail with: The call is ambiguous between the following methods or properties: 'System.Net.WebHeaderCollection.this[System.Net.HttpRequestHeader]' and 'System.Net.WebHeaderCollection.this[System.Net.HttpResponseHeader]'.
I can understand to some extent why (a) would fail, as the overloads accept the HttpRequestHeader/HttpResponseHeader enums; but (b) and (c) are implicitly cast to type int.
The following works:
var headers = new WebHeaderCollection();
int index = 0;
var d = headers[index];
I only came across this when writing some tests, and needed the ability to prove that an expected header was added (and in my scenario would always be the only one!)
Why do i have to declare a variable of type int to use this overload?
In all cases, the expression is deemed to be "a constant expression with value zero" - which is implicitly convertible to any enum type.
Your later code works because you're effectively losing the const-ness, so that removes the implicit conversion.
In fact, there's a bug in the C# compiler around this, which means it treats any constant expression with value zero, not just integer values, as convertible to any enum type - so this works too, but shouldn't:
HttpRequestHeader weird = 0.0;
When trying to run the following code:
Expression<Func<string, string>> stringExpression = Expression.Lambda<Func<string, string>>(
Expression.Add(
stringParam,
Expression.Constant("A")
),
new List<ParameterExpression>() { stringParam }
);
string AB = stringExpression.Compile()("B");
I get the error referenced in the title: "The binary operator Add is not defined for the types 'System.String' and 'System.String'." Is that really the case? Obviously in C# it works. Is doing string s = "A" + "B" in C# special syntactic sugar that the expression compiler doesn't have access to?
It's absolutely right, yes. There is no such operator - the C# compiler converts string + string into a call to string.Concat. (This is important, because it means that x + y + z can be converted into string.Concat(x, y, z) which avoids creating intermediate strings pointlessly.
Have a look at the docs for string operators - only == and != are defined by the framework.
This just caught me out too, and as Jon points out in his answer, the C# compiler converts string + string into string.Concat. There is an overload of the Expression.Add method that allows you to specify the "add" method to use.
var concatMethod = typeof(string).GetMethod("Concat", new[] { typeof(string), typeof(string) });
var addExpr = Expression.Add(Expression.Constant("hello "),Expression.Constant("world"), concatMethod);
You might want to change the string.Concat method to use the correct overload.
Proving this works:
Console.WriteLine(Expression.Lambda<Func<string>>(addExpr).Compile()());
Will output:
hello world
Yeah, it's a surprise isn't it!!! The compiler replaces it with a call to String.Concat.
I have the following code:
Func<string, bool> comparer = delegate(string value) {
return value != "0";
};
However, the following does not compile:
var comparer = delegate(string value) {
return value != "0";
};
Why can't the compiler figure out it is a Func<string, bool>? It takes one string parameter, and returns a boolean. Instead, it gives me the error:
Cannot assign anonymous method to an
implicitly-typed local variable.
I have one guess and that is if the var version compiled, it would lack consistency if I had the following:
var comparer = delegate(string arg1, string arg2, string arg3, string arg4, string arg5) {
return false;
};
The above wouldn't make sense since Func<> allows only up to 4 arguments (in .NET 3.5, which is what I am using). Perhaps someone could clarify the problem. Thanks.
UPDATE: This answer was written over ten years ago and should be considered to be of historical interest; in C# 10 the compiler will infer some delegate types.
Others have already pointed out that there are infinitely many possible delegate types that you could have meant; what is so special about Func that it deserves to be the default instead of Predicate or Action or any other possibility? And, for lambdas, why is it obvious that the intention is to choose the delegate form, rather than the expression tree form?
But we could say that Func is special, and that the inferred type of a lambda or anonymous method is Func of something. We'd still have all kinds of problems. What types would you like to be inferred for the following cases?
var x1 = (ref int y)=>123;
There is no Func<T> type that takes a ref anything.
var x2 = y=>123;
We don't know the type of the formal parameter, though we do know the return. (Or do we? Is the return int? long? short? byte?)
var x3 = (int y)=>null;
We don't know the return type, but it can't be void. The return type could be any reference type or any nullable value type.
var x4 = (int y)=>{ throw new Exception(); }
Again, we don't know the return type, and this time it can be void.
var x5 = (int y)=> q += y;
Is that intended to be a void-returning statement lambda or something that returns the value that was assigned to q? Both are legal; which should we choose?
Now, you might say, well, just don't support any of those features. Just support "normal" cases where the types can be worked out. That doesn't help. How does that make my life easier? If the feature works sometimes and fails sometimes then I still have to write the code to detect all of those failure situations and give a meaningful error message for each. We still have to specify all that behaviour, document it, write tests for it, and so on. This is a very expensive feature that saves the user maybe half a dozen keystrokes. We have better ways to add value to the language than spending a lot of time writing test cases for a feature that doesn't work half the time and doesn't provide hardly any benefit in cases where it does work.
The situation where it is actually useful is:
var xAnon = (int y)=>new { Y = y };
because there is no "speakable" type for that thing. But we have this problem all the time, and we just use method type inference to deduce the type:
Func<A, R> WorkItOut<A, R>(Func<A, R> f) { return f; }
...
var xAnon = WorkItOut((int y)=>new { Y = y });
and now method type inference works out what the func type is.
Only Eric Lippert knows for sure, but I think it's because the signature of the delegate type doesn't uniquely determine the type.
Consider your example:
var comparer = delegate(string value) { return value != "0"; };
Here are two possible inferences for what the var should be:
Predicate<string> comparer = delegate(string value) { return value != "0"; }; // okay
Func<string, bool> comparer = delegate(string value) { return value != "0"; }; // also okay
Which one should the compiler infer? There's no good reason to choose one or the other. And although a Predicate<T> is functionally equivalent to a Func<T, bool>, they are still different types at the level of the .NET type system. The compiler therefore cannot unambiguously resolve the delegate type, and must fail the type inference.
Eric Lippert has an old post about it where he says
And in fact the C# 2.0 specification
calls this out. Method group
expressions and anonymous method
expressions are typeless expressions
in C# 2.0, and lambda expressions join
them in C# 3.0. Therefore it is
illegal for them to appear "naked" on
the right hand side of an implicit
declaration.
Different delegates are considered different types. e.g., Action is different than MethodInvoker, and an instance of Action can't be assigned to a variable of type MethodInvoker.
So, given an anonymous delegate (or lambda) like () => {}, is it an Action or a MethodInvoker? The compiler can't tell.
Similarly, if I declare a delegate type taking a string argument and returning a bool, how would the compiler know you really wanted a Func<string, bool> instead of my delegate type? It can't infer the delegate type.
The following points are from the MSDN regarding Implicitly Typed Local Variables:
var can only be used when a local variable is declared and initialized in the same statement; the variable cannot be initialized to null, or to a method group or an anonymous function.
The var keyword instructs the compiler to infer the type of the variable from the expression on the right side of the initialization statement.
It is important to understand that the var keyword does not mean "variant" and does not indicate that the variable is loosely typed, or late-bound. It just means that the compiler determines and assigns the most appropriate type.
MSDN Reference: Implicitly Typed Local Variables
Considering the following regarding Anonymous Methods:
Anonymous methods enable you to omit the parameter list.
MSDN Reference: Anonymous Methods
I would suspect that since the anonymous method may actually have different method signatures, the compiler is unable to properly infer what the most appropriate type to assign would be.
My post doesn't answer the actual question, but it does answer the underlying question of :
"How do I avoid having to type out some fugly type like Func<string, string, int, CustomInputType, bool, ReturnType>?" [1]
Being the lazy/hacky programmer that I am, I experimented with using Func<dynamic, object> - which takes a single input parameter and returns an object.
For multiple arguments, you can use it like so:
dynamic myParams = new ExpandoObject();
myParams.arg0 = "whatever";
myParams.arg1 = 3;
Func<dynamic, object> y = (dynObj) =>
{
return dynObj.arg0.ToUpper() + (dynObj.arg1 * 45); //screw type casting, amirite?
};
Console.WriteLine(y(myParams));
Tip: You can use Action<dynamic> if you don't need to return an object.
Yeah I know it probably goes against your programming principles, but this makes sense to me and probably some Python coders.
I'm pretty novice at delegates... just wanted to share what I learned.
[1] This assumes that you aren't calling a method that requires a predefined Func as a parameter, in which case, you'll have to type that fugly string :/
Other answers were correct at the time they were written, but starting from C# 10.0 (from 2021), the compiler can infer a suitable delegate type (like some Func<...>, Action<...> or generated delegate type) in such cases.
See C# 10 Features - Lambda improvements.
var comparer = delegate(string value) {
return value != "0";
}; // OK in C# 10.0, picks 'Func<string, bool>' in this case
Of course the more usual syntax is to us =>, so:
var comparer = (string value) => {
return value != "0";
}; // OK in C# 10.0, picks 'Func<string, bool>' in this case
How is about that?
var item = new
{
toolisn = 100,
LangId = "ENG",
toolPath = (Func<int, string, string>) delegate(int toolisn, string LangId)
{
var path = "/Content/Tool_" + toolisn + "_" + LangId + "/story.html";
return File.Exists(Server.MapPath(path)) ? "<a style=\"vertical-align:super\" href=\"" + path + "\" target=\"_blank\">execute example</a> " : "";
}
};
string result = item.toolPath(item.toolisn, item.LangId);