Math operations on arbitrary objects in C# - c#

I'm implementing an interpreter for a toy language in C#, and in order to do math in that language I'd like to implement a function like this:
public static object Add( object a, object b )
{
// return the sum of a and b
// each might be int, double, or one of many other numeric types
}
I can imagine a very silly and bad implementation of this function with tons of branches based on the types of a and b (using the is operator) and a bunch of casts, but my instinct is that there is a better way.
What do you think a good implementation of this function would be?

If:
you just want an easy-to-program solution
your little language has the same arithmetic rules as C#
you can use C# 4
you don't care particularly about performance
then you can simply do this:
public static object Add(dynamic left, dynamic right)
{
return left + right;
}
Done. What will happen is when this method is called, the code will start the C# compiler again and ask the compiler "what would you do if you had to add these two things, but you knew their runtime types at compile time?" (The Dynamic Language Runtime will then cache the result so that the next time someone tries to add two ints, the compiler doesn't start up again, they just re-use the lambda that the compiler handed back to the DLR.)
If you want to implement your own rules for addition, welcome to my world. There is no magic road that avoids lots of type checks and switches. There are literally hundreds of possible cases for addition of two arbitrary types and you have to check them all.
The way we handle this complexity in C# is we define addition operators on a smaller subset of types: int, uint, long, ulong, decimal, double, float, all enums, all delegates, string, and all the nullable versions of those value types. (Enums are then treated as being their underlying types, which simplifies things further.)
So for example when you're adding a ushort to a short we simplify the problem by saying that ushort and short are both special cases of int, and then solve the problem for adding two ints. That massively cuts down on the amount of code we have to write. But believe me, the binary operator overload resolution algorithm in C# is many thousands of lines of code. It's not an easy problem.
If your toy language is intended to be a dynamic language with its own rules then you might consider implementing IDynamicMetaObjectProvider and using the DLR mechanisms for implementing arithmetic and other operations like function calling.

Convert your values to the most wide type, for example, to decimal. All types like int, double, short etc. implements IConvertible interface ( http://msdn.microsoft.com/en-us/library/system.iconvertible_members.aspx). It exposes ToDecimal method, which could be used to convert value to Decimal type. Also Convert class is very useful
decimal aop = Convert.ToDecimal(a);
decimal bop = Convert.ToDecimal(b);
decimal sum = aop + bop;
return Convert.ChangeType(sum, typeof(a)); // Changing type from decimal to type of the first operand.

One thing you can do is write your own object base for the toy language. It can be either a real class or an interface. That way, you can make sure that all your classes have some kind of functionality for all operations (even if it's only to throw a runtime NotSupported exception). You can do this using interface or abstract functions for the really common things (like ToString or Equals), or using message-passing or some other method for uncommon operations.
(p.s. I cross-posted with STO, but I like STO's idea for numeric types.)

the easiest way to do it right away would be to just test for the types as you said.
But since this is for a toy language implementation (good for you!) then i would suggest using a better abstraction than object to pass around values in your interpreter. maybe make a LanguageNameObject base class and then you can add all the helper methods you want to help you implement this method. Basically the Object class is a bad abstraction for you to work with.... so build a better one!

Hmm interesting, I would use the typeof() operator on the objects coming in and test those against the types you will allow Add operations to be performed on. I'd also imagine you'll throw an exception if oddball types are attempting to be added together?
Ideally however if your only going to allow adding of int, float, double, etc, I would really just create overloaded version of the Add method to handle those various cases.

Related

How to create type-safe numerical types?

Context
Suppose I'm writing an app about cakes.
I need to store the weight of cakes in kg and their FCR (frosting/chocolate ratio. I made that up). I can store these values as float. The problem I see with that is that I can assign a weight-in-kg value to an FCR field.
C#'s type system can prevent errors like this. If I create a class WeightInKg and a class FrostingChocolateRatio, I won't be able to assign one to the other.
Issue
I will then need to implement all numerical operators (+ - * / > < == etc) again. These are already annoying to implement, because they are only mere wrappers over the functionality of float. However, as these structs are both based on float, all those wrapper methods are virtually identical. This will again be the case for every other such float-based type.
What I have tried / thought about
Good old OO inheritance. An abstract class FloatValue can provide all the numerical operations, but these then return the FloatValue type instead of the (type safe) subclass type. I also feel like a struct should be used for something that is inherently a bare-bones value, and structs don't support sub-classes.
Generics. I am currently using this; a struct Quantity<T> with all the numerical operations implemented. For T, I then use empty "marker classes", which only exist to identify a certain type of quantity (e.g. FrostingChocolateRatio). This works, but constantly using Quantity<WhatIReallyWant> is awkward and produces more visual clutter in the code.
Question
Is this as close as I can get to what I want, or are there cleaner ways to have such type-safe values in C#?
Addition
As PaulF mentioned, FrostingChocolateRatio is not an ideal example because the math works differently for ratios. However, I'm out of creativity for today; just assume it does exactly the same as WeightInKg.
The point is that there are several different types of values which behave exactly like float, but it doesn't make sense to add a centimeter to a liter.

How to write an extension method/class using two generics? [duplicate]

Given:
static TDest Gimme<TSource,TDest>(TSource source)
{
return default(TDest);
}
Why can't I do:
string dest = Gimme(5);
without getting the compiler error:
error CS0411: The type arguments for method 'Whatever.Gimme<TSource,TDest>(TSource)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
The 5 can be inferred as int, but there's a restriction where the compiler won't/can't resolve the return type as a string. I've read in several places that this is by design but no real explanation. I read somewhere that this might change in C# 4, but it hasn't.
Anyone know why return types cannot be inferred from generic methods? Is this one of those questions where the answer's so obvious it's staring you in the face? I hope not!
The general principle here is that type information flows only "one way", from the inside to the outside of an expression. The example you give is extremely simple. Suppose we wanted to have type information flow "both ways" when doing type inference on a method R G<A, R>(A a), and consider some of the crazy scenarios that creates:
N(G(5))
Suppose there are ten different overloads of N, each with a different argument type. Should we make ten different inferences for R? If we did, should we somehow pick the "best" one?
double x = b ? G(5) : 123;
What should the return type of G be inferred to be? Int, because the other half of the conditional expression is int? Or double, because ultimately this thing is going to be assigned to double? Now perhaps you begin to see how this goes; if you're going to say that you reason from outside to inside, how far out do you go? There could be many steps along the way. See what happens when we start to combine these:
N(b ? G(5) : 123)
Now what do we do? We have ten overloads of N to choose from. Do we say that R is int? It could be int or any type that int is implicitly convertible to. But of those types, which ones are implicitly convertible to an argument type of N? Do we write ourselves a little prolog program and ask the prolog engine to solve what are all the possible return types that R could be in order to satisfy each of the possible overloads on N, and then somehow pick the best one?
(I'm not kidding; there are languages that essentially do write a little prolog program and then use a logic engine to work out what the types of everything are. F# for example, does way more complex type inference than C# does. Haskell's type system is actually Turing Complete; you can encode arbitrarily complex problems in the type system and ask the compiler to solve them. As we'll see later, the same is true of overload resolution in C# - you cannot encode the Halting Problem in the C# type system like you can in Haskell but you can encode NP-HARD problems into overload resolution problems.) (See below)
This is still a very simple expression. Suppose you had something like
N(N(b ? G(5) * G("hello") : 123));
Now we have to solve this problem multiple times for G, and possibly for N as well, and we have to solve them in combination. We have five overload resolution problems to solve and all of them, to be fair, should be considering both their arguments and their context type. If there are ten possibilities for N then there are potentially a hundred possibilities to consider for N(N(...)) and a thousand for N(N(N(...))) and very quickly you would have us solving problems that easily had billions of possible combinations and made the compiler very slow.
This is why we have the rule that type information only flows one way. It prevents these sorts of chicken and egg problems, where you are trying to both determine the outer type from the inner type, and determine the inner type from the outer type and cause a combinatorial explosion of possibilities.
Notice that type information does flow both ways for lambdas! If you say N(x=>x.Length) then sure enough, we consider all the possible overloads of N that have function or expression types in their arguments and try out all the possible types for x. And sure enough, there are situations in which you can easily make the compiler try out billions of possible combinations to find the unique combination that works. The type inference rules that make it possible to do that for generic methods are exceedingly complex and make even Jon Skeet nervous. This feature makes overload resolution NP-HARD.
Getting type information to flow both ways for lambdas so that generic overload resolution works correctly and efficiently took me about a year. It is such a complex feature that we only wanted to take it on if we absolutely positively would have an amazing return on that investment. Making LINQ work was worth it. But there is no corresponding feature like LINQ that justifies the immense expense of making this work in general.
UPDATE: It turns out that you can encode arbitrarily difficult problems in the C# type system. C# has nominal generic subtyping with generic contravariance, and it has been shown that you can build a Turing Machine out of generic type definitions and force the compiler to execute the machine, possibly going into infinite loops. At the time I wrote this answer the undecidability of such type systems was an open question. See https://stackoverflow.com/a/23968075/88656 for details.
You have to do:
string dest = Gimme<int, string>(5);
You need to specify what your types are in the call to the generic method. How could it know that you wanted a string in the output?
System.String is a bad example because it's a sealed class, but say it wasn't. How could the compiler know that you didn't want one of its subclasses instead if you didn't specify the type in the call?
Take this example:
System.Windows.Forms.Control dest = Gimme(5);
How would the compiler know what control to actually make? You'd need to specify it like so:
System.Windows.Forms.Control dest = Gimme<int, System.Windows.Forms.Button>(5);
Calling Gimme(5) ignoring the return value is a legal statement how would the compiler know which type to return?
I use this technique when I need to do something like that:
static void Gimme<T>(out T myVariable)
{
myVariable = default(T);
}
and use it like this:
Gimme(out int myVariable);
Print(myVariable); //myVariable is already declared and usable.
Note that inline declaration of out variables is available since C# 7.0
This was a design decision I guess. I also find it useful while programming in Java.
Unlike Java, C# seems to evolve towards a functional programming language, and you can get type inference the other way round, so you can have:
var dest = Gimme<int, string>(5);
which will infer the type of dest. I guess mixing this and the java style inference could prove to be fairly difficult to implement.
If a function is supposed to return one of a small number of types, you could have it return a class with defined widening conversions to those types. I don't think it's possible to do that in a generic way, since the widening ctype operator doesn't accept a generic type parameter.
public class ReturnString : IReq<string>
{
}
public class ReturnInt : IReq<int>
{
}
public interface IReq<T>
{
}
public class Handler
{
public T MakeRequest<T>(IReq<T> requestObject)
{
return default(T);
}
}
var handler = new Handler();
string stringResponse = handler.MakeRequest(new ReturnString());
int intResponse = handler.MakeRequest(new ReturnInt());

C# enums are similar to Java enums [duplicate]

What are some advantages of making enum in Java similar to a class, rather than just a collection of constants as in C/C++?
You get free compile time checking of valid values. Using
public static int OPTION_ONE = 0;
public static int OPTION_TWO = 1;
does not ensure
void selectOption(int option) {
...
}
will only accept 0 or 1 as a parameter value. Using an enum, that is guaranteed. Moreover, this leads to more self documenting code, because you can use code completion to see all enum values.
Type safety is one reason.
Another, that I find more important, is that you can attach metadata to enum values in Java. For example, you could use an enum to define the set of legal operations for a webservice, and then attach metadata for the type of request and data class:
AddItem(HttpMethod.POST, ProductEntry.class),
Java 5 enums originated from a typesafe enum pattern from Joshua Bloch's Effective Java (the first edition) to avoid the pitfalls of enums in C/C++/C# (which are simply thinly-veiled int constants) and the use in Java of final static int constants.
Primarily int constants and int enums aren't typesafe. You can pass in any int value. In C/C++ you can do this:
enum A { one, two, three };
enum B { beef, chicken, pork } b = beef;
void func(A a) { ... }
func((A)b);
Unfortunately the typesafe enum pattern from Effective Java had a lot of boilerplate, not all of it obvious. The most notable is you had to override the private method readResolve to stop Java creating new instances on deserialization, which would break simple reference checking (ie using the == operator instead of equals()).
So Java 5 enums offer these advantages over ints:
Type safety;
Java 5 enums can have behaviour and implement interfaces;
Java 5 enums have some extremely lightweight data structures like EnumSet and EnumMap.
Java 5 enums over these advantages over just using classes:
Less error-prone boilerplate (private constructor, readResolve() etc);
Semantic correctness. You see something is an enum and you know it's just representing a value. You see a class and you're not sure. Maybe there's a static factory method somewhere, etc. Java 5 enums much more clearly indicate intent.
Enums are already a class in Java.
If you're asking why this is better, I'd say that better type safety and the ability to add other attributes besides a mere ordinal value would come to mind.
In addition to better type safety, you can also define custom behavior in your enums (refer to Effective Java for some good examples).
You can use enums to effectively implement Singletons ^^:
public enum Elvis {
INSTANCE
}
Making enum a reference type that can contain fixed set of constants has led to efficient Map implementation like EnumMap and Set implementation like EnumSet (JDK classes).
From javadoc of EnumMap :
A specialized Map implementation for use with enum type keys. All of the keys in an enum map must come from a single enum type that is specified, explicitly or implicitly, when the map is created. Enum maps are represented internally as arrays. This representation is extremely compact and efficient.
EnumMap combines richness and type safety of Map with the speed of an array (Effective Java).
Enums are a type in itself - you cannot use an enum that does not exist, or put in some other similar looking constant. and also, you can enumerate them, so that code can be more concise.
using static constants could potentially cause maintenence nightmares - especially if they area spread out.
The only real advantage is that it can be used in a switch statement. All the other stuff an enum is capable of can just be done with plain vanilla class with a private constructor whose instances in turn are declared as public static final fields of the class in question (the typesafe pattern). The other advantage of enum is obviously that it makes the code less verbose than you would do with a plain vanilla class.
But if I'm not mistaken, in C++ (or was it C#?) you can use a String in a switch statement. So that advantage of enums in Java is negligible as opposed to C++. However, same thing was proposed for Java 7, not sure if it will make it.
Benefits of Using Enumerations:
An object can be created to work in the same manner as an enumeration. In fact,
enumerations were not even included in the Java language until version 5.0. However,
enumerations make code more readable and provide less room for programmer error.
OCA Java SE 7 Programmer I Study Guide

Generic only allowing only integers as type argument

I am writing a RationalNumber class in C# and would like to make it generic, but only allowing integers (int, byte, UInt32, my own BigInt class ...) as inputs - it doesn't make sense to have a rational number based on floats or even regular objects like Control.
However, it doesn't seem that I can filter out non-integer types when declaring the class.
Did I overlook something?
No you can't.
And you have the additional problem that there is no arithmetic constraint either. So there is no statically typed way to use the operators of your type argument either. So you'll need to use dynamic which is slower (unless they improved the runtime/jitter since .net 3.5).
Some projects with similar problems didn't make the class generic at all, and used a code generator to specialize it instead.

Generic methods in .NET cannot have their return types inferred. Why?

Given:
static TDest Gimme<TSource,TDest>(TSource source)
{
return default(TDest);
}
Why can't I do:
string dest = Gimme(5);
without getting the compiler error:
error CS0411: The type arguments for method 'Whatever.Gimme<TSource,TDest>(TSource)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
The 5 can be inferred as int, but there's a restriction where the compiler won't/can't resolve the return type as a string. I've read in several places that this is by design but no real explanation. I read somewhere that this might change in C# 4, but it hasn't.
Anyone know why return types cannot be inferred from generic methods? Is this one of those questions where the answer's so obvious it's staring you in the face? I hope not!
The general principle here is that type information flows only "one way", from the inside to the outside of an expression. The example you give is extremely simple. Suppose we wanted to have type information flow "both ways" when doing type inference on a method R G<A, R>(A a), and consider some of the crazy scenarios that creates:
N(G(5))
Suppose there are ten different overloads of N, each with a different argument type. Should we make ten different inferences for R? If we did, should we somehow pick the "best" one?
double x = b ? G(5) : 123;
What should the return type of G be inferred to be? Int, because the other half of the conditional expression is int? Or double, because ultimately this thing is going to be assigned to double? Now perhaps you begin to see how this goes; if you're going to say that you reason from outside to inside, how far out do you go? There could be many steps along the way. See what happens when we start to combine these:
N(b ? G(5) : 123)
Now what do we do? We have ten overloads of N to choose from. Do we say that R is int? It could be int or any type that int is implicitly convertible to. But of those types, which ones are implicitly convertible to an argument type of N? Do we write ourselves a little prolog program and ask the prolog engine to solve what are all the possible return types that R could be in order to satisfy each of the possible overloads on N, and then somehow pick the best one?
(I'm not kidding; there are languages that essentially do write a little prolog program and then use a logic engine to work out what the types of everything are. F# for example, does way more complex type inference than C# does. Haskell's type system is actually Turing Complete; you can encode arbitrarily complex problems in the type system and ask the compiler to solve them. As we'll see later, the same is true of overload resolution in C# - you cannot encode the Halting Problem in the C# type system like you can in Haskell but you can encode NP-HARD problems into overload resolution problems.) (See below)
This is still a very simple expression. Suppose you had something like
N(N(b ? G(5) * G("hello") : 123));
Now we have to solve this problem multiple times for G, and possibly for N as well, and we have to solve them in combination. We have five overload resolution problems to solve and all of them, to be fair, should be considering both their arguments and their context type. If there are ten possibilities for N then there are potentially a hundred possibilities to consider for N(N(...)) and a thousand for N(N(N(...))) and very quickly you would have us solving problems that easily had billions of possible combinations and made the compiler very slow.
This is why we have the rule that type information only flows one way. It prevents these sorts of chicken and egg problems, where you are trying to both determine the outer type from the inner type, and determine the inner type from the outer type and cause a combinatorial explosion of possibilities.
Notice that type information does flow both ways for lambdas! If you say N(x=>x.Length) then sure enough, we consider all the possible overloads of N that have function or expression types in their arguments and try out all the possible types for x. And sure enough, there are situations in which you can easily make the compiler try out billions of possible combinations to find the unique combination that works. The type inference rules that make it possible to do that for generic methods are exceedingly complex and make even Jon Skeet nervous. This feature makes overload resolution NP-HARD.
Getting type information to flow both ways for lambdas so that generic overload resolution works correctly and efficiently took me about a year. It is such a complex feature that we only wanted to take it on if we absolutely positively would have an amazing return on that investment. Making LINQ work was worth it. But there is no corresponding feature like LINQ that justifies the immense expense of making this work in general.
UPDATE: It turns out that you can encode arbitrarily difficult problems in the C# type system. C# has nominal generic subtyping with generic contravariance, and it has been shown that you can build a Turing Machine out of generic type definitions and force the compiler to execute the machine, possibly going into infinite loops. At the time I wrote this answer the undecidability of such type systems was an open question. See https://stackoverflow.com/a/23968075/88656 for details.
You have to do:
string dest = Gimme<int, string>(5);
You need to specify what your types are in the call to the generic method. How could it know that you wanted a string in the output?
System.String is a bad example because it's a sealed class, but say it wasn't. How could the compiler know that you didn't want one of its subclasses instead if you didn't specify the type in the call?
Take this example:
System.Windows.Forms.Control dest = Gimme(5);
How would the compiler know what control to actually make? You'd need to specify it like so:
System.Windows.Forms.Control dest = Gimme<int, System.Windows.Forms.Button>(5);
Calling Gimme(5) ignoring the return value is a legal statement how would the compiler know which type to return?
I use this technique when I need to do something like that:
static void Gimme<T>(out T myVariable)
{
myVariable = default(T);
}
and use it like this:
Gimme(out int myVariable);
Print(myVariable); //myVariable is already declared and usable.
Note that inline declaration of out variables is available since C# 7.0
This was a design decision I guess. I also find it useful while programming in Java.
Unlike Java, C# seems to evolve towards a functional programming language, and you can get type inference the other way round, so you can have:
var dest = Gimme<int, string>(5);
which will infer the type of dest. I guess mixing this and the java style inference could prove to be fairly difficult to implement.
If a function is supposed to return one of a small number of types, you could have it return a class with defined widening conversions to those types. I don't think it's possible to do that in a generic way, since the widening ctype operator doesn't accept a generic type parameter.
public class ReturnString : IReq<string>
{
}
public class ReturnInt : IReq<int>
{
}
public interface IReq<T>
{
}
public class Handler
{
public T MakeRequest<T>(IReq<T> requestObject)
{
return default(T);
}
}
var handler = new Handler();
string stringResponse = handler.MakeRequest(new ReturnString());
int intResponse = handler.MakeRequest(new ReturnInt());

Categories

Resources