I am reading "The D Programming Language" by Andrei Alexandrescu and one sentence puzzled me. Consider such code (p.138):
T[] find(T)(T[] haystack, T needle) {
while (haystack.length > 0 && haystack[0] != needle) {
haystack = haystack[1 .. $];
}
return haystack;
}
and call (p.140):
double[] a = [ 1.0, 2.5, 2.0, 3.4 ];
a = find(a, 2); // Error! ' find(double[], int)' undefined
Explanation (paragraph below the code):
If we squint hard enough, we do see that the intent of the caller in this case was to have T = double and benefit from the nice implicit conversion from int to double. However, having the language attempt combinatorially at the same time implicit conversions and type deduction is a dicey proposition in the general case, so D does not attempt to do all that.
I am puzzled because such language as C# tries to infer the type — if it cannot do this, user gets error, but if it can, well, it works. C# lives with it for several years and I didn't hear any story how this feature ruined somebody's day.
And so my questions is this — what dangers are involved with inferring types as in the example above?
I can see only advantages, it is easy to write generic function and it is easy to call it. Otherwise you would have to introduce more parameters in generic class/function and write special constrains expressing allowed conversions only for the sake of inferring types.
The first thing to note is that it doesn't say that there's a problem with type deduction, it's that there's a problem with type deduction and implicit conversion at the same time.
So, if you have:
a = find(a, 2.0);
Then it's happy to deduce double as the type.
And if explicitly typed for double, it's happy to deduce that 2 should be implicitly cast to 2.0.
What it's not going to do is both at the same time.
Now, C# does do that. And I think for the most part I agree with you, it's generally convenient and generally works well enough.
It is true at the same time that it can be confusing, especially in cases where it leads to more than one equally reasonable overload.
Why type inference and implicit operator is not working in the following situations? and Why does Assert.AreEqual on custom struct with implicit conversion operator fail? are both questions about why a particular combination of implicit conversion and type inference didn't work.
Unexpected effect of implicit cast on delegate type inference has other factors, but again just refusing to consider the method in the list of possible matches because the argument type didn't match would mean the problem didn't happen.
They'd both be a lot simpler if the answer was always "because you are doing implicit conversion and type inference at the same time, and that's not supported". Which is what the answer would be with D.
On the other hand though, such problems don't arise that often, so I still favour the C# design decision to allow for both, but that there are some problems means it's still a reasonable design decision to not allow them.
I am still in college and only remember about hearing about 1 type of polymorphism when learning about Java; however, when I was in a C# class, I just remember my professor talking about 4 types of polymorphism.
I am only aware of subclassing and defining specific behavior within more specific classes, and being able to call those specific behaviors with a single method in the base class because of an interface signature.
What are the other types, and are they of as big of an importance as the only type we were taught above? Is that why there are not taught?
Yes there are 4 kinds of polymorphism
Overloading (Same function names, different parameter types. This includes operator overloading and is done at compile time)
Parametric polymorphism (These are like templates in C++) Compile time
Subtype polymorphism (if a function has a parameter with a subtype, for example Car->Honda, f(Car), then function f will accept f(Honda) as well.) Runtime
Parameter coercion (This is an implicit type conversion. For example, a function might require a double/real/float, but will accept an int and will implicitly upcast the parameter)
Compile time
Reference:
"On Understanding Types, Data Abstraction, and Polymorphism" by Cardelli & Wegner.
I noticed you can do this sort of thing in C#:
XNamespace c = "http://s.opencalais.com/1/pred/";
Notice the string value is implicitly converted to different type. Are there other places this can be done? What are some common patterns and practices around this sort of thing?
This can happen whenever an implicit conversion operator is defined. All in all it is quite rare.
this should help
http://msdn.microsoft.com/en-us/library/z5z9kes2.aspx
edit: Matt ninja'd it :)
Surprisingly, the first time I saw this was on the Wikipedia article about C# Conversion operators, I've never actually seen anyone use this before. Seems like it would hurt readability and confuse a lot of developers...
Basically XNamespace provide an operator that performs implicit conversion.
I guess most common-sense guidelines apply, only use it where it makes sense and avoid confusion. The biggest problem is unintended implicit conversion which could potentially open up for programming errors. You can avoid this and still provide a conversion with an explicit conversion operator.
An example of a case where you would want to use an explicit conversion operator instead of an implicit one would be a integer class that allows conversion from a floating point type; an implicit conversion would hide the truncation/rounding that would have to take place and could thus make the user very confused (and probably be the source of bugs.)
In my code I've used it a couple of times, for example in a very simple validation result struct which provided implicit conversion to bool (but not from). This allowed me to do if (result) { ... } (the jury is still out about the usefulness of this though :)).
Guess most of its use is for "simple" datatypes, like big integers, decimals and likewise.
This question already has answers here:
Is casting the same thing as converting?
(11 answers)
Closed 9 years ago.
Eric Lippert's comments in this question have left me thoroughly confused. What is the difference between casting and conversion in C#?
Casting is a way of telling the compiler "Object X is really Type Y, go ahead and treat it as such."
Conversion is saying "I know Object X isn't Type Y, but there exists a way of creating a new Object from X of Type Y, go ahead and do it."
I believe what Eric is trying to say is:
Casting is a term describing syntax (hence the Syntactic meaning).
Conversion is a term describing what actions are actually taken behind the scenes (and thus the Semantic meaning).
A cast-expression is used to convert
explicitly an expression to a given
type.
And
A cast-expression of the form (T)E,
where T is a type and E is a
unary-expression, performs an explicit
conversion (§13.2) of the value of E
to type T.
Seems to back that up by saying that a cast operator in the syntax performs an explicit conversion.
I am reminded of the anecdote told by Richard Feynman where he is attending a philosophy class and the professor askes him "Feynman, you're a physicist, in your opinion is an electron an 'essential object'?" So Feynman asks the clarifying question "is a brick an essential object?" to the class. Every student has a different answer to that question. They say that the fundamental abstract notion of "brickness" is the essential object. No, one specific, unique brick is the essential object. No, the parts of the brick you can empirically observe is the essential object. And so on.
Which is of course not to answer your question.
I'm not going to go through all these dozen answers and debate with their authors about what I really meant. I'll write a blog article on the subject in a few weeks and we'll see if that throws any light on the matter.
How about an analogy though, a la Feynman. You wish to bake a loaf of banana bread Saturday morning (as I do almost every Saturday morning.) So you consult The Joy of Cooking, and it says "blah blah blah... In another bowl, whisk together the dry ingredients. ..."
Clearly there is a strong relationship between that instruction and your actions tomorrow morning, but equally clearly it would be a mistake to conflate the instruction with the action. The instruction consists of text. It has a location, on a particular page. It has punctuation. Were you to be in the kitchen whisking together flour and baking soda, and someone asked "what's your punctuation right now?", you'd probably think it was an odd question. The action is related to the instruction, but the textual properties of the instruction are not properties of the action.
A cast is not a conversion in the same way that a recipe is not the act of baking a cake. A recipe is text which describes an action, which you can then perform. A cast operator is text which describes an action - a conversion - which the runtime can then perform.
From the C# Spec 14.6.6:
A cast-expression is used to convert
explicitly an expression to a given
type.
...
A cast-expression of the form (T)E,
where T is a type and E is a
unary-expression, performs an explicit
conversion (§13.2) of the value of E
to type T.
So casting is a syntactic construct used to instruct the compiler to invoke explicit conversions.
From the C# Spec §13:
A conversion enables an expression of
one type to be treated as another
type. Conversions can be implicit or
explicit, and this determines whether
an explicit cast is required.
[Example: For instance, the conversion
from type int to type long is
implicit, so expressions of type int
can implicitly be treated as type
long. The opposite conversion, from
type long to type int, is explicit, so
an explicit cast is required.
So conversions are where the actual work gets done. You'll note that the cast-expression quote says that it performs explicit conversions but explicit conversions are a superset of implicit conversions, so you can also invoke implicit conversions (even if you don't have to) via cast-expressions.
Just my understanding, probably much too simple:
When casting the essential data remains intact (same internal representation) - "I know this is a dictionary, but you can use it as a ICollection".
When converting, you are changing the internal representation to something else - "I want this int to be a string".
After reading Eric's comments, an attempt in plain english:
Casting means that the two types are actually the same at some level. They may implement the same interface or inherit from the same base class or the target can be "same enough" (a superset?) for the cast to work such as casting from Int16 to Int32.
Converting types then means that the two objects may be similar enough to be converted. Take for example a string representation of a number. It is a string, it cannot simply be cast into a number, it needs to be parsed and converted from one to the other, and, the process may fail. It may fail for casting as well but I imagine that's a much less expensive failure.
And that's the key difference between the two concepts I think. Conversion will entail some sort of parsing, or deeper analysis and conversion of the source data. Casting does not parse. It simply attempts a match at some polymorphic level.
Casting is the creation of a value of one type from another value of another type. Conversion is a type of casting in which the internal representation of the value must also be changed (rather than just its interpretation).
In C#, casting and converting are both done with a cast-expression:
( type ) unary-expression
The distinction is important (and the point is made in the comment) because only conversions may be created by a conversion-operator-declarator. Therefore, only (implicit or explicit) conversions may be created in code.
A non-conversion implicit cast is always available for subtype-to-supertype casts, and a non-conversion explicit cast is always available for supertype-to-subtype casts. No other non-conversion casts are allowed.
In this context, casting means that you are exposing an object of a given type for manipulation as some other type, conversion means that you are actually changing an object of a given type to an object of another type.
This page of the MSDN C# documentation suggests that a cast is specific instance of conversion: the "explicit conversion." That is, a conversion of the form x = (int)y is a cast.
Automatic data type changes (such as myLong = myInt) are the more generic "conversion."
A cast is an operator on a class/struct. A conversion is a method/process on one or the other of the affected classes/structs, or may be in a complete different class/struct (i.e. Converter.ToInt32()
Cast operators come in two flavors: implicit and explicit
Implicit cast operators indicate that data of one type (say, Int32) can always be represented as another type (decimal) without loss of data/precision.
int i = 25;
decimal d = i;
Explicit cast operators indicate that data of one type (decimal) can always be faithfully represented as another type (int), but there may be loss of data/precision. Therefor the compiler requires you to explicitly state that you are aware of this and want to do it anyway, through use of the explicit cast syntax:
decimal d = 25.0001;
int i = (int)d;
Conversion takes two types that are not necessarily related in any way, and attempts to convert one into the other through some process, such as parsing. If all known conversion algorithms fail, the process may either throw an exception or return a default value:
string s = "200";
int i = Converter.ToInt32(s); // set i to 200 by parsing s
string s = "two hundred";
int i = Converter.ToInt32(s); // sets i to 0 because the parse fails
Eric's references to syntactic conversion vs. symantic conversion are basically an operator vs. methodology distinction.
A cast is syntactical, and may or may not involve a conversion (depending on the type of cast). As you know, C++ allows specifying the type of cast you want to use.
Casting up/down the hierarchy may or may not be considered conversion, depending on who you ask (and what language they're talking about!)
Eric (C#) is saying that casting to a different type always involves a conversion, though that conversion may not even change the internal representation of the instance.
A C++-guy will disagree, since a static_cast might not result in any extra code (so the "conversion" is not actually real!)
Casting and Conversion are basically the same concept in C#, except that a conversion may be done using any method such as Object.ToString(). Casting is only done with the casting operator (T) E, that is described in other posts, and may make use of conversions or boxing.
Which conversion method does it use? The compiler decides based on the classes and libraries provided to the compiler at compile-time. If an implicit conversion exists, you are not required to use the casting operator. Object o = String.Empty. If only explicit conversions exist, you must use the casting operator. String s = (String) o.
You can create explicit and implicit conversion operators in your own classes. Note: conversions can make the data look very similar or nothing like the original type to you and me, but it's all defined by the conversion methods, and makes it legal to the compiler.
Casting always refers to the use of the casting operator. You can write
Object o = float.NaN;
String s = (String) o;
But if you access s, in for example a Console.WriteLine, you will receive a runtime InvalidCastException. So, the cast operator still attempts to use conversion at access time, but will settle for boxing during assignment.
INSERTED EDIT#2: isn't it hilariously inconsistent myopia that since I provided this answer, the question has been marked as duplicate of a question which asks, "Is casting the same thing as converting?". And the answers of "No" are overwhelmingly upvoted. Yet my answer here which points out the generative essence for why casts are not the same as conversion is overwhelmingly downvoted (yet I have one +1 in the comments). I suppose that readers have a difficult time with comprehending that casts apply at the denotational syntax/semantics layer and conversions apply at the operational semantics layer. For example, a cast of a reference (or pointer in C/C++)—referring to a boxed data type—to another data type, doesn't (in all languages and scenarios) generate a conversion of the boxed data. For example, in C float a[1]; int* p = (int*)&a; doesn't insure that *p refers to int data.
A compiler compiles from denotational semantics to operational semantics. The compilation is not bijective, i.e. it isn't guaranteed to uncompile (e.g. Java, LLVM, asm.js, or C# bytecode) back to any denotational syntax which compiles to that bytecode (e.g. Scala, Python, C#, C via Emscripten, etc). Thus the two layers not the same.
Thus most obviously a 'cast' and a 'conversion' are not the same thing. My answer here is pointing out that the terms apply to two different layers of semantics. Casts apply to the semantics of what the denotational layer (input syntax of the compiler) knows about. Conversions apply to the semantics of what the operational (runtime or intermediate bytecode) layer knows about. I used the standard term of 'erased' to describe what happens to denotational semantics that aren't explicitly recorded in the operational semantics layer.
For example, reified generics are an example of recording denotational semantics in the operational semantics layer, but they have the disadvantage of making the operational semantics layer incompatible with higher-order denotational semantics, e.g. this is why it was painful to consider implementing Scala's higher-kinded generics on C#'s CLR because C#'s denotational semantics for generics was hard-coded at the operational semantics layer.
Come on guys, stop downvoting someone who knows a lot more than you do. Do your homework first before you vote.
INSERTED EDIT: Casting is an operation that happens at the denotational semantics layer (where types are expressed in their full semantics). A cast may (e.g. explicit conversion) or may not (e.g. upcasting) cause a conversion at the runtime semantic layer. The downvotes on my answer (and the upvoting on Marc Gavin's comment) indicates to me that most people don't understand the differences between denotational semantics and operational (execution) semantics. Sigh.
I will state Eric Lippert's answer more simply and more generally for all languages, including C#.
A cast is syntax so (like all syntax) is erased at compile-time; whereas, a conversion causes some action at runtime.
That is a true statement for every computer language that I am aware of in the entire universe. Note that the above statement does not say that casting and conversions are mutually exclusive.
A cast may cause a conversion at runtime, but there are cases where it may not.
The reason we have two distinct words, i.e. cast and conversion, is we need a way to separately describe what is happening in syntax (the cast operator) and at runtime (conversion, or type check and possible conversion).
It is important that we maintain this separation-of-concepts, because in some programming languages the cast never causes a conversion. Also so that we understand implicit casting (e.g. upcasting) is happening only at compile-time. The reason I wrote this answer is because I want to help readers understand in terms of being multilingual with computer languages. And also to see how that general definition correctly applies in the C# case as well.
Also I wanted to help readers see how I generalize concepts in my mind, which helps me as computer language designer. I am trying to pass along the gift of a very reductionist, abstract way of thinking. But I am also trying to explain this in a very practical way. Please feel free to let me know in the comments if I need to improve the elucidation.
Eric Lippert wrote:
A cast is not a conversion in the same way that a recipe is not the
act of baking a cake. A recipe is text which describes an action,
which you can then perform. A cast operator is text which describes an
action - a conversion - which the runtime can then perform.
The recipe is what is happening in syntax. Syntax is always erased, and replaced with either nothing or some runtime code.
For example, I can write a cast in C# that does nothing and is entirely erased at compile-time when it is does not cause a change in the storage requirements or is upcasting. We can clearly see that a cast is just syntax, that makes no change to the runtime code.
int x = 1;
int y = (int)x;
Giraffe g = new Giraffe();
Animal a = (Animal)g;
That can be used for documentation purposes (yet noisy), but it is essential in languages that have type inference, where a cast is sometimes necessary to tell the compiler what type you wish it to infer.
For an example, in Scala a None has the type Option[Nothing] where Nothing is the bottom type that is the sub-type of all possible types (not super-type). So sometimes when using None, the type needs to be casted to a specific type, because Scala only does local type inference, thus can't always infer the type you intended.
// (None : Option[Int]) casts None to Option[Int]
println(Some(7) <*> ((None : Option[Int]) <*> (Some(9) > add)))
A cast could know at compile-time that it requires a type conversion, e.g. int x = (int)1.5, or could require a type check and possible type conversion at runtime, e.g. downcasting. The cast (i.e. the syntax) is erased and replaced with the runtime action.
Thus we can clearly see that equating all casts with explicit conversion, is an error of implication in the MSDN documentation. That documentation is intending to say that explicit conversion requires a cast operator, but it should not be trying to also imply that all casts are explicit conversions. I am confident that Eric Lippert can clear this up when he writes the blog he promised in his answer.
ADD: From the comments and chat, I can see that there is some confusion about the meaning of the term erased.
The term 'erased' is used to describe information that was known at compile-time, which is not known at runtime. For example, types can be erased in non-reified generics, and it is called type erasure.
Generally speaking all the syntax is erased, because generally CLI is not bijective (invertible, and one-to-one) with C#. You cannot always go backwards from some arbitrary CLI code back to the exact C# source code. This means information has been erased.
Those who say erased is not the right term, are conflating the implementation of a cast with the semantic of the cast. The cast is a higher-level semantic (I think it is actually higher than syntax, it is denotational semantics at least in case of upcasting and downcasting) that says at that level of semantics that we want to cast the type. Now how that gets done at runtime is entirely different level of semantics. In some languages it might always be a NOOP. For example, in Haskell all typing information is erased at compile-time.
I'm implementing an interpreter for a toy language in C#, and in order to do math in that language I'd like to implement a function like this:
public static object Add( object a, object b )
{
// return the sum of a and b
// each might be int, double, or one of many other numeric types
}
I can imagine a very silly and bad implementation of this function with tons of branches based on the types of a and b (using the is operator) and a bunch of casts, but my instinct is that there is a better way.
What do you think a good implementation of this function would be?
If:
you just want an easy-to-program solution
your little language has the same arithmetic rules as C#
you can use C# 4
you don't care particularly about performance
then you can simply do this:
public static object Add(dynamic left, dynamic right)
{
return left + right;
}
Done. What will happen is when this method is called, the code will start the C# compiler again and ask the compiler "what would you do if you had to add these two things, but you knew their runtime types at compile time?" (The Dynamic Language Runtime will then cache the result so that the next time someone tries to add two ints, the compiler doesn't start up again, they just re-use the lambda that the compiler handed back to the DLR.)
If you want to implement your own rules for addition, welcome to my world. There is no magic road that avoids lots of type checks and switches. There are literally hundreds of possible cases for addition of two arbitrary types and you have to check them all.
The way we handle this complexity in C# is we define addition operators on a smaller subset of types: int, uint, long, ulong, decimal, double, float, all enums, all delegates, string, and all the nullable versions of those value types. (Enums are then treated as being their underlying types, which simplifies things further.)
So for example when you're adding a ushort to a short we simplify the problem by saying that ushort and short are both special cases of int, and then solve the problem for adding two ints. That massively cuts down on the amount of code we have to write. But believe me, the binary operator overload resolution algorithm in C# is many thousands of lines of code. It's not an easy problem.
If your toy language is intended to be a dynamic language with its own rules then you might consider implementing IDynamicMetaObjectProvider and using the DLR mechanisms for implementing arithmetic and other operations like function calling.
Convert your values to the most wide type, for example, to decimal. All types like int, double, short etc. implements IConvertible interface ( http://msdn.microsoft.com/en-us/library/system.iconvertible_members.aspx). It exposes ToDecimal method, which could be used to convert value to Decimal type. Also Convert class is very useful
decimal aop = Convert.ToDecimal(a);
decimal bop = Convert.ToDecimal(b);
decimal sum = aop + bop;
return Convert.ChangeType(sum, typeof(a)); // Changing type from decimal to type of the first operand.
One thing you can do is write your own object base for the toy language. It can be either a real class or an interface. That way, you can make sure that all your classes have some kind of functionality for all operations (even if it's only to throw a runtime NotSupported exception). You can do this using interface or abstract functions for the really common things (like ToString or Equals), or using message-passing or some other method for uncommon operations.
(p.s. I cross-posted with STO, but I like STO's idea for numeric types.)
the easiest way to do it right away would be to just test for the types as you said.
But since this is for a toy language implementation (good for you!) then i would suggest using a better abstraction than object to pass around values in your interpreter. maybe make a LanguageNameObject base class and then you can add all the helper methods you want to help you implement this method. Basically the Object class is a bad abstraction for you to work with.... so build a better one!
Hmm interesting, I would use the typeof() operator on the objects coming in and test those against the types you will allow Add operations to be performed on. I'd also imagine you'll throw an exception if oddball types are attempting to be added together?
Ideally however if your only going to allow adding of int, float, double, etc, I would really just create overloaded version of the Add method to handle those various cases.