Can somebody clarify the C# is keyword please. In particular these 2 questions:
Q1) line 5; Why does this return true?
Q2) line 7; Why no cast exception?
public void Test()
{
object intArray = new int[] { -100, -200 };
if (intArray is uint[]) //why does this return true?
{
uint[] uintArray = (uint[])intArray; //why no class cast exception?
for (int x = 0; x < uintArray.Length; x++)
{
Console.Out.WriteLine(uintArray[x]);
}
}
}
MSDN's description does not clarify the situation. It states that is will return true if either of these conditions are met. (http://msdn.microsoft.com/en-us/library/scekt9xw(VS.71).aspx>MDSN Article)
expression is not null.
expression can be cast to type.
I don't believe that you can do a valid cast of int[] into uint[]. Because:
A) This code does not compile:
int[] signed = new int[] { -100 };
uint[] unsigned = (uint[])signed;
B) Doing the cast in the debugger gives an error:
(uint[])signed
"Cannot convert type 'int[]' to 'uint[]'"
Sure enough, if line 3 was int[] instead of object then it would never compile. Which brings me to a final question related to Q2.
Q3) Why does C# raise a cast/conversion error in the debugger and compiler but not at runtime?
C# and the CLR have somewhat different conversion rules.
You can't directly cast between int[] and uint[] in C# because the language doesn't believe any conversion is available. However, if you go via object the result is up to the CLI. From the CLI spec section 8.7 (I hope - I'm quoting an email exchange I had on this topic with Eric Lippert a while ago):
Signed and unsigned integral primitive
types can be assigned to each other;
e.g., int8 := uint8 is valid. For this
purpose, bool shall be considered
compatible with uint8 and vice versa,
which makes bool := uint8 valid, and
vice versa. This is also true for
arrays of signed and unsigned integral
primitive types of the same size;
e.g., int32[] := uint32[] is valid.
(I haven't checked, but I assume that this sort of reference type conversion being valid is what makes is return true as well.)
It's somewhat unfortunate that there are disconnects between the language and the underlying execution engine, but it's pretty much unavoidable in the long run, I suspect. There are a few other cases like this, but the good news is that they rarely seem to cause significant harm.
EDIT: As Marc deleted his answer, I've linked to the full mail from Eric, as posted to the C# newsgroup.
Now that's interesting. I found this in the ECMA-335 standard. 4.3 castclass. Note that:
Arrays inherit from System.Array.
If Foo can be cast to Bar, then Foo[] can be cast to Bar[].
For the purposes of note 2 above, enums are treated as their underlying type: thus E1[] can be cast to E2[] if E1 and E2 share an underlying type.
You can cast int to uint, but that it behaves like this is very strange. Visual Studio does not recognize any of this, even the watch, when the debugger is attached just shows a question mark '?'.
You might wanna take a look at this, fast forward about 10 minutes in and listen to Anders explain the co-variant array implementation. I think that is the fundamentally underlying issue here.
Suggestion:
Declaring intArray as "int [] intArray" rather then "object intArray" will allow the compiler to pick up the invalid C# cast. Unless you absolutely have to use object, I would take that approach.
Re Q2,Q3:
At runtime have you tried wrapping the cast in a checked block?
From this article at MSDN:
By default, an expression that
contains only constant values causes a
compiler error if the expression
produces a value that is outside the
range of the destination type. If the
expression contains one or more
non-constant values, the compiler does
not detect the overflow.
...
By default, these non-constant
expressions are not checked for
overflow at run time either, and they
do not raise overflow exceptions. The
previous example displays
-2,147,483,639 as the sum of two positive integers.
Overflow checking can be enabled by
compiler options, environment
configuration, or use of the checked
keyword.
As it says, you can enforce overflow checking more globally via a compiler setting or environment config.
In your case this is probably desirable as it will cause a runtime error to be thrown that will ensure the likely invalid unsigned number to signed number overflow will not occur silently.
[Update] After testing this code, I found that using a declaration of type object instead of int [] appears to bypass the standard C# casting sytax, regardless of whether checked is enabled or not.
As JS has said, when you use object, you are bound by CLI rules and these apparently allow this to occur.
Re Q1:
This is related to the above. In short, because the cast involved it does not throw an exception (based on current overflow setting). Whether this is a good idea is another question.
From MSDN:
An "is" expression evaluates to true if the provided expression is non-null, and the
provided object can be cast to the provided type without causing an exception to be
thrown.
I'm guessing backwards compatablility with .NET 1: I'm still a bit fuzzy about the details, but I beleive the CLR type of all arrays is simply System.Array, with extra Type properties to lookup the element type. 'is' probably just didn't account for that in CLR v1, and now must maintain that.
It not working in the (uint[])(new int[]{}) case is probably due to the C# compiler (not the CLR runtime) being able to do stricter typechecking.
Also, arrays are just type unsafe in general:
Animal[] tigers = new Tiger[10];
tigers[3] = new Elephant(); // ArrayTypeMismatchException
OK,
I attempt to take a stab at this.
First off, the documentation says, "Checks if an object is compatible with a given type.", It also says that IF the type on the left is "castable" (you can convert without exception) to the type on the right, and the expression evaluates to non-null, the "is" keyword will evaluate to true.
Look to Jon Skeet for the other answer. He said it more eloquently than I could. He's right, if a conversion is available, it will accept your syntax, you could subsequently write your own, but it would seem overkill in this situation.
Reference: http://msdn.microsoft.com/en-us/library/scekt9xw(VS.80).aspx
Related
Can somebody clarify the C# is keyword please. In particular these 2 questions:
Q1) line 5; Why does this return true?
Q2) line 7; Why no cast exception?
public void Test()
{
object intArray = new int[] { -100, -200 };
if (intArray is uint[]) //why does this return true?
{
uint[] uintArray = (uint[])intArray; //why no class cast exception?
for (int x = 0; x < uintArray.Length; x++)
{
Console.Out.WriteLine(uintArray[x]);
}
}
}
MSDN's description does not clarify the situation. It states that is will return true if either of these conditions are met. (http://msdn.microsoft.com/en-us/library/scekt9xw(VS.71).aspx>MDSN Article)
expression is not null.
expression can be cast to type.
I don't believe that you can do a valid cast of int[] into uint[]. Because:
A) This code does not compile:
int[] signed = new int[] { -100 };
uint[] unsigned = (uint[])signed;
B) Doing the cast in the debugger gives an error:
(uint[])signed
"Cannot convert type 'int[]' to 'uint[]'"
Sure enough, if line 3 was int[] instead of object then it would never compile. Which brings me to a final question related to Q2.
Q3) Why does C# raise a cast/conversion error in the debugger and compiler but not at runtime?
C# and the CLR have somewhat different conversion rules.
You can't directly cast between int[] and uint[] in C# because the language doesn't believe any conversion is available. However, if you go via object the result is up to the CLI. From the CLI spec section 8.7 (I hope - I'm quoting an email exchange I had on this topic with Eric Lippert a while ago):
Signed and unsigned integral primitive
types can be assigned to each other;
e.g., int8 := uint8 is valid. For this
purpose, bool shall be considered
compatible with uint8 and vice versa,
which makes bool := uint8 valid, and
vice versa. This is also true for
arrays of signed and unsigned integral
primitive types of the same size;
e.g., int32[] := uint32[] is valid.
(I haven't checked, but I assume that this sort of reference type conversion being valid is what makes is return true as well.)
It's somewhat unfortunate that there are disconnects between the language and the underlying execution engine, but it's pretty much unavoidable in the long run, I suspect. There are a few other cases like this, but the good news is that they rarely seem to cause significant harm.
EDIT: As Marc deleted his answer, I've linked to the full mail from Eric, as posted to the C# newsgroup.
Now that's interesting. I found this in the ECMA-335 standard. 4.3 castclass. Note that:
Arrays inherit from System.Array.
If Foo can be cast to Bar, then Foo[] can be cast to Bar[].
For the purposes of note 2 above, enums are treated as their underlying type: thus E1[] can be cast to E2[] if E1 and E2 share an underlying type.
You can cast int to uint, but that it behaves like this is very strange. Visual Studio does not recognize any of this, even the watch, when the debugger is attached just shows a question mark '?'.
You might wanna take a look at this, fast forward about 10 minutes in and listen to Anders explain the co-variant array implementation. I think that is the fundamentally underlying issue here.
Suggestion:
Declaring intArray as "int [] intArray" rather then "object intArray" will allow the compiler to pick up the invalid C# cast. Unless you absolutely have to use object, I would take that approach.
Re Q2,Q3:
At runtime have you tried wrapping the cast in a checked block?
From this article at MSDN:
By default, an expression that
contains only constant values causes a
compiler error if the expression
produces a value that is outside the
range of the destination type. If the
expression contains one or more
non-constant values, the compiler does
not detect the overflow.
...
By default, these non-constant
expressions are not checked for
overflow at run time either, and they
do not raise overflow exceptions. The
previous example displays
-2,147,483,639 as the sum of two positive integers.
Overflow checking can be enabled by
compiler options, environment
configuration, or use of the checked
keyword.
As it says, you can enforce overflow checking more globally via a compiler setting or environment config.
In your case this is probably desirable as it will cause a runtime error to be thrown that will ensure the likely invalid unsigned number to signed number overflow will not occur silently.
[Update] After testing this code, I found that using a declaration of type object instead of int [] appears to bypass the standard C# casting sytax, regardless of whether checked is enabled or not.
As JS has said, when you use object, you are bound by CLI rules and these apparently allow this to occur.
Re Q1:
This is related to the above. In short, because the cast involved it does not throw an exception (based on current overflow setting). Whether this is a good idea is another question.
From MSDN:
An "is" expression evaluates to true if the provided expression is non-null, and the
provided object can be cast to the provided type without causing an exception to be
thrown.
I'm guessing backwards compatablility with .NET 1: I'm still a bit fuzzy about the details, but I beleive the CLR type of all arrays is simply System.Array, with extra Type properties to lookup the element type. 'is' probably just didn't account for that in CLR v1, and now must maintain that.
It not working in the (uint[])(new int[]{}) case is probably due to the C# compiler (not the CLR runtime) being able to do stricter typechecking.
Also, arrays are just type unsafe in general:
Animal[] tigers = new Tiger[10];
tigers[3] = new Elephant(); // ArrayTypeMismatchException
OK,
I attempt to take a stab at this.
First off, the documentation says, "Checks if an object is compatible with a given type.", It also says that IF the type on the left is "castable" (you can convert without exception) to the type on the right, and the expression evaluates to non-null, the "is" keyword will evaluate to true.
Look to Jon Skeet for the other answer. He said it more eloquently than I could. He's right, if a conversion is available, it will accept your syntax, you could subsequently write your own, but it would seem overkill in this situation.
Reference: http://msdn.microsoft.com/en-us/library/scekt9xw(VS.80).aspx
Note: This is not a duplicate of "Use of IsAssignableFrom and “is” keyword in C#". That other question asks about typeof(T).IsAssignableFrom(type)), where type is not an object but a Type.
This seems trivial — I can hear you saying, "Just call x.GetType()!" — but due to the COM-related corner case mentioned below, that call causes problems, which is why I'm asking about the rewrite.
… or are there rare special cases where the two might give different results?
I stumbled upon a type check of the form:
typeof(TValue).IsAssignableFrom(value.GetType())
where TValue is a generic type parameter (without any constraints) and value is an object.
I am not entirely sure whether it is safe to rewrite the above simply as:
value is TValue
To my current knowledge, the two tests are equivalent with the exception of COM objects. is should trigger a proper QueryInterface, while IsAssignableFrom might get confused by the __ComObject RCW wrapper type and report a false negative.
Are there any other differences between is and the shown use of IsAssignableFrom?
There are more cases where is and IsAssignableFrom returns different results, not just the one you mentioned for COM objects. For pair of array types with elements of type ElementType1 and ElementType2, where underlaying types of both element types are integer types of the same size but with opposite signedness, then
typeof(ElementType1[]).IsAssignableFrom(typeof(ElementType2[])) returns true but
new ElementType2[0] is ElementType1[] returns false
Specifically this includes arrays with these pairs of their element types:
byte / sbyte, short / ushort, int / uint, long / ulong
IntPtr / UIntPtr
any combination of enum type and either integer type or another enum type as long as underlaying types are of the same size
any combination of IntPtr / UIntPtr / int / uint in 32-bit process
any combination of IntPtr / UIntPtr / long / ulong in 64-bit process
This is due to differences in type system of C# and CLR as explained in
Why does my C# array lose type sign information when cast to object?
Why does “int[] is uint[] == true” in C#
Why is covariance of value-typed arrays inconsistent?
Different results of is and IsAssignableFrom in all cases mentioned above results from the fact that for new ElementType2[0] is ElementType1[] C# compiler simply emits False at compile-time (because it sees no way that for example int[] can be cast to uint[] as these are completly different types from C# perspective), completly omitting any runtime type checks. Fortunately casting array to object ((object)new ElementType2[0]) is ElementType1[] forces compiler to emit isinst IL instruction, which performs runtime type-check, that returns results consistent with IsAssignableFrom. This is also true for cases where target type is generic parameter, because it's type is not known at compile-time and C# compiler must emit isinst. So if you intend to replace IsAssignableFrom by is only in places where target type is generic parameter (as suggested in question title), I believe that these differences does not metter for you.
static void Main(string[] args)
{
int? bob = null;
Test(bob);
}
private static void Test<T>(T bob)
{
Console.WriteLine(bob is T);
Console.WriteLine(typeof(T).IsInstanceOfType(bob));
Console.WriteLine(typeof(T).IsAssignableFrom(bob.GetType()));
Console.ReadLine();
}
Is an example where they act slightly differently (since bob is null).
https://stackoverflow.com/a/15853213/34092 may be of interest.
Other than that (and the other exceptions you mention) they do appear to be equivalent.
In Jesse Liberty's Learning C# book, he says "Objects of one type can be converted into objects of another type. This is called casting."
If you investigate the IL generated from the code below, you can clearly see that the casted assignment isn't doing the same thing as the converted assignment. In the former, you can see the boxing/unboxing occurring; in the latter you can see a call to a convert method.
I know in the end it may be just a silly semantic difference--but is casting just another word for converting. I don't mean to be snarky, but I'm not interested in anyone's gut feeling on this--opinions don't count here! Can anyone point to a definitive reference that confirms or denies if casting and converting are the same thing?
object x;
int y;
x = 4;
y = ( int )x;
y = Convert.ToInt32( x );
Thank you
rp
Note added after Matt's comment about explicit/implicit:
I don't think implicit/explicit is the difference. In the code I posted, the change is explicit in both cases. An implicit conversion is what occurs when you assign a short to an int.
Note to Sklivvz:
I wanted confirmation that my suspicion of the looseness of Jesse Liberty's (otherwise usually lucid and clear) language was correct. I thought that Jesse Liberty was being a little loose with his language. I understand that casting is routed in object hierarchy--i.e., you can't cast from an integer to a string but you could cast from custom exception derived from System.Exception to a System.Exception.
It's interesting, though, that when you do try to cast from an int to a string the compiler tells you that it couldn't "convert" the value. Maybe Jesse is more correct than I thought!
Absolutely not!
Convert tries to get you an Int32 via "any means possible". Cast does nothing of the sort. With cast you are telling the compiler to treat the object as Int, without conversion.
You should always use cast when you know (by design) that the object is an Int32 or another class that has an casting operator to Int32 (like float, for example).
Convert should be used with String, or with other classes.
Try this
static void Main(string[] args)
{
long l = long.MaxValue;
Console.WriteLine(l);
byte b = (byte) l;
Console.WriteLine(b);
b = Convert.ToByte(l);
Console.WriteLine(b);
}
Result:
9223372036854775807
255
Unhandled Exception:
System.OverflowException: Value is
greater than Byte.MaxValue or less
than Byte.MinValue at
System.Convert.ToByte (Int64 value)
[0x00000] at Test.Main
(System.String[] args) [0x00019] in
/home/marco/develop/test/Exceptions.cs:15
The simple answer is: it depends.
For value types, casting will involve genuinely converting it to a different type. For instance:
float f = 1.5f;
int i = (int) f; // Conversion
When the casting expression unboxes, the result (assuming it works) is usually just a copy of what was in the box, with the same type. There are exceptions, however - you can unbox from a boxed int to an enum (with an underlying type of int) and vice versa; likewise you can unbox from a boxed int to a Nullable<int>.
When the casting expression is from one reference type to another and no user-defined conversion is involved, there's no conversion as far as the object itself is concerned - only the type of the reference "changes" - and that's really only the way that the value is regarded, rather than the reference itself (which will be the same bits as before). For example:
object o = "hello";
string x = (string) o; // No data is "converted"; x and o refer to the same object
When user-defined conversions get involved, this usually entails returning a different object/value. For example, you could define a conversion to string for your own type - and
this would certainly not be the same data as your own object. (It might be an existing string referred to from your object already, of course.) In my experience user-defined conversions usually exist between value types rather than reference types, so this is rarely an issue.
All of these count as conversions in terms of the specification - but they don't all count as converting an object into an object of a different type. I suspect this is a case of Jesse Liberty being loose with terminology - I've noticed that in Programming C# 3.0, which I've just been reading.
Does that cover everything?
The best explanation that I've seen can be seen below, followed by a link to the source:
"... The truth is a bit more complex than that. .NET provides
three methods of getting from point A to point B, as it were.
First, there is the implicit cast. This is the cast that doesn't
require you to do anything more than an assignment:
int i = 5;
double d = i;
These are also called "widening conversions" and .NET allows you to
perform them without any cast operator because you could never lose any
information doing it: the possible range of valid values of a double
encompasses the range of valid values for an int and then some, so
you're never going to do this assignment and then discover to your
horror that the runtime dropped a few digits off your int value. For
reference types, the rule behind an implicit cast is that the cast
could never throw an InvalidCastException: it is clear to the compiler
that the cast is always valid.
You can make new implicit cast operators for your own types (which
means that you can make implicit casts that break all of the rules, if
you're stupid about it). The basic rule of thumb is that an implicit
cast can never include the possibility of losing information in the
transition.
Note that the underlying representation did change in this
conversion: a double is represented completely differently from an int.
The second kind of conversion is an explicit cast. An explicit cast is
required wherever there is the possibility of losing information, or
there is a possibility that the cast might not be valid and thus throw
an InvalidCastException:
double d = 1.5;
int i = (int)d;
Here you are obviously going to lose information: i will be 1 after the
cast, so the 0.5 gets lost. This is also known as a "narrowing"
conversion, and the compiler requires that you include an explicit cast
(int) to indicate that yes, you know that information may be lost, but
you don't care.
Similarly, with reference types the compiler requires explicit casts in
situations in which the cast may not be valid at run time, as a signal
that yes, you know there's a risk, but you know what you're doing.
The third kind of conversion is one that involves such a radical change
in representation that the designers didn't provide even an explicit
cast: they make you call a method in order to do the conversion:
string s = "15";
int i = Convert.ToInt32(s);
Note that there is nothing that absolutely requires a method call here.
Implicit and explicit casts are method calls too (that's how you make
your own). The designers could quite easily have created an explicit
cast operator that converted a string to an int. The requirement that
you call a method is a stylistic choice rather than a fundamental
requirement of the language.
The stylistic reasoning goes something like this: String-to-int is a
complicated conversion with lots of opportunity for things going
horribly wrong:
string s = "The quick brown fox";
int i = Convert.ToInt32(s);
As such, the method call gives you documentation to read, and a broad
hint that this is something more than just a quick cast.
When designing your own types (particularly your own value types), you
may decide to create cast operators and conversion functions. The lines
dividing "implicit cast", "explicit cast", and "conversion function"
territory are a bit blurry, so different people may make different
decisions as to what should be what. Just try to keep in mind
information loss, and potential for exceptions and invalid data, and
that should help you decide."
Bruce Wood, November 16th 2005
http://bytes.com/forum/post1068532-4.html
Casting involves References
List<int> myList = new List<int>();
//up-cast
IEnumerable<int> myEnumerable = (IEnumerable<int>) myList;
//down-cast
List<int> myOtherList = (List<int>) myEnumerable;
Notice that operations against myList, such as adding an element, are reflected in myEnumerable and myOtherList. This is because they are all references (of varying types) to the same instance.
Up-casting is safe. Down-casting can generate run-time errors if the programmer has made a mistake in the type. Safe down-casting is beyond the scope of this answer.
Converting involves Instances
List<int> myList = new List<int>();
int[] myArray = myList.ToArray();
myList is used to produce myArray. This is a non-destructive conversion (myList works perfectly fine after this operation). Also notice that operations against myList, such as adding an element, are not reflected in myArray. This is because they are completely seperate instances.
decimal w = 1.1m;
int x = (int)w;
There are operations using the cast syntax in C# that are actually conversions.
Semantics aside, a quick test shows they are NOT equivalent !
They do the task differently (or perhaps, they do different tasks).
x=-2.5 (int)x=-2 Convert.ToInt32(x)=-2
x=-1.5 (int)x=-1 Convert.ToInt32(x)=-2
x=-0.5 (int)x= 0 Convert.ToInt32(x)= 0
x= 0.5 (int)x= 0 Convert.ToInt32(x)= 0
x= 1.5 (int)x= 1 Convert.ToInt32(x)= 2
x= 2.5 (int)x= 2 Convert.ToInt32(x)= 2
Notice the x=-1.5 and x=1.5 cases.
A cast is telling the compiler/interperter that the object in fact is of that type (or has a basetype/interface of that type). It's a pretty fast thing to do compared to a convert where it's no longer the compiler/interperter doing the job but a function actualling parsing a string and doing math to convert to a number.
Casting always means changing the data type of an object. This can be done for instance by converting a float value into an integer value, or by reinterpreting the bits. It is usally a language-supported (read: compiler-supported) operation.
The term "converting" is sometimes used for casting, but it is usually done by some library or your own code and does not necessarily result in the same as casting. For example, if you have an imperial weight value and convert it to metric weight, it may stay the same data type (say, float), but become a different number. Another typical example is converting from degrees to radian.
In a language- / framework-agnostic manner of speaking, converting from one type or class to another is known as casting. This is true for .NET as well, as your first four lines show:
object x;
int y;
x = 4;
y = ( int )x;
C and C-like languages (such as C#) use the (newtype)somevar syntax for casting. In VB.NET, for example, there are explicit built-in functions for this. The last line would be written as:
y = CInt(x)
Or, for more complex types:
y = CType(x, newtype)
Where 'C' obviously is short for 'cast'.
.NET also has the Convert() function, however. This isn't a built-in language feature (unlike the above two), but rather one of the framework. This becomes clearer when you use a language that isn't necessarily used together with .NET: they still very likely have their own means of casting, but it's .NET that adds Convert().
As Matt says, the difference in behavior is that Convert() is more explicit. Rather than merely telling the compiler to treat y as an integer equivalent of x, you are specifically telling it to alter x in such a way that is suitable for the integer class, then assign the result to y.
In your particular case, the casting does what is called 'unboxing', whereas Convert() will actually get the integer value. The result will appear the same, but there are subtle differences better explained by Keith.
According to Table 1-7 titled "Methods for Explicit Conversion" on page 55 in Chapter 1, Lesson 4 of MCTS Self-Paced Training Kit (Exam 70-536): Microsoft® .NET Framework 2.0—Application Development Foundation, there is certainly a difference between them.
System.Convert is language-independent and converts "Between types that implement the System.IConvertible interface."
(type) cast operator is a C#-specific language feature that converts "Between types that define conversion operators."
Furthermore, when implementing custom conversions, advice differs between them.
Per the section titled How to Implement Conversion in Custom Types on pp. 56-57 in the lesson cited above, conversion operators (casting) are meant for simplifying conversions between numeric types, whereas Convert() enables culture-specific conversions.
Which technique you choose depends on the type of conversion you want to perform:
Define conversion operators to simplify narrowing and widening
conversions between numeric types.
Implement System.IConvertible to enable conversion through
System.Convert. Use this technique to enable culture-specific conversions.
...
It should be clearer now that since the cast conversion operator is implemented separately from the IConvertible interface, that Convert() is not necessarily merely another name for casting. (But I can envision where one implementation may refer to the other to ensure consistency).
Casting is essentially just telling the runtime to "pretend" the object is the new type. It doesn't actually convert or change the object in any way.
Convert, however, will perform operations to turn one type into another.
As an example:
char caster = '5';
Console.WriteLine((int)caster);
The output of those statements will be 53, because all the runtime did is look at the bit pattern and treat it as an int. What you end up getting is the ascii value of the character 5, rather than the number 5.
If you use Convert.ToInt32(caster) however, you will get 5 because it actually reads the string and modifies it correctly. (Essentially it knows that ASCII value 53 is really the integer value 5.)
The difference there is whether the conversion is implicit or explicit. The first one up there is a cast, the second one is a more explicit call to a function that converts. They probably go about doing the same thing in different ways.
This question already has answers here:
Is casting the same thing as converting?
(11 answers)
Closed 9 years ago.
Eric Lippert's comments in this question have left me thoroughly confused. What is the difference between casting and conversion in C#?
Casting is a way of telling the compiler "Object X is really Type Y, go ahead and treat it as such."
Conversion is saying "I know Object X isn't Type Y, but there exists a way of creating a new Object from X of Type Y, go ahead and do it."
I believe what Eric is trying to say is:
Casting is a term describing syntax (hence the Syntactic meaning).
Conversion is a term describing what actions are actually taken behind the scenes (and thus the Semantic meaning).
A cast-expression is used to convert
explicitly an expression to a given
type.
And
A cast-expression of the form (T)E,
where T is a type and E is a
unary-expression, performs an explicit
conversion (§13.2) of the value of E
to type T.
Seems to back that up by saying that a cast operator in the syntax performs an explicit conversion.
I am reminded of the anecdote told by Richard Feynman where he is attending a philosophy class and the professor askes him "Feynman, you're a physicist, in your opinion is an electron an 'essential object'?" So Feynman asks the clarifying question "is a brick an essential object?" to the class. Every student has a different answer to that question. They say that the fundamental abstract notion of "brickness" is the essential object. No, one specific, unique brick is the essential object. No, the parts of the brick you can empirically observe is the essential object. And so on.
Which is of course not to answer your question.
I'm not going to go through all these dozen answers and debate with their authors about what I really meant. I'll write a blog article on the subject in a few weeks and we'll see if that throws any light on the matter.
How about an analogy though, a la Feynman. You wish to bake a loaf of banana bread Saturday morning (as I do almost every Saturday morning.) So you consult The Joy of Cooking, and it says "blah blah blah... In another bowl, whisk together the dry ingredients. ..."
Clearly there is a strong relationship between that instruction and your actions tomorrow morning, but equally clearly it would be a mistake to conflate the instruction with the action. The instruction consists of text. It has a location, on a particular page. It has punctuation. Were you to be in the kitchen whisking together flour and baking soda, and someone asked "what's your punctuation right now?", you'd probably think it was an odd question. The action is related to the instruction, but the textual properties of the instruction are not properties of the action.
A cast is not a conversion in the same way that a recipe is not the act of baking a cake. A recipe is text which describes an action, which you can then perform. A cast operator is text which describes an action - a conversion - which the runtime can then perform.
From the C# Spec 14.6.6:
A cast-expression is used to convert
explicitly an expression to a given
type.
...
A cast-expression of the form (T)E,
where T is a type and E is a
unary-expression, performs an explicit
conversion (§13.2) of the value of E
to type T.
So casting is a syntactic construct used to instruct the compiler to invoke explicit conversions.
From the C# Spec §13:
A conversion enables an expression of
one type to be treated as another
type. Conversions can be implicit or
explicit, and this determines whether
an explicit cast is required.
[Example: For instance, the conversion
from type int to type long is
implicit, so expressions of type int
can implicitly be treated as type
long. The opposite conversion, from
type long to type int, is explicit, so
an explicit cast is required.
So conversions are where the actual work gets done. You'll note that the cast-expression quote says that it performs explicit conversions but explicit conversions are a superset of implicit conversions, so you can also invoke implicit conversions (even if you don't have to) via cast-expressions.
Just my understanding, probably much too simple:
When casting the essential data remains intact (same internal representation) - "I know this is a dictionary, but you can use it as a ICollection".
When converting, you are changing the internal representation to something else - "I want this int to be a string".
After reading Eric's comments, an attempt in plain english:
Casting means that the two types are actually the same at some level. They may implement the same interface or inherit from the same base class or the target can be "same enough" (a superset?) for the cast to work such as casting from Int16 to Int32.
Converting types then means that the two objects may be similar enough to be converted. Take for example a string representation of a number. It is a string, it cannot simply be cast into a number, it needs to be parsed and converted from one to the other, and, the process may fail. It may fail for casting as well but I imagine that's a much less expensive failure.
And that's the key difference between the two concepts I think. Conversion will entail some sort of parsing, or deeper analysis and conversion of the source data. Casting does not parse. It simply attempts a match at some polymorphic level.
Casting is the creation of a value of one type from another value of another type. Conversion is a type of casting in which the internal representation of the value must also be changed (rather than just its interpretation).
In C#, casting and converting are both done with a cast-expression:
( type ) unary-expression
The distinction is important (and the point is made in the comment) because only conversions may be created by a conversion-operator-declarator. Therefore, only (implicit or explicit) conversions may be created in code.
A non-conversion implicit cast is always available for subtype-to-supertype casts, and a non-conversion explicit cast is always available for supertype-to-subtype casts. No other non-conversion casts are allowed.
In this context, casting means that you are exposing an object of a given type for manipulation as some other type, conversion means that you are actually changing an object of a given type to an object of another type.
This page of the MSDN C# documentation suggests that a cast is specific instance of conversion: the "explicit conversion." That is, a conversion of the form x = (int)y is a cast.
Automatic data type changes (such as myLong = myInt) are the more generic "conversion."
A cast is an operator on a class/struct. A conversion is a method/process on one or the other of the affected classes/structs, or may be in a complete different class/struct (i.e. Converter.ToInt32()
Cast operators come in two flavors: implicit and explicit
Implicit cast operators indicate that data of one type (say, Int32) can always be represented as another type (decimal) without loss of data/precision.
int i = 25;
decimal d = i;
Explicit cast operators indicate that data of one type (decimal) can always be faithfully represented as another type (int), but there may be loss of data/precision. Therefor the compiler requires you to explicitly state that you are aware of this and want to do it anyway, through use of the explicit cast syntax:
decimal d = 25.0001;
int i = (int)d;
Conversion takes two types that are not necessarily related in any way, and attempts to convert one into the other through some process, such as parsing. If all known conversion algorithms fail, the process may either throw an exception or return a default value:
string s = "200";
int i = Converter.ToInt32(s); // set i to 200 by parsing s
string s = "two hundred";
int i = Converter.ToInt32(s); // sets i to 0 because the parse fails
Eric's references to syntactic conversion vs. symantic conversion are basically an operator vs. methodology distinction.
A cast is syntactical, and may or may not involve a conversion (depending on the type of cast). As you know, C++ allows specifying the type of cast you want to use.
Casting up/down the hierarchy may or may not be considered conversion, depending on who you ask (and what language they're talking about!)
Eric (C#) is saying that casting to a different type always involves a conversion, though that conversion may not even change the internal representation of the instance.
A C++-guy will disagree, since a static_cast might not result in any extra code (so the "conversion" is not actually real!)
Casting and Conversion are basically the same concept in C#, except that a conversion may be done using any method such as Object.ToString(). Casting is only done with the casting operator (T) E, that is described in other posts, and may make use of conversions or boxing.
Which conversion method does it use? The compiler decides based on the classes and libraries provided to the compiler at compile-time. If an implicit conversion exists, you are not required to use the casting operator. Object o = String.Empty. If only explicit conversions exist, you must use the casting operator. String s = (String) o.
You can create explicit and implicit conversion operators in your own classes. Note: conversions can make the data look very similar or nothing like the original type to you and me, but it's all defined by the conversion methods, and makes it legal to the compiler.
Casting always refers to the use of the casting operator. You can write
Object o = float.NaN;
String s = (String) o;
But if you access s, in for example a Console.WriteLine, you will receive a runtime InvalidCastException. So, the cast operator still attempts to use conversion at access time, but will settle for boxing during assignment.
INSERTED EDIT#2: isn't it hilariously inconsistent myopia that since I provided this answer, the question has been marked as duplicate of a question which asks, "Is casting the same thing as converting?". And the answers of "No" are overwhelmingly upvoted. Yet my answer here which points out the generative essence for why casts are not the same as conversion is overwhelmingly downvoted (yet I have one +1 in the comments). I suppose that readers have a difficult time with comprehending that casts apply at the denotational syntax/semantics layer and conversions apply at the operational semantics layer. For example, a cast of a reference (or pointer in C/C++)—referring to a boxed data type—to another data type, doesn't (in all languages and scenarios) generate a conversion of the boxed data. For example, in C float a[1]; int* p = (int*)&a; doesn't insure that *p refers to int data.
A compiler compiles from denotational semantics to operational semantics. The compilation is not bijective, i.e. it isn't guaranteed to uncompile (e.g. Java, LLVM, asm.js, or C# bytecode) back to any denotational syntax which compiles to that bytecode (e.g. Scala, Python, C#, C via Emscripten, etc). Thus the two layers not the same.
Thus most obviously a 'cast' and a 'conversion' are not the same thing. My answer here is pointing out that the terms apply to two different layers of semantics. Casts apply to the semantics of what the denotational layer (input syntax of the compiler) knows about. Conversions apply to the semantics of what the operational (runtime or intermediate bytecode) layer knows about. I used the standard term of 'erased' to describe what happens to denotational semantics that aren't explicitly recorded in the operational semantics layer.
For example, reified generics are an example of recording denotational semantics in the operational semantics layer, but they have the disadvantage of making the operational semantics layer incompatible with higher-order denotational semantics, e.g. this is why it was painful to consider implementing Scala's higher-kinded generics on C#'s CLR because C#'s denotational semantics for generics was hard-coded at the operational semantics layer.
Come on guys, stop downvoting someone who knows a lot more than you do. Do your homework first before you vote.
INSERTED EDIT: Casting is an operation that happens at the denotational semantics layer (where types are expressed in their full semantics). A cast may (e.g. explicit conversion) or may not (e.g. upcasting) cause a conversion at the runtime semantic layer. The downvotes on my answer (and the upvoting on Marc Gavin's comment) indicates to me that most people don't understand the differences between denotational semantics and operational (execution) semantics. Sigh.
I will state Eric Lippert's answer more simply and more generally for all languages, including C#.
A cast is syntax so (like all syntax) is erased at compile-time; whereas, a conversion causes some action at runtime.
That is a true statement for every computer language that I am aware of in the entire universe. Note that the above statement does not say that casting and conversions are mutually exclusive.
A cast may cause a conversion at runtime, but there are cases where it may not.
The reason we have two distinct words, i.e. cast and conversion, is we need a way to separately describe what is happening in syntax (the cast operator) and at runtime (conversion, or type check and possible conversion).
It is important that we maintain this separation-of-concepts, because in some programming languages the cast never causes a conversion. Also so that we understand implicit casting (e.g. upcasting) is happening only at compile-time. The reason I wrote this answer is because I want to help readers understand in terms of being multilingual with computer languages. And also to see how that general definition correctly applies in the C# case as well.
Also I wanted to help readers see how I generalize concepts in my mind, which helps me as computer language designer. I am trying to pass along the gift of a very reductionist, abstract way of thinking. But I am also trying to explain this in a very practical way. Please feel free to let me know in the comments if I need to improve the elucidation.
Eric Lippert wrote:
A cast is not a conversion in the same way that a recipe is not the
act of baking a cake. A recipe is text which describes an action,
which you can then perform. A cast operator is text which describes an
action - a conversion - which the runtime can then perform.
The recipe is what is happening in syntax. Syntax is always erased, and replaced with either nothing or some runtime code.
For example, I can write a cast in C# that does nothing and is entirely erased at compile-time when it is does not cause a change in the storage requirements or is upcasting. We can clearly see that a cast is just syntax, that makes no change to the runtime code.
int x = 1;
int y = (int)x;
Giraffe g = new Giraffe();
Animal a = (Animal)g;
That can be used for documentation purposes (yet noisy), but it is essential in languages that have type inference, where a cast is sometimes necessary to tell the compiler what type you wish it to infer.
For an example, in Scala a None has the type Option[Nothing] where Nothing is the bottom type that is the sub-type of all possible types (not super-type). So sometimes when using None, the type needs to be casted to a specific type, because Scala only does local type inference, thus can't always infer the type you intended.
// (None : Option[Int]) casts None to Option[Int]
println(Some(7) <*> ((None : Option[Int]) <*> (Some(9) > add)))
A cast could know at compile-time that it requires a type conversion, e.g. int x = (int)1.5, or could require a type check and possible type conversion at runtime, e.g. downcasting. The cast (i.e. the syntax) is erased and replaced with the runtime action.
Thus we can clearly see that equating all casts with explicit conversion, is an error of implication in the MSDN documentation. That documentation is intending to say that explicit conversion requires a cast operator, but it should not be trying to also imply that all casts are explicit conversions. I am confident that Eric Lippert can clear this up when he writes the blog he promised in his answer.
ADD: From the comments and chat, I can see that there is some confusion about the meaning of the term erased.
The term 'erased' is used to describe information that was known at compile-time, which is not known at runtime. For example, types can be erased in non-reified generics, and it is called type erasure.
Generally speaking all the syntax is erased, because generally CLI is not bijective (invertible, and one-to-one) with C#. You cannot always go backwards from some arbitrary CLI code back to the exact C# source code. This means information has been erased.
Those who say erased is not the right term, are conflating the implementation of a cast with the semantic of the cast. The cast is a higher-level semantic (I think it is actually higher than syntax, it is denotational semantics at least in case of upcasting and downcasting) that says at that level of semantics that we want to cast the type. Now how that gets done at runtime is entirely different level of semantics. In some languages it might always be a NOOP. For example, in Haskell all typing information is erased at compile-time.
I am curious to know what the difference is between a cast to say an int compared to using Convert.ToInt32(). Is there some sort of performance gain with using one?
Also which situations should each be used for? Currently I'm more inclined to use Convert but I don't have a reason to go either way. In my mind I see them both accomplishing the same goal.
Cast when it's really a type of int, Convert when it's not an int but you want it to become one.
For example int i = (int)o; when you know o is an int
int i = Convert.ToInt32("123") because "123" is not an int, it's a string representation of an int.
See Diff Between Cast and Convert on another forum
Answer
The Convert.ToInt32(String, IFormatProvider) underneath calls the Int32.Parse (read remarks).
So the only difference is that if a null string is passed it returns 0, whereas Int32.Parse throws an ArgumentNullException.
It is really a matter of choice whichever you use.
Personally, I use neither, and tend to use the TryParse functions (e.g. System.Int32.TryParse()).
UPDATE
Link on top is broken, see this answer on StackOverflow.
There is another difference.
"Convert" is always overflow-checked, while "cast" maybe, depending on your Settings and the "checked" or "unchecked" keyword used.
To be more explicit.
Consider the code:
int source = 260;
byte destination = (byte)source;
Then destination will be 4 without a warning.
But:
int source = 260;
byte destination = Convert.ToByte(source);
will give you a exception.
Not all types supports conversion like
int i = 0;
decimal d = (decimal)i;
because it is needed to implement explicit operator. But .NET also provide IConvertible interface, so any type implements that interface may be converted to most framework built-in types. And finally, Convert class helps to operate with types implements IConvertible interface.
A cast just tells the compiler that this object is actually an implementation of a different type and to treat it like the new implementation now. Whereas a convert says that this doesn't inherit from what you're trying to convert to, but there is a set way to so it. For example, say we're turning "16" into an int. "16" is a string, and does not inherit from int in any way. But, it is obvious that "16" could be turned into the int 16.
Throwing in my 2c -- it seems that a conceptual distinction might be useful. Not that I'm an expert.. :
Casting is changing the representative type. So "32" and 32L and 32.0f seem reasonable to cast between each other. c# will not support the "32" automatically, but most dynamic languages will. So I'll use the (long)"32" or (String)32L. When I can. I also have another rule -- Casting should be round trippable.
Converting doesn't have to be round trippable, and can simply create a totally new object.
The Grey area is for example the string "32xx". A case can be made that when you cast it, it becomes 32L (the number was parsed until it couldn't be). Perl used this. But then this violates my round trip requirement. The same goes for 32.5f to 32L. Almost all languages including very statically typed ones allow this, and it too fails the round trippable rule. It is grey in that if you allow "32" to be cast, then at compile time you don't know if it might be "32xxx".
Another distinction that can be made is to just use casting for "IsA" and not for "makeLookLikeA". So if you know a string comes from a database but is actually an int in the unofficial schema, feel free to use a cast (although in this case c# wants you to use Convert anyway). The same would go for a float. But not for when you are just using the cast to truncate the float. This distinction also accounts for DownCasting and UpCasting -- the object was always 'IsA', but the type might have been generalized for a list.
There are a lot of overloads for Convert.ToInt32 that can take for example a string. While trying to cast a string to an int will throw a compile error. The point being is they're for different uses. Convert is especially useful when you're not sure what type the object you're casting from is.
There is one more reason you should use Convert.ToInt32 instead of a cast.
For example:
float a = 1.3f;
float b = 0.02f;
int c = (int)(a / b);
int d = Convert.ToInt32(a / b);`
The result is c = 64 and d = 65
string number = "123abc";
int num;
Int32.TryParse(number, out num); // no exception thrown at this call
Convert.ToInt32(number); // exception IS thrown at this call