Why is the last line not allowed?
IEnumerable<double> doubleenumerable = new List<double> { 1, 2 };
IEnumerable<string> stringenumerable = new List<string> { "a", "b" };
IEnumerable<object> objects1 = stringenumerable; // OK
IEnumerable<object> objects2 = doubleenumerable; // Not allowed
Is this because double is a value type that doesn't derive from object, hence the covariance doesn't work?
Does that mean that there is no way to make this work:
public interface IMyInterface<out T>
{
string Method();
}
public class MyClass<U> : IMyInterface<U>
{
public string Method()
{
return "test";
}
}
public class Test
{
public static object test2()
{
IMyInterface<double> a = new MyClass<double>();
IMyInterface<object> b = a; // Invalid cast!
return b.Method();
}
}
And that I need to write my very own IMyInterface<T>.Cast<U>() to do that?
Why is the last line not allowed?
Because double is a value type and object is a reference type; covariance only works when both types are reference types.
Is this because double is a value type that doesn't derive from object, hence the covariance doesn't work?
No. Double does derive from object. All value types derive from object.
Now the question you should have asked:
Why does covariance not work to convert IEnumerable<double> to IEnumerable<object>?
Because who does the boxing? A conversion from double to object must box the double. Suppose you have a call to IEnumerator<object>.Current that is "really" a call to an implementation of IEnumerator<double>.Current. The caller expects an object to be returned. The callee returns a double. Where is the code that does the boxing instruction that turns the double returned by IEnumerator<double>.Current into a boxed double?
It is nowhere, that's where, and that's why this conversion is illegal. The call to Current is going to put an eight-byte double on the evaluation stack, and the consumer is going to expect a four-byte reference to a boxed double on the evaluation stack, and so the consumer is going to crash and die horribly with an misaligned stack and a reference to invalid memory.
If you want the code that boxes to execute then it has to be written at some point, and you're the person who gets to write it. The easiest way is to use the Cast<T> extension method:
IEnumerable<object> objects2 = doubleenumerable.Cast<object>();
Now you call a helper method that contains the boxing instruction that converts the double from an eight-byte double to a reference.
UPDATE: A commenter notes that I have begged the question -- that is, I have answered a question by presupposing the existence of a mechanism which solves a problem every bit as hard as a solution to the original question requires. How does the implementation of Cast<T> manage to solve the problem of knowing whether to box or not?
It works like this sketch. Note that the parameter types are not generic:
public static IEnumerable<T> Cast<T>(this IEnumerable sequence)
{
if (sequence == null) throw ...
if (sequence is IEnumerable<T>)
return sequence as IEnumerable<T>;
return ReallyCast<T>(sequence);
}
private static IEnumerable<T> ReallyCast<T>(IEnumerable sequence)
{
foreach(object item in sequence)
yield return (T)item;
}
The responsibility for determining whether the cast from object to T is an unboxing conversion or a reference conversion is deferred to the runtime. The jitter knows whether T is a reference type or a value type. 99% of the time it will of course be a reference type.
To understand what is allowed and not allowed, and why things behave as they do, it is helpful to understand what's going on under the hood. For every value type, there exists a corresponding type of class object, which--like all objects--will inherit from System.Object. Each class object includes with its data a 32-bit word (x86) or 64-bit longword (x64) which identifies its type. Value-type storage locations, however, do not hold such class objects or references to them, nor do they have a word of type data stored with them. Instead, each primitive-value-type location simply holds the bits necessary to represent a value, and each struct-value-type storage location simply holds the contents of all the public and private fields of that type.
When one copies a variable of type Double to one of type Object, one creates a new instance of the class-object type associated with Double and copies all the bytes from the original to that new class object. Although the boxed-Double class type has the same name as the Double value type, this does not lead to ambiguity because they can generally not be used in the same contexts. Storage locations of value types hold raw bits or combinations of fields, without stored type information; copying one such storage location to another copies all bytes, and consequently copies all public and private fields. By contrast, heap objects of types derived from value types are heap objects, and behave like heap objects. Although C# regards the contents of value-type storage locations as though they are derivatives of Object, under the hood the contents of such storage locations are simply collections of bytes, effectively outside the type system. Since they can only be accessed by code which knows what the bytes represent, there is no need to store such information with the storage location itself. Although the necessity for boxing when calling GetType on a struct is often described in terms of GetType being a non-shadowed, non-virtual function, the real necessity stems from the fact that the contents of a value-type storage location (as distinct from the location itself) don't have type information.
Variance of this type is only supported for reference types. See http://blogs.msdn.com/b/csharpfaq/archive/2010/02/16/covariance-and-contravariance-faq.aspx
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
To give context to the code imagine a collection of objects inside a cube. The objects are placed randomly and can affect each other. Several series of test events are planned then executed against the cube of objects. Only the best result is kept. This is not the real problem but a simplified version to focus the question.
Sample code
class Loc{
double UpDown
double LeftRight
double FrontBack
}
class Affects{
string affectKey
List<string> impacts //scripts that execute against properties
}
class Item{
Loc startLoc
Loc endLoc
List<string> affectedBy
string resultText // summary of analysis of changes
}
class ItemColl{
List<Item> myItems
}
class main{
ItemColl items
List<string> actions
void ProcessAffects(ItemColl tgt, List<string> acts){
// take actions against the tgt set and return
}
int IsBetter(ItemColl orig, List<Items> altered){
// compares the collection to determine "better one"
// positive better, negative worse, zero for no change
}
void DoThings(){
// original code
ItemColl temp = items
ProcessAffects(temp,actions)
IsBetter(temp,actions)
// the result was always zero - admittedly a duh error
}
}
When I added an alternate constructor that copied the object passed in and did the same to all subordinate objects, as in
class ItemColl{
public ItemColl(){}
public ItemColl (ItemColl clone){
// do a deep copy
}
// partial code from main DoThings
// replaced ItemColl temp = items
// with
ItemColl temp = new ItemColl(items)
it solved the problem that lead me to first question. (Thanks to the people who answered that question kindly.) What I am stuck on is whether or not there are other options to consider? I am hoping this restatement has a better focus and if I am not taking advantage of some newer efficiencies I would like to know.
I removed the old question entirely and re-phrased post face-palm.
Before you get into parameters, you need some background:
Background
There are two kinds of objects in .NET-land, Reference types and Value types. The main difference between the two is how assignment works.
Value Types
When you assign a value type instance to a variable, the value is copied to the variable. The basic numeric types (int, float, double, etc) are all value types. As a result, in this code:
decimal dec1 = 5.44m;
decimal dec2 = dec1;
dec1 = 3.1415m;
both decimal variables (dec and dec2) are wide enough to hold a decimal valued number. In each case, the value is copied. At the end, dec1 == 3.145m and dec2 == 5.44m.
Nearly all value types are declared as a struct (yes, if you get access to the .NET sources, int is a struct). Like all .NET types, they act when boxed as if they are derived from the object base class (their derivation is through System.ValueType. Both object (aka System.Object) and System.ValueType are reference types, even though the unboxed types that derive from System.ValueType are value types (a little magic happens here).
All value types are sealed/final - you can't sub-class them. You also can't create a default constructor for them - they come with a default constructor that initializes them to their default value. You can create additional constructors (which don't hide the built-in default constructor).
All enums are value types as well. They inherit from System.Enum but are value types and behave mostly like other value types.
In general, value types should be designed to be immutable; not all are.
Reference Types
Variables of reference types hold references, not values. That said, it sometimes help to think of them holding a value - it's just that that value is a reference to an object on the managed heap.
When you assign to a variable of reference type, you are assigning the reference. For example:
public class MyType {
public int TheValue { get; set; }
// more properties, fields, methods...
}
MyType mt1 = new MyType() {TheValue = 5};
MyType mt2 = mt1;
mt1.TheValue = 42;
Here, the mt1 and mt2 variables both contain references to the same object. When that object is mutated in the final line of code, you end up with two variables both referring to an object whose TheValue property is 42.
All types declared as a class are reference types. In general, other than the numeric types, enums and bools, most (but not all) of the types that you normally encounter will be reference types.
Anything declared to be a delegate or an event are also reference types under the covers. Someone mentioned interface. There is no such thing as an object typed purely as an interface. Both structs and classes may be declared to implement an interface - it doesn't change their value/reference type nature, but a struct stored as an interface will be boxed.
Difference in Constructor Behavior
One other difference between Reference and Value Types is what the new keyword means when constructing a new object. Consider this class and this struct:
public class CPoint {
public float X { get; set; }
public float Y { get; set; }
public CPoint (float x, float y) {
X = x;
Y = y;
}
}
public struct SPoint {
public float X { get; set; }
public float Y { get; set; }
public CPoint (float x, float y) {
X = x;
Y = y;
}
}
They are basically the same, except that CPoint is a class (a reference type) and SPoint is a struct (a value type).
When you create an instance of SPoint using the two float constructor (remember, it gets a default constructor auto-magically), like this:
var sp = new SPoint (42.0, 3.14);
What happens is that the constructor runs and creates a value. That value is then copied into the sp variable (which is of type SPoint and large enough to hold a two-float SPoint).
If I do this:
var cp = new CPoint (42.0, 3.14);
Something very different happens. First, memory is allocated on the managed heap large enough to hold a CPoint (i.e., enough to hold two floats plus the overhead of the object being a reference type). Then the two-float constructor runs (and that constructor is the only constructor - there is no default constructor (the additional, programmer-written constructor hides the compiler generated default constructor)). The constructor initializes that newCPoint in the memory allocated on the managed heap. Finally, a reference to that newly create object is created and copied to the variable cp.
Parameter Passing
Sorry the preamble took so long.
Unless otherwise specified, all parameters to functions/methods are passed by value. But, don't forget that the value of a variable of reference type is a reference.
So, if I have a function declared as (MyType is the class declared above):
public void MyFunction(decimal decValue, MyType myObject) {
// some code goes here
}
and some code that looks like:
decimal dec1 = 5.44m;
MyType mt1 = new MyType() {TheValue = 5};
MyFunction (dec1, mt1);
What happens is that the value of dec1 is copied to the function parameter (decValue) and available for use within MyFunction. If someone changes the value of the decValue within the function, no side effects outside the function occurs.
Similarly, but differently, the value of mt1 is copied to the method parameter myObject. However, that value is reference to a MyType object residing on the managed heap. If, within the method, some code mutates that object (say: myObject.TheValue=666;), then the object to which both the mt1 and myObject variables refer is mutated, and that results in a side effect viewable outside of the function. That said, everything is still being passed by value.
Passing Parameters by Reference
You can pass parameters by reference in two ways, using either the out or ref keywords. An out parameter does not need to be initialized before the function call (while a ref parameter must be). Within the function, an out parameter must be initialized before the function returns - ref parameters may be initialized, but they do not need to be. The idea is that ref parameters expect to pass in and out of the function (by reference). But out parameters are designed simply as a way to pass something out of the function (by reference).
If I declare a function like:
public void MyByRefFunction(out decimal decValue, ref MyType myObject) {
decValue = 25.624; //decValue must be intialized - it's an out parameter
myObject = new MyType (){TheValue = myObject.TheValue + 2};
}
and then I call it this way
decimal dec1; //note that it's not initalized
MyType mt1 = new MyType() {TheValue = 5};
MyType mt2 = mt1;
MyByRefFunction (out dec1, ref mt1);
After that call, dec1 will contain the value 25.624; that value was passed out of the function by reference.
Passing reference type variables by reference is more interesting. After the function call, mt1 will no longer refer to the object created with TheValue equal to 5, it will refer to the newly created object with TheValue equal to 5 + 2 (the object created within the function). Now, mt1 and mt2 will refer to different object with different TheValue property values.
With reference types, when you pass a variable normally, the object you pass it may mutate (and that mutation is visible after the function returns). If you pass a reference by reference, the reference itself may mutate, and the value of the reference may be different after the function returns.
All custom objects (derived from tobject) are "Reference type".
Nope. See the docs pages for Reference Types and Value Types
The following keywords are used to declare reference types:
class
interface
delegate
C# also provides the following built-in reference types:
dynamic
object
string
A value type can be one of the two following kinds:
a structure type ...
an enumeration type ...
So any time you make a class, it's always a Reference type.
EVERY type inherits from Object - Value Types and Reference Types.
Even if you pass it to a function with a reference parameter, as with the RefChange function both items are changed and both have exactly the same values in the integer list.
The ref keyword just forces your parameter to be passed by reference. Using ref with a Reference Type allows you to reassign the original passed in reference. See What is the use of “ref” for reference-type variables in C#?
.
Do not confuse the concept of passing by reference with the concept of reference types. The two concepts are not the same. A method parameter can be modified by ref regardless of whether it is a value type or a reference type. There is no boxing of a value type when it is passed by reference.
Source
Of course, the ref keyword is important when you pass in a Value Type, such as a struct.
If you want to pass a copy of an object, create an overloaded constructor to which you pass the original object and inside the constructor manage the duplication of the values that matter.
That's called a Copy Constructor, and is a long-established pattern, if you want to use it. In fact, there is a new c# 9.0 feature all about it: records.
well i cant comment since my reputation is too low, but value types are usually in built types such as int, float ...
everything else is reference type. reference type is always a shallow copy regardless of ref keyword.
ref keyword mainly served for value-type or act as a safeguard.
if u want to deep copy, Icloneable is very useful.
If I implement an interface for a value type and try to cast it to a List of it's interface type, why does this result in an error whereas the reference type converts just fine?
This is the error:
Cannot convert instance argument type
System.Collections.Generic.List<MyValueType> to
System.Collections.Generic.IEnumerable<MyInterfaceType>
I have to explicitely use the Cast<T> method to convert it, why?
Since IEnumerable is a readonly enumeration through a collection, it doesn't make any sense to me that it cannot be cast directly.
Here's example code to demonstrate the issue:
public interface I{}
public class T : I{}
public struct V: I{}
public void test()
{
var listT = new List<T>();
var listV = new List<V>();
var listIT = listT.ToList<I>(); //OK
var listIV = listV.ToList<I>(); //FAILS to compile, why?
var listIV2 = listV.Cast<I>().ToList(); //OK
}
Variance (covariance or contravariance) doesn't work for value types, only reference types:
Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type. (MSDN)
The values contained inside reference type variables are references (for example, addresses) and data addresses have the same size and are interpreted the same way, without any required change in their bit patterns.
In contrast, the values contained inside value type variables do not have the same size or the same semantics. Using them as reference types requires boxing and boxing requires type-specific instructions to be emitted by the compiler. It's not practical or efficient (sometimes maybe not even possible) for the compiler to emit boxing instructions for any possible kind of value type, therefore variance is disallowed altogether.
Basically, variance is practical thanks to the extra layer of indirection (the reference) from the variable to the actual data. Because value types lack that layer of indirection, they lack variance capabilities.
Combine the above with how LINQ operations work:
A Cast operation upcasts/boxes all elements (by accessing them through the non-generic IEnumerable, as you pointed out) and then verifies that all elements in a sequence can be successfully cast/unboxed to the provided type and then does exactly that. The ToList operation enumerates the sequence and returns a list from that enumeration.
Each one has its own job. If (say) ToList did the job of both, it would have the performance overhead of both, which is undesirable for most other cases.
I was looking at this question, and aside from a rather odd way to enumerate something, the op was having trouble because the enumerator is a struct. I understand that returning or passing a struct around uses a copy because it is a value type:
public MyStruct GetThingButActuallyJustCopyOfIt()
{
return this.myStructField;
}
or
public void PretendToDoSomething(MyStruct thingy)
{
thingy.desc = "this doesn't work as expected";
}
So my question is if MyStruct implements IMyInterface (such as IEnumerable), will these types of methods work as expected?
public struct MyStruct : IMyInterface { ... }
//will caller be able to modify the property of the returned IMyInterface?
public IMyInterface ActuallyStruct() { return (IMyInterface)this.myStruct; }
//will the interface you pass in get its value changed?
public void SetInterfaceProp(IMyInterface thingy)
{
thingy.desc = "the implementing type is a struct";
}
Yes, that code will work, but it needs explanation, because there is a whole world of code that will not work, and you're likely to trip into that unless you know this.
Before I forget: Mutable structs are evil. OK, with that out of the way, let's move on.
Let's take a simple example, you can use LINQPad to verify this code:
void Main()
{
var s = new MyStruct();
Test(s);
Debug.WriteLine(s.Description);
}
public void Test(IMyInterface i)
{
i.Description = "Test";
}
public interface IMyInterface
{
string Description { get; set; }
}
public struct MyStruct : IMyInterface
{
public string Description { get; set; }
}
When executing this, what will be printed?
null
OK, so why?
Well, the problem is this line:
Test(s);
This will in fact box that struct and pass the boxed copy to the method. You're successfully modifying that boxed copy, but not the original s variable, which was never assigned anything, and is thus still null.
OK, so if we change just one line in the first piece of code:
IMyInterface s = new MyStruct();
Does this change the outcome?
Yes, because now you're boxing that struct here, and always use the boxed copy. In this context it behaves like an object, you're modifying the boxed copy and writing out the contents of the boxed copy.
The problem thus crops up whenever you box or unbox that struct, then you get copies that live separate lives.
Conclusion: Mutable structs are evil.
I see two answers about using ref here now, and this is barking up the wrong tree. Using ref means you've solved the problem before you added ref.
Here's an example.
If we change the Test method above to take a ref parameter:
public void Test(ref IMyInterface i)
Would this change anything?
No, because this code is now invalid:
var s = new MyStruct();
Test(ref s);
You'll get this:
The best overloaded method match for 'UserQuery.Test(ref UserQuery.IMyInterface)' has some invalid arguments
Argument 1: cannot convert from 'ref UserQuery.MyStruct' to 'ref UserQuery.IMyInterface'
And so you change the code to this:
IMyInterface s = new MyStruct();
Test(ref s);
But now you're back to my example, just having added ref, which I showed is not necessary for the change to propagate back.
So using ref is orthogonal, it solves different problems, but not this one.
OK, more comments regarding ref.
Yes, of course passing a struct around using ref will indeed make the changes flow throughout the program.
That is not what this question was about. The question posted some code, asked if it would work, and it would. In this particular variant of code it would work. But it's so easy to trip up. And pay particular note that the question was regarding structs and interfaces. If you leave interfaces out of it, and pass the struct around using ref, then what do you have? A different question.
Adding ref does not change this question, nor the answer.
Within the CLR, every value-type definition actually defines two kinds of things: a structure type, and a heap object type. A widening conversion exists from the structure type to the boxed object type, and a narrowing conversion exists from Object to the structure type. The structure type will behave with value semantics, and the heap object type will behave with mutable reference semantics. Note that the heap object types associated with all non-trivial structure types [i.e. those with any non-default states] are always mutable, and nothing in the structure definition can cause them to be otherwise.
Note that value types may be constrained, cast, or coerced to interface types, and cast or coerced to reference types. Consider:
void DoSomethingWithDisposable<T,U>(ref T p1,
List<int>.Enumerator p2) where T:IDisposable
{
IDisposable v1a = p1; // Coerced
Object v1b = p1; // Coerced
IDisposable v2a = (IDisposable)p2; // Cast
Object v2b = (Object)p2; // Cast
p1.Dispose(); // Constrained call
}
void blah( List<string>.Enumerator p1, List<int>.Enumerator p2) // These are value types
{
DoSomethingWithDisposable(p1,p2); // Constrains p1 to IDisposable
}
Constraining a generic type to an interface type does not affect its behavior as a value type. Casting or coercing an a value type to an interface or reference type, however, will create a new instance of the heap object type and return a reference to that. That reference will then behave with reference-type semantics.
The behavior of value types with generic constraints can at times be very useful, and such usefulness can apply even when using mutating interfaces, but unfortunately there's no way to tell the compiler that a value type must remain as a value type, and that the compiler should warn if it would find itself converting it to something else. Consider the following three methods:
bool AdvanceIntEnumerator1(IEnumerator<int> it)
{ return it.MoveNext(); }
bool AdvanceIntEnumerator2(ref T it) where T:IEnumerator<int>
{ return it.MoveNext(); }
bool AdvanceIntEnumeratorTwice<T>(ref T it) where T:IEnumerator<int>
{ return it.MoveNext() && AdvanceIntEnumerator1(it); }
If one passes to the first piece of code a variable of type List<int>.Enumerator, the system will copy its state to a new heap object, call MoveNext on that object, and abandon it. If one passes instead a variable of type IEnumerator<int> which holds a reference to a heap object of type List<int>.Enumerator, it will call MoveNext on that instance, which the calling code will still retain.
If one passes to the second piece of code a variable of type List<int>.Enumerator, the system will call MoveNext on that variable, thus changing its state. If one passes a variable of type IEnumerable<T>, the system will call MoveNext on the object referred to by that variable; the variable won't be modified (it will still point to the same instance), but the instance to which it points will be.
Passing to the third piece of code a variable of type List<int>.Enumerator will cause MoveNext to be called on that variable, thus changing its state. If that returns true, the system will copy the already-modified variable to a new heap object and call MoveNext on that. The object will then be abandoned, so the variable will only be advanced once, but the return value will indicate whether a second MoveNext would have succeeded. Passing the third piece of code a variable of type IEnumerator<T> which holds a reference to a List<T>.Enumerator, however, will cause that instance to be advanced twice.
No, interface is a contract, to make it work properly you need to use ref keyword.
public void SetInterfaceProp(ref IMyInterface thingy)
{
thingy.desc = "the implementing type is a struct";
}
What matters here is a real type that stays inside that interface wrap.
To be more clear:
even if code with method SetInterfaceProp defined like
public void SetInterfaceProp(IMyInterface thingy)
{
thingy.desc = "the implementing type is a struct";
}
will work:
IMyInterface inter= default(MyStruct);
SetInterfaceProp(inter);
this one will not :
MyStruct inter = default(MyStruct);
SetInterfaceProp(inter);
You can not gurantee that the caller of your method will always use IMyInterface, so to guarantee expected behavior, in this case, you can define ref keyword, that will guarantee that in both cases method would run as expected.
I've seen both terms be used almost interchangeably in various online explanations, and most text books I've consulted are also not entirely clear about the distinction.
Is there perhaps a clear and simple way of explaining the difference that you guys know of?
Type conversion (also sometimes known as type cast)
To use a value of one type in a context that expects another.
Nonconverting type cast (sometimes known as type pun)
A change that does not alter the underlying bits.
Coercion
Process by which a compiler automatically converts a value of one type into a value of another type when that second type is required by the surrounding context.
Type Conversion:
The word conversion refers to either implicitly or explicitly changing a value from one data type to another, e.g. a 16-bit integer to a 32-bit integer.
The word coercion is used to denote an implicit conversion.
The word cast typically refers to an explicit type conversion (as opposed to an implicit conversion), regardless of whether this is a re-interpretation of a bit-pattern or a real conversion.
So, coercion is implicit, cast is explicit, and conversion is any of them.
Few examples (from the same source) :
Coercion (implicit):
double d;
int i;
if (d > i) d = i;
Cast (explicit):
double da = 3.3;
double db = 3.3;
double dc = 3.4;
int result = (int)da + (int)db + (int)dc; //result == 9
Usages vary, as you note.
My personal usages are:
A "cast" is the usage of a cast operator. A cast operator instructs the compiler that either (1) this expression is not known to be of the given type, but I promise you that the value will be of that type at runtime; the compiler is to treat the expression as being of the given type, and the runtime will produce an error if it is not, or (2) the expression is of a different type entirely, but there is a well-known way to associate instances of the expression's type with instances of the cast-to type. The compiler is instructed to generate code that performs the conversion. The attentive reader will note that these are opposites, which I think is a neat trick.
A "conversion" is an operation by which a value of one type is treated as a value of another type -- usually a different type, though an "identity conversion" is still a conversion, technically speaking. The conversion may be "representation changing", like int to double, or it might be "representation preserving" like string to object. Conversions may be "implicit", which do not require a cast, or "explicit", which do require a cast.
A "coercion" is a representation-changing implicit conversion.
Casting is the process by which you treat an object type as another type, Coercing is converting one object to another.
Note that in the former process there is no conversion involved, you have a type that you would like to treat as another, say for example, you have 3 different objects that inherit from a base type, and you have a method that will take that base type, at any point, if you know the specific child type, you can CAST it to what it is and use all the specific methods and properties of that object and that will not create a new instance of the object.
On the other hand, coercing implies the creation of a new object in memory of the new type and then the original type would be copied over to the new one, leaving both objects in memory (until the Garbage Collectors takes either away, or both).
As an example consider the following code:
class baseClass {}
class childClass : baseClass {}
class otherClass {}
public void doSomethingWithBase(baseClass item) {}
public void mainMethod()
{
var obj1 = new baseClass();
var obj2 = new childClass();
var obj3 = new otherClass();
doSomethingWithBase(obj1); //not a problem, obj1 is already of type baseClass
doSomethingWithBase(obj2); //not a problem, obj2 is implicitly casted to baseClass
doSomethingWithBase(obj3); //won't compile without additional code
}
obj1 is passed without any casting or coercing (conversion) because it's already of the same type baseClass
obj2 is implicitly casted to base, meaning there's no creation of a new object because obj2 can already be baseClass
obj3 needs to be converted somehow to base, you'll need to provide your own method to convert from otherClass to baseClass, which will involve creating a new object of type baseClass and filling it by copying the data from obj3.
A good example is the Convert C# class where it provides custom code to convert among different types.
According to Wikipedia,
In computer science, type conversion, type casting, type coercion, and type juggling are different ways of changing an expression from one data type to another.
The difference between type casting and type coercion is as follows:
TYPE CASTING | TYPE COERCION
|
1. Explicit i.e., done by user | 1. Implicit i.e., done by the compiler
|
2. Types: | 2. Type:
Static (done at compile time) | Widening (conversion to higher data
| type)
Dynamic (done at run time) | Narrowing (conversion to lower data
| type)
|
3. Casting never changes the | 3. Coercion can result in representation
the actual type of object | as well as type change.
nor representation. |
Note: Casting is not conversion. It is just the process by which we treat an object type as another type. Therefore, the actual type of object, as well as the representation, is not changed during casting.
I agree with #PedroC88's words:
On the other hand, coercing implies the creation of a new object in
memory of the new type and then the original type would be copied over
to the new one, leaving both objects in memory (until the Garbage
Collectors takes either away, or both).
Casting preserves the type of objects. Coercion does not.
Coercion is taking the value of a type that is NOT assignment compatible and converting to a type that is assignment compatible. Here I perform a coercion because Int32 does NOT inherit from Int64...so it's NOT assignment compatible. This is a widening coercion (no data lost). A widening coercion is a.k.a. an implicit conversion. A Coercion performs a conversion.
void Main()
{
System.Int32 a = 100;
System.Int64 b = a;
b.GetType();//The type is System.Int64.
}
Casting allows you to treat a type as if it were of a different type while also preserving the type.
void Main()
{
Derived d = new Derived();
Base bb = d;
//b.N();//INVALID. Calls to the type Derived are not possible because bb is of type Base
bb.GetType();//The type is Derived. bb is still of type Derived despite not being able to call members of Test
}
class Base
{
public void M() {}
}
class Derived: Base
{
public void N() {}
}
Source: The Common Language Infrastructure Annotated Standard by James S. Miller
Now what's odd is that Microsoft's documentation on Casting does not align with the ecma-335 specification definition of Casting.
Explicit conversions (casts): Explicit conversions require a cast
operator. Casting is required when information might be lost in the
conversion, or when the conversion might not succeed for other
reasons. Typical examples include numeric conversion to a type that
has less precision or a smaller range, and conversion of a base-class
instance to a derived class.
...This sounds like Coercions not Casting.
For example,
object o = 1;
int i = (int)o;//Explicit conversions require a cast operator
i.GetType();//The type has been explicitly converted to System.Int32. Object type is not preserved. This meets the definition of Coercion not casting.
Who knows? Maybe Microsoft is checking if anybody reads this stuff.
From the CLI standard:
I.8.3.2 Coercion
Sometimes it is desirable to take a value of a type that is not assignable-to a location, and convert
the value to a type that is assignable-to the type of the location. This is accomplished through
coercion of the value. Coercion takes a value of a particular type and a desired type and attempts
to create a value of the desired type that has equivalent meaning to the original value. Coercion
can result in representation change as well as type change; hence coercion does not necessarily
preserve object identity.
There are two kinds of coercion: widening, which never loses information, and narrowing, in
which information might be lost. An example of a widening coercion would be coercing a value
that is a 32-bit signed integer to a value that is a 64-bit signed integer. An example of a
narrowing coercion is the reverse: coercing a 64-bit signed integer to a 32-bit signed integer.
Programming languages often implement widening coercions as implicit conversions, whereas
narrowing coercions usually require an explicit conversion.
Some coercion is built directly into the VES operations on the built-in types (see §I.12.1). All
other coercion shall be explicitly requested. For the built-in types, the CTS provides operations
to perform widening coercions with no runtime checks and narrowing coercions with runtime
checks or truncation, according to the operation semantics.
I.8.3.3 Casting
Since a value can be of more than one type, a use of the value needs to clearly identify which of
its types is being used. Since values are read from locations that are typed, the type of the value
which is used is the type of the location from which the value was read. If a different type is to
be used, the value is cast to one of its other types. Casting is usually a compile time operation,
but if the compiler cannot statically know that the value is of the target type, a runtime cast check
is done. Unlike coercion, a cast never changes the actual type of an object nor does it change the
representation. Casting preserves the identity of objects.
For example, a runtime check might be needed when casting a value read from a location that is
typed as holding a value of a particular interface. Since an interface is an incomplete description
of the value, casting that value to be of a different interface type will usually result in a runtime
cast check.
Below is a posting from the following article:
The difference between coercion and casting is often neglected. I can see why; many languages have the same (or similar) syntax and terminology for both operations. Some languages may even refer to any conversion as “casting,” but the following explanation refers to concepts in the CTS.
If you are trying to assign a value of some type to a location of a different type, you can generate a value of the new type that has a similar meaning to the original. This is coercion. Coercion lets you use the new type by creating a new value that in some way resembles the original. Some coercions may discard data (e.g. converting the int 0x12345678 to the short 0x5678), while others may not (e.g. converting the int 0x00000008 to the short 0x0008, or the long 0x0000000000000008).
Recall that values can have multiple types. If your situation is slightly different, and you only want to select a different one of the value’s types, casting is the tool for the job. Casting simply indicates that you wish to operate on a particular type that a value includes.
The difference at the code level varies from C# to IL. In C#, both casting and coercion look fairly similar:
static void ChangeTypes(int number, System.IO.Stream stream)
{
long longNumber = number;
short shortNumber = (short)number;
IDisposable disposableStream = stream;
System.IO.FileStream fileStream = (System.IO.FileStream)stream;
}
At the IL level they are quite different:
ldarg.0
conv.i8
stloc.0
ldarg.0
conv.i2
stloc.1
ldarg.1
stloc.2
ldarg.1
castclass [mscorlib]System.IO.FileStream
stloc.3
As for the logical level, there are some important differences. What’s most important to remember is that coercion creates a new value, while casting does not. The identity of the original value and the value after casting are the same, while the identity of a coerced value differs from the original value; coersion creates a new, distinct instance, while casting does not. A corollary is that the result of casting and the original will always be equivalent (both in identity and equality), but a coerced value may or may not be equal to the original, and never shares the original identity.
It’s easy to see the implications of coercion in the examples above, as the numeric types are always copied by value. Things get a bit trickier when you’re working with reference types.
class Name : Tuple<string, string>
{
public Name(string first, string last)
: base(first, last)
{
}
public static implicit operator string[](Name name)
{
return new string[] { name.Item1, name.Item2 };
}
}
In the example below, one conversion is a cast, while the other is a coercion.
Tuple<string, string> tuple = name;
string[] strings = name;
After these conversions, tuple and name are equal, but strings is not equal to either of them. You could make the situation slightly better (or slightly more confusing) by implementing Equals() and operator ==() on the Name class to compare a Name and a string[]. These operators would “fix” the comparison issue, but you would still have two separate instances; any modification to strings would not be reflected in name or tuple, while changes to either one of name or tuple would be reflected in name and tuple, but not in strings.
Although the example above was meant to illustrate some differences between casting and coercion, it also serves as a great example of why you should be extremely cautious about using conversion operators with reference types in C#.
Why is this a compile time error?
public TCastTo CastMe<TSource, TCastTo>(TSource i)
{
return (TCastTo)i;
}
Error:
Cannot convert type 'TSource' to 'TCastTo'
And why is this a runtime error?
public TCastTo CastMe<TSource, TCastTo>(TSource i)
{
return (TCastTo)(object)i;
}
int a = 4;
long b = CastMe<int, long>(a); // InvalidCastException
// this contrived example works
int aa = 4;
int bb = CastMe<int, int>(aa);
// this also works, the problem is limited to value types
string s = "foo";
object o = CastMe<string, object>(s);
I've searched SO and the internet for an answer to this and found lots of explanations on similar generic related casting issues, but I can't find anything on this particular simple case.
Why is this a compile time error?
The problem is that every possible combination of value types has different rules for what a cast means. Casting a 64 bit double to a 16 bit int is completely different code from casting a decimal to a float, and so on. The number of possibilities is enormous. So think like the compiler. What code is the compiler supposed to generate for your program?
The compiler would have to generate code that starts the compiler again at runtime, does a fresh analysis of the types, and dynamically emits the appropriate code.
That seems like perhaps more work and less performance than you expected to get with generics, so we simply outlaw it. If what you really want is for the compiler to start up again and do an analysis of the types, use "dynamic" in C# 4; that's what it does.
And why is this a runtime error?
Same reason.
A boxed int may only be unboxed to int (or int?), for the same reason as above; if the CLR tried to do every possible conversion from a boxed value type to every other possible value type then essentially it has to run a compiler again at runtime. That would be unexpectedly slow.
So why is it not an error for reference types?
Because every reference type conversion is the same as every other reference type conversion: you interrogate the object to see if it is derived from or identical to the desired type. If it's not, you throw an exception (if doing a cast) or result in null/false (if using the "as/is" operators). The rules are consistent for reference types in a way that they are not for value types. Remember reference types know their own type. Value types do not; with value types, the variable doing the storage is the only thing that knows the type semantics that apply to those bits. Value types contain their values and no additional information. Reference types contain their values plus lots of extra data.
For more information see my article on the subject:
http://ericlippert.com/2009/03/03/representation-and-identity/
C# uses one cast syntax for multiple different underlying operations:
upcast
downcast
boxing
unboxing
numeric conversion
user-defined conversion
In generic context, the compiler has no way of knowing which of those is correct, and they all generate different MSIL, so it bails out.
By writing return (TCastTo)(object)i; instead, you force the compiler to do an upcast to object, followed by a downcast to TCastTo. The compiler will generate code, but if that wasn't the right way to convert the types in question, you'll get a runtime error.
Code Sample:
public static class DefaultConverter<TInput, TOutput>
{
private static Converter<TInput, TOutput> cached;
static DefaultConverter()
{
ParameterExpression p = Expression.Parameter(typeof(TSource));
cached = Expression.Lambda<Converter<TSource, TCastTo>(Expression.Convert(p, typeof(TCastTo), p).Compile();
}
public static Converter<TInput, TOutput> Instance { return cached; }
}
public static class DefaultConverter<TOutput>
{
public static TOutput ConvertBen<TInput>(TInput from) { return DefaultConverter<TInput, TOutput>.Instance.Invoke(from); }
public static TOutput ConvertEric(dynamic from) { return from; }
}
Eric's way sure is shorter, but I think mine should be faster.
The compile error is caused because TSource cannot be implicitly cast to TCastTo. The two types may share a branch on their inheritance tree, but there is no guarantee. If you wanted to call only types that did share an ancestor, you should modify the CastMe() signature to use the ancestor type instead of generics.
The runtime error example avoids the error in your first example by first casting the TSource i to an object, something all objects in C# derive from. While the compiler doesn't complain (because object -> something that derives from it, could be valid), the behaviour of casting via (Type)variable syntax will throw if the cast is invalid. (The same problem that the compiler prevented from happening in example 1).
Another solution, which does something similar to what you're looking for...
public static T2 CastTo<T, T2>(T input, Func<T, T2> convert)
{
return convert(input);
}
You'd call it like this.
int a = 314;
long b = CastTo(a, i=>(long)i);
Hopefully this helps.