List, array and IEnumerable covariance - c#

I'll start with several postulates to better explain the context of my question:
Array Covariance
Postulate 1.1
An array of a value type is not covariant. int[] cannot pass for object[].
Postulate 1.2
An array of a reference type is covariant with a valid IEnumerable. string[] can pass for IEnumerable<object>).
Postulate 1.3
An array of a reference type is covariant with a valid covariant array. string[] can pass for object[].
List Covariance
Postulate 2.1 (same as 1.1)
A list of a value type is not covariant. List<int> cannot pass for List<object>.
Postulate 2.2 (same as 1.2)
A list of a reference type is covariant with a valid IEnumerable. List<string> can pass for IEnumerable<object>).
Postulate 2.3 (different from 1.3)
A list of a reference type is not covariant with a valid covariant List. List<string> cannot pass for List<object>).
My question concerns postulates 1.3, 2.2 and 2.3. Specifically:
Why can string[] pass for object[], but List<string> not for List<object>?
Why can List<string> pass for IEnumerable<object> but not for List<object>?

List covariance is unsafe:
List<string> strings = new List<string> { "a", "b", "c" };
List<object> objects = strings;
objects.Add(1); //
Array covariance is also unsafe for the same reason:
string[] strings = new[] { "a", "b", "c" };
object[] objects = strings;
objects[0] = 1; //throws ArrayTypeMismatchException
array covariance in C# is recognised as a mistake, and has been present since version 1.
Since the collection cannot be modified through the IEnumerable<T> interface, it is safe to type a List<string> as an IEnumerable<object>.

Arrays are covariant, but a System.Int32[] does not hold references to things which are derived from System.Object. Within the .NET runtime, each value-type definition actually defines two kinds of things: a heap object type and a value (storage location) type. The heap object type is derived from System.Object; the storage location type is implicitly convertible to the heap object type (which in turn derives from System.Object) but does not itself actually derive from System.Object nor anything else. Although all arrays, including System.Int32[] are heap-object types, the individual elements of a System.Int32[] are instances of the storage location type.
The reason that a String[] can be passed to code expecting an Object[] is that the former contains "references to heap-object instances of type derived from type String", and the latter likewise for type Object. Since String derives from Object, a reference to a heap-object of a type derived from String will also be a reference to a heap object which derives from Object, and a String[] will contain references to heap objects which derive from Object--exactly what code would expect to read from an Object[]. By contrast, because an int[] [i.e. System.Int32[]] does not contain references to heap-object instances of type Int32, its contents will not conform to the expectations of code which is expecting Object[].

Related

I can convert List<string> to IEnumerable<object>, but I can't convert List<int> to IEnumerable<object>. Why not? [duplicate]

IEnumerable<T> is co-variant but it does not support value type, just only reference type. The below simple code is compiled successfully:
IEnumerable<string> strList = new List<string>();
IEnumerable<object> objList = strList;
But changing from string to int will get compiled error:
IEnumerable<int> intList = new List<int>();
IEnumerable<object> objList = intList;
The reason is explained in MSDN:
Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type.
I have searched and found that some questions mentioned the reason is boxing between value type and reference type. But it does not still clear up my mind much why boxing is the reason?
Could someone please give a simple and detailed explanation why covariance and contravariance do not support value type and how boxing affects this?
Basically, variance applies when the CLR can ensure that it doesn't need to make any representational change to the values. References all look the same - so you can use an IEnumerable<string> as an IEnumerable<object> without any change in representation; the native code itself doesn't need to know what you're doing with the values at all, so long as the infrastructure has guaranteed that it will definitely be valid.
For value types, that doesn't work - to treat an IEnumerable<int> as an IEnumerable<object>, the code using the sequence would have to know whether to perform a boxing conversion or not.
You might want to read Eric Lippert's blog post on representation and identity for more on this topic in general.
EDIT: Having reread Eric's blog post myself, it's at least as much about identity as representation, although the two are linked. In particular:
This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.
It is perhaps easier to understand if you think about the underlying representation (even though this really is an implementation detail). Here is a collection of strings:
IEnumerable<string> strings = new[] { "A", "B", "C" };
You can think of the strings as having the following representation:
[0] : string reference -> "A"
[1] : string reference -> "B"
[2] : string reference -> "C"
It is a collection of three elements, each being a reference to a string. You can cast this to a collection of objects:
IEnumerable<object> objects = (IEnumerable<object>) strings;
Basically it is the same representation except now the references are object references:
[0] : object reference -> "A"
[1] : object reference -> "B"
[2] : object reference -> "C"
The representation is the same. The references are just treated differently; you can no longer access the string.Length property but you can still call object.GetHashCode(). Compare this to a collection of ints:
IEnumerable<int> ints = new[] { 1, 2, 3 };
[0] : int = 1
[1] : int = 2
[2] : int = 3
To convert this to an IEnumerable<object> the data has to be converted by boxing the ints:
[0] : object reference -> 1
[1] : object reference -> 2
[2] : object reference -> 3
This conversion requires more than a cast.
I think everything starts from definiton of LSP (Liskov Substitution Principle), which climes:
if q(x) is a property provable about objects x of type T then q(y) should be true for objects y of type S where S is a subtype of T.
But value types, for example int can not be substitute of object in C#.
Prove is very simple:
int myInt = new int();
object obj1 = myInt ;
object obj2 = myInt ;
return ReferenceEquals(obj1, obj2);
This returns false even if we assign the same "reference" to the object.
It does come down to an implementation detail: Value types are implemented differently to reference types.
If you force value types to be treated as reference types (i.e. box them, e.g. by referring to them via an interface) you can get variance.
The easiest way to see the difference is simply consider an Array: an array of Value types are put together in memory contiguously (directly), where as an array of Reference types only have the reference (a pointer) contiguously in memory; the objects being pointed to are separately allocated.
The other (related) issue(*) is that (almost) all Reference types have the same representation for variance purposes and much code does not need to know of the difference between types, so co- and contra-variance is possible (and easily implemented -- often just by omission of extra type checking).
(*) It may be seen to be the same issue...

Array with mixed data-types in C#

Given the following code:
void Main()
{
dynamic[] arr = { 5, "test2", "test3"};
foreach (var i in arr)
{
Console.WriteLine(i.GetType().Name);
}
}
it prints the following:
Int32
String
String
I can't understand how can an array has elements of different types. From a C background, array elements should be having the same type and each element should take the same amount of RAM. Because in C, something like arr[i] would be equivalent to *(arr + i) and the pointer arr would move i * sizeof(arr data type) steps.
dynamic[] arr = { 5, "test2", "test3"};
results in object[] (you can see if you call arr.GetType()).
The array contains objects of the same type; in this case the type is object.
Boxing and unboxing
The elements in your array are boxed. This passage is from Boxing and Unboxing (C# Programming Guide).
Boxing is the process of converting a value type to the type object or to any interface type implemented by this value type. When the CLR boxes a value type, it wraps the value inside a System.Object instance and stores it on the managed heap.
An object[] array, even for value types, does not contain the objects themselves; it contains references to them (btw. string is a reference type in C#).
Again, this is from Boxing and Unboxing (C# Programming Guide).
dynamic in C#
I think the first sentence from Using type dynamic (C# Programming Guide) could clarify how dynamic works in C#.
C# 4 introduces a new type, dynamic. The type is a static type, but an object of type dynamic bypasses static type checking.
The quote from Built-in reference types (C# reference) may even be better.
The dynamic type indicates that use of the variable and references to its members bypass compile-time type checking. Instead, these operations are resolved at run time. (...)
Type dynamic behaves like type object in most circumstances.
Remember that in C#, all classes inherit from Object.
An Object[] array is actually an array containing pointers to the actual objects, so the size is always the same.
The memory would look like this :
A dynamic[] array will be casted to Object[], therefore accepting any data type in there.
About structure, which don't inherit from Object, the run-time uses a trick called boxing to put the structure inside an object, therefore allowing the structure item to enter the array.
A dynamic type will be stored as an object but at run time the compiler will load many more bytes to make sense of what to do with the dynamic type. In order to do that, a lot more memory will be used to figure that out. Think of dynamic as a fancy object.

Why I can't use value types in generic class with constraint where T : IEnumerable<object> [duplicate]

IEnumerable<T> is co-variant but it does not support value type, just only reference type. The below simple code is compiled successfully:
IEnumerable<string> strList = new List<string>();
IEnumerable<object> objList = strList;
But changing from string to int will get compiled error:
IEnumerable<int> intList = new List<int>();
IEnumerable<object> objList = intList;
The reason is explained in MSDN:
Variance applies only to reference types; if you specify a value type for a variant type parameter, that type parameter is invariant for the resulting constructed type.
I have searched and found that some questions mentioned the reason is boxing between value type and reference type. But it does not still clear up my mind much why boxing is the reason?
Could someone please give a simple and detailed explanation why covariance and contravariance do not support value type and how boxing affects this?
Basically, variance applies when the CLR can ensure that it doesn't need to make any representational change to the values. References all look the same - so you can use an IEnumerable<string> as an IEnumerable<object> without any change in representation; the native code itself doesn't need to know what you're doing with the values at all, so long as the infrastructure has guaranteed that it will definitely be valid.
For value types, that doesn't work - to treat an IEnumerable<int> as an IEnumerable<object>, the code using the sequence would have to know whether to perform a boxing conversion or not.
You might want to read Eric Lippert's blog post on representation and identity for more on this topic in general.
EDIT: Having reread Eric's blog post myself, it's at least as much about identity as representation, although the two are linked. In particular:
This is why covariant and contravariant conversions of interface and delegate types require that all varying type arguments be of reference types. To ensure that a variant reference conversion is always identity-preserving, all of the conversions involving type arguments must also be identity-preserving. The easiest way to ensure that all the non-trivial conversions on type arguments are identity-preserving is to restrict them to be reference conversions.
It is perhaps easier to understand if you think about the underlying representation (even though this really is an implementation detail). Here is a collection of strings:
IEnumerable<string> strings = new[] { "A", "B", "C" };
You can think of the strings as having the following representation:
[0] : string reference -> "A"
[1] : string reference -> "B"
[2] : string reference -> "C"
It is a collection of three elements, each being a reference to a string. You can cast this to a collection of objects:
IEnumerable<object> objects = (IEnumerable<object>) strings;
Basically it is the same representation except now the references are object references:
[0] : object reference -> "A"
[1] : object reference -> "B"
[2] : object reference -> "C"
The representation is the same. The references are just treated differently; you can no longer access the string.Length property but you can still call object.GetHashCode(). Compare this to a collection of ints:
IEnumerable<int> ints = new[] { 1, 2, 3 };
[0] : int = 1
[1] : int = 2
[2] : int = 3
To convert this to an IEnumerable<object> the data has to be converted by boxing the ints:
[0] : object reference -> 1
[1] : object reference -> 2
[2] : object reference -> 3
This conversion requires more than a cast.
I think everything starts from definiton of LSP (Liskov Substitution Principle), which climes:
if q(x) is a property provable about objects x of type T then q(y) should be true for objects y of type S where S is a subtype of T.
But value types, for example int can not be substitute of object in C#.
Prove is very simple:
int myInt = new int();
object obj1 = myInt ;
object obj2 = myInt ;
return ReferenceEquals(obj1, obj2);
This returns false even if we assign the same "reference" to the object.
It does come down to an implementation detail: Value types are implemented differently to reference types.
If you force value types to be treated as reference types (i.e. box them, e.g. by referring to them via an interface) you can get variance.
The easiest way to see the difference is simply consider an Array: an array of Value types are put together in memory contiguously (directly), where as an array of Reference types only have the reference (a pointer) contiguously in memory; the objects being pointed to are separately allocated.
The other (related) issue(*) is that (almost) all Reference types have the same representation for variance purposes and much code does not need to know of the difference between types, so co- and contra-variance is possible (and easily implemented -- often just by omission of extra type checking).
(*) It may be seen to be the same issue...

C#/.NET - why is string[] castable to object[] but int[] is not? [duplicate]

I encountered with question: why it's impossible cast int[] to object[] , e.g.
object[] o = new int[] { 0, 1, 2 };
Meanwhile I can cast to just object and back to int[].
I'll be glad to hear deep answer.
Directly from the docs:
Array covariance specifically does not extend to arrays of value-types. For example, no conversion exists that permits an int[] to be treated as an object[].
An array of ints or any other value-type is not an array of objects. Value types have different storage characteristics to those of reference types. An array of (reference type) objects holds a list of object references (with the objects themselves living in the heap), so the slots will always be a constant width. Value types, on the other hand, store their value directly in the array, so the slots might be any width. This makes a conversion between the two meaningless.
It's a little confusing because even though value-types are derived from System.Object, they behave very differently to reference types, and object-like behaviour of value types (e.g. boxing) is only possible through magical handling of them by the compiler and runtime, and it doesn't extend to arrays.
As a side note, casting arrays is a well dodgy practice. I wouldn't do it.
For an instance of type A to be castable to type B, one of the following conditions must be true:
there is an implicit/explicit conversion from A to B;
there is a hierarchical relationship. Such a relationship could be achieve through one of two ways:
deriving A from B (e.g., class A : B {})
covariance/contravariance. C# allows covariance for:
arrays of reference types (string[] > object[]) (*)
generic types arguments in interfaces/delegates (IEnumerable<string> > IEnumerable<object> and Func<string> > Func<object>)
delegates ( string Method() {} can be assigned to delegate object Del(); )
You cannot cast int[] to object[] because none of the above conditions are true.
(*) - You should avoid this though - array covariance is broken and was it was added simply so that the CLR would support Java-like languages.
Although type System.Int32 derives from object, and references to System.Int32 object instances can be used as references to System.Object, an array of type System.Int32[] does not hold instances of System.Int32, nor does it hold references to them. Instead, each element of an array will hold just the 32-bit numeric value associated with an Int32, without holding any of the other information associated with an object instance. Although C# will allow code like:
Object[] array = new Object[3];
int five = 5;
array[0] = five;
array[1] = five;
array[2] = array[0];
the code isn't storing five, nor a reference to it, into the array. Instead, the assignment to array[0] will create a new object of type System.Int32 which holds the number 5 and store a reference to that. The assignment to array[1] will then create another new object of type System.Int32, which also holds the value 5, and store a reference to that. The third assignment will store into array[2] a reference to the same object as array[0]. Note that even though all three array slots seem to hold the number 5, they actually hold more information than that. The array also encapsulates the fact that array[0] and array[2] hold references to one object, while array[1] holds a reference to another.
When a reference type is cast to its parent type, the resulting reference is required to identify the same object as the original. Consequently, the object identified by the resulting reference cannot encapsulate any more information than the object identified by the original (it's the same object, after all!). Because an Object[], even one whose elements all identify instances of System.Int32, encapsulates information beyond what can be stored in an int[], it is not possible for an int[] and an Object[] to be one and the same object.

C# FieldInfo.SetValue with an array parameter and arbitrary element type

I am trying to set an array field using reflection like this:
FieldInfo field = ...
A[] someArray = GetElementsInSomeWay();
field.SetValue(this, someArray);
The field has type B[]. B inherits from A and the exact type of B is not known at compile time.
GetElementsInSomeWay() returns A[] but the real elements inside are all B's. GetElementsInSomeWay() is a library method and can't be changed.
What I can do at most is to get the B with System.Type type = field.FieldType.GetElementType().
However I can't cast the array to the required type, e.g.
someArray as type[] because [] requires an exact type before it to declare an array type. Or am I missing something here? Can I declare an array of some type, if the type becomes known in runtime using System.Type variable?
Doing it the direct way produces the following error (here A is UnityEngine.Component and B is AbilityResult which can also be one of a few dozens other classes, all inheriting (possibly thru a long inheritance chain) from UnityEngine.Component):
ArgumentException: Object type UnityEngine.Component[] cannot be converted to target type: AbilityResult[]
Parameter name: val
System.Reflection.MonoField.SetValue (System.Object obj, System.Object val, BindingFlags invokeAttr, System.Reflection.Binder binder, System.Globalization.CultureInfo culture) (at /Applications/buildAgent/work/3df08680c6f85295/mcs/class/corlib/System.Reflection/MonoField.cs:133)
System.Reflection.FieldInfo.SetValue (System.Object obj, System.Object value) (at /Applications/buildAgent/work/3df08680c6f85295/mcs/class/corlib/System.Reflection/FieldInfo.cs:150)
I think you need to understand better the variance of arrays. Arrays like object[] and string[] came already in .NET 1, a long time before generics (.NET 2). Otherwise they might have been called Array<object> and Array<string>.
Now, we all know that any string is an object. If this fact implies, for some "construction" Xxx, that any Xxx<string> is an Xxx<object>, then we call this covaraince. Since .NET 4, some generic interfaces and generic delegate types can be covarinat, and this is marked with an out keyword in their definition, as in
public interface IEnumerable<out T>
{
// ...
}
If the relation is reversed upon "applying" Xxx<>, then that's called contravariance. So "string is object" with contravariance becomes "Xxx<object> is Xxx<string>".
Now, back to arrays. The natural question here is: Are arrays covariant or contravariant or neither ("invariant")? Since you can both read from and write to each "cell" in an array, they can't be fully covariant or contravariant. But try this code:
string[] arr1 = { "these", "are", "strings", };
object[] arr2 = arr1; // works! covariance!
var runtimeType = arr2.GetType(); // System.String[]
So a T[] is covariant in .NET. But didn't I just say I could write (like in not out) to an array? So what if I continue like this:
arr2[0] = new object();
I'm trying to put an object which is not a string into the zeroth slot. The above line has to compile (compile-time type of arr2 is object[]). But it also has to raise an excpetion run-time. That's the problem with arrays and covariance.
So what about contravariance? Try this:
object[] arrA = { new object(), DateTime.Now, "hello", };
string[] arrB = arrA; // won't compile, no cotravariance
Making an explicit cast from object[] to string[] still won't work runtime.
And now, the answer to your question: You are trying to apply contravariance to arrays. AbilityResult is Component, but that does not imply that Component[] is AbilityResult[]. Whoever wrote the GetElementsInSomeWay() method, chose to create a new Component[]. Even if all components he put into it, are AbilityResult, there's still no contravariance. The author of GetElementsInSomeWay() could have chosen to make a new AbilityResult[] instead. He could still have return type Component[] beacuase of the covariance of .NET arrays.
Lesson to learn: The real type (run-time type as revealed by .GetType()) of an array will not change just because you cast. .NET does allow covariance (a variable of type Component[] might hold an object whose real type is AbilityResult[]). And finally, .NET does not allow contravariance (a variable of type AbilityResult[] never holds a reference to an object of real type Component[]).
Hope this helps. Otherwise, my answer should give you some terms you can google to find explanations superior to mine.
After some search i stumbled on this question: How do I create a C# array using Reflection and only type info?
A possible solution to my problem is:
A[] someArray = GetElementsInSomeWay();
System.Type type = field.FieldType.GetElementType();
Array filledArray = Array.CreateInstance(type, someArray.Length);
Array.Copy(someArray, filledArray, someArray.Length);
field.SetValue(this, filledArray);
I just tested, it works.
However, I'd still like to avoid copying the elements. In my case the arrays are pretty small (3-5 elements at most) but nevertheless would be nice to see a cleaner solution if there is one.

Categories

Resources