Structs, Interfaces and Boxing [duplicate] - c#

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is it safe for structs to implement interfaces?
Take this code:
interface ISomeInterface
{
public int SomeProperty { get; }
}
struct SomeStruct : ISomeInterface
{
int someValue;
public int SomeProperty { get { return someValue; } }
public SomeStruct(int value)
{
someValue = value;
}
}
and then I do this somewhere:
ISomeInterface someVariable = new SomeStruct(2);
is the SomeStruct boxed in this case?

Jon's point is true, but as a side note there is one slight exception to the rule; generics. If you have where T : ISomeInterface, then this is constrained, and uses a special opcode. This means the interface can be used without boxing. For example:
public static void Foo<T>(T obj) where T : ISomeInterface {
obj.Bar(); // Bar defined on ISomeInterface
}
This does not involve boxing, even for value-type T. However, if (in the same Foo) you do:
ISomeInterface asInterface = obj;
asInterface.Bar();
then that boxes as before. The constrained only applies directly to T.

Yes, it is. Basically whenever you need a reference and you've only got a value type value, the value is boxed.
Here, ISomeInterface is an interface, which is a reference type. Therefore the value of someVariable is always a reference, so the newly created struct value has to be boxed.

I'm adding this to hopefully shed a little more light on the answers offered by Jon and Marc.
Consider this non-generic method:
public static void SetToNull(ref ISomeInterface obj) {
obj = null;
}
Hmm... setting a ref parameter to null. That's only possibly for a reference type, correct? (Well, or for a Nullable<T>; but let's ignore that case to keep things simple.) So the fact that this method compiles tells us that a variable declared to be of some interface type must be treated as a reference type.
The key phrase here is "declared as": consider this attempt to call the above method:
var x = new SomeStruct();
// This line does not compile:
// "Cannot convert from ref SomeStruct to ref ISomeInterface" --
// since x is declared to be of type SomeStruct, it cannot be passed
// to a method that wants a parameter of type ref ISomeInterface.
SetToNull(ref x);
Granted, the reason you can't pass x in the above code to SetToNull is that x would need to be declared as an ISomeInterface for you to be able to pass ref x -- and not because the compiler magically knows that SetToNull includes the line obj = null. But in a way that just reinforces my point: the obj = null line is legal precisely because it would be illegal to pass a variable not declared as an ISomeInterface to the method.
In other words, if a variable is declared as an ISomeInterface, it can be set to null, pure and simple. And that's because interfaces are reference types -- hence, declaring an object as an interface and assigning it to a value type object boxes that value.
Now, on the other hand, consider this hypothetical generic method:
// This method does not compile:
// "Cannot convert null to type parameter 'T' because it could be
// a non-nullable value type. Consider using 'default(T)' instead." --
// since this method could take a variable declared as, e.g., a SomeStruct,
// the compiler cannot assume a null assignment is legal.
public static void SetToNull<T>(ref T obj) where T : ISomeInterface {
obj = null;
}

The MSDN documentation tells us that structs are value, not reference types. They are boxed when converting to/from a variable of type object. But the central question here is: what about a variable of an interface type? Since the interface can also be implemented by a class, then this must be tantamount to converting from a value to a reference type, as Jon Skeet already said, therefore yes boxing would occur. More discussion on an msdn blog.

Related

Explicit cast explanation in terms of memory for reference type in C#

In MSDN, "For reference types, an explicit cast is required if you need to convert from a base type to a derived type".
In wiki, "In programming language theory, a reference type is a data type that refers to an object in memory. A pointer type on the other hand refers to a memory address. Reference types can be thought of as pointers that are implicitly dereferenced." which is the case in C.
How to explain the memory storing procedure when considering explicit casting for reference type in C#?
For most cases, there's really not much conceivable difference between a reference variable and a pointer variable. Both point to a location in memory. The type of the reference (or pointer) variable tells the compiller which operations can be performed using it.
Instead of C pointers, which are (primarily) used with basic types (such as int or byte), consider C++ object pointers first. It's really almost the same as in C#:
MyBaseClass* a = new MyBaseclass();
a->BaseMethod(); // Call method using -> operator (dereference and call)
MyBaseClass* b = new MyDerivedClass();
b->DerivedMethod(); // Error: MyBaseClass has no such method
// Proper C++-Style casting.
MyDerivedClass* c = dynamic_cast<MyDerivedClass*>(b);
// Shortcut to the above, does not do the type test.
// MyDerivedClass* c = (MyDerivedClass*)b;
c->DerivedMethod(); // Ok
This translates almost 1:1 to C#, so reference types are (from a programmer point of view) just pointers with a defined type. The only visible difference would be that a direct C-Style cast in C# is equivalent to a try_cast in C++, which will ensure that you can never assign a wrong target instance to a reference variable.
So the differences between a reference type and a pointer to an object are (most of these are implied by the fact that C# is a managed language):
A reference variable can never point to invalid memory (except to NULL).
A reference variable can never point to an object that's not of its type.
When assigning a value to a reference variable, the type is always tested.
A cast on a reference variable needs to check that the target object is of the given type.
The reference objects are stored on a heap, where they can be referenced from the code. The object, as it is on the heap, is of a given type.
From the code, you can create references to it, and those references can be cast to some some other types.
Now, there are couple of cases, which are described in the referenced article. I will use the examples from there to make it easier.
1. Implicit conversions
Implicit conversion takes place, when you don't ask for it specifically in code. Compiler has to know by itself how to do this.
1.1. Value Types
If the type of value that you are trying to cast is of size, that allows you to store it in the size of memory that makes the size of the type you want to cast to, then compiler will let you do that. This is mostly for numeric values, so following the examples from your referenced article:
// Implicit conversion. num long can
// hold any value an int can hold, and more!
int num = 2147483647;
long bigNum = num;
So since int is 'smaller' than long, compiler will let you do this.
1.2. Reference Types
Assuming you have following classes definitions:
class Base {
}
class Derived : Base {
public int IntProperty { get; set; }
public int CalculateSomething ()
{
return IntProperty * 23;
}
}
Then you can safely do conversions like:
Derived d = new Derived();
Base b = d;
This is because object d, which you have created on the heap, is of type Derived, and since it's a derived type from type Base, it is guaranteed to have all members that Base has. So it's safe to convert the reference and use Derived object as Base object. Because Derived IS Base (Derived : Base).
2. Explicit conversions
Let's assume we have another class in our project:
class DerivedLike
{
public int IntProp { get; set; }
public int CalculateSomethingElse()
{
return IntProp * 23;
}
}
If we write
DerivedLike dl = new DerivedLike();
Derived d = dl;
we'll get from our compiler that it cannot implicitly convert type DerivedLike to Derived.
This is, because the two reference types are totally different, so compiler cannot allow you to do that. Those types have different properties and methods.
2.1. Implementing explicit conversion
As long as you cannot convert from Derived class to Base class by yourself, you can write an operator in most other cases.
If one wants to proceed with conversion from DerivedLike to Derived, we must implement in the DerivedLike class, a conversion operator. It's a static operator which tells how to convert one type to another. The conversion operator may be either implicit, or explicit. Explicit will require the developer to cast it explicitly, by providing the Type name in the parenthesis.
The recommendation for choosing between implicit and explicit operators is that if conversion may throw exceptions, it should be explicit, so that conversion is done consciously by the developer.
Let's change our code to meet that requirement:
class DerivedLike
{
public static explicit operator Derived(DerivedLike a)
{
return new Derived() { IntProperty = a.IntProp};
}
public int IntProp { get; set; }
public int CalculateSomethingElse()
{
return IntProp * 23;
}
}
So this will compile fine now:
DerivedLike dl = new DerivedLike();
Derived d = (Derived)dl;
Going back to memory topic, please note, that with such conversion, you will now have two objects on the heap.
One created here:
DerivedLike dl = new DerivedLike();
Second one created here:
Derived d = (Derived)dl;
The object on the heap cannot change it's type.
Hope this clarifies.

I can only cast a contravariant delegate with "as"

I'm trying to cast a contravariant delegate but for some reason I can only do it using the "as" operator.
interface MyInterface { }
delegate void MyFuncType<in InType>(InType input);
class MyClass<T> where T : MyInterface
{
public void callDelegate(MyFuncType<MyInterface> func)
{
MyFuncType<T> castFunc1 = (MyFuncType <T>) func; //Error
MyFuncType<T> castFunc2 = func as MyFuncType<T>;
MyFuncType<T> castFunc3 = func is MyFuncType<T> ? (MyFuncType<T>)func : (MyFuncType<T>)null; //Error
}
}
castFunc2 works fine but castFunc1 and castFunc3 cause the error:
Cannot convert type 'delegateCovariance.MyFuncType<myNamespace.MyInterface>' to myNamespace.MyFuncType<T>'
The MSDN article on the as operator states that castFunc2 and castFunc3 are "equivalent" so I don't understand how only one of them could cause an error. Another piece of this that is confusing me is that changing MyInterface from an interface to a class gets rid of the error.
Can anyone help me understand what is going on here?
Thanks!
Add a constraint such that T must be a class.
class MyClass<T> where T: class, MyInterface
This gives the compiler enough information to know that T is convertible. You don't need the explicit cast either.
Variance only applies to reference types. T is allowed to be a value type without the constraint which breaks the compilers ability to prove that T is compatible for contravariance.
The reason the second statement works is because as actually can perform a null conversion. For example:
class SomeClass { }
interface SomeInterface { }
static void Main(string[] args)
{
SomeClass foo = null;
SomeInterface bar = foo as SomeInterface;
}
Foo is obviously not directly convertable to SomeInterface, but it still succeeds because a null conversion can still take place. Your MSDN reference may be correct for most scenarios, but the generated IL code is very different which means they are fundamentally different from a technical perspective.
Eric Lippert gave a great explanation of this issue in his recent posts: An "is" operator puzzle, part one, An "is" operator puzzle, part two.
Main rationale behind this behavior is following: "is" (or "as") operators are not the same as a cast.
"as" operator can result non-null result event if corresponding cast would be illegal, and this is especially true when we're dealing with type arguments.
Basically, cast operator in your case means (as Eric said) that "I know that this value is of the given type, even though the compiler does not know that, the compiler should allow it" or "I know that this value is not of the given type; generate special-purpose, type-specific code to convert a value of one type to a value of a different type."
Later case deals with value conversions like double-to-integer conversion and we can ignore this meaning in current context.
And generic type arguments are not logical in the first context neither. If you're dealing with a generic type argument, than why you're not stating this "contract" clearly by using generic type argument?
I'm not 100% sure what you're want to achieve, but you can omit special type from your method and freely use generic argument instead:
class MyClass<T> where T : MyInterface
{
public void callDelegate(Action<T> func)
{
}
}
class MyClass2
{
public void callDelegate<T>(Action<T> func)
where T : MyInterface
{
}
}
Otherwise you should use as operator with check for null instead of type check.
Your class says that T implements MyInterface because MyInterface is not a instance type. Therefore, MyFuncType<T> is not guaranteed to be MyFuncType<MyInterface>. It could be MyFuncType<SomeType> and SomeType : MyInterface, but that wouldn't be the same as SomeOtherType : MyInterface. Make sense?

Is "where T : class" not enforced in any way at compile time or run time?

In the following code, I pass a struct into a constructor that is expecting a class. Why does this compile and run without error (and produce the desired output)?
class Program
{
static void Main()
{
var entity = new Foo { Id = 3 };
var t = new Test<IEntity>(entity); // why doesn't this fail?
Console.WriteLine(t.Entity.Id.ToString());
Console.ReadKey();
}
}
public class Test<TEntity> where TEntity : class
{
public TEntity Entity { get; set; }
public Test(TEntity entity)
{
Entity = entity;
}
public void ClearEntity()
{
Entity = null;
}
}
public struct Foo : IEntity
{
public int Id { get; set; }
}
public interface IEntity
{
int Id { get; set; }
}
If I change my Main() method so that it includes a call to ClearEntity(), as shown below, it still generates no error. Why?
static void Main()
{
var entity = new Foo { Id = 3 };
var t = new Test<IEntity>(entity);
Console.WriteLine(t.Entity.Id.ToString());
t.ClearEntity(); // why doesn't this fail?
Console.ReadKey();
}
where TEntity : class forces TEntity to be a reference type, but an interface such as IEntity is a reference type.
See here:
http://msdn.microsoft.com/en-us/library/d5x73970(v=vs.80).aspx
where T : class | The type argument must be a reference type, including any class, interface, delegate, or array type
Regarding your second question, you might think t.ClearEntity() would fail because it's assigning null to a variable whose type is a value type, but that's not the case. The compile-time type of Entity is the reference type IEntity, and the runtime type (after assignment) is the null type. So you never have a variable of type Foo but value null.
from the C# documentation:
where T : class
The type argument must be a reference type, including any class, interface, delegate, or array type. (See note below.)
Because you're passing the struct via an interface, it's still considered a reference type.
Within the .net runtime, every non-nullable value type has an associated reference type (often referred to as a "boxed value type") which derives from System.ValueType. Saying Object Foo = 5; won't actually store an Int32 into Foo; instead it will create a new instance of the reference type associated with Int32 and store a reference to that instance. A class constraint on a generic type specifies that the type in question must be some sort of a reference type, but does not by itself exclude the possibility that the type may be used to pass a reference to a boxed value-type instance. In most contexts outside generic type constraints, interface types are regarded as class types.
It's important to note that not only are boxed value types stored like reference types; they behave like reference types. For example, List<string>.Enumerator is a value type which implements IEnumerator<string>. If one has two variables of type List<string>.Enumerator, copying one to the other will copy the state of the enumeration, such that there will be two separate and independent enumerators which point to the same list. Copying one of those variables to a variable of type IEnumerator<string> will create a new instance of the boxed value type associated with List<string.Enumerator and store in the latter variable a reference to that new object (which will be a third independent enumerator). Copying that variable to another of type IEnumerator<string>, however, will simply store a reference to the existing object (since IEnumerator<string> is a reference type).
The C# language tries to pretend that value types derive from Object, but within the guts of the .net Runtime they really don't. Instead, they're convertible to types which derive from System.ValueType (which in turn derives from Object). The latter types will satisfy a type constraint, even though the former ones will not. Incidentally, despite its name, System.ValueType is actually a class type.
I, likewise, assumed that constraint keyword class meant the same class as the type declaration keyword class, but it doesn't.
As explained in the other answers, the term class here is over-loaded, which seems to me to be a horrible decision for the C# language design. Something like referencetype would have been more helpful.

Most efficient way to check if an object is a value type

WARNING: THIS CODE SUCKS, SEE ANTHONY'S COMMENTS
Which is faster?
1.
public bool IsValueType<T>(T obj){
return obj is ValueType;
}
2.
public bool IsValueType<T>(T obj){
return obj == null ? false : obj.GetType().IsValueType;
}
3.
public bool IsValueType<T>(T obj){
return default(T) != null;
}
4.Something else
You aren't really testing an object - you want to test the type. To call those, the caller must know the type, but... meh. Given a signature <T>(T obj) the only sane answer is:
public bool IsValueType<T>() {
return typeof(T).IsValueType;
}
or if we want to use an example object for type inference purposes:
public bool IsValueType<T>(T obj) {
return typeof(T).IsValueType;
}
this doesn't need boxing (GetType() is boxing), and doesn't have problems with Nullable<T>. A more interesting case is when you are passing object...
public bool IsValueType(object obj);
here, we already have massive problems with null, since that could be an empty Nullable<T> (a struct) or a class. But A reasonable attempt would be:
public bool IsValueType(object obj) {
return obj != null && obj.GetType().IsValueType;
}
but note that it is incorrect (and unfixable) for empty Nullable<T>s. Here it becomes pointless to worry about boxing as we are already boxed.
My first answer would be to write a simple test and find out for yourself.
My second answer (without any testing on my part, of course) would be option 1. It is the simplest check. The second method involves two separate checks while the third involves creating a default instance of a type.
You should also consider readability. The framework already gives you the ability to have the following in your code:
if(someObj is ValueType)
{
// Do some work
}
Why even bother creating a method that would simply turn the above statement into (assuming you made your method static and allowed the compiler to infer the generic type):
if(IsValueType(someObj))
{
// Do some work
}
Defining a struct actually defines two types: a value type, and a class type which derives from System.ValueType. If a request is made to create a variable, parameter, field, or array (collectively, 'storage location') of a type which derives from System.ValueType, the system will instead create a storage location which will store the object's fields rather than storing a reference to an object in which those fields appear. On the other hand, if a request is made to create an instance of a type deriving from System.ValueType, the system will create an object instance of a class which derives from System.ValueType.
This may be demonstrated by creating a struct which implements IValue:
interface IValue {int value {get; set;}};
struct ValueStruct : IValue
{
public int value {get; set;}};
}
with generic test routine and code to wrap it:
static void Test<T>(T it) where T:IValue
{
T duplicate = it;
it.value += 1;
duplicate.value += 10;
Console.WriteLine(it.value.ToString());
}
static void Test()
{
ValueStruct v1 = new ValueStruct();
v1.value = 9;
IValue v2 = v1;
Test<ValueStruct>(v1);
Test<ValueStruct>(v1);
Test<IValue>(v1);
Test<IValue>(v1);
Test<IValue>(v2);
Test<IValue>(v2);
}
Note that in every case, calling GetType on the parameter passed to Test would yield ValueStruct, which will report itself as a value type. Nonetheless, the passed-in item will only be a "real" value type on the first two calls. On the third and fourth calls, it will really be a class type, as demonstrated by the fact that a change to duplicate will affect it. And on the fifth and sixth calls, the change will be propagated back to v2, so the second call will "see" it.
static class Metadata<T>
{
static public readonly Type Type = typeof(T);
static public readonly bool IsValueType = Metadata<T>.Type.IsValueType;
}
//fast test if T is ValueType
if(Metadata<T>.IsValueType) //only read static readonly field!
{
//...
}
There are two rules:
1-All Classes are reference types such as Object and String, so it's supported by .NET Framework classes.
2-All structures are value types such as bool and char, even though it contain reference member, so it's supported by .NET Framework structures.
Simply right click on any type and Go To Definition if it's a Class so that means it a reference type else if it's a Struct so that means it's a value type :)
You can use
obj.GetType().IsValueType
This uses reflection but clear way instead of care of boxing unboxing.

Generic unboxing of boxed value types

I have a generic function that is constrained to struct. My inputs are boxed ("objects"). Is it possible to unbox the value at runtime to avoid having to check for each possible type and do the casts manually?
See the above example:
public struct MyStruct
{
public int Value;
}
public void Foo<T>(T test)
where T : struct
{
// do stuff
}
public void TestFunc()
{
object o = new MyStruct() { Value = 100 }; // o is always a value type
Foo(o);
}
In the example, I know that o must be a struct (however, it does not need to be MyStruct ...). Is there a way to call Foo without tons of boilerplate code to check for every possible struct type?
Thank you.
.NET Generics are implemented in a manner that allows value types as a generic type parameter without incurring any boxing/unboxing overhead. Because your're casting to object before calling Foo you don't take advantage of that, in fact you're not even taking advantage of generics at all.
The whole point of using generics in the first place is to replace the "object-idiom". I think you're missing the concept here.
Whatever type T happens to be, it is available at run-time and because you constrained it to struct guaranteed to be a struct type.
Your TestFunc could be written like this without problem:
public void TestFunc()
{
MyStruct o = new MyStruct() { Value = 100 }; // o is always a value type
Foo<MyStruct>(o);
}
Looking at Foo, it would look like this in your example:
public void Foo<T>(T test)
where T : struct
{
T copy = test; // T == MyStruct
}
EDIT:
Ok, since the OP clarified what he wants to call the generic method but doesn't know the type of his struct (it's just object). The easiest way to call your generic method with the correct type parameter is to use a little reflection.
public void TestFunc()
{
object o = new DateTime();
MethodInfo method = this.GetType().GetMethod("Foo");
MethodInfo generic = method.MakeGenericMethod(o.GetType());
generic.Invoke(this, new object[] {o});
}
public void Foo<T>(T test)
where T : struct
{
T copy = test; // T == DateTime
}
No; you're using object, which is (by definition) not a struct/value type. Why are you intentionally boxing the value in this way?
The whole point of using generics is to avoid situations like this.
When you actually "close" the generic with a type of struct, you eliminate the need for runtime type checking: ie.
Foo<MyStruct>(MyStruct test);
Your implementation of Foo, can safely assume that it's dealing with a struct.
(Marked as CW because you can't pass an instance of ValueType to a generic requiring a struct, but it might be helpful for others who come across this question).
Instead of declaring o as an object, you can use a type of System.ValueType, which can only be assigned struct values; you cannot store an object in a ValueType.
However, I'm honestly not sure if that does anything in terms of (un)boxing. Note that ECMA-334 11.1.1 says:
System.ValueType is not itself a value-type. Rather, it is a class-type from which all value-types are automatically derived.
I dont know exactly what you are trying to archieve, but you could pass a delegate/lambda to unbox the value, and select some value in the struct you are interested in:
(Updated this code snippet after slurmomatics comment)
public void Foo<TValue>(object test, Func<object, TValue> ValueSelector)
where TValue : struct
{
TValue value = ValueSelector(test);
// do stuff with 'value'
}
public void TestFunc()
{
object o = new MyStruct() { Value = 100 };
// Do the unboxing in the lambda.
// Additionally you can also select some
// value, if you need to, like in this example
Foo(o, x => ((MyStruct)x).Value);
}
Update:
Then do this:
public static void Foo<TUnboxed>(object test)
where TUnboxed : struct
{
try
{
TUnboxed unboxed = (TUnboxed)test;
}
catch (InvalidCastException ex)
{
// handle the exception or re-throw it...
throw ex;
}
// do stuff with 'unboxed'
}
public void TestFunc()
{
// box an int
object o = 100;
// Now call foo, letting it unbox the int.
// Note that the generic type can not be infered
// but has to be explicitly given, and has to match the
// boxed type, or throws an `InvalidCastException`
Foo<int>(o);
}

Categories

Resources