Do all C# casts result in boxing/unboxing - c#

I am curious to know if all casts in C# result in boxing, and if not, are all casts a costly operation?
Example taken from Boxing and Unboxing (C# Programming Guide)
int i = 123;
// The following line boxes i.
object o = i;
This line obviously causes boxing (wrapping up the int type as an object).
This is an operation that is considered costly, since it creates garbage that will be collected.
What about casts from 2 different types of reference types? what is the cost of that? can it be properly measured? (compared to the previous example)
For example:
public class A
{
}
public class B : A
{
}
var obj = new B();
var obj2 = (A)obj; // is this an "expensive" operation? this is not boxing

I am curious to know if all conversions in C# result in boxing.
No. Only boxing conversions result in boxing, hence the name "boxing conversions". Boxing conversions are all built-in conversions from value types to reference types -- either to a class that the value type inherits from, or to an interface that it implements. (Or to an interface compatible with an interface it implements, via a covariant or contravariant reference conversion.)
are all conversions a costly operation?
No. Identity conversions are zero cost because the compiler can elide them entirely.
What are the costs of implicit and explicit reference conversions?
Implicit reference conversions are zero cost. The compiler can elide them entirely. That is, converting from Giraffe to its base type Animal, or Giraffe to its implemented interface type IAmATallMammal, are free.
Explicit reference conversions involve a runtime check to verify that the reference does in fact refer to an object of the desired type.
Whether that runtime check is "costly" or not depends on your budget.
can that cost be properly measured?
Sure. Decide what resource is relevant to you -- time, say -- and then carefully measure your consumption of time with a stopwatch.
A question you did not ask but probably should have:
What are the most expensive conversions?
User-defined conversions are nothing more than a syntactic sugar for a method call; that method can take arbitrarily long, like any method.
Dynamic conversions start the compiler again at runtime; the compiler may take arbitrarily long to perform a type analysis, depending on how hard an analysis problem you choose to throw at it.

No.
Boxing means putting a value into a new reference type instance.
Standard casts between reference types do not result in any allocations.
(User-defined casts can do anything)

I am curious to know if all casts in C# result in boxing,
No. Boxing is a very special operation that means treating an instance of a value type as an instance of a reference type. For reference type conversion to reference type conversion, the concept plays no role.
are all casts a costly operation?
Short answer: No.
Long answer: Define costly. Still no, though.
What about casts from 2 different types of reference types? what is the cost of that?
Well, what if it's just a derived to base reference conversion? That's BLAZINGLY fast because nothing happens.
Other, user-defined conversions, could be "slow", they could be "fast."
This one is "slow."
class A { public int Foo { get; set; } }
class B {
public int Foo { get; set; }
static Random rg = new Random();
static explicit operator A(B b) {
Thread.Sleep(rg.Next());
return new A { Foo = b.Foo; }
}
}
This one is "fast."
class A { public int Foo { get; set; } }
class B {
public int Foo { get; set; }
static Random rg = new Random();
static explicit operator A(B b) {
return new A { Foo = b.Foo; }
}
}
var obj2 = (A)obj; // is this an "expensive" operation? this is not boxing
No, it's "cheap."

Related

Explicit cast explanation in terms of memory for reference type in C#

In MSDN, "For reference types, an explicit cast is required if you need to convert from a base type to a derived type".
In wiki, "In programming language theory, a reference type is a data type that refers to an object in memory. A pointer type on the other hand refers to a memory address. Reference types can be thought of as pointers that are implicitly dereferenced." which is the case in C.
How to explain the memory storing procedure when considering explicit casting for reference type in C#?
For most cases, there's really not much conceivable difference between a reference variable and a pointer variable. Both point to a location in memory. The type of the reference (or pointer) variable tells the compiller which operations can be performed using it.
Instead of C pointers, which are (primarily) used with basic types (such as int or byte), consider C++ object pointers first. It's really almost the same as in C#:
MyBaseClass* a = new MyBaseclass();
a->BaseMethod(); // Call method using -> operator (dereference and call)
MyBaseClass* b = new MyDerivedClass();
b->DerivedMethod(); // Error: MyBaseClass has no such method
// Proper C++-Style casting.
MyDerivedClass* c = dynamic_cast<MyDerivedClass*>(b);
// Shortcut to the above, does not do the type test.
// MyDerivedClass* c = (MyDerivedClass*)b;
c->DerivedMethod(); // Ok
This translates almost 1:1 to C#, so reference types are (from a programmer point of view) just pointers with a defined type. The only visible difference would be that a direct C-Style cast in C# is equivalent to a try_cast in C++, which will ensure that you can never assign a wrong target instance to a reference variable.
So the differences between a reference type and a pointer to an object are (most of these are implied by the fact that C# is a managed language):
A reference variable can never point to invalid memory (except to NULL).
A reference variable can never point to an object that's not of its type.
When assigning a value to a reference variable, the type is always tested.
A cast on a reference variable needs to check that the target object is of the given type.
The reference objects are stored on a heap, where they can be referenced from the code. The object, as it is on the heap, is of a given type.
From the code, you can create references to it, and those references can be cast to some some other types.
Now, there are couple of cases, which are described in the referenced article. I will use the examples from there to make it easier.
1. Implicit conversions
Implicit conversion takes place, when you don't ask for it specifically in code. Compiler has to know by itself how to do this.
1.1. Value Types
If the type of value that you are trying to cast is of size, that allows you to store it in the size of memory that makes the size of the type you want to cast to, then compiler will let you do that. This is mostly for numeric values, so following the examples from your referenced article:
// Implicit conversion. num long can
// hold any value an int can hold, and more!
int num = 2147483647;
long bigNum = num;
So since int is 'smaller' than long, compiler will let you do this.
1.2. Reference Types
Assuming you have following classes definitions:
class Base {
}
class Derived : Base {
public int IntProperty { get; set; }
public int CalculateSomething ()
{
return IntProperty * 23;
}
}
Then you can safely do conversions like:
Derived d = new Derived();
Base b = d;
This is because object d, which you have created on the heap, is of type Derived, and since it's a derived type from type Base, it is guaranteed to have all members that Base has. So it's safe to convert the reference and use Derived object as Base object. Because Derived IS Base (Derived : Base).
2. Explicit conversions
Let's assume we have another class in our project:
class DerivedLike
{
public int IntProp { get; set; }
public int CalculateSomethingElse()
{
return IntProp * 23;
}
}
If we write
DerivedLike dl = new DerivedLike();
Derived d = dl;
we'll get from our compiler that it cannot implicitly convert type DerivedLike to Derived.
This is, because the two reference types are totally different, so compiler cannot allow you to do that. Those types have different properties and methods.
2.1. Implementing explicit conversion
As long as you cannot convert from Derived class to Base class by yourself, you can write an operator in most other cases.
If one wants to proceed with conversion from DerivedLike to Derived, we must implement in the DerivedLike class, a conversion operator. It's a static operator which tells how to convert one type to another. The conversion operator may be either implicit, or explicit. Explicit will require the developer to cast it explicitly, by providing the Type name in the parenthesis.
The recommendation for choosing between implicit and explicit operators is that if conversion may throw exceptions, it should be explicit, so that conversion is done consciously by the developer.
Let's change our code to meet that requirement:
class DerivedLike
{
public static explicit operator Derived(DerivedLike a)
{
return new Derived() { IntProperty = a.IntProp};
}
public int IntProp { get; set; }
public int CalculateSomethingElse()
{
return IntProp * 23;
}
}
So this will compile fine now:
DerivedLike dl = new DerivedLike();
Derived d = (Derived)dl;
Going back to memory topic, please note, that with such conversion, you will now have two objects on the heap.
One created here:
DerivedLike dl = new DerivedLike();
Second one created here:
Derived d = (Derived)dl;
The object on the heap cannot change it's type.
Hope this clarifies.

Why is this test expression an error?

i want to understand why the C# language decided to make this test expression as an error.
interface IA { }
interface IB { }
class Foo : IA, IB { }
class Program
{
static void testFunction<T>(T obj) where T : IA, IB
{
IA obj2 = obj;
if (obj == obj2) //ERROR
{
}
}
static void Main(string[] args)
{
Foo myFoo = new Foo();
testFunction(myFoo);
Console.ReadLine();
}
}
In the testFunction, i can make an object called obj2 and set it to obj implicitly without casting it. But why cant i then check the two objects to see if they are the same, without casting it? They obviously implement the same interface, so why is it an error?
You can check to see if they're the same object by using Object.ReferenceEquals or Object.Equals.
However, since your constraints (IA and IB interfaces) don't enforce that the type is necessarily a reference type, there's no guarantee that the equality operator can be used.
Suppose you construct T with a value type X that implements IA.
What does
static void testFunction<T>(T obj) where T : IA
{
IA obj2 = obj;
if (obj == obj2) //ERROR
do when called as testFunction<X>(new X(whatever)) ?
T is X, X implements IA, so the implicit conversion boxes obj to obj2.
The equality operator is now comparing a value type X with a boxed copy of compile-time type IA. That the runtime type is a boxed X the compiler does not care about; that information is ignored.
What comparison semantics should it use?
It cannot use reference comparison semantics because that would mean that obj would also have to be boxed. It won't box to the same reference, so this would always be false, which seems bad.
It cannot use value comparison semantics because the compiler has no basis upon which kind of value semantics it should use! At compile time it does not know whether the type chosen for T in the future will provide an overloaded == operator or not, and even if it does, that operator is unlikely to take an IA as one of its operands.
There are no equality semantics that the compiler can reasonably choose, and therefore this is illegal.
Now, if you constrain T to be a reference type then the first objection goes away, and the compiler can reasonably choose reference equality. If that's your intention then constrain T to be a reference type.
To expand a bit on Reed's answer (which is certainly correct):
Note that the following code results in the same error at compile time:
Guid g = Guid.NewGuid(); // a value type
object o = g;
if (o == g) // ERROR
{
}
The C# language specification says (§7.10.6):
The predefined reference type equality operators are:
bool operator ==(object x, object y);
bool operator !=(object x, object y);
[...]
The predefined reference type equality operators require one of the following:
Both operands are a value of a type known to be a reference-type or the literal null.
Furthermore, an explicit reference conversion (§6.2.4) exists from the type of either operand to the type of the other operand.
One operand is a value of type T where T is a type-parameter and the other operand is the literal null. Furthermore T does not have the value type constraint.
[...]
Unless one of these conditions are true, a binding-time error occurs. Notable implications of these rules are:
[...]
The predefined reference type equality operators do not permit value type operands to be compared. Therefore, unless a struct type declares its own equality operators, it is not possible to compare values of that struct type.
The predefined reference type equality operators never cause boxing operations to occur for their operands. It would be meaningless to perform such boxing operations, since references to the newly allocated boxed instances would necessarily differ from all other references.
Now, in your code example you do not constrain T to be a reference type and hence you get the compile-time error. Your sample can be fixed however by declaring that T must be a reference type:
static void testFunction<T>(T obj) where T : class, IA, IB
{
IA obj2 = obj;
if (obj == obj2) // compiles fine
{
}
}
Try
if (obj.Equals(obj2))
IA doesn't implement any == operator.
Ahh thanks Reed Copsey for that note.
I just also found out that you can put "class" in the where clause like this.
static void testFunction<T>(T obj) where T : class, IA, IB
{
IA obj2 = obj;
if (obj == obj2)
{
}
}
Now its a reference type and it works! :-)

Passing arguments, does unboxing occur

What I have read, passing arguments is by default valuetypes. In my example the first function test1 takes a reference type and unbox, it will decrease the performance if I got this right.
However I have never read that you do like test2 for increase performance.
So whats best practice?
public Main(){
string test = "hello";
test1(test); // Does this line perform a boxing? So it's not good for performance?
test2(ref test); // Passing a reference as a reference
}
public string test1(string arg1) {
return arg1;
}
public string test2(ref string arg1) {
return arg1;
}
There's no boxing or unboxing involved at all here. string is a reference type - why would it be boxed? What would that even mean?
Even if you used int instead, there'd be no need for boxing, because there's no conversion of the value into an actual object.
I suspect your understanding of both boxing and parameter passing is flawed.
Boxing occurs when a value type value needs to be converted into an object, usually in order for it to be used as a variable (somewhere) of an interface or object type. So this boxes:
int value = 10;
Foo(value);
...
public void Foo(object x)
{
}
... but it wouldn't occur if Foo were changed such that the type of x were int instead.
The detailed rules on boxing become very complicated to state precisely and accurately, particularly where generics come in, but that's the basics.
There is no boxing here at all; boxing is when a value type is treated as object or an interface (not including generics), for example:
int i = 1;
Foo(i); // the value of i is boxed
Bar(i); // the value of i is boxed
...
private void Foo(object obj) {...}
private void Bar(IConvertible obj) {...}
In your examples, a: there is no type conversion here, so no need to box, and b: string is a reference-type anyway, so there is no meaning of boxing a string.
Your test2 is actually showing "pass by reference", aka ref, which is completely unrelated to boxing - and indeed ref parameters must be an exact match, so there is never any boxing involved in a ref parameter (however, subsequent code could obtain the value from the reference and then box/unbox that)

Structs, Interfaces and Boxing [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is it safe for structs to implement interfaces?
Take this code:
interface ISomeInterface
{
public int SomeProperty { get; }
}
struct SomeStruct : ISomeInterface
{
int someValue;
public int SomeProperty { get { return someValue; } }
public SomeStruct(int value)
{
someValue = value;
}
}
and then I do this somewhere:
ISomeInterface someVariable = new SomeStruct(2);
is the SomeStruct boxed in this case?
Jon's point is true, but as a side note there is one slight exception to the rule; generics. If you have where T : ISomeInterface, then this is constrained, and uses a special opcode. This means the interface can be used without boxing. For example:
public static void Foo<T>(T obj) where T : ISomeInterface {
obj.Bar(); // Bar defined on ISomeInterface
}
This does not involve boxing, even for value-type T. However, if (in the same Foo) you do:
ISomeInterface asInterface = obj;
asInterface.Bar();
then that boxes as before. The constrained only applies directly to T.
Yes, it is. Basically whenever you need a reference and you've only got a value type value, the value is boxed.
Here, ISomeInterface is an interface, which is a reference type. Therefore the value of someVariable is always a reference, so the newly created struct value has to be boxed.
I'm adding this to hopefully shed a little more light on the answers offered by Jon and Marc.
Consider this non-generic method:
public static void SetToNull(ref ISomeInterface obj) {
obj = null;
}
Hmm... setting a ref parameter to null. That's only possibly for a reference type, correct? (Well, or for a Nullable<T>; but let's ignore that case to keep things simple.) So the fact that this method compiles tells us that a variable declared to be of some interface type must be treated as a reference type.
The key phrase here is "declared as": consider this attempt to call the above method:
var x = new SomeStruct();
// This line does not compile:
// "Cannot convert from ref SomeStruct to ref ISomeInterface" --
// since x is declared to be of type SomeStruct, it cannot be passed
// to a method that wants a parameter of type ref ISomeInterface.
SetToNull(ref x);
Granted, the reason you can't pass x in the above code to SetToNull is that x would need to be declared as an ISomeInterface for you to be able to pass ref x -- and not because the compiler magically knows that SetToNull includes the line obj = null. But in a way that just reinforces my point: the obj = null line is legal precisely because it would be illegal to pass a variable not declared as an ISomeInterface to the method.
In other words, if a variable is declared as an ISomeInterface, it can be set to null, pure and simple. And that's because interfaces are reference types -- hence, declaring an object as an interface and assigning it to a value type object boxes that value.
Now, on the other hand, consider this hypothetical generic method:
// This method does not compile:
// "Cannot convert null to type parameter 'T' because it could be
// a non-nullable value type. Consider using 'default(T)' instead." --
// since this method could take a variable declared as, e.g., a SomeStruct,
// the compiler cannot assume a null assignment is legal.
public static void SetToNull<T>(ref T obj) where T : ISomeInterface {
obj = null;
}
The MSDN documentation tells us that structs are value, not reference types. They are boxed when converting to/from a variable of type object. But the central question here is: what about a variable of an interface type? Since the interface can also be implemented by a class, then this must be tantamount to converting from a value to a reference type, as Jon Skeet already said, therefore yes boxing would occur. More discussion on an msdn blog.

Question about C# 4.0's generics covariance

Having defined this interface:
public interface IInputBoxService<out T> {
bool ShowDialog();
T Result { get; }
}
Why does the following code work:
public class StringInputBoxService : IInputBoxService<string> {
...
}
...
IInputBoxService<object> service = new StringInputBoxService();
and this doesn't?:
public class IntegerInputBoxService : IInputBoxService<int> {
...
}
...
IInputBoxService<object> service = new IntegerInputBoxService();
Does it have anything to do with int being a value type? If yes, how can I circumvent this situation?
Thanks
Yes, it absolutely has to do with int being a value type. Generic variance in C# 4 only works with reference types. This is primarily because references always have the same representation: a reference is just a reference, so the CLR can use the same bits for something it knows is a string reference as for an object reference. The CLR can make sure that the code will be safe, and use native code which only knows about IInputBoxService<object> when passed an IInputBoxService<string> - the value returned from Result will be representationally compatible (if such a term exists!).
With int => object there would have to be boxing etc, so you don't end up with the same code - that basically messes up variance.
EDIT: The C# 4.0 spec says this in section 13.1.3.2:
The purpose of variance annotations is
to provide for more lenient (but still
type safe) conversions to interface
and delegate types. To this end the
definitions of implicit (§6.1) and
explicit conversions (§6.2) make use
of the notion of
variance-convertibility, which is
defined as follows: A type T is variance-convertible to a type
T if T is either an
interface or a delegate type declared
with the variant type parameters T, and for each variant type
parameter Xi one of the following
holds:
Xi is covariant and an
implicit reference or identity
conversion exists from Ai to Bi
Xi
is contravariant and an implicit
reference or identity conversion
exists from Bi to Ai
Xi is invariant
and an identity conversion exists from
Ai to Bi
This doesn't make it terribly obvious, but basically reference conversions only exist between reference types, which leaves only identity conversions (i.e. from a type to itself).
As for workarounds: I think you'd have to create your own wrapper class, basically. This can be as simple as:
public class Wrapper<T>
{
public T Value { get; private set; }
public Wrapper(T value)
{
Value = value;
}
}
It's pretty nasty though :(

Categories

Resources