I try to understand this code:
double b = 3;
object o = b;
Console.WriteLine(o.Equals(3));//false
Console.WriteLine(o.Equals(b));//true
Console.WriteLine( o == (object)b );//false
Each new boxing makes different references of object b?
If 1. is true, why o.Equals(b) is true?
If Equals does not check references, why o.Equals(3) is false?
Thanks.
Yes, each time you box a value type, a new object is created. More on boxing here.
Equals check for value equality, not reference equality. Both o and b are the same: a double with a value of 3.0.
3 here is an int, not a double, and Equals for different types doesn't do any conversion to make them compatible, like the compiler is usually doing. o.Equals(3.0) will return true.
double b = 3;
creates a new variable in stack with value 3
object o = b;
creates an object in the heap which reference the same place of b in the stack so you have the same variable with two references this is boxing
o.Equals(3)
is false because it creates a new anonymous variable with value 3 not b
o.Equals(b)
is true because it's the same variable
o == (object)b
is false because == is comparing references in memory addresess but Equals compares the value of the variable itself
See this
It explains all about equals behavior.
Every time an effort is made to convert a value type into a reference type, it must be boxed to a new object instance. There is no way the system could do anything else without breaking compatibility. Among other things, while one might expect that boxed value types would be immutable(*), none of them are. Every value type, when boxed, yields a mutable object. While C# and vb.net don't provide any convenient way to mutate such objects, trusted and verifiable code written in C++/CLI can do so easily. Even if the system knew of a heap object that holds an Int32 whose value is presently 23, the statement Object foo = 23; would have to generate a new Int32 heap object with a value of 23, since the system would have no way of knowing whether something might be planning to change the value of that existing object to 57.
(*)I would argue that they should be; rather than making all boxed objects mutable, it would be much better to provide a means by which struct types like List<T>.Enumerator could specify customizable boxing behavior. I'm not sure if there's any way to fix that now without totally breaking compatibility with existing code, though.
Related
Today I stumbled upon an interesting bug I wrote. I have a set of properties which can be set through a general setter. These properties can be value types or reference types.
public void SetValue( TEnum property, object value )
{
if ( _properties[ property ] != value )
{
// Only come here when the new value is different.
}
}
When writing a unit test for this method I found out the condition is always true for value types. It didn't take me long to figure out this is due to boxing/unboxing. It didn't take me long either to adjust the code to the following:
public void SetValue( TEnum property, object value )
{
if ( !_properties[ property ].Equals( value ) )
{
// Only come here when the new value is different.
}
}
The thing is I'm not entirely satisfied with this solution. I'd like to keep a simple reference comparison, unless the value is boxed.
The current solution I am thinking of is only calling Equals() for boxed values. Doing a check for a boxed values seems a bit overkill. Isn't there an easier way?
If you need different behaviour when you're dealing with a value-type then you're obviously going to need to perform some kind of test. You don't need an explicit check for boxed value-types, since all value-types will be boxed** due to the parameter being typed as object.
This code should meet your stated criteria: If value is a (boxed) value-type then call the polymorphic Equals method, otherwise use == to test for reference equality.
public void SetValue(TEnum property, object value)
{
bool equal = ((value != null) && value.GetType().IsValueType)
? value.Equals(_properties[property])
: (value == _properties[property]);
if (!equal)
{
// Only come here when the new value is different.
}
}
( ** And, yes, I know that Nullable<T> is a value-type with its own special rules relating to boxing and unboxing, but that's pretty much irrelevant here.)
Equals() is generally the preferred approach.
The default implementation of .Equals() does a simple reference comparison for reference types, so in most cases that's what you'll be getting. Equals() might have been overridden to provide some other behavior, but if someone has overridden .Equals() in a class it's because they want to change the equality semantics for that type, and it's better to let that happen if you don't have a compelling reason not to. Bypassing it by using == can lead to confusion when your class sees two things as different when every other class agrees that they're the same.
Since the input parameter's type is object, you will always get a boxed value inside the method's context.
I think your only chance is to change the method's signature and to write different overloads.
How about this:
if(object.ReferenceEquals(first, second)) { return; }
if(first.Equals(second)) { return; }
// they must differ, right?
Update
I realized this doesn't work as expected for a certain case:
For value types, ReferenceEquals returns false so we fall back to Equals, which behaves as expected.
For reference types where ReferenceEquals returns true, we consider them "same" as expected.
For reference types where ReferenceEquals returns false and Equals returns false, we consider them "different" as expected.
For reference types where ReferenceEquals returns false and Equals returns true, we consider them "same" even though we want "different"
So the lesson is "don't get clever"
I suppose
I'd like to keep a simple reference comparison, unless the value is boxed.
is somewhat equivalent to
If the value is boxed, I'll do a non-"simple reference comparison".
This means the first thing you'll need to do is to check whether the value is boxed or not.
If there exists a method to check whether an object is a boxed value type or not, it should be at least as complex as that "overkill" method you provided the link to unless that is not the simplest way. Nonetheless, there should be a "simplest way" to determine if an object is a boxed value type or not. It's unlikely that this "simplest way" is simpler than simply using the object Equals() method, but I've bookmarked this question to find out just in case.
(not sure if I was logical)
I read What is boxing and unboxing and what are the trade offs? but can't understand one thing. Suppose I have a class:
class MyClass
{
public int Value { get; set; }
}
And I want to get value within my method:
void MyFunc(MyClass cls)
{
int i = cls.Value;
}
As a class placed in heap, I guess that Value placed in a heap too? And therefore operation
int i = cls.Value;
is unboxing? Or it's not unboxing?
Stop thinking about stack and heap; that's completely the wrong way to think about it. It is emphatically not the case that "boxed" means "on the heap", and therefore anything "on the heap" must be "boxed".
Stack and heap are irrelevant. Rather, think about references and values. A value of value type is boxed when it must be treated as a reference to an object. If you need to have a reference to a value of a value type, you make a box, put the value in the box, and make a reference to the box. And there, now you have a reference to a value of value type.
Do not confuse that with making a reference to a variable of value type; that is completely different. A variable and a value are two very different things; to make a reference to a variable you use the "ref" keyword.
Boxing or unboxing doesn't have anything to do with storing values on heap or stack. You should read the article "Boxing and Unboxing" from the C# Programming Guide. In your example none of these two occurs because you're assigning int to int.
It's neither unboxing nor boxing.
Considering you assign to int without cast and, I hope, this code compiles, that means that cls.Value is a Integer(int) type. So assign int to int.
What happens here is a value copy.
int i = 5;
object o = i; // boxing of int i
int i = (int)o; // unboxing of object o
Note that we do not assign i to a field or property of an object, but to the object itself.
It is comparable to the nature of light. Light can be perceived of being made of particles (photons) or being a wave. An int can be an int object (a reference type) or an int value type. You can however not define an int to be a reference type directly; you must convert it to an object, e.g. by assigning it to a variable, parameter or property of type object or casting it to object to make it a reference type.
The MSDN Guidelines for Overloading Equals() and Operator == state:
By default, the operator == tests for
reference equality by determining if
two references indicate the same
object, so reference types do not need
to implement operator == in order to
gain this functionality. When a type
is immutable, meaning the data
contained in the instance cannot be
changed, overloading operator == to
compare value equality instead of
reference equality can be useful
because, as immutable objects, they
can be considered the same as long as
they have the same value. Overriding
operator == in non-immutable types is
not recommended.
Can anyone explain the reasoning behind the bold?
EDIT - Also, is this guideline relevant to the == operator only, or is it meant for the Equals method as well ?
My educated guess would be to make things operate like the built in types in .NET do, namely that == should work like reference equality where possible, and that Equals should work like value equality where possible. Consider the actual difference between == and Equals:
object myObj = new Integer(4);
object myObj2 = new Integer(4);
//Note that == is only called if the ref'd objects are cast as a type
//overloading it.
myObj == myObj2; //False (???)
myObj.Equals(myObj2); //True (This call is virtual)
//Set the references equal to each other -- note that the operator==
//comparison now works.
myObj2 = myObj;
myObj == myObj2; //True
myObj.Equals(myObj2); //True
This behavior is of course inconsistent and confusing, particularly to new programmers -- but it demonstrates the difference between reference comparisons and value comparisons.
If you follow this MSDN guideline, you are following the guideline taken by important classes such as string. Basically -- if a comparison using == succeeds, the programmer knows that that comparison will always succeed, so long as the references involved don't get assigned to new objects. The programmer need never worry about the contents of the objects being different, because they never will be different:
//Mutable type
var mutable1 = new Mutable(1);
var mutable2 = mutable1;
mutable1 == mutable2; //true
mutable1.MutateToSomethingElse(56);
mutable1 == mutable2; //still true, even after modification
//This is consistent with the framework. (Because the references are the same,
//reference and value equality are the same.) Consider if == were overloaded,
//and there was a difference between reference and value equality:
var mutable1 = new Mutable(1);
var mutable2 = new Mutable(1);
mutable1 == mutable2; //true
mutable1.MutateToSomethingElse(56);
mutable1 == mutable2; //oops -- not true anymore
//This is inconsistent with, say, "string", because it cannot mutate.
It boils down to that there's no real technical reason for the guideline -- it's just to remain consistent with the rest of the classes in the framework.
Assume you have a mutable type A and you create a set or objects of type A. Adding an object to the set should fail if this object already exists in the set.
Now let's say you add an object to the set, and then you change its properties so that it becomes equal to another object in the set. You've created an illegal state, where there are two objects in the set which are equal.
In the following code...
int i=5;
object o = 5;
Console.WriteLine(o); //prints 5
I have three questions:
1) What additional/useful functionality is acquired by the 5 residing in the variable o that the 5 represented by the variable i does not have ?
2) If some code is expecting a value type then we can just pass it the int i , but if its expecting a reference type , its probably not interested in the 5 boxed in o anyway . So when are boxing conversions explicitly used in code ?
3) How come the Console.WriteLine(o) print out a 5 instead of System.Object ??
What additional/useful functionality is acquired by the 5 residing in the variable o that the 5 represented by the variable i does not have ?
It's rare that you want to box something, but occasionally it is necessary to do so. In older versions of .NET boxing was often necessary because some methods only worked with object (e.g. ArrayList's methods). This is much less of a problem now that there is generics, so boxing occurs less frequently in newer code.
If some code is expecting a value type then we can just pass it the int i, but if its expecting a reference type, its probably not interested in the 5 boxed in o anyway . So when are boxing conversions explicitly used in code ?
In practice boxing usually happens automatically for you. You could explicitly box a variable if you want to make it more clear to the reader of your code that boxing is happening. This might be relevant if performance could be an issue.
How come the Console.WriteLine(o) print out a 5 instead of System.Object ??
Because ToString is a virtual method on object which means that the implementation that is called depends on the runtime type, not the static type. Since int overrides ToString with its own implementation, it is int.ToString that is called, not the default implementation provided by object.
object o = 5;
Console.WriteLine(o.GetType()); // outputs System.Int32, not System.Object
1) On its own, there is not much point. But imagine you wish to store something in a generic way, and you don't know whether that thing is a value or an object. With boxing, you can convert the value into an object, and then treat everything as an object. Wihtout it, you would need a special case to be able to hold a value or an object. (THis is most useful in containers such as lists, allowing you to mix values like 5 with references to objects like a FileStream).
2) Boxing conversions usually only happen implicitly, except in example code illustrating boxing.
3) The WriteLine code probably calls the virtual Object.ToString() method. If the class of the Object it calls this on does not override ToString, then it will call the base class (object) implementation, but most types (including System.Int although int is a value type, it is still derived from System.Object) override this to provide a more useful context-specific result.
What additional/useful functionality is acquired by the 5 residing in the variable o that the 5 represented by the variable i does not have ?
There is no additional functionality acquired by a boxed value type, apart from the fact that it can be passed by referenced to code that requires that.
So when are boxing conversions explicitly used in code ?
I can't spontaneously think of a scenario when you would need to explicitly box an int to an object, since there is always an implicit conversion in that direction (although I would not be surprised if there are cases when an explicit conversion is required).
How come the Console.WriteLine(o) print out a 5 instead of System.Object ??
It calls ToString on the object passed. In fact, it starts by trying to convert the object to an IFormattable and, if successful (which it will be in the case of an int) then calls the ToString overload that is defined in that interface. This will return "5".
Additional functionality: The object is a full-fledged object. You can call methods on it and use it as you would any other object:
System.Console.WriteLine("type: {0}", o.GetType());
System.Console.WriteLine("hash code: {0}", o.GetHashCode());
The int variable is a value type, not an object.
XXX: This is incorrect; see comments. I would venture instead that the one difference in how you might use the two is that object o = 5 is nullable (you can set o = null), while the value type is not - if int i = 5, then i is always an int.
Explicit boxing: As you said, the boxed version is used by coding manipulating objects as objects rather than integers in particular. This is what enables non-type-safe generic data structures. Now that type-safe generic data structures are available, you are unlikely to be doing much casting and boxing/unboxing.
Why "5": Because the object knows how to print itself using ToString().
I was just having a quick read through this article (specifically the bit about why he chose to use structs / fields instead of classes / properties) and saw this line:
The result of a property is not a true l-value so we cannot do something like Vertex.Normal.dx = 0. The chaining of properties gives very unexpected results.
What sort of unexpected results is he talking about?
I would add to dbemerlin's answer that the key here is Rico's note that properties are not "lvalues", or, as we call them in C#, "variables".
In order to mutate a mutable struct (and ideally, you should not; mutable structs often cause more problems than they solve) you need to mutate a variable. That's what a variable is -- a storage location whose contents change. If you have a field of type vector and you say
Foo.vector.x = 123;
then we have a variable of value type -- the field Foo.vector -- and we can therefore mutate its property x. But if you have a property of value type:
Foo.Vector.x = 123;
the property is not a variable. This is equivalent to
Vector v = Foo.Vector;
v.x = 123;
which mutates the temporary variable v, not whatever storage location is backing the property.
The whole problem goes away if you abandon mutable value types. To change x, make a new vector with the new values and replace the whole thing:
Foo.Vector = new Vector(x, Foo.Vector.y);
The only "unexpected" result would be that the assignment wouldn't last because Vertex.Normal returns a copy and the code assigns 0 to dx of the copy.
I can't test it now but that is what i would expect (from what i know of .NETs handling of structs)