Related
Could someone please be kind enough to explain why calling ToString() on an empty reference type causes an exception (which in my mind makes perfect sense, you cant invoke a method on nothing!) but calling ToString() on an empty Nullable(Of T) returns String.Empty? This was quite a surprise to me as I assumed the behaviour would be consistent across types.
Nullable<Guid> value = null;
Stock stock = null;
string result = value.ToString(); //Returns empty string
string result1 = stock.ToString(); //Causes a NullReferenceException
Nullable<T> is actually a struct that has some compiler support and implementation support to behave like a null without actually being null.
What you are seeing is the collision between the implementation allowing you to treat it naturally as a null as you would any other reference type, but allowing the method call to happen because the Nullable<T> isn't actually null, the value inside it is null.
Visually it looks like it shouldn't work, this is simply because you cannot see what is done in the background for you.
Other such visual trickery can be seen when you call an extension method on a null reference type... the call works (against visual expectation) because under the hood it is resolved into a static method call passing your null instance as a parameter.
How does a Nullable<T> type work behind the scenes?
Nullable is a value type and the assignment to null causes it to be initialized with Value=null and HasValue=false.
Further, Nullable.ToString() is implement as follows:
public override string ToString()
{
if (!this.HasValue)
{
return "";
}
return this.value.ToString();
}
So what you are seeing is expected.
It is a bit tricky with nullable types. When you set it to null it is actualy not null cause it is not reference type (it is value type). When you initialize such variable with null it creates new sctructure instance where HasValue property is false and it's Value is null, so when you call ToString method it works well on structure instance.
The exception raised by calling default(object).ToString() is called NullReferenceException for a reason, it's calling a method on a null reference. default(int?) on the other hand, is not a null reference, because it's not a reference; it is a value type with a value that is equivalent to null.
The big practical point, is that if this was done, then the following would fail:
default(int?).HasValue // should return false, or throw an exception?
It would also screw-up the way we have some ability to mix nullables and non-nullables:
((int?)null).Equals(1) // should return false, or throw an exception?
And the following becomes completely useless:
default(int?).GetValueOrDefault(-1);
We could get rid of HasValue and force comparison with null, but then what if the equality override of the value-type that is made nullable can return true when compared to null in some cases. That may not be a great idea, but it can be done and the language has to cope.
Let's think back to why nullable types are introduced. The possibility that a reference type can be null, is inherent in the concept of reference types unless effort is taken to enforce non-nullability: Reference types are types that refer to something, and that implies the possibility of one not referring to anything, which we call null.
While a nuisance in many cases, we can make use of this in a variety of cases, such as representing "unknown value", "no valid value" and so on (we can use it for what null means in databases, for example).
At this point, we've given null a meaning in a given context, beyond the simple fact that a given reference doesn't refer to any object.
Since this is useful, we could therefore want to set an int or DateTime to null, but we can't because they aren't types that refer to something else, and hence can't be in a state of not referring to anything any more than I as a mammal can lose my feathers.
The nullable types introduced with 2.0 give us a form of value types that can have the semantic null, through a different mechanism than that of reference types. Most of this you could code yourself if it didn't exist, but special boxing and promotion rules allow for more sensible boxing and operator use.
Okay. Now let's consider why NullReferenceExceptions happen in the first place. Two are inevitable, and one was a design decision in C# (and doesn't apply to all of .NET).
You try to call a virtual method or property, or access a field on a null reference. This has to fail, because there's no way to look up what override should be called, and no such field.
You call a non-virtual method or property on a null reference which in turn calls a virtual method or property, or accesses a field. This is obviously a variant on point one, but the design decision we're coming to next has the advantage of guaranteeing this fails at the start, rather than part-way through (which could be confusing and have long-term side-effects).
You call a non-virtual method or property on a null reference which does not call a virtual method or property, or access a field. There's no inherent reason why this should not be allowed, and some languages allow it, but in C# they decided to use callvirt rather than call to force a NullReferenceException for the sake of consistency (can't say I agree, but there you go).
None of these cases apply in any way to a nullable value type. It is impossible to put a nullable value type into a condition in which there is no way to know which field or method override to access. The whole concept of NullReferenceException just doesn't make sense here.
In all, not throwing a NullReferenceException is consistent with the other types - types through it if and only if a null reference is used.
Note that there is a case where calling on a null nullable-type throws, it does so with GetType(), because GetType() is not virtual, and when called on a value-type there is always an implied boxing. This is true of other value types so:
(1).GetType()
is treated as:
((object)1).GetType()
But in the case of nullable types, boxing turns those with a false HasValue into null, and hence:
default(int?).GetType()
being treated as:
((object)default(int?)).GetType()
which results in GetType() being called on a null object, and hence throwing.
This incidentally brings us to why not faking NullReferenceType was the more sensible design decision - people who need that behaviour can always box. If you want it to through then use ((object)myNullableValue).GetString() so there's no need for the language to treat it as a special case to force the exception.
EDIT
Oh, I forgot to mention the mechanics behind NullReferenceException.
The test for NullReferenceException is very cheap, because it mostly just ignores the problem, and then catches the exception from the OS if it happens. In other words, there is no test.
See What is the CLR implementation behind raising/generating a null reference exception? and note how none of that would work with nullable value types.
If you investigate Nullable<> definition, there is an override ToString definition. In this function, ToString is overriden to return String.Empty.
// Summary:
// Returns the text representation of the value of the current System.Nullable<T>
// object.
//
// Returns:
// The text representation of the value of the current System.Nullable<T> object
// if the System.Nullable<T>.HasValue property is true, or an empty string ("")
// if the System.Nullable<T>.HasValue property is false.
public override string ToString();
On the otherhand, Stock is a custom class, which I assume ToString is not overriden. Thus it returns NullReferenceException since it uses default behaviour.
As per MSDN Remarks
Guid.ToSTring() method Returns a string representation of the
value of this Guid instance, according to the provided format
specifier.
As per MSDN Remarks on Nullable
A type is said to be nullable if it can be assigned a value or can be
assigned null, which means the type has no value whatsoever.
Consequently, a nullable type can express a value, or that no value
exists. For example, a reference type such as String is nullable,
whereas a value type such as Int32 is not. A value type cannot be
nullable because it has enough capacity to express only the values
appropriate for that type; it does not have the additional capacity
required to express a value of null.
I was wondering why Nullable<T> is a value type, if it is designed to mimic the behavior of reference types? I understand things like GC pressure, but I don't feel convinced - if we want to have int acting like reference, we are probably OK with all the consequences of having real reference type. I can see no reason why Nullable<T> is not just boxed version of T struct.
As value type:
it still needs to be boxed and unboxed, and more, boxing must be a bit different than with "normal" structs (to treat null-valued nullable like real null)
it needs to be treated differently when checking for null (done simply in Equals, no real problem)
it is mutable, breaking the rule that structs should be immutable (ok, it is logically immutable)
it needs to have special restriction to disallow recursion like Nullable<Nullable<T>>
Doesn't making Nullable<T> a reference type solve that issues?
rephrased and updated:
I've modified my reason list a bit, but my general question is still open:
How will reference type Nullable<T> be worse than current value type implementation? Is it only GC pressure and "small, immutable" rule? It still feels strange for me...
The reason is that it was not designed to act like a reference type. It was designed to act like a value type, except in just one particular. Let's look at some ways value types and reference types differ.
The main difference between a value and reference type, is that value type is self-contained (the variable containing the actual value), while a reference type refers to another value.
Some other differences are entailed by this. The fact that we can alias reference types directly (which has both good and bad effects) comes from this. So too do differences in what equality means:
A value type has a concept of equality based on the value contained, which can optionally be redefined (there are logical restrictions on how this redefinition can happen*). A reference type has a concept of identity that is meaningless with value types (as they cannot be directly aliased, so two such values cannot be identical) that can not be redefined, which is also gives the default for its concept of equality. By default, == deals with this value-based equality when it comes to value types†, but with identity when it comes to reference types. Also, even when a reference type is given a value-based concept of equality, and has it used for == it never loses the ability to be compared to another reference for identity.
Another difference entailed by this is that reference types can be null - a value that refers to another value allows for a value that doesn't refer to any value, which is what a null reference is.
Also, some of the advantages of keeping value-types small relate to this, since being based on value, they are copied by value when passed to functions.
Some other differences are implied but not entailed by this. That it's often a good idea to make value types immutable is implied but not entailed by the core difference because while there are advantages to be found without considering implementation matters, there are also advantages in doing so with reference types (indeed some relating to safety with aliases apply more immediately to reference types) and reasons why one may break this guideline - so it's not a hard and fast rule (with nested value types the risks involved are so heavily reduced that I would have few qualms in making a nested value type mutable, even though my style leans heavily to making even reference types immutable when at all practical).
Some further differences between value types and reference types are arguably implementation details. That a value type in a local variable has the value stored on the stack has been argued as an implementation detail; probably a pretty obvious one if your implementation has a stack, and certainly an important one in some cases, but not core to the definition. It's also often overstated (for a start, a reference type in a local variable also has the reference itself in the stack, for another there are plenty of times when a value type value is stored in the heap).
Some further advantages in value types being small relate to this.
Now, Nullable<T> is a type that behaves like a value type in all the ways described above, except that it can take a null value. Maybe the matter of local values being stored on the stack isn't all that important (being more an implementation detail than anything else), but the rest is inherent to how it is defined.
Nullable<T> is defined as
struct Nullable<T>
{
private bool hasValue;
internal T value;
/* methods and properties I won't go into here */
}
Most of the implementation from this point is obvious. Some special handling is needed allow null to be assigned to it - treated as if default(Nullable<T>) had been assigned - and some special handling when boxed, and then the rest follows (including that it can be compared for equality with null).
If Nullable<T> was a reference type, then we'd have to have special handling to allow for all the rest to occur, along with special handling for features in how .NET helps the developer (such as we'd need special handling to make it descend from ValueType). I'm not even sure if it would be possible.
*There are some restrictions on how we are allowed to redefine equality. Combining those rules with those used in the defaults, then generally we can allow for two values to be considered equal that would be considered unequal by default, but it rarely makes sense to consider two values unequal that the default would consider equal. A exception is the case where a struct contains only value-types, but where said value-types redefine equality. This the a result of an optimisation, and generally considered a bug rather than by design.
†An exception is float-point types. Because of the definition of value-types in the CLI standard, double.NaN.Equals(double.NaN) and float.NaN.Equals(float.NaN) return true. But because of the definition of NaN in ISO 60559, float.NaN == float.NaN and double.NaN == double.NaN both return false.
Edited to address the updated question...
You can box and unbox objects if you want to use a struct as a reference.
However, the Nullable<> type basically allows to enhance any value type with an additional state flag which tells whether the value shall be used as null or if the stuct is "valid".
So to address your questions:
This is an advantage when used in collections, or because of the different semantics (copying instead of referencing)
No it doesn't. The CLR does respect this when boxing and unboxing, so that you actually never box a Nullable<> instance. Boxing a Nullable<> which "has" no value will return a null reference, and unboxing does the opposite.
Nope.
Again, this isn't the case. In fact generic constraints for a struct do not allow nullable structs to be used. This makes sense due to the special boxing/unboxing behavior. Therefore, if you have a where T: struct to constrain a generic type, nullable types will be disallowed. Since this constraint is defined on the Nullable<T> type as well, you cannot nest them, without any special treatment to prevent this.
Why not using references? I already mentioned the important semantic differences. But apart from this, reference types use much more memory space: Each reference, especially in 64-bit environments, uses up not only heap memory for the instance, but also memory for the reference, the instance type information, locking bits etc.. So, apart from the semantics and performance differences (indirection via reference), you end up with using a multiple of the memory used for the entity itself for most common entities. And the GC gets more objects to handle, which will make the total performance compared to structs even worse.
It is not mutable; check again.
The boxing is different too; an empty "boxes" to null.
But; it is small (barely bigger than T), immutable, and encapsulates only structs - ideal as a struct. Perhaps more importantly, so long as T is truly a "value", then so is T? a logical "value".
I coded MyNullable as a class.
Can't really understand why it cannot be a class, beside for avoid heap memory pressure.
namespace ClassLibrary1
{
using NFluent;
using NUnit.Framework;
[TestFixture]
class MyNullableShould
{
[Test]
public void operator_equals_btw_nullable_and_value_works()
{
var myNullable = new MyNullable<int>(1);
Check.That(myNullable == 1).IsEqualTo(true);
Check.That(myNullable == 2).IsEqualTo(false);
}
[Test]
public void Can_be_comparedi_with_operator_equal_equals()
{
var myNullable = new MyNullable<int>(1);
var myNullable2 = new MyNullable<int>(1);
Check.That(myNullable == myNullable2).IsTrue();
Check.That(myNullable == myNullable2).IsTrue();
var myNullable3 = new MyNullable<int>(2);
Check.That(myNullable == myNullable3).IsFalse();
}
}
}
namespace ClassLibrary1
{
using System;
public class MyNullable<T> where T : struct
{
internal T value;
public MyNullable(T value)
{
this.value = value;
this.HasValue = true;
}
public bool HasValue { get; }
public T Value
{
get
{
if (!this.HasValue) throw new Exception("Cannot grab value when has no value");
return this.value;
}
}
public static explicit operator T(MyNullable<T> value)
{
return value.Value;
}
public static implicit operator MyNullable<T>(T value)
{
return new MyNullable<T>(value);
}
public static bool operator ==(MyNullable<T> n1, MyNullable<T> n2)
{
if (!n1.HasValue) return !n2.HasValue;
if (!n2.HasValue) return false;
return Equals(n1.value, n2.value);
}
public static bool operator !=(MyNullable<T> n1, MyNullable<T> n2)
{
return !(n1 == n2);
}
public override bool Equals(object other)
{
if (!this.HasValue) return other == null;
if (other == null) return false;
return this.value.Equals(other);
}
public override int GetHashCode()
{
return this.HasValue ? this.value.GetHashCode() : 0;
}
public T GetValueOrDefault()
{
return this.value;
}
public T GetValueOrDefault(T defaultValue)
{
return this.HasValue ? this.value : defaultValue;
}
public override string ToString()
{
return this.HasValue ? this.value.ToString() : string.Empty;
}
}
}
What is the reason null doesn't evaluate to false in conditionals?
I first thought about assignments to avoid the bug of using = instead of ==, but this could easily be disallowed by the compiler.
if (someClass = someValue) // cannot convert someClass to bool. Ok, nice
if (someClass) // Cannot convert someClass to bool. Why?
if (someClass != null) // More readable?
I think it's fairly reasonable to assume that null means false. There are other languages that use this too, and I've not had a bug because of it.
Edit: And I'm of course referring to reference types.
A good comment by Daniel Earwicker on the assignment bug... This compiles without a warning because it evaluates to bool:
bool bool1 = false, bool2 = true;
if (bool1 = bool2)
{
// oops... false == true, and bool1 became true...
}
It's a specific design feature in the C# language: if statements accept only a bool.
IIRC this is for safety: specifically, so that your first if (someClass = someValue) fails to compile.
Edit: One benefit is that it makes the if (42 == i) convention ("yoda comparisons") unnecessary.
"I think it's fairly reasonable to assume that null means false"
Not in C#. false is a boolean struct, a value type. Value types cannot have a null value. If you wanted to do what you achieved, you'd have to create custom converters of your particular type to boolean:
public class MyClass
{
public static implicit operator bool(MyClass instance)
{
return instance != null;
}
}
With the above, I could then do:
if (instance) {
}
etc.
"I think it's fairly reasonable to assume that null means false"
I don't agree. IMHO, more often than not, false means "no". Null means "I don't know"; i.e. completely indeterminate.
One thing that comes to mind what about in the instance of a data type, like int? Int's can't be null, so do they always evaluate to true? You could assume that int = 0 is false, but that starts to get really complicated, because 0 is a valid value (where maybe 0 should evaluate to true, because the progammer set it) and not just a default value.
There are a lot of edge cases where null isn't an option, or sometimes it's an option, and other times it's not.
They put in things like this to protect the programmer from making mistakes. It goes along the same line of why you can't do fall through in case statements.
Just use if(Convert.ToBoolean(someClass))
http://msdn.microsoft.com/en-us/library/wh2c31dd.aspx
Parameters
value Type: System.Object An object that implements the
IConvertible interface, or null.
Return Value
Type: System.Boolean true or false,
which reflects the value returned by
invoking the IConvertible.ToBoolean
method for the underlying type of
value. If value is null, the method
returns false
As far as I know, this is a feature that you see in dynamic languages, which C# is not (per the language specification if only accepts bool or an expression that evaluates to bool).
I don't think it's reasonable to assume that null is false in every case. It makes sense in some cases, but not in others. For example, assume that you have a flag that can have three values: set, unset, and un-initialized. In this case, set would be true, unset would be false and un-initialized would be null. As you can see, in this case the meaning of null is not false.
Because null and false are different things.
A perfect example is bool? foo
If foo's value is true, then its value is true.
If foo's value is false, then its value is false
If foo has nothing assigned to it, its value is null.
These are three logically separate conditions.
Think of it another way
"How much money do I owe you?"
"Nothing" and "I don't have that information" are two distinctly separate answers.
What is the reason null doesn't
evaluate to false in conditionals?
I first thought about assignments to
avoid the bug of using = instead of
==
That isn't the reason. We know this because if the two variables being compared happen to be of type bool then the code will compile quite happily:
bool a = ...
bool b = ...
if (a = b)
Console.WriteLine("Weird, I expected them to be different");
If b is true, the message is printed (and a is now true, making the subsequent debugging experience consistent with the message, thus confusing you even more...)
The reason null is not convertible to bool is simply that C# avoids implicit conversion unless requested by the designer of a user-defined type. The C++ history book is full of painful stories caused by implicit conversions.
Structurally, most people who "cannot think of any technological reason null should be equal to false" get it wrong.
Code is run by CPUs.
Most (if not all) CPUs have bits, groups of bits and interpretations of groups of bits. That said, something can be 0, 1, a byte, a word, a dword, a qword and so on.
Note that on x86 platform, bytes are octets (8 bits), and words are usually 16 bits, but this is not a necessity. Older CPUs had words of 4 bits, and even todays' low-end embedded controllers often use like 7 or 12 bits per word.
That said, something is either "equal", "zero", "greater", "less", "greater or equal", or "less or equal" in machine code. There is no such thing as null, false or true.
As a convention, true is 1, false is 0, and a null pointer is either 0x00, 0x0000, 0x00000000, or 0x0000000000000000, depending on address bus width.
C# is one of the exceptions, as it is an indirect type, where the two possible values 0 and 1 are not an immediate value, but an index of a structure (think enum in C, or PTR in x86 assembly).
This is by design.
It is important to note, though, that such design decisions are elaborate decisions, while the traditional, straightforward way is to assume that 0, null and false are equal.
C# doesn't make a conversion of the parameter, as C++ does. You need to explicitly convert the value in a boolean, if you want the if statement to accept the value.
It's simply the type system of c# compared to languages like PHP, Perl, etc.
A condition only accepts Boolean values, null does not have the type Boolean so it doesn't work there.
As for the NULL example in C/C++ you mentioned in another comment it has to be said that neither C nor C++ have a boolean type (afair C++ usually has a typecast for bool that resolves to an int, but thats another matter) and they also have no null-references, only NULL(=> 0)-pointers.
Of course the compiler designers could implement an automatic conversion for any nullable type to boolean but that would cause other problems, i.e.:
Assuming that foo is not null:
if (foo)
{
// do stuff
}
Which state of foo is true?
Always if it's not null?
But what if you want your type to be convertable to boolean (i.e. from your tri-state or quantum-logic class)?
That would mean you would have two different conversions to bool, the implicit and the explicit, which would both behave differently.
I don't even dare to imagine what should happen if you do
if (!!foo) // common pattern in C to normalize a value used as boolean,
// in this case might be abused to create a boolean from an object
{
}
I think the forced (foo == null) is good since it also adds clarity to your code, it's easier to understand what you really check for.
hey guys, I've removed some of the complexities of my needs to the core of what I need to know.
I want to send a collection of Values to a method, and inside that method I want to test the Value against, say, a property of an Entity. The property will always be of the same Type as the Value.
I also want to test if the value is null, or the default value, obviously depending on whether the value type is a reference type, or a value type.
Now, if all the values sent to the method are of the same type, then I could do this using generics, quite easily, like this:
public static void testGenerics<TValueType>(List<TValueType> Values) {
//test null/default
foreach (TValueType v in Values) {
if (EqualityComparer<TValueType>.Default.Equals(v, default(TValueType))) {
//value is null or default for its type
} else {
//comapre against another value of the same Type
if (EqualityComparer<TValueType>.Default.Equals(v, SomeOtherValueOfTValueType)) {
//value equals
} else {
//value doesn't equal
}
}
}
}
My questions is, how would I carry out the same function, if my Collection contained values of different Types.
My main concerns are successfully identifying null or default values, and successfully identifying if each value passed in, equals some other value of the same type.
Can I achieve this by simply passing the type object? I also can't really use the EqualityComparers as I can't use generics, because I'm passing in an unknown number of different Types.
is there a solution?
thanks
UPDATE
ok, searching around, could I use the following code to test for null/default successfully in my scenario (taken from this SO answer):
object defaultValue = type.IsValueType ? Activator.CreateInstance(type) : null;
I reckon this might work.
Now, how can I successfully compare two values of the same Type, without knowing their types successfully and reliably?
There is Object.Equals(object left, object right) static method, it internally relies on Equals(object) implementation available at one of provided arguments. Why do you avoid using it?
The rules of implementing equality members are nearly the following:
Required: Override Equals(object) and GetHashCode() methods
Optional: Implement IEquatable<T> for your type (this is what EqualityComparer.Default relies on)
Optional: Implement == and != operators
So as you see, if you'll rely on object.Equals(object left, object right), this will be the best solution relying on strongly required part of equality implementation pattern.
Moreover, it will be the fastest option, since it relies just on virtual methods. Otherwise you'll anyway involve some reflection.
public static void TestGenerics(IList values) {
foreach (object v in values) {
if (ReferenceEquals(null,v)) {
// v is null reference
}
else {
var type = v.GetType();
if (type.IsValueType && Equals(v, Activator.CreateInstance(type))) {
// v is default value of its value type
}
else {
// v is non-null value of some reference type
}
}
}
}
The short answer is "yes", but the longer answer is that it's possible but will take a non-trivial amount of effort on your part and some assumptions in order to make it work. Your issue really comes when you have values that would be considered "equal" when compared in strongly-typed code, but do not have reference equality. Your biggest offenders will be value types, as a boxed int with a value of 1 won't have referential equality to another boxed int of the same value.
Given that, you have to go down the road of using things like the IComparable interface. If your types will always specifically match, then this is likely sufficient. If either of your values implements IComparable then you can cast to that interface and compare to the other instance to determine equality (==0). If neither implements it then you'll likely have to rely on referential equality. For reference types this will work unless there is custom comparison logic (an overloaded == operator on the type, for example).
Just bear in mind that the types would have to match EXACTLY. In other words, an int and an short won't necessarily compare like this, nor would an int and a double.
You could also go down the path of using reflection to dynamically invoke the Default property on the generic type determined at runtime by the supplied Type variable, but I wouldn't want to do that if I didn't have to for performance and compile-time safety (or lack thereof) reasons.
Is the list of types you need to test a pre-determined list? If so, you can use the Visitor Pattern (and maybe even if not since we have Generics). Create a method on your Entities (can be done using partial classes) that takes in an interface. Your class then calls a method on that interface passing itself. The interface method can be generic, or you can create an overload for each type you want to test.
Battery about to die otherwise would give example.
Fifteen seconds after hitting "Save" the machine went into hibernate.
After thinking about it, the Visitor pattern might not solve your specific problem. I thought you were trying to compare entities, but it appears you are testing values (so potentially ints and strings).
But for the sake of completion, and because the visitor pattern is kind of cool once you realize what it does, here's an explanation.
The Visitor pattern allows you to handle multiple types without needing to figure out how to cast to the specific type (you decouple the type from the item using that type). It works by having two interfaces - the visitor and the acceptor:
interface IAcceptor
{
void Accept(IVisitor visitor);
}
interface IVisitor
{
void Visit(Type1 type1);
void Visit(Type2 type2);
.. etc ..
}
You can optionally use a generic method there:
interface IVisitor
{
void Visit<T>(T instance);
}
The basic implementation of the accept method is:
void Accept(IVisitor visitor)
{
visitor.Visit(this);
}
Because the type implementing Accept() knows what type it is, the correct overload (or generic type) is used. You could achieve the same thing with reflection and a lookup table (or select statement), but this is much cleaner. Also, you don't have to duplicate the lookup among different implementations -- various classes can implement IVisitor to create type-specific functionality.
The Visitor pattern is one way of doing "Double Dispatch". The answer to this question is another way and you might be able to morph it into something that works for your specific case.
Basically, a long-winded non-answer to your problem, sorry. :) The problem intrigues me, though -- like how do you know what property on the entity you should test against?
Why use one over the other?
== is the identity test. It will return true if the two objects being tested are in fact the same object. Equals() performs an equality test, and will return true if the two objects consider themselves equal.
Identity testing is faster, so you can use it when there's no need for more expensive equality tests. For example, comparing against null or the empty string.
It's possible to overload either of these to provide different behavior -- like identity testing for Equals() --, but for the sake of anybody reading your code, please don't.
Pointed out below: some types like String or DateTime provide overloads for the == operator that give it equality semantics. So the exact behavior will depend on the types of the objects you are comparing.
See also:
http://blogs.msdn.com/csharpfaq/archive/2004/03/29/102224.aspx
#John Millikin:
Pointed out below: some value types like DateTime provide overloads for the == operator >that give it equality semantics. So the exact behavior will depend on the types of the >objects you are comparing.
To elaborate:
DateTime is implemented as a struct. All structs are children of System.ValueType.
Since System.ValueType's children live on the stack, there is no reference pointer to the heap, and thus no way to do a reference check, you must compare objects by value only.
System.ValueType overrides .Equals() and == to use a reflection based equality check, it uses reflection to compare each fields value.
Because reflection is somewhat slow, if you implement your own struct, it is important to override .Equals() and add your own value checking code, as this will be much faster. Don't just call base.Equals();
Everyone else pretty much has you covered, but I have one more word of advice. Every now and again, you will get someone who swears on his life (and those of his loved ones) that .Equals is more efficient/better/best-practice or some other dogmatic line. I can't speak to efficiency (well, OK, in certain circumstances I can), but I can speak to a big issue which will crop up: .Equals requires an object to exist. (Sounds stupid, but it throws people off.)
You can't do the following:
StringBuilder sb = null;
if (sb.Equals(null))
{
// whatever
}
It seems obvious to me, and perhaps most people, that you will get a NullReferenceException. However, proponents of .Equals forget about that little factoid. Some are even "thrown" off (sorry, couldn't resist) when they see the NullRefs start to pop up.
(And years before the DailyWTF posting, I did actually work with someone who mandated that all equality checks be .Equals instead of ==. Even proving his inaccuracy didn't help. We just made damn sure to break all his other rules so that no reference returned from a method nor property was ever null, and it worked out in the end.)
== is generally the "identity" equals meaning "object a is in fact the exact same object in memory as object b".
equals() means that the objects logically equal (say, from a business point of view). So if you are comparing instances of a user-defined class, you would generally need to use and define equals() if you want things like a Hashtable to work properly.
If you had the proverbial Person class with properties "Name" and "Address" and you wanted to use this Person as a key into a Hashtable containing more information about them, you would need to implement equals() (and hash) so that you could create an instance of a Person and use it as a key into the Hashtable to get the information.
Using == alone, your new instance would not be the same.
According to MSDN:
In C#, there are two different kinds of equality: reference equality (also known as identity) and value equality. Value equality is the generally understood meaning of equality: it means that two objects contain the same values. For example, two integers with the value of 2 have value equality. Reference equality means that there are not two objects to compare. Instead, there are two object references and both of them refer to the same object.
...
By default, the operator == tests for reference equality by determining whether two references indicate the same object.
Both Equals and == can be overloaded, so the exact results of calling one or the other will vary. Note that == is determined at compile time, so while the actual implementation could change, which == is used is fixed at compile time, unlike Equals which could use a different implementation based on the run time type of the left side.
For instance string performs an equality test for ==.
Also note that the semantics of both can be complex.
Best practice is to implement equality like this example. Note that you can simplify or exclude all of this depending on how you plan on using you class, and that structs get most of this already.
class ClassName
{
public bool Equals(ClassName other)
{
if (other == null)
{
return false;
}
else
{
//Do your equality test here.
}
}
public override bool Equals(object obj)
{
ClassName other = obj as null; //Null and non-ClassName objects will both become null
if (obj == null)
{
return false;
}
else
{
return Equals(other);
}
}
public bool operator ==(ClassName left, ClassName right)
{
if (left == null)
{
return right == null;
}
else
{
return left.Equals(right);
}
}
public bool operator !=(ClassName left, ClassName right)
{
if (left == null)
{
return right != null;
}
else
{
return !left.Equals(right);
}
}
public override int GetHashCode()
{
//Return something useful here, typically all members shifted or XORed together works
}
}
Another thing to take into consideration: the == operator may not be callable or may have different meaning if you access the object from another language. Usually, it's better to have an alternative that can be called by name.
The example is because the class DateTime implements the IEquatable interface, which implements a "type-specific method for determining equality of instances." according to MSDN.
use equals if you want to express the contents of the objects compared should be equal. use == for primitive values or if you want to check that the objects being compared is one and the same object. For objects == checks whether the address pointer of the objects is the same.
I have seen Object.ReferenceEquals() used in cases where one wants to know if two references refer to the same object
In most cases, they are the same, so you should use == for clarity. According to the Microsoft Framework Design Guidelines:
"DO ensure that Object.Equals and the equality operators have exactly the same semantics and similar performance characteristics."
https://learn.microsoft.com/en-us/dotnet/standard/design-guidelines/equality-operators
But sometimes, someone will override Object.Equals without providing equality operators. In that case, you should use Equals to test for value equality, and Object.ReferenceEquals to test for reference equality.
If you do disassemble (by dotPeek for example) of Object, so
public virtual bool Equals(Object obj)
described as:
// Returns a boolean indicating if the passed in object obj is
// Equal to this. Equality is defined as object equality for reference
// types and bitwise equality for value types using a loader trick to
// replace Equals with EqualsValue for value types).
//
So, is depend on type.
For example:
Object o1 = "vvv";
Object o2 = "vvv";
bool b = o1.Equals(o2);
o1 = 555;
o2 = 555;
b = o1.Equals(o2);
o1 = new List<int> { 1, 2, 3 };
o2 = new List<int> { 1, 2, 3 };
b = o1.Equals(o2);
First time b is true (equal performed on value types), second time b is true (equal performed on value types), third time b is false (equal performed on reference types).