Can XPathNavigator.Evaluate(string) return null? - c#

Can this overload of XPathNavigator.Evaluate return null ?
// Can "result" be null ?
object result = xmlDoc.CreateNavigator().Evaluate(xpathString);
If the answer is No, then why Resharper says that result maybe null ?
string str = result.ToString(); // Resharper: Possible NullReferenceException
I found nothing in the documentation about an input that might cause it to return null. I also tried inspecting the Reference Source for this function, but it was unfruitful.
I know that R# uses code annotations, but I still don't trust this warning as I tried different inputs with none of them returns null.

Looking at the code, it does look like it would be highly unlikely to get a null from XPathNavigator.Evaluate. There are a couple of possible code paths that might get you a null, but I suspect they're pathological edge cases (if evaluating a function that should be a number function, but isn't, or if the operand to a query is already null). I doubt these would happen under normal circumstances.
I don't know why ReSharper has the [CanBeNull] annotation on the return value. If I had to guess, I'd say it's because the method is virtual, and therefore there's no way to guarantee that the implementation will always return a value. Or because it calls an abstract method on another class that doesn't have any null-ness guarantees, and there's no check on the return of that value, so again, there's no guarantee that it won't be null.
The annotations are based on static control flow analysis, and that can only get you so far. ReSharper will provide the strongest hints that it can. If it knows it's not null, it will annotate it so, if it doesn't know, it will flag it [CanBeNull], and err on the side of caution.

Related

When Implementing IEqualityComparer Should GetHashCode check for null?

When implementing IEqualityComparer<Product> (Product is a class), ReSharper complains that the null check below is always false:
public int GetHashCode(Product product)
{
// Check whether the object is null.
if (Object.ReferenceEquals(product, null))
return 0;
// ... other stuff ...
}
(Code example from MSDN VS.9 documentation of Enumerable.Except)
ReSharper may be wrong, but when searching for an answer, I came across the official documentation for IEqualityComparer<T> which has an example where null is not checked for:
public int GetHashCode(Box bx)
{
int hCode = bx.Height ^ bx.Length ^ bx.Width;
return hCode.GetHashCode();
}
Additionally, the documentation for GetHashCode() states that ArgumentNullException will be thrown when "The type of obj is a reference type and obj is null."
So, when implementing IEqualityComparer should GetHashCode check for null, and if so, what should it do with null (throw an exception or return a value)?
I'm interested most in .NET framework official documentation that specifies one way or another if null should be checked.
ReSharper is wrong.
Obviously code you write can call that particular GetHashCode method and pass in a null value. All known methods might ensure this will never happen, but obviously ReSharper can only take existing code (patterns) into account.
So in this case, check for null and do the "right thing".
Corollary: If the method in question was private, then ReSharper might analyze (though I'm not sure it does) the public code and verify that there is indeed no way that this particular private method will be called with a null reference, but since it is a public method, and one available through an interface, then
ReSharper is wrong.
The documentation says that null values should never be hashable, and that attempting to do so should always result in an exception.
Of course, you're free to do whatever you want. If you want to create a hash based structure for which null keys are valid, you're free to do so, in this case you should simply ignore this warning.
ReSharper has some special case code here. It will not warn about the ReferenceEquals in this:
if (ReferenceEquals(obj, null)) { throw new ArgumentNullException("obj"); }
It will warn about the ReferenceEquals in this:
if (ReferenceEquals(obj, null)) { return 0; }
Throwing an ArgumentNullException exception is consistent with the contract specified in IEqualityComparer(Of T).GetHashCode
If you go to the definition of IEqualityComparer (F12) you'll also find further documentation:
// Exceptions:
// System.ArgumentNullException:
// The type of obj is a reference type and obj is null.
int GetHashCode(T obj);
So ReSharper is right that there is something wrong, but the error displayed doesn't match the change you should make to the code.
There is some nuance to this question.
The docs state that IEqualityComparer<T>.GetHashCode(T) throws on null input; however EqualityComparer<>.Default - which is almost certainly by far the most used implementation - does not throw.
Clearly, an implementation does not need to throw on null it merely has the option too.
However, I'd argue that no implementation should ever throw on null here, it's just confusing, and a possible source of bugs. Exceptions are a pain in any case, being a non-local control flow mechanism, and that alone argues for using them when necessary only (i.e.: not here). But additionally, for IEqualityComparer specifically, the docs state that whenever Equals(x, y) then GetHashCode(x) should equal GetHashCode(y) - and Equals does allow nulls, and is not documented as throwing any exceptions.
The invariant that equality implies hashcode equality makes implementing things relying on those hashcodes much simpler. Having a gotcha with the null value is a design cost you should avoid paying without need. And here there is no need, ever.
In short:
do not throw from GetHashCode, even though it is allowed
and do check for nulls; Resharper's warning is incorrect.
Doing this results in simpler code with fewer gotchas, and it follows the behavior of EqualityComparer<>.Default which is the most common implementation used.

Changing the name just because of Null?

I work on developing an external API. I added a method to my public interface :
public void AddMode(TypeA mode);
public void AddMode(TypeB mode); // the new method, TypeB and TypeA are not related at all
It looked good, until one test broke that was passing a null . That made the compiler confused with ambiguous call. I fixed the test with casting the null.
However my question is :
Should I change the name just because of this?
Or should let the client do the cast as I did? (if they pass null for whatever reason)
What is the best in this case while designing APIs ?
Edit :
the call was like this AddMode(null) , not like :
TypeA vl = null;
AddMode(v1); // this doesn't cause a problem
An API should be designed so that it's easy to use correctly and hard to use incorrectly.
Your API is easy to use correctly:
AddMode(new TypeA());
does compile.
It's harder to use incorrectly:
AddMode(null);
does not compile. The user ist forced to do something like
AddMode((TypeA)null);
Which should make him think, whether this is expected usage. So I think your API is OK as it is.
I think that depends on how exceptional null as a value for the respective argument is.
Compare, for example, this ArgumentNullException constructor: It is most frequently called when an internal exception has to be set. Otherwise, this constructor, which excepts the name of the illegal argument, is passed. On odd occasions, the former has to be invoked because a custom message has to be supplied, but no internal exception is supplied (I usually do this when I'm throwing the exception for an array/collection argument that contains null, but is not null itself). So, in this case, I need the explicit cast, and I'd say it is acceptable there.
If your methods really do the same, but null is still a usual value, you might want to add a parameterless overload for the null variant (i.e. the explicit cast is still possible, but users can also call the parameterless overload instead).
If your methods do something somewhat different, and yet something else for null, you can think about disallowing null altogether for the methods you've shown and adding a parameterless overload for the null case.
Update: If null is not acceptable anyway (and will result in an exception), then you should leave the method as it is. Other than for testing purposes, there should never be any whatsoever situation in which a literal null would be passed to the method, as this will invariably yield an exception. Therefore, do not change your overload names in this case.
Is null valid input to this method anyway?
Personally I'd leave it as is as long as both overloads of AddMode related, since you'd expect AddMode(X) and AddMode(Y) to be doing something related to each other.
If they are not related in any way, then maybe a method name change is in order
Well, it depends either null value is acceptable value in your API.
If not just do not accept it, by not supporting it. So even if the consumer will try to use it with null the compiler will break on ambiguity problem.
If your API accepts a null as a possible parameter value, then you have to specify it in the documentation and mention it is necessary to cast it, and write some code samples to show how.
However, if you don not want the user to use null values, you might change TypeA and TypeB as a struct instead of a class, if your class design allow it.

Why does Convert.ToString(null) return a different value if you cast null?

Convert.ToString(null)
returns
null
As I expected.
But
Convert.ToString(null as object)
returns
""
Why are these different?
There are 2 overloads of ToString that come into play here
Convert.ToString(object o);
Convert.ToString(string s);
The C# compiler essentially tries to pick the most specific overload which will work with the input. A null value is convertible to any reference type. In this case string is more specific than object and hence it will be picked as the winner.
In the null as object you've solidified the type of the expression as object. This means it's no longer compatible with the string overload and the compiler picks the object overload as it's the only compatible one remaining.
The really hairy details of how this tie breaking works is covered in section 7.4.3 of the C# language spec.
Following on from JaredPar's excellent overload resolution answer - the question remains "why does Convert.ToString(string) return null, but Convert.ToString(object) return string.Empty"?
And the answer to that is...because the docs say so:
Convert.ToString(string) returns "the specified string instance; no actual conversion is performed."
Convert.ToString(object) returns "the string representation of value, or String.Empty if value is null."
EDIT:
As to whether this is a "bug in the spec", "very bad API design", "why was it specified like this", etc. - I'll take a shot at some rationale for why I don't see it as big deal.
System.Convert has methods for converting every base type to itself. This is strange - since no conversion is needed or possible, so the methods end up just returning the parameter. Convert.ToString(string) behaves the same. I presume these are here for code generation scenarios.
Convert.ToString(object) has 3 choices when passed null. Throw, return null, or return string.Empty. Throwing would be bad - doubly so with the assumption these are used for generated code. Returning null requires your caller do a null check - again, not a great choice in generated code. Returning string.Empty seems a reasonable choice. The rest of System.Convert deals with value types - which have a default value.
It's debatable whether returning null is more "correct", but string.Empty is definitely more usable. Changing Convert.ToString(string) means breaking the "no actual conversion" rule. Since System.Convert is a static utility class, each method can be logically treated as its own. There's very few real world scenarios where this behavior should be "surprising", so let usability win over (possible) correctness.

How does ReSharper know "Expression is always true"?

Check out the following code:
private void Foo(object bar)
{
Type type = bar.GetType();
if (type != null) // Expression is always true
{
}
}
ReSharper claims type will never be null. That's obvious to me because there's always going to be a type for bar, but how does ReSharper know that? How can it know that the result of a method will never be null?
Type is not a struct so it can't be that. And if the method were written by me, then the return value could certainly be null (not necessarily GetType, but something else).
Is ReSharper clever enough to know that, for only that particular method, the result will never be null? (Like there's a hard-coded list of known .NET methods which will never return null.)
JetBrains perfectly explains how ReSharper does this in their features list.
Summary from link (this particular question is about NotNullAttribute):
We have analyzed a great share of .NET Framework Class Library, as well as NUnit Framework, and annotated it through external XML files, using a set of custom attributes from the JetBrains.Annotations namespace, specifically:
StringFormatMethodAttribute (for methods that take format strings as parameters)
InvokerParameterNameAttribute (for methods with string literal arguments that should match one of caller parameters)
AssertionMethodAttribute (for assertion methods)
AssertionConditionAttribute (for condition parameters of assertion methods)
TerminatesProgramAttribute (for methods that terminate control flow)
CanBeNullAttribute (for values that can be null)
NotNullAttribute (for values that can not be null)
UsedImplicitlyAttribute (for entities that should not be marked as unused)
MeansImplicitUseAttribute (for extending semantics of any other attribute to mean that the corresponding entity should not be marked as unused)
Yes, it basically has knowledge of some well-known methods. You should find the same for string concatenation too, for example:
string x = null;
string y = null;
string z = x + y;
if (z == null)
{
// ReSharper should warn about this never executing
}
Now the same information is also becoming available via Code Contracts - I don't know whether JetBrains is hooking directly into this information, has its own database, or a mixture of the two.
GetType is not virtual. Your assumption is most likely correct in your last statement.
Edit: to answer your comment question - it can't infer with your methods out of the box.
object.GetType is not virtual, so you cannot yourself implement a version that returns a null value. Therefore, if bar is null, you will get a NullReferenceException and otherwise, type will never by null.

Why ever cast reference types when you can use "as"? [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
Casting: (NewType) vs. Object as NewType
In C#, why ever cast reference types when you can use "as"?
Casting can generate exceptions whereas "as" will evaulate to null if the casting fails.
Wouldn't "as" be easier to use with reference types in all cases?
eg:
MyObject as DataGridView
rather than,
(DataGridView)MyObject
Consider the following alternatives:
Foo(someObj as SomeClass);
and:
Foo((SomeClass)someObj);
Due to someObj being of the wrong type, the first version passes null to Foo. Some time later, this results in a NullReferenceException being thrown. How much later? Depends on what Foo does. It might store the null in a field, and then minutes later it's accessed by some code that expects it to be non-null.
But with the second version, you find the problem immediately.
Why make it harder to fix bugs?
Update
The OP asked in a comment: isn't is easier to use as and then check for null in an if statement?
If the null is unexpected and is evidence of a bug in the caller, you could say:
SomeClass c = someObj as SomeClass;
if (c == null)
{
// hmm...
}
What do you do in that if-block? There are two general solutions. One is to throw an exception, so it is the caller's responsibility to deal with their mistake. In which case it is definitely simpler to write:
SomeClass c = (SomeClass)someObj;
It simply saves you writing the if/throw logic by hand.
There is another alternative though. If you have a "stock" implementation of SomeClass that you are happy to use where nothing better is available (maybe it has methods that do nothing, or return "empty" values, etc.) then you could do this:
SomeClass c = (someObj as SomeClass) ?? _stockImpl;
This will ensure that c is never null. But is that really better? What if the caller has a bug; don't you want to help find bugs? By swapping in a default object, you disguise the bug. That sounds like an attractive idea until you waste a week of your life trying to track down a bug.
(In a way this mimics the behaviour of Objective-C, in which any attempt to use a null reference will never throw; it just silently does nothing.)
operator 'as' work with reference types only.
Sometimes, you want the exception to be thrown. Sometimes, you want to try to convert and nulls are OK. As already stated, as will not work with value types.
One definite reason is that the object is, or could be (when writing a generic method, you may not know at coding-time) being cast to a value type, in which case as isn't allowed.
One more dubious reason is that you already know that the object is of the type in question. Just how dubious depends on how you already know that. In the following case:
if(obj is MyType)
DoMyTypeStuff((MyType)obj);
else
DoMoreGeneralStuff(obj);
It's hard to justify using as here, as the only thing it really does is add a redundant check (maybe it'll be optimised away, maybe it won't). At the other extreme, if you are half-way to a trance state with the amount of information you've got in you're brain's paged-in memory and on the basis of that you are pretty sure that the object must be of the type in question, maybe it's better to add in the check.
Another good reason is that the difference between being of the wrong type and being null gets hidden by as. If it's reasonable to be passing in a string to a given method, including a null string, but it's not reasonable to pass in an int, then val as string has just made the incorrect usage look like a completely different correct usage, and you've just made the bug harder to find and potentially more damaging.
Finally, maybe if you don't know the type of the object, the calling code should. If the calling code has called yours incorrectly, they should receive an exception. To either allow the InvalidCastException to pass back, or to catch it and throw an InvalidArgument exception or similar is a reasonable and clear means of doing so.
If, when you write the code to make the cast, you are sure that the cast should work, you should use (DataGridView)MyObject. This way, if the cast fails in the future, your assumption about the type of MyObject will cause an invalid cast exception at the point where you make the cast, instead of a null reference exception at some point later.
If you do want to handle the case where MyObject is not a DataGridView, then use as, and presumably check for it being null before doing anything with it.
tl;dr If your code assumes something, and that assumption is wrong at run-time, the code should throw an exception.
From MSDN (as (C# reference)):
the as operator only performs reference conversions and boxing conversions. The as operator cannot perform other conversions, such as user-defined conversions, which should instead be performed using cast expressions.
Taking into consideration all of the comments, we came across this just the other day and wondered why you would do a direct cast over using the keyword as. What if you want the cast to fail? This is sometimes the desirable effect you want from a cast if you're casting from a null object. You then push the exception up the call stack.
So, if you want something to fail, use a direct cast, if you're okay with it not failing, use the as keyword.
As is faster and doesn't throw exceptions. Therefore it is generally preferred. Reasons to use casts include:
Using as, you can only assign types that are lower in the inheritance tree to ones that are higher. For example:
object o = "abc" as object;
DataGridView d = "abc" as DataGridView // doesn't do anything
DataGridView could create a custom cast that does allow this. Casts are defined on the target type and therefore allow everything, as long as it's defined.
Another problem with as is that it doesn't always work. Consider this method:
IEnumerable<T> GetList<T>(T item)
{
(from ... select) as IEnumerable<T>
}
This code fails because T could also be a Value Type. You can't use as on those because they can never be null. This means you'll have to put a constraint on T, while it is actually unnecesary. If you don't know whether you're going to have a reference type or not, you can never use as.
Of course, you should always check for null when you use the as keyword. Don't assume no exceptions will be thrown just becase the keyword doesn't throw any. Don't put a Try {} Catch(NullReferenceException){} around it, that't unneccesary and bloat. Just assign the value to a variable and check for null before you use it. Never use it inline in a method call.

Categories

Resources