string.IsNullOrEmpty() vs string.NotNullOrEmpty() - c#

I'm curious if any developers use string.IsNullOrEmpty() more often with a negative than with a positive
e.g.
if (!string.IsNullOrEmpty())
This is how I use the method 99% of the time. What was the design decision for this?

Because "IsNullOrEmpty" is easier to understand than "NotNullOrEmpty". The latter could be interpreted as:
It's not null and it's not empty
It's not null or it is empty

Double negatives are usually discouraged in naming stuff. !string.NotNullOrEmpty(...) would make one.

For those logicians out there, !string.IsNullOrEmpty is not equivalent to string.IsNotNullOrEmpty. #Guffa has it correct. Using DeMorgan's law, it would have to be string.IsNotNullAndNotEmpty to be equivalent.
¬(null ∨ empty) ⇔ ¬null ∧ ¬empty
¬(null ∨ empty) ≠ ¬null ∨ empty
The point here, I guess, is that the way it is currently is unambiguous, where as making the opposite unambiguous would be cumbersome.

C# naming conventions dictate that your expressions should be in the positive such as "Is..." and not "IsNot..."
EDIT: Typically, I use it when doing error checking and input validation at the beginning of a method and raise an exception if the parameter is null or empty.
if (string.IsNullOrEmpty(myParameter))
{
throw new ....
}

I always create an extension method for "HasContent()" which generally makes sense, follows the "positive" specifications, and saves on code bloat because I use it much more often than its counterpart:
public static bool HasContent(this string s) {
return !string.IsNullOrEmpty(s);
}

I prefer the extension method:
public static class StringExtensions
{
public static bool IsNullOrEmpty(this string value)
{
return string.IsNullOrEmpty(value);
}
}
I find it reads better to say:
if(myValue.IsNullOrEmpty())
or
if(!myValue.IsNullOrEmpty())

Perhaps because then the name would have to be the lengthy IsNotNullAndNotEmpty to be as specific.

Of course you could always use string.IsNullOrWhiteSpace(string) now instead of string .IsNullOrEmpty(string) from .NET 4.0

That is the most common usage I have seen.

"NotNullOrEmpty" is ambiguous, it could mean "(not null) or empty" or it could mean "not (null or empty)". To make it unambiguous you'd have to use "NotNullAndNotEmpty", which is a mouthfull.
Also, the "IsNullOrEmpty" naming encourages use as a guard clause, which I think is useful. E.g.:
if (String.IsNullOrEmpty(someString))
{
// error handling
return;
}
// do stuff
which I think is generally cleaner than:
if (!String.IsNullOrEmpty(someString))
{
// do stuff
}
else
{
// error handling
return;
}

I would actually be inclined to offer a different answer from the "it's ambiguous" explanation provided by several others (though I agree with that answer as well):
Personally, I like to minimize nesting in my code, as (to me) the more curly braces code has, the harder it becomes to follow.
Therefore I'd much prefer this (for example):
public bool DoSomethingWithString(string s) {
if (string.IsNullOrEmpty(s))
return false;
// here's the important code, not nested
}
to this:
public bool DoSomethingWithString(string s) {
if (!string.IsNullOrEmpty(s)) {
// here's the important code, nested
} else {
return false;
}
}
This is a pretty specific scenario (where a null/empty string prompts an immediate exit) and clearly isn't the way a method using IsNullOrEmpty would always be structured; but I think it's actually pretty common.

Personally I prefer to cater for the non negated scenario first. It just makes sense to me to do the true part first and then the false. Comes down to personal style.

I've always thought it seemed the wrong way round as I use the negative much more often than the positive.
I would also like there to be an instance IsEmpty() or IsNotEmpty() for use when the variable is declared within the function. This could not be IsNullOrEmpty() or IsNotNullOrEmpty() as if the instance was null then you would get a null reference exception.

I had the same question before I realized all I had to do to flip the question was to put the Not operator in front of the conditional. I think it cleande up my code some.
// need to check if tBx_PTNum.Text is empty
/*
if (string.IsNullOrWhiteSpace(tBx_PTNum.Text))
{
// no pt number yet
}
else
{
ptNum = Convert.ToInt32(tBx_PTNum.Text);
}
*/
if(!string.IsNullOrEmpty(tBx_PTNum.Text))
{
ptNum = Convert.ToInt32(tBx_PTNum.Text);
}

Related

How does 'out' (parameter) work? [duplicate]

If we want to get a value from a method, we can use either return value, like this:
public int GetValue();
or:
public void GetValue(out int x);
I don't really understand the differences between them, and so, don't know which is better. Can you explain me this?
Thank you.
Return values are almost always the right choice when the method doesn't have anything else to return. (In fact, I can't think of any cases where I'd ever want a void method with an out parameter, if I had the choice. C# 7's Deconstruct methods for language-supported deconstruction acts as a very, very rare exception to this rule.)
Aside from anything else, it stops the caller from having to declare the variable separately:
int foo;
GetValue(out foo);
vs
int foo = GetValue();
Out values also prevent method chaining like this:
Console.WriteLine(GetValue().ToString("g"));
(Indeed, that's one of the problems with property setters as well, and it's why the builder pattern uses methods which return the builder, e.g. myStringBuilder.Append(xxx).Append(yyy).)
Additionally, out parameters are slightly harder to use with reflection and usually make testing harder too. (More effort is usually put into making it easy to mock return values than out parameters). Basically there's nothing I can think of that they make easier...
Return values FTW.
EDIT: In terms of what's going on...
Basically when you pass in an argument for an "out" parameter, you have to pass in a variable. (Array elements are classified as variables too.) The method you call doesn't have a "new" variable on its stack for the parameter - it uses your variable for storage. Any changes in the variable are immediately visible. Here's an example showing the difference:
using System;
class Test
{
static int value;
static void ShowValue(string description)
{
Console.WriteLine(description + value);
}
static void Main()
{
Console.WriteLine("Return value test...");
value = 5;
value = ReturnValue();
ShowValue("Value after ReturnValue(): ");
value = 5;
Console.WriteLine("Out parameter test...");
OutParameter(out value);
ShowValue("Value after OutParameter(): ");
}
static int ReturnValue()
{
ShowValue("ReturnValue (pre): ");
int tmp = 10;
ShowValue("ReturnValue (post): ");
return tmp;
}
static void OutParameter(out int tmp)
{
ShowValue("OutParameter (pre): ");
tmp = 10;
ShowValue("OutParameter (post): ");
}
}
Results:
Return value test...
ReturnValue (pre): 5
ReturnValue (post): 5
Value after ReturnValue(): 10
Out parameter test...
OutParameter (pre): 5
OutParameter (post): 10
Value after OutParameter(): 10
The difference is at the "post" step - i.e. after the local variable or parameter has been changed. In the ReturnValue test, this makes no difference to the static value variable. In the OutParameter test, the value variable is changed by the line tmp = 10;
What's better, depends on your particular situation. One of the reasons out exists is to facilitate returning multiple values from one method call:
public int ReturnMultiple(int input, out int output1, out int output2)
{
output1 = input + 1;
output2 = input + 2;
return input;
}
So one is not by definition better than the other. But usually you'd want to use a simple return, unless you have the above situation for example.
EDIT:
This is a sample demonstrating one of the reasons that the keyword exists. The above is in no way to be considered a best practise.
You should generally prefer a return value over an out param. Out params are a necessary evil if you find yourself writing code that needs to do 2 things. A good example of this is the Try pattern (such as Int32.TryParse).
Let's consider what the caller of your two methods would have to do. For the first example I can write this...
int foo = GetValue();
Notice that I can declare a variable and assign it via your method in one line. FOr the 2nd example it looks like this...
int foo;
GetValue(out foo);
I'm now forced to declare my variable up front and write my code over two lines.
update
A good place to look when asking these types of question is the .NET Framework Design Guidelines. If you have the book version then you can see the annotations by Anders Hejlsberg and others on this subject (page 184-185) but the online version is here...
http://msdn.microsoft.com/en-us/library/ms182131(VS.80).aspx
If you find yourself needing to return two things from an API then wrapping them up in a struct/class would be better than an out param.
There's one reason to use an out param which has not already been mentioned: the calling method is obliged to receive it. If your method produces a value which the caller should not discard, making it an out forces the caller to specifically accept it:
Method1(); // Return values can be discard quite easily, even accidentally
int resultCode;
Method2(out resultCode); // Out params are a little harder to ignore
Of course the caller can still ignore the value in an out param, but you've called their attention to it.
This is a rare need; more often, you should use an exception for a genuine problem or return an object with state information for an "FYI", but there could be circumstances where this is important.
It's preference mainly
I prefer returns and if you have multiple returns you can wrap them in a Result DTO
public class Result{
public Person Person {get;set;}
public int Sum {get;set;}
}
You should almost always use a return value. 'out' parameters create a bit of friction to a lot of APIs, compositionality, etc.
The most noteworthy exception that springs to mind is when you want to return multiple values (.Net Framework doesn't have tuples until 4.0), such as with the TryParse pattern.
You can only have one return value whereas you can have multiple out parameters.
You only need to consider out parameters in those cases.
However, if you need to return more than one parameter from your method, you probably want to look at what you're returning from an OO approach and consider if you're better off return an object or a struct with these parameters. Therefore you're back to a return value again.
I would prefer the following instead of either of those in this simple example.
public int Value
{
get;
private set;
}
But, they are all very much the same. Usually, one would only use 'out' if they need to pass multiple values back from the method. If you want to send a value in and out of the method, one would choose 'ref'. My method is best, if you are only returning a value, but if you want to pass a parameter and get a value back one would likely choose your first choice.
I think one of the few scenarios where it would be useful would be when working with unmanaged memory, and you want to make it obvious that the "returned" value should be disposed of manually, rather than expecting it to be disposed of on its own.
Additionally, return values are compatible with asynchronous design paradigms.
You cannot designate a function "async" if it uses ref or out parameters.
In summary, Return Values allow method chaining, cleaner syntax (by eliminating the necessity for the caller to declare additional variables), and allow for asynchronous designs without the need for substantial modification in the future.
As others have said: return value, not out param.
May I recommend to you the book "Framework Design Guidelines" (2nd ed)? Pages 184-185 cover the reasons for avoiding out params. The whole book will steer you in the right direction on all sorts of .NET coding issues.
Allied with Framework Design Guidelines is the use of the static analysis tool, FxCop. You'll find this on Microsoft's sites as a free download. Run this on your compiled code and see what it says. If it complains about hundreds and hundreds of things... don't panic! Look calmly and carefully at what it says about each and every case. Don't rush to fix things ASAP. Learn from what it is telling you. You will be put on the road to mastery.
Using the out keyword with a return type of bool, can sometimes reduce code bloat and increase readability. (Primarily when the extra info in the out param is often ignored.) For instance:
var result = DoThing();
if (result.Success)
{
result = DoOtherThing()
if (result.Success)
{
result = DoFinalThing()
if (result.Success)
{
success = true;
}
}
}
vs:
var result;
if (DoThing(out result))
{
if (DoOtherThing(out result))
{
if (DoFinalThing(out result))
{
success = true;
}
}
}
There is no real difference. Out parameters are in C# to allow method return more then one value, that's all.
However There are some slight differences , but non of them are really important:
Using out parameter will enforce you to use two lines like:
int n;
GetValue(n);
while using return value will let you do it in one line:
int n = GetValue();
Another difference (correct only for value types and only if C# doesn't inline the function) is that using return value will necessarily make a copy of the value when the function return, while using OUT parameter will not necessarily do so.
Please avoid using out parameters.
Although, they can make sense in certain situations (for example when implementing the Try-Parse Pattern), they are very hard to grasp.
Chances to introduce bugs or side effects by yourself (unless you are very experienced with the concept) and by other developers (who either use your API or may inherit your code) is very high.
According to Microsoft's quality rule CA1021:
Although return values are commonplace and heavily used, the correct application of out and ref parameters requires intermediate design and coding skills. Library architects who design for a general audience should not expect users to master working with out or ref parameters.
Therefore, if there is not a very good reason, please just don't use out or ref.
See also:
Is using "out" bad practice
https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1021
Both of them have a different purpose and are not treated the same by the compiler. If your method needs to return a value, then you must use return. Out is used where your method needs to return multiple values.
If you use return, then the data is first written to the methods stack and then in the calling method's. While in case of out, it is directly written to the calling methods stack. Not sure if there are any more differences.
out is more useful when you are trying to return an object that you declare in the method.
Example
public BookList Find(string key)
{
BookList book; //BookList is a model class
_books.TryGetValue(key, out book) //_books is a concurrent dictionary
//TryGetValue gets an item with matching key and returns it into book.
return book;
}
return value is the normal value which is returned by your method.
Where as out parameter, well out and ref are 2 key words of C# they allow to pass variables as reference.
The big difference between ref and out is, ref should be initialised before and out don't
I suspect I'm not going to get a look-in on this question, but I am a very experienced programmer, and I hope some of the more open-minded readers will pay attention.
I believe that it suits object-oriented programming languages better for their value-returning procedures (VRPs) to be deterministic and pure.
'VRP' is the modern academic name for a function that is called as part of an expression, and has a return value that notionally replaces the call during evaluation of the expression. E.g. in a statement such as x = 1 + f(y) the function f is serving as a VRP.
'Deterministic' means that the result of the function depends only on the values of its parameters. If you call it again with the same parameter values, you are certain to get the same result.
'Pure' means no side-effects: calling the function does nothing except computing the result. This can be interpreted to mean no important side-effects, in practice, so if the VRP outputs a debugging message every time it is called, for example, that can probably be ignored.
Thus, if, in C#, your function is not deterministic and pure, I say you should make it a void function (in other words, not a VRP), and any value it needs to return should be returned in either an out or a ref parameter.
For example, if you have a function to delete some rows from a database table, and you want it to return the number of rows it deleted, you should declare it something like this:
public void DeleteBasketItems(BasketItemCategory category, out int count);
If you sometimes want to call this function but not get the count, you could always declare an overloading.
You might want to know why this style suits object-oriented programming better. Broadly, it fits into a style of programming that could be (a little imprecisely) termed 'procedural programming', and it is a procedural programming style that fits object-oriented programming better.
Why? The classical model of objects is that they have properties (aka attributes), and you interrogate and manipulate the object (mainly) through reading and updating those properties. A procedural programming style tends to make it easier to do this, because you can execute arbitrary code in between operations that get and set properties.
The downside of procedural programming is that, because you can execute arbitrary code all over the place, you can get some very obtuse and bug-vulnerable interactions via global variables and side-effects.
So, quite simply, it is good practice to signal to someone reading your code that a function could have side-effects by making it non-value returning.

c# avoiding variable declaration

Suppose I have some code like this:
public string SomeMethod(int Parameter)
{
string TheString = "";
TheString = SomeOtherMethod(Parameter);
return TheString;
}
Of course, this code is equivalent to this:
public string SomeMethod(int Parameter)
{
return SomeOtherMethod(Parameter);
}
I think the first version is more readable and that's how I'm writing my code, even thought I'm using a variable when I know I could avoid it.
My question is this: does the compiler compile the code in the same way (ie same performance) or is the second option really better in terms of performance.
Thanks.
I'd say the first form is less readable and it contains a redundant initializer. Why initialize the variable to "" if you're about to give it a different value? At least change it to:
public string SomeMethod(int parameter)
{
string returnValue = SomeOtherMethod(parameter);
return returnValue;
}
or if you really want to separate declaration from initialization:
public string SomeMethod(int parameter)
{
string returnValue;
returnValue = SomeOtherMethod(parameter);
return returnValue;
}
(Note that I've also adjusted the named to follow .NET naming conventions and to give a more meaningful name to the local variable -"TheString" conveys no useful meaning.)
You really won't see any performance problems from using the local variable, but I'd really encourage you to think about the readability. What is the purpose of the local variable here? You'd presumably describe the method as: "Returns the result of calling SomeOtherMethod with the given parameter" - at which point, the one-line version implements exactly that description.
The compiler will produce very similar code for your two examples. One slight modification though is to avoid initializing to an empty string that you never use.
public string SomeMethod(int Parameter)
{
string result;
result = SomeOtherMethod(Parameter);
return result;
}
I'm not sure rewriting the code in this way makes it more readable, but it does mean that you can add a breakpoint and see the value of result before the method returns. This can be useful when debugging.
Note you can combine the first and second line and still get this benefit:
public string SomeMethod(int Parameter)
{
string result = SomeOtherMethod(Parameter);
return result;
}
I think this last version is both highly readable and easy to debug.
Answer is already posted though let me give a different try :
There are 3 things that you are looking for :
Readability, Performance, usefulness (such as debugging, logging etc..)
1.Readability is somewhat relative. What Eric Lippert / Jon Skeet finds something more readable , same thing will not be applicable to me. More and more you code, many things and your perspective will change toward looking at the code.
Both choices you gave are readable , for me second is more readable.
2.Performance : In the first choice , as you might me aware of string immutability that if you reinitialize a string it will not clear earlier name (interning) and it will create new string and the variable will point it to it.
So from performance perspective intializing a variable to new value (unnecessarily) will cause performance bottleneck. Again this is relative, and depends on size/coplexity of the application.
For this you need to go with second option. Your second option and Jon's answer will result into same performance.
3.Debugging perspective : you would want to have local variable if you are looking for this stuff.

if/else, good design

Is it acceptable/good-style to simplify this function:
bool TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
{
return Do(obj);
}
else
{
return false;
}
}
as:
bool TryDo(Class1 obj, SomeEnum type)
{
return obj.CanDo(type) && Do(obj);
}
The second version is shorter but arguably less intuitive.
What I would code is :
return obj.CanDo(type) ? Do(obj) : false;
Version with brackets:
bool TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
{
return Do(obj);
}
return false;
}
Or version without brackets (in answer comments is high debate about it):
bool TryDo(Class1 obj, SomeEnum type)
{
/*
* If you want use this syntax of
* "if", this doing this on self
* responsibility, and i don't want
* get down votes for this syntax,
* because if I remove this from my
* answer, i get down votes because many
* peoples think brackets i wrong.
* See comments for more information.
*/
if (obj.CanDo(type))
return Do(obj);
return false;
}
Your first code example is better, but I think my version is even better.
Your second version is not good readable and makes code harder to maintain, this is bad.
The else is useless and the &&, however obvious, is not as readable as pure text.
I prefer the following:
bool TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
{
return Do(obj);
}
return false;
}
Yes.
Especially with names similar to your chosen names, i.e. CanDoSomething and DoSomething it is absolutely clear to any competent programmer what the second code does: “if and only if the condition holds, do something and return the result”. “if and only if” is the core meaning of the short-circuited && operator.
The first code is convoluted and unnecessarily long without giving any more information than the second code.
But in general, the two conditions may not form such an intimate relationship (as in CanDo and Do) and it might be better to separate them logically because putting them in the same conditional might not make intuitive sense.
A lot of people here claim that the first version is “much clearer”. I’d really like to hear their arguments. I can’t think of any.
On the other hand, there’s this closely related (although not quite the same) code:
if (condition)
return true;
else
return false;
this should always be transformed to this:
return condition;
No exception. It’s concise and still more readable to someone who is competent in the language.
The shortened version hides the fact that Do does something. It looks like you're just doing a comparison and returning the result, but you're actually doing a comparison and performing an action, and it's not obvious that the code has this "side effect".
I think the core of the problem is that you're returning the result of an evaluation and the return code of an action. If you were returning the result of two evaluations in this way, I wouldn't have a problem with it
Another alternative that may be a bit more readable is using the conditional operator:
bool TryDo(Class1 obj, SomeEnum type) {
return obj.CanDo(type) ? Do(obj) : false;
}
The 1st version is much easier to read with less chance of being misunderstood, and I think that is important in real world code.
I do not like this design, and maybe not for the obvious reason. What bothers me is.
return Do(obj);
To me it makes no sense for the Do function to have a bool return type. Is this a substitute for property pushing errors up?
Most likely this function should be void or returning a complex object. The scenario should simply not come up.
Also if a bool somehow makes sense now, it can easily stop making sense in the future. With your code change it would require more re factoring to fix
In this case, I would go with the first option - it is much more readable and the intention of the code is much clearer.
Neither, because when TryDo returns False, you can't determine whether it was because
'Not CanDo' or 'Do returned False'.
I fully understand that you can ignore the result, but the way it's expressed implies that the result has meaning.
If the result is meaningless, the intent would be clearer with
void TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
Do(obj);
return;
}
If the result does have a meaning then there should be a way to differentiate between the two 'false' returns. I.E. What does 'If (!TryDo(x))' mean?
Edit: To put it another way, the OP's code is saying that 'I can't swim' is the same as 'I tried to swim and drowned'
I don't like the second version as I'm not really a fan of taking advantage of the order of sub-expression evaluation in a conditional expression. It's placing an expected ordering on sub-expressions which in my mind, should have equal precedence (even though they don't).
At the same time, I find the first version a bit bloated, so I'd opt for the ternary solution which I consider highly readable.
While it is certainly clear what the second code for a competent programmer, it would seem to me clearer and easier to read in a more general case to write the code like any other "if precondition satisfied, do action, else fail" style.
This could be achieved either by:
return obj.CanDo(type)? Do(obj) : false;
Or,
if(obj.CanDo(type)) return Do(obj);
return false;
I find this superior because this same style can be replicated no matter the return type. For example,
return (array.Length > 1)? array[0] : null;
Or,
return (curActivity != null)? curActivity.Operate() : 0;
The same style can also be expanded to situations that don't have a return value:
if(curSelection != null)
curSelection.DoSomething();
My two cents.
IMO it's only OK if the second function has no side-effects. But since Do() has side-effects I'd go with the if.
My guideline is that an expression should not have side-effekts. When calling functions with a side-effect use a statement.
This guideline has a problem if a function returns a failurecode. In that case I accept assignment of that errorcode to a variable or directly returning it. But I don't use the return value in a complex expression. So perhaps I should say that only the outermost function call in an expression should have a side-effect.
See Eric Lippert's Blog for a longer explanation.
The side-effect problem that some people are mentioning is bogus. No one should be surprised that a method named "Do" has a side-effect.
The fact is, you are calling two methods. Both of those methods have bool as a return value. Your second option is very clear and concise. (Though I would get rid of the outer parenthesis and you forgot an ending semi-colon.)
Never do that. Keep it simple and intuitive.
I might get hate for this but what about:
if (obj.CanDo(type)) return Do(obj);
return false;
I don't like having braces for one liners.
For me, I prefer the second method. Most of my methods that return bool are shortened in the same manner when simple conditional logic is involved.
I think that the Class1 Type should determine if it can do, given a SomeEnum value.
I would leave the decision on whether or not it can handle the input for it to decide:
bool TryDo(Class1 obj, SomeEnum type)
{
return obj.Do(type));
}
No. The second version
return (obj.CanDo(type) && Do(obj))
relies on the short-circuiting behavior of the && operator which is an optimization, not a method to control program flow. In my mind this is only slightly different than using exceptions for program flow (and almost as bad).
I hate clever code, it's a bitch to understand and debug. The goal of the function is "if we can do this, then do it and return the result else return false." The original code makes that meaning very clear.
There are several good answers already, but I thought I would show one more example of (what I consider) good, readable code.
bool TryDo(Class1 obj, SomeEnum type)
{
bool result = false;
if (obj.CanDo(type))
{
result = Do(obj);
}
return result;
}
Keep or remove the curly brackets back around the body of the if statement according to taste.
I like this approach because it illustrates that the result is false unless something else happens, and I think it more clearly shows that Do() is doing something and returning a boolean, which TryDo() uses as its return value.

Is extending String class with IsNullOrEmpty confusing?

Everyone knows and love String.IsNullOrEmpty(yourString) method.
I was wondering if it's going to confuse developers or make code better if we extend String class to have method like this:
yourString.IsNullOrEmpty();
Pro:
More readable.
Less typing.
Cons:
Can be confusing because yourString
variable can be null and it looks
like you're executing method on a
null variable.
What do you think?
The same question we can ask about myObject.IsNull() method.
That how I would write it:
public static class StringExt
{
public static bool IsNullOrEmpty(this string text)
{
return string.IsNullOrEmpty(text);
}
public static bool IsNull(this object obj)
{
return obj == null;
}
}
If I'm not mistaken, every answer here decries the fact that the extension method can be called on a null instance, and because of this they do not support believe this is a good idea.
Let me counter their arguments.
I don't believe AT ALL that calling a method on an object that may be null is confusing. The fact is that we only check for nulls in certain locations, and not 100% of the time. That means there is a percentage of time where every method call we make is potentially on a null object. This is understood and acceptable. If it wasn't, we'd be checking null before every single method call.
So, how is it confusing that a particular method call may be happening on a null object? Look at the following code:
var bar = foo.DoSomethingResultingInBar();
Console.Writeline(bar.ToStringOr("[null]"));
Are you confused? You should be. Because here's the implementation of foo:
public Bar DoSomethingResultingInBar()
{
return null; //LOL SUCKER!
}
See? You read the code sample without being confused at all. You understood that, potentially, foo would return a null from that method call and the ToStringOr call on bar would result in a NRE. Did your head spin? Of course not. Its understood that this can happen. Now, that ToStringOr method is not familiar. What do you do in these situations? You either read the docs on the method or examine the code of the call. Here it is:
public static class BarExtensions
{
public static string ToStringOr(this bar, string whenNull)
{
return bar == null ? whenNull ?? "[null]" : bar.ToString();
}
}
Confusing? Of course not. Its obvious that the developer wanted a shorthand method of checking if bar is null and substituting a non-null string for it. Doing this can slash your code significantly and increase readability and code reuse. Of course you could do this in other ways, but this way would be no more confusing than any other. For example:
var bar = foo.DoSomethingResultingInBar();
Console.Writeline(ToStringOr(bar, "[null]"));
When you encounter this code, what do you have to differently than the original version? You still have to examine the code, you still have to determine its behavior when bar is null. You still have to deal with this possibility.
Are extension methods confusing? Only if you don't understand them. And, quite frankly, the same can be said for ANY part of the language, from delegates to lambdas.
I think the root of this problem is what Jon Skeet has mentioned in the list of things he hates in his favorite language (C#): C# should not have imported all extension methods in a whole namespace automatically. This process should have been done more explicitly.
My personal opinion about this specific question is (since we can't do anything about the above fact) to use the extension method if you want. I don't say it won't be confusing, but this fact about extension methods (that can be called on null references) is a global thing and doesn't affect only String.IsNullOrEmpty, so C# devs should get familiar with it.
By the way, it's fortunate that Visual Studio clearly identifies extension methods by (extension) in the IntelliSense tooltip.
I'm personally not a fan of doing this. The biggest problem with extension methods right now is discoverability. Unleses you flat out know all of the methods which exist on a particular type, it's not possible to look at a method call and know that it's an extension method. As such I find it problematic to do anything with an extension method that would not be possible with a normal method call. Otherwise you will end up confusing developers.
A corollary to this problem exist in C++. In several C++ implementations it's possible to call instance methods on NULL pointers as long as you don't touch any fields or virtual methods on the type. I've worked with several pieces of code that do this intentionally and give methods differentt behavior when "this==NULL". It's quite maddening to work with.
This is not to say that I don't like extension methods. Quite the contrary, I enjoy them and use them frequently. But I think there are 2 important rules you should follow when writing them.
Treat the actual method implementation as if it's just another static method because it in fact is. For example throw ArgumentException instead of NullReference exception for a null this
Don't let an extension method perform tricks that a normal instance method couldn't do
EDIT Responding to Mehrdad's comment
The problem with taking advantage of it is that I don't see str.IsNullOrEmpty as having a significant functional advantage over String.IsNullOrEmpty(str). The only advantage I see is that one requires less typing than the other. The same could be said about extension methods in general. But in this case you're additionally altering the way people think about program flow.
If shorter typing is what people really want wouldn't IsNullOrEmpty(str) be a much better option? It's both unambiguous and is the shortest of all. True C# has no support for such a feature today. But imagine if in C# I could say
using SomeNamespace.SomeStaticClass;
The result of doing this is that all methods on SomeStaticClass were now in the global namespace and available for binding. This seems to be what people want, but they're attaching it to an extension method which I'm not a huge fan of.
I really like this approach AS LONG AS the method makes it clear that it is checking the object is null. ThrowIfNull, IsNull, IsNullOrEmpty, etc. It is very readable.
Personally, I wouldn't create an extension that does something that already exists in the framework unless it was a significant improvement in usability. In this instance, I don't think that's the case. Also, if I were to create an extension, I would name it in a way as to reduce confusion, not increase it. Again, I think this case fails that test.
Having said all that, I do have a string extension that tests, not only if the string is null or empty, but also if it only contains whitespace. I call it IsNothing. You can find it here.
It doesn't just look like you're calling a method on a null variable. You /are/ calling a method on a null variable, albeit one implemented through a static extension method. I had no idea extension methods (which I was already leery of) supported that. This even allows you to do crazy things like:
public static int PowerLength(this string obj)
{
return obj == null ? 0 : obj.Length;
}
From where I'm standing now, I would classify any use of an extension method on a null reference under considered harmful.
Let's look at the pro:s and con:s...
More readable.
Yes, slightly, but the improvement in readability is outweighed by the fact that it looks like you are calling an instance method on something that doesn't have to be an instance. In effect it's easier to read, but it's harder to understand, so the improved readability is really just an illusion.
Less typing.
Yes, but that is really not a strong argument. If typing is the main part of your programming, you are just not doing something that is remotely challenging enough for you to evolve as a developer.
Can be confusing because yourString variable can be null and it looks like
you're executing method on a null variable.
True, (as mentioned above).
Yes, it will confuse.
I think that everyone who knows about IsNullOrEmpty() perceives the use of it as quite natural. So your suggested extension method will not add more readability.
Maybe for someone new to .NET this extension method might be easier to deal with, but there is the danger possibility that she/he doesn't understand all facets of extension methods (like that you need to import the namespace and that it can be invoked on null). She/he will might wonder why this extension method does not exist in another project. Anyway: Even if someone is new to .NET the syntax of the IsNullOrEmpty() method might become natural quite fast. Also here the benefits of the extension methods will not outweight the confusion caused by it.
Edit: Tried to rephrase what I wanted to say.
I think extending any object with an "IsNull" type call is a little confusing and should be avoided if possible.
It's arguable that the IsEmptyString method might be useful for the String type, and that because you'd usually combine this with a test for null that the IsNullOrEmpty might be useful, but I'd avoid this too due to the fact that the string type already has a static method that does this and I'm not sure you're saving yourself that much typing (5 characters at most).
Calling a method on a variable that is null usually results in a NullReferenceException. The IsNullOrEmpty()-Method deviates from this behaviour in a way that is not predictable from just looking at the code. Therefore I would advise against using it since it creates confusion and the benefit of saving a couple of characters is minimal.
In general, I'm only ok with extension methods being safe to call on null if they have the word 'null' or something like that in their name. That way, I'm clued in to the fact that they may be safe to call with null. Also, they better document that fact in their XML comment header so I get that info when I mouse-over the call.
When comparing class instance to null using ".IsNull()" is not even shorter than using " == null".
Things change in generic classes when the generic type argument without constraint can be a value type.
In such case comparison with default type is lengthy and I use the extension below:
public static bool IsDefault<T>(this T x)
{
return EqualityComparer<T>.Default.Equals(x, default(T));
}

How should I name my class, functions, member variables and static variables?

Some may feel this question is subjective. But, I feel this is among the most important things to be told to a programmer.
Is this a good function name to check for null values.
1. checkNull()
2. notNull()
3. isNull()
What if I write
checkIfNull()
I do not know how many people share the same feeling as I do, I have spent more time in thinking good names for my functions than writing one.
How do people think of good names? Can the naming be consistent across languages (mainly C++ and Java)
Update:
As I go by the number of updates till now, Most people prefer isNull(). How do you decide upon this that isNull() is the perfect name.
checkNotNull() // throw exception if Null
Is this a good name? Does everyone depend upon their intuition for deciding a name?
The question is about choosing a perfect name!!!
isNull might be a bad example, because:
Object foo = null;
if (foo.isNull()) { // Causes a NullPointerException in Java. }
Otherwise, you've got:
Object foo = null;
if (UtilityClass.isNull(foo) { }
Which seems harder and less clear than just doing:
Object foo = null;
if (foo == null) { }
Like the others, I prefer isNull() (or IsNull(), depending on your language/coding conventions).
Why? Beside it is a widely accepted convention, it sounds nice when you read the code:
if (isNull())
// or
if (foo.isInitialized())
and so on. Almost natural English... :-) Compare to the alternatives!
Like iWerner, I would avoid negative form for making identifiers (variables, methods) names.
Another common convention is to start method/function names with a verb. Now, Sun did not follow this convention in the early days of Java (hence the length() and size() methods, for example) but it even deprecates some of these old names in favor of the verb rule.
If the function throws an exception if it's null, it should be called ThrowIfNull to make it clear that it will throw for you.
IsNull() is a good choice, But additionally it should return a bool.
So that you can check its value in if statment without getting any NullReference exception.
Nowadays it is highly recommended to use the javaBeans convention:
isNull() //if the return type is a primitive
getNull() //if the return type is an object (Like Boolean in java)
For non boolean types access members, you should use get.
For static variable members use the camel case style: "myVar".
For class name use camel case style with capitalized first letter: "MyClass".
And for constant members use uppercase letter with underscore as separator: "MY_CONSTANT".
The answer depends on what your method returns.
If it returns a bool indicating whether the object is null, I would name it IsNull(Thing thing), because it is the least ambiguous formulation - what the method does and what it returns is immediately obvious.
If the method is void but throws if the object is null, I would call it GuardAgainstNull(), or something along these lines.
IMO, CheckNull() is somewhat ambiguous - you don't know by looking at the method if it should return a bool or throw, or what the bool indicates exactly.
I prefer IsNull.
To learn good naming style, study the standard libraries (except in PHP). You should follow the style used by the standard libraries in each language.
For C#, study the Framework Design Guidelines.
personally, I would use
IsNull()
I found this article. Felt like sharing with you guys!
If you're doing a lot of null checking in your code, I think having a pair of methods, i.e.:
IsNull()
IsNotNull()
will lead to the most readable code in the long run.
I know !IsNull() is a standard idiom in curly brace languages, but I think it's much less clear than IsNotNull.
It's too easy to overlook that single "!" character, especially if it's buried in a more complex expression.
It can vary depending on the language you are using - and you tagged a couple to this question. It is important to stay consistent with the standards of the language/library you are coding against. Yes, naming conventions are very important! [There's even a wikipedia entry on it: http://en.wikipedia.org/wiki/Naming_conventions_%28programming%29]
For .Net I found this "cheat sheet" on naming conventions:
http://www.irritatedvowel.com/Programming/Standards.aspx
For your example in C# I'd reccommend : IsNull()
If your company does not specify naming conventions in its coding standards I suggest it's time you add them.
Our company's Java coding standards are basedon the official Java Coding Standards which, I believe, specify names like isNull().
From your example, the notNull() is bad, because you may end up with statements like if(!notNull()) or the like.
I would use IsNull(); there is a precedence in .Net which has a static IsNullOrEmpty() method for the String type. "Is" is my preferred prefix for methods that return a bool. I would not have a negative method "notNull", because this too easily results in double negatives. Instead use the negation operation on a positive method, e.g., !IsNull().
However, a method that only checks for a null value may be overly complicating things; what is wrong with
x == null
Which I think is more readable than
IsNull(x)
Most developers seeing IsNull(x) would wonder if there is some fancy null checking in the IsNull method; if there isn't then "x == null" is probably better.

Categories

Resources