Is it acceptable/good-style to simplify this function:
bool TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
{
return Do(obj);
}
else
{
return false;
}
}
as:
bool TryDo(Class1 obj, SomeEnum type)
{
return obj.CanDo(type) && Do(obj);
}
The second version is shorter but arguably less intuitive.
What I would code is :
return obj.CanDo(type) ? Do(obj) : false;
Version with brackets:
bool TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
{
return Do(obj);
}
return false;
}
Or version without brackets (in answer comments is high debate about it):
bool TryDo(Class1 obj, SomeEnum type)
{
/*
* If you want use this syntax of
* "if", this doing this on self
* responsibility, and i don't want
* get down votes for this syntax,
* because if I remove this from my
* answer, i get down votes because many
* peoples think brackets i wrong.
* See comments for more information.
*/
if (obj.CanDo(type))
return Do(obj);
return false;
}
Your first code example is better, but I think my version is even better.
Your second version is not good readable and makes code harder to maintain, this is bad.
The else is useless and the &&, however obvious, is not as readable as pure text.
I prefer the following:
bool TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
{
return Do(obj);
}
return false;
}
Yes.
Especially with names similar to your chosen names, i.e. CanDoSomething and DoSomething it is absolutely clear to any competent programmer what the second code does: “if and only if the condition holds, do something and return the result”. “if and only if” is the core meaning of the short-circuited && operator.
The first code is convoluted and unnecessarily long without giving any more information than the second code.
But in general, the two conditions may not form such an intimate relationship (as in CanDo and Do) and it might be better to separate them logically because putting them in the same conditional might not make intuitive sense.
A lot of people here claim that the first version is “much clearer”. I’d really like to hear their arguments. I can’t think of any.
On the other hand, there’s this closely related (although not quite the same) code:
if (condition)
return true;
else
return false;
this should always be transformed to this:
return condition;
No exception. It’s concise and still more readable to someone who is competent in the language.
The shortened version hides the fact that Do does something. It looks like you're just doing a comparison and returning the result, but you're actually doing a comparison and performing an action, and it's not obvious that the code has this "side effect".
I think the core of the problem is that you're returning the result of an evaluation and the return code of an action. If you were returning the result of two evaluations in this way, I wouldn't have a problem with it
Another alternative that may be a bit more readable is using the conditional operator:
bool TryDo(Class1 obj, SomeEnum type) {
return obj.CanDo(type) ? Do(obj) : false;
}
The 1st version is much easier to read with less chance of being misunderstood, and I think that is important in real world code.
I do not like this design, and maybe not for the obvious reason. What bothers me is.
return Do(obj);
To me it makes no sense for the Do function to have a bool return type. Is this a substitute for property pushing errors up?
Most likely this function should be void or returning a complex object. The scenario should simply not come up.
Also if a bool somehow makes sense now, it can easily stop making sense in the future. With your code change it would require more re factoring to fix
In this case, I would go with the first option - it is much more readable and the intention of the code is much clearer.
Neither, because when TryDo returns False, you can't determine whether it was because
'Not CanDo' or 'Do returned False'.
I fully understand that you can ignore the result, but the way it's expressed implies that the result has meaning.
If the result is meaningless, the intent would be clearer with
void TryDo(Class1 obj, SomeEnum type)
{
if (obj.CanDo(type))
Do(obj);
return;
}
If the result does have a meaning then there should be a way to differentiate between the two 'false' returns. I.E. What does 'If (!TryDo(x))' mean?
Edit: To put it another way, the OP's code is saying that 'I can't swim' is the same as 'I tried to swim and drowned'
I don't like the second version as I'm not really a fan of taking advantage of the order of sub-expression evaluation in a conditional expression. It's placing an expected ordering on sub-expressions which in my mind, should have equal precedence (even though they don't).
At the same time, I find the first version a bit bloated, so I'd opt for the ternary solution which I consider highly readable.
While it is certainly clear what the second code for a competent programmer, it would seem to me clearer and easier to read in a more general case to write the code like any other "if precondition satisfied, do action, else fail" style.
This could be achieved either by:
return obj.CanDo(type)? Do(obj) : false;
Or,
if(obj.CanDo(type)) return Do(obj);
return false;
I find this superior because this same style can be replicated no matter the return type. For example,
return (array.Length > 1)? array[0] : null;
Or,
return (curActivity != null)? curActivity.Operate() : 0;
The same style can also be expanded to situations that don't have a return value:
if(curSelection != null)
curSelection.DoSomething();
My two cents.
IMO it's only OK if the second function has no side-effects. But since Do() has side-effects I'd go with the if.
My guideline is that an expression should not have side-effekts. When calling functions with a side-effect use a statement.
This guideline has a problem if a function returns a failurecode. In that case I accept assignment of that errorcode to a variable or directly returning it. But I don't use the return value in a complex expression. So perhaps I should say that only the outermost function call in an expression should have a side-effect.
See Eric Lippert's Blog for a longer explanation.
The side-effect problem that some people are mentioning is bogus. No one should be surprised that a method named "Do" has a side-effect.
The fact is, you are calling two methods. Both of those methods have bool as a return value. Your second option is very clear and concise. (Though I would get rid of the outer parenthesis and you forgot an ending semi-colon.)
Never do that. Keep it simple and intuitive.
I might get hate for this but what about:
if (obj.CanDo(type)) return Do(obj);
return false;
I don't like having braces for one liners.
For me, I prefer the second method. Most of my methods that return bool are shortened in the same manner when simple conditional logic is involved.
I think that the Class1 Type should determine if it can do, given a SomeEnum value.
I would leave the decision on whether or not it can handle the input for it to decide:
bool TryDo(Class1 obj, SomeEnum type)
{
return obj.Do(type));
}
No. The second version
return (obj.CanDo(type) && Do(obj))
relies on the short-circuiting behavior of the && operator which is an optimization, not a method to control program flow. In my mind this is only slightly different than using exceptions for program flow (and almost as bad).
I hate clever code, it's a bitch to understand and debug. The goal of the function is "if we can do this, then do it and return the result else return false." The original code makes that meaning very clear.
There are several good answers already, but I thought I would show one more example of (what I consider) good, readable code.
bool TryDo(Class1 obj, SomeEnum type)
{
bool result = false;
if (obj.CanDo(type))
{
result = Do(obj);
}
return result;
}
Keep or remove the curly brackets back around the body of the if statement according to taste.
I like this approach because it illustrates that the result is false unless something else happens, and I think it more clearly shows that Do() is doing something and returning a boolean, which TryDo() uses as its return value.
Related
If we want to get a value from a method, we can use either return value, like this:
public int GetValue();
or:
public void GetValue(out int x);
I don't really understand the differences between them, and so, don't know which is better. Can you explain me this?
Thank you.
Return values are almost always the right choice when the method doesn't have anything else to return. (In fact, I can't think of any cases where I'd ever want a void method with an out parameter, if I had the choice. C# 7's Deconstruct methods for language-supported deconstruction acts as a very, very rare exception to this rule.)
Aside from anything else, it stops the caller from having to declare the variable separately:
int foo;
GetValue(out foo);
vs
int foo = GetValue();
Out values also prevent method chaining like this:
Console.WriteLine(GetValue().ToString("g"));
(Indeed, that's one of the problems with property setters as well, and it's why the builder pattern uses methods which return the builder, e.g. myStringBuilder.Append(xxx).Append(yyy).)
Additionally, out parameters are slightly harder to use with reflection and usually make testing harder too. (More effort is usually put into making it easy to mock return values than out parameters). Basically there's nothing I can think of that they make easier...
Return values FTW.
EDIT: In terms of what's going on...
Basically when you pass in an argument for an "out" parameter, you have to pass in a variable. (Array elements are classified as variables too.) The method you call doesn't have a "new" variable on its stack for the parameter - it uses your variable for storage. Any changes in the variable are immediately visible. Here's an example showing the difference:
using System;
class Test
{
static int value;
static void ShowValue(string description)
{
Console.WriteLine(description + value);
}
static void Main()
{
Console.WriteLine("Return value test...");
value = 5;
value = ReturnValue();
ShowValue("Value after ReturnValue(): ");
value = 5;
Console.WriteLine("Out parameter test...");
OutParameter(out value);
ShowValue("Value after OutParameter(): ");
}
static int ReturnValue()
{
ShowValue("ReturnValue (pre): ");
int tmp = 10;
ShowValue("ReturnValue (post): ");
return tmp;
}
static void OutParameter(out int tmp)
{
ShowValue("OutParameter (pre): ");
tmp = 10;
ShowValue("OutParameter (post): ");
}
}
Results:
Return value test...
ReturnValue (pre): 5
ReturnValue (post): 5
Value after ReturnValue(): 10
Out parameter test...
OutParameter (pre): 5
OutParameter (post): 10
Value after OutParameter(): 10
The difference is at the "post" step - i.e. after the local variable or parameter has been changed. In the ReturnValue test, this makes no difference to the static value variable. In the OutParameter test, the value variable is changed by the line tmp = 10;
What's better, depends on your particular situation. One of the reasons out exists is to facilitate returning multiple values from one method call:
public int ReturnMultiple(int input, out int output1, out int output2)
{
output1 = input + 1;
output2 = input + 2;
return input;
}
So one is not by definition better than the other. But usually you'd want to use a simple return, unless you have the above situation for example.
EDIT:
This is a sample demonstrating one of the reasons that the keyword exists. The above is in no way to be considered a best practise.
You should generally prefer a return value over an out param. Out params are a necessary evil if you find yourself writing code that needs to do 2 things. A good example of this is the Try pattern (such as Int32.TryParse).
Let's consider what the caller of your two methods would have to do. For the first example I can write this...
int foo = GetValue();
Notice that I can declare a variable and assign it via your method in one line. FOr the 2nd example it looks like this...
int foo;
GetValue(out foo);
I'm now forced to declare my variable up front and write my code over two lines.
update
A good place to look when asking these types of question is the .NET Framework Design Guidelines. If you have the book version then you can see the annotations by Anders Hejlsberg and others on this subject (page 184-185) but the online version is here...
http://msdn.microsoft.com/en-us/library/ms182131(VS.80).aspx
If you find yourself needing to return two things from an API then wrapping them up in a struct/class would be better than an out param.
There's one reason to use an out param which has not already been mentioned: the calling method is obliged to receive it. If your method produces a value which the caller should not discard, making it an out forces the caller to specifically accept it:
Method1(); // Return values can be discard quite easily, even accidentally
int resultCode;
Method2(out resultCode); // Out params are a little harder to ignore
Of course the caller can still ignore the value in an out param, but you've called their attention to it.
This is a rare need; more often, you should use an exception for a genuine problem or return an object with state information for an "FYI", but there could be circumstances where this is important.
It's preference mainly
I prefer returns and if you have multiple returns you can wrap them in a Result DTO
public class Result{
public Person Person {get;set;}
public int Sum {get;set;}
}
You should almost always use a return value. 'out' parameters create a bit of friction to a lot of APIs, compositionality, etc.
The most noteworthy exception that springs to mind is when you want to return multiple values (.Net Framework doesn't have tuples until 4.0), such as with the TryParse pattern.
You can only have one return value whereas you can have multiple out parameters.
You only need to consider out parameters in those cases.
However, if you need to return more than one parameter from your method, you probably want to look at what you're returning from an OO approach and consider if you're better off return an object or a struct with these parameters. Therefore you're back to a return value again.
I would prefer the following instead of either of those in this simple example.
public int Value
{
get;
private set;
}
But, they are all very much the same. Usually, one would only use 'out' if they need to pass multiple values back from the method. If you want to send a value in and out of the method, one would choose 'ref'. My method is best, if you are only returning a value, but if you want to pass a parameter and get a value back one would likely choose your first choice.
I think one of the few scenarios where it would be useful would be when working with unmanaged memory, and you want to make it obvious that the "returned" value should be disposed of manually, rather than expecting it to be disposed of on its own.
Additionally, return values are compatible with asynchronous design paradigms.
You cannot designate a function "async" if it uses ref or out parameters.
In summary, Return Values allow method chaining, cleaner syntax (by eliminating the necessity for the caller to declare additional variables), and allow for asynchronous designs without the need for substantial modification in the future.
As others have said: return value, not out param.
May I recommend to you the book "Framework Design Guidelines" (2nd ed)? Pages 184-185 cover the reasons for avoiding out params. The whole book will steer you in the right direction on all sorts of .NET coding issues.
Allied with Framework Design Guidelines is the use of the static analysis tool, FxCop. You'll find this on Microsoft's sites as a free download. Run this on your compiled code and see what it says. If it complains about hundreds and hundreds of things... don't panic! Look calmly and carefully at what it says about each and every case. Don't rush to fix things ASAP. Learn from what it is telling you. You will be put on the road to mastery.
Using the out keyword with a return type of bool, can sometimes reduce code bloat and increase readability. (Primarily when the extra info in the out param is often ignored.) For instance:
var result = DoThing();
if (result.Success)
{
result = DoOtherThing()
if (result.Success)
{
result = DoFinalThing()
if (result.Success)
{
success = true;
}
}
}
vs:
var result;
if (DoThing(out result))
{
if (DoOtherThing(out result))
{
if (DoFinalThing(out result))
{
success = true;
}
}
}
There is no real difference. Out parameters are in C# to allow method return more then one value, that's all.
However There are some slight differences , but non of them are really important:
Using out parameter will enforce you to use two lines like:
int n;
GetValue(n);
while using return value will let you do it in one line:
int n = GetValue();
Another difference (correct only for value types and only if C# doesn't inline the function) is that using return value will necessarily make a copy of the value when the function return, while using OUT parameter will not necessarily do so.
Please avoid using out parameters.
Although, they can make sense in certain situations (for example when implementing the Try-Parse Pattern), they are very hard to grasp.
Chances to introduce bugs or side effects by yourself (unless you are very experienced with the concept) and by other developers (who either use your API or may inherit your code) is very high.
According to Microsoft's quality rule CA1021:
Although return values are commonplace and heavily used, the correct application of out and ref parameters requires intermediate design and coding skills. Library architects who design for a general audience should not expect users to master working with out or ref parameters.
Therefore, if there is not a very good reason, please just don't use out or ref.
See also:
Is using "out" bad practice
https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1021
Both of them have a different purpose and are not treated the same by the compiler. If your method needs to return a value, then you must use return. Out is used where your method needs to return multiple values.
If you use return, then the data is first written to the methods stack and then in the calling method's. While in case of out, it is directly written to the calling methods stack. Not sure if there are any more differences.
out is more useful when you are trying to return an object that you declare in the method.
Example
public BookList Find(string key)
{
BookList book; //BookList is a model class
_books.TryGetValue(key, out book) //_books is a concurrent dictionary
//TryGetValue gets an item with matching key and returns it into book.
return book;
}
return value is the normal value which is returned by your method.
Where as out parameter, well out and ref are 2 key words of C# they allow to pass variables as reference.
The big difference between ref and out is, ref should be initialised before and out don't
I suspect I'm not going to get a look-in on this question, but I am a very experienced programmer, and I hope some of the more open-minded readers will pay attention.
I believe that it suits object-oriented programming languages better for their value-returning procedures (VRPs) to be deterministic and pure.
'VRP' is the modern academic name for a function that is called as part of an expression, and has a return value that notionally replaces the call during evaluation of the expression. E.g. in a statement such as x = 1 + f(y) the function f is serving as a VRP.
'Deterministic' means that the result of the function depends only on the values of its parameters. If you call it again with the same parameter values, you are certain to get the same result.
'Pure' means no side-effects: calling the function does nothing except computing the result. This can be interpreted to mean no important side-effects, in practice, so if the VRP outputs a debugging message every time it is called, for example, that can probably be ignored.
Thus, if, in C#, your function is not deterministic and pure, I say you should make it a void function (in other words, not a VRP), and any value it needs to return should be returned in either an out or a ref parameter.
For example, if you have a function to delete some rows from a database table, and you want it to return the number of rows it deleted, you should declare it something like this:
public void DeleteBasketItems(BasketItemCategory category, out int count);
If you sometimes want to call this function but not get the count, you could always declare an overloading.
You might want to know why this style suits object-oriented programming better. Broadly, it fits into a style of programming that could be (a little imprecisely) termed 'procedural programming', and it is a procedural programming style that fits object-oriented programming better.
Why? The classical model of objects is that they have properties (aka attributes), and you interrogate and manipulate the object (mainly) through reading and updating those properties. A procedural programming style tends to make it easier to do this, because you can execute arbitrary code in between operations that get and set properties.
The downside of procedural programming is that, because you can execute arbitrary code all over the place, you can get some very obtuse and bug-vulnerable interactions via global variables and side-effects.
So, quite simply, it is good practice to signal to someone reading your code that a function could have side-effects by making it non-value returning.
Suppose you have to process a sequence of InputType that produces two sequences one of type OutputType and the other of type ErrorType.
A basic implementation could be:
class SeqProcessor {
private IEnumerable<ErrorType> errorTypes;
public SeqProcessor()
{
this.errorTypes = Enumerable.Empty<ErrorType>;
}
public IEnumerable<ErrorType> Errors
{
get { return this.errors; }
}
public IEnumerable<OutputType> ProcessItems(IEnumerable<InputType> inputTypes)
{
yield return new OutputType();
if (err) this.errorTypes = this.errorTypes.Concat(new ErrorType());
yield return new OutputType();
yield return new OutputType();
if (err) this.errorTypes = this.errorTypes.Concat(new ErrorType());
// ...
yield break;
}
}
I see these two alternatives for example:
Use a common interface (eg. IProduct) between OutputType and ErrorType and let ProcessItems return IEnumerable<IProduct> (than discriminate using Linq).
Define a subclass of ErrorType called NoError and let ProcessItems return tuples IEnumerable<Tuple<OutputType, ErrorType>> (if no error, NoError will be used in the tuple).
Edit:
Since ErrorType are semantically different from OutputType, mixing these types could be a violation of Single Responsibility Principle.
Can the use of a delegate be an acceptable alternative design:
class SeqProcessor {
public IEnumerable<OutputType> ProcessItems(
IEnumerable<InputType> inputTypes,
Action<ErrorType> onError)
{
yield return new OutputType();
// ...
onError(new ErrorType());
}
}
Which approach do you use in such cases?
The second approach suggests that a NoError instance is a specialization of a NoError; this would rarely be true in practice. More likely the shared functionality between the two is small, making the first approach better.
Depending on what exactly you want to achieve, I see multiple possible solutions here:
Stay with the original implementation (where I would replace private IEnumerable<ErrorType> errorTypes but something that allows you to determine the item the error belongs to). In this context, the errors you are encountering would have the significance of a warning (which is why I would also prefer the name Warning) because they are separated from the actual result.
Using a common interface for both result types (that is, output and error) would only make sense if other functions consuming the resulting list could really make use of the error output. I doubt that this is what you intended but imho, this would be valid design choice.
As Pieter pointed out, having a sub-class NoError of ErrorType would really be nasty. However, a nicer solution would be using ResultType as a base for the types NoError and Error. That way, you really have specialization of the base class. Still, I wonder that the output will contain in case of an error. The original element? A processed, but invalid element? Null? Depending on what you want to achieve, this could be reasonable, but this is hard to tell from the given information and, to be honest, I doubt that is what you want.
The OnError is good practice in many contexts because it allows for great flexibility. However, you will still have to think about what will be the corresponding entry in the result in such a case. Imho, it will probably be the best choice to simply leave it out in order to avoid the treatment of either null or either special values.
All in all, it seems like the OnError approach seems to be most promising, even though additional information may drive you towards one of the other mentioned approaches.
Suppose I have some code like this:
public string SomeMethod(int Parameter)
{
string TheString = "";
TheString = SomeOtherMethod(Parameter);
return TheString;
}
Of course, this code is equivalent to this:
public string SomeMethod(int Parameter)
{
return SomeOtherMethod(Parameter);
}
I think the first version is more readable and that's how I'm writing my code, even thought I'm using a variable when I know I could avoid it.
My question is this: does the compiler compile the code in the same way (ie same performance) or is the second option really better in terms of performance.
Thanks.
I'd say the first form is less readable and it contains a redundant initializer. Why initialize the variable to "" if you're about to give it a different value? At least change it to:
public string SomeMethod(int parameter)
{
string returnValue = SomeOtherMethod(parameter);
return returnValue;
}
or if you really want to separate declaration from initialization:
public string SomeMethod(int parameter)
{
string returnValue;
returnValue = SomeOtherMethod(parameter);
return returnValue;
}
(Note that I've also adjusted the named to follow .NET naming conventions and to give a more meaningful name to the local variable -"TheString" conveys no useful meaning.)
You really won't see any performance problems from using the local variable, but I'd really encourage you to think about the readability. What is the purpose of the local variable here? You'd presumably describe the method as: "Returns the result of calling SomeOtherMethod with the given parameter" - at which point, the one-line version implements exactly that description.
The compiler will produce very similar code for your two examples. One slight modification though is to avoid initializing to an empty string that you never use.
public string SomeMethod(int Parameter)
{
string result;
result = SomeOtherMethod(Parameter);
return result;
}
I'm not sure rewriting the code in this way makes it more readable, but it does mean that you can add a breakpoint and see the value of result before the method returns. This can be useful when debugging.
Note you can combine the first and second line and still get this benefit:
public string SomeMethod(int Parameter)
{
string result = SomeOtherMethod(Parameter);
return result;
}
I think this last version is both highly readable and easy to debug.
Answer is already posted though let me give a different try :
There are 3 things that you are looking for :
Readability, Performance, usefulness (such as debugging, logging etc..)
1.Readability is somewhat relative. What Eric Lippert / Jon Skeet finds something more readable , same thing will not be applicable to me. More and more you code, many things and your perspective will change toward looking at the code.
Both choices you gave are readable , for me second is more readable.
2.Performance : In the first choice , as you might me aware of string immutability that if you reinitialize a string it will not clear earlier name (interning) and it will create new string and the variable will point it to it.
So from performance perspective intializing a variable to new value (unnecessarily) will cause performance bottleneck. Again this is relative, and depends on size/coplexity of the application.
For this you need to go with second option. Your second option and Jon's answer will result into same performance.
3.Debugging perspective : you would want to have local variable if you are looking for this stuff.
With one of mine coworkers often argue about "the right way" of writing 'Get' methods.
My opinion is object GetSomeObject(). My colleague thinks it is better to be void GetSomeObject(object obj). I know the result is one and the same in both cases. I want to hear and other opinions. Ohhh i forgot to tell for what platform we are talking about - .NET Framework the language is C#.
If it is a simple get then it should be a property
public object SomeObject
{
get { return _someObj; }
}
if it is computational then
object GetSomeObject() { ... }
Is far more commonly expected. Besides the other would have to have either a ref or out passed in as the argument which is discouraged if the former can be achieved
void GetSomeObject(object obj) won't actually 'get' anything. If you turned it into an out parameter you could assign a value to it, and it would technically work, buy why when you can use return types exactly as they were intended:
public void GetSomeObject(out returnObject)
{
returnObject = ...
}
or
public object GetSomeObject()
{
return ...
}
Of course object GetSomeObject() is better. The other one is more a setter than a getter...
It reads easier and is good coding practice to return a value, over modifying a reference parameter (which to me at least, is an old-school way of getting values back)
If you're looking for a deeper meaning:
Function parameters in C# are created as Value parameters by default (as apposed to reference parameters).
To change the parameter value and persist that change to the calling code, the parameter needs to be declared reference. You probably know all this.
Here's the difference: Both value and ref parameters are stored on the stack (which is highly efficient), but the reference parameter's data is stored on the heap. So there is a fraction overhead using reference values.
In most cases this is not a problem at all, but some issues might pop up, like:
recursive functions that use up too much stack (and you get a stack overflow)
functions that require the speed, like calculating primes or fractals
There's probably more, and better examples, than what I gave, you get the idea though.
object getSomeObject(); is better.
Generally speaking the first way is more typical and better because object GetSomeObject() allows you to do the following: GetSomeObject().Foo(). And it's somewhat more intuitive.
However, bool GetSomeObject( out object obj ) can be useful as in the case of TryGetValue() in the Dictionary class.
I can pretty much only see a single reason for using void GetSomeObject(out object obj) instead of object GetSomeObject() and that is if you get rid of the void and instead do something like ErrorResult GetSomeObject(out object obj) (and the GetSomeObject-operation is hairy and error-prone) since you can then report status via the return value.
However, that would still be better handled via plain old Exceptions IMHO. Though I know that some coding standards say that exceptions shouldn't be used at all and in those cases you might want to do something like this.
Still, I'd say just go with a property or the object GetSomeObject() unless you have a really good reason not to.
I'm curious if any developers use string.IsNullOrEmpty() more often with a negative than with a positive
e.g.
if (!string.IsNullOrEmpty())
This is how I use the method 99% of the time. What was the design decision for this?
Because "IsNullOrEmpty" is easier to understand than "NotNullOrEmpty". The latter could be interpreted as:
It's not null and it's not empty
It's not null or it is empty
Double negatives are usually discouraged in naming stuff. !string.NotNullOrEmpty(...) would make one.
For those logicians out there, !string.IsNullOrEmpty is not equivalent to string.IsNotNullOrEmpty. #Guffa has it correct. Using DeMorgan's law, it would have to be string.IsNotNullAndNotEmpty to be equivalent.
¬(null ∨ empty) ⇔ ¬null ∧ ¬empty
¬(null ∨ empty) ≠ ¬null ∨ empty
The point here, I guess, is that the way it is currently is unambiguous, where as making the opposite unambiguous would be cumbersome.
C# naming conventions dictate that your expressions should be in the positive such as "Is..." and not "IsNot..."
EDIT: Typically, I use it when doing error checking and input validation at the beginning of a method and raise an exception if the parameter is null or empty.
if (string.IsNullOrEmpty(myParameter))
{
throw new ....
}
I always create an extension method for "HasContent()" which generally makes sense, follows the "positive" specifications, and saves on code bloat because I use it much more often than its counterpart:
public static bool HasContent(this string s) {
return !string.IsNullOrEmpty(s);
}
I prefer the extension method:
public static class StringExtensions
{
public static bool IsNullOrEmpty(this string value)
{
return string.IsNullOrEmpty(value);
}
}
I find it reads better to say:
if(myValue.IsNullOrEmpty())
or
if(!myValue.IsNullOrEmpty())
Perhaps because then the name would have to be the lengthy IsNotNullAndNotEmpty to be as specific.
Of course you could always use string.IsNullOrWhiteSpace(string) now instead of string .IsNullOrEmpty(string) from .NET 4.0
That is the most common usage I have seen.
"NotNullOrEmpty" is ambiguous, it could mean "(not null) or empty" or it could mean "not (null or empty)". To make it unambiguous you'd have to use "NotNullAndNotEmpty", which is a mouthfull.
Also, the "IsNullOrEmpty" naming encourages use as a guard clause, which I think is useful. E.g.:
if (String.IsNullOrEmpty(someString))
{
// error handling
return;
}
// do stuff
which I think is generally cleaner than:
if (!String.IsNullOrEmpty(someString))
{
// do stuff
}
else
{
// error handling
return;
}
I would actually be inclined to offer a different answer from the "it's ambiguous" explanation provided by several others (though I agree with that answer as well):
Personally, I like to minimize nesting in my code, as (to me) the more curly braces code has, the harder it becomes to follow.
Therefore I'd much prefer this (for example):
public bool DoSomethingWithString(string s) {
if (string.IsNullOrEmpty(s))
return false;
// here's the important code, not nested
}
to this:
public bool DoSomethingWithString(string s) {
if (!string.IsNullOrEmpty(s)) {
// here's the important code, nested
} else {
return false;
}
}
This is a pretty specific scenario (where a null/empty string prompts an immediate exit) and clearly isn't the way a method using IsNullOrEmpty would always be structured; but I think it's actually pretty common.
Personally I prefer to cater for the non negated scenario first. It just makes sense to me to do the true part first and then the false. Comes down to personal style.
I've always thought it seemed the wrong way round as I use the negative much more often than the positive.
I would also like there to be an instance IsEmpty() or IsNotEmpty() for use when the variable is declared within the function. This could not be IsNullOrEmpty() or IsNotNullOrEmpty() as if the instance was null then you would get a null reference exception.
I had the same question before I realized all I had to do to flip the question was to put the Not operator in front of the conditional. I think it cleande up my code some.
// need to check if tBx_PTNum.Text is empty
/*
if (string.IsNullOrWhiteSpace(tBx_PTNum.Text))
{
// no pt number yet
}
else
{
ptNum = Convert.ToInt32(tBx_PTNum.Text);
}
*/
if(!string.IsNullOrEmpty(tBx_PTNum.Text))
{
ptNum = Convert.ToInt32(tBx_PTNum.Text);
}