Related
If we want to get a value from a method, we can use either return value, like this:
public int GetValue();
or:
public void GetValue(out int x);
I don't really understand the differences between them, and so, don't know which is better. Can you explain me this?
Thank you.
Return values are almost always the right choice when the method doesn't have anything else to return. (In fact, I can't think of any cases where I'd ever want a void method with an out parameter, if I had the choice. C# 7's Deconstruct methods for language-supported deconstruction acts as a very, very rare exception to this rule.)
Aside from anything else, it stops the caller from having to declare the variable separately:
int foo;
GetValue(out foo);
vs
int foo = GetValue();
Out values also prevent method chaining like this:
Console.WriteLine(GetValue().ToString("g"));
(Indeed, that's one of the problems with property setters as well, and it's why the builder pattern uses methods which return the builder, e.g. myStringBuilder.Append(xxx).Append(yyy).)
Additionally, out parameters are slightly harder to use with reflection and usually make testing harder too. (More effort is usually put into making it easy to mock return values than out parameters). Basically there's nothing I can think of that they make easier...
Return values FTW.
EDIT: In terms of what's going on...
Basically when you pass in an argument for an "out" parameter, you have to pass in a variable. (Array elements are classified as variables too.) The method you call doesn't have a "new" variable on its stack for the parameter - it uses your variable for storage. Any changes in the variable are immediately visible. Here's an example showing the difference:
using System;
class Test
{
static int value;
static void ShowValue(string description)
{
Console.WriteLine(description + value);
}
static void Main()
{
Console.WriteLine("Return value test...");
value = 5;
value = ReturnValue();
ShowValue("Value after ReturnValue(): ");
value = 5;
Console.WriteLine("Out parameter test...");
OutParameter(out value);
ShowValue("Value after OutParameter(): ");
}
static int ReturnValue()
{
ShowValue("ReturnValue (pre): ");
int tmp = 10;
ShowValue("ReturnValue (post): ");
return tmp;
}
static void OutParameter(out int tmp)
{
ShowValue("OutParameter (pre): ");
tmp = 10;
ShowValue("OutParameter (post): ");
}
}
Results:
Return value test...
ReturnValue (pre): 5
ReturnValue (post): 5
Value after ReturnValue(): 10
Out parameter test...
OutParameter (pre): 5
OutParameter (post): 10
Value after OutParameter(): 10
The difference is at the "post" step - i.e. after the local variable or parameter has been changed. In the ReturnValue test, this makes no difference to the static value variable. In the OutParameter test, the value variable is changed by the line tmp = 10;
What's better, depends on your particular situation. One of the reasons out exists is to facilitate returning multiple values from one method call:
public int ReturnMultiple(int input, out int output1, out int output2)
{
output1 = input + 1;
output2 = input + 2;
return input;
}
So one is not by definition better than the other. But usually you'd want to use a simple return, unless you have the above situation for example.
EDIT:
This is a sample demonstrating one of the reasons that the keyword exists. The above is in no way to be considered a best practise.
You should generally prefer a return value over an out param. Out params are a necessary evil if you find yourself writing code that needs to do 2 things. A good example of this is the Try pattern (such as Int32.TryParse).
Let's consider what the caller of your two methods would have to do. For the first example I can write this...
int foo = GetValue();
Notice that I can declare a variable and assign it via your method in one line. FOr the 2nd example it looks like this...
int foo;
GetValue(out foo);
I'm now forced to declare my variable up front and write my code over two lines.
update
A good place to look when asking these types of question is the .NET Framework Design Guidelines. If you have the book version then you can see the annotations by Anders Hejlsberg and others on this subject (page 184-185) but the online version is here...
http://msdn.microsoft.com/en-us/library/ms182131(VS.80).aspx
If you find yourself needing to return two things from an API then wrapping them up in a struct/class would be better than an out param.
There's one reason to use an out param which has not already been mentioned: the calling method is obliged to receive it. If your method produces a value which the caller should not discard, making it an out forces the caller to specifically accept it:
Method1(); // Return values can be discard quite easily, even accidentally
int resultCode;
Method2(out resultCode); // Out params are a little harder to ignore
Of course the caller can still ignore the value in an out param, but you've called their attention to it.
This is a rare need; more often, you should use an exception for a genuine problem or return an object with state information for an "FYI", but there could be circumstances where this is important.
It's preference mainly
I prefer returns and if you have multiple returns you can wrap them in a Result DTO
public class Result{
public Person Person {get;set;}
public int Sum {get;set;}
}
You should almost always use a return value. 'out' parameters create a bit of friction to a lot of APIs, compositionality, etc.
The most noteworthy exception that springs to mind is when you want to return multiple values (.Net Framework doesn't have tuples until 4.0), such as with the TryParse pattern.
You can only have one return value whereas you can have multiple out parameters.
You only need to consider out parameters in those cases.
However, if you need to return more than one parameter from your method, you probably want to look at what you're returning from an OO approach and consider if you're better off return an object or a struct with these parameters. Therefore you're back to a return value again.
I would prefer the following instead of either of those in this simple example.
public int Value
{
get;
private set;
}
But, they are all very much the same. Usually, one would only use 'out' if they need to pass multiple values back from the method. If you want to send a value in and out of the method, one would choose 'ref'. My method is best, if you are only returning a value, but if you want to pass a parameter and get a value back one would likely choose your first choice.
I think one of the few scenarios where it would be useful would be when working with unmanaged memory, and you want to make it obvious that the "returned" value should be disposed of manually, rather than expecting it to be disposed of on its own.
Additionally, return values are compatible with asynchronous design paradigms.
You cannot designate a function "async" if it uses ref or out parameters.
In summary, Return Values allow method chaining, cleaner syntax (by eliminating the necessity for the caller to declare additional variables), and allow for asynchronous designs without the need for substantial modification in the future.
As others have said: return value, not out param.
May I recommend to you the book "Framework Design Guidelines" (2nd ed)? Pages 184-185 cover the reasons for avoiding out params. The whole book will steer you in the right direction on all sorts of .NET coding issues.
Allied with Framework Design Guidelines is the use of the static analysis tool, FxCop. You'll find this on Microsoft's sites as a free download. Run this on your compiled code and see what it says. If it complains about hundreds and hundreds of things... don't panic! Look calmly and carefully at what it says about each and every case. Don't rush to fix things ASAP. Learn from what it is telling you. You will be put on the road to mastery.
Using the out keyword with a return type of bool, can sometimes reduce code bloat and increase readability. (Primarily when the extra info in the out param is often ignored.) For instance:
var result = DoThing();
if (result.Success)
{
result = DoOtherThing()
if (result.Success)
{
result = DoFinalThing()
if (result.Success)
{
success = true;
}
}
}
vs:
var result;
if (DoThing(out result))
{
if (DoOtherThing(out result))
{
if (DoFinalThing(out result))
{
success = true;
}
}
}
There is no real difference. Out parameters are in C# to allow method return more then one value, that's all.
However There are some slight differences , but non of them are really important:
Using out parameter will enforce you to use two lines like:
int n;
GetValue(n);
while using return value will let you do it in one line:
int n = GetValue();
Another difference (correct only for value types and only if C# doesn't inline the function) is that using return value will necessarily make a copy of the value when the function return, while using OUT parameter will not necessarily do so.
Please avoid using out parameters.
Although, they can make sense in certain situations (for example when implementing the Try-Parse Pattern), they are very hard to grasp.
Chances to introduce bugs or side effects by yourself (unless you are very experienced with the concept) and by other developers (who either use your API or may inherit your code) is very high.
According to Microsoft's quality rule CA1021:
Although return values are commonplace and heavily used, the correct application of out and ref parameters requires intermediate design and coding skills. Library architects who design for a general audience should not expect users to master working with out or ref parameters.
Therefore, if there is not a very good reason, please just don't use out or ref.
See also:
Is using "out" bad practice
https://learn.microsoft.com/en-us/dotnet/fundamentals/code-analysis/quality-rules/ca1021
Both of them have a different purpose and are not treated the same by the compiler. If your method needs to return a value, then you must use return. Out is used where your method needs to return multiple values.
If you use return, then the data is first written to the methods stack and then in the calling method's. While in case of out, it is directly written to the calling methods stack. Not sure if there are any more differences.
out is more useful when you are trying to return an object that you declare in the method.
Example
public BookList Find(string key)
{
BookList book; //BookList is a model class
_books.TryGetValue(key, out book) //_books is a concurrent dictionary
//TryGetValue gets an item with matching key and returns it into book.
return book;
}
return value is the normal value which is returned by your method.
Where as out parameter, well out and ref are 2 key words of C# they allow to pass variables as reference.
The big difference between ref and out is, ref should be initialised before and out don't
I suspect I'm not going to get a look-in on this question, but I am a very experienced programmer, and I hope some of the more open-minded readers will pay attention.
I believe that it suits object-oriented programming languages better for their value-returning procedures (VRPs) to be deterministic and pure.
'VRP' is the modern academic name for a function that is called as part of an expression, and has a return value that notionally replaces the call during evaluation of the expression. E.g. in a statement such as x = 1 + f(y) the function f is serving as a VRP.
'Deterministic' means that the result of the function depends only on the values of its parameters. If you call it again with the same parameter values, you are certain to get the same result.
'Pure' means no side-effects: calling the function does nothing except computing the result. This can be interpreted to mean no important side-effects, in practice, so if the VRP outputs a debugging message every time it is called, for example, that can probably be ignored.
Thus, if, in C#, your function is not deterministic and pure, I say you should make it a void function (in other words, not a VRP), and any value it needs to return should be returned in either an out or a ref parameter.
For example, if you have a function to delete some rows from a database table, and you want it to return the number of rows it deleted, you should declare it something like this:
public void DeleteBasketItems(BasketItemCategory category, out int count);
If you sometimes want to call this function but not get the count, you could always declare an overloading.
You might want to know why this style suits object-oriented programming better. Broadly, it fits into a style of programming that could be (a little imprecisely) termed 'procedural programming', and it is a procedural programming style that fits object-oriented programming better.
Why? The classical model of objects is that they have properties (aka attributes), and you interrogate and manipulate the object (mainly) through reading and updating those properties. A procedural programming style tends to make it easier to do this, because you can execute arbitrary code in between operations that get and set properties.
The downside of procedural programming is that, because you can execute arbitrary code all over the place, you can get some very obtuse and bug-vulnerable interactions via global variables and side-effects.
So, quite simply, it is good practice to signal to someone reading your code that a function could have side-effects by making it non-value returning.
I'm trying to formalise the usage of the "out" keyword in c# for a project I'm on, particularly with respect to any public methods. I can't seem to find any best practices out there and would like to know what is good or bad.
Sometimes I'm seeing some methods signatures that look like this:
public decimal CalcSomething(Date start, Date end, out int someOtherNumber){}
At this point, it's just a feeling, this doesn't sit well with me. For some reason, I'd prefer to see:
public Result CalcSomething(Date start, Date end){}
where the result is a type that contains a decimal and the someOtherNumber. I think this makes it easier to read. It allows Result to be extended or have properties added without breaking code. It also means that the caller of this method doesn't have to declare a locally scoped "someOtherNumber" before calling. From usage expectations, not all callers are going to be interested in "someOtherNumber".
As a contrast, the only instances that I can think of right now within the .Net framework where "out" parameters make sense are in methods like TryParse(). These actually make the caller write simpler code, whereby the caller is primarily going to be interested in the out parameter.
int i;
if(int.TryParse("1", i)){
DoSomething(i);
}
I'm thinking that "out" should only be used if the return type is bool and the expected usages are where the "out" parameters will always be of interest to the caller, by design.
Thoughts?
There is a reason that one of the static code analysis (=FxCop) rules points at you when you use out parameters. I'd say: only use out when really needed in interop type scenarios. In all other cases, simply do not use out. But perhaps that's just me?
This is what the .NET Framework Developer's Guide has to say about out parameters:
Avoid using out or reference parameters.
Working with members
that define out or reference
parameters requires that the developer
understand pointers, subtle
differences between value types and
reference types, and initialization
differences between out and reference
parameters.
But if you do use them:
Do place all out parameters after all of the pass-by-value and ref
parameters (excluding parameter
arrays), even if this results in an
inconsistency in parameter ordering
between overloads.
This convention makes the method
signature easier to understand.
Your approach is better than out, because you can "chain" calls that way:
DoSomethingElse(DoThing(a,b).Result);
as opposed to
DoThing(a, out b);
DoSomethingElse(b);
The TryParse methods implemented with "out" was a mistake, IMO. Those would have been very convenient in chains.
There are only very few cases where I would use out. One of them is if your method returns two variables that from an OO point of view do not belong into an object together.
If for example, you want to get the most common word in a text string, and the 42nd word in the text, you could compute both in the same method (having to parse the text only once). But for your application, these informations have no relation to each other: You need the most common word for statistical purposes, but you only need the 42nd word because your customer is a geeky Douglas Adams fan.
Yes, that example is very contrived, but I haven't got a better one...
I just had to add that starting from C# 7, the use of the out keyword makes for very readable code in certain instances, when combined with inline variable declaration. While in general you should rather return a (named) tuple, control flow becomes very concise when a method has a boolean outcome, like:
if (int.TryParse(mightBeCount, out var count)
{
// Successfully parsed count
}
I should also mention, that defining a specific class for those cases where a tuple makes sense, more often than not, is more appropriate. It depends on how many return values there are and what you use them for. I'd say, when more than 3, stick them in a class anyway.
One advantage of out is that the compiler will verify that CalcSomething does in fact assign a value to someOtherNumber. It will not verify that the someOtherNumber field of Result has a value.
Stay away from out. It's there as a low-level convenience. But at a high level, it's an anti-technique.
int? i = Util.TryParseInt32("1");
if(i == null)
return;
DoSomething(i);
If you have even seen and worked with MS
namespace System.Web.Security
MembershipProvider
public abstract MembershipUser CreateUser(string username, string password, string email, string passwordQuestion, string passwordAnswer, bool isApproved, object providerUserKey, out MembershipCreateStatus status);
You will need a bucket. This is an example of a class breaking many design paradigms. Awful!
Just because the language has out parameters doesn't mean they should be used. eg goto
The use of out Looks more like the Dev was either Lazy to create a type or wanted to try a language feature.
Even the completely contrived MostCommonAnd42ndWord example above I would use
List or a new type contrivedresult with 2 properties.
The only good reasons i've seen in the explanations above was in interop scenarios when forced to. Assuming that is valid statement.
You could create a generic tuple class for the purpose of returning multiple values. This seems to be a decent solution but I can't help but feel that you lose a bit of readability by returning such a generic type (Result is no better in that regard).
One important point, though, that james curran also pointed out, is that the compiler enforces an assignment of the value. This is a general pattern I see in C#, that you must state certain things explicitly, for more readable code. Another example of this is the override keyword which you don't have in Java.
If your result is more complex than a single value, you should, if possible, create a result object. The reasons I have to say this?
The entire result is encapsulated. That is, you have a single package that informs the code of the complete result of CalcSomething. Instead of having external code interpret what the decimal return value means, you can name the properties for your previous return value, Your someOtherNumber value, etc.
You can include more complex success indicators. The function call you wrote might throw an exception if end comes before start, but exception throwing is the only way to report errors. Using a result object, you can include a boolean or enumerated "Success" value, with appropriate error reporting.
You can delay the execution of the result until you actually examine the "result" field. That is, the execution of any computing needn't be done until you use the values.
I'm just wondering how other developers tackle this issue of getting 2 or 3 answers from a method.
1) return a object[]
2) return a custom class
3) use an out or ref keyword on multiple variables
4) write or borrow (F#) a simple Tuple<> generic class
http://slideguitarist.blogspot.com/2008/02/whats-f-tuple.html
I'm working on some code now that does data refreshes. From the method that does the refresh I would like to pass back (1) Refresh Start Time and (2) Refresh End Time.
At a later date I may want to pass back a third value.
Thoughts? Any good practices from open source .NET projects on this topic?
It entirely depends on what the results are. If they are related to one another, I'd usually create a custom class.
If they're not really related, I'd either use an out parameter or split the method up. If a method wants to return three unrelated items, it's probably doing too much. The exception to this is when you're talking across a web-service boundary or something else where a "purer" API may be too chatty.
For two, usually 4)
More than that, 2)
Your question points to the possibility that you'll be returning more data in the future, so I would recommend implementing your own class to contain the data.
What this means is that your method signature will remain the same even if the inner representation of the object you're passing around changes to accommodate more data. It's also good practice for readability and encapsulation reasons.
Code Architeture wise i'd always go with a Custom Class when needing somewhat a specific amount of variables changed. Why? Simply because a Class is actually a "blueprint" of an often used data type, creating your own data type, which it in this case is, will help you getting a good structure and helping others programme for your interface.
Personally, I hate out/ref params, so I'd rather not use that approach. Also, most of the time, if you need to return more than one result, you are probably doing something wrong.
If it really is unavoidable, you will probably be happiest in the long run writing a custom class. Returning an array is tempting as it is easy and effective in the short teerm, but using a class gives you the option of changing the return type in the future without having to worry to much about causing problems down stream. Imagine the potential for a debugging nightmare if someone swaps the order of two elements in the array that is returned....
I use out if it's only 1 or 2 additional variables (for example, a function returns a bool that is the actual important result, but also a long as an out parameter to return how long the function ran, for logging purposes).
For anything more complicated, i usually create a custom struct/class.
I think the most common way a C# programmer would do this would be to wrap the items you want to return in a separate class. This would provide you with the most flexibility going forward, IMHO.
It depends. For an internal only API, I'll usually choose the easiest option. Generally that's out.
For a public API, a custom class usually makes more sense - but if it's something fairly primitive, or the natural result of the function is a boolean (like *.TryParse) I'll stick with an out param. You can do a custom class with an implicit cast to bool as well, but that's usually just weird.
For your particular situation, a simple immutable DateRange class seems most appropriate to me. You can easily add that new value without disturbing existing users.
If you're wanting to send back the refresh start and end times, that suggests a possible class or struct, perhaps called DataRefreshResults. If your possible third value is also related to the refresh, then it could be added. Remember, a struct is always passed by value, so it's allocated on the heap does not need to be garbage-collected.
Some people use KeyValuePair for two values. It's not great though because it just labels the two things as Key and Value. Not very descriptive. Also it would seriously benefit from having this added:
public static class KeyValuePair
{
public static KeyValuePair<K, V> Make(K k, V v)
{
return new KeyValuePair<K, V>(k, v);
}
}
Saves you from having to specify the types when you create one. Generic methods can infer types, generic class constructors can't.
For your scenario you may want to define generic Range{T} class (with checks for the range validity).
If method is private, then I usually use tuples from my helper library. Public or protected methods generally always deserve separate.
Return a custom type, but don't use a class, use a struct - no memory allocation/garbage collection overhead means no downsides.
If 2, a Pair.
If more than 2 a class.
Another solution is to return a dictionary of named object references. To me, this is pretty equivalent to using a custom return class, but without the clutter. (And using RTTI and reflection it is just as typesafe as any other solution, albeit dynamically so.)
It depends on the type and meaning of the results, as well as whether the method is private or not.
For private methods, I usually just use a Tuple, from my class library.
For public/protected/internal methods (ie. not private), I use either out parameter or a custom class.
For instance, if I'm implementing the TryXYZ pattern, where you have an XYZ method that throws an exception on failure and a TryXYZ method that returns Boolean, TryXYZ will use an out parameter.
If the results are sequence-oriented (ie. return 3 customers that should be processed) then I will typically return some kind of collection.
Other than that I usually just use a custom class.
If a method outputs two to three related value, I would group them in a type. If the values are unrelated, the method is most likely doing way too much and I would refactor it into a number of simpler methods.
Everyone knows and love String.IsNullOrEmpty(yourString) method.
I was wondering if it's going to confuse developers or make code better if we extend String class to have method like this:
yourString.IsNullOrEmpty();
Pro:
More readable.
Less typing.
Cons:
Can be confusing because yourString
variable can be null and it looks
like you're executing method on a
null variable.
What do you think?
The same question we can ask about myObject.IsNull() method.
That how I would write it:
public static class StringExt
{
public static bool IsNullOrEmpty(this string text)
{
return string.IsNullOrEmpty(text);
}
public static bool IsNull(this object obj)
{
return obj == null;
}
}
If I'm not mistaken, every answer here decries the fact that the extension method can be called on a null instance, and because of this they do not support believe this is a good idea.
Let me counter their arguments.
I don't believe AT ALL that calling a method on an object that may be null is confusing. The fact is that we only check for nulls in certain locations, and not 100% of the time. That means there is a percentage of time where every method call we make is potentially on a null object. This is understood and acceptable. If it wasn't, we'd be checking null before every single method call.
So, how is it confusing that a particular method call may be happening on a null object? Look at the following code:
var bar = foo.DoSomethingResultingInBar();
Console.Writeline(bar.ToStringOr("[null]"));
Are you confused? You should be. Because here's the implementation of foo:
public Bar DoSomethingResultingInBar()
{
return null; //LOL SUCKER!
}
See? You read the code sample without being confused at all. You understood that, potentially, foo would return a null from that method call and the ToStringOr call on bar would result in a NRE. Did your head spin? Of course not. Its understood that this can happen. Now, that ToStringOr method is not familiar. What do you do in these situations? You either read the docs on the method or examine the code of the call. Here it is:
public static class BarExtensions
{
public static string ToStringOr(this bar, string whenNull)
{
return bar == null ? whenNull ?? "[null]" : bar.ToString();
}
}
Confusing? Of course not. Its obvious that the developer wanted a shorthand method of checking if bar is null and substituting a non-null string for it. Doing this can slash your code significantly and increase readability and code reuse. Of course you could do this in other ways, but this way would be no more confusing than any other. For example:
var bar = foo.DoSomethingResultingInBar();
Console.Writeline(ToStringOr(bar, "[null]"));
When you encounter this code, what do you have to differently than the original version? You still have to examine the code, you still have to determine its behavior when bar is null. You still have to deal with this possibility.
Are extension methods confusing? Only if you don't understand them. And, quite frankly, the same can be said for ANY part of the language, from delegates to lambdas.
I think the root of this problem is what Jon Skeet has mentioned in the list of things he hates in his favorite language (C#): C# should not have imported all extension methods in a whole namespace automatically. This process should have been done more explicitly.
My personal opinion about this specific question is (since we can't do anything about the above fact) to use the extension method if you want. I don't say it won't be confusing, but this fact about extension methods (that can be called on null references) is a global thing and doesn't affect only String.IsNullOrEmpty, so C# devs should get familiar with it.
By the way, it's fortunate that Visual Studio clearly identifies extension methods by (extension) in the IntelliSense tooltip.
I'm personally not a fan of doing this. The biggest problem with extension methods right now is discoverability. Unleses you flat out know all of the methods which exist on a particular type, it's not possible to look at a method call and know that it's an extension method. As such I find it problematic to do anything with an extension method that would not be possible with a normal method call. Otherwise you will end up confusing developers.
A corollary to this problem exist in C++. In several C++ implementations it's possible to call instance methods on NULL pointers as long as you don't touch any fields or virtual methods on the type. I've worked with several pieces of code that do this intentionally and give methods differentt behavior when "this==NULL". It's quite maddening to work with.
This is not to say that I don't like extension methods. Quite the contrary, I enjoy them and use them frequently. But I think there are 2 important rules you should follow when writing them.
Treat the actual method implementation as if it's just another static method because it in fact is. For example throw ArgumentException instead of NullReference exception for a null this
Don't let an extension method perform tricks that a normal instance method couldn't do
EDIT Responding to Mehrdad's comment
The problem with taking advantage of it is that I don't see str.IsNullOrEmpty as having a significant functional advantage over String.IsNullOrEmpty(str). The only advantage I see is that one requires less typing than the other. The same could be said about extension methods in general. But in this case you're additionally altering the way people think about program flow.
If shorter typing is what people really want wouldn't IsNullOrEmpty(str) be a much better option? It's both unambiguous and is the shortest of all. True C# has no support for such a feature today. But imagine if in C# I could say
using SomeNamespace.SomeStaticClass;
The result of doing this is that all methods on SomeStaticClass were now in the global namespace and available for binding. This seems to be what people want, but they're attaching it to an extension method which I'm not a huge fan of.
I really like this approach AS LONG AS the method makes it clear that it is checking the object is null. ThrowIfNull, IsNull, IsNullOrEmpty, etc. It is very readable.
Personally, I wouldn't create an extension that does something that already exists in the framework unless it was a significant improvement in usability. In this instance, I don't think that's the case. Also, if I were to create an extension, I would name it in a way as to reduce confusion, not increase it. Again, I think this case fails that test.
Having said all that, I do have a string extension that tests, not only if the string is null or empty, but also if it only contains whitespace. I call it IsNothing. You can find it here.
It doesn't just look like you're calling a method on a null variable. You /are/ calling a method on a null variable, albeit one implemented through a static extension method. I had no idea extension methods (which I was already leery of) supported that. This even allows you to do crazy things like:
public static int PowerLength(this string obj)
{
return obj == null ? 0 : obj.Length;
}
From where I'm standing now, I would classify any use of an extension method on a null reference under considered harmful.
Let's look at the pro:s and con:s...
More readable.
Yes, slightly, but the improvement in readability is outweighed by the fact that it looks like you are calling an instance method on something that doesn't have to be an instance. In effect it's easier to read, but it's harder to understand, so the improved readability is really just an illusion.
Less typing.
Yes, but that is really not a strong argument. If typing is the main part of your programming, you are just not doing something that is remotely challenging enough for you to evolve as a developer.
Can be confusing because yourString variable can be null and it looks like
you're executing method on a null variable.
True, (as mentioned above).
Yes, it will confuse.
I think that everyone who knows about IsNullOrEmpty() perceives the use of it as quite natural. So your suggested extension method will not add more readability.
Maybe for someone new to .NET this extension method might be easier to deal with, but there is the danger possibility that she/he doesn't understand all facets of extension methods (like that you need to import the namespace and that it can be invoked on null). She/he will might wonder why this extension method does not exist in another project. Anyway: Even if someone is new to .NET the syntax of the IsNullOrEmpty() method might become natural quite fast. Also here the benefits of the extension methods will not outweight the confusion caused by it.
Edit: Tried to rephrase what I wanted to say.
I think extending any object with an "IsNull" type call is a little confusing and should be avoided if possible.
It's arguable that the IsEmptyString method might be useful for the String type, and that because you'd usually combine this with a test for null that the IsNullOrEmpty might be useful, but I'd avoid this too due to the fact that the string type already has a static method that does this and I'm not sure you're saving yourself that much typing (5 characters at most).
Calling a method on a variable that is null usually results in a NullReferenceException. The IsNullOrEmpty()-Method deviates from this behaviour in a way that is not predictable from just looking at the code. Therefore I would advise against using it since it creates confusion and the benefit of saving a couple of characters is minimal.
In general, I'm only ok with extension methods being safe to call on null if they have the word 'null' or something like that in their name. That way, I'm clued in to the fact that they may be safe to call with null. Also, they better document that fact in their XML comment header so I get that info when I mouse-over the call.
When comparing class instance to null using ".IsNull()" is not even shorter than using " == null".
Things change in generic classes when the generic type argument without constraint can be a value type.
In such case comparison with default type is lengthy and I use the extension below:
public static bool IsDefault<T>(this T x)
{
return EqualityComparer<T>.Default.Equals(x, default(T));
}
I'm faced with a situation that I think can only be solved by using a ref parameter. However, this will mean changing a method to always accept a ref parameter when I only need the functionality provided by a ref parameter 5% of the time.
This makes me think "whoa, crazy, must find another way". Am I being stupid? What sort of problems can be caused by a ref parameter?
Edit
Further details were requested, I don't think they are entirely relevant to what I was asking but here we go.
I'm wanting to either save a new instance (which will update with the ID which may later be used) or retrieve an existing instance that matches some logic and update that, save it then change the reference of the new instance to point to the existing one.
Code may make it clearer:
protected override void BeforeSave(Log entity)
{
var newLog = entity;
var existingLog = (from log in repository.All()
where log.Stuff == newLog.Stuff
&& log.Id != newLog.Id
select log).SingleOrDefault();
if (existingLog != null)
{
// update the time
existingLog.SomeValue = entity.SomeValue;
// remove the reference to the new entity
entity = existingLog;
}
}
// called from base class which usually does nothing before save
public void Save(TEntity entity)
{
var report = validator.Validate(entity);
if (report.ValidationPassed)
{
BeforeSave(entity);
repository.Save(entity);
}
else
{
throw new ValidationException { Report = report };
}
}
It's the fact that I would be adding it in only for one child (so far) of the base class that prevents me using an overload (due to the fact I would have to duplicate the Save method). I also have the problem whereby I need to force them to use the ref version in this instance otherwise things won't work as expected.
Can you add an overload? Have one signature without the ref parameter, and one with it.
Ref parameters can be useful, and I'm glad they exist in C#, but they shouldn't be used without thought. Often if a method is effectively returning two values, it would be better either to split the method into two parts, or encapsulate both values in a single type. Neither of these covers every case though - there are definitely times when ref is the best option.
Perhaps use an overloaded function for this 5% case and leave the other function as is.
Unnecessary ref parameters can lead to bad design patterns, but if you have a specific need, there's no problem with doing this.
If you take the .NET Framework as a barometer of people's expectations of an API, consider that almost all of the String methods return the modified value, but leave the passed argument unchanged. String.Trim(), for instance, returns the trimmed String - it doesn't trim the String that was passed in as an argument.
Now, obviously, this is only feasible if you're willing to put return-values into your API. Also, if your function already returns a value, you run into the nasty possibility of creating a custom structure that contains your original return value as well as the newly changed object.
Ultimately, it's up to you and how you document your API. I've found in my experience though that my fellow programmers tend to expect my functions to act "like the .NET Framework functions". :)
A ref parameter won't cause problems per se. It's a documented feature of the language.
However it could cause social problems. Specifically, clients of your API might not expect a ref parameter simply because it's rare. You might modify a reference that the client doesn't expect.
Of course you can argue that this is the client's fault for not reading your API spec, and that would be true. But sometimes it's best to reduce surprise. Writing good code isn't just about following the rules and documenting stuff, it's also about making something naturally obvious to a human user.
An overload won't kill your application or its design. As long as the intent is clearly documented, it should be okay.
One thing that might be considered is mitigating your fears about the ref parameter through a different type of parameter. For example, consider this:
public class SaveArgs
{
public SaveArgs(TEntity value) { this.Value = value; }
public TEntity Value { get; private set;}
public int NewId { get; internal set; }
public bool NewIdGenerated { get; internal set; }
}
In your code, you simply pass a SaveArgs rather than the TEntity, so that you can modify its properties with more meaningful information. (Naturally, it'd be a better-designed class than what I have above.) But then you wouldn't have to worry about vague method interfaces, and you could return as much data as you needed to in a verbose class.
Just a thought.
EDIT: Fixed the code. My bad.
There is nothing wrong with using ref parameters that I can think of, in fact they can be very handy sometimes. I think they sometimes get a bad rap due to debugging since the value of the variable can change in the code logic and can sometimes be hard to track. It also makes things difficult when converting to things like WebServices where a "value" only pass will suffice.
The biggest issue I have with ref parameters is they make it difficult to use type inference as you must sometimes explicitly declare the type of the variable being used as the ref parameter.
Most of the time I use a ref parameter it's in a TryGet scenario. In general I've stopped using ref's in that scenario and instead opted for using a more functional style method by way of an option.
For instance. TryGetValue in dictionary switches from
bool TryGetValue(TKey key, out TValue value)
To
Option<Value> TryGetValue(TKey key)
Option available here: http://blogs.msdn.com/jaredpar/archive/2008/10/08/functional-c-providing-an-option-part-2.aspx
The most common use I've seen for ref parameters is as a way of returning multiple values. If that's the case, you should consider creating a class or struct that returns all the values as one object.
If you still want to use a ref but want to make it optional, add a function overload.
ref is just a tool. You should think: What is the best design pattern for what I am building?
Sometimes will be better to use an overloaded method.
Others will be better to return a custom type or a tuple.
Others will be better to use a global variable.
And others ref will be the right decision.
If your method only needs this ref parameter 5% of the time perhaps you need to break this method down. Of course without more details its hard to say but this to me smells like a case of violating single responsability principal. Perhaps overloading it will help.
As for your question there is no issue in my opinion passing a parameter as a reference although it is not a common thing to run into.
This is one of those things that F# or other functional programming languages solve a lot better with returning Tuple values. Its a lot more cleaner and terse syntax. In the book I am reading on F#, it actually points out the C# equivelant of using ref as doing the same thing in C# to return multiple parameters.
I have no idea if it is a bad practice or there is some underlying "booga booga" about ref parameters, to me they just feel as not clean syntax.
A void-returning function with a single reference parameter certainly looks funny to me. If I were reviewing this code, I'd suggest refactoring the BeforeSave() to include the call to Repository.Save() - renaming it, obviously. Why not just have one method that takes your possibly-new entity and guarantees that everything is saved properly? The caller doesn't do anything with the returned entity anyway.