Is extending String class with IsNullOrEmpty confusing? - c#

Everyone knows and love String.IsNullOrEmpty(yourString) method.
I was wondering if it's going to confuse developers or make code better if we extend String class to have method like this:
yourString.IsNullOrEmpty();
Pro:
More readable.
Less typing.
Cons:
Can be confusing because yourString
variable can be null and it looks
like you're executing method on a
null variable.
What do you think?
The same question we can ask about myObject.IsNull() method.
That how I would write it:
public static class StringExt
{
public static bool IsNullOrEmpty(this string text)
{
return string.IsNullOrEmpty(text);
}
public static bool IsNull(this object obj)
{
return obj == null;
}
}

If I'm not mistaken, every answer here decries the fact that the extension method can be called on a null instance, and because of this they do not support believe this is a good idea.
Let me counter their arguments.
I don't believe AT ALL that calling a method on an object that may be null is confusing. The fact is that we only check for nulls in certain locations, and not 100% of the time. That means there is a percentage of time where every method call we make is potentially on a null object. This is understood and acceptable. If it wasn't, we'd be checking null before every single method call.
So, how is it confusing that a particular method call may be happening on a null object? Look at the following code:
var bar = foo.DoSomethingResultingInBar();
Console.Writeline(bar.ToStringOr("[null]"));
Are you confused? You should be. Because here's the implementation of foo:
public Bar DoSomethingResultingInBar()
{
return null; //LOL SUCKER!
}
See? You read the code sample without being confused at all. You understood that, potentially, foo would return a null from that method call and the ToStringOr call on bar would result in a NRE. Did your head spin? Of course not. Its understood that this can happen. Now, that ToStringOr method is not familiar. What do you do in these situations? You either read the docs on the method or examine the code of the call. Here it is:
public static class BarExtensions
{
public static string ToStringOr(this bar, string whenNull)
{
return bar == null ? whenNull ?? "[null]" : bar.ToString();
}
}
Confusing? Of course not. Its obvious that the developer wanted a shorthand method of checking if bar is null and substituting a non-null string for it. Doing this can slash your code significantly and increase readability and code reuse. Of course you could do this in other ways, but this way would be no more confusing than any other. For example:
var bar = foo.DoSomethingResultingInBar();
Console.Writeline(ToStringOr(bar, "[null]"));
When you encounter this code, what do you have to differently than the original version? You still have to examine the code, you still have to determine its behavior when bar is null. You still have to deal with this possibility.
Are extension methods confusing? Only if you don't understand them. And, quite frankly, the same can be said for ANY part of the language, from delegates to lambdas.

I think the root of this problem is what Jon Skeet has mentioned in the list of things he hates in his favorite language (C#): C# should not have imported all extension methods in a whole namespace automatically. This process should have been done more explicitly.
My personal opinion about this specific question is (since we can't do anything about the above fact) to use the extension method if you want. I don't say it won't be confusing, but this fact about extension methods (that can be called on null references) is a global thing and doesn't affect only String.IsNullOrEmpty, so C# devs should get familiar with it.
By the way, it's fortunate that Visual Studio clearly identifies extension methods by (extension) in the IntelliSense tooltip.

I'm personally not a fan of doing this. The biggest problem with extension methods right now is discoverability. Unleses you flat out know all of the methods which exist on a particular type, it's not possible to look at a method call and know that it's an extension method. As such I find it problematic to do anything with an extension method that would not be possible with a normal method call. Otherwise you will end up confusing developers.
A corollary to this problem exist in C++. In several C++ implementations it's possible to call instance methods on NULL pointers as long as you don't touch any fields or virtual methods on the type. I've worked with several pieces of code that do this intentionally and give methods differentt behavior when "this==NULL". It's quite maddening to work with.
This is not to say that I don't like extension methods. Quite the contrary, I enjoy them and use them frequently. But I think there are 2 important rules you should follow when writing them.
Treat the actual method implementation as if it's just another static method because it in fact is. For example throw ArgumentException instead of NullReference exception for a null this
Don't let an extension method perform tricks that a normal instance method couldn't do
EDIT Responding to Mehrdad's comment
The problem with taking advantage of it is that I don't see str.IsNullOrEmpty as having a significant functional advantage over String.IsNullOrEmpty(str). The only advantage I see is that one requires less typing than the other. The same could be said about extension methods in general. But in this case you're additionally altering the way people think about program flow.
If shorter typing is what people really want wouldn't IsNullOrEmpty(str) be a much better option? It's both unambiguous and is the shortest of all. True C# has no support for such a feature today. But imagine if in C# I could say
using SomeNamespace.SomeStaticClass;
The result of doing this is that all methods on SomeStaticClass were now in the global namespace and available for binding. This seems to be what people want, but they're attaching it to an extension method which I'm not a huge fan of.

I really like this approach AS LONG AS the method makes it clear that it is checking the object is null. ThrowIfNull, IsNull, IsNullOrEmpty, etc. It is very readable.

Personally, I wouldn't create an extension that does something that already exists in the framework unless it was a significant improvement in usability. In this instance, I don't think that's the case. Also, if I were to create an extension, I would name it in a way as to reduce confusion, not increase it. Again, I think this case fails that test.
Having said all that, I do have a string extension that tests, not only if the string is null or empty, but also if it only contains whitespace. I call it IsNothing. You can find it here.

It doesn't just look like you're calling a method on a null variable. You /are/ calling a method on a null variable, albeit one implemented through a static extension method. I had no idea extension methods (which I was already leery of) supported that. This even allows you to do crazy things like:
public static int PowerLength(this string obj)
{
return obj == null ? 0 : obj.Length;
}
From where I'm standing now, I would classify any use of an extension method on a null reference under considered harmful.

Let's look at the pro:s and con:s...
More readable.
Yes, slightly, but the improvement in readability is outweighed by the fact that it looks like you are calling an instance method on something that doesn't have to be an instance. In effect it's easier to read, but it's harder to understand, so the improved readability is really just an illusion.
Less typing.
Yes, but that is really not a strong argument. If typing is the main part of your programming, you are just not doing something that is remotely challenging enough for you to evolve as a developer.
Can be confusing because yourString variable can be null and it looks like
you're executing method on a null variable.
True, (as mentioned above).

Yes, it will confuse.
I think that everyone who knows about IsNullOrEmpty() perceives the use of it as quite natural. So your suggested extension method will not add more readability.
Maybe for someone new to .NET this extension method might be easier to deal with, but there is the danger possibility that she/he doesn't understand all facets of extension methods (like that you need to import the namespace and that it can be invoked on null). She/he will might wonder why this extension method does not exist in another project. Anyway: Even if someone is new to .NET the syntax of the IsNullOrEmpty() method might become natural quite fast. Also here the benefits of the extension methods will not outweight the confusion caused by it.
Edit: Tried to rephrase what I wanted to say.

I think extending any object with an "IsNull" type call is a little confusing and should be avoided if possible.
It's arguable that the IsEmptyString method might be useful for the String type, and that because you'd usually combine this with a test for null that the IsNullOrEmpty might be useful, but I'd avoid this too due to the fact that the string type already has a static method that does this and I'm not sure you're saving yourself that much typing (5 characters at most).

Calling a method on a variable that is null usually results in a NullReferenceException. The IsNullOrEmpty()-Method deviates from this behaviour in a way that is not predictable from just looking at the code. Therefore I would advise against using it since it creates confusion and the benefit of saving a couple of characters is minimal.

In general, I'm only ok with extension methods being safe to call on null if they have the word 'null' or something like that in their name. That way, I'm clued in to the fact that they may be safe to call with null. Also, they better document that fact in their XML comment header so I get that info when I mouse-over the call.

When comparing class instance to null using ".IsNull()" is not even shorter than using " == null".
Things change in generic classes when the generic type argument without constraint can be a value type.
In such case comparison with default type is lengthy and I use the extension below:
public static bool IsDefault<T>(this T x)
{
return EqualityComparer<T>.Default.Equals(x, default(T));
}

Related

Difference ways of calling methods within the same class in C# [duplicate]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I was curious about how other people use the this keyword. I tend to use it in constructors, but I may also use it throughout the class in other methods. Some examples:
In a constructor:
public Light(Vector v)
{
this.dir = new Vector(v);
}
Elsewhere
public void SomeMethod()
{
Vector vec = new Vector();
double d = (vec * vec) - (this.radius * this.radius);
}
I don't mean this to sound snarky, but it doesn't matter.
Seriously.
Look at the things that are important: your project, your code, your job, your personal life. None of them are going to have their success rest on whether or not you use the "this" keyword to qualify access to fields. The this keyword will not help you ship on time. It's not going to reduce bugs, it's not going to have any appreciable effect on code quality or maintainability. It's not going to get you a raise, or allow you to spend less time at the office.
It's really just a style issue. If you like "this", then use it. If you don't, then don't. If you need it to get correct semantics then use it. The truth is, every programmer has his own unique programing style. That style reflects that particular programmer's notions of what the "most aesthetically pleasing code" should look like. By definition, any other programmer who reads your code is going to have a different programing style. That means there is always going to be something you did that the other guy doesn't like, or would have done differently. At some point some guy is going to read your code and grumble about something.
I wouldn't fret over it. I would just make sure the code is as aesthetically pleasing as possible according to your own tastes. If you ask 10 programmers how to format code, you are going to get about 15 different opinions. A better thing to focus on is how the code is factored. Are things abstracted right? Did I pick meaningful names for things? Is there a lot of code duplication? Are there ways I can simplify stuff? Getting those things right, I think, will have the greatest positive impact on your project, your code, your job, and your life. Coincidentally, it will probably also cause the other guy to grumble the least. If your code works, is easy to read, and is well factored, the other guy isn't going to be scrutinizing how you initialize fields. He's just going to use your code, marvel at it's greatness, and then move on to something else.
There are several usages of this keyword in C#.
To qualify members hidden by similar name
To have an object pass itself as a parameter to other methods
To have an object return itself from a method
To declare indexers
To declare extension methods
To pass parameters between constructors
To internally reassign value type (struct) value.
To invoke an extension method on the current instance
To cast itself to another type
To chain constructors defined in the same class
You can avoid the first usage by not having member and local variables with the same name in scope, for example by following common naming conventions and using properties (Pascal case) instead of fields (camel case) to avoid colliding with local variables (also camel case). In C# 3.0 fields can be converted to properties easily by using auto-implemented properties.
I only use it when absolutely necessary, ie, when another variable is shadowing another. Such as here:
class Vector3
{
float x;
float y;
float z;
public Vector3(float x, float y, float z)
{
this.x = x;
this.y = y;
this.z = z;
}
}
Or as Ryan Fox points out, when you need to pass this as a parameter. (Local variables have precedence over member variables)
Personally, I try to always use this when referring to member variables. It helps clarify the code and make it more readable. Even if there is no ambiguity, someone reading through my code for the first time doesn't know that, but if they see this used consistently, they will know if they are looking at a member variable or not.
I use it every time I refer to an instance variable, even if I don't need to. I think it makes the code more clear.
I can't believe all of the people that say using it always is a "best practice" and such.
Use "this" when there is ambiguity, as in Corey's example or when you need to pass the object as a parameter, as in Ryan's example. There is no reason to use it otherwise because being able to resolve a variable based on the scope chain should be clear enough that qualifying variables with it should be unnecessary.
EDIT: The C# documentation on "this" indicates one more use, besides the two I mentioned, for the "this" keyword - for declaring indexers
EDIT: #Juan: Huh, I don't see any inconsistency in my statements - there are 3 instances when I would use the "this" keyword (as documented in the C# documentation), and those are times when you actually need it. Sticking "this" in front of variables in a constructor when there is no shadowing going on is simply a waste of keystrokes and a waste of my time when reading it, it provides no benefit.
I use it whenever StyleCop tells me to. StyleCop must be obeyed. Oh yes.
Any time you need a reference to the current object.
One particularly handy scenario is when your object is calling a function and wants to pass itself into it.
Example:
void onChange()
{
screen.draw(this);
}
I tend to use it everywhere as well, just to make sure that it is clear that it is instance members that we are dealing with.
I use it anywhere there might be ambiguity (obviously). Not just compiler ambiguity (it would be required in that case), but also ambiguity for someone looking at the code.
Another somewhat rare use for the this keyword is when you need to invoke an explicit interface implementation from within the implementing class. Here's a contrived example:
class Example : ICloneable
{
private void CallClone()
{
object clone = ((ICloneable)this).Clone();
}
object ICloneable.Clone()
{
throw new NotImplementedException();
}
}
Here's when I use it:
Accessing Private Methods from within the class (to differentiate)
Passing the current object to another method (or as a sender object, in case of an event)
When creating extension methods :D
I don't use this for Private fields because I prefix private field variable names with an underscore (_).
[C++]
I agree with the "use it when you have to" brigade. Decorating code unnecessarily with this isn't a great idea because the compiler won't warn you when you forget to do it. This introduces potential confusion for people expecting this to always be there, i.e. they'll have to think about it.
So, when would you use it? I've just had a look around some random code and found these examples (I'm not passing judgement on whether these are good things to do or otherwise):
Passing "yourself" to a function.
Assigning "yourself" to a pointer or something like that.
Casting, i.e. up/down casting (safe or otherwise), casting away constness, etc.
Compiler enforced disambiguation.
You should always use it, I use it to diferantiate private fields and parameters (because our naming conventions state that we don't use prefixes for member and parameter names (and they are based on information found on the internet, so I consider that a best practice))
I use it when, in a function that accepts a reference to an object of the same type, I want to make it perfectly clear which object I'm referring to, where.
For example
class AABB
{
// ... members
bool intersects( AABB other )
{
return other.left() < this->right() &&
this->left() < other.right() &&
// +y increases going down
other.top() < this->bottom() &&
this->top() < other.bottom() ;
}
} ;
(vs)
class AABB
{
bool intersects( AABB other )
{
return other.left() < right() &&
left() < other.right() &&
// +y increases going down
other.top() < bottom() &&
top() < other.bottom() ;
}
} ;
At a glance which AABB does right() refer to? The this adds a bit of a clarifier.
In Jakub Šturc's answer his #5 about passing data between contructors probably could use a little explanation. This is in overloading constructors and is the one case where use of this is mandatory. In the following example we can call the parameterized constructor from the parameterless constructor with a default parameter.
class MyClass {
private int _x
public MyClass() : this(5) {}
public MyClass(int v) { _x = v;}
}
I've found this to be a particularly useful feature on occasion.
I got in the habit of using it liberally in Visual C++ since doing so would trigger IntelliSense ones I hit the '>' key, and I'm lazy. (and prone to typos)
But I've continued to use it, since I find it handy to see that I'm calling a member function rather than a global function.
I tend to underscore fields with _ so don't really ever need to use this. Also R# tends to refactor them away anyway...
I pretty much only use this when referencing a type property from inside the same type. As another user mentioned, I also underscore local fields so they are noticeable without needing this.
I use it only when required, except for symmetric operations which due to single argument polymorphism have to be put into methods of one side:
boolean sameValue (SomeNum other) {
return this.importantValue == other.importantValue;
}
[C++]
this is used in the assignment operator where most of the time you have to check and prevent strange (unintentional, dangerous, or just a waste of time for the program) things like:
A a;
a = a;
Your assignment operator will be written:
A& A::operator=(const A& a) {
if (this == &a) return *this;
// we know both sides of the = operator are different, do something...
return *this;
}
this on a C++ compiler
The C++ compiler will silently lookup for a symbol if it does not find it immediately. Sometimes, most of the time, it is good:
using the mother class' method if you did not overloaded it in the child class.
promoting a value of a type into another type
But sometimes, You just don't want the compiler to guess. You want the compiler to pick-up the right symbol and not another.
For me, those times are when, within a method, I want to access to a member method or member variable. I just don't want some random symbol picked up just because I wrote printf instead of print. this->printf would not have compiled.
The point is that, with C legacy libraries (§), legacy code written years ago (§§), or whatever could happen in a language where copy/pasting is an obsolete but still active feature, sometimes, telling the compiler to not play wits is a great idea.
These are the reasons I use this.
(§) it's still a kind of mystery to me, but I now wonder if the fact you include the <windows.h> header in your source, is the reason all the legacy C libraries symbols will pollute your global namespace
(§§) realizing that "you need to include a header, but that including this header will break your code because it uses some dumb macro with a generic name" is one of those russian roulette moments of a coder's life
'this.' helps find members on 'this' class with a lot of members (usually due to a deep inheritance chain).
Hitting CTRL+Space doesn't help with this, because it also includes types; where-as 'this.' includes members ONLY.
I usually delete it once I have what I was after: but this is just my style breaking through.
In terms of style, if you are a lone-ranger -- you decide; if you work for a company stick to the company policy (look at the stuff in source control and see what other people are doing). In terms of using it to qualify members, neither is right or wrong. The only wrong thing is inconsistency -- that is the golden rule of style. Leave the nit-picking others. Spend your time pondering real coding problems -- and obviously coding -- instead.
I use it every time I can. I believe it makes the code more readable, and more readable code equals less bugs and more maintainability.
When you are many developers working on the same code base, you need some code guidelines/rules. Where I work we've desided to use 'this' on fields, properties and events.
To me it makes good sense to do it like this, it makes the code easier to read when you differentiate between class-variables and method-variables.
It depends on the coding standard I'm working under. If we are using _ to denote an instance variable then "this" becomes redundant. If we are not using _ then I tend to use this to denote instance variable.
I use it to invoke Intellisense just like JohnMcG, but I'll go back and erase "this->" when I'm done. I follow the Microsoft convention of prefixing member variables with "m_", so leaving it as documentation would just be redundant.
1 - Common Java setter idiom:
public void setFoo(int foo) {
this.foo = foo;
}
2 - When calling a function with this object as a parameter
notifier.addListener(this);
There is one use that has not already been mentioned in C++, and that is not to refer to the own object or disambiguate a member from a received variable.
You can use this to convert a non-dependent name into an argument dependent name inside template classes that inherit from other templates.
template <typename T>
struct base {
void f() {}
};
template <typename T>
struct derived : public base<T>
{
void test() {
//f(); // [1] error
base<T>::f(); // quite verbose if there is more than one argument, but valid
this->f(); // f is now an argument dependent symbol
}
}
Templates are compiled with a two pass mechanism. During the first pass, only non-argument dependent names are resolved and checked, while dependent names are checked only for coherence, without actually substituting the template arguments.
At that step, without actually substituting the type, the compiler has almost no information of what base<T> could be (note that specialization of the base template can turn it into completely different types, even undefined types), so it just assumes that it is a type. At this stage the non-dependent call f that seems just natural to the programmer is a symbol that the compiler must find as a member of derived or in enclosing namespaces --which does not happen in the example-- and it will complain.
The solution is turning the non-dependent name f into a dependent name. This can be done in a couple of ways, by explicitly stating the type where it is implemented (base<T>::f --adding the base<T> makes the symbol dependent on T and the compiler will just assume that it will exist and postpones the actual check for the second pass, after argument substitution.
The second way, much sorter if you inherit from templates that have more than one argument, or long names, is just adding a this-> before the symbol. As the template class you are implementing does depend on an argument (it inherits from base<T>) this-> is argument dependent, and we get the same result: this->f is checked in the second round, after template parameter substitution.
You should not use "this" unless you absolutely must.
There IS a penalty associated with unnecessary verbosity. You should strive for code that is exactly as long as it needs to be, and no longer.

Is using an extension method for casting a bad idea?

I recently started on WPF, and I noticed that you have to do a lot of casting (especially with events). This is an aesthetic issue, but I was wondering how bad it would be if I'd use an extension method to cast, instead of using normal casting.
public static T Cast<T>(this object obj)
{
return (T)obj;
}
This would mean I could prevent a few nested parantheses, and change:
Console.WriteLine(((DataGridCell)e.OriginalSource).ActualHeight);
to:
Console.WriteLine(e.OriginalSource.Cast<DataGridCell>().ActualHeight);
Are there any clear disadvantages that I might be overlooking? How disgusted will people be when they encounter this in code? :)
This is similar in intent to Enumerable.Cast, so I wouldn't necessarily say that people will be disgusted.
Are there any clear disadvantages that I might be overlooking?
The main disadvantage is that this will be an extension method available to every single variable in your code, since you're extending System.Object. I typically avoid extension methods on Object for this reason, as it "pollutes" intellisense.
That being said, there are other disadvantages:
If you used this on an existing IEnumerable, you'd get a name collision with Enumerable.Cast<T>. A file having your namespace included but missing a using System.Linq could easily be misunderstood by other developers, as this would have a very different meaning to the expected "Cast<T>" extension method.
If you use this on a value type, you're introducing boxing (pushing the value type into an object), then an unbox and cast, which can actually cause an exception that wouldn't occur with a cast. Your extension method will raise an exception if you do:
int i = 42;
float f = i.Cast<float>();
This might be unexpected, as float f = (float)i; is perfectly legal. For details, see Eric Lippert's post on Representation and Identity. If you do write this, I would definitely recommend adding a class constraint to your operator.
I, personally, would just use parenthesis. This is a common, language supported feature, and should be understandable to all C# developers. Casting has the advantages of being shorter, understandable, and side effect free (in terms of intellisense, etc).
The other option would be to make this a normal static method, which would allow you to write:
Console.WriteLine(Utilities.Cast<DataGridCell>(e.OriginalSource).ActualHeight);
This eliminates the disadvantage of "polluting" intellisense, and makes it obvious that its a method you wrote, but increases the amount of typing required to use. It also does nothing to prevent the boxing and unbox/cast issue.
The main disadvantage is that casting is well-known for every C# developer, while your Cast<T> method is just another not-invented here wheel. The next step, usually, is a set of extensions like IsTrue, IsFalse, IsNull, etc.
This is a syntax garbage.

"Ternary" operator for different method signatures

I'm looking for an elegant way to choose a method signature (overloaded) and pass an argument based on a conditional. I have an importer that will either produce the most recent file to import, or take an explicit path for the data.
Currently my code looks like this:
if (string.IsNullOrEmpty(arguments.File))
{
importer.StartImport();
}
else
{
importer.StartImport(arguments.File);
}
I would like it to look like this (or conceptually similar):
importer.StartImport(string.IsNullOrEmpty(arguments.File) ? Nothing : arguments.File);
The idea being that a different method signature would be called. A few stipulations:
1) I will not rely on 'null' to indicate an unspecified file (i.e. anything besides null itself).
2) I will not pass the arguments struct to the importer class; that violates the single responsibility principle.
One solution I'm aware of, is having only one StartImport() method that takes a single string, at which point that method resolves the conditional and chooses how to proceed. I'm currently choosing to avoid this solution because it only moves the if-statement from one method to another. I'm asking this question because:
1) I would like to reduce "8 lines" of code to 1.
2) I'm genuinely curious if C# is capable of something like this.
I would like to reduce "8 lines" of code to 1.
I think you're asking the wrong question. It's not how many lines of code you have, it's how clear, maintainable, and debuggable they are. From what you've described, importing from a default location and importing with a known file are semantically different - so I think you're correct in separating them as two different overloads. In fact, you may want to go further and actually name them differently to further clarify the difference.
I'm genuinely curious if C# is capable of something like this.
Sure, we can use all sorts of fancy language tricks to make this more compact ... but I don't think they make the code more clear. For instance:
// build a Action delegate based on the argument...
Action importAction = string.IsNullOrEmpty(arguments.File)
? () => importer.StartImport()
: () => importer.StartImport(arguments.File)
importAction(); // invoke the delegate...
The code above uses a lambda + closure to create a Action delegate of the right type, which is then invoked. But this is hardly more clear ... it's also slightly less efficient, as it requires creating a delegate and then invoking the method through that delegate. In most cases the performance overhead is completely negligible. The real problem here is the use of the closure - it's very easy to misuse code with closures - and it's entirely possible to introduce bugs by using closures incorrectly.
You can't do this using 'strong-typed' C#. Overload resolution is performed at compile time: the compiler will work out the static (compile-time) type of the argument, and resolve based on that. If you want to use one of two different overloads, you have to perform two different calls.
In some cases, you can defer overload resolution until runtime using the dynamic type. However, this has a performance (and clarity!) cost, and won't work in your situation because you need to pass a different number of arguments in the two cases.
Yes, it is possible (using reflection). But I wouldn't recommend it at all because you end up with way more code than you had before. The "8-lines" which you have are quite simple and readable. The "one-liner" using reflection is not. But for completeness sake, here's a one-liner:
importer.GetType().GetMethod("StartImport", string.IsNullOrEmpty(arguments.File) ? new Type[0] : new Type[] { typeof(string) }).Invoke(importer, string.IsNullOrEmpty(arguments.File) ? new object[0] : new object[] { arguments.File) }));
Frankly, I think your code would be a lot more readable and modular if you combined the overloaded methods back into a single method and moved the conditional inside it instead of making the caller pre-validate the input.
importer.StartImport(arguments.File);
void StartImport(string File) {
if (string.isNullOrEmpty(File)) {
...
}
else {
...
}
}
Assuming you call the method from multiple places in your code you don't have the conditional OR ternary expression scattered around violating the DRY principle with this approach.
//one line
if (string.IsNullOrEmpty(arguments.File)) {importer.StartImport();} else{importer.StartImport(arguments.File);}

What is the right way to write 'Get' method?

With one of mine coworkers often argue about "the right way" of writing 'Get' methods.
My opinion is object GetSomeObject(). My colleague thinks it is better to be void GetSomeObject(object obj). I know the result is one and the same in both cases. I want to hear and other opinions. Ohhh i forgot to tell for what platform we are talking about - .NET Framework the language is C#.
If it is a simple get then it should be a property
public object SomeObject
{
get { return _someObj; }
}
if it is computational then
object GetSomeObject() { ... }
Is far more commonly expected. Besides the other would have to have either a ref or out passed in as the argument which is discouraged if the former can be achieved
void GetSomeObject(object obj) won't actually 'get' anything. If you turned it into an out parameter you could assign a value to it, and it would technically work, buy why when you can use return types exactly as they were intended:
public void GetSomeObject(out returnObject)
{
returnObject = ...
}
or
public object GetSomeObject()
{
return ...
}
Of course object GetSomeObject() is better. The other one is more a setter than a getter...
It reads easier and is good coding practice to return a value, over modifying a reference parameter (which to me at least, is an old-school way of getting values back)
If you're looking for a deeper meaning:
Function parameters in C# are created as Value parameters by default (as apposed to reference parameters).
To change the parameter value and persist that change to the calling code, the parameter needs to be declared reference. You probably know all this.
Here's the difference: Both value and ref parameters are stored on the stack (which is highly efficient), but the reference parameter's data is stored on the heap. So there is a fraction overhead using reference values.
In most cases this is not a problem at all, but some issues might pop up, like:
recursive functions that use up too much stack (and you get a stack overflow)
functions that require the speed, like calculating primes or fractals
There's probably more, and better examples, than what I gave, you get the idea though.
object getSomeObject(); is better.
Generally speaking the first way is more typical and better because object GetSomeObject() allows you to do the following: GetSomeObject().Foo(). And it's somewhat more intuitive.
However, bool GetSomeObject( out object obj ) can be useful as in the case of TryGetValue() in the Dictionary class.
I can pretty much only see a single reason for using void GetSomeObject(out object obj) instead of object GetSomeObject() and that is if you get rid of the void and instead do something like ErrorResult GetSomeObject(out object obj) (and the GetSomeObject-operation is hairy and error-prone) since you can then report status via the return value.
However, that would still be better handled via plain old Exceptions IMHO. Though I know that some coding standards say that exceptions shouldn't be used at all and in those cases you might want to do something like this.
Still, I'd say just go with a property or the object GetSomeObject() unless you have a really good reason not to.

Best practice of using the "out" keyword in C#

I'm trying to formalise the usage of the "out" keyword in c# for a project I'm on, particularly with respect to any public methods. I can't seem to find any best practices out there and would like to know what is good or bad.
Sometimes I'm seeing some methods signatures that look like this:
public decimal CalcSomething(Date start, Date end, out int someOtherNumber){}
At this point, it's just a feeling, this doesn't sit well with me. For some reason, I'd prefer to see:
public Result CalcSomething(Date start, Date end){}
where the result is a type that contains a decimal and the someOtherNumber. I think this makes it easier to read. It allows Result to be extended or have properties added without breaking code. It also means that the caller of this method doesn't have to declare a locally scoped "someOtherNumber" before calling. From usage expectations, not all callers are going to be interested in "someOtherNumber".
As a contrast, the only instances that I can think of right now within the .Net framework where "out" parameters make sense are in methods like TryParse(). These actually make the caller write simpler code, whereby the caller is primarily going to be interested in the out parameter.
int i;
if(int.TryParse("1", i)){
DoSomething(i);
}
I'm thinking that "out" should only be used if the return type is bool and the expected usages are where the "out" parameters will always be of interest to the caller, by design.
Thoughts?
There is a reason that one of the static code analysis (=FxCop) rules points at you when you use out parameters. I'd say: only use out when really needed in interop type scenarios. In all other cases, simply do not use out. But perhaps that's just me?
This is what the .NET Framework Developer's Guide has to say about out parameters:
Avoid using out or reference parameters.
Working with members
that define out or reference
parameters requires that the developer
understand pointers, subtle
differences between value types and
reference types, and initialization
differences between out and reference
parameters.
But if you do use them:
Do place all out parameters after all of the pass-by-value and ref
parameters (excluding parameter
arrays), even if this results in an
inconsistency in parameter ordering
between overloads.
This convention makes the method
signature easier to understand.
Your approach is better than out, because you can "chain" calls that way:
DoSomethingElse(DoThing(a,b).Result);
as opposed to
DoThing(a, out b);
DoSomethingElse(b);
The TryParse methods implemented with "out" was a mistake, IMO. Those would have been very convenient in chains.
There are only very few cases where I would use out. One of them is if your method returns two variables that from an OO point of view do not belong into an object together.
If for example, you want to get the most common word in a text string, and the 42nd word in the text, you could compute both in the same method (having to parse the text only once). But for your application, these informations have no relation to each other: You need the most common word for statistical purposes, but you only need the 42nd word because your customer is a geeky Douglas Adams fan.
Yes, that example is very contrived, but I haven't got a better one...
I just had to add that starting from C# 7, the use of the out keyword makes for very readable code in certain instances, when combined with inline variable declaration. While in general you should rather return a (named) tuple, control flow becomes very concise when a method has a boolean outcome, like:
if (int.TryParse(mightBeCount, out var count)
{
// Successfully parsed count
}
I should also mention, that defining a specific class for those cases where a tuple makes sense, more often than not, is more appropriate. It depends on how many return values there are and what you use them for. I'd say, when more than 3, stick them in a class anyway.
One advantage of out is that the compiler will verify that CalcSomething does in fact assign a value to someOtherNumber. It will not verify that the someOtherNumber field of Result has a value.
Stay away from out. It's there as a low-level convenience. But at a high level, it's an anti-technique.
int? i = Util.TryParseInt32("1");
if(i == null)
return;
DoSomething(i);
If you have even seen and worked with MS
namespace System.Web.Security
MembershipProvider
public abstract MembershipUser CreateUser(string username, string password, string email, string passwordQuestion, string passwordAnswer, bool isApproved, object providerUserKey, out MembershipCreateStatus status);
You will need a bucket. This is an example of a class breaking many design paradigms. Awful!
Just because the language has out parameters doesn't mean they should be used. eg goto
The use of out Looks more like the Dev was either Lazy to create a type or wanted to try a language feature.
Even the completely contrived MostCommonAnd42ndWord example above I would use
List or a new type contrivedresult with 2 properties.
The only good reasons i've seen in the explanations above was in interop scenarios when forced to. Assuming that is valid statement.
You could create a generic tuple class for the purpose of returning multiple values. This seems to be a decent solution but I can't help but feel that you lose a bit of readability by returning such a generic type (Result is no better in that regard).
One important point, though, that james curran also pointed out, is that the compiler enforces an assignment of the value. This is a general pattern I see in C#, that you must state certain things explicitly, for more readable code. Another example of this is the override keyword which you don't have in Java.
If your result is more complex than a single value, you should, if possible, create a result object. The reasons I have to say this?
The entire result is encapsulated. That is, you have a single package that informs the code of the complete result of CalcSomething. Instead of having external code interpret what the decimal return value means, you can name the properties for your previous return value, Your someOtherNumber value, etc.
You can include more complex success indicators. The function call you wrote might throw an exception if end comes before start, but exception throwing is the only way to report errors. Using a result object, you can include a boolean or enumerated "Success" value, with appropriate error reporting.
You can delay the execution of the result until you actually examine the "result" field. That is, the execution of any computing needn't be done until you use the values.

Categories

Resources