There are some .NET libraries which use methods for accessing object data instead of getters i.e HttpWebResponse.GetResponseStream().
Also there are examples of accessing an stream by a property i.e HttpResponse.OutputStream.
My question is when to use which form of access and why ?
See the FxCop rule: CA1024: Use properties where appropriate.
Good question. Although a property is little more than syntax sugar for a pair of get/set methods, there two should be used in different times.
Generally, you should use a property-style getter when:
The value to be returned represents field-like data (generally primitives/value types, but a reference to another domain object is also fine)
The calculation, if any, to produce that value is relatively cheap/side-effect free
Getting the same value twice will produce the same value given the same inputs
Generally, you should use a getter method when:
The returned object is created for the purpose (e.g. factory methods)
Evaluating the returned value requires side effects (e.g. touching a file system, database, or changing other values)
Getting the return type twice will produce two distinct results (i.e. two Streams, db connections, etc).
In a sentence, if conceptually speaking the value needed is something the object HAS, use a property. If the value needed is the result of something the object DOES, use a method.
Good question. This article brings up a few good points. In general, I use methods when the computation is expensive and properties when computation is not expensive (i.e. a stored value is returned).
My opinion which, I'm sure, will get to -10 real fast, is that you should only use properties for serialization. In all other cases explicit method call is preferable because when you look at it, you know that a method with possible side effects is being invoked.
I guess the "correct" (tm) answer is that when all your method would do is return the value, it is ok to use getter/setter, but if there is any work to do, use a method.
Related
Assume that 2 different methods - one static and one non-static - need an instance variable.
The variable is used 3-5 different times within the methods for comparison purposes.
The variable is NOT changed in any manner.
Also would the type of variable - String, Colection, Collection, etc. make any difference on how it should be coded.
What is the best/right way of using Instance Variable within a private method (static and non-static)?
Pass as method argument
Store locally by using the method to get the value - this.getClaimPropertyVertices();
Store locally by getting the value - this.claimPropertyVertices;
Use the instance variable directly in the method
When creating a local variable to store the value will the "final" keyword provide any advantages, if the variable will not be changed.
Edit 1: Based on a comment, I am adding additional information
The value cannot be created locally in the method. It has to come from the class or some other method accessed by the class.
My Solution Based on the Answers:
Based on the answer by #EricJ. and #Jodrell. I went with option 1 and also created it as a private static method. I also found some details here to support this.
When creating a local variable to store the value will the "final" keyword provide any advantages, if the variable will not be changed
In Java, final provides an optimization opportunity to the compiler. It states that the contents of the variable will not be changed. The keyword readonly provides a similar role in C#.
Whether or not that additional opportunity for optimization is meaningful depends on the specific problem. In many cases, the cost of other portions of the algorithm will be vastly larger than optimizations that the compiler is able to make due to final or readonly.
Use of those keywords has another benefit. They create a contract that the value will not change, which helps future maintainers of the code understand that they should not change the value (indeed, the compiler will not let them).
What is the best/right way of using Instance Variable within a private method (static and non-static)?
Pass as method argument
The value is already stored in the instance. Why pass it? Best case is this is not better than using the instance property/field. Worst case the JITer not inline the call, and will create a larger stack frame costing a few CPU cycles. Note: if you are calling a static method, then you must pass the variable as the static method cannot access the object instance.
Store locally by using the method to get the value - this.getClaimPropertyVertices();
This is what I do in general. Getters/setters are there to provide a meaningful wrapper around fields. In some cases, the getter will initialize the backing field (common pattern in C# when using serializers that do not call the object constructor. Don't get me started on that topic...).
Store locally by getting the value - this.claimPropertyVertices;
No, see above.
Use the instance variable directly in the method
Exactly the same as above. Using this or not using this should generate the exact same code.
UPDATE (based on your edit)
If the value is external to the object instance, and should not meaningfully be stored along with the instance, pass it in as a value to the method call.
If you write your functions with the static keyword whenever you can, there are several obvious benefits.
Its obvious what inputs effect the function from the signature.
You know that the function will have no side effects (unless you are passing by reference). This overlooks non-functional side effects, like changes to the GUI.
The function is not programtically tied to the class, if you decide that logically its behaviour has a better association with another entity, you can just move it. Then adjust any namespace references.
These benefits make the function easy to understand and simpler to reuse. They will also make it simpler to use the function in a Multi Threaded context, you don't have to worry about contention on ever spreading side effects.
I will cavet this answer. You should write potentially resuable functions with the static keyword. Simple or obviously non-resulable functionality should just access the private member or getter, if implemented.
Wrapping a struct or class field in a property forces all accesses to that field to go through "getter" and "setter" methods. This allows for the possibility of adding logic for validation, lazy initialization, etc. Further, in the case of class fields, it allows for the possibility that one might have logic which applies to some instances but not others; if the properties are not virtual, it may be difficult to implement such logic efficiently (e.g. one would might have to define a static VerySpecialInstance and have the property getter say if (this == VerySpecialInstance) GetSpecialProperty(); else GetOrdinaryProperty();) but it could be done.
If, however, the semantics of a struct (e.g. System.Drawing.Point) dictate that a particular read-write property may be written with any value which is legal for its type, writing will have no side effect other than to change its value, it will always return the last value written (if any), and if not written it will read as the default value for its type; and if code which uses the type will likely rely upon such assumptions, I'm unclear on what possible benefit would be served by using a read-write property rather than a field to hold the value.
The fact that Microsoft uses properties rather than fields for things like Point.X etc. has historically caused confusion since MyList[3].X = 4; would be translated to MyList[3].Set_X(4), and without looking inside the definition of Set_X it's not possible to tell whether that method would achieve its desired effect without changing any fields of the struct in question; today's C# compiler will guess that it wouldn't work, and will forbid that construct even though there are some struct types where property setters would in fact work just fine. If X been a field rather than a property, and if Microsoft had said that the two safe ways to mutate a struct are either to access the fields directly or to pass the struct as a ref parameter to a mutating method (which, if it's a static method of the struct type, could access public fields), such guesswork would not be necessary.
Given that using exposed struct fields rather than read-write properties improves both performance and semantic clarity, what reasons exist to make struct fields private and wrap them in properties? Data binding requires properties, but I don't think it works with structure types anyway (if one makes a copy of a struct and then sets some property of the original to one value and the corresponding property of the duplicate to another, what value should be reported to the bound object?) Are there some benefits of struct properties of which I'm unaware?
Personally, I think the 'ideal' struct in many cases would simply be a list of exposed public fields, and a constructor whose parameters are simply the initial values of those fields, in order. Such a struct would offer optimal performance and predictable semantics (behaving identically to all other such structs, aside from the types and names of the fields). Is there any reason to favor read-write properties in cases where there isn't anything they could do anything other than simply read and write an underlying field?
Don't see any benefit on immutable struct of using read/write properties, except point you wrote about: wrapping the logic inside setter and/or getter of the property, and maintaining general guideline across your code base (benefit for maintainance and readability point of view) .
I personally when define a struct almost always use raw public fields and no properties, for simplicity and easy consumption of my type (for the problems on immutable types you wrote already in question)
Hope this helps.
Rico Mariani wrote a good MSDN blog article on this very topic.
Reasons to use public fields rather than getters and setters include:
There are no values the field cannot be allowed to have.
The client is expected to edit it.
To be able to write things such as object.X.Y = Z.
To making a strong promise that the value is just a value and there are no side-effects associated with it (and won't be in the future either).
Some people find this very controversial. I suspect this is because the case listed rarely or never come up in the kind of software they write, but they don't realise that in other application areas they come up a great deal.
(This is a copy of an answer I provided here, but I thought the information is useful enough to be repeated here.)
When you have automatic properties, C# compiler asks you to call the this constructor on any constructor you have, to make sure everything is initialized before you access them.
If you don't use automatic properties, but simply declare the values, you can avoid using the this constructor.
What's the overhead of using this on constructors in structs? Is it the same as double initializing the values?
Would you recommend not using it, if performance was a top concern for this particular type?
I would recommend not using automatic properties at all for structs, as it means they'll be mutable - if only privately.
Use readonly fields, and public properties to provide access to them where appropriate. Mutable structures are almost always a bad idea, and have all kinds of nasty little niggles.
Do you definitely need to create your own value type in the first place though? In my experience it's very rare to find a good reason to create a struct rather than a class. It may be that you've got one, but it's worth checking.
Back to your original question: if you care about performance, measure it. Always. In this case it's really easy - you can write the struct using an automatic property and then reimplement it without. You could use a #if block to keep both options available. Then you can measure typical situations and see whether the difference is significant. Of course, I think the design implications are likely to be more important anyway :)
Yes, the values will be initialized twice and without profiling it is difficult to say whether or not this performance hit would be significant.
The default constructor of a struct initializes all members to their default values. After this happens your constructor will run in which you undoubtedly set the values of those properties again.
I would imagine this would be no different than the CLR's practice of initializing all fields of a reference type upon instantiation.
The reason the C# compiler requires you to chain to the default constructor (i.e. append : this() to your constructor declaration) when auto-implemented properties are used is because all variables need to be assigned before exiting the constructor. Now, auto-implemented properties mess this up a bit in that they don't allow you to directly access the variables that back the properties. The method the compiler uses to get around this is to automatically assign all the variables to their default values, and to insure this, you must chain to the default constructor. It's not a particularly clever method, but it does the job well enough.
So indeed, this will mean that some variables will end up getting initialised twice. However, I don't think this will be a big performance problem. I would be very surprised it the compiler (or at very least the JIT) didn't simply remove the first initialisation statement for any variable that is set twice in your constructor. A quick benchmark should confirm this for you, though I'm quite sure you will get the suspected results. (If you by chance don't, and you absolutely need the tiny performance boost that avoidance of duplicate initialisation offers, you can just define your properties the normal way, i.e. with backing variables.)
To be honest, my advice would be not even to bother with auto-implemented properties in structures. It's perfectly acceptable just to use public variables in lieu of them, and they offer no less functionality than auto-implemented properties. Classes are a different situation of course, but I really wouldn't hesitate to use public variables in structs. (Any complex properties can be defined normally, if you need them.)
Hope that helps.
Don't use automatic properties with structure types. Simply expose fields directly. If a struct has an exposed public field Foo of type Bar, the fact that Foo is an exposed field of type Bar (information readily available from Intellisense) tells one pretty much everything there is to know about it. By contrast, the fact that a struct Foo has an exposed read-write property of Boz does not say anything about whether writing to Boz will mutate a field in the struct, or whether it will mutate some object to which Boz holds a reference. Exposing fields directly will offer cleaner semantics, and often also result in faster-running code.
I'm trying to formalise the usage of the "out" keyword in c# for a project I'm on, particularly with respect to any public methods. I can't seem to find any best practices out there and would like to know what is good or bad.
Sometimes I'm seeing some methods signatures that look like this:
public decimal CalcSomething(Date start, Date end, out int someOtherNumber){}
At this point, it's just a feeling, this doesn't sit well with me. For some reason, I'd prefer to see:
public Result CalcSomething(Date start, Date end){}
where the result is a type that contains a decimal and the someOtherNumber. I think this makes it easier to read. It allows Result to be extended or have properties added without breaking code. It also means that the caller of this method doesn't have to declare a locally scoped "someOtherNumber" before calling. From usage expectations, not all callers are going to be interested in "someOtherNumber".
As a contrast, the only instances that I can think of right now within the .Net framework where "out" parameters make sense are in methods like TryParse(). These actually make the caller write simpler code, whereby the caller is primarily going to be interested in the out parameter.
int i;
if(int.TryParse("1", i)){
DoSomething(i);
}
I'm thinking that "out" should only be used if the return type is bool and the expected usages are where the "out" parameters will always be of interest to the caller, by design.
Thoughts?
There is a reason that one of the static code analysis (=FxCop) rules points at you when you use out parameters. I'd say: only use out when really needed in interop type scenarios. In all other cases, simply do not use out. But perhaps that's just me?
This is what the .NET Framework Developer's Guide has to say about out parameters:
Avoid using out or reference parameters.
Working with members
that define out or reference
parameters requires that the developer
understand pointers, subtle
differences between value types and
reference types, and initialization
differences between out and reference
parameters.
But if you do use them:
Do place all out parameters after all of the pass-by-value and ref
parameters (excluding parameter
arrays), even if this results in an
inconsistency in parameter ordering
between overloads.
This convention makes the method
signature easier to understand.
Your approach is better than out, because you can "chain" calls that way:
DoSomethingElse(DoThing(a,b).Result);
as opposed to
DoThing(a, out b);
DoSomethingElse(b);
The TryParse methods implemented with "out" was a mistake, IMO. Those would have been very convenient in chains.
There are only very few cases where I would use out. One of them is if your method returns two variables that from an OO point of view do not belong into an object together.
If for example, you want to get the most common word in a text string, and the 42nd word in the text, you could compute both in the same method (having to parse the text only once). But for your application, these informations have no relation to each other: You need the most common word for statistical purposes, but you only need the 42nd word because your customer is a geeky Douglas Adams fan.
Yes, that example is very contrived, but I haven't got a better one...
I just had to add that starting from C# 7, the use of the out keyword makes for very readable code in certain instances, when combined with inline variable declaration. While in general you should rather return a (named) tuple, control flow becomes very concise when a method has a boolean outcome, like:
if (int.TryParse(mightBeCount, out var count)
{
// Successfully parsed count
}
I should also mention, that defining a specific class for those cases where a tuple makes sense, more often than not, is more appropriate. It depends on how many return values there are and what you use them for. I'd say, when more than 3, stick them in a class anyway.
One advantage of out is that the compiler will verify that CalcSomething does in fact assign a value to someOtherNumber. It will not verify that the someOtherNumber field of Result has a value.
Stay away from out. It's there as a low-level convenience. But at a high level, it's an anti-technique.
int? i = Util.TryParseInt32("1");
if(i == null)
return;
DoSomething(i);
If you have even seen and worked with MS
namespace System.Web.Security
MembershipProvider
public abstract MembershipUser CreateUser(string username, string password, string email, string passwordQuestion, string passwordAnswer, bool isApproved, object providerUserKey, out MembershipCreateStatus status);
You will need a bucket. This is an example of a class breaking many design paradigms. Awful!
Just because the language has out parameters doesn't mean they should be used. eg goto
The use of out Looks more like the Dev was either Lazy to create a type or wanted to try a language feature.
Even the completely contrived MostCommonAnd42ndWord example above I would use
List or a new type contrivedresult with 2 properties.
The only good reasons i've seen in the explanations above was in interop scenarios when forced to. Assuming that is valid statement.
You could create a generic tuple class for the purpose of returning multiple values. This seems to be a decent solution but I can't help but feel that you lose a bit of readability by returning such a generic type (Result is no better in that regard).
One important point, though, that james curran also pointed out, is that the compiler enforces an assignment of the value. This is a general pattern I see in C#, that you must state certain things explicitly, for more readable code. Another example of this is the override keyword which you don't have in Java.
If your result is more complex than a single value, you should, if possible, create a result object. The reasons I have to say this?
The entire result is encapsulated. That is, you have a single package that informs the code of the complete result of CalcSomething. Instead of having external code interpret what the decimal return value means, you can name the properties for your previous return value, Your someOtherNumber value, etc.
You can include more complex success indicators. The function call you wrote might throw an exception if end comes before start, but exception throwing is the only way to report errors. Using a result object, you can include a boolean or enumerated "Success" value, with appropriate error reporting.
You can delay the execution of the result until you actually examine the "result" field. That is, the execution of any computing needn't be done until you use the values.
I'm just wondering how other developers tackle this issue of getting 2 or 3 answers from a method.
1) return a object[]
2) return a custom class
3) use an out or ref keyword on multiple variables
4) write or borrow (F#) a simple Tuple<> generic class
http://slideguitarist.blogspot.com/2008/02/whats-f-tuple.html
I'm working on some code now that does data refreshes. From the method that does the refresh I would like to pass back (1) Refresh Start Time and (2) Refresh End Time.
At a later date I may want to pass back a third value.
Thoughts? Any good practices from open source .NET projects on this topic?
It entirely depends on what the results are. If they are related to one another, I'd usually create a custom class.
If they're not really related, I'd either use an out parameter or split the method up. If a method wants to return three unrelated items, it's probably doing too much. The exception to this is when you're talking across a web-service boundary or something else where a "purer" API may be too chatty.
For two, usually 4)
More than that, 2)
Your question points to the possibility that you'll be returning more data in the future, so I would recommend implementing your own class to contain the data.
What this means is that your method signature will remain the same even if the inner representation of the object you're passing around changes to accommodate more data. It's also good practice for readability and encapsulation reasons.
Code Architeture wise i'd always go with a Custom Class when needing somewhat a specific amount of variables changed. Why? Simply because a Class is actually a "blueprint" of an often used data type, creating your own data type, which it in this case is, will help you getting a good structure and helping others programme for your interface.
Personally, I hate out/ref params, so I'd rather not use that approach. Also, most of the time, if you need to return more than one result, you are probably doing something wrong.
If it really is unavoidable, you will probably be happiest in the long run writing a custom class. Returning an array is tempting as it is easy and effective in the short teerm, but using a class gives you the option of changing the return type in the future without having to worry to much about causing problems down stream. Imagine the potential for a debugging nightmare if someone swaps the order of two elements in the array that is returned....
I use out if it's only 1 or 2 additional variables (for example, a function returns a bool that is the actual important result, but also a long as an out parameter to return how long the function ran, for logging purposes).
For anything more complicated, i usually create a custom struct/class.
I think the most common way a C# programmer would do this would be to wrap the items you want to return in a separate class. This would provide you with the most flexibility going forward, IMHO.
It depends. For an internal only API, I'll usually choose the easiest option. Generally that's out.
For a public API, a custom class usually makes more sense - but if it's something fairly primitive, or the natural result of the function is a boolean (like *.TryParse) I'll stick with an out param. You can do a custom class with an implicit cast to bool as well, but that's usually just weird.
For your particular situation, a simple immutable DateRange class seems most appropriate to me. You can easily add that new value without disturbing existing users.
If you're wanting to send back the refresh start and end times, that suggests a possible class or struct, perhaps called DataRefreshResults. If your possible third value is also related to the refresh, then it could be added. Remember, a struct is always passed by value, so it's allocated on the heap does not need to be garbage-collected.
Some people use KeyValuePair for two values. It's not great though because it just labels the two things as Key and Value. Not very descriptive. Also it would seriously benefit from having this added:
public static class KeyValuePair
{
public static KeyValuePair<K, V> Make(K k, V v)
{
return new KeyValuePair<K, V>(k, v);
}
}
Saves you from having to specify the types when you create one. Generic methods can infer types, generic class constructors can't.
For your scenario you may want to define generic Range{T} class (with checks for the range validity).
If method is private, then I usually use tuples from my helper library. Public or protected methods generally always deserve separate.
Return a custom type, but don't use a class, use a struct - no memory allocation/garbage collection overhead means no downsides.
If 2, a Pair.
If more than 2 a class.
Another solution is to return a dictionary of named object references. To me, this is pretty equivalent to using a custom return class, but without the clutter. (And using RTTI and reflection it is just as typesafe as any other solution, albeit dynamically so.)
It depends on the type and meaning of the results, as well as whether the method is private or not.
For private methods, I usually just use a Tuple, from my class library.
For public/protected/internal methods (ie. not private), I use either out parameter or a custom class.
For instance, if I'm implementing the TryXYZ pattern, where you have an XYZ method that throws an exception on failure and a TryXYZ method that returns Boolean, TryXYZ will use an out parameter.
If the results are sequence-oriented (ie. return 3 customers that should be processed) then I will typically return some kind of collection.
Other than that I usually just use a custom class.
If a method outputs two to three related value, I would group them in a type. If the values are unrelated, the method is most likely doing way too much and I would refactor it into a number of simpler methods.