Related
I don't mean to troll but I really don't get it. Why would language designers allow private methods instead of some naming convention (see __ in Python) ?
I searched for the answer and usual arguments are:
a) To make the implementation cleaner/avoid long vertical list of methods in IDE autocompletion
b) To announce to the world which methods are public interface and which may change and are just for implementation purpose
c) Readability
Ok so now, all of those could be achieved by naming all private methods with __ prefix or by "private" keyword which doesn't have any implications other than be information for IDE (don't put those in autocompletion) and other programers (don't use it unless you really must). Hell, one could even require unsafe-like keyword to access private methods to really discourage this.
I am asking this because I work with some c# code and I keep changing private methods to public for test purposes as many in-between private methods (like string generators for xml serialization) are very useful for debugging purposes (like writing some part of string to log file etc.).
So my question is:
Is there anything which is achieved by access restriction but couldn't be achieved by naming conventions without restricting the access ?
There are a couple questions/issues that you are raising, so I'll handle each one separately.
How do I test private methods?
Beyond the debate/discussion of if you should test private methods, there are a few ways you can do this.
Refactor
A broad general answer is that you can refactor the behaviour into a separate testable class which the original class leverages. This is debatable and not always applicable depending on your design or privileges to do so.
InternalsVisibleTo
A common routine is to extract testable logic into a method and mark it as internal. In your assembly properties, you can add an attribute [InternalsVisibleTo("MyUnitTestingProject")]. This will allow you to access the method from your unit testing project while still hiding access to all other assemblies. http://msdn.microsoft.com/en-us/library/system.runtime.compilerservices.internalsvisibletoattribute.aspx
However, given the comments made by you, you are unable to change the structure of source code permanently in your workplace; that you are changing the accessors to test, test, then change them back before committing. In this case there are two options:
Partial testing classes
If you mark the class as partial. Create a second partial class file which will contain your tests (or public wrappers to the private members). Then when it comes time to merge/commit, just remove your partial classes from the project and remove the partial keyword from the main class. In addition, you can wrap the entire testing code file with if DEBUG (or other directive) so it's only available when unit testing and will not affect production/development code.
http://weblogs.asp.net/ralfw/archive/2006/04/14/442836.aspx
public partial class MyClass
{
private string CreateTempString()
{
return "Hello World!";
}
}
#if DEBUG
public partial class MyClass //in file "MyClass_Accessor.cs"
{
public string CreateTempString_Accessor()
{
return CreateTempString();
}
}
#endif
Reflection
You can still access private members via reflection:
public class Test
{
private string PrivateField = "private";
}
Test t = new Test();
var publicFieldInfo = typeof(Test).GetField("PrivateField", BindingFlags.Instance | BindingFlags.NonPublic);
Console.WriteLine(publicFieldInfo.GetValue(t)); //outputs "private"
Your unit tests could pull out private/hidden data in classes this way. In fact, Microsoft provides two classes that do exactly this for you: PrivateObject and PrivateType
Given your in-house development process limitations, this is likely your best bet as you'll be able to manage your own tests outside the main project libraries without having to alter anything in the core code.
Note that Silverlight (and likely other Core-CLR runtimes) strictly enforce non-public access during reflection, so this option is not applicable in those cases.
So, there are a few ways to test private members, and I'm sure there are a few more clever/not-so-clever methods of doing so lurking out there.
Could all of those benefits could be achieved by naming all private methods with __ prefix or by introducing a private-but-accessible access modifier?
The benefits cited by you (citing others) being:
To make the implementation cleaner/avoid long vertical list of
methods in IDE autocompletion
To announce to the world which methods are public interface and
which may change and are just for implementation purpose
Readability
Now you add that these could all be achieved with __ or by a change to the language specification and/or IDEs that would support a private-but-accessible access modifier, possibly with some unsafe-like keyword that would discourage this. I don't think it will be worthwhile going into a debate about changing the current features/behaviours of the language and IDE (and possibly it wouldn't be make sense for StackOverflow), so focusing on what is available:
1) Cleaner implementation and intellisense
The Visual Studio IDE (I can't speak for MonoDevelop) does support hiding members from intellisense when they're marked with the [EditorBrowsableAttribute]. But this only works if the developer enables the option "Hide Advanced Members" in their Visual Studio options. (note that it will not supress members in the intellisense when you're working within the same assembly)
http://msdn.microsoft.com/en-us/library/system.componentmodel.editorbrowsableattribute.aspx
So marking a public member as such makes it behave (intellisense-wise) as internal-ish (no [InternalsVisibleTo] support). So if you're in the same assembly, or if you do not have the Hide Advanced Members enabled, you'll still see a long list of __ members in the intellisense. Even if you have it hidden from intellisense, it's still fully accessible according to its current access modifier.
2) Public usage interface/contract
This assumes that all developers in the C#, and Visual Basic, and F#, and C++.NET and any .NET development world will adopt the same __ naming convention and adhere to it as assemblies are compiled and interchanged between developers. Maybe if you're scripting in IronPython, you can get away with it, or if your company internally adopts this approach. But generally speaking, it's not going to happen and .NET developers may likely be hestitant to leverage libraries adopting this convention as that is not the general .NET culture.
3) Readability
This kind of goes with #2 in that what is "readable" depends on the culture and what developers within that field expect; it is certainly debatable and subjective. I would wager that the majority of the C# developers find the strict/enforced encapsulation to significantly improve code readability and I'm sure a good chunk of them would find __ used often would detract from that. (as a side, I'm sure it's not uncommon for developers to adopt _ or __ prefixes for private fields and still keep them private)
However, readability and encapsulation in C# goes beyond just public/private accessors. In C#, there are private, public, protected internal, protected, and internal (am I missing one?) each has their own use and provide different information for developers. Now I'm not sure how you would go about communicating those accessors only via __. Suggesting single underscore is protected, double underscore is private, that would definitely hamper readability.
Is there anything which is achieved by access restriction that couldn't be achieved by naming conventions without restricting the access?
If you're asking why did the C# design team go this route, well I guess you'd have to ask Mr. Hejlsberg one day. I know they were creating a language gleaning the best parts of C/C++ and to strongly focus on the priciples of object-oriented principles.
As to what is achieved by enforcing access via the access modifiers:
More guaranteed proper access by consumers of the API. If your class utilizes a MakeTempStringForXMLSerialization method which stores the string as a class property for serialization, but for performance reasons forgoes costly checks (because you, as a developer have done unit testing to ensure that all of class's fields will be valid via the public API) then a third party does some lovely garbage-in-garbage-out, they'll blame you and/or the vendor for a shoddy library. Is that fair? Not necessarily; they put the garbage in, but the reality is many will still blame the vendor.
For new developers attempting to understand how your API works, it helps to simplify their experience. Yes, developers should read the documentation, but if the public API is intuitive (as it generally should be) and not exposing a boatload of members that shouldn't be accessed, then it's far less confusing and less likely they'll accidentally feed garbage into the system. It will also lower the overhead to get the developers to consume and leverage your API effectively without hassles. This is especially the case when it comes to any updates you publish of your API in which you wish to change/refactor/improve your internal implementation details.
So from a business perspective, it protects them from liability and bad relations with customers and is more appealing for developers to purchase and invest in it.
Now this can all be the case, as you say, if everyone follows the convention that __ members should not be accessed outside of the class or provide some unsafe marker where you say, "If you do this, and it breaks, it's not my fault!" well then you're not on the same planet as C# .NET development. The accessors provided by C# provide that __ convention but ensure that all developers adhere to it.
One could argue that the restricted access is an illusion as consumers can work around it via reflection (as demonstrated above), and thus there is actually no programmatic difference between the access modifiers and __ (or other) notation. (On Silverlight/Core-CLR, there is, most definitely a programmatic difference though!) But the work developers would go through to access those private fields is the difference between you giving consumers an open door with a sign "don't go in" (that you hope they can read) and a door with a lock that they have to bash down.
So in the end what does it actually provide? Standardized, enforced access to members where as __ provides non-standardized, non-enforced access to members. In addition, __ lacks the range of description that the varieties of available access modifiers supply.
Update (January 2nd, 2013)
I know it's been half a year, but I've been reading through the C# language specification and came across this little gem from section 2.4.2 Identifiers which states:
Identifiers containing two consecutive underscore characters (U+005F)
are reserved for use by the implementation. For example, an
implementation might provide extended keywords that begin with two
underscores.
I imagine nothing necessarily bad will happen, most likely nothing will break catastrophically if you do. But it's just one more aspect that should be considered when thinking about using double underscores in your code that the specification suggests that you do not.
The reason private methods exist are to provide encapsulation.
This allows you to provide public contracts by which you want your object to interact, yet have your internal logic and state be encapsulated in a way that refactoring would not affect consumers.
For example, you could provide public named properties, yet decide to store state in a Dictionary, similar to what typed DataSets do.
It's not really a "Security" feature (since you always have reflection to override it), but a way to keep public APIs separate from internal implementations.
Nobody should "depend" on your internal and private implementation, they should only depend on your public (or protected) implementation.
Now, regarding unit testing, it is usually undesired to test internal implementations.
One common approach though is to declare it internal and give the test assemblies access, through InternalsVisibleToAttribute, as Chris mentioned.
Also, a clear distinction between public, private, and protected are extremely useful with inheritance, in defining what you expect to be overridable and what shouldn't.
In general, there is no real point to marking fields or methods private. It provides an artificial sense of security. All the code is running inside the same process, presumably written by people with a friendly relationship to each other. Why do they need access controls to protect the code from each other? That sounds like a cultural issue to fix.
Documentation provides more information than private/protected/public does, and a good system will have documentation anyway. Use that to guide your development. If you mark a method as "private", and a developer calls it anyway, that's a bad developer, and he will ruin your system in some other way eventually. Fix that developer.
Access controls eventually get in the way of development, and you spend time just making the compiler happy.
Of course, if you are talking strictly about C#, then you should stick to marking methods as private, because other C# developers will expect it.
It hides a lot of internal details, especially for library implementers who may actually want to hide those details.
Keep in mind that there are commercial libraries out there, being sold. They expose only a very limited set of options in their interface to their users and they want all the rest to be well hidden.
If you design a language that doesn't give this option, you're making a language that will be used almost exclusively for open-source projects (or mostly for scripting).
I don't know much about Python though, would be interesting to know if there are commercial closed-source libraries written in python.
The reasons you mentioned are good enough too.
If you're publishing a library with public members, it's possible for users to use these members. You can use naming conventions to tell them they shouldn't, but some people will.
If your "internal" members are public, it becomes impossible for you to guarantee compatibility with the old version of the library, unless you keep all of the internal structure the same and make only very minor changes. That means that every program using your library must use the version it was compiled against (and will most likely do this by shipping its own version, rather than using a shared system version).
This becomes a problem when your library has a security update. At best, all programs are using different shared versions, and you have to backport the security update to every version that's in the wild. More likely, every program ships its own version and simply will not get the update unless that program's author does it.
If you design your public API carefully, with private members for the things you might change, you can keep breaking changes to a minimum. Then all programs/libraries can use the latest version of your library, even if they were compiled for an earlier one, and they can benefit from the updates.
This can lead to some interesting debate. A main reason for marking private methods (and member variables) is to enforce information hiding. External code shouldn't know or care about how your class works. You should be free to change your impl without breaking other things. This is well established software practice.
However in real life, private methods can get in the way of testing and unforeseen changes that need to occur. Python has a belief "we're all consenting adults" and discourages private methods. So, sometimes I'll just use naming conventions and/or comment "not intended for public use". It's another one of those OO-theory vs. reality issues.
Long explanation aside, I have a situation where I need to basically re-implement a .NET framework class in order to extend the behavior in a manner that is not compatible with an inheritance or composition/delegation strategy. The question is not a matter of whether the course of action I am to take is what you would do, or recommend, it is instead a question of naming/coding-style.
Is there a paradigm for naming classes and methods that have the same functionality as an existing class or method ala the convention of ClassEx/MethodEx that exists in C++?
[edit]
I understand that choosing good names for this is important... I haven't written a line of code yet, and am instead taking the time to think through the ramifications of what I am about to undertake, and that includes searching for a clear, descriptive, name while trying to be concise. The issue is that the name I have in mind is not terribly concise.
[/edit]
Here are the ways I've seen in the .NET Framework itself:
Call it something slightly different, but don't use any specific suffix. For example, System.TimeZoneInfo was introduced to supersede System.TimeZone.
Put it in another namespace. For example, the WPF Button is in System.Windows instead of System.Windows.Forms.
Suffix it with a number. For example X509Certificate2 versus X509Certificate. (This practice was common with COM interfaces but has fallen out of favor in .NET.)
Note that the naming of TimeZoneInfo is a publicized case of Microsoft tackling this convtrovertial naming issue head on. See and http://blogs.msdn.com/kathykam/archive/2007/03/28/bye-bye-system-timezone2-hello-system-timezoneinfo.aspx and http://blogs.msdn.com/kcwalina/archive/2006/10/06/TimeZone2Naming.aspx for excellent information.
Try name your classes/methods with real meaning.
For example, if you extending the Random functionality to create random strings, name the class StringRandom or StringRandomizer and such.
If you writing class with general purpose extension methods that applying to specific class/interface, for example IList, name it ListExtensions.
If you writing random.Next method that returns random number between minValue and maxValue including maxValue, name the method NextIncludingMaxValue.
If you writing queue.Dequeue method that is thread safe, name if DequeueThreadSafe.
If you writing queue.Dequeue method that blocking until other thread enqueueing an item, name it DequeueBlocking.
And such...
C#, for the most part, avoids these situations entirely due to the ease in which you can extend a class with new methods without breaking binary compatibility (you can add methods, at will, to a class, just not an interface), and through the use of Extension methods.
There are few reasons to ever do this in C#, unlike C++. In C++, adding a method breaks compatibility, so "Ex" becomes a much more common scenario.
I give all my methods (and properties) camelCase names: so for example Invalidate is a framework method name, and invalidate is the name of one of my methods.
This (using camelCase names) is unconventional, so some people object to it, but I find it convenient.
No such problem with class names (for which I use the conventional UpperCase), because for class names there are their namespaces to distinguish them from the framework classes.
I have been trying to follow StyleCop's guidelines on a project, to see if the resulting code was better in the end. Most rules are reasonable or a matter of opinion on coding standard, but there is one rule which puzzles me, because I haven't seen anyone else recommend it, and because I don't see a clear benefit to it:
SA1101: The call to {method or property name} must begin with the 'this.' prefix to indicate that the item is a member of the class.
On the downside, the code is clearly more verbose that way, so what are the benefits of following that rule? Does anyone here follow that rule?
I don't really follow this guidance unless I'm in the scenarios you need it:
there is an actual ambiguity - mainly this impacts either constructors (this.name = name;) or things like Equals (return this.id == other.id;)
you want to pass a reference to the current instance
you want to call an extension method on the current instance
Other than that I consider this clutter. So I turn the rule off.
It can make code clearer at a glance. When you use this, it's easier to:
Tell static and instance members apart. (And distinguish instance methods from delegates.)
Distinguish instance members from local variables and parameters (without using a naming convention).
I think this article explains it a little
http://blogs.msdn.microsoft.com/sourceanalysis/archive/2008/05/25/a-difference-of-style.aspx
...a brilliant young developer at Microsoft (ok, it was me) decided to take it upon himself to write a little tool which could detect variances from the C# style used within his team. StyleCop was born. Over the next few years, we gathered up all of the C# style guidelines we could find from the various teams within Microsoft, and picked out all of best practices which were common to these styles. These formed the first set of StyleCop rules. One of the earliest rules that came out of this effort was the use of the this prefix to call out class members, and the removal of any underscore prefixes from field names. C# style had officially grown apart from its old C++ tribe.
this.This
this.Does
this.Not
this.Add
this.Clarity
this.Nor
this.Does
this.This
this.Add
this.Maintainability
this.To
this.Code
The usage of "this.", when used excessively or a forced style requirement, is nothing more then a contrivance used under the guise that there is < 1% of developers that really do not understand code or what they are doing, and makes it painful for 99% who want to write easily readable and maintainable code.
As soon as you start typing, Intellisence will list the content available in the scope of where you are typing, "this." is not necessary to expose class members, and unless you are completely clueless to what you are coding for you should be able to easily find the item you need.
Even if you are completely clueless, use "this." to hint what is available, but don't leave it in code. There are also a slew of add-ons like Resharper that help to bring clarity to the scope and expose the contents of objects more efficiently. It is better to learn how to use the tools provided to you then to develop a bad habit that is hated by a large number of your co-workers.
Any developer that does not inherently understand the scope of static, local, class or global content should not rely on "hints" to indicate the scope. "this." is worse then Hungarian notation as at least Hungarian notation provided an idea about the type the variable is referencing and serves some benefit. I would rather see "_" or "m" used to denote class field members then to see "this." everywhere.
I have never had an issue, nor seen an issue with a fellow developer that repeatedly fights with code scope or writes code that is always buggy because of not using "this." explicitly. It is an unwarranted fear that "this." prevents future code bugs and is often the argument used where ignorance is valued.
Coders grow with experience, "this." is like asking someone to put training wheels on their bike as an adult because it is what they first had to use to learn how to ride a bike. And adult might fall off a bike 1 in 1,000 times they get on it, but that is no reason to force them to use training wheels.
"this." should be banned from the language definition for C#, unfortunately there is only one reason for using it, and that is to resolve ambiguity, which could also be easily resolved through better code practices.
A few basic reasons for using this (and I coincidentally always prefix class values with the name of the class of which they are a part as well - even within the class itself).
1) Clarity. You know right this instant which variables you declared in the class definition and which you declared as locals, parameters and whatnot. In two years, you won't know that and you'll go on a wondrous voyage of re-discovery that is absolutely pointless and not required if you specifically state the parent up front. Somebody else working on your code has no idea from the get-go and thus benefits instantly.
2) Intellisense. If you type 'this.' you get all instance-specific members and properties in the help. It makes finding things a lot easier, especially if you're maintaining somebody else's code or code you haven't looked at in a couple of years. It also helps you avoid errors caused by misconceptions of what variables and methods are declared where and how. It can help you discover errors that otherwise wouldn't show up until the compiler choked on your code.
3) Granted you can achieve the same effect by using prefixes and other techniques, but this begs the question of why you would invent a mechanism to handle a problem when there is a mechanism to do so built into the language that is actually supported by the IDE? If you touch-type, even in part, it will ultimately reduce your error rate, too, by not forcing you to take your fingers out of the home position to get to the underscore key.
I see lots of young programmers who make a big deal out of the time they will save by not typing a character or two. Most of your time will be spent debugging, not coding. Don't worry so much about your typing speed. Worry more about how quickly you can understand what is going on in the code. If you save a total of five minutes coding and win up spending an extra ten minutes debugging, you've slowed yourself down, no matter how fast you look like you're going.
Note that the compiler doesn't care whether you prefix references with this or not (unless there's a name collision with a local variable and a field or you want to call an extension method on the current instance.)
It's up to your style. Personally I remove this. from code as I think it decreases the signal to noise ratio.
Just because Microsoft uses this style internally doesn't mean you have to. StyleCop seems to be a MS-internal tool gone public. I'm all for adhering to the Microsoft conventions around public things, such as:
type names are in PascalCase
parameter names are in camelCase
interfaces should be prefixed with the letter I
use singular names for enums, except for when they're [Flags]
...but what happens in the private realms of your code is, well, private. Do whatever your team agrees upon.
Consistency is also important. It reduces cognitive load when reading code, especially if the code style is as you expect it. But even when dealing with a foreign coding style, if it's consistent then it won't take long to become used to it. Use tools like ReSharper and StyleCop to ensure consistency where you think it's important.
Using .NET Reflector suggests that Microsoft isn't that great at adhering to the StyleCop coding standards in the BCL anyway.
I do follow it, because I think it's really convenient to be able to tell apart access to static and instance members at first glance.
And of course I have to use it in my constructors, because I normally give the constructor parameters the same names as the field their values get assigned to. So I need "this" to access the fields.
In addition it is possible to duplicate variable names in a function so using 'this' can make it clearer.
class foo {
private string aString;
public void SetString(string aString){
//this.aString refers to the class field
//aString refers to the method parameter
this.aString = aString;
}
}
I follow it mainly for intellisense reasons. It is so nice typing this. and getting a consise list of properties, methods, etc.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
This is a subjective thing of course, but I don't see anything positive in prefixing interface names with an 'I'. To me, Thing is practically always more readable than IThing.
My question is, why does this convention exist then? Sure, it makes it easier to tell interfaces from other types. But wouldn't that argument extend to retaining the Hungarian notation, which is now widely censured?
What's your argument for that awkward 'I'? Or, more importantly, what could be Microsoft's?
Conventions (and criticism against them) all have a reason behind them, so let's run down some reasons behind conventions
Interfaces are prefixed as I to differentiate interface types from implementations - e.g., as mentioned above there needs to be an easy way to distinguish between Thing and its interface IThing so the convention serves to this end.
Interfaces are prefixed I to differentiate it from abstract classes - There is ambiguity when you see the following code:
public class Apple: Fruit
Without the convention one wouldn't know if Apple was inheriting from another class named Fruit, or if it were an implementation of an interface named Fruit, whereas IFruit will make this obvious:
public class Apple: IFruit
Principle of least surprise applies.
Not all uses of hungarian notation are censured - Early uses of Hungarian notation signified a prefix which indicated the type of the object and then followed by the variable name or sometimes an underscore before the variable name. This was, for certain programming environments (think Visual Basic 4 - 6) useful but as true object-oriented programming grew in popularity it became impractical and redundant to specify the type. This became especially issue when it came to intellisense.
Today hungarian notation is acceptable to distinguish UI elements from actual data and similarly associated UI elements, e.g., txtObject for a textbox, lblObject for the label that is associated with that textbox, while the data for the textbox is simply Object.
I also have to point out that the original use of Hungarian notation wasn't for specifying data types (called System Hungarian Notation) but rather, specifying the semantic use of a variable name (called Apps Hungarian Notation). Read more on it on the wikipedia entry on Hungarian Notation.
The reason I do it is simple: because that's the convention. I'd rather just follow it than have all my code look different, making it harder to read and learn.
Thing is more readable name than IThing. I'm from the school of thought that we should program to interfaces rather than specific implementations. So generally speaking, interfaces should have priority over implementations. I prefer to give the more readable name to the interface rather than the implementation (i.e., my interfaces are named without the 'I' prefix).
Well, one obvious consideration would be the (very common) IFoo and Foo pair (when abstracting Foo), but more generally it is often fundamental to know whether something is an interface vs class. Yes it is partly redundant, but IMO is is different from things like sCustomerName - here, the name itself (customerName) should be enough to understand the variable.
But with CustomerRepository - it that a class, or the abstract interface?
Also: expectation; the fact is, right or wrong, that is what people expect. That is almost reason enough.
I'm sure your question was the topic of many lengthy discussions within the Microsoft team that worked on the .NET Framework and its standards.
I think the most telling example comes from the source itself. Below, I transcribe extracts from Framework Design Guidelines, a book I highly recommend.
From Krzysztof Cwalina, CLR program manager:
The only prefix used is "I" for interfaces (as in ICollection), but that is for historical reasons. In retrospect, I think it would have been better to use regular type names. In a majority of the cases developers don't care that something is an interface and not an abstract class, for example.
From Brad Abrams, CLR and .NET Framework program manager:
On the other hand, the "I" prefix on interfaces is a clear recognition of the influence of COM (and Java) on the .NET Framework. COM popularized, even institutionalized, the notation that interfaces begin with "I." Although we discussed diverging from this historic pattern we decided to carry forward the pattern as so many of our users were already familiar with COM.
From Jeffrey Richter, consultant and author:
Personally, I like the "I" prefix and I wish we had more stuff like this. Little one-character prefixes go a long way toward keeping code terse and yet descriptive. [...] I use prefixes for my private type fields because I find this very useful.
My point is, it WAS on the discussion table. An advantage I see is that it helps avoid name collisions between classes and interfaces, so your names can be both descriptive and compact
Personally--and perhaps out of habit--I like the I prefix, because it succinctly flags interfaces, allowing me to have one-to-one naming correspondence with implementing types. This shines in cases when you want to provide a base implementation: IThing is the interface, Thing is the base (perhaps abstract) type. Derived types can be SomeThing. I love being able to use such crystal clear shorthand notation.
I think it is better than adding a "Impl" suffix on your concrete class. It is a single letter, and this convention is well established. Of course you are free to use any naming you wish.
In my opinion this 'I' is just visual noise. IDE should show class and interface names differently. Fortunately Java standard library doesn't use this convention.
There is nothing wrong with NOT using I convention for interfaces - just be consistent and make sure it works not just for you but for whole team (if there is one).
Naming an interface should have much deeper meaning than just whether or not you put an "I" at the front of the name.
Neither "Fruit" nor "IFruit" would have a whole lot of meaning for me as an interface. This is because it looks a lot more like a class. Classes define things, whereas interfaces should define functionality.
The "I" naming convention does help differentiate between classes and interfaces so that development is a little bit easier. And while it is not required, it certainly helps avoid common object oriented coding headaches. C#, like Java, only allows for inheritance from a single base class. But you can implement as many interfaces as you want. The caveat is, if you inherit from a class and implement one or more interfaces, the base class has to be named first (i.e. class Trout: Fish, ISwimmer, IDiver ... ).
I really like to name my interfaces both based based on what functionality they provide as well as what type of interface they are (i.e. animate or inanimate interfaces).
If you focus on in functionality that the interface provides you can quickly determine a name for the interface. It also helps you to quickly see if your interface defines unrelated functions.
Interfaces that define inanimate objects (i.e. things that can't act on their own)...
I like to name them with ...able at the end
IPrintable (such as Document, Invoice)
IPlayable (such as Instrument, MediaPlayer)
ISavable (such as Document, Image)
IEdible (such as Fruit, Beef)
IDrivable (such as Car)
IFlyable (such as Plane)
Interfaces that define animate objects (i.e. things that act on their own)...
I like to name them with ...er at the end
ISwimer (such as Fish, Dog, Duck)
IDiver (such as Fish, Duck)
IFlyer (such as Pilot)
IDriver (such as NascarDriver)
In the end, the "I" naming convention helps differentiate between interfaces and classes. But it may make sense to add additional naming logic besides just the "I" at the beginning.
Because you usually have an IThing and a Thing. So instead of letting people come with their own "conventions" for this recurring situation, a uniform one-size-fits all convention was chosen. Echoing what others say, the de facto standardness is reason enough to use it.
It's just a convention that's intent is to prevent name collisions. C# does not allow me to have a class and an interface named Client, even if the file names are Client and IClient, respectively. I'm comfortable using the convention; if I had to offer a different convention I'd suggest using "Contract" as a suffix, e.g. ClientContract.
Do prefix interface names with the
letter I to indicate that the type is
an interface.
The guideline doesn't explain why you should use the I prefix, but the fact that this is now an established convention should be reason enough.
What do you have to gain by dropping the I prefix?
I don't know exactly why they chose that convention, perhaps partly thinking of ensouling the class with "I" as in "I am Enumerable".
A naming convention that would be more in the line of the rest of the framework would be to incorporate the type in the name, as for example the xxxAttribute and xxxException classes, making it xxxInterface. That's a bit lengthy though, and after all the interfaces is something separate, not just another bunch of classes.
I know the Microsoft guidelines recommends using the 'I' to describe it as an interface. But this comes from IBM naming conventions if I'm not remember wrong, the initiating 'I' for interfaces and the succeeding *Impl for the implementations.
However, in my opinion the Java Naming Conventions is a better choice than the IBM naming convention (and not only in Java, for C# as well and any OO programming language). Interfaces describes what an object can be able to do if it implements the interface and the description should be in verb form. I.e Runnable, Serializable, Invoiceable, etc. IMHO this is a perfect description of what the interface represents.
It looks Hungarianish to me. Hungarian is generally considered a menace in strongly-typed languages.
Since C# is a Microsoft product and Hungarian notation was a Microsoft invention, I can see where C# might be susceptible to its influence.
It's popular for an OS GUI to use different icons for files and folders. If they all shared the same icons, but folders were prefixed with "F", it would be acceptable to use the same icon. But, since humans' image recognition speed is faster than their word recognition speed, we have settled on icons.
Computer programs like IDEs are fully capable of making a file's interface-ness apparent. This would free the namespace of different levels of abstraction happening in the same name. E.g. in "ICare", "I" describes the implementation and "Care" describes the interface's capabilities.
I'm guessing the "I" convention is so popular because we haven't been able to think of anything better, but it's existence is important because it points out a need we have for knowing this kind of information.
To separate interfaces from classes.
Also (this is more of a personal observation than dictated from upon high), interfaces describe what a class does. The 'I' lends itself to this (I'm sure it is a construct in grammar which would be great to whip out right now); an interface that describes classes that validate would be "IValidate". One that describes matching behavior would be "IMatch".
The fact of the matter is that everyone understands it and part of writing better code is making it easy to read and understand.
I don't really like this convention. I understand that it helps out with the case when you have an interface and an implementation that would have the same name, but I just find it ugly. I'd still follow it if it were the convention where I am working, of course. Consistency is the point of conventions, and consistency is a very good thing.
I like to have an interface describe what the interface does in as generic a way as possible, for example, Validator. A specific implementation that validates a particular thing would be a ThingValidator, and an implementation with some abstract functionality shared by Validators would be an AbstractValidator. I would do this even if Thing is the only... well... thing that I'm validating, and Validator would be generic.
In cases where only one concrete class makes sense for an interface, I still try to describe something specific about that particular implementation rather than naming the interface differently to prevent a names collision. After all, I'm going to be typing the name of the interface more often than the name of the implementation.
You may add the word "Interface" as a suffix to your customized interface for example "SerializerInterface". For abstract class, "Fruit", for instance, "FruitAbstract" or you can make it "AbstractFruit", just be consistent all through out. That is readable enough, or follow the usual naming convention.
Just my 2 cents:
The reason why I personally append the suffix "Interface" is because it is easier to read than the "I" prefix and because the files in the system are listed "grouped". For example:
Not so good:
Alien.php
Host.php
IAlien.php
IHost.php
IXenomorph.php
Xenomorph.php
Better:
Alien.php
AlienInterface.php
Host.php
HostInterface.php
Xenomorph.php
XenomorphInterface.php
But that's just personal taste. I think as long as one uses a consistent naming convention throughout the entire project, nothing speaks against using your own naming convention.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've never been a fan of Hungarian notation, I've always found it pretty useless unless you're doing some really low level programming, but in every C++ project I've worked on some kind of Hungarian notation policy was enforced, and with it the use of some 'not-really-Hungarian' prefixes as m_ for fields, s_ for statics, g_ for globals and so on.
Soon I realized how much useless it was in C# and gradually started to drop all of my old habits... but the 'm_' thing. I still use the m_ prefix on private fields because I really find it very useful to being able to distinguish between parameters, locals and fields.
The naming conventions for fields page at MSDN says I shouldn't, but it does not say why (the way e.g. Google's conventions generally tend to rationalize their prescriptions).
Are there reasons why I shouldn't or is it only a matter of style. If it is the latter, are prefixes generally considered a bad style and can I expect negative reactions from other people working on the codebase?
I like the underbar prefix for member fields. Mostly I like it because that way, all of my member fields are shown alphabetically before my methods in the wizard bar at the top of the screen.
When you should:
When your project coding guidelines say you should
When you shouldn't:
When your project coding guidelines say you shouldn't
If you don't have any guidelines yet, you're free to choose whatever you or your team want and feel most comfortable with. Personally when coding C++ I tend to use m_ for members, it does help. When coding in other languages, particularly those without true classes (like Javascript, Lua) I don't.
In short I don't believe there is a "right" and a "wrong" way.
The auto-implemented property feature in C# 3.0 creates less of a need for this convention one way or the other. Instead of writing
string m_name;
public string Name { get { return m_name; } }
or
string _Name;
public string Name { get { return _Name; } }
(or any other convention), you can now write
public string Name { get; private set; }
Since you no longer need the explicit backing store variable, you no longer have to come up with a name for it; thus avoiding this entire discussion.
Obviously, this argument doesn't apply when you really need explicit backing store such as to perform validation.
As some have alluded to, the MS guidelines say:
Do not use a prefix for field names.
For example, do not use g_ or s_ to
distinguish static versus non-static
fields.
I happen to agree with this. prefixes make your code look ugly and waste space with inconsequential characters. Having said that, it is often common to use fields to back properties where both the field and the property would have the same name (with the private field being camel case and the property being pascal case). In VB, this doesn't work, since VB isn't case-sensitive. In this scenario, I recommend the use of a single _ prefix. No more, no less. It just looks cleaner, IMHO.
I have experimented with m_, s_, just _, and no prefix at all. I have settled on using just _ for all static and instance variables. I don't find it important to distinguish static variables from instance variables. In theory it sounds good, in practice it doesn't create a problem.
A coworker once made a convincing argument to eliminate all prefixes, we tried it on one project and it worked better then I expected. I carried it forward to my next project and became annoyed that it "interferes" with Intellisense. When you have the following situation
int foo;
public int Foo
{
get { return foo; }
}
Starting to type foo will suggest both the instance variable and the property. Prefixing the variable with an underscore eliminates the annoying double suggestion, so I switched back to using just _.
I try to follow the MSDN .NET library guidelines. They include a naming guidelines section.
Obviously, these are secondary to your project guidelines.
I prefer to mark property backing fields (although as already mentioned .NET 3.0+ reduces the need thanks to Automatic Properties) with underscores but not the "m". For one it puts them at the top of the InteliSense list when I come to use them.
I will admit that I need to brush-up on the guidelines on MSDN, things can change so quickly these days.
With tools like resharper there's really no reason for prefixes. Also if you write short methods, you should be able to tell really quickly where the var is coming from. Finally, I guess I wouldn't really see the need to tell a difference between a static or not because again resharper is going to red line it if you try to do something you're not able to. Even without resharper you're probably saved by the compiler.
I always prefix member variables with m_ and static variables with s_ for the same reasons that you state. Some people prefix member variables with an underscore, but I've always found this a bit odd looking (but that's just a personal preference).
Most people I work with use the m_/s_ prefix. I don't really think it matters too much what you use, as long as you're consistent.
I never use them. It encourages sloppy coding.
The MSDN coding guidelines, that's where it's at.
Here are a few reasons to use _ (and not m_).
(1) Many BCL guys do it despite MS's naming guide. (Check out their blog.) Those guys write the framework, so they have some good habits worth copying. Some of the most helpful example code on MSDN is written by them, and so uses the underscore convention. It's a de-facto industry standard.
(2) A single underscore is a noticeable yet unobtrusive way to disambiguate method and class-level variables by simply reading the source. It helps people understand new (or old) code at-a-glance when reading it. Yes, you can mouse-over to see this in an IDE, but we shouldn't be forced to. You may want to read it in a text editor, or dare I say it, on paper.
(3) Some say you don't need any prefix as methods will be short, and later if needed you can change the field to an auto-implemented property. But in the real world methods are as long as they need to be, and there are important differences between fields and properties (e.g. serialization and initialization).
Footnote: The "m" for member in m_ is redundant in our usage here, but it was lower case because one of the ideas in many of these old naming conventions was that type names started with upper case and instance names started with lower case. That doesn't apply in .NET so it's doubly redundant. Also Hungarian notation was sometimes useful with old C compilers (e.g. integer or pointer casting and arithmetic) but even in C++ its usefulness was diminished when dealing with classes.
As #John Kraft mentions, there is no "correct" answer. MattJ is the closest–you should always follow your company's style guidelines. When in Rome, and all that.
As for my personal opinion, since it's called for here, I vote that you drop m_ entirely.
I believe the best style is one where all members are PascalCased, regardless of visibility (that means even private members), and all arguments are camelCased. I do not break this style.
I can understand the desire to prefix property backing store field; after all you must differentiate between the field and the property, right? I agree, you must. But use a post-fix.
Instead of m_MyProperty (or even _MyProperty, which I've seen and even promoted once upon a time), use MyPropertyValue. It's easier to read and understand and -- more importantly -- it's close to your original property name in intellisense.
Ultimately, that's the reason I prefer a postfix. If I want to access MyPropertyValue using intellisense you (typically) type "My <down-arrow> <tab>", since by then you're close enough that only MyProperty and MyPropertyValue are on the list. If you want to access m_MyProperty using intellisense, you'll have to type "m_My <tab>".
It's about keystroke economy, in my opinion.
There is one important difference between C++ and C#: Tool support. When you follow the established guidelines (or common variations), you will get a deep level of tool support that C++ never had. Following the standards allows tools to do deeper refactoring/rename operations than you'd otherwise be capable of. Resharper does this. So stick with one of the established standards.
I never do this and the reason why is that I [try to] keep my methods short. If I can see the whole method on the screen, I can see the params, I can see the locals and so I can tell what is owned by the class and what is a param or a local.
I do typically name my params and locals using a particular notation, but not always. I'm nothing if not inconsistent. I rely on the fact that my methods are short and try to keep them from doing X, Y and Z when they should be only doing X.
Anyhow, that's my two cents.
Unless I'm stuck with vi or Emacs for editing code, my IDE takes care of differential display of members for me so I rarely uses any special conventions. That also goes for prefixing interfaces with I or classes with C.
Someone, please, explain the .NET style of I-prefix on interfaces. :)
what i am used to is that private properties got small underscone f.ex "string _name". the public one got "Name". and the input variables in methods got small letter"void MyMethod(string name)".
if you got static const is often written with big letters. static const MYCONST = "hmpf".
I am sure that I will get flamed for this but so be it.
It's called Microsoft's .NET library guidelines but it's really Brad Abrams's views (document here) - there are other views with valid reasons.
People tend to go with the majority view rather than having good solid reasons for a specific style.
The important point is to evaluate why a specific style is used and why it's preferred over another style - in other words, have a reason for choosing a style not just because everyone says it's the thing to do - think for yourself.
The basic reason for not using old style Hungarian was the use of abbreviations which was different for every team and difficult to learn - this is easily solved by not abbreviating.
As the available development tools change the style should change to what makes the most sense - but have a solid reason for each style item.
Below are my style guidelines with my reasons - I am always looking for ways to improve my style to create more reliable and easier to maintain code.
Variable Naming Convention
We all have our view on variable naming conventions. There are many different styles that will help produce easily maintainable quality code - any style which supports the basic essential information about a variable are okay. The criteria for a specific naming convention should be that it aids in producing code that is reliable and easily maintainable. Criteria that should not be used are:
It's ugly
Microsoft (i.e. Brad Abrams) says don't use that style - Microsoft does not always produce the most reliable code just look at the bugs in Expression Blend.
It is very important when reading code that a variable name should instantly convey three essential facts about the variable:
it’s scope
it’s type
a clearly understand about what it is used for
Scope: Microsoft recommends relying totally on IntelliSense . IntelliSense is awesome; however, one simply does not mouse over every variable to see it's scope and type. Assuming a variable is in a scope that it is not can cause significant errors. For example, if a reference variable is passed in as a parameter and it is altered in local scope that change will remain after the method returns which may not be desired. If a field or a static variable is modified in local scope but one thinks that it is a local variable unexpected behavior could result. Therefore it is extremely important to be able to just look at a variable (not mouse over) and instantly know it's scope.
The following style for indicating scope is suggested; however, any style is perfectly okay as long as it clearly and consistently indicates the variable's scope:
m_ field variable
p_ parameter passed to a method
s_ static variable
local variable
Type: Serious errors can occur if one believes they are working with a specific type when they are actually working with a different type - again, we simply do not mouse over ever variable to determine its type, we just assume that we know what its type is and that is how errors are created.
Abbreviations: Abbreviations are evil because they can mean different things to different developers. One developer may think a leading lower case "s" means string while another may think it means signed integer. Abbreviations are a sign of lazy coding - take a little extra time and type the full name to make it clear to the developer that has to maintain the code. For example, the difference between "str" and "string" is only three characters - it does not take much more effort to make code easy to maintain.
Common and clear abbreviations for built-in data types only are acceptable but must be standardized within the team.
Self Documenting Code: Adding a clear description to a variable name makes it very easy for another developer to read and understand the code - make the name so understandable that the team manager can read and understand the code without being a developer.
Order of Variable Name Parts: The recommended order is scope-type-description because:
IntelliSense will group all similar scopes and within each scope IntelliSense will group all similar types which makes lookups easy - try finding a variable the other way
It makes it very easy to see and understand the scope and to see and understand the type
It's a fairly common style and easy to understand
It will pass FxCop
Examples: Here are a few examples:
m_stringCustomerName
p_stringCustomerDatabaseConnectionString
intNumberOfCustomerRecords or iNumberOfCustomerRecords or integerNumberOfCustomerRecords
These simple rules will significantly improve code reliability and maintainability.
Control Structure Single Line Statements
All control structures (if, while, for, etc.) single line statements should always be wrapped with braces because it is very easy to add a new statement not realizing that a given statement belongs to a control structure which will break the code logic without generating any compile time errors.
Method Exception Wrapping
All methods should be wrapped with an outer try-catch which trap, provide a place to recover, identify, locate, log, and make a decision to throw or not. It is the unexpected exception that cause our applications to crash - by wrapping every method trapping all unhandled exceptions we guarantee identifying and logging all exceptions and we prevent our application from ever crashing. It takes a little more work but the results is well worth the effort.
Indentation
Indentation is not a major issue; however, four spaces and not using tabs is suggested. If code is printed, the first printer tab usually defaults to 8 spaces. Different developer tend to use different tab sizes. Microsoft's code is usually indented 4 space so if one uses any Microsoft code and uses other than 4 spaces, then the code will need to be reformatted. Four spaces makes it easy and consistent.
I never use any hungarian warts whenever I'm given the choice. It's extra typing and doesn't convey any meaningful information. Any good IDE (and I define "good" based on the presence of this feature, among others) will allow you to have different syntax highlighting for static members, instance members, member functions, types, etc. There is no reason to clutter your code with information that can be provided by the IDE. This is a corollary to not cluttering your code with commented-out old code because your versioning system should be responsible for that stuff.
The best way is to agree on a standard with your colleagues, and stick to it. It doesn't absolutely have to be the method that would work best for everyone, just agreeing on one method is more important than which method you actually agree on.
What we chose for our code standard is to use _ as prefix for member variables. One of the reasons was that it makes it easy to find the local variables in the intellisense.
Before we agreed on that standard I used another one. I didn't use any prefix at all, and wrote this.memberVariable in the code to show that I was using a member variable.
With the property shorthand in C# 3, I find that I use a lot less explicit member variables.
The closest thing to official guidelines is StyleCop, a tool from Microsoft which can automatically analyse your source files and detect violations from the recommended coding style, and can be run from within Visual Studio and/or automated builds such as MSBuild.
We use it on our projects and it does help to make code style and layout more consistent between developers, although be warned it does take quite a bit of getting used to!
To answer your question - it doesn't allow any Hungarian notation, nor any prefixes like m_ (in fact, it doesn't allow the use of underscores at all).
I don't use that style any longer. It was developed to help you see quickly how variables were being used. The newer dev environments let you see that information by hovering your mouse over the variable. The need for it has gone away if you use those newer tools.
There might also be some insight to be gleaned from C++ Coding Standards (Sutter, Herb and Alexandrescum Andrei, 2004). Item #0 is entitled "Don't sweat the small stuff. (Or: Know what not to standardize.)".
They touch on this specific question a little bit by saying "If you can't decide on your own naming convention, try ... private member variables likeThis_ ..." (Remember use of leading underscore is subject to very specific rules in C++).
However, before getting there, they emphasize a certain level of consistency "...the important thing is not to set a rule but just to be consistent with the style already in use within the file..."
The benefit of that notation in C/C++ was to make it easier to see what a symbol's type was without having to go search for the declaration. These styles appeared before the arrival of Intellisense and "Go to Definition" - we often had to go on a goose chase looking for the declaration in who knows how many header files. On a large project this could be a significant annoyance which was bad enough when looking at C source code, but even worse when doing forensics using mixed assembly+source code and a raw call stack.
When faced with these realities, using m_ and all the other hungarian rules starts to make some sense even with the maintenance overhead because of how much time it would save just in looking up a symbol's type when looking at unfamiliar code. Now of course we have Intellisense and "Go to Definition", so the main time saving motivation of that naming convention is no longer there. I don't think there's much point in doing that any more, and I generally try to go with the .NET library guidelines just to be consistent and possibly gain a little bit more tool support.
If you are not coding under a particular guideline, you should keep using your actual m_ notation and change it if the project coding guidelines says so.
Be functional.
Do not use global variables.
Do not use static variables.
Do not use member variables.
If you really have to, but only if you really have to, use one and only one variable to access your application / environment.