Why doesn't C# offer constness akin to C++? - c#

References in C# are quite similar to those on C++, except that they are garbage collected.
Why is it then so difficult for the C# compiler to support the following:
Members functions marked const.
References to data types (other than string) marked const, through which only const member functions can be called ?
I believe it would be really useful if C# supported this. For one, it'll really help the seemingly widespread gay abandon with which C# programmers return naked references to private data (at least that's what I've seen at my workplace).
Or is there already something equivalent in C# which I'm missing? (I know about the readonly and const keywords, but they don't really serve the above purpose)

I suspect there are some practical reasons, and some theoretical reasons:
Should the constness apply to the object or the reference? If it's in the reference, should this be compile-time only, or as a bit within the reference itself? Can something else which has a non-const reference to the same object fiddle with it under the hood?
Would you want to be able to cast it away as you can in C++? That doesn't sound very much like something you'd want on a managed platform... but what about all those times where it makes sense in C++?
Syntax gets tricky (IMO) when you have more than one type involved in a declaration - think arrays, generics etc. It can become hard to work out exactly which bit is const.
If you can't cast it away, everyone has to get it right. In other words, both the .NET framework types and any other 3rd party libraries you use all have to do the right thing, or you're left with nasty situations where your code can't do the right thing because of a subtle problem with constness.
There's a big one in terms of why it can't be supported now though:
Backwards compatibility: there's no way all libraries would be correctly migrated to it, making it pretty much useless :(
I agree it would be useful to have some sort of constness indicator, but I can't see it happening, I'm afraid.
EDIT: There's been an argument about this raging in the Java community for ages. There's rather a lot of commentary on the relevant bug which you may find interesting.

As Jon already covered (of course) const correctness is not as simple as it might appear. C++ does it one way. D does it another (arguably more correct/ useful) way. C# flirts with it but doesn't do anything more daring, as you have discovered (and likely never well, as Jon well covered again).
That said, I believe that many of Jon's "theoretical reasons" are resolved in D's model.
In D (2.0), const works much like C++, except that it is fully transitive (so const applied to a pointer would apply to the object pointed to, any members of that object, any pointers that object had, objects they pointed to etc) - but it is explicit that this only applies from the variable that you have declared const (so if you already have a non-const object and you take a const pointer to it, the non-const variable can still mutate the state).
D introduces another keyword - invariant - which applies to the object itself. This means that nothing can ever change the state once initialised.
The beauty of this arrangement is that a const method can accept both const and invariant objects. Since invariant objects are the bread and butter of the functional world, and const method can be marked as "pure" in the functional sense - even though it may be used with mutable objects.
Getting back on track - I think it's the case that we're only now (latter half of the naughties) understanding how best to use const (and invariant). .Net was originally defined when things were more hazy, so didn't commit to too much - and now it's too late to retrofit.
I'd love to see a port of D run on the .Net VM, though :-)

Mr. Heljsberg, the designer of the C# language has already answered this question:
http://www.artima.com/intv/choicesP.html

I wouldn't be surprised if immutable types were added to a future version of C#.
There have already been moves in that direction with C# 3.0.
Anonymous types, for example, are immutable.
I think, as a result of extensions designed to embrace parallelism, you will be likely to see immutability pop up more and more.

The question is, do we need constness in C#?
I'm pretty sure that the JITter knows that the given method is not going to affect the object itself and performs corresponding optimizations automagically. (maybe by emitting call instead of callvirt ?)
I'm not sure we need those, since most of the pros of constness are performance related, you end up at the point 1.
Besides that, C# has the readonly keyword.

Related

Why should casting be avoided? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I generally avoid casting types as much as possible since I am under the impression that it's poor coding practice and may incur a performance penalty.
But if someone asked me to explain why exactly that is, i would probably look at them like a deer in headlights.
So why/when is casting bad?
Is it general for java, c#, c++ or does every different runtime environment deal with it on it's own terms?
Specifics for a any language are welcome, example why is it bad in c++?
You've tagged this with three languages, and the answers are really quite different between the three. Discussion of C++ more or less implies discussion of C casts as well, and that gives (more or less) a fourth answer.
Since it's the one you didn't mention explicitly, I'll start with C. C casts have a number of problems. One is that they can do any of a number of different things. In some cases, the cast does nothing more than tell the compiler (in essence): "shut up, I know what I'm doing" -- i.e., it ensures that even when you do a conversion that could cause problems, the compiler won't warn you about those potential problems. Just for example, char a=(char)123456;. The exact result of this implementation defined (depends on the size and signedness of char), and except in rather strange situations, probably isn't useful. C casts also vary in whether they're something that happens only at compile time (i.e., you're just telling the compiler how to interpret/treat some data) or something that happens at run time (e.g., an actual conversion from double to long).
C++ attempts to deal with that to at least some extent by adding a number of "new" cast operators, each of which is restricted to only a subset of the capabilities of a C cast. This makes it more difficult to (for example) accidentally do a conversion you really didn't intend -- if you only intend to cast away constness on an object, you can use const_cast, and be sure that the only thing it can affect is whether an object is const, volatile, or not. Conversely, a static_cast is not allowed to affect whether an object is const or volatile. In short, you have most of the same types of capabilities, but they're categorized so one cast can generally only do one kind of conversion, where a single C-style cast can do two or three conversions in one operation. The primary exception is that you can use a dynamic_cast in place of a static_cast in at least some cases and despite being written as a dynamic_cast, it'll really end up as a static_cast. For example, you can use dynamic_cast to traverse up or down a class hierarchy -- but a cast "up" the hierarchy is always safe, so it can be done statically, while a cast "down" the hierarchy isn't necessarily safe so it's done dynamically.
Java and C# are much more similar to each other. In particular, with both of them casting is (virtually?) always a run-time operation. In terms of the C++ cast operators, it's usually closest to a dynamic_cast in terms of what's really done -- i.e., when you attempt to cast an object to some target type, the compiler inserts a run-time check to see whether that conversion is allowed, and throw an exception if it's not. The exact details (e.g., the name used for the "bad cast" exception) varies, but the basic principle remains mostly similar (though, if memory serves, Java does make casts applied to the few non-object types like int much closer to C casts -- but these types are used rarely enough that 1) I don't remember that for sure, and 2) even if it's true, it doesn't matter much anyway).
Looking at things more generally, the situation's pretty simple (at least IMO): a cast (obviously enough) means you're converting something from one type to another. When/if you do that, it raises the question "Why?" If you really want something to be a particular type, why didn't you define it to be that type to start with? That's not to say there's never a reason to do such a conversion, but anytime it happens, it should prompt the question of whether you could re-design the code so the correct type was used throughout. Even seemingly innocuous conversions (e.g., between integer and floating point) should be examined much more closely than is common. Despite their seeming similarity, integers should really be used for "counted" types of things and floating point for "measured" kinds of things. Ignoring the distinction is what leads to some of the crazy statements like "the average American family has 1.8 children." Even though we can all see how that happens, the fact is that no family has 1.8 children. They might have 1 or they might 2 or they might have more than that -- but never 1.8.
Lots of good answers here. Here's the way I look at it (from a C# perspective).
Casting usually means one of two things:
I know the runtime type of this expression but the compiler does not know it. Compiler, I am telling you, at runtime the object that corresponds to this expression is really going to be of this type. As of now, you know that this expression is to be treated as being of this type. Generate code that assumes that the object will be of the given type, or, throw an exception if I'm wrong.
Both the compiler and the developer know the runtime type of the expression. There is another value of a different type associated with the value that this expression will have at runtime. Generate code that produces the value of the desired type from the value of the given type; if you cannot do so, then throw an exception.
Notice that those are opposites. There are two kinds of casts! There are casts where you are giving a hint to the compiler about reality - hey, this thing of type object is actually of type Customer - and there are casts where you are telling the compiler to perform a mapping from one type to another - hey, I need the int that corresponds to this double.
Both kinds of casts are red flags. The first kind of cast raises the question "why exactly is it that the developer knows something that the compiler doesn't?" If you are in that situation then the better thing to do is usually to change the program so that the compiler does have a handle on reality. Then you don't need the cast; the analysis is done at compile time.
The second kind of cast raises the question "why isn't the operation being done in the target data type in the first place?" If you need a result in ints then why are you holding a double in the first place? Shouldn't you be holding an int?
Some additional thoughts here:
Link
Casting errors are always reported as run-time errors in java. Using generics or templating turns these errors into compile-time errors, making it much easier to detect when you have made a mistake.
As I said above. This isn't to say that all casting is bad. But if it is possible to avoid it, its best to do so.
Casting is not inherently bad, it's just that it's often misused as a means to achieve something that really should either not be done at all, or done more elegantly.
If it was universally bad, languages would not support it. Like any other language feature, it has its place.
My advice would be to focus on your primary language, and understand all its casts, and associated best practices. That should inform excursions into other languages.
The relevant C# docs are here.
There is a great summary on C++ options at a previous SO question here.
I'm mostly speaking for C++ here, but most of this probably applies to Java and C# as well:
C++ is a statically typed language. There are some leeways the language allows you in this (virtual functions, implicit conversions), but basically the compiler knows the type of every object at compile-time. The reason to use such a language is that errors can be caught at compile-time. If the compiler know the types of a and b, then it will catch you at compile-time when you do a=b where a is a complex number and b is a string.
Whenever you do explicit casting you tell the compiler to shut up, because you think you know better. In case you're wrong, you will usually only find out at run-time. And the problem with finding out at run-time is, that this might be at a customer's.
Java, c# and c++ are strongly typed languages, although strongly typed languages can be seen as inflexible, they have the benefit of doing type checking at compile time and protect you against runtime errors that are caused by having the wrong type for certain operations.
There are basicaly two kind of casts: a cast to a more general type or a cast to an other types (more specific). Casting to a more general type (casting to a parent type) will leave the compile time checks intact. But casting to other types (more specific types) will disable compile time type checking and will be replaced by the compiler by a runtime check. This means you have less certainty you’re compiled code will run correctly. It also has some negligible performance impact, due to the extra runtime type check (the Java API is full of casts!).
Some types of casting are so safe and efficient as to often not even be considered casting at all.
If you cast from a derived type to a base type, this is generally quite cheap (often - depending on language, implementation and other factors - it is zero-cost) and is safe.
If you cast from a simple type like an int to a wider type like a long int, then again it is often quite cheap (generally not much more expensive than assigning the same type as that cast to) and again is safe.
Other types are more fraught and/or more expensive. In most languages casting from a base type to a derived type is either cheap but has a high risk of severe error (in C++ if you static_cast from base to derived it will be cheap, but if the underlying value is not of the derived type the behaviour is undefined and can be very strange) or relatively expensive and with a risk of raising an exception (dynamic_cast in C++, explicit base-to-derived cast in C#, and so on). Boxing in Java and C# is another example of this, and an even greater expense (considering that they are changing more than just how the underlying values are treated).
Other types of cast can lose information (a long integer type to a short integer type).
These cases of risk (whether of exception or a more serious error) and of expense are all reasons to avoid casting.
A more conceptual, but perhaps more important, reason is that each case of casting is a case where your ability to reason about the correctness of your code is stymied: Each case is another place where something can go wrong, and the ways in which it can go wrong add to the complexity of deducing whether the system as a whole will go wrong. Even if the cast is provably safe each time, proving this is an extra part of the reasoning.
Finally, the heavy use of casts can indicate a failure to consider the object model well either in creating it, using it, or both: Casting back and forth between the same few types frequently is almost always a failure to consider the relationships between the types used. Here it's not so much that casts are bad, as they are a sign of something bad.
There is a growing tendency for programmers to cling to dogmatic rules about use of language features ("never use XXX!", "XXX considered harmful", etc), where XXX ranges from gotos to pointers to protected data members to singletons to passing objects by value.
Following such rules, in my experience, ensures two things: you will not be a terrible programmer, nor will you be a great programmer.
A much better approach is to dig down and uncover the kernel of truth behind these blanket prohibitions, and then use the features judiciously, with the understanding that there are many situations for which they're the best tool for the job.
"I generally avoid casting types as much as possible" is a good example of such an overgeneralized rule. Casts are essential in many common situations. Some examples:
When interoperating with third-party code (especially when that code is rife with typedefs). (Example: GLfloat <--> double <--> Real.)
Casting from a derived to base class pointer/reference: This is so common and natural that the compiler will do it implicitly. If making it explicit increases readability, the cast is a step forwards, not backwards!
Casting from a base to derived class pointer/reference: Also common, even in well-designed code. (Example: heterogeneous containers.)
Inside binary serialization/deserialization or other low-level code that needs access to the raw bytes of built-in types.
Any time when it's just plain more natural, convenient, and readable to use a different type. (Example: std::size_type --> int.)
There are certainly many situations where it's not appropriate to use a cast, and it's important to learn these as well; I won't go into too much detail since the answers above have done a good job pointing some of them out.
To elaborate on KDeveloper's answer, it's not inherently type-safe. With casting, there is no guarantee that what you are casting from and casting to will match, and if that occurs, you will get a runtime exception, which is always a bad thing.
With specific regards to C#, because it includes the is and as operators, you have the opportunity to (for the most part) make the determination as to whether or not a cast would succeed. Because of this, you should take the appropriate steps to determine whether or not the operation would succeed and proceed appropriately.
In case of C#, one needs to be more careful while casting because of boxing/unboxing overheads involved while dealing with value types.
Not sure if someone already mentioned this, but in C# casting can be used in a rather safe manner, and is often necessary. Suppose you receive an object which can be of several types. Using the is keyword you can first confirm that the object is indeed of the type you are about to cast it to, and then cast the object to that type directly. (I didn't work with Java much but I'm sure there's a very straightforward way of doing it there as well).
You only cast an object to some type, if 2 conditions are met:
you know it is of that type
the compiler doesn't
This means not all the information you have is well represented in the type structure you use. This is bad, because your implementation should semantically comprise your model, which it clearly doesn't in this case.
Now when you do a cast, then this can have 2 different reasons:
You did a bad job in expressing the type relationships.
the languages type system simply is not expressive enough to phrase them.
In most languages you run into the 2nd situation a lot of times. Generics as in Java help a bit, the C++ template system even more, but it is hard to master and even then some things may be impossible or just not worth the effort.
So you could say, a cast is a dirty hack to circumvent your problems to express some specific type relationship in some specific language. Dirty hacks should be avoided. But you can never live without them.
Generally templates (or generics) are more type safe than casts. In that respect, i would say that an issue with casting is type-safety. However, there is another more subtle issue associated especially with downcasting: design. From my perspective at least, downcasting is a code smell, an indication that something might be wrong with my desing and i should investigate further. Why is simple: if you "get" the abstractions right, you simply don't need it! Nice question by the way...
Cheers!
To be really concise, a good reason is because of portability. Different architecture that both accommodate the same language might have, say, different sized ints. So if I migrate from ArchA to ArchB, which has a narrower int, I might see odd behavior at best, and seg faulting at worst.
(I'm clearly ignoring architecture independent bytecode and IL.)

C# developers learning Java, what are the biggest differences one may overlook? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
For c# developers that are staring out to learn Java, are there any big underlying differences between the two languages that should be pointed out?
Maybe some people may assume things to be the same, but there are some import aspects that shouldn't be overlooked? (or you can really screw up!)
Maybe in terms of OOP constructs, the way GC works, references, deployment related, etc.
A few gotchas off the top of my head:
Java doesn't have custom value types (structs) so don't bother looking for them
Java enums are very different to the "named numbers" approach of C#; they're more OO. They can be used to great effect, if you're careful.
byte is signed in Java (unfortunately)
In C#, instance variable initializers run before the base class constructor does; in Java they run after it does (i.e. just before the constructor body in "this" class)
In C# methods are sealed by default. In Java they're virtual by default.
The default access modifier in C# is always "the most restrictive access available in the current context"; in Java it's "package" access. (It's worth reading up on the particular access modifiers in Java.)
Nested types in Java and C# work somewhat differently; in particular they have different access restrictions, and unless you declare the nested type to be static it will have an implicit reference to an instance of the containing class.
here is a very comprehensive comparison of the 2 languages:
http://www.25hoursaday.com/CsharpVsJava.html
Added: http://en.wikipedia.org/wiki/Comparison_of_Java_and_C_Sharp
I am surprised that no one has mentioned properties, something quite fundamental in C# but absent in Java. C# 3 and above has automatically implemented properties as well. In Java you have to use GetX/SetX type methods.
Another obvious difference is LINQ and lambda expressions in C# 3 absent in Java.
There are a few other simple but useful things missing from Java like verbatim strings (#""), operator overloading, iterators using yield and pre processor are missing in Java as well.
One of my personal favourites in C# is that namespace names don't have to follow the physical directory structure. I really like this flexibility.
There are a lot of differences, but these come to mind for me:
Lack of operator overloading in Java. Watch your instance.Equals(instance2) versus instance == instance2 (especially w/strings).
Get used to interfaces NOT being prefixed with an I. Often you see namespaces or classes suffixed with Impl instead.
Double checked locking doesn't work because of the Java memory model.
You can import static methods without prefixing them with the class name, which is very useful in certain cases (DSLs).
Switch statements in Java don't require a default, and you can't use strings as case labels (IIRC).
Java generics will anger you. Java generics don't exist at runtime (at least in 1.5), they're a compiler trick, which causes problems if you want to do reflection on the generic types.
.NET has reified generics; Java has erased generics.
The difference is this: if you have an ArrayList<String> object, in .NET, you can tell (at runtime) that the object has type ArrayList<String>, whereas in Java, at runtime, the object is of type ArrayList; the String part is lost. If you put in non-String objects into the ArrayList, the system can't enforce that, and you'll only know about it after you try to extract the item out, and the cast fails.
One thing I miss in C# from Java is the forced handling of checked exceptions. In C# is it far to common that one is unaware of the exceptions a method may throw and you're at the mercy of the documentation or testing to discover them. Not so in Java with checked exceptions.
Java has autoboxing for primitives rather than value types, so although System.Int32[] is an array of values in C#, Integer[] is an array of references to Integer objects, and as such not suitable for higher performance calculations.
No delegates or events - you have to use interfaces. Fortunately, you can create classes and interface implementations inline, so this isn't such a big deal
The built-in date/calendar functionality in Java is horrible compared to System.DateTime. There is a lot of info about this here: What's wrong with Java Date & Time API?
Some of these can be gotchas for a C# developer:
The Java Date class is mutable which can make returning and passing dates around dangerous.
Most of the java.util.Date constructors are deprecated. Simply instantiating a date is pretty verbose.
I have never gotten the java.util.Date class to interoperate well with web services. In most cases the dates on either side were wildly transformed into some other date & time.
Additionally, Java doesn't have all the same features that the GAC and strongly-named assemblies bring. Jar Hell is the term for what can go wrong when linking/referencing external libraries.
As far as packaging/deployment is concerned:
it can be difficult to package up web applications in an EAR/WAR format that actually install and run in several different application servers (Glassfish, Websphere, etc).
deploying your Java app as a Windows service takes a lot more effort than in C#. Most of the recommendations I got for this involved a non-free 3rd party library
application configuration isn't nearly as easy as including an app.config file in your project. There is a java.util.Properties class, but it isn't as robust and finding the right spot to drop your .properties file can be confusing
There are no delegates in Java. Therefore, aside from all the benefits that delegates bring to the table, events work differently too. Instead of just hooking up a method, you need to implement an interface and attach that instead.
One thing that jumps out b/c it's on my interview list is that there is no "new" keyword analogue in Java for method hiding and there fore no compiler warning "you should put new here". Accidental method hiding when you meant to override leads to bugs.
(edit for example)
Example, B derives from A (using C# syntax, Java behaves same way last I checked but does not emit compiler warning). Does A's foo get called, or B's foo? (A's gets called, probably surprising the dev who implemented B).
class A
{
public void foo() {code}
}
class B:A
{
public void foo() {code}
}
void SomeMethod()
{
A a = new B(); // variable's type is declared as A, but assigned to an object of B.
a.foo();
}
Java doesn't have LINQ and the documentation is hell. User interfaces in Java are a pain to develop, you lose all the good things Microsoft gave us (WPF, WCF, etc...) but get hard - to - use, hardly documented "APIs".
The most harrasing difference to me when I switch to java it's the string declaration.
in C# string (most of the time)
in Java String
It's pretty simple, but trust me, it makes you lose so much time when you have the habit to s not S !
The one issue I've run into so far when working with Java coming from C# is Exceptions and Errors are different.
For example you cannot catch an out of memory error using catch(Exception e).
See the following for more details:
why-is-java-lang-outofmemoryerror-java-heap-space-not-caught
It's been so long since I've been in Java but the things I noticed right off the bat in application development was C# event model, C# drag and drop vs using Layout Managers in Swing (if your doing App dev), and exception handling with Java making sure you catch an exception and C# not required.
In response to your very direct question in your title:
"C# developers learning Java, what are the biggest differences one may overlook?"
A: The fact that Java is considerably slower on Windows.

Why does StyleCop recommend prefixing method or property calls with "this"?

I have been trying to follow StyleCop's guidelines on a project, to see if the resulting code was better in the end. Most rules are reasonable or a matter of opinion on coding standard, but there is one rule which puzzles me, because I haven't seen anyone else recommend it, and because I don't see a clear benefit to it:
SA1101: The call to {method or property name} must begin with the 'this.' prefix to indicate that the item is a member of the class.
On the downside, the code is clearly more verbose that way, so what are the benefits of following that rule? Does anyone here follow that rule?
I don't really follow this guidance unless I'm in the scenarios you need it:
there is an actual ambiguity - mainly this impacts either constructors (this.name = name;) or things like Equals (return this.id == other.id;)
you want to pass a reference to the current instance
you want to call an extension method on the current instance
Other than that I consider this clutter. So I turn the rule off.
It can make code clearer at a glance. When you use this, it's easier to:
Tell static and instance members apart. (And distinguish instance methods from delegates.)
Distinguish instance members from local variables and parameters (without using a naming convention).
I think this article explains it a little
http://blogs.msdn.microsoft.com/sourceanalysis/archive/2008/05/25/a-difference-of-style.aspx
...a brilliant young developer at Microsoft (ok, it was me) decided to take it upon himself to write a little tool which could detect variances from the C# style used within his team. StyleCop was born. Over the next few years, we gathered up all of the C# style guidelines we could find from the various teams within Microsoft, and picked out all of best practices which were common to these styles. These formed the first set of StyleCop rules. One of the earliest rules that came out of this effort was the use of the this prefix to call out class members, and the removal of any underscore prefixes from field names. C# style had officially grown apart from its old C++ tribe.
this.This
this.Does
this.Not
this.Add
this.Clarity
this.Nor
this.Does
this.This
this.Add
this.Maintainability
this.To
this.Code
The usage of "this.", when used excessively or a forced style requirement, is nothing more then a contrivance used under the guise that there is < 1% of developers that really do not understand code or what they are doing, and makes it painful for 99% who want to write easily readable and maintainable code.
As soon as you start typing, Intellisence will list the content available in the scope of where you are typing, "this." is not necessary to expose class members, and unless you are completely clueless to what you are coding for you should be able to easily find the item you need.
Even if you are completely clueless, use "this." to hint what is available, but don't leave it in code. There are also a slew of add-ons like Resharper that help to bring clarity to the scope and expose the contents of objects more efficiently. It is better to learn how to use the tools provided to you then to develop a bad habit that is hated by a large number of your co-workers.
Any developer that does not inherently understand the scope of static, local, class or global content should not rely on "hints" to indicate the scope. "this." is worse then Hungarian notation as at least Hungarian notation provided an idea about the type the variable is referencing and serves some benefit. I would rather see "_" or "m" used to denote class field members then to see "this." everywhere.
I have never had an issue, nor seen an issue with a fellow developer that repeatedly fights with code scope or writes code that is always buggy because of not using "this." explicitly. It is an unwarranted fear that "this." prevents future code bugs and is often the argument used where ignorance is valued.
Coders grow with experience, "this." is like asking someone to put training wheels on their bike as an adult because it is what they first had to use to learn how to ride a bike. And adult might fall off a bike 1 in 1,000 times they get on it, but that is no reason to force them to use training wheels.
"this." should be banned from the language definition for C#, unfortunately there is only one reason for using it, and that is to resolve ambiguity, which could also be easily resolved through better code practices.
A few basic reasons for using this (and I coincidentally always prefix class values with the name of the class of which they are a part as well - even within the class itself).
1) Clarity. You know right this instant which variables you declared in the class definition and which you declared as locals, parameters and whatnot. In two years, you won't know that and you'll go on a wondrous voyage of re-discovery that is absolutely pointless and not required if you specifically state the parent up front. Somebody else working on your code has no idea from the get-go and thus benefits instantly.
2) Intellisense. If you type 'this.' you get all instance-specific members and properties in the help. It makes finding things a lot easier, especially if you're maintaining somebody else's code or code you haven't looked at in a couple of years. It also helps you avoid errors caused by misconceptions of what variables and methods are declared where and how. It can help you discover errors that otherwise wouldn't show up until the compiler choked on your code.
3) Granted you can achieve the same effect by using prefixes and other techniques, but this begs the question of why you would invent a mechanism to handle a problem when there is a mechanism to do so built into the language that is actually supported by the IDE? If you touch-type, even in part, it will ultimately reduce your error rate, too, by not forcing you to take your fingers out of the home position to get to the underscore key.
I see lots of young programmers who make a big deal out of the time they will save by not typing a character or two. Most of your time will be spent debugging, not coding. Don't worry so much about your typing speed. Worry more about how quickly you can understand what is going on in the code. If you save a total of five minutes coding and win up spending an extra ten minutes debugging, you've slowed yourself down, no matter how fast you look like you're going.
Note that the compiler doesn't care whether you prefix references with this or not (unless there's a name collision with a local variable and a field or you want to call an extension method on the current instance.)
It's up to your style. Personally I remove this. from code as I think it decreases the signal to noise ratio.
Just because Microsoft uses this style internally doesn't mean you have to. StyleCop seems to be a MS-internal tool gone public. I'm all for adhering to the Microsoft conventions around public things, such as:
type names are in PascalCase
parameter names are in camelCase
interfaces should be prefixed with the letter I
use singular names for enums, except for when they're [Flags]
...but what happens in the private realms of your code is, well, private. Do whatever your team agrees upon.
Consistency is also important. It reduces cognitive load when reading code, especially if the code style is as you expect it. But even when dealing with a foreign coding style, if it's consistent then it won't take long to become used to it. Use tools like ReSharper and StyleCop to ensure consistency where you think it's important.
Using .NET Reflector suggests that Microsoft isn't that great at adhering to the StyleCop coding standards in the BCL anyway.
I do follow it, because I think it's really convenient to be able to tell apart access to static and instance members at first glance.
And of course I have to use it in my constructors, because I normally give the constructor parameters the same names as the field their values get assigned to. So I need "this" to access the fields.
In addition it is possible to duplicate variable names in a function so using 'this' can make it clearer.
class foo {
private string aString;
public void SetString(string aString){
//this.aString refers to the class field
//aString refers to the method parameter
this.aString = aString;
}
}
I follow it mainly for intellisense reasons. It is so nice typing this. and getting a consise list of properties, methods, etc.

How do you deal with new features of C# so that they don't lead to poorly written code?

A number of features were introduced into C# 3.0 which made me uneasy, such as object initializers, extension methods and implicitly typed variables. Now in C# 4.0 with things like the dynamic keyword I'm getting even more concerned.
I know that each of these features CAN be used in appropriate ways BUT in my view they make it easier for developers to make bad coding decisions and therefore write worse code. It seems to me that Microsoft are trying to win market share by making the coding easy and undemanding. Personally I prefer a language that is rigorous and places more demands on my coding standards and forces me to structure things in an OOP way.
Here are a few examples of my concerns for the features mentioned above:
Object constructors can do important logic that is not exposed to the consumer. This is in the control of the object developer. Object initializers take this control away and allow the consumer to make the decisions about which fields to initialize.
EDIT: I had not appreciated that you can mix constructor and initializer (my bad) but this starts to look messy to my mind and so I am still not convinced it is a step forward.
Allowing developers to extend built-in types using extension methods allows all and sundry to start adding their favourite pet methods to the string class, which can end up with a bewildering array of options, or requires more policing of coding standards to weed these out.
Allowing implicitly typed variables allows quick and dirty programming instead or properly OOP approaches, which can quickly become an unmanageable mess of vars all over your application.
Are my worries justified?
Object initializers simply allow the client to set properties immediately after construction, no control is relinquished as the caller must still ensure all of the constructor arguments are satisfied.
Personally I feel they add very little:
Person p1 = new Person("Fred");
p1.Age = 30;
p1.Height = 123;
Person p2 = new Person("Fred")
{
Age = 30;
Height = 123;
};
I know a lot of people dislike the 'var' keyword. I can understand why as it is an openly inviting abuse, but I do not mind it providing the type is blindingly obvious:
var p1 = new Person("Fred");
Person p2 = GetPerson();
In the second line above, the type is not obvious, despite the method name. I would use the type in this case.
Extension methods -- I would use very sparingly but they are very useful for extending the .NET types with convenience methods, especially IEnumerable, ICollection and String. String.IsNullOrEmpty() as an extension method is very nice, as it can be called on null references.
I do not think you need to worry, good developers will always use their tools with respect and bad developers will always find ways to misue their tools: limiting the toolset will not solve this problem.
You could limit the features of C# 3.0 your developers can use by writing the restrictions into your coding standards. Then when code is reviewed prior to check in, any code that breaches these rules should be spotted and the check in refused. One such case could well be extension methods.
Obviously, your developers will want to use the new features - all developers do. However, if you've got good, well documented reasons why they shouldn't be used, good developers will follow them. You should also be open to revising these rules as new information comes to light.
With VS 2008 you can specify which version of .NET you want to target (Right click over the solution and select Properties > Application). If you limit yourself to .NET 2.0 then you won't get any of the new features in .NET 3.5. Obviously this doesn't help if you want to use some of the features.
However, I think your fears over vars are unwarranted. C# is still as strongly typed as ever. Declaring something as var simply tells the compiler to pick the best type for this variable. The variable can't change type it's always an int or Person or whatever. Personally I follow the same rules as Paul Ruane; if the type is clear from the syntax then use a var; if not name the type explicitly.
I have seen your position expressed like this:
Build a development environment that
any fool can use and what you get is
many fools developing.
This is very true, but the rest of us look good by contrast. I regard this as a good thing. One of the funniest postings I have ever seen remarked that
VB should not be underestimated. It is an extremely powerful tool for
keeping idiots out of this [C++] newsgroup.
More seriously, whether or not a tools is dangerous depends on the wisdom of the wielder. the only way you can prevent folly is to prevent action. foreach looks innocuous but you can't even remove items as you iterate a collection because removing an item invalidates the iterator. You end up dumping them in favour of a classic for loop.
I think the only really justified issue in your bunch is overuse of extension methods. When important functionality is only accessible through extension methods, sometimes it's hard for everyone in a group to find out about and use that functionality.
Worrying about object initializers and the "var" keyword seems very nitpicky. Both are simple and predictable syntax that can be used to make code more readable, and it's not clear to me how you see them being "abused."
I suggest you address concerns like this through written, agreed-upon coding standards. If nobody can come up with good reasons to use new language features, then there's no need to use them, after all.
Object initializers are just fancy syntax. There is nothing the developer can do with them that he couldn't already do before - they do however save you a few lines of code.
If by "extend built in types" you mean extension methods - again, this is nothing new, just better syntax. The methods are static methods that appear as if they were members. The build in classes remain untouched.
Implicit typed variables are needed for Linq. I also use them with generics that have a lot of type parameters. But I'd agree that I wouldn't like to see them used exclusively.
Of course every feature can be abused but I think that these new features actually let you write code that is easier to read.
One big mitigating factor about var is that it can never move between scopes. It can not be a return type or a parameter type. This makes it far safer in my mind, as it is always tightly typed and always implementation detail of one method.
New features was introduced because Microsoft realized that they are absolutely necessary for implementing new language features. Like LINQ, for example. So you can use the same strategy:
1) know about those features,
2) do not use until you'd find it absolutely necessary for you.
If you really understand those features, I bet you'd feel it necessary pretty soon. :)
At least with "var" and intializers you're not really able to do anything new, it's just a new way to write things. Although it does look like object initializers compile to slightly different IL.
One angle that really blows my mind about extension methods is that you can put them on an interface. That means a class can inherit concrete code by implementing an interface. And since a class can implement multiple interfaces that means, in a roundabout sort of way, that C# now has something like multiple inheritance. So that's a new feature that should definitely be handled with care.
Are my worries justified?
No. This has been another edition of simple answers to simple questions.

Why shouldn't I prefix my fields? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
I've never been a fan of Hungarian notation, I've always found it pretty useless unless you're doing some really low level programming, but in every C++ project I've worked on some kind of Hungarian notation policy was enforced, and with it the use of some 'not-really-Hungarian' prefixes as m_ for fields, s_ for statics, g_ for globals and so on.
Soon I realized how much useless it was in C# and gradually started to drop all of my old habits... but the 'm_' thing. I still use the m_ prefix on private fields because I really find it very useful to being able to distinguish between parameters, locals and fields.
The naming conventions for fields page at MSDN says I shouldn't, but it does not say why (the way e.g. Google's conventions generally tend to rationalize their prescriptions).
Are there reasons why I shouldn't or is it only a matter of style. If it is the latter, are prefixes generally considered a bad style and can I expect negative reactions from other people working on the codebase?
I like the underbar prefix for member fields. Mostly I like it because that way, all of my member fields are shown alphabetically before my methods in the wizard bar at the top of the screen.
When you should:
When your project coding guidelines say you should
When you shouldn't:
When your project coding guidelines say you shouldn't
If you don't have any guidelines yet, you're free to choose whatever you or your team want and feel most comfortable with. Personally when coding C++ I tend to use m_ for members, it does help. When coding in other languages, particularly those without true classes (like Javascript, Lua) I don't.
In short I don't believe there is a "right" and a "wrong" way.
The auto-implemented property feature in C# 3.0 creates less of a need for this convention one way or the other. Instead of writing
string m_name;
public string Name { get { return m_name; } }
or
string _Name;
public string Name { get { return _Name; } }
(or any other convention), you can now write
public string Name { get; private set; }
Since you no longer need the explicit backing store variable, you no longer have to come up with a name for it; thus avoiding this entire discussion.
Obviously, this argument doesn't apply when you really need explicit backing store such as to perform validation.
As some have alluded to, the MS guidelines say:
Do not use a prefix for field names.
For example, do not use g_ or s_ to
distinguish static versus non-static
fields.
I happen to agree with this. prefixes make your code look ugly and waste space with inconsequential characters. Having said that, it is often common to use fields to back properties where both the field and the property would have the same name (with the private field being camel case and the property being pascal case). In VB, this doesn't work, since VB isn't case-sensitive. In this scenario, I recommend the use of a single _ prefix. No more, no less. It just looks cleaner, IMHO.
I have experimented with m_, s_, just _, and no prefix at all. I have settled on using just _ for all static and instance variables. I don't find it important to distinguish static variables from instance variables. In theory it sounds good, in practice it doesn't create a problem.
A coworker once made a convincing argument to eliminate all prefixes, we tried it on one project and it worked better then I expected. I carried it forward to my next project and became annoyed that it "interferes" with Intellisense. When you have the following situation
int foo;
public int Foo
{
get { return foo; }
}
Starting to type foo will suggest both the instance variable and the property. Prefixing the variable with an underscore eliminates the annoying double suggestion, so I switched back to using just _.
I try to follow the MSDN .NET library guidelines. They include a naming guidelines section.
Obviously, these are secondary to your project guidelines.
I prefer to mark property backing fields (although as already mentioned .NET 3.0+ reduces the need thanks to Automatic Properties) with underscores but not the "m". For one it puts them at the top of the InteliSense list when I come to use them.
I will admit that I need to brush-up on the guidelines on MSDN, things can change so quickly these days.
With tools like resharper there's really no reason for prefixes. Also if you write short methods, you should be able to tell really quickly where the var is coming from. Finally, I guess I wouldn't really see the need to tell a difference between a static or not because again resharper is going to red line it if you try to do something you're not able to. Even without resharper you're probably saved by the compiler.
I always prefix member variables with m_ and static variables with s_ for the same reasons that you state. Some people prefix member variables with an underscore, but I've always found this a bit odd looking (but that's just a personal preference).
Most people I work with use the m_/s_ prefix. I don't really think it matters too much what you use, as long as you're consistent.
I never use them. It encourages sloppy coding.
The MSDN coding guidelines, that's where it's at.
Here are a few reasons to use _ (and not m_).
(1) Many BCL guys do it despite MS's naming guide. (Check out their blog.) Those guys write the framework, so they have some good habits worth copying. Some of the most helpful example code on MSDN is written by them, and so uses the underscore convention. It's a de-facto industry standard.
(2) A single underscore is a noticeable yet unobtrusive way to disambiguate method and class-level variables by simply reading the source. It helps people understand new (or old) code at-a-glance when reading it. Yes, you can mouse-over to see this in an IDE, but we shouldn't be forced to. You may want to read it in a text editor, or dare I say it, on paper.
(3) Some say you don't need any prefix as methods will be short, and later if needed you can change the field to an auto-implemented property. But in the real world methods are as long as they need to be, and there are important differences between fields and properties (e.g. serialization and initialization).
Footnote: The "m" for member in m_ is redundant in our usage here, but it was lower case because one of the ideas in many of these old naming conventions was that type names started with upper case and instance names started with lower case. That doesn't apply in .NET so it's doubly redundant. Also Hungarian notation was sometimes useful with old C compilers (e.g. integer or pointer casting and arithmetic) but even in C++ its usefulness was diminished when dealing with classes.
As #John Kraft mentions, there is no "correct" answer. MattJ is the closest–you should always follow your company's style guidelines. When in Rome, and all that.
As for my personal opinion, since it's called for here, I vote that you drop m_ entirely.
I believe the best style is one where all members are PascalCased, regardless of visibility (that means even private members), and all arguments are camelCased. I do not break this style.
I can understand the desire to prefix property backing store field; after all you must differentiate between the field and the property, right? I agree, you must. But use a post-fix.
Instead of m_MyProperty (or even _MyProperty, which I've seen and even promoted once upon a time), use MyPropertyValue. It's easier to read and understand and -- more importantly -- it's close to your original property name in intellisense.
Ultimately, that's the reason I prefer a postfix. If I want to access MyPropertyValue using intellisense you (typically) type "My <down-arrow> <tab>", since by then you're close enough that only MyProperty and MyPropertyValue are on the list. If you want to access m_MyProperty using intellisense, you'll have to type "m_My <tab>".
It's about keystroke economy, in my opinion.
There is one important difference between C++ and C#: Tool support. When you follow the established guidelines (or common variations), you will get a deep level of tool support that C++ never had. Following the standards allows tools to do deeper refactoring/rename operations than you'd otherwise be capable of. Resharper does this. So stick with one of the established standards.
I never do this and the reason why is that I [try to] keep my methods short. If I can see the whole method on the screen, I can see the params, I can see the locals and so I can tell what is owned by the class and what is a param or a local.
I do typically name my params and locals using a particular notation, but not always. I'm nothing if not inconsistent. I rely on the fact that my methods are short and try to keep them from doing X, Y and Z when they should be only doing X.
Anyhow, that's my two cents.
Unless I'm stuck with vi or Emacs for editing code, my IDE takes care of differential display of members for me so I rarely uses any special conventions. That also goes for prefixing interfaces with I or classes with C.
Someone, please, explain the .NET style of I-prefix on interfaces. :)
what i am used to is that private properties got small underscone f.ex "string _name". the public one got "Name". and the input variables in methods got small letter"void MyMethod(string name)".
if you got static const is often written with big letters. static const MYCONST = "hmpf".
I am sure that I will get flamed for this but so be it.
It's called Microsoft's .NET library guidelines but it's really Brad Abrams's views (document here) - there are other views with valid reasons.
People tend to go with the majority view rather than having good solid reasons for a specific style.
The important point is to evaluate why a specific style is used and why it's preferred over another style - in other words, have a reason for choosing a style not just because everyone says it's the thing to do - think for yourself.
The basic reason for not using old style Hungarian was the use of abbreviations which was different for every team and difficult to learn - this is easily solved by not abbreviating.
As the available development tools change the style should change to what makes the most sense - but have a solid reason for each style item.
Below are my style guidelines with my reasons - I am always looking for ways to improve my style to create more reliable and easier to maintain code.
Variable Naming Convention
We all have our view on variable naming conventions. There are many different styles that will help produce easily maintainable quality code - any style which supports the basic essential information about a variable are okay. The criteria for a specific naming convention should be that it aids in producing code that is reliable and easily maintainable. Criteria that should not be used are:
It's ugly
Microsoft (i.e. Brad Abrams) says don't use that style - Microsoft does not always produce the most reliable code just look at the bugs in Expression Blend.
It is very important when reading code that a variable name should instantly convey three essential facts about the variable:
it’s scope
it’s type
a clearly understand about what it is used for
Scope: Microsoft recommends relying totally on IntelliSense . IntelliSense is awesome; however, one simply does not mouse over every variable to see it's scope and type. Assuming a variable is in a scope that it is not can cause significant errors. For example, if a reference variable is passed in as a parameter and it is altered in local scope that change will remain after the method returns which may not be desired. If a field or a static variable is modified in local scope but one thinks that it is a local variable unexpected behavior could result. Therefore it is extremely important to be able to just look at a variable (not mouse over) and instantly know it's scope.
The following style for indicating scope is suggested; however, any style is perfectly okay as long as it clearly and consistently indicates the variable's scope:
m_ field variable
p_ parameter passed to a method
s_ static variable
local variable
Type: Serious errors can occur if one believes they are working with a specific type when they are actually working with a different type - again, we simply do not mouse over ever variable to determine its type, we just assume that we know what its type is and that is how errors are created.
Abbreviations: Abbreviations are evil because they can mean different things to different developers. One developer may think a leading lower case "s" means string while another may think it means signed integer. Abbreviations are a sign of lazy coding - take a little extra time and type the full name to make it clear to the developer that has to maintain the code. For example, the difference between "str" and "string" is only three characters - it does not take much more effort to make code easy to maintain.
Common and clear abbreviations for built-in data types only are acceptable but must be standardized within the team.
Self Documenting Code: Adding a clear description to a variable name makes it very easy for another developer to read and understand the code - make the name so understandable that the team manager can read and understand the code without being a developer.
Order of Variable Name Parts: The recommended order is scope-type-description because:
IntelliSense will group all similar scopes and within each scope IntelliSense will group all similar types which makes lookups easy - try finding a variable the other way
It makes it very easy to see and understand the scope and to see and understand the type
It's a fairly common style and easy to understand
It will pass FxCop
Examples: Here are a few examples:
m_stringCustomerName
p_stringCustomerDatabaseConnectionString
intNumberOfCustomerRecords or iNumberOfCustomerRecords or integerNumberOfCustomerRecords
These simple rules will significantly improve code reliability and maintainability.
Control Structure Single Line Statements
All control structures (if, while, for, etc.) single line statements should always be wrapped with braces because it is very easy to add a new statement not realizing that a given statement belongs to a control structure which will break the code logic without generating any compile time errors.
Method Exception Wrapping
All methods should be wrapped with an outer try-catch which trap, provide a place to recover, identify, locate, log, and make a decision to throw or not. It is the unexpected exception that cause our applications to crash - by wrapping every method trapping all unhandled exceptions we guarantee identifying and logging all exceptions and we prevent our application from ever crashing. It takes a little more work but the results is well worth the effort.
Indentation
Indentation is not a major issue; however, four spaces and not using tabs is suggested. If code is printed, the first printer tab usually defaults to 8 spaces. Different developer tend to use different tab sizes. Microsoft's code is usually indented 4 space so if one uses any Microsoft code and uses other than 4 spaces, then the code will need to be reformatted. Four spaces makes it easy and consistent.
I never use any hungarian warts whenever I'm given the choice. It's extra typing and doesn't convey any meaningful information. Any good IDE (and I define "good" based on the presence of this feature, among others) will allow you to have different syntax highlighting for static members, instance members, member functions, types, etc. There is no reason to clutter your code with information that can be provided by the IDE. This is a corollary to not cluttering your code with commented-out old code because your versioning system should be responsible for that stuff.
The best way is to agree on a standard with your colleagues, and stick to it. It doesn't absolutely have to be the method that would work best for everyone, just agreeing on one method is more important than which method you actually agree on.
What we chose for our code standard is to use _ as prefix for member variables. One of the reasons was that it makes it easy to find the local variables in the intellisense.
Before we agreed on that standard I used another one. I didn't use any prefix at all, and wrote this.memberVariable in the code to show that I was using a member variable.
With the property shorthand in C# 3, I find that I use a lot less explicit member variables.
The closest thing to official guidelines is StyleCop, a tool from Microsoft which can automatically analyse your source files and detect violations from the recommended coding style, and can be run from within Visual Studio and/or automated builds such as MSBuild.
We use it on our projects and it does help to make code style and layout more consistent between developers, although be warned it does take quite a bit of getting used to!
To answer your question - it doesn't allow any Hungarian notation, nor any prefixes like m_ (in fact, it doesn't allow the use of underscores at all).
I don't use that style any longer. It was developed to help you see quickly how variables were being used. The newer dev environments let you see that information by hovering your mouse over the variable. The need for it has gone away if you use those newer tools.
There might also be some insight to be gleaned from C++ Coding Standards (Sutter, Herb and Alexandrescum Andrei, 2004). Item #0 is entitled "Don't sweat the small stuff. (Or: Know what not to standardize.)".
They touch on this specific question a little bit by saying "If you can't decide on your own naming convention, try ... private member variables likeThis_ ..." (Remember use of leading underscore is subject to very specific rules in C++).
However, before getting there, they emphasize a certain level of consistency "...the important thing is not to set a rule but just to be consistent with the style already in use within the file..."
The benefit of that notation in C/C++ was to make it easier to see what a symbol's type was without having to go search for the declaration. These styles appeared before the arrival of Intellisense and "Go to Definition" - we often had to go on a goose chase looking for the declaration in who knows how many header files. On a large project this could be a significant annoyance which was bad enough when looking at C source code, but even worse when doing forensics using mixed assembly+source code and a raw call stack.
When faced with these realities, using m_ and all the other hungarian rules starts to make some sense even with the maintenance overhead because of how much time it would save just in looking up a symbol's type when looking at unfamiliar code. Now of course we have Intellisense and "Go to Definition", so the main time saving motivation of that naming convention is no longer there. I don't think there's much point in doing that any more, and I generally try to go with the .NET library guidelines just to be consistent and possibly gain a little bit more tool support.
If you are not coding under a particular guideline, you should keep using your actual m_ notation and change it if the project coding guidelines says so.
Be functional.
Do not use global variables.
Do not use static variables.
Do not use member variables.
If you really have to, but only if you really have to, use one and only one variable to access your application / environment.

Categories

Resources