Related
I've always wondered how the dependencies are managed from a programming language to its libraries. Take for example C#. When I was beginning to learn about computing, I would assume (wrongly as it turns out) that the language itself is designed independently of the class libraries that would eventually become available for it. That is, the set of language keywords (such as for, class or throw) plus the syntax and semantics are defined first, and libraries that can be used from the language are developed separately. The specific classes in those libraries, I used to think, should not have any impact on the design of the language.
But that doesn't work, or not all the time. Consider throw. The C# compiler makes sure that the expression following throw resolves to an exception type. Exception is a class in a library, and as such it should not be special at all. It would be a class as any other, except that the C# compiler assigns it that special semantics. That is very good, but my conclusion is that the design of the language does depend on the existence and behaviour of specific elements in the class libraries.
Additionally, I wonder how this dependency is managed. If I were to design a new programming language, what techniques would I use to map the semantics of throw to the very particular class that is Exception?
So my questions are two:
Am I correct in thinking that language design is tightly coupled to that of its base class libraries?
How are these dependencies managed from within the compiler and run-time? What techniques are used?
Thank you.
EDIT. Thanks to those who pointed out that my second question is very vague. I agree. What I am trying to learn is what kind of references the compiler stores about the types it needs. For example, does it find the types by some kind of unique id? What happens when a new version of the compiler or the class libraries is released? I am aware that this is still pretty vague, and I don't expect a precise, single-paragraph answer; rather, pointers to literature or blog posts are most welcome.
What I am trying to learn is what kind of references the compiler stores about the types it needs. For example, does it find the types by some kind of unique id?
Obviously the C# compiler maintains an internal database of all the types available to it in both source code and metadata; this is why a compiler is called a "compiler" -- it compiles a collection of data about the sources and libraries.
When the C# compiler needs to, say, check whether an expression that is thrown is derived from or identical to System.Exception it pretends to do a global namespace lookup on System, and then it does a lookup on Exception, finds the class, and then compares the resulting class information to the type that was deduced for the expression.
The compiler team uses this technique because that way it works no matter whether we are compiling your source code and System.Exception is in metadata, or if we are compiling mscorlib itself and System.Exception is in source.
Of course as a performance optimization the compiler actually has a list of "known types" and populates that list early so that it does not have to undergo the expense of doing the lookup every time. As you can imagine, the number of times you'd have to look up the built-in types is extremely large. Once the list is populated then the type information for System.Exception can be just read out of the list without having to do the lookup.
What happens when a new version of the compiler or the class libraries is released?
What happens is: a whole bunch of developers, testers, managers, designers, writers and educators get together and spend a few million man-hours making sure that the compiler and the class libraries all work before they're released.
This question is, again, impossibly vague. What has to happen to make a new compiler release? A lot of work, that's what has to happen.
I am aware that this is still pretty vague, and I don't expect a precise, single-paragraph answer; rather, pointers to literature or blog posts are most welcome.
I write a blog about, among other things, the design of the C# language and its compiler. It's at http://ericlippert.com.
I would assume (perhaps wrongly) that the language itself is designed independently of the class libraries that would eventually become available for it.
Your assumption is, in the case of C#, completely wrong. C# 1.0, the CLR 1.0 and the .NET Framework 1.0 were all designed together. As the language, runtime and framework evolved, the designers of each worked very closely together to ensure that the right resources were allocated so that each could ship new features on time.
I do not understand where your completely false assumption comes from; that sounds like a highly inefficient way to write a high-level language and a great way to miss your deadlines.
I can see writing a language like C, which is basically a more pleasant syntax for assembler, without a library. But how would you possibly write, say, async-await without having the guy designing Task<T> in the room with you? It seems like an exercise in frustration.
Am I correct in thinking that language design is tightly coupled to that of its base class libraries?
In the case of C#, yes, absolutely. There are dozens of types that the C# language assumes are available and as-documented in order to work correctly.
I once spent a very frustrating hour with a developer who was having some completely crazy problem with a foreach loop before I discovered that he had written his own IEnumerable<T> that had slightly different methods than the real IEnumerable<T>. The solution to his problem: don't do that.
How are these dependencies managed from within the compiler and run-time?
I don't know how to even begin to answer this impossibly vague question.
All (practical) programming languages have a minimum number of required functions. For modern "OO" languages, this also includes a minimum number of required types.
If the type is required in the Language Specification, then it is required - regardless of how it is packaged.
Conversely, not all of the BCL is required to have a valid C# implementation. This is because not all of the BCL types are required by the Language Specification. For instance, System.Exception (see #16.2) and NullReferenceException are required, but FileNotFoundException is not required to implement the C# Language.
Note that even though the specification provides minimal definitions for base types (e.g System.String), it does not define the commonly-accepted methods (e.g. String.Replace). That is, almost all of the BCL is outside the scope of the Language Specification1.
.. but my conclusion is that the design of the language does depend on the existence and behaviour of specific elements in the class libraries.
I agree entirely and have included examples (and limits of such definitions) above.
.. If I were to design a new programming language, what techniques would I use to map the semantics of "throw" to the very particular class that is "Exception"?
I would not look primarily at the C# specification, but rather I would look at the Common Language Infrastructure specification. This new language should, for practically reasons, be designed to interoperate with existing CLI/CLR languages, but does not necessarily need to "be C#".
1 The CLI (and associated references) do define the requirements of a minimal BCL. So if it is taken that a valid C# implementation must conform to (or may assume) the CLI then there are many other types to consider that are not mentioned in the C# specification itself.
Unfortunately, I do not have sufficient knowledge of the 2nd (and more interesting) question.
my impression is that
in languages like C# and Ada
application source code is portable
standard library source code is not portable
accross compilers/implementations
From other questions I can see that locking on types is a bad idea. But it is possible to do so, so I was wondering if it is such a bad thing to do why is it allowed? I am assuming there must be good use cases for its purpose so could someone let me know what they are please?
It's nearly always a bad idea:
Anyone can lock on the types from anywhere in the code, so you have no way to be sure that you won't get a deadlock without looking through all the code.
Locking on a type can even cause deadlocks across AppDomains. See Joe Duffy's article: Don't lock on marshal-by-bleed objects.
It's allowed because there are almost no restrictions on what you can use as your lock object. In other words, it wasn't specifically allowed - it's just that there isn't any code in the .NET framework that disallows it.
The book "Debugging Microsoft .NET Applications" has source code for an FxCop rule DoNotLockOnTypes that warns you if you try to do this. (thanks to Christian.K)
To understand why it is a bad idea in general have a look at the article Don't lock type objects.
It is allowed because the language/framework designers decided to be able to take lock on anything that derives from System.Object. Nobody can prevent it because System.Type derives from System.Object (as every other .NET type).
Take this signature:
void Foo(object o)
How could a compiler enforce that o is no System.Type? You could of course check it at runtime, but this would have a performance impact.
And of course there might be super-exotic situations where one might need to lock on a type. Maybe the CLR does it internally.
Many bad ideas find their way into programming languages because no language designer can foretell the future. Any language created by humans will have warts.
Some examples:
Hejlsberg wished (Original article: The A-Z of Programming Languages: C# - Computerworld) he had added non-nullable class references to C#. (I wish he had bitten off the const problem as well.)
The C++ committee screwed up with valarray, and export, among numerous other minor and major regrets.
Java's templates were a botch-job (OMG, type elision!) designed to avoid changing the VM, and by the time they realised the VM had to change anyway, it was too late to do the necessary rework.
Python's scoping rules are a constant irritant that numerous attempts to improve it haven't really helped much (a little, but not much).
Is there any way to disable the use of the "dynamic" keyword in .net 4?
I thought the Code Analysis feature of VS2010 might have a rule to fail the build if the dynamic keyword is used but I couldn't fine one.
It's part of the C# 4.0 language, so no not really.
You can use FXCop to look for it though and fail the build if it encounters it.
Style cop might work instead:
http://code.msdn.microsoft.com/sourceanalysis
Here is a link talking about the same issue and how style cop might be the answer. There is also a post about how to get FX cop to potentially look for the dynamic keyword, although it's not perfect.
http://social.msdn.microsoft.com/Forums/en/vstscode/thread/8ce407ba-bdf7-422b-bbcd-ca4701c3a76f
The dynamic keyword is not evil, but using it could be.
It leads to code errors that you can only find during runtime.
This should be avoided at all costs.
Runtime errors are bad. Compile time errors are good.
You could use something like the following to set your own standards.
http://joel.fjorden.se/static.php?page=CodeStyleEnforcer
Target .net 1.0? :-)
Or do code reviews.
(Or, to be less facetious, it should be pretty easy to write a custom FxCop or CA rule to disallow use of dynamic)
Wouldn't you just kill for a C++ macro right now? :-)
Remove the reference to Microsoft.CSharp.dll, and I think maybe all uses of dynamic will fail to compile.
I'm not sure I understand what this irrational fear of the dynamic keyword is for. There was this type of hysteria over anonymous variables and the var keyword for .NET 3.5 except that was just idiotic since those are legitimate statically defined types.
The dynamic keyword serves a highly specialized purpose, I don't see why any person would have the desire to use it without understanding why. However stopping that from occurring could be solved with 1 team meeting explaining some of the new features of .NET 4 including the dynamic keyword. I assume you're more of a senior or the senior lead of the team; it should be quite easy to tell your team if they ever feel they NEED to use the dynamic keyword to come see you FIRST.
This was exactly the instructions I gave to my team as I find it unlikely we will ever use the dynamic keyword because we don't write COM interop activity. And past that I will defer any type of dynamic proxy use to an established library like Linfu or Castle and leave up the implementation of dynamic proxies to them to use or not use the dynamic keyword.
Is there any design reason for that (like the reason they gave up multi inheritance)?
or it just wasn't important enough?
And same question applies for optional parameters in methods... this was already in the first version of vb.net... so it surely no laziness that cause MS not to allow optional parameters, probably architecture decision.. and it seems they had change of heart about that, because C# 4 is going to include that..
What was the decision and why did they give it up?
Edit:
Maybe readers didn't fully understand me. I'm working lately on a calculation program (support numbers of any size, to the last digit), in which some methods are used millions of times per second.
Say I have a method called Add(int num), and this method is used quiet a lot with 1 as parameter (Add(1);), I've found out it is faster to implement a special method especially for one. And I don't mean overloading - Writing a new method called AddOne, and literally copy the Add method into it, except that instead of using num I'm writing 1. This might seems horribly weird to you, but it's actually faster.
(as much as ugly it is)
That made me wonder why C# doesn't support manual inline which can be amazingly helpful here.
Edit 2:
I asked myself whether or not to add this. I'm very well familiar with the weirdness (and disadvantages) of choosing a platform such as dot net for such project, but I think dot net optimizations are more important than you think... especially features such as Any CPU etc.
To answer part of your question, see Eric Gunnerson's blog post: Why doesn't C# have an 'inline' keyword?
A quote from his post:
For C#, inlining happens at the JIT
level, and the JIT generally makes a
decent decision.
EDIT: I'm not sure of the reason for delayed optional parameters support, however saying they "gave up" on it sounds as though they were expected to implement it based on our expectations of what other languages offered. I imagine it wasn't high on their priority list and they had deadlines to get certain features out the door for each version. It probably didn't rise in importance till now, especially since method overloading was an available alternative. Meanwhile we got generics (2.0), and the features that make LINQ possible etc. (3.0). I'm happy with the progression of the language; the aforementioned features are more important to me than getting support for optional parameters early on.
Manual inlining would be almost useless. The JIT compiler inlines methods during native code compilation where appropriate, and I think in almost all cases the JIT compiler is better at guessing when it is appropriate than the programmer.
As for optional parameters, I don't know why they weren't there in previous versions. That said, I don't like them to be there in C# 4, because I consider them somewhat harmful because the parameter get baked into the consuming assembly and you have to recompile it if you change the standard values in a DLL and want the consuming assembly to use the new ones.
EDIT:
Some additional information about inlining. Although you cannot force the JIT compiler to inline a method call, you can force it to NOT inline a method call. For this, you use the System.Runtime.CompilerServices.MethodImplAttribute, like so:
internal static class MyClass
{
[System.Runtime.CompilerServices.MethodImplAttribute(MethodImplOptions.NoInlining)]
private static void MyMethod()
{
//Powerful, magical code
}
//Other code
}
My educated guess: the reason earlier versions of C# didn't have optional parameters is because of bad experiences with them in C++. On the surface, they look straight-forward enough, but there are a few bothersome corner cases. I think one of Herb Sutter's books describes this in more detail; in general, it has to do with overriding virtual methods. Maximilian has mentioned one of the .NET corner cases in his answer.
You can also pretty much get by with out them by manually writing multiple overloads; that may not be very nice for the author of the class, but clients will hardly notice the difference between overloads and optional parameters.
So after all these years w/o them, why did C# 4.0 add them? 1) improved parity with VB.NET, and 2) easier interop with COM.
I'm working lately on a calculation program (support numbers of any size, to the last digit), in which some methods are used literally millions of times per second.
Then you chose a wrong language. I assume you actually profiled your code (right?) and know that there is nothing apart from micro-optimisations that can help you. Also, you're using a high-performance native bigint library and not writing your own, right?
If that's true, don't use .NET. If you think you can gain speed on partial specialisation, go to Haskell, C, Fortran or any other language that either does it automatically, or can expose inlining to you to do it by hand.
If Add(1) really matters to you, heap allocations will matter too.
However, you should really look at what the profiler can tell you...
C# has added them in 4.0: http://msdn.microsoft.com/en-us/library/dd264739(VS.100).aspx
As to why they weren't done from the beginning, its most likely because they felt method overloads gave more flexibility. With overloading you can specify multiple 'defaults' based on the other parameters that you're taking. Its also not that much more syntax.
Even in languages like C++, inlining something doesn't guarantee that it'll happen; it's a hint to the compiler. The compiler can either take the hint, or do its own thing.
C# is another step removed from the generated assembly code (via IL + the JIT), so it becomes even harder to guarantee that something will inline. Furthermore, you have issues like the x86 + x64 implementations of the JIT differing in behaviour.
Java doesn't include an inline keyword either. The better Java JITs can inline even virtual methods, nor does the use of keywords like private or final make any difference (it used to, but that is now ancient history).
For those of us that have programmed enough I’m sure we have come across many different flavours of coding standards that you can use when it comes to programming.
e.g. http://msdn.microsoft.com/en-us/library/ms229042.aspx
You might derive your coding standards for the current company you work for or from the original author of the code you’re working on. Coding styles are often used for specific program languages and some styles in one coding language might not be considered appropriate for others. Of course some coding standards can be applied across many different program languages.
Thank you for your time.
EDIT: As we know there are many related articles on this subject, but C# Coding standard / Best practices in SO has some very useful links in there which is worth a visit. (Check out the 2 links on .NET/C# guidelines by ESV - Accepted Answer)
Google has a posted style guide for C++ here which I consult sometimes. Just reading through the explanations and reasoning, despite whether you end up agreeing with some of the styles or not, may teach you some things you might not have thought about.
My best advice regarding coding standards: don't let them get in the way when trying to get work done.
A big bureaucracy might actually hinder progress in projects instead of helping to achieve better team work. When people complain about not following coding standards instead of the actual quality of the code, then it is too much regulation.
Other than that, pick one from the many suggestions and try to stick with it for as long as possible to build a code base following a single standard that you are used to.
Coding standards are good, but coding standards written from scratch in which the company reinvents the wheel, or coding standards imposed by a single "prophet", can be worse than having no coding standards at all.
This means:
Coding standards should be discussed and agreed upon.
The coding standards document should include the reasons behind each rule.
Coding standards should be at least partially based on reliable sources.
The sources I know of for the languages in your tags are:
For C++: The book C++ Coding Standards by Sutter/Alexandrescu.
For C#: 4 or 5 PDF's I found googling for C# Coding Standards :)
Adam Cogan has a great set of rules on his web site. There are coding guidelines, but there is much more there also.
Adam Cogan's Rules to Better...
Coding standards are great. We've been using Lance Hunt's C# Coding Standards for .NET almost without modifications
If you are maintaining code that continue to use the same standard as the original code was developed in (there is nothing worse then trying to debug a problem when the code looks all higgildy piggeldy)
Some comment to the post suggesting looking at the Google C++ guidelines. Detailed discussion about some aspects of these guidelines are posted at comp.lang.c++.moderated.
Some weird or controversial points include:
We don't believe that the available
alternatives to exceptions, such as
error codes and assertions, introduce
a significant burden.
As if assertions were a viable alternative... Assertions are usually for programming errors and situations that should never happen, while exceptions can happen (are somewhat anticipated) in the execution flow.
Reference Arguments: All parameters
passed by reference must be labeled
const. ... In fact it is a very strong
convention that input arguments are
values or const references while
output arguments are pointers.
No comment, about weasel phrase a very strong convention.
Doing Work in Constructors: Do only
trivial initialization in a
constructor. If at all possible, use
an Init() method for non-trivial
initialization. ... If your object
requires non-trivial initialization,
consider having an explicit Init()
method and/or adding a member flag
that indicates whether the object was
successfully initialized.
Yes... 2-phase init to make things simpler... What if I have const fields? This rule is probably the effect of attitude towards exceptions.
Use streams only for logging
Which streams? IOStreams, standard C streams, other?
On one hand they advise to use macros only in exceptional situations, while they recommend using DISALLOW_COPY_AND_ASSIGN to prohibit copy/assign. They could have advised the approach with special class (like in Boost)
Do not overload operators except in rare, special circumstances.
What about assignment, or arithmetic operators for numeric calculations, etc?
Default parameters are more difficult to maintain because copy-and-
paste from previous code may not reveal all the parameters. Copy-and-
pasting of code segments can cause major problems when the default
arguments are not appropriate for the new code.
The what? Copy/paste from previous code?
Remember that reading any of the guidelines can introduce a bias to your way of thinking. And sometimes it won't be beneficial for you or your code. I agree with some other posts advising reading good books by good authors beforehand. When you have sufficient amount of knowledge, then you are able to look at the guidelines and find good and weak points easily, without creating a mess in your brain ;)
If you plan to introduce a code-formatting standard to an existing programming team, get input from each member of the team so they'll have "buy in" and be more likely to write code to that standard.
Programming styles are as difficult to change as habits, and you'll have to accept that some people won't make their code 100% compliant 100% of the time. It would be worth your time to find (or write your own) pretty-printer program and periodically run all your code through it to enforce consistency. (I always felt uneasy when manually checking in source code changes that only consisted of formatting corrections for other peoples' code; I worried that others would label me a nitpicker.)
Sun Java Code Conventions
Python Style Guide
Zend Coding Standard for PHP
Having asked this question. I found that the accepted answer proved to be sufficient for my needs.
However, I realise that this is not a 'one-size-fits-all' scenario, so there is a large quantity of information within the thread that you may find more or less useful. Weel worth a read!
For Java and other C-family languages I recommend Sofware Monkey's coding standards (of course, since they're mine).
In general, keep them simple, and provide examples and justification for every requirement.
What's in the standard doesn't really matter all that much. What matters is that you have one, and that your developers follow it.
It doesn't quite answer the question, but it's worth a mention...
I read Steve McConnell's Code Complete. Whilst it doesn't give you a pre-baked set of coding standards it does set out a lot of good arguments for the various approaches. It'll make you think about things you'd not thought of before.
It changed my little world for the better.
Coding standards themselves are great and all, but what I think is much, much, MUCH more important is keeping with the style of whatever code you're maintaining. I've seen people add a function to some class written one way and forcing their coding standard on just that function. It's inconsistent, it sticks out, and, in my opinion, it makes it harder to enjoy the class "as a whole".
Whenever you're maintaining code, look at the code around it. See what the style is. K&R braces? Capital Camel Case methods? Hungarian? Double-line comment blocks between every function? Whatever it is, you should do it too in that specific area.
Before I leave, one thing I'd like to note that's related - naming files. I'm mainly a C++ guy, so this may not apply to whatever else, but basically it goes _.h or .cpp. So, Foo::Bar would be in Foo_Bar.h. Common things (i.e. a precompiled header) for the Foo namespace would be in Foo_common.h (note the lowercase common). Of course, that's a taste thing, but everybody who has worked with this has come out in favor of this.
i think Code Craft - The Practice of Writing Excellent Code pretty much sums it all up
Very popular are Ellemtel rules for C++.
For C# I recommend Framework Design Guidelines: Conventions, Idioms, and Patterns for Reusable .NET Libraries (2nd Edition) (Microsoft .NET Development Series).
Mono Coding Guidelines
The answers here a pretty complete, thus I am not pointing to another coding standard document. However, once you decided to stick to one style you should use an automated coding style enforcer throughout your team.
For Java there is checkstyle and for .NET Microsoft Style Cop.
Here is a similar discussion on Stackoverflow: C# Coding standard / Best practices
Camel and pascal casing alone solves a lot of coding standard problems