the page at http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html says that double-checked locking is flawed in java. I'm just wondering does it also apply to other languages (C#, Vb, C++, etc)
I've read Double checked locking pattern: Broken or not?, Is this broken double checked locking?, How to solve the "Double-Checked Locking is Broken" Declaration in Java? to be truthful i don't know what the common consensus is. some say yes its broken others say no.
Anyway, my question is does it also apply to other languages (C#, Vb, C++, etc)
Double checked locking is safe in Java, PROVIDED THAT:
the instance variable is declared as volatile, AND
the JVM correctly implements the JSR-133 specification; i.e. it is compliant with Java 5 and later.
My source is the JSR-133 (Java Memory Model) FAQ - Jeremy Manson and Brian Goetz, February 2004. This is confirmed by Goetz in a number of other places.
However, as Goetz says, this is an idiom whose time has passed. Uncontended synchronization in Java is now fast, so he recommends that you just declare the getInstance() method as synchronized if you need to do lazy initialization. (And I imagine that this applies to other languages too ...)
Besides, all things being equal, it is a bad idea to write code that works in Java 5 but is unreliable in older JVMs.
OK, so what about the other languages? Well, it depends on how the idiom is implemented, and often on the platform.
C# - according to https://stackoverflow.com/a/1964832/139985, it is platform dependent whether the instance variable needs to be volatile. However, Wikipedia says that if you do use volatile or explicit memory barriers, the idiom can be implemented safely.
VB - according to Wikipedia the idiom can be implemented safely using explicit memory barriers.
C++ - according to Wikipedia the idiom can be implemented safely using volatile in Visual C++ 2005. But other sources say that in general the C++ language specification doesn't provide sufficient guarantees for volatile to be sure. However double-checked locking can be implemented in the context of the C++ 2011 language revision - https://stackoverflow.com/a/6099828/139985.
(Note: I'm just summarizing some sources I found which seem to me to be recent ... and sound. I'm not C++, C# or VB expert. Please read the linked pages and make your own judgements.)
This wikipedia article covers java, c++ and .net (c#/vb) http://en.wikipedia.org/wiki/Double-checked_locking
This is a tricky question, with a mine-field of contradictory information out there.
A part of the problem is that there are a few variants of double-checked locking:
The field checked on the fast path may be volatile or not.
There is a one-field variant and a two-field variant of double-checked locking.
And not only that, different authors have a different definition for what it means that the pattern is "correct".
Definition #1: A widely accepted specification of the programming language (e.g. ECMA for C#) guarantees that the pattern is correct.
Definition #2: The pattern works in practice on a particular architecture (typically x86).
As disagreeable as it might seem, a lot of code out there depends on Definition #2.
Let's take C# as an example. In C#, the double-checked pattern (as typically implemented) is correct according to Definition #1 if and only if the field is volatile. But if we consider Definition #2, pretty much all variants are correct on X86 (i.e., happen to work), even if the field is non-volatile. On Itanium, the one-field variant happens to work if the field is non-volatile, but not the two-field variant.
The unfortunate consequence is that you'll find articles making clearly contradictory statements on the correctness of this pattern.
As others have said, this idiom has had its time. FWIW, for lazy initialization, .Net now provides a built-in class: System.Lazy<T> (msdn). Don't know if something similar is available in java though.
It was flawed in Java, it was fixed in Java 5. The fact that is was broken was more of an implementation issue coupled with a misunderstanding than a technically "bad idea".
Related
I've always wondered how the dependencies are managed from a programming language to its libraries. Take for example C#. When I was beginning to learn about computing, I would assume (wrongly as it turns out) that the language itself is designed independently of the class libraries that would eventually become available for it. That is, the set of language keywords (such as for, class or throw) plus the syntax and semantics are defined first, and libraries that can be used from the language are developed separately. The specific classes in those libraries, I used to think, should not have any impact on the design of the language.
But that doesn't work, or not all the time. Consider throw. The C# compiler makes sure that the expression following throw resolves to an exception type. Exception is a class in a library, and as such it should not be special at all. It would be a class as any other, except that the C# compiler assigns it that special semantics. That is very good, but my conclusion is that the design of the language does depend on the existence and behaviour of specific elements in the class libraries.
Additionally, I wonder how this dependency is managed. If I were to design a new programming language, what techniques would I use to map the semantics of throw to the very particular class that is Exception?
So my questions are two:
Am I correct in thinking that language design is tightly coupled to that of its base class libraries?
How are these dependencies managed from within the compiler and run-time? What techniques are used?
Thank you.
EDIT. Thanks to those who pointed out that my second question is very vague. I agree. What I am trying to learn is what kind of references the compiler stores about the types it needs. For example, does it find the types by some kind of unique id? What happens when a new version of the compiler or the class libraries is released? I am aware that this is still pretty vague, and I don't expect a precise, single-paragraph answer; rather, pointers to literature or blog posts are most welcome.
What I am trying to learn is what kind of references the compiler stores about the types it needs. For example, does it find the types by some kind of unique id?
Obviously the C# compiler maintains an internal database of all the types available to it in both source code and metadata; this is why a compiler is called a "compiler" -- it compiles a collection of data about the sources and libraries.
When the C# compiler needs to, say, check whether an expression that is thrown is derived from or identical to System.Exception it pretends to do a global namespace lookup on System, and then it does a lookup on Exception, finds the class, and then compares the resulting class information to the type that was deduced for the expression.
The compiler team uses this technique because that way it works no matter whether we are compiling your source code and System.Exception is in metadata, or if we are compiling mscorlib itself and System.Exception is in source.
Of course as a performance optimization the compiler actually has a list of "known types" and populates that list early so that it does not have to undergo the expense of doing the lookup every time. As you can imagine, the number of times you'd have to look up the built-in types is extremely large. Once the list is populated then the type information for System.Exception can be just read out of the list without having to do the lookup.
What happens when a new version of the compiler or the class libraries is released?
What happens is: a whole bunch of developers, testers, managers, designers, writers and educators get together and spend a few million man-hours making sure that the compiler and the class libraries all work before they're released.
This question is, again, impossibly vague. What has to happen to make a new compiler release? A lot of work, that's what has to happen.
I am aware that this is still pretty vague, and I don't expect a precise, single-paragraph answer; rather, pointers to literature or blog posts are most welcome.
I write a blog about, among other things, the design of the C# language and its compiler. It's at http://ericlippert.com.
I would assume (perhaps wrongly) that the language itself is designed independently of the class libraries that would eventually become available for it.
Your assumption is, in the case of C#, completely wrong. C# 1.0, the CLR 1.0 and the .NET Framework 1.0 were all designed together. As the language, runtime and framework evolved, the designers of each worked very closely together to ensure that the right resources were allocated so that each could ship new features on time.
I do not understand where your completely false assumption comes from; that sounds like a highly inefficient way to write a high-level language and a great way to miss your deadlines.
I can see writing a language like C, which is basically a more pleasant syntax for assembler, without a library. But how would you possibly write, say, async-await without having the guy designing Task<T> in the room with you? It seems like an exercise in frustration.
Am I correct in thinking that language design is tightly coupled to that of its base class libraries?
In the case of C#, yes, absolutely. There are dozens of types that the C# language assumes are available and as-documented in order to work correctly.
I once spent a very frustrating hour with a developer who was having some completely crazy problem with a foreach loop before I discovered that he had written his own IEnumerable<T> that had slightly different methods than the real IEnumerable<T>. The solution to his problem: don't do that.
How are these dependencies managed from within the compiler and run-time?
I don't know how to even begin to answer this impossibly vague question.
All (practical) programming languages have a minimum number of required functions. For modern "OO" languages, this also includes a minimum number of required types.
If the type is required in the Language Specification, then it is required - regardless of how it is packaged.
Conversely, not all of the BCL is required to have a valid C# implementation. This is because not all of the BCL types are required by the Language Specification. For instance, System.Exception (see #16.2) and NullReferenceException are required, but FileNotFoundException is not required to implement the C# Language.
Note that even though the specification provides minimal definitions for base types (e.g System.String), it does not define the commonly-accepted methods (e.g. String.Replace). That is, almost all of the BCL is outside the scope of the Language Specification1.
.. but my conclusion is that the design of the language does depend on the existence and behaviour of specific elements in the class libraries.
I agree entirely and have included examples (and limits of such definitions) above.
.. If I were to design a new programming language, what techniques would I use to map the semantics of "throw" to the very particular class that is "Exception"?
I would not look primarily at the C# specification, but rather I would look at the Common Language Infrastructure specification. This new language should, for practically reasons, be designed to interoperate with existing CLI/CLR languages, but does not necessarily need to "be C#".
1 The CLI (and associated references) do define the requirements of a minimal BCL. So if it is taken that a valid C# implementation must conform to (or may assume) the CLI then there are many other types to consider that are not mentioned in the C# specification itself.
Unfortunately, I do not have sufficient knowledge of the 2nd (and more interesting) question.
my impression is that
in languages like C# and Ada
application source code is portable
standard library source code is not portable
accross compilers/implementations
From other questions I can see that locking on types is a bad idea. But it is possible to do so, so I was wondering if it is such a bad thing to do why is it allowed? I am assuming there must be good use cases for its purpose so could someone let me know what they are please?
It's nearly always a bad idea:
Anyone can lock on the types from anywhere in the code, so you have no way to be sure that you won't get a deadlock without looking through all the code.
Locking on a type can even cause deadlocks across AppDomains. See Joe Duffy's article: Don't lock on marshal-by-bleed objects.
It's allowed because there are almost no restrictions on what you can use as your lock object. In other words, it wasn't specifically allowed - it's just that there isn't any code in the .NET framework that disallows it.
The book "Debugging Microsoft .NET Applications" has source code for an FxCop rule DoNotLockOnTypes that warns you if you try to do this. (thanks to Christian.K)
To understand why it is a bad idea in general have a look at the article Don't lock type objects.
It is allowed because the language/framework designers decided to be able to take lock on anything that derives from System.Object. Nobody can prevent it because System.Type derives from System.Object (as every other .NET type).
Take this signature:
void Foo(object o)
How could a compiler enforce that o is no System.Type? You could of course check it at runtime, but this would have a performance impact.
And of course there might be super-exotic situations where one might need to lock on a type. Maybe the CLR does it internally.
Many bad ideas find their way into programming languages because no language designer can foretell the future. Any language created by humans will have warts.
Some examples:
Hejlsberg wished (Original article: The A-Z of Programming Languages: C# - Computerworld) he had added non-nullable class references to C#. (I wish he had bitten off the const problem as well.)
The C++ committee screwed up with valarray, and export, among numerous other minor and major regrets.
Java's templates were a botch-job (OMG, type elision!) designed to avoid changing the VM, and by the time they realised the VM had to change anyway, it was too late to do the necessary rework.
Python's scoping rules are a constant irritant that numerous attempts to improve it haven't really helped much (a little, but not much).
In Eric Lippert's blog entry on umpires and the C# compiler and spec, he makes this statement:
(or deliberately; we implement a small number of extensions to the formal C# language)
And that got me wondering, what extensions is he referring to, exactly?
In his comments he gives some answers, (and he has given some in past blog entries)
Handling of the constant 0, typed references (http://www.eggheadcafe.com/articles/20030114.asp), type analysis of conditional expressions...
But what he's trying to say is that it shouldn't really matter to the end user, because they're weird corner cases (like the "(m=> (m=> (m=> ..." case) that are needed to help the compiler, and aren't part of the spec.
Follow the spec and you should be fine.
(below added 1:42 pm)
So, I said "Follow the spec and you should be fine." That's really all the advice that I can help you with. Yes, there are some places where the compiler deviates from the spec. But these aren't really documented, partially because they don't know what to do. Do they fix the compiler to adhere to the spec? Or do they change the spec for the weird behavior. That's basically the whole point of the handling of zero article:
http://blogs.msdn.com/ericlippert/archive/2006/03/29/the-root-of-all-evil-part-two.aspx
That's kinda the point of these extensions. They're (for the most part) undocumented, because (for the most part) they're unknown. Who knows, maybe there's some really weird handling of Flags enums that doesn't quite conform to the spec, but we wouldn't really know about it until we do some really weird thing with them. The flags enumerations have been tested, and should mostly follow the spec, so when I say "follow the spec and you should be fine" I mean precisely that. You might not be fine, because there are gotchas. but Eric is doing what he can to fix these problems, and make them known in the interim.
Not counting bugs, the most common example is varargs; but much of COM interop is extensions; for example interface constructors.
17.5 in the spec basically says "anything in System.Runtime.InteropServices can do what it wants". MSDN documents this here.
Is there any design reason for that (like the reason they gave up multi inheritance)?
or it just wasn't important enough?
And same question applies for optional parameters in methods... this was already in the first version of vb.net... so it surely no laziness that cause MS not to allow optional parameters, probably architecture decision.. and it seems they had change of heart about that, because C# 4 is going to include that..
What was the decision and why did they give it up?
Edit:
Maybe readers didn't fully understand me. I'm working lately on a calculation program (support numbers of any size, to the last digit), in which some methods are used millions of times per second.
Say I have a method called Add(int num), and this method is used quiet a lot with 1 as parameter (Add(1);), I've found out it is faster to implement a special method especially for one. And I don't mean overloading - Writing a new method called AddOne, and literally copy the Add method into it, except that instead of using num I'm writing 1. This might seems horribly weird to you, but it's actually faster.
(as much as ugly it is)
That made me wonder why C# doesn't support manual inline which can be amazingly helpful here.
Edit 2:
I asked myself whether or not to add this. I'm very well familiar with the weirdness (and disadvantages) of choosing a platform such as dot net for such project, but I think dot net optimizations are more important than you think... especially features such as Any CPU etc.
To answer part of your question, see Eric Gunnerson's blog post: Why doesn't C# have an 'inline' keyword?
A quote from his post:
For C#, inlining happens at the JIT
level, and the JIT generally makes a
decent decision.
EDIT: I'm not sure of the reason for delayed optional parameters support, however saying they "gave up" on it sounds as though they were expected to implement it based on our expectations of what other languages offered. I imagine it wasn't high on their priority list and they had deadlines to get certain features out the door for each version. It probably didn't rise in importance till now, especially since method overloading was an available alternative. Meanwhile we got generics (2.0), and the features that make LINQ possible etc. (3.0). I'm happy with the progression of the language; the aforementioned features are more important to me than getting support for optional parameters early on.
Manual inlining would be almost useless. The JIT compiler inlines methods during native code compilation where appropriate, and I think in almost all cases the JIT compiler is better at guessing when it is appropriate than the programmer.
As for optional parameters, I don't know why they weren't there in previous versions. That said, I don't like them to be there in C# 4, because I consider them somewhat harmful because the parameter get baked into the consuming assembly and you have to recompile it if you change the standard values in a DLL and want the consuming assembly to use the new ones.
EDIT:
Some additional information about inlining. Although you cannot force the JIT compiler to inline a method call, you can force it to NOT inline a method call. For this, you use the System.Runtime.CompilerServices.MethodImplAttribute, like so:
internal static class MyClass
{
[System.Runtime.CompilerServices.MethodImplAttribute(MethodImplOptions.NoInlining)]
private static void MyMethod()
{
//Powerful, magical code
}
//Other code
}
My educated guess: the reason earlier versions of C# didn't have optional parameters is because of bad experiences with them in C++. On the surface, they look straight-forward enough, but there are a few bothersome corner cases. I think one of Herb Sutter's books describes this in more detail; in general, it has to do with overriding virtual methods. Maximilian has mentioned one of the .NET corner cases in his answer.
You can also pretty much get by with out them by manually writing multiple overloads; that may not be very nice for the author of the class, but clients will hardly notice the difference between overloads and optional parameters.
So after all these years w/o them, why did C# 4.0 add them? 1) improved parity with VB.NET, and 2) easier interop with COM.
I'm working lately on a calculation program (support numbers of any size, to the last digit), in which some methods are used literally millions of times per second.
Then you chose a wrong language. I assume you actually profiled your code (right?) and know that there is nothing apart from micro-optimisations that can help you. Also, you're using a high-performance native bigint library and not writing your own, right?
If that's true, don't use .NET. If you think you can gain speed on partial specialisation, go to Haskell, C, Fortran or any other language that either does it automatically, or can expose inlining to you to do it by hand.
If Add(1) really matters to you, heap allocations will matter too.
However, you should really look at what the profiler can tell you...
C# has added them in 4.0: http://msdn.microsoft.com/en-us/library/dd264739(VS.100).aspx
As to why they weren't done from the beginning, its most likely because they felt method overloads gave more flexibility. With overloading you can specify multiple 'defaults' based on the other parameters that you're taking. Its also not that much more syntax.
Even in languages like C++, inlining something doesn't guarantee that it'll happen; it's a hint to the compiler. The compiler can either take the hint, or do its own thing.
C# is another step removed from the generated assembly code (via IL + the JIT), so it becomes even harder to guarantee that something will inline. Furthermore, you have issues like the x86 + x64 implementations of the JIT differing in behaviour.
Java doesn't include an inline keyword either. The better Java JITs can inline even virtual methods, nor does the use of keywords like private or final make any difference (it used to, but that is now ancient history).
When we talk about the .NET world the CLR is what everything we do depends on.
What is the minimum knowledge of CLR a .NET programmer must have to be a good programmer?
Can you give me one/many you think is/are the most important subjects:
GC?, AppDomain?, Threads?, Processes?, Assemblies/Fusion?
I will very much appreciate if you post a links to articles, blogs, books or other on the topic where more information could be found.
Update: I noticed from some of comments that my question was not clear to some. When I say CLR I don't mean .Net Framework. It is NOT about memorizing .NET libraries, it is rather to understand how does the execution environment (in which those libraries live on runtime) work.
My question was directly inspired by John Robbins the author of "Debugging Applications for Microsoft® .NET" book (which I recommend) and colleague of here cited Jeffrey Richter at Wintellect. In one of introductory chapters he is saying that "...any .NET programmer should know what is probing and how assemblies are loaded into runtime". Do you think there are other such things?
Last Update: After having read first 5 chapters of "CLR via C#" I must say to anyone reading this. If you haven't allready, read this book!
Most of those are way deeper than the kind of thing many developers fall down on in my experience. Most misunderstood (and important) aspects in my experience:
Value types vs reference types
Variables vs objects
Pass by ref vs pass by value
Delegates and events
Distinguishing between language, runtime and framework
Boxing
Garbage collection
On the "variables vs objects" front, here are three statements about the code
string x = "hello";
(Very bad) x is a string with 5 letters
(Slightly better) x is a reference to a string with 5 letters
(Correct) The value of x is a reference to a string with 5 letters
Obviously the first two are okay in "casual" conversation, but only if everyone involved understands the real situation.
A great programmer cannot be measured by the quantity of things he knows about the CLR. Sure it's a nice beginning, but he must also know OOP/D/A and a lot of other things like Design Patterns, Best Practices, O/RM concepts etc.
Fact is I'd say a "great .Net programmer" doesn't necessary need to know much about the CLR at all as long as he has great knowledge about general programming theory and concepts...
I would rather hire a "great Java developer" with great general knowledge and experience in Java for a .Net job then a "master" in .Net that have little experience and thinks O/RM is a stock ticker and stored procedures are a great way to "abstract away the database"...
I've seen professional teachers in .Net completely fail in doing really simple things without breaking their backs due to lack of "general knowledge" while they at the same time "know everything" there is to know about .Net and the CLR...
Updated: reading the relevant parts of the book CLR via C# by Jeffrey Richter..this book can be a good reference..
Should know about Memory Management, Delegates
Jon's answer seems to be pretty complete to me (plus delegates) but I think what fundamentally seperates a good programmer from an average one is answering the why questions rather than the how. It's great to know how garbage collections works and how value types and reference types work, but it's a whole other level to understand when to use a value type vs. reference type. It's the difference between speaking in a language vs. giving a speech in a language (it's all about how we apply the knowledge we have and how we arrive at those decisions).
Jon's answer is good. Those are all fairly basic but important areas that a lot of developers do not have a good understanding of. I think knowing the difference between value and reference types ties in to a basic understanding of how the GC in .NET behaves, but, more importantly, a good understanding of the Dispose pattern is important.
The rest of the areas you mention are either very deep knowledge about the CLR itself or more advanced concepts that aren't widely used (yet). [.NET 4.0 will start to change some of that with the introduction of the parallel extensions and MEF.]
One thing that can be really tricky to grasp is deferred execution and the likes.
How do you explain how a method that returns an IEnumerable works? What does a delegate really do? things like that.