What does "late-bound access to the destination object" mean?

What does "late-bound access to the destination object" mean? - c#

The docs for Interlocked.Exchange<T> contain the following remark:
This method overload is preferable to the Exchange(Object, Object) method overload, because the latter requires late-bound access to the destination object.
I am quite bewildered by this note. To me "late binding" refers to runtime method dispatch and doesn't seem to have anything to do with the technical specifics of atomically swapping two memory locations. What is the note talking about? What does "late-bound access" mean in this context?

canton7's answer is correct, and thanks for the shout-out. I'd like to add a few additional points.
This sentence, as is too often the case in the .NET documentation, both chooses to enstructure bizarre word usements, and thoroughly misses the point. For me, the poor word choice that stood out was not "late bound", which merely misses the point. The really awful word choice is using "destination object" to mean variable. A variable is not an object, any more than your sock drawer is a pair of socks. A variable contains a reference to an object, just as a sock drawer contains socks, and those two things should not be confused.
As you note, the reason to prefer the T version has nothing to do with late binding. The reason to prefer the T version is C# does not allow variant conversions on ref arguments. If you have a variable shelly of type Turtle, you cannot pass ref shelly to a method that takes ref object, because that method could write a Tiger into a ref object.
What then are the logical consequences of using the Object-taking overload on shelly? There are only two possibilities:
We copy the value of shelly to a second variable of type Object, do the exchange, and then copy the new value back, and now our operation is no longer atomic, which was the whole point of calling the interlocked exchange.
We change shelly to be of type Object, and now we are in a non-statically-typed and therefore bug-prone world, where we cannot ever be sure that shelly still contains a reference to Turtle.
Since both of those alternatives are terrible, you should use the generic version because it allows the aliased variable to be of the correct type throughout the operation.

The equivalent remark for Interlocked.Exchange(object, object) is:
Beginning with .NET Framework 2.0, the Exchange<T>(T, T) method overload provides a type-safe alternative for reference types. We recommend that you call it instead of this overload.
Although I haven't heard it used in this way before, I think by "late-bound" it simply means "non type-safe", as you need to cast the object to your concrete type (at runtime) before using it.
As well as virtual method dispatch, "Late Binding" also commonly refers to reflection, as the exact method to be called similarly isn't known until runtime.
To quote Eric Lippert:
Basically by "early binding" we mean "the binding analysis is performed by the compiler and baked in to the generated program"; if the binding fails then the program does not run because the compiler did not get to the code generation phase. By "late binding" we mean "some aspect of the binding will be performed by the runtime" and therefore a binding failure will manifest as a runtime failure
(emphasis mine). Under this rather loose definition, casting object to a concrete type and then calling a method on it could be seen as "late bound", as there's an element of the binding which is performed at, and could fail at, runtime.

Related

Array.Initialize - Why does this method exist?

I stumbled upon a method today. I'm talking about: Array.Initialize().
According to the documentation:
This method is designed to help compilers support value-type arrays; most users do not need this method.
How does this method is responsible for making the compiler support value types? As far as I'm concerned this method just:
Initializes every element of the value-type Array by calling the default constructor of the value type.
Also, why is it public? I don't see myself with the need of calling this method, compilers already initialize arrays when created, so manually calling this method will be redundant and useless.
Even when my intention would be resetting the values of an array, I would still not call it, I would create a new one. array = new int[].
So, it seems that this method exist just for the sake of the compiler. Why is this? Can anyone give me some more details?

It's worth noting that the rules of .NET are different to the rules of C#.
There are things we can do in .NET that we can't do in C#, generally either because the code is not verifiable (ref return types for example) or because they could introduce some confusion.
In C# structs cannot have a defined parameterless constructor, and calling new SomeValueType() works by creating a zero-filled portion of memory (all fields therefore being 0 for numeric types, null for reference types, and the result of this same rule again for other value-types).
In .NET you can have a parameterless constructor on a value type.
It's probably a bad idea to do so. For one thing the rules about just when it is called and just when the memory of the value is zero-filled, and what happens upon assignment in different cases aren't entirely simple (e.g. new SomeValueType() will call it but new T() in a generic method where T is SomeValueType will not!). Life is simpler if the result of new SomeValueType() will always be zero-filling. That no doubt influenced the design of C# not allowing this even though .NET does.
For this reason, Array.Initialize() will never make sense on new arrays of any type that was written in C#, because calling the constructor and zero-filling is the same thing.
But by the same token, it's possible for a type to be written in another .NET language (at the very least, you can do it in CIL) that does have a parameterless constructor that actually has an effect. And for that reason its possible that a compiler for such a language would want its equivalent to new SomeValueType[3] to call that constructor on all the types in the array. And therefore it's sensible to have a method in the framework that allows such a fill to be done, so that a compiler for such a language can make use of it.
Also, why is it public?
So it can be called by code produced by such a hypothetical constructor even in a context where security restrictions prevent it from calling private methods of another assembly.

For me myself it looks like the Initialize() method runs through the array and recreates the Value Types within. So with a new array you get a new empty array and so you get with Array.Clear(), but with Array.Initialize() you get an Array full of fresh created Value Types (types and length based on the old array).
And that should be all of the difference.

Based on the CLR source, the method traverses each index of the array and initializes the value type on that index by calling the default constructor, similar to initobj IL instruction (I wonder what happens when the constructor throws an exception, though). The method is public because calling a private method directly from IL would make it a bit unverifiable.
Today's C# compilers do not initialize each element of the array when creating it, simply "set" each index to the default value of the type. C# 6 introduces implementing default constructors for value types (which were already supported by CLR), so this is needed for languages with different array creation semantics.

You can see the expected use in the test code:
https://github.com/dotnet/coreclr/blob/3015ff7afb4936a1c5c5856daa4e3482e6b390a9/tests/src/CoreMangLib/cti/system/array/arrayinitialize.cs
Basically, it sets an array of non-intrinsic value-types back to their default(T) state.
It does not seem like an amazingly useful tool, but I can see how it could be useful for zero'ing out arrays of non-intrinsic value data.

What does Resharper have against the commonly-used getRange() in Excel Interop?

Almost all of the example code online for C# Excel interop has stuff like this:
monthlyChartRange = _xlSheetChart.get_Range("A3", "C4");
Yet, Resharper turns up its nose at it and demands: "Use indexed property" If you accede to its wishes (and I love R#, so I always say, "As you wish"), it changes it to:
monthlyChartRange = _xlSheetChart.Range["A3", "B4"];
Why? How is Range better than get_Range? How is the former any more indexed than the latter?

Resharper is trained to use properties instead of the underlying Getter because it's easier code to read and more OO-like. The property just calls the getter under the covers.
The COM port of Excel automation just exposed the Getter, even though this is not really all that OO-ish.
BTW, I recommend using Aspose Cells instead of Office automation. It's like ten trillion times faster.

Yet, Resharper turns up its nose at it and demands: "Use indexed property"
A bit of an intro - COM goes way back. You must bare in mind that COM is a binary protocol and that some COM-aware languages have no concept of properties.
COM indexed properties are generally defined like so:
[propget, id(DISPID_VALUE), helpstring("blah")]
HRESULT Item([in] long Index, [out, retval] BSTR* Item);
The dispatch ID is DISPID_VALUE which informs clients that this is the default property typically used for indexed properties. Notice how it is defined as a property get but interestingly it takes a parameter index. The notion of a property getter that takes an argument can be confusing to some programming languages as we shall see.
Fully COM-compliant Client Languages
A good COM-aware client language (such as VB6; VBA; .NET) will allow you to interact with the object like so:
string thing = myObject.Item[2];
These languages sees that this particular property is a propget however it takes and in parameter. The fact that it is marked as DISPID_VALUE is rather riveting because that tells COM clients that it is the default property. Putting two and two together, clients will realise that it must be a dandy indexed property so that they can use groovy square brackets.
It's important to realise that this is just sugar syntax. It's just a more convenient way to call the default indexed property. Behind the scenes, it will call the same COM property getter method as per un-compliant languages as shown below.
Not-so-COM-compliant Client Languages
Some programming languages (like Visual Objects 2.6), freak out (then again I freak out when I use VO) when they see a COM property that takes a parameter and so it falls back to the underlying method declaration where it essentially treats it as a method call that takes an index parameter and returns a string:
string thing = myObject.get_Item(2) // Boo!
These clients probably also missed the important fact that the property was marked as DISPID_VALUE.
Now with c#, you are arguably free to use the method-style operations to query a property or do as COM was designed and use index property notation.
After all, c# is OO; understands properties; understands indexers so why wouldn't you use them?
Hence why Resharper is prompting you to use the COM-preferred indexed style. It's no different to how you would use a regular c# class.

Why structs cannot have destructors?

What is best answer on interview on such question you think?
I think I didn't find a copy of this here, if there is one please link it.

Another way of looking at this - rather than just quoting the spec which says that structs can't/don't have destructors - consider what would happen if the spec was changed so that they did - or rather, let's ask the question: can we guess why did the language designers decide to not allow structs to have 'destructors' in the first place?
(Don't get hung up on the word 'destructor' here; we're basically talking about a magic method on structs that gets called automatically when the variable goes out of scope. In other words, a language feature analogous to C++'s destructors.)
The first thing to realize is that we don't care about releasing memory. Whether the object is on the stack or on the heap (eg. a struct in a class), the memory will be taken care of one way or another sooner or later; either by being popped off the stack or by being collected. The real reason for having something that's destructor-like in the first place is for managing external resources - things like file handles, window handles, or other things that need special handling to get them cleaned up that the CLR itself doesn't know about.
Now supposed you allow a struct to have a destructor that can do this cleanup. Fine. Until you realize that when structs are passed as parameters, they get passed by value: they are copied. Now you've got two structs with the same internal fields, and they're both going to attempt to clean up the same object. One will happen first, and so code that is using the other one afterwards will start to fail mysteriously... and then its own cleanup will fail (hopefully! - worst case is it might succeed in cleaning up some other random resource - this can happen in situations where handle values are reused, for example.)
You could conceivably make a special case for structs that are parameters so that their 'destructors' don't run (but be careful - you now need to remember that when calling a function, it's always the outer one that 'owns' the actual resource - so now some structs are subtly different to others...) - but then you still have this problem with regular struct variables, where one can be assigned to another, making a copy.
You could perhaps work around this by adding a special mechanism to assignment operations that somehow allows the new struct to negotiate ownership of the underlying resource with its new copy - perhaps they share it or transfer ownership outright from the old to the new - but now you've essentially headed off into C++-land, where you need copy constructors, assignment operators, and have added a bunch of subtleties waiting to trap the unaware novice programmer. And keep in mind that the entire point of C# is to avoid that type of C++-style complexity as much as possible.
And, just to make things a bit more confusing, as one of the other answers pointed out, structs don't just exist as local objects. With locals, scope is nice and well defined; but structs can also be members of a class object. When should the 'destructor' get called in that case? Sure, you can do it when the container class is finalized; but now you have a mechanism that behaves very differently depending on where the struct lives: if the struct is a local, it gets triggered immediately at end of scope; if the struct is within a class, it gets triggered lazily... So if you really care about ensuring that some resource in one of your structs is cleaned up at a certain time, and if your struct could end up as a member of a class, you'd probably need something explicit like IDisposable/using() anyhow to ensure you've got your bases covered.
So while I can't claim to speak for the language designers, I can make a pretty good guess that one reason they decided not to include such a feature is because it would be a can of worms, and they wanted to keep C# reasonably simple.

From Jon Jagger:
"A struct cannot have a destructor. A destructor is just an override of object.Finalize in disguise, and structs, being value types, are not subject to garbage collection."

Every object other than arrays and strings is stored on the heap in the same way: a header which gives information about the "object-related" properties (its type, whether it's used by any active monitor locks, whether it has a non-suppressed Finalize method, etc.), and its data (meaning the contents of all the type's instance fields (public, private, and protected intermixed, with base-class fields appearing before derived-type fields). Because every heap object has a header, the system can take a reference to any object and know what it is, and what the garbage-collector is supposed to do with it. If the system has a list of all objects which have been created and have a Finalize method, it can examine every object in the list, see if its Finalize method is unsuppressed, and act on it appropriately.
Structs are stored without any header; a struct like Point with two integer fields is simply stored as two integers. While it is possible to have a ref to a struct (such a thing is created when a struct is passed as a ref parameter), the code that uses the ref has to know what type of struct the ref points to, since neither the ref nor the struct itself holds that information. Further, heap objects may only be created by the garbage-collector, which will guarantee that any object which is created will always exist until the next GC cycle. By contrast, user code can create and destroy structs by itself (often on the stack); if code creates a struct along with a ref to it, and passes that ref it to a called routine, there's no way that code can destroy the struct (or do anything at all, for that matter) until the called routine returns, so the struct is guaranteed to exist at least until the called routine exits. On the other hand, once the called routine exits, the ref it was given should be presumed invalid, since the caller would be free to destroy the struct at any time thereafter.

Becuase by definition destructors are used to destruct instances of classes, and structs are value types.
Ref: http://msdn.microsoft.com/en-us/library/66x5fx1b.aspx
By Microsoft's own words: "Destructors are used to destruct instances of classes." so it's a little silly to ask "Why can't you use a destructor on (something that is not a class)?" ^^

VB Compiler does not do implicit casts to Object?

I've recently had a strange issue with one of my APIs reported. Essentially for some reason when used with VB code the VB compiler does not do implicit casts to Object when trying to invoke the ToString() method.
The following is a minimal code example firstly in C# and secondly in VB:
Graph g = new Graph();
g.LoadFromEmbeddedResource("VDS.RDF.Configuration.configuration.ttl");
foreach (Triple t in g.Triples)
{
Console.WriteLine(t.Subject.ToString());
}
The above compiles and runs fine while the below does not:
Dim g As Graph = New Graph()
g.LoadFromEmbeddedResource("VDS.RDF.Configuration.configuration.ttl")
For Each t As Triple In g.Triples
Console.WriteLine(t.Subject.ToString())
Next
The second VB example gives the following compiler exception:
Overload resolution failed because no
accessible 'ToString' accepts this
number of arguments.
This appears to be due to the fact that the type of the property t.Subject that I am trying to write to the console has explicitly defined ToString() methods which take parameters. The VB compiler appears to expect one of these to be used and does not seem to implicitly cast to Object and use the standard Object.ToString() method whereas the C# compiler does.
Is there any way around this e.g. a VB compiler option or is it best just to ensure that the type of the property (which is an interface in this example) explicitly defines an unparameterized ToString() method to ensure compatability with VB?
Edit
Here are the additional details requested by Lucian
Graph is an implementation of an interface but that is actually irrelevant since it is the INode interface which is the type that t.Subject returns which is the issue.
INode defines two overloads for ToString() both of which take parameters
Yes it is a compile time error
No I do not use hide-by-name, the API is all written in C# so I couldn't generate that kind of API if I wanted to
Note that I've since added an explicit unparameterized ToString() overload to the interface which has fixed the issue for VB users.

RobV, I'm the VB spec lead, so I should be able to answer your question, but I'll need some clarification please...
What are the overloads defined on "Graph"? It'd help if you could make a self-contained repro. It's hard to explain overloading behavior without knowing the overload candidates :)
You said it failed with a "compiler exception". That doesn't really exist. Do you mean a "compile-time error"? Or a "run-time exception"?
Something to check is whether you're relying on any kind of "hide-by-name" vs "hide-by-sig" behavior. C# compiler only ever emits "hide-by-sig" APIs; VB compiler can emit either depending on whether you use the "Shadows" keyword.
C# overload algorithm is to walk up the inheritance hierarchy level by level until it finds a possible match; VB overload algorithm is to look at all levels of the inheritance hierarchy simultaneously to see which has the best match. This is all a bit theoretical, but with a small self-contained repro of your problem I could explain what it means in practice.
Hans, I don't think your explanation is the right one. Your code gives compile-time error "BC30455: Argument not specified for parameter 'mumble' of ToString". But RobV had experienced "Overload resolution failed because no accessible 'ToString' accepts this number of arguments".

Here's a repro of this behavior. It also shows you the workaround, cast with CObj():
Module Module1
Sub Main()
Dim itf As IFoo = New CFoo()
Console.WriteLine(itf.ToString()) '' Error BC30455
Console.WriteLine(CObj(itf).ToString()) '' Okay
End Sub
End Module
Interface IFoo
Function ToString(ByVal mumble As Integer) As String
End Interface
Class CFoo
Implements IFoo
Function ToString1(ByVal mumble As Integer) As String Implements IFoo.ToString
Return "foo"
End Function
End Class
I think this is annotated in the VB.NET Language Specification, chapter 11.8.1 "Overloaded method resolution":
The justification for this rule is
that if a program is loosely-typed
(that is, most or all variables are
declared as Object), overload
resolution can be difficult because
all conversions from Object are
narrowing. Rather than have the
overload resolution fail in many
situations (requiring strong typing of
the arguments to the method call),
resolution the appropriate overloaded
method to call is deferred until run
time. This allows the loosely-typed
call to succeed without additional
casts.
An unfortunate side-effect of this,
however, is that performing the
late-bound call requires casting the
call target to Object. In the case of
a structure value, this means that the
value must be boxed to a temporary. If
the method eventually called tries to
change a field of the structure, this
change will be lost once the method
returns.
Interfaces are excluded from this
special rule because late binding
always resolves against the members of
the runtime class or structure type,
which may have different names than
the members of the interfaces they
implement.
Not sure. I'd transliterate it as: VB.NET is a loosely typed language where many object references are commonly late bound. This makes method overload resolution perilous.

What are the deficiencies of the Java/C# type system?

Its often hear that Haskell(which I don't know) has a very interesting type system.. I'm very familiar with Java and a little with C#, and sometimes it happens that I'm fighting the type system so some design accommodates or works better in a certain way.
That led me to wonder...
What are the problems that occur somehow because of deficiencies of Java/C# type system?
How do you deal with them?

Arrays are broken.
Object[] foo = new String[1];
foo[0] = new Integer(4);
Gives you java.lang.ArrayStoreException
You deal with them with caution.
Nullability is another big issue. NullPointerExceptions jump at your face everywhere. You really can't do anything about them except switch language, or use conventions of avoiding them as much as possible (initialize fields properly, etc).
More generally, the Java's/C#'s type systems are not very expressive. The most important thing Haskell can give you is that with its types you can enforce that functions don't have side effects. Having a compile time proof that parts of programs are just expressions that are evaluated makes programs much more reliable, composable, and easier to reason about. (Ignore the fact, that implementations of Haskell give you ways to bypass that).
Compare that to Java, where calling a method can do almost anything!
Also Haskell has pattern matching, which gives you different way of creating programs; you have data on which functions operate, often recursively. In pattern matching you destruct data to see of what kind it is, and behave according to it. e.g. You have a list, which is either empty, or head and tail. If you want to calculate the length, you define a function that says: if list is empty, length = 0, otherwise length = 1 + length(tail).
If you really like to learn more, there's two excellent online sources:
Learn you a Haskell and Real World Haskell

I dislike the fact that there is a differentiation between primitive (native) types (int, boolean, double) and their corresponding class-wrappers (Integer, Boolean, Double) in Java.
This is often quite annoying especially when writing generic code. Native types can't be genericized, you must instantiate a wrapper instead. Generics should make your code more abstract and easier reusable, but in Java they bring restrictions with obviously no reasons.
private static <T> T First(T arg[]) {
return arg[0];
}
public static void main(String[] args) {
int x[] = {1, 2, 3};
Integer y[] = {3, 4, 5};
First(x); // Wrong
First(y); // Fine
}
In .NET there are no such problems even though there are separate value and reference types, because they strictly realized "everything is an object".

this question about generics shows the deficiencies of the java type system's expressiveness
Higher-kinded generics in Java

I don't like the fact that classes are not first-class objects, and you can't do fancy things such as having a static method be part of an interface.

A fundamental weakness in the Java/.net type system is that it has no declarative means of specifying how an object's state relates to the contents of its reference-type fields, nor of specifying what a method is allowed to persist reference-type parameters. Although in some sense it's nice for the runtime to be able to use a field Foo of one type ICollection<integer> to mean many different things, it's not possible for the type system to provide real support for things like immutability, equivalency testing, cloning, or any other such features without knowing whether Foo represents:
A read-only reference to a collection which nothing will ever mutate; the class may freely share such reference with outside code, without affecting its semantics. The reference encapsulates only immutable state, and likely does not encapsulate identity.
A writable reference to a collection whose type is mutable, but which nothing will ever actually mutate; the class may only share such references with code that can be trusted not to mutate it. As above, the reference encapsulates only immutable state, and likely does not encapsulate identity.
The only reference anywhere in the universe to a collection which it mutates. The reference would encapsulate mutable state, but would not encapsulate identity (replacing the collection with another holding the same items would not change the state of the enclosing object).
A reference to a collection which it mutates, and whose contents it considers to be its own, but to which outside code holds references which it expects to be attached to `Foo`'s current state. The reference would encapsulate both identity and mutable state.
A reference to a mutable collection owned by some other object, which it expects to be attached to that other object's state (e.g. if the object holding `Foo` is supposed to display the contents of some other collection). That reference would encapsulate identity, but would not encapsulate mutable state.
Suppose one wants to copy the state of the object that contains Foo to a new, detached, object. If Foo represents #1 or #2, one may store in the new object either a copy of the reference in Foo, or a reference to a new object holding the same data; copying the reference would be faster, but both operations would be correct. If Foo represents #3, a correct detached copy must hold a reference to a new detached object whose state is copied from the original. If Foo represents #5, a correct detached copy must hold a copy of the original reference--it must NOT hold reference to a new detached object. And if Foo represents #4, the state of the object containing it cannot be copied in isolation; it might be possible to copy a bunch of interconnected objects to yield a new bunch whose state is equivalent to the original, but it would not be possible to copy the state of objects individually.
While it won't be possible for a type system to specify declaratively all of the possible relationships that can exist among objects and what should be done about them, it should be possible for a type system and framework to correctly generate code to produce semantically-correct equivalence tests, cloning methods, smoothly inter-operable mutable, immutable, and "readable" types, etc. in most cases, if it knew which fields encapsulate identity, mutable state, both, or neither. Additionally, it should be possible for a framework to minimize defensive copying and wrapping in circumstances where it could ensure that the passed references would not be given to anything that would mutate them.

(Re: C# specifically.)
I would love tagged unions.
Ditto on first-class objects for classes, methods, properties, etc.
Although I've never used them, Python has type classes that basically are the types that represent classes and how they behave.
Non-nullable reference types so null-checks are not needed. It was originally considered for C# but was discarded. (There is a stack overflow question on this.)
Covariance so I can cast a List<string> to a List<object>.

This is minor, but for the current versions of Java and C# declaring objects breaks the DRY principle:
Object foo = new Object;
Int x = new Int;

None of them have meta-programming facilities like say that old darn C++ dog has.
Using "using" duplication and lack of typedef is one example that violates DRY and can even cause user-induced 'aliasing' errors and more. Java 'templates' isn't even worth mentioning..

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.