Related
The docs for Interlocked.Exchange<T> contain the following remark:
This method overload is preferable to the Exchange(Object, Object) method overload, because the latter requires late-bound access to the destination object.
I am quite bewildered by this note. To me "late binding" refers to runtime method dispatch and doesn't seem to have anything to do with the technical specifics of atomically swapping two memory locations. What is the note talking about? What does "late-bound access" mean in this context?
canton7's answer is correct, and thanks for the shout-out. I'd like to add a few additional points.
This sentence, as is too often the case in the .NET documentation, both chooses to enstructure bizarre word usements, and thoroughly misses the point. For me, the poor word choice that stood out was not "late bound", which merely misses the point. The really awful word choice is using "destination object" to mean variable. A variable is not an object, any more than your sock drawer is a pair of socks. A variable contains a reference to an object, just as a sock drawer contains socks, and those two things should not be confused.
As you note, the reason to prefer the T version has nothing to do with late binding. The reason to prefer the T version is C# does not allow variant conversions on ref arguments. If you have a variable shelly of type Turtle, you cannot pass ref shelly to a method that takes ref object, because that method could write a Tiger into a ref object.
What then are the logical consequences of using the Object-taking overload on shelly? There are only two possibilities:
We copy the value of shelly to a second variable of type Object, do the exchange, and then copy the new value back, and now our operation is no longer atomic, which was the whole point of calling the interlocked exchange.
We change shelly to be of type Object, and now we are in a non-statically-typed and therefore bug-prone world, where we cannot ever be sure that shelly still contains a reference to Turtle.
Since both of those alternatives are terrible, you should use the generic version because it allows the aliased variable to be of the correct type throughout the operation.
The equivalent remark for Interlocked.Exchange(object, object) is:
Beginning with .NET Framework 2.0, the Exchange<T>(T, T) method overload provides a type-safe alternative for reference types. We recommend that you call it instead of this overload.
Although I haven't heard it used in this way before, I think by "late-bound" it simply means "non type-safe", as you need to cast the object to your concrete type (at runtime) before using it.
As well as virtual method dispatch, "Late Binding" also commonly refers to reflection, as the exact method to be called similarly isn't known until runtime.
To quote Eric Lippert:
Basically by "early binding" we mean "the binding analysis is performed by the compiler and baked in to the generated program"; if the binding fails then the program does not run because the compiler did not get to the code generation phase. By "late binding" we mean "some aspect of the binding will be performed by the runtime" and therefore a binding failure will manifest as a runtime failure
(emphasis mine). Under this rather loose definition, casting object to a concrete type and then calling a method on it could be seen as "late bound", as there's an element of the binding which is performed at, and could fail at, runtime.
http://msdn.microsoft.com/en-us/library/microsoft.office.tools.excel.worksheet.get_range.aspx it says to use the Range property instead of get_Range(Object Cell1, Object Cell2).
They are both doing the same thing, Gets a Microsoft.Office.Interop.Excel.Range object that represents a cell or a range of cells. So, what's the difference except that this is a method and another is a property? Why are they pointing on use of Range[], what's the reason for it?
Range() is faster than Range[]
By practice we have noticed it the case. But here should define a reason to say so.
This shortcut is convenient when you want to refer to an absolute range. However, it is not as flexible as the Rangeproperty as it cannot handle variable input as strings or object references. So at the end of the day you will still end up referring the long way. Although the shorty provides readability. Hence might as well get it right the first round without more resources spending.
Now why is it slow? In the compiling.
"During run-time Excel always uses conventional notation (or so I've been told), so when the code is being compiled all references in shortcut notation must be converted to conventional range form (or so I've been told). {ie [A150] must be converted to Range("A150") form}. Whatever the truth of what I've been told, Visual Basic has to memorize both its compiled version of the code and whatever notation you used to write your code (i.e. whatever's in the code module), the workbook properties for the file size (the memory used) thus goes up slightly. "
As you see my answer was more in line with VBA. However after some research it is sort of proved that VBA side doesn't do much slowing down. So you only need to take care of the C# side. #Hans gives you a better answer in C# perspective. Hope combining both that you will get a great performing code :)
Here is some finding on the performance of Range[] vs Range() in Excel
If you use C# version 4 and up then you can use the Range indexer. But you have to use get_Range() on earlier versions.
Do note that there's something special about it, the default property of a COM interface maps to the indexer. But the Range property is not the default property of a Worksheet, it is just a regular property. Trouble is, C# does not permit declaring indexed properties other than the indexer. Works in VB.NET, not in C#, you had to call the property getter method directly. By popular demand, the C# team dropped this restriction in version 4 (VS2010). But only on COM interfaces, you still cannot declare indexed properties in your own code.
I have used both and both returned the same results. I think Range[] actually uses get_Range() internally.
For a question of naming convention I only use Range[] now.
If I have various subclasses of something, and an algorithm which operates on instances of those subclasses, and if the behaviour of the algorithm varies slightly depending on what particular subclass an instance is, then the most usual object-oriented way to do this is using virtual methods.
For example if the subclasses are DOM nodes, and if the algorithm is to insert a child node, that algorithm differs depending on whether the parent node is a DOM element (which can have children) or DOM text (which can't): and so the insertChildren method may be virtual (or abstract) in the DomNode base class, and implemented differently in each of the DomElement and DomText subclasses.
Another possibility is give the instances a common property, whose value can be read: for example the algorithm might read the nodeType property of the DomNode base class; or for another example, you might have different types (subclasses) of network packet, which share a common packet header, and you can read the packet header to see what type of packet it is.
I haven't used run-time-type information much, including:
The is and as keywords in C#
Downcasting
The Object.GetType method in dot net
The typeid operator in C++
When I'm adding a new algorithm which depends on the type of subclass, I tend instead to add a new virtual method to the class hierarchy.
My question is, when is it appropriate to use run-time-type information, instead of virtual functions?
When there's no other way around. Virtual methods are always preferred but sometimes they just can't be used. There's couple of reasons why this could happen but most common one is that you don't have source code of classes you want to work with or you can't change them. This often happens when you work with legacy system or with closed source commercial library.
In .NET it might also happens that you have to load new assemblies on the fly, like plugins and you generally have no base classes but have to use something like duck typing.
In C++, among some other obscure cases (which mostly deal with inferior design choices), RTTI is a way to implement so-called multi methods.
This constructions ("is" and "as") are very familiar for Delphi developers since event handlers usually downcast objects to a common ancestor. For example event OnClick passes the only argurment Sender: TObject regardless of the type of the object, whether it is TButton, TListBox or any other. If you want to know something more about this object you have to access it through "as", but in order to avoid an exception, you can check it with "is" before. This downcasting allows design-type binding of objects and methods that could not be possible with strict class type checking. Imagine you want to do the same thing if the user clicks Button or ListBox, but if they provide us with different prototypes of functions, it could not be possible to bind them to the same procedure.
In more general case, an object can call a function that notifies that the object for example has changed. But in advance it leaves the destination the possibility to know him "personally" (through as and is), but not necessarily. It does this by passing self as a most common ancestor of all objects (TObject in Delphi case)
dynamic_cast<>, if I remember correctly, is depending on RTTI. Some obscure outer interfaces might also rely on RTTI when an object is passed through a void pointer (for whatever reason that might happen).
That being said, I haven't seen typeof() in the wild in 10 years of pro C++ maintenance work. (Luckily.)
You can refer to More Effective C# for a case where run-time type checking is OK.
Item 3. Specialize Generic Algorithms
Using Runtime Type Checking
You can easily reuse generics by
simply specifying new type parameters.
A new instantiation with new type
parameters means a new type having
similar functionality.
All this is great, because you write
less code. However, sometimes being
more generic means not taking
advantage of a more specific, but
clearly superior, algorithm. The C#
language rules take this into account.
All it takes is for you to recognize
that your algorithm can be more
efficient when the type parameters
have greater capabilities, and then to
write that specific code. Furthermore,
creating a second generic type that
specifies different constraints
doesn't always work. Generic
instantiations are based on the
compile-time type of an object, and
not the runtime type. If you fail to
take that into account, you can miss
possible efficiencies.
For example, suppose you write a class that provides a reverse-order enumeration on a sequence of items represented through IEnumerable<T>. In order to enumerate it backwards you may iterate it and copy items into an intermediate collection with indexer access like List<T> and than enumerate that collection using indexer access backwards. But if your original IEnumerable is IList why not take advantage of it and provide more performant way (without copying to intermediate collection) to iterate items backwards. So basically it is a special we can take advantage of but still providing the same behavior (iterating sequence backwards).
But in general you should carefully consider run-time type checking and ensure that it doesn't violate Liskov Substituion Principle.
Isn't it much more elegant and neat to have an IStringable interface?
Who needs this Type.FullName object returned to us?
EDIT: everyone keeps asking why do I think it's more elegant..
Well, it's just like that, instead of IComparable, object would have CompareTo method, that by default throws an exception or returns 0.
There are objects that cannot and should not be described as a string. object could have equally returned string.Empty. Type.FullName is just an arbitrary choice..
And for methods such as Console.Write(object), I think it should be: Write(IStringable).
However, if you are using WriteLine to anything but strings (or something that its ToString is obvious such as numbers), it seems to me it's for debugging mode only..
By the way - how should I comment to you all? Is it okay that I post an answer?
There are three virtual methods that IMHO should have never been added to System.Object...
ToString()
GetHashCode()
Equals()
All of these could have been implemented as you suggest with an interface. Had they done so I think we'd be much better off. So why are these a problem? Let's just focus on ToString():
If ToString() is expected to be implemented by someone using ToString() and displaying the results you have an implicit contract that the compiler cannot enforce. You assume that ToString() is overloaded, but there is no way to force that to be the case.
With an IStringable you would only need to add that to your generic type-constraint or derive your interface from it to require it's usage on implementing objects.
If the benefit you find in overloading ToString() is for the debugger, you should start using [System.Diagnostics.DebuggerDisplayAttribute].
As for needing this implementation for converting objects to strings via String.Format(), and/or Console.WriteLine, they could have deferred to the System.Convert.ToString(object) and checked for something like 'IStringable', failing over to the type's name if not implemented.
As Christopher Estep points out, it's culture specific.
So I guess I stand alone here saying I hate System.Object and all of it's virtual methods. But I do love C# as a whole and overall I think the designers did a great job.
Note: If you intend to depend upon the behavior of ToString() being overloaded, I would suggest you go ahead and define your IStringable interface. Unfortunatly you'll have to pick another name for the method if you really want to require it.
more
My coworkers and I were just speaking on the topic. I think another big problem with ToString() is answering the question "what is it used for?". Is it Display text? Serialization text? Debugging text? Full type name?
Having Object.ToString makes APIs like Console.WriteLine possible.
From a design perspective the designers of the BCL felt that the ability to provide a string representation of an instance should be common to all objects. True full type name is not always helpful but they felt the ability to have customizable representation at a root level outweighed the minor annoyance of seeing a full type name in output.
True you could implement Console.WriteLine with no Object.ToString and instead do an interface check and default to the full name of the type if the interface was not present. But then every single API which wanted to capture the string representation of an object instance would have to implement this logic. Given the number of times Object.ToString is used just within the core BCL, this would have lead to a lot of duplication.
I imagine it exists because it's a wildly convenient thing to have on all objects and doesn't require add'l cruft to use. Why do you think IStringable would be more elegant?
Not at all.
It doesn't need to be implemented and it returns culture-specific results.
This method returns a human-readable string that is culture-sensitive. For example, for an instance of the Double class whose value is zero, the implementation of Double..::.ToString might return "0.00" or "0,00" depending on the current UI culture.
Further, while it comes with its own implementation, it can be overriden, and often is.
Why make it more complicated? The way it is right now basically establishes that each and every object is capable of printing its value to a string, I can't see anything wrong with that.
A "stringable" representation is useful in so many scenarios, the library designers probably thought ToString() was more straightforward.
With IStringable, you will have to do an extra check/cast to see if you can output an object in string format. It's too much of a hit on perf for such a common operation that should be a good thing to have for 99.99% of all objects anyway.
Mmmm, so it can be overridden in derived classes possibly?
Structs and Objects both have the ToString() member to ease debugging.
The easiest example of this can be seen with Console.WriteLine which receives a whole list of types including object, but also receives params object[] args. As Console is often a layer on-top of TextWriter these statements are also helpful (sometimes) when writing to files and other streams (sockets).
It also illustrates a simple object oriented design that shows you interfaces shouldn't be created just because you can.
My new base class:
class Object : global::System.Object
{
[Obsolete("Do not use ToString()", true)]
public sealed override string ToString()
{
return base.ToString();
}
[Obsolete("Do not use Equals(object)", true)]
public sealed override bool Equals(object obj)
{
return base.Equals(this, obj);
}
[Obsolete("Do not use GetHashCode()", true)]
public sealed override int GetHashCode()
{
return base.GetHashCode();
}
}
There's indeed little use of having the Type.FullName returned to you, but it would be even less use if an empty string or null were returned. You ask why it exists. That's not too easy to answer and has been a much debated issue for years. More then a decade ago, several new languages decided that it would be convenient to implicitly cast an object to a string when it was needed, those languages include Perl, PHP and JavaScript, but none of them is following the object orientation paradigm thoroughly.
Approaches
Designers of object oriented languages had a harder problem. In general, there were three approaches for getting the string representation of an object:
Use multiple inheritance, simply inherit from String as well and you can be cast to a string
Single inheritance: add ToString to the base class as a virtual method
Either: make the cast operator or copy constructor overloadable for strings
Perhaps you'd ask yourself Why would you need a ToString or equiv. in the first place? As some others already noted: the ToString is necessary for introspection (it is called when you hover your mouse over any instance of an object) and the debugger will show it too. As a programmer, you know that on any non-null object you can safely call ToString, always. No cast needed, no conversion needed.
It is considered good programming practice to always implement ToString in your own objects with a meaningful value from your persistable properties. Overloads can help if you need different types of representation of your class.
More history
If you dive a bit deeper in the history, we see SmallTalk taking a wider approach. The base object has many more methods, including printString, printOn etc.
A small decade later, when Bertrand Meyer wrote his landmark book Object Oriented Software construction, he suggested to use a rather wide base class, GENERAL. It includes methods like print, print_line and tagged_out, the latter showing all properties of the object, but no default ToString. But he suggests that the "second base object ANY to which all user defined object derive, can be expanded", which seems like the prototype approach we now know from JavaScript.
In C++, the only multiple inheritance language still in widespread use, no common ancestor exists for all classes. This could be the best candidate language to employ your own approach, i.e. use IStringable. But C++ has other ways: you can overload the cast operator and the copy constructor to implement stringability. In practice, having to be explicit about a to-string-implementation (as you suggest with IStringable) becomes quite cumbersome. C++ programmers know that.
In Java we find the first appearance of toString for a mainstream language. Unfortunately, Java has two main types: objects and value types. Value types do not have a toString method, instead you need to use Integer.toString or cast to the object counterpart. This has proven very cumbersome throughout the years, but Java programmers (incl. me) learnt to live with it.
Then came C# (I skipped a few languages, don't want to make it too long), which was first intended as a display language for the .NET platform, but proved very popular after initial skepticism. The C# designers (Anders Hejlsberg et al) looked mainly at C++ and Java and tried to take the best of both worlds. The value type remained, but boxing was introduced. This made it possible to have value types derive from Object implicitly. Adding ToString analogous to Java was just a small step and was done to ease the transition from the Java world, but has shown its invaluable merits by now.
Oddity
Though you don't directly ask about it, but why would the following have to fail?
object o = null;
Console.WriteLine(o.ToString());
and while you think about it, consider the following, which does not fail:
public static string MakeString(this object o)
{ return o == null ? "null" : o.ToString(); }
// elsewhere:
object o = null;
Console.WriteLine(o.MakeString());
which makes me ask the question: would, if the language designers had thought of extension methods early on, the ToString method be part of the extension methods to prevent unnecessary NullPointerExceptions? Some consider this bad design, other consider it a timesaver.
Eiffel, at the time, had a special class NIL which represented nothingness, but still had all the base class's methods. Sometimes I wished that C# or Java had abandoned null altogether, just like Bertrand Meyer did.
Conclusion
The wide approach of classical languages like Eiffel and Smalltalk has been replaced by a very narrow approach. Java still has a lot of methods on Object, C# only has a handful. This is of course good for implementations. Keeping ToString in the package simply keeps programming clean and understandable at the same time and because it is virtual, you can (and should!) always override it, which will make your code better apprehendable.
-- Abel --
EDIT: the asker edited the question and made a comparison to IComparable, same is probably true for ICloneable. Those are very good remarks and it is often considered that IComparable should've been included in Object. In line with Java, C# has Equals and not IComparable, but against Java, C# does not have ICloneable (Java has clone()).
You also state that it is handy for debugging only. Well, consider this everywhere you need to get the string version of something (contrived, no ext. methods, no String.Format, but you get the idea):
CarInfo car = new CarInfo();
BikeInfo bike = new BikeInfo();
string someInfoText = "Car " +
(car is IStringable) ? ((IStringable) car).ToString() : "none") +
", Bike " +
(bike is IStringable) ? ((IStringable) bike).ToString() : "none");
and compare that with this. Whichever you find easier you should choose:
CarInfo car = new CarInfo();
BikeInfo bike = new BikeInfo();
string someInfoText = "Car " + car.ToString() + ", Bike " + bike.ToString();
Remember that languages are about making things clearer and easier. Many parts of the language (LINQ, extension methods, ToString(), the ?? operator) are created as conveniences. None of these are necessities, but sure are we glad that we have them. Only when we know how to use them well, we also find the true value of a feature (or not).
I'd like to add a couple of thoughts on why .NET's System.Object class definition has a ToString() method or member function, in addition to the previous postings on debugging.
Since the .NET Common Language Runtime (CLR) or Execution Runtime supports Reflection, being able to instantiate an object given the string representation of the class type seems to be essential and fundamental. And if I'm not mistaken, all reference values in the CLR are derived from System.Object, having the ToString() method in the class ensures its availability and usage through Reflection. Defining and implementing an interface along the lines of IStringable, is not mandatory or required when defining a class in .NET, and would not ensure the ability to dynamically create a new instance after querying an assembly for its supported class types.
As more advanced .NET functionality available in the 2.0, 3.0 and 3.5 runtimes, such as Generics and LINQ, are based on Reflection and dynamic instantiation, not to mention .NET's Dynamic Language Runtime (DLR) support that allow for .NET implementations of scripting languages, such as Ruby and Python, being able to identify and create an instance by a string type seems to be an essential and indispensable function to have in all class definitions.
In short, if we can't identify and name a specific class we want to instantiate, how can we create it? Relying on a ToString() method that has the base class behavior of returning the Class Type as a "human readable" string seems to make sense.
Maybe a review of the articles and books from Jeffrey Ricther and Don Box on the .NET Framework design and architecture may provide better insights on this topic as well.
I'm trying to formalise the usage of the "out" keyword in c# for a project I'm on, particularly with respect to any public methods. I can't seem to find any best practices out there and would like to know what is good or bad.
Sometimes I'm seeing some methods signatures that look like this:
public decimal CalcSomething(Date start, Date end, out int someOtherNumber){}
At this point, it's just a feeling, this doesn't sit well with me. For some reason, I'd prefer to see:
public Result CalcSomething(Date start, Date end){}
where the result is a type that contains a decimal and the someOtherNumber. I think this makes it easier to read. It allows Result to be extended or have properties added without breaking code. It also means that the caller of this method doesn't have to declare a locally scoped "someOtherNumber" before calling. From usage expectations, not all callers are going to be interested in "someOtherNumber".
As a contrast, the only instances that I can think of right now within the .Net framework where "out" parameters make sense are in methods like TryParse(). These actually make the caller write simpler code, whereby the caller is primarily going to be interested in the out parameter.
int i;
if(int.TryParse("1", i)){
DoSomething(i);
}
I'm thinking that "out" should only be used if the return type is bool and the expected usages are where the "out" parameters will always be of interest to the caller, by design.
Thoughts?
There is a reason that one of the static code analysis (=FxCop) rules points at you when you use out parameters. I'd say: only use out when really needed in interop type scenarios. In all other cases, simply do not use out. But perhaps that's just me?
This is what the .NET Framework Developer's Guide has to say about out parameters:
Avoid using out or reference parameters.
Working with members
that define out or reference
parameters requires that the developer
understand pointers, subtle
differences between value types and
reference types, and initialization
differences between out and reference
parameters.
But if you do use them:
Do place all out parameters after all of the pass-by-value and ref
parameters (excluding parameter
arrays), even if this results in an
inconsistency in parameter ordering
between overloads.
This convention makes the method
signature easier to understand.
Your approach is better than out, because you can "chain" calls that way:
DoSomethingElse(DoThing(a,b).Result);
as opposed to
DoThing(a, out b);
DoSomethingElse(b);
The TryParse methods implemented with "out" was a mistake, IMO. Those would have been very convenient in chains.
There are only very few cases where I would use out. One of them is if your method returns two variables that from an OO point of view do not belong into an object together.
If for example, you want to get the most common word in a text string, and the 42nd word in the text, you could compute both in the same method (having to parse the text only once). But for your application, these informations have no relation to each other: You need the most common word for statistical purposes, but you only need the 42nd word because your customer is a geeky Douglas Adams fan.
Yes, that example is very contrived, but I haven't got a better one...
I just had to add that starting from C# 7, the use of the out keyword makes for very readable code in certain instances, when combined with inline variable declaration. While in general you should rather return a (named) tuple, control flow becomes very concise when a method has a boolean outcome, like:
if (int.TryParse(mightBeCount, out var count)
{
// Successfully parsed count
}
I should also mention, that defining a specific class for those cases where a tuple makes sense, more often than not, is more appropriate. It depends on how many return values there are and what you use them for. I'd say, when more than 3, stick them in a class anyway.
One advantage of out is that the compiler will verify that CalcSomething does in fact assign a value to someOtherNumber. It will not verify that the someOtherNumber field of Result has a value.
Stay away from out. It's there as a low-level convenience. But at a high level, it's an anti-technique.
int? i = Util.TryParseInt32("1");
if(i == null)
return;
DoSomething(i);
If you have even seen and worked with MS
namespace System.Web.Security
MembershipProvider
public abstract MembershipUser CreateUser(string username, string password, string email, string passwordQuestion, string passwordAnswer, bool isApproved, object providerUserKey, out MembershipCreateStatus status);
You will need a bucket. This is an example of a class breaking many design paradigms. Awful!
Just because the language has out parameters doesn't mean they should be used. eg goto
The use of out Looks more like the Dev was either Lazy to create a type or wanted to try a language feature.
Even the completely contrived MostCommonAnd42ndWord example above I would use
List or a new type contrivedresult with 2 properties.
The only good reasons i've seen in the explanations above was in interop scenarios when forced to. Assuming that is valid statement.
You could create a generic tuple class for the purpose of returning multiple values. This seems to be a decent solution but I can't help but feel that you lose a bit of readability by returning such a generic type (Result is no better in that regard).
One important point, though, that james curran also pointed out, is that the compiler enforces an assignment of the value. This is a general pattern I see in C#, that you must state certain things explicitly, for more readable code. Another example of this is the override keyword which you don't have in Java.
If your result is more complex than a single value, you should, if possible, create a result object. The reasons I have to say this?
The entire result is encapsulated. That is, you have a single package that informs the code of the complete result of CalcSomething. Instead of having external code interpret what the decimal return value means, you can name the properties for your previous return value, Your someOtherNumber value, etc.
You can include more complex success indicators. The function call you wrote might throw an exception if end comes before start, but exception throwing is the only way to report errors. Using a result object, you can include a boolean or enumerated "Success" value, with appropriate error reporting.
You can delay the execution of the result until you actually examine the "result" field. That is, the execution of any computing needn't be done until you use the values.