Why does object.ToString() exist? - c#

Isn't it much more elegant and neat to have an IStringable interface?
Who needs this Type.FullName object returned to us?
EDIT: everyone keeps asking why do I think it's more elegant..
Well, it's just like that, instead of IComparable, object would have CompareTo method, that by default throws an exception or returns 0.
There are objects that cannot and should not be described as a string. object could have equally returned string.Empty. Type.FullName is just an arbitrary choice..
And for methods such as Console.Write(object), I think it should be: Write(IStringable).
However, if you are using WriteLine to anything but strings (or something that its ToString is obvious such as numbers), it seems to me it's for debugging mode only..
By the way - how should I comment to you all? Is it okay that I post an answer?

There are three virtual methods that IMHO should have never been added to System.Object...
ToString()
GetHashCode()
Equals()
All of these could have been implemented as you suggest with an interface. Had they done so I think we'd be much better off. So why are these a problem? Let's just focus on ToString():
If ToString() is expected to be implemented by someone using ToString() and displaying the results you have an implicit contract that the compiler cannot enforce. You assume that ToString() is overloaded, but there is no way to force that to be the case.
With an IStringable you would only need to add that to your generic type-constraint or derive your interface from it to require it's usage on implementing objects.
If the benefit you find in overloading ToString() is for the debugger, you should start using [System.Diagnostics.DebuggerDisplayAttribute].
As for needing this implementation for converting objects to strings via String.Format(), and/or Console.WriteLine, they could have deferred to the System.Convert.ToString(object) and checked for something like 'IStringable', failing over to the type's name if not implemented.
As Christopher Estep points out, it's culture specific.
So I guess I stand alone here saying I hate System.Object and all of it's virtual methods. But I do love C# as a whole and overall I think the designers did a great job.
Note: If you intend to depend upon the behavior of ToString() being overloaded, I would suggest you go ahead and define your IStringable interface. Unfortunatly you'll have to pick another name for the method if you really want to require it.
more
My coworkers and I were just speaking on the topic. I think another big problem with ToString() is answering the question "what is it used for?". Is it Display text? Serialization text? Debugging text? Full type name?

Having Object.ToString makes APIs like Console.WriteLine possible.
From a design perspective the designers of the BCL felt that the ability to provide a string representation of an instance should be common to all objects. True full type name is not always helpful but they felt the ability to have customizable representation at a root level outweighed the minor annoyance of seeing a full type name in output.
True you could implement Console.WriteLine with no Object.ToString and instead do an interface check and default to the full name of the type if the interface was not present. But then every single API which wanted to capture the string representation of an object instance would have to implement this logic. Given the number of times Object.ToString is used just within the core BCL, this would have lead to a lot of duplication.

I imagine it exists because it's a wildly convenient thing to have on all objects and doesn't require add'l cruft to use. Why do you think IStringable would be more elegant?

Not at all.
It doesn't need to be implemented and it returns culture-specific results.
This method returns a human-readable string that is culture-sensitive. For example, for an instance of the Double class whose value is zero, the implementation of Double..::.ToString might return "0.00" or "0,00" depending on the current UI culture.
Further, while it comes with its own implementation, it can be overriden, and often is.

Why make it more complicated? The way it is right now basically establishes that each and every object is capable of printing its value to a string, I can't see anything wrong with that.

A "stringable" representation is useful in so many scenarios, the library designers probably thought ToString() was more straightforward.

With IStringable, you will have to do an extra check/cast to see if you can output an object in string format. It's too much of a hit on perf for such a common operation that should be a good thing to have for 99.99% of all objects anyway.

Mmmm, so it can be overridden in derived classes possibly?

Structs and Objects both have the ToString() member to ease debugging.
The easiest example of this can be seen with Console.WriteLine which receives a whole list of types including object, but also receives params object[] args. As Console is often a layer on-top of TextWriter these statements are also helpful (sometimes) when writing to files and other streams (sockets).
It also illustrates a simple object oriented design that shows you interfaces shouldn't be created just because you can.

My new base class:
class Object : global::System.Object
{
[Obsolete("Do not use ToString()", true)]
public sealed override string ToString()
{
return base.ToString();
}
[Obsolete("Do not use Equals(object)", true)]
public sealed override bool Equals(object obj)
{
return base.Equals(this, obj);
}
[Obsolete("Do not use GetHashCode()", true)]
public sealed override int GetHashCode()
{
return base.GetHashCode();
}
}

There's indeed little use of having the Type.FullName returned to you, but it would be even less use if an empty string or null were returned. You ask why it exists. That's not too easy to answer and has been a much debated issue for years. More then a decade ago, several new languages decided that it would be convenient to implicitly cast an object to a string when it was needed, those languages include Perl, PHP and JavaScript, but none of them is following the object orientation paradigm thoroughly.
Approaches
Designers of object oriented languages had a harder problem. In general, there were three approaches for getting the string representation of an object:
Use multiple inheritance, simply inherit from String as well and you can be cast to a string
Single inheritance: add ToString to the base class as a virtual method
Either: make the cast operator or copy constructor overloadable for strings
Perhaps you'd ask yourself Why would you need a ToString or equiv. in the first place? As some others already noted: the ToString is necessary for introspection (it is called when you hover your mouse over any instance of an object) and the debugger will show it too. As a programmer, you know that on any non-null object you can safely call ToString, always. No cast needed, no conversion needed.
It is considered good programming practice to always implement ToString in your own objects with a meaningful value from your persistable properties. Overloads can help if you need different types of representation of your class.
More history
If you dive a bit deeper in the history, we see SmallTalk taking a wider approach. The base object has many more methods, including printString, printOn etc.
A small decade later, when Bertrand Meyer wrote his landmark book Object Oriented Software construction, he suggested to use a rather wide base class, GENERAL. It includes methods like print, print_line and tagged_out, the latter showing all properties of the object, but no default ToString. But he suggests that the "second base object ANY to which all user defined object derive, can be expanded", which seems like the prototype approach we now know from JavaScript.
In C++, the only multiple inheritance language still in widespread use, no common ancestor exists for all classes. This could be the best candidate language to employ your own approach, i.e. use IStringable. But C++ has other ways: you can overload the cast operator and the copy constructor to implement stringability. In practice, having to be explicit about a to-string-implementation (as you suggest with IStringable) becomes quite cumbersome. C++ programmers know that.
In Java we find the first appearance of toString for a mainstream language. Unfortunately, Java has two main types: objects and value types. Value types do not have a toString method, instead you need to use Integer.toString or cast to the object counterpart. This has proven very cumbersome throughout the years, but Java programmers (incl. me) learnt to live with it.
Then came C# (I skipped a few languages, don't want to make it too long), which was first intended as a display language for the .NET platform, but proved very popular after initial skepticism. The C# designers (Anders Hejlsberg et al) looked mainly at C++ and Java and tried to take the best of both worlds. The value type remained, but boxing was introduced. This made it possible to have value types derive from Object implicitly. Adding ToString analogous to Java was just a small step and was done to ease the transition from the Java world, but has shown its invaluable merits by now.
Oddity
Though you don't directly ask about it, but why would the following have to fail?
object o = null;
Console.WriteLine(o.ToString());
and while you think about it, consider the following, which does not fail:
public static string MakeString(this object o)
{ return o == null ? "null" : o.ToString(); }
// elsewhere:
object o = null;
Console.WriteLine(o.MakeString());
which makes me ask the question: would, if the language designers had thought of extension methods early on, the ToString method be part of the extension methods to prevent unnecessary NullPointerExceptions? Some consider this bad design, other consider it a timesaver.
Eiffel, at the time, had a special class NIL which represented nothingness, but still had all the base class's methods. Sometimes I wished that C# or Java had abandoned null altogether, just like Bertrand Meyer did.
Conclusion
The wide approach of classical languages like Eiffel and Smalltalk has been replaced by a very narrow approach. Java still has a lot of methods on Object, C# only has a handful. This is of course good for implementations. Keeping ToString in the package simply keeps programming clean and understandable at the same time and because it is virtual, you can (and should!) always override it, which will make your code better apprehendable.
-- Abel --
EDIT: the asker edited the question and made a comparison to IComparable, same is probably true for ICloneable. Those are very good remarks and it is often considered that IComparable should've been included in Object. In line with Java, C# has Equals and not IComparable, but against Java, C# does not have ICloneable (Java has clone()).
You also state that it is handy for debugging only. Well, consider this everywhere you need to get the string version of something (contrived, no ext. methods, no String.Format, but you get the idea):
CarInfo car = new CarInfo();
BikeInfo bike = new BikeInfo();
string someInfoText = "Car " +
(car is IStringable) ? ((IStringable) car).ToString() : "none") +
", Bike " +
(bike is IStringable) ? ((IStringable) bike).ToString() : "none");
and compare that with this. Whichever you find easier you should choose:
CarInfo car = new CarInfo();
BikeInfo bike = new BikeInfo();
string someInfoText = "Car " + car.ToString() + ", Bike " + bike.ToString();
Remember that languages are about making things clearer and easier. Many parts of the language (LINQ, extension methods, ToString(), the ?? operator) are created as conveniences. None of these are necessities, but sure are we glad that we have them. Only when we know how to use them well, we also find the true value of a feature (or not).

I'd like to add a couple of thoughts on why .NET's System.Object class definition has a ToString() method or member function, in addition to the previous postings on debugging.
Since the .NET Common Language Runtime (CLR) or Execution Runtime supports Reflection, being able to instantiate an object given the string representation of the class type seems to be essential and fundamental. And if I'm not mistaken, all reference values in the CLR are derived from System.Object, having the ToString() method in the class ensures its availability and usage through Reflection. Defining and implementing an interface along the lines of IStringable, is not mandatory or required when defining a class in .NET, and would not ensure the ability to dynamically create a new instance after querying an assembly for its supported class types.
As more advanced .NET functionality available in the 2.0, 3.0 and 3.5 runtimes, such as Generics and LINQ, are based on Reflection and dynamic instantiation, not to mention .NET's Dynamic Language Runtime (DLR) support that allow for .NET implementations of scripting languages, such as Ruby and Python, being able to identify and create an instance by a string type seems to be an essential and indispensable function to have in all class definitions.
In short, if we can't identify and name a specific class we want to instantiate, how can we create it? Relying on a ToString() method that has the base class behavior of returning the Class Type as a "human readable" string seems to make sense.
Maybe a review of the articles and books from Jeffrey Ricther and Don Box on the .NET Framework design and architecture may provide better insights on this topic as well.

Related

How to use Console.WriteLine with a class [duplicate]

I'm studying C# and I wonder what the point and benefit of overriding ToString might be, as shown in the example below.
Could this be done in some simpler way, using a common method without the override?
public string GetToStringItemsHeadings
{
get { return string.Format("{0,-20} {1, -20}", "Office Email", "Private Email"); }
}
public override string ToString()
{
string strOut = string.Format("{0,-20} {1, -20}", m_work, m_personal);
return strOut;
}
I'm just going to give you the answer straight from the Framework Design Guidelines from the .NET Development Series.
AVOID throwing exceptions from ToString
CONSIDER returning a unique string associated with the instance.
CONSIDER having the output of ToString be a valid input for any parsing methods on this type.
DO ensure that ToString has no observable side effects.
DO report security-sensitive information through an override of ToString only after demanding an appropriate permission. If the permission demand fails, return a string excluding security-sensitive information.
The Object.ToString method is intended to be used for general display and debugging purposes. The default implementation simply provides the object type name. The default implementation is not very useful, and it is recommended that the method be overridden.
DO override ToString whenever an interesting human-readable string can be returned. The default implementation is not very useful, and a custom implementation can almost always provide more value.
DO prefer a friendly name over a unique but not readable ID.
It is also worth mentioning as Chris Sells also explains in the guidelines that ToString is often dangerous for user interfaces. Generally my rule of thumb is to expose a property that would be used for binding information to the UI, and leave the ToString override for displaying diagnostic information to the developer. You can also decorate your type with DebuggerDisplayAttribute as well.
DO try to keep the string returned from ToString short. The debugger uses ToString to get a textual representation of an object to be shown to the developer. If the string is longer than the debugger can display, the debugging experience is hindered.
DO string formatting based on the current thread culture when returning culture-dependent information.
DO provide overload ToString(string format), or implement IFormattable, if the string return from ToString is culture-sensitive or there are various ways to format the string. For example, DateTime provides the overload and implements IFormattable.
DO NOT return an empty string or null from ToString
I swear by these guidelines, and you should to. I can't tell you how my code has improved just by this one guideline for ToString. The same thing goes for things like IEquatable(Of T) and IComparable(Of T). These things make your code very functional, and you won't regret taking the extra time to implement any of it.
Personally, I've never really used ToString much for user interfaces, I have always exposed a property or method of some-sort. The majority of the time you should use ToString for debugging and developer purposes. Use it to display important diagnostic information.
Do you need to override ToString? No.
Can you get a string representation of your object in another way? Yes.
But by using ToString you are using a method that is common to all objects and thus other classes know about this method. For instance, whenever the .NET framework wants to convert an object to a string representation, ToString is a prime candidate (there are others, if you want to provide more elaborate formatting options).
Concretely,
Console.WriteLine(yourObject);
would invoke yourObject.ToString().
Overriding ToString() allows you to give a useful human-readable string representation of a class.
This means that the output can reveal useful information about your class. For example, if you had a Person class you might choose to have the ToString() output the person's id, their firstname, their lastname etc. This is extremely useful when debugging or logging.
With regard to your example - it is difficult to tell if your override is useful without knowing what this class is - but the implementation itself is ok.
It's always appropriate but carefully consider the intentions behind what you're displaying
A better question would be to ask:
Why would one override ToString()?
ToString() is the window into an object's state. Emphasis on state as a requirement. Strongly OOP languages like Java/C# abuse the OOP model by encapsulating everything in a class. Imagine you are coding in a language that doesn't follow the strong OOP model; consider whether you'd use a class or a function. If you would use it as a function (ie verb, action) and internal state is only maintained temporarily between input/output, ToString() won't add value.
Like others have mentioned, it's important to consider what you output with ToString() because it could be used by the debugger or other systems.
I like to imagine the ToString method as the --help parameter of an object. It should be short, readable, obvious, and easy to display. It should display what the object is not what it does. With all that in mind let's consider...
Use Case - Parsing a TCP packet:
Not an application-level-only network capture but something with more meat like a pcap capture.
You want to overload ToString() for just the TCP layer so you can print data to the console. What would it include? You could go crazy and parse all of the TCP details (ie TCP is complex)...
Which includes the:
Source Port
Destination Port
Sequence Number
Acknowledgment number
Data offset
Flags
Window Offset
Checksum
Urgent Pointer
Options (I'm not even going to go there)
But would you want to receive all that junk if you were calling TCP.ToString() on 100 packets? Of course not, it would be information overload. The easy and obvious choice is also the most sensible...
Expose what people would expect to see:
Source Port
Destination Port
I prefer a sensible output that's easy for humans to parse but YMMV.
TCP:[destination:000, source:000]
Nothing complex, the output isn't for machines to parse (ie unless people are abusing your code), the intended purpose is for human readability.
But what about all the rest of that juicy info I talked about before, isn't that useful too? I'll get to that but first...
ToString() one of the most valuable and underused methods of all time
For two reasons:
People don't understand what ToString() is for
The base 'Object' class is missing another, equally important, string method.
Reason 1 - Don't abuse the usefulness of ToString():
A lot of people use ToString() to pull a simple string representation of an object. The C# manual even states:
ToString is the major formatting method in the .NET Framework. It converts an object to its string representation so that it is suitable for display.
Display, not further processing. That doesn't mean, take my nice string representation of the TCP packet above and pull the source port using a regex ::cringe::.
The right way to do things is, call ToString() directly on the SourcePort property (which BTW is a ushort so ToString() should already be available).
If you need something more robust to package the state of a complex object for machine parsing you'll be better off using a structured serialization strategy.
Fortunately, such strategies are very common:
ISerializable (C#)
Pickle (Python)
JSON (Javascript or any language that implements it)
SOAP
etc...
Note: Unless you're using PHP because, herp-derp, there's a function for that ::snicker::
Reason 2 - ToString() is not enough:
I have yet to see a language that implements this at the core but I have seen and used variations of this approach in the wild.
Some of which include:
ToVerboseString()
ToString(verbose=true)
Basically, that hairy mess of a TCP Packet's state should be described for human readability. To avoid 'beating a dead horse' talking about TCP I'll 'point a finger' at the #1 case where I think ToString() and ToVerboseString() are underutilized...
Use Case - Arrays:
If you primarily use one language, you're probably comfortable with that language's approach. For people like me who jump between different languages, the number of varied approaches can be irritating.
Ie, the number of times this has irritated me is greater than the sum of all of the fingers of every Hindu god combined.
There are various cases where languages use common hacks and a few that get it right. Some require wheel re-inventing, some do a shallow dump, others do a deep dump, none of them work the way I'd like them to...
What I'm asking for is a very simple approach:
print(array.ToString());
Outputs: 'Array[x]' or 'Array[x][y]'
Where x is the number of items in the first dimension and y is the number of items in the second dimension or some value that indicates that the 2nd dimension is jagged (min/max range maybe?).
And:
print(array.ToVerboseString());
Outputs the whole she-bang in pretty-print because I appreciate pretty things.
Hopefully, this sheds some light on a topic that has irked me for a long time. At the very least I sprinkled a little troll-bait for the PHPers to downvote this answer.
It's about good practise as much as anything, really.
ToString() is used in many places to return a string representation of an object, generally for consumption by a human. Often that same string can be used to rehydrate the object (think of int or DateTime for example), but that's not always a given (a tree for example, might have a useful string representation which simply displays a Count, but clearly you can't use that to rebuild it).
In particular the debugger will use this to display the variable in the watch windows, immediate windows etc, therefore ToString is actually invaluable for debugging.
In general, also, such a type will often have an explicit member that returns a string as well. For example a Lat/Long pair might have ToDecimalDegrees which returns "-10, 50" for example, but it might also have ToDegreesMinutesSeconds, since that is another format for a Lat/Long pair. That same type might then also override ToString with one of those to provide a 'default' for things like debugging, or even potentially for things like rendering web pages (for example, the # construct in Razor writes the ToString() result of a non-string expression to the output stream).
object.ToString() converts an object to its string representation. If you want to change what is returned when a user calls ToString() on a class you have created then you would need to override ToString() in that class.
While I think the most useful information has already been provided, I shall add my two cents:
ToString() is meant to be overridden. Its default implementation returns the type name which, while maybe useful at times (particularly when working with a lot of objects), doesn't suffice in the big majority of times.
Remember that, for debugging purposes, you can rely on DebuggerDisplayAttribute. You can read more about it here.
As a rule, on POCOs you can always override ToString(). POCOs are a structured representation of data, which usually can become a string.
Design ToString to be a textual representation of your object. Maybe its main fields and data, maybe a description of how many items are in the collection, etc.
Always try to fit that string into a single line and have only essential information. If you have a Person class with Name, Address, Number etc. properties, return only the main data (Name some ID number).
Be careful not to override a good implementation of ToString(). Some framework classes already implement ToString(). Overriding that default implementation is a bad thing: people will expect a certain result from ToString() and get another.
Don't be really afraid of using ToString(). The only thing I'd be careful about is returning sensitive information. Other than that, the risk is minimal. Sure, as some have pointed out, other classes will use your ToString whenever reaching for information. But heck, when does returning the type name will be considered better than getting some actual information?
If you don't override ToString then you get your base classes implementation which, for Object is just the short type name of the class.
If you want some other, more meaningful or useful implementation of ToString then override it.
This can be useful when using a list of your type as the datasource for a ListBox as the ToString will be automatically displayed.
Another situtation occurs when you want to pass your type to String.Format which invokes ToString to get a representation of your type.
Something no one else has mentioned yet: By overriding ToString(), you can also consider implementing IFormattable so you can do the following:
public override ToString() {
return string.Format("{0,-20} {1, -20}", m_work, m_personal);
}
public ToString(string formatter) {
string formattedEmail = this.ToString();
switch (formatter.ToLower()) {
case "w":
formattedEmail = m_Work;
break;
case "p":
formattedEmail = m_Personal;
break;
case "hw":
formattedEmail = string.Format("mailto:{0}", m_Work);
break;
}
return formattedEmail;
}
Which can be useful.
One benefit of overriding ToString() is Resharper's tooling support: Alt + Ins -> "Formatting members" and it writes the ToString() for you.
In some cases it makes it easier to read values of custom classes in the debugger watch window. If I know exactly what I want to see in the watch window, when I override ToString with that information, then I see it.
When defining structs (effectively user-primitives) I find it's good practice to have matching ToString, and Parse and TryParse methods, particularly for XML serialization. In this case you will be converting the entire state to a string, so that it can be read from later.
Classes however are more compound structures that will usually be too complex for using ToString and Parse. Their ToString methods, instead of saving the entire state, can be a simple description that helps you identify their state, like a unique identifier like a name or ID, or maybe a quantity for a list.
Also, as Robbie said, overriding ToString allows you to call ToString on a reference as basic as type object.
You can use that when you have an object with not intuitive meaning of string representation, like person. So if you need for example to print this person you can use this override for preparing it's format.
The Object.ToString method should be used for debugging purposes only. The default implementation shows the object type name which is not very useful. Consider to override this method to provide better information for diagnostics and debugging. Please consider that logging infrastructures often use the ToString method as well and so you will find these text fragments in your log files.
Do not return localized text resources within the Object.ToString method. The reason is that the ToString method should always return something that the developer can understand. The developer might not speak all languages which the application supports.
Implement the IFormattable interface when you want to return a user-friendly localized text. This interface defines a ToString overload with the parameters format and formatProvider. The formatProvider helps you to format the text in a culture aware way.
See also: Object.ToString and IFormattable
Simpler depends on how your property is going to be used. If you just need to format the string one time then it does not make that much sense overriding it.
However, It appears that you are overriding the ToString method to not return the normal string data for your property , but to perform a standard formatting pattern. Since you are using string.format with padding.
Because you said you are learning, the exercise appears to also hit on core principles in object oriented programming relating to encapsulation and code re-use.
The string.format taking the arguments you have set for padding ensures that the property will be formatted the same way each time for any code that calls it. As well, going forward you only have to change it in one place instead of many.
Great question and also some great answers!
I find it useful to override the ToString method on entity classes as it helps quickly identify issues in testing especially when an assertion fails the test console will invoke the ToString method on the object.
But in agreement with what has been said before it's to give a human readable representation of the object in question.
I'm happy with the Framework Guidelines referred to in other answers.
However, I'd like to emphasize the display and debugging purposes.
Be careful about how you use ToString in your code. Your code shouldn't rely on the string representation of an object. If it does, you should absolutely provide the respective Parse methods.
Since ToString can be used everywhere, it can be a maintainability pain point if you want to change the string representation of an object at a later point in time. You cannot just examine the call hierarchy in this case to study whether some code will break.

dynamic and generics in C#

As discovered in C 3.5, the following would not be possible due to type erasure: -
int foo<T>(T bar)
{
return bar.Length; // will not compile unless I do something like where T : string
}
foo("baz");
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
Having read about the dynamic keyword, I wrote the following: -
int foo<T>(T bar)
{
dynamic test = bar;
return test.Length;
}
foo("baz"); // will compile and return 3
So, as far as I understand, dynamic will bypass compile time checking but if the type has been erased, surely it would still be unable to resolve the symbol unless it goes deeper and uses some kind of reflection?
Is using the dynamic keyword in this way bad practice and does this make generics a little more powerful?
dynamics and generics are 2 completely different notions. If you want compile-time safety and speed use strong typing (generics or just standard OOP techniques such as inheritance or composition). If you do not know the type at compile time you could use dynamics but they will be slower because they are using runtime invocation and less safe because if the type doesn't implement the method you are attempting to invoke you will get a runtime error.
The 2 notions are not interchangeable and depending on your specific requirements you could use one or the other.
Of course having the following generic constraint is completely useless because string is a sealed type and cannot be used as a generic constraint:
int foo<T>(T bar) where T : string
{
return bar.Length;
}
you'd rather have this:
int foo(string bar)
{
return bar.Length;
}
I believe the reason this doesn't work is in C# and java, is due to a concept called type erasure, see http://en.wikipedia.org/wiki/Type_erasure.
No, this isn't because of type erasure. Anyway there is no type erasure in C# (unlike Java): a distinct type is constructed by the runtime for each different set of type arguments, there is no loss of information.
The reason why it doesn't work is that the compiler knows nothing about T, so it can only assume that T inherits from object, so only the members of object are available. You can, however, provide more information to the compiler by adding a constraint on T. For instance, if you have an interface IBar with a Length property, you can add a constraint like this:
int foo<T>(T bar) where T : IBar
{
return bar.Length;
}
But if you want to be able to pass either an array or a string, it won't work, because the Length property isn't declared in any interface implemented by both String and Array...
No, C# does not have type erasure - only Java has.
But if you specify only T, without any constraint, you can not use obj.Lenght because T can virtually be anything.
foo(new Bar());
The above would resolve to an Bar-Class and thus the Lenght Property might not be avaiable.
You can only use Methods on T when you ensure that T this methods also really has. (This is done with the where Constraints.)
With the dynamics, you loose compile time checking and I suggest that you do not use them for hacking around generics.
In this case you would not benefit from dynamics in any way. You just delay the error, as an exception is thrown in case the dynamic object does not contain a Length property. In case of accessing the Length property in a generic method I can't see any reason for not constraining it to types who definately have this property.
"Dynamics are a powerful new tool that make interop with dynamic languages as well as COM easier, and can be used to replace much turgid reflective code. They can be used to tell the compiler to execute operations on an object, the checking of which is deferred to runtime.
The great danger lies in the use of dynamic objects in inappropriate contexts, such as in statically typed systems, or worse, in place of an interface/base class in a properly typed system."
Qouted From Article
Thought I'd weigh-in on this one, because no one clarified how generics work "under the hood". That notion of T being an object is mentioned above, and is quite clear. What is not talked about, is that when we compile C# or VB or any other supported language, - at the Intermediate Language (IL) level (what we compile to) which is more akin to an assembly language or equivalent of Java Byte codes, - at this level, there is no generics! So the new question is how do you support generics in IL? For each type that accesses the generic, a non-generic version of the code is generated which substitutes the generic(s) such as the ubiquitous T to the actual type it was called with. So if you only have one type of generic, such as List<>, then that's what the IL will contain. But if you use many implementation of a generic, then many specific implementations are created, and calls to the original code substituted with the calls to the specific non-generic version. To be clear, a MyList used as: new MyList(), will be substituted in IL with something like MyList_string().
That's my (limited) understanding of what's going on. The point being, the benefit of this approach is that the heavy lifting is done at compile-time, and at runtime there's no degradation to performance - which is again, why generic are probably so loved used anywhere, and everywhere by .NET developers.
On the down-side? If a method or type is used many times, then the output assembly (EXE or DLL) will get larger and larger, dependent of the number of different implementation of the same code. Given the average size of DLLs output - I doubt you'll ever consider generics to be a problem.

When to use run-time type information?

If I have various subclasses of something, and an algorithm which operates on instances of those subclasses, and if the behaviour of the algorithm varies slightly depending on what particular subclass an instance is, then the most usual object-oriented way to do this is using virtual methods.
For example if the subclasses are DOM nodes, and if the algorithm is to insert a child node, that algorithm differs depending on whether the parent node is a DOM element (which can have children) or DOM text (which can't): and so the insertChildren method may be virtual (or abstract) in the DomNode base class, and implemented differently in each of the DomElement and DomText subclasses.
Another possibility is give the instances a common property, whose value can be read: for example the algorithm might read the nodeType property of the DomNode base class; or for another example, you might have different types (subclasses) of network packet, which share a common packet header, and you can read the packet header to see what type of packet it is.
I haven't used run-time-type information much, including:
The is and as keywords in C#
Downcasting
The Object.GetType method in dot net
The typeid operator in C++
When I'm adding a new algorithm which depends on the type of subclass, I tend instead to add a new virtual method to the class hierarchy.
My question is, when is it appropriate to use run-time-type information, instead of virtual functions?
When there's no other way around. Virtual methods are always preferred but sometimes they just can't be used. There's couple of reasons why this could happen but most common one is that you don't have source code of classes you want to work with or you can't change them. This often happens when you work with legacy system or with closed source commercial library.
In .NET it might also happens that you have to load new assemblies on the fly, like plugins and you generally have no base classes but have to use something like duck typing.
In C++, among some other obscure cases (which mostly deal with inferior design choices), RTTI is a way to implement so-called multi methods.
This constructions ("is" and "as") are very familiar for Delphi developers since event handlers usually downcast objects to a common ancestor. For example event OnClick passes the only argurment Sender: TObject regardless of the type of the object, whether it is TButton, TListBox or any other. If you want to know something more about this object you have to access it through "as", but in order to avoid an exception, you can check it with "is" before. This downcasting allows design-type binding of objects and methods that could not be possible with strict class type checking. Imagine you want to do the same thing if the user clicks Button or ListBox, but if they provide us with different prototypes of functions, it could not be possible to bind them to the same procedure.
In more general case, an object can call a function that notifies that the object for example has changed. But in advance it leaves the destination the possibility to know him "personally" (through as and is), but not necessarily. It does this by passing self as a most common ancestor of all objects (TObject in Delphi case)
dynamic_cast<>, if I remember correctly, is depending on RTTI. Some obscure outer interfaces might also rely on RTTI when an object is passed through a void pointer (for whatever reason that might happen).
That being said, I haven't seen typeof() in the wild in 10 years of pro C++ maintenance work. (Luckily.)
You can refer to More Effective C# for a case where run-time type checking is OK.
Item 3. Specialize Generic Algorithms
Using Runtime Type Checking
You can easily reuse generics by
simply specifying new type parameters.
A new instantiation with new type
parameters means a new type having
similar functionality.
All this is great, because you write
less code. However, sometimes being
more generic means not taking
advantage of a more specific, but
clearly superior, algorithm. The C#
language rules take this into account.
All it takes is for you to recognize
that your algorithm can be more
efficient when the type parameters
have greater capabilities, and then to
write that specific code. Furthermore,
creating a second generic type that
specifies different constraints
doesn't always work. Generic
instantiations are based on the
compile-time type of an object, and
not the runtime type. If you fail to
take that into account, you can miss
possible efficiencies.
For example, suppose you write a class that provides a reverse-order enumeration on a sequence of items represented through IEnumerable<T>. In order to enumerate it backwards you may iterate it and copy items into an intermediate collection with indexer access like List<T> and than enumerate that collection using indexer access backwards. But if your original IEnumerable is IList why not take advantage of it and provide more performant way (without copying to intermediate collection) to iterate items backwards. So basically it is a special we can take advantage of but still providing the same behavior (iterating sequence backwards).
But in general you should carefully consider run-time type checking and ensure that it doesn't violate Liskov Substituion Principle.

What are the schools of OOP? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
Are there philosophical differences between Smalltalk OOP and Simula OOP ?
This is a question related to Java & C# vs C++ indirectly. As I understand, C++ is based on Simula but Java and C# are more or less from the Smalltalk family.
Several key 'differences in 'Style' within the broader OOP banner.
In all cases a statement about a static or dynamic type system means predominately one or the other, the issue is far from clear cut or clearly defined.
Also many languages choose the blur the line between the choices so this is not a list of binary choices by any means.
Polymorphic late binding
or "what does foo.Bar(x) mean?"
Hierarchy of types is flattened to a specific implementation per instance (often done via a vtable) and often allowing explicit reference to the base classes implementation.
Conceptually you look at the most specific type that foo is at the callsite. If it has an implementation of Bar for the parameter x that is called, if not the parent of foo is chosen and the process repeated.
Examples: C++/Java/C#, "Simula style" is often used.
Pure message passing. The code in foo which handles messages 'named' "Bar" is asked to accept the x. Only the name matters, not any assumptions the call site may have had about exactly what Bar was meant to be. Contrast with the previous style in which the method in question was Bar known to be something defined on whatever was known about the type hierarchy defined at compile time (though the precise place in the hierarchy is left till runtime).
Examples: Objective-C/Ruby, "Smalltalk style" is often used.
1 is often used within statically typed frameworks where it is an error, checked at compile time for no such implementation to exist. Further the languages often differentiate between Bar(x) and Bar(y) if x and y are different types. This is method overloading and the resulting methods with the same name are viewed as entirely different.
2 is often used in dynamic languages (which tend to avoid method overloading) as such it is possible that, at runtime the type of foo has no 'handler' for the message named 'Bar', different languages handle this in different ways.
Both can be implemented behind the scenes in the same fashion if desired (often the default for the second, Smalltalk style is to invoke a function but this is not made a defined behaviour in all cases).
Since the former method can frequently be easily implemented as simple pointer offset function calls it can, more easily, be made relatively fast. This does not mean that the other styles cannot also be made fast, but more work may be required to ensure that the greater flexibility is not compromised when doing so.
Inheritance/Reuse
or "Where do babies come from?"
Class based
Method implementations are organized into groups called classes. When implementation inheritance is desired a class is defined which extends the parent class. In this way it gains all exposed aspects of the parent (both fields and methods) and can choose to alter certain/all of those aspects but cannot remove any. You can add and update but not delete.
Examples: C++/Java/C# (note both SmallTalk and Simula use this)
Prototype based
Any instance of an object is simply a collection of identified methods (normally identified by name) and state in the form of (again named) fields. Whenever a new instance of this 'type' is desired an existing instance can be used to clone a new one. This new class retains a copy of the state and methods of the previous class but can then be modified to remove, add or alter existing named fields and methods.
Examples: Self/JavaScript
Again 1 tends to happen in static languages, 2 in dynamic though this is by no means a requirement they simply lend themselves to the style.
Interface or Class based
or "what or how?"
Interfaces list the methods that are required. They are a contract
Examples: VB6
Classes list methods that are required but may optionally supply their implementation
Examples: Simula
This is very much not a binary choice. Most class based languages allow the concept of abstract methods (ones with no implementation yet). If you have a class where all methods are abstract (called pure virtual in C++) then what the class amounts to is pretty much an interface, albeit one that may have also defined some state (fields). An true Interface should have no state (since it defines only what is possible, not how it happens.
Only older OOP languages tend to rely solely on one or the other.
VB6 has only on interfaces and have no implementation inheritance.
Simula let you declare pure virtual classes but you could instantiate them (with runtime errors on use)
Single or Multiple Inheritance
or "Who is the daddy?"
Single
Only one type can be a parent to another. In the Class based form above you can extend (take implementation from) only one type. Typically this form includes the concept of interfaces as first class aspects of the language to make up for this.
advantages include cleaner metadata and introspection, simpler language rules.
complications include making it harder to bring useful methods into scope (things like MixIns and Extension methods seek to mitigate this sort of problem)
Examples: C#/java
Multiple - you can extend multiple classes
advantages include certain structures are easier to model and design
complications include complex rules for collision resolution, especially when overloaded methods exist which could take either parent type.
Examples: C++/Eiffel
This question provokes considerable debate, especially as it is a key differentiator between C++'s OOP implementation and many of the modern statically typed languages perceived as possible successors like c# and java.
Mutability
or "what do you want to do to me?"
Mutable
Objects, once created can have their state changed.
Imutable
Objects, once created cannot be changed.
Frequently this is not an all or nothing it is simply a default (most commonly used OOP languages default to mutable by default). This can have a great deal of affect on how the language is structured. Many primarily functional languages which have included OOP features default the objects to have immutable state.
'Pureness' of their OOP
or "Is everything an Object?"
Absolutely everything in the system is viewed as an object (possibly even down to the methods themselves which are simply another kind of object and can be interacted with in the same way other objects can be).
Examples: SmallTalk
Not everything is an object, you cannot pass messages to everything (though the system might jump through hoops to make it seem like you can)
Examples: C++/C#/Java (see note*)
This is quite complex since techniques like auto boxing of primitives make it seem like everything is but you will find that several boundary cases exist where this 'compiler magic' is discovered and the proverbial wizard of Oz is found behind the curtain resulting is problems or errors.
In languages with immutability as a default this is less likely to happen, since the key aspect of objects (that they contain both methods and state) means that things that are similar to objects but not quite have less possibility for complications.
In regards to Java/C# the autoboxing(or in c#) system lets you treat, syntactically any variable as if it was an object but, in actuality this is not the case and this is exhibited in areas such as attempting to lock on an autoboxed object (rejected by the compiler as it would be an obvious bug).
Static or Dynamic
or "Who do you think you are?"
A far more pervasive aspect of language design and not one to get into here but the choices inherent in this decision impact many aspects of OOP as mentioned earlier.
Just aspects of the polymorphic late binding can depend on:
The type of the object to whom the message is being passed (at compile time/run time)
The type of the parameter(s) which are being passed (at compile time/run time)
The more dynamic a language gets the more complex these decisions tend to become but conversely the more input the language user, rather than the language designer has in the decision.
Giving examples here would be some what foolhardy since statically typed languages may be modified to include dynamic aspects (like c# 4.0).
I'd put Java and C# in the Simula camp as well:
Smalltalk, being dynamically typed, is quite apart of the four other languages you cite.
Smalltalk is structurally typed (alias duck typing) while the other four are nominally typed.
(What Java and C# have in common with Smalltalk is being mainly based on a VM, but there is little influence on the programming style).
Java and C# are definitely not from the Smalltalk family. Alan Kay even said that when he created OOP he did not have anything like Java or C++ in mind. Java, C#, and C++ all interpret OOP in pretty much the same way.
Languages like Smalltalk and Ruby have a radically different model that is based on message passing. In C++ classes are essentially namespaces for methods and state. Method invocations are bound at compile time. Smalltalk does not bind a "method call" until runtime. The result of this is that in C++
foo->bar
is compiled to mean "call the bar method on the foo object." If bar is non virtual, I'd imagine that the address of the bar method is specifically referenced.
In Smalltalk
foo bar
means "send the message bar to the foo object." foo can do whatever it wants with this message when it arrives. The default behavior is to call the method named bar, but that is not required. This property is exploited in Ruby for ActiveRecord column accessors. When you have an ActiveRecord object and you send it the name of a column in its database table as a message, if there is no method with that name defined, it checks to see if there is a column by that name on the table and if there is returns the value.
Message passing might seem like a tiny, irrelevant detail, but out of it, the rest of OOP easily flows.
"OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I'm not aware of them." -- Alan Kay, creator of Smalltalk
Eiffel is a statically typed, compiled, multiple inheritance pure OOP language.
http://dev.eiffel.com
Of the modern (and I use the term lightly) OO programming languages Objective C is the most like smalltalk.
Messaages:
In C++,C# and Java: messages are bound at compile time.
You can think of a method call as a message being sent to the object.
In Objective C,Smalltalk: messages are bound at run time.
I would say statically typed and dynamically typed OOP are two separate disciplines within the same school of OOP.
Java, C#, and C++ all follow a similar OOP strategy. It is based on function calls that are bound at compile time. Depending ont he call, either the direct function call or an offset into a vtable is fixed when compilation happens. By contrast Smalltalk's OOP is based on message passing. Conceptually every method call is a message to the receiving object asking whether it has a method called "Foo."
Smalltalk has no concept of interfaces. It only has similar looking methods. In the C++ group of languages, everything is bound to interfaces. One cannot implement AddRef and Release without also implementing QueryInterface (even if it is just a stub) because they are all part of the IUnknown interface. In Smalltalk, there is no IUnknown. There is only a collection of 3 functions, any of which could be implemented or not.
I'd say there is also a pretty big difference, conceptually, between class-based OOP (of which Smalltalk, Simula, C# and Java are all examples) and prototype-based OOP (which started with Self and is most widespread in JavaScript).
Aside from the above points, there is also a conceptual breakdown of Smalltalk vs. Simula.
Conceptually, "Smalltalk-style" typically indicates that the method run when a message is called is determined at run time, aiding polymorphism.
"Simula-style", on the other hand, usually seems to indicate where all method calls are really just a convenient way of writing overloaded function calls--no runtime polymorphism. (Please correct me if I'm wrong.)
In the middle, we have Java: all methods virtual by default, but statically typed and has compile-time type dispatch.
Example:
// C++
class Base {
void doSomething() {
cout << "Base::doSomething() called!\n";
}
}
class Derived : Base {
void doSomething() {
cout << "Derived::doSomething() called!\n";
}
}
int main() {
Base* b = new Base();
Derived* d = new Derived();
b->doSomething(); // prints "Base::doSomething() called!"
d->doSomething(); // prints "Derived::doSomething() called!"
Base* d2 = d; // OK; Liskov substitution principle.
d2->doSomething(); // prints "Base::doSomething called!" (!)
delete b;
delete d;
return 0;
}
VS:
// Objective-C
//Base.h
#interface Base
{
}
-(void)doSomething
#end
//Base.m
#import "Base.h"
#implementation Base
-(void) doSomething {
printf("doSomething sent to Base!");
}
#end
//Derived.h
#import "Base.h"
#import "Base.m"
#interface Derived : Base
{
}
#end
//Derived.m
#import "Derived.h"
#implementation Derived
-(void) doSomething {
printf("doSomething sent to Derived!")
}
#end
//Main.m
#import "Base.h"
#import "Base.m"
#import "Derived.h"
#import "Derived.m"
int main() {
Base* b = [[Base alloc] init];
Derived* d = [[Derived alloc] init];
[b doSomething]; // prints "doSomething sent to Base!"
[d doSomething]; // prints "doSomething sent to Derived!"
Base* d2 = d;
[d2 doSomething]; // prints "doSomething sent to Derived!"
[b release];
[d release];
return 0;
}

Best practice of using the "out" keyword in C#

I'm trying to formalise the usage of the "out" keyword in c# for a project I'm on, particularly with respect to any public methods. I can't seem to find any best practices out there and would like to know what is good or bad.
Sometimes I'm seeing some methods signatures that look like this:
public decimal CalcSomething(Date start, Date end, out int someOtherNumber){}
At this point, it's just a feeling, this doesn't sit well with me. For some reason, I'd prefer to see:
public Result CalcSomething(Date start, Date end){}
where the result is a type that contains a decimal and the someOtherNumber. I think this makes it easier to read. It allows Result to be extended or have properties added without breaking code. It also means that the caller of this method doesn't have to declare a locally scoped "someOtherNumber" before calling. From usage expectations, not all callers are going to be interested in "someOtherNumber".
As a contrast, the only instances that I can think of right now within the .Net framework where "out" parameters make sense are in methods like TryParse(). These actually make the caller write simpler code, whereby the caller is primarily going to be interested in the out parameter.
int i;
if(int.TryParse("1", i)){
DoSomething(i);
}
I'm thinking that "out" should only be used if the return type is bool and the expected usages are where the "out" parameters will always be of interest to the caller, by design.
Thoughts?
There is a reason that one of the static code analysis (=FxCop) rules points at you when you use out parameters. I'd say: only use out when really needed in interop type scenarios. In all other cases, simply do not use out. But perhaps that's just me?
This is what the .NET Framework Developer's Guide has to say about out parameters:
Avoid using out or reference parameters.
Working with members
that define out or reference
parameters requires that the developer
understand pointers, subtle
differences between value types and
reference types, and initialization
differences between out and reference
parameters.
But if you do use them:
Do place all out parameters after all of the pass-by-value and ref
parameters (excluding parameter
arrays), even if this results in an
inconsistency in parameter ordering
between overloads.
This convention makes the method
signature easier to understand.
Your approach is better than out, because you can "chain" calls that way:
DoSomethingElse(DoThing(a,b).Result);
as opposed to
DoThing(a, out b);
DoSomethingElse(b);
The TryParse methods implemented with "out" was a mistake, IMO. Those would have been very convenient in chains.
There are only very few cases where I would use out. One of them is if your method returns two variables that from an OO point of view do not belong into an object together.
If for example, you want to get the most common word in a text string, and the 42nd word in the text, you could compute both in the same method (having to parse the text only once). But for your application, these informations have no relation to each other: You need the most common word for statistical purposes, but you only need the 42nd word because your customer is a geeky Douglas Adams fan.
Yes, that example is very contrived, but I haven't got a better one...
I just had to add that starting from C# 7, the use of the out keyword makes for very readable code in certain instances, when combined with inline variable declaration. While in general you should rather return a (named) tuple, control flow becomes very concise when a method has a boolean outcome, like:
if (int.TryParse(mightBeCount, out var count)
{
// Successfully parsed count
}
I should also mention, that defining a specific class for those cases where a tuple makes sense, more often than not, is more appropriate. It depends on how many return values there are and what you use them for. I'd say, when more than 3, stick them in a class anyway.
One advantage of out is that the compiler will verify that CalcSomething does in fact assign a value to someOtherNumber. It will not verify that the someOtherNumber field of Result has a value.
Stay away from out. It's there as a low-level convenience. But at a high level, it's an anti-technique.
int? i = Util.TryParseInt32("1");
if(i == null)
return;
DoSomething(i);
If you have even seen and worked with MS
namespace System.Web.Security
MembershipProvider
public abstract MembershipUser CreateUser(string username, string password, string email, string passwordQuestion, string passwordAnswer, bool isApproved, object providerUserKey, out MembershipCreateStatus status);
You will need a bucket. This is an example of a class breaking many design paradigms. Awful!
Just because the language has out parameters doesn't mean they should be used. eg goto
The use of out Looks more like the Dev was either Lazy to create a type or wanted to try a language feature.
Even the completely contrived MostCommonAnd42ndWord example above I would use
List or a new type contrivedresult with 2 properties.
The only good reasons i've seen in the explanations above was in interop scenarios when forced to. Assuming that is valid statement.
You could create a generic tuple class for the purpose of returning multiple values. This seems to be a decent solution but I can't help but feel that you lose a bit of readability by returning such a generic type (Result is no better in that regard).
One important point, though, that james curran also pointed out, is that the compiler enforces an assignment of the value. This is a general pattern I see in C#, that you must state certain things explicitly, for more readable code. Another example of this is the override keyword which you don't have in Java.
If your result is more complex than a single value, you should, if possible, create a result object. The reasons I have to say this?
The entire result is encapsulated. That is, you have a single package that informs the code of the complete result of CalcSomething. Instead of having external code interpret what the decimal return value means, you can name the properties for your previous return value, Your someOtherNumber value, etc.
You can include more complex success indicators. The function call you wrote might throw an exception if end comes before start, but exception throwing is the only way to report errors. Using a result object, you can include a boolean or enumerated "Success" value, with appropriate error reporting.
You can delay the execution of the result until you actually examine the "result" field. That is, the execution of any computing needn't be done until you use the values.

Categories

Resources