Related
When planning out my programs, I often start with a chain of thought like so:
A football team is just a list of football players. Therefore, I should represent it with:
var football_team = new List<FootballPlayer>();
The ordering of this list represent the order in which the players are listed in the roster.
But I realize later that teams also have other properties, besides the mere list of players, that must be recorded. For example, the running total of scores this season, the current budget, the uniform colors, a string representing the name of the team, etc..
So then I think:
Okay, a football team is just like a list of players, but additionally, it has a name (a string) and a running total of scores (an int). .NET does not provide a class for storing football teams, so I will make my own class. The most similar and relevant existing structure is List<FootballPlayer>, so I will inherit from it:
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal
}
But it turns out that a guideline says you shouldn't inherit from List<T>. I'm thoroughly confused by this guideline in two respects.
Why not?
Apparently List is somehow optimized for performance. How so? What performance problems will I cause if I extend List? What exactly will break?
Another reason I've seen is that List is provided by Microsoft, and I have no control over it, so I cannot change it later, after exposing a "public API". But I struggle to understand this. What is a public API and why should I care? If my current project does not and is not likely to ever have this public API, can I safely ignore this guideline? If I do inherit from List and it turns out I need a public API, what difficulties will I have?
Why does it even matter? A list is a list. What could possibly change? What could I possibly want to change?
And lastly, if Microsoft did not want me to inherit from List, why didn't they make the class sealed?
What else am I supposed to use?
Apparently, for custom collections, Microsoft has provided a Collection class which should be extended instead of List. But this class is very bare, and does not have many useful things, such as AddRange, for instance. jvitor83's answer provides a performance rationale for that particular method, but how is a slow AddRange not better than no AddRange?
Inheriting from Collection is way more work than inheriting from List, and I see no benefit. Surely Microsoft wouldn't tell me to do extra work for no reason, so I can't help feeling like I am somehow misunderstanding something, and inheriting Collection is actually not the right solution for my problem.
I've seen suggestions such as implementing IList. Just no. This is dozens of lines of boilerplate code which gains me nothing.
Lastly, some suggest wrapping the List in something:
class FootballTeam
{
public List<FootballPlayer> Players;
}
There are two problems with this:
It makes my code needlessly verbose. I must now call my_team.Players.Count instead of just my_team.Count. Thankfully, with C# I can define indexers to make indexing transparent, and forward all the methods of the internal List... But that's a lot of code! What do I get for all that work?
It just plain doesn't make any sense. A football team doesn't "have" a list of players. It is the list of players. You don't say "John McFootballer has joined SomeTeam's players". You say "John has joined SomeTeam". You don't add a letter to "a string's characters", you add a letter to a string. You don't add a book to a library's books, you add a book to a library.
I realize that what happens "under the hood" can be said to be "adding X to Y's internal list", but this seems like a very counter-intuitive way of thinking about the world.
My question (summarized)
What is the correct C# way of representing a data structure, which, "logically" (that is to say, "to the human mind") is just a list of things with a few bells and whistles?
Is inheriting from List<T> always unacceptable? When is it acceptable? Why/why not? What must a programmer consider, when deciding whether to inherit from List<T> or not?
There are some good answers here. I would add to them the following points.
What is the correct C# way of representing a data structure, which, "logically" (that is to say, "to the human mind") is just a list of things with a few bells and whistles?
Ask any ten non-computer-programmer people who are familiar with the existence of football to fill in the blank:
A football team is a particular kind of _____
Did anyone say "list of football players with a few bells and whistles", or did they all say "sports team" or "club" or "organization"? Your notion that a football team is a particular kind of list of players is in your human mind and your human mind alone.
List<T> is a mechanism. Football team is a business object -- that is, an object that represents some concept that is in the business domain of the program. Don't mix those! A football team is a kind of team; it has a roster, a roster is a list of players. A roster is not a particular kind of list of players. A roster is a list of players. So make a property called Roster that is a List<Player>. And make it ReadOnlyList<Player> while you're at it, unless you believe that everyone who knows about a football team gets to delete players from the roster.
Is inheriting from List<T> always unacceptable?
Unacceptable to whom? Me? No.
When is it acceptable?
When you're building a mechanism that extends the List<T> mechanism.
What must a programmer consider, when deciding whether to inherit from List<T> or not?
Am I building a mechanism or a business object?
But that's a lot of code! What do I get for all that work?
You spent more time typing up your question that it would have taken you to write forwarding methods for the relevant members of List<T> fifty times over. You're clearly not afraid of verbosity, and we are talking about a very small amount of code here; this is a few minutes work.
UPDATE
I gave it some more thought and there is another reason to not model a football team as a list of players. In fact it might be a bad idea to model a football team as having a list of players too. The problem with a team as/having a list of players is that what you've got is a snapshot of the team at a moment in time. I don't know what your business case is for this class, but if I had a class that represented a football team I would want to ask it questions like "how many Seahawks players missed games due to injury between 2003 and 2013?" or "What Denver player who previously played for another team had the largest year-over-year increase in yards ran?" or "Did the Piggers go all the way this year?"
That is, a football team seems to me to be well modeled as a collection of historical facts such as when a player was recruited, injured, retired, etc. Obviously the current player roster is an important fact that should probably be front-and-center, but there may be other interesting things you want to do with this object that require a more historical perspective.
Wow, your post has an entire slew of questions and points. Most of the reasoning you get from Microsoft is exactly on point. Let's start with everything about List<T>
List<T> is highly optimized. Its main usage is to be used as a private member of an object.
Microsoft did not seal it because sometimes you might want to create a class that has a friendlier name: class MyList<T, TX> : List<CustomObject<T, Something<TX>> { ... }. Now it's as easy as doing var list = new MyList<int, string>();.
CA1002: Do not expose generic lists: Basically, even if you plan to use this app as the sole developer, it's worthwhile to develop with good coding practices, so they become instilled into you and second nature. You are still allowed to expose the list as an IList<T> if you need any consumer to have an indexed list. This lets you change the implementation within a class later on.
Microsoft made Collection<T> very generic because it is a generic concept... the name says it all; it is just a collection. There are more precise versions such as SortedCollection<T>, ObservableCollection<T>, ReadOnlyCollection<T>, etc. each of which implement IList<T> but not List<T>.
Collection<T> allows for members (i.e. Add, Remove, etc.) to be overridden because they are virtual. List<T> does not.
The last part of your question is spot on. A Football team is more than just a list of players, so it should be a class that contains that list of players. Think Composition vs Inheritance. A Football team has a list of players (a roster), it isn't a list of players.
If I were writing this code, the class would probably look something like so:
public class FootballTeam<T>//generic class
{
// Football team rosters are generally 53 total players.
private readonly List<T> _roster = new List<T>(53);
public IList<T> Roster
{
get { return _roster; }
}
// Yes. I used LINQ here. This is so I don't have to worry about
// _roster.Length vs _roster.Count vs anything else.
public int PlayerCount
{
get { return _roster.Count(); }
}
// Any additional members you want to expose/wrap.
}
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal;
}
Previous code means: a bunch of guys from the street playing football, and they happen to have a name. Something like:
Anyway, this code (from m-y's answer)
public class FootballTeam
{
// A team's name
public string TeamName;
// Football team rosters are generally 53 total players.
private readonly List<T> _roster = new List<T>(53);
public IList<T> Roster
{
get { return _roster; }
}
public int PlayerCount
{
get { return _roster.Count(); }
}
// Any additional members you want to expose/wrap.
}
Means: this is a football team which has management, players, admins, etc. Something like:
This is how is your logic presented in pictures…
This is a classic example of composition vs inheritance.
In this specific case:
Is the team a list of players with added behavior
or
Is the team an object of its own that happens to contain a list of players.
By extending List you are limiting yourself in a number of ways:
You cannot restrict access (for example, stopping people changing the roster). You get all the List methods whether you need/want them all or not.
What happens if you want to have lists of other things as well. For example, teams have coaches, managers, fans, equipment, etc. Some of those might well be lists in their own right.
You limit your options for inheritance. For example you might want to create a generic Team object, and then have BaseballTeam, FootballTeam, etc. that inherit from that. To inherit from List you need to do the inheritance from Team, but that then means that all the various types of team are forced to have the same implementation of that roster.
Composition - including an object giving the behavior you want inside your object.
Inheritance - your object becomes an instance of the object that has the behavior you want.
Both have their uses, but this is a clear case where composition is preferable.
As everyone has pointed out, a team of players is not a list of players. This mistake is made by many people everywhere, perhaps at various levels of expertise. Often the problem is subtle and occasionally very gross, as in this case. Such designs are bad because these violate the Liskov Substitution Principle. The internet has many good articles explaining this concept e.g., http://en.wikipedia.org/wiki/Liskov_substitution_principle
In summary, there are two rules to be preserved in a Parent/Child relationship among classes:
a Child should require no characteristic less than what completely defines the Parent.
a Parent should require no characteristic in addition to what completely defines the Child.
In other words, a Parent is a necessary definition of a child, and a child is a sufficient definition of a Parent.
Here is a way to think through ones solution and apply the above principle that should help one avoid such a mistake. One should test ones hypothesis by verifying if all the operations of a parent class are valid for the derived class both structurally and semantically.
Is a football team a list of football players? ( Do all properties of a list apply to a team in the same meaning)
Is a team a collection of homogenous entities? Yes, team is a collection of Players
Is the order of inclusion of players descriptive of the state of the team and does the team ensure that the sequence is preserved unless explicitly changed? No, and No
Are players expected to be included/dropped based on their sequencial position in the team? No
As you see, only the first characteristic of a list is applicable to a team. Hence a team is not a list. A list would be a implementation detail of how you manage your team, so it should only be used to store the player objects and be manipulated with methods of Team class.
At this point I'd like to remark that a Team class should, in my opinion, not even be implemented using a List; it should be implemented using a Set data structure (HashSet, for example) in most cases.
What if the FootballTeam has a reserves team along with the main team?
class FootballTeam
{
List<FootballPlayer> Players { get; set; }
List<FootballPlayer> ReservePlayers { get; set; }
}
How would you model that with?
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal
}
The relationship is clearly has a and not is a.
or RetiredPlayers?
class FootballTeam
{
List<FootballPlayer> Players { get; set; }
List<FootballPlayer> ReservePlayers { get; set; }
List<FootballPlayer> RetiredPlayers { get; set; }
}
As a rule of thumb, if you ever want to inherit from a collection, name the class SomethingCollection.
Does your SomethingCollection semantically make sense? Only do this if your type is a collection of Something.
In the case of FootballTeam it doesn't sound right. A Team is more than a Collection. A Team can have coaches, trainers, etc as the other answers have pointed out.
FootballCollection sounds like a collection of footballs or maybe a collection of football paraphernalia. TeamCollection, a collection of teams.
FootballPlayerCollection sounds like a collection of players which would be a valid name for a class that inherits from List<FootballPlayer> if you really wanted to do that.
Really List<FootballPlayer> is a perfectly good type to deal with. Maybe IList<FootballPlayer> if you are returning it from a method.
In summary
Ask yourself
Is X a Y? or Has X a Y?
Do my class names mean what they are?
Design > Implementation
What methods and properties you expose is a design decision. What base class you inherit from is an implementation detail. I feel it's worth taking a step back to the former.
An object is a collection of data and behaviour.
So your first questions should be:
What data does this object comprise in the model I'm creating?
What behaviour does this object exhibit in that model?
How might this change in future?
Bear in mind that inheritance implies an "isa" (is a) relationship, whereas composition implies a "has a" (hasa) relationship. Choose the right one for your situation in your view, bearing in mind where things might go as your application evolves.
Consider thinking in interfaces before you think in concrete types, as some people find it easier to put their brain in "design mode" that way.
This isn't something everyone does consciously at this level in day to day coding. But if you're mulling this sort of topic, you're treading in design waters. Being aware of it can be liberating.
Consider Design Specifics
Take a look at List<T> and IList<T> on MSDN or Visual Studio. See what methods and properties they expose. Do these methods all look like something someone would want to do to a FootballTeam in your view?
Does footballTeam.Reverse() make sense to you? Does footballTeam.ConvertAll<TOutput>() look like something you want?
This isn't a trick question; the answer might genuinely be "yes". If you implement/inherit List<Player> or IList<Player>, you're stuck with them; if that's ideal for your model, do it.
If you decide yes, that makes sense, and you want your object to be treatable as a collection/list of players (behaviour), and you therefore want to implement ICollection<Player> or IList<Player>, by all means do so. Notionally:
class FootballTeam : ... ICollection<Player>
{
...
}
If you want your object to contain a collection/list of players (data), and you therefore want the collection or list to be a property or member, by all means do so. Notionally:
class FootballTeam ...
{
public ICollection<Player> Players { get { ... } }
}
You might feel that you want people to be able to only enumerate the set of players, rather than count them, add to them or remove them. IEnumerable<Player> is a perfectly valid option to consider.
You might feel that none of these interfaces are useful in your model at all. This is less likely (IEnumerable<T> is useful in many situations) but it's still possible.
Anyone who attempts to tell you that one of these it is categorically and definitively wrong in every case is misguided. Anyone who attempts to tell you it is categorically and definitively right in every case is misguided.
Move on to Implementation
Once you've decided on data and behaviour, you can make a decision about implementation. This includes which concrete classes you depend on via inheritance or composition.
This may not be a big step, and people often conflate design and implementation since it's quite possible to run through it all in your head in a second or two and start typing away.
A Thought Experiment
An artificial example: as others have mentioned, a team is not always "just" a collection of players. Do you maintain a collection of match scores for the team? Is the team interchangeable with the club, in your model? If so, and if your team isa collection of players, perhaps it also isa collection of staff and/or a collection of scores. Then you end up with:
class FootballTeam : ... ICollection<Player>,
ICollection<StaffMember>,
ICollection<Score>
{
....
}
Design notwithstanding, at this point in C# you won't be able to implement all of these by inheriting from List<T> anyway, since C# "only" supports single inheritance. (If you've tried this malarkey in C++, you may consider this a Good Thing.) Implementing one collection via inheritance and one via composition is likely to feel dirty. And properties such as Count become confusing to users unless you implement ILIst<Player>.Count and IList<StaffMember>.Count etc. explicitly, and then they're just painful rather than confusing. You can see where this is going; gut feeling whilst thinking down this avenue may well tell you it feels wrong to head in this direction (and rightly or wrongly, your colleagues might also if you implemented it this way!)
The Short Answer (Too Late)
The guideline about not inheriting from collection classes isn't C# specific, you'll find it in many programming languages. It is received wisdom not a law. One reason is that in practice composition is considered to often win out over inheritance in terms of comprehensibility, implementability and maintainability. It's more common with real world / domain objects to find useful and consistent "hasa" relationships than useful and consistent "isa" relationships unless you're deep in the abstract, most especially as time passes and the precise data and behaviour of objects in code changes. This shouldn't cause you to always rule out inheriting from collection classes; but it may be suggestive.
First of all, it has to do with usability. If you use inheritance, the Team class will expose behavior (methods) that are designed purely for object manipulation. For example, AsReadOnly() or CopyTo(obj) methods make no sense for the team object. Instead of the AddRange(items) method you would probably want a more descriptive AddPlayers(players) method.
If you want to use LINQ, implementing a generic interface such as ICollection<T> or IEnumerable<T> would make more sense.
As mentioned, composition is the right way to go about it. Just implement a list of players as a private variable.
Let me rewrite your question. so you might see the subject from a different perspective.
When I need to represent a football team, I understand that it is basically a name. Like: "The Eagles"
string team = new string();
Then later I realized teams also have players.
Why can't I just extend the string type so that it also holds a list of players?
Your point of entry into the problem is arbitrary. Try to think what does a team have (properties), not what it is.
After you do that, you could see if it shares properties with other classes. And think about inheritance.
It depends on the context
When you consider your team as a list of players, you are projecting the "idea" of a foot ball team down to one aspect: You reduce the "team" to the people you see on the field. This projection is only correct in a certain context. In a different context, this might be completely wrong. Imagine you want to become a sponsor of the team. So you have to talk to the managers of the team. In this context the team is projected to the list of its managers. And these two lists usually don't overlap very much. Other contexts are the current versus the former players, etc.
Unclear semantics
So the problem with considering a team as a list of its players is that its semantic depends on the context and that it cannot be extended when the context changes. Additionally it is hard to express, which context you are using.
Classes are extensible
When you using a class with only one member (e.g. IList activePlayers), you can use the name of the member (and additionally its comment) to make the context clear. When there are additional contexts, you just add an additional member.
Classes are more complex
In some cases it might be overkill to create an extra class. Each class definition must be loaded through the classloader and will be cached by the virtual machine. This costs you runtime performance and memory. When you have a very specific context it might be OK to consider a football team as a list of players. But in this case, you should really just use a IList , not a class derived from it.
Conclusion / Considerations
When you have a very specific context, it is OK to consider a team as a list of players. For example inside a method it is completely OK to write:
IList<Player> footballTeam = ...
When using F#, it can even be OK to create a type abbreviation:
type FootballTeam = IList<Player>
But when the context is broader or even unclear, you should not do this. This is especially the case when you create a new class whose context in which it may be used in the future is not clear. A warning sign is when you start to add additional attributes to your class (name of the team, coach, etc.). This is a clear sign that the context where the class will be used is not fixed and will change in the future. In this case you cannot consider the team as a list of players, but you should model the list of the (currently active, not injured, etc.) players as an attribute of the team.
A football team is not a list of football players. A football team is composed of a list of football players!
This is logically wrong:
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal
}
and this is correct:
class FootballTeam
{
public List<FootballPlayer> players
public string TeamName;
public int RunningTotal
}
Just because I think the other answers pretty much go off on a tangent of whether a football team "is-a" List<FootballPlayer> or "has-a" List<FootballPlayer>, which really doesn't answer this question as written.
The OP chiefly asks for clarification on guidelines for inheriting from List<T>:
A guideline says that you shouldn't inherit from List<T>. Why not?
Because List<T> has no virtual methods. This is less of a problem in your own code, since you can usually switch out the implementation with relatively little pain - but can be a much bigger deal in a public API.
What is a public API and why should I care?
A public API is an interface you expose to 3rd party programmers. Think framework code. And recall that the guidelines being referenced are the ".NET Framework Design Guidelines" and not the ".NET Application Design Guidelines". There is a difference, and - generally speaking - public API design is a lot more strict.
If my current project does not and is not likely to ever have this public API, can I safely ignore this guideline? If I do inherit from List and it turns out I need a public API, what difficulties will I have?
Pretty much, yeah. You may want to consider the rationale behind it to see if it applies to your situation anyway, but if you're not building a public API then you don't particularly need to worry about API concerns like versioning (of which, this is a subset).
If you add a public API in the future, you will either need to abstract out your API from your implementation (by not exposing your List<T> directly) or violate the guidelines with the possible future pain that entails.
Why does it even matter? A list is a list. What could possibly change? What could I possibly want to change?
Depends on the context, but since we're using FootballTeam as an example - imagine that you can't add a FootballPlayer if it would cause the team to go over the salary cap. A possible way of adding that would be something like:
class FootballTeam : List<FootballPlayer> {
override void Add(FootballPlayer player) {
if (this.Sum(p => p.Salary) + player.Salary > SALARY_CAP)) {
throw new InvalidOperationException("Would exceed salary cap!");
}
}
}
Ah...but you can't override Add because it's not virtual (for performance reasons).
If you're in an application (which, basically, means that you and all of your callers are compiled together) then you can now change to using IList<T> and fix up any compile errors:
class FootballTeam : IList<FootballPlayer> {
private List<FootballPlayer> Players { get; set; }
override void Add(FootballPlayer player) {
if (this.Players.Sum(p => p.Salary) + player.Salary > SALARY_CAP)) {
throw new InvalidOperationException("Would exceed salary cap!");
}
}
/* boiler plate for rest of IList */
}
but, if you've publically exposed to a 3rd party you just made a breaking change that will cause compile and/or runtime errors.
TL;DR - the guidelines are for public APIs. For private APIs, do what you want.
There are a lot excellent answers here, but I want to touch on something I didn't see mentioned: Object oriented design is about empowering objects.
You want to encapsulate all your rules, additional work and internal details inside an appropriate object. In this way other objects interacting with this one don't have to worry about it all. In fact, you want to go a step further and actively prevent other objects from bypassing these internals.
When you inherit from List, all other objects can see you as a List. They have direct access to the methods for adding and removing players. And you'll have lost your control; for example:
Suppose you want to differentiate when a player leaves by knowing whether they retired, resigned or were fired. You could implement a RemovePlayer method that takes an appropriate input enum. However, by inheriting from List, you would be unable to prevent direct access to Remove, RemoveAll and even Clear. As a result, you've actually disempowered your FootballTeam class.
Additional thoughts on encapsulation... You raised the following concern:
It makes my code needlessly verbose. I must now call my_team.Players.Count instead of just my_team.Count.
You're correct, that would be needlessly verbose for all clients to use you team. However, that problem is very small in comparison to the fact that you've exposed List Players to all and sundry so they can fiddle with your team without your consent.
You go on to say:
It just plain doesn't make any sense. A football team doesn't "have" a list of players. It is the list of players. You don't say "John McFootballer has joined SomeTeam's players". You say "John has joined SomeTeam".
You're wrong about the first bit: Drop the word 'list', and it's actually obvious that a team does have players.
However, you hit the nail on the head with the second. You don't want clients calling ateam.Players.Add(...). You do want them calling ateam.AddPlayer(...). And your implemention would (possibly amongst other things) call Players.Add(...) internally.
Hopefully you can see how important encapsulation is to the objective of empowering your objects. You want to allow each class to do its job well without fear of interference from other objects.
Does allowing people to say
myTeam.subList(3, 5);
make any sense at all? If not then it shouldn't be a List.
It depends on the behaviour of your "team" object. If it behaves just like a collection, it might be OK to represent it first with a plain List. Then you might start to notice that you keep duplicating code that iterates on the list; at this point you have the option of creating a FootballTeam object that wraps the list of players. The FootballTeam class becomes the home for all the code that iterates on the list of players.
It makes my code needlessly verbose. I must now call my_team.Players.Count instead of just my_team.Count. Thankfully, with C# I can define indexers to make indexing transparent, and forward all the methods of the internal List... But that's a lot of code! What do I get for all that work?
Encapsulation. Your clients need not know what goes on inside of FootballTeam. For all your clients know, it might be implemented by looking the list of players up in a database. They don't need to know, and this improves your design.
It just plain doesn't make any sense. A football team doesn't "have" a list of players. It is the list of players. You don't say "John McFootballer has joined SomeTeam's players". You say "John has joined SomeTeam". You don't add a letter to "a string's characters", you add a letter to a string. You don't add a book to a library's books, you add a book to a library.
Exactly :) you will say footballTeam.Add(john), not footballTeam.List.Add(john). The internal list will not be visible.
What is the correct C# way of representing a data structure...
Remeber, "All models are wrong, but some are useful." -George E. P. Box
There is no a "correct way", only a useful one.
Choose one that is useful to you and/your users. That's it. Develop economically, don't over-engineer. The less code you write, the less code you will need to debug. (read the following editions).
-- Edited
My best answer would be... it depends. Inheriting from a List would expose the clients of this class to methods that may be should not be exposed, primarily because FootballTeam looks like a business entity.
-- Edition 2
I sincerely don't remember to what I was referring on the “don't over-engineer” comment. While I believe the KISS mindset is a good guide, I want to emphasize that inheriting a business class from List would create more problems than it resolves, due abstraction leakage.
On the other hand, I believe there are a limited number of cases where simply to inherit from List is useful. As I wrote in the previous edition, it depends. The answer to each case is heavily influenced by both knowledge, experience and personal preferences.
Thanks to #kai for helping me to think more precisely about the answer.
This reminds me of the "Is a" versus "has a" tradeoff. Sometimes it is easier and makesmore sense to inherit directly from a super class. Other times it makes more sense to create a standalone class and include the class you would have inherited from as a member variable. You can still access the functionality of the class but are not bound to the interface or any other constraints that might come from inheriting from the class.
Which do you do? As with a lot of things...it depends on the context. The guide I would use is that in order to inherit from another class there truly should be an "is a" relationship. So if you a writing a class called BMW, it could inherit from Car because a BMW truly is a car. A Horse class can inherit from the Mammal class because a horse actually is a mammal in real life and any Mammal functionality should be relevant to Horse. But can you say that a team is a list? From what I can tell, it does not seem like a Team really "is a" List. So in this case, I would have a List as a member variable.
Problems with serializing
One aspect is missing. Classes that inherit from List can't be serialized correctly using XmlSerializer. In that case DataContractSerializer must be used instead, or an own serializing implementation is needed.
public class DemoList : List<Demo>
{
// using XmlSerializer this properties won't be seralized
// There is no error, the data is simply not there.
string AnyPropertyInDerivedFromList { get; set; }
}
public class Demo
{
// this properties will be seralized
string AnyPropetyInDemo { get; set; }
}
Further reading: When a class is inherited from List<>, XmlSerializer doesn't serialize other attributes
Use IList instead
Personaly I wouldn't inherit from List but implement IList. Visual Studio will do the job for you and create a full working iplementation. Look here: How to get a full working implementation of IList
What the guidelines say is that the public API should not reveal the internal design decision of whether you are using a list, a set, a dictionary, a tree or whatever. A "team" is not necessarily a list. You may implement it as a list but users of your public API should use you class on a need to know basis. This allows you to change your decision and use a different data structure without affecting the public interface.
When they say List<T> is "optimized" I think they want to mean that it doesn't have features like virtual methods which are bit more expensive. So the problem is that once you expose List<T> in your public API, you loose ability to enforce business rules or customize its functionality later. But if you are using this inherited class as internal within your project (as opposed to potentially exposed to thousands of your customers/partners/other teams as API) then it may be OK if it saves your time and it is the functionality you want to duplicate. The advantage of inheriting from List<T> is that you eliminate lot of dumb wrapper code that is just never going to be customized in foreseeable future. Also if you want your class to explicitly have exact same semantics as List<T> for the life of your APIs then also it may be OK.
I often see lot of people doing tons of extra work just because of FxCop rule says so or someone's blog says it's a "bad" practice. Many times, this turns code in to design pattern palooza weirdness. As with lot of guideline, treat it as guideline that can have exceptions.
My dirty secret: I don't care what people say, and I do it. .NET Framework is spread with "XxxxCollection" (UIElementCollection for top of my head example).
So what stops me saying:
team.Players.ByName("Nicolas")
When I find it better than
team.ByName("Nicolas")
Moreover, my PlayerCollection might be used by other class, like "Club" without any code duplication.
club.Players.ByName("Nicolas")
Best practices of yesterday, might not be the one of tomorrow. There is no reason behind most best practices, most are only wide agreement among the community. Instead of asking the community if it will blame you when you do that ask yourself, what is more readable and maintainable?
team.Players.ByName("Nicolas")
or
team.ByName("Nicolas")
Really. Do you have any doubt? Now maybe you need to play with other technical constraints that prevent you to use List<T> in your real use case. But don't add a constraint that should not exist. If Microsoft did not document the why, then it is surely a "best practice" coming from nowhere.
While I don't have a complex comparison as most of these answers do, I would like to share my method for handling this situation. By extending IEnumerable<T>, you can allow your Team class to support Linq query extensions, without publicly exposing all the methods and properties of List<T>.
class Team : IEnumerable<Player>
{
private readonly List<Player> playerList;
public Team()
{
playerList = new List<Player>();
}
public Enumerator GetEnumerator()
{
return playerList.GetEnumerator();
}
...
}
class Player
{
...
}
I just wanted to add that Bertrand Meyer, the inventor of Eiffel and design by contract, would have Team inherit from List<Player> without so much as batting an eyelid.
In his book, Object-Oriented Software Construction, he discusses the implementation of a GUI system where rectangular windows can have child windows. He simply has Window inherit from both Rectangle and Tree<Window> to reuse the implementation.
However, C# is not Eiffel. The latter supports multiple inheritance and renaming of features. In C#, when you subclass, you inherit both the interface and the implemenation. You can override the implementation, but the calling conventions are copied directly from the superclass. In Eiffel, however, you can modify the names of the public methods, so you can rename Add and Remove to Hire and Fire in your Team. If an instance of Team is upcast back to List<Player>, the caller will use Add and Remove to modify it, but your virtual methods Hire and Fire will be called.
If your class users need all the methods and properties** List has, you should derive your class from it. If they don't need them, enclose the List and make wrappers for methods your class users actually need.
This is a strict rule, if you write a public API, or any other code that will be used by many people. You may ignore this rule if you have a tiny app and no more than 2 developers. This will save you some time.
For tiny apps, you may also consider choosing another, less strict language. Ruby, JavaScript - anything that allows you to write less code.
I think I don't agree with your generalization. A team isn't just a collection of players. A team has so much more information about it - name, emblem, collection of management/admin staff, collection of coaching crew, then collection of players. So properly, your FootballTeam class should have 3 collections and not itself be a collection; if it is to properly model the real world.
You could consider a PlayerCollection class which like the Specialized StringCollection offers some other facilities - like validation and checks before objects are added to or removed from the internal store.
Perhaps, the notion of a PlayerCollection betters suits your preferred approach?
public class PlayerCollection : Collection<Player>
{
}
And then the FootballTeam can look like this:
public class FootballTeam
{
public string Name { get; set; }
public string Location { get; set; }
public ManagementCollection Management { get; protected set; } = new ManagementCollection();
public CoachingCollection CoachingCrew { get; protected set; } = new CoachingCollection();
public PlayerCollection Players { get; protected set; } = new PlayerCollection();
}
Prefer Interfaces over Classes
Classes should avoid deriving from classes and instead implement the minimal interfaces necessary.
Inheritance breaks Encapsulation
Deriving from classes breaks encapsulation:
exposes internal details about how your collection is implemented
declares an interface (set of public functions and properties) that may not be appropriate
Among other things this makes it harder to refactor your code.
Classes are an Implementation Detail
Classes are an implementation detail that should be hidden from other parts of your code.
In short a System.List is a specific implementation of an abstract data type, that may or may not be appropriate now and in the future.
Conceptually the fact that the System.List data type is called "list" is a bit of a red-herring. A System.List<T> is a mutable ordered collection that supports amortized O(1) operations for adding, inserting, and removing elements, and O(1) operations for retrieving the number of elements or getting and setting element by index.
The Smaller the Interface the more Flexible the Code
When designing a data structure, the simpler the interface is, the more flexible the code is. Just look at how powerful LINQ is for a demonstration of this.
How to Choose Interfaces
When you think "list" you should start by saying to yourself, "I need to represent a collection of baseball players". So let's say you decide to model this with a class. What you should do first is decide what the minimal amount of interfaces that this class will need to expose.
Some questions that can help guide this process:
Do I need to have the count? If not consider implementing IEnumerable<T>
Is this collection going to change after it has been initialized? If not consider IReadonlyList<T>.
Is it important that I can access items by index? Consider ICollection<T>
Is the order in which I add items to the collection important? Maybe it is an ISet<T>?
If you indeed want these thing then go ahead and implement IList<T>.
This way you will not be coupling other parts of the code to implementation details of your baseball players collection and will be free to change how it is implemented as long as you respect the interface.
By taking this approach you will find that code becomes easier to read, refactor, and reuse.
Notes about Avoiding Boilerplate
Implementing interfaces in a modern IDE should be easy. Right click and choose "Implement Interface". Then forward all of the implementations to a member class if you need to.
That said, if you find you are writing lots of boilerplate, it is potentially because you are exposing more functions than you should be. It is the same reason you shouldn't inherit from a class.
You can also design smaller interfaces that make sense for your application, and maybe just a couple of helper extension functions to map those interfaces to any others that you need. This is the approach I took in my own IArray interface for the LinqArray library.
When is it acceptable?
To quote Eric Lippert:
When you're building a mechanism that extends the List<T> mechanism.
For example, you are tired of the absence of the AddRange method in IList<T>:
public interface IMoreConvenientListInterface<T> : IList<T>
{
void AddRange(IEnumerable<T> collection);
}
public class MoreConvenientList<T> : List<T>, IMoreConvenientListInterface<T> { }
When planning out my programs, I often start with a chain of thought like so:
A football team is just a list of football players. Therefore, I should represent it with:
var football_team = new List<FootballPlayer>();
The ordering of this list represent the order in which the players are listed in the roster.
But I realize later that teams also have other properties, besides the mere list of players, that must be recorded. For example, the running total of scores this season, the current budget, the uniform colors, a string representing the name of the team, etc..
So then I think:
Okay, a football team is just like a list of players, but additionally, it has a name (a string) and a running total of scores (an int). .NET does not provide a class for storing football teams, so I will make my own class. The most similar and relevant existing structure is List<FootballPlayer>, so I will inherit from it:
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal
}
But it turns out that a guideline says you shouldn't inherit from List<T>. I'm thoroughly confused by this guideline in two respects.
Why not?
Apparently List is somehow optimized for performance. How so? What performance problems will I cause if I extend List? What exactly will break?
Another reason I've seen is that List is provided by Microsoft, and I have no control over it, so I cannot change it later, after exposing a "public API". But I struggle to understand this. What is a public API and why should I care? If my current project does not and is not likely to ever have this public API, can I safely ignore this guideline? If I do inherit from List and it turns out I need a public API, what difficulties will I have?
Why does it even matter? A list is a list. What could possibly change? What could I possibly want to change?
And lastly, if Microsoft did not want me to inherit from List, why didn't they make the class sealed?
What else am I supposed to use?
Apparently, for custom collections, Microsoft has provided a Collection class which should be extended instead of List. But this class is very bare, and does not have many useful things, such as AddRange, for instance. jvitor83's answer provides a performance rationale for that particular method, but how is a slow AddRange not better than no AddRange?
Inheriting from Collection is way more work than inheriting from List, and I see no benefit. Surely Microsoft wouldn't tell me to do extra work for no reason, so I can't help feeling like I am somehow misunderstanding something, and inheriting Collection is actually not the right solution for my problem.
I've seen suggestions such as implementing IList. Just no. This is dozens of lines of boilerplate code which gains me nothing.
Lastly, some suggest wrapping the List in something:
class FootballTeam
{
public List<FootballPlayer> Players;
}
There are two problems with this:
It makes my code needlessly verbose. I must now call my_team.Players.Count instead of just my_team.Count. Thankfully, with C# I can define indexers to make indexing transparent, and forward all the methods of the internal List... But that's a lot of code! What do I get for all that work?
It just plain doesn't make any sense. A football team doesn't "have" a list of players. It is the list of players. You don't say "John McFootballer has joined SomeTeam's players". You say "John has joined SomeTeam". You don't add a letter to "a string's characters", you add a letter to a string. You don't add a book to a library's books, you add a book to a library.
I realize that what happens "under the hood" can be said to be "adding X to Y's internal list", but this seems like a very counter-intuitive way of thinking about the world.
My question (summarized)
What is the correct C# way of representing a data structure, which, "logically" (that is to say, "to the human mind") is just a list of things with a few bells and whistles?
Is inheriting from List<T> always unacceptable? When is it acceptable? Why/why not? What must a programmer consider, when deciding whether to inherit from List<T> or not?
There are some good answers here. I would add to them the following points.
What is the correct C# way of representing a data structure, which, "logically" (that is to say, "to the human mind") is just a list of things with a few bells and whistles?
Ask any ten non-computer-programmer people who are familiar with the existence of football to fill in the blank:
A football team is a particular kind of _____
Did anyone say "list of football players with a few bells and whistles", or did they all say "sports team" or "club" or "organization"? Your notion that a football team is a particular kind of list of players is in your human mind and your human mind alone.
List<T> is a mechanism. Football team is a business object -- that is, an object that represents some concept that is in the business domain of the program. Don't mix those! A football team is a kind of team; it has a roster, a roster is a list of players. A roster is not a particular kind of list of players. A roster is a list of players. So make a property called Roster that is a List<Player>. And make it ReadOnlyList<Player> while you're at it, unless you believe that everyone who knows about a football team gets to delete players from the roster.
Is inheriting from List<T> always unacceptable?
Unacceptable to whom? Me? No.
When is it acceptable?
When you're building a mechanism that extends the List<T> mechanism.
What must a programmer consider, when deciding whether to inherit from List<T> or not?
Am I building a mechanism or a business object?
But that's a lot of code! What do I get for all that work?
You spent more time typing up your question that it would have taken you to write forwarding methods for the relevant members of List<T> fifty times over. You're clearly not afraid of verbosity, and we are talking about a very small amount of code here; this is a few minutes work.
UPDATE
I gave it some more thought and there is another reason to not model a football team as a list of players. In fact it might be a bad idea to model a football team as having a list of players too. The problem with a team as/having a list of players is that what you've got is a snapshot of the team at a moment in time. I don't know what your business case is for this class, but if I had a class that represented a football team I would want to ask it questions like "how many Seahawks players missed games due to injury between 2003 and 2013?" or "What Denver player who previously played for another team had the largest year-over-year increase in yards ran?" or "Did the Piggers go all the way this year?"
That is, a football team seems to me to be well modeled as a collection of historical facts such as when a player was recruited, injured, retired, etc. Obviously the current player roster is an important fact that should probably be front-and-center, but there may be other interesting things you want to do with this object that require a more historical perspective.
Wow, your post has an entire slew of questions and points. Most of the reasoning you get from Microsoft is exactly on point. Let's start with everything about List<T>
List<T> is highly optimized. Its main usage is to be used as a private member of an object.
Microsoft did not seal it because sometimes you might want to create a class that has a friendlier name: class MyList<T, TX> : List<CustomObject<T, Something<TX>> { ... }. Now it's as easy as doing var list = new MyList<int, string>();.
CA1002: Do not expose generic lists: Basically, even if you plan to use this app as the sole developer, it's worthwhile to develop with good coding practices, so they become instilled into you and second nature. You are still allowed to expose the list as an IList<T> if you need any consumer to have an indexed list. This lets you change the implementation within a class later on.
Microsoft made Collection<T> very generic because it is a generic concept... the name says it all; it is just a collection. There are more precise versions such as SortedCollection<T>, ObservableCollection<T>, ReadOnlyCollection<T>, etc. each of which implement IList<T> but not List<T>.
Collection<T> allows for members (i.e. Add, Remove, etc.) to be overridden because they are virtual. List<T> does not.
The last part of your question is spot on. A Football team is more than just a list of players, so it should be a class that contains that list of players. Think Composition vs Inheritance. A Football team has a list of players (a roster), it isn't a list of players.
If I were writing this code, the class would probably look something like so:
public class FootballTeam<T>//generic class
{
// Football team rosters are generally 53 total players.
private readonly List<T> _roster = new List<T>(53);
public IList<T> Roster
{
get { return _roster; }
}
// Yes. I used LINQ here. This is so I don't have to worry about
// _roster.Length vs _roster.Count vs anything else.
public int PlayerCount
{
get { return _roster.Count(); }
}
// Any additional members you want to expose/wrap.
}
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal;
}
Previous code means: a bunch of guys from the street playing football, and they happen to have a name. Something like:
Anyway, this code (from m-y's answer)
public class FootballTeam
{
// A team's name
public string TeamName;
// Football team rosters are generally 53 total players.
private readonly List<T> _roster = new List<T>(53);
public IList<T> Roster
{
get { return _roster; }
}
public int PlayerCount
{
get { return _roster.Count(); }
}
// Any additional members you want to expose/wrap.
}
Means: this is a football team which has management, players, admins, etc. Something like:
This is how is your logic presented in pictures…
This is a classic example of composition vs inheritance.
In this specific case:
Is the team a list of players with added behavior
or
Is the team an object of its own that happens to contain a list of players.
By extending List you are limiting yourself in a number of ways:
You cannot restrict access (for example, stopping people changing the roster). You get all the List methods whether you need/want them all or not.
What happens if you want to have lists of other things as well. For example, teams have coaches, managers, fans, equipment, etc. Some of those might well be lists in their own right.
You limit your options for inheritance. For example you might want to create a generic Team object, and then have BaseballTeam, FootballTeam, etc. that inherit from that. To inherit from List you need to do the inheritance from Team, but that then means that all the various types of team are forced to have the same implementation of that roster.
Composition - including an object giving the behavior you want inside your object.
Inheritance - your object becomes an instance of the object that has the behavior you want.
Both have their uses, but this is a clear case where composition is preferable.
As everyone has pointed out, a team of players is not a list of players. This mistake is made by many people everywhere, perhaps at various levels of expertise. Often the problem is subtle and occasionally very gross, as in this case. Such designs are bad because these violate the Liskov Substitution Principle. The internet has many good articles explaining this concept e.g., http://en.wikipedia.org/wiki/Liskov_substitution_principle
In summary, there are two rules to be preserved in a Parent/Child relationship among classes:
a Child should require no characteristic less than what completely defines the Parent.
a Parent should require no characteristic in addition to what completely defines the Child.
In other words, a Parent is a necessary definition of a child, and a child is a sufficient definition of a Parent.
Here is a way to think through ones solution and apply the above principle that should help one avoid such a mistake. One should test ones hypothesis by verifying if all the operations of a parent class are valid for the derived class both structurally and semantically.
Is a football team a list of football players? ( Do all properties of a list apply to a team in the same meaning)
Is a team a collection of homogenous entities? Yes, team is a collection of Players
Is the order of inclusion of players descriptive of the state of the team and does the team ensure that the sequence is preserved unless explicitly changed? No, and No
Are players expected to be included/dropped based on their sequencial position in the team? No
As you see, only the first characteristic of a list is applicable to a team. Hence a team is not a list. A list would be a implementation detail of how you manage your team, so it should only be used to store the player objects and be manipulated with methods of Team class.
At this point I'd like to remark that a Team class should, in my opinion, not even be implemented using a List; it should be implemented using a Set data structure (HashSet, for example) in most cases.
What if the FootballTeam has a reserves team along with the main team?
class FootballTeam
{
List<FootballPlayer> Players { get; set; }
List<FootballPlayer> ReservePlayers { get; set; }
}
How would you model that with?
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal
}
The relationship is clearly has a and not is a.
or RetiredPlayers?
class FootballTeam
{
List<FootballPlayer> Players { get; set; }
List<FootballPlayer> ReservePlayers { get; set; }
List<FootballPlayer> RetiredPlayers { get; set; }
}
As a rule of thumb, if you ever want to inherit from a collection, name the class SomethingCollection.
Does your SomethingCollection semantically make sense? Only do this if your type is a collection of Something.
In the case of FootballTeam it doesn't sound right. A Team is more than a Collection. A Team can have coaches, trainers, etc as the other answers have pointed out.
FootballCollection sounds like a collection of footballs or maybe a collection of football paraphernalia. TeamCollection, a collection of teams.
FootballPlayerCollection sounds like a collection of players which would be a valid name for a class that inherits from List<FootballPlayer> if you really wanted to do that.
Really List<FootballPlayer> is a perfectly good type to deal with. Maybe IList<FootballPlayer> if you are returning it from a method.
In summary
Ask yourself
Is X a Y? or Has X a Y?
Do my class names mean what they are?
Design > Implementation
What methods and properties you expose is a design decision. What base class you inherit from is an implementation detail. I feel it's worth taking a step back to the former.
An object is a collection of data and behaviour.
So your first questions should be:
What data does this object comprise in the model I'm creating?
What behaviour does this object exhibit in that model?
How might this change in future?
Bear in mind that inheritance implies an "isa" (is a) relationship, whereas composition implies a "has a" (hasa) relationship. Choose the right one for your situation in your view, bearing in mind where things might go as your application evolves.
Consider thinking in interfaces before you think in concrete types, as some people find it easier to put their brain in "design mode" that way.
This isn't something everyone does consciously at this level in day to day coding. But if you're mulling this sort of topic, you're treading in design waters. Being aware of it can be liberating.
Consider Design Specifics
Take a look at List<T> and IList<T> on MSDN or Visual Studio. See what methods and properties they expose. Do these methods all look like something someone would want to do to a FootballTeam in your view?
Does footballTeam.Reverse() make sense to you? Does footballTeam.ConvertAll<TOutput>() look like something you want?
This isn't a trick question; the answer might genuinely be "yes". If you implement/inherit List<Player> or IList<Player>, you're stuck with them; if that's ideal for your model, do it.
If you decide yes, that makes sense, and you want your object to be treatable as a collection/list of players (behaviour), and you therefore want to implement ICollection<Player> or IList<Player>, by all means do so. Notionally:
class FootballTeam : ... ICollection<Player>
{
...
}
If you want your object to contain a collection/list of players (data), and you therefore want the collection or list to be a property or member, by all means do so. Notionally:
class FootballTeam ...
{
public ICollection<Player> Players { get { ... } }
}
You might feel that you want people to be able to only enumerate the set of players, rather than count them, add to them or remove them. IEnumerable<Player> is a perfectly valid option to consider.
You might feel that none of these interfaces are useful in your model at all. This is less likely (IEnumerable<T> is useful in many situations) but it's still possible.
Anyone who attempts to tell you that one of these it is categorically and definitively wrong in every case is misguided. Anyone who attempts to tell you it is categorically and definitively right in every case is misguided.
Move on to Implementation
Once you've decided on data and behaviour, you can make a decision about implementation. This includes which concrete classes you depend on via inheritance or composition.
This may not be a big step, and people often conflate design and implementation since it's quite possible to run through it all in your head in a second or two and start typing away.
A Thought Experiment
An artificial example: as others have mentioned, a team is not always "just" a collection of players. Do you maintain a collection of match scores for the team? Is the team interchangeable with the club, in your model? If so, and if your team isa collection of players, perhaps it also isa collection of staff and/or a collection of scores. Then you end up with:
class FootballTeam : ... ICollection<Player>,
ICollection<StaffMember>,
ICollection<Score>
{
....
}
Design notwithstanding, at this point in C# you won't be able to implement all of these by inheriting from List<T> anyway, since C# "only" supports single inheritance. (If you've tried this malarkey in C++, you may consider this a Good Thing.) Implementing one collection via inheritance and one via composition is likely to feel dirty. And properties such as Count become confusing to users unless you implement ILIst<Player>.Count and IList<StaffMember>.Count etc. explicitly, and then they're just painful rather than confusing. You can see where this is going; gut feeling whilst thinking down this avenue may well tell you it feels wrong to head in this direction (and rightly or wrongly, your colleagues might also if you implemented it this way!)
The Short Answer (Too Late)
The guideline about not inheriting from collection classes isn't C# specific, you'll find it in many programming languages. It is received wisdom not a law. One reason is that in practice composition is considered to often win out over inheritance in terms of comprehensibility, implementability and maintainability. It's more common with real world / domain objects to find useful and consistent "hasa" relationships than useful and consistent "isa" relationships unless you're deep in the abstract, most especially as time passes and the precise data and behaviour of objects in code changes. This shouldn't cause you to always rule out inheriting from collection classes; but it may be suggestive.
First of all, it has to do with usability. If you use inheritance, the Team class will expose behavior (methods) that are designed purely for object manipulation. For example, AsReadOnly() or CopyTo(obj) methods make no sense for the team object. Instead of the AddRange(items) method you would probably want a more descriptive AddPlayers(players) method.
If you want to use LINQ, implementing a generic interface such as ICollection<T> or IEnumerable<T> would make more sense.
As mentioned, composition is the right way to go about it. Just implement a list of players as a private variable.
Let me rewrite your question. so you might see the subject from a different perspective.
When I need to represent a football team, I understand that it is basically a name. Like: "The Eagles"
string team = new string();
Then later I realized teams also have players.
Why can't I just extend the string type so that it also holds a list of players?
Your point of entry into the problem is arbitrary. Try to think what does a team have (properties), not what it is.
After you do that, you could see if it shares properties with other classes. And think about inheritance.
It depends on the context
When you consider your team as a list of players, you are projecting the "idea" of a foot ball team down to one aspect: You reduce the "team" to the people you see on the field. This projection is only correct in a certain context. In a different context, this might be completely wrong. Imagine you want to become a sponsor of the team. So you have to talk to the managers of the team. In this context the team is projected to the list of its managers. And these two lists usually don't overlap very much. Other contexts are the current versus the former players, etc.
Unclear semantics
So the problem with considering a team as a list of its players is that its semantic depends on the context and that it cannot be extended when the context changes. Additionally it is hard to express, which context you are using.
Classes are extensible
When you using a class with only one member (e.g. IList activePlayers), you can use the name of the member (and additionally its comment) to make the context clear. When there are additional contexts, you just add an additional member.
Classes are more complex
In some cases it might be overkill to create an extra class. Each class definition must be loaded through the classloader and will be cached by the virtual machine. This costs you runtime performance and memory. When you have a very specific context it might be OK to consider a football team as a list of players. But in this case, you should really just use a IList , not a class derived from it.
Conclusion / Considerations
When you have a very specific context, it is OK to consider a team as a list of players. For example inside a method it is completely OK to write:
IList<Player> footballTeam = ...
When using F#, it can even be OK to create a type abbreviation:
type FootballTeam = IList<Player>
But when the context is broader or even unclear, you should not do this. This is especially the case when you create a new class whose context in which it may be used in the future is not clear. A warning sign is when you start to add additional attributes to your class (name of the team, coach, etc.). This is a clear sign that the context where the class will be used is not fixed and will change in the future. In this case you cannot consider the team as a list of players, but you should model the list of the (currently active, not injured, etc.) players as an attribute of the team.
A football team is not a list of football players. A football team is composed of a list of football players!
This is logically wrong:
class FootballTeam : List<FootballPlayer>
{
public string TeamName;
public int RunningTotal
}
and this is correct:
class FootballTeam
{
public List<FootballPlayer> players
public string TeamName;
public int RunningTotal
}
Just because I think the other answers pretty much go off on a tangent of whether a football team "is-a" List<FootballPlayer> or "has-a" List<FootballPlayer>, which really doesn't answer this question as written.
The OP chiefly asks for clarification on guidelines for inheriting from List<T>:
A guideline says that you shouldn't inherit from List<T>. Why not?
Because List<T> has no virtual methods. This is less of a problem in your own code, since you can usually switch out the implementation with relatively little pain - but can be a much bigger deal in a public API.
What is a public API and why should I care?
A public API is an interface you expose to 3rd party programmers. Think framework code. And recall that the guidelines being referenced are the ".NET Framework Design Guidelines" and not the ".NET Application Design Guidelines". There is a difference, and - generally speaking - public API design is a lot more strict.
If my current project does not and is not likely to ever have this public API, can I safely ignore this guideline? If I do inherit from List and it turns out I need a public API, what difficulties will I have?
Pretty much, yeah. You may want to consider the rationale behind it to see if it applies to your situation anyway, but if you're not building a public API then you don't particularly need to worry about API concerns like versioning (of which, this is a subset).
If you add a public API in the future, you will either need to abstract out your API from your implementation (by not exposing your List<T> directly) or violate the guidelines with the possible future pain that entails.
Why does it even matter? A list is a list. What could possibly change? What could I possibly want to change?
Depends on the context, but since we're using FootballTeam as an example - imagine that you can't add a FootballPlayer if it would cause the team to go over the salary cap. A possible way of adding that would be something like:
class FootballTeam : List<FootballPlayer> {
override void Add(FootballPlayer player) {
if (this.Sum(p => p.Salary) + player.Salary > SALARY_CAP)) {
throw new InvalidOperationException("Would exceed salary cap!");
}
}
}
Ah...but you can't override Add because it's not virtual (for performance reasons).
If you're in an application (which, basically, means that you and all of your callers are compiled together) then you can now change to using IList<T> and fix up any compile errors:
class FootballTeam : IList<FootballPlayer> {
private List<FootballPlayer> Players { get; set; }
override void Add(FootballPlayer player) {
if (this.Players.Sum(p => p.Salary) + player.Salary > SALARY_CAP)) {
throw new InvalidOperationException("Would exceed salary cap!");
}
}
/* boiler plate for rest of IList */
}
but, if you've publically exposed to a 3rd party you just made a breaking change that will cause compile and/or runtime errors.
TL;DR - the guidelines are for public APIs. For private APIs, do what you want.
There are a lot excellent answers here, but I want to touch on something I didn't see mentioned: Object oriented design is about empowering objects.
You want to encapsulate all your rules, additional work and internal details inside an appropriate object. In this way other objects interacting with this one don't have to worry about it all. In fact, you want to go a step further and actively prevent other objects from bypassing these internals.
When you inherit from List, all other objects can see you as a List. They have direct access to the methods for adding and removing players. And you'll have lost your control; for example:
Suppose you want to differentiate when a player leaves by knowing whether they retired, resigned or were fired. You could implement a RemovePlayer method that takes an appropriate input enum. However, by inheriting from List, you would be unable to prevent direct access to Remove, RemoveAll and even Clear. As a result, you've actually disempowered your FootballTeam class.
Additional thoughts on encapsulation... You raised the following concern:
It makes my code needlessly verbose. I must now call my_team.Players.Count instead of just my_team.Count.
You're correct, that would be needlessly verbose for all clients to use you team. However, that problem is very small in comparison to the fact that you've exposed List Players to all and sundry so they can fiddle with your team without your consent.
You go on to say:
It just plain doesn't make any sense. A football team doesn't "have" a list of players. It is the list of players. You don't say "John McFootballer has joined SomeTeam's players". You say "John has joined SomeTeam".
You're wrong about the first bit: Drop the word 'list', and it's actually obvious that a team does have players.
However, you hit the nail on the head with the second. You don't want clients calling ateam.Players.Add(...). You do want them calling ateam.AddPlayer(...). And your implemention would (possibly amongst other things) call Players.Add(...) internally.
Hopefully you can see how important encapsulation is to the objective of empowering your objects. You want to allow each class to do its job well without fear of interference from other objects.
Does allowing people to say
myTeam.subList(3, 5);
make any sense at all? If not then it shouldn't be a List.
It depends on the behaviour of your "team" object. If it behaves just like a collection, it might be OK to represent it first with a plain List. Then you might start to notice that you keep duplicating code that iterates on the list; at this point you have the option of creating a FootballTeam object that wraps the list of players. The FootballTeam class becomes the home for all the code that iterates on the list of players.
It makes my code needlessly verbose. I must now call my_team.Players.Count instead of just my_team.Count. Thankfully, with C# I can define indexers to make indexing transparent, and forward all the methods of the internal List... But that's a lot of code! What do I get for all that work?
Encapsulation. Your clients need not know what goes on inside of FootballTeam. For all your clients know, it might be implemented by looking the list of players up in a database. They don't need to know, and this improves your design.
It just plain doesn't make any sense. A football team doesn't "have" a list of players. It is the list of players. You don't say "John McFootballer has joined SomeTeam's players". You say "John has joined SomeTeam". You don't add a letter to "a string's characters", you add a letter to a string. You don't add a book to a library's books, you add a book to a library.
Exactly :) you will say footballTeam.Add(john), not footballTeam.List.Add(john). The internal list will not be visible.
What is the correct C# way of representing a data structure...
Remeber, "All models are wrong, but some are useful." -George E. P. Box
There is no a "correct way", only a useful one.
Choose one that is useful to you and/your users. That's it. Develop economically, don't over-engineer. The less code you write, the less code you will need to debug. (read the following editions).
-- Edited
My best answer would be... it depends. Inheriting from a List would expose the clients of this class to methods that may be should not be exposed, primarily because FootballTeam looks like a business entity.
-- Edition 2
I sincerely don't remember to what I was referring on the “don't over-engineer” comment. While I believe the KISS mindset is a good guide, I want to emphasize that inheriting a business class from List would create more problems than it resolves, due abstraction leakage.
On the other hand, I believe there are a limited number of cases where simply to inherit from List is useful. As I wrote in the previous edition, it depends. The answer to each case is heavily influenced by both knowledge, experience and personal preferences.
Thanks to #kai for helping me to think more precisely about the answer.
This reminds me of the "Is a" versus "has a" tradeoff. Sometimes it is easier and makesmore sense to inherit directly from a super class. Other times it makes more sense to create a standalone class and include the class you would have inherited from as a member variable. You can still access the functionality of the class but are not bound to the interface or any other constraints that might come from inheriting from the class.
Which do you do? As with a lot of things...it depends on the context. The guide I would use is that in order to inherit from another class there truly should be an "is a" relationship. So if you a writing a class called BMW, it could inherit from Car because a BMW truly is a car. A Horse class can inherit from the Mammal class because a horse actually is a mammal in real life and any Mammal functionality should be relevant to Horse. But can you say that a team is a list? From what I can tell, it does not seem like a Team really "is a" List. So in this case, I would have a List as a member variable.
Problems with serializing
One aspect is missing. Classes that inherit from List can't be serialized correctly using XmlSerializer. In that case DataContractSerializer must be used instead, or an own serializing implementation is needed.
public class DemoList : List<Demo>
{
// using XmlSerializer this properties won't be seralized
// There is no error, the data is simply not there.
string AnyPropertyInDerivedFromList { get; set; }
}
public class Demo
{
// this properties will be seralized
string AnyPropetyInDemo { get; set; }
}
Further reading: When a class is inherited from List<>, XmlSerializer doesn't serialize other attributes
Use IList instead
Personaly I wouldn't inherit from List but implement IList. Visual Studio will do the job for you and create a full working iplementation. Look here: How to get a full working implementation of IList
What the guidelines say is that the public API should not reveal the internal design decision of whether you are using a list, a set, a dictionary, a tree or whatever. A "team" is not necessarily a list. You may implement it as a list but users of your public API should use you class on a need to know basis. This allows you to change your decision and use a different data structure without affecting the public interface.
When they say List<T> is "optimized" I think they want to mean that it doesn't have features like virtual methods which are bit more expensive. So the problem is that once you expose List<T> in your public API, you loose ability to enforce business rules or customize its functionality later. But if you are using this inherited class as internal within your project (as opposed to potentially exposed to thousands of your customers/partners/other teams as API) then it may be OK if it saves your time and it is the functionality you want to duplicate. The advantage of inheriting from List<T> is that you eliminate lot of dumb wrapper code that is just never going to be customized in foreseeable future. Also if you want your class to explicitly have exact same semantics as List<T> for the life of your APIs then also it may be OK.
I often see lot of people doing tons of extra work just because of FxCop rule says so or someone's blog says it's a "bad" practice. Many times, this turns code in to design pattern palooza weirdness. As with lot of guideline, treat it as guideline that can have exceptions.
My dirty secret: I don't care what people say, and I do it. .NET Framework is spread with "XxxxCollection" (UIElementCollection for top of my head example).
So what stops me saying:
team.Players.ByName("Nicolas")
When I find it better than
team.ByName("Nicolas")
Moreover, my PlayerCollection might be used by other class, like "Club" without any code duplication.
club.Players.ByName("Nicolas")
Best practices of yesterday, might not be the one of tomorrow. There is no reason behind most best practices, most are only wide agreement among the community. Instead of asking the community if it will blame you when you do that ask yourself, what is more readable and maintainable?
team.Players.ByName("Nicolas")
or
team.ByName("Nicolas")
Really. Do you have any doubt? Now maybe you need to play with other technical constraints that prevent you to use List<T> in your real use case. But don't add a constraint that should not exist. If Microsoft did not document the why, then it is surely a "best practice" coming from nowhere.
While I don't have a complex comparison as most of these answers do, I would like to share my method for handling this situation. By extending IEnumerable<T>, you can allow your Team class to support Linq query extensions, without publicly exposing all the methods and properties of List<T>.
class Team : IEnumerable<Player>
{
private readonly List<Player> playerList;
public Team()
{
playerList = new List<Player>();
}
public Enumerator GetEnumerator()
{
return playerList.GetEnumerator();
}
...
}
class Player
{
...
}
I just wanted to add that Bertrand Meyer, the inventor of Eiffel and design by contract, would have Team inherit from List<Player> without so much as batting an eyelid.
In his book, Object-Oriented Software Construction, he discusses the implementation of a GUI system where rectangular windows can have child windows. He simply has Window inherit from both Rectangle and Tree<Window> to reuse the implementation.
However, C# is not Eiffel. The latter supports multiple inheritance and renaming of features. In C#, when you subclass, you inherit both the interface and the implemenation. You can override the implementation, but the calling conventions are copied directly from the superclass. In Eiffel, however, you can modify the names of the public methods, so you can rename Add and Remove to Hire and Fire in your Team. If an instance of Team is upcast back to List<Player>, the caller will use Add and Remove to modify it, but your virtual methods Hire and Fire will be called.
If your class users need all the methods and properties** List has, you should derive your class from it. If they don't need them, enclose the List and make wrappers for methods your class users actually need.
This is a strict rule, if you write a public API, or any other code that will be used by many people. You may ignore this rule if you have a tiny app and no more than 2 developers. This will save you some time.
For tiny apps, you may also consider choosing another, less strict language. Ruby, JavaScript - anything that allows you to write less code.
I think I don't agree with your generalization. A team isn't just a collection of players. A team has so much more information about it - name, emblem, collection of management/admin staff, collection of coaching crew, then collection of players. So properly, your FootballTeam class should have 3 collections and not itself be a collection; if it is to properly model the real world.
You could consider a PlayerCollection class which like the Specialized StringCollection offers some other facilities - like validation and checks before objects are added to or removed from the internal store.
Perhaps, the notion of a PlayerCollection betters suits your preferred approach?
public class PlayerCollection : Collection<Player>
{
}
And then the FootballTeam can look like this:
public class FootballTeam
{
public string Name { get; set; }
public string Location { get; set; }
public ManagementCollection Management { get; protected set; } = new ManagementCollection();
public CoachingCollection CoachingCrew { get; protected set; } = new CoachingCollection();
public PlayerCollection Players { get; protected set; } = new PlayerCollection();
}
Prefer Interfaces over Classes
Classes should avoid deriving from classes and instead implement the minimal interfaces necessary.
Inheritance breaks Encapsulation
Deriving from classes breaks encapsulation:
exposes internal details about how your collection is implemented
declares an interface (set of public functions and properties) that may not be appropriate
Among other things this makes it harder to refactor your code.
Classes are an Implementation Detail
Classes are an implementation detail that should be hidden from other parts of your code.
In short a System.List is a specific implementation of an abstract data type, that may or may not be appropriate now and in the future.
Conceptually the fact that the System.List data type is called "list" is a bit of a red-herring. A System.List<T> is a mutable ordered collection that supports amortized O(1) operations for adding, inserting, and removing elements, and O(1) operations for retrieving the number of elements or getting and setting element by index.
The Smaller the Interface the more Flexible the Code
When designing a data structure, the simpler the interface is, the more flexible the code is. Just look at how powerful LINQ is for a demonstration of this.
How to Choose Interfaces
When you think "list" you should start by saying to yourself, "I need to represent a collection of baseball players". So let's say you decide to model this with a class. What you should do first is decide what the minimal amount of interfaces that this class will need to expose.
Some questions that can help guide this process:
Do I need to have the count? If not consider implementing IEnumerable<T>
Is this collection going to change after it has been initialized? If not consider IReadonlyList<T>.
Is it important that I can access items by index? Consider ICollection<T>
Is the order in which I add items to the collection important? Maybe it is an ISet<T>?
If you indeed want these thing then go ahead and implement IList<T>.
This way you will not be coupling other parts of the code to implementation details of your baseball players collection and will be free to change how it is implemented as long as you respect the interface.
By taking this approach you will find that code becomes easier to read, refactor, and reuse.
Notes about Avoiding Boilerplate
Implementing interfaces in a modern IDE should be easy. Right click and choose "Implement Interface". Then forward all of the implementations to a member class if you need to.
That said, if you find you are writing lots of boilerplate, it is potentially because you are exposing more functions than you should be. It is the same reason you shouldn't inherit from a class.
You can also design smaller interfaces that make sense for your application, and maybe just a couple of helper extension functions to map those interfaces to any others that you need. This is the approach I took in my own IArray interface for the LinqArray library.
When is it acceptable?
To quote Eric Lippert:
When you're building a mechanism that extends the List<T> mechanism.
For example, you are tired of the absence of the AddRange method in IList<T>:
public interface IMoreConvenientListInterface<T> : IList<T>
{
void AddRange(IEnumerable<T> collection);
}
public class MoreConvenientList<T> : List<T>, IMoreConvenientListInterface<T> { }
I'm playing around with writing an item crafting system that I might want to put into a game someday. There are Recipes which specify the ingredients they require and what they produce.
I wanted the recipes to be flexible, such that they only required a broad category of ingredients, not an exact one. For example, a recipe for a weapon blade might just say it requires a metal, not specifically steel. The recipes have to verify that the ingredients given are within the acceptable category. Some materials might belong to multiple categories.
Then I had a possibly brilliant, possibly insane idea. The .net type system already implements that! So for each material, I add a property of type Type, and use IsAssignableFrom to verify the ingredients' compatibility.
I have a file that looks like this:
public interface ItemType { }
public interface Material : ItemType { }
public interface Metal : Material { }
public interface Gold : Metal { }
public interface Silver : Metal { }
public interface Iron : Metal { }
public interface Steel : Metal { }
public interface Wood : Material { }
public interface Coal : Material { }
And so on. None of those are ever implemented. I'm just borrowing the built in type checking for my own purposes.
Is there anything necessarily wrong with this?
edit: actual question
If I've been clear enough to explain what I'm trying to accomplish here, then what would you suggest is a good way to go about it, ignoring this whole type system abuse thing? Would you have also used this solution, or something else?
Second question, are there any pitfalls to watch out for in what I've done here?
Is there anything necessarily wrong with this?
Yes, everything.
Classes and interfaces are meant to express behavior. There is no behavior in your code. Your code is not miscomunicating the intentions. Usually, when you see interface, you expect it to have some method and that method is called. That is not the case here.
It will become impossible to define the materials and recipes in some kind of configuration/resource file, like most normal games do. So you have to recompile every time you want to change the materials or recipe a little.
It will become problematic to create items/materials that are somehow related. For example, lets say there are multiple tools and each tool can be from different materials. In your case, you have to write down every combination. In ideal case, you can just run few nested for loops which create each combination.
You cannot parametrize the materials in any way without creating classes of them. For example, you might want different colors of wool. How would you do it? Create interface for each color? Or use some kind of enum as parameter. But you have to create class for that.
Better way would be simple Item class that has collection of tags. Even simple strings should be enough.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
tl;dr
In a good design. Should accessing the database be handled in a separate business logic layer (in an asp.net MVC model), or is it OK to pass IQueryables or DbContext objects to a controller?
Why? What are the pros and cons of each?
I'm building an ASP.NET MVC application in C#. It uses EntityFramework as an ORM.
Let's simplify this scenario a bit.
I have a database table with cute fluffy kittens. Each kitten has a kitten image link, kitten fluffiness index, kitten name and kitten id. These map to an EF generated POCO called Kitten. I might use this class in other projects and not just the asp.net MVC project.
I have a KittenController which should fetch the latest fluffy kittens at /Kittens. It may contain some logic selecting the kitten, but not too much logic. I've been arguing with a friend about how to implement this, I won't disclose sides :)
Option 1: db in the controller:
public ActionResult Kittens() // some parameters might be here
{
using(var db = new KittenEntities()){ // db can also be injected,
var result = db.Kittens // this explicit query is here
.Where(kitten=>kitten.fluffiness > 10)
.Select(kitten=>new {
Name=kitten.name,
Url=kitten.imageUrl
}).Take(10);
return Json(result,JsonRequestBehavior.AllowGet);
}
}
Option 2: Separate model
public class Kitten{
public string Name {get; set; }
public string Url {get; set; }
private Kitten(){
_fluffiness = fluffinessIndex;
}
public static IEnumerable<Kitten> GetLatestKittens(int fluffinessIndex=10){
using(var db = new KittenEntities()){ //connection can also be injected
return db.Kittens.Where(kitten=>kitten.fluffiness > 10)
.Select(entity=>new Kitten(entity.name,entity.imageUrl))
.Take(10).ToList();
}
} // it's static for simplicity here, in fact it's probably also an object method
// Also, in practice it might be a service in a services directory creating the
// Objects and fetching them from the DB, and just the kitten MVC _type_ here
}
//----Then the controller:
public ActionResult Kittens() // some parameters might be here
{
return Json(Kittens.GetLatestKittens(10),JsonRequestBehavior.AllowGet);
}
Notes: GetLatestKittens is unlikely to be used elsewhere in the code but it might. It's possible to use the constructor of Kitten instead of a static building method and changing the class for Kittens. Basically it's supposed to be a layer above the database entities so the controller does not have to be aware of the actual database, the mapper, or entity framework.
What are some pros and cons for each design?
Is there a clear winner? Why?
Note: Of course, alternative approaches are very valued as answers too.
Clarification 1: This is not a trivial application in practice. This is an application with tens of controllers and thousands of lines of code, and the entities are not only used here but in tens of other C# projects. The example here is a reduced test case.
The second approach is superior. Let's try a lame analogy:
You enter a pizza shop and walk over to the counter. "Welcome to McPizza Maestro Double Deluxe, may I take your order?" the pimpled cashier asks you, the void in his eyes threatening to lure you in. "Yeah I'll have one large pizza with olives". "Okay", the cashier replies and his voice croaks in the middle of the "o" sound. He yells towards the kitchen "One Jimmy Carter!"
And then, after waiting for a bit, you get a large pizza with olives. Did you notice anything peculiar? The cashier didn't say "Take some dough, spin it round like it's Christmas time, pour some cheese and tomato sauce, sprinkle olives and put in an oven for about 8 minutes!" Come to think of it, that's not peculiar at all. The cashier is simply a gateway between two worlds: The customer who wants the pizza, and the cook who makes the pizza. For all the cashier knows, the cook gets his pizza from aliens or slices them from Jimmy Carter (he's a dwindling resource, people).
That is your situation. Your cashier isn't dumb. He knows how to make pizza. That doesn't mean he should be making pizza, or telling someone how to make pizza. That's the cook's job. As other answers (notably Florian Margaine's and Madara Uchiha's) illustrated, there is a separation of responsibilities. The model might not do much, it might be just one function call, it might be even one line - but that doesn't matter, because the controller doesn't care.
Now, let's say the owners decide that pizzas are just a fad (blasphemy!) and you switch over to something more contemporary, a fancy burger joint. Let's review what happens:
You enter a fancy burger joint and walk over to the counter. "Welcome to Le Burger Maestro Double Deluxe, may I take your order?" "yeah, I'll have one large hamburger with olives". "Okay", and he turns to the kitchen, "One Jimmy Carter!"
And then, you get a large hamburger with olives (ew).
Option 1 and 2 are bit extreme and like the choice between the devil and the deep blue sea but if I had to choose between the two I would prefer option 1.
First of all, option 2 will throw a runtime exception because Entity Framework does not support to project into an entity (Select(e => new Kitten(...)) and it does not allow to use a constructor with parameters in a projection. Now, this note seems a bit pedantic in this context, but by projecting into the entity and returning a Kitten (or an enumeration of Kittens) you are hiding the real problem with that approach.
Obviously, your method returns two properties of the entity that you want to use in your view - the kitten's name and imageUrl. Because these are only a selection of all Kitten properties returning a (half-filled) Kitten entity would not be appropriate. So, what type to actually return from this method?
You could return object (or IEnumerable<object>) (that's how I understand your comment about the "object method") which is fine if you pass the result into Json(...) to be processed in Javascript later. But you would lose all compile time type information and I doubt that an object result type is useful for anything else.
You could return some named type that just contains the two properties - maybe called "KittensListDto".
Now, this is only one method for one view - the view to list kittens. Then you have a details view to display a single kitten, then an edit view and then a delete confirm view maybe. Four views for an existing Kitten entity, each of which needs possibly different properties and each of which would need a separate method and projection and a different DTO type. The same for the Dog entity and for 100 entities more in the project and you get perhaps 400 methods and 400 return types.
And most likely not a single one will be ever reused at any other place than this specific view. Why would you want to Take 10 kittens with just name and imageUrl anywhere a second time? Do you have a second kittens list view? If so, it will have a reason and the queries are only identical by accident and now and if one changes the other one does not necessarily, otherwise the list view is not properly "reused" and should not exist twice. Or is the same list used by an Excel export maybe? But perhaps the Excel users want to have 1000 kittens tomorrow, while the view should still display only 10. Or the view should display the kitten's Age tomorrow, but the Excel users don't want to have that because their Excel macros would not run correctly anymore with that change. Just because two pieces of code are identical they don't have to be factored out into a common reusable component if they are in a different context or have different semantics. You better leave it a GetLatestKittensForListView and GetLatestKittensForExcelExport. Or you better don't have such methods in your service layer at all.
In the light of these considerations an excursion to a Pizza shop as an analogy why the first approach is superior :)
"Welcome to BigPizza, the custom Pizza shop, may I take your order?" "Well, I'd like to have a Pizza with olives, but tomato sauce on top and cheese at the bottom and bake it in the oven for 90 minutes until it's black and hard like a flat rock of granite." "OK, Sir, custom Pizzas are our profession, we'll make it."
The cashier goes to the kitchen. "There is a psycho at the counter, he wants to have a Pizza with... it's a rock of granite with ... wait ... we need to have a name first", he tells the cook.
"No!", the cook screams, "not again! You know we tried that already." He takes a stack of paper with 400 pages, "here we have rock of granite from 2005, but... it didn't have olives, but paprica instead... or here is top tomato ... but the customer wanted it baked only half a minute." "Maybe we should call it TopTomatoGraniteRockSpecial?" "But it doesn't take the cheese at the bottom into account..." The cashier: "That's what Special is supposed to express." "But having the Pizza rock formed like a pyramid would be special as well", the cook replies. "Hmmm ... it is difficult...", the desparate cashier says.
"IS MY PIZZA ALREADY IN THE OVEN?", suddenly it shouts through the kitchen door. "Let's stop this discussion, just tell me how to make this Pizza, we are not going to have such a Pizza a second time", the cook decides. "OK, it's a Pizza with olives, but tomato sauce on top and cheese at the bottom and bake it in the oven for 90 minutes until it's black and hard like a flat rock of granite."
If option 1 violates a separation of concerns principle by using a database context in the view layer the option 2 violates the same principle by having presentation centric query logic in the service or business layer. From a technical viewpoint it does not but it will end up with a service layer that is anything else than "reusable" outside of the presentation layer. And it has much higher development and maintenance costs because for every required piece of data in a controller action you have to create services, methods and return types.
Now, there actually might be queries or query parts that are reused often and that's why I think that option 1 is almost as extreme as option 2 - for example a Where clause by the key (will be probably used in details, edit and delete confirm view), filtering out "soft deleted" entities, filtering by a tenant in a multi-tenant architecture or disabling change tracking, etc. For such really repetetive query logic I could imagine that extracting this into a service or repository layer (but maybe only reusable extensions methods) might make sense, like
public IQueryable<Kitten> GetKittens()
{
return context.Kittens.AsNoTracking().Where(k => !k.IsDeleted);
}
Anything else that follows after - like projecting properties - is view specific and I would not like to have it in this layer. In order to make this approach possible IQueryable<T> must be exposed from the service/repository. It does not mean that the select must be directly in the controller action. Especially fat and complex projections (that maybe join other entities by navigation properties, perform groupings, etc.) could be moved into extension methods of IQueryable<T> that are collected in other files, directories or even another project, but still a project that is an appendix to the presentation layer and much closer to it than to the service layer. An action could then look like this:
public ActionResult Kittens()
{
var result = kittenService.GetKittens()
.Where(kitten => kitten.fluffiness > 10)
.OrderBy(kitten => kitten.name)
.Select(kitten => new {
Name=kitten.name,
Url=kitten.imageUrl
})
.Take(10);
return Json(result,JsonRequestBehavior.AllowGet);
}
Or like this:
public ActionResult Kittens()
{
var result = kittenService.GetKittens()
.ToKittenListViewModel(10, 10);
return Json(result,JsonRequestBehavior.AllowGet);
}
With ToKittenListViewModel() being:
public static IEnumerable<object> ToKittenListViewModel(
this IQueryable<Kitten> kittens, int minFluffiness, int pageItems)
{
return kittens
.Where(kitten => kitten.fluffiness > minFluffiness)
.OrderBy(kitten => kitten.name)
.Select(kitten => new {
Name = kitten.name,
Url = kitten.imageUrl
})
.Take(pageItems)
.AsEnumerable()
.Cast<object>();
}
That's just a basic idea and a sketch that another solution could be in the middle between option 1 and 2.
Well, it all depends on the overall architecture and requirements and all what I wrote above might be useless and wrong. Do you have to consider that the ORM or data access technology could be changed in future? Could there be a physical boundary between controller and database, is the controller disconnected from the context and do the data need to be fetched via a web service for example in future? This would require a very different approach which would more lean towards option 2.
Such an architecture is so different that - in my opinion - you simply can't say "maybe" or "not now, but possibly it could be a requirement in future, or possibly it won't". This is something that the project's stakeholders have to define before you can proceed with architectural decisions as it will increase development costs dramatically and it will we wasted money in development and maintenance if the "maybe" turns out to never become reality.
I was talking only about queries or GET requests in a web app which have rarely something that I would call "business logic" at all. POST requests and modifying data are a whole different story. If it is forbidden that an order can be changed after it is invoiced for example this is a general "business rule" that normally applies no matter which view or web service or background process or whatever tries to change an order. I would definitely put such a check for the order status into a business service or any common component and never into a controller.
There might be an argument against using IQueryable<T> in a controller action because it is coupled to LINQ-to-Entities and it will make unit tests difficult. But what is a unit test going to test in a controller action that doesn't contain any business logic, that gets parameters passed in that usually come from a view via model binding or routing - not covered by the unit test - that uses a mocked repository/service returning IEnumerable<T> - database query and access is not tested - and that returns a View - correct rendering of the view is not tested?
This is the key phrase there:
I might use this class in other projects and not just the asp.net MVC project.
A controller is HTTP-centric. It is only there to handle HTTP requests. If you want to use your model in any other project, i.e. your business logic, you can't have any logic in the controllers. You must be able to take off your model, put it somewhere else, and all your business logic still works.
So, no, don't access your database from your controller. It kills any possible reuse you might ever get.
Do you really want to rewrite all your db/linq requests in all your projects when you can have simple methods that you reuse?
Another thing: your function in option 1 has two responsibilities: it fetches the result from a mapper object and it displays it. That's too many responsibilities. There is an "and" in the list of responsibilities. Your option 2 only has one responsibility: being the link between the model and the view.
I'm not sure about how ASP.NET or C# does things. But I do know MVC.
In MVC, you separate your application into two major layers: The Presentational layer (which contains the Controller and View), and the Model layer (which contains... the Model).
The point is to separate the 3 major responsibilities in the application:
The application logic, handling request, user input, etc. That's the Controller.
The presentation logic, handling templating, display, formats. That's the View.
The business logic or "heavy logic", handling basically everything else. That's your actual application basically, where everything your application is supposed to do gets done. This part handles domain objects that represents the information structures of the application, it handles the mapping of those objects into permanent storage (be it session, database or files).
As you can see, database handling is found on the Model, and it has several advantages:
The controller is less tied to the model. Because "the work" gets done in the Model, should you want to change your controller, you'll be able to do so more easily if your database handling is in the Model.
You gain more flexibility. In the case where you want to change your mapping scheme (I want to switch to Postgres from MySQL), I only need to change it once (in the base Mapper definition).
For more information, see the excellent answer here: How should a model be structured in MVC?
I prefer the second approach. It at least separates between controller and business logic. It is still a little bit hard to unit test (may be I'm not good at mocking).
I personally prefer the following approach. Main reason is it is easy to unit testing for each layer - presentation, business logic, data access. Besides, you can see that approach in a lot of open source projects.
namespace MyProject.Web.Controllers
{
public class MyController : Controller
{
private readonly IKittenService _kittenService ;
public MyController(IKittenService kittenService)
{
_kittenService = kittenService;
}
public ActionResult Kittens()
{
// var result = _kittenService.GetLatestKittens(10);
// Return something.
}
}
}
namespace MyProject.Domain.Kittens
{
public class Kitten
{
public string Name {get; set; }
public string Url {get; set; }
}
}
namespace MyProject.Services.KittenService
{
public interface IKittenService
{
IEnumerable<Kitten> GetLatestKittens(int fluffinessIndex=10);
}
}
namespace MyProject.Services.KittenService
{
public class KittenService : IKittenService
{
public IEnumerable<Kitten> GetLatestKittens(int fluffinessIndex=10)
{
using(var db = new KittenEntities())
{
return db.Kittens // this explicit query is here
.Where(kitten=>kitten.fluffiness > 10)
.Select(kitten=>new {
Name=kitten.name,
Url=kitten.imageUrl
}).Take(10);
}
}
}
}
#Win has the idea I'd more or less follow.
Have the Presentation just presents.
The Controller simply acts as a bridge, it does nothing really, it is the middle man. Should be easy to test.
The DAL is the hardest part. Some like to separate it out on a web service, I have done so for a project once. That way you can also have the DAL act as an API for others (internally or externally) to consume - so WCF or WebAPI comes to mind.
That way your DAL is completely independent of your web server. If someone hacks your server, the DAL is probably still secure.
It's up to you I guess.
Single Responsibility Principle. Each of your classes should have one and only one reason to change. #Zirak gives a good example of how each person has a single reponsibility in the chain of events.
Let's look at the hypothetical test case you have provided.
public ActionResult Kittens() // some parameters might be here
{
using(var db = new KittenEntities()){ // db can also be injected,
var result = db.Kittens // this explicit query is here
.Where(kitten=>kitten.fluffiness > 10)
.Select(kitten=>new {
Name=kitten.name,
Url=kitten.imageUrl
}).Take(10);
return Json(result,JsonRequestBehavior.AllowGet);
}
}
With a service layer in between, it might look something like this.
public ActionResult Kittens() // some parameters might be here
{
using(var service = new KittenService())
{
var result = service.GetFluffyKittens();
return Json(result,JsonRequestBehavior.AllowGet);
}
}
public class KittenService : IDisposable
{
public IEnumerable<Kitten> GetFluffyKittens()
{
using(var db = new KittenEntities()){ // db can also be injected,
return db.Kittens // this explicit query is here
.Where(kitten=>kitten.fluffiness > 10)
.Select(kitten=>new {
Name=kitten.name,
Url=kitten.imageUrl
}).Take(10);
}
}
}
With a few more imaginary controller classes, you can see how this would be much easier to reuse. That's great! We have code reuse, but there's even more benefit. Lets say for example, our Kitten website is taking off like crazy, everyone wants to look at fluffy kittens, so we need to partition our database (shard). The constructor for all our db calls needs to be injected with the connection to the proper database. With our controller based EF code, we would have to change the controllers because of a DATABASE issue.
Clearly that means that our controllers are now dependant upon database concerns. They now have too many reasons to change, which can potentially lead to accidental bugs in the code and needing to retest code that is unrelated to that change.
With a service, we could do the following, while the controllers are protected from that change.
public class KittenService : IDisposable
{
public IEnumerable<Kitten> GetFluffyKittens()
{
using(var db = GetDbContextForFuffyKittens()){ // db can also be injected,
return db.Kittens // this explicit query is here
.Where(kitten=>kitten.fluffiness > 10)
.Select(kitten=>new {
Name=kitten.name,
Url=kitten.imageUrl
}).Take(10);
}
}
protected KittenEntities GetDbContextForFuffyKittens(){
// ... code to determine the least used shard and get connection string ...
var connectionString = GetShardThatIsntBusy();
return new KittensEntities(connectionString);
}
}
The key here is to isolate changes from reaching other parts of your code. You should be testing anything that is affected by a change in code, so you want to isolate changes from one another. This has the side effect of keeping your code DRY, so you end up with more flexible and reusable classes and services.
Separating the classes also allows you to centralize behavior that would have either been difficult or repetitive before. Think about logging errors in your data access. In the first method, you would need logging everywhere. With a layer in between you can easily insert some logging logic.
public class KittenService : IDisposable
{
public IEnumerable<Kitten> GetFluffyKittens()
{
Func<IEnumerable<Kitten>> func = () => {
using(var db = GetDbContextForFuffyKittens()){ // db can also be injected,
return db.Kittens // this explicit query is here
.Where(kitten=>kitten.fluffiness > 10)
.Select(kitten=>new {
Name=kitten.name,
Url=kitten.imageUrl
}).Take(10);
}
};
return this.Execute(func);
}
protected KittenEntities GetDbContextForFuffyKittens(){
// ... code to determine the least used shard and get connection string ...
var connectionString = GetShardThatIsntBusy();
return new KittensEntities(connectionString);
}
protected T Execute(Func<T> func){
try
{
return func();
}
catch(Exception ex){
Logging.Log(ex);
throw ex;
}
}
}
Either way is not so good for testing. Use dependency injection to get the DI container to create the db context and inject it into the controller constructor.
EDIT: a little more on testing
If you can test you can see if you application works per spec before you publish.
If you can't test easily you won't write your test.
from that chat room:
Okay, so on a trivial application you write it and it doesn't change very much,
but on a non trivial application you get these nasty things called dependencies, which when you change one breaks a lot of shit, so you use Dependency injection to inject a repo that you can fake, and then you can write unit tests in order to make sure your code doesn't
If I had (note: really had) to chose between the 2 given options, I'd say 1 for simplicity, but I don't recommend using it since it's hard to maintain and causes a lot of duplicate code.
A controller should contain as less business logic as possible. It should only delegate data access, map it to a ViewModel and pass it to the View.
If you want to abstract data access away from your controller (which is a good thing), you might want to create a service layer containing a method like GetLatestKittens(int fluffinessIndex).
I don't recommend placing data access logic in your POCO either, this doesn't allow you to switch to another ORM (NHibernate for example) and reuse the same POCO's.
How does one go about create an API that is fluent in nature?
Is this using extension methods primarily?
This article explains it much better than I ever could.
EDIT, can't squeeze this in a comment...
There are two sides to interfaces, the implementation and the usage. There's more work to be done on the creation side, I agree with that, however the main benefits can be found on the usage side of things. Indeed, for me the main advantage of fluent interfaces is a more natural, easier to remember and use and why not, more aesthetically pleasing API. And just maybe, the effort of having to squeeze an API in a fluent form may lead to better thought out API?
As Martin Fowler says in the original article about fluent interfaces:
Probably the most important thing to
notice about this style is that the
intent is to do something along the
lines of an internal
DomainSpecificLanguage. Indeed this is
why we chose the term 'fluent' to
describe it, in many ways the two
terms are synonyms. The API is
primarily designed to be readable and
to flow. The price of this fluency is
more effort, both in thinking and in
the API construction itself. The
simple API of constructor, setter, and
addition methods is much easier to
write. Coming up with a nice fluent
API requires a good bit of thought.
As in most cases API's are created once and used over and over again, the extra effort may be worth it.
And verbose? I'm all for verbosity if it serves the readability of a program.
MrBlah,
Though you can write extension methods to write a fluent interface, a better approach is using the builder pattern. I'm in the same boat as you and I'm trying to figure out a few advanced features of fluent interfaces.
Below you'll see some sample code that I created in another thread
public class Coffee
{
private bool _cream;
private int _ounces;
public static Coffee Make { get { return new Coffee(); } }
public Coffee WithCream()
{
_cream = true;
return this;
}
public Coffee WithOuncesToServe(int ounces)
{
_ounces = ounces;
return this;
}
}
var myMorningCoffee = Coffee.Make.WithCream().WithOuncesToServe(16);
While many people cite Martin Fowler as being a prominent exponent in the fluent API discussion, his early design claims actually evolve around a fluent builder pattern or method chaining. Fluent APIs can be further evolved into actual internal domain-specific languages. An article that explains how a BNF notation of a grammar can be manually transformed into a "fluent API" can be seen here:
http://blog.jooq.org/2012/01/05/the-java-fluent-api-designer-crash-course/
It transforms this grammar:
Into this Java API:
// Initial interface, entry point of the DSL
interface Start {
End singleWord();
End parameterisedWord(String parameter);
Intermediate1 word1();
Intermediate2 word2();
Intermediate3 word3();
}
// Terminating interface, might also contain methods like execute();
interface End {
void end();
}
// Intermediate DSL "step" extending the interface that is returned
// by optionalWord(), to make that method "optional"
interface Intermediate1 extends End {
End optionalWord();
}
// Intermediate DSL "step" providing several choices (similar to Start)
interface Intermediate2 {
End wordChoiceA();
End wordChoiceB();
}
// Intermediate interface returning itself on word3(), in order to allow
// for repetitions. Repetitions can be ended any time because this
// interface extends End
interface Intermediate3 extends End {
Intermediate3 word3();
}
Java and C# being somewhat similar, the example certainly translates to your use-case as well. The above technique has been heavily used in jOOQ, a fluent API / internal domain-specific language modelling the SQL language in Java
This is a very old question, and this answer should probably be a comment rather than an answer, but I think it's a topic worth continuing to talk about, and this response is too long to be a comment.
The original thinking concerning "fluency" seems to have been basically about adding power and flexibility (method chaining, etc) to objects while making code a bit more self-explanatory.
For example
Company a = new Company("Calamaz Holding Corp");
Person p = new Person("Clapper", 113, 24, "Frank");
Company c = new Company(a, 'Floridex', p, 1973);
is less "fluent" than
Company c = new Company().Set
.Name("Floridex");
.Manager(
new Person().Set.FirstName("Frank").LastName("Clapper").Awards(24)
)
.YearFounded(1973)
.ParentCompany(
new Company().Set.Name("Calamaz Holding Corp")
)
;
But to me, the later is not really any more powerful or flexible or self-explanatory than
Company c = new Company(){
Name = "Floridex",
Manager = new Person(){ FirstName="Frank", LastName="Clapper", Awards=24 },
YearFounded = 1973,
ParentCompany = new Company(){ Name="Calamaz Holding Corp." }
};
..in fact I would call this last version easier to create, read and maintain than the previous, and I would say that it requires significantly less baggage behind the scenes, as well. Which to me is important, for (at least) two reasons:
1 - The cost associated with creating and maintaining layers of objects (no matter who does it) is just as real, relevant and important as the cost associated with creating and maintaining the code that consumes them.
2 - Code bloat embedded in layers of objects creates just as many (if not more) problems as code bloat in the code that consumes those objects.
Using the last version means you can add a (potentially useful) property to the Company class simply by adding one, very simple line of code.
That's not to say that I feel there's no place for method chaining. I really like being able to do things like (in JavaScript)
var _this = this;
Ajax.Call({
url: '/service/getproduct',
parameters: {productId: productId},
)
.Done(
function(product){
_this.showProduct(product);
}
)
.Fail(
function(error){
_this.presentError(error);
}
);
..where (in the hypothetical case I'm imagining) Done and Fail were additions to the original Ajax object, and were able to be added without changing any of the original Ajax object code or any of the existing code that made use of the original Ajax object, and without creating one-off things that were exceptions to the general organization of the code.
So I have definitely found value in making a subset of an object's functions return the 'this' object. In fact whenever I have a function that would otherwise return void, I consider having it return this.
But I haven't yet really found significant value in adding "fluent interfaces" (.eg "Set") to an object, although theoretically it seems like there could be a sort of namespace-like code organization that could arise out of the practice of doing that, which might be worthwhile. ("Set" might not be particularly valuable, but "Command", "Query" and "Transfer" might, if it helped organize things and facilitate and minimize the impact of additions and changes.) One of the potential benefits of such a practice, depending on how it was done, might be improvement in a coder's typical level of care and attention to protection levels - the lack of which has certainly caused great volumes grief.
KISS: Keep it simple stupid.
Fluent design is about one aesthetic design principle used throughout the API. Thou your methodology you use in your API can change slightly, but it is generally better to stay consistent.
Even though you may think 'everyone can use this API, because it uses all different types of methodology's'. The truth is the user would start feeling lost because your consistently changing the structure/data structure of the API to a new design principle or naming convention.
If you wish to change halfway through to a different design principle eg.. Converting from error codes to exception handling because some higher commanding power. It would be folly and would normally in tail lots of pain. It is better to stay the course and add functionality that your customers can use and sell than to get them to re-write and re-discover all their problems again.
Following from the above, you can see that there is more at work of writing a Fluent API than meet's the eye. There are psychological, and aesthetic choices to make before beginning to write one and even then the feeling,need, and desire to conform to customers demand's and stay consistent is the hardest of all.
What is a fluent API
Wikipedia defines them here http://en.wikipedia.org/wiki/Fluent_interface
Why Not to use a fluent interface
I would suggest not implementing a traditional fluent interface, as it increases the amount of code you need to write, complicates your code and is just adding unnecessary boilerplate.
Another option, do nothing!
Don't implement anything. Don't provide "easy" constructors for setting properties and don't provide a clever interface to help your client. Allow the client to set the properties however they normally would. In .Net C# or VB this could be as simple as using object initializers.
Car myCar = new Car { Name = "Chevrolet Corvette", Color = Color.Yellow };
So you don't need to create any clever interface in your code, and this is very readable.
If you have very complex Sets of properties which must be set, or set in a certain order, then use a separate configuration object and pass it to the class via a separate property.
CarConfig conf = new CarConfig { Color = Color.Yellow, Fabric = Fabric.Leather };
Car myCar = new Car { Config = conf };
No and yes. The basics are a good interface or interfaces for the types that you want to behave fluently. Libraries with extension methods can extend this behavior and return the interface. Extension methods give others the possibility to extend your fluent API with more methods.
A good fluent design can be hard and takes a rather long trial and error period to totally finetune the basic building blocks. Just a fluent API for configuration or setup is not that hard.
Learning building a fluent API does one by looking at existing APIs. Compare the FluentNHibernate with the fluent .NET APIs or the ICriteria fluent interfaces. Many configuration APIs are also designed "fluently".
With a fluent API:
myCar.SetColor(Color.Blue).SetName("Aston Martin");
Check out this video http://www.viddler.com/explore/dcazzulino/videos/8/
Writting a fluent API it's complicated, that's why I've written Diezel that is a Fluent API generator for Java. It generates the API with interfaces (or course) to:
control the calling flow
catch generic types (like guice one)
It generates also implementations.
It's a maven plugin.
I think the answer depends on the behaviour you want to achieve for your fluent API. For a stepwise initialization the easiest way is, in my opinion, to create a builder class that implements different interfaces used for the different steps. E.g. if you have a class Student with the properties Name, DateOfBirth and Semester the implementation of the builder could look like so:
public class CreateStudent : CreateStudent.IBornOn, CreateStudent.IInSemester
{
private readonly Student student;
private CreateStudent()
{
student = new Student();
}
public static IBornOn WithName(string name)
{
CreateStudent createStudent = new CreateStudent();
createStudent.student.Name = name;
return createStudent;
}
public IInSemester BornOn(DateOnly dateOfBirth)
{
student.DateOfBirth = dateOfBirth;
return this;
}
public Student InSemester(int semester)
{
student.Semester = semester;
return student;
}
public interface IBornOn
{
IInSemester BornOn(DateOnly dateOfBirth);
}
public interface IInSemester
{
Student InSemester(int semester);
}
}
The builder can then be used as follows:
Student student = CreateStudent.WithName("Robert")
.BornOn(new DateOnly(2002, 8, 3)).InSemester(2);
Admittedly, writing an API for more than three properties becomes tedious. For this reasons I have implemented a source generator that can do this work for you: M31.FluentAPI.