Prior to C# generics, everyone would code collections for their business objects by creating a collection base that implemented IEnumerable
IE:
public class CollectionBase : IEnumerable
and then would derive their Business Object collections from that.
public class BusinessObjectCollection : CollectionBase
Now with the generic list class, does anyone just use that instead? I've found that I use a compromise of the two techniques:
public class BusinessObjectCollection : List<BusinessObject>
I do this because I like to have strongly typed names instead of just passing Lists around.
What is your approach?
I am generally in the camp of just using a List directly, unless for some reason I need to encapsulate the data structure and provide a limited subset of its functionality. This is mainly because if I don't have a specific need for encapsulation then doing it is just a waste of time.
However, with the aggregate initializes feature in C# 3.0, there are some new situations where I would advocate using customized collection classes.
Basically, C# 3.0 allows any class that implements IEnumerable and has an Add method to use the new aggregate initializer syntax. For example, because Dictionary defines a method Add(K key, V value) it is possible to initialize a dictionary using this syntax:
var d = new Dictionary<string, int>
{
{"hello", 0},
{"the answer to life the universe and everything is:", 42}
};
The great thing about the feature is that it works for add methods with any number of arguments. For example, given this collection:
class c1 : IEnumerable
{
void Add(int x1, int x2, int x3)
{
//...
}
//...
}
it would be possible to initialize it like so:
var x = new c1
{
{1,2,3},
{4,5,6}
}
This can be really useful if you need to create static tables of complex objects. For example, if you were just using List<Customer> and you wanted to create a static list of customer objects you would have to create it like so:
var x = new List<Customer>
{
new Customer("Scott Wisniewski", "555-555-5555", "Seattle", "WA"),
new Customer("John Doe", "555-555-1234", "Los Angeles", "CA"),
new Customer("Michael Scott", "555-555-8769", "Scranton PA"),
new Customer("Ali G", "", "Staines", "UK")
}
However, if you use a customized collection, like this one:
class CustomerList : List<Customer>
{
public void Add(string name, string phoneNumber, string city, string stateOrCountry)
{
Add(new Customer(name, phoneNumber, city, stateOrCounter));
}
}
You could then initialize the collection using this syntax:
var customers = new CustomerList
{
{"Scott Wisniewski", "555-555-5555", "Seattle", "WA"},
{"John Doe", "555-555-1234", "Los Angeles", "CA"},
{"Michael Scott", "555-555-8769", "Scranton PA"},
{"Ali G", "", "Staines", "UK"}
}
This has the advantage of being both easier to type and easier to read because their is no need to retype the element type name for each element. The advantage can be particularly strong if the element type is long or complex.
That being said, this is only useful if you need static collections of data defined in your app. Some types of apps, like compilers, use them all the time. Others, like typical database apps don't because they load all their data from a database.
My advice would be that if you either need to define a static collection of objects, or need to encapsulate away the collection interface, then create a custom collection class. Otherwise I would just use List<T> directly.
It's recommended that in public API's not to use List<T>, but to use Collection<T>
If you are inheriting from it though, you should be fine, afaik.
I prefer just to use List<BusinessObject>. Typedefing it just adds unnecessary boilerplate to the code. List<BusinessObject> is a specific type, it's not just any List object, so it's still strongly typed.
More importantly, declaring something List<BusinessObject> makes it easier for everyone reading the code to tell what types they are dealing with, they don't have to search through to figure out what a BusinessObjectCollection is and then remember that it's just a list. By typedefing, you'll have to require a consistent (re)naming convention that everyone has to follow in order for it to make sense.
Use the type List<BusinessObject> where you have to declare a list of them. However,
where you return a list of BusinessObject, consider returning IEnumerable<T>, IList<T> or ReadOnlyCollection<T> - i.e. return the weakest possible contract that satisfies the client.
Where you want to "add custom code" to a list, code extension methods on the list type. Again, attach these methods to the weakest possible contract, e.g.
public static int SomeCount(this IEnumerable<BusinessObject> someList)
Of course, you can't and shouldn't add state with extension methods, so if you need to add a new property and a field behind it, use a subclass or better, a wrapper class to store this.
I've been going back and forth on 2 options:
public class BusinessObjectCollection : List<BusinessObject> {}
or methods that just do the following:
public IEnumerable<BusinessObject> GetBusinessObjects();
The benefits of the first approach is that you can change the underlying data store without having to mess with method signatures. Unfortunately if you inherit from a collection type that removes a method from the previous implementation, then you'll have to deal with those situations throughout your code.
You should probably avoid creating your own collection for that purpose. It's pretty common to want to change the type of data structure a few times during refactorings or when adding new features. With your approach, you would wind up with a separate class for BusinessObjectList, BusinessObjectDictionary, BusinessObjectTree, etc.
I don't really see any value in creating this class just because the classname is more readable. Yeah, the angle bracket syntax is kind of ugly, but it's standard in C++, C# and Java, so even if you don't write code that uses it you're going to run into it all the time.
I generally only derive my own collection classes if I need to "add value". Like, if the collection itself needed to have some "metadata" properties tagging along with it.
I do the exact same thing as you Jonathan... just inherit from List<T>. You get the best of both worlds. But I generally only do it when there is some value to add, like adding a LoadAll() method or whatever.
You can use both. For laziness - I mean productivity - List is a very useful class, it's also "comprehensive" and frankly full of YANGNI members. Coupled with the sensible argument / recommendation put forward by the MSDN article already linked about exposing List as a public member, I prefer the "third" way:
Personally I use the decorator pattern to expose only what I need from List i.e:
public OrderItemCollection : IEnumerable<OrderItem>
{
private readonly List<OrderItem> _orderItems = new List<OrderItem>();
void Add(OrderItem item)
{
_orderItems.Add(item)
}
//implement only the list members, which are required from your domain.
//ie. sum items, calculate weight etc...
private IEnumerator<string> Enumerator() {
return _orderItems.GetEnumerator();
}
public IEnumerator<string> GetEnumerator() {
return Enumerator();
}
}
Further still I'd probably abstract OrderItemCollection into IOrderItemCollection so I can swap my implementation of IOrderItemCollection over in the future in (I may prefer to use a different inner enumerable object such as Collection or more likley for perf use a Key Value Pair collection or Set.
I use generic lists for almost all scenarios. The only time that I would consider using a derived collection anymore is if I add collection specific members. However, the advent of LINQ has lessened the need for even that.
6 of 1, half dozen of another
Either way its the same thing. I only do it when I have reason to add custom code into the BusinessObjectCollection.
With out it having load methods return a list allows me to write more code in a common generic class and have it just work. Such as a Load method.
As someone else pointed out, it is recommended not to expose List publicly, and FxCop will whinge if you do so. This includes inheriting from List as in:
public MyTypeCollection : List<MyType>
In most cases public APIs will expose IList (or ICollection or IEnumerable) as appropriate.
In cases where you want your own custom collection, you can keep FxCop quiet by inheriting from Collection instead of List.
If you choose to create your own collection class you should check out the types in System.Collections.ObjectModel Namespace.
The namespace defines base classes thare are ment to make it easier for implementers to create a custom collections.
I tend to do it with my own collection if I want to shield the access to the actual list. When you are writing business objects, chance is that you need a hook to know if your object is being added/removed, in such sense I think BOCollection is better idea. Of coz if that is not required, List is more lightweight. Also you might want to check using IList to provide additional abstraction interface if you need some kind of proxying (e.g. a fake collection triggers lazy load from database)
But... why not consider Castle ActiveRecord or any other mature ORM framework? :)
At the most of the time I simply go with the List way, as it gives me all the functionality I need at the 90% of the time, and when something 'extra' is needed, I inherit from it, and code that extra bit.
I would do this:
using BusinessObjectCollection = List<BusinessObject>;
This just creates an alias rather than a completely new type. I prefer it to using List<BusinessObject> directly because it leaves me free to change the underlying structure of the collection at some point in the future without changing code that uses it (as long as I provide the same properties and methods).
try out this:
System.Collections.ObjectModel.Collection<BusinessObject>
it makes unnecessary to implement basic method like CollectionBase do
this is the way:
return arrays, accept IEnumerable<T>
=)
Related
I've seen similar questions but I'm still lost in regards to the return type of Public, Internal and Private methods. When should my return type by Collection and when should it be IEnumerable
The MSDN guide says
X DO NOT use weakly typed collections in public APIs.
The type of all return values and parameters representing collection items should be the exact item type, not any of its base types (this applies only to public members of the collection).
This has totally confused me and implies that everything (public) should be Collection<T> or ReadOnlyCollection<T>. My confusion comes in when can we use List<T>, for example
public Collection<string> DoThisThing(int i)
{
Collection<string> col = new Collection<string>();
List<string> myList = null;
if (i > 0)
myList = GetListOfString();
//col.AddRange(...); Can't AddRange as no method
//return myList; Can't do this as it expects return type of Collection<string>
//So, to ensure my return is Collection, I have to convert
foreach(var item in myList)
{
col.Add(item);
}
return col;
}
In the above example, it's perfectly valid that List is a better choice, so, if I want to use AddRange, I could either
Hope to have control of API so I can change the return type of GetListOfString to Collection (not always possible)
Add an extension method to Control to support AddRange
Type it out every time, as I do above
However, if I do option 1, then I don't see when I'd ever actually use ToList() (other than when I have to if an external API returns List) as all my return types would be either Collection<T>, ReadOnlyCollection<T>
Using option 2 seems more sensible but, now I'm lost as it feels as if I'm forcing it to do something it isn't intended to do.
Option 3 is repetitive
Whist I appreciate it will always depend on the situation, as a general rule, is the following accurate
Public - Should be very precise, for example IEnumerable<string> (note, not T), but not Collection<T>/Collection<string>
Internal and private - Should be more agnostic, so Collection<T> is better (and live with not really using List or if I do have to use, I will have to cast it into Collection<T>
EDIT: added code at the end
You are confusing several different layers of the guide.
In your internal code you are free to do what you like, albeit there are guidelines for how to write code that is clean, readable and easy to maitain. Still, those are implementation details and they are tucked away behind your public methods, so you can do what you want (you may shoot in your own foot).
Artifacts you write to be used by other developers are one of these:
libraries;
frameworks;
modules of "some kind".
Now, in these cases you must decide:
the boundaries of your artifact;
how those developers use your artifact.
The boundaries are well defined by an assembly, if you choose to have one.
Sometimes namespaces may be considered "internal boundaries" and the consumer is yourself, in the future. Organizing your code into internal boundaries is a good practice but consider that classes already define boundaries so don't overdo it. Group 10-20 classes in a namespace by cohesion and keep those groups apart without interdependencies by low coupling. You cannot really understand what I mean here when I say bridge... but come back in a year or so and read this answer again :)
How to use the assembly is defined in these places:
manifest (generated automatically by the VS packaging);
documentation (which should accompany your assembly);
source code (if you expose it);
public constructs (enums, interfaces, classes, namespaces, package dependencies);
public and protected methods.
Now with all those tools you have a way to design that thing called an API.
There is plenty of room to maneuver and make mistakes. Because of this, the good people at M$ decided to define a uniform and consistent way of designing the default APIs in the default assemblies and namespaces (such as System.*).
What you have touched in your question is a guideline to API designers (which you might or might not be). All they say is:
if you are designing an API such as a non-internal method of a public class in an accessible assembly, than do us the favor and return a tightly defined value, as concrete as possible.
So, if it is a List<T> don't return any of the ancestors such as IEnumerable<T> or whatever. This allows the consumers (other developers) of your assembly to leverage concrete properties of your return value.
On the other hand you should accept as general values as possible. If you just use a method of IEnumerable<T> in your method you should require just that:
public void MyMethod<T>(IEnumerable<T> e);
This signature of your method allows consumers to pass in many different concrete collections. The only thing you require is that it is (derives from) an IEnumerable<T>.
These design decisions are also bound to many important OO principles of SOLID. They postulate that:
you should depend on abstractions => so expose interfaces if possible and keep your API abstract;
you should hide implementation details so that changes in your assembly don't propagate to consumer code;
you should avoid changing the API by allowing extension mechanism.
To be more precise I take your code and try to be as concrete as possible:
public List<string> DoThisThing(int i) // use list to return a more concrete type
{
List<string> list = new List<string>(); // you really want a *LIST* so why be abstract in your own code innards?
// List<string> myList = null; <-- superfluous
if (i > 0)
{
list = GetListOfString();
} // ALWAYS (!) use code blocks, see the book "Clean Code"
// this problem is solved: in your internal code don't be abstract if you don't need it
//col.AddRange(...); Can't AddRange as no method
return list; // Can't do this as it expects return type of Collection<string>
// you're done...
// So, to ensure my return is Collection, I have to convert
// foreach(var item in myList)
// {
// col.Add(item);
// }
// return col;
}
Some help about the theory behind all this: The SOLID principles.
What you really want to understand is how and why you need abstractions. When you really grasp that, you will also understand when you DO NOT need them and so decide in a reasoned manner when to use IEnumerable<T> or List<string> without negative repercussions on the rest of your code. The most important principles of SOLID are in this case OCP and DIP.
Your question seems to be thinking into the "wrong" direction, starting from the initial MSDN quotation:
X DO NOT use weakly typed collections in public APIs.
The type of all return values and parameters representing collection items should be the exact item type, not any of its base types (this applies only to public members of the collection).
This refers to the items of collections, not the collection types.
This guideline does not refer to the type of the collection itself at all. Whether you return an enumerable, a collection, a list, or any other collection type, entirely depends on how you intend your collection to behave and what guarantees you want to make about the set of items you store.
The guideline you cited, on the other hand, indicates that you should strongly type the items stored in your collection. That is, if you have a list of string values, do not declare the collection item type as object (e.g. by typing the collection to IEnumerable<object> or IList<object>), but do use string as the item type (e.g. IEnumerable<string>, ICollection<string>, IList<string>).
Often you have to implement a collection because it is not present among those of the .NET Framework. In the examples that I find online, often the new collection is built based on another collection (for example, List<T>): in this way it is possible to avoid the management of the resizing of the collection.
public class CustomCollection<T>
{
private List<T> _baseArray;
...
public CustomCollection(...)
{
this._baseArray = new List<T>(...);
}
}
What are the disadvantages of following this approach? Only lower performance because of the method calls to the base collection? Or the compiler performs some optimization?
Moreover, in some cases the field relating to the base collection (for example the above _baseArray) is declared as readonly. Why?
The main disadvantage is the fact that if you want to play nice you'll have to implement a lot of interfaces by hand (ICollection, IEnumerable, possibly IList... both generic and non-generic), and that's quite a bit of code. Not complex code, since you're just relaying the calls, but still code. The extra call to the inner list shouldn't make too big of a difference in most cases.
It's to enforce the fact that once the inner list is set, it cannot be changed into another list.
Usually it's best to inherit from one of the many built-in collection classes to make your own collection, instead of doing it the hard way. Collection<T> is a good starting point, and nobody is stopping you from inheriting List<T> itself.
For #2: if the private member is only assigned to in the constructor or when declared, it can be readonly. This is usually true if you only have one underlying collection and don't ever need to recreate it.
I'd say a pretty large disadvantage of this approach is that you can't use LINQ on your custom collection unless you implement IEnumerable. A better approach might be to subclass and force new implementation on methods as necessary, ex:
public class FooList<T> : List<T>
{
public new void Add(T item)
{
// any FooList-specific logic regarding adding items
base.Add(item);
}
}
As for the readonly keyword, it means that you can only set the variable in the constructor.
I know there has been a lot of posts on this but it still confuses me why should you pass in an interface like IList and return an interface like IList back instead of the concrete list.
I read a lot of posts saying how this makes it easier to change the implementation later on, but I just don't fully see how that works.
Say if I have this method
public class SomeClass
{
public bool IsChecked { get; set; }
}
public void LogAllChecked(IList<SomeClass> someClasses)
{
foreach (var s in someClasses)
{
if (s.IsChecked)
{
// log
}
}
}
I am not sure how using IList will help me out in the future.
How about if I am already in the method? Should I still be using IList?
public void LogAllChecked(IList<SomeClass> someClasses)
{
//why not List<string> myStrings = new List<string>()
IList<string> myStrings = new List<string>();
foreach (var s in someClasses)
{
if (s.IsChecked)
{
myStrings.Add(s.IsChecked.ToString());
}
}
}
What do I get for using IList now?
public IList<int> onlySomeInts(IList<int> myInts)
{
IList<int> store = new List<int>();
foreach (var i in myInts)
{
if (i % 2 == 0)
{
store.Add(i);
}
}
return store;
}
How about now? Is there some new implementation of a list of int's that I will need to change out?
Basically, I need to see some actual code examples of how using IList would have solved some problem over just taking List into everything.
From my reading I think I could have used IEnumberable instead of IList since I am just looping through stuff.
Edit
So I have been playing around with some of my methods on how to do this. I am still not sure about the return type(if I should make it more concrete or an interface).
public class CardFrmVm
{
public IList<TravelFeaturesVm> TravelFeaturesVm { get; set; }
public IList<WarrantyFeaturesVm> WarrantyFeaturesVm { get; set; }
public CardFrmVm()
{
WarrantyFeaturesVm = new List<WarrantyFeaturesVm>();
TravelFeaturesVm = new List<TravelFeaturesVm>();
}
}
public class WarrantyFeaturesVm : AvailableFeatureVm
{
}
public class TravelFeaturesVm : AvailableFeatureVm
{
}
public class AvailableFeatureVm
{
public Guid FeatureId { get; set; }
public bool HasFeature { get; set; }
public string Name { get; set; }
}
private IList<AvailableFeature> FillAvailableFeatures(IEnumerable<AvailableFeatureVm> avaliableFeaturesVm)
{
List<AvailableFeature> availableFeatures = new List<AvailableFeature>();
foreach (var f in avaliableFeaturesVm)
{
if (f.HasFeature)
{
// nhibernate call to Load<>()
AvailableFeature availableFeature = featureService.LoadAvaliableFeatureById(f.FeatureId);
availableFeatures.Add(availableFeature);
}
}
return availableFeatures;
}
Now I am returning IList for the simple fact that I will then add this to my domain model what has a property like this:
public virtual IList<AvailableFeature> AvailableFeatures { get; set; }
The above is an IList itself as this is what seems to be the standard to use with nhibernate. Otherwise I might have returned IEnumberable back but not sure. Still, I can't figure out what the user would 100% need(that's where returning a concrete has an advantage over).
Edit 2
I was also thinking what happens if I want to do pass by reference in my method?
private void FillAvailableFeatures(IEnumerable<AvailableFeatureVm> avaliableFeaturesVm, IList<AvailableFeature> toFill)
{
foreach (var f in avaliableFeaturesVm)
{
if (f.HasFeature)
{
// nhibernate call to Load<>()
AvailableFeature availableFeature = featureService.LoadAvaliableFeatureById(f.FeatureId);
toFill.Add(availableFeature);
}
}
}
would I run into problems with this? Since could they not pass in an array(that has a fixed size)? Would it be better maybe for a concrete List?
There are three questions here: what type should I use for a formal parameter? What should I use for a local variable? and what should I use for a return type?
Formal parameters:
The principle here is do not ask for more than you need. IEnumerable<T> communicates "I need to get the elements of this sequence from beginning to end". IList<T> communicates "I need to get and set the elements of this sequence in arbitrary order". List<T> communicates "I need to get and set the elements of this sequence in arbitrary order and I only accept lists; I do not accept arrays."
By asking for more than you need, you (1) make the caller do unnecessary work to satisfy your unnecessary demands, and (2) communicate falsehoods to the reader. Ask only for what you're going to use. That way if the caller has a sequence, they don't need to call ToList on it to satisfy your demand.
Local variables:
Use whatever you want. It's your method. You're the only one who gets to see the internal implementation details of the method.
Return type:
Same principle as before, reversed. Offer the bare minimum that your caller requires. If the caller only requires the ability to enumerate the sequence, only give them an IEnumerable<T>.
The most practical reason I've ever seen was given by Jeffrey Richter in CLR via C#.
The pattern is to take the basest class or interface possible for your arguments and return the most specific class or interface possible for your return types. This gives your callers the most flexibility in passing in types to your methods and the most opportunities to cast/reuse the return values.
For example, the following method
public void PrintTypes(IEnumerable items)
{
foreach(var item in items)
Console.WriteLine(item.GetType().FullName);
}
allows the method to be called passing in any type that can be cast to an enumerable. If you were more specific
public void PrintTypes(List items)
then, say, if you had an array and wished to print their type names to the console, you would first have to create a new List and fill it with your types. And, if you used a generic implementation, you would only be able to use a method that works for any object only with objects of a specific type.
When talking about return types, the more specific you are, the more flexible callers can be with it.
public List<string> GetNames()
you can use this return type to iterate the names
foreach(var name in GetNames())
or you can index directly into the collection
Console.WriteLine(GetNames()[0])
Whereas, if you were getting back a less specific type
public IEnumerable GetNames()
you would have to massage the return type to get the first value
Console.WriteLine(GetNames().OfType<string>().First());
IEnumerable<T> allows you to iterate through a collection. ICollection<T> builds on this and also allows for adding and removing items. IList<T> also allows for accessing and modifying them at a specific index. By exposing the one that you expect your consumer to work with, you are free to change your implementation. List<T> happens to implement all three of those interfaces.
If you expose your property as a List<T> or even an IList<T> when all you want your consumer to have is the ability to iterate through the collection. Then they could come to depend on the fact that they can modify the list. Then later if you decide to convert the actual data store from a List<T> to a Dictionary<T,U> and expose the dictionary keys as the actual value for the property (I have had to do exactly this before). Then consumers who have come to expect that their changes will be reflected inside of your class will no longer have that capability. That's a big problem! If you expose the List<T> as an IEnumerable<T> you can comfortably predict that your collection is not being modified externally. That is one of the powers of exposing List<T> as any of the above interfaces.
This level of abstraction goes the other direction when it belongs to method parameters. When you pass your list to a method that accepts IEnumerable<T> you can be sure that your list is not going to be modified. When you are the person implementing the method and you say you accept an IEnumerable<T> because all you need to do is iterate through that list. Then the person calling the method is free to call it with any data type that is enumerable. This allows your code to be used in unexpected, but perfectly valid ways.
From this it follows that your method implementation can represent its local variables however you wish. The implementation details are not exposed. Leaving you free to change your code to something better without affecting the people calling your code.
You cannot predict the future. Assuming that a property's type will always be beneficial as a List<T> is immediately limiting your ability to adapt to unforeseen expectations of your code. Yes, you may never change that data type from a List<T> but you can be sure that if you have to. Your code is ready for it.
Short Answer:
You pass the interface so that no matter what concrete implementation of that interface you use, your code will support it.
If you use a concrete implementation of list, another implementation of the same list will not be supported by your code.
Read a bit on inheritance and polymorphism.
Here's an example: I had a project once where our lists got very large, and resulting fragmentation of the large object heap was hurting performance. We replaced List with LinkedList. LinkedList does not contain an array, so all of a sudden, we had almost no use of the large object heap.
Mostly, we used the lists as IEnumerable<T>, anyway, so there was no further change needed. (And yes, I would recommend declaring references as IEnumerable if all you're doing is enumerating them.) In a couple of places, we needed the list indexer, so we wrote an inefficient IList<T> wrapper around the linked lists. We needed the list indexer infrequently, so the inefficiency was not a problem. If it had been, we could have provided some other implementation of IList, perhaps as a collection of small-enough arrays, that would have been more efficiently indexable while also avoiding large objects.
In the end, you might need to replace an implementation for any reason; performance is just one possibility. Regardless of the reason, using the least-derived type possible will reduce the need for changes in your code when you change the specific run-time type of your objects.
Inside the method, you should use var, instead of IList or List. When your data source changes to come from a method instead, your onlySomeInts method will survive.
The reason to use IList instead of List as parameters, is because many things implement IList (List and [], as two examples), but only one thing implements List. It's more flexible to code to the interface.
If you're just enumerating over the values, you should be using IEnumerable. Every type of datatype that can hold more than one value implements IEnumerable (or should) and makes your method hugely flexible.
Using IList instead of List makes writing unit tests significantly easier. It allows you to use a 'Mocking' library to pass and return data.
The other general reason for using interfaces is to expose the minimum amount of knowledge necessary to the user of an object.
Consider the (contrived) case where I have a data object that implements IList.
public class MyDataObject : IList<int>
{
public void Method1()
{
...
}
// etc
}
Your functions above only care about being able to iterate over a list. Ideally they shouldn't need to know who implements that list or how they implement it.
In your example, IEnumerable is a better choice as you thought.
It is always a good idea to reduce the dependencies between your code as much as possible.
Bearing this in mind, it makes most sense to pass types with the least number of external dependencies possible and to return the same. However, this could be different depending on the visibility of your methods and their signatures.
If your methods form part of an interface, the methods will need to be defined using types available to that interface. Concrete types will probably not be available to interfaces, so they would have to return non-concrete types. You would want to do this if you were creating a framework, for example.
However, if you are not writing a framework, it may be advantageous to pass parameter with the weakest possible types (i.e. base classes, interfaces, or even delegates) and return concrete types. That gives the caller the ability to do as much as possible with the returned object, even if it is cast as an interface. However, this makes the method more fragile, as any change to the returned object type may break the calling code. In practice though, that generally isn't a major problem.
You accept an Interface as a parameter for a method because that allows the caller to submit different concrete types as arguments. Given your example method LogAllChecked, the parameter someClasses could be of various types, and for the person writing the method, all might be equivalent (i.e. you'd write the exact same code regardless of the type of the parameter). But for the person calling the method, it can make a huge difference -- if they have an array and you're asking for a list, they have to change the array to a list or v.v. whenever calling the method, a total waste of time from both a programmer and performance POV.
Whether you return an Interface or a concrete type depends upon what you want to let your callers do with the object you created -- this is an API design decision, and there's no hard and fast rule. You have to weigh their ability to make full use of the object against their ability to easily use a portion of the objects functionality (and of course whether you WANT them to be making full use of the object). For instance, if you return an IEnumerable, then you are limiting them to iterating -- they can't add or remove items from your object, they can only act against the objects. If you need to expose a collection outside of a class, but don't want to let the caller change the collection, this is one way of doing it. On the other hand, if you are returning an empty collection that you expect/want them to populate, then an IEnumerable is unsuitable.
Here's my answer in this .NET 4.5+ world.
Use IList<T> and IReadonlyList<T>,
instead of List<T>, because ReadonlyList<T> doesn't exist.
IList<T> looks so consistent with IReadonlyList<T>
Use IEnumerable<T> for minimum exposure (property) or requirement (parameter) if foreach is the only way to use it.
Use IReadonlyList<T> if you also need to expose/use Count and [] indexer.
Use IList<T> if you also allow callers to add/update/delete elements
because List<T> implements IReadonlyList<T>, it doesn't need any explicit casting.
An example class:
// manipulate the list within the class
private List<int> _numbers;
// callers can add/update/remove elements, but cannot reassign a new list to this property
public IList<int> Numbers { get { return _numbers; } }
// callers can use: .Count and .ReadonlyNumbers[idx], but cannot add/update/remove elements
public IReadOnlyList<int> ReadonlyNumbers { get { return _numbers; } }
Whats is the difference between:
List<MyType> myList;
and
myList : List<MyType>
Its obvious that the first one is list and second one is a class. But my question is what's the advantage of second over first one.
The latter gives you the ability to have a function which takes a myList, instead of just a List. This also means that if the type of myList changes (perhaps to a sorted list) you don't have to change your code anywhere. So instead of declaring List<myType> everwhere, and then having to change them, if you had MyList objects everywhere, you're golden.
Its also a syntactic difference. Does myList have a list, or is it a list?
I would lean towards having a MyList : List<MyType> if it is used commonly throughout your program.
List<MyType> myList is an instance of the generic type List that can contain items that are of MyType (or of any types derived from MyType)
var myTypeInstance = new MyType();
var myList = new List<MyType>;
myList.Add(myTypeInstance);
myList : List<MyType> is a new type that inherits from List from which you can then make multiple instances:
var myTypeInstance = new MyType();
var myCollectionVariable = new myList();
myCollectionVariable.Add(myTypeInstance);
The main advantage of the latter over the former is if you wanted to have some methods that act on a List you can put them on your class, rather than storing them in a "helper" or "utility" library, for example:
class myList : List<MyType>
{
public void DoSomethingToAllMyTypesInList()
{
...
...
}
}
Object composition link text
vs.
Class inheritance link text
The latter is a new class that inherits from the base, but is distinct. The most obvious distinction is that it doesn't have the same constructors, but you'll also run into problems streaming it.
Those are the disadvantages. The advantage is that you could add some of your own methods. Even then, I'd consider using containment, with has-a relationship instead of is-a.
I would prefer not to inherit implementation where possible. It has its uses, but if it's not entirely necessary, then it's not worth it.
The major answer to your question is that by inheriting List<T>, you make all its methods public by default. Usually when writing a new class, you want encapsulation. You don't want to let the internals leak out. For example, suppose you wanted to make a thread-safe container. If you inherit from a thread-ignorant container, your clients will be able to use the public interface of the base class to bypass any locking you try to put in.
Another popular mistake comes when you find yourself using a particular container type a lot, it's tempting to try and use inheritance to make a short name for it:
class ShortName : Dictionary<string, List<string>> { };
But that's not what you've done - you've created a completely new type. This means that if you have some other library that can produce the right data structure, it won't be directly usable by your code; you'll have to copy it into a ShortName first. An example is Linq, which can easily build a Dictionary<string, List<string>> from a very readable, functional expression, ending with ToDictionary.
So instead, do this:
using ShortName = Dictionary<string, List<string>>;
Now you have a short snappy alias for the unweildy typename, but you're actually still using the same type.
The Microsoft design guidelines (FxCop and VS Code Analysis) don't recommend inheriting publicly-visible classes from List<T>. Instead you can inherit from Collection<T> as described in this blog post.
These guidelines aren't necessarily relevant for private assemblies or internal classes though.
A couple of reasons why you might want to inherit from Collection<T> or List<T> are:
So you can add custom application-specific members to your collection class.
So you can create a ComVisible collection class (you can't expose a generic List directly to COM, but you can expose a derived class).
By the way the naming guidelines would also recommend you name your derived class with the "Collection" suffix, i.e.
MyTypeCollection : List<MyType> // or : Collection<MyType>, IList<MyType>
rather than
MyList : List<MyType> // or : Collection<MyType>, IList<MyType>
Some people find benefits in abstracting data structures away from their application logic. If you decide that generic list is no longer the best data structure to represent MyList you can change your MyList implementation, and as long as your interface is the same, you don't have to update any other code.
This is over kill in many situations however.
There are also semantic benefits to working with an abstracted data type rather than the original, though the list type blurs the line. It is more obvious when working with a dictionary data structure. If you wrap the dictionary in a custom collection type, and expose keys and values as properties. you can write code that reads more like the business logic you are implementing.
I know that IList is the interface and List is the concrete type but I still don't know when to use each one. What I'm doing now is if I don't need the Sort or FindAll methods I use the interface. Am I right? Is there a better way to decide when to use the interface or the concrete type?
There are two rules I follow:
Accept the most basic type that will work
Return the richest type your user will need
So when writing a function or method that takes a collection, write it not to take a List, but an IList<T>, an ICollection<T>, or IEnumerable<T>. The generic interfaces will still work even for heterogenous lists because System.Object can be a T too. Doing this will save you headache if you decide to use a Stack or some other data structure further down the road. If all you need to do in the function is foreach through it, IEnumerable<T> is really all you should be asking for.
On the other hand, when returning an object out of a function, you want to give the user the richest possible set of operations without them having to cast around. So in that case, if it's a List<T> internally, return a copy as a List<T>.
Microsoft guidelines as checked by FxCop discourage use of List<T> in public APIs - prefer IList<T>.
Incidentally, I now almost always declare one-dimensional arrays as IList<T>, which means I can consistently use the IList<T>.Count property rather than Array.Length. For example:
public interface IMyApi
{
IList<int> GetReadOnlyValues();
}
public class MyApiImplementation : IMyApi
{
public IList<int> GetReadOnlyValues()
{
List<int> myList = new List<int>();
... populate list
return myList.AsReadOnly();
}
}
public class MyMockApiImplementationForUnitTests : IMyApi
{
public IList<int> GetReadOnlyValues()
{
IList<int> testValues = new int[] { 1, 2, 3 };
return testValues;
}
}
IEnumerable
You should try and use the least specific type that suits your purpose.
IEnumerable is less specific than IList.
You use IEnumerable when you want to loop through the items in a collection.
IList
IList implements IEnumerable.
You should use IList when you need access by index to your collection, add and delete elements, etc...
List
List implements IList.
There's an important thing that people always seem to overlook:
You can pass a plain array to something which accepts an IList<T> parameter, and then you can call IList.Add() and will receive a runtime exception:
Unhandled Exception: System.NotSupportedException: Collection was of a fixed size.
For example, consider the following code:
private void test(IList<int> list)
{
list.Add(1);
}
If you call that as follows, you will get a runtime exception:
int[] array = new int[0];
test(array);
This happens because using plain arrays with IList<T> violates the Liskov substitution principle.
For this reason, if you are calling IList<T>.Add() you may want to consider requiring a List<T> instead of an IList<T>.
I would agree with Lee's advice for taking parameters, but not returning.
If you specify your methods to return an interface that means you are free to change the exact implementation later on without the consuming method ever knowing. I thought I'd never need to change from a List<T> but had to later change to use a custom list library for the extra functionality it provided. Because I'd only returned an IList<T> none of the people that used the library had to change their code.
Of course that only need apply to methods that are externally visible (i.e. public methods). I personally use interfaces even in internal code, but as you are able to change all the code yourself if you make breaking changes it's not strictly necessary.
It's always best to use the lowest base type possible. This gives the implementer of your interface, or consumer of your method, the opportunity to use whatever they like behind the scenes.
For collections you should aim to use IEnumerable where possible. This gives the most flexibility but is not always suited.
If you're working within a single method (or even in a single class or assembly in some cases) and no one outside is going to see what you're doing, use the fullness of a List. But if you're interacting with outside code, like when you're returning a list from a method, then you only want to declare the interface without necessarily tying yourself to a specific implementation, especially if you have no control over who compiles against your code afterward. If you started with a concrete type and you decided to change to another one, even if it uses the same interface, you're going to break someone else's code unless you started off with an interface or abstract base type.
You are most often better of using the most general usable type, in this case the IList or even better the IEnumerable interface, so that you can switch the implementation conveniently at a later time.
However, in .NET 2.0, there is an annoying thing - IList does not have a Sort() method. You can use a supplied adapter instead:
ArrayList.Adapter(list).Sort()
I don't think there are hard and fast rules for this type of thing, but I usually go by the guideline of using the lightest possible way until absolutely necessary.
For example, let's say you have a Person class and a Group class. A Group instance has many people, so a List here would make sense. When I declare the list object in Group I will use an IList<Person> and instantiate it as a List.
public class Group {
private IList<Person> people;
public Group() {
this.people = new List<Person>();
}
}
And, if you don't even need everything in IList you can always use IEnumerable too. With modern compilers and processors, I don't think there is really any speed difference, so this is more just a matter of style.
You should use the interface only if you need it, e.g., if your list is casted to an IList implementation other than List. This is true when, for example, you use NHibernate, which casts ILists into an NHibernate bag object when retrieving data.
If List is the only implementation that you will ever use for a certain collection, feel free to declare it as a concrete List implementation.
In situations I usually come across, I rarely use IList directly.
Usually I just use it as an argument to a method
void ProcessArrayData(IList almostAnyTypeOfArray)
{
// Do some stuff with the IList array
}
This will allow me to do generic processing on almost any array in the .NET framework, unless it uses IEnumerable and not IList, which happens sometimes.
It really comes down to the kind of functionality you need. I'd suggest using the List class in most cases. IList is best for when you need to make a custom array that could have some very specific rules that you'd like to encapsulate within a collection so you don't repeat yourself, but still want .NET to recognize it as a list.
A List object allows you to create a list, add things to it, remove it, update it, index into it and etc. List is used whenever you just want a generic list where you specify object type in it and that's it.
IList on the other hand is an Interface. Basically, if you want to create your own custom List, say a list class called BookList, then you can use the Interface to give you basic methods and structure to your new class. IList is for when you want to create your own, special sub-class that implements List.
Another difference is:
IList is an Interface and cannot be instantiated. List is a class and can be instantiated. It means:
IList<string> list1 = new IList<string>(); // this is wrong, and won't compile
IList<string> list2 = new List<string>(); // this will compile
List<string> list3 = new List<string>(); // this will compile