I know that IList is the interface and List is the concrete type but I still don't know when to use each one. What I'm doing now is if I don't need the Sort or FindAll methods I use the interface. Am I right? Is there a better way to decide when to use the interface or the concrete type?
There are two rules I follow:
Accept the most basic type that will work
Return the richest type your user will need
So when writing a function or method that takes a collection, write it not to take a List, but an IList<T>, an ICollection<T>, or IEnumerable<T>. The generic interfaces will still work even for heterogenous lists because System.Object can be a T too. Doing this will save you headache if you decide to use a Stack or some other data structure further down the road. If all you need to do in the function is foreach through it, IEnumerable<T> is really all you should be asking for.
On the other hand, when returning an object out of a function, you want to give the user the richest possible set of operations without them having to cast around. So in that case, if it's a List<T> internally, return a copy as a List<T>.
Microsoft guidelines as checked by FxCop discourage use of List<T> in public APIs - prefer IList<T>.
Incidentally, I now almost always declare one-dimensional arrays as IList<T>, which means I can consistently use the IList<T>.Count property rather than Array.Length. For example:
public interface IMyApi
{
IList<int> GetReadOnlyValues();
}
public class MyApiImplementation : IMyApi
{
public IList<int> GetReadOnlyValues()
{
List<int> myList = new List<int>();
... populate list
return myList.AsReadOnly();
}
}
public class MyMockApiImplementationForUnitTests : IMyApi
{
public IList<int> GetReadOnlyValues()
{
IList<int> testValues = new int[] { 1, 2, 3 };
return testValues;
}
}
IEnumerable
You should try and use the least specific type that suits your purpose.
IEnumerable is less specific than IList.
You use IEnumerable when you want to loop through the items in a collection.
IList
IList implements IEnumerable.
You should use IList when you need access by index to your collection, add and delete elements, etc...
List
List implements IList.
There's an important thing that people always seem to overlook:
You can pass a plain array to something which accepts an IList<T> parameter, and then you can call IList.Add() and will receive a runtime exception:
Unhandled Exception: System.NotSupportedException: Collection was of a fixed size.
For example, consider the following code:
private void test(IList<int> list)
{
list.Add(1);
}
If you call that as follows, you will get a runtime exception:
int[] array = new int[0];
test(array);
This happens because using plain arrays with IList<T> violates the Liskov substitution principle.
For this reason, if you are calling IList<T>.Add() you may want to consider requiring a List<T> instead of an IList<T>.
I would agree with Lee's advice for taking parameters, but not returning.
If you specify your methods to return an interface that means you are free to change the exact implementation later on without the consuming method ever knowing. I thought I'd never need to change from a List<T> but had to later change to use a custom list library for the extra functionality it provided. Because I'd only returned an IList<T> none of the people that used the library had to change their code.
Of course that only need apply to methods that are externally visible (i.e. public methods). I personally use interfaces even in internal code, but as you are able to change all the code yourself if you make breaking changes it's not strictly necessary.
It's always best to use the lowest base type possible. This gives the implementer of your interface, or consumer of your method, the opportunity to use whatever they like behind the scenes.
For collections you should aim to use IEnumerable where possible. This gives the most flexibility but is not always suited.
If you're working within a single method (or even in a single class or assembly in some cases) and no one outside is going to see what you're doing, use the fullness of a List. But if you're interacting with outside code, like when you're returning a list from a method, then you only want to declare the interface without necessarily tying yourself to a specific implementation, especially if you have no control over who compiles against your code afterward. If you started with a concrete type and you decided to change to another one, even if it uses the same interface, you're going to break someone else's code unless you started off with an interface or abstract base type.
You are most often better of using the most general usable type, in this case the IList or even better the IEnumerable interface, so that you can switch the implementation conveniently at a later time.
However, in .NET 2.0, there is an annoying thing - IList does not have a Sort() method. You can use a supplied adapter instead:
ArrayList.Adapter(list).Sort()
I don't think there are hard and fast rules for this type of thing, but I usually go by the guideline of using the lightest possible way until absolutely necessary.
For example, let's say you have a Person class and a Group class. A Group instance has many people, so a List here would make sense. When I declare the list object in Group I will use an IList<Person> and instantiate it as a List.
public class Group {
private IList<Person> people;
public Group() {
this.people = new List<Person>();
}
}
And, if you don't even need everything in IList you can always use IEnumerable too. With modern compilers and processors, I don't think there is really any speed difference, so this is more just a matter of style.
You should use the interface only if you need it, e.g., if your list is casted to an IList implementation other than List. This is true when, for example, you use NHibernate, which casts ILists into an NHibernate bag object when retrieving data.
If List is the only implementation that you will ever use for a certain collection, feel free to declare it as a concrete List implementation.
In situations I usually come across, I rarely use IList directly.
Usually I just use it as an argument to a method
void ProcessArrayData(IList almostAnyTypeOfArray)
{
// Do some stuff with the IList array
}
This will allow me to do generic processing on almost any array in the .NET framework, unless it uses IEnumerable and not IList, which happens sometimes.
It really comes down to the kind of functionality you need. I'd suggest using the List class in most cases. IList is best for when you need to make a custom array that could have some very specific rules that you'd like to encapsulate within a collection so you don't repeat yourself, but still want .NET to recognize it as a list.
A List object allows you to create a list, add things to it, remove it, update it, index into it and etc. List is used whenever you just want a generic list where you specify object type in it and that's it.
IList on the other hand is an Interface. Basically, if you want to create your own custom List, say a list class called BookList, then you can use the Interface to give you basic methods and structure to your new class. IList is for when you want to create your own, special sub-class that implements List.
Another difference is:
IList is an Interface and cannot be instantiated. List is a class and can be instantiated. It means:
IList<string> list1 = new IList<string>(); // this is wrong, and won't compile
IList<string> list2 = new List<string>(); // this will compile
List<string> list3 = new List<string>(); // this will compile
Related
I have spent quite a few hours pondering the subject of exposing list members. In a similar question to mine, Jon Skeet gave an excellent answer. Please feel free to have a look.
ReadOnlyCollection or IEnumerable for exposing member collections?
I am usually quite paranoid to exposing lists, especially if you are developing an API.
I have always used IEnumerable for exposing lists, as it is quite safe, and it gives that much flexibility. Let me use an example here:
public class Activity
{
private readonly IList<WorkItem> workItems = new List<WorkItem>();
public string Name { get; set; }
public IEnumerable<WorkItem> WorkItems
{
get
{
return this.workItems;
}
}
public void AddWorkItem(WorkItem workItem)
{
this.workItems.Add(workItem);
}
}
Anyone who codes against an IEnumerable is quite safe here. If I later decide to use an ordered list or something, none of their code breaks and it is still nice. The downside of this is IEnumerable can be cast back to a list outside of this class.
For this reason, a lot of developers use ReadOnlyCollection for exposing a member. This is quite safe since it can never be cast back to a list. For me I prefer IEnumerable since it provides more flexibility, should I ever want to implement something different than a list.
I have come up with a new idea I like better. Using IReadOnlyCollection:
public class Activity
{
private readonly IList<WorkItem> workItems = new List<WorkItem>();
public string Name { get; set; }
public IReadOnlyCollection<WorkItem> WorkItems
{
get
{
return new ReadOnlyCollection<WorkItem>(this.workItems);
}
}
public void AddWorkItem(WorkItem workItem)
{
this.workItems.Add(workItem);
}
}
I feel this retains some of the flexibility of IEnumerable and is encapsulated quite nicely.
I posted this question to get some input on my idea. Do you prefer this solution to IEnumerable? Do you think it is better to use a concrete return value of ReadOnlyCollection? This is quite a debate and I want to try and see what are the advantages/disadvantages that we all can come up with.
EDIT
First of all thank you all for contributing so much to the discussion here. I have certainly learned a ton from each and every one and would like to thank you sincerely.
I am adding some extra scenarios and info.
There are some common pitfalls with IReadOnlyCollection and IEnumerable.
Consider the example below:
public IReadOnlyCollection<WorkItem> WorkItems
{
get
{
return this.workItems;
}
}
The above example can be casted back to a list and mutated, even though the interface is readonly. The interface, despite it's namesake does not guarantee immutability. It is up to you to provide an immutable solution, therefore you should return a new ReadOnlyCollection. By creating a new list (a copy essentially), the state of your object is safe and sound.
Richiban says it best in his comment: a interface only guarantees what something can do, not what it cannot do.
See below for an example:
public IEnumerable<WorkItem> WorkItems
{
get
{
return new List<WorkItem>(this.workItems);
}
}
The above can be casted and mutated, but your object is still immutable.
Another outside the box statement would be collection classes. Consider the following:
public class Bar : IEnumerable<string>
{
private List<string> foo;
public Bar()
{
this.foo = new List<string> { "123", "456" };
}
public IEnumerator<string> GetEnumerator()
{
return this.foo.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
The class above can have methods for mutating foo the way you want it to be, but your object can never be casted to a list of any sort and mutated.
Carsten Führmann makes a fantastic point about yield return statements in IEnumerables.
One important aspect seems to be missing from the answers so far:
When an IEnumerable<T> is returned to the caller, they must consider the possibility that the returned object is a "lazy stream", e.g. a collection built with "yield return". That is, the performance penalty for producing the elements of the IEnumerable<T> may have to be paid by the caller, for each use of the IEnumerable. (The productivity tool "Resharper" actually points this out as a code smell.)
By contrast, an IReadOnlyCollection<T> signals to the caller that there will be no lazy evaluation. (The Count property, as opposed to the Count extension method of IEnumerable<T> (which is inherited by IReadOnlyCollection<T> so it has the method as well), signals non-lazyness. And so does the fact that there seem to be no lazy implementations of IReadOnlyCollection.)
This is also valid for input parameters, as requesting an IReadOnlyCollection<T> instead of IEnumerable<T> signals that the method needs to iterate several times over the collection. Sure the method could create its own list from the IEnumerable<T> and iterate over that, but as the caller may already have a loaded collection at hand it would make sense to take advantage of it whenever possible. If the caller only has an IEnumerable<T> at hand, he only needs to add .ToArray() or .ToList() to the parameter.
What IReadOnlyCollection does not do is prevent the caller to cast to some other collection type. For such protection, one would have to use the class ReadOnlyCollection<T>.
In summary, the only thing IReadOnlyCollection<T> does relative to IEnumerable<T> is add a Count property and thus signal that no lazyness is involved.
Talking about class libraries, I think IReadOnly* is really useful, and I think you're doing it right :)
It's all about immutable collection... Before there were just immutables and to enlarge arrays was a huge task, so .net decided to include in the framework something different, mutable collection, that implement the ugly stuff for you, but IMHO they didn't give you a proper direction for immutable that are extremely useful, especially in a high concurrency scenario where sharing mutable stuff is always a PITA.
If you check other today languages, such as objective-c, you will see that in fact the rules are completely inverted! They quite always exchange immutable collection between different classes, in other words the interface expose just immutable, and internally they use mutable collection (yes, they have it of course), instead they expose proper methods if they want let the outsiders change the collection (if the class is a stateful class).
So this little experience that I've got with other languages pushes me to think that .net list are so powerful, but the immutable collection were there for some reason :)
In this case is not a matter of helping the caller of an interface, to avoid him to change all the code if you're changing internal implementation, like it is with IList vs List, but with IReadOnly* you're protecting yourself, your class, to being used in not a proper way, to avoid useless protection code, code that sometimes you couldn't also write (in the past in some piece of code I had to return a clone of the complete list to avoid this problem).
My take on concerns of casting and IReadOnly* contracts, and 'proper' usages of such.
If some code is being “clever” enough to perform an explicit cast and break the interface contract, then it is also “clever” enough to use reflection or otherwise do nefarious things such as access the underlying List of a ReadOnlyCollection wrapper object. I don’t program against such “clever” programmers.
The only thing that I guarantee is that after said IReadOnly*-interface objects are exposed, then my code will not violate that contract and will not modified the returned collection object.
This means that I write code that returns List-as-IReadOnly*, eg., and rarely opt for an actual read-only concrete type or wrapper. Using IEnumerable.ToList is sufficient to return an IReadOnly[List|Collection] - calling List.AsReadOnly adds little value against “clever” programmers who can still access the underlying list that the ReadOnlyCollection wraps.
In all cases, I guarantee that the concrete types of IReadOnly* return values are eager. If I ever write a method that returns an IEnumerable, it is specifically because the contract of the method is that which “supports streaming” fsvo.
As far as IReadOnlyList and IReadOnlyCollection, I use the former when there is 'an' implied stable ordering established that is meaningful to index, regardless of purposeful sorting. For example, arrays and Lists can be returned as an IReadOnlyList while a HashSet would better be returned as an IReadOnlyCollection. The caller can always assign the I[ReadOnly]List to an I[ReadOnly]Collection as desired: this choice is about the contract exposed and not what a programmer, “clever” or otherwise, will do.
It seems that you can just return an appropriate interface:
...
private readonly List<WorkItem> workItems = new List<WorkItem>();
// Usually, there's no need the property to be virtual
public virtual IReadOnlyList<WorkItem> WorkItems {
get {
return workItems;
}
}
...
Since workItems field is in fact List<T> so the natural idea IMHO is to expose the most wide interface which is IReadOnlyList<T> in the case
!! IEnumerable vs IReadOnlyList !!
IEnumerable has been with us from the beginning of time. For many years, it was a de facto standard way to represent a read-only collection. Since .NET 4.5, however, there is another way to do that: IReadOnlyList.
Both collection interfaces are useful.
<>
I know there has been a lot of posts on this but it still confuses me why should you pass in an interface like IList and return an interface like IList back instead of the concrete list.
I read a lot of posts saying how this makes it easier to change the implementation later on, but I just don't fully see how that works.
Say if I have this method
public class SomeClass
{
public bool IsChecked { get; set; }
}
public void LogAllChecked(IList<SomeClass> someClasses)
{
foreach (var s in someClasses)
{
if (s.IsChecked)
{
// log
}
}
}
I am not sure how using IList will help me out in the future.
How about if I am already in the method? Should I still be using IList?
public void LogAllChecked(IList<SomeClass> someClasses)
{
//why not List<string> myStrings = new List<string>()
IList<string> myStrings = new List<string>();
foreach (var s in someClasses)
{
if (s.IsChecked)
{
myStrings.Add(s.IsChecked.ToString());
}
}
}
What do I get for using IList now?
public IList<int> onlySomeInts(IList<int> myInts)
{
IList<int> store = new List<int>();
foreach (var i in myInts)
{
if (i % 2 == 0)
{
store.Add(i);
}
}
return store;
}
How about now? Is there some new implementation of a list of int's that I will need to change out?
Basically, I need to see some actual code examples of how using IList would have solved some problem over just taking List into everything.
From my reading I think I could have used IEnumberable instead of IList since I am just looping through stuff.
Edit
So I have been playing around with some of my methods on how to do this. I am still not sure about the return type(if I should make it more concrete or an interface).
public class CardFrmVm
{
public IList<TravelFeaturesVm> TravelFeaturesVm { get; set; }
public IList<WarrantyFeaturesVm> WarrantyFeaturesVm { get; set; }
public CardFrmVm()
{
WarrantyFeaturesVm = new List<WarrantyFeaturesVm>();
TravelFeaturesVm = new List<TravelFeaturesVm>();
}
}
public class WarrantyFeaturesVm : AvailableFeatureVm
{
}
public class TravelFeaturesVm : AvailableFeatureVm
{
}
public class AvailableFeatureVm
{
public Guid FeatureId { get; set; }
public bool HasFeature { get; set; }
public string Name { get; set; }
}
private IList<AvailableFeature> FillAvailableFeatures(IEnumerable<AvailableFeatureVm> avaliableFeaturesVm)
{
List<AvailableFeature> availableFeatures = new List<AvailableFeature>();
foreach (var f in avaliableFeaturesVm)
{
if (f.HasFeature)
{
// nhibernate call to Load<>()
AvailableFeature availableFeature = featureService.LoadAvaliableFeatureById(f.FeatureId);
availableFeatures.Add(availableFeature);
}
}
return availableFeatures;
}
Now I am returning IList for the simple fact that I will then add this to my domain model what has a property like this:
public virtual IList<AvailableFeature> AvailableFeatures { get; set; }
The above is an IList itself as this is what seems to be the standard to use with nhibernate. Otherwise I might have returned IEnumberable back but not sure. Still, I can't figure out what the user would 100% need(that's where returning a concrete has an advantage over).
Edit 2
I was also thinking what happens if I want to do pass by reference in my method?
private void FillAvailableFeatures(IEnumerable<AvailableFeatureVm> avaliableFeaturesVm, IList<AvailableFeature> toFill)
{
foreach (var f in avaliableFeaturesVm)
{
if (f.HasFeature)
{
// nhibernate call to Load<>()
AvailableFeature availableFeature = featureService.LoadAvaliableFeatureById(f.FeatureId);
toFill.Add(availableFeature);
}
}
}
would I run into problems with this? Since could they not pass in an array(that has a fixed size)? Would it be better maybe for a concrete List?
There are three questions here: what type should I use for a formal parameter? What should I use for a local variable? and what should I use for a return type?
Formal parameters:
The principle here is do not ask for more than you need. IEnumerable<T> communicates "I need to get the elements of this sequence from beginning to end". IList<T> communicates "I need to get and set the elements of this sequence in arbitrary order". List<T> communicates "I need to get and set the elements of this sequence in arbitrary order and I only accept lists; I do not accept arrays."
By asking for more than you need, you (1) make the caller do unnecessary work to satisfy your unnecessary demands, and (2) communicate falsehoods to the reader. Ask only for what you're going to use. That way if the caller has a sequence, they don't need to call ToList on it to satisfy your demand.
Local variables:
Use whatever you want. It's your method. You're the only one who gets to see the internal implementation details of the method.
Return type:
Same principle as before, reversed. Offer the bare minimum that your caller requires. If the caller only requires the ability to enumerate the sequence, only give them an IEnumerable<T>.
The most practical reason I've ever seen was given by Jeffrey Richter in CLR via C#.
The pattern is to take the basest class or interface possible for your arguments and return the most specific class or interface possible for your return types. This gives your callers the most flexibility in passing in types to your methods and the most opportunities to cast/reuse the return values.
For example, the following method
public void PrintTypes(IEnumerable items)
{
foreach(var item in items)
Console.WriteLine(item.GetType().FullName);
}
allows the method to be called passing in any type that can be cast to an enumerable. If you were more specific
public void PrintTypes(List items)
then, say, if you had an array and wished to print their type names to the console, you would first have to create a new List and fill it with your types. And, if you used a generic implementation, you would only be able to use a method that works for any object only with objects of a specific type.
When talking about return types, the more specific you are, the more flexible callers can be with it.
public List<string> GetNames()
you can use this return type to iterate the names
foreach(var name in GetNames())
or you can index directly into the collection
Console.WriteLine(GetNames()[0])
Whereas, if you were getting back a less specific type
public IEnumerable GetNames()
you would have to massage the return type to get the first value
Console.WriteLine(GetNames().OfType<string>().First());
IEnumerable<T> allows you to iterate through a collection. ICollection<T> builds on this and also allows for adding and removing items. IList<T> also allows for accessing and modifying them at a specific index. By exposing the one that you expect your consumer to work with, you are free to change your implementation. List<T> happens to implement all three of those interfaces.
If you expose your property as a List<T> or even an IList<T> when all you want your consumer to have is the ability to iterate through the collection. Then they could come to depend on the fact that they can modify the list. Then later if you decide to convert the actual data store from a List<T> to a Dictionary<T,U> and expose the dictionary keys as the actual value for the property (I have had to do exactly this before). Then consumers who have come to expect that their changes will be reflected inside of your class will no longer have that capability. That's a big problem! If you expose the List<T> as an IEnumerable<T> you can comfortably predict that your collection is not being modified externally. That is one of the powers of exposing List<T> as any of the above interfaces.
This level of abstraction goes the other direction when it belongs to method parameters. When you pass your list to a method that accepts IEnumerable<T> you can be sure that your list is not going to be modified. When you are the person implementing the method and you say you accept an IEnumerable<T> because all you need to do is iterate through that list. Then the person calling the method is free to call it with any data type that is enumerable. This allows your code to be used in unexpected, but perfectly valid ways.
From this it follows that your method implementation can represent its local variables however you wish. The implementation details are not exposed. Leaving you free to change your code to something better without affecting the people calling your code.
You cannot predict the future. Assuming that a property's type will always be beneficial as a List<T> is immediately limiting your ability to adapt to unforeseen expectations of your code. Yes, you may never change that data type from a List<T> but you can be sure that if you have to. Your code is ready for it.
Short Answer:
You pass the interface so that no matter what concrete implementation of that interface you use, your code will support it.
If you use a concrete implementation of list, another implementation of the same list will not be supported by your code.
Read a bit on inheritance and polymorphism.
Here's an example: I had a project once where our lists got very large, and resulting fragmentation of the large object heap was hurting performance. We replaced List with LinkedList. LinkedList does not contain an array, so all of a sudden, we had almost no use of the large object heap.
Mostly, we used the lists as IEnumerable<T>, anyway, so there was no further change needed. (And yes, I would recommend declaring references as IEnumerable if all you're doing is enumerating them.) In a couple of places, we needed the list indexer, so we wrote an inefficient IList<T> wrapper around the linked lists. We needed the list indexer infrequently, so the inefficiency was not a problem. If it had been, we could have provided some other implementation of IList, perhaps as a collection of small-enough arrays, that would have been more efficiently indexable while also avoiding large objects.
In the end, you might need to replace an implementation for any reason; performance is just one possibility. Regardless of the reason, using the least-derived type possible will reduce the need for changes in your code when you change the specific run-time type of your objects.
Inside the method, you should use var, instead of IList or List. When your data source changes to come from a method instead, your onlySomeInts method will survive.
The reason to use IList instead of List as parameters, is because many things implement IList (List and [], as two examples), but only one thing implements List. It's more flexible to code to the interface.
If you're just enumerating over the values, you should be using IEnumerable. Every type of datatype that can hold more than one value implements IEnumerable (or should) and makes your method hugely flexible.
Using IList instead of List makes writing unit tests significantly easier. It allows you to use a 'Mocking' library to pass and return data.
The other general reason for using interfaces is to expose the minimum amount of knowledge necessary to the user of an object.
Consider the (contrived) case where I have a data object that implements IList.
public class MyDataObject : IList<int>
{
public void Method1()
{
...
}
// etc
}
Your functions above only care about being able to iterate over a list. Ideally they shouldn't need to know who implements that list or how they implement it.
In your example, IEnumerable is a better choice as you thought.
It is always a good idea to reduce the dependencies between your code as much as possible.
Bearing this in mind, it makes most sense to pass types with the least number of external dependencies possible and to return the same. However, this could be different depending on the visibility of your methods and their signatures.
If your methods form part of an interface, the methods will need to be defined using types available to that interface. Concrete types will probably not be available to interfaces, so they would have to return non-concrete types. You would want to do this if you were creating a framework, for example.
However, if you are not writing a framework, it may be advantageous to pass parameter with the weakest possible types (i.e. base classes, interfaces, or even delegates) and return concrete types. That gives the caller the ability to do as much as possible with the returned object, even if it is cast as an interface. However, this makes the method more fragile, as any change to the returned object type may break the calling code. In practice though, that generally isn't a major problem.
You accept an Interface as a parameter for a method because that allows the caller to submit different concrete types as arguments. Given your example method LogAllChecked, the parameter someClasses could be of various types, and for the person writing the method, all might be equivalent (i.e. you'd write the exact same code regardless of the type of the parameter). But for the person calling the method, it can make a huge difference -- if they have an array and you're asking for a list, they have to change the array to a list or v.v. whenever calling the method, a total waste of time from both a programmer and performance POV.
Whether you return an Interface or a concrete type depends upon what you want to let your callers do with the object you created -- this is an API design decision, and there's no hard and fast rule. You have to weigh their ability to make full use of the object against their ability to easily use a portion of the objects functionality (and of course whether you WANT them to be making full use of the object). For instance, if you return an IEnumerable, then you are limiting them to iterating -- they can't add or remove items from your object, they can only act against the objects. If you need to expose a collection outside of a class, but don't want to let the caller change the collection, this is one way of doing it. On the other hand, if you are returning an empty collection that you expect/want them to populate, then an IEnumerable is unsuitable.
Here's my answer in this .NET 4.5+ world.
Use IList<T> and IReadonlyList<T>,
instead of List<T>, because ReadonlyList<T> doesn't exist.
IList<T> looks so consistent with IReadonlyList<T>
Use IEnumerable<T> for minimum exposure (property) or requirement (parameter) if foreach is the only way to use it.
Use IReadonlyList<T> if you also need to expose/use Count and [] indexer.
Use IList<T> if you also allow callers to add/update/delete elements
because List<T> implements IReadonlyList<T>, it doesn't need any explicit casting.
An example class:
// manipulate the list within the class
private List<int> _numbers;
// callers can add/update/remove elements, but cannot reassign a new list to this property
public IList<int> Numbers { get { return _numbers; } }
// callers can use: .Count and .ReadonlyNumbers[idx], but cannot add/update/remove elements
public IReadOnlyList<int> ReadonlyNumbers { get { return _numbers; } }
I have been hearing that it is important to use the lowest class possible when passing parameters to methods. Why is this? Also where can i find more information on what the class hierarchy is? I would like to know what IEnumerable inheriated from and so forth.
If you use IEnumerable<T> as a parameter type, then you can pass in any type that implements that interface. That includes List<T>, Stack<T>, Queue<T>, etc.
It also includes various anonymous types that might be the result of a LINQ query, and also the very important IQueryable<T>.
By using "low-level" arguments, you give your method the ability to work on a larger variety of objects. It encourages writing generic re-usable methods.
MSDN can tell you what different things inherit from (in the case of IEnumerable, it inherits from nothing, because it represents pretty much the most primitive idea of a "list")
IEnumerable is a read-only sequence, while a List can be appended to.
If you design your public API so that it exposes IList<T> all over the place and then realize that you want to return a read only list, you have to either break your code by changing to IEnumerable<T> or use the horrible ReadOnlyCollection. I call it horrible because it throws exceptions on .Add/.Remove etc.
So if you only need to read, return IEnumerable, if your callers need to add/append, return IList.
On another note: Never return a List<T>, always an IList<T>. The reason is that List is a concrete class that can't be overridden in any sensible way, while IList is an interface that allows you to change the actual implementation without breaking the public contract.
The quickest thing I can think of is this: what happens if you no longer want your implementation to be of type List<T>?
Let's say you one day decide to refactor your application to use a LinkedList<T> or a SortedList<T>, all you have to change is that type instead of all of the types in all of the methods you might be passing your collection around to.
You can improve the maintainability of your code by using this technique.
The idea is to maximize the flexibility of your function. If you require a List<T>, then callers must have one to pass in. If they don't have one handy, they'll have to create one, and this is expensive. If you require IEnumerable<T>, on the other hand, then they can pass in any collection.
The best place to fnd information on the class heiarchy in .NET is MSDN.
Which is the best type to us for returning collections?
Should I use IList<T>, IEnumerable<T>, IQueryable<T>, something else? Which is best and why?
I'm trying to decide which I should use typically, both in the interface and the implementation of a few classes that I'm writing.
edit Let me nail this down a little further, I am using LINQ to SQL to return data over a WCF Service. It feels like that may make a change in the best type to use?
The Framework Design Guidelines state:
Use Collection<T> or a subclass of
Collection<T> for properties or return
values representing read/write
collections.
public Collection<Session> Sessions { get; }
Use ReadOnlyCollection<T>, a subclass
of ReadOnlyCollection<T>, or in rare
cases IEnumerable<T> for properties or
return values representing read-only
collections.
public ReadOnlyCollection<Session> Sessions { get; }
In general, prefer
ReadOnlyCollection<T>.
Regarding LINQ, the guidelines, which were created for .NET 3.5, are clear but not (imo) entirely convincing in the justification:
The review body made an explicit
decision that LINQ
should not change this guideline
["Do not return IEnumerator<T>,
except as the return type of a
GetEnumerator method"]. Your
callers can end up with a clumsy
object model if they choose not to use
LINQ or a language that does not
support it.
Use the least general Type that all possible return types will conform to. i.e, if the method you are looking at might return a List<int> or an int[], then I'd type as IEnumerable<int> ... If it could return List<int> or a List<Employee> or an int[] I'd type as IEnumerable. If it always returned either a Collection<Employee> or a Collection<SalariedEmployee> then return Collection<Employee>
If the method will always generate the same type, use that type...
In a consuming method or interface, otoh, where the returned object is being used, you should use the opposite philosophy, Type the incoming method parameter as the least general type that is required by the internal functionality of the code in the consuming method... i.e, if all the method does with the collection object is enumerate through it using foreach, then the incoming parameter type should IEnumerable<>
If the collection is unordered or doesn't need random access, IEnumerable is correct. If it's a list and you want to expose it as one, then declare the method or property to return IList, but you may well need to return a ReadOnlyCollection wrapper over that collection (either directly or using syntax such as List.AsReadOnly()). I would return IQueryable only if I had some useful overrides.
I default to IEnumerable. I'm shooting for the minimal interface to expose. Both IList<T> and IQueryable<T> implement IEnumerable<T>. So unless you have other specific requirements for the methods I'd go for minimalism and use the least derived type. If you have other requirements in your calling code, such as performance of indexed lookups or getting the number of items in the collection then you might want to choose another type such as ICollection<T>.
When writing applications, I don't see any problem with returning a specific generic type, e.g.:
List<myType> MyMethod()
{
...
}
In my experience, this is easy for the original developer, and easy for other developers to understand what the original developer intended.
But if you're developing some kind of framework that will be used by other developers, you might want to be more sophisticated - returning an interface, for example.
It ultimately depends on what you want to do with the data being returned. Remember that IEnumerable implies (by that I mean forces) you to access the data in a sequential manner. You can't add to it, alter it, and you can't access an item at a specific point in the array.
IList doesn't have this problem, but you have to provide additional functionality to implement it. If you inherit from a .net object, you might not have to worry about it, but it really depends on how you are creating the object.
Each have their trade offs and there is no one to always default to.
Prior to C# generics, everyone would code collections for their business objects by creating a collection base that implemented IEnumerable
IE:
public class CollectionBase : IEnumerable
and then would derive their Business Object collections from that.
public class BusinessObjectCollection : CollectionBase
Now with the generic list class, does anyone just use that instead? I've found that I use a compromise of the two techniques:
public class BusinessObjectCollection : List<BusinessObject>
I do this because I like to have strongly typed names instead of just passing Lists around.
What is your approach?
I am generally in the camp of just using a List directly, unless for some reason I need to encapsulate the data structure and provide a limited subset of its functionality. This is mainly because if I don't have a specific need for encapsulation then doing it is just a waste of time.
However, with the aggregate initializes feature in C# 3.0, there are some new situations where I would advocate using customized collection classes.
Basically, C# 3.0 allows any class that implements IEnumerable and has an Add method to use the new aggregate initializer syntax. For example, because Dictionary defines a method Add(K key, V value) it is possible to initialize a dictionary using this syntax:
var d = new Dictionary<string, int>
{
{"hello", 0},
{"the answer to life the universe and everything is:", 42}
};
The great thing about the feature is that it works for add methods with any number of arguments. For example, given this collection:
class c1 : IEnumerable
{
void Add(int x1, int x2, int x3)
{
//...
}
//...
}
it would be possible to initialize it like so:
var x = new c1
{
{1,2,3},
{4,5,6}
}
This can be really useful if you need to create static tables of complex objects. For example, if you were just using List<Customer> and you wanted to create a static list of customer objects you would have to create it like so:
var x = new List<Customer>
{
new Customer("Scott Wisniewski", "555-555-5555", "Seattle", "WA"),
new Customer("John Doe", "555-555-1234", "Los Angeles", "CA"),
new Customer("Michael Scott", "555-555-8769", "Scranton PA"),
new Customer("Ali G", "", "Staines", "UK")
}
However, if you use a customized collection, like this one:
class CustomerList : List<Customer>
{
public void Add(string name, string phoneNumber, string city, string stateOrCountry)
{
Add(new Customer(name, phoneNumber, city, stateOrCounter));
}
}
You could then initialize the collection using this syntax:
var customers = new CustomerList
{
{"Scott Wisniewski", "555-555-5555", "Seattle", "WA"},
{"John Doe", "555-555-1234", "Los Angeles", "CA"},
{"Michael Scott", "555-555-8769", "Scranton PA"},
{"Ali G", "", "Staines", "UK"}
}
This has the advantage of being both easier to type and easier to read because their is no need to retype the element type name for each element. The advantage can be particularly strong if the element type is long or complex.
That being said, this is only useful if you need static collections of data defined in your app. Some types of apps, like compilers, use them all the time. Others, like typical database apps don't because they load all their data from a database.
My advice would be that if you either need to define a static collection of objects, or need to encapsulate away the collection interface, then create a custom collection class. Otherwise I would just use List<T> directly.
It's recommended that in public API's not to use List<T>, but to use Collection<T>
If you are inheriting from it though, you should be fine, afaik.
I prefer just to use List<BusinessObject>. Typedefing it just adds unnecessary boilerplate to the code. List<BusinessObject> is a specific type, it's not just any List object, so it's still strongly typed.
More importantly, declaring something List<BusinessObject> makes it easier for everyone reading the code to tell what types they are dealing with, they don't have to search through to figure out what a BusinessObjectCollection is and then remember that it's just a list. By typedefing, you'll have to require a consistent (re)naming convention that everyone has to follow in order for it to make sense.
Use the type List<BusinessObject> where you have to declare a list of them. However,
where you return a list of BusinessObject, consider returning IEnumerable<T>, IList<T> or ReadOnlyCollection<T> - i.e. return the weakest possible contract that satisfies the client.
Where you want to "add custom code" to a list, code extension methods on the list type. Again, attach these methods to the weakest possible contract, e.g.
public static int SomeCount(this IEnumerable<BusinessObject> someList)
Of course, you can't and shouldn't add state with extension methods, so if you need to add a new property and a field behind it, use a subclass or better, a wrapper class to store this.
I've been going back and forth on 2 options:
public class BusinessObjectCollection : List<BusinessObject> {}
or methods that just do the following:
public IEnumerable<BusinessObject> GetBusinessObjects();
The benefits of the first approach is that you can change the underlying data store without having to mess with method signatures. Unfortunately if you inherit from a collection type that removes a method from the previous implementation, then you'll have to deal with those situations throughout your code.
You should probably avoid creating your own collection for that purpose. It's pretty common to want to change the type of data structure a few times during refactorings or when adding new features. With your approach, you would wind up with a separate class for BusinessObjectList, BusinessObjectDictionary, BusinessObjectTree, etc.
I don't really see any value in creating this class just because the classname is more readable. Yeah, the angle bracket syntax is kind of ugly, but it's standard in C++, C# and Java, so even if you don't write code that uses it you're going to run into it all the time.
I generally only derive my own collection classes if I need to "add value". Like, if the collection itself needed to have some "metadata" properties tagging along with it.
I do the exact same thing as you Jonathan... just inherit from List<T>. You get the best of both worlds. But I generally only do it when there is some value to add, like adding a LoadAll() method or whatever.
You can use both. For laziness - I mean productivity - List is a very useful class, it's also "comprehensive" and frankly full of YANGNI members. Coupled with the sensible argument / recommendation put forward by the MSDN article already linked about exposing List as a public member, I prefer the "third" way:
Personally I use the decorator pattern to expose only what I need from List i.e:
public OrderItemCollection : IEnumerable<OrderItem>
{
private readonly List<OrderItem> _orderItems = new List<OrderItem>();
void Add(OrderItem item)
{
_orderItems.Add(item)
}
//implement only the list members, which are required from your domain.
//ie. sum items, calculate weight etc...
private IEnumerator<string> Enumerator() {
return _orderItems.GetEnumerator();
}
public IEnumerator<string> GetEnumerator() {
return Enumerator();
}
}
Further still I'd probably abstract OrderItemCollection into IOrderItemCollection so I can swap my implementation of IOrderItemCollection over in the future in (I may prefer to use a different inner enumerable object such as Collection or more likley for perf use a Key Value Pair collection or Set.
I use generic lists for almost all scenarios. The only time that I would consider using a derived collection anymore is if I add collection specific members. However, the advent of LINQ has lessened the need for even that.
6 of 1, half dozen of another
Either way its the same thing. I only do it when I have reason to add custom code into the BusinessObjectCollection.
With out it having load methods return a list allows me to write more code in a common generic class and have it just work. Such as a Load method.
As someone else pointed out, it is recommended not to expose List publicly, and FxCop will whinge if you do so. This includes inheriting from List as in:
public MyTypeCollection : List<MyType>
In most cases public APIs will expose IList (or ICollection or IEnumerable) as appropriate.
In cases where you want your own custom collection, you can keep FxCop quiet by inheriting from Collection instead of List.
If you choose to create your own collection class you should check out the types in System.Collections.ObjectModel Namespace.
The namespace defines base classes thare are ment to make it easier for implementers to create a custom collections.
I tend to do it with my own collection if I want to shield the access to the actual list. When you are writing business objects, chance is that you need a hook to know if your object is being added/removed, in such sense I think BOCollection is better idea. Of coz if that is not required, List is more lightweight. Also you might want to check using IList to provide additional abstraction interface if you need some kind of proxying (e.g. a fake collection triggers lazy load from database)
But... why not consider Castle ActiveRecord or any other mature ORM framework? :)
At the most of the time I simply go with the List way, as it gives me all the functionality I need at the 90% of the time, and when something 'extra' is needed, I inherit from it, and code that extra bit.
I would do this:
using BusinessObjectCollection = List<BusinessObject>;
This just creates an alias rather than a completely new type. I prefer it to using List<BusinessObject> directly because it leaves me free to change the underlying structure of the collection at some point in the future without changing code that uses it (as long as I provide the same properties and methods).
try out this:
System.Collections.ObjectModel.Collection<BusinessObject>
it makes unnecessary to implement basic method like CollectionBase do
this is the way:
return arrays, accept IEnumerable<T>
=)