When to use Collection<T> or IEnumerable<T> - c#

I've seen similar questions but I'm still lost in regards to the return type of Public, Internal and Private methods. When should my return type by Collection and when should it be IEnumerable
The MSDN guide says
X DO NOT use weakly typed collections in public APIs.
The type of all return values and parameters representing collection items should be the exact item type, not any of its base types (this applies only to public members of the collection).
This has totally confused me and implies that everything (public) should be Collection<T> or ReadOnlyCollection<T>. My confusion comes in when can we use List<T>, for example
public Collection<string> DoThisThing(int i)
{
Collection<string> col = new Collection<string>();
List<string> myList = null;
if (i > 0)
myList = GetListOfString();
//col.AddRange(...); Can't AddRange as no method
//return myList; Can't do this as it expects return type of Collection<string>
//So, to ensure my return is Collection, I have to convert
foreach(var item in myList)
{
col.Add(item);
}
return col;
}
In the above example, it's perfectly valid that List is a better choice, so, if I want to use AddRange, I could either
Hope to have control of API so I can change the return type of GetListOfString to Collection (not always possible)
Add an extension method to Control to support AddRange
Type it out every time, as I do above
However, if I do option 1, then I don't see when I'd ever actually use ToList() (other than when I have to if an external API returns List) as all my return types would be either Collection<T>, ReadOnlyCollection<T>
Using option 2 seems more sensible but, now I'm lost as it feels as if I'm forcing it to do something it isn't intended to do.
Option 3 is repetitive
Whist I appreciate it will always depend on the situation, as a general rule, is the following accurate
Public - Should be very precise, for example IEnumerable<string> (note, not T), but not Collection<T>/Collection<string>
Internal and private - Should be more agnostic, so Collection<T> is better (and live with not really using List or if I do have to use, I will have to cast it into Collection<T>

EDIT: added code at the end
You are confusing several different layers of the guide.
In your internal code you are free to do what you like, albeit there are guidelines for how to write code that is clean, readable and easy to maitain. Still, those are implementation details and they are tucked away behind your public methods, so you can do what you want (you may shoot in your own foot).
Artifacts you write to be used by other developers are one of these:
libraries;
frameworks;
modules of "some kind".
Now, in these cases you must decide:
the boundaries of your artifact;
how those developers use your artifact.
The boundaries are well defined by an assembly, if you choose to have one.
Sometimes namespaces may be considered "internal boundaries" and the consumer is yourself, in the future. Organizing your code into internal boundaries is a good practice but consider that classes already define boundaries so don't overdo it. Group 10-20 classes in a namespace by cohesion and keep those groups apart without interdependencies by low coupling. You cannot really understand what I mean here when I say bridge... but come back in a year or so and read this answer again :)
How to use the assembly is defined in these places:
manifest (generated automatically by the VS packaging);
documentation (which should accompany your assembly);
source code (if you expose it);
public constructs (enums, interfaces, classes, namespaces, package dependencies);
public and protected methods.
Now with all those tools you have a way to design that thing called an API.
There is plenty of room to maneuver and make mistakes. Because of this, the good people at M$ decided to define a uniform and consistent way of designing the default APIs in the default assemblies and namespaces (such as System.*).
What you have touched in your question is a guideline to API designers (which you might or might not be). All they say is:
if you are designing an API such as a non-internal method of a public class in an accessible assembly, than do us the favor and return a tightly defined value, as concrete as possible.
So, if it is a List<T> don't return any of the ancestors such as IEnumerable<T> or whatever. This allows the consumers (other developers) of your assembly to leverage concrete properties of your return value.
On the other hand you should accept as general values as possible. If you just use a method of IEnumerable<T> in your method you should require just that:
public void MyMethod<T>(IEnumerable<T> e);
This signature of your method allows consumers to pass in many different concrete collections. The only thing you require is that it is (derives from) an IEnumerable<T>.
These design decisions are also bound to many important OO principles of SOLID. They postulate that:
you should depend on abstractions => so expose interfaces if possible and keep your API abstract;
you should hide implementation details so that changes in your assembly don't propagate to consumer code;
you should avoid changing the API by allowing extension mechanism.
To be more precise I take your code and try to be as concrete as possible:
public List<string> DoThisThing(int i) // use list to return a more concrete type
{
List<string> list = new List<string>(); // you really want a *LIST* so why be abstract in your own code innards?
// List<string> myList = null; <-- superfluous
if (i > 0)
{
list = GetListOfString();
} // ALWAYS (!) use code blocks, see the book "Clean Code"
// this problem is solved: in your internal code don't be abstract if you don't need it
//col.AddRange(...); Can't AddRange as no method
return list; // Can't do this as it expects return type of Collection<string>
// you're done...
// So, to ensure my return is Collection, I have to convert
// foreach(var item in myList)
// {
// col.Add(item);
// }
// return col;
}
Some help about the theory behind all this: The SOLID principles.
What you really want to understand is how and why you need abstractions. When you really grasp that, you will also understand when you DO NOT need them and so decide in a reasoned manner when to use IEnumerable<T> or List<string> without negative repercussions on the rest of your code. The most important principles of SOLID are in this case OCP and DIP.

Your question seems to be thinking into the "wrong" direction, starting from the initial MSDN quotation:
X DO NOT use weakly typed collections in public APIs.
The type of all return values and parameters representing collection items should be the exact item type, not any of its base types (this applies only to public members of the collection).
This refers to the items of collections, not the collection types.
This guideline does not refer to the type of the collection itself at all. Whether you return an enumerable, a collection, a list, or any other collection type, entirely depends on how you intend your collection to behave and what guarantees you want to make about the set of items you store.
The guideline you cited, on the other hand, indicates that you should strongly type the items stored in your collection. That is, if you have a list of string values, do not declare the collection item type as object (e.g. by typing the collection to IEnumerable<object> or IList<object>), but do use string as the item type (e.g. IEnumerable<string>, ICollection<string>, IList<string>).

Related

Is returning IList<T> worse than returning T[] or List<T>?

The answers to questions like this: List<T> or IList<T> always seem to agree that returning an interface is better than returning a concrete implementation of a collection. But I'm struggling with this. Instantiating an interface is impossible, so if your method is returning an interface, it's actually still returning a specific implementation. I was experimenting a bit with this by writing 2 small methods:
public static IList<int> ExposeArrayIList()
{
return new[] { 1, 2, 3 };
}
public static IList<int> ExposeListIList()
{
return new List<int> { 1, 2, 3 };
}
And use them in my test program:
static void Main(string[] args)
{
IList<int> arrayIList = ExposeArrayIList();
IList<int> listIList = ExposeListIList();
//Will give a runtime error
arrayIList.Add(10);
//Runs perfectly
listIList.Add(10);
}
In both cases when I try to add a new value, my compiler gives me no errors, but obviously the method which exposes my array as an IList<T> gives a runtime error when I try to add something to it.
So people who don't know what's happening in my method, and have to add values to it, are forced to first copy my IList to a List to be able to add values without risking errors. Of course they can do a typecheck to see if they're dealing with a List or an Array, but if they don't do that, and they want to add items to the collection they have no other choice to copy the IList to a List, even if it already is a List. Should an array never be exposed as IList?
Another concern of mine is based upon the accepted answer of the linked question (emphasis mine):
If you are exposing your class through a library that others will use, you generally want to expose it via interfaces rather than concrete implementations. This will help if you decide to change the implementation of your class later to use a different concrete class. In that case the users of your library won't need to update their code since the interface doesn't change.
If you are just using it internally, you may not care so much, and using List may be ok.
Imagine someone actually used my IList<T> they got from my ExposeListIlist() method just like that to add/remove values. Everything works fine. But now like the answer suggests, because returning an interface is more flexible I return an array instead of a List (no problem on my side!), then they're in for a treat...
TLDR:
1) Exposing an interface causes unnecessary casts? Does that not matter?
2) Sometimes if users of the library don't use a cast, their code can break when you change your method, even though the method remains perfectly fine.
I am probably overthinking this, but I don't get the general consensus that returning an interface is to be preferred over returning an implementation.
Maybe this is not directly answering your question, but in .NET 4.5+, I prefer to follow these rules when designing public or protected APIs:
do return IEnumerable<T>, if only enumeration is available;
do return IReadOnlyCollection<T> if both enumeration and items count are available;
do return IReadOnlyList<T>, if enumeration, items count and indexed access are available;
do return ICollection<T> if enumeration, items count and modification are available;
do return IList<T>, if enumeration, items count, indexed access and modification are available.
Last two options assume, that method must not return array as IList<T> implementation.
No, because the consumer should know what exactly IList is:
IList is a descendant of the ICollection interface and is the base
interface of all non-generic lists. IList implementations fall into
three categories: read-only, fixed-size, and variable-size. A
read-only IList cannot be modified. A fixed-size IList does not allow
the addition or removal of elements, but it allows the modification of
existing elements. A variable-size IList allows the addition, removal,
and modification of elements.
You can check for IList.IsFixedSize and IList.IsReadOnly and do what you want with it.
I think IList is an example of a fat interface and it should have been split into multiple smaller interfaces and it also violates Liskov substitution principle when you return an array as an IList.
Read more if you want to make decision about returning interface
UPDATE
Digging more and I found that IList<T> does not implement IList and IsReadOnly is accessible through base interface ICollection<T> but there is no IsFixedSize for IList<T>. Read more about why generic IList<> does not inherit non-generic IList?
As with all "interface versus implementation" question, you'll have to realise what exposing a public member means: it defines the public API of this class.
If you expose a List<T> as a member (field, property, method, ...), you tell the consumer of that member: the type obtained by accessing this method is a List<T>, or something derived of that.
Now if you expose an interface, you hide the "implementation detail" of your class using a concrete type. Of course you can't instantiate IList<T>, but you can use an Collection<T>, List<T>, derivations thereof or your own type implementing IList<T>.
The actual question is "Why does Array implement IList<T>", or "Why has the IList<T> interface so many members".
It also depends on what you want the consumers of that member to do. If you actually return an internal member through your Expose... member, you'll want to return a new List<T>(internalMember) anyway, as otherwise the consumer can try and cast them to IList<T> and modify your internal member through that.
If you just expect consumers to iterate the results, expose IEnumerable<T> or IReadOnlyCollection<T> instead.
Be careful with blanket quotes that are taken out of context.
Returning an interface is better than returning a concrete implementation
This quote only makes sense if it's used in the context of the SOLID principles. There are 5 principles but for the purposes of this discussion we'll just talk about the last 3.
Dependency inversion principle
one should “Depend upon Abstractions. Do not depend upon concretions.”
In my opinion, this principle is the most difficult to understand. But if you look at the quote carefully it looks a lot like your original quote.
Depend on interfaces (abstractions). Do no depend on concrete implementations (concretions).
This is still a little confusing but if we start applying the other principles together it starts to make a lot more sense.
Liskov substitution principle
“objects in a program should be replaceable with instances of their subtypes without altering the correctness of that program.”
As you pointed out, returning an Array is clearly different behavior to returning a List<T> even though they both implement IList<T>. This is most certainly a violation of LSP.
The important thing to realize is that interfaces are about the consumer. If you're returning an interface, you've created a contract that any methods or properties on that interface can be used without changing the behavior of the program.
Interface segregation principle
“many client-specific interfaces are better than one general-purpose interface.”
If you're returning an interface, you should return the most client specific interface your implementation supports. In other words, if you're not expecting the client to call the Add method you shouldn't return an interface with an Add method on it.
Unfortunately, the interfaces in the .NET framework (particularly the early versions) are not always ideal client specific interfaces. Although as #Dennis pointed out in his answer, there are a lot more choices in .NET 4.5+.
Returning an interface is not necessarily better than returning a concrete implementation of a collection. You should always have a good reason to use an interface instead of a concrete type. In your example it seems pointless to do so.
Valid reasons to use an interface could be:
You do not know what the implementation of the methods returning the interface will look like and there may be many, developed over time. It may be other people writing them, from other companies. So you just want to agree on the bare necessities and leave it up to them how to implement the functionality.
You want to expose some common functionality independent from your class hierarchy in a type-safe way. Objects of different base types that should offer the same methods would implement your interface.
One could argue that 1 and 2 are basically the same reason. They are two different scenarios that ultimately lead to the same need.
"It's a contract". If the contract is with yourself and your application is closed in both functionality and time, there is often no point in using an interface.

IEnumerable vs IReadonlyCollection vs ReadonlyCollection for exposing a list member

I have spent quite a few hours pondering the subject of exposing list members. In a similar question to mine, Jon Skeet gave an excellent answer. Please feel free to have a look.
ReadOnlyCollection or IEnumerable for exposing member collections?
I am usually quite paranoid to exposing lists, especially if you are developing an API.
I have always used IEnumerable for exposing lists, as it is quite safe, and it gives that much flexibility. Let me use an example here:
public class Activity
{
private readonly IList<WorkItem> workItems = new List<WorkItem>();
public string Name { get; set; }
public IEnumerable<WorkItem> WorkItems
{
get
{
return this.workItems;
}
}
public void AddWorkItem(WorkItem workItem)
{
this.workItems.Add(workItem);
}
}
Anyone who codes against an IEnumerable is quite safe here. If I later decide to use an ordered list or something, none of their code breaks and it is still nice. The downside of this is IEnumerable can be cast back to a list outside of this class.
For this reason, a lot of developers use ReadOnlyCollection for exposing a member. This is quite safe since it can never be cast back to a list. For me I prefer IEnumerable since it provides more flexibility, should I ever want to implement something different than a list.
I have come up with a new idea I like better. Using IReadOnlyCollection:
public class Activity
{
private readonly IList<WorkItem> workItems = new List<WorkItem>();
public string Name { get; set; }
public IReadOnlyCollection<WorkItem> WorkItems
{
get
{
return new ReadOnlyCollection<WorkItem>(this.workItems);
}
}
public void AddWorkItem(WorkItem workItem)
{
this.workItems.Add(workItem);
}
}
I feel this retains some of the flexibility of IEnumerable and is encapsulated quite nicely.
I posted this question to get some input on my idea. Do you prefer this solution to IEnumerable? Do you think it is better to use a concrete return value of ReadOnlyCollection? This is quite a debate and I want to try and see what are the advantages/disadvantages that we all can come up with.
EDIT
First of all thank you all for contributing so much to the discussion here. I have certainly learned a ton from each and every one and would like to thank you sincerely.
I am adding some extra scenarios and info.
There are some common pitfalls with IReadOnlyCollection and IEnumerable.
Consider the example below:
public IReadOnlyCollection<WorkItem> WorkItems
{
get
{
return this.workItems;
}
}
The above example can be casted back to a list and mutated, even though the interface is readonly. The interface, despite it's namesake does not guarantee immutability. It is up to you to provide an immutable solution, therefore you should return a new ReadOnlyCollection. By creating a new list (a copy essentially), the state of your object is safe and sound.
Richiban says it best in his comment: a interface only guarantees what something can do, not what it cannot do.
See below for an example:
public IEnumerable<WorkItem> WorkItems
{
get
{
return new List<WorkItem>(this.workItems);
}
}
The above can be casted and mutated, but your object is still immutable.
Another outside the box statement would be collection classes. Consider the following:
public class Bar : IEnumerable<string>
{
private List<string> foo;
public Bar()
{
this.foo = new List<string> { "123", "456" };
}
public IEnumerator<string> GetEnumerator()
{
return this.foo.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
The class above can have methods for mutating foo the way you want it to be, but your object can never be casted to a list of any sort and mutated.
Carsten Führmann makes a fantastic point about yield return statements in IEnumerables.
One important aspect seems to be missing from the answers so far:
When an IEnumerable<T> is returned to the caller, they must consider the possibility that the returned object is a "lazy stream", e.g. a collection built with "yield return". That is, the performance penalty for producing the elements of the IEnumerable<T> may have to be paid by the caller, for each use of the IEnumerable. (The productivity tool "Resharper" actually points this out as a code smell.)
By contrast, an IReadOnlyCollection<T> signals to the caller that there will be no lazy evaluation. (The Count property, as opposed to the Count extension method of IEnumerable<T> (which is inherited by IReadOnlyCollection<T> so it has the method as well), signals non-lazyness. And so does the fact that there seem to be no lazy implementations of IReadOnlyCollection.)
This is also valid for input parameters, as requesting an IReadOnlyCollection<T> instead of IEnumerable<T> signals that the method needs to iterate several times over the collection. Sure the method could create its own list from the IEnumerable<T> and iterate over that, but as the caller may already have a loaded collection at hand it would make sense to take advantage of it whenever possible. If the caller only has an IEnumerable<T> at hand, he only needs to add .ToArray() or .ToList() to the parameter.
What IReadOnlyCollection does not do is prevent the caller to cast to some other collection type. For such protection, one would have to use the class ReadOnlyCollection<T>.
In summary, the only thing IReadOnlyCollection<T> does relative to IEnumerable<T> is add a Count property and thus signal that no lazyness is involved.
Talking about class libraries, I think IReadOnly* is really useful, and I think you're doing it right :)
It's all about immutable collection... Before there were just immutables and to enlarge arrays was a huge task, so .net decided to include in the framework something different, mutable collection, that implement the ugly stuff for you, but IMHO they didn't give you a proper direction for immutable that are extremely useful, especially in a high concurrency scenario where sharing mutable stuff is always a PITA.
If you check other today languages, such as objective-c, you will see that in fact the rules are completely inverted! They quite always exchange immutable collection between different classes, in other words the interface expose just immutable, and internally they use mutable collection (yes, they have it of course), instead they expose proper methods if they want let the outsiders change the collection (if the class is a stateful class).
So this little experience that I've got with other languages pushes me to think that .net list are so powerful, but the immutable collection were there for some reason :)
In this case is not a matter of helping the caller of an interface, to avoid him to change all the code if you're changing internal implementation, like it is with IList vs List, but with IReadOnly* you're protecting yourself, your class, to being used in not a proper way, to avoid useless protection code, code that sometimes you couldn't also write (in the past in some piece of code I had to return a clone of the complete list to avoid this problem).
My take on concerns of casting and IReadOnly* contracts, and 'proper' usages of such.
If some code is being “clever” enough to perform an explicit cast and break the interface contract, then it is also “clever” enough to use reflection or otherwise do nefarious things such as access the underlying List of a ReadOnlyCollection wrapper object. I don’t program against such “clever” programmers.
The only thing that I guarantee is that after said IReadOnly*-interface objects are exposed, then my code will not violate that contract and will not modified the returned collection object.
This means that I write code that returns List-as-IReadOnly*, eg., and rarely opt for an actual read-only concrete type or wrapper. Using IEnumerable.ToList is sufficient to return an IReadOnly[List|Collection] - calling List.AsReadOnly adds little value against “clever” programmers who can still access the underlying list that the ReadOnlyCollection wraps.
In all cases, I guarantee that the concrete types of IReadOnly* return values are eager. If I ever write a method that returns an IEnumerable, it is specifically because the contract of the method is that which “supports streaming” fsvo.
As far as IReadOnlyList and IReadOnlyCollection, I use the former when there is 'an' implied stable ordering established that is meaningful to index, regardless of purposeful sorting. For example, arrays and Lists can be returned as an IReadOnlyList while a HashSet would better be returned as an IReadOnlyCollection. The caller can always assign the I[ReadOnly]List to an I[ReadOnly]Collection as desired: this choice is about the contract exposed and not what a programmer, “clever” or otherwise, will do.
It seems that you can just return an appropriate interface:
...
private readonly List<WorkItem> workItems = new List<WorkItem>();
// Usually, there's no need the property to be virtual
public virtual IReadOnlyList<WorkItem> WorkItems {
get {
return workItems;
}
}
...
Since workItems field is in fact List<T> so the natural idea IMHO is to expose the most wide interface which is IReadOnlyList<T> in the case
!! IEnumerable vs IReadOnlyList !!
IEnumerable has been with us from the beginning of time. For many years, it was a de facto standard way to represent a read-only collection. Since .NET 4.5, however, there is another way to do that: IReadOnlyList.
Both collection interfaces are useful.
<>

Why use IList or List?

I know there has been a lot of posts on this but it still confuses me why should you pass in an interface like IList and return an interface like IList back instead of the concrete list.
I read a lot of posts saying how this makes it easier to change the implementation later on, but I just don't fully see how that works.
Say if I have this method
public class SomeClass
{
public bool IsChecked { get; set; }
}
public void LogAllChecked(IList<SomeClass> someClasses)
{
foreach (var s in someClasses)
{
if (s.IsChecked)
{
// log
}
}
}
I am not sure how using IList will help me out in the future.
How about if I am already in the method? Should I still be using IList?
public void LogAllChecked(IList<SomeClass> someClasses)
{
//why not List<string> myStrings = new List<string>()
IList<string> myStrings = new List<string>();
foreach (var s in someClasses)
{
if (s.IsChecked)
{
myStrings.Add(s.IsChecked.ToString());
}
}
}
What do I get for using IList now?
public IList<int> onlySomeInts(IList<int> myInts)
{
IList<int> store = new List<int>();
foreach (var i in myInts)
{
if (i % 2 == 0)
{
store.Add(i);
}
}
return store;
}
How about now? Is there some new implementation of a list of int's that I will need to change out?
Basically, I need to see some actual code examples of how using IList would have solved some problem over just taking List into everything.
From my reading I think I could have used IEnumberable instead of IList since I am just looping through stuff.
Edit
So I have been playing around with some of my methods on how to do this. I am still not sure about the return type(if I should make it more concrete or an interface).
public class CardFrmVm
{
public IList<TravelFeaturesVm> TravelFeaturesVm { get; set; }
public IList<WarrantyFeaturesVm> WarrantyFeaturesVm { get; set; }
public CardFrmVm()
{
WarrantyFeaturesVm = new List<WarrantyFeaturesVm>();
TravelFeaturesVm = new List<TravelFeaturesVm>();
}
}
public class WarrantyFeaturesVm : AvailableFeatureVm
{
}
public class TravelFeaturesVm : AvailableFeatureVm
{
}
public class AvailableFeatureVm
{
public Guid FeatureId { get; set; }
public bool HasFeature { get; set; }
public string Name { get; set; }
}
private IList<AvailableFeature> FillAvailableFeatures(IEnumerable<AvailableFeatureVm> avaliableFeaturesVm)
{
List<AvailableFeature> availableFeatures = new List<AvailableFeature>();
foreach (var f in avaliableFeaturesVm)
{
if (f.HasFeature)
{
// nhibernate call to Load<>()
AvailableFeature availableFeature = featureService.LoadAvaliableFeatureById(f.FeatureId);
availableFeatures.Add(availableFeature);
}
}
return availableFeatures;
}
Now I am returning IList for the simple fact that I will then add this to my domain model what has a property like this:
public virtual IList<AvailableFeature> AvailableFeatures { get; set; }
The above is an IList itself as this is what seems to be the standard to use with nhibernate. Otherwise I might have returned IEnumberable back but not sure. Still, I can't figure out what the user would 100% need(that's where returning a concrete has an advantage over).
Edit 2
I was also thinking what happens if I want to do pass by reference in my method?
private void FillAvailableFeatures(IEnumerable<AvailableFeatureVm> avaliableFeaturesVm, IList<AvailableFeature> toFill)
{
foreach (var f in avaliableFeaturesVm)
{
if (f.HasFeature)
{
// nhibernate call to Load<>()
AvailableFeature availableFeature = featureService.LoadAvaliableFeatureById(f.FeatureId);
toFill.Add(availableFeature);
}
}
}
would I run into problems with this? Since could they not pass in an array(that has a fixed size)? Would it be better maybe for a concrete List?
There are three questions here: what type should I use for a formal parameter? What should I use for a local variable? and what should I use for a return type?
Formal parameters:
The principle here is do not ask for more than you need. IEnumerable<T> communicates "I need to get the elements of this sequence from beginning to end". IList<T> communicates "I need to get and set the elements of this sequence in arbitrary order". List<T> communicates "I need to get and set the elements of this sequence in arbitrary order and I only accept lists; I do not accept arrays."
By asking for more than you need, you (1) make the caller do unnecessary work to satisfy your unnecessary demands, and (2) communicate falsehoods to the reader. Ask only for what you're going to use. That way if the caller has a sequence, they don't need to call ToList on it to satisfy your demand.
Local variables:
Use whatever you want. It's your method. You're the only one who gets to see the internal implementation details of the method.
Return type:
Same principle as before, reversed. Offer the bare minimum that your caller requires. If the caller only requires the ability to enumerate the sequence, only give them an IEnumerable<T>.
The most practical reason I've ever seen was given by Jeffrey Richter in CLR via C#.
The pattern is to take the basest class or interface possible for your arguments and return the most specific class or interface possible for your return types. This gives your callers the most flexibility in passing in types to your methods and the most opportunities to cast/reuse the return values.
For example, the following method
public void PrintTypes(IEnumerable items)
{
foreach(var item in items)
Console.WriteLine(item.GetType().FullName);
}
allows the method to be called passing in any type that can be cast to an enumerable. If you were more specific
public void PrintTypes(List items)
then, say, if you had an array and wished to print their type names to the console, you would first have to create a new List and fill it with your types. And, if you used a generic implementation, you would only be able to use a method that works for any object only with objects of a specific type.
When talking about return types, the more specific you are, the more flexible callers can be with it.
public List<string> GetNames()
you can use this return type to iterate the names
foreach(var name in GetNames())
or you can index directly into the collection
Console.WriteLine(GetNames()[0])
Whereas, if you were getting back a less specific type
public IEnumerable GetNames()
you would have to massage the return type to get the first value
Console.WriteLine(GetNames().OfType<string>().First());
IEnumerable<T> allows you to iterate through a collection. ICollection<T> builds on this and also allows for adding and removing items. IList<T> also allows for accessing and modifying them at a specific index. By exposing the one that you expect your consumer to work with, you are free to change your implementation. List<T> happens to implement all three of those interfaces.
If you expose your property as a List<T> or even an IList<T> when all you want your consumer to have is the ability to iterate through the collection. Then they could come to depend on the fact that they can modify the list. Then later if you decide to convert the actual data store from a List<T> to a Dictionary<T,U> and expose the dictionary keys as the actual value for the property (I have had to do exactly this before). Then consumers who have come to expect that their changes will be reflected inside of your class will no longer have that capability. That's a big problem! If you expose the List<T> as an IEnumerable<T> you can comfortably predict that your collection is not being modified externally. That is one of the powers of exposing List<T> as any of the above interfaces.
This level of abstraction goes the other direction when it belongs to method parameters. When you pass your list to a method that accepts IEnumerable<T> you can be sure that your list is not going to be modified. When you are the person implementing the method and you say you accept an IEnumerable<T> because all you need to do is iterate through that list. Then the person calling the method is free to call it with any data type that is enumerable. This allows your code to be used in unexpected, but perfectly valid ways.
From this it follows that your method implementation can represent its local variables however you wish. The implementation details are not exposed. Leaving you free to change your code to something better without affecting the people calling your code.
You cannot predict the future. Assuming that a property's type will always be beneficial as a List<T> is immediately limiting your ability to adapt to unforeseen expectations of your code. Yes, you may never change that data type from a List<T> but you can be sure that if you have to. Your code is ready for it.
Short Answer:
You pass the interface so that no matter what concrete implementation of that interface you use, your code will support it.
If you use a concrete implementation of list, another implementation of the same list will not be supported by your code.
Read a bit on inheritance and polymorphism.
Here's an example: I had a project once where our lists got very large, and resulting fragmentation of the large object heap was hurting performance. We replaced List with LinkedList. LinkedList does not contain an array, so all of a sudden, we had almost no use of the large object heap.
Mostly, we used the lists as IEnumerable<T>, anyway, so there was no further change needed. (And yes, I would recommend declaring references as IEnumerable if all you're doing is enumerating them.) In a couple of places, we needed the list indexer, so we wrote an inefficient IList<T> wrapper around the linked lists. We needed the list indexer infrequently, so the inefficiency was not a problem. If it had been, we could have provided some other implementation of IList, perhaps as a collection of small-enough arrays, that would have been more efficiently indexable while also avoiding large objects.
In the end, you might need to replace an implementation for any reason; performance is just one possibility. Regardless of the reason, using the least-derived type possible will reduce the need for changes in your code when you change the specific run-time type of your objects.
Inside the method, you should use var, instead of IList or List. When your data source changes to come from a method instead, your onlySomeInts method will survive.
The reason to use IList instead of List as parameters, is because many things implement IList (List and [], as two examples), but only one thing implements List. It's more flexible to code to the interface.
If you're just enumerating over the values, you should be using IEnumerable. Every type of datatype that can hold more than one value implements IEnumerable (or should) and makes your method hugely flexible.
Using IList instead of List makes writing unit tests significantly easier. It allows you to use a 'Mocking' library to pass and return data.
The other general reason for using interfaces is to expose the minimum amount of knowledge necessary to the user of an object.
Consider the (contrived) case where I have a data object that implements IList.
public class MyDataObject : IList<int>
{
public void Method1()
{
...
}
// etc
}
Your functions above only care about being able to iterate over a list. Ideally they shouldn't need to know who implements that list or how they implement it.
In your example, IEnumerable is a better choice as you thought.
It is always a good idea to reduce the dependencies between your code as much as possible.
Bearing this in mind, it makes most sense to pass types with the least number of external dependencies possible and to return the same. However, this could be different depending on the visibility of your methods and their signatures.
If your methods form part of an interface, the methods will need to be defined using types available to that interface. Concrete types will probably not be available to interfaces, so they would have to return non-concrete types. You would want to do this if you were creating a framework, for example.
However, if you are not writing a framework, it may be advantageous to pass parameter with the weakest possible types (i.e. base classes, interfaces, or even delegates) and return concrete types. That gives the caller the ability to do as much as possible with the returned object, even if it is cast as an interface. However, this makes the method more fragile, as any change to the returned object type may break the calling code. In practice though, that generally isn't a major problem.
You accept an Interface as a parameter for a method because that allows the caller to submit different concrete types as arguments. Given your example method LogAllChecked, the parameter someClasses could be of various types, and for the person writing the method, all might be equivalent (i.e. you'd write the exact same code regardless of the type of the parameter). But for the person calling the method, it can make a huge difference -- if they have an array and you're asking for a list, they have to change the array to a list or v.v. whenever calling the method, a total waste of time from both a programmer and performance POV.
Whether you return an Interface or a concrete type depends upon what you want to let your callers do with the object you created -- this is an API design decision, and there's no hard and fast rule. You have to weigh their ability to make full use of the object against their ability to easily use a portion of the objects functionality (and of course whether you WANT them to be making full use of the object). For instance, if you return an IEnumerable, then you are limiting them to iterating -- they can't add or remove items from your object, they can only act against the objects. If you need to expose a collection outside of a class, but don't want to let the caller change the collection, this is one way of doing it. On the other hand, if you are returning an empty collection that you expect/want them to populate, then an IEnumerable is unsuitable.
Here's my answer in this .NET 4.5+ world.
Use IList<T> and IReadonlyList<T>,
instead of List<T>, because ReadonlyList<T> doesn't exist.
IList<T> looks so consistent with IReadonlyList<T>
Use IEnumerable<T> for minimum exposure (property) or requirement (parameter) if foreach is the only way to use it.
Use IReadonlyList<T> if you also need to expose/use Count and [] indexer.
Use IList<T> if you also allow callers to add/update/delete elements
because List<T> implements IReadonlyList<T>, it doesn't need any explicit casting.
An example class:
// manipulate the list within the class
private List<int> _numbers;
// callers can add/update/remove elements, but cannot reassign a new list to this property
public IList<int> Numbers { get { return _numbers; } }
// callers can use: .Count and .ReadonlyNumbers[idx], but cannot add/update/remove elements
public IReadOnlyList<int> ReadonlyNumbers { get { return _numbers; } }

Returning a ReadOnlyCollection from a method with an IList return type

Here I have the following bit of code:
private IList<IState> _states = new List<IState>();
private ReadOnlyCollection<IState> _statesViewer;
public IList<IState> States { get { return _statesViewer; } }
I believe that generally it is preferable to return interfaces rather than the concrete classes themselves, but in this case, shouldn't I set as the return type of the States property a ReadOnlyCollection?
Any user of my library will think it is possible to anything you can do with an IList if I set it as so, and that means adding elements. That is not true and I'm definitely breaking the contract exposing it as an IList.
Am I right with this view or there is something else I am missing here?
Do whatever makes the API clearest. If the caller only needs to enumerate it you could expose as IEnumerable<T>, but I wouldn't be concerned about exposing it as ReadOnlyCollection. You could declare a custom type (interface or class) with just an indexer and enumerator, of course
If it was me, I would simply expose it as
IEnumerable<IState>
IEnumerable<T> is a good choice for any public property that represents a sequence.
It's the smallest contract possible that is still a sequence, which helps you stay decoupled.
It enables a rich collection of operations in Linq for objects, so you're offering your consumer a lot of power.
For some cases i take the IEnumerable<IState> if people are only allowed to run over the list. But if i need some more built-in functionality for the outer world like index operator, Count, Contains, IndexOf, etc. i'll give them an ReadOnlyCollection as IList and write within the documentation that it is read only.

List<BusinessObject> or BusinessObjectCollection?

Prior to C# generics, everyone would code collections for their business objects by creating a collection base that implemented IEnumerable
IE:
public class CollectionBase : IEnumerable
and then would derive their Business Object collections from that.
public class BusinessObjectCollection : CollectionBase
Now with the generic list class, does anyone just use that instead? I've found that I use a compromise of the two techniques:
public class BusinessObjectCollection : List<BusinessObject>
I do this because I like to have strongly typed names instead of just passing Lists around.
What is your approach?
I am generally in the camp of just using a List directly, unless for some reason I need to encapsulate the data structure and provide a limited subset of its functionality. This is mainly because if I don't have a specific need for encapsulation then doing it is just a waste of time.
However, with the aggregate initializes feature in C# 3.0, there are some new situations where I would advocate using customized collection classes.
Basically, C# 3.0 allows any class that implements IEnumerable and has an Add method to use the new aggregate initializer syntax. For example, because Dictionary defines a method Add(K key, V value) it is possible to initialize a dictionary using this syntax:
var d = new Dictionary<string, int>
{
{"hello", 0},
{"the answer to life the universe and everything is:", 42}
};
The great thing about the feature is that it works for add methods with any number of arguments. For example, given this collection:
class c1 : IEnumerable
{
void Add(int x1, int x2, int x3)
{
//...
}
//...
}
it would be possible to initialize it like so:
var x = new c1
{
{1,2,3},
{4,5,6}
}
This can be really useful if you need to create static tables of complex objects. For example, if you were just using List<Customer> and you wanted to create a static list of customer objects you would have to create it like so:
var x = new List<Customer>
{
new Customer("Scott Wisniewski", "555-555-5555", "Seattle", "WA"),
new Customer("John Doe", "555-555-1234", "Los Angeles", "CA"),
new Customer("Michael Scott", "555-555-8769", "Scranton PA"),
new Customer("Ali G", "", "Staines", "UK")
}
However, if you use a customized collection, like this one:
class CustomerList : List<Customer>
{
public void Add(string name, string phoneNumber, string city, string stateOrCountry)
{
Add(new Customer(name, phoneNumber, city, stateOrCounter));
}
}
You could then initialize the collection using this syntax:
var customers = new CustomerList
{
{"Scott Wisniewski", "555-555-5555", "Seattle", "WA"},
{"John Doe", "555-555-1234", "Los Angeles", "CA"},
{"Michael Scott", "555-555-8769", "Scranton PA"},
{"Ali G", "", "Staines", "UK"}
}
This has the advantage of being both easier to type and easier to read because their is no need to retype the element type name for each element. The advantage can be particularly strong if the element type is long or complex.
That being said, this is only useful if you need static collections of data defined in your app. Some types of apps, like compilers, use them all the time. Others, like typical database apps don't because they load all their data from a database.
My advice would be that if you either need to define a static collection of objects, or need to encapsulate away the collection interface, then create a custom collection class. Otherwise I would just use List<T> directly.
It's recommended that in public API's not to use List<T>, but to use Collection<T>
If you are inheriting from it though, you should be fine, afaik.
I prefer just to use List<BusinessObject>. Typedefing it just adds unnecessary boilerplate to the code. List<BusinessObject> is a specific type, it's not just any List object, so it's still strongly typed.
More importantly, declaring something List<BusinessObject> makes it easier for everyone reading the code to tell what types they are dealing with, they don't have to search through to figure out what a BusinessObjectCollection is and then remember that it's just a list. By typedefing, you'll have to require a consistent (re)naming convention that everyone has to follow in order for it to make sense.
Use the type List<BusinessObject> where you have to declare a list of them. However,
where you return a list of BusinessObject, consider returning IEnumerable<T>, IList<T> or ReadOnlyCollection<T> - i.e. return the weakest possible contract that satisfies the client.
Where you want to "add custom code" to a list, code extension methods on the list type. Again, attach these methods to the weakest possible contract, e.g.
public static int SomeCount(this IEnumerable<BusinessObject> someList)
Of course, you can't and shouldn't add state with extension methods, so if you need to add a new property and a field behind it, use a subclass or better, a wrapper class to store this.
I've been going back and forth on 2 options:
public class BusinessObjectCollection : List<BusinessObject> {}
or methods that just do the following:
public IEnumerable<BusinessObject> GetBusinessObjects();
The benefits of the first approach is that you can change the underlying data store without having to mess with method signatures. Unfortunately if you inherit from a collection type that removes a method from the previous implementation, then you'll have to deal with those situations throughout your code.
You should probably avoid creating your own collection for that purpose. It's pretty common to want to change the type of data structure a few times during refactorings or when adding new features. With your approach, you would wind up with a separate class for BusinessObjectList, BusinessObjectDictionary, BusinessObjectTree, etc.
I don't really see any value in creating this class just because the classname is more readable. Yeah, the angle bracket syntax is kind of ugly, but it's standard in C++, C# and Java, so even if you don't write code that uses it you're going to run into it all the time.
I generally only derive my own collection classes if I need to "add value". Like, if the collection itself needed to have some "metadata" properties tagging along with it.
I do the exact same thing as you Jonathan... just inherit from List<T>. You get the best of both worlds. But I generally only do it when there is some value to add, like adding a LoadAll() method or whatever.
You can use both. For laziness - I mean productivity - List is a very useful class, it's also "comprehensive" and frankly full of YANGNI members. Coupled with the sensible argument / recommendation put forward by the MSDN article already linked about exposing List as a public member, I prefer the "third" way:
Personally I use the decorator pattern to expose only what I need from List i.e:
public OrderItemCollection : IEnumerable<OrderItem>
{
private readonly List<OrderItem> _orderItems = new List<OrderItem>();
void Add(OrderItem item)
{
_orderItems.Add(item)
}
//implement only the list members, which are required from your domain.
//ie. sum items, calculate weight etc...
private IEnumerator<string> Enumerator() {
return _orderItems.GetEnumerator();
}
public IEnumerator<string> GetEnumerator() {
return Enumerator();
}
}
Further still I'd probably abstract OrderItemCollection into IOrderItemCollection so I can swap my implementation of IOrderItemCollection over in the future in (I may prefer to use a different inner enumerable object such as Collection or more likley for perf use a Key Value Pair collection or Set.
I use generic lists for almost all scenarios. The only time that I would consider using a derived collection anymore is if I add collection specific members. However, the advent of LINQ has lessened the need for even that.
6 of 1, half dozen of another
Either way its the same thing. I only do it when I have reason to add custom code into the BusinessObjectCollection.
With out it having load methods return a list allows me to write more code in a common generic class and have it just work. Such as a Load method.
As someone else pointed out, it is recommended not to expose List publicly, and FxCop will whinge if you do so. This includes inheriting from List as in:
public MyTypeCollection : List<MyType>
In most cases public APIs will expose IList (or ICollection or IEnumerable) as appropriate.
In cases where you want your own custom collection, you can keep FxCop quiet by inheriting from Collection instead of List.
If you choose to create your own collection class you should check out the types in System.Collections.ObjectModel Namespace.
The namespace defines base classes thare are ment to make it easier for implementers to create a custom collections.
I tend to do it with my own collection if I want to shield the access to the actual list. When you are writing business objects, chance is that you need a hook to know if your object is being added/removed, in such sense I think BOCollection is better idea. Of coz if that is not required, List is more lightweight. Also you might want to check using IList to provide additional abstraction interface if you need some kind of proxying (e.g. a fake collection triggers lazy load from database)
But... why not consider Castle ActiveRecord or any other mature ORM framework? :)
At the most of the time I simply go with the List way, as it gives me all the functionality I need at the 90% of the time, and when something 'extra' is needed, I inherit from it, and code that extra bit.
I would do this:
using BusinessObjectCollection = List<BusinessObject>;
This just creates an alias rather than a completely new type. I prefer it to using List<BusinessObject> directly because it leaves me free to change the underlying structure of the collection at some point in the future without changing code that uses it (as long as I provide the same properties and methods).
try out this:
System.Collections.ObjectModel.Collection<BusinessObject>
it makes unnecessary to implement basic method like CollectionBase do
this is the way:
return arrays, accept IEnumerable<T>
=)

Categories

Resources