Implementing a collection using another collection

Implementing a collection using another collection - c#

Often you have to implement a collection because it is not present among those of the .NET Framework. In the examples that I find online, often the new collection is built based on another collection (for example, List<T>): in this way it is possible to avoid the management of the resizing of the collection.
public class CustomCollection<T>
{
private List<T> _baseArray;
...
public CustomCollection(...)
{
this._baseArray = new List<T>(...);
}
}
What are the disadvantages of following this approach? Only lower performance because of the method calls to the base collection? Or the compiler performs some optimization?
Moreover, in some cases the field relating to the base collection (for example the above _baseArray) is declared as readonly. Why?

The main disadvantage is the fact that if you want to play nice you'll have to implement a lot of interfaces by hand (ICollection, IEnumerable, possibly IList... both generic and non-generic), and that's quite a bit of code. Not complex code, since you're just relaying the calls, but still code. The extra call to the inner list shouldn't make too big of a difference in most cases.
It's to enforce the fact that once the inner list is set, it cannot be changed into another list.
Usually it's best to inherit from one of the many built-in collection classes to make your own collection, instead of doing it the hard way. Collection<T> is a good starting point, and nobody is stopping you from inheriting List<T> itself.

For #2: if the private member is only assigned to in the constructor or when declared, it can be readonly. This is usually true if you only have one underlying collection and don't ever need to recreate it.

I'd say a pretty large disadvantage of this approach is that you can't use LINQ on your custom collection unless you implement IEnumerable. A better approach might be to subclass and force new implementation on methods as necessary, ex:
public class FooList<T> : List<T>
{
public new void Add(T item)
{
// any FooList-specific logic regarding adding items
base.Add(item);
}
}
As for the readonly keyword, it means that you can only set the variable in the constructor.

Related

IEnumerable vs IReadonlyCollection vs ReadonlyCollection for exposing a list member

I have spent quite a few hours pondering the subject of exposing list members. In a similar question to mine, Jon Skeet gave an excellent answer. Please feel free to have a look.
ReadOnlyCollection or IEnumerable for exposing member collections?
I am usually quite paranoid to exposing lists, especially if you are developing an API.
I have always used IEnumerable for exposing lists, as it is quite safe, and it gives that much flexibility. Let me use an example here:
public class Activity
{
private readonly IList<WorkItem> workItems = new List<WorkItem>();
public string Name { get; set; }
public IEnumerable<WorkItem> WorkItems
{
get
{
return this.workItems;
}
}
public void AddWorkItem(WorkItem workItem)
{
this.workItems.Add(workItem);
}
}
Anyone who codes against an IEnumerable is quite safe here. If I later decide to use an ordered list or something, none of their code breaks and it is still nice. The downside of this is IEnumerable can be cast back to a list outside of this class.
For this reason, a lot of developers use ReadOnlyCollection for exposing a member. This is quite safe since it can never be cast back to a list. For me I prefer IEnumerable since it provides more flexibility, should I ever want to implement something different than a list.
I have come up with a new idea I like better. Using IReadOnlyCollection:
public class Activity
{
private readonly IList<WorkItem> workItems = new List<WorkItem>();
public string Name { get; set; }
public IReadOnlyCollection<WorkItem> WorkItems
{
get
{
return new ReadOnlyCollection<WorkItem>(this.workItems);
}
}
public void AddWorkItem(WorkItem workItem)
{
this.workItems.Add(workItem);
}
}
I feel this retains some of the flexibility of IEnumerable and is encapsulated quite nicely.
I posted this question to get some input on my idea. Do you prefer this solution to IEnumerable? Do you think it is better to use a concrete return value of ReadOnlyCollection? This is quite a debate and I want to try and see what are the advantages/disadvantages that we all can come up with.
EDIT
First of all thank you all for contributing so much to the discussion here. I have certainly learned a ton from each and every one and would like to thank you sincerely.
I am adding some extra scenarios and info.
There are some common pitfalls with IReadOnlyCollection and IEnumerable.
Consider the example below:
public IReadOnlyCollection<WorkItem> WorkItems
{
get
{
return this.workItems;
}
}
The above example can be casted back to a list and mutated, even though the interface is readonly. The interface, despite it's namesake does not guarantee immutability. It is up to you to provide an immutable solution, therefore you should return a new ReadOnlyCollection. By creating a new list (a copy essentially), the state of your object is safe and sound.
Richiban says it best in his comment: a interface only guarantees what something can do, not what it cannot do.
See below for an example:
public IEnumerable<WorkItem> WorkItems
{
get
{
return new List<WorkItem>(this.workItems);
}
}
The above can be casted and mutated, but your object is still immutable.
Another outside the box statement would be collection classes. Consider the following:
public class Bar : IEnumerable<string>
{
private List<string> foo;
public Bar()
{
this.foo = new List<string> { "123", "456" };
}
public IEnumerator<string> GetEnumerator()
{
return this.foo.GetEnumerator();
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
}
The class above can have methods for mutating foo the way you want it to be, but your object can never be casted to a list of any sort and mutated.
Carsten Führmann makes a fantastic point about yield return statements in IEnumerables.

One important aspect seems to be missing from the answers so far:
When an IEnumerable<T> is returned to the caller, they must consider the possibility that the returned object is a "lazy stream", e.g. a collection built with "yield return". That is, the performance penalty for producing the elements of the IEnumerable<T> may have to be paid by the caller, for each use of the IEnumerable. (The productivity tool "Resharper" actually points this out as a code smell.)
By contrast, an IReadOnlyCollection<T> signals to the caller that there will be no lazy evaluation. (The Count property, as opposed to the Count extension method of IEnumerable<T> (which is inherited by IReadOnlyCollection<T> so it has the method as well), signals non-lazyness. And so does the fact that there seem to be no lazy implementations of IReadOnlyCollection.)
This is also valid for input parameters, as requesting an IReadOnlyCollection<T> instead of IEnumerable<T> signals that the method needs to iterate several times over the collection. Sure the method could create its own list from the IEnumerable<T> and iterate over that, but as the caller may already have a loaded collection at hand it would make sense to take advantage of it whenever possible. If the caller only has an IEnumerable<T> at hand, he only needs to add .ToArray() or .ToList() to the parameter.
What IReadOnlyCollection does not do is prevent the caller to cast to some other collection type. For such protection, one would have to use the class ReadOnlyCollection<T>.
In summary, the only thing IReadOnlyCollection<T> does relative to IEnumerable<T> is add a Count property and thus signal that no lazyness is involved.

Talking about class libraries, I think IReadOnly* is really useful, and I think you're doing it right :)
It's all about immutable collection... Before there were just immutables and to enlarge arrays was a huge task, so .net decided to include in the framework something different, mutable collection, that implement the ugly stuff for you, but IMHO they didn't give you a proper direction for immutable that are extremely useful, especially in a high concurrency scenario where sharing mutable stuff is always a PITA.
If you check other today languages, such as objective-c, you will see that in fact the rules are completely inverted! They quite always exchange immutable collection between different classes, in other words the interface expose just immutable, and internally they use mutable collection (yes, they have it of course), instead they expose proper methods if they want let the outsiders change the collection (if the class is a stateful class).
So this little experience that I've got with other languages pushes me to think that .net list are so powerful, but the immutable collection were there for some reason :)
In this case is not a matter of helping the caller of an interface, to avoid him to change all the code if you're changing internal implementation, like it is with IList vs List, but with IReadOnly* you're protecting yourself, your class, to being used in not a proper way, to avoid useless protection code, code that sometimes you couldn't also write (in the past in some piece of code I had to return a clone of the complete list to avoid this problem).

My take on concerns of casting and IReadOnly* contracts, and 'proper' usages of such.
If some code is being “clever” enough to perform an explicit cast and break the interface contract, then it is also “clever” enough to use reflection or otherwise do nefarious things such as access the underlying List of a ReadOnlyCollection wrapper object. I don’t program against such “clever” programmers.
The only thing that I guarantee is that after said IReadOnly*-interface objects are exposed, then my code will not violate that contract and will not modified the returned collection object.
This means that I write code that returns List-as-IReadOnly*, eg., and rarely opt for an actual read-only concrete type or wrapper. Using IEnumerable.ToList is sufficient to return an IReadOnly[List|Collection] - calling List.AsReadOnly adds little value against “clever” programmers who can still access the underlying list that the ReadOnlyCollection wraps.
In all cases, I guarantee that the concrete types of IReadOnly* return values are eager. If I ever write a method that returns an IEnumerable, it is specifically because the contract of the method is that which “supports streaming” fsvo.
As far as IReadOnlyList and IReadOnlyCollection, I use the former when there is 'an' implied stable ordering established that is meaningful to index, regardless of purposeful sorting. For example, arrays and Lists can be returned as an IReadOnlyList while a HashSet would better be returned as an IReadOnlyCollection. The caller can always assign the I[ReadOnly]List to an I[ReadOnly]Collection as desired: this choice is about the contract exposed and not what a programmer, “clever” or otherwise, will do.

It seems that you can just return an appropriate interface:
...
private readonly List<WorkItem> workItems = new List<WorkItem>();
// Usually, there's no need the property to be virtual
public virtual IReadOnlyList<WorkItem> WorkItems {
get {
return workItems;
}
}
...
Since workItems field is in fact List<T> so the natural idea IMHO is to expose the most wide interface which is IReadOnlyList<T> in the case

!! IEnumerable vs IReadOnlyList !!
IEnumerable has been with us from the beginning of time. For many years, it was a de facto standard way to represent a read-only collection. Since .NET 4.5, however, there is another way to do that: IReadOnlyList.
Both collection interfaces are useful.
<>

Persistent collections and standard collection interfaces

I'm implementing a persistent collection - for the sake of argument, let's say it's a singly-linked list, of the style common in functional languages.
class MyList<T>
{
public T Head { get; }
public MyList<T> Tail { get; }
// other various stuff
// . . .
}
It seems natural to have this class implement ICollection<T>, since it can implement all the normal behavior one would expect of an ICollection<T>, at least in broad strokes. But there is a lot of mismatch between this class's behavior and ICollection<T>. For example, the signature of the Add() method
void Add(T item); // ICollection<T> version
assumes that the addition will be performed as a side-effect that mutates the collection. But this is a persistent data structure, so Add() should instead create a new list and return it.
MyList<T> Add(T item); // what we really want
It seems the best way to resolve this is to just create the version we want, and also generate a non-functional explicit implementation of the version defined in the interface.
void ICollection<T>.Add(T item) { throw new NotSupportedException(); }
public MyList<T> Add(T item) { return new MyList<T>(item, this); }
But I have a few concerns about that option:
Will this be confusing to users? I envision scenarios where someone is working with this class, and finds that calling Add() on it sometimes raises an exception, and sometimes runs but doesn't modify the list as would normally be expected for an ICollection, depending on the type information associated with the reference being used?
Following on (1), the implementation of ICollection<T>'s IsReadOnly should presumably return true. But that would seem to conflict with what is implied in other spots where Add() is being used with instances of the class.
Is (2) resolved in a non-confusing way by following the explicit implementation pattern again, with the new version returning false and the explicit implementation returning true? Or does this just make it even worse by falsely implying that MyList<T>'s Add() method is a mutator?
Or would it be better to forget trying to use the existing interface and just create a separate IPersistentCollection<T> interface that derives from IEnumerable<T> instead?
edit I changed the name of the class, and switched over to using ICollection. I wanted to focus on the object's behavior and how it relates to the interface. I just went with the cons list as a simple example. I appreciate the advice that if I were to implement a cons list I should try and come up with a less-confusing name and, should avoid implementing IList because that interface is intended for fast random access, but they are somewhat tangential issues.
What I intended to ask about is what others think about the tension between the semantics of read-only (or immutable) collections that are baked into the Framework, and persistent collections which implement equivalent behavior to what is described by the interface, only functionally rather than through mutating side effects.

Will implementing IList<T> be confusing?
Yes. Though there are situations in which an implementation of IList<T> throws -- say, when you are attempting to resize the list but its implementation is an array -- I would find it quite confusing to have an IList<T> that could be mutated in no way and did not have fast random access.
Should I implement a new IPersistentList<T>?
That depends on whether anyone will use it. Are consumers of your class likely to have a half-dozen different implementations of IPL<T> to choose from? I see no point in making an interface that is implemented by only one class; just use the class.
WPF's ItemsControl can get better performance if its ItemsSource is an IList<T> instead of an IEnumerable<T>.
But your persistent linked list will not have fast random access anyway.

It would make more sense to me to make a new IPersistentList<T> (or IImmutableList<T> since "persistent" sounds to me like the data is saved off somewhere.) interface since, really, it's different behavior than what is expected of an IList<T>. Classes that implement IList<T> should be mutable IMHO.
Oh, and of course, I'd avoid using the class name List<T> since it's already part of the framework.

If you aren't supposed to return collections to callers, how should you return a collection of data to a caller?

I am writing a method that's intended to return a dictionary filled with configuration keys and values. The method that's building up this dictionary is doing so dynamically, so I need to return this set of keys and values as a collection (probably IDictionary<string, string>). In my various readings (sources escape me at the moment), the general consensus on returning collection types from method calls is not to.
I understand the reasons for this policy, and I tend to agree, but in cases like this I see no other alternative. This is my question: is there a way I can return this data to the caller, while following this principle?
Edit: The reasons I've heard for not allowing this behavior is that a collection or dictionary type that is meant to be consumed (but not modified) by the client exposes too much behavior, giving the illusion that the caller can modify the type. Dictionary for example has Add and Remove methods, as well as a mutable indexer. If the values in the dictionary are meant to be read-only, these methods are superfluous at best. Further damage can be done if the internal collection is exposed, and the 'owner' of the collection is not anticipating changes to the collection from outside sources.
There are other reasons I've heard, but I can't recall them off-hand - these are the most pertinent in my situation.
Edit: More clarification: The problem I'm having is that I'm building an API, so I have no control over the client calling this function. Cloning the dictionary isn't a problem, but I'm trying to keep my API as clean as possible. Returning a dictionary with methods such as Add and Remove implies that the collection can or should be modified, which isn't the case. Modifications here are meaningless, and so I don't want to expose the promise of that functionality through the returned type's interface.
Resolution: To come to terms with my desire for a clean API, I'm going to write a custom Dictionary class that does not expose the mutating methods Add and Remove, or the set indexer. This type will not implement IDictionary, but I will write a method ToDictionary that will return the data within an IDictionary. It will implement IEnumerable<KeyValuePair<TKey, TValue>> in order to have access to the standard LINQ operations over enumerables. Now all I need is a name for my custom dictionary type... =) Thanks everyone.

The general consensus on returning
collection types from method calls is
not to.
First time I've heard this, and it seems a stupid restriction to me.
I understand the reasons for this
policy
Which are they then?
Edit:
The reasons you cite against returning collections are specific potential problems, which can be adresses specifically (by returning a read-only wrapper), without a blanket restriction on returning collections. But as I understand your situation, the collection is actually built by the method - in that case, changes made by the caller will not affect anything else and thus aren't something you really have to worry about, nor should you be overly restrictive in what the caller is supposed to be able to do with the object created specifically for him.

The main reason for this restriction is that it breaks polymorphism, constness and access control, if the class returns a member collection. If you are building up a collection to return, and the class does not retain it as a member, then this is not a problem.
That said, you may wish to think harder about why you wish to return this collection. What do you want the calling class to be able to do with the data? Can you implement this functionality by adding methods to your class, instead of returning a collection (e.g. myobj.getvalueFromKey(s) instead of myobj.getdictionary()[s])? Might it be more appropriate to return an object that only exposes the information you want it to, rather than simply return the collection (e.g. MyLookupTable MyClass::getLookupTable() rather than IDictionary MyClass::getLookupTable()).
If you have no control over the caller, and you must return a collection of a given type, then it should either be a copy of a member collection, or a new collection entirely, that the callee doesn't store.

In my opinion returning collections is only a problem if changing the returned collection can have side effects, eg. several functions work with the same collection.
If you are only creating the collection and not making data from a class public through returning the collection I think it is okay to simply return the dictionary
If the collection is used elsewhere in your code and the code you returned the collection to should not be able to change the collection you have to clone it.

I've never heard that advice. There might be issues with thread safety if you do it poorly, but you can work around that if you need to.

Check out ReadOnlyCollection() for this. Change your return type and your last statement to
return new ReadOnlyCollection(whateverYouWereReturningBefore);

Perhaps the confusion is with readonly collections (i.e. non-mutable collections)? If so, there's an excellent series of posts by Eric Lippert that goes into good detail on how to build these.
To quote: It is much easier to reason about a data structure if you know that it will never change. Since they cannot be modified, they are automatically threadsafe. Since they cannot be modified, you can maintain a stack of past “snapshots” of the structure, and suddenly undo-redo implementations become trivial.

Hows about returning an IEnumerable<T>, the caller can then easily filter the results anyway they like via linq without mutating the original structure.
obviously for a Dictionary this will be IEnumerable<KeyValuePair<T,U>>
Edit: For lookup you presumebly want ToLookup() extension and ILookup

I usually return an array of data vs a collection type. In C#, for example, a lot of the collections implement a .toArray() method, and for those that don't, an array can be retrieved using lambdas.
Edit
Saw your comment to "No Refunds No Returns" answer. If you're returning a dictionary, an array may not work for you. In this case, I would recommend returning an interface rather than a concrete implementation.
In C# (for example):
public IDictionary<string, object> MyMethod()
{
Dictionary<string, object> myDictionary = new Dictionary<string, object>();
// do stuff here
return myDictionary;
}
Edit 2
You may need to implement your own read-only dictionary class and throw an exception in the necessary methods to prevent adding, etc.
In C# (Again, for example) (Not a complete solution):
public class ReadOnlyDictionary<TKey, TValue> : IDictionary<TKey, TValue>
{
private IDictionary<TKey, TValue> _innerDictionary;
public ReadOnlyDictionary(IDictionary<TKey, TValue> innerDictionary)
{
this._innerDictionary = innerDictionary;
}
public void Add(TKey key, TValue value)
{
throw new NotImplementedException();
}
public bool Remove(TKey key)
{
throw new NotImplementedException();
}
public void Add(KeyValuePair<TKey, TValue> item)
{
throw new NotImplementedException();
}
public void Clear()
{
throw new NotImplementedException();
}
public bool IsReadOnly
{
get { return true; }
}
public bool Remove(KeyValuePair<TKey, TValue> item)
{
throw new NotImplementedException();
}
public IEnumerator<KeyValuePair<TKey, TValue>> GetEnumerator()
{
return _innerDictionary.GetEnumerator();
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return _innerDictionary.GetEnumerator();
}
}

The reasons I've heard for not allowing this behavior is that a collection or dictionary type that is meant to be consumed (but not modified) by the client exposes too much behavior, giving the illusion that the caller can modify the type.
But it's not an illusion. The caller can modify the type (well, the instance of the type). Why on earth is this a problem?
By the same logic, DataTable.Select() shouldn't return a DataRow[], since not only can the caller manipulate the membership of that array, it can change the underlying data!
And the idea of returning an immutable dictionary-like class that has a ToDictionary() method: what possible benefit accrues from doing that?
It's true that returning immutable objects makes it possible for you to implement interning without changing your API. But that's about the only advantage that I can think of.

A major problem with using mutable class objects to pass around data is that every mutable object encompasses two major kinds of state:
The contents of all its fields, and the objects refered to thereby.
The set of all references that exist to it, and things that might be done with those references.
If a method accepts a mutable object (be it a collection or something else) as a parameter, and its contract specifies that it will mutate it somehow (e.g. add items to a collection) but not keep any reference to it, then the caller will know that the set of references that exist to that object after the method call will be the same as it was before. If the caller never exposes the object to the outside world except pass it to such methods, tracking what references exist will be easy.
On the other hand, if a method returns a mutable object to the caller, keeping track of what references may exist to the objects that are passed in and out may be difficult or impossible unless every caller receives a different mutable object. Having the called function create a new mutable object each time it's called, and populate that object with data as appropriate, is certainly a workable approach, but it's often better to let the caller create the new object. That way the caller may be able to recycle objects as appropriate (improving performance) and it will be clearer what's going on. For example, if Customer is a mutable class and one does:
Customer myCustomer = Database.GetCustomer("Fred Smith");
it's unclear whether making changes to myCustomer will have any effect on the database. By contrast, if the code were instead written as:
Customer myCustomer = new Customer;
Database.LoadCustomer(myCustomer, "Fred Smith");
it would be clearer that the data within myCustomer is not attached to the database (or anything else).

Best practice: How to expose a read-only ICollection

I have an ICollection<T> called foos in my class which I want to expose as read-only (see this question). I see that the interface defines a property .IsReadOnly, which seems appropriate... My question is this: how do I make it obvious to the consumer of the class that foos is read-only?
I don't want to rely on them remembering to query .IsReadOnly before trying a not-implemented method such as .Add(). Ideally, I would like to expose foos as a ReadOnlyCollection<T>, but it does not implement IList<T>. Should I expose foo via a method called, for example, GetReadOnlyFooCollection rather than via a property? If so, would this not confuse someone who then expects a ReadOnlyCollection<T>?
This is C# 2.0, so extension methods like ToList() are not available...

You can make "foos" a ReadOnlyCollection like this:
ReadOnlyCollection<T> readOnlyCollection = foos.ToList<T>().AsReadOnly();
Then you can expose it as a property of your class.
EDIT:
class FooContainer
{
private ICollection<Foo> foos;
public ReadOnlyCollection<Foo> ReadOnlyFoos { get { return foos.ToList<Foo>().AsReadOnly();} }
}
Note: You should remember that once you get the ReadOnlyFoos collection is no longer "synchronized" with your foos ICollection. See the thread you referenced.

Since the question was written, .NET 4.0 has added an IReadOnlyCollection<T> interface; it would probably be good to use that as the declared return type.
That does, however, leave open the question of what type of instance to return. One approach would be to clone all the items in the original collection. Another would be to always return a read-only wrapper. A third would be to return the original collection if it implements IReadOnlyCollection<T>. Each approach will be the best one in certain contexts, but will be less than ideal (or perhaps downright dreadful) in others. Unfortunately, Microsoft provides no standard means by which a question can be asked two very important questions:
Do you promise to always and forevermore contain the same items as you do right now?
Can you safely be exposed directly to code which is not supposed to modify your contents.
Which style of wrapping is appropriate would depend upon what the client code is expecting to do with the thing it receives. Some scenarios to be avoided:
An object was supplied of a type that the client would recognize as immutable, but rather than being returned directly it is duplicated, using a type that the client doesn't recognize as immutable. Consequently, the client is compelled to duplicate the collection again.
An object was supplied of a type that the client would recognize as immutable, but before being returned it is wrapped in such a fashion that the client can't tell whether the collection is immutable or not, and thus is compelled to duplicate.
An object of mutable type which is not supposed to be mutated is supplied by a client (cast to a read-only interface). It is then exposed directly to another client which determines that it is a mutable type and proceeds to modify it.
A reference to a mutable collection is received and is encapsulated in a read-only wrapper before being returned to a client that needs an immutable object. The method that returned the collection promised that it is immutable, and thus the client declined to make its own defensive copy. The client is then ill-prepared for the possibility that the collection might change.
There isn't really any particularly "safe" course of action an object can take with collections that it receives from some clients and needs to expose to others. Always duplicating everything is in many circumstances the safest course of action, but it can easily result in situations where a collection which shouldn't need to be duplicated at all ends up getting duplicated hundreds or thousands of times. Returning references as received can often be the most efficient approach, but it can also be semantically dangerous.
I wish Microsoft would add a standard means by which collections could be asked the above questions or, better yet, be asked to produce an immutable snapshot of their current state. An immutable collection could return an immutable snapshot of its current state very cheaply (just return itself) and even some mutable collection types could return an immutable snapshot at a cost far below the cost of a full enumeration (e.g. a List<T> might be backed by two T[][] arrays, one of which holds references to sharable immutable arrays of 256 items, and the other of which holds references to unsharable mutable arrays of 256 items. Making a snapshot of a list would require cloning only the inner arrays containing items that have been modified since the last snapshot--potentially much cheaper than cloning the whole list. Unfortunately, since there's no standard "make an immutable snapshot" interface [note that ICloneable doesn't count, since a clone of a mutable list would be mutable; while one could make an immutable snapshot by encapsulating a mutable clone in a read-only wrapper, that would only work for things which are cloneable, and even types which aren't cloneable should still support a "mutable snapshot" function.]

My recommendation is to return use a ReadOnlyCollection<T> for the scenario directly. This makes the usage explicit to the calling user.
Normally I would suggest using the appropriate interface. But given that the .NET Framework does not currently have a suitable IReadOnlyCollection, you must go with the ReadOnlyCollection type.
Also you must be aware when using ReadOnlyCollection, because it is not actually read-only: Immutability and ReadOnlyCollection

I seem to have settled on returning IEnumerable with the objects cloned.
public IEnumerable<Foose> GetFooseList() {
foreach(var foos in Collection) {
yield return foos.Clone();
}
}
requires a Clone method on Foos.
This allows no changes in the collection. Remember that ReadonlyCollection is "leaky" since the objects inside it can be changed as mentioned in a link in another post.

Sometimes you may want to use an interface, perhaps because you want to mock the collection during unit testing. Please see my blog entry for adding your own interface to ReadonlyCollection by using an adapter.

I typically return an IEnumerable<T>.
Once you make a collection readonly (so methods like Add, Remove and Clear no longer work) there's not much left that a collection supports that an enumerable doesn't - just Count and Contains, I believe.
If consumers of your class really need to treat elements like they're in a collection, it's easy enough to pass an IEnumerable to List<T>'s constructor.

Return a T[]:
private ICollection<T> items;
public T[] Items
{
get { return new List<T>(items).ToArray(); }
}

List<BusinessObject> or BusinessObjectCollection?

Prior to C# generics, everyone would code collections for their business objects by creating a collection base that implemented IEnumerable
IE:
public class CollectionBase : IEnumerable
and then would derive their Business Object collections from that.
public class BusinessObjectCollection : CollectionBase
Now with the generic list class, does anyone just use that instead? I've found that I use a compromise of the two techniques:
public class BusinessObjectCollection : List<BusinessObject>
I do this because I like to have strongly typed names instead of just passing Lists around.
What is your approach?

I am generally in the camp of just using a List directly, unless for some reason I need to encapsulate the data structure and provide a limited subset of its functionality. This is mainly because if I don't have a specific need for encapsulation then doing it is just a waste of time.
However, with the aggregate initializes feature in C# 3.0, there are some new situations where I would advocate using customized collection classes.
Basically, C# 3.0 allows any class that implements IEnumerable and has an Add method to use the new aggregate initializer syntax. For example, because Dictionary defines a method Add(K key, V value) it is possible to initialize a dictionary using this syntax:
var d = new Dictionary<string, int>
{
{"hello", 0},
{"the answer to life the universe and everything is:", 42}
};
The great thing about the feature is that it works for add methods with any number of arguments. For example, given this collection:
class c1 : IEnumerable
{
void Add(int x1, int x2, int x3)
{
//...
}
//...
}
it would be possible to initialize it like so:
var x = new c1
{
{1,2,3},
{4,5,6}
}
This can be really useful if you need to create static tables of complex objects. For example, if you were just using List<Customer> and you wanted to create a static list of customer objects you would have to create it like so:
var x = new List<Customer>
{
new Customer("Scott Wisniewski", "555-555-5555", "Seattle", "WA"),
new Customer("John Doe", "555-555-1234", "Los Angeles", "CA"),
new Customer("Michael Scott", "555-555-8769", "Scranton PA"),
new Customer("Ali G", "", "Staines", "UK")
}
However, if you use a customized collection, like this one:
class CustomerList : List<Customer>
{
public void Add(string name, string phoneNumber, string city, string stateOrCountry)
{
Add(new Customer(name, phoneNumber, city, stateOrCounter));
}
}
You could then initialize the collection using this syntax:
var customers = new CustomerList
{
{"Scott Wisniewski", "555-555-5555", "Seattle", "WA"},
{"John Doe", "555-555-1234", "Los Angeles", "CA"},
{"Michael Scott", "555-555-8769", "Scranton PA"},
{"Ali G", "", "Staines", "UK"}
}
This has the advantage of being both easier to type and easier to read because their is no need to retype the element type name for each element. The advantage can be particularly strong if the element type is long or complex.
That being said, this is only useful if you need static collections of data defined in your app. Some types of apps, like compilers, use them all the time. Others, like typical database apps don't because they load all their data from a database.
My advice would be that if you either need to define a static collection of objects, or need to encapsulate away the collection interface, then create a custom collection class. Otherwise I would just use List<T> directly.

It's recommended that in public API's not to use List<T>, but to use Collection<T>
If you are inheriting from it though, you should be fine, afaik.

I prefer just to use List<BusinessObject>. Typedefing it just adds unnecessary boilerplate to the code. List<BusinessObject> is a specific type, it's not just any List object, so it's still strongly typed.
More importantly, declaring something List<BusinessObject> makes it easier for everyone reading the code to tell what types they are dealing with, they don't have to search through to figure out what a BusinessObjectCollection is and then remember that it's just a list. By typedefing, you'll have to require a consistent (re)naming convention that everyone has to follow in order for it to make sense.

Use the type List<BusinessObject> where you have to declare a list of them. However,
where you return a list of BusinessObject, consider returning IEnumerable<T>, IList<T> or ReadOnlyCollection<T> - i.e. return the weakest possible contract that satisfies the client.
Where you want to "add custom code" to a list, code extension methods on the list type. Again, attach these methods to the weakest possible contract, e.g.
public static int SomeCount(this IEnumerable<BusinessObject> someList)
Of course, you can't and shouldn't add state with extension methods, so if you need to add a new property and a field behind it, use a subclass or better, a wrapper class to store this.

I've been going back and forth on 2 options:
public class BusinessObjectCollection : List<BusinessObject> {}
or methods that just do the following:
public IEnumerable<BusinessObject> GetBusinessObjects();
The benefits of the first approach is that you can change the underlying data store without having to mess with method signatures. Unfortunately if you inherit from a collection type that removes a method from the previous implementation, then you'll have to deal with those situations throughout your code.

You should probably avoid creating your own collection for that purpose. It's pretty common to want to change the type of data structure a few times during refactorings or when adding new features. With your approach, you would wind up with a separate class for BusinessObjectList, BusinessObjectDictionary, BusinessObjectTree, etc.
I don't really see any value in creating this class just because the classname is more readable. Yeah, the angle bracket syntax is kind of ugly, but it's standard in C++, C# and Java, so even if you don't write code that uses it you're going to run into it all the time.

I generally only derive my own collection classes if I need to "add value". Like, if the collection itself needed to have some "metadata" properties tagging along with it.

I do the exact same thing as you Jonathan... just inherit from List<T>. You get the best of both worlds. But I generally only do it when there is some value to add, like adding a LoadAll() method or whatever.

You can use both. For laziness - I mean productivity - List is a very useful class, it's also "comprehensive" and frankly full of YANGNI members. Coupled with the sensible argument / recommendation put forward by the MSDN article already linked about exposing List as a public member, I prefer the "third" way:
Personally I use the decorator pattern to expose only what I need from List i.e:
public OrderItemCollection : IEnumerable<OrderItem>
{
private readonly List<OrderItem> _orderItems = new List<OrderItem>();
void Add(OrderItem item)
{
_orderItems.Add(item)
}
//implement only the list members, which are required from your domain.
//ie. sum items, calculate weight etc...
private IEnumerator<string> Enumerator() {
return _orderItems.GetEnumerator();
}
public IEnumerator<string> GetEnumerator() {
return Enumerator();
}
}
Further still I'd probably abstract OrderItemCollection into IOrderItemCollection so I can swap my implementation of IOrderItemCollection over in the future in (I may prefer to use a different inner enumerable object such as Collection or more likley for perf use a Key Value Pair collection or Set.

I use generic lists for almost all scenarios. The only time that I would consider using a derived collection anymore is if I add collection specific members. However, the advent of LINQ has lessened the need for even that.

6 of 1, half dozen of another
Either way its the same thing. I only do it when I have reason to add custom code into the BusinessObjectCollection.
With out it having load methods return a list allows me to write more code in a common generic class and have it just work. Such as a Load method.

As someone else pointed out, it is recommended not to expose List publicly, and FxCop will whinge if you do so. This includes inheriting from List as in:
public MyTypeCollection : List<MyType>
In most cases public APIs will expose IList (or ICollection or IEnumerable) as appropriate.
In cases where you want your own custom collection, you can keep FxCop quiet by inheriting from Collection instead of List.

If you choose to create your own collection class you should check out the types in System.Collections.ObjectModel Namespace.
The namespace defines base classes thare are ment to make it easier for implementers to create a custom collections.

I tend to do it with my own collection if I want to shield the access to the actual list. When you are writing business objects, chance is that you need a hook to know if your object is being added/removed, in such sense I think BOCollection is better idea. Of coz if that is not required, List is more lightweight. Also you might want to check using IList to provide additional abstraction interface if you need some kind of proxying (e.g. a fake collection triggers lazy load from database)
But... why not consider Castle ActiveRecord or any other mature ORM framework? :)

At the most of the time I simply go with the List way, as it gives me all the functionality I need at the 90% of the time, and when something 'extra' is needed, I inherit from it, and code that extra bit.

I would do this:
using BusinessObjectCollection = List<BusinessObject>;
This just creates an alias rather than a completely new type. I prefer it to using List<BusinessObject> directly because it leaves me free to change the underlying structure of the collection at some point in the future without changing code that uses it (as long as I provide the same properties and methods).

try out this:
System.Collections.ObjectModel.Collection<BusinessObject>
it makes unnecessary to implement basic method like CollectionBase do

this is the way:
return arrays, accept IEnumerable<T>
=)

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Implementing a collection using another collection - c#

For #2: if the private member is only assigned to in the constructor or when declared, it can be readonly. This is usually true if you only have one underlying collection and don't ever need to recreate it.

Related

IEnumerable vs IReadonlyCollection vs ReadonlyCollection for exposing a list member

Persistent collections and standard collection interfaces

If you aren't supposed to return collections to callers, how should you return a collection of data to a caller?

Best practice: How to expose a read-only ICollection

List<BusinessObject> or BusinessObjectCollection?

Categories

Resources