Why is it considered bad to expose List<T>? [duplicate]

Why is it considered bad to expose List<T>? [duplicate] - c#

This question already has answers here:
List<T> or IList<T> [closed]
(18 answers)
Closed 8 years ago.
According to FXCop, List should not be exposed in an API object model. Why is this considered bad practice?

I agree with moose-in-the-jungle here: List<T> is an unconstrained, bloated object that has a lot of "baggage" in it.
Fortunately the solution is simple: expose IList<T> instead.
It exposes a barebones interface that has most all of List<T>'s methods (with the exception of things like AddRange()) and it doesn't constrain you to the specific List<T> type, which allows your API consumers to use their own custom implementers of IList<T>.
For even more flexibility, consider exposing some collections to IEnumerable<T>, when appropriate.

There are the 2 main reasons:
List<T> is a rather bloated type with many members not relevant in many scenarios (is too “busy” for public object models).
The class is unsealed, but not specifically designed to be extended (you cannot override any members)

It's only considered bad practice if you are writing an API that will be used by thousands or millions of developers.
The .NET framework design guidelines are meant for Microsoft's public APIs.
If you have an API that's not being used by a lot of people, you should ignore the warning.

i think you dont want your consumers adding new elements into your return. An API should be clear and complete and if it returns an array, it should return the exact data structure. I dont think it has anything to do with T per say but rather returning a List<> instead of an array [] directly

One reason is because List isn't something you can simulate. Even in less-popular libraries, I've seen iterations that used to expose a List object as an IList due to this recommendation, and in later versions decided to not store the data in a list at all (perhaps in a database). Because it was an IList, it wasn't a breaking change to change the implementation underneath the clients and yet keep everyone working.

One of the reason is that user will be able to change the list and owner of the list will not know about this, while in some cases it must do some stuff after adding/removing items to/from the list. Even if it isn't required now it can become a requirement in future. So it is better to add AddXXX / RemoveXXX method to the owner of the class and expose list an an IEnumerable or (which is better in my opinion) expose it as an IList and use ObservableCollection from WindowsBase.

Related

Best-practice for data-object properties: IEnumerable vs Array [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
Short question: is it OK to declare data-object properties as IEnumerable, or it should be Array instead?
Background:
I've just found a bug in our project which caused performance issue. The reason was that IEnumerable has been iterated multiple times. But it is simple only at first sight. I think there is a design flaw there which allowed that to happen.
The deeper investigation has shown that one method GetAllUsers returned a UsersResponse object, one of the properties of which was IEnumerable<T> UsersList. When caching was implemented, obviously the entire UsersResponse object was being cached, and it worked fine at that time because GetAllUsers assigned an array to IEnumerable<T> UsersList. Later implementation of GetAllUsers has changed and for some reason developer decided that ToArray() call is redundant. So I think the problem was that UsersResponse object was not well-designed and allowed too much freedom for it's factory-method. On the other hand, caching an object which contains IEnumerable properties is also useless in principle.
So we return to my question of designing data-objects in general: when you declare it, not knowing if it will be cached some time in the future or how it will be used in other ways except your current need, is it OK to declare it's properties as IEnumerable, placing the responsibility of careful usage on other developers, or it must be Array from the start?
What I've searched:
The only suggestion I've found is Jon Wagner's blog post where he recommends to “Seal” LINQ chains as soon as they have been built. But this relates more to building the IEnumerable than storing it in an entity property. Although in conjunction with principle to return as specific type as possible, it can imply declaring property as Array.

When I'm thinking about API design I am always trying to be "nice" to the consumer. This means accepting as much as possible for parameters (when possible), and providing as much as possible for return values. If you too buy into this, it means that you should strive for IEnumerable parameters while providing Array (or similar) return values. The result is maximum value for the consumer of your API (even if it is yourself in the end).

I generally favour the Array approach unless it is forced into collections.
Especially when there's a lot numerical computations.
Advantages:
Compact storage. (probability for hacky techniques)
[ ] accessor. ( not applicable for C# )
readable code for multi dimensional arrays
last but not least, stick to one and never use both. it is terrible to convert a collection from Array to Collection and vice versa, this does no good either.

The most abstract type that fulfills the requirements should be used. If you need to make it impossible to store an "open" query in a DTO, then use ICollection<T> or (if you need indexed access) IList<T> for the property.
Using an array will nail you down to a concrete implementation. This may be a non-issue or it may turn out to be a pain at some later point. The latter rules out array IMHO.
BTW: It is the caller's responsibility to only iterate once over an IEnumerable<T>.

What is the benefit of using type-safe collection classes?

I was wondering, why on some occasions i see a class representing some type's collection.
For example:
In Microsoft XNA Framework: TextureCollection, TouchCollection, etc.
Also other classes in the .NET framework itself, ending with Collection.
Why is it designed this way? what are the benefits for doing it this way and not as a generic type collection, like was introduced in C# 2.0 ?
Thanks

The examples you gave are good ones. TextureCollection is sealed and has no public constructor, only an internal one. TouchCollection implements IList<TouchLocation>, similar to the way List<T> implements IList<T>. Generics at work here btw, the upvoted answer isn't correct.
TextureCollection is intentionally crippled, it makes sure that you can never create an instance of it. Only secret knowledge about textures can fill this collection, a List<> wouldn't suffice since it cannot be initialized with that secret knowledge that makes the indexer work. Nor does the class need to be generic, it only knows about Texture class instances.
The TouchCollection is similarly specialized. The Add() method throws a NotSupportedException. This cannot be done with a regular List<> class, its Add() method isn't virtual so cannot be overridden to throw the exception.
This is not unusual.

In the .NET framework itself, many type-safe collections predate 2.0 Generics, and are kept for compatibility.
For several XAML-related contexts, there's either no syntax to specify a generic class, or the syntax is cumbersome. Therefore, when List<T> wiould be used, there's a specific TList written for each need.

It allows you to define your own semantics on the collection (you may not want to have an Add or AddRange method etc...).
Additionally, readability is increased by not having your code littered with List<Touch> and List<Texture> everywhere.
There is also quite a lot of .NET 1.0/1.1 code that still needs to work, so the older collections that predate generics still need to exist.

It's not that easy to use generic classes in XAML for example.

Following on from Oded's answer, your own class type allows for much easier change down the track when you decide you want a stack / queue etc instead of that List. There can be lots of reasons for this, including performance, memory use etc.
In fact, it's usually a good idea to hide that type of implementation detail - users of your class just want to know that it stores Textures, not how.

C#-Generics -ISerializable,IEnumerable,IList -efficient application

I need simple example to use ISerializable,IEnumerable,IList with Generics efficiently.
Also wish to know what are all the other Interfaces we can use along with Generics.
Update :
The task i need to perform is using these interfaces
I have to serialize the custom Types
Collect them in Generic object
Iterate them to find the match

This question is very broad.
Note that the interfaces you've listed are not all about the same thing.
ISerializable is not generic, and deals with serialization of objects to streams or similar.
IEnumerable is about being able to enumerating over a collection or something that produces a stream of elements.
IList is an interface that is typically implemented by such a collection.
It would help us helping you if you could narrow down your question somewhat. As your question stands now, it's more like "I need to know everything there is to know about cars".
As for "all other interfaces that can be used with generics", have you looked at the MSDN Documentation for the .NET framework classes?

I have a feeling that this question is a homework question...but I'll bite with a little information.
Generics != Interfaces. Basically you can use any interface that you want with Generics, it is one of the more powerful parts of generics, by using interfaces that you create, you can then define generic methods that process multiple concrete implementations by limiting the generic type to objects that implement a specific interface.

.NET Reflection Create Class Properties

I am fairly new to reflection and I would like to know, if possible, how to create an instance of a class then add properties to the class, set those properties, then read them later. I don't have any code as i don't even know how to start going about this. C# or VB is fine.
Thank You
EDIT: (to elaborate)
My system has a dynamic form creator. one of my associates requires that the form data be accessible via web service. My idea was to create a class (based on the dynamic form) add properties to the class (based on the forms fields) set those properties (based on the values input for those fields) then return the class in the web service.
additionally, the web service will be able to set the properties in the class and eventually commit those changes to the db.

If you mean dynamically create a class, then the two options are:
Reflection.Emit - Difficult, Fast to create the class
CodeDom - Less Difficult, Slower to create the class
If you mean create an instance of an existing class, then start with Activator.CreateInstance to create an instance of the object, and then look at the methods on Type such as GetProperty which will return a PropertyInfo that you can call GetValue and SetValue on.
Update: For the scenario you describe, returning dynamic data from a web service, then I'd recommend against this approach as it's hard for you to code, and hard for statically-typed languages to consume. Instead, as suggested in the comments and one of the other answers, some sort of dictionary would likely be a better option.
(Note that when I say return some sort of dictionary, I am speaking figuratively rather than literally, i.e. return something which is conceptually the same as a dictionary such as a list of key-value pairs. I wouldn't recommend directly returning one (even if you're using WCF which does support this) because it's typically better to have full control over the XML you return.)

I know this is being overly simplified by why not just KISS and generate the required Xml to return through the Web Service and then parse the returned Xml to populate the database.
My reasoning is that for the expanded reason you suggest doing this I can see the value or reason for wanting a dynamic class?

The Execution-Time Code Generation chapter of Eric Gunnerson's book (A Programmer's Introduction to C#) has some great information on this topic. See page 14 and onwards in particular. He outlines the two main methods of accomplishing dynamic class/code generation (CodeDOM and the Reflection.Emit namespace). It also discusses the difficulty and performance of the two approaches. Have a read through that, and you ought to find everything you might need.

The real question is, what do you need to use those properties for?
What are gonna be the use cases? Do you need to bind those properties to the UI somehow? Using what kind of technology? (WPF, Windows Forms?)
Is it just that you need to gather a set of key/value pairs at runtime? Then maybe a simple dictionary would do the trick.
Please elaborate if you can on what it is you need, and I'm sure people here can come up with plenty of ways to help you, but it's difficult to give a good answer without more context.

Why doesn't ReadOnlyCollection<> include methods like FindAll(), FindFirst(),

Following the suggestions of FxCop and my personal inclination I've been encouraging the team I'm coaching to use ReadOnlyCollections as much possible. If only so that recipients of the lists can't modify their content. In their theory this is bread & butter. The problem is that the List<> interface is much richer exposing all sorts of useful methods. Why did they make that choice?
Do you just give up and return writable collections? Do you return readonly collections and then wrap them in the writable variety? Ahhhhh.
Update:
Thanks I'm familiar with the Framework Design Guideline and thats why the team is using FxCop to enforce it. However this team is living with VS 2005 (I know, I know) and so telling them that LINQ/Extension methods would solve their problems just makes them sad.
They've learned that List.FindAll() and .FindFirst() provide greater clarity than writing a foreach loop. Now I'm pushing them to use ReadOnlyCollections they lose that clarity.
Maybe there is a deeper design problem that I'm not spotting.
-- Sorry the original post should have mentioned the VS2005 restriction. I've lived with for so long that I just don't notice.

Section 8.3.2 of the .NET Framework Design Guidelines Second Edition:
DO use ReadOnlyCollection<T>, a subclass of ReadOnlyCollection<T>, or in rare cases IEnumerable<T> for properties or return values representing read-only collections.
We go with ReadOnlyCollections to express our intent of the collection returned.
The List<T> methods you speak of were added in .NET 2.0 for convenience. In C# 3.0 / .NET 3.5, you can get all those methods back on ReadOnlyCollection<T> (or any IEnumerable<T>) using extension methods (and use LINQ operators as well), so I don't think there's any motivation for adding them natively to other types. The fact that they exist at all on List is just a historical note due to the presence of extension methods being available now but weren't in 2.0.

First off, ReadOnlyCollection<T> does implement IEnumerable<T> and IList<T>. With all of the extension methods in .NET 3.5 and LINQ, you have access to nearly all of the functionality from the original List<T> class in terms of querying, which is all you should do with a ReadOnlyCollection<T> anyways.
That being said, your initial question leads me to make some suggestions...
Returning List<T> is bad design, so it shouldn't be a point of comparison. List<T> should be used for implementation, but for the interface, IList<T> should be returned. The Framework Design Guidelines specifically state:
"DO NOT use ArrayList or List<T> in public APIs." (Page 251)
If you take that into consideration, there is absolutely no disadvantage to ReadOnlyCollection<T> when compared to List<T>. Both of these classes implement IEnumerable<T> and IList<T>, which are the interfaces that should be returned anyways.

I don't have any insight as to why they weren't originally added. But now that we have LINQ I certainly see no reason to add them in future versions of the language. The methods you mentioned can easily be written in a LINQ query today. These days I just use the LINQ queries for pretty much everything. I actually more often get annoyed with List<T> having those methods because it conflicts with extension methods I write against IEnumerable<T>.

I think Jeff's answer kinda contains the answer you need; instead of ReadOnlyCollection<T>, return a subclass of it... one that you implement yourself to include the methods that you'd like to use without upgrading to VS2008/LINQ.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.