I saw this reply from Jon on Initialize generic object with unknown type:
If you want a single collection to
contain multiple unrelated types of
values, however, you will have to use
List<object>
I'm not comparing ArrayList vs List<>, but ArrayList vs List<object>, as both will be exposing elements of type object. What would be the benefit of using either one in this case?
EDIT: It's no concern for type safety here, since both class is exposing object as its item. One still needs to cast from object to the desired type. I'm more interested in anything other than type safety.
EDIT: Thanks Marc Gravell and Sean for the answer. Sorry, I can only pick 1 as answer, so I'll up vote both.
You'll be able to use the LINQ extension methods directly with List<object>, but not with ArrayList, unless you inject a Cast<object>() / OfType<object> (thanks to IEnumerable<object> vs IEnumerable). That's worth quite a bit, even if you don't need type safety etc.
The speed will be about the same; structs will still be boxed, etc - so there isn't much else to tell them apart. Except that I tend to see ArrayList as "oops, somebody is writing legacy code again..." ;-p
One big benefit to using List<object> is that these days most code is written to use the generic classes/interfaces. I suspect that these days most people would write a method that takes a IList<object> instead of an IList. Since ArrayList doesn't implement IList<object> you wouldn't be able to use an array list in these scenarios.
I tend to think of the non-generic classes/interfaces as legacy code and avoid them whenever possible.
In this case, ArrayList vs. List<Object> then you won't notice any differences in speed. There might be some differences in the actual methods available on each of these, particular in .NET 3.5 and counting extension methods, but that has more to do with ArrayList being somewhat deprecated than anything else.
Yes, besides being typesafe, generic collections might be actually faster.
From the MSDN (http://msdn.microsoft.com/en-us/library/system.collections.generic.aspx)
The System.Collections.Generic
namespace contains interfaces and
classes that define generic
collections, which allow users to
create strongly typed collections that
provide better type safety and
performance than non-generic strongly
typed collections.
Do some benchmarking and you will know what performs best. I guestimate that the difference is very small.
List<> is a typesafe version of ArrayList. It will guarantee that you will get the same object type in the collection.
Related
I'm trying to understand the design decision behind this part of the language. I admit i'm very new to it all but this is something which caught me out initially and I was wondering if I'm missing an obvious reason. Consider the following code:
List<int> MyList = new List<int>() { 5, 4, 3, 2, 1 };
int[] MyArray = {5,4,3,2,1};
//Sort the list
MyList.Sort();
//This was an instance method
//Sort the Array
Array.Sort(MyArray);
//This was a static method
Why are they not both implemented in the same way - intuitively to me it would make more sense if they were both instance methods?
The question is interesting because it reveals details of the .NET type system. Like value types, string and delegate types, array types get special treatment in .NET. The most notable oddish behavior is that you never explicitly declare an array type. The compiler takes care of it for you with ample helpings of the jitter. System.Array is an abstract type, you'll get dedicated array types in the process of writing code. Either by explicitly creating a type[] or by using generic classes that have an array in their base implementation.
In a largish program, having hundreds of array types is not unusual. Which is okay, but there's overhead involved for each type. It is storage required for just the type, not the objects of it. The biggest chunk of it is the so-called 'method table'. In a nutshell, it is a list of pointers to each instance method of the type. Both the class loader and the jitter work together to fill this table. This is commonly known as the 'v-table' but isn't quite a match, the table contains pointers to methods that are both non-virtual and virtual.
You can see where this leads perhaps, the designers were worried about having lots of types with big method tables. So looked for ways to cut down on the overhead.
Array.Sort() was an obvious target.
The same issue is not relevant for generic types. A big nicety of generics, one of many, one method table can handle the method pointers for any type parameter of a reference type.
You are comparing two different types of 'object containers':
MyList is a generic collection of type List, a wrapper class, of type int, where the List<T> represents a strongly typed list of objects. The List class itself provides methods to search, sort, and manipulate its contained objects.
MyArray is a basic data structure of type Array. The Array does not provide the same rich set of methods as the List. Arrays can at the same time be single-dimensional, multidimensional or jagged, whilst Lists out of the box only are single-dimensional.
Take a look at this question, it provides a richer discussion about these data types: Array versus List<T>: When to use which?
Without asking someone who was involved in the design of the original platform it's hard to know. But, here's my guess.
In older languages, like C, arrays are dumb data structures - they have no code of their own. Instead, they're manipulated by outside methods. As you move into an Object oriented framework, the closest equivilent is a dumb object (with minimal methods) manipulated by static methods.
So, my guess is that the implementation of .NET Arrays is more a symptom of C style thinking in the early days of development than anything else.
This likely has to do with inheritance. The Array class cannot be manually derived from. But oddly, you can declare an array of anything at all and get an instance of System.Array that is strongly typed, even before generics allowed you to have strongly typed collections. Array seems to be one of those magic parts of the framework.
Also notice that none of the instance methods provided on an array massively modify the array. SetValue() seems to be the only one that changes anything. The Array class itself provides many static methods that can change the content of the array, like Reverse() and Sort(). Not sure if that's significant - maybe someone here can give some background as to why that's the case.
In contrast, List<T> (which wasn't around in the 1.0 framework days) and classes like ArrayList (which was around back then) are just run-of-the mill classes with no special meaning within the framework. They provide a common .Sort() instance method so that when you inherited from these classes, you'd get that functionality or could override it.
However, these kinds of sort methods have gone out of vogue anyway as extension methods like Linq's .OrderBy() style sorting have become the next evolution. You can query and sort arrays and Lists and any other enumerable object with the same mechanism now, which is really, really nice.
-- EDIT --
The other, more cynical answer may just be - that's how Java did it so Microsoft did it the same way in the 1.0 version of the framework since at that time they were busy playing catch-up.
One reason might be because Array.Sort was designed in .NET 1.0, which had no generics.
I'm not sure, but I'm thinking maybe just so that arrays are as close to Primitives as they can be.
Recently I asked a question on SO that had mentioned the possible use of an c# ArrayList for a solution. A comment was made that using an arraylist is bad. I would like to more about this. I have never heard this statement before about arraylists.
could sombody bring me up to speed on the possible performance problems with using arraylists
c#. .net-2
The main problem with ArrayList is that is uses object - it means you have to cast to and from whatever you are encapsulating. It is a remnant of the days before generics and is probably around for backwards compatibility only.
You do not have the type safety with ArrayList that you have with a generic list. The performance issue is in the need to cast objects back to the original (or have implicit boxing happen).
Implicit boxing will happen whenever you use a value type - it will be boxed when put into the ArrayList and unboxed when referenced.
The issue is not just that of performance, but also of readablity and correctness. Since generics came in, this object has become obsolete and would only be needed in .NET 1.0/1.1 code.
If you're storing a value type (int, float, double, etc - or any struct), ArrayList will cause boxing on every storage and unboxing on every element access. This can be a significant hit to performance.
In addition, there is a complete lack of type safety with ArrayList. Since everything is stored as "object", there's an extra burden on you, as a developer, to keep it safe.
In addition, if you want the behavior of storing objects, you can always use List<object>. There is no disadvantage to this over ArrayList, and it has one large (IMO) advantage: It makes your intent (storing an untyped object) clear from the start.
ArrayList really only exists, and should only be used, for .NET 1.1 code. There really is no reason to use it in .NET 2+.
ArrayList is not a generic type so it must store all items you place in it as objects. This is bad for two reasons. First, when putting value types in the ArrayList you force the compiler to box the value type into a reference type which can be costly. Second, you now have to cast everything you pull out of the array list. This is bad since you now need to make sure you know what objects are in there.
List avoids these issues since it is constructed with the proper type.
For example:
List<int> ints = new List<int>();
ints.Add(5); //no boxing
int num = ints[0]; // no casting
The generic List<T> is preferred since it is generic, which provides additional type information and removes the need to box/unbox value types added to it.
In addition to the performance issues, it's a matter of moving errors from runtime to compile time. Casting objects retrieved from ArrayLists must happen at runtime, and any type errors will happen during execution. Using a generic List<> all types are checked during compile time.
All the boxing and unboxing can be expensive and fragile. Microsoft made some nice improvments in terms of typing and performance in .NET 2.0 generics.
Here are some good reads:
Boxing and Unboxing of Value Types : What You Need to Know? at http://www.c-sharpcorner.com/uploadfile/stuart_fujitani/boxnunbox11192005055746am/boxnunbox.aspx
Performance: ArrayList vs List<> at http://allantech.blogspot.com/2007/03/performance-arraylist-vs-list.html
I am running through some tests about using ArrayLists and List.
Speed is very important in my app.
I have tested creating 10000 records in each, finding an item by index and then updating that object for example:
List[i] = newX;
Using the arraylist seems much faster. Is that correct?
UPDATE:
Using the List[i] approach, for my List<T> approach I am using LINQ to find the index eg/
....
int index = base.FindIndex(x=>x.AlpaNumericString = "PopItem");
base[index] = UpdatedItem;
It is definately slower than
ArrayList.IndexOf("PopItem"))
base[index] = UpdatedItem;
A generic List (List<T>) should always be quicker than an ArrayList.
Firstly, an ArrayList is not strongly-typed and accepts types of object, so if you're storing value types in the ArrayList, they are going to be boxed and unboxed every time they are added or accessed.
A Generic List can be defined to accept only (say) int's so therefore no boxing or unboxing needs to occur when adding/accessing elements of the list.
If you're dealing with reference types, you're probably still better off with a Generic List over an ArrayList, since although there's no boxing/unboxing going on, your Generic List is type-safe, and there will be no implicit (or explicit) casts required when retrieving your strongly-typed object from the ArrayList's "collection" of object types.
There may be some edge-cases where an ArrayList is faster performing than a Generic List, however, I (personally) have not yet come across one. Even the MSDN documentation states:
Performance Considerations
In deciding whether to use the
List<(Of <(T>)>) or ArrayList class,
both of which have similar
functionality, remember that the
List<(Of <(T>)>) class performs better
in most cases and is type safe. If a
reference type is used for type T of
the List<(Of <(T>)>) class, the
behavior of the two classes is
identical. However, if a value type is
used for type T, you need to consider
implementation and boxing issues.
If a value type is used for type T,
the compiler generates an
implementation of the List<(Of <(T>)>)
class specifically for that value
type. That means a list element of a
List<(Of <(T>)>) object does not have
to be boxed before the element can be
used, and after about 500 list
elements are created the memory saved
not boxing list elements is greater
than the memory used to generate the
class implementation.
Make certain the value type used for
type T implements the IEquatable<(Of
<(T>)>) generic interface. If not,
methods such as Contains must call the
Object..::.Equals(Object) method,
which boxes the affected list element.
If the value type implements the
IComparable interface and you own the
source code, also implement the
IComparable<(Of <(T>)>) generic
interface to prevent the BinarySearch
and Sort methods from boxing list
elements. If you do not own the source
code, pass an IComparer<(Of <(T>)>)
object to the BinarySearch and Sort
methods
Moreover, I particularly like the very last section of that paragraph, which states:
It is to your advantage to use the type-specific implementation of the List<(Of <(T>)>) class instead of using the ArrayList class or writing a strongly typed wrapper collection yourself. The reason is your implementation must do what the .NET Framework does for you already, and the common language runtime can share Microsoft intermediate language code and metadata, which your implementation cannot.
Touché! :)
Based on your recent edit it seems as though you're not performing a 1:1 comparison here. In the List you have a class object and you're looking for the index based on a property, whereas in the ArrayList you just store the values of that property. If so, this is a severely flawed comparison.
To make it a 1:1 comparison you would add the values to the list only, not the class. Or, you would add the class items to the ArrayList. The former would allow you to use IndexOf on both collections. The latter would entail looping through your entire ArrayList and comparing each item till a match was found (and you could do the same for the List), or overriding object.Equals since ArrayList uses that for comparison.
For an interesting read, I suggest taking a look at Rico Mariani's post: Performance Quiz #7 -- Generics Improvements and Costs -- Solution. Even in that post Rico also emphasizes the need to benchmark different scenarios. No blanket statement is issued about ArrayLists, although the general consensus is to use generic lists for performance, type safety, and having a strongly typed collection.
Another related article is: Why should I use List and not ArrayList.
ArrayList seems faster? According to the documentation ( http://msdn.microsoft.com/en-us/library/6sh2ey19.aspx ) List should be faster when using a value type, and the same speed when using a reference type. ArrayList is slower with value types because it needs to box/unbox the values when you're accessing them.
I would expect them to be about the same if they are value-types. There is an extra cast/type-check for ArrayList, but nothing huge. Of course, List<T> should be preferred. If speed is the primary concern (which it almost always isn't, at least not in this way), then you might also want to profile an array (T[]) - harder (=more expensive) to add/remove, of course - but if you are just querying/assigning by index, it should be the fastest. I have had to resort to arrays for some very localised performance critical work, but 99.95% of the time this is overkill and should be avoided.
For example, for any of the 3 approaches (List<T>/ArrayList/T[]) I would expect the assignment cost to be insignificant to the cost of newing up the new instance to put into the storage.
Marc Gravell touched on this in his anwswer - I think it needs to be stressed.
It is usually a waste of time to prematurely optimize your code!
A better approach is to do a simple, well designed first implementation, and test it with anticipated real world data loads.
Often, you will find that it's "fast enough". (It helps to start out with a clear definition of "fast enough" - e.g. "Must be able to find a single CD in a 10,000 CD collection in 3 seconds or less")
If it's not, put a profiler on it. Almost invariably, the bottle neck will NOT be where you expect.
(I learned this the hard way when I brought a whole app to it's knees with single badly chosen string concatenation)
What's the preferred container type when returning multiple objects of the same type from a function?
Is it against good practice to return a simple array (like MyType[]), or should you wrap it in some generic container (like ICollection<MyType>)?
Thanks!
Eric Lippert has a good article on this. In case you can't be bothered to read the entire article, the answer is: return the interface.
Return an IEnumerable<T> using a yield return.
I would return an IList<T> as that gives the consumer of your function the greatest flexibility. That way if the consumer of your function only needed to enumerate the sequence they can do so, but if they want to use the sequence as a list they can do that as well.
My general rule of thumb is to accept the least restrictive type as a parameter and return the richest type I can. This is, of course, a balancing act as you don't want to lock yourself into any particular interface or implementation (but always, always try to use an interface).
This is the least presumptuous approach that you, the API developer, can take. It is not up to you to decide how a consumer of your function will use what they send you - that is why you would return an IList<T> in this case as to give them the greatest flexibility. Also for this same reason you would never presume to know what type of parameter a consumer will send you. If you only need to iterate a sequence sent to you as a parameter then make the parameter an IEnumerable<T> rather than a List<T>.
EDIT (monoxide): Since it doesn't look like the question is going to be closed, I just want to add a link from the other question about this: Why arrays are harmful
Why not List<T>?
From the Eric Lippert post mentioned by others, I thought I will highlight this:
If I need a sequence I’ll use
IEnumerable<T>, if I need a mapping
from contiguous numbers to data I’ll
use a List<T>, if I need a mapping
across arbitrary data I’ll use a
Dictionary<K,V>, if I need a set I’ll
use a HashSet<T>. I simply don’t need
arrays for anything, so I almost never
use them. They don’t solve a problem I
have better than the other tools at my
disposal.
A good piece of advice that I've oft heard quoted is this:
Be liberal in what you accept, precise in what you provide.
In terms of designing your API, I'd suggest you should be returning an Interface, not a concrete type.
Taking your example method, I'd rewrite it as follows:
public IList<object> Foo()
{
List<object> retList = new List<object>();
// Blah, blah, [snip]
return retList;
}
The key is that your internal implementation choice - to use a List - isn't revealed to the caller, but you're returning an appropriate interface.
Microsoft's own guidelines on framework development recommend against returning specific types, favoring interfaces. (Sorry, I couldn't find a link for this)
Similarly, your parameters should be as general as possible - instead of accepting an array, accept an IEnumerable of the appropriate type. This is compatible with arrays as well as lists and other useful types.
Taking your example method again:
public IList<object> Foo(IEnumerable<object> bar)
{
List<object> retList = new List<object>();
// Blah, blah, [snip]
return retList;
}
If the collection that is being returned is read-only, meaning you never want the elements to in the collection to be changed, then use IEnumerable<T>. This is the most basic representation of a read-only sequence of immutable (at least from the perspective of the enumeration itself) elements.
If you want it to be a self-contained collection that can be changed, then use ICollection<T> or IList<T>.
For example, if you wanted to return the results of searching for a particular set of files, then return IEnumerable<FileInfo>.
However, if you wanted to expose the files in a directory, however, you would expose IList/ICollection<FileInfo> as it makes sense that you would want to possibly change the contents of the collection.
return ICollection<type>
The advantage to generic return types, is that you can change the underlying implementation without changing the code that uses it. The advantage to returning the specific type, is you can use more type specific methods.
Always return an interface type that presents the greatest amount of functionality to the caller. So in your case ICollection<YourType> ought to be used.
Something interesting to note is that the BCL developers actually got this wrong in some place of the .NET framework - see this Eric Lippert blog post for that story.
Why not IList<MyType>?
It supports direct indexing which is hallmark for an array without removing the possibility to return a List<MyType> some day. If you want to suppress this feature, you probably want to return IEnumerable<MyType>.
It depends on what you plan to do with the collection you're returning. If you're just iterating, or if you only want the user to iterate, then I agree with #Daniel, return IEnumerable<T>. If you actually want to allow list-based operations, however, I'd return IList<T>.
Use generics. It's easier to interoperate with other collections classes and the type system is more able to help you with potential errors.
The old style of returning an array was a crutch before generics.
What ever makes your code more readable, maintainable and easier for YOU.
I would have used the simple array, simpler==better most of the time.
Although I really have to see the context to give the right answer.
There are big advantages to favouring IEnumerable over anything else, as this gives you the greatest implementation flexibility and allows you to use yield return or Linq operators for lazy implementation.
If the caller wants a List<T> instead they can simply call ToList() on whatever you returned, and the overall performance will be roughly the same as if you had created and returned a new List<T> from your method.
Array is harmful, but ICollection<T> is also harmful.
ICollection<T> cannot guarantee the object will be immutable.
My recommendation is to wrap the returning object with ReadOnlyCollection<T>
As the title says when should I use List and when should I use ArrayList?
Thanks
The main time to use ArrayList is in .NET 1.1
Other than that, List<T> all the way (for your local T)...
For those (rare) cases where you don't know the type up-front (and can't use generics), even List<object> is more helpful than ArrayList (IMO).
You should always use List<TypeOfChoice> (introduced in .NET 2.0 with generics) since it is TypeSafe and faster than ArrayList (no un-necessary boxing/unboxing).
Only case I could think of where an ArrayList could be handy is if you need to interface with old stuff (.NET 1.1) or you need an array of objects of different type and you load up everything as object - but you could do the latter with List<Object> which is generally better.
Since List is a generic class, I would tend to always use List.
ArrayList is a .NET 1.x class (still available & valid though), but it is not 'typed'/generic, so you'll need to cast items from 'object' back to the desired type; whereas when using List, you don't have to do that.
Use List where ever possible. I can't see any use to ArrayList when high performing List exists.
ArrayList is an older .NET data structure. If you are using .NET 2.0 or above always use List when the array needs to hold items of the same type. Usage of List over ArrayList improves both performance and usability.
As other said. You should use the List generic, almost always when you know the type (C# is a strong-typed language), and other ways when u do polymorphic/inhertance classes or other stuff like that.
If you dont want to use Linq Queries, then you dont need to use List. If you want to use then you must prefer List.
Generics was introduced in .Net 2.0.If you are using earlier versions of .Net ,then you can use the Array List else we can go with the Generic List itself. Array List is the deprecated one and won't provide better type safety and also create boxing and unboxing problems.But Generic List won't.