Efficiency: Creating an array of doubles incrementally? - c#

Consider the following code:
List<double> l = new List<double>();
//add unknown number of values to the list
l.Add(0.1); //assume we don't have these values ahead of time.
l.Add(0.11);
l.Add(0.1);
l.ToArray(); //ultimately we want an array of doubles
Anything wrong with this approach? Is there a more appropriate way to build an array, without knowing the size, or elements ahead of time?

There's nothing wrong with your approach. You are using the correct data type for the purpose.

After some observations you can get a better idea of the total elements in that list. Then you can create a new list with an initial capacity in the constructor:
List<double> l = new List<double>(capacity);
Other than this, it's the proper technique and data structure.
UPDATE:
If you:
Need only the Add and ToArray functions of the List<T> structure,
And you can't really predict the total capacity
And you end up with more than 1K elements
And better performance is really really (really!) your goal
Then you might want to write your own interface:
public interface IArrayBuilder<T>
{
void Add(T item);
T[] ToArray();
}
And then write your own implementation, which might be better than List<T>. Why is that? because List<T> holds a single array internally, and it increases its size when needed. The procedure of increasing the inner array costs, in terms of performance, since it allocates new memory (and perhaps copies the elements from the old array to the new one, I don't remember). However, if all of the conditions described above are true, all you need is to build an array, you don't really need all of the data to be stored in a single array internally.
I know it's a long shot, but I think it's better sharing such thoughts...

As others have already pointed out: This is the correct approach. I'll just add that if you can somehow avoid the array and use List<T> directly or perhaps IEnumerable<T>, you'll avoid copying the array as ToArray actually copies the internal array of the list instance.
Eric Lippert has a great post about arrays, that you may find relevant.

A dynamic data structure like a List is the correct way to implement this. The only real advantage arrays have over a List is the O(1) access performance (compared to O(n) in List). The flexibility more than makes up for this performance loss imho

Related

Arrays/Double Arrays vs Lists/Dictionaries [duplicate]

MyClass[] array;
List<MyClass> list;
What are the scenarios when one is preferable over the other? And why?
It is rare, in reality, that you would want to use an array. Definitely use a List<T> any time you want to add/remove data, since resizing arrays is expensive. If you know the data is fixed length, and you want to micro-optimise for some very specific reason (after benchmarking), then an array may be useful.
List<T> offers a lot more functionality than an array (although LINQ evens it up a bit), and is almost always the right choice. Except for params arguments, of course. ;-p
As a counter - List<T> is one-dimensional; where-as you have have rectangular (etc) arrays like int[,] or string[,,] - but there are other ways of modelling such data (if you need) in an object model.
See also:
How/When to abandon the use of Arrays in c#.net?
Arrays, What's the point?
That said, I make a lot of use of arrays in my protobuf-net project; entirely for performance:
it does a lot of bit-shifting, so a byte[] is pretty much essential for encoding;
I use a local rolling byte[] buffer which I fill before sending down to the underlying stream (and v.v.); quicker than BufferedStream etc;
it internally uses an array-based model of objects (Foo[] rather than List<Foo>), since the size is fixed once built, and needs to be very fast.
But this is definitely an exception; for general line-of-business processing, a List<T> wins every time.
Really just answering to add a link which I'm surprised hasn't been mentioned yet: Eric's Lippert's blog entry on "Arrays considered somewhat harmful."
You can judge from the title that it's suggesting using collections wherever practical - but as Marc rightly points out, there are plenty of places where an array really is the only practical solution.
Notwithstanding the other answers recommending List<T>, you'll want to use arrays when handling:
image bitmap data
other low-level data-structures (i.e. network protocols)
Unless you are really concerned with performance, and by that I mean, "Why are you using .Net instead of C++?" you should stick with List<>. It's easier to maintain and does all the dirty work of resizing an array behind the scenes for you. (If necessary, List<> is pretty smart about choosing array sizes so it doesn't need to usually.)
Arrays should be used in preference to List when the immutability of the collection itself is part of the contract between the client & provider code (not necessarily immutability of the items within the collection) AND when IEnumerable is not suitable.
For example,
var str = "This is a string";
var strChars = str.ToCharArray(); // returns array
It is clear that modification of "strChars" will not mutate the original "str" object, irrespective implementation-level knowledge of "str"'s underlying type.
But suppose that
var str = "This is a string";
var strChars = str.ToCharList(); // returns List<char>
strChars.Insert(0, 'X');
In this case, it's not clear from that code-snippet alone if the insert method will or will not mutate the original "str" object. It requires implementation level knowledge of String to make that determination, which breaks Design by Contract approach. In the case of String, it's not a big deal, but it can be a big deal in almost every other case. Setting the List to read-only does help but results in run-time errors, not compile-time.
If I know exactly how many elements I'm going to need, say I need 5 elements and only ever 5 elements then I use an array. Otherwise I just use a List<T>.
Arrays Vs. Lists is a classic maintainability vs. performance problem. The rule of thumb that nearly all developers follow is that you should shoot for both, but when they come in to conflict, choose maintainability over performance. The exception to that rule is when performance has already proven to be an issue. If you carry this principle in to Arrays Vs. Lists, then what you get is this:
Use strongly typed lists until you hit performance problems. If you hit a performance problem, make a decision as to whether dropping out to arrays will benefit your solution with performance more than it will be a detriment to your solution in terms of maintenance.
Most of the times, using a List would suffice. A List uses an internal array to handle its data, and automatically resizes the array when adding more elements to the List than its current capacity, which makes it more easy to use than an array, where you need to know the capacity beforehand.
See http://msdn.microsoft.com/en-us/library/ms379570(v=vs.80).aspx#datastructures20_1_topic5 for more information about Lists in C# or just decompile System.Collections.Generic.List<T>.
If you need multidimensional data (for example using a matrix or in graphics programming), you would probably go with an array instead.
As always, if memory or performance is an issue, measure it! Otherwise you could be making false assumptions about the code.
Another situation not yet mentioned is when one will have a large number of items, each of which consists of a fixed bunch of related-but-independent variables stuck together (e.g. the coordinates of a point, or the vertices of a 3d triangle). An array of exposed-field structures will allow the its elements to be efficiently modified "in place"--something which is not possible with any other collection type. Because an array of structures holds its elements consecutively in RAM, sequential accesses to array elements can be very fast. In situations where code will need to make many sequential passes through an array, an array of structures may outperform an array or other collection of class object references by a factor of 2:1; further, the ability to update elements in place may allow an array of structures to outperform any other kind of collection of structures.
Although arrays are not resizable, it is not difficult to have code store an array reference along with the number of elements that are in use, and replace the array with a larger one as required. Alternatively, one could easily write code for a type which behaved much like a List<T> but exposed its backing store, thus allowing one to say either MyPoints.Add(nextPoint); or MyPoints.Items[23].X += 5;. Note that the latter would not necessarily throw an exception if code tried to access beyond the end of the list, but usage would otherwise be conceptually quite similar to List<T>.
Rather than going through a comparison of the features of each data type, I think the most pragmatic answer is "the differences probably aren't that important for what you need to accomplish, especially since they both implement IEnumerable, so follow popular convention and use a List until you have a reason not to, at which point you probably will have your reason for using an array over a List."
Most of the time in managed code you're going to want to favor collections being as easy to work with as possible over worrying about micro-optimizations.
Lists in .NET are wrappers over arrays, and use an array internally. The time complexity of operations on lists is the same as would be with arrays, however there is a little more overhead with all the added functionality / ease of use of lists (such as automatic resizing and the methods that come with the list class). Pretty much, I would recommend using lists in all cases unless there is a compelling reason not to do so, such as if you need to write extremely optimized code, or are working with other code that is built around arrays.
Since no one mention: In C#, an array is a list. MyClass[] and List<MyClass> both implement IList<MyClass>. (e.g. void Foo(IList<int> foo) can be called like Foo(new[] { 1, 2, 3 }) or Foo(new List<int> { 1, 2, 3 }) )
So, if you are writing a method that accepts a List<MyClass> as an argument, but uses only subset of features, you may want to declare as IList<MyClass> instead for callers' convenience.
Details:
Why array implements IList?
How do arrays in C# partially implement IList<T>?
They may be unpopular, but I am a fan of Arrays in game projects.
- Iteration speed can be important in some cases, foreach on an Array has significantly less overhead if you are not doing much per element
- Adding and removing is not that hard with helper functions
- Its slower, but in cases where you only build it once it may not matter
- In most cases, less extra memory is wasted (only really significant with Arrays of structs)
- Slightly less garbage and pointers and pointer chasing
That being said, I use List far more often than Arrays in practice, but they each have their place.
It would be nice if List where a built in type so that they could optimize out the wrapper and enumeration overhead.
Populating a list is easier than an array. For arrays, you need to know the exact length of data, but for lists, data size can be any. And, you can convert a list into an array.
List<URLDTO> urls = new List<URLDTO>();
urls.Add(new URLDTO() {
key = "wiki",
url = "https://...",
});
urls.Add(new URLDTO()
{
key = "url",
url = "http://...",
});
urls.Add(new URLDTO()
{
key = "dir",
url = "https://...",
});
// convert a list into an array: URLDTO[]
return urls.ToArray();
Keep in mind that with List is not possible to do this:
List<string> arr = new List<string>();
arr.Add("string a");
arr.Add("string b");
arr.Add("string c");
arr.Add("string d");
arr[10] = "new string";
It generates an Exception.
Instead with arrays:
string[] strArr = new string[20];
strArr[0] = "string a";
strArr[1] = "string b";
strArr[2] = "string c";
strArr[3] = "string d";
strArr[10] = "new string";
But with Arrays there is not an automatic data structure resizing. You have to manage it manually or with Array.Resize method.
A trick could be initialize a List with an empty array.
List<string> arr = new List<string>(new string[100]);
arr[10] = "new string";
But in this case if you put a new element using Add method it will be injected in the end of the List.
List<string> arr = new List<string>(new string[100]);
arr[10] = "new string";
arr.Add("bla bla bla"); // this will be in the end of List
It completely depends on the contexts in which the data structure is needed. For example, if you are creating items to be used by other functions or services using List is the perfect way to accomplish it.
Now if you have a list of items and you just want to display them, say on a web page array is the container you need to use.

Best practice for iterating over an ad-hoc list of strings in C# [duplicate]

MyClass[] array;
List<MyClass> list;
What are the scenarios when one is preferable over the other? And why?
It is rare, in reality, that you would want to use an array. Definitely use a List<T> any time you want to add/remove data, since resizing arrays is expensive. If you know the data is fixed length, and you want to micro-optimise for some very specific reason (after benchmarking), then an array may be useful.
List<T> offers a lot more functionality than an array (although LINQ evens it up a bit), and is almost always the right choice. Except for params arguments, of course. ;-p
As a counter - List<T> is one-dimensional; where-as you have have rectangular (etc) arrays like int[,] or string[,,] - but there are other ways of modelling such data (if you need) in an object model.
See also:
How/When to abandon the use of Arrays in c#.net?
Arrays, What's the point?
That said, I make a lot of use of arrays in my protobuf-net project; entirely for performance:
it does a lot of bit-shifting, so a byte[] is pretty much essential for encoding;
I use a local rolling byte[] buffer which I fill before sending down to the underlying stream (and v.v.); quicker than BufferedStream etc;
it internally uses an array-based model of objects (Foo[] rather than List<Foo>), since the size is fixed once built, and needs to be very fast.
But this is definitely an exception; for general line-of-business processing, a List<T> wins every time.
Really just answering to add a link which I'm surprised hasn't been mentioned yet: Eric's Lippert's blog entry on "Arrays considered somewhat harmful."
You can judge from the title that it's suggesting using collections wherever practical - but as Marc rightly points out, there are plenty of places where an array really is the only practical solution.
Notwithstanding the other answers recommending List<T>, you'll want to use arrays when handling:
image bitmap data
other low-level data-structures (i.e. network protocols)
Unless you are really concerned with performance, and by that I mean, "Why are you using .Net instead of C++?" you should stick with List<>. It's easier to maintain and does all the dirty work of resizing an array behind the scenes for you. (If necessary, List<> is pretty smart about choosing array sizes so it doesn't need to usually.)
Arrays should be used in preference to List when the immutability of the collection itself is part of the contract between the client & provider code (not necessarily immutability of the items within the collection) AND when IEnumerable is not suitable.
For example,
var str = "This is a string";
var strChars = str.ToCharArray(); // returns array
It is clear that modification of "strChars" will not mutate the original "str" object, irrespective implementation-level knowledge of "str"'s underlying type.
But suppose that
var str = "This is a string";
var strChars = str.ToCharList(); // returns List<char>
strChars.Insert(0, 'X');
In this case, it's not clear from that code-snippet alone if the insert method will or will not mutate the original "str" object. It requires implementation level knowledge of String to make that determination, which breaks Design by Contract approach. In the case of String, it's not a big deal, but it can be a big deal in almost every other case. Setting the List to read-only does help but results in run-time errors, not compile-time.
If I know exactly how many elements I'm going to need, say I need 5 elements and only ever 5 elements then I use an array. Otherwise I just use a List<T>.
Arrays Vs. Lists is a classic maintainability vs. performance problem. The rule of thumb that nearly all developers follow is that you should shoot for both, but when they come in to conflict, choose maintainability over performance. The exception to that rule is when performance has already proven to be an issue. If you carry this principle in to Arrays Vs. Lists, then what you get is this:
Use strongly typed lists until you hit performance problems. If you hit a performance problem, make a decision as to whether dropping out to arrays will benefit your solution with performance more than it will be a detriment to your solution in terms of maintenance.
Most of the times, using a List would suffice. A List uses an internal array to handle its data, and automatically resizes the array when adding more elements to the List than its current capacity, which makes it more easy to use than an array, where you need to know the capacity beforehand.
See http://msdn.microsoft.com/en-us/library/ms379570(v=vs.80).aspx#datastructures20_1_topic5 for more information about Lists in C# or just decompile System.Collections.Generic.List<T>.
If you need multidimensional data (for example using a matrix or in graphics programming), you would probably go with an array instead.
As always, if memory or performance is an issue, measure it! Otherwise you could be making false assumptions about the code.
Another situation not yet mentioned is when one will have a large number of items, each of which consists of a fixed bunch of related-but-independent variables stuck together (e.g. the coordinates of a point, or the vertices of a 3d triangle). An array of exposed-field structures will allow the its elements to be efficiently modified "in place"--something which is not possible with any other collection type. Because an array of structures holds its elements consecutively in RAM, sequential accesses to array elements can be very fast. In situations where code will need to make many sequential passes through an array, an array of structures may outperform an array or other collection of class object references by a factor of 2:1; further, the ability to update elements in place may allow an array of structures to outperform any other kind of collection of structures.
Although arrays are not resizable, it is not difficult to have code store an array reference along with the number of elements that are in use, and replace the array with a larger one as required. Alternatively, one could easily write code for a type which behaved much like a List<T> but exposed its backing store, thus allowing one to say either MyPoints.Add(nextPoint); or MyPoints.Items[23].X += 5;. Note that the latter would not necessarily throw an exception if code tried to access beyond the end of the list, but usage would otherwise be conceptually quite similar to List<T>.
Rather than going through a comparison of the features of each data type, I think the most pragmatic answer is "the differences probably aren't that important for what you need to accomplish, especially since they both implement IEnumerable, so follow popular convention and use a List until you have a reason not to, at which point you probably will have your reason for using an array over a List."
Most of the time in managed code you're going to want to favor collections being as easy to work with as possible over worrying about micro-optimizations.
Lists in .NET are wrappers over arrays, and use an array internally. The time complexity of operations on lists is the same as would be with arrays, however there is a little more overhead with all the added functionality / ease of use of lists (such as automatic resizing and the methods that come with the list class). Pretty much, I would recommend using lists in all cases unless there is a compelling reason not to do so, such as if you need to write extremely optimized code, or are working with other code that is built around arrays.
Since no one mention: In C#, an array is a list. MyClass[] and List<MyClass> both implement IList<MyClass>. (e.g. void Foo(IList<int> foo) can be called like Foo(new[] { 1, 2, 3 }) or Foo(new List<int> { 1, 2, 3 }) )
So, if you are writing a method that accepts a List<MyClass> as an argument, but uses only subset of features, you may want to declare as IList<MyClass> instead for callers' convenience.
Details:
Why array implements IList?
How do arrays in C# partially implement IList<T>?
They may be unpopular, but I am a fan of Arrays in game projects.
- Iteration speed can be important in some cases, foreach on an Array has significantly less overhead if you are not doing much per element
- Adding and removing is not that hard with helper functions
- Its slower, but in cases where you only build it once it may not matter
- In most cases, less extra memory is wasted (only really significant with Arrays of structs)
- Slightly less garbage and pointers and pointer chasing
That being said, I use List far more often than Arrays in practice, but they each have their place.
It would be nice if List where a built in type so that they could optimize out the wrapper and enumeration overhead.
Populating a list is easier than an array. For arrays, you need to know the exact length of data, but for lists, data size can be any. And, you can convert a list into an array.
List<URLDTO> urls = new List<URLDTO>();
urls.Add(new URLDTO() {
key = "wiki",
url = "https://...",
});
urls.Add(new URLDTO()
{
key = "url",
url = "http://...",
});
urls.Add(new URLDTO()
{
key = "dir",
url = "https://...",
});
// convert a list into an array: URLDTO[]
return urls.ToArray();
Keep in mind that with List is not possible to do this:
List<string> arr = new List<string>();
arr.Add("string a");
arr.Add("string b");
arr.Add("string c");
arr.Add("string d");
arr[10] = "new string";
It generates an Exception.
Instead with arrays:
string[] strArr = new string[20];
strArr[0] = "string a";
strArr[1] = "string b";
strArr[2] = "string c";
strArr[3] = "string d";
strArr[10] = "new string";
But with Arrays there is not an automatic data structure resizing. You have to manage it manually or with Array.Resize method.
A trick could be initialize a List with an empty array.
List<string> arr = new List<string>(new string[100]);
arr[10] = "new string";
But in this case if you put a new element using Add method it will be injected in the end of the List.
List<string> arr = new List<string>(new string[100]);
arr[10] = "new string";
arr.Add("bla bla bla"); // this will be in the end of List
It completely depends on the contexts in which the data structure is needed. For example, if you are creating items to be used by other functions or services using List is the perfect way to accomplish it.
Now if you have a list of items and you just want to display them, say on a web page array is the container you need to use.

List<T> vs HashSet<T> - dynamic collection choice is efficient or not?

var usedIds = list.Count > 20 ? new HashSet<int>() as ICollection<int> : new List<int>();
Assuming that List is more performant with 20 or less items and HashSet is more performant with greater items amount (from this post), is it efficient approach to use different collection types dynamicaly based on the predictable items count?
All of the actions for each of the collection types will be the same.
PS: Also i have found HybridCollection Class which seems to do the same thing automaticaly, but i've never used it so i have no info on its performance either.
EDIT: My collection is mostly used as the buffer with many inserts and gets.
In theory, it could be, depending on how many and what type of operations you are performing on the collections. In practice, it would be a pretty rare case where such micro-optimization would justify the added complexity.
Also consider what type of data you are working with. If you are using int as the collection item as the first line of your question suggests, then the threshold is going to be quite a bit less than 20 where List is no longer faster than HashSet for many operations.
In any case, if you are going to do that, I would create a new collection class to handle it, something along the lines of the HybridDictionary, and expose it to your user code with some generic interface like IDictionary.
And make sure you profile it to be sure that your use case actually benefits from it.
There may even be a better option than either of those collections, depending on what exactly it is you are doing. i.e. if you are doing a lot of "before or after" inserts and traversals, then LinkedList might work better for you.
Hashtables like Hashset<T> and Dictionary<K,T> are faster at searching and inserting items in any order.
Arrays T[] are best used if you always have a fixed size and a lot of indexing operations. Adding items to a array is slower than adding into a list due to the covariance of arrays in c#.
List<T> are best used for dynamic sized collections whith indexing operations.
I don't think it is a good idea to write something like the hybrid collection better use a collection dependent on your requirements. If you have a buffer with a lof of index based operations i would not suggest a Hashtable, as somebody already quoted a Hashtable by design uses more memory
HashSet is for faster access, but List is for insert. If you don't plan adding new items, use HashSet, otherwise List.
If you collection is very small then the performance is virtually always going to be a non-issue. If you know that n is always less than 20, O(n) is, by definition, O(1). Everything is fast for small n.
Use the data structure that most appropriate represents how you are conceptually treating the data, the type of operations that you need to perform, and the type of operations that should be most efficient.
is it efficient approach to use different collection types dynamicaly based on the predictable items count?
It can be depending on what you mean by "efficiency" (MS offers HybridDictionary class for that, though unfortunately it is non generic). But irrespective of that its mostly a bad choice. I will explain both.
From an efficiency standpoint:
Addition will be always faster in a List<T>, since a HashSet<T> will have to precompute hash code and store it. Even though removal and lookup will be faster with a HashSet<T> as size grows up, addition to the end is where List<T> wins. You will have to decide which is more important to you.
HashSet<T> will come up with a memory overhead compared to List<T>. See this for some illustration.
But however, from a usability standpoint it need not make sense. A HashSet<T> is a set, unlike a bag which List<T> is. They are very different, and their uses are very different. For:
HashSet<T> cannot have duplicates.
HashSet<T> will not care about any order.
So when you return a hybrid ICollection<T>, your requirement goes like this: "It doesn't matter whether duplicates can be added or not. Sometimes let it be added, sometimes not. Of course iteration order is not important anyway" - very rarely useful.
Good q, and +1.
HashSet is better, because it will probably use less space, and you will have faster access to elements.

What is the difference between LinkedList and ArrayList, and when to use which one?

What is the difference between LinkedList and ArrayList? How do I know when to use which one?
The difference is the internal data structure used to store the objects.
An ArrayList will use a system array (like Object[]) and resize it when needed. On the other hand, a LinkedList will use an object that contains the data and a pointer to the next and previous objects in the list.
Different operations will have different algorithmic complexity due to this difference in the internal representation.
Don't use either. Use System.Collections.Generic.List<T>.
That really is my recommendation. Probably independently of what your application is, but here's a little more color just in case you're doing something that needs a finely tuned choice here.
ArrayList and LinkedList are different implementations of the storage mechanism for a List. ArrayList uses an array that it must resize if your collection outgrows it current storage size. LinkedList on the other hand uses the linked list data structure from CS 201. LinkedList is better for some head- or tail-insert heavy workloads, but ArrayList is better for random access workloads.
ArrayList has a good replacement which is List<T>.
In general, List<T> is a wrapper for array - it allows indexing and accessing items in O(1), but, every time you exceed the capacity an O(n) must be paid.
LinkedList<T> won't let you access items using index but you can count that insert will always cost O(1). In addition, you can insert items in to the beginning of the list and between existing items in O(1).
I think that in most cases List<T> is the default choice. Many of the common scenarios don't require special order and have no strict complexity constraints, therefore List<T> is preferred due to its usage simplicity.
The main difference between ArrayList and List<T>, LinkedList<T>, and other similar Generics is that ArrayList holds Objects, while the others hold a type that you specify (ie. List<Point> holds only Points).
Because of this, you need to cast any object you take out of an ArrayList to its actual type. This can take a lot of screen space if you have long class names.
In general it's much better to use List<T> and other typed Generics unless you really need to have a list with multiple different types of objects in it.
The difference lies in the semantics of how the List interface* is implemented:
http://en.wikipedia.org/wiki/Arraylist and http://en.wikipedia.org/wiki/LinkedList
*Meaning the basic list operations
As #sblom has stated, use the generic counterparts of LinkedList and ArrayList. There's really no reason not to, and plenty of reasons to do so.
The List<T> implementation is effectively wrapping an Array. Should the user attempt to insert elements beyond the bounds of the backing array, it will be copied to a larger array (at considerable expense, buit transparently to users of the List<T>)
A LinkedList<T> has a completely different implementation in which data is held in LinkedListNode<T> instances, which carry reference to two other LinkedListNode<T> instances (or only one in the case of the head or tail of the list). No external reference to mid-list items is created. This means that iterating the list is fast, but random-access is slow, because one must iterate the nodes from one end or the other. The best reason to use a LinkedList is to allow for fast inserts, that involve simply changing the references held by the nodes, rather than rewriting the entire list to insert an item (as is the case with List<T>)
They have different performance on "inserts" (adding new elements) and lookups. For inserts ArrayLists keeps an array internally (initially 16 items long) and when you reach the max capacity it doubles the size of the array. An LinkedList starts empty and add an item (node) when needed.
I think also that with ArrayList you are able to index the items, while with the LinkedList you have to "visit" the item from the head (or the LinkedList does this automatically for you).

A very basic auto-expanding list/array

I have a method which returns an array of fixed type objects (let's say MyObject).
The method creates a new empty Stack<MyObject>. Then, it does some work and pushes some number of MyObjects to the end of the Stack. Finally, it returns the Stack.ToArray().
It does not change already added items or their properties, nor remove them. The number of elements to add will cost performance. There is no need to sort/order the elements.
Is Stack a best thing to use? Or must I switch to Collection or List to ensure better performance and/or lower memory cost?
Stack<T> will not be any faster than List<T>.
For optimal performance, you should use a List<T> and set the Capacity to a number larger than or equal to the number of items you plan to add.
If the ordering doesn't matter and your method doesn't need to add/remove/edit items that have already been processed then why not return IEnumerable<MyObject> and just yield each item as you go?
Then your calling code can either use the IEnumerable<MyObject> sequence directly, or call ToArray, ToList etc as required.
For example...
// use the sequence directly
foreach (MyObject item in GetObjects())
{
Console.WriteLine(item.ToString());
}
// ...
// convert to an array
MyObject[] myArray = GetObjects().ToArray();
// ...
// convert to a list
List<MyObject> myList = GetObjects().ToList();
// ...
public IEnumerable<MyObject> GetObjects()
{
foreach (MyObject foo in GetObjectsFromSomewhereElse())
{
MyObject bar = DoSomeProcessing(foo);
yield return bar;
}
}
Stack<T> is not any faster than List<T> in this case, so I would probably use List, unless something about what you are doing is "stack-like". List<T> is the more standard data structure to use when what you want is basically a growable array, whereas stacks are usually used when you need LIFO behavior for the collection.
For this purpose, there is not any other collections in the framework that will perform considerably better than a Stack<T>.
However, both Stack<T> and List<T> auto-grows their internal array of items when the initial capacity is exceeded. This involves creating a new larger array and copying all items. This costs some performance.
If you know the number of items beforehand, initialize your collection to that capacity to avoid auto-growth. If you don't know exactly, choose a capacity that is unlikely to be insufficient.
Most of the built in collections take the initial capacity as a constructor argument:
var stack = new Stack<T>(200); // Initial capacity of 200 items.
Use a LinkedList maybe?
Though LinkedLists are only useful with sequential data.
You don't need Stack<> if all you're going to do is append. You can use List<>.Add (http://msdn.microsoft.com/en-us/library/d9hw1as6.aspx) and then ToArray.
(You'll also want to set initial capacity, as others have pointed out.)
If you need the semantics of a stack (last-in first-out), then the answer is, without any doubt, yes, a stack is your best solution. If you know from the start how many elements it will end up with, you can avoid the cost of automatic resizing by calling the constructor that receives a capacity.
If you're worried about the memory cost of copying the stack into an array, and you only need sequential access to the result, then, you can return the Stack<T> as an IEnumerable<T> instead of an array and iterate it with foreach.
All that said, unless this code proves it is problematic in terms of performance (i.e., by looking at data from a profiler), I wouldn't bother much and go with the semantics call.

Categories

Resources