C# arrays with unknown size

C# arrays with unknown size - c#

I need class with array inside:
class FirstClass
{
int x = 0;
int y = 0;
}
class SecondClass
{
FirstClass[] firstclass = ???????? // size of array is unknown
OtherClass[][] otherclass = ???????? // i need also multidimensional array or array of arrays
}
Normally im using List<> for purposes like this, but today i need fastest possible solution (a lot of data to process), and i think Array should be faster than List (I've read somewhere that List is a lot slower than Array).
Question is - what to put into "????????" in my code?
Edit:
My class is a neural network object. Biggest array will contain learning data (100000+ data rows with 50+ fields, float or double datatype).
If Array is 1% faster than List - i need Array.

You cannot make an array until you know its exact size: unlike lists, arrays cannot grow*; once you give them a size, they keep that size forever. If you must have an array for processing, make a List<T> first, populate it, and then convert to an array with ToArray() method:
var firstClassList = new List<FirstClass>();
firtstClassList.Add(new FirstClass(123));
firtstClassList.Add(new FirstClass(456));
FirstClass firstClass[] = firtstClassList.ToArray();
var otherClassList = new List<List<OtherClass>>();
otherClassList.Add(new List<OtherClass>());
otherClassList[0].Add(OtherClass(10));
otherClassList[0].Add(OtherClass(11));
otherClassList.Add(new List<OtherClass>());
otherClassList[1].Add(OtherClass(20));
otherClassList[1].Add(OtherClass(21));
otherClassList[1].Add(OtherClass(22));
OtherClass otherClass[][] = otherClassList.Select(r => r.ToArray()).ToArray();
Note, however, that this is premature optimization: lists are very much as fast as arrays, but they offer a lot more flexibility. Replacing lists with arrays does not make sense, at least not before you profile.
* Calls of Array.Resize to change the size of the array do not count as "growing" the array, because you end up with a different array instance, and the one you were trying to "grow" gets thrown away immediately after being copied.

If you are unsure of the size of the Arrays, you may want to consider using a different type of collection, such as a List<FirstClass> as you mentioned or an ArrayList etc.
Performance-wise, until you have actually tested the data, you may be fine using a List and if you still have issues or bottlenecks, consider using an Array or some more efficient structure. (The run times shouldn't differ too greatly unless you are working with very large sets of data.)
(There is always the option to convert your lists to arrays as well.)

Why would Array be faster than a List? See this question for a comparison of the running times.
The only case I can think of where an Array would beat a list is if you needed to search for an element, and the array was sorted so you could do a binary search. However, since you can use [] to index List<T>s in C#, that's a moot point.

var list = new List<FirstClass>();
list.Add(object);
FirstClass[] firstClass = list.ToArray();
var listOfLists = new List<List<SecondClass>>();
var item1 = new List<SecondClass>();
item1.Add(anotherObject);
listOfLists.Add(item1);
SecondClass[][] secondClass = listsOfLists.Select(i => i.ToArray()).ToArray();

You can just instantiate them like this:
FirstClass[] firstclass = new FirstClass[];
OtherClass[][] otherclass = new OtherClass[][];
And then just use the Add method whenever you need to add an element:
firstclass.Add(new FirstClass());

Related

Arrays/Double Arrays vs Lists/Dictionaries [duplicate]

MyClass[] array;
List<MyClass> list;
What are the scenarios when one is preferable over the other? And why?

It is rare, in reality, that you would want to use an array. Definitely use a List<T> any time you want to add/remove data, since resizing arrays is expensive. If you know the data is fixed length, and you want to micro-optimise for some very specific reason (after benchmarking), then an array may be useful.
List<T> offers a lot more functionality than an array (although LINQ evens it up a bit), and is almost always the right choice. Except for params arguments, of course. ;-p
As a counter - List<T> is one-dimensional; where-as you have have rectangular (etc) arrays like int[,] or string[,,] - but there are other ways of modelling such data (if you need) in an object model.
See also:
How/When to abandon the use of Arrays in c#.net?
Arrays, What's the point?
That said, I make a lot of use of arrays in my protobuf-net project; entirely for performance:
it does a lot of bit-shifting, so a byte[] is pretty much essential for encoding;
I use a local rolling byte[] buffer which I fill before sending down to the underlying stream (and v.v.); quicker than BufferedStream etc;
it internally uses an array-based model of objects (Foo[] rather than List<Foo>), since the size is fixed once built, and needs to be very fast.
But this is definitely an exception; for general line-of-business processing, a List<T> wins every time.

Really just answering to add a link which I'm surprised hasn't been mentioned yet: Eric's Lippert's blog entry on "Arrays considered somewhat harmful."
You can judge from the title that it's suggesting using collections wherever practical - but as Marc rightly points out, there are plenty of places where an array really is the only practical solution.

Notwithstanding the other answers recommending List<T>, you'll want to use arrays when handling:
image bitmap data
other low-level data-structures (i.e. network protocols)

Unless you are really concerned with performance, and by that I mean, "Why are you using .Net instead of C++?" you should stick with List<>. It's easier to maintain and does all the dirty work of resizing an array behind the scenes for you. (If necessary, List<> is pretty smart about choosing array sizes so it doesn't need to usually.)

Arrays should be used in preference to List when the immutability of the collection itself is part of the contract between the client & provider code (not necessarily immutability of the items within the collection) AND when IEnumerable is not suitable.
For example,
var str = "This is a string";
var strChars = str.ToCharArray(); // returns array
It is clear that modification of "strChars" will not mutate the original "str" object, irrespective implementation-level knowledge of "str"'s underlying type.
But suppose that
var str = "This is a string";
var strChars = str.ToCharList(); // returns List<char>
strChars.Insert(0, 'X');
In this case, it's not clear from that code-snippet alone if the insert method will or will not mutate the original "str" object. It requires implementation level knowledge of String to make that determination, which breaks Design by Contract approach. In the case of String, it's not a big deal, but it can be a big deal in almost every other case. Setting the List to read-only does help but results in run-time errors, not compile-time.

If I know exactly how many elements I'm going to need, say I need 5 elements and only ever 5 elements then I use an array. Otherwise I just use a List<T>.

Arrays Vs. Lists is a classic maintainability vs. performance problem. The rule of thumb that nearly all developers follow is that you should shoot for both, but when they come in to conflict, choose maintainability over performance. The exception to that rule is when performance has already proven to be an issue. If you carry this principle in to Arrays Vs. Lists, then what you get is this:
Use strongly typed lists until you hit performance problems. If you hit a performance problem, make a decision as to whether dropping out to arrays will benefit your solution with performance more than it will be a detriment to your solution in terms of maintenance.

Most of the times, using a List would suffice. A List uses an internal array to handle its data, and automatically resizes the array when adding more elements to the List than its current capacity, which makes it more easy to use than an array, where you need to know the capacity beforehand.
See http://msdn.microsoft.com/en-us/library/ms379570(v=vs.80).aspx#datastructures20_1_topic5 for more information about Lists in C# or just decompile System.Collections.Generic.List<T>.
If you need multidimensional data (for example using a matrix or in graphics programming), you would probably go with an array instead.
As always, if memory or performance is an issue, measure it! Otherwise you could be making false assumptions about the code.

Another situation not yet mentioned is when one will have a large number of items, each of which consists of a fixed bunch of related-but-independent variables stuck together (e.g. the coordinates of a point, or the vertices of a 3d triangle). An array of exposed-field structures will allow the its elements to be efficiently modified "in place"--something which is not possible with any other collection type. Because an array of structures holds its elements consecutively in RAM, sequential accesses to array elements can be very fast. In situations where code will need to make many sequential passes through an array, an array of structures may outperform an array or other collection of class object references by a factor of 2:1; further, the ability to update elements in place may allow an array of structures to outperform any other kind of collection of structures.
Although arrays are not resizable, it is not difficult to have code store an array reference along with the number of elements that are in use, and replace the array with a larger one as required. Alternatively, one could easily write code for a type which behaved much like a List<T> but exposed its backing store, thus allowing one to say either MyPoints.Add(nextPoint); or MyPoints.Items[23].X += 5;. Note that the latter would not necessarily throw an exception if code tried to access beyond the end of the list, but usage would otherwise be conceptually quite similar to List<T>.

Rather than going through a comparison of the features of each data type, I think the most pragmatic answer is "the differences probably aren't that important for what you need to accomplish, especially since they both implement IEnumerable, so follow popular convention and use a List until you have a reason not to, at which point you probably will have your reason for using an array over a List."
Most of the time in managed code you're going to want to favor collections being as easy to work with as possible over worrying about micro-optimizations.

Lists in .NET are wrappers over arrays, and use an array internally. The time complexity of operations on lists is the same as would be with arrays, however there is a little more overhead with all the added functionality / ease of use of lists (such as automatic resizing and the methods that come with the list class). Pretty much, I would recommend using lists in all cases unless there is a compelling reason not to do so, such as if you need to write extremely optimized code, or are working with other code that is built around arrays.

Since no one mention: In C#, an array is a list. MyClass[] and List<MyClass> both implement IList<MyClass>. (e.g. void Foo(IList<int> foo) can be called like Foo(new[] { 1, 2, 3 }) or Foo(new List<int> { 1, 2, 3 }) )
So, if you are writing a method that accepts a List<MyClass> as an argument, but uses only subset of features, you may want to declare as IList<MyClass> instead for callers' convenience.
Details:
Why array implements IList?
How do arrays in C# partially implement IList<T>?

They may be unpopular, but I am a fan of Arrays in game projects.
- Iteration speed can be important in some cases, foreach on an Array has significantly less overhead if you are not doing much per element
- Adding and removing is not that hard with helper functions
- Its slower, but in cases where you only build it once it may not matter
- In most cases, less extra memory is wasted (only really significant with Arrays of structs)
- Slightly less garbage and pointers and pointer chasing
That being said, I use List far more often than Arrays in practice, but they each have their place.
It would be nice if List where a built in type so that they could optimize out the wrapper and enumeration overhead.

Populating a list is easier than an array. For arrays, you need to know the exact length of data, but for lists, data size can be any. And, you can convert a list into an array.
List<URLDTO> urls = new List<URLDTO>();
urls.Add(new URLDTO() {
key = "wiki",
url = "https://...",
});
urls.Add(new URLDTO()
{
key = "url",
url = "http://...",
});
urls.Add(new URLDTO()
{
key = "dir",
url = "https://...",
});
// convert a list into an array: URLDTO[]
return urls.ToArray();

Keep in mind that with List is not possible to do this:
List<string> arr = new List<string>();
arr.Add("string a");
arr.Add("string b");
arr.Add("string c");
arr.Add("string d");
arr[10] = "new string";
It generates an Exception.
Instead with arrays:
string[] strArr = new string[20];
strArr[0] = "string a";
strArr[1] = "string b";
strArr[2] = "string c";
strArr[3] = "string d";
strArr[10] = "new string";
But with Arrays there is not an automatic data structure resizing. You have to manage it manually or with Array.Resize method.
A trick could be initialize a List with an empty array.
List<string> arr = new List<string>(new string[100]);
arr[10] = "new string";
But in this case if you put a new element using Add method it will be injected in the end of the List.
List<string> arr = new List<string>(new string[100]);
arr[10] = "new string";
arr.Add("bla bla bla"); // this will be in the end of List

It completely depends on the contexts in which the data structure is needed. For example, if you are creating items to be used by other functions or services using List is the perfect way to accomplish it.
Now if you have a list of items and you just want to display them, say on a web page array is the container you need to use.

Push Item to the end of an array

No, I can't use generic Collections. What I am trying to do is pretty simple actually. In php I would do something like this
$foo = [];
$foo[] = 1;
What I have in C# is this
var foo = new int [10];
// yeah that's pretty much it
Now I can do something like foo[foo.length - 1] = 1 but that obviously wont work. Another option is foo[foo.Count(x => x.HasValue)] = 1 along with a nullable int during declaration. But there has to be a simpler way around this trivial task.
This is homework and I don't want to explain to my teacher (and possibly the entire class) what foo[foo.Count(x => x.HasValue)] = 1 is and why it works etc.

The simplest way is to create a new class that holds the index of the inserted item:
public class PushPopIntArray
{
private int[] _vals = new int[10];
private int _nextIndex = 0;
public void Push(int val)
{
if (_nextIndex >= _vals.Length)
throw new InvalidOperationException("No more values left to push");
_vals[_nextIndex] = val;
_nextIndex++;
}
public int Pop()
{
if (_nextIndex <= 0)
throw new InvalidOperationException("No more values left to pop");
_nextIndex--;
return _vals[_nextIndex];
}
}
You could add overloads to get the entire array, or to index directly into it if you wanted. You could also add overloads or constructors to create different sized arrays, etc.

In C#, arrays cannot be resized dynamically. You can use Array.Resize (but this will probably be bad for performance) or substitute for ArrayList type instead.

But there has to be a simpler way around this trivial task.
Nope. Not all languages do everything as easy as each other, this is why Collections were invented. C# <> python <> php <> java. Pick whichever suits you better, but equivalent effort isn't always the case when moving from one language to another.

foo[foo.Length] won't work because foo.Length index is outside the array.
Last item is at index foo.Length - 1
After that an array is a fixed size structure if you expect it to work the same as in php you're just plainly wrong

Originally I wrote this as a comment, but I think it contains enough important points to warrant writing it as an answer.
You seem to be under the impression that C# is an awkward language because you stubbornly insist on using an array while having the requirement that you should "push items onto the end", as evidenced by this comment:
Isn't pushing items into the array kind of the entire purpose of the data structure?
To answer that: no, the purpose of the array data structure is to have a contiguous block of pre-allocated memory to mimic the original array structure in C(++) that you can easily index and perform pointer arithmetic on.
If you want a data structure that supports certain operations, such as pushing elements onto the end, consider a System.Collections.Generic.List<T>, or, if you insist on avoiding generics, a System.Collections.List. There are specializations that specify the underlying storage structure (such as ArrayList) but in general the whole point of the C# library is that you don't want to concern yourself with such details: the List<T> class has certain guarantees on its operations (e.g. insertion is O(n), retrieval is O(1) -- just like an array) and whether there is an array or some linked list that actually holds the data is irrelevant and is in fact dynamically decided based on the size and use case of the list at runtime.
Don't try to compare PHP and C# by comparing PHP arrays with C# arrays - they have different programming paradigms and the way to solve a problem in one does not necessarily carry over to the other.
To answer the question as written, I see two options then:
Use arrays the awkward way. Either create an array of Nullable<int>s and accept some boxing / unboxing and unpleasant LINQ statements for insertion; or keep an additional counter (preferably wrapped up in a class together with the array) to keep track of the last assigned element.
Use a proper data structure with appropriate guarantees on the operations that matter, such as List<T> which is effectively the (much better, optimised) built-in version of the second option above.
I understand that the latter option is not feasible for you because of the constraints imposed by your teacher, but then do not be surprised that things are harder than the canonical way in another language, if you are not allowed to use the canonical way in this language.
Afterthought:
A hybrid alternative that just came to mind, is using a List for storage and then just calling .ToArray on it. In your insert method, just Add to the list and return the new array.

Is there a reason / benefit to converting a generic list to array?

I have a function that returns a list of strings.
I can loop through that list just like it was an array.
Is there a reason I should convert it into an array before doing so?
EDIT:
The reason I'm asking is because I see code online (and on SO) that manipulate/create a collection of string using List (or other types) but then call the ToArray when returning it, why?

One reason is to make sure that you're not exposing the original list to the client code. If you did exposed, then they may add/remove elements to the returned List<T> which will modify the source list as List<T> is a reference type. This may not be the intended behavior.
If you use ToArray and return the array instead, you don't need to worry about modification in the array, as it is just a copy of the data.
If you have some specific question with some sample code, you may get a specific answer.

Not unless you have a specific use case where it's required. There are more options for manipulating/searching Lists than there are with Arrays out-of-the-box with .NET.
If you ever are working on a memory-sensitive application, then Arrays probably have an ever-so-slightly smaller memory footprint, but I wouldn't worry about that unless memory's a very sensitive issue for your app.

It could be that the sample code you've seen does some sort of lazy evaluation (like for example most Linq statements do).
In that case it would make sense to convert the result to an array to execute the query immediately instead of evaluate it during the first iteration.
Example:
// persons will only be lazy evaluated, which means
// the query will only execute when the "persons" are iterated.
var persons = Persons.Select(x => x.Name == "John Doe");
// here the "ToArray" forces an immediate execution of the query
var personsArray = Persons.Select(x => x.Name == "John Doe").ToArray();

One reason is you have class acting as cache for e.g. One class has field listOfAllClients and you want to return subset as below
List<Clients> _listOfClientCache ;
private client[] GetSClientsByCity(string city)
{
_listOfClient.Where(x => x.Client.City = city).ToArray();
}
One reason I see is memory foot print, If caller is not going to add or remove elements then it saves memory.
List is implemented using arrays, but when need to add element more than size of underline array then it just doubles the size of underlying array and copy elements. So there is good chance that underlying array size more than actual elements within the List.
Converting to Array will free up memory allocated for the unused elements.
Below is the test
[TestMethod]
public void ListAndArrayTest()
{
int[] array = GetArray();
Debug.WriteLine("IsFixedSize {0}, Elements {1}", array.IsFixedSize, array.Length);
// This will print IsFixedSize True, Elements 35
}
private int[] GetArray()
{
List<int> localList = new List<int>();
for (int i = 0; i < 35; i++)
{
localList.Add(i);
}
Debug.WriteLine("Capacity {0}, Element Count {1}", localList.Capacity, localList.Count);
// This will print Capacity **64**, Element Count **35**
return localList.ToArray();
}
Output : Capacity 64, Element Count 35
IsFixedSize True, Elements 35
From the output it is clear that caller has pure array, not any reference of original list which was using array under the hood with capacity of 64 element as after 32 elements when it required more elements it doubled size of array to 64 and copied the elements to it.
Once it is converted back to array caller has array of 35 element.

Arrays are more light-weight structures than lists and provide no actual methods for manipulation (by manipulation I mean: adding and removing items)
To add an item to an array or to delete an item from an array you need to re-define the array.
Lists allows you to use methods such as .Add and .Remove - making them more suitable for collection manipulation structures.
If your requirement focus on just finding an item in a collection - stick to arrays but if you need to manipulate the collection go for the list.
Think of lists as if they were special interface to manipulate arrays.

Good Practice? Redefining Arrays to Change their Size

I'm working in Unity in C#, but this is a more general programming practice question.
dataType[] array = new dataType[x];
//Further down
dataType[] newArray = new dataType[array.Length + 1];
for (int i=0;i<array.Length;i++){
newArray[i] = array[i];
}
Is this considered bad practice? Is it more/less efficient than just using an ArrayList?
I'm only performing this operation at the beginning of a level and then never again, so is it worth just working with ArrayLists?

You should use dynamic data structures when you do not know the exact size of an array. List is a better option if you compare with ArrayList. ArrayList works only on objects whereas List<> make use of generics (which works on specific type or on object as well). This will help you create high performing and maintainable code.

Yes, this is bad practice. If you expect your array size to change then use a List instead of an Array. List works just like an array in terms of indexing. To add new items to the List use Add(). You can even call ToArray() on the list to get an Array from it.
http://msdn.microsoft.com/en-us/library/6sh2ey19(v=vs.110).aspx
If you come across a situation where you can't use a List and need to grow your Array, use Array.Resize() as shown below.
int[] array = new int[5];
Console.WriteLine(array.Length);
Array.Resize<int>(ref array, 20);
Console.WriteLine(array.Length);

C# Growing List and Pointers to Elements

I need to have a growing array, or list (the built in ones are sufficient). Furthermore I need to be able to manipulate elements in the array with pointers to that specific element for example the following code
List<int> l1=new List<int>();
List<bool> l2=new List<bool>();
l1.Add(8);
l2.Add(true);
l1.Add(234);
l2.Add(true);
Console.WriteLine(l1[0]); //output=8
int* pointer = (int *) l1[0];
Console.WriteLine(*pointer); //Needs to output 8
Console.WriteLine(l2[0]); //output=true
bool* pointer2 = (bool *) l2[0];
Console.WriteLine(*pointer2); //Needs to output true
Thanks in advance for any help

Im trying to use an array to store packet data and pass it off to threads, these theads need to be able to modify the data without trashing the array
In this case, I would just pass your List<T> into the threaded routine, as well as a starting and ending index that thread should use.
Provided you always work by index, and stay within the bounds, there shouldn't be any problem with "trashing the array."

First off: you are applying a C++ approach to a problem that is solved differently in C#. In C# you generally don't want to do things involving explicit pointers, because they make life difficult, especially as it pertains to garbage collection.
That said, if you must do it this way, what you'd want to do is pass the entire list (and maybe the index) as a parameter, along with an offset, to the other thread. You would also want to be sure to lock the list appropriately in all accessing threads, to avoid dirty reads/writes.
The right solution is just to pass the item that you actually want to process. Reference types are passed byval, but that just means a new pointer is created to the same heap variable. It isn't actually creating a new value on the heap.
So for example:
var myList = new List<MyClass> { someInstanceofMyClass1, someInstanceofMyClass2 };
var t = new Thread(()=> SomeMethod(myList[0])); // Assuming MyClass is a reference type, the value passed here is the same instance as the one in myList
t.Start();
...

It already works the way you want for reference types. Therefore one potential solution is to create a class to box these values as reference types. If the items are related by index (and I suspect they are) it's a good idea to keep one list to hold a type that groups both values rather than two lists anyway. Going with that:
public Class MyClass
{
public int IntValue {get;set;}
public bool BoolValue {get;set;}
public MyClass(int intValue, bool boolValue)
{
IntValue = intValue;
boolValue = boolValue;
}
}
List<MyClass> l1 = new List<MyClass>();
l1.Add(new MyClass(8, true));
MyClass pointer = l1[0];
Console.WriteLine(pointer.IntValue); //writes 8
Console.WriteLine(pointer.BoolValue); //writes True

If you are using .NET 4, you might want to look into the classes in System.Collections.Concurrent namespace. They provide thread-safe data structures that might help you achieve your goal with less code.

For what you are trying to accomplish (network packets), it sounds like you would benefit from a System.IO.BinaryWriter instead of a List. (You could even pass a NetworkStream to a BinaryWriter)
It supports seeking so you can go back and re-write, and you can't trash the array since it grows automatically.
Performance-wise, I would assume BinaryWriter is faster than a List since it writes to an underlying Stream and MemoryStream is faster than a List<byte>

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

C# arrays with unknown size - c#

You can just instantiate them like this: FirstClass[] firstclass = new FirstClass[]; OtherClass[][] otherclass = new OtherClass[][]; And then just use the Add method whenever you need to add an element: firstclass.Add(new FirstClass());

Related

Arrays/Double Arrays vs Lists/Dictionaries [duplicate]

Push Item to the end of an array

Is there a reason / benefit to converting a generic list to array?

Good Practice? Redefining Arrays to Change their Size

C# Growing List and Pointers to Elements

Categories

Resources