IEnumerable<T> ToArray usage - Is it a copy or a pointer?

IEnumerable<T> ToArray usage - Is it a copy or a pointer? - c#

I am parsing an arbitrary length byte array that is going to be passed around to a few different layers of parsing. Each parser creates a Header and a Packet payload just like any ordinary encapsulation.
My problem lies in how the encapsulation holds its packet byte array payload. Say I have a 100 byte array with three levels of encapsulation. Three packet objects will be created and I want to set the payload of these packets to the corresponding position in the byte array of the packet.
For example, let's say the payload size is 20 for all levels, then imagine it has a public byte[] Payload on each object. However, the problem is that this byte[] Payload is a copy of the original 100 bytes, so I'm going to end up with 160 bytes in memory instead of 100.
If it were in C++, I could just easily use a pointer - however, I'm writing this in C#.
So I created the following class:
public class PayloadSegment<T> : IEnumerable<T>
{
public readonly T[] Array;
public readonly int Offset;
public readonly int Count;
public PayloadSegment(T[] array, int offset, int count)
{
this.Array = array;
this.Offset = offset;
this.Count = count;
}
public T this[int index]
{
get
{
if (index < 0 || index >= this.Count)
throw new IndexOutOfRangeException();
else
return Array[Offset + index];
}
set
{
if (index < 0 || index >= this.Count)
throw new IndexOutOfRangeException();
else
Array[Offset + index] = value;
}
}
public IEnumerator<T> GetEnumerator()
{
for (int i = Offset; i < Offset + Count; i++)
yield return Array[i];
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
IEnumerator<T> enumerator = this.GetEnumerator();
while (enumerator.MoveNext())
{
yield return enumerator.Current;
}
}
}
This way I can simply reference a position inside the original byte array but use positional indexing. However, if I do something like:
PayloadSegment<byte> something = new PayloadSegment<byte>(someArray, 5, 10);
byte[] somethingArray = something.ToArray();
Will the somethingArray be a copy of the bytes, or a reference to the original PayloadSegment (which in turn is a reference to the original byte array)?
EDIT: Actually after rethinking this, can't I simply use a new MemoryStream(array, offset, length)?

The documentation for the Enumerable.ToArray extension method doesn't specifically mention what it does when it's passed a sequence that happens to already be an array. But a simple check with .NET Reflector reveals that it does indeed create a copy of the array.
It is worth noting however that when given a sequence that implements ICollection<T> (which Array does) the copy can be done much faster because the number of elements is known up front so it does not have to do dynamic resizing of the buffer such as List<T> does.

There is a very strong practice which suggests that calling "ToArray" on an object should return a new array which is detached from anything else. Nothing that is done to the original object should affect the array, and nothing which is done to the array should affect the original object. My personal preference would have been to call the routine "ToNewArray", to make explicit that each call will return a different new array.
A few of my classes have an "AsReadableArray", which returns an array which may or may not be attached to anything else. The array won't change in response to manipulations to the original object, but it's possible that multiple reads yielding the same data (which they often will) will return the same array. I really wish .net had an ImmutableArray type, supporting the same sorts of operations as String [a String, in essence, being an ImmutableArray(Of Char)], and a ReadableArray abstract type (from which both Array and ImmutableArray would inherit). I doubt such a thing could be squeezed into .Net 5.0, but it would allow a lot of things to be done much more cleanly.

It is a copy. When you call a To<Type> method, it creates a copy of the source element with the target Type

Because byte is a value type, the array will hold copies of the values, not pointers to them.
If you need the same behavior as an reference type, it is best to create a class that holds the byte has a property, and may group other data and functionality.

It's a copy. It would be very unintuitive if I passed something.ToArray() to some method, and the method changed the value of something by changing the array!

Related

C# Time complexity of Array[T].Contains(T item) vs HashSet<T>.Contains(T item)

HashSet(T).Contains(T) (inherited from ICollection<T>.Contains(T)) has a time complexity of O(1).
So, I'm wondering what the complexity of a class member array containing integers would be as I strive to achieve O(1) and don't need the existence checks of HashSet(T).Add(T).
Since built-in types are not shown in the .NET reference source, I have no chance of finding found the array implementation of IList(T).Contains(T).
Any (further) reading material or reference would be very much appreciated.

You can see source code of Array with any reflector (maybe online too, didn't check). IList.Contains is just:
Array.IndexOf(this,value) >= this.GetLowerBound(0);
And Array.IndexOf calls Array.IndexOf<T>, which, after a bunch of consistency checks, redirects to
EqualityComparer<T>.Default.IndexOf(array, value, startIndex, count)
And that one finally does:
int num = startIndex + count;
for (int index = startIndex; index < num; ++index)
{
if (this.Equals(array[index], value))
return index;
}
return -1;
So just loops over array with average complexity O(N). Of course that was obvious from the beginning, but just to provide some more evidence.

Array source code for the .Net Framework (up to v4.8) is available in reference source, and can be de-compiled using ILSpy.
In reference source, you find at line 2753 then 2809:
// -----------------------------------------------------------
// ------- Implement ICollection<T> interface methods --------
// -----------------------------------------------------------
...
[SecuritySafeCritical]
bool Contains<T>(T value) {
//! Warning: "this" is an array, not an SZArrayHelper. See comments above
//! or you may introduce a security hole!
T[] _this = JitHelpers.UnsafeCast<T[]>(this);
return Array.IndexOf(_this, value) != -1;
}
And IndexOf ends up on this IndexOf which is a O(n) algorithm.
internal virtual int IndexOf(T[] array, T value, int startIndex, int count)
{
int endIndex = startIndex + count;
for (int i = startIndex; i < endIndex; i++) {
if (Equals(array[i], value)) return i;
}
return -1;
}
Those methods are on a special class SZArrayHelper in same source file, and as explained at line 2721, this is the implementation your are looking for.
// This class is needed to allow an SZ array of type T[] to expose IList<T>,
// IList<T.BaseType>, etc., etc. all the way up to IList<Object>. When the following call is
// made:
//
// ((IList<T>) (new U[n])).SomeIListMethod()
//
// the interface stub dispatcher treats this as a special case, loads up SZArrayHelper,
// finds the corresponding generic method (matched simply by method name), instantiates
// it for type <T> and executes it.
About achieving O(1) complexity, you should convert it to a HashSet:
var lookupHashSet = new HashSet<T>(yourArray);
...
var hasValue = lookupHashSet.Contains(testValue);
Of course, this conversion is an O(n) operation. If you do not have many lookup to do, it is moot.
Note from documentation on this constructor:
If collection contains duplicates, the set will contain one of each unique element. No exception will be thrown. Therefore, the size of the resulting set is not identical to the size of collection.

You actually can see the source for List<T>, but you need to look it up online. Here's one source.
Any pure list/array bool Contains(T item) check is O(N) complexity, because each element needs to be checked. .NET is no exception. (If you designed a data structure that manifested as a list but also contained a bloom filter helper data structure, that would be another story.)

struct array vs object array c#

I understand that mutable structs are evil. However, I'd still like to compare the performance of an array of structs vs an array of objects. This is what I have so far
public struct HelloStruct
{
public int[] hello1;
public int[] hello2;
public int hello3;
public int hello4;
public byte[] hello5;
public byte[] hello6;
public string hello7;
public string hello8;
public string hello9;
public SomeOtherStruct[] hello10;
}
public struct SomeOtherStruct
{
public int yoyo;
public int yiggityyo;
}
public class HelloClass
{
public int[] hello1;
public int[] hello2;
public int hello3;
public int hello4;
public byte[] hello5;
public byte[] hello6;
public string hello7;
public string hello8;
public string hello9;
public SomeOtherClass[] hello10;
}
public class SomeOtherClass
{
public int yoyo;
public int yiggityyo;
}
static void compareTimesClassVsStruct()
{
HelloStruct[] a = new HelloStruct[50];
for (int i = 0; i < a.Length; i++)
{
a[i] = default(HelloStruct);
}
HelloClass[] b = new HelloClass[50];
for (int i = 0; i < b.Length; i++)
{
b[i] = new HelloClass();
}
Console.WriteLine("Starting now");
var s1 = Stopwatch.StartNew();
for (int i = 0; i < _max; i++)
{
a[i % 50].hello1 = new int[] { 1, 2, 3, 4, i % 50 };
a[i % 50].hello3 = i;
a[i % 50].hello7 = (i % 100).ToString();
}
s1.Stop();
var s2 = Stopwatch.StartNew();
for (int j = 0; j < _max; j++)
{
b[j % 50].hello1 = new int[] { 1, 2, 3, 4, j % 50 };
b[j % 50].hello3 = j;
b[j % 50].hello7 = (j % 100).ToString();
}
s2.Stop();
Console.WriteLine(((double)(s1.Elapsed.TotalSeconds)));
Console.WriteLine(((double)(s2.Elapsed.TotalSeconds)));
Console.Read();
}
There's a couple of things happening here that I'd like to understand.
Firstly, since the array stores structs, when I try to access a struct from the array using the index operation, should I get a copy of the struct or a reference to the original struct? In this case when I inspect the array after running the code, I get the mutated struct values. Why is this so?
Secondly, when I compare the timings inside CompareTimesClassVsStruct() I get approximately the same time. What is the reason behind that? Is there any case under which using an array of structs or an array of objects would outperform the other?
Thanks

When you access the properties of an element of an array of structs, you are NOT operating on a copy of the struct - you are operating on the struct itself. (This is NOT true of a List<SomeStruct> where you will be operating on copies, and the code in your example wouldn't even compile.)
The reason you are seeing similar times is because the times are being distorted by the (j % 100).ToString() and new int[] { 1, 2, 3, 4, j % 50 }; within the loops. The amount of time taken by those two statements is dwarfing the times taken by the array element access.
I changed the test app a little, and I get times for accessing the struct array of 9.3s and the class array of 10s (for 1,000,000,000 loops), so the struct array is noticeably faster, but pretty insignificantly so.
One thing which can make struct arrays faster to iterate over is locality of reference. When iterating over a struct array, adjacent elements are adjacent in memory, which reduces the number of processor cache misses.
The elements of class arrays are not adjacent (although the references to the elements in the array are, of course), which can result in many more processor cache misses while you iterate over the array.
Another thing to be aware of is that the number of contiguous bytes in a struct array is effectively (number of elements) * (sizeof(element)), whereas the number of contiguous bytes in a class array is (number of elements) * (sizeof(reference)) where the size of a reference is 32 bits or 64 bits, depending on memory model.
This can be a problem with large arrays of large structs where the total size of the array would exceed 2^31 bytes.
Another difference you might see in speed is when passing large structs as parameters - obviously it will be much quicker to pass by value a copy of the reference to a reference type on the stack than to pass by value a copy of a large struct.
Finally, note that your sample struct is not very representative. It contains a lot of reference types, all of which will be stored somewhere on the heap, not in the array itself.
As a rule of thumb, structs should not be more than 32 bytes or so in size (the exact limit is a matter of debate), they should contain only primitive (blittable) types, and they should be immutable. And, usually, you shouldn't worry about making things structs anyway, unless you have a provable performance need for them.

Firstly, since the array stores structs, when I try to access a struct from the array using the index operation, should I get a copy of the struct or a reference to the original struct?
Let me tell you what is actually happening rather than answering your confusingly worded either-or question.
Arrays are a collection of variables.
The index operation when applied to an array produces a variable.
Mutating a field of a mutable struct successfully requires that you have in hand the variable that contains the struct you wish to mutate.
So now to your question: Should you get a reference to the struct?
Yes, in the sense that a variable refers to storage.
No, in the sense that the variable does not contain a reference to an object; the struct is not boxed.
No, in the sense that the variable is not a ref variable.
However, if you had called an instance method on the result of the indexer, then a ref variable would have been produced for you; that ref variable is called "this", and it would have been passed to your instance method.
You see how confusing this gets. Better to not think about references at all. Think about variables and values. Indexing an array produces a variable.
Now deduce what would have happened had you used a list rather than an array, knowing that the getter indexer of a list produces a value, not a variable.
In this case when I inspect the array after running the code, I get the mutated struct values. Why is this so?
You mutated a variable.
I get approximately the same time. What is the reason behind that?
The difference is so tiny that it is being swamped by all the memory allocations and memory copies you are doing in both cases. That is the real takeaway here. Are operations on mutable value types stored in arrays slightly faster? Possibly. (They save on collection pressure as well, which is often the more relevant performance metric.) But though the relative savings might be significant, the savings as a percentage of total work is often tiny. If you have a performance problem then you want to attack the most expensive thing, not something that is already cheap.

How to truncate an array in place in C#

I mean is it really possible? MSDN says that arrays are fixed-size and the only way to resize is "copy-to-new-place". But maybe it is possible with unsafe/some magic with internal CLR structures, they all are written in C++ where we have a full memory control and can call realloc and so on.
I have no code provided for this question, because I don't even know if it can exist.
I'm not talking about Array.Resize methods and so on, because they obviosly do not have needed behaviour.
Assume that we have a standard x86 process with 2GB ram, and I have 1.9GB filled by single array. Then I want to release half of it. So I want to write something like:
MagicClass.ResizeArray(ref arr, n)
And do not get OutOfMemoryException. Array.Resize will try to allocate another gigabyte of RAM and will fail with 1.9+1 > 2GB OutOfMemory.

You can try Array.Resize():
int[] myArray = new int[] { 1, 2, 3, 4 };
int myNewSize = 1;
Array.Resize(ref myArray, myNewSize);
// Test: 1
Console.Write(myArray.Length);

realloc will attempt to do the inplace resize - but it reserves the right to copy the whole thing elsewhere and return a pointer that's completely different.
Pretty much the same outward behaviour is exposed by .NET's List<T> class - which you should be using anyway if you find yourself changing array sizes often. It hides the actual array reference from you so that the change is propagated throughout all of the references to the same list. As you remove items from the end, only the length of the list changes while the inner array stays the same - avoiding the copying.
It doesn't release the memory (you can always do that explicitly with Capacity = XXX, but that makes a new copy of the array), but then again, unless you're working with large arrays, neither does realloc - and if you're working with large arrays, yada, yada - we've been there :)
realloc doesn't really make sense in the kind of memory model .NET has anyway - the heap is continously collected and compacted over time. So if you're trying to use it to avoid the copies when just trimming an array, while also keeping memory usage low... don't bother. At the next heap compaction, the whole memory above your array is going to be moved to fill in the blanks. Even if it were possible to do the realloc, the only benefit you have over simply copying the array is that you would keep your array in the old-living heap - and that isn't necessarily what you want anyway.

Neither array type in BCL supports what you want. That being said - you can implement your own type that would support what you need. It can be backed by standard array, but would implement own Length and indexer properties, that would 'hide' portion of array from you.
public class MyTruncatableArray<T>
{
private T[] _array;
private int _length;
public MyTruncatableArray(int size)
{
_array = new T[size];
_length = size;
}
public T this[int index]
{
get
{
CheckIndex(index, _length);
return _array[index];
}
set
{
CheckIndex(index, _length);
_array[index] = value;
}
}
public int Length
{
get { return _length; }
set
{
CheckIndex(value);
_length = value;
}
}
private void CheckIndex(int index)
{
this.CheckIndex(index, _array.Length);
}
private void CheckIndex(int index, int maxValue)
{
if (index < 0 || index > maxValue)
{
throw new ArgumentException("New array length must be positive and lower or equal to original size");
}
}
}
It really depend what exactly do need. (E.g. do you need to truncate just so that you can easier use it from your code. Or is perf/GC/memory consumption a concern? If the latter is the case - did you perform any measurements that proves standard Array.Resize method unusable for your case?)

how does object of List<string> add the supplied string

Somebody please shed some light on how Add method is implemented for
(how Add method is implemented for List in c#)
listobject.Add();
where List<User> listobject= new List<User>() is the declaration for the object.
I know that using List we can perform many operations on a fly and that too with type safety, but what I wonder is how id add method implemented so that it takes care of all that at run time.
Hope it doesnot copy the object and make adjustment on each add but I will keep my fingers crossed and wait for your reply :)

Using Reflector you can see exactly how its implemented.
public void Add(T item)
{
if (this._size == this._items.Length)
{
this.EnsureCapacity(this._size + 1);
}
this._items[this._size++] = item;
this._version++;
}
Following 'EnsureCapacity' ...
private void EnsureCapacity(int min)
{
if (this._items.Length < min)
{
int num = (this._items.Length == 0) ? 4 : (this._items.Length * 2);
if (num < min)
{
num = min;
}
this.Capacity = num;
}
}
And finally the setter for 'Capacity'
public int Capacity
{
get
{
return this._items.Length;
}
set
{
if (value != this._items.Length)
{
if (value < this._size)
{
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.value, ExceptionResource.ArgumentOutOfRange_SmallCapacity);
}
if (value > 0)
{
T[] destinationArray = new T[value];
if (this._size > 0)
{
Array.Copy(this._items, 0, destinationArray, 0, this._size);
}
this._items = destinationArray;
}
else
{
this._items = List<T>._emptyArray;
}
}
}
}

Internally the List<T> holds the items in an array. The actual implementation (List<string>) is created at compile-time runtime (thanks #Jason for the correction), so internally there will be a string array that holds the items.
For reference types the list will hold a reference to the same object instance that you added. This is true for strings as well. However, note that the string class is immutable, so any time you modify a string, it actually results in a new instance.
string a = "a";
List<string> list = new List<string>();
list.Add(a); // now the item in the list and a refer to the same string instance
a = "b"; // a is now a completely new instance, the list
// is still referring the old one

No it will work on the string reference, otherwise there wouldn't be much point in using it if it always cloned you're objects.
You can actually check this sort of thing within Visual Studio by using the memory screens, it's possible to compare the address of the added item, and the original and you'll see that they point to the same memory location.

As Frederik notes, List uses an array internally. It creates the array with an initial size and as more items are added, if the array is filled to capacity, it is copied into a larger array. That is why if you know ahead of time that a List will contain many strings, it can help to specify its initial capacity in the constructor.
When you remove items from the list, or insert in the middle, it must shift all the elements in the internal array so a List is not particularly optimized for adding/removing many items. You may be better off using a LinkedList which is much more efficient at add/remove operations but gives up the ability to efficiently access elements in the list by position.
Different collections have different implementations that are best suited for certain scenarios. For a good example of how various collections are implemented, I suggest checking out PowerCollections library that was released by Wintellect. Many of the collections are no longer relevant in .NET 3.5/4.0 but they provide some great insight on how to go about implementing collections.

For a List, the Add method will be of generic type T, in this case a string. The content of the list will be a reference to the original variable, so if you modify the string in the list it will modify the original and vice versa.

How to initialize a List<T> to a given size (as opposed to capacity)?

.NET offers a generic list container whose performance is almost identical (see Performance of Arrays vs. Lists question). However they are quite different in initialization.
Arrays are very easy to initialize with a default value, and by definition they already have certain size:
string[] Ar = new string[10];
Which allows one to safely assign random items, say:
Ar[5]="hello";
with list things are more tricky. I can see two ways of doing the same initialization, neither of which is what you would call elegant:
List<string> L = new List<string>(10);
for (int i=0;i<10;i++) L.Add(null);
or
string[] Ar = new string[10];
List<string> L = new List<string>(Ar);
What would be a cleaner way?
EDIT: The answers so far refer to capacity, which is something else than pre-populating a list. For example, on a list just created with a capacity of 10, one cannot do L[2]="somevalue"
EDIT 2: People wonder why I want to use lists this way, as it is not the way they are intended to be used. I can see two reasons:
One could quite convincingly argue that lists are the "next generation" arrays, adding flexibility with almost no penalty. Therefore one should use them by default. I'm pointing out they might not be as easy to initialize.
What I'm currently writing is a base class offering default functionality as part of a bigger framework. In the default functionality I offer, the size of the List is known in advanced and therefore I could have used an array. However, I want to offer any base class the chance to dynamically extend it and therefore I opt for a list.

List<string> L = new List<string> ( new string[10] );

I can't say I need this very often - could you give more details as to why you want this? I'd probably put it as a static method in a helper class:
public static class Lists
{
public static List<T> RepeatedDefault<T>(int count)
{
return Repeated(default(T), count);
}
public static List<T> Repeated<T>(T value, int count)
{
List<T> ret = new List<T>(count);
ret.AddRange(Enumerable.Repeat(value, count));
return ret;
}
}
You could use Enumerable.Repeat(default(T), count).ToList() but that would be inefficient due to buffer resizing.
Note that if T is a reference type, it will store count copies of the reference passed for the value parameter - so they will all refer to the same object. That may or may not be what you want, depending on your use case.
EDIT: As noted in comments, you could make Repeated use a loop to populate the list if you wanted to. That would be slightly faster too. Personally I find the code using Repeat more descriptive, and suspect that in the real world the performance difference would be irrelevant, but your mileage may vary.

Use the constructor which takes an int ("capacity") as an argument:
List<string> = new List<string>(10);
EDIT: I should add that I agree with Frederik. You are using the List in a way that goes against the entire reasoning behind using it in the first place.
EDIT2:
EDIT 2: What I'm currently writing is a base class offering default functionality as part of a bigger framework. In the default functionality I offer, the size of the List is known in advanced and therefore I could have used an array. However, I want to offer any base class the chance to dynamically extend it and therefore I opt for a list.
Why would anyone need to know the size of a List with all null values? If there are no real values in the list, I would expect the length to be 0. Anyhow, the fact that this is cludgy demonstrates that it is going against the intended use of the class.

Create an array with the number of items you want first and then convert the array in to a List.
int[] fakeArray = new int[10];
List<int> list = fakeArray.ToList();

If you want to initialize the list with N elements of some fixed value:
public List<T> InitList<T>(int count, T initValue)
{
return Enumerable.Repeat(initValue, count).ToList();
}

Why are you using a List if you want to initialize it with a fixed value ?
I can understand that -for the sake of performance- you want to give it an initial capacity, but isn't one of the advantages of a list over a regular array that it can grow when needed ?
When you do this:
List<int> = new List<int>(100);
You create a list whose capacity is 100 integers. This means that your List won't need to 'grow' until you add the 101th item.
The underlying array of the list will be initialized with a length of 100.

This is an old question, but I have two solutions. One is fast and dirty reflection; the other is a solution that actually answers the question (set the size not the capacity) while still being performant, which none of the answers here do.
Reflection
This is quick and dirty, and should be pretty obvious what the code does. If you want to speed it up, cache the result of GetField, or create a DynamicMethod to do it:
public static void SetSize<T>(this List<T> l, int newSize) =>
l.GetType().GetField("_size", BindingFlags.NonPublic | BindingFlags.Instance).SetValue(l, newSize);
Obviously a lot of people will be hesitant to put such code into production.
ICollection<T>
This solution is based around the fact that the constructor List(IEnumerable<T> collection) optimizes for ICollection<T> and immediately adjusts the size to the correct amount, without iterating it. It then calls the collections CopyTo to do the copy.
The code for the List<T> constructor is as follows:
public List(IEnumerable<T> collection) {
....
ICollection<T> c = collection as ICollection<T>;
if (collection is ICollection<T> c)
{
int count = c.Count;
if (count == 0)
{
_items = s_emptyArray;
}
else {
_items = new T[count];
c.CopyTo(_items, 0);
_size = count;
}
}
So we can completely optimally pre-initialize the List to the correct size, without any extra copying.
How so? By creating an ICollection<T> object that does nothing other than return a Count. Specifically, we will not implement anything in CopyTo which is the only other function called.
private struct SizeCollection<T> : ICollection<T>
{
public SizeCollection(int size) =>
Count = size;
public void Add(T i){}
public void Clear(){}
public bool Contains(T i)=>true;
public void CopyTo(T[]a, int i){}
public bool Remove(T i)=>true;
public int Count {get;}
public bool IsReadOnly=>true;
public IEnumerator<T> GetEnumerator()=>null;
IEnumerator IEnumerable.GetEnumerator()=>null;
}
public List<T> InitializedList<T>(int size) =>
new List<T>(new SizeCollection<T>(size));
We could in theory do the same thing for AddRange/InsertRange for an existing array, which also accounts for ICollection<T>, but the code there creates a new array for the supposed items, then copies them in. In such case, it would be faster to just empty-loop Add:
public void SetSize<T>(this List<T> l, int size)
{
if(size < l.Count)
l.RemoveRange(size, l.Count - size);
else
for(size -= l.Count; size > 0; size--)
l.Add(default(T));
}

Initializing the contents of a list like that isn't really what lists are for. Lists are designed to hold objects. If you want to map particular numbers to particular objects, consider using a key-value pair structure like a hash table or dictionary instead of a list.

You seem to be emphasizing the need for a positional association with your data, so wouldn't an associative array be more fitting?
Dictionary<int, string> foo = new Dictionary<int, string>();
foo[2] = "string";

The accepted answer (the one with the green check mark) has an issue.
The problem:
var result = Lists.Repeated(new MyType(), sizeOfList);
// each item in the list references the same MyType() object
// if you edit item 1 in the list, you are also editing item 2 in the list
I recommend changing the line above to perform a copy of the object. There are many different articles about that:
String.MemberwiseClone() method called through reflection doesn't work, why?
https://code.msdn.microsoft.com/windowsdesktop/CSDeepCloneObject-8a53311e
If you want to initialize every item in your list with the default constructor, rather than NULL, then add the following method:
public static List<T> RepeatedDefaultInstance<T>(int count)
{
List<T> ret = new List<T>(count);
for (var i = 0; i < count; i++)
{
ret.Add((T)Activator.CreateInstance(typeof(T)));
}
return ret;
}

You can use Linq to cleverly initialize your list with a default value. (Similar to David B's answer.)
var defaultStrings = (new int[10]).Select(x => "my value").ToList();
Go one step farther and initialize each string with distinct values "string 1", "string 2", "string 3", etc:
int x = 1;
var numberedStrings = (new int[10]).Select(x => "string " + x++).ToList();

string [] temp = new string[] {"1","2","3"};
List<string> temp2 = temp.ToList();

After thinking again, I had found the non-reflection answer to the OP question, but Charlieface beat me to it. So I believe that the correct and complete answer is https://stackoverflow.com/a/65766955/4572240
My old answer:
If I understand correctly, you want the List<T> version of new T[size], without the overhead of adding values to it.
If you are not afraid the implementation of List<T> will change dramatically in the future (and in this case I believe the probability is close to 0), you can use reflection:
public static List<T> NewOfSize<T>(int size) {
var list = new List<T>(size);
var sizeField = list.GetType().GetField("_size",BindingFlags.Instance|BindingFlags.NonPublic);
sizeField.SetValue(list, size);
return list;
}
Note that this takes into account the default functionality of the underlying array to prefill with the default value of the item type. All int arrays will have values of 0 and all reference type arrays will have values of null. Also note that for a list of reference types, only the space for the pointer to each item is created.
If you, for some reason, decide on not using reflection, I would have liked to offer an option of AddRange with a generator method, but underneath List<T> just calls Insert a zillion times, which doesn't serve.
I would also like to point out that the Array class has a static method called ResizeArray, if you want to go the other way around and start from Array.
To end, I really hate when I ask a question and everybody points out that it's the wrong question. Maybe it is, and thanks for the info, but I would still like an answer, because you have no idea why I am asking it. That being said, if you want to create a framework that has an optimal use of resources, List<T> is a pretty inefficient class for anything than holding and adding stuff to the end of a collection.

A notice about IList:
MSDN IList Remarks:
"IList implementations fall into three categories: read-only, fixed-size, and variable-size. (...). For the generic version of this interface, see
System.Collections.Generic.IList<T>."
IList<T> does NOT inherits from IList (but List<T> does implement both IList<T> and IList), but is always variable-size.
Since .NET 4.5, we have also IReadOnlyList<T> but AFAIK, there is no fixed-size generic List which would be what you are looking for.

This is a sample I used for my unit test. I created a list of class object. Then I used forloop to add 'X' number of objects that I am expecting from the service.
This way you can add/initialize a List for any given size.
public void TestMethod1()
{
var expected = new List<DotaViewer.Interface.DotaHero>();
for (int i = 0; i < 22; i++)//You add empty initialization here
{
var temp = new DotaViewer.Interface.DotaHero();
expected.Add(temp);
}
var nw = new DotaHeroCsvService();
var items = nw.GetHero();
CollectionAssert.AreEqual(expected,items);
}
Hope I was of help to you guys.

A bit late but first solution you proposed seems far cleaner to me : you dont allocate memory twice.
Even List constrcutor needs to loop through array in order to copy it; it doesn't even know by advance there is only null elements inside.
1.
- allocate N
- loop N
Cost: 1 * allocate(N) + N * loop_iteration
2.
- allocate N
- allocate N + loop ()
Cost : 2 * allocate(N) + N * loop_iteration
However List's allocation an loops might be faster since List is a built-in class, but C# is jit-compiled sooo...

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

IEnumerable<T> ToArray usage - Is it a copy or a pointer? - c#

It is a copy. When you call a To<Type> method, it creates a copy of the source element with the target Type

Because byte is a value type, the array will hold copies of the values, not pointers to them. If you need the same behavior as an reference type, it is best to create a class that holds the byte has a property, and may group other data and functionality.

It's a copy. It would be very unintuitive if I passed something.ToArray() to some method, and the method changed the value of something by changing the array!

Related

C# Time complexity of Array[T].Contains(T item) vs HashSet<T>.Contains(T item)

struct array vs object array c#

How to truncate an array in place in C#

how does object of List<string> add the supplied string

How to initialize a List<T> to a given size (as opposed to capacity)?

Categories

Resources