Out of curiosity, I was trying to test the performance of List<T> using both value and reference types.
The results were not as I expected, leading me to believe my understanding of how these objects are laid out in memory might not be correct.
This was my experiment:
Create a basic class containing just two members, an int and a bool
Create 2 List<T> objects to hold my test classes (List1 and List2)
Randomly generate test objects and add them to List1 and List2 alternately
Time how long it takes to iterate through List1 (doing some arbitrary work such as incrementing a counter and then accessing the element)
I then repeated with a struct in place of a class
My assumptions were that when using a class, the references held in the List<T> would be contiguous, but because of how I created them (switching between adding to List1 and List2), the objects they point to probably wouldn't be.
I thought that when using a struct, because it is a value type, the objects themselves would be held contiguously in memory (since the List<T> holds the actual items rather than a collection of references).
Because of this, I expected struct to perform better (due to prefetchers etc..)
In actual fact, both were very similar.
What's going on here?
Edit - Added code to actually access the element in the iterator, code sample included
Test class (or struct)
public class/struct TestClass
{
public int TestInt;
public bool TestBool;
}
Creating random Lists:
var list1 = new List<TestClass>();
var list2 = new List<TestClass>();
var toggle = false;
for (var i=0; i < 4000000; i++)
{
// Random object generation removed for simplicity
if (toggle)
list1.Add(randomObject);
else
list2.Add(randomObject);
toggle = !toggle;
}
Testing:
var stopWatch = new Stopwatch();
var counter = 0;
var testBool = false;
stopwatch.Start();
foreach(var item in list1)
{
// Access the element
testBool = item.TestBool;
counter++;
}
stopwatch.Stop();
Repeat with TestObject as both a class and a struct.
I realise there isn't much difference, but I expected struct to perform significantly better than class
// Access the element
testBool = item.TestBool;
That has no effect, the optimizer will remove the statement since it has no useful side-effects. You are not actually measuring the difference between a struct and a class since you never actually access the element.
counter++;
Same story, pretty likely to be optimized away. Unless you actually use the counter value, after the loop completes. Having the optimizer remove too much code and make the test meaningless is a common micro-benchmark hazard. A workaround would be:
foreach(var item in list1)
{
// Access the element
counter += item.TestInt;
}
Console.WriteLine(counter);
Benchmark guidelines are:
Only profile code produced by the Release configuration. The Debug build produces too much extra code and suppresses optimization
Tools + Options, Debugging, General, untick the "Suppress JIT optimization on module load". This ensures that you get optimized code even if you run with the debugger
Debug + Windows + Disassembly is a very important debugger window to show you what code really runs. Having some understanding of machine code is required to interpret that window correctly
Very important to put an outer loop around the test code to ensure that you run the test at least 10 times. This removes cold start effects, like the processor having to fill the L1 instruction cache and the jitter having to load the IL from the assembly and compile it the first time it executes. And removes random outliers you'll get from having to compete with other processes that run on the machine and also compete for the processor.
Differences of 15% are not statistically significant.
If you aren't actually accessing members of the class objects stored in your list, then the following two types should provide equivalent performance for iteration.
List<IntPtr>
List<object>
Even though the reference type instances aren't filling a contiguous section of memory, the references themselves are.
The exception to the above case would be if the CLR compresses pointers when executing 64-bit applications with less than 32GiB of memory. This strategy is documented as Compressed OOPS in the JVM. However, the x86-64 instruction set includes instructions that allow this compression/decompression to be performed extremely efficiently, so even in this case you should see performance similar to List<int>.
Things get interesting when your value types exceed the size of a pointer (IntPtr.Size). After that point, the performance of a List<T> containing references should quickly surpass the performance of a List<T> of value types. This is due to the fact that regardless of how big your reference type instance is, the reference to that instance is at most IntPtr.Size.
Related
After reading so much about how to do it, I'm quite confused.
So here is what I want to do:
I have a datastructure/object that holds all kinds of information. I tread the datastructure as if it were immutable. Whenever I need to update information, I make a DeepCopy and do the changes to it. Then I swap the old and the newly created object.
Now I don't know how to do everything right.
Let's look at it from the side of the reader/consumer threads.
MyObj temp = dataSource;
var a = temp.a;
... // many instructions
var b = temp.b;
....
As I understand reading references is atomic. So I don't need any volatile or locking to assign the current reference of dataSource to temp. But what about the Garbage Collection. As I understand the GC has some kind of reference counter to know when to free memory. So when another thread updates dataSource exactly at the moment when dataSource is assigned to temp, does the GC increase the counter on the right memory block?
The other thing is compiler/CLR optimization. I assign dataSource to temp and use temp to access data members. What does the CLR do? Does it really make a copy of dataSource or does the optimizer just use dataSource to access .a and .b? Let's assume that between temp.a and temp.b are lot's of instructions so that the reference to temp/dataSource cannot be held in a CPU register. So is temp.b really temp.b or is it optimized to dataSource.b because the copy to temp can be optimized away. This is especially important if another thread updates dataSource to point to a new object.
Do I really need volatile, lock, ReadWriterLockSlim, Thread.MemoryBarrier or something else?
The important thing to me is that I want to make sure that temp.a and temp.b access the old datastructure even when another thread updates dataSource to another newly created data structure. I never change data inside an existing structure. Updates are always done by creating a copy, updating data and then updating the reference to the new copy of the datastructre.
Maybe one more question. If I don't use volatile, how long does it take until all cores on all CPUs see the updated reference?
When it comes to volatile please have a look here: When should the volatile keyword be used in C#?
I have done a little test programm:
namespace test1 {
public partial class Form1 : Form {
public Form1() { InitializeComponent(); }
Something sharedObj = new Something();
private void button1_Click(object sender, EventArgs e) {
Thread t = new Thread(Do); // Kick off a new thread
t.Start(); // running WriteY()
for (int i = 0; i < 1000; i++) {
Something reference = sharedObj;
int x = reference.x; // sharedObj.x;
System.Threading.Thread.Sleep(1);
int y = reference.y; // sharedObj.y;
if (x != y) {
button1.Text = x.ToString() + "/" + y.ToString();
Update();
}
}
}
private void Do() {
for (int i = 0; i < 1000000; i++) {
Something someNewObject = sharedObj.Clone(); // clone from immutable
someNewObject.Do();
sharedObj = someNewObject; // atomic
}
}
}
class Something {
public Something Clone() { return (Something)MemberwiseClone(); }
public void Do() { x++; System.Threading.Thread.Sleep(0); y++; }
public int x = 0;
public int y = 0;
}
}
In Button1_click there is a for-loop and inside the for-loop I access a datastructure/object once using the direct "shareObj" and once using a temporarily created "reference". Using the reference is enough to make sure that "var a" and "var b" are initialized with values from the same object.
The only thing I don't understand is, why is "Something reference = sharedObj;" not optimized away and "int x = reference.x;" not replaced by "int x = sharedObj.x;"?
How does the compiler, jitter know not to optimize this? Or are temporarily objects never optimized in C#?
But most important: Is my example running as intended because it is correct or is it running as intended only by accident?
As I understand reading references is atomic.
Correct. This is a very limited property though. It means reading a reference will work; you'll never get the bits of half an old reference mixed with the bits of half a new reference resulting in a reference that doesn't work. If there's a concurrent change it promises nothing about whether you get the old or the new reference (what would such a promise even mean?)
So I don't need any volatile or locking to assign the current reference of dataSource to temp.
Maybe, though there are cases where this can have problems.
But what about the Garbage Collection. As I understand the GC has some kind of reference counter to know when to free memory.
Incorrect. There is no reference counting in .NET garbage collection.
If there is a static reference to an object, then it is not eligible for reclamation.
If there is an active local reference to an object, then it is not eligble for reclamation.
If there is a reference to an object in a field of an object that is not eligible for reclamation, then it too is not eligible for reclamation, recursively.
There's no counting here. Either there is an active strong reference prohibiting reclamation, or there isn't.
This has a great many very important implications. Of relevance here is that there can never be any incorrect reference counting, since there is no reference counting. Strong references will not disappear under your feet.
The other thing is compiler/CLR optimization. I assign dataSource to temp and use temp to access data members. What does the CLR do? Does it really make a copy of dataSource or does the optimizer just use dataSource to access .a and .b?
That depends on what dataSource and temp are as far as whether they are local or not, and how they are used.
If dataSource and temp are both local, then it is perfectly possible that either the compiler or the jitter would optimise the assignment away. If they are both local though, they are both local to the same thread, so this isn't going to impact multi-threaded use.
If dataSource is a field (static or instance), then since temp is definitely a local in the code shown (because its initialised in the code fragment shown) the assignment cannot be optimised away. For one thing, grabbing a local copy of a field is in itself a possible optimisation, being faster to do several operations on a local reference than to continually access a field or static. There's not much point having a compiler or jitter "optimisation" that just makes things slower.
Consider what actually happens if you were to not use temp:
var a = dataSource.a;
... // many instructions
var b = dataSource.b;
To access dataSource.a the code must first obtain a reference to dataSource and then access a. Afterwards it obtains a reference to dataSource and then accesses b.
Optimising by not using a local makes no sense, since there's going to be an implicit local anyway.
And there is the simple fact that the fear you have is something considered: After temp = dataSource there's no assumption that temp == dataSource because there could be other threads changing dataSource, so it's not valid to make optimisations predicated on temp == dataSource.*
Really the optimisations you are concerned about are either not relevant or not valid and hence not going to happen.
There is a case that could cause you problems though. It is just about possible for a thread running on one core to not see a change to dataSource made by a thread changing on another core. As such if you have:
/* Thread A */
dataSource = null;
/* Some time has passed */
/* Thread B */
var isNull = dataSource == null;
Then there's no guarantee that just because Thread A had finished setting dataSource to null, that Thread B would see this. Ever.
The memory models in use in .NET itself and in the processors .NET generally runs on (x86 and x86-64) would prevent that happening, but in terms of possible future optimisations, this is something that's possible. You need memory barriers to ensure that Thread A's publishing definitely affects Thread B's reading. lock and volatile are both ways to ensure that.
*One doesn't even need to be multi-threaded for this to not follow, though it is possible to prove in particular cases that there are no single-thread changes that would break that assumption. That doesn't really matter though, because the multi-threaded case still applies.
Wonder if anyone could explain what is going on with this weird little optimisation in our draw code. We replaced the first little bit of code with the second and got a huge speed increase (4400 tick -> 15 ticks using stopwatch class)
// Add all the visible sprites to the list
m_renderOrder.Clear();
foreach (CSpriteInternalData sprite in m_InternalData)
{
if (!sprite.m_bRender) continue;
m_renderOrder.Add(sprite);
}
Replaced with...
// Add all the visible sprites to the list
m_renderOrder.Clear();
renderOrderCount = 0;
for (int i = 0; i < m_numSprites; i++ )
{
if (m_InternalData[i].m_bRender)
m_renderOrder[renderOrderCount++] = m_InternalData[i];
}
I looks to be the simplest little change, for such a huge increase in speed. Can anyone help?
If CSpriteInternalData is a struct, i.e. a value type, each time when you assign a value of that type to a variable, a copy is done.
MyStruct a = new MyStruct(50);
MyStruct b = a; //a is copied to b;
a.Value = 10;
Console.WriteLine(b.Value); //still 10, has a separate copy of value
If structs are small and portable, that is not much of a problem, but if the structs are large, they can get slow. Foreach creates a variable that is repeatedly assigned a value from the collection, so if CSpriteInternalData is a struct, each one is in turn copied to the sprite variable, and that could take time.
Also, the line when you Add the item to the m_renderOrder collection, invokes another copy of the structure, but I guess only a few of them have the m_bRender flag set, so that one does not take too much time.
If that is the cause of the slowdown / speedup I would wholeheartedly recommend that you change CSpriteInternalData to a class, that would use reference behavior, and just copy references around, instead of whole copies.
foreach always creates an instance of an enumerator that is returned by GetEnumerator method and that enumerator also keeps state throughout the life cycle of the foreach loop.
It then repeatedly calls for the Next() object on the enumerator and runs your code for each object it returns.
You can create you own emulator in case. But better to user for loop instead when execution time matters.
The foreach loop has a slightly different purpose. It is meant for itterating through some collection that implements IEnumerable. It's performance is much slower.
for the reference you can see
http://www.codeproject.com/Articles/6759/FOREACH-Vs-FOR-C
I quite often write code that copies member variables to a local stack variable in the belief that it will improve performance by removing the pointer dereference that has to take place whenever accessing member variables.
Is this valid?
For example
public class Manager {
private readonly Constraint[] mConstraints;
public void DoSomethingPossiblyFaster()
{
var constraints = mConstraints;
for (var i = 0; i < constraints.Length; i++)
{
var constraint = constraints[i];
// Do something with it
}
}
public void DoSomethingPossiblySlower()
{
for (var i = 0; i < mConstraints.Length; i++)
{
var constraint = mConstraints[i];
// Do something with it
}
}
}
My thinking is that DoSomethingPossiblyFaster is actually faster than DoSomethingPossiblySlower.
I know this is pretty much a micro optimisation, but it would be useful to have a definitive answer.
Edit
Just to add a little bit of background around this. Our application has to process a lot of data coming from telecom networks, and this method is likely to be called about 1 billion times a day for some of our servers. My view is that every little helps, and sometimes all I am trying to do is give the compiler a few hints.
Which is more readable? That should usually be your primary motivating factor. Do you even need to use a for loop instead of foreach?
As mConstraints is readonly I'd potentially expect the JIT compiler to do this for you - but really, what are you doing in the loop? The chances of this being significant are pretty small. I'd almost always pick the second approach simply for readability - and I'd prefer foreach where possible. Whether the JIT compiler optimizes this case will very much depend on the JIT itself - which may vary between versions, architectures, and even how large the method is or other factors. There can be no "definitive" answer here, as it's always possible that an alternative JIT will optimize differently.
If you think you're in a corner case where this really matters, you should benchmark it - thoroughly, with as realistic data as possible. Only then should you change your code away from the most readable form. If you're "quite often" writing code like this, it seems unlikely that you're doing yourself any favours.
Even if the readability difference is relatively small, I'd say it's still present and significant - whereas I'd certainly expect the performance difference to be negligible.
If the compiler/JIT isn't already doing this or a similar optimization for you (this is a big if), then DoSomethingPossiblyFaster should be faster than DoSomethingPossiblySlower. The best way to explain why is to look at a rough translation of the C# code to straight C.
When a non-static member function is called, a hidden pointer to this is passed into the function. You'd have roughly the following, ignoring virtual function dispatch since it's irrelevant to the question (or equivalently making Manager sealed for simplicity):
struct Manager {
Constraint* mConstraints;
int mLength;
}
void DoSomethingPossiblyFaster(Manager* this) {
Constraint* constraints = this->mConstraints;
int length = this->mLength;
for (int i = 0; i < length; i++)
{
Constraint constraint = constraints[i];
// Do something with it
}
}
void DoSomethingPossiblySlower()
{
for (int i = 0; i < this->mLength; i++)
{
Constraint constraint = (this->mConstraints)[i];
// Do something with it
}
}
The difference is that in DoSomethingPossiblyFaster, mConstraints lives on the stack and access only requires one layer of pointer indirection, since it's at a fixed offset from the stack pointer. In DoSomethingPossiblySlower, if the compiler misses the optimization opportunity, there's an extra pointer indirection. The compiler has to read a fixed offset from the stack pointer to access this and then read a fixed offset from this to get mConstraints.
There are two possible optimizations that could negate this hit:
The compiler could do exactly what you did manually and cache mConstraints on the stack.
The compiler could store this in a register so that it doesn't need to fetch it from the stack on every loop iteration before dereferencing it. This means that fetching mConstraints from this or from the stack is basically the same operation: A single dereference of a fixed offset from a pointer that's already in a register.
You know the response you will get, right? "Time it."
There is probably not a definitive answer. First, the compiler might do the optimization for you. Second, even if it doesn't, indirect addressing at the assembly level may not be significantly slower. Third, it depends on the cost of making the local copy, compared to the number of loop iterations. Then there are caching effects to consider.
I love to optimize, but this is one place I would definitely say wait until you have a problem, then experiment. This is a possible optimization that can be added when needed, not one of those optimizations that needs to be planned up front to avoid a massive ripple effect later.
Edit: (towards a definitive answer)
Compiling both functions in release mode and examining the IL with IL Dasm shows that in both places the "PossiblyFaster" function uses the local variable, it has one less instruction
ldloc.0 vs
ldarg.0; ldfld class Constraint[] Manager::mConstraints
Of course, this is still one level removed from the machine code - you don't know what the JIT compiler will do for you. But it is likely that "PossiblyFaster" is marginally faster.
However, I still don't recommend adding the extra variable until you are sure this function is the most expensive thing in your system.
I've profiled this and came up with a bunch of interesting results that are probably only valid for my specific example, but I thought would be worth while noting here.
The fastest is X86 release mode. That runs one iteration of my test in 7.1 seconds, whereas the equivalent X64 code takes 8.6 seconds. This was running 5 iterations, each iteration processing the loop 19.2 million times.
The fastest approach for the loop was:
foreach (var constraint in mConstraints)
{
... do stuff ...
}
The second fastest approach, which massively surprised me was the following
for (var i = 0; i < mConstraints.Length; i++)
{
var constraint = mConstraints[i];
... do stuff ...
}
I guess this was because mConstraints was stored in a register for the loop.
This slowed down when I removed the readonly option for mConstraints.
So, my summary from this is that being readable in this situation does give performance as well.
List<int> list = ...
for(int i = 0; i < list.Count; ++i)
{
...
}
So does the compiler know the list.Count does not have to be called each iteration?
Are you sure about that?
List<int> list = new List<int> { 0 };
for (int i = 0; i < list.Count; ++i)
{
if (i < 100)
{
list.Add(i + 1);
}
}
If the compiler cached the Count property above, the contents of list would be 0 and 1. If it did not, the contents would be the integers from 0 to 100.
Now, that might seem like a contrived example to you; but what about this one?
List<int> list = new List<int>();
int i = 0;
while (list.Count <= 100)
{
list.Add(i++);
}
It may seem as if these two code snippets are completely different, but that's only because of the way we tend to think about for loops versus while loops. In either case, the value of a variable is checked on every iteration. And in either case, that value very well could change.
Typically it's not safe to assume the compiler optimizes something when the behavior between "optimized" and "non-optimized" versions of the same code is actually different.
The C# compiler does not do any optimizations like this. The JIT compiler, however, optimizes this for arrays, I believe (which are not resizable), but not for lists.
A List's count property can change within the loop structure, so it would be an incorrect optimization.
It's worth noting, as nobody else has mentioned it, that there is no knowing from looking at a loop like this what the "Count" property will actually do, or what side effects it may have.
Consider the following cases:
A third party implementation of a property called "Count" could execute any code it wished to. e.g. return a Random number for all we know. With List we can be a bit more confident about how it will operate, but how is the JIT to tell these implementations apart?
Any method call within the loop could potentially alter the return value of Count (not just a straight "Add" directly on the collection, but a user method that is called in the loop might also party on the collection)
Any other thread that happens to be executing concurrently could also change the Count value.
The JIT just can't "know" that Count is constant.
However, the JIT compiler can make the code run much more efficiently by inlining the implementation of the Count property (as long as it is a trivial implementation). In your example it may well be inlined down to a simple test of a variable value, avoiding the overhead of a function call on each iteration, and thus making the final code nice and fast. (Note: I don't know if the JIT will do this, just that it could. I don't really care - see the last sentence of my answer to find out why)
But even with inlining, the value may still be changed between iterations of the loop, so it would still need to be read from RAM for each comparison. If you were to copy Count into a local variable and the JIT could determine by looking at the code in the loop that the local variable will remain constant for the loop's lifetime, then it may be able to further optimise it (e.g. by holding the constant value in a register rather than having to read it from RAM on each iteration). So if you (as a programmer) know that Count will be constant for the lifetime of the loop, you may be able to help the JIT by caching Count in a local variable. This gives the JIT the best chance of optimising the loop. (But there are no guarantees that the JIT will actually apply this optimisation, so it may make no difference to the execution times to manually "optimise" this way. You also risk things going wrong if your assumption (that Count is constant) is incorrect. Or your code may break if another programmer edits the contents of the loop so that Count is no longer constant, and he doesn't spot your cleverness)
So the moral of the story is: The JIT can make a pretty good stab at optimising this case by inlining. Even if it doesn't do this now, it may do it with the next C# version. You might not gain any advantage by manually "optmising" the code, and you risk changing its behaviour and thus breaking it, or at least making future maintenance of your code more risky, or possibly losing out on future JIT enhancements. So the best approach is to just write it the way you have, and optimise it when your profiler tells you that the loop is your performance bottleneck.
Hence, IMHO it's interesting to consider/understand cases like this, but ultimately you don't actually need to know. A little bit of knowledge can be a dangerous thing. Just let the JIT do its thing, and then profile the result to see if it needs improving.
If you take a look at the IL generated for Dan Tao's example you'll see a line like this at the condition of the loop:
callvirt instance int32 [mscorlib]System.Collections.Generic.List`1<int32>::get_Count()
This is undeniable proof that Count (i.e. get_Count()) is called for every iteration of the loop.
For all the other commenters who say that the 'Count' property could change in a loop body: JIT optimizations let you take advantage of the actual code that's running, not the worst-case of what might happen. In general, the Count could change. But it doesn't in all code.
So in the poster's example (which might not have any Count-changing), is it unreasonable for the JIT to detect that the code in the loop doesn't change whatever internal variable List uses to hold its length? If it detects that list.Count is constant, wouldn't it lift that variable access out of the loop body?
I don't know if the JIT does this or not. But I am not so quick to brush this problem off as trivially "never."
No, it doesn't. Because condition is calculated on each step. It can be more complex than just comparsion with count, and any boolean expression is allowed:
for(int i = 0; new Random().NextDouble() < .5d; i++)
Console.WriteLine(i);
http://msdn.microsoft.com/en-us/library/aa664753(VS.71).aspx
It depends on the particular implementation of Count; I've never noticed any performance issues with using the Count property on a List so I assume it's ok.
In this case you can save yourself some typing with a foreach.
List<int> list = new List<int>(){0};
foreach (int item in list)
{
// ...
}
According to [MSDN: Array usage guidelines](http://msdn.microsoft.com/en-us/library/k2604h5s(VS.71).aspx):
Array Valued Properties
You should use collections to avoid code inefficiencies. In the following code example, each call to the myObj property creates a copy of the array. As a result, 2n+1 copies of the array will be created in the following loop.
[Visual Basic]
Dim i As Integer
For i = 0 To obj.myObj.Count - 1
DoSomething(obj.myObj(i))
Next i
[C#]
for (int i = 0; i < obj.myObj.Count; i++)
DoSomething(obj.myObj[i]);
Other than the change from myObj[] to ICollection myObj, what else would you recommend? Just realized that my current app is leaking memory :(
Thanks;
EDIT: Would forcing C# to pass references w/ ref (safety aside) improve performance and/or memory usage?
No, it isn't leaking memory - it is just making the garbage collector work harder than it might. Actually, the MSDN article is slightly misleading: if the property created a new collection every time it was called, it would be just as bad (memory wise) as with an array. Perhaps worse, due to the usual over-sizing of most collection implementations.
If you know a method/property does work, you can always minimise the number of calls:
var arr = obj.myObj; // var since I don't know the type!
for (int i = 0; i < arr.Length; i++) {
DoSomething(arr[i]);
}
or even easier, use foreach:
foreach(var value in obj.myObj) {
DoSomething(value);
}
Both approaches only call the property once. The second is clearer IMO.
Other thoughts; name it a method! i.e. obj.SomeMethod() - this sets expectation that it does work, and avoids the undesirable obj.Foo != obj.Foo (which would be the case for arrays).
Finally, Eric Lippert has a good article on this subject.
Just as a hint for those who haven't use the ReadOnlyCollection mentioned in some of the answers:
[C#]
class XY
{
private X[] array;
public ReadOnlyCollection<X> myObj
{
get
{
return Array.AsReadOnly(array);
}
}
}
Hope this might help.
Whenever I have properties that are costly (like recreating a collection on call) I either document the property, stating that each call incurs a cost, or I cache the value as a private field. Property getters that are costly, should be written as methods.
Generally, I try to expose collections as IEnumerable rather than arrays, forcing the consumer to use foreach (or an enumerator).
It will not make copies of the array unless you make it do so. However, simply passing the reference to an array privately owned by an object has some nasty side-effects. Whoever receives the reference is basically free to do whatever he likes with the array, including altering the contents in ways that cannot be controlled by its owner.
One way of preventing unauthorized meddling with the array is to return a copy of the contents. Another (slightly better) is to return a read-only collection.
Still, before doing any of these things you should ask yourself if you are about to give away too much information. In some cases (actually, quite often) it is even better to keep the array private and instead let provide methods that operate on the object owning it.
myobj will not create new item unless you explicitly create one. so to make better memory usage I recommend to use private collection (List or any) and expose indexer which will return the specified value from the private collection