Managing Arrays in C# (Memory management)

Managing Arrays in C# (Memory management) - c#

I have experience in C and C++(languages without GC) but just recently using C# a lot. I have a question to which I think I got the answer but I want for someone to confirm me, or correct me(if I am wrong).
Say I have the following
int[,] g = new int[nx, ny];
very easy. This just separates memory for a 2D array of ints. After that I can use it, provided that I don't surpass nx or ny as limits of the array.
Now, suppose I want to do this several times but with different nx's and ny's everytime (how this new values are calculated is of no importance)
so I would do
int[,] g;
for(k=0;k<numberOfTimes;k++)
{
//re-calculate nx and ny
g = new int[nx, ny];
//work with g
}
normally I would think that everytime I am separating memory for g, I am leaving leaked memory that could never be reached. Obviously I would have to "delete" this.
But since C# has Garbage Collection, can I do the above with impunity??
In other words, is my code above safe enough?
any suggestion to better it?

Your code is safe enough.
GC collects only those objects on the heap that have no references on the stack pointing to them. As in your scenario, a variable in the for loop scope gets another reference, the previous array loses all available references to it and thus gets collected in some time.
Also, there is no need to declare the variable in the outer scope as it is optimized later on during compile time. More about that : here

Related

Details of write barriers in the .Net Garbage Collector

I have a large T[] in generation 2 on the Large Object Heap. T is a reference type. I make the following assignment:
T[0] = new T(..);
Which object(s) are marked as dirty for the next Gen0/Gen1 mark phases of GC? The entire array instance, or just the new instance of T? Will the next Gen0/Gen1 GC mark phase have to go through every item of the array? (That would seem unnecessary and very inefficient.)
Are arrays special in this regard? Would it change the answer if the collection were e.g. a SortedList<K, T> and I added a new, maximal item?
I've read through many questions and articles, including the ones below, but I still don't think I've found a clear answer.
I'm aware that an entire range of memory is marked as dirty, not individual objects, but is the new array entry or the array itself the basis of this?
card table and write barriers in .net GC
Garbage Collector Basics and Performance Hints

Which object(s) are marked as dirty for the next Gen0/Gen1 mark phases of GC? The entire array instance, or just the new instance of T?
128B block containing the start of the array will be marked as dirty. The newly created instance (new T()) will be a new object, so it will first be checked through a Gen 0 collection without the card table.
For simplicity, presuming that the start of the array is aligned on a 128B boundary, this means the first 128B will be invalidated, so presuming T is a reference type and you're on a 64-bit system, that's first 16 items to check during the next collection.
Will the next Gen0/Gen1 GC mark phase have to go through every item of the array? (That would seem unnecessary and very inefficient.)
Just these 16 to 32 items, depending on the pointer size in this architecture.
Are arrays special in this regard? Would it change the answer if the collection were e.g. a SortedList and I added a new, maximal item?
Arrays are not special. A SortedList<K,T> maintains two arrays internally, so more blocks will end up dirty in the average case.

pretty sure its tracking array slots, not the root which is holding reference to array object itself.
btw if particual card is set dirty, it has to scan 4k of memory. ive read somewhere its now using windows' own mechanism which lets you get notifications if memory range in interest is written to.

Deallocate memory on unused objects forcelly

Lets consider the following scenario in the Single Linked List:-
I have been given the target node, which is going to be deleted.
Lets assume the following data and I am going to receive the object which holds "3", which is the one I am going to delete;
1 -> 2 -> 3 -> 4 -> 5 -> 6
And Class Structure is:-
Class DataHolder
{
int data;
DataHolder nxtPrt;
}
Void Delete (DataHolder currentData)
{
currentData.data = currentData.nxtPrt.data; //Now 3 will be overwritten by 4
(x) currentData.nxtPrt = (y) currentData.nxtPrt.nxtPrt;
//Now the object which belongs to 4 (previously it was 3),
//is pointing to 5;
}
So, now the actual copy of the object 4 is now become useless;
So, now i just want to remove the space allotted to original copy of 4;
But, now I cannot track it also since, I have altered the object to point 5.
So right now, at this point I have lost the actual object 4.
May I Kindly know, is there any way to forcefully ask the object to release its occupied memory like doing in "C" using dealloc,
or I have to depend on the GC to collect the unused space upon its wish.
Thanks in advance.

You're always relying on the GC, there's no way around it. And yes, it will clean up your other objects, as long as there's no reference to them. You can allocate unmanaged memory and deal with it as you see fit but, in that case, why are you using C#? Just use C(++).
But the simplest answer is don't write your own linked list. Just use LinkedList<YourStruct>. Learn your environment - the language(s), the libraries and the runtime. If you're just going to write C code in C#, you're going to hurt, nobody's going to understand your code and you gain hardly any benefit from working in C#. Again, if you don't want to use C#/.NET... don't. There's nothing inherently wrong with C or C++, or with unmanaged languages. Use the best tool for the job.
Don't think in C terms at all. It simply doesn't work in a GC'd/managed environment. Where does memory come from when you allocate it in C? Usually the stack or the heap, with a few bits in registers. In .NET, this is kind of abstracted away, but in practice, you still only have those three locations. However, they work differently. You can't allocate classes or arrays on a stack (there's limited support using unsafe code, but that's it). There's multiple heaps, and apart from the large object heap, they always allocate from the top, similar to a stack. So deallocating a single object has no value whatsoever - if you don't compact the heap to eliminate the free spots, you don't get less memory usage, and you don't get any extra space for new objects.

Return object to pool when no references point to it

Ok, I want to do the following to me it seems like a good idea so if there's no way to do what I'm asking, I'm sure there's a reasonable alternative.
Anyways, I have a sparse matrix. It's pretty big and mostly empty. I have a class called MatrixNode that's basically a wrapper around each of the cells in the matrix. Through it you can get and set the value of that cell. It also has Up, Down, Left and Right properties that return a new MatrixNode that points to the corresponding cell.
Now, since the matrix is mostly empty, having a live node for each cell, including the empty ones, is an unacceptable memory overhead. The other solution is to make new instances of MatrixNode every time a node is requested. This will make sure that only the needed nodes are kept in the memory and the rest will be collected. What I don't like about it is that a new object has to be created every time. I'm scared about it being too slow.
So here's what I've come up with. Have a dictionary of weak references to nodes. When a node is requested, if it doesn't exist, the dictionary creates it and stores it as a weak reference. If the node does already exist (probably referenced somewhere), it just returns it.
Then, if the node doesn't have any live references left, instead of it being collected, I want to store it in a pool. Later, when a new node is needed, I want to first check if the pool is empty and only make a new node if there isn't one already available that can just have it's data swapped out.
Can this be done?
A better question would be, does .NET already do this for me? Am I right in worrying about the performance of creating single use objects in large numbers?

Instead of guessing, you should make a performance test to see if there are any issues at all. You may be surprised to know that managed memory allocation can often outperform explicit allocation because your code doesn't have to pay for deallocation when your data goes out of scope.
Performance may become an issue only when you are allocating new objects so frequently that the garbage collector has no chance to collect them.
That said, there are sparse array implementations in C# already, like Math.NET and MetaNumerics. These libraries are already optimized for performance and will probably avoid performance issues you will run into if you start your implementation from stratch
An SO search for c# and sparse-matrix will return many related questions, including answers pointing to commercial libraries like ILNumerics (has a community edition), NMath and Extreme Optimization's libraries

Most sparse matrix implementations use one of a few well-known schemes for their data; I generally recommend CSR or CSC, as those are efficient for common operations.
If that seems too complex, you can start using COO. What this means in your code is that you will not store anything for empty members; however, you have an item for every non-empty one. A simple implementation might be:
public struct SparseMatrixItem
{
int Row;
int Col;
double Value;
}
And your matrix would generally be a simple container:
public interface SparseMatrix
{
public IList<SparseMatrixItem> Items { get; }
}
You should make sure that the Items list stays sorted according to the row and col indices, because then you can use binary search to quickly find out if an item exists for a specific (i,j).

The idea of having a pool of objects that people use and then return to the pool is used for really expensive objects. Objects representing a network connection, a new thread, etc. It sounds like your object is very small and easy to create. Given that, you're almost certainly going to harm performance pooling it; the overhead of managing the pool will be greater than the cost of just creating a new one each time.
Having lots of short lived very small objects is the exact case that the GC is designed to handle quickly. Creating a new object is dirt cheap; it's just moving a pointer up and clearing out the bits for that object. The real overhead for objects comes in when a new garbage collection happens; for that it needs to find all "alive" objects and move them around, leaving all "dead" objects in their place. If your small object doesn't live through a single collection it has added almost no overhead. Keeping the objects around for a long time (like, say, by pooling them so you can reuse them) means copying them through several collections, consuming a fair bit of resources.

Does array resizing invoke the GC?

I looked into the implementation of Array.Resize() and noticed that a new array is created and returned. I'm aiming for zero memory allocation during gameplay and so I need to avoid creating any new reference types. Does resizing an array trigger the Garbage Collector on the previous array? I'm creating my own 2D array resizer, but it essentially functions in the same way as the .NET Resize() method.
If the new array is smaller than the previous one, but excess objects have already been placed back into a generic object pool, will this invoke the GC?
Arrays will constantly be created in my game loop, so I need to try and make it as efficient as possible. I'm trying to create an array pool as such, so that there's no need to keep creating them ingame. However, if the resize method does the same thing, then it makes little sense to not just instantiate a new array instead of having the pool.
Thanks for the help

Array.Resize doesn't actually change the original array at all - anyone who still has a reference to it will be able to use it as before. Therefore there's no optimization possible. Frankly it's a badly named method, IMO :(
From the docs:
This method allocates a new array with
the specified size, copies elements
from the old array to the new one, and
then replaces the old array with the
new one.
So no, it's not going to reuse the original memory or anything like that. It's just creating a shallow copy with a different size.

Yes, using Array.Resize causes a new array to be allocated and the old one to eventually be collected (unless there are still references to it somewhere).
A more low-level array resizer could possibly do some minor optimization in some cases (for example when the array is being made smaller or there happens to be memory available right after the array), but .NET's implementation doesn't do that.

Implicitly yes.
Explicitly no.

Any allocation will eventually be cleaned up by the GC when no more references exist, so yes.
If you want to avoid resizing your arrays, the best thing you could do would be to preallocate with a large enough size to avoid having to reallocate at all. In that case, you might as well just use a collection class with an initial capacity specified in the constructor, such as List.

Moving objects inside arrays

I'm trying to make a Tetris-like game in XNA, and currently I'm thinking of what way would be the best to handle it.
This is what I have so far:
I have a class called Block, which has for example texture and color tint.
Then I was planning on having everything in a double array, like:
Block[,] blocks = new Block[10,20];
which would then be the full grid.
And then when the blocks move downwards, I was thinking of doing like this:
blocks[x,y+1] = blocks[x,y];
blocks[x,y] = null;
At first I thought this was a good idea, but now when I've been thinking I'm not so sure. How does it work with the memory and such? Does it create a new object every time I do that or what? Could someone please explain how it actually works when I move an object inside an array?
I'm not really looking for a Tetris-specific answer, I'm just interested in how it actually works.
Thanks.

No, you're just moving pointers around. When you say:
blocks[x,y+1] = blocks[x,y];
what you're essentially doing is swapping the pointer. The object will stay exactly where it is, but now instead of it being at index x,y it'll be at index of x , y+1. When you say
blocks[x,y] = null;
there you're removing the reference to the object x,y and if nothing else is holding a reference, the Garbage Collecter will clean it up.

The first answer above is almost correct, but the assignment is not swapping the pointer, it is duplicating it. After the first line of code there are two references to the object originally referenced at blocks[x,y]. The null assignment removes the original reference, but you still have the new reference living at blocks[x,y+1]. Null that one and the heap object will be fair game for the GC.

If you were storing value types (such as int, string) inside your array, you would indeed be creating a copy of the data each time you copied a value over, because value types are immutable in C#. Since you're storing a class (which is a reference type) in your array, your code is really just making a copy of the pointer, not the whole object.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.