Array Allocation in .NET

Array Allocation in .NET - c#

The question is regarding the allocation of arrays in .net. i have a sample program below in which maximum array i can get is to length. I increase length to +1 it gives outofMemory exception. But If I keep the length and remove the comments I am able to allocation 2 different big arrays. Both arrays are less the .net permissible object size of 2 gb and total memory also less than virtual Memory. Can someone put any thoughts?
class Program
{
static int length = 203423225;
static double[] d = new double[length];
//static int[] i = new int[15000000];
static void Main(string[] args)
{
Console.WriteLine((sizeof(double)*(double)length)/(1024*1024));
Console.WriteLine(d.Length);
//Console.WriteLine(i.Length);
Console.WriteLine(Process.GetCurrentProcess().VirtualMemorySize64.ToString());
}
}

A 32-bit process must allocate virtual memory for the array from the address space it has available. by default 2 gigabytes. Which contains a mix of both code and data. Allocations are made from the holes between the existing allocations.
Such allocations always fail not because there is no more virtual memory left, they fail because the available holes are not big enough. And you asking for a big hole, getting 1.6 jiggabytes is very rare and will only work on very simple programs that don't load any additional DLLs. A poorly based DLL is a good way to cut a large hole in two, drastically reducing the odds of such an allocation succeeding. The more typical works-on-the-first-try allocation is around 650 megabytes. The second allocation didn't fail because there was another hole available. The odds go down considerably after a program has been running for a while and the address space got fragmented. A 90 MB allocation can fail.
You can get insight in how the virtual memory address space is carved up for a program with the SysInternals' VMMap utility.
A simple workaround is to set the EXE project's Platform target setting to AnyCPU and run the program on a 64-bit operating system. It will have gobs of addressable virtual memory space available, you'll only be limited by the maximum allowed size of the paging file and the .NET 2 gigabyte object size limit. A limitation that's addressed in .NET 4.5 with the new <gcAllowVeryLargeObjects> config element. Even a 32-bit program can take advantage of the available 4 gigabyte 32-bit address space on a 64-bit operating system with the /LARGEADDRESSAWARE option of editbin.exe, you'll have to run it in a post-build event.

This will be because when allocating memory for arrays the memory must be contiguous (i.e. the array must be allocated as one large block of memory). Even if there is enough space in total to allocate the array, if the free address space is split up then the memory allocation will still fail unless the largest of those free spaces is large enough for the entire array.

Related

Large object heap waste

I noticed that my application runs out of memory quicker than it should. It creates many byte arrays of several megabytes each. However when I looked at memory usage with vmmap, it appears .NET allocates much more than needed for each buffer. To be precise, when allocating a buffer of 9 megabytes, .NET creates a heap of 16 megabytes. The remaining 7 megabytes cannot be used to create another buffer of 9 megabytes, so .NET creates another 16 megabytes. So each 9MB buffer wastes 7MB of address space!
Here's a sample program that throws an OutOfMemoryException after allocating 106 buffers in 32-bit .NET 4:
using System.Collections.Generic;
namespace CSharpMemoryAllocationTest
{
class Program
{
static void Main(string[] args)
{
var buffers = new List<byte[]>();
for (int i = 0; i < 130; ++i)
{
buffers.Add(new byte[9 * 1000 * 1024]);
}
}
}
}
Note that you can increase the size of the array to 16 * 1000 * 1024 and still allocate the same amount of buffers before running out of memory.
VMMap shows this:
Also note how there's an almost 100% difference between the total Size of the Managed Heap and the total Commited size. (1737MB vs 946MB).
Is there a reliable way around this problem on .NET, i.e. can I coerce the runtime into allocating no more than what I actually need, or maybe much larger Managed Heaps that can be used for several contiguous buffers?

Internally the CLR allocates memory in segments. From your description it sounds like the 16 MB allocations are segments and your arrays are allocated within these. The remaining space is reserved and not really wasted under normal circumstances, as it will be used for other allocations. If you don't have any allocation that will fit within the remaining chunks these are essentially overhead.
As your arrays are allocated using contiguous memory you can only fit a single of those within a segment and hence the overhead in this case.
The default segment size is 16 MB, but if you allocation is larger than that the CLR will allocate segments that are larger. I'm not aware of the details, but e.g. if I allocate 20 MB Wmmap shows me 24 MB segments.
One way to reduce the overhead is to make allocations that fit with the segment sizes if possible. But keep in mind that these are implementation details and could change with any update of the CLR.

The CLR reserving a 16MB chunk in one go from the OS, but only actively occupying 9MB.
I believe you are expecting the 9MB and 9MB to be in one heap. The difficulty is that the variable is now split over 2 heaps.
Heap 1 = 9MB + 7MB
Heap 2 = 2MB
The problem we have now, is if the original 9MB is deleted, we now have 2 heaps we can't tidy up, as the contents are shared across heaps.
To improve performance, the approach is to put them in single heaps.
If you are worried about memory usage, don't be. Memory usage is not a bad thing with .NET, as it if no-one is using it, what's the problem? The GC will at some point kick in, and memory will be tidied up. GC will only kick in either
When the CLR deems it necessary
When the OS tells the CLR to give back memory
When forced to by the code
But memory usage, especially in this example shouldn't be a concern. The memory usage stops CPU cycles occurring. Otherwise if it tidied up memory constantly, your CPU would be high, and your process (and all others on your machine) would run much slower.

Age old symptom of the buddy system heap management algorithm, where powers of 2 are used to split each block recursively, in a binary tree, so for 9M the next size is 16M. If you dropped your array size down to 8mb, you will see your usage drop by half. Not a new problem, native programmers deal with it too.
The small object pool (less than 85,000 bytes) is managed differently, but at 9MB your arrays are in the large object pool. As of .NET 4.5, the large object heap doesn't participate in compaction, large objects are immediately promoted to generation 2.
You can't coerce the algorithm, but you can certainly coerce your user code by figuring out what sizes to use that will most efficiently fill the binary segments.
If you need to fill your process space with 9 MB arrays, either:
Figure out how to save 1MB to reduce the arrays to 8MB segments
Write or use a segmented array class that abstracts an array of 1 or 2MB array segments, using an indexer property. Same way you build an unlimited bitfield or a growable ArrayList. Actually, I thought one of the containers already did this.
Move to 64-bit
Reclaiming fragmented portion of a buddy system heap is an optimization with logarithmic returns (ie. you are approximately running out of memory anyway). At some point you'll have to move to 64-bit whether its convenient or not, unless your data size is fixed.

C# calling COM fails to allocate memory

I've got a problem with a C# application and a COM component allocating memory:
C# program calls a function in a COM DLL written in C++ which does matrix processing. The function allocates a lot of memory (around 800MB in eight 100MB chunks). This fails (malloc returns "bad allocation" when calling the function from C#.
If I run the same function from a C program, allocating the same amount of memory, then there's no problem allocating memory.
I've got 8GB RAM, Win7 x64 and there are plenty of free memory.
How to fix that it works to allocate memory when calling from the C# application?
I tried to google it, but didn't really know what to search for. Searched for setting heap size etc, but that didn't give anything.
Feel a bit lost! All help are appreciated!

Amount of physical memory (8 GB) is not the constraint that limits memory consumption of your application. Supposedly, you built 32-bit application which has a fundamental limit of 4 GB of directly addressable bytes. For historical reasons, the application not doing any magic has only half of this - 2 GB. This is where you allocate from, and this space is used for other needs. 100 MB chucks are large enough to reduce the effectively usable space because of memory/address fragmentation (you want not just 100 chunks, you request continuous ones).
The easiest solution here is to build 64-bit applications. The limits there are distant.
If you still want 32-bit code:
enable /LARGEADDRESSWARE on the hosting application binary to extend limit from 2 to 4 GB
use file mappings, which you can keep in physical memory with your data and map into metered address space on demand
allocate smaller chunks

What is the Max size of Two Dimensional character array?

I have one two dimensional array:
char[,] DataFile;
When I create a object:
DataFile=new char[45000,6000]
It throws an out of memory exception.
What is the Max Size of object in .Net 3.5? What Is the Max Length of char array?

Single objects still limited to 2 GB in size in CLR 4.0? already has quite a nice explanation of the limits in various circumstances.

Well, it depends.
Obviously it'll matter how much physical memory (RAM) you've got installed and/or how large you set up virtual memory (swap).
In any case, in 32bit Windows maximum object size is 2GB. But there's another limit: The process image must have a contiguous block of memory of the required size.
Your array is about 514MB large. You should check for yourself if you have sufficient resources available.

There is no actual limit, it just depends on how much RAM your computer has, and how much contiguous memory the runtime can allocate.

Impose a Memory Limit on the CLR

I'm trying to test the error handling of my code under limited memory situations.
I'm also keen to see how the performance of my code is affected in low memory situations where perhaps the GC has to run more often.
Is there a way of running a .Net applicataion (or NUnit test-suite) with limited memory? I know with Java you can limit the amount of memory the JVM has access to - is there something similar in .Net?

This is not an option in the CLR. Memory is managed very differently, there are at least 10 distinct heaps in a .NET process. A .NET program can use the entire virtual memory space available in a Windows process without restrictions.
The simplest approach is to just allocate the memory when your program starts. You have to be a bit careful, you cannot swallow too much in one gulp, the address space is fragmented due to it containing a mix of code and data at different addresses. Memory is allocated from the holes in between. To put a serious dent into the available address space, you have to allocate at least a gigabyte and that's not possible with a single allocation.
So just use a loop to allocate smaller chunks, say one megabyte at a time:
private static List<byte[]> Gobble = new List<byte[]>();
static void Main(string[] args) {
for (int megabyte = 0; megabyte < 1024; megabyte++)
Gobble.Add(new byte[1024 * 1024]);
// etc..
}
Note that this is very fast, the allocated address space is just reserved and doesn't occupy any RAM.

You can enlist your process into a Windows Job Object. You can set memory (and other) limits for the job. This is the cleanest and the only sane way to limit the amount of memory that your process can use.

Why Would an Out of Memory Exception be Thrown if Memory is Available?

I have a fairly simple C# application that has builds a large hashtable. The keys of this hashtable are strings, and the values are ints.
The program runs fine until around 10.3 million items are added to the hashtable, when an out of memory error is thrown on the line that adds an item to the hasbtable.
According to the task manager, my program is only using 797mb of memory, and there's still over 2gb available. It's a 32-bit machine, so I know only a total of 2gb can be used by one process, but that still leaves about 1.2gb that the hashtable should be able to expand into.
Why would an out of memory error be thrown?

In theory you get 2GB for the process, but the reality is that it's 2GB of contiguous memory, so if your process' memory is fragmented you get less than that.
Additionally I suspect the hash table like most data structures by default will double in size when it needs to grow thus causing a huge growth when the tipping point item is added.
If you know the size that it needs to be ahead of time (or have a reasonable over-estimate) it may help to specify the capacity in the constructor.
Alternatively if it's not crucial that it's in memory some sort of database solution may be better and give you more flexibility if it does reach the point that it can't fit in memory.

Probably it is due to memory fragmentation: you have still free memory but not contiguous. Memory is divided in pages, usually 4KB in size, so if you allocate 4 MB, you will need 1024 contiguous memory pages in your process addressing space (they have not be physically contiguous as the memory is virtualized per-process).
However memory for the hashtable hasn't do be contiguous (unless it is very badly implemented), so maybe it is some limit of the memory manager...

Use Process Explorer (www.sysinternals.com) and look at the Virtual Address Space of your process. In contrast with "Private Bytes" (which is the amount of memory taken by the process), the Virtual Address Space shows the highest memory address in use. If fragmentation is high, it will be much higher than the "Private Bytes".
If your application really needs that much memory:
Consider going to 64-bit
Enable the /LARGEADDRESSAWARE flag, which will give your 32-bit process 4GB of RAM under a 64-bit operating system, and 3GB if a 32-bit Windows is booted with the /3GB flag.

You're simply looking at the wrong column. Have a look at the "Commit Size" column, this one should be around 2GB.
http://windows.microsoft.com/en-us/windows-vista/What-do-the-Task-Manager-memory-columns-mean

The program you are running has limited resources thanks to the Visual Studio debugger trying to keep track of everything that you're doing in your application (breakpoints, references, the stack, etc.).
In addition to that, you might have more stuff that is still indisposed than you think-- the garbage collector is tiered, and collects large objects very slowly.
+-------+
| large | collected less often (~1/10+ cycles)
+-+-------+-+ |
| medium | |
+-+-----------+-+ V
| small | collected more often (~1/3 cycles)
+---------------+
NOTE: The numbers are from memory, so take it with a grain of salt.

You should use something like hibrid between array and linked list. Because the linked list takes more memory per item than array, but the array need continuous memory space.

I solved my version of this problem by unchecking the 'Prefer 32-bit' checkbox in the exe's Project Properties page under the Build tab

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.