What are exaclty Large Objects in C#? - c#

Can I call an array of 1 million integers as large object? Or one instance of that object should be > 85 KB to be considered as large object?
If I make an array like int[1000000]. This whole object with each member is treated as one object with size > 85 KB. right?
If I have a class X {int i; string j} .. then List having count > 100000. Will this be saved to LOH?
Basically what I mean is if the size of an object like Class X is say 8.6KB and I make a datastructure like List myList. Then if the list count is 9 then it is not LO but if it has count 10 then it is?
I want a last answer(almost go all the answers):
Now I know that array is a collection of pointers of 8 bytes. So an array to be Large object it should have 85000/8 number of elements or more. Is that correct?

Any object larger than 85,000 bytes is considered to a large object and is treated differently during garbage collection. An array can itself be over 85,000 bytes large if all its references (aka pointers) make up that amount.
In case of arrays the actual count is not made up of the size of the objects but their references. Let's say you have an array of Customer, and let's assume that a Customer has 3 integers, each 8 bytes in size, and each reference is also 8 bytes in size. Then an array of 10 Customers actually takes up 80 bytes, not 240. The 24 bytes of each Customer are separate objects.
See this article for further information, and refer also to the Large Object Heap Improvements in .NET 4.5.

If you are asking in context of garbage collection then yes 85 KB would be a big object since big objects are immediately marked as a generation 2 objects.

Related

C# Large object in medium size collection

I'm pretty new to the memory problem. Hope you don't think this is a stupid question to ask.
I know that memory larger than 85,000 Bytes would be put into LOH in C#
i.e.
Byte[] hugeByteCollection = new Byte[85000];
I'm wondering if a collection with size 10000 - 20000 with an object that contains 10 member variables (byte type) will be put into LOH or SOH ?
The size of an array of objects is the number of objects times the pointer size. This is because only value types is stored in the array itself, reference types (objects) will be stored somewhere else and will not count towards the size of the array. So 85000/4=21250 objects, and 85000/8=10625 objects can be stored in an array on the SOH in 32bit and 64bit mode, respectively.
Edit:
Thanks to Hans Passant for pointing out that this assumes that the collection type used is an array and not a list. Lists resize themselves to be bigger than the content to avoid too many allocations. See this link for details

GC.GetTotalMemory use and its return value

I have a binary file save on disk of size 15KB but why its memory size is always 4 bytes only
long mem1=GC.GetTotalMemory(false);
Object[] array= new Object[1000000];
array[1]=obj; // obj is the object content of the file before it is saved on disk
long mem2=GC.GetTotalMemory(false);
long sizeOfOneElementInArray=(mem2-mem1)/1000000;
I am wrong about something somewhere. I think it is incorrect because 4 bytes is not enough to store even a hello world string, but why is it incorrect.
Thanks for any help.
Is the assumption that by assigning obj to the index [1] in the array o that it would take a substantial number of bytes? All you are doing is assigning a reference. Not only that, but all that new Object[1000000] did was create an array (space to associate 1,000,000 Object's and the memory required by Object[].), not allocate 1,000,000 Object's. I am sure someone can elaborate even more about the internal data structures being used and why 4 bytes shows up.
The key thing to realize is that assigned obj to o[1] is not allocating additional memory for obj. If you are trying to determine an approximation call GC.GetTotalMemory before obj is allocated, then after. In your test obj is already allocated before you call the first GC.GetTotalMemory
In general, when MSDN documentation says things like A number that is the best available approximation of the number of bytes currently allocated in managed memory it is a bad idea to rely on it for accurate values :-)
All kidding aside, if your object 'o' is not used in the function after line #3 in your example, it is possible that it is being collected between lines 3 and 4. Just a guess.
First, if I use the code as written, x becomes 0 for me, because o is not used after the assignment, so the GC can collect it. The following assumes o is not collected (e.g. by using GC.KeepAlive(o) at the end of the method).
Let's look carefully what each of the two important lines of your code do (assuming 32-bit architecture):
Object[] o = new Object[1000000];
This line allocates 1000000 * 4 + 16 bytes. Each element in the array takes 4 bytes, because it's a reference (pointer) to an object. 16 bytes is the overhead of an array.
o[1] = obj;
This line changes one of the references in o to reference to obj. This line allocates exactly 0 bytes.
I'm not sure why are you confused about the result. It has to be 4, unless there are some unreferenced objects from earlier part of the code. In that case, it could be less than 4 (even a negative number). Of course, if this were a multi-threaded application, the result could be pretty much anything, depending on what other threads do.
And this all assumes that GetTotalMemory() is precise, which it doesn't have to be.
In my experience, marshal.sizeof() is generally a better method of getting an object's true size than asking the garbage collector.
http://msdn.microsoft.com/en-us/library/y3ybkfb3.aspx

Wasted bytes in protocol buffer arrays?

I have a protocol buffer setup like this:
[ProtoContract]
Foo
{
[ProtoMember(1)]
Bar[] Bars;
}
A single Bar gets encoded to a 67 byte protocol buffer. This sounds about right because I know that a Bar is pretty much just a 64 byte array, and then there are 3 bytes overhead for length prefixing.
However, when I encode a Foo with an array of 20 Bars it takes 1362 bytes. 20 * 67 is 1340, so there are 22 bytes of overhead just for encoding an array!
Why does this take up so much space? And is there anything I can do to reduce it?
This overhead is quite simply the information it needs to know where each of the 20 objects starts and ends. There is nothing I can do different here without breaking the format (i.e. doing something contrary to the spec).
If you really want the gory details:
An array or list is (if we exclude "packed", which doesn't apply here) simply a repeated block of sub-messages. There are two layouts available for sub-messages; strings and groups. With a string, the layout is:
[header][length][data]
where header is the varint-encoded mash of the wire-type and field-number (hex 08 in this case with field 1), length is the varint-encoded size of data, and data is the sub-object itself. For small objects (data less than 128 bytes) this often means 2 bytes overhead per object, depending on a: the field number (fields above 15 take more space), and b: the size of the data.
With a group, the layout is:
[header][data][footer]
where header is the varint-encoded mash of the wire-type and field-number (hex 0B in this case with field 1), data is the sub-object, and footer is another varint mash to indicate the end of the object (hex 0C in this case with field 1).
Groups are less favored generally, but they have the advantage that they don't incur any overhead as data grows in size. For small field-numbers (less than 16) again the overhead is 2 bytes per object. Of course, you pay double for large field-numbers, instead.
By default, arrays aren't actually passed as arrays, but as repeated members, which have a little more overhead.
So I'd guess you actually have 1 byte of overhead for each repeated array element, plus 2 extra bytes overhead on top.
You can lose the overhead by using a "packed" array. protobuf-net supports this: http://code.google.com/p/protobuf-net/
The documentation for the binary format is here: http://code.google.com/apis/protocolbuffers/docs/encoding.html

Estimate/Calculate Session Memory Usage

I would like to estimate the amount of memory used by each session on my server in my ASP.NET web application. A few key questions:
How much memory is allocated just to have each Session instance?
Is the memory usage of each variable equal to its given address space (e.g. 32 bits for an Int32)?
What about variables with variable address space (e.g. String, Array[]s)?
What about custom object instances (e.g. MyCustomObject which holds various other things)?
Is anything added for each variable (e.g. address of Int32 variable to tie it to the session instance) adding to overhead-per-variable?
Would appreciate some help in figuring out on how I can exactly predict how much memory each session will consume. Thank you!
The HttpSessionStateContainer class has ten local varaibles, so that is roughly 40 bytes, plus 8 bytes for the object overhead. It has a session id string and an items collection, so that is something like 50 more bytes when the items collection is empty. It has some more references, but I believe those are references to objects shared by all session objects. So, all in all that would make about 100 bytes per session object.
If you put a value type like Int32 in the items collection of a session, it has to be boxed. With the object overhead of 8 bytes it comes to 12 bytes, but due to limitations in the memory manager it can't allocate less than 16 bytes for the object. With the four bytes for the reference to the object, an Int32 needs 20 bytes.
If you put a reference type in the items collection, you only store the reference, so that is just four bytes. If it's a literal string, it's already created so it won't use any more memory. A created string will use (8 + 8 + 2 * Length) bytes.
An array of value types will use (Length * sizeof(type)) plus a few more bytes. An array of reference types will use (Length * 4) plus a few more bytes for the references, and each object is allocated separately.
A custom object uses roughly the sum of the size of it's members, plus some extra padding in some cases, plus the 8 bytes of object overhead. An object containing an Int32 and a Boolean (= 5 bytes) will be padded to 8 bytes, plus the 8 bytes overhead.
So, if you put a string with 20 characters and three integers in a session object, that will use about (100 + (8 + 8 + 20 *2) + 3 * (20)) = 216 bytes. (However, the session items collection will probably allocate a capacity of 16 items, so that's 64 bytes of which you are using 16, so the size would be 264 bytes.)
(All the sizes are on a 32 bit system. On a 64 bit system each reference is 8 bytes instead of 4 bytes.)
.NET Memory Profiler is your friend:
http://memprofiler.com/
You can download the trial version for free and run it. Although these things sometimes get complicated to install and run, I found it surprisingly simple to connect to a running web server and inspect all the objects it was holding in memory.
You can probably get some of these using Performance Counters and Custom Performance Counters. I never tested these with ASP.NET, but otherwise they'r quite nice to measure performances.
This old, old article from Microsoft containing performance tips for ASP (not ASP.NET) states that each session has an overhead of about 10 kilobytes. I have no idea if this is also applicable to ASP.NET, but it sure is a lot more than then 100 bytes overhead the Guffa mentions.
Planning a large scale application there are some other things you probably need to consider besides rough memory usage.
It depends on a Session State Provider you choose, and default in-process sessions are probably not the case at all.
In case of Out-Of-Process session storage (which may be the preferred for scalable applications) the picture will be completely different, and depends on how session objects are serialized and stored.
With SQL session storage, there will not be linear RAM consumption.
I would recommend integration testing with an out-of-process flavour of session state Provider from the very beginning for a large-scale application.

sizeof() equivalent for reference types?

I'm looking for a way to get the size of an instance of a reference type. sizeof is only for value types. Is this possible?
You need Marshal.SizeOf
Edit: This is for unsafe code, but then, so is sizeof().
If you don't mind it being a little less accurate than perfect, and for comparative purposes, you could serialize the object/s and measure that (in bytes for example)
EDIT (I kept thinking after posting): Because it's a little more complicated than sizeof for valuetypes, for example: reference types can have references to other objects and so on... there's not an exact and easy way to do it that I know of...
I had a similar question recently and wanted to know the size of Object and LinkedListNode in C#. To solve the problem, I developed a program that would:
Measure the program's "Working Set"
Allocate a lot of objects.
Measure the "Working Set" again.
Divide the difference by the number of allocated objects.
On my computer (64-bit), I got the following data:
Measuring Object:
iter working set size estimate
-1 11190272
1000000 85995520 74.805248
2000000 159186944 73.998336
3000000 231473152 73.4276266666667
4000000 306401280 73.802752
5000000 379092992 73.580544
6000000 451387392 73.3661866666667
7000000 524378112 73.3125485714286
8000000 600096768 73.613312
9000000 676405248 73.9127751111111
Average size: 73.7577032239859
Measuring LinkedListNode<Object>:
iter working set size estimate
-1 34168832
1000000 147959808 113.790976
2000000 268963840 117.397504
3000000 387796992 117.876053333333
4000000 507973632 118.4512
5000000 628379648 118.8421632
6000000 748834816 119.110997333333
7000000 869265408 119.299510857143
8000000 993509376 119.917568
9000000 1114038272 119.985493333333
Average size: 118.296829561905
Estimated Object size: 29.218576886067
Estimated LinkedListNode<reference type> size: 44.5391263379189
Based on the data, the average size of allocating millions of Objects is approximately 29.2 bytes. A LinkedListNode object is approximately 44.5 bytes. This data illustrates two things:
It's very unlikely that the system is allocating a partial byte. The fractional measure of bytes indicates the overhead the CLR requires to allocate and track millions of reference types.
If we simply round-down the number of bytes, we're still unlikely to have the proper byte count for reference types. This is clear from the measure of Objects. If we round down, we assume the size is 29 bytes which, while theoretically possible, is unlikely because of padding. In order to improve performance, object allocations are usually padded for alignment purposes. I would guess that CLR objects will be 4 byte aligned.
Assuming CLR overhead and 4-byte alignment, I'd estimate an Object in C# is 28 bytes and a LinkedListNode is 44 bytes.
BTW Jon Skeet had the idea for the method above before I did and stated it in this answer to a similar question.
Beware that Marshal.SizeOf is for unsafe code...
I don't think it's possible for managed code though, maybe you can explain your problem, there may be another way to solve it
If you can - Serialize it!
Dim myObjectSize As Long
Dim ms As New IO.MemoryStream
Dim bf As New Runtime.Serialization.Formatters.Binary.BinaryFormatter()
bf.Serialize(ms, myObject)
myObjectSize = ms.Position
Please refer my answer in the below link.
It is possible via .sos.dll debugger extension
Find out the size of a .net object

Categories

Resources