From C# 5.0 in a Nutshell: The Definitive Reference in page 22;
Reference types require separate allocations of memory for the
reference and object. The object consumes as many bytes as its fields,
plus additional administrative overhead. The precise overhead is
intrinsically private to the implementation of the .NET runtime, but
at minimum the overhead is eight bytes, used to store a key to the
object’s type, as well as temporary information such as its lock state
for multithreading and a flag to indicate whether it has been fixed
from movement by the garbage collector. Each reference to an object
requires an extra four or eight bytes, depending on whether the .NET
runtime is running on a 32- or 64-bit platform.
I'm not quite sure I understand this bold part completely. It says on 32-bit platforms a reference requires four bytes, on 64-bit platforms it requires eight bytes.
So, let's say we have
string s = "Soner";
How can I check how many bytes this s reference requires?
You can use Environment.Is64BitProcess. If it is, every reference will be 8 bytes. If it's not, every reference will be 4 bytes. The type of the reference, and the contents of the object it refers to, are irrelevant.
EDIT: As noted in a now-deleted answer, IntPtr.Size is even simpler.
EDIT: As noted in comments, although currently all references in a CLR are the same size, it's just possible that at some point it will go down a similar path to Hotspot, which uses "compressed oops" in many cases to store references as 32-bit values even in a 64-bit process (without limiting the memory available).
If you really want to calculate the size of a reference, using this Reference.Size should work:
using System;
using System.Reflection.Emit;
public static class Reference
{
public static readonly int Size = new Func<int>(delegate()
{
var method = new DynamicMethod(string.Empty, typeof(int), null);
var gen = method.GetILGenerator();
gen.Emit(OpCodes.Sizeof, typeof(object));
gen.Emit(OpCodes.Conv_I4);
gen.Emit(OpCodes.Ret);
return ((Func<int>)method.CreateDelegate(typeof(Func<int>)))();
})();
}
But going with the other answers is probably a better idea.
To expand on Jon Skeet's answer, to get the number of possible bytes you should do this:
int bytesInRef = Environment.Is64BitProcess ? 8 : 4;
However, this is an implementation detail. Not only should you not worry about this, you should ignore this. Here's a good blog post on (another) implementation detail, but it's still applicable as it talks about implementation details and how you shouldn't trust them or depend on them. Here: The Stack Is An Implementation Detail
Related
In a .NET Core 3.0 project, I have an interface that return a Span<byte>. This works for a large set of classes except one particular implementation which can generate its data on the fly (due to not caching it).
The implementation looks like:
public Span<byte> Data => CompileBytes();
where it would be something like (this is abstract code but pretty close to a use case)
public byte[] CompileBytes()
{
using (MemoryStream stream = new MemoryStream())
{
foreach (IDataSource data in DataSources)
stream.Write(data.ByteArray);
return stream.ToArray();
}
}
I have been looking around online to see if there's a guarantee that this is safe to do but haven't found any.
My worry is that Span is a very thin layer around the data that the GC will ignore such that the GC assumes we will not let the span outlive the underlying buffer, and the temporary byte array that is created will eventually get GC'd, which means I could potentially have a ticking time bomb on my hands if it for some reason does a GC while someone other code is using the span. Is this the case? Can I return a Span<> for a temporary object and be perfectly okay (under the assumption it is properly used by staying in the bounds of the span)?
The definition seems to be implementation dependent so I can't figure out with my limited knowledge if it holds onto the reference or not... because if so, then I am safe and my question is answered.
The MSDN says 'memory safe' but I am unsure the exact specifics by which they define memory safe and if it covers my definition. As such, if it does, then this question is answered.
I am not using any unsafe code.
Having a reference to a managed array in a Span<T> is safe, even if it is the only reference to it.
As described in the article All About Span: Exploring a New .NET Mainstay, Span<T> uses a special way of storing these references, a ByReference<T>, which is implemented as a JIT intrinsic.
Quoting the linked article (section How Is Span<T> Implemented?):
Span<T> is actually written to use a special internal type in the runtime that’s treated as a just-in-time (JIT) intrinsic, with the JIT generating for it the equivalent of a ref T field
And section What Is Memory<T> and Why Do You Need It?
Span<T> is a ref-like type as it contains a ref field, and ref fields can refer not only to the beginning of objects like arrays, but also to the middle of them [...] These references are called interior pointers, and tracking them is a relatively expensive operation for the .NET runtime’s garbage collector.
The last part of that quote clarifies that the reference stored in Span<T> is indeed tracked by the GC, so it will not clean up memory that is still being referenced
C# 6.0 in a Nutshell by Joseph Albahari and Ben Albahari (O’Reilly).
Copyright 2016 Joseph Albahari and Ben Albahari, 978-1-491-92706-9.
introduces, at page 312, BitArrays as one of the Collection types .NET provides:
BitArray
A BitArray is a dynamically sized collection of compacted bool values.
It is more memory-efficient than both a simple array of bool and a
generic List of bool, because it uses only one bit for each value,
whereas the bool type otherwise occupies one byte for each value.
It's nice to have the possibility of declaring a collection of bits instead of working with bytes when you are interested in binary values only, but what about declaring a single bit field ?
like:
public class X
{
public [bit-type] MyBit {get; set;}
}
.NET does not support it ?
The existent posts on that touch the topic talk about setting individual bits within, ultimately, a byte variable. I am asking if, once .NET thought of supporting working with bit variables, in a collection, if it also supports declaring a non-collection such variable.
So your question is whether .NET supports this or not. The answer is no.
Why? It's fundamentally possible to have such a feature. But the demand is really low. It's better to invest the developer time elsewhere.
If you want to make use of memory below the byte granularity you will need to build this yourself. BitArray is not intrinsic to the runtime. It manipulates the bits of some bigger type (I think it's int-based). You can do the same thing.
BitVector32 is a built-in struct that you can use to individually address 32 bits.
As you can see in the .Net reference, BitArray internally stores the values within an Array of int
public BitArray(int length, bool defaultValue) {
...
m_array = new int[GetArrayLength(length, BitsPerInt32)];
m_length = length;
int fillValue = defaultValue ? unchecked(((int)0xffffffff)) : 0;
for (int i = 0; i < m_array.Length; i++) {
m_array[i] = fillValue;
}
_version = 0;
}
So the least thing that gets allocated with a BitArray is already an entire int for the reference and even more if you store data in it. This also makes sense since the memory the used for addressing anything is already in data words. Those are - depending on the architecture - at least 4 bytes long already.
You can of course define an own type for a single Bit to store, but this will also take at least a byte - if not even a complete word and a byte due to being a reference type - to do so. Memory is allocated to a program by the OS in terms of memory addresses, which usually address bytes, so anything less is not entirely useful.
It takes a lot of binary values to store, to even make up for the space already lost by using the type in the first place, so the only really useful application of this technique of storing bits is when you've got lots of them, so that you can profit of the 8:1 memory ratio.
Background
I am writing a managed x64 assembler (which is also a library), so it has multiple classes which define an unsigned 64-bit integer property for use as addresses and offsets. Some are file offsets, others are absolute addresses (relative to the main memory) and again others are relative virtual addresses.
Problem
I use the ulong datatype for the properties in the mentioned scenarios, and this works fine. However, such properties are not CLS-compliant. I can mark them as [ClsCompliant(false)], but then I need to provide a CLS-compliant alternative to users of the library.
Options and questions
Some suggest providing an alternative property with a bigger data type, but this is not an option because there is no bigger signed integer primitive which could hold all values from 0 to UInt64.MaxValue.
I would rather not mark my entire assembly as non-CLS-compliant, because in most usage scenario's, not all the possible values up to UInt64.MaxValue are used. So, for e.g. Address I could provide an alternative long property AddressAlternative, which only accepts positive values. However, what should happen when Address somehow contains a value above Int64.MaxValue. Should AddressAlternative throw some exception?
And what would be an appropriate name for AddressAlternative?
Providing an alternative for every usage of ulong would result in many 'double' properties. Is there a better way to do this? Note that not all usages of ulong properties have the same semantics, so a single struct would not cut it.
And finally, I have the same CLS compliance problem in constructor parameters. So should I provide an alternative overload accepting long for such a parameter?
I do not mind restricting the use of (some functionality) of the library when it is used from a CLS-only context, as long as it can be used in most scenarios.
but when it represents an unsigned address above Int64.MaxValue
You are using the wrong type, addresses must be stored in IntPtr or UIntPtr. There is just no way your problem is realistic. If you can't afford to lose the single bit in UInt64 then you are way too close to overflow. If this number represents an index then a plain Int32 will be fine, .NET memory blobs are limited to 2 gigabyte, even on a 64-bit machine.
If it is an address then IntPtr will be fine for a very, very long time. Currently available hardware is 4.5 orders of magnitude away from reaching that limit. Very drastic hardware redesign will be needed to get close, you'll have much bigger problems to worry about when that day ever comes. Nine exabyte of virtual memory is enough for everybody until I retire.
Microsoft defines a 64-bit address as Int64, not UInt64, so you can still be CLS compliant.
Please refer to http://msdn.microsoft.com/en-us/library/837ksy6h.aspx.
Which basically says:
IntPtr Constructor (Int64)
Initializes
a new instance of IntPtr using the
specified 64-bit pointer.
Parameters
value
Type: System.Int64
A pointer or handle contained in a 64-bit signed integer.
And yes, I just did a quick test and the following worked fine in a project targeted for either x64 or Any CPU. I placed a brekpoint in the code and examined x. However, when targeted for only x86, it will throw an exception.
using System;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
IntPtr x = new IntPtr(long.MaxValue);
}
}
}
But, if it turns out that you really need that extra bit. You could provide two libraries. One that is CLS-Compliant and one that is not--user's choice. This could be accomplished by using #if statements and using Conditional Compilation Symbols. This way, you can define the same variable name, but with different definitions. http://msdn.microsoft.com/en-us/library/4y6tbswk.aspx
In delphi the "ZeroMemory" procedure, ask for two parameters.
CODE EXAMPLE
procedure ZeroMemory(Destination: Pointer; Length: DWORD);
begin
FillChar(Destination^, Length, 0);
end;
I want make this, or similar in C#... so, what's their equivalent?
thanks in advance!
.NET framework objects are always initialized to a known state
.NET framework value types are automatically 'zeroed' -- which means that the framework guarantees that it is initialized into its natural default value before it returns it to you for use. Things that are made up of value types (e.g. arrays, structs, objects) have their fields similarly initialized.
In general, in .NET all managed objects are initialized to default, and there is never a case when the contents of an object is unpredictable (because it contains data that just happens to be in that particular memory location) as in other unmanaged environments.
Answer: you don't need to do this, as .NET will automatically "zero" the object for you. However, you should know what the default value for each value type is. For example, the default of a bool is false, and the default of an int is zero.
Unmanaged objects
"Zero-ing" a region of memory is usually only necessary in interoping with external, non-managed libraries.
If you have a pinned pointer to a region of memory containing data that you intend to pass to an outside non-managed library (written in C, say), and you want to zero that section of memory, then your pointer most likely points to a byte array and you can use a simple for-loop to zero it.
Off-topic note
On the flip side, if a large object is allocated in .NET, try to reuse it instead of throwing it away and allocating a new one. That's because any new object is automatically "zeroed" by the .NET framework, and for large objects this clearing will cause a hidden performance hit.
You very rarely need unsafe code in C#. Usually only when interacting with native libraries.
The Marshal class as some low level helper functions, but I'm not aware of any that zeros out memory.
Firstly, in .Net (including C#) then value types are zero by default - so this takes away one of the common uses of ZeroMemory.
Secondly, if you want to zero a list of type T then try a method like:
void ZeroMemory<T>(IList<T> destination)
{
for (var i=0;i<destination.Count; i+))
{
destination[i] = default(T);
}
}
If a list isn't available... then I think I'd need to see more of the calling code.
Technically there is the Array.Clear, but it's only for managed arrays. What do you want to do?
In C++ it is fairly simple to display the actual value of a pointer to an object. For example:
void* p = new CSomething();
cout << p;
Is there a way to do something like this in .NET?
The value of doing this would/could only be educational, e.g. for purposes of demonstration as in displaying a value for students to see rather than just comparing for reference equality or null (nothing) to prove shallow copies, immutability etc.
You can use GCHandle to get the address of a pinned object. The GC can move objects around so the only sensible address to get is one of a pinned object.
GCHandle handle = GCHandle.Alloc(obj, GCHandleType.Pinned);
Console.WriteLine(handle.AddrOfPinnedObject().ToInt32());
handle.Free();
Remember though that GCHandle will only pin objects that are primitive or blittable types. Some objects are blittable (and you can set it up for demo purposes so it works) but any reference type will not be blittable.
You'll need to add an explicit blittable description using [StructLayout(LayoutKind.Sequential)] or use the debugger to directly inspect addresses of object that do not meet these criteria.
If this is for education purposes, I suggest you use a debugger instead. If you load SOS.dll (which is part of the .NET framework) into WinDbg or even Visual Studio, you can examine the actual objects in memory.
E.g. to list the heap use the !dumpheap -stat command. The !do command dumps a mananaged object on the specified memory address and so forth. SOS has numerous commands that will let you examine internal .NET structures, so it is a really useful tool for learning more about the runtime.
By using the debugger for this, you're not restricted to looking at demo applications. You can peek into the details of real applications. Also, you'll pick up some really useful debugging skills.
There are several excellent introductions to debugging using WinDbg + SOS. Check Tess' blog for lots of tutorials.
RuntimeHelpers.GetHashCode will give you an identity-based hash code. In practice, this is probably based on address. As explained:
"RuntimeHelpers.GetHashCode is useful
in scenarios where you care about
object identity. Two strings with
identical contents will return
different values for
RuntimeHelpers.GetHashCode, because
they are different string objects,
although their contents are the same."
Interning of string literals is the main possible exception. This is actually the same in C++.
I understand that if you provide the compiler the /unsafe option, you will be allowed to write 'unsafe' code, and with it have access to pointers.
I haven't tested this but found it in this artice
Edit:
Seems the main thing to remember is that you would have to mark any code using unsafe code with the unsafe keyword:
unsafe public static void Main()
In .Net you don't work with pointers at all. So you would create reference objects, of which you can always see the value.
When comparing reference objects, the references are compared, not the actual values! (Except for comparing strings, where the '==' is overloaded).
Maybe an .Net example of what you want to demonstrate would elaborate things...
You can retrieve the address of an object in .NET, such as with unsafe code, but the address you get back will only be temporary -- it'll be a snapshot as of the point where you take the address.
The next time a garbage collection happens, the address of your object is likely to change:
If the object is no longer referenced from anywhere, it will be collected, and some other object will occupy that address
If the object is still being referenced, it will probably be promoted to a higher generation (and therefore moved to a different GC heap). Alternatively, if it's already in generation 2, it will probably be moved in memory as the heap is compacted.
The existence of the garbage collector is the reason why the int* pointer in #Jesper's exists in the scope of a { } block. The pointer is fixed only within that block; once execution leaves the block, the object is entitled to be collected and/or moved.
unsafe
{
object o = new Object();
int *ptr = &o; //Get address
Console.WriteLine((int)ptr); //Write address
}
You need to compile this with the /unsafe switch