Does new always allocate on the heap in C++ / C# / Java - c#

My understanding has always been, regardless of C++ or C# or Java, that when we use the new keyword to create an object it allocates memory on the heap. I thought that new is only needed for reference types (classes), and that primitive types (int, bool, float, etc.) never use new and always go on the stack (except when they're a member variable of a class that gets instantiated with new). However, I have been reading information that makes me doubt this long standing assumption, at least for Java and C#.
For example, I just noticed that in C# the new operator can be used to initialize a value type, see here. Is this an exception to the rule, a helper feature of the language, and if so, what other exceptions would there be?
Can someone please clarify this?

I thought that new is only needed for reference types (classes), and that primitive types (int, bool, float, etc.) never use new
In C++, you can allocate primitive types on the heap if you want to:
int* p = new int(42);
This is useful if you want a shared counter, for example in the implementation of shared_ptr<T>.
Also, you are not forced to use new with classes in C++:
void function()
{
MyClass myObject(1, 2, 3);
}
This will allocate myObject on the stack. Note that new is rarely used in modern C++.
Furthermore, you can overload operator new (either globally or class-specific) in C++, so even if you say new MyClass, the object does not necessarily get allocated on the heap.

I don't know precisely about Java (and it seems quite difficult to get a documentation about it).
In C#, new invokes the constructor and returns a fresh object. If it is of value type, it is allocated on the stack (eg. local variable) or on the heap (eg. boxed object, member of a reference type object). If it is of reference type, it always goes on the heap and is managed by the garbage collector. See http://msdn.microsoft.com/en-us/library/fa0ab757(v=vs.80).aspx for more details.
In C++, a "new expression" returns a pointer to an object with dynamic storage duration (ie. that you must destroy yourself). There is no mention of heap (with this meaning) in the C++ standard, and the mechanism through which such an object is obtained is implementation defined.

My understanding has always been, regardless of C++ or C# or Java, that when we use the new keyword to create an object it allocates memory on the heap.
Your understanding has been incorrect:
new may work differently in different programming languages, even when these languages are superficially alike. Don't let the similar syntax of C#, C++, and Java mislead you!
The terms "heap" and "stack" (as they are understood in the context of internal memory management) are simply not relevant to all programming languages. Arguably, these two concepts are more often implementation details than that they are part of a programming language's official specification.
(IIRC, this is true for at least C# and C++. I don't know about Java.)
The fact that they are such widespread implementation details doesn't imply that you should rely on that distinction, nor that you should even know about it! (However, I admit that I usually find it beneficial to know "how things work" internally.)
I would suggest that you stop worrying too much about these concepts. The important thing that you need to get right is to understand a language's semantics; e.g., for C# or any other .NET language, the difference in reference and value type semantics.
Example: What the C# specification says about operator new:
Note how the following part of the C# specification published by ECMA (4th edition) does not mention any "stack" or "heap":
14.5.10 The new operator
The new operator is used to create new instances of types. […]
The new operator implies creation of an instance of a type, but does not necessarily imply dynamic allocation of memory. In particular, instances of value types require no additional memory beyond the variables in which they reside, and no dynamic allocations occur when new is used to create instances of value types.
Instead, it talks of "dynamic allocation of memory", but that is not the same thing: You could dynamically allocate memory on a stack, on the heap, or anywhere else (e.g. on a hard disk drive) for that matter.
What it does say, however, is that instances of value types are stored in-place, which is exactly what value type semantics are all about: Value type instances get copied during an assignment, while reference type instances are referenced / "aliased". That is the important thing to understand, not the "heap" or the "stack"!

In c#, a class always lives on the heap. A struct can be either on the heap or stack:
variables (except captures and iterator blocks), and fields on a struct that is itself on the stack live on the stack
captures, iterator blocks, fields of something that is on the heap, and values in an array live on the heap, as do "boxed" values

Java 7 does escape analysis to determine if an object can be allocated on the stack, according to http://download.oracle.com/javase/7/docs/technotes/guides/vm/performance-enhancements-7.html.
However, you cannot instruct the runtime to allocate an object on heap or on stack. It's done automatically.

Regarding c#, read The Truth About Value Types.
You will see that value types can go on the heap as well.
And at this question is suggested that reference types could go on the stack. (but it does not happen at the moment)

(Referring to Java) What you said is correct- primitives are allocated on the stack (there are exceptions e.g. closures). However, you might be referring to objects such as:
Integer n = new Integer(2);
This refers to an Integer object, and not a primitive int. Perhaps this was your source of confusion? In this case, n will be allocated on the heap. Perhaps your confusion was due to autoboxing rules? Also see this question for more details on autoboxing. Check out comments on this answer for exceptions to the rule where primitives are allocated on the heap.

In Java and C#, we don't need to allocate primitive types on the heap. They can be allocated on the stack ( not that they are restricted to stack ). Whereas, in C++ we can have primitive as well as user defined types to be allocated on both stack and heap.

In C++, there's an additional way to use the new operator, and that's via 'placement new'. The memory you point it to could exist anywhere.
See What uses are there for "placement new"?

Related

How do you store an int or other "C# value types" on the heap (with C#)?

I'm engaged in educating myself about C# via Troelsen's Pro C# book.
I'm familiar with the stack and heap and how C# stores these sorts of things. In C++, whenever we use new we receive a pointer to something on the heap. However, in C# the behavior of new seems different to me:
when used with value types like an int, using new seems to merely call the int default constructor yet the value of such an int would still be stored on the stack
I understand that all objects/structs and such are stored on the heap, regardless of whether or not new is used.
So my question is: how can I instantiate an int on the heap? (And does this have something to do with 'boxing'?)
You can box any value type to System.Object type so it will be stored on the managed heap:
int number = 1;
object locatedOnTheHeap = number;
An other question is why you need this.
This is a classic example from the must-know MSDN paper: Boxing and Unboxing (C# Programming Guide)
When the CLR boxes a value type, it wraps the value inside a
System.Object and stores it on the managed heap.
Boxing is used to store value types in the garbage-collected heap.
Boxing is an implicit conversion of a value type to the type object or
to any interface type implemented by this value type. Boxing a value
type allocates an object instance on the heap and copies the value
into the new object.
.
I understand that all objects/structs and such are stored on the heap
BTW, IIRC sometimes JIT optimizes code so value type objects like of type like int are stored in the CPU registers rather than stack.
I do not know why you would want to do this, however, in theory you could indeed box your value. You would do this by boxing the int into an object (which is a reference type and will be placed on the stack:
object IAmARefSoIWillBeOnHeap = (object)1;
*As sll stated, you do not need the (object) as it will be an implicit cast. This is here merely for academic reasons, to show what is happening.
Here is a good article about reference versus value types, which is the difference of the heap versus the stack
A value type is "allocated" wherever it is declared:
As a local variable, typically on the stack (but to paraphrase Eric Lippert, the stack is an implementation detail, I suggest you read his excellent piece on his blog: The Truth About Value Types.)
As a field in a class, it expands the size of the instance with the size of the value type, and takes up space inside the instance
As such, this code:
var x = new SomeValueType();
does not allocate something on the heap by itself for that value type. If you close over it with an anonymous method or similar, the local variable will be transformed into the field of a class, and an instance of that class will be allocated on the heap, but in this case, the value type will be embedded into that class as a field.
The heap is for instances of reference types.
However, you've touched up something regarding boxing. You can box a value type value to make a copy of it and place that copy on the heap, wrapped in an object.
So this:
object x = new SomeValueType();
would first allocate the value type, then box it into an object, and store the reference to that object in x.
yet the value of such an int would still be stored on the stack
This is not necessarily true. Even when it is true, it's purely an implementation detail, and not part of the specification of the language. The main issue is that the type system does not necessarily correlate to the storage mechanism used by the runtime.
There are many cases where calling new on a struct will still result in an object that isn't on the stack. Boxing is a good example - when you box an object, you're basically pushing it into an object (effectively "copying" it to the heap), and referencing the object. Also, any time you're closing over a value type with a lambda, you'll end up "allocating on the heap."
That being said, I wouldn't focus on this at all - the issue really shouldn't about stack vs. heap in allocations, but rather about value type vs. reference type semantics and usage. As such, I'd strongly recommend reading Eric Lippert's The Truth About Value Types and Jon Skeet's References and Values. Both of these articles focus on the important aspects of struct vs. class semantics instead of necessarily looking at the storage.
As for ways to force the storage of an int on the heap, here are a couple of simple ones:
object one = 1; // Boxing
int two = 2; // Gets closed over, so ends up "on the heap"
Action closeOverTwo = () => { Console.WriteLine(two); }
// Do stuff with two here...
var three = new { Three = 3 }; // Wrap in a value type...
If you want an int on the heap, you can do this:
object o = 4;
But basically, you shouldn't want that. C# is designed for you not to think about such things. Here's a good place to start on that: http://blogs.msdn.com/b/ericlippert/archive/2009/04/27/the-stack-is-an-implementation-detail.aspx
So my question is: how can I instantiate an int on the heap? (And does
this have something to do with 'boxing'?)
Your understanding about objects and structs are correct. When you intialized either an object or a structure it goes on the heap.

How will an object with a value type and reference type be stored in .NET?

In .NET, integer data type is a value type(stack) and String is a reference type(heap).
So If a class A has an integer, and a string type object in it, and a class B creates an object of class A, then how will this object of class A be stored in memory? In stack, or in a heap?
This was asked in my Microsoft interview. Need to understand how I fared.
Eric Lippert just wrote about this:
It is simply false that the choice of whether to use the stack or the heap has anything fundamentally to do with the type of the thing being stored.
The true story is:
"in the Microsoft implementation of C# on the desktop CLR, value types are stored on the stack when the value is a local variable or temporary that is not a closed-over local variable of a lambda or anonymous method, and the method body is not an iterator block, and the jitter chooses to not enregister the value."
Most importantly, he stresses that you simply should not care where a type lives. You should care where things of a certain lifetime live.
In general, only value types that are local variables end up on the stack. The rest, including fields of classes, is stored on the heap.
In fact, the situation is more complex; see the link to Eric Lippert's Blog provided in Rex M's answer.
If I recall correctly, Objects are always reference types, regardless of their member types.
So, any object of Class A will be stored on the heap.
It was just a tricky question. I think the question was asked to you to check your knowledge about classes in dotnet. Classes are reference type. So in simple words it will go to the heap section.

Why do we need struct? (C#)

To use a struct, we need to instantiate the struct and use it just like a class. Then why don't we just create a class in the first place?
A struct is a value type so if you create a copy, it will actually physically copy the data, whereas with a class it will only copy the reference to the data
A major difference between the semantics of class and struct is that structs have value semantics. What is this means is that if you have two variables of the same type, they each have their own copy of the data. Thus if a variable of a given value type is set equal to another (of the same type), operations on one will not affect the other (that is, assignment of value types creates a copy). This is in sharp contrast to reference types.
There are other differences:
Value types are implicitly sealed (it is not possible to derive from a value type).
Value types can not be null.
Value types are given a default constructor that initialzes the value type to its default value.
A variable of a value type is always a value of that type. Contrast this with classes where a variable of type A could refer to a instance of type B if B derives from A.
Because of the difference in semantics, it is inappropriate to refer to structs as "lightweight classes."
All of the reasons I see in other answers are interesting and can be useful, but if you want to read about why they are required (at least by the VM) and why it was a mistake for the JVM to not support them (user-defined value types), read Demystifying Magic: High-level Low-level Programming. As it stands, C# shines in talking about the potential to bring safe, managed code to systems programming. This is also one of the reasons I think the CLI is a superior platform [than the JVM] for mobile computing. A few other reasons are listed in the linked paper.
It's important to note that you'll very rarely, if ever, see an observable performance improvement from using a struct. The garbage collector is extremely fast, and in many cases will actually outperform the structs. When you add in the nuances of them, they're certainly not a first-choice tool. However, when you do need them and have profiler results or system-level constructs to prove it, they get the job done.
Edit: If you wanted an answer of why we need them as opposed to what they do, ^^^
In C#, a struct is a value type, unlike classes which are reference types. This leads to a huge difference in how they are handled, or how they are expected to be used.
You should probably read up on structs from a book. Structs in C# aren't close cousins of class like in C++ or Java.
This is a myth that struct are always created on heap.
Ok it is right that struct is value type and class is reference type. But remember that
1. A Reference Type always goes on the Heap.
2. Value Types go where they were declared.
Now what that second line means is I will explain with below example
Consider the following method
public void DoCalulation()
{
int num;
num=2;
}
Here num is a local variable so it will be created on stack.
Now consider the below example
public class TestClass
{
public int num;
}
public void DoCalulation()
{
TestClass myTestClass = new TestClass ();
myTestClass.num=2;
}
This time num is the num is created on heap.Ya in some cases value types perform more than reference types as they don't require garbage collection.
Also remeber:
The value of a value type is always a value of that type.
The value of a reference type is always a reference.
And you have to think over the issue that if you expect that there will lot be instantiation then that means more heap space yow will deal with ,and more is the work of garbage collector.For that case you can choose structs.
Structs have many different semantics to classes. The differences are many but the primary reasons for their existence are:
They can be explicitly layed out in memmory
this allows certain interop scenarios
They may be allocated on the stack
Making some sorts of high performance code possible in a much simpler fashion
the difference is that a struct is a value-type
I've found them useful in 2 situations
1) Interop - you can specify the memory layout of a struct, so you can guarantee that when you invoke an unmanaged call.
2) Performance - in some (very limited) cases, structs can be faster than classes, In general, this requires structs to be small (I've heard 16 bytes or less) , and not be changed often.
One of the main reasons is that, when used as local variables during a method call, structs are allocated on the stack.
Stack allocation is cheap, but the big difference is that de-allocation is also very cheap. In this situation, the garbage collector doesn't have to track structs -- they're removed when returning from the method that allocated them when the stack frame is popped.
edit - clarified my post re: Jon Skeet's comment.
A struct is a value type (like Int32), whereas a class is a reference type. Structs get created on the stack rather than the heap. Also, when a struct is passed to a method, a copy of the struct is passed, but when a class instance is passed, a reference is passed.
If you need to create your own datatype, say, then a struct is often a better choice than a class as you can use it just like the built-in value types in the .NET framework. There some good struct examples you can read here.

Why are structs stored on the stack while classes get stored on the heap(.NET)?

I know that one of the differences between classes and structs is that struct instances get stored on stack and class instances(objects) are stored on the heap.
Since classes and structs are very similar. Does anybody know the difference for this particular distinction?
(edited to cover points in comments)
To emphasise: there are differences and similarities between value-types and reference-types, but those differences have nothing to do with stack vs heap, and everything to do with copy-semantics vs reference-semantics. In particular, if we do:
Foo first = new Foo { Bar = 123 };
Foo second = first;
Then are "first" and "second" talking about the same copy of Foo? or different copies? It just so happens that the stack is a convenient and efficient way of handling value-types as variables. But that is an implementation detail.
(end edit)
Re the whole "value types go on the stack" thing... - value types don't always go on the stack;
if they are fields on a class
if they are boxed
if they are "captured variables"
if they are in an iterator block
then they go on the heap (the last two are actually just exotic examples of the first)
i.e.
class Foo {
int i; // on the heap
}
static void Foo() {
int i = 0; // on the heap due to capture
// ...
Action act = delegate {Console.WriteLine(i);};
}
static IEnumerable<int> Foo() {
int i = 0; // on the heap to do iterator block
//
yield return i;
}
Additionally, Eric Lippert (as already noted) has an excellent blog entry on this subject
It's useful in practice to be able to allocate memory on the stack for some purposes, since those allocations are very fast.
However, it's worth noting that there's no fundamental guarantee that all structs will be placed on the stack. Eric Lippert recently wrote an interesting blog entry on this topic.
Every process has a data block consists of two different allocatable memory segment. These are stack and heap. Stack is mostly serving as the program flow manager and saves local variables, parameters and returning pointers (in a case of returning from the current working function).
Classes are very complex and mostly very large types compared to value types like structs (or basic types -- ints, chars, etc.) Since stack allocation should be specialized on the efficiency of program flow, it is not serving an optimal environment to keep large objects.
Therefore, to greet both of the expectations, this seperated architecture came along.
How the compiler and run-time environment handle memory management has grown up over a long period of time. The stack memory v.s. heap memory allocation decision had a lot to do with what could be known at compile-time and what could be known at runtime. This was before managed run times.
In general, the compiler has very good control of what's on the stack, it gets to decide what is cleaned up and when based on calling conventions. The heap on the other hand, was more like the wild west. The compiler did not have good control of when things came and went. By placing function arguments on the stack, the compiler is able to make a scope -- that scope can be controlled over the lifetime of the call. This is a natural place to put value types, because they are easy to control as opposed to reference types that can hand out memory locations (pointers) to just about anyone they want.
Modern memory management changes a lot of this. The .NET runtime can take control of reference types and the managed heap through complex garbage collection and memory management algorithms. This is also a very, very deep subject.
I recommend you check out some texts on compilers -- I grew up on Aho, so I recommend that. You can also learn a lot about the subject by reading Gosling.
In some languages, like C++, objects are also value types.
To find an example for the opposite is harder, but under classic Pascal union structs could only be instantiated on the heap. (normal structs could be static)
In short: this situation is a choice, not a hard law. Since C# (and Java before it) lack procedural underpinnings, one can ask themselves why it needs structures at all.
The reason it is there, is probably a combination of needing it for external interfaces and to have a performant and tight complex (container-) type. One that is faster than class. And then it is better to make it a value type.
Marc Gravell already explained wonderfully the difference regarding how value and reference types are copied which is the main differentiation between them.
As to why value types are usually created on the stack, that's because the way they are copied allows it. The stack has some definite advantages over the heap in terms of performance, particularly because the compiler can calculate the exact position of a variable created in a certain block of code, which makes access faster.
When you create a reference type you receive a reference to the actual object which exists in the heap. There is a small level of indirection whenever you interact with the object itself. These reference types cannot be created on the stack because the lifetime of values in the stack is determined, in great part, by the structure of your code. The function frame of a method call will be popped off the stack when the function returns, for example.
With value types, however, their copy semantics allows the compiler, depending on where it was created, to place it in the stack. If you create a local variable that holds an instance of a struct in a method and then return it, a copy of it will be created, as Marc explained above. This means that the value can be safely placed in the stack, since the lifetime of the actual instance is tied to the method's function frame. Anytime you send it somewhere outside the current function a copy of it will be created, so it doesn't matter if you tie the existence of the original instance to the scope of the function. Along these lines, you can also see why value types that are captured by closures need to go in the heap: They outlive their scope because they must also be accessible from within the closure, which can be passed around freely.
If it were a reference type, then you wouldn't be returning a copy of the object, but rather a reference, which means the actual value must be stored somewhere else, otherwise, if you returned the reference and the object's lifetime was tied to the scope in which it was created, it would end up pointing to an empty space in memory.
The distinction isn't really that "Value types go on the stack, reference types on the heap". The real point is that it's usually more efficient to access objects that live in the stack, so the compiler will try and place those values it can there. It simply turns out that value types, because of their copy semantics, fit the bill better than reference types.
I believe that whether or not to use stack or heap space is the main distinction between the two, perhaps this article will shed some light on your question: Csharp classes vs structs
The main difference being that the heap may hold objects that live forever while something on the stack is temporary in that it will disappear when the enclosing callsite is exited. This is because when one enters a method it grows to hold local variables as well as the caller method. When the method exits (ab)normally eg return or because of exception each frame must be popped off the stack. Eventually the interested frame is popped and everything on it lost.
The whole point about using the stack is that it automatically implements and honours scope. A variable stored on the stack exists until the functiont that created it exits and that functions stack frame is popped. Things that have local scope are natural for stack storage things that have bigger scope are more difficult to manage on the stack. Objects on the heap can have lifetimes that are controlled in more complex ways.
Compilers always use the stack for variables - value or reference it makes little difference. A reference variable doesn't have to have its value stored on the stack - it can be anywhere and the heap makes a more efficient if the object referenced is big and if there are multiple references to it. The point is that the scope of a reference variable isn't the same as the lifetime of the object it references i.e. a variable may be destroyed by being popped off the stack but the object (on the heap) it references might live on.
If a value type is small enough you might as well store it on the stack in place of a reference to it on the heap - its lifetime is tied to the scope of the variable. If the value type is part of a larger reference type then it too could have multiple references to it and hence it is more natural to store it on the heap and dissociate its lifetime from any single reference variable.
Stack and heap are about lifetimes and the value v reference semantics is almost a by product.
Have a look at Value and Reference
Value types go on the stack, reference types go on the heap. A struct is a value type.
There is no guaruantee about this in the specification though, so it might change in future releases:)

In C#, why is String a reference type that behaves like a value type?

A String is a reference type even though it has most of the characteristics of a value type such as being immutable and having == overloaded to compare the text rather than making sure they reference the same object.
Why isn't string just a value type then?
Strings aren't value types since they can be huge, and need to be stored on the heap. Value types are (in all implementations of the CLR as of yet) stored on the stack. Stack allocating strings would break all sorts of things: the stack is only 1MB for 32-bit and 4MB for 64-bit, you'd have to box each string, incurring a copy penalty, you couldn't intern strings, and memory usage would balloon, etc...
(Edit: Added clarification about value type storage being an implementation detail, which leads to this situation where we have a type with value sematics not inheriting from System.ValueType. Thanks Ben.)
It is not a value type because performance (space and time!) would be terrible if it were a value type and its value had to be copied every time it were passed to and returned from methods, etc.
It has value semantics to keep the world sane. Can you imagine how difficult it would be to code if
string s = "hello";
string t = "hello";
bool b = (s == t);
set b to be false? Imagine how difficult coding just about any application would be.
A string is a reference type with value semantics. This design is a tradeoff which allows certain performance optimizations.
The distinction between reference types and value types are basically a performance tradeoff in the design of the language. Reference types have some overhead on construction and destruction and garbage collection, because they are created on the heap. Value types on the other hand have overhead on assignments and method calls (if the data size is larger than a pointer), because the whole object is copied in memory rather than just a pointer. Because strings can be (and typically are) much larger than the size of a pointer, they are designed as reference types. Furthermore the size of a value type must be known at compile time, which is not always the case for strings.
But strings have value semantics which means they are immutable and compared by value (i.e. character by character for a string), not by comparing references. This allows certain optimizations:
Interning means that if multiple strings are known to be equal, the compiler can just use a single string, thereby saving memory. This optimization only works if strings are immutable, otherwise changing one string would have unpredictable results on other strings.
String literals (which are known at compile time) can be interned and stored in a special static area of memory by the compiler. This saves time at runtime since they don't need to be allocated and garbage collected.
Immutable strings does increase the cost for certain operations. For example you can't replace a single character in-place, you have to allocate a new string for any change. But this is a small cost compared to the benefit of the optimizations.
Value semantics effectively hides the distinction between reference type and value types for the user. If a type has value semantics, it doesn't matter for the user if the type is a value type or reference type - it can be considered an implementation detail.
This is a late answer to an old question, but all other answers are missing the point, which is that .NET did not have generics until .NET 2.0 in 2005.
String is a reference type instead of a value type because it was of crucial importance for Microsoft to ensure that strings could be stored in the most efficient way in non-generic collections, such as System.Collections.ArrayList.
Storing a value-type in a non-generic collection requires a special conversion to the type object which is called boxing. When the CLR boxes a value type, it wraps the value inside a System.Object and stores it on the managed heap.
Reading the value from the collection requires the inverse operation which is called unboxing.
Both boxing and unboxing have non-negligible cost: boxing requires an additional allocation, unboxing requires type checking.
Some answers claim incorrectly that string could never have been implemented as a value type because its size is variable. Actually it is easy to implement string as a fixed-length data structure containing two fields: an integer for the length of the string, and a pointer to a char array. You can also use a Small String Optimization strategy on top of that.
If generics had existed from day one I guess having string as a value type would probably have been a better solution, with simpler semantics, better memory usage and better cache locality. A List<string> containing only small strings could have been a single contiguous block of memory.
Not only strings are immutable reference types.
Multi-cast delegates too.
That is why it is safe to write
protected void OnMyEventHandler()
{
delegate handler = this.MyEventHandler;
if (null != handler)
{
handler(this, new EventArgs());
}
}
I suppose that strings are immutable because this is the most safe method to work with them and allocate memory.
Why they are not Value types? Previous authors are right about stack size etc. I would also add that making strings a reference types allow to save on assembly size when you use the same constant string in the program. If you define
string s1 = "my string";
//some code here
string s2 = "my string";
Chances are that both instances of "my string" constant will be allocated in your assembly only once.
If you would like to manage strings like usual reference type, put the string inside a new StringBuilder(string s). Or use MemoryStreams.
If you are to create a library, where you expect a huge strings to be passed in your functions, either define a parameter as a StringBuilder or as a Stream.
In a very simple words any value which has a definite size can be treated as a value type.
Also, the way strings are implemented (different for each platform) and when you start stitching them together. Like using a StringBuilder. It allocats a buffer for you to copy into, once you reach the end, it allocates even more memory for you, in the hopes that if you do a large concatenation performance won't be hindered.
Maybe Jon Skeet can help up out here?
It is mainly a performance issue.
Having strings behave LIKE value type helps when writing code, but having it BE a value type would make a huge performance hit.
For an in-depth look, take a peek at a nice article on strings in the .net framework.
How can you tell string is a reference type? I'm not sure that it matters how it is implemented. Strings in C# are immutable precisely so that you don't have to worry about this issue.
Actually strings have very few resemblances to value types. For starters, not all value types are immutable, you can change the value of an Int32 all you want and it it would still be the same address on the stack.
Strings are immutable for a very good reason, it has nothing to do with it being a reference type, but has a lot to do with memory management. It's just more efficient to create a new object when string size changes than to shift things around on the managed heap. I think you're mixing together value/reference types and immutable objects concepts.
As far as "==": Like you said "==" is an operator overload, and again it was implemented for a very good reason to make framework more useful when working with strings.
The fact that many mention the stack and memory with respect to value types and primitive types is because they must fit into a register in the microprocessor. You cannot push or pop something to/from the stack if it takes more bits than a register has....the instructions are, for example "pop eax" -- because eax is 32 bits wide on a 32-bit system.
Floating-point primitive types are handled by the FPU, which is 80 bits wide.
This was all decided long before there was an OOP language to obfuscate the definition of primitive type and I assume that value type is a term that has been created specifically for OOP languages.
Isn't just as simple as Strings are made up of characters arrays. I look at strings as character arrays[]. Therefore they are on the heap because the reference memory location is stored on the stack and points to the beginning of the array's memory location on the heap. The string size is not known before it is allocated ...perfect for the heap.
That is why a string is really immutable because when you change it even if it is of the same size the compiler doesn't know that and has to allocate a new array and assign characters to the positions in the array. It makes sense if you think of strings as a way that languages protect you from having to allocate memory on the fly (read C like programming)

Categories

Resources