What is the most efficient way to reassign a struct? - c#

I have in my program a struct type called Square which is used to represent the location (int Rank, int File) of a square on a chess board.
If I assign Square by new Square sq(); say and then I want to reassign it, is it better to do so by
sq = new Square(rank, file);
or by writing an internal Set method and calling Set thus
sq.Set(rank, file);
What I am asking is when you use new on a struct, does the runtime reallocate new memory and call the constructor or does it reuse the existing memory? If it does the former then it would be better to write a Set method to avoid overheads would it not? Cheers.

The traditional thinking these days is the value types should be immutable, so you would not want to have a Set method unless that is returning a new Square object and not mutating the original. As such,
sq = new Square(rank, file);
And
sq = sq.GenerateSquare(rank, file); // renamed Set method from original question to appease comments
Should ultimately perform the same operation.
But given this approach, GenerateSquare would also possibly be better as a static method of Square rather than something depending upon any given instance. (An instance method would be more useful if something about the existing instance was used in the creation of a new instance.)

Structures are value types, so a simple assignment will do the job:
Square sq = new Square(rank, file);
Square anotherSq = sq;

Worrying about the weight of garbage collection or memory use is something you should not be concerned with until you have profiled your application and know it will be an issue. A simple structure like this is not going be taking up much space and likely not the cause of problems if your program does hit a bottleneck.

For structs... space for new structs is created on the stack, (see NOTE), not the heap, and is not subject to garbage collection. If the assignment variable is an already existing copy of the struct, then it is overwritten. No additional memory is used.
NOTE: If you create a new struct and assign it to a variable that is a property of a reference type, then yes, the reference type is on the heap, but the memory slot the struct is copied to is the already existing memory slot for that already existing reference type, no new heap memory is allocated. And the struct is not independantly subject to garbage collection....
But others' comments about your design are correct, structs should generally only be used for immutable domain objects, things that are simple and easy to create (small footprint) and have no identity (i.e., one telephone number object set to (802) 123-4567 is equivilent to and can be used anywhere else you need a telephone number object set to (802) 123-4567
So in general, these objects should not have constrcutors or property setters, they should have static factory methods that create instances of them.

Related

How does memory management work for a static generic list in c#?

From the static, I understand that whenever a static variable is declared - It's memory get allocated in RAM. Suppose, we have integer static int i = 5; then a memory of 4 byte will be occupied somewhere in computer. And the same will happen if I have a static class or any reference type.
But my question is - if I declare a generic list like List<string> in c# and that is static. So what or how much memory will be allocated for this list in computer. And I assume that If I add items in this list - then it will require some more memory.
So, it breaks my concept about static - that a static field has
a fixed memory allocation at the time of declaration and that can not be changed through the application lifetime.
Can someone genius in c# help me out here?
There's no difference in the allocation of static member compared to non-static ones. "Static" just means that the member is visible and accessible to all instances of the class declaring it.
For the List<>: all objects you instantiate with a "new" keywork are created in a part of the memory called Heap. So are the static list you are asking about.
Lists in .NET are created as arrays of a certain length plus a pointer to an eventual new array. Then, whenever that first array gets filled by adding items to the list, a new array is created and linked to the first using the pointer. In this way the list can grow.
You're making a few assumptions about how .NET does memory management.
Under the hood (and I'd recommend looking) List uses Array to allocate blocks of data and is instantiated to a size of 4 unless specified, so you'll have a pointer for the array and it's size multiplied by the size of int. The amount of memory used initially depends on what size the array is when you instantiate the List.
E.g. if you have List<int> then you have a memory pointer for the List instance, a memory pointer for Array and at whatever size you set in the constructor multiplied by the amount of memory required for the data type of T. All of this gets put in the Gen0 cache initially and more or less memory is allocated, deallocated, moved to the Gen1 and Gen2 blocks as you populate, depopulate, use the List.
Given all of the above, there is no definitive answer unless the question is refined, e.g. "How much memory is allocated when I instantiate List<int>(5)?"
As for static, that's pretty much moot as the same amount of memory has to be allocated for the instance.
From a different angle, maybe the way to help would be to explain just what 'static' is in .net.
Here's a simple class:
public class MyClass
{
public string Zeus;
public static string Hades;
}
Okay, so what does that 'static' mean for our Hades string? Static basically means: it only exists in one place - it doesn't matter how many instances of the class you make, there's only going to be one Hades string.
MyClass first = new MyClass();
MyClass second = new MyClass();
MyClass third = new MyClass();
... there are now three Zeus strings. One for each of those MyClasses:
first.Zeus = "first";
second.Zeus = "second";
third.Zeus = "third";
... but there's only one Hades:
MyClass.Hades = "only version";
Notice how I didn't put 'first.Hades', or 'second.Hades'? That's because, since there's only one version, I don't have to put an instance to get to it. In fact, VisualStudio will flat out tell you, "I can't do this - you're trying to get to a static variable, but you're trying to get to it through an actual instance of your class."
Instead, you just use: MyClass.Hades.
So, getting back to your memory question?
public class MyClass
{
public List<string> Zeus;
public static List<string> Hades;
}
The way that those List are stored really isn't any different. The only difference is, you'll always have one List for your static Hades variable... and you'll have as a Zeus List for every MyClass you create (that hasn't been GarbageCollected)
Make sense? It's kinda important to get this concept down, because it'll come into play a lot for stuff like caching or having a Singleton global object.

New Reference When Concatenating A String

A couple of weeks ago, I was asked a C# question in a job interview. The question was exactly this:
string a = "Hello, ";
for(int i = 0; i < 99999999; i++)
{
a += "world!";
}
I was asked exactly, "why this is a bad method for concatenated string?". My response was some sort of "readability, append should be chosen" etc.
But apparently, this is not the case according to the guy that was interviewing me. So, according to him, every time we concatenate a string, because of the structure of CLR, a new reference is created in memory. So, in the end of the following code, we would have 99999999 of string variable "a" in memory.
I thought, the objects are created just once in the stack as soon as a value is assigned to them (I'm not talking about heap). The way I knew was the memory allocation is done once in the stack for each primitive data types, their values are modified as needed and disposed when the execution of a scope is finished. Is that wrong? Or, are new references of variable "a" actually created in the stack every single time it is concatenated?
Can someone please explain how it works for stack? Many thanks.
First remember these two facts:
string is an immutable type (existing instances are never modified)
string is a reference type (the "value" of a string expression is a reference to the location where the instance is)
Therefore, a statement like:
a += "world!";
will work similar to a = a + "world!";. It will first follow the reference to the "old" a and concat that old string with the string "world!". This involves copying the contents of both old strings into a new memory location. That is the "+" part. It will then move the reference of a from pointing to the old location into pointing to the new location (the newly concatenated string). That is the "=" assignment part of the statement.
Now it follows that the old string instance is left with no references to it. So at some point, the garbage collector will remove it (and possibly move memory around to avoid "holes").
So I guess your job interviewer was absolutely right there. The loop of your question will create a bunch of (mostly very long!) strings in memory (in the heap since you want to be technical).
A simpler approach could be:
string a = "Hello, "
+ string.Concat(Enumerable.Repeat("world!", 999...));
Here we use string.Concat. That method will know it will need to concatenate a bunch of strings into one long string, and it can use some sort of expandable buffer (such as a StringBuilder or even a pointer type char*) internally to make sure it does not create a myriad of "dead" object instances in mememory.
(Do not use ToArray() or similar, as in string.Concat(Enumerable.Repeat("world!", 999...).ToArray()), of course!)
.NET distinguishes between ref and value types. string is a ref type. It is allocated on the heap without exception. It's lifetime is controlled by the GC.
So, in the end of the following code, we would have 99999999 of string variable "a" in memory.
99999999 have been allocated. Of course, some of them might be GC'ed already.
their values are modified as needed and disposed when the execution of a scope is finished
String is not a primitive or a value type. Those are allocated "inline" inside of something else such as the stack, an array or inside heap objects. They also can be boxed and become true heap objects. None of that applies here.
The problem with this code is not the allocation but the quadratic runtime complexity. I don't think this loop would ever finish in practice.
Reference types (i.e. classes & strings) are always created in the heap. Value types (such as structs) are created in the stack and are lost when a function ends execution.
However stating that after the loop you will have N objects in memory is not entirely true. In each evaluation of the of the
a += "world!";
statement you do create a new string. What happens to the previously created string is more complicated. The garbage collector now owns, since there is no other reference to it in your code and will release it at some point, which you don't exactly know when will happen.
Finally, the ultimate problem with this code is that you believe you are modifying an object, but strings are immutable, meaning you cannot really change their value once created. You can only create new ones and this is what the += operator is doing. This would be far more efficient with a StringBuilder which was made to be mutable.
EDIT
As requested, here's stack / heap related clarification. Value types are not always in the stack. They are in the stack when you declare them inside a function body:
void method()
{
int a = 1; // goes in the stack
}
But go into the heap when they are part of other objects, like when an integer is a property of a class (since the whole class instance is in the heap).

Should I use Class or Struct in the following case (data structure with many fields)? [duplicate]

I'm about to create 100,000 objects in code. They are small ones, only with 2 or 3 properties. I'll put them in a generic list and when they are, I'll loop them and check value a and maybe update value b.
Is it faster/better to create these objects as class or as struct?
EDIT
a. The properties are value types (except the string i think?)
b. They might (we're not sure yet) have a validate method
EDIT 2
I was wondering: are objects on the heap and the stack processed equally by the garbage collector, or does that work different?
Is it faster to create these objects as class or as struct?
You are the only person who can determine the answer to that question. Try it both ways, measure a meaningful, user-focused, relevant performance metric, and then you'll know whether the change has a meaningful effect on real users in relevant scenarios.
Structs consume less heap memory (because they are smaller and more easily compacted, not because they are "on the stack"). But they take longer to copy than a reference copy. I don't know what your performance metrics are for memory usage or speed; there's a tradeoff here and you're the person who knows what it is.
Is it better to create these objects as class or as struct?
Maybe class, maybe struct. As a rule of thumb:
If the object is :
1. Small
2. Logically an immutable value
3. There's a lot of them
Then I'd consider making it a struct. Otherwise I'd stick with a reference type.
If you need to mutate some field of a struct it is usually better to build a constructor that returns an entire new struct with the field set correctly. That's perhaps slightly slower (measure it!) but logically much easier to reason about.
Are objects on the heap and the stack processed equally by the garbage collector?
No, they are not the same because objects on the stack are the roots of the collection. The garbage collector does not need to ever ask "is this thing on the stack alive?" because the answer to that question is always "Yes, it's on the stack". (Now, you can't rely on that to keep an object alive because the stack is an implementation detail. The jitter is allowed to introduce optimizations that, say, enregister what would normally be a stack value, and then it's never on the stack so the GC doesn't know that it is still alive. An enregistered object can have its descendents collected aggressively, as soon as the register holding onto it is not going to be read again.)
But the garbage collector does have to treat objects on the stack as alive, the same way that it treats any object known to be alive as alive. The object on the stack can refer to heap-allocated objects that need to be kept alive, so the GC has to treat stack objects like living heap-allocated objects for the purposes of determining the live set. But obviously they are not treated as "live objects" for the purposes of compacting the heap, because they're not on the heap in the first place.
Is that clear?
Sometimes with struct you don't need to call the new() constructor, and directly assign the fields making it much faster that usual.
Example:
Value[] list = new Value[N];
for (int i = 0; i < N; i++)
{
list[i].id = i;
list[i].isValid = true;
}
is about 2 to 3 times faster than
Value[] list = new Value[N];
for (int i = 0; i < N; i++)
{
list[i] = new Value(i, true);
}
where Value is a struct with two fields (id and isValid).
struct Value
{
int id;
bool isValid;
public Value(int i, bool isValid)
{
this.i = i;
this.isValid = isValid;
}
}
On the other hand is the items needs to be moved or selected value types all that copying is going to slow you down. To get the exact answer I suspect you have to profile your code and test it out.
Arrays of structs are represented on the heap in a contiguous block of memory, whereas an array of objects is represented as a contiguous block of references with the actual objects themselves elsewhere on the heap, thus requiring memory for both the objects and for their array references.
In this case, as you are placing them in a List<> (and a List<> is backed onto an array) it would be more efficient, memory-wise to use structs.
(Beware though, that large arrays will find their way on the Large Object Heap where, if their lifetime is long, may have an adverse affect on your process's memory management. Remember, also, that memory is not the only consideration.)
Structs may seem similar to classes, but there are important differences that you should be aware of. First of all, classes are reference types and structs are value types. By using structs, you can create objects that behave like the built-in types and enjoy their benefits as well.
When you call the New operator on a class, it will be allocated on the heap. However, when you instantiate a struct, it gets created on the stack. This will yield performance gains. Also, you will not be dealing with references to an instance of a struct as you would with classes. You will be working directly with the struct instance. Because of this, when passing a struct to a method, it's passed by value instead of as a reference.
More here:
http://msdn.microsoft.com/en-us/library/aa288471(VS.71).aspx
If they have value semantics, then you should probably use a struct. If they have reference semantics, then you should probably use a class. There are exceptions, which mostly lean towards creating a class even when there are value semantics, but start from there.
As for your second edit, the GC only deals with the heap, but there is a lot more heap space than stack space, so putting things on the stack isn't always a win. Besides which, a list of struct-types and a list of class-types will be on the heap either way, so this is irrelevant in this case.
Edit:
I'm beginning to consider the term evil to be harmful. After all, making a class mutable is a bad idea if it's not actively needed, and I would not rule out ever using a mutable struct. It is a poor idea so often as to almost always be a bad idea though, but mostly it just doesn't coincide with value semantics so it just doesn't make sense to use a struct in the given case.
There can be reasonable exceptions with private nested structs, where all uses of that struct are hence restricted to a very limited scope. This doesn't apply here though.
Really, I think "it mutates so it's a bad stuct" is not much better than going on about the heap and the stack (which at least does have some performance impact, even if a frequently misrepresented one). "It mutates, so it quite likely doesn't make sense to consider it as having value semantics, so it's a bad struct" is only slightly different, but importantly so I think.
The best solution is to measure, measure again, then measure some more. There may be details of what you're doing that may make a simplified, easy answer like "use structs" or "use classes" difficult.
A struct is, at its heart, nothing more nor less than an aggregation of fields. In .NET it's possible for a structure to "pretend" to be an object, and for each structure type .NET implicitly defines a heap object type with the same fields and methods which--being a heap object--will behave like an object. A variable which holds a reference to such a heap object ("boxed" structure) will exhibit reference semantics, but one which holds a struct directly is simply an aggregation of variables.
I think much of the struct-versus-class confusion stems from the fact that structures have two very different usage cases, which should have very different design guidelines, but the MS guidelines don't distinguish between them. Sometimes there is a need for something which behaves like an object; in that case, the MS guidelines are pretty reasonable, though the "16 byte limit" should probably be more like 24-32. Sometimes, however, what's needed is an aggregation of variables. A struct used for that purpose should simply consist of a bunch of public fields, and possibly an Equals override, ToString override, and IEquatable(itsType).Equals implementation. Structures which are used as aggregations of fields are not objects, and shouldn't pretend to be. From the structure's point of view, the meaning of field should be nothing more or less than "the last thing written to this field". Any additional meaning should be determined by the client code.
For example, if a variable-aggregating struct has members Minimum and Maximum, the struct itself should make no promise that Minimum <= Maximum. Code which receives such a structure as a parameter should behave as though it were passed separate Minimum and Maximum values. A requirement that Minimum be no greater than Maximum should be regarded like a requirement that a Minimum parameter be no greater than a separately-passed Maximum one.
A useful pattern to consider sometimes is to have an ExposedHolder<T> class defined something like:
class ExposedHolder<T>
{
public T Value;
ExposedHolder() { }
ExposedHolder(T val) { Value = T; }
}
If one has a List<ExposedHolder<someStruct>>, where someStruct is a variable-aggregating struct, one may do things like myList[3].Value.someField += 7;, but giving myList[3].Value to other code will give it the contents of Value rather than giving it a means of altering it. By contrast, if one used a List<someStruct>, it would be necessary to use var temp=myList[3]; temp.someField += 7; myList[3] = temp;. If one used a mutable class type, exposing the contents of myList[3] to outside code would require copying all the fields to some other object. If one used an immutable class type, or an "object-style" struct, it would be necessary to construct a new instance which was like myList[3] except for someField which was different, and then store that new instance into the list.
One additional note: If you are storing a large number of similar things, it may be good to store them in possibly-nested arrays of structures, preferably trying to keep the size of each array between 1K and 64K or so. Arrays of structures are special, in that indexing one will yield a direct reference to a structure within, so one can say "a[12].x = 5;". Although one can define array-like objects, C# does not allow for them to share such syntax with arrays.
Use classes.
On a general note. Why not update value b as you create them?
From a c++ perspective I agree that it will be slower modifying a structs properties compared to a class. But I do think that they will be faster to read from due to the struct being allocated on the stack instead of the heap. Reading data from the heap requires more checks than from the stack.
Well, if you go with struct afterall, then get rid of string and use fixed size char or byte buffer.
That's re: performance.

Why .NET String is immutable? [duplicate]

This question already has answers here:
Why can't strings be mutable in Java and .NET?
(17 answers)
Closed 9 years ago.
As we all know, String is immutable. What are the reasons for String being immutable and the introduction of StringBuilder class as mutable?
Instances of immutable types are inherently thread-safe, since no thread can modify it, the risk of a thread modifying it in a way that interferes with another is removed (the reference itself is a different matter).
Similarly, the fact that aliasing can't produce changes (if x and y both refer to the same object a change to x entails a change to y) allows for considerable compiler optimisations.
Memory-saving optimisations are also possible. Interning and atomising being the most obvious examples, though we can do other versions of the same principle. I once produced a memory saving of about half a GB by comparing immutable objects and replacing references to duplicates so that they all pointed to the same instance (time-consuming, but a minute's extra start-up to save a massive amount of memory was a performance win in the case in question). With mutable objects that can't be done.
No side-effects can come from passing an immutable type as a method to a parameter unless it is out or ref (since that changes the reference, not the object). A programmer therefore knows that if string x = "abc" at the start of a method, and that doesn't change in the body of the method, then x == "abc" at the end of the method.
Conceptually, the semantics are more like value types; in particular equality is based on state rather than identity. This means that "abc" == "ab" + "c". While this doesn't require immutability, the fact that a reference to such a string will always equal "abc" throughout its lifetime (which does require immutability) makes uses as keys where maintaining equality to previous values is vital, much easier to ensure correctness of (strings are indeed commonly used as keys).
Conceptually, it can make more sense to be immutable. If we add a month onto Christmas, we haven't changed Christmas, we have produced a new date in late January. It makes sense therefore that Christmas.AddMonths(1) produces a new DateTime rather than changing a mutable one. (Another example, if I as a mutable object change my name, what has changed is which name I am using, "Jon" remains immutable and other Jons will be unaffected.
Copying is fast and simple, to create a clone just return this. Since the copy can't be changed anyway, pretending something is its own copy is safe.
[Edit, I'd forgotten this one]. Internal state can be safely shared between objects. For example, if you were implementing list which was backed by an array, a start index and a count, then the most expensive part of creating a sub-range would be copying the objects. However, if it was immutable then the sub-range object could reference the same array, with only the start index and count having to change, with a very considerable change to construction time.
In all, for objects which don't have undergoing change as part of their purpose, there can be many advantages in being immutable. The main disadvantage is in requiring extra constructions, though even here it's often overstated (remember, you have to do several appends before StringBuilder becomes more efficient than the equivalent series of concatenations, with their inherent construction).
It would be a disadvantage if mutability was part of the purpose of an object (who'd want to be modeled by an Employee object whose salary could never ever change) though sometimes even then it can be useful (in a many web and other stateless applications, code doing read operations is separate from that doing updates, and using different objects may be natural - I wouldn't make an object immutable and then force that pattern, but if I already had that pattern I might make my "read" objects immutable for the performance and correctness-guarantee gain).
Copy-on-write is a middle ground. Here the "real" class holds a reference to a "state" class. State classes are shared on copy operations, but if you change the state, a new copy of the state class is created. This is more often used with C++ than C#, which is why it's std:string enjoys some, but not all, of the advantages of immutable types, while remaining mutable.
Making strings immutable has many advantages. It provides automatic thread safety, and makes strings behave like an intrinsic type in a simple, effective manner. It also allows for extra efficiencies at runtime (such as allowing effective string interning to reduce resource usage), and has huge security advantages, since it's impossible for an third party API call to change your strings.
StringBuilder was added in order to address the one major disadvantage of immutable strings - runtime construction of immutable types causes a lot of GC pressure and is inherently slow. By making an explicit, mutable class to handle this, this issue is addressed without adding unneeded complication to the string class.
Strings are not really immutable. They are just publicly immutable.
It means you cannot modify them from their public interface. But in the inside the are actually mutable.
If you don't believe me look at the String.Concat definition using reflector.
The last lines are...
int length = str0.Length;
string dest = FastAllocateString(length + str1.Length);
FillStringChecked(dest, 0, str0);
FillStringChecked(dest, length, str1);
return dest;
As you can see the FastAllocateString returns an empty but allocated string and then it is modified by FillStringChecked
Actually the FastAllocateString is an extern method and the FillStringChecked is unsafe so it uses pointers to copy the bytes.
Maybe there are better examples but this is the one I have found so far.
string management is an expensive process. keeping strings immutable allows repeated strings to be reused, rather than re-created.
Why are string types immutable in C#
String is a reference type, so it is never copied, but passed by reference.
Compare this to the C++ std::string
object (which is not immutable), which
is passed by value. This means that if
you want to use a String as a key in a
Hashtable, you're fine in C++, because
C++ will copy the string to store the
key in the hashtable (actually
std::hash_map, but still) for later
comparison. So even if you later
modify the std::string instance,
you're fine. But in .Net, when you use
a String in a Hashtable, it will store
a reference to that instance. Now
assume for a moment that strings
aren't immutable, and see what
happens:
1. Somebody inserts a value x with key "hello" into a Hashtable.
2. The Hashtable computes the hash value for the String, and places a
reference to the string and the value
x in the appropriate bucket.
3. The user modifies the String instance to be "bye".
4. Now somebody wants the value in the hashtable associated with "hello". It
ends up looking in the correct bucket,
but when comparing the strings it says
"bye"!="hello", so no value is
returned.
5. Maybe somebody wants the value "bye"? "bye" probably has a different
hash, so the hashtable would look in a
different bucket. No "bye" keys in
that bucket, so our entry still isn't
found.
Making strings immutable means that
step 3 is impossible. If somebody
modifies the string he's creating a
new string object, leaving the old one
alone. Which means the key in the
hashtable is still "hello", and thus
still correct.
So, probably among other things,
immutable strings are a way to enable
strings that are passed by reference
to be used as keys in a hashtable or
similar dictionary object.
Just to throw this in, an often forgotten view is of security, picture this scenario if strings were mutable:
string dir = "C:\SomePlainFolder";
//Kick off another thread
GetDirectoryContents(dir);
void GetDirectoryContents(string directory)
{
if(HasAccess(directory) {
//Here the other thread changed the string to "C:\AllYourPasswords\"
return Contents(directory);
}
return null;
}
You see how it could be very, very bad if you were allowed to mutate strings once they were passed.
You never have to defensively copy immutable data. Despite the fact that you need to copy it to mutate it, often the ability to freely alias and never have to worry about unintended consequences of this aliasing can lead to better performance because of the lack of defensive copying.
Strings are passed as reference types in .NET.
Reference types place a pointer on the stack, to the actual instance that resides on the managed heap. This is different to Value types, who hold their entire instance on the stack.
When a value type is passed as a parameter, the runtime creates a copy of the value on the stack and passes that value into a method. This is why integers must be passed with a 'ref' keyword to return an updated value.
When a reference type is passed, the runtime creates a copy of the pointer on the stack. That copied pointer still points to the original instance of the reference type.
The string type has an overloaded = operator which creates a copy of itself, instead of a copy of the pointer - making it behave more like a value type. However, if only the pointer was copied, a second string operation could accidently overwrite the value of a private member of another class causing some pretty nasty results.
As other posts have mentioned, the StringBuilder class allows for the creation of strings without the GC overhead.
Strings and other concrete objects are typically expressed as immutable objects to improve readability and runtime efficiency. Security is another, a process can't change your string and inject code into the string
Imagine you pass a mutable string to a function but don't expect it to be changed. Then what if the function changes that string? In C++, for instance, you could simply do call-by-value (difference between std::string and std::string& parameter), but in C# it's all about references so if you passed mutable strings around every function could change it and trigger unexpected side effects.
This is just one of various reasons. Performance is another one (interned strings, for example).
There are five common ways by which a class data store data that cannot be modified outside the storing class' control:
As value-type primitives
By holding a freely-shareable reference to class object whose properties of interest are all immutable
By holding a reference to a mutable class object that will never be exposed to anything that might mutate any properties of interest
As a struct, whether "mutable" or "immutable", all of whose fields are of types #1-#4 (not #5).
By holding the only extant copy of a reference to an object whose properties can only be mutated via that reference.
Because strings are of variable length, they cannot be value-type primitives, nor can their character data be stored in a struct. Among the remaining choices, the only one which wouldn't require that strings' character data be stored in some kind of immutable object would be #5. While it would be possible to design a framework around option #5, that choice would require that any code which wanted a copy of a string that couldn't be changed outside its control would have to make a private copy for itself. While it hardly be impossible to do that, the amount of extra code required to do that, and the amount of extra run-time processing necessary to make defensive copies of everything, would far outweigh the slight benefits that could come from having string be mutable, especially given that there is a mutable string type (System.Text.StringBuilder) which accomplishes 99% of what could be accomplished with a mutable string.
Immutable Strings also prevent concurrency-related issues.
Imagine being an OS working with a string that some other thread was
modifying behind your back. How could you validate anything without
making a copy?

What happens when value types are created?

I'm developing a game using XNA and C# and was attempting to avoid calling new struct() type code each frame as I thought it would freak the GC out. "But wait," I said to myself, "struct is a value type. The GC shouldn't get called then, right?" Well, that's why I'm asking here.
I only have a very vague idea of what happens to value types. If I create a new struct within a function call, is the struct being created on the stack? Will it simply get pushed and popped and performance not take a hit? Further, would there be some memory limit or performance implications if, say, I need to create many instances in a single call?
Take, for instance, this code:
spriteBatch.Draw(tex, new Rectangle(x, y, width, height), Color.White);
Rectangle in this case is a struct. What happens when that new Rectangle is created? What are the implications of having to repeat that line many times (say, thousands of times)? Is this Rectangle created, a copy sent to the Draw method, and then discarded (meaning no memory getting eaten up the more Draw is called in that manner in the same function)?
P.S. I know this may be pre-mature optimization, but I'm mostly curious and wish to have a better understanding of what is happening.
When a new struct is created, it's contents are put straight into the location where you specify - if it's a method variable, it goes on the stack; if it's being assigned to a class variable, it goes inside the class instance being pointed to (on the heap).
When a struct variable is copied (or, in your case, passed to a function), the bytes making up the struct are all copied to the correct place on the stack or inside the class (if you're setting a field or property on an instance of a reference type).
Even though there may be copying of bytes, the JIT compiler will likely optimize all the unneccessary copies away so that it executes as fast as possible. Generally, it's not something you need to worry about - this is very much a micro-optimization :)
Does this answer your question?
While value types go on the stack, there's still performance implications to allocating and deallocating all that memory every frame -- especially on the Xbox 360. On a PC you'll likely not notice the difference, but on the 360 you probably will.
The value types are created on the stack if declared locally or on the heap if part of an object instance (as part of the object instance). In any case, struct instances are not collected by the GC, they are destroyed when their container goes out of scope.
The MSDN struct (C#) article has some more information about this.
This is just to add to thecoops answer. For reference types the new operator allocates a new instance of the type on the heap and calls the specified constructor.
For a struct, the new operator initializes the fields according to the specified constructor. It is however possible to instantiate a struct without using new. In that case all the fields in the struct are uninitialized and cannot be used until they have been explicitly initialized.
For more info see the description on MSDN.

Categories

Resources