Why do we need struct? (C#) - c#

To use a struct, we need to instantiate the struct and use it just like a class. Then why don't we just create a class in the first place?

A struct is a value type so if you create a copy, it will actually physically copy the data, whereas with a class it will only copy the reference to the data

A major difference between the semantics of class and struct is that structs have value semantics. What is this means is that if you have two variables of the same type, they each have their own copy of the data. Thus if a variable of a given value type is set equal to another (of the same type), operations on one will not affect the other (that is, assignment of value types creates a copy). This is in sharp contrast to reference types.
There are other differences:
Value types are implicitly sealed (it is not possible to derive from a value type).
Value types can not be null.
Value types are given a default constructor that initialzes the value type to its default value.
A variable of a value type is always a value of that type. Contrast this with classes where a variable of type A could refer to a instance of type B if B derives from A.
Because of the difference in semantics, it is inappropriate to refer to structs as "lightweight classes."

All of the reasons I see in other answers are interesting and can be useful, but if you want to read about why they are required (at least by the VM) and why it was a mistake for the JVM to not support them (user-defined value types), read Demystifying Magic: High-level Low-level Programming. As it stands, C# shines in talking about the potential to bring safe, managed code to systems programming. This is also one of the reasons I think the CLI is a superior platform [than the JVM] for mobile computing. A few other reasons are listed in the linked paper.
It's important to note that you'll very rarely, if ever, see an observable performance improvement from using a struct. The garbage collector is extremely fast, and in many cases will actually outperform the structs. When you add in the nuances of them, they're certainly not a first-choice tool. However, when you do need them and have profiler results or system-level constructs to prove it, they get the job done.
Edit: If you wanted an answer of why we need them as opposed to what they do, ^^^

In C#, a struct is a value type, unlike classes which are reference types. This leads to a huge difference in how they are handled, or how they are expected to be used.
You should probably read up on structs from a book. Structs in C# aren't close cousins of class like in C++ or Java.

This is a myth that struct are always created on heap.
Ok it is right that struct is value type and class is reference type. But remember that
1. A Reference Type always goes on the Heap.
2. Value Types go where they were declared.
Now what that second line means is I will explain with below example
Consider the following method
public void DoCalulation()
{
int num;
num=2;
}
Here num is a local variable so it will be created on stack.
Now consider the below example
public class TestClass
{
public int num;
}
public void DoCalulation()
{
TestClass myTestClass = new TestClass ();
myTestClass.num=2;
}
This time num is the num is created on heap.Ya in some cases value types perform more than reference types as they don't require garbage collection.
Also remeber:
The value of a value type is always a value of that type.
The value of a reference type is always a reference.
And you have to think over the issue that if you expect that there will lot be instantiation then that means more heap space yow will deal with ,and more is the work of garbage collector.For that case you can choose structs.

Structs have many different semantics to classes. The differences are many but the primary reasons for their existence are:
They can be explicitly layed out in memmory
this allows certain interop scenarios
They may be allocated on the stack
Making some sorts of high performance code possible in a much simpler fashion

the difference is that a struct is a value-type
I've found them useful in 2 situations
1) Interop - you can specify the memory layout of a struct, so you can guarantee that when you invoke an unmanaged call.
2) Performance - in some (very limited) cases, structs can be faster than classes, In general, this requires structs to be small (I've heard 16 bytes or less) , and not be changed often.

One of the main reasons is that, when used as local variables during a method call, structs are allocated on the stack.
Stack allocation is cheap, but the big difference is that de-allocation is also very cheap. In this situation, the garbage collector doesn't have to track structs -- they're removed when returning from the method that allocated them when the stack frame is popped.
edit - clarified my post re: Jon Skeet's comment.

A struct is a value type (like Int32), whereas a class is a reference type. Structs get created on the stack rather than the heap. Also, when a struct is passed to a method, a copy of the struct is passed, but when a class instance is passed, a reference is passed.
If you need to create your own datatype, say, then a struct is often a better choice than a class as you can use it just like the built-in value types in the .NET framework. There some good struct examples you can read here.

Related

Should I use Class or Struct in the following case (data structure with many fields)? [duplicate]

I'm about to create 100,000 objects in code. They are small ones, only with 2 or 3 properties. I'll put them in a generic list and when they are, I'll loop them and check value a and maybe update value b.
Is it faster/better to create these objects as class or as struct?
EDIT
a. The properties are value types (except the string i think?)
b. They might (we're not sure yet) have a validate method
EDIT 2
I was wondering: are objects on the heap and the stack processed equally by the garbage collector, or does that work different?
Is it faster to create these objects as class or as struct?
You are the only person who can determine the answer to that question. Try it both ways, measure a meaningful, user-focused, relevant performance metric, and then you'll know whether the change has a meaningful effect on real users in relevant scenarios.
Structs consume less heap memory (because they are smaller and more easily compacted, not because they are "on the stack"). But they take longer to copy than a reference copy. I don't know what your performance metrics are for memory usage or speed; there's a tradeoff here and you're the person who knows what it is.
Is it better to create these objects as class or as struct?
Maybe class, maybe struct. As a rule of thumb:
If the object is :
1. Small
2. Logically an immutable value
3. There's a lot of them
Then I'd consider making it a struct. Otherwise I'd stick with a reference type.
If you need to mutate some field of a struct it is usually better to build a constructor that returns an entire new struct with the field set correctly. That's perhaps slightly slower (measure it!) but logically much easier to reason about.
Are objects on the heap and the stack processed equally by the garbage collector?
No, they are not the same because objects on the stack are the roots of the collection. The garbage collector does not need to ever ask "is this thing on the stack alive?" because the answer to that question is always "Yes, it's on the stack". (Now, you can't rely on that to keep an object alive because the stack is an implementation detail. The jitter is allowed to introduce optimizations that, say, enregister what would normally be a stack value, and then it's never on the stack so the GC doesn't know that it is still alive. An enregistered object can have its descendents collected aggressively, as soon as the register holding onto it is not going to be read again.)
But the garbage collector does have to treat objects on the stack as alive, the same way that it treats any object known to be alive as alive. The object on the stack can refer to heap-allocated objects that need to be kept alive, so the GC has to treat stack objects like living heap-allocated objects for the purposes of determining the live set. But obviously they are not treated as "live objects" for the purposes of compacting the heap, because they're not on the heap in the first place.
Is that clear?
Sometimes with struct you don't need to call the new() constructor, and directly assign the fields making it much faster that usual.
Example:
Value[] list = new Value[N];
for (int i = 0; i < N; i++)
{
list[i].id = i;
list[i].isValid = true;
}
is about 2 to 3 times faster than
Value[] list = new Value[N];
for (int i = 0; i < N; i++)
{
list[i] = new Value(i, true);
}
where Value is a struct with two fields (id and isValid).
struct Value
{
int id;
bool isValid;
public Value(int i, bool isValid)
{
this.i = i;
this.isValid = isValid;
}
}
On the other hand is the items needs to be moved or selected value types all that copying is going to slow you down. To get the exact answer I suspect you have to profile your code and test it out.
Arrays of structs are represented on the heap in a contiguous block of memory, whereas an array of objects is represented as a contiguous block of references with the actual objects themselves elsewhere on the heap, thus requiring memory for both the objects and for their array references.
In this case, as you are placing them in a List<> (and a List<> is backed onto an array) it would be more efficient, memory-wise to use structs.
(Beware though, that large arrays will find their way on the Large Object Heap where, if their lifetime is long, may have an adverse affect on your process's memory management. Remember, also, that memory is not the only consideration.)
Structs may seem similar to classes, but there are important differences that you should be aware of. First of all, classes are reference types and structs are value types. By using structs, you can create objects that behave like the built-in types and enjoy their benefits as well.
When you call the New operator on a class, it will be allocated on the heap. However, when you instantiate a struct, it gets created on the stack. This will yield performance gains. Also, you will not be dealing with references to an instance of a struct as you would with classes. You will be working directly with the struct instance. Because of this, when passing a struct to a method, it's passed by value instead of as a reference.
More here:
http://msdn.microsoft.com/en-us/library/aa288471(VS.71).aspx
If they have value semantics, then you should probably use a struct. If they have reference semantics, then you should probably use a class. There are exceptions, which mostly lean towards creating a class even when there are value semantics, but start from there.
As for your second edit, the GC only deals with the heap, but there is a lot more heap space than stack space, so putting things on the stack isn't always a win. Besides which, a list of struct-types and a list of class-types will be on the heap either way, so this is irrelevant in this case.
Edit:
I'm beginning to consider the term evil to be harmful. After all, making a class mutable is a bad idea if it's not actively needed, and I would not rule out ever using a mutable struct. It is a poor idea so often as to almost always be a bad idea though, but mostly it just doesn't coincide with value semantics so it just doesn't make sense to use a struct in the given case.
There can be reasonable exceptions with private nested structs, where all uses of that struct are hence restricted to a very limited scope. This doesn't apply here though.
Really, I think "it mutates so it's a bad stuct" is not much better than going on about the heap and the stack (which at least does have some performance impact, even if a frequently misrepresented one). "It mutates, so it quite likely doesn't make sense to consider it as having value semantics, so it's a bad struct" is only slightly different, but importantly so I think.
The best solution is to measure, measure again, then measure some more. There may be details of what you're doing that may make a simplified, easy answer like "use structs" or "use classes" difficult.
A struct is, at its heart, nothing more nor less than an aggregation of fields. In .NET it's possible for a structure to "pretend" to be an object, and for each structure type .NET implicitly defines a heap object type with the same fields and methods which--being a heap object--will behave like an object. A variable which holds a reference to such a heap object ("boxed" structure) will exhibit reference semantics, but one which holds a struct directly is simply an aggregation of variables.
I think much of the struct-versus-class confusion stems from the fact that structures have two very different usage cases, which should have very different design guidelines, but the MS guidelines don't distinguish between them. Sometimes there is a need for something which behaves like an object; in that case, the MS guidelines are pretty reasonable, though the "16 byte limit" should probably be more like 24-32. Sometimes, however, what's needed is an aggregation of variables. A struct used for that purpose should simply consist of a bunch of public fields, and possibly an Equals override, ToString override, and IEquatable(itsType).Equals implementation. Structures which are used as aggregations of fields are not objects, and shouldn't pretend to be. From the structure's point of view, the meaning of field should be nothing more or less than "the last thing written to this field". Any additional meaning should be determined by the client code.
For example, if a variable-aggregating struct has members Minimum and Maximum, the struct itself should make no promise that Minimum <= Maximum. Code which receives such a structure as a parameter should behave as though it were passed separate Minimum and Maximum values. A requirement that Minimum be no greater than Maximum should be regarded like a requirement that a Minimum parameter be no greater than a separately-passed Maximum one.
A useful pattern to consider sometimes is to have an ExposedHolder<T> class defined something like:
class ExposedHolder<T>
{
public T Value;
ExposedHolder() { }
ExposedHolder(T val) { Value = T; }
}
If one has a List<ExposedHolder<someStruct>>, where someStruct is a variable-aggregating struct, one may do things like myList[3].Value.someField += 7;, but giving myList[3].Value to other code will give it the contents of Value rather than giving it a means of altering it. By contrast, if one used a List<someStruct>, it would be necessary to use var temp=myList[3]; temp.someField += 7; myList[3] = temp;. If one used a mutable class type, exposing the contents of myList[3] to outside code would require copying all the fields to some other object. If one used an immutable class type, or an "object-style" struct, it would be necessary to construct a new instance which was like myList[3] except for someField which was different, and then store that new instance into the list.
One additional note: If you are storing a large number of similar things, it may be good to store them in possibly-nested arrays of structures, preferably trying to keep the size of each array between 1K and 64K or so. Arrays of structures are special, in that indexing one will yield a direct reference to a structure within, so one can say "a[12].x = 5;". Although one can define array-like objects, C# does not allow for them to share such syntax with arrays.
Use classes.
On a general note. Why not update value b as you create them?
From a c++ perspective I agree that it will be slower modifying a structs properties compared to a class. But I do think that they will be faster to read from due to the struct being allocated on the stack instead of the heap. Reading data from the heap requires more checks than from the stack.
Well, if you go with struct afterall, then get rid of string and use fixed size char or byte buffer.
That's re: performance.

Immutable class vs struct

The following are the only ways classes are different from structs in C# (please correct me if I'm wrong):
Class variables are references, while struct variables are values, therefore the entire value of struct is copied in assignments and parameter passes
Class variables are pointers stored on stack that point to the memory on heap, while struct variables are on stored heap as values
Suppose I have an immutable struct, that is struct with fields that cannot be modified once initialized. Each time I pass this struct as a parameter or use in assignments, the value would be copied and stored on stack.
Then suppose I make this immutable struct to be an immutable class. The single instance of this class would be created once, and only the reference to the class would be copied in assignments and parameter passes.
If the object was mutable, the behavior in these two cases would be different: when one would change the object, in the first case the copy of the struct would be modified, while in the second case the original object would be changed. However, in both cases the object is immutable, therefore there is no difference whether this is actually a class or a struct for the user of this object.
Since copying reference is cheaper than copying struct, why would one use an immutable struct?
Also, since mutable structs are evil, it looks like there is no reason to use structs at all.
Where am I wrong?
Since copying reference is cheaper than copying struct, why would one use an immutable struct?
This isn't always true. Copying a reference is going to be 8 bytes on a 64bit OS, which is potentially larger than many structs.
Also note that creation of the class is likely more expensive. Creating a struct is often done completely on the stack (though there are many exceptions), which is very fast. Creating a class requires creating the object handle (for the garbage collector), creating the reference on the stack, and tracking the object's lifetime. This can add GC pressure, which also has a real cost.
That being said, creating a large immutable struct is likely not a good idea, which is part of why the Guidelines for choosing between Classes and Structures recommend always using a class if your struct will be more than 16 bytes, if it will be boxed, and other issues that make the difference smaller.
That being said, I often base my decision more on the intended usage and meaning of the type in question. Value types should be used to refer to a single value (again, refer to guidelines), and often have a semantic meaning and expected usage different than classes. This is often just as important as the performance characteristics when making the choice between class or struct.
Reed's answer is quite good but just to add a few extra points:
please correct me if I'm wrong
You are basically on the right track here. You've made the common error of confusing variables with values. Variables are storage locations; values are stored in variables. And you are flirting with the commonly-stated myth that "value types go on the stack"; rather, variables go on either short-term or long-term storage, because variables are storage locations. Whether a variable goes on short or long term storage depends on its known lifetime, not its type.
But all of that is not particularly relevant to your question, which boils down to asking for a refutation of this syllogism:
Mutable structs are evil.
Reference copying is cheaper than struct copying, so immutable structs are always worse.
Therefore, there is never any use for structs.
We can refute the syllogism in several ways.
First, yes, mutable structs are evil. However, they are sometimes very useful because in some limited scenarios, you can get a performance advantage. I do not recommend this approach unless other reasonable avenues have been exhausted and there is a real performance problem.
Second, reference copying is not necessarily cheaper than struct copying. References are typically implemented as 4 or 8 byte managed pointers (though that is an implementation detail; they could be implemented as opaque handles). Copying a reference-sized struct is neither cheaper nor more expensive than copying a reference-sized reference.
Third, even if reference copying is cheaper than struct copying, references must be dereferenced in order to get at their fields. Dereferencing is not zero cost! Not only does it take machine cycles to dereference a reference, doing so might mess up the processor cache, and that can make future dereferences far more expensive!
Fourth, even if reference copying is cheaper than struct copying, who cares? If that is not the bottleneck that is producing an unacceptable performance cost then which one is faster is completely irrelevant.
Fifth, references are far, far more expensive in memory space than structs are.
Sixth, references add expense because the network of references must be periodically traced by the garbage collector; "blittable" structs may be ignored by the garbage collector entirely. Garbage collection is a large expense.
Seventh, immutable value types cannot be null, unlike reference types. You know that every value is a good value. And as Reed pointed out, in order to get a good value of a reference type you have to run both an allocator and a constructor. That's not cheap.
Eighth, value types represent values, and programs are often about the manipulation of values. It makes sense to "bake in" the metaphors of both "value" and "reference" in a language, regardless of which is "cheaper".
From MSDN;
Classes are reference types and structures are value types. Reference
types are allocated on the heap, and memory management is handled by
the garbage collector. Value types are allocated on the stack or
inline and are deallocated when they go out of scope. In general,
value types are cheaper to allocate and deallocate. However, if they
are used in scenarios that require a significant amount of boxing and
unboxing, they perform poorly as compared to reference types.
Do not define a structure unless the type has all of the following characteristics:
It logically represents a single value, similar to primitive types (integer, double, and so on).
It has an instance size smaller than 16 bytes.
It is immutable.
It will not have to be boxed frequently.
So, you should always use a class instead of struct, if your struct will be more than 16 bytes. Also read from http://www.dotnetperls.com/struct
There are two usage cases for structures. Opaque structures are useful for things which could be implemented using immutable classes, but are sufficiently small that even in the best of circumstances there wouldn't be much--if any--benefit to using a class, especially if the frequency with which they are created and discarded is a significant fraction of the frequency with which they will be simply copied. For example, Decimal is a 16-byte struct, so holding a million Decimal values would take 16 megabytes. If it were a class, each reference to a Decimal instance would take 4 or 8 bytes, but each distinct instance would probably take another 20-32 bytes. If one had many large arrays whose elements were copied from a small number of distinct Decimal instances, the class could win out, but in most scenarios one would be more likely to have an array with a million references to a million distinct instances of Decimal, which would mean the struct would win out.
Using structures in this way is generally only good if the guidelines quoted from MSDN apply (though the immutability guideline is mainly a consequence of the fact that there isn't yet any way via which struct methods can indicate that they modify the underlying struct). If any of the last three guidelines don't apply, one is likely better off using an immutable class than a struct. If the first guideline does not apply, however, that means one shouldn't use an opaque struct, but not that one should use a class instead.
In some situations, the purpose of a data type is simply to fasten a group of variables together with duct tape so that their values can be passed around as a unit, but they still remain semantically as distinct variables. For example, a lot of methods may need to pass around groups of three floating-point numbers representing 3d coordinates. If one wants to draw a triangle, it's a lot more convenient to pass three Point3d parameters than nine floating-point numbers. In many cases, the purpose of such types is not to impart any domain-specific behavior, but rather to simply provide a means of passing things around conveniently. In such cases, structures can offer major performance advantages over classes, if one uses them properly. A struct which is supposed to represent three varaibles of type double fastened together with duct tape should simply have three public fields of type double. Such a struct will allow two common operations to be performed efficiently:
Given an instance, take a snapshot of its state so the instance can be modified without disturbing the snapshot
Given an instance which is no longer needed, somehow come up with an instance which is slightly different
Immutable class types allow the first to be performed at fixed cost regardless of the amount of data held by the class, but they are inefficient at the second. The greater the amount of data the variable is supposed to represent, the greater the advantage of immutable class types versus structs when performing the first operation, and the greater the advantage of exposed-field structs when performing the second.
Mutable class types can be efficient in scenarios where the second operation dominates, and the first is needed seldom if ever, but it can be difficult for an object to expose the present values in a mutable class object without exposing the object itself to outside modification.
Note that depending upon usage patterns, large exposed-field structures may be much more efficient than either opaque structures or class types. Structure larger than 17 bytes are often less efficient than smaller ones, but they can still be vastly more efficient than classes. Further, the cost of passing a structure as a ref parameter does not depend upon its size. Large structs are inefficient if one accesses them via properties rather than fields, passes them by value needlessly, etc. but if one is careful to avoid redundant "copy" operations, there are usage patterns where there is no break-even point for classes versus structs--structs will simply perform better.
Some people may recoil in horror at the idea of a type having exposed fields, but I would suggest that a struct such as I describe shouldn't be thought of so much as an entity unto itself, but rather an extension of the things that read or write it. For example:
public struct SlopeAndIntercept
{
public double Slope,Intercept;
}
public SlopeAndIntercept FindLeastSquaresFit() ...
Code which is going to perform a least-squares-fit of a bunch of points will have to do a significant amount of work to find either the slope or Y intercept of the resulting line; finding both would not cost much more. Code which calls the FindLeastSquaresFit method is likely going to want to have the slope in one variable and the intercept in another. If such code does:
var resultLine = FindLeastSquaresFit();
the result will be to effectively create two variables resultLine.Slope and resultLine.Intercept which the method can manipulate as it sees fit. The fields of resultLine don't really belong to SlopeIntercept, nor to FindLeastSquaresFit; they belong to the code that declares resultLine. The situation is little different from if the method were used as:
double Slope, Intercept;
FindLeastSquaresFit(out Slope, out Intercept);
In that context, it would be clear that immediately following the function call, the two variables have the meaning assigned by the method, but that their meaning at any other time will depend upon what else the method does with them. Likewise for the fields of the aforementioned structure.
There are some situations where it may be better to return data using an immutable class rather than a transparent structure. Among other things, using a class will make it easier for future versions of a function that returns a Foo to return something which includes additional information. On the other hand, there are many situations where code is going to expect to deal with a specific set of discrete things, and changing that set of things would fundamentally change what clients have to do with it. For example, if one has a bunch of code that deals with (x,y) points, adding a "z" coordinate is going to require that code to be rewritten, and there's nothing the "point" type can do to mitigate that.

When exactly do I use a struct (Dont tell me when I want things to be allocated on a stack) [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
A lot of times the answer is merely when I want things to be allocated on a stack instead of the heap.. assuming I dont know what the stack and the heap are (and please dont try to explain it here), when exactly should I be using structs instead of classes?
Here is the answer I've been giving out, but please tell me if I'm wrong or falling short of a better answer:
I create structs usually when I have enums that I want to add more data to. For instance, I might start with a simple enum:
public enum Colors { Blue, Green, Red }
Then if I need to store more aspects of this data I go to structs:
public struct Color
{
string Name;
int HexValue;
string Description;
}
public class Colors
{
public static Color Blue;
public static Color Red;
public static Color Green;
static Colors()
{
Blue = new Color("Blue", 1234, "A light blue"
}
}
The point is... similar to enums, I use structs just when I want to declare a bunch of types.
struct vs class in .NET
The real time to use a struct is when you want value type like properties. Value types behave very differently from reference types, and the difference can be shocking and bug inducing if you aren't aware. For example a struct is copied into (and out of) method calls.
The heap vs stack isn't a compelling argument to me. In a typical .NET app, how often do you care about where the object lives?
I very rarely use structs in .NET apps. The only place I've truly used them was in a game where I wanted value type like properties for objects like vectors and such.
struct vs class in C++
This is a much simpler question to answer. structs and classes in C++ are identical to each other with only one minor difference. Everything in a C++ struct is public by default, where as everything in a C++ class is private by default. That is the only difference.
Simplistically, structs are for data, classes are for data and the manipulations of those data.
well in C++ land, you can allocate a struct (or class) on the stack or the heap... I generally use a struct when the default public access to everything is useful and class when I want to encapsulate... I also use structs for functors that need state.
You have this labelled with C#, C++ and "language-agnostic". The fact is though, the difference between structs and classes in C# and C++ are completely different, so this is not a language-agnostic question.
In C++ class is syntactic sugar for struct with different default visibility. The reason is that C had a struct and only had "public" visibility (indeed, it's not even a fully meaningful statement in C, there's no such thing as OO-style information hiding in C).
C++ wanted to be compatible with C so it had to keep struct default to everything visible. C++ also wanted to follow good OO rules, in which things are private by default, so it introduced class to do exactly the same, but with different default visibility.
Generally you use struct when you are closer to C-style use; no or relatively simple member functions (methods), simple construction, ability to change all fields from the outside ("Plain Old Data"). class is generally used for anything else, and hence more common.
In C# a struct has value semantics while a class has reference semantics (in C++ both classes and structs have value semantics, but you can also use types that access them with reference semantics). A struct, as a value-type is self-contained (the variable contains the actual value(s) directly) while a class, as a reference type refers to another value.
Some other differences are entailed by this. The fact that we can alias reference types directly (which has both good and bad effects) comes from this. So too do differences in what equality means:
A value type has a concept of equality based on the value contained, which can optionally be redefined (there are logical restrictions on how this redefinition can happen*). A reference type has a concept of identity that is meaningless with value types (as they cannot be directly aliased, so two such values cannot be identical) that can not be redefined, which is also gives the default for its concept of equality. By default, == deals with this value-based equality when it comes to value types†, but with identity when it comes to reference types. Also, even when a reference type is given a value-based concept of equality, and has it used for == it never loses the ability to be compared to another reference for identity.
Another difference entailed by this is that reference types can be null - a value that refers to another value allows for a value that doesn't refer to any value, which is what a null reference is.
Also, some of the advantages of keeping value-types small relate to this, since being based on value, they are copied by value when passed to functions.
Some other differences are implied but not entailed by this. That it's often a good idea to make value types immutable is implied but not entailed by the core difference because while there are advantages to be found without considering implementation matters, there are also advantages in doing so with reference types (indeed some relating to safety with aliases apply more immediately to reference types) and reasons why one may break this guideline - so it's not a hard and fast rule (with nested value types the risks involved are so heavily reduced that I would have few qualms in making a nested value type mutable, even though my style leans heavily to making even reference types immutable when at all practical).
Some further differences between value types and reference types are arguably implementation details. That a value type in a local variable has the value stored on the stack has been argued as an implementation detail; probably a pretty obvious one if your implementation has a stack, and certainly an important one in some cases, but not core to the definition. It's also often overstated (for a start, a reference type in a local variable also has the reference itself in the stack, for another there are plenty of times when a value type value is stored in the heap).
Some further advantages in value types being small relate to this.
Therefore in C# a struct when you are solely concerned with value-semantics and will not want to alias (string is an example of a case where value-semantics are very important, but you would want to alias, so it is a class to make it a reference-type). It's also a very good idea for such types to be immutable and an extremely good idea for such types to have fields that total to less than 16bytes - for a larger struct or a struct that needs to be mutable it may well be wise to use a class instead, even if the value-semantics make struct your first choice.
In C++, technically it doesn't matter. You could use struct for polymorphic object and class for PODs. The language wouldn't care, though your coworkers may plot a bloody revenge. Aside from default access, there's no difference between class and struct.
Ultimately, the most important consideration is that you pick a coding style, and apply it consitantly. Maybe that means everything is classes, or maybe PODs are structs. You need to decide for yourself, taking in to consideration any coding practices applied by whomever you work for.
As for myself, I only use structs if the object has only public members and no virtuals. They might have data only or data and methods. Typically I use structs for buckets of data that may or may not have simple operations associated with them, usually to convert from one type to another. Hence they may also have constructors.
When to use struct...
When you want object to behave as a value type
When the required size of object is <=16 bytes roughly.
A struct is actually exactly the same thing as a class - with one difference: in a class, everything is private by default while in a struct, everything is public by default!
struct Color
{
string Name;
private:
int HexValue;
};
would be the same as
class Color
{
int HexValue;
public:
string Name;
};
I would say , use stack when your data is smaller in size and you don't want a few thoushands of this object because it can hurt ou back a lot because as already mentioned that value types are copied by nature so pasing few thoushands of objects which is copied by value is not a good idea also.
Second point , i would like to include is when you only want data for most of the time and data is numeric most of the time , you can use stack.

What (if any) are the implications of having an object or a nullable type as a field in a struct

For performance reasons I use structs in several use cases.
If I have an object or a nullable type (another struct but nullable) as a member in the struct, is there an adverse effect on performance. Do I lose the very benefit I am trying to gain?
Edit
I am aware of the size limitations and proper use of structs. Please no more lectures. In performance tests the structs perform faster.
I do not mean to sound abrasive or ungrateful, but how do I make my question any more simple?
Does having a object as a member of a struct impact performance or negate the benefit?
Well, C# is a strange beast when it comes to the performance part of struct vs classes.
Check this link: http://msdn.microsoft.com/en-us/library/y23b5415(VS.71).aspx
According to Microsoft you should use a struct only when the instance size is under 16 bytes. Andrew is right. If you do not pass around a struct, you might see a performance benefit. Value type semantics have a heavy performance (and at time memory, depending on what you are doing) penalty while passing them around.
As far as collections are concerned, if you are using a non-generic collection, the boxing and unboxing of a value-type (struct in this case) will have a higher performance overhead than a reference type (i.e. class). That said, it is also true that structs get allocated faster than classes.
Although struct and class have same syntax, the behavior is vastly different. This can force you to make many errors that might be difficult to trace. For example, like static constructors in a struct would not be called when you call it's public (hidden constructor) or as operator will fail with structs.
Nullable types are themselves are implemented with structs. But they do have a penalty. Even every operation of a Nullable type emit more IL.
Well, in my opinion, struct are well left to be used in types such as DateTime or Guids. If you need an immutable type, use struct otherwise, don't. The performance benefits are not that huge. Similarly even the overhead is not that huge. So at the end of day, it depends on your data you are storing in the struct and also how you are using it.
No, you won't lose the benefit necessarily. One area in which you see a performance benefit from using a struct is when you are creating many objects quickly in a loop and do not need to pass these objects to any other methods. In this case you should be fine but without seeing some code it is impossible to tell.
Personally, I'd be more worried about simply using structs inappropriately; what you have described sounds like an object (class) to me.
In particular, I'd worry about your struct being too big; when you pass a struct around (between variables, between methods, etc) it gets copied. If it is a big fat beast with lots of fields (some of which are themselves beasts) then this copy will take more space on the stack, and more CPU time. Contrast to passing a reference to an object, which takes a constant size / time (width per your x86/x64 architecture).
If we talk about basic nullable types, such as classic "values"; Nullable<T> of course has an overhead; the real questions are:
is it too much
is it more expensive than the check I'd still have to do for a "magic number" etc
In particular, all casts and operators on Nullable<T> get extra code - for example:
int? a = ..., b = ...;
int? c = a + b;
is really more similar to:
int? c = (a.HasValue && b.HasValue) ?
new Nullable<int>(a.GetValueOrDefault() + b.GetValueOrDefault())
: new Nullable<int>();
The only way to see if this is too much is going to be with your own local tests, with your own data. The fact that the data is on a struct in this case is largely moot; the numbers should broadly compare no matter where they are.
Nullable<T> is essentially a tuple of T and bool flag indicating whether it's null or not. Its performance effect is therefore exactly the same: in terms of size, you get that extra bool (plus whatever padding it deems required).
For references to reference types, there are no special implications. It's just whatever the size of an object reference is (which is usually sizeof(IntPtr), though I don't think there's a definite guarantee on that). Of course, GC would also have to trace through those references every now and then, but a reference inside a struct is not in any way special in that regard.
Neither nullable types nor immutable class types will pose a problem within a struct. When using mutable class types, however, one should generally try to stick to one of two approaches:
The state represented by mutable class field or property should be the *identity*, rather than the *mutable charactersitics*, of the mutable object referenced thereby. For example, suppose a struct has a field of type `Car`, which holds a reference to a red car, vehicle ID #24601. Suppose further that someone copies the struct and then paints the vehicle referred to blue. An object reference would be appropriate if, under such circumstances, one would want the structure to hold a reference to a blue car with ID #24601. It would be inappropriate if one would want the structure to still hold a refernce to a red car (which would have to have some other ID, since car ID #24601 is blue).
Code within the struct creates a mutable class instance, performs all mutations that will ever be performed to that instance (possibly copying data from a passed-in instance), and stores it in a private field after all mutations are complete. Once a reference to the instance is stored in a field, the struct must never again mutate that instance, nor expose it to any code which could mutate it.
Note that the two approaches offer very different semantics; one should never have a hard time deciding between them, since in any circumstance where one is appropriate the other would be completely inappropriate. In some circumstances there may be other approaches which would work somewhat better, but in general one should identify whether a struct's state includes the identity or mutable characteristics of any nested mutable classes, and use one of the patterns above as appropriate.

Blindly converting structs to classes to hide the default constructor?

I read all the questions related to this topic, and they all give reasons why a default constructor on a struct is not available in C#, but I have not yet found anyone who suggests a general course of action when confronted with this situation.
The obvious solution is to simply convert the struct to a class and deal with the consequences.
Are there other options to keep it as a struct?
I ran into this situation with one of our internal commerce API objects. The designer converted it from a class to a struct, and now the default constructor (which was private before) leaves the object in an invalid state.
I thought that if we're going to keep the object as a struct, a mechanism for checking the validity of the state should be introduced (something like an IsValid property). I was met with much resistance, and an explanation of "whoever uses the API should not use the default constructor," a comment which certainly raised my eyebrows. (Note: the object in question is constructed "properly" through static factory methods, and all other constructors are internal.)
Is everyone simply converting their structs to classes in this situation without a second thought?
Edit: I would like to see some suggestions about how to keep this type of object as a struct -- the object in question above is much better suited as a struct than as a class.
For a struct, you design the type so the default constructed instance (fields all zero) is a valid state. You don't [shall not] arbitrarily use struct instead of class without a good reason - there's nothing wrong with using an immutable reference type.
My suggestions:
Make sure the reason for using a struct is valid (a [real] profiler revealed significant performance problems resulting from heavy allocation of a very lightweight object).
Design the type so the default constructed instance is valid.
If the type's design is dictated by native/COM interop constraints, wrap the functionality and don't expose the struct outside the wrapper (private nested type). That way you can easily document and verify proper use of the constrained type requirements.
The reason for this is that a struct (an instance of System.ValueType) is treated specially by the CLR: it is initialized with all the fields being 0 (or default). You don't really even need to create one - just declare it. This is why default constructors are required.
You can get around this in two ways:
Create a property like IsValid to indicate if it is a valid struct, like you indicate and
in .Net 2.0 consider using Nullable<T> to allow an uninitialized (null) struct.
Changing the struct to a class can have some very subtle consequences (in terms of memory usage and object identity which come up more in a multithreaded environment), and non-so-subtle but hard to debug NullReferenceExceptions for uninitialized objects.
The reason why there is no possibility to define a default constructor is illustrated by the following expression:
new MyStruct[1000];
You've got 3 options here
calling the default constructor 1000 times, or
creating corrupt data (note that a struct can contain references; if you don't initialize or blank out the reference, you could potentially access arbitrary memory), or
blank the allocated memory out with zeroes (at the byte level).
.NET does the same for both structs and classes: fields and array elements are blanked out with zeroes. This also gets more consistent behavior between structs and classes and no unsafe code. It also allows the .NET framework not to specialize something like new byte[1000].
And that's the default constructor for structs .NET demands and takes care of itself: zero out all bytes.
Now, to handle this, you've got a couple of options:
Add an Am-I-Initialized property to the struct (like HasValue on Nullable).
Allow the zeroed out struct to be a valid value (like 0 is a valid value for a decimal).

Categories

Resources