Why is Uri a class and not a struct?

Why is Uri a class and not a struct? - c#

System.Uri seems to be a perfect candidate for being a struct but the good folks at Microsoft decided to make it a class. It is clearly a value object -> there is no sense in having two different instances of "https://stackoverflow.com" and so it has struct-y equality rules. I can't see any setters on the public API either, so it is immutable.
Is there some implementation detail which dictates it must be a class?

A struct would always have a default constructor (where all fields would be initialized to their default values). This could cause issues, for example, with some of the internal fields.
One example is that the existing constructor ensures that m_String can't be null - https://referencesource.microsoft.com/#system/net/system/UriExt.cs,39 . If you made this a struct, you can't easily achieve that in C#. So, everywhere you read from m_String you'd need to add a null check (which is not required if it is a class).
Additionally, as others have pointed out, the docs have other guidance on choosing when to use struct:
✓ CONSIDER defining a struct instead of a class if instances of the
type are small and commonly short-lived or are commonly embedded in
other objects.
X AVOID defining a struct unless the type has all of the following
characteristics:
It logically represents a single value, similar to primitive types
(int, double, etc.).
It has an instance size under 16 bytes.
It is immutable.
It will not have to be boxed frequently.
The 16 byte requirement, in particular, is likely to be problematic for Uri.

Related

How to determine if .NET (BCL) type is immutable

From this Answer, I came to know that KeyValuePair are immutables.
I browsed through the docs, but could not find any information regarding immutable behavior.
I was wondering how to determine if a type is immutable or not?

I don't think there's a standard way to do this, since there is no official concept of immutability in C#. The only way I can think of is looking at certain things, indicating a higher probability:
1) All properties of the type have a private set
2) All fields are const/readonly or private
3) There are no methods with obvious/known side effects
4) Also, being a struct generally is a good indication (if it is BCL type or by someone with guidelines for this)
Something like an ImmutabeAttribute would be nice. There are some thoughts here (somewhere down in the comments), but I haven't seen one in "real life" yet.

The first indication would be that the documentation for the property in the overview says "Gets the key in the key/value pair."
The second more definite indication would be in the description of the property itself:
"This property is read/only."

I don't think you can find "proof" of immutability by just looking at the docs, but there are several strong indicators:
It's a struct (why does this matter?)
It has no settable public properties (both are read-only)
It has no obvious mutator methods
For definitive proof I recommend downloading the BCL's reference source from Microsoft or using an IL decompiler to show you how a type would look like in code.

A KeyValuePair<T1,T2> is a struct which, absent Reflection, can only be mutated outside its constructor by copying the contents of another KeyValuePair<T1,T2> which holds the desired values. Note that the statement:
MyKeyValuePair = new KeyValuePair(1,2);
like all similar constructor invocations on structures, actually works by creating a new temporary instance of KeyValuePair<int,int> (happens before the constructor itself executes), setting the field values of that instance (done by the constructor), copying all public and private fields of that new temporary instance to MyKeyValuePair, and then discarding the temporary instance.
Consider the following code:
static KeyValuePair MyKeyValuePair; // Field in some class
// Thread1
MyKeyValuePair = new KeyValuePair(1,1);
// ***
MyKeyValuePair = new KeyValuePair(2,2);
// Thread2
st = MyKeyValuePair.ToString();
Because MyKeyValuePair is precisely four bytes in length, the second statement in Thread1 will update both fields simultaneously. Despite that, if the second statement in Thread1 executes between Thread2's evaluation of MyKeyValuePair.Key.ToString() and MyKeyValuePair.Value.ToString(), the second ToString() will act upon the new mutated value of the structure, even though the first already-completed ToString()operated upon the value before the mutation.
All non-trivial structs, regardless of how they are declared, have the same immutability rules for their fields: code which can change a struct can change its fields; code which cannot change a struct cannot change its fields. Some structs may force one to go through hoops to change one of their fields, but designing struct types to be "immutable" is neither necessary nor sufficient to ensure the immutability of instances. There are a few reasonable uses of "immutable" struct types, but such use cases if anything require more care than is necessary for structs with exposed public fields.

When exactly do I use a struct (Dont tell me when I want things to be allocated on a stack) [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 12 years ago.
A lot of times the answer is merely when I want things to be allocated on a stack instead of the heap.. assuming I dont know what the stack and the heap are (and please dont try to explain it here), when exactly should I be using structs instead of classes?
Here is the answer I've been giving out, but please tell me if I'm wrong or falling short of a better answer:
I create structs usually when I have enums that I want to add more data to. For instance, I might start with a simple enum:
public enum Colors { Blue, Green, Red }
Then if I need to store more aspects of this data I go to structs:
public struct Color
{
string Name;
int HexValue;
string Description;
}
public class Colors
{
public static Color Blue;
public static Color Red;
public static Color Green;
static Colors()
{
Blue = new Color("Blue", 1234, "A light blue"
}
}
The point is... similar to enums, I use structs just when I want to declare a bunch of types.

struct vs class in .NET
The real time to use a struct is when you want value type like properties. Value types behave very differently from reference types, and the difference can be shocking and bug inducing if you aren't aware. For example a struct is copied into (and out of) method calls.
The heap vs stack isn't a compelling argument to me. In a typical .NET app, how often do you care about where the object lives?
I very rarely use structs in .NET apps. The only place I've truly used them was in a game where I wanted value type like properties for objects like vectors and such.
struct vs class in C++
This is a much simpler question to answer. structs and classes in C++ are identical to each other with only one minor difference. Everything in a C++ struct is public by default, where as everything in a C++ class is private by default. That is the only difference.

Simplistically, structs are for data, classes are for data and the manipulations of those data.

well in C++ land, you can allocate a struct (or class) on the stack or the heap... I generally use a struct when the default public access to everything is useful and class when I want to encapsulate... I also use structs for functors that need state.

You have this labelled with C#, C++ and "language-agnostic". The fact is though, the difference between structs and classes in C# and C++ are completely different, so this is not a language-agnostic question.
In C++ class is syntactic sugar for struct with different default visibility. The reason is that C had a struct and only had "public" visibility (indeed, it's not even a fully meaningful statement in C, there's no such thing as OO-style information hiding in C).
C++ wanted to be compatible with C so it had to keep struct default to everything visible. C++ also wanted to follow good OO rules, in which things are private by default, so it introduced class to do exactly the same, but with different default visibility.
Generally you use struct when you are closer to C-style use; no or relatively simple member functions (methods), simple construction, ability to change all fields from the outside ("Plain Old Data"). class is generally used for anything else, and hence more common.
In C# a struct has value semantics while a class has reference semantics (in C++ both classes and structs have value semantics, but you can also use types that access them with reference semantics). A struct, as a value-type is self-contained (the variable contains the actual value(s) directly) while a class, as a reference type refers to another value.
Some other differences are entailed by this. The fact that we can alias reference types directly (which has both good and bad effects) comes from this. So too do differences in what equality means:
A value type has a concept of equality based on the value contained, which can optionally be redefined (there are logical restrictions on how this redefinition can happen*). A reference type has a concept of identity that is meaningless with value types (as they cannot be directly aliased, so two such values cannot be identical) that can not be redefined, which is also gives the default for its concept of equality. By default, == deals with this value-based equality when it comes to value types†, but with identity when it comes to reference types. Also, even when a reference type is given a value-based concept of equality, and has it used for == it never loses the ability to be compared to another reference for identity.
Another difference entailed by this is that reference types can be null - a value that refers to another value allows for a value that doesn't refer to any value, which is what a null reference is.
Also, some of the advantages of keeping value-types small relate to this, since being based on value, they are copied by value when passed to functions.
Some other differences are implied but not entailed by this. That it's often a good idea to make value types immutable is implied but not entailed by the core difference because while there are advantages to be found without considering implementation matters, there are also advantages in doing so with reference types (indeed some relating to safety with aliases apply more immediately to reference types) and reasons why one may break this guideline - so it's not a hard and fast rule (with nested value types the risks involved are so heavily reduced that I would have few qualms in making a nested value type mutable, even though my style leans heavily to making even reference types immutable when at all practical).
Some further differences between value types and reference types are arguably implementation details. That a value type in a local variable has the value stored on the stack has been argued as an implementation detail; probably a pretty obvious one if your implementation has a stack, and certainly an important one in some cases, but not core to the definition. It's also often overstated (for a start, a reference type in a local variable also has the reference itself in the stack, for another there are plenty of times when a value type value is stored in the heap).
Some further advantages in value types being small relate to this.
Therefore in C# a struct when you are solely concerned with value-semantics and will not want to alias (string is an example of a case where value-semantics are very important, but you would want to alias, so it is a class to make it a reference-type). It's also a very good idea for such types to be immutable and an extremely good idea for such types to have fields that total to less than 16bytes - for a larger struct or a struct that needs to be mutable it may well be wise to use a class instead, even if the value-semantics make struct your first choice.

In C++, technically it doesn't matter. You could use struct for polymorphic object and class for PODs. The language wouldn't care, though your coworkers may plot a bloody revenge. Aside from default access, there's no difference between class and struct.
Ultimately, the most important consideration is that you pick a coding style, and apply it consitantly. Maybe that means everything is classes, or maybe PODs are structs. You need to decide for yourself, taking in to consideration any coding practices applied by whomever you work for.
As for myself, I only use structs if the object has only public members and no virtuals. They might have data only or data and methods. Typically I use structs for buckets of data that may or may not have simple operations associated with them, usually to convert from one type to another. Hence they may also have constructors.

When to use struct...
When you want object to behave as a value type
When the required size of object is <=16 bytes roughly.

A struct is actually exactly the same thing as a class - with one difference: in a class, everything is private by default while in a struct, everything is public by default!
struct Color
{
string Name;
private:
int HexValue;
};
would be the same as
class Color
{
int HexValue;
public:
string Name;
};

I would say , use stack when your data is smaller in size and you don't want a few thoushands of this object because it can hurt ou back a lot because as already mentioned that value types are copied by nature so pasing few thoushands of objects which is copied by value is not a good idea also.
Second point , i would like to include is when you only want data for most of the time and data is numeric most of the time , you can use stack.

What (if any) are the implications of having an object or a nullable type as a field in a struct

For performance reasons I use structs in several use cases.
If I have an object or a nullable type (another struct but nullable) as a member in the struct, is there an adverse effect on performance. Do I lose the very benefit I am trying to gain?
Edit
I am aware of the size limitations and proper use of structs. Please no more lectures. In performance tests the structs perform faster.
I do not mean to sound abrasive or ungrateful, but how do I make my question any more simple?
Does having a object as a member of a struct impact performance or negate the benefit?

Well, C# is a strange beast when it comes to the performance part of struct vs classes.
Check this link: http://msdn.microsoft.com/en-us/library/y23b5415(VS.71).aspx
According to Microsoft you should use a struct only when the instance size is under 16 bytes. Andrew is right. If you do not pass around a struct, you might see a performance benefit. Value type semantics have a heavy performance (and at time memory, depending on what you are doing) penalty while passing them around.
As far as collections are concerned, if you are using a non-generic collection, the boxing and unboxing of a value-type (struct in this case) will have a higher performance overhead than a reference type (i.e. class). That said, it is also true that structs get allocated faster than classes.
Although struct and class have same syntax, the behavior is vastly different. This can force you to make many errors that might be difficult to trace. For example, like static constructors in a struct would not be called when you call it's public (hidden constructor) or as operator will fail with structs.
Nullable types are themselves are implemented with structs. But they do have a penalty. Even every operation of a Nullable type emit more IL.
Well, in my opinion, struct are well left to be used in types such as DateTime or Guids. If you need an immutable type, use struct otherwise, don't. The performance benefits are not that huge. Similarly even the overhead is not that huge. So at the end of day, it depends on your data you are storing in the struct and also how you are using it.

No, you won't lose the benefit necessarily. One area in which you see a performance benefit from using a struct is when you are creating many objects quickly in a loop and do not need to pass these objects to any other methods. In this case you should be fine but without seeing some code it is impossible to tell.

Personally, I'd be more worried about simply using structs inappropriately; what you have described sounds like an object (class) to me.
In particular, I'd worry about your struct being too big; when you pass a struct around (between variables, between methods, etc) it gets copied. If it is a big fat beast with lots of fields (some of which are themselves beasts) then this copy will take more space on the stack, and more CPU time. Contrast to passing a reference to an object, which takes a constant size / time (width per your x86/x64 architecture).
If we talk about basic nullable types, such as classic "values"; Nullable<T> of course has an overhead; the real questions are:
is it too much
is it more expensive than the check I'd still have to do for a "magic number" etc
In particular, all casts and operators on Nullable<T> get extra code - for example:
int? a = ..., b = ...;
int? c = a + b;
is really more similar to:
int? c = (a.HasValue && b.HasValue) ?
new Nullable<int>(a.GetValueOrDefault() + b.GetValueOrDefault())
: new Nullable<int>();
The only way to see if this is too much is going to be with your own local tests, with your own data. The fact that the data is on a struct in this case is largely moot; the numbers should broadly compare no matter where they are.

Nullable<T> is essentially a tuple of T and bool flag indicating whether it's null or not. Its performance effect is therefore exactly the same: in terms of size, you get that extra bool (plus whatever padding it deems required).
For references to reference types, there are no special implications. It's just whatever the size of an object reference is (which is usually sizeof(IntPtr), though I don't think there's a definite guarantee on that). Of course, GC would also have to trace through those references every now and then, but a reference inside a struct is not in any way special in that regard.

Neither nullable types nor immutable class types will pose a problem within a struct. When using mutable class types, however, one should generally try to stick to one of two approaches:
The state represented by mutable class field or property should be the *identity*, rather than the *mutable charactersitics*, of the mutable object referenced thereby. For example, suppose a struct has a field of type `Car`, which holds a reference to a red car, vehicle ID #24601. Suppose further that someone copies the struct and then paints the vehicle referred to blue. An object reference would be appropriate if, under such circumstances, one would want the structure to hold a reference to a blue car with ID #24601. It would be inappropriate if one would want the structure to still hold a refernce to a red car (which would have to have some other ID, since car ID #24601 is blue).
Code within the struct creates a mutable class instance, performs all mutations that will ever be performed to that instance (possibly copying data from a passed-in instance), and stores it in a private field after all mutations are complete. Once a reference to the instance is stored in a field, the struct must never again mutate that instance, nor expose it to any code which could mutate it.
Note that the two approaches offer very different semantics; one should never have a hard time deciding between them, since in any circumstance where one is appropriate the other would be completely inappropriate. In some circumstances there may be other approaches which would work somewhat better, but in general one should identify whether a struct's state includes the identity or mutable characteristics of any nested mutable classes, and use one of the patterns above as appropriate.

Blindly converting structs to classes to hide the default constructor?

I read all the questions related to this topic, and they all give reasons why a default constructor on a struct is not available in C#, but I have not yet found anyone who suggests a general course of action when confronted with this situation.
The obvious solution is to simply convert the struct to a class and deal with the consequences.
Are there other options to keep it as a struct?
I ran into this situation with one of our internal commerce API objects. The designer converted it from a class to a struct, and now the default constructor (which was private before) leaves the object in an invalid state.
I thought that if we're going to keep the object as a struct, a mechanism for checking the validity of the state should be introduced (something like an IsValid property). I was met with much resistance, and an explanation of "whoever uses the API should not use the default constructor," a comment which certainly raised my eyebrows. (Note: the object in question is constructed "properly" through static factory methods, and all other constructors are internal.)
Is everyone simply converting their structs to classes in this situation without a second thought?
Edit: I would like to see some suggestions about how to keep this type of object as a struct -- the object in question above is much better suited as a struct than as a class.

For a struct, you design the type so the default constructed instance (fields all zero) is a valid state. You don't [shall not] arbitrarily use struct instead of class without a good reason - there's nothing wrong with using an immutable reference type.
My suggestions:
Make sure the reason for using a struct is valid (a [real] profiler revealed significant performance problems resulting from heavy allocation of a very lightweight object).
Design the type so the default constructed instance is valid.
If the type's design is dictated by native/COM interop constraints, wrap the functionality and don't expose the struct outside the wrapper (private nested type). That way you can easily document and verify proper use of the constrained type requirements.

The reason for this is that a struct (an instance of System.ValueType) is treated specially by the CLR: it is initialized with all the fields being 0 (or default). You don't really even need to create one - just declare it. This is why default constructors are required.
You can get around this in two ways:
Create a property like IsValid to indicate if it is a valid struct, like you indicate and
in .Net 2.0 consider using Nullable<T> to allow an uninitialized (null) struct.
Changing the struct to a class can have some very subtle consequences (in terms of memory usage and object identity which come up more in a multithreaded environment), and non-so-subtle but hard to debug NullReferenceExceptions for uninitialized objects.

The reason why there is no possibility to define a default constructor is illustrated by the following expression:
new MyStruct[1000];
You've got 3 options here
calling the default constructor 1000 times, or
creating corrupt data (note that a struct can contain references; if you don't initialize or blank out the reference, you could potentially access arbitrary memory), or
blank the allocated memory out with zeroes (at the byte level).
.NET does the same for both structs and classes: fields and array elements are blanked out with zeroes. This also gets more consistent behavior between structs and classes and no unsafe code. It also allows the .NET framework not to specialize something like new byte[1000].
And that's the default constructor for structs .NET demands and takes care of itself: zero out all bytes.
Now, to handle this, you've got a couple of options:
Add an Am-I-Initialized property to the struct (like HasValue on Nullable).
Allow the zeroed out struct to be a valid value (like 0 is a valid value for a decimal).

Why do we need struct? (C#)

To use a struct, we need to instantiate the struct and use it just like a class. Then why don't we just create a class in the first place?

A struct is a value type so if you create a copy, it will actually physically copy the data, whereas with a class it will only copy the reference to the data

A major difference between the semantics of class and struct is that structs have value semantics. What is this means is that if you have two variables of the same type, they each have their own copy of the data. Thus if a variable of a given value type is set equal to another (of the same type), operations on one will not affect the other (that is, assignment of value types creates a copy). This is in sharp contrast to reference types.
There are other differences:
Value types are implicitly sealed (it is not possible to derive from a value type).
Value types can not be null.
Value types are given a default constructor that initialzes the value type to its default value.
A variable of a value type is always a value of that type. Contrast this with classes where a variable of type A could refer to a instance of type B if B derives from A.
Because of the difference in semantics, it is inappropriate to refer to structs as "lightweight classes."

All of the reasons I see in other answers are interesting and can be useful, but if you want to read about why they are required (at least by the VM) and why it was a mistake for the JVM to not support them (user-defined value types), read Demystifying Magic: High-level Low-level Programming. As it stands, C# shines in talking about the potential to bring safe, managed code to systems programming. This is also one of the reasons I think the CLI is a superior platform [than the JVM] for mobile computing. A few other reasons are listed in the linked paper.
It's important to note that you'll very rarely, if ever, see an observable performance improvement from using a struct. The garbage collector is extremely fast, and in many cases will actually outperform the structs. When you add in the nuances of them, they're certainly not a first-choice tool. However, when you do need them and have profiler results or system-level constructs to prove it, they get the job done.
Edit: If you wanted an answer of why we need them as opposed to what they do, ^^^

In C#, a struct is a value type, unlike classes which are reference types. This leads to a huge difference in how they are handled, or how they are expected to be used.
You should probably read up on structs from a book. Structs in C# aren't close cousins of class like in C++ or Java.

This is a myth that struct are always created on heap.
Ok it is right that struct is value type and class is reference type. But remember that
1. A Reference Type always goes on the Heap.
2. Value Types go where they were declared.
Now what that second line means is I will explain with below example
Consider the following method
public void DoCalulation()
{
int num;
num=2;
}
Here num is a local variable so it will be created on stack.
Now consider the below example
public class TestClass
{
public int num;
}
public void DoCalulation()
{
TestClass myTestClass = new TestClass ();
myTestClass.num=2;
}
This time num is the num is created on heap.Ya in some cases value types perform more than reference types as they don't require garbage collection.
Also remeber:
The value of a value type is always a value of that type.
The value of a reference type is always a reference.
And you have to think over the issue that if you expect that there will lot be instantiation then that means more heap space yow will deal with ,and more is the work of garbage collector.For that case you can choose structs.

Structs have many different semantics to classes. The differences are many but the primary reasons for their existence are:
They can be explicitly layed out in memmory
this allows certain interop scenarios
They may be allocated on the stack
Making some sorts of high performance code possible in a much simpler fashion

the difference is that a struct is a value-type
I've found them useful in 2 situations
1) Interop - you can specify the memory layout of a struct, so you can guarantee that when you invoke an unmanaged call.
2) Performance - in some (very limited) cases, structs can be faster than classes, In general, this requires structs to be small (I've heard 16 bytes or less) , and not be changed often.

One of the main reasons is that, when used as local variables during a method call, structs are allocated on the stack.
Stack allocation is cheap, but the big difference is that de-allocation is also very cheap. In this situation, the garbage collector doesn't have to track structs -- they're removed when returning from the method that allocated them when the stack frame is popped.
edit - clarified my post re: Jon Skeet's comment.

A struct is a value type (like Int32), whereas a class is a reference type. Structs get created on the stack rather than the heap. Also, when a struct is passed to a method, a copy of the struct is passed, but when a class instance is passed, a reference is passed.
If you need to create your own datatype, say, then a struct is often a better choice than a class as you can use it just like the built-in value types in the .NET framework. There some good struct examples you can read here.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.

Why is Uri a class and not a struct? - c#

Related

How to determine if .NET (BCL) type is immutable

When exactly do I use a struct (Dont tell me when I want things to be allocated on a stack) [closed]

What (if any) are the implications of having an object or a nullable type as a field in a struct

Blindly converting structs to classes to hide the default constructor?

Why do we need struct? (C#)

Categories

Resources