C# class code loaded in RAM? - c#

I would like to know whether the actual code of a C# class gets loaded in RAM when you instantiate the class?
So for example if I have 2 Classes CLASS A , CLASS B, where class A has 10000 lines of code but just 1 field, an int. And class B has 10 lines of code and also 1 field an int as well. If I instantiate Class A will it take more RAM than Class B due to its lines of code ?
A supplementary question, If the lines of code are loaded in memory together with the class, will they be loaded for every instance of the class? or just once for all the instances?
Thanks in advance.

In the desktop framework, I believe methods are JITted on a method-by-method basis. I don't know whether the IL for a class is completely loaded into RAM when the class is first loaded, or whether they're just memory mapped to the assembly file.
Either way, you get a single copy for all instances - at least for non-generic types. For generic types (and methods) it gets slightly more complicated - there's one JIT representation for all reference type type arguments, and one for each value type type argument. So List<string> and List<Stream> share native code, but List<int> and List<Guid> don't. (Extrapolate as appropriate for types with more than one generic type parameter.) You still only get one copy for all instances of the same constructed type though - objects don't come with their own copy of the native code.

An instance of type A will take exactly the same amount of memory as type B. The amount of memory used by an instance of a type is a direct result of the fields which it contains so if both types contain the same fields, then instances of both types will contain the same amount of memory. Of course, if you have variable length fields, such as strings, arrays, collections etc. then you have to take this into account, but if the fields are set to the same values for each type then the same amount of memory will be used.
Within an app domain, the code containing the instructions for the methods of each type will only be loaded once, rather than for each instance of the type. As Jon says, it is important to remember that each closed generic type (with all type parameters stated) is a separate runtime type.
Incidentally, it is not important how many lines of source code your type contains, but how much IL this source code compiles to. However, if one type has 10 line of source code and another has 10,000 then it is highly likely that the IL for that latter class will be much greater. You can examine the IL by using a tool such as .NET Reflector.

This is a good article which describes how and where the IL code is JITted and loaded:
http://msdn.microsoft.com/en-us/magazine/cc163791.aspx

Related

What is the downside of using a structure vs object in a list in C#?

As I understand, using structure value types will always give better performance than using reference types in an array or list. Is there any downside involved in using struct instead of class type in a generic list?
PS : I am aware that MSDN recommends that struct should be maximum 16 bytes, but I have been using 100+ byte structure without problems so far. Also, when I get the maximum stack memory error exceeded for using a struct, I also run out of heap space if I use a class instead.
There is a lot of misinformation out there about struct vs. reference types in .Net. Anything which makes blanket statements like "structs will always perform better in ..." is almost certainly wrong. It's almost impossible to make blanket statements about performance.
Here are several items related to value types in a generic collection which will / can affect performance.
Using a value types in a generic instantiation can cause extra copies of methods to be JIT'd at runtime. For reference types only one instance will be generated
Using value types will affect the size of the allocated array to be count * size of the specific value type vs. reference types which have all have the same size
Adding / accessing values in the collection will incur copy overhead. The performance of this changes based on the size of the item. For references again it's the same no matter the type and for value types it will vary based on the size
As others have pointed out, there are many downsides to using large structures in a list. Some ramifications of what others have said:
Say you're sorting a list whose members are 100+ byte structures. Every time items have to be swapped, the following occurs:
var temp = list[i];
list[i] = list[j];
list[j] = temp;
The amount of data copied is 3*sizeof(your_struct). If you're sorting a list that's made up of reference types, the amount of data copied is 3*sizeof(IntPtr): 12 bytes in the 32-bit runtime, or 24 bytes in the 64-bit runtime. I can tell you from experience that copying large structures is far more expensive than the indirection inherent in using reference types.
Using structures also reduces the maximum number of items you can have in a list. In .NET, the maximum size of any single data structure is 2 gigabytes (minus a little bit). A list of structures has a maximum capacity of 2^31/sizeof(your_struct). So if your structure is 100 bytes in size, you can have at most about 21.5 million of them in a list. But if you use reference types, your maximum is about 536 million in the 32-bit runtime (although you'll run out of memory before you reach that limit), or 268 million in the 64-bit runtime. And, yes, some of us really do work with that many things in memory.
using structure value types will always give better performance than using reference types in an array or list
There is nothing true in that statement.
Take a look at this question and answer.
With structs, you cannot have code reuse in the form of class inheritance. A struct can only implement interfaces but cannot inherit from a class or another struct whereas a class can inherit from another class and of course implement interfaces.
When storing data in a List<T> or other collection (as opposed to keeping a list of controls or other active objects) and one wishes to allow the data to change, one should generally follow one of four patterns:
Store immutable objects in the list, and allow the list itself to change
Store mutable objects in the list, but only allow objects created by the owner of the list to be stored therein. Allow outsiders to access the mutable objects themselves.
Only store mutable objects to which no outside references exist, and don't expose to the outside world any references to objects within the list; if information from the list is requested, copy it from the objects in the list.
Store value types in the list.
Approach #1 is the simplest, if the objects one wants to store are immutable. Of course, the requirement that objects be immutable can be somewhat limiting.
Approach #2 can be convenient in some cases, and it permits convenient updating of data in the list (e.g. MyList[index].SomeProperty += 5;) but the exact semantics of how returned properties are, or remain, attached to items in the list may sometimes be unclear. Further, there's no clear way to load all the properties of an item in the list from an 'example' object.
Approach #3 has simple-to-understand semantics (changing an object after giving it to the list will have no effect, objects retrieved from the list will not be affected by subsequent changes to the list, and changes to objects retrieved from a list will not affect the list themselves unless the objects are explicitly written back), but requires defensive copying on every list access, which can be rather bothersome.
Approach #4 offers essentially the same semantics as approach #3, but copying a struct is cheaper than making a defensive copy of a class object. Note that if the struct is mutable, the semantics of:
var temp = MyList[index];
temp.SomeField += 5;
MyList[index] temp;
are clearer than anything that can be achieved with so-called "immutable" (i.e. mutation-only-by-assignment) structs. To know what the above does, all one needs to know about the struct is that SomeField is a public field of some particular type. By contrast, even something like:
var temp = MyList[index];
temp = temp.WithSomeField(temp.SomeField + 5);
MyList[index] temp;
which is about the best one could hope for with such a struct, would be much harder to read than the easily-mutable-struct version. Further, to be sure of what the above actually does, one would have to examine the definition of the struct's WithSomeField method and any constructors or methods employed thereby, as well as all of the struct's fields, to determine whether it had any side-effects other than modifying SomeField.

Boxing and Unboxing [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
What is boxing and unboxing and what are the trade offs?
Ok I understand the basic concept of what happens when you box and unbox.
Box throws the value type (stack object) into a System.Object and stores it on the heap
Unbox unpackages that object on the heap holding that value type and throws it back on the stack so it can be used.
Here is what I don't understand:
Why would this need to be done...specific real-world examples
Why is generics so efficient? They say because Generics doesn't need to unbox or box, ok..I don't get why...what's behind that in generics
Why is generics better than lets say other types. Lets say for example other collections?
so all in all I don't understand this in application in the real world in terms of code and then going further how it makes generics better...why it doesn't have to do any of this in the first place when using Generics.
Boxing needs to be done whenever you want to hold an int in an object variable.
A generic collection of ints contains an int[] instead of an object[].
Putting an int into the object[] behind a non-generic collection requires you to box the int.
Putting an int into the int[] behind a generic collection does not invlove any boxing.
Firstly, the stack and heap are implementation details. a value type isnt defined by being on the stack. there is nothing to say that the concept of stack and heap will be used for all systems able to host the CLR:
Link
That aside:
when a value type is boxed, the data in that value type is read, an object is created, and the data is copied to the new object.
if you are boxing all the items in a collection, this is a lot of overhead.
if you have a collection of value types and are iterating over them, this will happen for each read, then the items are then unboxed (the reverse of the process) just to read a value!!
Generic collections are strongly typed to the type being stored in them, and therefore no boxing or unboxing needs to occur.
Here is a response around the unboxing/boxing portion.
I'm not sure how it is implemented in
mono, but generic interfaces will help
because the compiler creates a new
function of the specific type for each
different type used (internally, there
are a few cases where it can utilize
the same generated function). If a
function of the specific type is
generated, there is no need to
box/unbox the type.
This is why the Collections.Generic
library was a big hit at .NET 2.0
because collections no longer required
boxing and became significantly more
efficient.
In regards to why are generics better then other collections outside the boxing/unboxing scope is that they also force type. No longer can you readily toss a collection around which can hold any type. It can prevent bugs at compile time, versus seeing them at run time.
MSDN has a nice article: Boxing and Unboxing (C# Programming Guide)
In relation to simple assignments, boxing and unboxing are computationally expensive processes. When a value type is boxed, a new object must be allocated and constructed. To a lesser degree, the cast required for unboxing is also expensive computationally.
Boxing is used to store value types in the garbage-collected heap. Boxing is an implicit conversion of a value type to the type object or to any interface type implemented by this value type. Boxing a value type allocates an object instance on the heap and copies the value into the new object.
Unboxing is an explicit conversion from the type object to a value type or from an interface type to a value type that implements the interface. An unboxing operation consists of:
Checking the object instance to make sure that it is a boxed value of the given value type.
Copying the value from the instance into the value-type variable.
Check also: Exploring C# Boxing
And read Jeffrey Richter's Type fundamentals. Here Two sample chapters plus full TOC from Jeffrey Richter's "CLR via C#" (Microsoft Press, 2010) he published some time ago.
Also some notes from Jeffrey Richter's book CLR via C#:
It’s possible to convert a value type to a reference type by using a mechanism called boxing.
Internally, here’s what happens when an instance of a value type is boxed:
Memory is allocated from the managed heap. The amount of memory allocated is the
size required by the value type’s fields plus the two additional overhead members (the
type object pointer and the sync block index) required by all objects on the managed
heap.
The value type’s fields are copied to the newly allocated heap memory.
The address of the object is returned. This address is now a reference to an object; the value type is now a reference type. The C# compiler automatically produces the IL code necessary to box a value type instance, but you still need to understand what’s going on internally so that you’re aware of code size and performance issues.
Note. It should be noted that the FCL now includes a new set of generic collection classes that make the non-generic collection classes obsolete. For example, you should use the System.Collections.Generic.List class instead of the System.Collections.ArrayList
class. The generic collection classes offer many improvements over the non-generic equivalents. For example, the API has been cleaned up and improved, and the performance of the collection classes has been greatly improved as well. But one of the biggest improvements is that the generic collection classes allow you to work with collections of value types without requiring that items in the collection be boxed/unboxed. This in itself greatly improves performance because far fewer objects will be created on the managed heap thereby reducing the number of garbage collections required by your application. Furthermore, you will get compile-time type safety, and your source code will be cleaner due to fewer casts. This will all be explained in further detail in Chapter 12,
“Generics.”
I don't want overquote full chapter here. Read his book and you gain some details on process and receive some answers. And BTW, answer to your question quite a few here on SO, around Web and in many books. It is fundamental knowledge you certainly have to understand.
Here is an interesting read from Eric Lippert (The truth about value types):
Link
regarding your statement:
Box throws the value type (stack object) into a System.Object and stores it on the heap Unbox unpackages that object on the heap holding that value type and throws it back on the stack so it can be used.
This needs to be done because at the IL level there are different instructions for value types than for reference types (ldfld vs ldflda , checkout the dissassembly for a method that calls someValueType.ToString() vs someReferenceType.ToString() and you'll see that the instructions are different).
These instructions are not compatible so, when you need to pass a value type to a method as an object, that value needs to be wrapped in a reference type (boxing). This is ineficient because the runtime needs to copy the value type and then create a new boxing type in order to pass one value.
Generics are faster because value types can be stored as values and not references so no boxing is needed. Take ArrayList vs List<int>. If you want to put 1 into an ArrayList, the CLR needs to box the int so that it can be stored in a object[]. List<T> however, uses a T[] to store the list contents so List uses a int[] which means that 1 doesn't need to be boxed in order to put it in the array.
To put it simple boxing and unboxing takes alot of time. Why - beacuse it's faster to use known type from the start then let this handle for runtime.
In colection of objects can contain differnt items : string, int, double, etc. and you must check every time that your operation with variable is corect.
Convert from one type to enother takes time.
Generic are much faster and encourage you to use them, old collections exist for backward compability
Suppose I want to store a bunch of variables of type Long in a List, but the system supported neither value-type generics nor boxing. The way to go about storing such values would be to define a new class "BoxedLong", which held a single field "Value" of type Long. Then to add a value to the list, one would create a new instance of a BoxedLong, set its Value field to the desired value, and store that in the list. To retrieve a value from the list, one would retrieve a BoxedLong object from the list, and take the value from its Value field.
When a value type is passed to something that expects an Object, the above is essentially what happens under the hood, except without the new identifier names.
When using generics with value types, the system doesn't use an value-holder class and pass it to routines which expect to work with objects. Instead, the system creates a new version of the routine that will work with the value type in question. If five different value types are passed to a generic routine, five different versions of the routine will be generated. In general, this will yield more code than would the use of a value-holder class, but the code will have to do less work every time a value is passed in or retrieved. Since most routines will have many values of each type passed in or out, the cost of generating different versions of the routine will be more than recouped by the elimination of boxing/unboxing operations.

What happens when you create an instance of an object containing no state in C#?

I am I think ok at algorithmic programming, if that is the right term? I used to play with turbo pascal and 8086 assembly language back in the 1980s as a hobby. But only very small projects and I haven't really done any programming in the 20ish years since then. So I am struggling for understanding like a drowning swimmer.
So maybe this is a very niave question or I'm just making no sense at all, but say I have an object kind of like this:
class Something : IDoer
{
void Do(ISomethingElse x)
{
x.DoWhatEverYouWant(42);
}
}
And then I do
var Thing1 = new Something();
var Thing2 = new Something();
Thing1.Do(blah);
Thing2.Do(blah);
does Thing1 = Thing2? does "new Something()" create anything? Or is it not much different different from having a static class, except I can pass it around and swap it out etc.
Is the "Do" procedure in the same location in memory for both the Thing1(blah) and Thing2(blah) objects? I mean when executing it, does it mean there are two Something.Do procedures or just one?
They are two separate objects; they just don't have state.
Consider this code:
var obj1 = new object();
var obj2 = new object();
Console.WriteLine(object.ReferenceEquals(obj1, obj2));
It will output False.
Just because an object has no state doesn't mean it doesn't get allocated just like any other object. It just takes very little space (just like an object).
In response to the last part of your question: there is only one Do method. Methods are not stored per instance but rather per class. If you think about it, it would be extremely wasteful to store them per instance. Every method call to Do on a Something object is really the same set of instructions; all that differs between calls from different objects is the state of the underlying object (if the Something class had any state to begin with, that is).
What this means is that instance methods on class objects are really behaviorally the same as static methods.
You might think of it as if all instance-level methods were secretly translated as follows (I'm not saying this is strictly true, just that you could think of it this way and it does kind of make sense):
// appears to be instance-specific, so you might think
// it would be stored for every instance
public void Do() {
Do(this);
}
// is clearly static, so it is much clearer it only needs
// to be stored in one place
private static Do(Something instance) {
// do whatever Do does
}
Interesting side note: the above hypothetical "translation" explains pretty much exactly how extension methods work: they are static methods, but by qualifying their first parameter with the this keyword, they suddenly look like instance methods.
There are most definitely two different objects in memory. Each object will consume 8 bytes on the heap (at least on 32-bit systems); 4 for the syncblock and 4 for the type handle (which includes the method table). Other than the system-defined state data there is no other user-defined state data in your case.
There is a single instance of the code for the Something.Do method. The type handle pointer that each object holds is how the CLR locates the different methods for the class. So even though there are two different objects in memory they both execute the same code. Since Something.Do was declared as an instance method it will have a this pointer passed to it internally so that the code can modify the correct instance members depending on which object was invoking the method. In your case the Something class has no instance members (and thus no user-defined state) and so this is quite irrelevant, but still happens nevertheless.
No they are not the same. They are two separate instances of the class Something. They happen to be identically instantiated, that is all.
You would create 2 "empty" objects, there would be a small allocation on the heap for each object.
But the "Do" method is always in the same place, that has nothing to do with the absence of state. Code is not stored 'in' a class/object. There is only 1 piece of code corresponding to Do() and it has a 'hidden' parameter this that points to the instance of Something it was called on.
Conceptually, Thing1 and Thing2 are different objects, but there is only one Something.Do procedure.
The .Net runtime allocates a little bit of memory to each of the objects you create - one chunk to Thing1 and another to Thing2. The purpose of this chunk of memory is to store (1) the state of the object and (2) a the address of any procedures that that belong to the object. I know you don't have any state, but the runtime doesn't care - it still keeps two separate references to two separate chunks of memory.
Now, your "Do" method is the same for both Thing1 and Thing2, do the runtime only keeps one version of the procedure in memory.
he memory allocated Thing1 includes the address of the the Do method. When you invoke the Do method on Thing1, it looks up the address of its Do method for Thing1 and runs the method. The same thing happens with the other object, Thing2. Although the objects are different, the same Do method is called for both Thing1 and Thing2.
What this boils down to is that Thing1 and Thing2 are different, in that the names "Thing1" and "Thing2" refer to different areas of memory. The contents of this memory is he same in both cases - a single address that points to the "Do" method.
Well, that's the theory, anyway. Under the hood, there might be some kind of optimisation going on (See http://www.wrox.com/WileyCDA/Section/CLR-Method-Call-Internals.id-291453.html if you're interested), but for most practical purposes, what I have said is the way things work.
Thing1 != Thing2
These are two different objects in memory.
The Do method code is in the same place for both objects. There is no need to store two different copies of the method.
Each reference type (Thing1, Thing2) is pointing to a different physical address in main memory, as they have been instantiated separately. The thing pointed to in memory is the bytes used by the object, whether it has a state or not (it always has a state, but whether it has a declared/initialised state).
If you assigned a reference type to another reference type (Thing2 = Thing1;) then it would be the same portion of memory used by two different reference types, and no new instantiation would take place.
A good way of think of the new constructor(), is that you are really just calling the method inside your class whos sole responsibility is to produce you a new instance of an object that is cookie cutted from your class.
so now you can have multiple instances of the same class running around at runtime handling all sorts of situations :D
as far as the CLR, you are getting infact 2 seperate instances on memory that each contain pointers to it, it is very similar to any other OOP language but we do not have to actually interact with the pointers, they are translated the same as a non reference type, so we dont have to worry about them!
(there are pointers in C# if you wish to whip out your [unsafe] keyword!)

Why do we need struct? (C#)

To use a struct, we need to instantiate the struct and use it just like a class. Then why don't we just create a class in the first place?
A struct is a value type so if you create a copy, it will actually physically copy the data, whereas with a class it will only copy the reference to the data
A major difference between the semantics of class and struct is that structs have value semantics. What is this means is that if you have two variables of the same type, they each have their own copy of the data. Thus if a variable of a given value type is set equal to another (of the same type), operations on one will not affect the other (that is, assignment of value types creates a copy). This is in sharp contrast to reference types.
There are other differences:
Value types are implicitly sealed (it is not possible to derive from a value type).
Value types can not be null.
Value types are given a default constructor that initialzes the value type to its default value.
A variable of a value type is always a value of that type. Contrast this with classes where a variable of type A could refer to a instance of type B if B derives from A.
Because of the difference in semantics, it is inappropriate to refer to structs as "lightweight classes."
All of the reasons I see in other answers are interesting and can be useful, but if you want to read about why they are required (at least by the VM) and why it was a mistake for the JVM to not support them (user-defined value types), read Demystifying Magic: High-level Low-level Programming. As it stands, C# shines in talking about the potential to bring safe, managed code to systems programming. This is also one of the reasons I think the CLI is a superior platform [than the JVM] for mobile computing. A few other reasons are listed in the linked paper.
It's important to note that you'll very rarely, if ever, see an observable performance improvement from using a struct. The garbage collector is extremely fast, and in many cases will actually outperform the structs. When you add in the nuances of them, they're certainly not a first-choice tool. However, when you do need them and have profiler results or system-level constructs to prove it, they get the job done.
Edit: If you wanted an answer of why we need them as opposed to what they do, ^^^
In C#, a struct is a value type, unlike classes which are reference types. This leads to a huge difference in how they are handled, or how they are expected to be used.
You should probably read up on structs from a book. Structs in C# aren't close cousins of class like in C++ or Java.
This is a myth that struct are always created on heap.
Ok it is right that struct is value type and class is reference type. But remember that
1. A Reference Type always goes on the Heap.
2. Value Types go where they were declared.
Now what that second line means is I will explain with below example
Consider the following method
public void DoCalulation()
{
int num;
num=2;
}
Here num is a local variable so it will be created on stack.
Now consider the below example
public class TestClass
{
public int num;
}
public void DoCalulation()
{
TestClass myTestClass = new TestClass ();
myTestClass.num=2;
}
This time num is the num is created on heap.Ya in some cases value types perform more than reference types as they don't require garbage collection.
Also remeber:
The value of a value type is always a value of that type.
The value of a reference type is always a reference.
And you have to think over the issue that if you expect that there will lot be instantiation then that means more heap space yow will deal with ,and more is the work of garbage collector.For that case you can choose structs.
Structs have many different semantics to classes. The differences are many but the primary reasons for their existence are:
They can be explicitly layed out in memmory
this allows certain interop scenarios
They may be allocated on the stack
Making some sorts of high performance code possible in a much simpler fashion
the difference is that a struct is a value-type
I've found them useful in 2 situations
1) Interop - you can specify the memory layout of a struct, so you can guarantee that when you invoke an unmanaged call.
2) Performance - in some (very limited) cases, structs can be faster than classes, In general, this requires structs to be small (I've heard 16 bytes or less) , and not be changed often.
One of the main reasons is that, when used as local variables during a method call, structs are allocated on the stack.
Stack allocation is cheap, but the big difference is that de-allocation is also very cheap. In this situation, the garbage collector doesn't have to track structs -- they're removed when returning from the method that allocated them when the stack frame is popped.
edit - clarified my post re: Jon Skeet's comment.
A struct is a value type (like Int32), whereas a class is a reference type. Structs get created on the stack rather than the heap. Also, when a struct is passed to a method, a copy of the struct is passed, but when a class instance is passed, a reference is passed.
If you need to create your own datatype, say, then a struct is often a better choice than a class as you can use it just like the built-in value types in the .NET framework. There some good struct examples you can read here.

What are the deficiencies of the Java/C# type system?

Its often hear that Haskell(which I don't know) has a very interesting type system.. I'm very familiar with Java and a little with C#, and sometimes it happens that I'm fighting the type system so some design accommodates or works better in a certain way.
That led me to wonder...
What are the problems that occur somehow because of deficiencies of Java/C# type system?
How do you deal with them?
Arrays are broken.
Object[] foo = new String[1];
foo[0] = new Integer(4);
Gives you java.lang.ArrayStoreException
You deal with them with caution.
Nullability is another big issue. NullPointerExceptions jump at your face everywhere. You really can't do anything about them except switch language, or use conventions of avoiding them as much as possible (initialize fields properly, etc).
More generally, the Java's/C#'s type systems are not very expressive. The most important thing Haskell can give you is that with its types you can enforce that functions don't have side effects. Having a compile time proof that parts of programs are just expressions that are evaluated makes programs much more reliable, composable, and easier to reason about. (Ignore the fact, that implementations of Haskell give you ways to bypass that).
Compare that to Java, where calling a method can do almost anything!
Also Haskell has pattern matching, which gives you different way of creating programs; you have data on which functions operate, often recursively. In pattern matching you destruct data to see of what kind it is, and behave according to it. e.g. You have a list, which is either empty, or head and tail. If you want to calculate the length, you define a function that says: if list is empty, length = 0, otherwise length = 1 + length(tail).
If you really like to learn more, there's two excellent online sources:
Learn you a Haskell and Real World Haskell
I dislike the fact that there is a differentiation between primitive (native) types (int, boolean, double) and their corresponding class-wrappers (Integer, Boolean, Double) in Java.
This is often quite annoying especially when writing generic code. Native types can't be genericized, you must instantiate a wrapper instead. Generics should make your code more abstract and easier reusable, but in Java they bring restrictions with obviously no reasons.
private static <T> T First(T arg[]) {
return arg[0];
}
public static void main(String[] args) {
int x[] = {1, 2, 3};
Integer y[] = {3, 4, 5};
First(x); // Wrong
First(y); // Fine
}
In .NET there are no such problems even though there are separate value and reference types, because they strictly realized "everything is an object".
this question about generics shows the deficiencies of the java type system's expressiveness
Higher-kinded generics in Java
I don't like the fact that classes are not first-class objects, and you can't do fancy things such as having a static method be part of an interface.
A fundamental weakness in the Java/.net type system is that it has no declarative means of specifying how an object's state relates to the contents of its reference-type fields, nor of specifying what a method is allowed to persist reference-type parameters. Although in some sense it's nice for the runtime to be able to use a field Foo of one type ICollection<integer> to mean many different things, it's not possible for the type system to provide real support for things like immutability, equivalency testing, cloning, or any other such features without knowing whether Foo represents:
A read-only reference to a collection which nothing will ever mutate; the class may freely share such reference with outside code, without affecting its semantics. The reference encapsulates only immutable state, and likely does not encapsulate identity.
A writable reference to a collection whose type is mutable, but which nothing will ever actually mutate; the class may only share such references with code that can be trusted not to mutate it. As above, the reference encapsulates only immutable state, and likely does not encapsulate identity.
The only reference anywhere in the universe to a collection which it mutates. The reference would encapsulate mutable state, but would not encapsulate identity (replacing the collection with another holding the same items would not change the state of the enclosing object).
A reference to a collection which it mutates, and whose contents it considers to be its own, but to which outside code holds references which it expects to be attached to `Foo`'s current state. The reference would encapsulate both identity and mutable state.
A reference to a mutable collection owned by some other object, which it expects to be attached to that other object's state (e.g. if the object holding `Foo` is supposed to display the contents of some other collection). That reference would encapsulate identity, but would not encapsulate mutable state.
Suppose one wants to copy the state of the object that contains Foo to a new, detached, object. If Foo represents #1 or #2, one may store in the new object either a copy of the reference in Foo, or a reference to a new object holding the same data; copying the reference would be faster, but both operations would be correct. If Foo represents #3, a correct detached copy must hold a reference to a new detached object whose state is copied from the original. If Foo represents #5, a correct detached copy must hold a copy of the original reference--it must NOT hold reference to a new detached object. And if Foo represents #4, the state of the object containing it cannot be copied in isolation; it might be possible to copy a bunch of interconnected objects to yield a new bunch whose state is equivalent to the original, but it would not be possible to copy the state of objects individually.
While it won't be possible for a type system to specify declaratively all of the possible relationships that can exist among objects and what should be done about them, it should be possible for a type system and framework to correctly generate code to produce semantically-correct equivalence tests, cloning methods, smoothly inter-operable mutable, immutable, and "readable" types, etc. in most cases, if it knew which fields encapsulate identity, mutable state, both, or neither. Additionally, it should be possible for a framework to minimize defensive copying and wrapping in circumstances where it could ensure that the passed references would not be given to anything that would mutate them.
(Re: C# specifically.)
I would love tagged unions.
Ditto on first-class objects for classes, methods, properties, etc.
Although I've never used them, Python has type classes that basically are the types that represent classes and how they behave.
Non-nullable reference types so null-checks are not needed. It was originally considered for C# but was discarded. (There is a stack overflow question on this.)
Covariance so I can cast a List<string> to a List<object>.
This is minor, but for the current versions of Java and C# declaring objects breaks the DRY principle:
Object foo = new Object;
Int x = new Int;
None of them have meta-programming facilities like say that old darn C++ dog has.
Using "using" duplication and lack of typedef is one example that violates DRY and can even cause user-induced 'aliasing' errors and more. Java 'templates' isn't even worth mentioning..

Categories

Resources