Related
In C#, what is the key difference (in terms of features or use cases) between these two containers? There doesn't appear to be any information comparing these on Google.
System.Collections.ObjectModel.ReadOnlyDictionary
System.Collections.Immutable.ImmutableDictionary
I understand that an ImmutableDictionary is thread-safe. Is the same true of a ReadOnlyDictionary?
This is not a duplicate of How to properly use IReadOnlyDictionary?. That question is about how to use IReadOnlyDictionary. This question is about the difference between the two (which, as someone commented on that thread back in 2015, would be a different question - ie. this one)
A ReadOnlyDictionary can be initialized once via constructor, then you can't add or remove items from it (they throw NotSupportedExceptions). It's useful if you want to ensure that it won't be modified while it's sent across multiple layers of your application.
An ImmutableDictionary has methods to modify it like Add or Remove, but they will create a new dictionary and return that, the original one remains unchanged and the copy of the new immutable dictionary is returned.
Note that:
You initialize the ReadOnlyDictionary by passing another dictionary instance to the constructor. That explains why a ReadOnlyDictionary is mutable (if the underlying dictionary is modified). It's just a wrapper that is protected from direct changes.
You can't use a constructor for ImmutableDictionary: How can I create a new instance of ImmutableDictionary?
That also explains why the ReadOnlyDictionary is not thread-safe (better: it's as thread-safe as the underlying dictionary). The ImmutableDictionary is thread-safe because you can't modify the original instance (neither directly nor indirectly). All methods that "modify" it actually return a new instance.
But if you need a thread-safe dictionary and it's not necessary that it's immutable, use a ConcurrentDictionary instead.
A ReadOnlyDictionary<TKey,TValue> is a wrapper around another existing IDictionary<TKey,TValue> implementing object.
Importantly, whilst "you" (the code with access to the ReadOnlyDictionary) cannot make any changes to the dictionary via the wrapper, this does not mean that other code is unable to modify the underlying dictionary.
So unlike what other answers may suggest, you cannot assume that the ReadOnlyDictionary isn't subject to modification - just that "you" aren't allowed to. So for example, you cannot be sure that two attempts to access a particular key will produce the same result.
In addition to the current answers, I would add that ImmutableDictionary is slower and usually will use more memory.
Why slower? Behind the scenes, the ImmutableDictionary isn't a hash table. It uses an AVL tree which is a self-balancing tree, and therefore, its access complexity is O(logn). On the other hand, the other dictionaries use a hash table behind the scenes and the access complexity for them is O(1).
Why more memory allocation? Every time the dictionary is being changed it creates a new dictionary because it is immutable.
Instead of describing what these two classes do, it would be better to describe what it actually means for something to be read-only or immutable, as there is a key distinction which doesn't really give much option to those two implementations.
Read-only is part of an "interface" of a class, its set of public methods and properties. Being read-only means that there is no possible sequence of actions an external consumer of the class could do in order to affect its visible state. Compare with a read-only file for example; no application can write to such a file using the same API that made it possible to make it read-only in the first place.
Does read-only imply thread-safe? Not necessarily – a read-only class could still employ things like caching or optimization of its internal data structures and those may be (poorly) implemented in a way that breaks when invoked concurrently.
Does read-only imply never-changing? Also no; look at the system clock, for example. You cannot really affect it (with the default permissions), you can only read it (making it read-only by definition), but its value changes based on the time.
Never-changing means immutable. It is a much stronger concept, and, like thread-safety, is a part of the contract of the whole class. The class must actively ensure that no part of its instance ever changes during its lifetime, with respect to what can be observed externally.
Strings are immutable in .NET: as long as the integrity of the runtime is not compromised (by memory hacking), a particular instance of a string will never be different from its initially observed value. Read-only files are, on the other hand, not much immutable, as one could always turn read-only off and change the file.
Immutable also does not imply thread-safe, as such an object could still employ techniques that modify its internal state and are not thread-safe (but it's generally easier to ensure).
The question whether immutable implies read-only depends on how you look at it. You can usually "mutate" an immutable object in a way that doesn't affect external code that may be using it, thus exposing an immutable object is at least as strong as exposing a read-only one. Taking a substring of a string is like deleting a part of it, but in a safe manner.
This brings us back to the original question about the two classes. All ReadOnlyDictionary has to do is to be read-only. You still have to provide the data in some way, with an internally wrapped dictionary, and you and only you can still write to it through the internal dictionary. The wrapper provides "strong" read-only access (compared to a "weak" read-only access that you get just by casting to IReadOnlyDictionary). It is also thread-safe, but only when the underlying dictionary is thread-safe as well.
ImmutableDictionary can do much more with the strong guarantee that the data it holds cannot be changed. Essentially you can "patch" parts of it with new data and obtain a modified "copy" of the structure but without actually copying the complete object. It is also thread-safe by the virtue of its implementation. Similarly to a StringBuilder, you use a builder to make changes to an instance and then bake them to make the final instance of an immutable dictionary.
ReadOnlyDictionary: is ReadOnly, cannot add or remove
ImmutableDictonary: can add or remove but it is immutable like string. There is new object to added and removed.
From this Answer, I came to know that KeyValuePair are immutables.
I browsed through the docs, but could not find any information regarding immutable behavior.
I was wondering how to determine if a type is immutable or not?
I don't think there's a standard way to do this, since there is no official concept of immutability in C#. The only way I can think of is looking at certain things, indicating a higher probability:
1) All properties of the type have a private set
2) All fields are const/readonly or private
3) There are no methods with obvious/known side effects
4) Also, being a struct generally is a good indication (if it is BCL type or by someone with guidelines for this)
Something like an ImmutabeAttribute would be nice. There are some thoughts here (somewhere down in the comments), but I haven't seen one in "real life" yet.
The first indication would be that the documentation for the property in the overview says "Gets the key in the key/value pair."
The second more definite indication would be in the description of the property itself:
"This property is read/only."
I don't think you can find "proof" of immutability by just looking at the docs, but there are several strong indicators:
It's a struct (why does this matter?)
It has no settable public properties (both are read-only)
It has no obvious mutator methods
For definitive proof I recommend downloading the BCL's reference source from Microsoft or using an IL decompiler to show you how a type would look like in code.
A KeyValuePair<T1,T2> is a struct which, absent Reflection, can only be mutated outside its constructor by copying the contents of another KeyValuePair<T1,T2> which holds the desired values. Note that the statement:
MyKeyValuePair = new KeyValuePair(1,2);
like all similar constructor invocations on structures, actually works by creating a new temporary instance of KeyValuePair<int,int> (happens before the constructor itself executes), setting the field values of that instance (done by the constructor), copying all public and private fields of that new temporary instance to MyKeyValuePair, and then discarding the temporary instance.
Consider the following code:
static KeyValuePair MyKeyValuePair; // Field in some class
// Thread1
MyKeyValuePair = new KeyValuePair(1,1);
// ***
MyKeyValuePair = new KeyValuePair(2,2);
// Thread2
st = MyKeyValuePair.ToString();
Because MyKeyValuePair is precisely four bytes in length, the second statement in Thread1 will update both fields simultaneously. Despite that, if the second statement in Thread1 executes between Thread2's evaluation of MyKeyValuePair.Key.ToString() and MyKeyValuePair.Value.ToString(), the second ToString() will act upon the new mutated value of the structure, even though the first already-completed ToString()operated upon the value before the mutation.
All non-trivial structs, regardless of how they are declared, have the same immutability rules for their fields: code which can change a struct can change its fields; code which cannot change a struct cannot change its fields. Some structs may force one to go through hoops to change one of their fields, but designing struct types to be "immutable" is neither necessary nor sufficient to ensure the immutability of instances. There are a few reasonable uses of "immutable" struct types, but such use cases if anything require more care than is necessary for structs with exposed public fields.
StringBuilder exists purely for the reason that strings in .NET are immutable, that is that traditional string concatenation can use lots of resources (due to lots of String objects being created).
So, since an Int32 is also immutable why don't classes exist for multiple addition for example?
There is. There's UriBuilder for building Uri objects.
What would an Int32Builder do? What meaningful operation on a single integer is going to be more convenient and/or more performant through use of such a class?
For an XXXBuilder class to make sense, the following have to hold:
The class or struct is immutable.
Changing the value by replacing it with one based on the previous (e.g. someString += "abc" or someDate = someDate.AddDays(1)) has to be relatively expensive (true in the former example more than the latter) and/or relatively convoluted to code.
The requirement for such a XXXBuilder class is common enough that it makes sense to provide it rather than just letting those who do need it code their own.
None of the above applies to int. They do apply to string and Uri. I don't think reference vs value type is particularly relevant except that cases where point 2 fits are also going to be cases where a class is almost certainly a better design choice than a value type.
Indeed, the combination of point 1 and point 2 is relatively uncommon in .NET. Some would argue less common than it should be (those who favour heavy use of immutable types). And if we can avoid point 2, then we would, wouldn't we? Nobody will think "I'll code this to be expensive and clumsy and provide a builder class". Rather they may on occasion think "The downside to my well thought-out immutability is that while it gives me many advantages it makes some operations expensive and clumsy, so I'll provide a builder class as well".
A concatenated string gets longer, which requires heap memory allocations and memory copies.
These get more expensive the longer the string gets, ergo we've a helper class (i.e. StringBuilder) to minimise the amount of copying that goes when when strings are concatenated.
Ints aren't concatinated, as you multiply ints you don't need more memory to hold the result of two multiplied ints, you just need another int (or the same int if it's *=).
You'd only need a helper class if you need to concatenate ints into some form of list . . . oh wait, List<int>!
Int32 is a value type.
String is a reference type. StringBuilder exists because String is an immutable reference type. String is also a collection of Char - so many allocations happen when you concatenate strings - StringBuilder makes these allocations beforehand, making creation of concatenated strings much more efficient. This is not an issue with value types.
Because Int32 is a value type, usually allocated on the stack (or within the body of a heap object). The compiler will automatically reuse the memory location when adding many value types in a loop for example.
The answer is basically "because of an implementation detail which means it is unnecessary".
The fact that string concatenation is slow, leading to a requirement for StringBuilder, is itself an implementation detail.
Value types can have their lifetime tracked because they have value type semantics. Whether this occurs is an implementation detail. In practice it does, and that is the reason why there is no need for an IntBuilder class.
This question already has answers here:
Why can't strings be mutable in Java and .NET?
(17 answers)
Closed 9 years ago.
As we all know, String is immutable. What are the reasons for String being immutable and the introduction of StringBuilder class as mutable?
Instances of immutable types are inherently thread-safe, since no thread can modify it, the risk of a thread modifying it in a way that interferes with another is removed (the reference itself is a different matter).
Similarly, the fact that aliasing can't produce changes (if x and y both refer to the same object a change to x entails a change to y) allows for considerable compiler optimisations.
Memory-saving optimisations are also possible. Interning and atomising being the most obvious examples, though we can do other versions of the same principle. I once produced a memory saving of about half a GB by comparing immutable objects and replacing references to duplicates so that they all pointed to the same instance (time-consuming, but a minute's extra start-up to save a massive amount of memory was a performance win in the case in question). With mutable objects that can't be done.
No side-effects can come from passing an immutable type as a method to a parameter unless it is out or ref (since that changes the reference, not the object). A programmer therefore knows that if string x = "abc" at the start of a method, and that doesn't change in the body of the method, then x == "abc" at the end of the method.
Conceptually, the semantics are more like value types; in particular equality is based on state rather than identity. This means that "abc" == "ab" + "c". While this doesn't require immutability, the fact that a reference to such a string will always equal "abc" throughout its lifetime (which does require immutability) makes uses as keys where maintaining equality to previous values is vital, much easier to ensure correctness of (strings are indeed commonly used as keys).
Conceptually, it can make more sense to be immutable. If we add a month onto Christmas, we haven't changed Christmas, we have produced a new date in late January. It makes sense therefore that Christmas.AddMonths(1) produces a new DateTime rather than changing a mutable one. (Another example, if I as a mutable object change my name, what has changed is which name I am using, "Jon" remains immutable and other Jons will be unaffected.
Copying is fast and simple, to create a clone just return this. Since the copy can't be changed anyway, pretending something is its own copy is safe.
[Edit, I'd forgotten this one]. Internal state can be safely shared between objects. For example, if you were implementing list which was backed by an array, a start index and a count, then the most expensive part of creating a sub-range would be copying the objects. However, if it was immutable then the sub-range object could reference the same array, with only the start index and count having to change, with a very considerable change to construction time.
In all, for objects which don't have undergoing change as part of their purpose, there can be many advantages in being immutable. The main disadvantage is in requiring extra constructions, though even here it's often overstated (remember, you have to do several appends before StringBuilder becomes more efficient than the equivalent series of concatenations, with their inherent construction).
It would be a disadvantage if mutability was part of the purpose of an object (who'd want to be modeled by an Employee object whose salary could never ever change) though sometimes even then it can be useful (in a many web and other stateless applications, code doing read operations is separate from that doing updates, and using different objects may be natural - I wouldn't make an object immutable and then force that pattern, but if I already had that pattern I might make my "read" objects immutable for the performance and correctness-guarantee gain).
Copy-on-write is a middle ground. Here the "real" class holds a reference to a "state" class. State classes are shared on copy operations, but if you change the state, a new copy of the state class is created. This is more often used with C++ than C#, which is why it's std:string enjoys some, but not all, of the advantages of immutable types, while remaining mutable.
Making strings immutable has many advantages. It provides automatic thread safety, and makes strings behave like an intrinsic type in a simple, effective manner. It also allows for extra efficiencies at runtime (such as allowing effective string interning to reduce resource usage), and has huge security advantages, since it's impossible for an third party API call to change your strings.
StringBuilder was added in order to address the one major disadvantage of immutable strings - runtime construction of immutable types causes a lot of GC pressure and is inherently slow. By making an explicit, mutable class to handle this, this issue is addressed without adding unneeded complication to the string class.
Strings are not really immutable. They are just publicly immutable.
It means you cannot modify them from their public interface. But in the inside the are actually mutable.
If you don't believe me look at the String.Concat definition using reflector.
The last lines are...
int length = str0.Length;
string dest = FastAllocateString(length + str1.Length);
FillStringChecked(dest, 0, str0);
FillStringChecked(dest, length, str1);
return dest;
As you can see the FastAllocateString returns an empty but allocated string and then it is modified by FillStringChecked
Actually the FastAllocateString is an extern method and the FillStringChecked is unsafe so it uses pointers to copy the bytes.
Maybe there are better examples but this is the one I have found so far.
string management is an expensive process. keeping strings immutable allows repeated strings to be reused, rather than re-created.
Why are string types immutable in C#
String is a reference type, so it is never copied, but passed by reference.
Compare this to the C++ std::string
object (which is not immutable), which
is passed by value. This means that if
you want to use a String as a key in a
Hashtable, you're fine in C++, because
C++ will copy the string to store the
key in the hashtable (actually
std::hash_map, but still) for later
comparison. So even if you later
modify the std::string instance,
you're fine. But in .Net, when you use
a String in a Hashtable, it will store
a reference to that instance. Now
assume for a moment that strings
aren't immutable, and see what
happens:
1. Somebody inserts a value x with key "hello" into a Hashtable.
2. The Hashtable computes the hash value for the String, and places a
reference to the string and the value
x in the appropriate bucket.
3. The user modifies the String instance to be "bye".
4. Now somebody wants the value in the hashtable associated with "hello". It
ends up looking in the correct bucket,
but when comparing the strings it says
"bye"!="hello", so no value is
returned.
5. Maybe somebody wants the value "bye"? "bye" probably has a different
hash, so the hashtable would look in a
different bucket. No "bye" keys in
that bucket, so our entry still isn't
found.
Making strings immutable means that
step 3 is impossible. If somebody
modifies the string he's creating a
new string object, leaving the old one
alone. Which means the key in the
hashtable is still "hello", and thus
still correct.
So, probably among other things,
immutable strings are a way to enable
strings that are passed by reference
to be used as keys in a hashtable or
similar dictionary object.
Just to throw this in, an often forgotten view is of security, picture this scenario if strings were mutable:
string dir = "C:\SomePlainFolder";
//Kick off another thread
GetDirectoryContents(dir);
void GetDirectoryContents(string directory)
{
if(HasAccess(directory) {
//Here the other thread changed the string to "C:\AllYourPasswords\"
return Contents(directory);
}
return null;
}
You see how it could be very, very bad if you were allowed to mutate strings once they were passed.
You never have to defensively copy immutable data. Despite the fact that you need to copy it to mutate it, often the ability to freely alias and never have to worry about unintended consequences of this aliasing can lead to better performance because of the lack of defensive copying.
Strings are passed as reference types in .NET.
Reference types place a pointer on the stack, to the actual instance that resides on the managed heap. This is different to Value types, who hold their entire instance on the stack.
When a value type is passed as a parameter, the runtime creates a copy of the value on the stack and passes that value into a method. This is why integers must be passed with a 'ref' keyword to return an updated value.
When a reference type is passed, the runtime creates a copy of the pointer on the stack. That copied pointer still points to the original instance of the reference type.
The string type has an overloaded = operator which creates a copy of itself, instead of a copy of the pointer - making it behave more like a value type. However, if only the pointer was copied, a second string operation could accidently overwrite the value of a private member of another class causing some pretty nasty results.
As other posts have mentioned, the StringBuilder class allows for the creation of strings without the GC overhead.
Strings and other concrete objects are typically expressed as immutable objects to improve readability and runtime efficiency. Security is another, a process can't change your string and inject code into the string
Imagine you pass a mutable string to a function but don't expect it to be changed. Then what if the function changes that string? In C++, for instance, you could simply do call-by-value (difference between std::string and std::string& parameter), but in C# it's all about references so if you passed mutable strings around every function could change it and trigger unexpected side effects.
This is just one of various reasons. Performance is another one (interned strings, for example).
There are five common ways by which a class data store data that cannot be modified outside the storing class' control:
As value-type primitives
By holding a freely-shareable reference to class object whose properties of interest are all immutable
By holding a reference to a mutable class object that will never be exposed to anything that might mutate any properties of interest
As a struct, whether "mutable" or "immutable", all of whose fields are of types #1-#4 (not #5).
By holding the only extant copy of a reference to an object whose properties can only be mutated via that reference.
Because strings are of variable length, they cannot be value-type primitives, nor can their character data be stored in a struct. Among the remaining choices, the only one which wouldn't require that strings' character data be stored in some kind of immutable object would be #5. While it would be possible to design a framework around option #5, that choice would require that any code which wanted a copy of a string that couldn't be changed outside its control would have to make a private copy for itself. While it hardly be impossible to do that, the amount of extra code required to do that, and the amount of extra run-time processing necessary to make defensive copies of everything, would far outweigh the slight benefits that could come from having string be mutable, especially given that there is a mutable string type (System.Text.StringBuilder) which accomplishes 99% of what could be accomplished with a mutable string.
Immutable Strings also prevent concurrency-related issues.
Imagine being an OS working with a string that some other thread was
modifying behind your back. How could you validate anything without
making a copy?
I am wondering how immutability is defined? If the values aren't exposed as public, so can't be modified, then it's enough?
Can the values be modified inside the type, not by the customer of the type?
Or can one only set them inside a constructor? If so, in the cases of double initialization (using the this keyword on structs, etc) is still ok for immutable types?
How can I guarantee that the type is 100% immutable?
If the values aren't exposed as public, so can't be modified, then it's enough?
No, because you need read access.
Can the values be modified inside the type, not by the customer of the type?
No, because that's still mutation.
Or can one only set them inside a constructor?
Ding ding ding! With the additional point that immutable types often have methods that construct and return new instances, and also often have extra constructors marked internal specifically for use by those methods.
How can I guarantee that the type is 100% immutable?
In .Net it's tricky to get a guarantee like this, because you can use reflection to modify (mutate) private members.
The previous posters have already stated that you should assign values to your fields in the constructor and then keep your hands off them. But that is sometimes easier said than done. Let's say that your immutable object exposes a property of the type List<string>. Is that list allowed to change? And if not, how will you control it?
Eric Lippert has written a series of posts in his blog about immutability in C# that you might find interesting: you find the first part here.
One thing that I think might be missed in all these answers is that I think that an object can be considered immutable even if its internal state changes - as long as those internal changes are not visible to the 'client' code.
For example, the System.String class is immutable, but I think it would be permitted to cache the hash code for an instance so the hash is only calculated on the first call to GetHashCode(). Note that as far as I know, the System.String class does not do this, but I think it could and still be considered immutable. Of course any of these changes would have to be handled in a thread-safe manner (in keeping with the non-observable aspect of the changes).
To be honest though, I can't think of many reasons one might want or need this type of 'invisible mutability'.
Here is the definition of immutability from Wikipedia (link)
"In object-oriented and functional programming, an immutable object is an object whose state cannot be modified after it is created."
Essentially, once the object is created, none of its properties can be changed. An example is the String class. Once a String object is created it cannot be changed. Any operation done to it actually creates a new String object.
Lots of questions there. I'll try to answer each of them individually:
"I am wondering how immutability is defined?" - Straight from the Wikipedia page (and a perfectly accurate/concise definition)
An immutable object is an object whose state cannot be modified after it is created
"If the values aren't exposed as public, so can't be modified, then it's enough?" - Not quite. It can't be modified in any way whatsoever, so you've got to insure that methods/functions don't change the state of the object, and if performing operations, always return a new instance.
"Can the values be modified inside the type, not by the customer of the type?" - Technically, it can't be modified either inside or by a consumer of the type. In pratice, types such as System.String (a reference type for the matter) exist that can be considered mutable for almost all practical purposes, though not in theory.
"Or can one only set them inside a constructor?" - Yes, in theory that's the only place where state (variables) can be set.
"If so, in the cases of double initialization (using the this keyword on structs, etc) is still ok for immutable types?" - Yes, that's still perfectly fine, because it's all part of the initialisation (creation) process, and the instance isn't returned until it has finished.
"How can I guarantee that the type is 100% immutable?" - The following conditions should insure that. (Someone please point out if I'm missing one.)
Don't expose any variables. They should all be kept private (not even protected is acceptable, since derived classes can then modify state).
Don't allow any instance methods to modify state (variables). This should only be done in the constructor, while methods should create new instances using a particular constructor if they require to return a "modified" object.
All members that are exposed (as read-only) or objects returned by methods must themselves be immutable.
Note: you can't insure the immutability of derived types, since they can define new variables. This is a reason for marking any type you wan't to make sure it immutable as sealed so that no derived class can be considered to be of your base immutable type anywhere in code.
Hope that helps.
I've learned that immutability is when you set everything in the constructor and cannot modify it later on during the lifetime of the object.
The definition of immutability can be located on Google .
Example:
immutable - literally, not able to change.
www.filosofia.net/materiales/rec/glosaen.htm
In terms of immutable data structures, the typical definition is write-once-read-many, in other words, as you say, once created, it cannot be changed.
There are some cases which are slightly in the gray area. For instance, .NET strings are considered immutable, because they can't change, however, StringBuilder internally modifies a String object.
An immutable is essentially a class that forces itself to be final from within its own code. Once it is there, nothing can be changed. In my knowledge, things are set in the constructor and then that's it. I don't see how something could be immutable otherwise.
There's unfortunately no immutable keywords in c#/vb.net, though it has been debated, but if there's no autoproperties and all fields are declared with the readonly (readonly fields can only bet assigned in the constructor) modfier and that all fields is declared of an immutable type you will have assured your self immutability.
An immutable object is one whose observable state can never be changed by any plausible sequence of code execution. An immutable type is one which guarantees that any instances exposed to the outside world will be immutable (this requirement is often stated as requiring that the object's state may only be set in its constructor; this isn't strictly necessary in the case of objects with private constructors, nor is it sufficient in the case of objects which call outside methods on themselves during construction).
A point which other answers have neglected, however, is a definition of an object's state. If Foo is a class, the state of a List<Foo> consists of the sequence of object identities contained therein. If the only reference to a particular List<Foo> instance is held by code which will neither cause that sequence to be changed, nor expose it to code that might do so, then that instance will be immutable, regardless of whether the Foo objects referred to therein are mutable or immutable.
To use an analogy, if one has a list of automobile VINs (Vehicle Identification Numbers) printed on tamper-evident paper, the list itself would be immutable even though cars aren't. Even if the list contains ten red cars today, it might contain ten blue cars tomorrow; they would still, however, be the same ten cars.