Should I use struct instead of class to put items int dictionary - c#

I've a custom Entity class (see code below). The objects of this class get populated into a Dictionary collection Dictionary<string, Dictionary <uint,Entity>> dt . Someone ask me to use Struct instead of Class, since value types won't get copied into heap. Well, I think in this case Class seems better choice since Dictionary will then only contains reference to the objects of type Entity. Entity type in this case represents a row in a csv file. So for every row there will be a Entity type object. The encapsulated Dictionary within Entity type contains key,value pair representing column,value for a row in a csv file.
But I want to make sure I didn't miss anything obvious so thought better to ask.
public class Entity
{
Dictionary<string, string> dtValues = new Dictionary<string,string>(); //contains values from CSV file.
public Dictionary<string, string> Values
{
get { return dtValues; }
set { dtValues = value; }
}
}

Someone ask me to use Struct instead of Class, since value types won't get copied into heap.
That person is dead wrong. Simply put: the dictionary itself is on the heap, so its contents will also be on the heap.
If you're not sure whether your should use a reference type or a value type, then you should research the pros and cons of both (I listed a few here). But you should probably use a reference type.
Remember: premature optimization is the root of all evil.

Using a Dictionary of Structs is always a really bad idea because when you enumerate it or do anything with it's values you will be copying the data over and over, so using classes is the right way, in this manner only references are moved.

Types which hold references to mutable objects for the purpose of encapsulating state should generally avoid exposing those objects to outside code. I don't quite understand what the purpose of your Entity type would be, since the only apparent content is a reference to a Dictionary [which is, of course, a mutable type].
In general, if one wishes to store things in a collection and is trying to decide between using a class or a struct for the things to be stored, I would suggest the following:
If you use a mutable class type for list items or dictionary values [not keys], you may edit the information stored in an item without having to involve the collection itself in the process. Making information contained in an item available to outside code, however, will require either cloning the data item or copying data from it into something else.
If you use an immutable class type, or any structure type, for things stored in a collection, editing the information contained in an item will require reading out the item, producing a changed version, and storing it back. Making information contained in the item available to outside code, however, will be much easier, since one can return it as its own data type.
A pattern which is sometimes useful is to define a mutable structure type but then define a wrapper type something like the following:
public class ExposedFieldHolder<T>
{
public T Value;
ExposedFieldHolder(T value) { Value = value; }
}
If Value is a field, rather than a property, and one creates e.g. a Dictionary<string, ExposedFieldHolder<someStruct>>, then it will be possible to edit data while it's stored within the dictionary e.g.myDict["George"].Value.Age++;but it will also be possible to give the data associated with an entry to outside code [e.g.return myDict[name].Value;`].
Having a simple exposed-field structure encapsulated within an ExposedFieldHolder<T> can be safer and more convenient than simply using a mutable class (since one can return a holder's entire Value without giving outside code access to the holder itself), and also more convenient than using an immutable class or structure (since one can modify list items or dictionary values without having to use a readout-modify-writeback sequence). It is less space-efficient than simply storing an exposed-field structure directly in the collection. If one stored an exposed-field struct directly as a dictionary value without using an `ExposedFieldHolder, the above update to George's age would require the three-step process:
var temp = myDict["George"];
temp.Age++;
myDict["George"] = temp;
but that may still be better than what an immutable class or so-called "immutable" structure would require, e.g.
var temp = myDict["George"];
temp = new PersonDat(temp.Name, temp.Age+1, temp.FavoriteColor, temp.AstrologicalSign);
myDict["George"] = temp;
Note that if one consistently follows the former pattern when using exposed-field structures, the code will be correct regardless of what fields or properties the structure contains or the order in which they appear. By contrast, when using the latter pattern one must be very careful to ensure any fields or properties whose value should not be changed get passed to the constructor in the proper sequence. Some people may like that kind of code; I think it's horrible.

Related

Is there a List<T> like dynamic array that allows access to the internal array data in .NET?

Looking over the source of List<T>, it seems that there's no good way to access the private _items array of items.
What I need is basically a dynamic list of structs, which I can then modify in place. From my understanding, because C# 6 doesn't yet support ref return types, you can't have a List<T> return a reference to an element, which requires copying of the whole item, for example:
struct A {
public int X;
}
void Foo() {
var list = new List<A> { new A { X = 3; } };
list[0].X++; // this fails to compile, because the indexer returns a copy
// a proper way to do this would be
var copy = list[0];
copy.X++;
list[0] = copy;
var array = new A[] { new A { X = 3; } };
array[0].X++; // this works just fine
}
Looking at this, it's both clunky from syntax point of view, and possibly much slower than modifying the data in place (Unless the JIT can do some magic optimizations for this specific case? But I doubt they could be relied on in the general case, unless it's a special standardized optimization?)
Now if List<T>._items was protected, one could at least subclass List<T> and create a data structure with specific modify operations available. Is there another data structure in .NET that allows this, or do I have to implement my own dynamic array?
EDIT: I do not want any form of boxing or introducing any form of reference semantics. This code is intended for very high performance, and the reason I'm using an array of structs is to have them tighly packed on memory (and not everywhere around heap, resulting in cache misses).
I want to modify the structs in place because it's part of a performance critical algorithm that stores some of it's data in those structs.
Is there another data structure in .NET that allows this, or do I have to implement my own dynamic array?
Neither.
There isn't, and can't be, a data structure in .NET that avoids the structure copy, because deep integration with the C# language is needed to get around the "indexed getter makes a copy" issue. So you're right to think in terms of directly accessing the array.
But you don't have to build your own dynamic array from scratch. Many List<T>-like operations such as Resize and bulk movement of items are provided for you as static methods on type System.Array. They come in generic flavors, so no boxing is involved.
The unfortunate thing is that the high-performance Buffer.BlockCopy, which should work on any blittable type, actually contains a hard-coded check for primitive types and refuses to work on any structure.
So just go with T[] (plus int Count -- array length isn't good enough because trying to keep capacity equal to count is very inefficient) and use System.Array static methods when you would otherwise use methods of List<T>. If you wrap this as a PublicList<T> class, you can get reusability and both the convenience of methods for Add, Insert, Sort as well as direct element access by indexing directly on the array. Just exercise some restraint and never store the handle to the internal array, because it will become out-of-date the next time the list needs to grow its capacity. Immediate direct access is perfectly fine though.

Give names to Key and Value in C# Dictionary to improve code readability

In C# struct, we can know clearly the purpose of a variable by it's name. For example,
public struct Book
{
public string title;
public string author;
}
Then, i know b.title is a type of string and it's referring to title.
However in C# dictionary, we can only specify the type
Dictionary<string,string> d
How can i make the code more readable such that the key of the dictionary is type of string and it is referring to title, and the value is type of string and it is referring to author? That means, other people can easily know that d["J.R.R. Tolkien"] is a wrong use of the dictionary when reading the code.
EDIT
#mike z suggested to use a variable name titleToAuthor to help readability. But my real issue is that in the code there are nested dictionary. E.g.
Dictionary<string, Dictionary<string, string>>,
or even 3 levels
Dictionary<string, Dictionary<string , Dictionary< string , string[] >>>.
We want to keep to convenience of using Dictionary without creating our own class but at the same time we need some way to improve the readability
As suggested by #ScottDorman you could define a new Type TitleAuthorDictionary that derives from Dictionary<string, string>, like so:
public class TitleAuthorDictionary : Dictionary<string, string>
{
public new void Add(string title, string author)
{
base.Add(title, author);
}
public new string this[string title]
{
get { return base[title]; }
set { base[title] = value; }
}
}
You could then use the more readable Dictionary Collection, like this:
TitleAuthorDictionary dictionary = new TitleAuthorDictionary();
dictionary.Add("Title1", "Author1");
dictionary.Add(title: "Title2", author: "Author2");
dictionary["Title2"] = "Author3";
With .NET 6 and C# 10, you can now use global using directive to give type aliases in your project.
Give built-in types their aliases in one single place, e.g. GlobalUsings.cs.
global using Title = System.String;
global using Author = System.String;
Then use aliases for better readability in your dictionaries.
Dictionary<string, Dictionary<Title, Author>>
What you can't solve without fighting the language, I'd suggest solving with documentation. Identifiers are included in that category as a form of self-documentation.
So to add self-documentation to a type of this sort:
using TitleToAuthor = System.Collections.Generic.Dictionary<
string, // title
string // author
>;
To add self-documentation to instances of that type:
TitleToAuthor title_to_author = new TitleToAuthor();
Unfortunately you cannot nest using directives as generic type parameters, so using a solution like this will make your using directives at the top of the file butt-ugly, but at least that butt-ugly code, written once, will then create a very readable alias to the left of it (showing exactly what it's for) that you can refer to throughout the rest of the code without actually creating new data types.
Another way is to simply create new data types, inheriting from Dictionary, e.g. I'd suggest this route if you have more you can do with the new type than simply getting a more readable type name, like adding methods that are specifically useful for that collection or if that collection is used in many source files (since then you'd have to repeat the same using directives a lot).
In such a case, composition might be better than inheritance (storing dictionary as a member) since then you could create a smaller, subset interface tailored to your needs (and perhaps with fewer ways to misuse it by only providing higher-level functions that make total sense for the specific container type) instead of just a full-blown dictionary's interface + more methods. In such a case, you'd be turning that somewhat hard-to-read, generic, general-purpose dictionary into a hidden implementation detail of something that not only reads better with respect to its type name, but also provides a smaller, tailored (less general) interface that more specifically handles your needs. For a simple example, it might be an error to allow empty strings to be specified for the key or the value. A dictionary doesn't impose such assertions, but an interface of your own design that uses dictionary as a private implementation detail can.
If you're tripping up over the readability of the key/value parameters of a dictionary, perhaps the problem is not really in the readability of the dictionary, but in the amount of public exposure that specific dictionary has. If you have a dictionary instance or even type with very public visibility that gets referred to all over the place, then you might be concerned with not only the readability of such code but also the flexibility that allows those accessing it to do anything they want that's allowed in a full-blown dictionary (including things you might not want to allow to happen at a broader scope). After all, even a type like float tells you very little about what it's supposed to do, but we tend to write code in a way where floats are either implementation details of a class/function or just function parameters/return types that are rather obvious in terms of what they do. So perhaps it would be better to seek to make such dictionaries less visible and more into private implementation details, since the clarity and readability of implementation details generally doesn't matter nearly as much as the more publicly-visible parts of an interface that are going to be accessed throughout your codebase.
By design, a dictionary is a key-value pair, and the exposed collections are called as such. If you want something more explicit, you can derive your own custom dictionary or implement the appropriate dictionary interfaces on your own class. You could also look at implementing this as a keyed collection, where you provide a lambda expression to derive the key from your data.

Writing access to private objects via public readOnly property

I am currently struggling to understand something i just saw somewhere.
Lets say I have two classes :
class MyFirstCLass{
public int membVar1;
private int membVar2;
public string membVar3;
private string membVar4;
public MyFirstClass(){
}
}
and :
class MySecondClass{
private MyFirstClass firstClassObject = new MyFirstClass();
public MyFirstClass FirstClassObject{
get{
return firstClassObject;
}
}
}
If i do something like this :
var secondClassObject = new MySecondClass(){
FirstClassObject = {membVar1 = 42, membVar3 = "foo"}
};
secondClass is an instanciation of MySecondClass, and does have one private member variable of type MyFirstClass wich has a readOnly property. However, i am able to change the state of membVar1 and membVar2. Isn't there any encapsulation problem ?
Best regards,
Al_th
The fact that the FirstClassObject property on MySecondClass has no setter does not mean that the object returned from the getter becomes immutable. Since it has public fields, these fields are mutable. Therefore it is perfectly legal to say secondClassObject.FirstClassObject.membVar1 = 42. The absence of the setter only means that you cannot replace the object reference stored in the firstClassObject field with a reference to a different object.
Please note: You are not changing the value of MySecondClass.FirstClassObject. You are simply changing the values inside that property.
Compare the following two snippets. The first is legal, the second is not as it tries to assign a new value to the FirstClassObject property:
// legal:
var secondClassObject = new MySecondClass(){
FirstClassObject = {membVar1 = 42, membVar3 = "foo"} }
// won't compile:
// Property or indexer 'FirstClassObject' cannot be assigned to -- it is read only
var secondClassObject = new MySecondClass(){
FirstClassObject = new MyFirstClass {membVar1 = 42, membVar3 = "foo"} }
Basically, your code is just a very fancy way of writing this:
var secondClassObject = new MySecondClass();
secondClassObject.FirstClassObject.membVar1 = 42;
secondClassObject.FirstClassObject.membVar3 = "foo";
And that's how I would write it. It is explicit and understandable.
Neither a storage location of type MyFirstCLass, nor the value returned by a a property of type MyFirstCLass, contains fields membVar1, membVar2, etc. The storage location or property instead contains information sufficient to either identify an instance of MyFirstCLass or indicate that it is "null". In some languages or frameworks, there exist reference types which identify an object but only allow certain operations to be performed on it, but Java and .NET both use Promiscuous Object References: if an object allows outside code that holds a reference to do something with it, any outside code that gets a reference will be able to do that.
If a class is using a mutable object to encapsulate its own state, and wishes to allow the outside world to see that state but not allow the outside world to tamper with it, it must not return the object directly to the outside code but instead give the outside code something else. Possibilities include:
Expose all the aspects of state encompassed by the object individually (e.g. have a membVar1 property which returns the value of the encapsulated object's membVar1). This can avoid confusion, but provides a caller with no way to handle the properties as a group.
Return a new instance of a read-only wrapper which holds a reference to the private object, and has members that forward read requests (but not write requests) to those members. The returned object will serve as a read-only "view", but outside code will have no nice way to identify whether two such objects are views of the same underlying object.
Have a field of a read-only-wrapper type which is initialized in the constructor, and have a property return that. If each object will only have one read-only wrapper associated with it, two wrapper references will view the same wrapped object only if they identify the same wrapper.
Create an immutable copy of the underlying data, perhaps by creating a new mutable copy and returning a new read-only wrapper to it. This will give the caller a "snapshot" of the data, rather than a live "view".
Create a new mutable copy of the underlying data, and return that. This has the disadvantage that a caller who tries to change the underlying data by changing the copy will be allowed to change the copy without any warnings, but the operation won't work. All of the arguments for why mutable structs are "evil" apply doubly here: code which receives an exposed-field structure should expect that changes to the received structure won't affect the source from which it came, but code which receives a mutable class object has no way of knowing that. Properties should not behave this way; such behavior is generally only appropriate for methods which make clear their intention (e.g. FirstClassObjectAsNewMyFirstClass();
Require that the caller pass in a mutable object of a type that can accept the underlying data, and copy the data into that. This gives the caller the data in a mutable form (which in some cases may be easier to work with) but at the same time avoids any confusion about who "owns" the object. As an added bonus, if the caller will be making many queries, the caller may reuse the same mutable object for all of them, thus avoiding unnecessary object allocations.
Encapsulate the data within a structure, and have a property return the structure. Some people may balk at such usage, but it's a useful convention in cases where a caller may want to piecewise-modify the data. This approach only really works if the data in question is limited to a fixed set of discrete values (such as the coordinates and dimensions of a rectangle), but has the advantage that if the caller understands what a .NET structure is (as all .NET programmers should) the semantics are inherently obvious.
Of these choices, only the last two make clear via the type system what semantics the caller should expect. Accepting a mutable object from the caller offers clear semantics, but makes usage awkward. Returning an exposed-field structure offers clear semantics but only if the data consists of a fixed set of discrete values. Returning a mutable copy of the data is sometimes useful, but is only appropriate if the method name makes clear what it is doing. The other choices generally leave ambiguous the question of whether the data represents a snapshot or a live "view".

is it possible to create immutable object without passing everything in constructor?

I want to make my class immutable. Obvious way would be to declare all fields as get; private set; and to initialize all fields in constructor. So clients must provide everything in constructor. The problem is that when there are ~10 or more fields passing them in constructor become very unreadable, because there are no labels for each field.
For example this is pretty readable:
info = new StockInfo
{
Name = data[0] as string,
Status = s,
LotSize = (int)data[1],
ISIN = data[2] as string,
MinStep = (decimal)data[3]
};
compare to this:
new StockInfo(data[0] as string, s, (int) data[1], data[2] as string, (decimal) data[3])
And now imaging that I have 10 or more parameters.
So how can I make class immutable saving readability?
I can suggest only use the same formatting when using constructor:
info = new StockInfo(
data[0] as string, // Name
s, // Status
(int)data[1], // LotSize
data[2] as string, // ISIN
(decimal)data[3] // MinStep
);
Can you suggest something better?
Here are some options. You will have to decide what's best for you:
Use a classic immutable object (with a massive constructor) with named arguments for readability. (Drawbacks: Some frown on having many constructor arguments. May be inconvenient to use from other .NET languages without support for named arguments.)
info = new StockInfo
(
name: data[0] as string,
status: s,
...
)
Expose the mutable object through an immutable interface. (Drawbacks: The object could still be mutated with casting. Extra type to write.)
public interface IStockInfo
{
string Name { get; }
string Status { get; }
}
IStockInfo info = new StockInfo
{
Name = data[0] as string,
Status = s,
...
}
Expose a read-only view of the mutable object - see ReadOnlyCollection<T> for example. (Drawbacks: Extra type to implement. Extra object created. Extra indirections.)
var readOnlyInfo = new ReadOnlyStockInfoDecorator(info);
Expose an immutable clone of the mutable object. (Drawbacks: Extra type to implement. Extra object created. Copying required.)
var immutableInfo = new ImmutableStockInfo(info);
Use freezable objects. (Drawback: Post-freeze mutation-attempts won't be caught until execution-time.)
info.Freeze();
info.Name = "Test"; // Make this throw an exception.
Use fluent-style builders or similar (Drawbacks: Some may be unfamiliar with the pattern. Lots of extra code to write. Lots of copies created. Intermediate states may possibly be illegal)
info = StockInfo.FromName(data[0] as string)
.WithStatus(s) // Make this create a modified copy
.WithXXX() ;
Here is how you could do it, using C#'s named parameters:
var info = new StockInfo
(
Name: data[0] as string,
Status: s,
LotSize: (int)data[1],
ISIN: data[2] as string,
MinStep: (decimal)data[3]
);
No, it is not possible. Either you have immutable object or you want to have ability to modify object.
You can use named parameters.
You may consider passing other objects (and group parameters), so that one object will contain only parameters that are somehow very similar.
Looking at your code I may also suggest that you extract parameters first, so instead of passing something like data[0] as string you use string stockName = data[0] as string; and then use stockName. That should make your code more readable.
If you're passing so many parameters to the constructor of your object it may be a good idea to revise your design. You may be violating Single Responsibility principle.
So how can I make class immutable saving readability?
You can use named parameters:
info = new StockInfo(
name: data[0] as string,
status: s,
lotSize: (int)data[1],
isin: data[2] as string,
minStep: (decimal)data[3]
);
Note that the goal of using object initializers is not readability - and they should not be considered a substitute for constructors. It is a very good idea to always include every parameter in a constructor which is required to properly initialize a type. Immutable types must pass in all of their arguments during construction, either via a constructor or a factory method.
Object initializers will never work with immutable types, as they work by setting values after constructors.
Some more possible solutions:
Immutability by convention
The object is mutable, you just behave well and never change it after it's been set-up. This is completely inappropriate for most uses (any where the object is public for a start), but can work well for internal "worker" objects with a limited number of places where they are used (and hence a limited number of places where you can mess up and change them).
Deeper Hierarchy
Assuming your real class has more than 5 fields (not that hard to read, especially if you've an IDE with tooltips), some may be composable. E.g if you had different parts of a name, and address and a latitude and longitude in the same class, you could break that into name, address and coördinate classes.
A bonus that happens in some such cases, is that if you've many (and I mean many for this to be worthwhile, anything less than a few thousand and it's a waste of time) such objects and there are some such fields identical between them, you can sometimes build them in such a way that those shared values have the same object in each case, rather than different identical objects - all the things that can go wrong with aliasing can't happen, since they are immutable after all.
Builder Classes
Examples would be StringBuilder and UriBuilder. Here you've got precisely the issue you have - you want the benefits of immutability, but there are at least some times when you want to be able to construct he object in more than one step.
So you create a different mutable class that has equivalent properties, but with setters as well as getters, along with other mutating methods (whether something like Append() makes sense depends on the class of course), and a method that constructs an instance of your immutable class.
I've made use of this with classes whose constructor has as many as 30 parameters, because there really were 30 different pieces of information that was part of the same concern. In this case, about the only place I'd call the constructor was in the corresponding builder class.
I would suggest that it may be helpful to have an interface for "maybe-mutable" objects which includes AsMutable, AsNewMutable, and AsImmutable methods. An immutable object could implelement AsImmutable by simply returning itself. A mutable object should implement AsImmutable by returning either a new immutable object created using the mutable one as a constructor parameter, or else an immutable object which is known to be equivalent. The constructor for the immutable object should load the new object with the contents of the original, but calling AsImmutable on all maybe-mutable fields.
A mutable object should simply return itself in response to AsMutable, while an immutable object should construct a new "shallowly" mutable object. If code wishes to mutate an object referred to by a "maybe-mutable" property, it should set the property to its AsMutable equivalent.
Calling the AsNewMutable method on an immutable object should behave just like AsMutable. Calling it on a mutable object could either behave equivalently to AsImmutable.AsMutable, or could create mutable clones of any nested mutable objects (there are times when either approach might be better, depending upon which nested objects will end up being mutated).
If you use this pattern, you should be able to reap many of the benefits of both immutable objects (most notably, the ability to take a snap shot of a "deep" object without needing to make a "deep" copy), and mutable ones (being able to produce an object by performing many steps on the same instance). Performance may be enhanced by having each a mutable object keep a reference to an immutable object whose state was at some time identical to its own; after constructing an immutable instance, the object could check whether it matches the other and, if so, discard the new instance and return the old one. While this would seem to represent extra work, it could in fact improve performance considerably in the scenario where a mutable object has AsImmutable called upon it more than once between mutations. If one calls AsImmutable twice on a deep tree structure when most of the tree has in fact not been mutated, having the non-mutated parts of the tree return the same object instances both times would facilitate future comparisons.
Note: If one uses this pattern, one should override GetHashCode and Equals for the deeply-immutable type, but not the mutable one. Immutable objects which hold identical values should be considered interchangeable and thus equivalent, but mutable objects should not be equivalent to anything but themselves regardless of their values. Note also that some care may be needed if objects hold anything of type double, float, or Decimal, since those types override Object.Equals to mean something other than equivalence.

What is the downside of using a structure vs object in a list in C#?

As I understand, using structure value types will always give better performance than using reference types in an array or list. Is there any downside involved in using struct instead of class type in a generic list?
PS : I am aware that MSDN recommends that struct should be maximum 16 bytes, but I have been using 100+ byte structure without problems so far. Also, when I get the maximum stack memory error exceeded for using a struct, I also run out of heap space if I use a class instead.
There is a lot of misinformation out there about struct vs. reference types in .Net. Anything which makes blanket statements like "structs will always perform better in ..." is almost certainly wrong. It's almost impossible to make blanket statements about performance.
Here are several items related to value types in a generic collection which will / can affect performance.
Using a value types in a generic instantiation can cause extra copies of methods to be JIT'd at runtime. For reference types only one instance will be generated
Using value types will affect the size of the allocated array to be count * size of the specific value type vs. reference types which have all have the same size
Adding / accessing values in the collection will incur copy overhead. The performance of this changes based on the size of the item. For references again it's the same no matter the type and for value types it will vary based on the size
As others have pointed out, there are many downsides to using large structures in a list. Some ramifications of what others have said:
Say you're sorting a list whose members are 100+ byte structures. Every time items have to be swapped, the following occurs:
var temp = list[i];
list[i] = list[j];
list[j] = temp;
The amount of data copied is 3*sizeof(your_struct). If you're sorting a list that's made up of reference types, the amount of data copied is 3*sizeof(IntPtr): 12 bytes in the 32-bit runtime, or 24 bytes in the 64-bit runtime. I can tell you from experience that copying large structures is far more expensive than the indirection inherent in using reference types.
Using structures also reduces the maximum number of items you can have in a list. In .NET, the maximum size of any single data structure is 2 gigabytes (minus a little bit). A list of structures has a maximum capacity of 2^31/sizeof(your_struct). So if your structure is 100 bytes in size, you can have at most about 21.5 million of them in a list. But if you use reference types, your maximum is about 536 million in the 32-bit runtime (although you'll run out of memory before you reach that limit), or 268 million in the 64-bit runtime. And, yes, some of us really do work with that many things in memory.
using structure value types will always give better performance than using reference types in an array or list
There is nothing true in that statement.
Take a look at this question and answer.
With structs, you cannot have code reuse in the form of class inheritance. A struct can only implement interfaces but cannot inherit from a class or another struct whereas a class can inherit from another class and of course implement interfaces.
When storing data in a List<T> or other collection (as opposed to keeping a list of controls or other active objects) and one wishes to allow the data to change, one should generally follow one of four patterns:
Store immutable objects in the list, and allow the list itself to change
Store mutable objects in the list, but only allow objects created by the owner of the list to be stored therein. Allow outsiders to access the mutable objects themselves.
Only store mutable objects to which no outside references exist, and don't expose to the outside world any references to objects within the list; if information from the list is requested, copy it from the objects in the list.
Store value types in the list.
Approach #1 is the simplest, if the objects one wants to store are immutable. Of course, the requirement that objects be immutable can be somewhat limiting.
Approach #2 can be convenient in some cases, and it permits convenient updating of data in the list (e.g. MyList[index].SomeProperty += 5;) but the exact semantics of how returned properties are, or remain, attached to items in the list may sometimes be unclear. Further, there's no clear way to load all the properties of an item in the list from an 'example' object.
Approach #3 has simple-to-understand semantics (changing an object after giving it to the list will have no effect, objects retrieved from the list will not be affected by subsequent changes to the list, and changes to objects retrieved from a list will not affect the list themselves unless the objects are explicitly written back), but requires defensive copying on every list access, which can be rather bothersome.
Approach #4 offers essentially the same semantics as approach #3, but copying a struct is cheaper than making a defensive copy of a class object. Note that if the struct is mutable, the semantics of:
var temp = MyList[index];
temp.SomeField += 5;
MyList[index] temp;
are clearer than anything that can be achieved with so-called "immutable" (i.e. mutation-only-by-assignment) structs. To know what the above does, all one needs to know about the struct is that SomeField is a public field of some particular type. By contrast, even something like:
var temp = MyList[index];
temp = temp.WithSomeField(temp.SomeField + 5);
MyList[index] temp;
which is about the best one could hope for with such a struct, would be much harder to read than the easily-mutable-struct version. Further, to be sure of what the above actually does, one would have to examine the definition of the struct's WithSomeField method and any constructors or methods employed thereby, as well as all of the struct's fields, to determine whether it had any side-effects other than modifying SomeField.

Categories

Resources