This example is in C# but the question really applies to any OO language. I'd like to create a generic, immutable class which implements IReadOnlyList. Additionally, this class should have an underlying generic IList which is unable to be modified. Initially, the class was written as follows:
public class Datum<T> : IReadOnlyList<T>
{
private IList<T> objects;
public int Count
{
get;
private set;
}
public T this[int i]
{
get
{
return objects[i];
}
private set
{
this.objects[i] = value;
}
}
public Datum(IList<T> obj)
{
this.objects = obj;
this.Count = obj.Count;
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
return this.objects.GetEnumerator();
}
}
However, this isn't immutable. As you can likely tell, changing the initial IList 'obj' changes Datum's 'objects'.
static void Main(string[] args)
{
List<object> list = new List<object>();
list.Add("one");
Datum<object> datum = new Datum<object>(list);
list[0] = "two";
Console.WriteLine(datum[0]);
}
This writes "two" to the console. As the point of Datum is immutability, that's not okay. In order to resolve this, I've rewritten the constructor of Datum:
public Datum(IList<T> obj)
{
this.objects = new List<T>();
foreach(T t in obj)
{
this.objects.Add(t);
}
this.Count = obj.Count;
}
Given the same test as before, "one" appears on the console. Great. But, what if Datum contains a collection of non-immutable collection and one of the non-immutable collections is modified?
static void Main(string[] args)
{
List<object> list = new List<object>();
List<List<object>> containingList = new List<List<object>>();
list.Add("one");
containingList.Add(list);
Datum<List<object>> d = new Datum<List<object>>(containingList);
list[0] = "two";
Console.WriteLine(d[0][0]);
}
And, as expected, "two" is printed out on the console. So, my question is, how do I make this class truly immutable?
You can't. Or rather, you don't want to, because the ways of doing it are so bad. Here are a few:
1. struct-only
Add where T : struct to your Datum<T> class. structs are usually immutable, but if it contains mutable class instances, it can still be modified (thanks Servy). The major downside is that all classes are out, even immutable ones like string and any immutable class you make.
var e = new ExtraEvilStruct();
e.Mutable = new Mutable { MyVal = 1 };
Datum<ExtraEvilStruct> datum = new Datum<ExtraEvilStruct>(new[] { e });
e.Mutable.MyVal = 2;
Console.WriteLine(datum[0].Mutable.MyVal); // 2
2. Create an interface
Create a marker interface and implement it on any immutable types you create. The major downside is that all built-in types are out. And you don't really know if classes implementing this are truly immutable.
public interface IImmutable
{
// this space intentionally left blank, except for this comment
}
public class Datum<T> : IReadOnlyList<T> where T : IImmutable
3. Serialize!
If you serialize and deserialize the objects that you are passed (e.g. with Json.NET), you can create completely-separate copies of them. Upside: works with many built-in and custom types you might want to put here. Downside: requires extra time and memory to create the read-only list, and requires that your objects are serializable without losing anything important. Expect any links to objects outside of your list to be destroyed.
public Datum(IList<T> obj)
{
this.objects =
JsonConvert.DeserializeObject<IList<T>>(JsonConvert.SerializeObject(obj));
this.Count = obj.Count;
}
I would suggest that you simply document Datum<T> to say that the class should only be used to store immutable types. This sort of unenforced implicit requirement exists in other types (e.g. Dictionary expects that TKey implements GetHashCode and Equals in the expected way, including immutability), because it's too difficult for it to not be that way.
Kind of hacky, and definitely more confusing than it's worth in my opinion, but if your T is guaranteed to be serializable, you can store string representations of the objects in your collection rather than storing the objects themselves. Then even if someone pulls an item from your collection and modifies it, your collection would still be intact.
It would be slow and you'd get a different object every time you pulled it from the list. So I'm not recommending this.
Something like:
public class Datum<T> : IReadOnlyList<T>
{
private IList<string> objects;
public T this[int i] {
get { return JsonConvert.DeserializeObject<T>(objects[i]); }
private set { this.objects[i] = JsonConvert.SerializeObject(value); }
}
public Datum(IList<T> obj) {
this.objects = new List<string>();
foreach (T t in obj) {
this.objects.Add(JsonConvert.SerializeObject(t));
}
this.Count = obj.Count;
}
public IEnumerator<T> GetEnumerator() {
return this.objects.Select(JsonConvert.DeserializeObject<T>).GetEnumerator();
}
}
It's impossible. There's no possible way to constrain the generic type to be immutable. The best that you can possibly do is write a collection that cannot allow the structure of that collection to be modified. There is no way to prevent the collection from being used as a collection of some mutable type.
think that such collections are not match OOP, because this design leads to specific co-relation between independent classes - collection and it's items. How one class can change behavior of other without knowlege of each other?
So suggestions of serialization and so can allow you to do it on hacky way, but better is to decide if it's so required to make collection of immutable items, who trys to change them except your own code? May be better "to not mutate" items rather than try "make them immutable".
I faced the same problem, where I implement an object (say CachedData<T>) which handles a cached copy of the property of another object (say T SourceData). When calling the constructor of CachedData, you pass a delegate which returns a SourceData. When calling CachedData<T>.value, you get a copy of SourceData, which is updated every now and then.
It would make no sense to try caching an object, as .Value would only cache the reference to the data, not the data itself. It would only make sense to cache data types, strings, and perhaps structures.
So I ended up:
Thoroughly documenting CachedData<T>, and
Throwing an error in the constructor if T is neither a ValueType, a Structure, or a String. Some like (forgive my VB): If GetType(T) <> GetType(String) AndAlso GetType(T).IsClass Then Throw New ArgumentException("Explain")
Related
I have a base class CodeElement like
public class CodeElement
{
public string Name;
public CodeElement(string name)
{
Name = name;
}
// ...
}
and several derived classes (Class, Window, Constant, etc.) like
public class Class : CodeElement
{
public Class(string name) : base(name)
{}
// ...
}
Note that the constructor is always like this (except the name, obviously). I also have a class CodeElementComparer implementing IComparer<CodeElement> which simply compares by Name.
My problem is the following: I have one somewhat large list (<10,000 elements) of each derived class on which I need to run a very large number of searches, by name (several million each). The Lists are filled before any searches are run. As such, I sort each List after they are complete (using the CodeElementComparer) and then use List<T>.BinarySearch like this
private Class FindClass(List<Class> classes, string name)
{
Class dummy = new Class(name);
int index = classes.BinarySearch(dummy, codeElementComparer);
if (index >= 0)
{
return classes[index];
}
else
{
return null;
}
}
Runtime is just fine, the problem is that new derived classes are regularly added, forcing me to copy paste the above method every time. What I am looking for is something like
private T FindElement<T>(List<T> elements, string name) where T : CodeElement
{
CodeElement dummy = new CodeElement(name);
int index = elements.BinarySearch(dummy, codeElementComparer);
if (index >= 0)
{
return elements[index];
}
else
{
return null;
}
}
However, this does not compile, since List<T>.BinarySearch requires dummy to be of type T (even though I am only using an IComparer<CodeElement>). Here's what I considered; unfortunately, I am stuck on each approach:
Cast the List<T> to List<CodeElement>. I know this does not work because Lists are writable and I could theoretically add a Window to a List<Class> that way. From what I gathered from other questions on here, casting it to IEnumerable<CodeElement> works, but IEnumerable does not support binary search (since that requires O(1) access to make sense).
Create a dummy of type T, using the constructor. While I know that there will always be a constructor which takes only the name parameter, the compiler does not. If I had a way to specify that all derived classes must have such a constructor, I could make dummy of type T.
Change the type of the elements parameter to List<CodeElement>, then cast the return to T. This does compile, but is super unsafe.
Do you have any concise solution to this?
EDIT: Although the names are not unique, handling that once when creating a dictionary is still better than dealing with binary search, as #canton7 pointed out. I am still interested in how to handle this with Lists though.
To answer the question (regardless of discussions on whether a better collection type would be better), one way is to make use of Span<T>.BinarySearch, which takes an IComparable<T> rather than just a T.
For this, you need to get a Span<T> from your List<T>. This can be done with CollectionsMarshal.AsSpan, but note that this gives you a reference to the underlying array which can change if the list is resized, so use the resulting span with caution!
The final code looks a bit like this:
private T FindElement<T>(List<T> elements, string name) where T : CodeElement
{
CodeElement dummy = new CodeElement(name);
var span = CollectionsMarshal.AsSpan(elements);
int index = span.BinarySearch(dummy);
if (index >= 0)
{
return span[index];
}
else
{
return null;
}
}
class CodeElement : IComparable<CodeElement>
{
public string Name { get; }
public CodeElement(string name) => Name = name;
public int CompareTo(CodeElement other) => Comparer<string>.Default.Compare(Name, other?.Name);
}
Note that you don't have to use CodeElement as your dummy -- anything which implements IComparable<CodeElement> will do it.
That said, note that a binary search is not particularly hard to implement. Here's Span<T>.BinarySearch and here's Array.BinarySearch, and another random one. You can avoid the whole dummy thing by copying and tweaking one of these implementations.
You can create the dummy instance with type T using reflection. Here is a sample that I just tested :
private T FindElement<T>(List<T> elements, string name) where T : CodeElement {
//CodeElement dummy = new CodeElement(name);
//using System;
T dummy = (T)Activator.CreateInstance(typeof(T), name);
int index = elements.BinarySearch(dummy, codeElementComparer);
if (index >= 0) {
return elements[index];
} else {
return null;
}
I'm on a quest to write a TypedBinaryReader that would be able to read any type that BinaryReader normally supports, and a type that implements a specific interface. I have come really close, but I'm not quite there yet.
For the value types, I mapped the types to functors that call the appropriate functions.
For the reference types, as long as they inherit the interface I specified and can be constructed, the function below works.
However, I want to create an universal generic method call, ReadUniversal<T>() that would work for both value types and the above specified reference types.
This is attempt number one, it works, but It's not generic enought, I still have to cases.
public class TypedBinaryReader : BinaryReader {
private readonly Dictionary<Type, object> functorBindings;
public TypedBinaryReader(Stream input) : this(input, Encoding.UTF8, false) { }
public TypedBinaryReader(Stream input, Encoding encoding) : this(input, encoding, false) { }
public TypedBinaryReader(Stream input, Encoding encoding, bool leaveOpen) : base(input, encoding, leaveOpen) {
functorBindings = new Dictionary<Type, object>() {
{typeof(byte), new Func<byte>(ReadByte)},
{typeof(int), new Func<int>(ReadInt32)},
{typeof(short), new Func<short>(ReadInt16)},
{typeof(long), new Func<long>(ReadInt64)},
{typeof(sbyte), new Func<sbyte>(ReadSByte)},
{typeof(uint), new Func<uint>(ReadUInt32)},
{typeof(ushort), new Func<ushort>(ReadUInt16)},
{typeof(ulong), new Func<ulong>(ReadUInt64)},
{typeof(bool), new Func<bool>(ReadBoolean)},
{typeof(float), new Func<float>(ReadSingle)}
};
}
public T ReadValueType<T>() {
return ((Func<T>)functorBindings[typeof(T)])();
}
public T ReadReferenceType<T>() where T : MyReadableInterface, new() {
T item = new T();
item.Read(this);
return item;
}
public List<T> ReadMultipleValuesList<T, R>() {
dynamic size = ReadValueType<R>();
List<T> list = new List<T>(size);
for (dynamic i = 0; i < size; ++i) {
list.Add(ReadValueType<T>());
}
return list;
}
public List<T> ReadMultipleObjecsList<T, R>() where T : MyReadableInterface {
dynamic size = ReadValueType<R>();
List<T> list = new List<T>(size);
for (dynamic i = 0; i < size; ++i) {
list.Add(ReadReferenceType<T>());
}
return list;
}
}
An idea that I came up with, that I don't really like, is to write generic class that boxes in the value types, like this one:
public class Value<T> : MyReadableInterface {
private T value;
public Value(T value) {
this.value = value;
}
internal Value(TypedBinaryReader reader) {
Read(reader);
}
public T Get() {
return value;
}
public void Set(T value) {
if (!this.value.Equals(value)) {
this.value = value;
}
}
public override string ToString() {
return value.ToString();
}
public void Read(TypedBinaryReader reader) {
value = reader.ReadValueType<T>();
}
}
This way, I can use ReadReferencTypes<T>() even on value types, as long as I pass the type parameter as Value<int> instead of just int.
But this is still ugly since I again have to remember what I'm reading, just instead of having to remember function signature, I have to remember to box in the value types.
Ideal solution would be when I could add a following method to TypedBinaryReader class:
public T ReadUniversal<T>() {
if ((T).IsSubclassOf(typeof(MyReadableInterface)) {
return ReadReferenceType<T>();
} else if (functorBindings.ContainsKey(typeof(T)) {
return ReadValueType<T>();
} else {
throw new SomeException();
}
}
However, due to different constraints on the generic argument T, this won't work. Any ideas on how to make it work?
Ultimate goal is to read any type that BinaryReader normally can or any type that implements the interface, using only a single method.
If you need a method to handle reference types and a method to handle value types, that's a perfectly valid reason to have two methods.
What may help is to view this from the perspective of code that will call the methods in this class. From their perspective, do they benefit if they can call just one method regardless of the type instead of having to call one method for value types and another for value types? Probably not.
What happens (and I've done this lots and lots of times) is that we get caught up in how we want a certain class to look or behave for reasons that aren't related to the actual software that we're trying to write. In my experience this happens a lot when we're trying to write generic classes. Generic classes help us when we see unnecessarily code duplication in cases where the types we're working with don't matter (like if we had one class for a list of ints, another for a list of doubles, etc.)
Then when we get around to actually using the classes we've created we may find that our needs are not quite what we thought, and the time we spent polishing that generic class goes to waste.
If the types we're working with do require entirely different code then forcing the handling of multiple unrelated types into a single generic method is going to make your code more complicated. (Whenever we feel forced to use dynamic it's a good sign that something may have become overcomplicated.)
My suggestion is just to write the code that you need and not worry if you need to call different methods. See if it actually creates a problem. It probably won't. Don't try to solve the problem until it appears.
If I try to cast an object of type EntityCollection<MyNamespace.Models.MyEntityClass> to ICollection<Object> I get an InvalidCastException.
Okay, according to the docs, EntityCollection<TEntity> implements ICollection<T>, and MyNamespace.Models.MyEntityClass must descend from Object, right? So why on earth does this fail?
FWIW, I'm trying to do this in a method that generally can add and remove items from what might be an EntityCollection or some other IList or ISet. I need to preserve the change tracking behavior of EntityCollection, because the object is to eventually be able to commit the changes if it's an EC.
Edit
Okay, here's some more specifics of what I'm doing. I'm deserializing JSON, and the target object can have properties that are collections--maybe they're EntityCollections, maybe not. For the sake of simplicity, lets say the members of the collection are always subclasses of EntityObject, whether it's an EntityCollection or not (if I understand the responses so far, I'd have no better luck casting to ICollection<EntityObject> than to ICollection<Object>…right?). This is the part where I run into trouble…
foreach (PropertyInfo prop in hasManys)
{
// This is where I get the InvalidCastException...
ICollection<Object> oldHms = (ICollection<Object>)prop.GetValue(parentObj, null);
JEnumerable<JToken> hmIds = links[FormatPropName(prop.Name)].Children();
if (hmIds.Count() == 0)
{
// No members! Clear it out!
oldHms.Clear();
continue; // breaking early!
}
relType = prop.PropertyType.GetGenericArguments()[0];
// Get back the actual entities we'll need to put into the relationship...
List<EntityObject> newHms = new List<EntityObject>();
foreach (JToken jt in hmIds)
{
// ...populate newHms with existing EntityObjects from the context...
}
// first, delete any missing...
/* Got to use ToList() to make a copy, because otherwise missings is
* still connected to the oldHms collection (It's an ObjectQuery)
* and you can't modify oldHms while enumerating missings.
*/
// This cast will fail too, right? Though it's more easily fixable:
IEnumerable<EntityObject> missings = ((ICollection<EntityObject>)oldHms).Except(newHms).ToList();
foreach (EntityObject missing in missings)
{
oldHms.Remove(missing); // One of my mutable collection operations
}
// add new ones
foreach (EntityObject child in newHms)
{
if (!oldHms.Contains(child)) // Skip if already in there
{
oldHms.Add(child); // another mutable collection operation
}
}
}
}
That's a bit simplified, I have special cases for Arrays (implement ICollection, but aren't generics) and other stuff that I took out. Point is, I need to operate Clear, Add, and Remove on the EntityCollection itself--if that's what it is. Maybe there's another way to do this type of synchronization that I'm missing?
read-write collections cannot be variant.
Take this example:
List<MyClass> list1 = new List<MyClass>();
// assume this would work
ICollection<object> list2 = list1;
list2.Add(new object()); // ooops. We added an object to List<MyClass>!
In principal this kind of casting is only possible for "read-only" interfaces (allowing covariance) or for "write-only" interfaces (allowing contravariance).
One "solution" would involve a wrapper class like this:
public class Wrapper<T> : ICollection<object>
{
private readonly ICollection<T> collection;
public Wrapper(ICollection<T> collection)
{
this.collection = collection;
}
public void Add(object item)
{
// maybe check if T is of the desired type
collection.Add((T)item);
}
public void Clear()
{
collection.Clear();
}
public bool Contains(object item)
{
// maybe check if T is of the desired type
return collection.Contains((T)item);
}
public void CopyTo(object[] array, int arrayIndex)
{
// maybe check if T is of the desired type
collection.CopyTo(array.Cast<T>().ToArray(), arrayIndex);
}
public int Count
{
get { return collection.Count; }
}
public bool IsReadOnly
{
get { return collection.IsReadOnly; }
}
public bool Remove(object item)
{
// maybe check if T is of the desired type
return collection.Remove((T)item);
}
public IEnumerator<object> GetEnumerator()
{
yield return collection;
}
System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
{
return collection.GetEnumerator();
}
}
Instead of
EntityCollection<MyNamespace.Models.MyEntityClass> collection = ...;
ICollection<Object> generic = collection ;
you would have to write:
EntityCollection<MyNamespace.Models.MyEntityClass> collection = ...;
ICollection<Object> generic = new Wrapper(collection);
And could adjust the wrapper class at the points marked by comments how to deal with type problems.
Since ICollection<T> hasn't variance, ICollection<MyEntityClass> and ICollection<object> are different types, unrelated to each other.
I'm trying to do this in a method that generally can add and remove
items from what might be an EntityCollection or some other IList or
ISet
So, why don't you work with IList? Looks like you don't care about real type of items in this method.
I am using Protobuf-net to serialize a custom nested list. I understand that native lists cannot be nested directly, which is why I have used a container object for the inner list. However, I would also like to make my container objects IEnumerable but this means Protobuf-net throws it out with the error:
Nested or jagged lists and arrays are not supported
Here is an example of my list structure which causes the error:
[ProtoContract]
public class MyOuterList<T>
{
[ProtoMember(1)]
readonly List<MyInnerList<T>> nestedData = new List<ObjectList<T>>();
}
[ProtoContract]
public class MyInnerList<T> : IEnumerable<T>
{
[ProtoMember(1)]
private readonly List<T> data = new List<T>();
}
The fix is to remove IEnumerable from MyInnerList but obviously that prevents it being directly iterable. Is there a sneaky attribute like [ProtobufCustomObjectSoPleaseIgnoreIEnumerable] that could be used?
The best alternative I have come up with so far is to use an Enumerable property as shown below but I fear that the property could still be cast back to a list again. I would prefer to be using GetEnumerator/yield in some way but I can't see how.
[ProtoContract]
public class MyInnerList<T>
{
[ProtoMember(1)]
private readonly List<T> data = new List<T>();
public IEnumerable<T> Data
{
get { return this.data; }
}
}
Is there a sneaky attribute like [ProtobufCustomObjectSoPleaseIgnoreIEnumerable] that could be used?
yup:
[ProtoContract(IgnoreListHandling=true)]
public class MyInnerList<T> : IEnumerable<T>
{
[ProtoMember(1)]
private readonly List<T> data = new List<T>();
}
sneaky is sneaky. IgnoreListHandling has the intellisense documentation:
If specified, do NOT treat this type as a list, even if it looks like one.
Also, due to multiple requests like this one, I plan on looking at implementing support for jagged arrays / lists shortly. The plan is to basically get the runtime to spoof the wrapper with a member (field 1) in the serializer's imagination, so you can use List<List<T>> and it'll work just like your model above (it will even be wire-compatible, since you sensibly chose field 1).
I noticed something strange and there is a possibility I am wrong.
I have an interface IA and class A:
interface IA { .... }
class A : IA { .... }
In other class I have this:
private IList<A> AList;
public IList<IA> {
get { return AList; }
}
But I get compilation error.
But if I change it to:
public IList<IA> {
get { return AList.ToArray(); }
}
Everything is fine.
Why is it?
Why this doesn't work
private IList<A> AList;
public IList<IA> { get { return AList; } }
Exposing the property as IList<IA> would allow you to try to add class B : IA to the list, but the underlying list is really IList<A>, B is not A, so this would blow up in your face. Thus, it is not allowed.
Why this works:
public IList<IA> { get { return AList.ToArray(); } }
Array variance is broken. You can return the list as an array, it will still blow up in your face at runtime if you tried an Add operation (or try to replace an object at a given index with something other than an object of type A, but it's legal at compile time. A different example of this variance at play:
string[] array = new string[10];
object[] objs = array; // legal
objs[0] = new Foo(); // will bite you at runtime
From comments:
So what you suggest to use? How can I make the property return valid
object? How can I make the return value read only?
If consumers only need to iterate over the sequence and not have random, indexed access to it, you can expose the property as an IEnumerable<IA>.
public IEnumerable<IA> TheList
{
get { return AList.Select(a => a); }
}
(The Select is actually not technically needed, but using this will prevent consumers from being able to cast the result to its true underlying List<> type.) If the consumers decide they want a list or an array, they are free to call ToList() or ToArray() on it, and whatever they do with it (in terms of adding, removing, replacing items) will not affect your list. (Changes to the items' properties would be visible.) Similarly, you could also expose the collection an IList<IA> yourself in a safe way
public IList<IA> TheList
{
get { return AList.ToList<IA>(); }
}
Again, this would return a copy of the list, so any changes to it would not affect your underlying list.
Because native arrays are broken. This code is bad, you shouldn't do it, and the C# designers wish desperately they could undo it.
Arrays are covariant but lists are not.