Why generic interfaces are not co/contravariant by default? - c#

For example IEnumerable<T> interface:
public interface IEnumerable<out T> : IEnumerable
{
IEnumerator<T> GetEnumerator();
}
In this interface the generic type is used only as a return type of interface method and not used as a type of method arguments thus it can be covariant. Giving this, can't compiler theoretically infer the variance from the interface? If it can, why does C# requires us to set co/contravariance keywords explicitly.
Update: As Jon Skeet mentioned this question can be spited into sub-questions:
Can compiler infer generic type's co/contravariance by how it is used inside current generic type and all it's base types?
For example.. How many generic interface parameters from .NET Framework 4.0 can be marked co/contravariant automatically without any ambiguity? About 70%, 80%, 90% or 100%?
If it can, should it apply co/contravariance to generic types by default? At least to those types which it is capable to analyze and infer co/contravariance from the type usage.

Well, there are two questions here. Firstly, could the compiler always do so? Secondly, should it (if it can)?
For the first question, I'll defer to Eric Lippert, who made this comment when I brought exactly this issue up in the 2nd edition of C# in Depth:
It's not clear to me that we reasonably could even if we wanted to. We can easily come up
with situations that require expensive global analysis of all the interfaces in a program
to work out the variances, and we can easily come up with situations where either it's
<in T, out U> or <out T, in U> and no way to decide between them. With both bad
performance and ambiguous cases it's an unlikely feature.
(I hope Eric doesn't mind me quoting this verbatim; he's previously been very good about sharing such insights, so I'm going by past form :)
On the other hand, I suspect there are still cases where it can be inferred with no ambiguity, so the second point is still relevant...
I don't think it should be automatic even where the compiler can unambiguously know that it's valid in just one way. While expanding an interface is always a breaking change to some extent, it's generally not if you're the only one implementing it. However, if people are relying on your interface to be variant, you may not be able to add methods to it without breaking clients... even if they're just callers, not implementers. The methods you add may change a previously-covariant interface to become invariant, at which point you break any callers who are trying to use it covariantly.
Basically, I think it's fine to require this to be explicit - it's a design decision you should be making consciously, rather than just accidentally ending up with covariance/contravariance without having thought about it.

This article explains that there are situations that the compiler cannot infer and so it provides you with the explicit syntax:
interface IReadWriteBase<T>
{
IReadWrite<T> ReadWrite();
}
interface IReadWrite<T> : IReadWriteBase<T>
{
}
What do you infer here in or out, both work?

Related

Interface conflict resolution in C#

This is a spin-off question based on Eric Lippert's answer on this question.
I would like to know why the C# language is designed not being able to detect the correct interface member in the following specific case. I am not looking on feedback whether designing a class this way is considered best practice.
class Turtle { }
class Giraffe { }
class Ark : IEnumerable<Turtle>, IEnumerable<Giraffe>
{
public IEnumerator<Turtle> GetEnumerator()
{
yield break;
}
// explicit interface member 'IEnumerable.GetEnumerator'
IEnumerator IEnumerable.GetEnumerator()
{
yield break;
}
// explicit interface member 'IEnumerable<Giraffe>.GetEnumerator'
IEnumerator<Giraffe> IEnumerable<Giraffe>.GetEnumerator()
{
yield break;
}
}
In the code above, Ark has 3 conflicting implementation of GetEnumerator(). This conflict is resolved by treating IEnumerator<Turtle>'s implementation as default, and requiring specific casts for both others.
Retrieving the enumerators works like a charm:
var ark = new Ark();
var e1 = ((IEnumerable<Turtle>)ark).GetEnumerator(); // turtle
var e2 = ((IEnumerable<Giraffe>)ark).GetEnumerator(); // giraffe
var e3 = ((IEnumerable)ark).GetEnumerator(); // object
// since IEnumerable<Turtle> is the default implementation, we don't need
// a specific cast to be able to get its enumerator
var e4 = ark.GetEnumerator(); // turtle
Why isn't there a similar resolution for LINQ's Select extension method? Is there a proper design decision to allow the inconsistency between resolving the former, but not the latter?
// This is not allowed, but I don't see any reason why ..
// ark.Select(x => x); // turtle expected
// these are allowed
ark.Select<Turtle, Turtle>(x => x);
ark.Select<Giraffe, Giraffe>(x => x);
It's important to first understand what mechanism is being used to resolve the call to the extension method Select. C# uses a generic type inference algorithm which is fairly complex; see the C# specification for the details. (I really should write a blog article explaining it all; I recorded a video about it in 2006 but unfortunately it has disappeared.)
But basically, the idea of generic type inference on Select is: we have:
public static IEnumerable<R> Select<A, R>(
this IEnumerable<A> items,
Func<A, R> projection)
From the call
ark.Select(x => x)
we must deduce what A and R was intended.
Since R depends on A, and in fact is equal to A, the problem reduces to finding A. The only information we have is the type of ark. We know that ark:
Is Ark
Extends object
Implements IEnumerable<Giraffe>
Implements IEnumerable<Turtle>
IEnumerable<T> extends IEnumerable and is covariant.
Turtle and Giraffe extend Animal which extends object.
Now, if those are the only things you know, and you know that we're looking for IEnumerable<A>, what conclusions can you reach about A?
There are a number of possibilities:
Choose Animal, or object.
Choose Turtle or Giraffe by some tiebreaker.
Decide that the situation is ambiguous, and give an error.
We can reject the first option. A design principle of C# is: when faced with a choice between options, always choose one of the options or produce an error. C# never says "you gave me a choice between Apple and Cake so I choose Food". It always chooses from the choices you gave it, or it says that it has no basis on which to make a choice.
Moreover, if we chose Animal, that just makes the situation worse. See the exercise at the end of this post.
You propose the second option, and your proposed tiebreaker is "an implicitly implemented interface gets priority over an explicitly implemented interface".
This proposed tiebreaker has some problems, starting with there is no such thing as an implicitly implemented interface. Let's make your situation slightly more complicated:
interface I<T>
{
void M();
void N();
}
class C : I<Turtle>, I<Giraffe>
{
void I<Turtle>.M() {}
public M() {} // Used for I<Giraffe>.M
void I<Giraffe>.N() {}
public N() {}
public static DoIt<T>(I<T> i) {i.M(); i.N();}
}
When we call C.DoIt(new C()) what happens? Neither interface is "explicitly implemented". Neither interface is "implicitly implemented". Interface members are implicitly or explicitly implemented, not interfaces.
Now we could say "an interface that has all of its members implicitly implemented is an implicitly implemented interface". Does that help? Nope. Because in your example, IEnumerable<Turtle> has one member implicitly implemented and one member explicitly implemented: the overload of GetEnumerator that returns IEnumerator is a member of IEnumerable<Turtle> and you've explicitly implemented it.
(ASIDE: A commenter notes that the above is inelegantly worded; it is not entirely clear from the specification whether members "inherited" from "base" interfaces are "members" of the "derived" interface, or whether it is simply the case that a "derivation" relationship between interfaces is simply the statement of a requirement that any implementor of the "derived" interface must also implement the "base". The specification has historically been unclear on this point and it is possible to make arguments either way. Regardless, my point is that the derived interface requires you to implement a certain set of members, and some of those members can be implicitly implemented and some can be explicitly implemented, and we can count how many there are of each should we choose to.)
So now maybe the proposed tiebreaker is "count the members, and the interface that has the least members explicitly implemented is the winner".
So let's take a step back here and ask the question: how on earth would you document this feature? How would you explain it? Suppose a customer comes to you and says "why are turtles being chosen over giraffes here?" How would you explain it?
Now suppose the customer asks "how can I make a prediction about what the compiler will do when I write the code?" Remember, that customer might not have the source code to Ark; it might be a type in a third-party library. Your proposal makes the invisible-to-users implementation decisions of third parties into relevant factors that control whether other people's code is correct or not. Developers generally are opposed to features that make it impossible for them to understand what their code does, unless there is a corresponding boost in power.
(For example: virtual methods make it impossible to know what your code does, but they are very useful; no one has made the argument that this proposed feature has a similar usefulness bonus.)
Suppose that third party changes a library so that a different number of members are explicitly implemented in a type you depend on. Now what happens? A third party changing whether or not a member is explicitly implemented can cause compilation errors in other people's code.
Even worse, it can not cause a compilation error; imagine a situation in which someone makes a change just in the number of methods that are implicitly implemented, and those methods are not even methods that you call, but that change silently causes a sequence of turtles to become a sequence of giraffes.
Those scenarios are really, really bad. C# was carefully designed to prevent this kind of "brittle base class" failure.
Oh, but it gets worse. Suppose we did like this tiebreaker; could we even implement it reliably?
How can we even tell if a member is explicitly implemented? The metadata in the assembly has a table that lists what class members are explicitly mapped to what interface members, but is that a reliable reflection of what is in the C# source code?
No, it is not! There are situations in which the C# compiler must secretly generate explicitly implemented interfaces on your behalf in order to satisfy the verifier (describing them would be quite off topic). So you cannot actually tell very easily how many interface members the type's implementor decided to implement explicitly.
It gets worse still: suppose the class is not even implemented in C#? Some languages always fill in the explicit interface table, and in fact I think Visual Basic might be one of those languages. So your proposal is to make the type inference rules possibly different for classes authored in VB than an equivalent type authored in C#.
Try explaining that to someone who just ported a class from VB to C# to have an identical public interface, and now their tests stop compiling.
Or, consider it from the perspective of the person implementing class Ark. If that person wishes to express the intention "this type can be used as both a sequence of turtles and giraffes, but if there is an ambiguity, choose turtles". Do you believe that any developer who wished to express that belief would naturally and easily come to the conclusion that the way to do that is to make one of the interfaces more implicitly implemented than the other?
If that were the sort of thing that developers needed to be able to disambiguate, then there should be a well-designed, clear, discoverable feature with those semantics. Something like:
class Ark : default IEnumerable<Turtle>, IEnumerable<Giraffe> ...
for example. That is, the feature should be obvious and searchable, rather than emerging by accident from an unrelated decision about what the public surface area of the type should be.
In short: The number of interface members that are explicitly implemented is not a part of the .NET type system. It's a private implementation strategy decision, not a public surface that the compiler should use to make decisions.
Finally, I've left the most important reason for last. You said:
I am not looking on feedback whether designing a class this way is considered best practice.
But that is an extremely important factor! The rules of C# are not designed to make good decisions about crappy code; they're designed to make crappy code into broken code that does not compile, and that has happened. The system works!
Making a class that implements two different versions of the same generic interface is a terrible idea and you should not do it. Because you should not do it, there is no incentive for the C# compiler team to spend even a minute figuring out how to help you do it better. This code gives you an error message. That is good. It should! That error message is telling you you're doing it wrong, so stop doing it wrong and start doing it right. If it hurts when you do that, stop doing that!
(One can certainly point out that the error message does a poor job of diagnosing the problem; this leads to another whole bunch of subtle design decisions. It was my intention to improve that error message for these scenarios, but the scenarios were too rare to make them a high priority and I did not get to it before I left Microsoft in 2012. Apparently no one else has made it a priority in the years that followed either.)
UPDATE: You ask why a call to ark.GetEnumerator can do the right thing automatically. That is a much easier question. The principle here is a simple one:
Overload resolution chooses the best member that is both accessible and applicable.
"Accessible" means that the caller has access to the member because it is "public enough", and "applicable" means "all the arguments match their formal parameter types".
When you call ark.GetEnumerator() the question is not "which implementation of IEnumerable<T> should I choose"? That's not the question at all. The question is "which GetEnumerator() is both accessible and applicable?"
There is only one, because explicitly implemented interface members are not accessible members of Ark. There is only one accessible member, and it happens to be applicable. One of the sensible rules of C# overload resolution is if there is only one accessible applicable member, choose it!
Exercise: What happens when you cast ark to IEnumerable<Animal>? Make a prediction:
I will get a sequence of turtles
I will get a sequence of giraffes
I will get a sequence of giraffes and turtles
I will get a compile error
I will get something else -- what?
Now try out your prediction and see what really happens. Draw conclusions as to whether it is a good or bad idea to write types that have multiple constructions of the same generic interface.

Adding Class<derived> to List<Class<Interface>> [duplicate]

What is the real reason for that limitation? Is it just work that had to be done? Is it conceptually hard? Is it impossible?
Sure, one couldn't use the type parameters in fields, because they are allways read-write. But that can't be the answer, can it?
The reason for this question is that I'm writing an article on variance support in C# 4, and I feel that I should explain why it is restricted to delegates and interfaces. Just to inverse the onus of proof.
Update:
Eric asked about an example.
What about this (don't know if that makes sense, yet :-))
public class Lookup<out T> where T : Animal {
public T Find(string name) {
Animal a = _cache.FindAnimalByName(name);
return a as T;
}
}
var findReptiles = new Lookup<Reptile>();
Lookup<Animal> findAnimals = findReptiles;
The reason for having that in one class could be the cache that is held in the class itself. And please don't name your different type pets the same!
BTW, this brings me to optional type parameters in C# 5.0 :-)
Update 2: I'm not claiming the CLR and C# should allow this. Just trying to understand what led to that it doesnt.
First off, as Tomas says, it is not supported in the CLR.
Second, how would that work? Suppose you have
class C<out T>
{ ... how are you planning on using T in here? ... }
T can only be used in output positions. As you note, the class cannot have any field of type T because the field could be written to. The class cannot have any methods that take a T, because those are logically writes. Suppose you had this feature -- how would you take advantage of it?
This would be useful for immutable classes if we could, say, make it legal to have a readonly field of type T; that way we'd massively cut down on the likelihood that it be improperly written to. But it's quite difficult to come up with other scenarios that permit variance in a typesafe manner.
If you have such a scenario, I'd love to see it. That would be points towards someday getting this implemented in the CLR.
UPDATE: See
Why isn't there generic variance for classes in C# 4.0?
for more on this question.
As far as I know, this feature isn't supported by CLR, so adding this would require significant work on the CLR side as well. I believe that co- and contra-variance for interfaces and delegates was actually supported on CLR before the version 4.0, so this was a relatively straightforward extension to implement.
(Supporting this feature for classes would be definitely useful, though!)
If they were permitted, useful 100% type-safe (no internal typecasts) classes or structures could be defined which were covariant with regard to their type T, if their constructor accepted one or more T's or T supplier's. Useful, 100%-type-safe classes or structures could be defined which were contravariant with respect to T if their constructors accepted one or more T consumers. I'm not sure there's much advantage of a class over an interface, beyond the ability to use "new" rather than using a static factory method (most likely from a class whose name is similar to that of the interface), but I can certainly see usage cases for having immutable structures support covariance.

Return type IList<T> vs. List<T> [duplicate]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
Can anyone explain to me why I would want to use IList over List in C#?
Related question: Why is it considered bad to expose List<T>
If you are exposing your class through a library that others will use, you generally want to expose it via interfaces rather than concrete implementations. This will help if you decide to change the implementation of your class later to use a different concrete class. In that case the users of your library won't need to update their code since the interface doesn't change.
If you are just using it internally, you may not care so much, and using List<T> may be ok.
The less popular answer is programmers like to pretend their software is going to be re-used the world over, when infact the majority of projects will be maintained by a small amount of people and however nice interface-related soundbites are, you're deluding yourself.
Architecture Astronauts. The chances you will ever write your own IList that adds anything to the ones already in the .NET framework are so remote that it's theoretical jelly tots reserved for "best practices".
Obviously if you are being asked which you use in an interview, you say IList, smile, and both look pleased at yourselves for being so clever. Or for a public facing API, IList. Hopefully you get my point.
Interface is a promise (or a contract).
As it is always with the promises - smaller the better.
Some people say "always use IList<T> instead of List<T>".
They want you to change your method signatures from void Foo(List<T> input) to void Foo(IList<T> input).
These people are wrong.
It's more nuanced than that. If you are returning an IList<T> as part of the public interface to your library, you leave yourself interesting options to perhaps make a custom list in the future. You may not ever need that option, but it's an argument. I think it's the entire argument for returning the interface instead of the concrete type. It's worth mentioning, but in this case it has a serious flaw.
As a minor counterargument, you may find every single caller needs a List<T> anyway, and the calling code is littered with .ToList()
But far more importantly, if you are accepting an IList as a parameter you'd better be careful, because IList<T> and List<T> do not behave the same way. Despite the similarity in name, and despite sharing an interface they do not expose the same contract.
Suppose you have this method:
public Foo(List<int> a)
{
a.Add(someNumber);
}
A helpful colleague "refactors" the method to accept IList<int>.
Your code is now broken, because int[] implements IList<int>, but is of fixed size. The contract for ICollection<T> (the base of IList<T>) requires the code that uses it to check the IsReadOnly flag before attempting to add or remove items from the collection. The contract for List<T> does not.
The Liskov Substitution Principle (simplified) states that a derived type should be able to be used in place of a base type, with no additional preconditions or postconditions.
This feels like it breaks the Liskov substitution principle.
int[] array = new[] {1, 2, 3};
IList<int> ilist = array;
ilist.Add(4); // throws System.NotSupportedException
ilist.Insert(0, 0); // throws System.NotSupportedException
ilist.Remove(3); // throws System.NotSupportedException
ilist.RemoveAt(0); // throws System.NotSupportedException
But it doesn't. The answer to this is that the example used IList<T>/ICollection<T> wrong. If you use an ICollection<T> you need to check the IsReadOnly flag.
if (!ilist.IsReadOnly)
{
ilist.Add(4);
ilist.Insert(0, 0);
ilist.Remove(3);
ilist.RemoveAt(0);
}
else
{
// what were you planning to do if you were given a read only list anyway?
}
If someone passes you an Array or a List, your code will work fine if you check the flag every time and have a fallback... But really; who does that? Don't you know in advance if your method needs a list that can take additional members; don't you specify that in the method signature? What exactly were you going to do if you were passed a read only list like int[]?
You can substitute a List<T> into code that uses IList<T>/ICollection<T> correctly. You cannot guarantee that you can substitute an IList<T>/ICollection<T> into code that uses List<T>.
There's an appeal to the Single Responsibility Principle / Interface Segregation Principle in a lot of the arguments to use abstractions instead of concrete types - depend on the narrowest possible interface. In most cases, if you are using a List<T> and you think you could use a narrower interface instead - why not IEnumerable<T>? This is often a better fit if you don't need to add items. If you need to add to the collection, use the concrete type, List<T>.
For me IList<T> (and ICollection<T>) is the worst part of the .NET framework. IsReadOnly violates the principle of least surprise. A class, such as Array, which never allows adding, inserting or removing items should not implement an interface with Add, Insert and Remove methods. (see also https://softwareengineering.stackexchange.com/questions/306105/implementing-an-interface-when-you-dont-need-one-of-the-properties)
Is IList<T> a good fit for your organisation? If a colleague asks you to change a method signature to use IList<T> instead of List<T>, ask them how they'd add an element to an IList<T>. If they don't know about IsReadOnly (and most people don't), then don't use IList<T>. Ever.
Note that the IsReadOnly flag comes from ICollection<T>, and indicates whether items can be added or removed from the collection; but just to really confuse things, it does not indicate whether they can be replaced, which in the case of Arrays (which return IsReadOnlys == true) can be.
For more on IsReadOnly, see msdn definition of ICollection<T>.IsReadOnly
List<T> is a specific implementation of IList<T>, which is a container that can be addressed the same way as a linear array T[] using an integer index. When you specify IList<T> as the type of the method's argument, you only specify that you need certain capabilities of the container.
For example, the interface specification does not enforce a specific data structure to be used. The implementation of List<T> happens to the same performance for accessing, deleting and adding elements as a linear array. However, you could imagine an implementation that is backed by a linked list instead, for which adding elements to the end is cheaper (constant-time) but random-access much more expensive. (Note that the .NET LinkedList<T> does not implement IList<T>.)
This example also tells you that there may be situations when you need to specify the implementation, not the interface, in the argument list: In this example, whenever you require a particular access performance characteristic. This is usually guaranteed for a specific implementation of a container (List<T> documentation: "It implements the IList<T> generic interface using an array whose size is dynamically increased as required.").
Additionally, you might want to consider exposing the least functionality you need. For example. if you don't need to change the content of the list, you should probably consider using IEnumerable<T>, which IList<T> extends.
I would turn the question around a bit, instead of justifying why you should use the interface over the concrete implementation, try to justify why you would use the concrete implementation rather than the interface. If you can't justify it, use the interface.
IList<T> is an interface so you can inherit another class and still implement IList<T> while inheriting List<T> prevents you to do so.
For example if there is a class A and your class B inherits it then you can't use List<T>
class A : B, IList<T> { ... }
public void Foo(IList<Bar> list)
{
// Do Something with the list here.
}
In this case you could pass in any class which implements the IList<Bar> interface. If you used List<Bar> instead, only a List<Bar> instance could be passed in.
The IList<Bar> way is more loosely coupled than the List<Bar> way.
A principle of TDD and OOP generally is programming to an interface not an implementation.
In this specific case since you're essentially talking about a language construct, not a custom one it generally won't matter, but say for example that you found List didn't support something you needed. If you had used IList in the rest of the app you could extend List with your own custom class and still be able to pass that around without refactoring.
The cost to do this is minimal, why not save yourself the headache later? It's what the interface principle is all about.
The most important case for using interfaces over implementations is in the parameters to your API. If your API takes a List parameter, then anyone who uses it has to use List. If the parameter type is IList, then the caller has much more freedom, and can use classes you never heard about, which may not even have existed when your code was written.
Supprising that none of these List vs IList questions (or answers) mentions the signature difference. (Which is why I searched for this question on SO!)
So here's the methods contained by List that are not found in IList, at least as of .NET 4.5 (circa 2015)
AddRange
AsReadOnly
BinarySearch
Capacity
ConvertAll
Exists
Find
FindAll
FindIndex
FindLast
FindLastIndex
ForEach
GetRange
InsertRange
LastIndexOf
RemoveAll
RemoveRange
Reverse
Sort
ToArray
TrimExcess
TrueForAll
What if .NET 5.0 replaces System.Collections.Generic.List<T> to System.Collection.Generics.LinearList<T>. .NET always owns the name List<T> but they guarantee that IList<T> is a contract. So IMHO we (atleast I) are not supposed to use someone's name (though it is .NET in this case) and get into trouble later.
In case of using IList<T>, the caller is always guareented things to work, and the implementer is free to change the underlying collection to any alternative concrete implementation of IList
All concepts are basically stated in most of the answers above regarding why use interface over concrete implementations.
IList<T> defines those methods (not including extension methods)
IList<T> MSDN link
Add
Clear
Contains
CopyTo
GetEnumerator
IndexOf
Insert
Remove
RemoveAt
List<T> implements those nine methods (not including extension methods), on top of that it has about 41 public methods, which weighs in your consideration of which one to use in your application.
List<T> MSDN link
You would because defining an IList or an ICollection would open up for other implementations of your interfaces.
You might want to have an IOrderRepository that defines a collection of orders in either a IList or ICollection. You could then have different kinds of implementations to provide a list of orders as long as they conform to "rules" defined by your IList or ICollection.
IList<> is almost always preferable as per the other poster's advice, however note there is a bug in .NET 3.5 sp 1 when running an IList<> through more than one cycle of serialization / deserialization with the WCF DataContractSerializer.
There is now a SP to fix this bug : KB 971030
The interface ensures that you at least get the methods you are expecting; being aware of the definition of the interface ie. all abstract methods that are there to be implemented by any class inheriting the interface. so if some one makes a huge class of his own with several methods besides the ones he inherited from the interface for some addition functionality, and those are of no use to you, its better to use a reference to a subclass (in this case the interface) and assign the concrete class object to it.
additional advantage is that your code is safe from any changes to concrete class as you are subscribing to only few of the methods of concrete class and those are the ones that are going to be there as long as the concrete class inherits from the interface you are using. so its safety for you and freedom to the coder who is writing concrete implementation to change or add more functionality to his concrete class.
You can look at this argument from several angles including the one of a purely OO approach which says to program against an Interface not an implementation. With this thought, using IList follows the same principal as passing around and using Interfaces that you define from scratch. I also believe in the scalability and flexibility factors provided by an Interface in general. If a class implmenting IList<T> needs to be extended or changed, the consuming code does not have to change; it knows what the IList Interface contract adheres to. However using a concrete implementation and List<T> on a class that changes, could cause the calling code to need to be changed as well. This is because a class adhering to IList<T> guarantees a certain behavior that is not guaranteed by a concrete type using List<T>.
Also having the power to do something like modify the default implementation of List<T> on a class Implementing IList<T> for say the .Add, .Remove or any other IList method gives the developer a lot of flexibility and power, otherwise predefined by List<T>
Typically, a good approach is to use IList in your public facing API (when appropriate, and list semantics are needed), and then List internally to implement the API.  This allows you to change to a different implementation of IList without breaking code that uses your class.
The class name List may be changed in next .net framework but the interface is never going to change as interface is contract.
Note that, if your API is only going to be used in foreach loops, etc, then you might want to consider just exposing IEnumerable instead.

Should I make C# private methods generic?

I have an API which includes some methods that take in a type parameter. I am now converting them to generic methods as part of improving the API (more type-safe), while keeping the non-generic version around for backward compatibility.
Current - to be obsoleted:
public object MyMethod(object value, Type expectedType)
New:
public T MyMethod<T>(object value)
However, Mymethod calls a private helper method which also takes in a type parameter:
private object HelperMethod(object value, Type expectedType)
Question: should I also make this private helper method generic?
I have my own answer below, but I would like to know if I'm missing something. I appreciate your insights very much.
My answer is no, I should not make this private method generic.
Reason 1: this helper method is private, so even if I made it generic, it doesn't improve the API.
Reason 2: If I make it generic, then the non-generic public methods will have to use reflection to pass the type parameter to this generic method, which means more overhead.
Making your private helper method generic does improve the API by carrying the generic type-specificity all the way through your implementation. Your implementation isn't fully realizing the type safety benefits of generics if you throttle the type down to a core that juggles typeless System.Objects around.
For example, why is the parameter to MyMethod still a System.Object? If that parameter originates in source, chances are good that it should be a type parameter too. If the parameter originates in data, you're probably better off with System.Object. Generics are most useful when they are used with source code, because source code expressions implicitly provide the type in most contexts.
The hidden costs of generics depends on the mix of types that will be used with the API. If most of the types will be value types (built-ins and structs), then switching the API to generics could raise your memory consumption because the generic code must be jit'd differently for each value type. If the majority of types used with your generic API are reference types (classes and interfaces), code/memory explosion isn't a concern because all reference types share the same JIT'd code.
Whether the cost of having the old API call the new generic API is a) measurable and b) acceptable depends entirely upon what you're doing in your private helper method - in particular, what the private helper method does with the Type parameter.
If the helper method implementation is fairly lightweight and you determine (by performance measurements) that the cost of adapting the old API to call the new generic helper is unacceptable, I would consider duplicating the helper method implementation, side by side, one in the old style and one in the new generic style. This eliminates crossover conversion costs between the API styles and eliminates the risk of introducing new bugs into the old API, at the cost of slightly increased internal code maintenance efforts.
I think it depends on the amount of time you plan to support the non-generic method, how often it will be called, what the impact on the reflection will be, etc. etc.
I would think you would want the generic functionality as much as you can to take advantage of the type-safety you want. Then retro-fit the non-generic version to use the generic version if necessary, and eventually deprecate it as soon as you can.

Reflexive type parameter constraints: X<T> where T : X<T> ‒ any simpler alternatives?

Every so often I am making a simple interface more complicated by adding a self-referencing ("reflexive") type parameter constraint to it. For example, I might turn this:
interface ICloneable
{
ICloneable Clone();
}
class Sheep : ICloneable
{
ICloneable Clone() { … }
} //^^^^^^^^^^
Sheep dolly = new Sheep().Clone() as Sheep;
//^^^^^^^^
into:
interface ICloneable<TImpl> where TImpl : ICloneable<TImpl>
{
TImpl Clone();
}
class Sheep : ICloneable<Sheep>
{
Sheep Clone() { … }
} //^^^^^
Sheep dolly = new Sheep().Clone();
Main advantage: An implementing type (such as Sheep) can now refer to itself instead of its base type, reducing the need for type-casting (as demonstrated by the last line of code).
While this is very nice, I've also noticed that these type parameter constraints are not intuitive and have the tendency to become really difficult to comprehend in more complex scenarios.*)
Question: Does anyone know of another C# code pattern that achieves the same effect or something similar, but in an easier-to-grasp fashion?
*) This code pattern can be unintuitive and hard to understand e.g. in these ways:
The declaration X<T> where T : X<T> appears to be recursive, and one might wonder why the compiler doesn't get stuck in an infinite loop, reasoning, "If T is an X<T>, then X<T> is really an X<X<…<T>…>>." (But constraints obviously don't get resolved like that.)
For implementers, it might not be obvious what type should be specified in place of TImpl. (The constraint will eventually take care of that.)
Once you add more type parameters and subtyping relationships between various generic interfaces to the mix, things get unmanageable fairly quickly.
Main advantage: An implementing type can now refer to itself instead of its base type, reducing the need for type-casting
Though it might seem like by the type constraint referring to itself it forces the implementing type to do the same, that's actually not what it does. People use this pattern to try to express patterns of the form "an override of this method must return the type of the overriding class", but that's not actually the constraint expressed or enforced by the type system. I give an example here:
https://ericlippert.com/2011/02/02/curiouser-and-curiouser/
While this is very nice, I've also noticed that these type parameter constraints are not intuitive and have the tendency to become really difficult to comprehend in more complex scenarios
Yep. I try to avoid this pattern. It's hard to reason about.
Does anyone know of another C# code pattern that achieves the same effect or something similar, but in an easier-to-grasp fashion?
Not in C#, no. You might consider looking at the Haskell type system if this sort of thing interests you; Haskell's "higher types" can represent those sorts of type patterns.
The declaration X<T> where T : X<T> appears to be recursive, and one might wonder why the compiler doesn't get stuck in an infinite loop, reasoning, "If T is an X<T>, then X<T> is really an X<X<…<T>…>>."
The compiler does not ever get into infinite loops when reasoning about such simple relationships. However, nominal subtyping of generic types with contravariance is in general undeciable. There are ways to force the compiler into infinite regresses, and the C# compiler does not detect these and prevent them before embarking on the infinite journey. (Yet. I am hoping to add detection for this in the Roslyn compiler but we'll see.)
See my article on the subject if this interests you. You'll want to read the linked-to paper as well.
https://ericlippert.com/2008/05/07/covariance-and-contravariance-part-11-to-infinity-but-not-beyond/
Unfortunately, there isn't a way to fully prevent this, and a generic ICloneable<T> with no type constraints is enough. Your constraint only limits possible parameters to classes which themselves implement it, which doesn't mean they are the ones currently being implemented.
In other words, if a Cow implements ICloneable<Cow>, you will still easily make Sheep implement ICloneable<Cow>.
I would simply use ICloneable<T> without constraints for two reasons:
I seriously doubt you will ever make a mistake of using a wrong type parameter.
Interfaces are meant to be contracts for other parts of code, not to be used to code on autopilot. If a part of a code expects ICloneable<Cow> and you pass a Sheep which can do that, it seems perfectly valid from that point.

Categories

Resources