System.Collections.Immutable containers, why sealed? - c#

Why are the container classes in System.Collections.Immutable, ie ImmutableList<T> sealed ?
I would like to inherit them and have to go through an ugly and error prone composition+proxy ..
Just trying to understand the reason here ?

All types should be sealed unless they are specifically and carefully designed for extension. Designing for extension is difficult and expensive and easy to do wrong.
Moreover: there are security and correctness implications when you use a type that allows extension. By making the type sealed, the authors of the type are telling the consumers of that type "if you receive an instance of this type, you can rely on the fact that you are actually getting the type that was written by Microsoft, tested by Microsoft, and had the source code published by Microsoft". You can write tests and have confidence that the runtime behaviour will match the test time behaviour because no one else is capable of making their own crazy extension that has a bug.
The question is backwards. You should never ask for a reason for a type to be sealed; sealed should have been the default in the language. Rather, we need a reason to unseal a type: because it was designed for extension, because it was implemented by professionals who carefully understood all the implications of extension, and because consumers of the type were willing to take on the risks of not knowing what the code they're calling actually does.

Related

Is object casting an inevitability of reality when there is a need to design modular architecture?

It is common to read around that object casting is a bad practice and should be avoided, for instance Why should casting be avoided? question has gotten some answers with great arguments:
By Jerry Coffin:
Looking at things more generally, the situation's pretty simple (at
least IMO): a cast (obviously enough) means you're converting
something from one type to another. When/if you do that, it raises the
question "Why?" If you really want something to be a particular type,
why didn't you define it to be that type to start with? That's not to
say there's never a reason to do such a conversion, but anytime it
happens, it should prompt the question of whether you could re-design
the code so the correct type was used throughout.
By Eric Lippert:
Both kinds of casts are red flags. The first kind of cast
raises the question "why exactly is it that the developer knows
something that the compiler doesn't?" If you are in that situation
then the better thing to do is usually to change the program so that
the compiler does have a handle on reality. Then you don't need the
cast; the analysis is done at compile time.
The second kind of cast raises the question "why isn't the operation
being done in the target data type in the first place?" If you need
a result in ints then why are you holding a double in the first
place? Shouldn't you be holding an int?
Moving on to my question, recently I have started to look into the source code of the well known open source project AutoFixture originally devloped by Mark Seemann which I really appreciate.
One of the main components of the library is the interface ISpecimenBuilder which define an somehow abstract method:
object Create(object request, ISpecimenContext context);
As you can see request parameter type is object and by such it accepts completely different types, different implementations of the interface treat different requests by their runtime type, checking if it is something they cable dealing with otherwise returning some kind of no response representation.
It seems that the design of the interface does not adhere to the "good practice" that object casting should be used sparsely.
I was thinking to myself if there is a better way to design this contract in a way that defeats all the casting but couldn't find any solution.
Obviously the object parameter could be replaced with some marker interface but it will not save us the casting problem, I have also thought that it is possible to use some variation of visitor pattern as described here but it does not seem to be very scalable, the visitor will must have dozens of different methods since there is so many different implementations of the interface that capable dealing with different types of requests.
Although the fact that I basically agree with the arguments against using casting as part of a good design in this specific scenario it seems as not only the best option but also the only realistic one.
To sum up, is object casting and a very general contracts are inevitability of reality when there is a need to design modular and extendable architecture?
I don't think that I can answer this question generally, for any type of application or framework, but I can offer an answer that specifically talks about AutoFixture, as well as offer some speculation about other usage scenarios.
If I had to write AutoFixture from scratch today, there's certainly things I'd do differently. Particularly, I wouldn't design the day-to-day API around something like ISpecimenBuilder. Rather, I'd design the data manipulation API around the concept of functors and monads, as outlined here.
This design is based entirely on generics, but it does require statically typed building blocks (also described in the article) known at compile time.
This is closely related to how something like QuickCheck works. When you write QuickCheck-based tests, you must supply generators for all of your own custom types. Haskell doesn't support run-time casting of values, but instead relies exclusively on generics and some compile-time automation. Granted, Haskell's generics are more powerful than C#'s, so you can't necessarily transfer the knowledge gained from Haskell to C#. It does suggest, however, that it's possible to write code entirely without relying on run-time casting.
AutoFixture does, however, support user-defined types without the need for the user to write custom generators. It does this via .NET Reflection. In .NET, the Reflection API is untyped; all the methods for generating objects and invoking members take object as input and return object as output.
Any application, library, or framework based on Reflection will have to perform some run-time casting. I don't see how to get around that.
Would it be possible to write data generators without Reflection? I haven't tried the following, but perhaps one could adopt a strategy where one would write 'the code' for a data generator directly in IL and use Reflection emit to dynamically compile an in-memory assembly that contains the generators.
This is a bit like how the Hiro container works, IIRC. I suppose that one could design other types of general-purpose frameworks around this concept, but I rarely see it done in .NET.

If attributes are only constructed when they are reflected into, why are attribute constructors so limited?

As shown here, attribute constructors are not called until you reflect to get the attribute values. However, as you may also know, you can only pass compile-time constant values to attribute constructors. Why is this? I think many people would much prefer to do something like this:
[MyAttribute(new MyClass(foo, bar, baz, jQuery)]
than passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value, and instead of using compile-time warnings/errors depending on exceptions that might be thrown somewhere that has nothing to do with the class except that a method that it called uses some attributes that were typed wrong.
What limitation caused this?
Attributes are part of metadata. You need to be able to reflect on metadata in an assembly without running code in that assembly.
Imagine for example that you are writing a compiler that needs to read attributes from an assembly in order to compile some source code. Do you really want the code in the referenced assembly to be loaded and executed? Do you want to put a requirement on compiler writers that they write compilers that can run arbitrary code in referenced assemblies during the compilation? Code that might crash, or go into infinite loops, or contact databases that the developer doesn't have permission to talk to? The number of awful scenarios is huge and we eliminate all of them by requiring that attributes be dead simple.
The issue is with the constructor arguments. They need to come from somewhere, they are not supplied by code that consumes the attribute. They must be supplied by the Reflection plumbing when it creates the attribute object by calling its constructor. For which it needs the constructor argument values.
This starts at compile time with the compiler parsing the attribute and recording the constructor arguments. It stores those argument values in the assembly metadata in a binary format. At issue then is that the runtime needs a highly standardized way to deserialize those values, one that preferably doesn't depend on any of the .NET classes that you'd normally use the de/serialize data. Because there's no guarantee that such classes are actually available at runtime, they won't be in a very trimmed version of .NET like the Micro Framework.
Even something as common as binary serialization with the BinaryFormatter class is troublesome, note how it requires the [Serializable] attribute on the class to allow it to do its job. Versioning would also be an enormous problem, clearly such a serializer class could never change for the risk of breaking attributes in old assemblies.
This is a rock and a hard place, solved by the CLS designers by heavily restricting the allowed types for an attribute constructor. They didn't leave much, just the simple values types, string, a simple one-dimensional array of them and Type. Never a problem deserializing them since their binary representation is simple and unambiguous. Quite a restriction but attributes can still be pretty expressive. The ultimate fallback is to use a string and decode that string in the constructor at runtime. Creating an object of MyClass isn't an issue, you can do so in the attribute constructor. You'll have to encode the arguments that this constructor needs however as properties of the attribute.
The probably most correct answer as to why you can only use constants for attributes is because the C#/BCL design team did not judge supporting anything else important enough to be added (i.e. not worth the effort).
When you build, the C# compiler will instantiate the attributes you have placed in your code and serialize them, so that they can be stored in the generated assembly. It was probably more important to ensure that attributes can be retrieved quickly and reliably than it was to support more complex scenarios.
Also, code that fails because some attribute property value is wrong is much easier to debug than some framework-internal deserialization error. Consider what would happen if the class definition for MyClass was defined in an external assembly - you compile and embed one version, then update the class definition for MyClass and run your application: boom!
On the other hand, it's seriously frustrating that DateTime instances are not constants.
What limitation caused this?
The reason it isn't possible to do what you describe is probably not caused by any limitation, but it's purely a language design decision. Basically, when designing the language they said "this should be possible but not this". If they really wanted this to be possible, the "limitations" would have been dealt with and this would be possible. I don't know the specific reasoning behind this decision though.
/.../ passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value /.../
I have been in similar situations. I sometimes wanted to use attributes with lambda expressions to implement something in a functional way. But after all, c# is not a functional language, and if I wrote the code in a non-functional way I haven't had the need for such attributes.
In short, I think like this: If I want to develop this in a functional way, I should use a functional language like f#. Now I use c# and I do it in a non-functional way, and then I don't need such attributes.
Perhaps you should simply reconsider your design and not use the attributes like you currently do.
UPDATE 1:
I claimed c# is not a functional language, but that is a subjective view and there is no rigourous definition of "Functional Language". I agree with the Adam Wright, "/.../ As such, I wouldn't class C# as functional in general discussion - it is, at best, multi-paradigm, with some functional flavour." at Why is C# a functional programmming language?
UPDATE 2:
I found this post by Jon Skeet: https://stackoverflow.com/a/294259/1105687 It regards not allowing generic attribute types, but the reasoning could be similar in this case:
Answer from Eric Lippert (paraphrased): no particular reason, except
to avoid complexity in both the language and compiler for a use case
which doesn't add much value.

Why cannot C# generics derive from one of the generic type parameters like they can in C++ templates? [duplicate]

This question already has answers here:
Inheritance on a constrained generic type parameter
(3 answers)
Closed 9 years ago.
Why cannot C# generics derive from one of the generic type parameters like they can in C++ templates? I mean I know it impossible because CLR does not support this, but why?
I am aware of the profound differences between C++ templates and C# generics - the former are compile time entities and must be resolved during the compilation, while the latter are first class run-time entities.
Still, I am failing to see the reason why CLR designers did not come up with a scheme which would ultimately enable a CLR generic type to derive from one of its generic type parameters. After all, this would be tremendously useful feature, I personally miss it greatly.
EDIT:
I would like to know of a hard-core issue, fixing which yields such a high price on implementing this feature that justifies it not being implemented yet. For instance, examine this fictional declaration:
class C<T> : T
{
}
As Eric Lippert has noticed what if "What if T is a struct? What if T is a sealed class type? What if T is an interface type? What if T is C?! What if T is a class dervied from C? What if T is an abstract type with an abstract method? What if T has less accessibility than C ? What if T is System.ValueType? (Can you have a non-struct which inherits from System.ValueType?) What about System.Delegate, System.Enum, and so on?"
As Eric continues, "Those are the easy, obvious ones". Indeed, he is right. I am interested in a concrete example of some neither easy nor obvious issue, which is hard to resolve.
Well, start by asking yourself what could possibly go wrong with class C<T> : T { }. A huge number of things come immediately to mind:
What if T is a struct? What if T is a sealed class type? What if T is an interface type? What if T is C<T>?! What if T is a class derived from C<T>? What if T is an abstract type with an abstract method? What if T has less accessibility than C ? What if T is System.ValueType? (Can you have a non-struct which inherits from System.ValueType?) What about System.Delegate, System.Enum, and so on?
Those are the easy, obvious ones. The proposed feature opens up literally hundreds, if not thousands of more subtle questions about the interaction between the type and its base type, all of which would have to be carefully specified, implemented and tested. We'd undoubtedly miss some, and thereby cause breaking changes in the future, or saddle the runtime with implementation-defined behaviour.
The costs would be enormous, so the benefit had better be enormous. I'm not seeing an enormous benefit here.
OK, if you didn't like my previous answer, then let's take a different tack.
Your question presupposes a falsehood: that we need a reason to not implement a feature. On the contrary, we need a very, very good reason to implement any feature. Features are enormously expensive in their up-front costs, in their maintenance costs, and in the opportunity costs. (That is, the time you spend on feature X is time you cannot spend on doing feature Y, and which might prevent you from ever doing feature Z.) In order to responsibly deliver value to our customers and stakeholders, we cannot implement every feature that someone happens to like.
It's not up to the runtime designers to justify why they did not implement a feature that you find particularly nice. Features are prioritized based on their costs vs the benefit to users, and users have not exactly been hammering down my door demanding this kind of inheritance. This particular feature would massively change how analysis of the type system works in the runtime, have far-reaching effects on every language that consumes generics, and seems to me to provide very little benefit.
We use this sort of inheritance in the compiler -- written in C++ -- and the resulting code is difficult to follow, hard to maintain, and confusing to debug. I've been doing my best to gradually eliminate code like this. I'm opposed to enabling the same sort of bad patterns in C# unless there is an enormously compelling benefit to doing so.
The task of describing that enormous benefit in a compelling way is laid upon the people who want the feature, not upon the people who would have to implement it. So what's the compelling benefit?
Example of code, where this could help:
public class SpecialDataRow<T> : T where T : DataRow
{
public int SpecialFactor { get; set; }
}
This would enable making 'special' rows from DataRow and also from any derived DataRows (like typed dataset generated ones)
I do not see any other way how to code such a class
What would be so useful about this?
Remember that despite the name, generics were never intended to support generic programming.
To support a feature like this, they'd have to make some pretty dramatic changes to the CLR.
You'd need to define a class that derives from a type that doesn't even exist at compile-time.
Why should they jump through such hoops and pretty fundamentally compromise their type system just to add this feature? Is it worth it?
If you think so, tell them why. Write feedback on connect.microsoft.com telling them why this feature is so fundamental that it must be added.
C++ templates cannot be compared to C# generics. C++ templates are pre-processed like macros, while generics in .NET are handled by the runtime.
But there are other people who know a lot more about that than me...

Dependency injection in .Net without virtual method calls?

I've been thinking about whether it's possible to apply the DI pattern without incurring the cost of virtual method calls (which according to experiments I did can be up to 4 times slower than non-virtual calls). The first idea I had was to do dependency injection via generics:
sealed class ComponentA<TComponentB, TComponentC> : IComponentA
where TComponentB : IComponentB
where TComponentC : IComponentC
{ ... }
Unfortunately the CLR still does method calls via the interfaces even when concrete implementations of TComponentB and TComponentC are specified as generic type parameters and all of the classes are declared as sealed. The only way to get the CLR to do non-virtual calls is by changing all of the classes to structs (which implement the interfaces). Using struct doesn't really make sense for DI though and makes the issue below even more unsolvable.
The second issue with the above solution is that it can't handle circular references. I can't think of any way, either by C# code, or by constructing expression trees, to handle circular references because that would entail infinitely recursing generic types. (.Net does support generic types referencing itself, but it doesn't seem to generalize to this case.) Since only structs can cause the CLR to bypass the interfaces, I don't think this problem is solvable at all because circular references between structs could cause a paradox.
There's only one other solution I can think of and it's guaranteed to work - emit all of the classes from scratch at runtime, maybe basing them on compiled classes as templates. Not really an ideal solution though.
Anyone have better ideas?
Edit: In regards to most of the comments, I guess I should say that this is filed under "pure intellectual curiosity" I debated whether to ask this because I do realize that I don't have any concrete case in which it's necessary. I was just thinking about it for fun and was wondering whether anyone else came across this before.
Typical example of trying to completely over-engineer something in my opinion. Just don't compromise your design because you can save a few 10's of milliseconds - if it is even that.
Are you seriously suggesting that because of the callvirt instructions, your app ends up being so significantly slower that users (those people you write the app for) will notice any difference - at all? I doubt that very much.
This blog post explains why you can't optimize the virtual call.
While a callvirt instruction does take longer this is usually done because it provides a cheap null check for the CLR prior to making the call to the method. A callvirt shouldn't take significantly longer than a call instruction especially considering the null check.
Have you found that you could significantly improve the performance of your application by creating types (either structs or classes with static methods) that allow you to guarantee that the C# compiler will emit call instructions rather than callvirt instructions?
The reason I ask is that I am wondering if you are going to create an unmaintainable code base that is brittle and hard to use simply to solve a problem that may or may not exist.

Implementing safe duck-typing in C#

After looking at how Go handles interfaces and liking it, I started thinking about how you could achieve similar duck-typing in C# like this:
var mallard = new Mallard(); // doesn't implement IDuck but has the right methods
IDuck duck = DuckTyper.Adapt<Mallard,IDuck>(mallard);
The DuckTyper.Adapt method would use System.Reflection.Emit to build an adapter on the fly. Maybe somebody has already written something like this. I guess it's not too different from what mocking frameworks already do.
However, this would throw exceptions at run-time if Mallard doesn't actually have the right IDuck methods. To get the error earlier at compile time, I'd have to write a MallardToDuckAdapter which is exactly what I'm trying to avoid.
Is there a better way?
edit: apparently the proper term for what I call "safe duck-typing" is structural typing.
How can you know if a cow walks like a duck and quacks like a duck if you don't have a living, breathing cow in front of you?
Duck-typing is a concept used at run-time. A similar concept at compile-time is structural typing which is AFAIK not supported by the CLR. (The CLR is centred around nominative typing.)
[A structural type system] contrasts with nominative systems, where comparisons are based on explicit declarations or the names of the types, and duck typing, in which only the part of the structure accessed at runtime is checked for compatibility.
The usual way to ensure that duck-typing throws no exception at run-time are unit-tests.
DuckTyping for C#
Reflection.Emit is used to emit IL that directly calls the original object
I don't think this library will give you compile time errors thought, I am not sure that would be entirely feasible. Use Unit Tests to help compensate for that.
I don't think there's another way in which you would get a compile time error.
However, this is something that Unit Testing is great for. You would write a unit test to verify that
DuckTyper.Adapt<Mallard, IDuck>(mallard);
successfully maps.
I know that implicit interfaces (which is what Go interfaces are) were planned for VB 10 (no idea about C#). Unfortunately, they were scrapped before release (I think they didn’t even make it into beta …). It would be nice to see whether they will make an appearance in a future version of .NET.
Of course, the new dynamic types can be used to achieve much the same but this is still not the same – implicit interfaces still allow strong typing, which I find important.

Categories

Resources