I a modeling a system to evaluate expressions. Now the operands in these expressions can be of one of several types including some primitive .NET types. When defining my Expression class, I want some degree of type-safety and therefore don't want to use 'object' for the operand object type, so I am considering defining an abstract Operand base class with nothing in it and creating a subclass for each type of Operand. What do you think of this?
Also, only some types of operands make sense with others. And finally, only some operators make sense with particular operands. I can't really think of a way to implement these rules at compile-time so I'm thinking I'll have to do these checks at runtime.
Any ideas on how I might be able to do this better?
I'm not sure if C based languages have this, however Java has several packages that would really make sense for this.
The JavaCC or java compiler compiler allows you to define a language (your expressions for example) and them build the corresponding java classes. A somewhat more user friendly if not more experimental and academic package is DemeterJ - this allows you to very easily specify the expression language and comes with a library for defining visitors and strategies to operate over the generated class structure. If you could afford to switch to Java I might try that. Other wise I'd look for a C# clone of one of these technologies.
Another thing to consider if you go down this route is that once you've generated your class structure within some reasonable approximation of the end result, you can subclass all of the generated classes and build all of your application specific login into the subclasses. That way if you really need to regenerate a new model for the expression language your logic will be relatively independent of your class hierarchy.
Update: Actually it looks as though some of this stuff is ported to .NET technology though I havent used it so I'm not sure what shape it may be in:
http://www.ccs.neu.edu/home/lieber/inside-impl.html
good luck!
How about Expression in 3.5? I recently wrote an expression parser/compiler using this.
I've recently built a dynamic expression evaluator. What I found to be effective was to create, as you suggested, a BaseOperand with meaningful derived classes (NumericOperand, StringOperand, DateOperand, etc) Depending on your implementation, generics may make sense as well (Operand).
Through the implementation of the Visitor pattern, you can perform any kind of validation you like.
I had a very specific need to roll my own solution, but there are many options already available for processing expressions. You may want to take a look at some of these for inspiration or to avoid reinventing the wheel.
I found a good approach to handle the types of objects with EXPRESSIONOASIS framework. They are using custom data structure to carry the types of the objects. So after parsing the operand with regular expressions and given expressions, they decide the type and store this type as property of a generic class which can be used any time for getting the type.
http://code.google.com/p/expressionoasis/
Related
Preamble:
I'm working on implementing a system where a developer can specify on object of type T which has N number of properties.
What I'm trying to accomplish:
I'd like to be able to provide more concrete control over how these properties are set and retrieved. This is difficult because of the limitless number of configurations possible.
Solutions I'm pursuing:
Is it possible to set getters dynamically?
Either through attributes? Generate the methods after the fact during construction like :
foreach(var property in typeOf(T).GetProperties())
{
//dynamically generate getter method which returns this property.
}
Possibly wrap type T in a container object which has the type as one of its properties. Then set up some type of special getter for that.
Why I want to do this:
Currently all the types for T get converted based on Typecode. Allowing people using this library to easily parse in values from various sources (databases, text files, app configs, etc.) as properties of type T. The point being to deliver these properties as type safe values rather than magic strings.
This is the impetus for wanting to use reflection for this but I could see numerous other applications for this I would imagine.
I could just say "Hey make all of your types nullable." so that it would be easy to determine which properties have been set in type T. I'd like to abstract away even that though.
The other option for this would be "Make sure you understand that certain types have default values. Be certain you're ready to accept those values, or set default values of your own (including making it nullable and setting it to null if so desired). Essentially trusting this to the developer rather than abstracting it. <==== This is what I'm currently doing.
Being able to dynamically generate methods, especially getters and setters, dynamically via reflection or a combination of reflection and C# Actions would be incredibly valuable. Any insight or ideas would be greatly welcome. Either ways of accomplishing the solutions I'm pursuiing or another idea which achieves the same ends.
I don't believe you can set accessor methods on properties of static types. Another challenge that I think you will have to deal with will be your goal to provide type safety - you see, if your types are built in run-time, compile-time checks will not work, which means you'll have to rely on dynamic a lot - this is slow. But not all is lost. You have at least two options:
The quick and dirty way
You can generate proxy classes that would inherit from your concrete types. Theese will give you ability to intercept method calls to base members and do pretty much anything you please. See Castle DynamicProxy
The hard but proper way (actually first option is using this behind the scenes)
You're looking at IL Generator namespace. This is a step above linq expression trees in a sense that you can generate your own assembly and types, all programmatically. This however is incredibly complex and error prone. I'd suggest you try option one first and only generate your own IL if you absolutely must.
UPD I know you didn't want magic strings, an I guess this is a bit less conventional solution, but also check out CSharpCodeProvider of CodeDOM namespace
It is common to read around that object casting is a bad practice and should be avoided, for instance Why should casting be avoided? question has gotten some answers with great arguments:
By Jerry Coffin:
Looking at things more generally, the situation's pretty simple (at
least IMO): a cast (obviously enough) means you're converting
something from one type to another. When/if you do that, it raises the
question "Why?" If you really want something to be a particular type,
why didn't you define it to be that type to start with? That's not to
say there's never a reason to do such a conversion, but anytime it
happens, it should prompt the question of whether you could re-design
the code so the correct type was used throughout.
By Eric Lippert:
Both kinds of casts are red flags. The first kind of cast
raises the question "why exactly is it that the developer knows
something that the compiler doesn't?" If you are in that situation
then the better thing to do is usually to change the program so that
the compiler does have a handle on reality. Then you don't need the
cast; the analysis is done at compile time.
The second kind of cast raises the question "why isn't the operation
being done in the target data type in the first place?" If you need
a result in ints then why are you holding a double in the first
place? Shouldn't you be holding an int?
Moving on to my question, recently I have started to look into the source code of the well known open source project AutoFixture originally devloped by Mark Seemann which I really appreciate.
One of the main components of the library is the interface ISpecimenBuilder which define an somehow abstract method:
object Create(object request, ISpecimenContext context);
As you can see request parameter type is object and by such it accepts completely different types, different implementations of the interface treat different requests by their runtime type, checking if it is something they cable dealing with otherwise returning some kind of no response representation.
It seems that the design of the interface does not adhere to the "good practice" that object casting should be used sparsely.
I was thinking to myself if there is a better way to design this contract in a way that defeats all the casting but couldn't find any solution.
Obviously the object parameter could be replaced with some marker interface but it will not save us the casting problem, I have also thought that it is possible to use some variation of visitor pattern as described here but it does not seem to be very scalable, the visitor will must have dozens of different methods since there is so many different implementations of the interface that capable dealing with different types of requests.
Although the fact that I basically agree with the arguments against using casting as part of a good design in this specific scenario it seems as not only the best option but also the only realistic one.
To sum up, is object casting and a very general contracts are inevitability of reality when there is a need to design modular and extendable architecture?
I don't think that I can answer this question generally, for any type of application or framework, but I can offer an answer that specifically talks about AutoFixture, as well as offer some speculation about other usage scenarios.
If I had to write AutoFixture from scratch today, there's certainly things I'd do differently. Particularly, I wouldn't design the day-to-day API around something like ISpecimenBuilder. Rather, I'd design the data manipulation API around the concept of functors and monads, as outlined here.
This design is based entirely on generics, but it does require statically typed building blocks (also described in the article) known at compile time.
This is closely related to how something like QuickCheck works. When you write QuickCheck-based tests, you must supply generators for all of your own custom types. Haskell doesn't support run-time casting of values, but instead relies exclusively on generics and some compile-time automation. Granted, Haskell's generics are more powerful than C#'s, so you can't necessarily transfer the knowledge gained from Haskell to C#. It does suggest, however, that it's possible to write code entirely without relying on run-time casting.
AutoFixture does, however, support user-defined types without the need for the user to write custom generators. It does this via .NET Reflection. In .NET, the Reflection API is untyped; all the methods for generating objects and invoking members take object as input and return object as output.
Any application, library, or framework based on Reflection will have to perform some run-time casting. I don't see how to get around that.
Would it be possible to write data generators without Reflection? I haven't tried the following, but perhaps one could adopt a strategy where one would write 'the code' for a data generator directly in IL and use Reflection emit to dynamically compile an in-memory assembly that contains the generators.
This is a bit like how the Hiro container works, IIRC. I suppose that one could design other types of general-purpose frameworks around this concept, but I rarely see it done in .NET.
As shown here, attribute constructors are not called until you reflect to get the attribute values. However, as you may also know, you can only pass compile-time constant values to attribute constructors. Why is this? I think many people would much prefer to do something like this:
[MyAttribute(new MyClass(foo, bar, baz, jQuery)]
than passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value, and instead of using compile-time warnings/errors depending on exceptions that might be thrown somewhere that has nothing to do with the class except that a method that it called uses some attributes that were typed wrong.
What limitation caused this?
Attributes are part of metadata. You need to be able to reflect on metadata in an assembly without running code in that assembly.
Imagine for example that you are writing a compiler that needs to read attributes from an assembly in order to compile some source code. Do you really want the code in the referenced assembly to be loaded and executed? Do you want to put a requirement on compiler writers that they write compilers that can run arbitrary code in referenced assemblies during the compilation? Code that might crash, or go into infinite loops, or contact databases that the developer doesn't have permission to talk to? The number of awful scenarios is huge and we eliminate all of them by requiring that attributes be dead simple.
The issue is with the constructor arguments. They need to come from somewhere, they are not supplied by code that consumes the attribute. They must be supplied by the Reflection plumbing when it creates the attribute object by calling its constructor. For which it needs the constructor argument values.
This starts at compile time with the compiler parsing the attribute and recording the constructor arguments. It stores those argument values in the assembly metadata in a binary format. At issue then is that the runtime needs a highly standardized way to deserialize those values, one that preferably doesn't depend on any of the .NET classes that you'd normally use the de/serialize data. Because there's no guarantee that such classes are actually available at runtime, they won't be in a very trimmed version of .NET like the Micro Framework.
Even something as common as binary serialization with the BinaryFormatter class is troublesome, note how it requires the [Serializable] attribute on the class to allow it to do its job. Versioning would also be an enormous problem, clearly such a serializer class could never change for the risk of breaking attributes in old assemblies.
This is a rock and a hard place, solved by the CLS designers by heavily restricting the allowed types for an attribute constructor. They didn't leave much, just the simple values types, string, a simple one-dimensional array of them and Type. Never a problem deserializing them since their binary representation is simple and unambiguous. Quite a restriction but attributes can still be pretty expressive. The ultimate fallback is to use a string and decode that string in the constructor at runtime. Creating an object of MyClass isn't an issue, you can do so in the attribute constructor. You'll have to encode the arguments that this constructor needs however as properties of the attribute.
The probably most correct answer as to why you can only use constants for attributes is because the C#/BCL design team did not judge supporting anything else important enough to be added (i.e. not worth the effort).
When you build, the C# compiler will instantiate the attributes you have placed in your code and serialize them, so that they can be stored in the generated assembly. It was probably more important to ensure that attributes can be retrieved quickly and reliably than it was to support more complex scenarios.
Also, code that fails because some attribute property value is wrong is much easier to debug than some framework-internal deserialization error. Consider what would happen if the class definition for MyClass was defined in an external assembly - you compile and embed one version, then update the class definition for MyClass and run your application: boom!
On the other hand, it's seriously frustrating that DateTime instances are not constants.
What limitation caused this?
The reason it isn't possible to do what you describe is probably not caused by any limitation, but it's purely a language design decision. Basically, when designing the language they said "this should be possible but not this". If they really wanted this to be possible, the "limitations" would have been dealt with and this would be possible. I don't know the specific reasoning behind this decision though.
/.../ passing a string (causing stringly typed code too!) with those values, turned into strings, and then relying on Regex to try and get the value instead of just using the actual value /.../
I have been in similar situations. I sometimes wanted to use attributes with lambda expressions to implement something in a functional way. But after all, c# is not a functional language, and if I wrote the code in a non-functional way I haven't had the need for such attributes.
In short, I think like this: If I want to develop this in a functional way, I should use a functional language like f#. Now I use c# and I do it in a non-functional way, and then I don't need such attributes.
Perhaps you should simply reconsider your design and not use the attributes like you currently do.
UPDATE 1:
I claimed c# is not a functional language, but that is a subjective view and there is no rigourous definition of "Functional Language". I agree with the Adam Wright, "/.../ As such, I wouldn't class C# as functional in general discussion - it is, at best, multi-paradigm, with some functional flavour." at Why is C# a functional programmming language?
UPDATE 2:
I found this post by Jon Skeet: https://stackoverflow.com/a/294259/1105687 It regards not allowing generic attribute types, but the reasoning could be similar in this case:
Answer from Eric Lippert (paraphrased): no particular reason, except
to avoid complexity in both the language and compiler for a use case
which doesn't add much value.
After looking at how Go handles interfaces and liking it, I started thinking about how you could achieve similar duck-typing in C# like this:
var mallard = new Mallard(); // doesn't implement IDuck but has the right methods
IDuck duck = DuckTyper.Adapt<Mallard,IDuck>(mallard);
The DuckTyper.Adapt method would use System.Reflection.Emit to build an adapter on the fly. Maybe somebody has already written something like this. I guess it's not too different from what mocking frameworks already do.
However, this would throw exceptions at run-time if Mallard doesn't actually have the right IDuck methods. To get the error earlier at compile time, I'd have to write a MallardToDuckAdapter which is exactly what I'm trying to avoid.
Is there a better way?
edit: apparently the proper term for what I call "safe duck-typing" is structural typing.
How can you know if a cow walks like a duck and quacks like a duck if you don't have a living, breathing cow in front of you?
Duck-typing is a concept used at run-time. A similar concept at compile-time is structural typing which is AFAIK not supported by the CLR. (The CLR is centred around nominative typing.)
[A structural type system] contrasts with nominative systems, where comparisons are based on explicit declarations or the names of the types, and duck typing, in which only the part of the structure accessed at runtime is checked for compatibility.
The usual way to ensure that duck-typing throws no exception at run-time are unit-tests.
DuckTyping for C#
Reflection.Emit is used to emit IL that directly calls the original object
I don't think this library will give you compile time errors thought, I am not sure that would be entirely feasible. Use Unit Tests to help compensate for that.
I don't think there's another way in which you would get a compile time error.
However, this is something that Unit Testing is great for. You would write a unit test to verify that
DuckTyper.Adapt<Mallard, IDuck>(mallard);
successfully maps.
I know that implicit interfaces (which is what Go interfaces are) were planned for VB 10 (no idea about C#). Unfortunately, they were scrapped before release (I think they didn’t even make it into beta …). It would be nice to see whether they will make an appearance in a future version of .NET.
Of course, the new dynamic types can be used to achieve much the same but this is still not the same – implicit interfaces still allow strong typing, which I find important.
I'm just checking out anonymous methods (in c#)--part of me likes the flexibility and short-hand, but I'm also concerned that it may make the code harder to read.
It also occurred to me that this construct seems to go against some of the o/o paradigm. Do you consider anonymous methods to be in-line with object oriented principles?
lambda (anonymous methods) is from the functional paradigm. That doesn't mean it is good or bad! If it fits the problem then use it, if it doesn't don't. OOP is not a goal, good code is the goal. I hate when people try to force a single paradigm down the throat, like in Java for example. C# is going in the right direction (IMHO), so it is becoming a multiparadigm language.
If you'd like to think of them with respect to Object Oriented design, they're merely syntactic sugar for some anonymous class which contains a method which gets invoked. In fact, Java does it with the longer winded final class. C# chose the shorter method. Both are valid and well within the bounds of Object Oriented design.
Lambda expressions are also no less Object Oriented than delegates. IMHO, lambda expressions fall into an almost entirely orthogonal study of programming from OOP: functional versus procedural.
So, use the right tool for the job be it lambdas, delegates, anonymous classes, objects, monads, etc ad nauseam. Your goal should be to have the right code to solve the right problem.
It doesn't make any sense to me to speak of anonymous functions being "object-oriented" or not "object-oriented." Are variables object-oriented? How about loops? Are exceptions object-oriented?
The label is not a useful thing to apply in this case.
If you think that in some particular case, using an anonymous function to accomplish something makes it harder to read, then don't use one.
Interestingly, the C# implementation of anonymous methods sometimes requires the creation of objects due to "closures". Read about it here: http://blogs.msdn.com/oldnewthing/archive/2006/08/02/686456.aspx