DSL: from DSL rules into C# expressions

DSL: from DSL rules into C# expressions - c#

the question is maybe composite, let me expand it:
does it exists a designer (stub/framework/meta-designer) to create AND/OR based rules based on .NET object public bool properties? Saved as any DSL/Boo/... output.
is it possible to compile the DSL output into C# expressions?
Our main problem is the gap between the documentation and the code. Our product based on hundreds of user definied rules and we want to speed up the change requests.
If we are able to give a simple designer to the users and grab the output, then after translating/compiling it into C#/IL code we have a fast change request cycle.
I know that our problem it's to specific but any "bricks in the wall" are welcome!
Example:
A C# class, subject of :
public class TestA
{
public bool B {...}
public bool C {...}
}
In the designer, we should able to create
any type of graphics designers (ie. dropdown to select public properties)
Output in DSL:
If TestA.B AND TestA.C Then Return True;
Output in C#:
if (testA.B && testA.C) { return true; }
Update #1
I would be glad with a DSL language that support using of static-typed .NET classes. I mean if the user can check the code ("Output in DSL" in the example), we don't need the designer.
Update #2
Based on the tipp, I stared with expression trees. After few days I ran into DLinq - I never was a big fan of DLinq but in this case fits the problem domain very well.
Easy to parse (A > 2 AND B < 4) OR C = 5 into expression trees
Easy to create expressions like that
Very easy to serialize/deserialize
GUI based on FlowLayoutPanel works fine as "expression builder"

You could build something like this your self.
You can get a list of all the public properties for a class using Type.GetMembers()
However, instead of generating C# code, I would use expression trees.
That way you don't need to involve the C# compiler when the users change rules. Instead, you can store the rules in a database, load them at runtime, and then use the Expression.Compile() method to create a delegate you can invoke to run the code.
Update:
In the comments someone asked "What is the difference between Expression Tress and domain specific languages?"
Here's the answer:
Expression trees and domain specific languages are orthogonal things.
Expression tress are just an API for representing C# expressions, that conveniently can be converted into a delegate dynamically at runtime.
A DSL, or domain specific language, is a programing language designed to solve a narrow class of problems.
They are, essentially, completely different things.
You can use expression trees as part of a DSL implementation if you like. Linq uses them for for that purpose.
In your case, however, you don't need a DSL. What you need is a user interface that generates rules (similar to the way outlook works), and then a way of executing those rules.
Creating the UI is just normal UI development.
Expression trees are what you can use to implement the rules.

It's a little-known fact that the designer for Windows Workflow Foundation and its Rules Engine in particular can be hosted in a Windows Forms application separate from Visual Studio. The rules authored in this way can similarly be evaluated independent of an actual workflow.
See WF Scenarios Guidance: Workflow Designer Re-Hosting and Tutorial: Hosting the WF Designer.

Related

Where does the LINQ query syntax come from?

I'm new to C# and have just started to delve into using classes that essentially mirror a database. What I'm confused about is how I'm able to use lines like
var queryLondonCustomers = from cust in customers
where cust.City == "London"
select cust;
in my program. From what I understand, that syntax isn't "normal" C#, so the above line wouldn't have any meaning if I hadn't included System.Linq. What is happening is we've added to the C# sharp language in the context of this file.
Maybe I'm completely wrong. Could someone clear this up for me? I come from a C++ background, so maybe I'd understand if someone could show me the C++ equivalent of this concept.
And if I'm right, why is this way of doing things preferable to having a C# class that talks to the database by using strings that are database queries, like with PHP and MySQL? I thought this MVC way of talking to the database was supposed to provide an abstraction for me to use a C# class for database operations, but really this is just taking database language and adding it to the C# language in the context of a particular C# file. I can't see any point of that. I'm not trolling, just trying to understand the spirit of this whole ASP.NET MVC thing that is the most confusing thing I've learned so far.

From what I understand, that syntax isn't "normal" C#
Yes it is, as of C# 3.
so the above line wouldn't have any meaning if I hadn't included System.Linq
Yes it would. It would still have effectively been transformed by the compiler into:
var queryLondonCustomers = customers.Where(cust => cust.City == "London");
(The lack of a Select call is because you're selecting the range variable directly, rather than some projection of it.)
If that code would have compiled (e.g. because of a Where member in customers, or due to another extension method on its type) then so would the query expression.
Query expressions are specified in section 7.16 of the C# language specification.
As for the question of why you'd want to do this, well:
Using an ORM instead of just manual SQL is hardly new - but LINQ integrates it into the language, with a somewhat leaky abstraction
LINQ doesn't just work for databases; I primarily use it in "regular" collections such as lists etc.

in my program. From what I understand, that syntax isn't "normal" C#, so the above line wouldn't have any meaning if I hadn't included System.Linq
Yes and no at the same time :-)
The LINQ syntax is standard C# syntax (from C# 3), but it is resolved at compile time as a semi-textual substitution...
Your code is changed to:
var queryLondonCustomers = customers.Where(cust => cust.City == "London");
and then the various .Where and .Select methods are resolved (it is called Duck Typing... see for example Does LINQ "Query Syntax" Support Duck Typing?)
So at this point you need the using System.Linq that gives you access to the System.Linq.Enumerable and System.Linq.Queryable that are the two classes that implement all the various .Where and .Select methods as extension methods.
Note that you could create and implement a static class of yours, public static class MyLinqMethods, and by creating methods with the "right" signature in that class, you could use the LINQ syntax against your MyLinqMethods class.
And if I'm right, why is this way of doing things preferable to having a C# class that talks to the database by using strings that are database queries
There is some safety in using LINQ... If you created somewhere some classes that mapped the database tables, then the C# compiler could check against these classes that you are using the right names for the fields. If you wrote
var queryLondonCustomers = from cust in customers
where cust.CityERROR == "London"
select cust;
the compiler would give you an error, because CityERROR isn't a field/property of Customer. Clearly you could have an error in the "mapping" files, but at least you have a single place that can have these errors.

From what I understand, that syntax isn't "normal" C#,
Yes it is.
so the above line wouldn't have any meaning if I hadn't included System.Linq
It would. It would always mean the same thing as:
var queryLondonCustomers = customers.Where(cust.City == "London");
C# doesn't care how customers.Where(Func<Customer, bool>) is defined, just as long as it is. System.Linq has extension methods that define Where for IEnumerable and IQueryable which covers 99.9% of the time that you want this, but it doesn't have to come from there.
In particular if customers was an instance of a class that had its own Where(Func<Customer, bool>) method then it would be the overload used (instance methods always beat extension methods in overload resolution). Likewise if another static class defined an extension method for ... Where(this CustomerCollection, Func<Customer, bool>) or similar it would be called.
And if I'm right, why is this way of doing things preferable to having a C# class that talks to the database by using strings that are database queries
Querying collection-like objects is a very common use case, of which database access is only one. Providing a common interface to a common use case is a classic reason for any interface-based programming.

ASP.NET MVC is just a way of creating web applications, like you would create windows forms or WPF projects to create desktop applications. They don't have any special capabilities regarding database interaction .
LINQ on the other hand is something that is quite unique. It provides a convenient way of working with collections. The data in these collections can come from databases but it doesn't have to. How you write your queries depend on your preference. I like the lambda syntax, it is short and easy to read.
The advantage of LINQ is that you CAN use its syntax to interact with the database, but therefore you'll need to use APIs that are designed to do this, such as entity framework. This way, you can tell entity framework to do certain stuff with your LINQ commands, such as retrieving records with a certain where clause.

Porting a very Pythonesque library over to .NET

I'm investigating the possibility of porting the Python library Beautiful Soup over to .NET. Mainly, because I really love the parser and there's simply no good HTML parsers on the .NET framework (Html Agility Pack is outdated, buggy, undocumented and doesn't work well unless the exact schema is known.)
One of my primary goals is to get the basic DOM selection functionality to really parallel the beauty and simplicity of BeautifulSoup, allowing developers to easily craft expressions to find elements they're looking for.
BeautifulSoup takes advantage of loose-binding and named parameters to make this happen. For example, to find all a tags with an id of test and a title that contains the word foo, I could do:
soup.find_all('a', id='test', title=re.compile('foo'))
However, C# doesn't have a concept of an arbitrary number of named elements. The .NET4 Runtime has named parameters, however they have to match an existing method prototype.
My Question: What is the C# design pattern that most parallels this Pythonic construct?
Some Ideas:
I'd like to go after this based on how I, as a developer, would like to code. Implementing this is out of the scope of this post. One idea I has would be to use anonymous types. Something like:
soup.FindAll("a", new { Id = "Test", Title = new Regex("foo") });
Though this syntax loosely matches the Python implementation, it still has some disadvantages.
The FindAll implementation would have to use reflection to parse the anonymous type, and handle any arbitrary metadata in a reasonable manner.
The FindAll prototype would need to take an Object, which makes it fairly unclear how to use the method unless you're well familiar with the documented behavior. I don't believe there's a way to declare a method that must take an anonymous type.
Another idea I had is perhaps a more .NET way of handling this but strays further away from the library's Python roots. That would be to use a fluent pattern. Something like:
soup.FindAll("a")
.Attr("id", "Test")
.Attr("title", new Regex("foo"));
This would require building an expression tree and locating the appropriate nodes in the DOM.
The third and last idea I have would be to use LINQ. Something like:
var nodes = (from n in soup
where n.Tag == "a" &&
n["id"] == "Test" &&
Regex.Match(n["title"], "foo").Success
select n);
I'd appreciate any insight from anyone with experience porting Python code to C#, or just overall recommendations on the best way to handle this situation.

Have you try to run your code inside the IronPython engine. As far as I know performs really well and you don't have to touch your python code.

Is there any off the shelf component which can be used to evaluate expressions on an object?

We would like to parse expressions of the type:
Func<T1, bool>, Func<T1, T2, bool>, Func<T1, T2, T3, bool>, etc.
I understand that it is relatively easy to build an expression tree and evaluate it, but I would like to get around the overhead of doing a Compile on the expression tree.
Is there any off the shelf component which can do this?
Is there any component which can parse C# expressions from a string and evaluate them? (Expression services for C# , I think there is something like this available for VB which is used by WF4)
Edit:
We have specific models which on which we need to evaluate expressions which are entered by IT Administrators.
public class SiteModel
{
public int NumberOfUsers {get;set;}
public int AvailableLicenses {get;set;}
}
We would like for them to enter an expression like:
Site.NumberOfUsers > 100 && Site.AvailableLicenses < Site.NumberOfUsers
We would then like to generate a Func which can be evaluated by passing a SiteModel object.
Func<SiteModel, bool> (Site) => Site.NumberOfUsers > 100 && Site.AvailableLicenses < Site.NumberOfUsers
Also, the performance should not be miserable (but around 80-100 calls per second on a normal PC should be fine).

Mono.CSharp can evaluate expressions from strings, and is very simple to use. The required references come with the mono compiler and runtime. (In the tools directory iirc).
You need to reference Mono.CSharp.dll and the Mono C# compiler executable (mcs.exe).
Next set up the evaluator to know about your code if necessary.
using Mono.CSharp;
...
Evaluator.ReferenceAssembly (Assembly.GetExecutingAssembly ());
Evaluator.Run ("using Foo.Bar;");
Then evaluating expressions is as simple as calling Evaluate.
var x = (bool) Evaluator.Evaluate ("0 == 1");

Maybe ILCalc (on codeplex) does what you are looking for. It comes as a .NET and a Silverlight version and is open sourced.
We have been using it successfully for quite a while. It even allows you to reference variables in your expression.

The "component" you are talking about:
Needs to understand C# syntax (for parsing your input string)
Needs to understand C# semantics (where to perform implicit int->double conversions, etc.)
Needs to generate IL code
Such a "component" is called a C# compiler.
The current Microsoft C# compiler is poor option as it runs in a separate process (thus increasing compilation time as all the metadata needs to be loaded into that process) and can only compile full assemblies (and .NET assemblies cannot be unloaded without unloading the whole AppDomain, thus leaking memory). However, if you can live with those restrictions, it's an easy solution - see sgorozco's answer.
The future Microsoft C# compiler (Roslyn project) will be able to do what you want, but that is still some time in the future - my guess is that it will be released with the next VS after VS11, i.e. with C# 6.0.
Mono C# compiler (see Mark H's answer) can do what you want, but I don't know if that supports code unloading or will also leak a bit of memory.
Roll your own. You know which subset of C# you need to support, and there are separate components available for the various "needs" above. For example, NRefactory 5 can parse C# code and analyze semantics. Expression Trees greatly simplify IL code generation. You could write a converter from NRefactory ResolveResults to Expression Trees, that would likely solve your problem in less than 300 lines of code. However, NRefactory reuses large parts of the Mono C# compiler in its parser - and if you're taking that big dependency, you might as well go with option 3.

Perhaps this technique is useful to you - especially regarding the dependency reviews as you are depending solely on framework components.
EDIT: as pinpointed by #Asti, this technique creates dynamic assemblies that unfortunately, due to limitations of .net Framework design, cannot be unloaded, so careful consideration should be done before using it. This means that if a script is updated, the old assembly containing the previous version of the script can't be unloaded from memory and will be lingering until the application or service hosting it is restarted.
In a scenario where the frequency of change in scripts is reduced, and where compiled scripts are cached and reused and not recompiled on every use, this memory leak can be IMO safely tolerated (this has been the case for all our uses of this technique). Fortunately, in my experience, the memory footprint of the generated assemblies for typical scripts tends to be quite small.
If this is not acceptable, then the scripts can be compiled on a separate AppDomain that can be removed from memory, although, this would require call marshaling between domains (e.g. a named pipe WCF service), or perhaps an IIS hosted service, where unloading occurs automatically after an inactivity period, or a memory footprint threshold is exceeded).
End EDIT
First, you need to add to your project a reference to Microsoft.CSharp, and add the following using statements
using System.CodeDom.Compiler; // this is included in System.Dll assembly
using Microsoft.CSharp;
Then, I'm adding the following method:
private void TestDynCompile() {
// the code you want to dynamically compile, as a string
string code = #"
using System;
namespace DynCode {
public class TestClass {
public string MyMsg(string name) {
//---- this would be code your users provide
return string.Format(""Hello {0}!"", name);
//-----
}
}
}";
// obtain a reference to a CSharp compiler
var provider = CodeDomProvider.CreateProvider("CSharp");
// Crate instance for compilation parameters
var cp = new CompilerParameters();
// Add assembly dependencies
cp.ReferencedAssemblies.Add("System.dll");
// hold compiled assembly in memory, don't produce an output file
cp.GenerateInMemory = true;
cp.GenerateExecutable = false;
// don't produce debugging information
cp.IncludeDebugInformation = false;
// Compile source code
var rslts = provider.CompileAssemblyFromSource(cp, code);
if( rslts.Errors.Count == 0 ) {
// No errors in compilation, obtain type for DynCode.TestClass
var type = rslts.CompiledAssembly.GetType("DynCode.TestClass");
// Create an instance for the dynamically compiled class
dynamic instance = Activator.CreateInstance(type);
// Invoke dynamic code
MessageBox.Show(instance.MyMsg("Gerardo")); // Hello Gerardo! is diplayed =)
}
}
As you can see, you need to add boilerplate code like a wrapper class definition, inject assembly dependencies, etc.), but this is a really powerful technique that adds scripting capabilities with full C# syntax and executes almost as fast as static code. (Invocation will be a little bit slower).
Assembly dependencies can refer to your own project dependencies, so classes and types defined in your project can be refered and used inside the dynamic code.
Hope this helps!

Not sure about the performance part but this seems like a good match for dynamic linq...

Generate xsd out of SiteModel class, then through web/whatever-UI let the administrator input the expression, transform the input via xsl where you modify the expression as a functor literal, then generate and execute it via CodeDom on the fly.

Maybe you can use LUA Scripts as input. The user enters a LUA expression and you can parse and execute it with the LUA engine. If needed you can wrap the input with some other LUA code before you interpret it and I'm not sure about the performance. But 100 calls/s are not that much.
Evaluating expressions is always a security issue. So take care of that, too.
You can use LUA in c#
Another way would be to compile some C# code that contains the input expression in a class. But here you will end up with one assembly per request. I think .net 4.0 can unload assemblies but older versions of .net can't. so this solution might not scale well. A workaround can be an own process that is restarted every X requests.

Thanks for your answers.
Introducing a dependency on Mono in a product like ours (which has more than 100K installations and has a long release cycle of 1-1.5 years) may not be a good option for us. This might also be an overkill since we only need to support simple expressions (with little or no nested expressions) and not an entire language.
After using the code dom compiler, we noticed that it causes the application to leak memory. Although we could load it in a separate app domain to work around this, this again might be an overkill.
The dynamic LINQ expression tree sample provided as part of the VS Samples has a lot of bugs and no support for type conversions when ding comparisons (changing a string to an int, a double to an int, a long to an int, etc). The parsing for indexers also seems to be broken. Although not usable off the shelf, it shows promise for our use cases.
We have decided to go with expression trees as of now.

c# executing a string as code...is it worth the effort?

Here's the story so far:
I'm doing a C# winforms application to facilitate specifying equipment for hire quotations.
In it, I have a List<T> of ~1500 stock items.
These items have a property called AutospecQty that has a get accessor that needs to execute some code that is specific to each item. This code will refer to various other items in the list.
So, for example, one item (let's call it Item0001) has this get accessor that may need to execute some code that may look something like this:
[some code to get the following items from the list here]
if(Item0002.Value + Item0003.Value > Item0004.Value)
{ return Item0002.Value }
else
{ return Item0004.Value }
Which is all well and good, but these bits of code are likely to change on a weekly basis, so I'm trying to avoid redeploying that often. Also, each item could (will) have wildly different code. Some will be querying the list, some will be doing some long-ass math functions, some will be simple addition as above...some will depend on variables not contained in the list.
What I'd like to do is to store the code for each item in a table in my database, then when the app starts just pull the relevant code out and bung it in a list, ready to be executed when the time comes.
Most of the examples I've seen on the internot regarding executing a string as code seem quite long-winded, convoluted, and/or not particularly novice-coder friendly (I'm a complete amateur), and don't seem to take into account being passed variables.
So the questions are:
Is there an easier/simpler way of achieving what I'm trying to do?
If 1=false (I'm guessing that's the case), is it worth the effort of all the potential problems of this approach, or would my time be better spent writing an automatic update feature into the application and just keeping it all inside the main app (so the user would just have to let the app update itself once a week)?
Another (probably bad) idea I had was shifting all the autospec code out to a separate DLL, and either just redeploying that when necessary, or is it even possible to reference a single DLL on a shared network drive?
I guess this is some pretty dangerous territory whichever way I go. Can someone tell me if I'm opening a can of worms best left well and truly shut?
Is there a better way of going about this whole thing? I have a habit of overcomplicating things that I'm trying to kick :P
Just as additional info, the autospec code will not be user-input. It'll be me updating it every week (no-one else has access to it), so hopefully that will mitigate some security concerns at least.
Apologies if I've explained this badly.
Thanks in advance

Some options to consider:
1) If you had a good continuous integration system with automatic build and deployment, would deploying every week be such an issue?
2) Have you considered MEF or similar which would allow you to substitute just a single DLL containing the new rules?
3) If the formula can be expressed simply (without needing to eval some code, e.g. A+B+C+D > E+F+G+H => J or K) you might be able to use reflection to gather the parameter values and then apply them.
4) You could use Expressions in .NET 4 and build an expression tree from the database and then evaluate it.

Looks like you may be well served by implementing the specification pattern.
As wikipedia describes it:
whereby business logic can be recombined by chaining the business logic together using boolean logic.

Have you considered something like MEF, then you could have lots of small dlls implementing various versions of your calculations and simply reference which one to load up from the database.
That is assuming you can wrap them all in a single (or small number of) interfaces.

I would attack this problem by creating a domain specific language which the program could interpret to execute the rules. Then put snippits of the DSL code in the database.
As you can see, I also like to overcomplicate things. :-) But it works as long as the long-term use is simplified.

You could have your program compile up your rules at runtime into a class that acts like a plugin using the CSharpCodeProvider.
See Compiling code during runtime for a sample of how to do this.

Expression evaluation design questions

I a modeling a system to evaluate expressions. Now the operands in these expressions can be of one of several types including some primitive .NET types. When defining my Expression class, I want some degree of type-safety and therefore don't want to use 'object' for the operand object type, so I am considering defining an abstract Operand base class with nothing in it and creating a subclass for each type of Operand. What do you think of this?
Also, only some types of operands make sense with others. And finally, only some operators make sense with particular operands. I can't really think of a way to implement these rules at compile-time so I'm thinking I'll have to do these checks at runtime.
Any ideas on how I might be able to do this better?

I'm not sure if C based languages have this, however Java has several packages that would really make sense for this.
The JavaCC or java compiler compiler allows you to define a language (your expressions for example) and them build the corresponding java classes. A somewhat more user friendly if not more experimental and academic package is DemeterJ - this allows you to very easily specify the expression language and comes with a library for defining visitors and strategies to operate over the generated class structure. If you could afford to switch to Java I might try that. Other wise I'd look for a C# clone of one of these technologies.
Another thing to consider if you go down this route is that once you've generated your class structure within some reasonable approximation of the end result, you can subclass all of the generated classes and build all of your application specific login into the subclasses. That way if you really need to regenerate a new model for the expression language your logic will be relatively independent of your class hierarchy.
Update: Actually it looks as though some of this stuff is ported to .NET technology though I havent used it so I'm not sure what shape it may be in:
http://www.ccs.neu.edu/home/lieber/inside-impl.html
good luck!

How about Expression in 3.5? I recently wrote an expression parser/compiler using this.

I've recently built a dynamic expression evaluator. What I found to be effective was to create, as you suggested, a BaseOperand with meaningful derived classes (NumericOperand, StringOperand, DateOperand, etc) Depending on your implementation, generics may make sense as well (Operand).
Through the implementation of the Visitor pattern, you can perform any kind of validation you like.
I had a very specific need to roll my own solution, but there are many options already available for processing expressions. You may want to take a look at some of these for inspiration or to avoid reinventing the wheel.

I found a good approach to handle the types of objects with EXPRESSIONOASIS framework. They are using custom data structure to carry the types of the objects. So after parsing the operand with regular expressions and given expressions, they decide the type and store this type as property of a generic class which can be used any time for getting the type.
http://code.google.com/p/expressionoasis/

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.