C# how to create functions that are interpreted at runtime - c#

I'm making a Genetic Program, but I'm hitting a limitation with C# where I want to present new functions to the algorithm but I can't do it without recompiling the program. In essence I want the user of the program to provide the allowed functions and the GP will automatically use them. It would be great if the user is required to know as little about programming as possible.
I want to plug in the new functions without compiling them into the program. In Python this is easy, since it's all interpreted, but I have no clue how to do it with C#. Does anybody know how to achieve this in C#? Are there any libraries, techniques, etc?

It depends on how you want the user of the program to "provide the allowed functions."
If the user is choosing functions that you've already implemented, you can pass these around as delegates or expression trees.
If the user is going to write their own methods in C# or another .NET language, and compile them into an assembly, you can load them using Reflection.
If you want the user to be able to type C# source code into your program, you can compile that using CodeDom, then call the resulting assembly using Reflection.
If you want to provide a custom expression language for the user, e.g. a simple mathematical language, then (assuming you can parse the language) you can use Reflection.Emit to generate a dynamic assembly and call that using -- you guessed it -- Reflection. Or you can construct an expression tree from the user code and compile that using LINQ -- depends on how much flexibility you need. (And if you can afford to wait, expression trees in .NET 4.0 remove many of the limitations that were in 3.5, so you may be able to avoid Reflection.Emit altogether.)
If you are happy for the user to enter expressions using Python, Ruby or another DLR language, you can host the Dynamic Language Runtime, which will interpret the user's code for you.
Hosting the DLR (and IronPython or IronRuby) could be a good choice here because you get a well tested environment and all the optimisations the DLR provides. Here's a how-to using IronPython.
Added in response to your performance question: The DLR is reasonably smart about optimisation. It doesn't blindly re-interpret the source code every time: once it has transformed the source code (or, specifically, a given function or class) to MSIL, it will keep reusing that compiled representation until the source code changes (e.g. the function is redefined). So if the user keeps using the same function but over different data sets, then as long as you can keep the same ScriptScope around, you should get decent perf; ditto if your concern is just that you're going to run the same function zillions of times during the genetic algorithm. Hosting the DLR is pretty easy to do, so it shouldn't be hard to do a proof of concept and measure to see if it's up to your needs.

You can try to create and manipulate Expression Trees. Use Linq to evaluate expression trees.

You can also use CodeDom to compile and run a function.
For sure you can google to see some examples that might fit your needs.
It seems that this article "How to dynamically compile C# code" and this article "Dynamically executing code in .Net" could help you.

You have access to the Compiler from within code, you can then create instances of the compiled code and use them without restarting the application. There are examples of it around
Here
and
Here
The second one is a javascript evaluator but could be adapted easily enough.

You can take a look at System.Reflection.Emit to generate code at the IL level.
Or generate C#, compile into a library and load that dynamically. Not nearly as flexible.

It is in fact very easy to generate IL. See this tutorial: http://www.meta-alternative.net/calc.pdf

Related

how to break up the code in a method body into statement by statement symbols/"tokens"

I'm writing something that will examine a function and rewrite that function in another language so basically if inside my function F1, i have this line of code var x=a.b(1) how do i break up the function body into symbols or "tokens"?
I've searched around and thought that stuff in System.Reflection.MethodInfo.GetMethodBody would do the trick however that class doesn't seem to be able to have the capabilities to do what i want..
what other solutions do we have?
Edit:
Is there anyway we can get the "method body" of a method using reflection? (like as a string or something)
Edit 2:
basically what I'm trying to do is to write a program in c#/vb and when i hit F5 a serializer function will (use reflection and) take the entire program (all the classes in that program) and serialize it into a single javascript file. of course javascript doesn't have the .net library so basically the C#/VB program will limit its use of classes to the .js library (which is a library written in c#/vb emulating the framework of javascript objects).
The advantage is that i have type safety while coding my javascript classes and many other benefits like using overloading and having classes/etc. since javascript doesn't have classes/overloading features natively, it rely on hacks to get it done. so basically the serializer function will write the javascript based on the C#/VB program input for me (along with all the hacks and possible optimizations).
I'm trying to code this serializer function
It sounds like you want a parse tree, which Reflection won't give you. Have a look at NRefactory, which is a VB and C# parser.
If you want to do this, the best way would be to parse the C#/VB code with a parser/lexer, such as the Gardens Point Parser Generator, flex/bison or ANTLR. then at the token level, reassemble it with proper javascript grammar. There are a few out there for C# and Java.
See this answer on analyzing and transforming source code
and this one on translating between programming languages.
These assume that you use conventional compiler methods for breaking your text into tokens ("lexing") and grouping related tokens into program structures ("parsing"). If you analysis is anything other than trivial, you'll need all the machinery, or it won't be reliable.
Reflection can only give you what the language designers decided to give you. They invariably don't give you detail inside functions.
If you want to go from IL to other language it may be easier than parsing source language first. If you want to go this route consider reading on Microsoft's "Volta" project (IL->JavaScript), while project is no longer available there are still old blogs discussing issues around it.
Note that reflection alone is not enough - reflection gives you byte array for the body of any particular method (MethodInfo.GetMethodBody.GetILAsByteArray - http://msdn.microsoft.com/en-us/library/system.reflection.methodbody.aspx) and you have to read it. There are several publically available "IL reader" libraries.

C# dynamic code evaluation, Eval, REPL

Does anyone know if there is a way to evaluate c# code at runtime.
eg. I would like to allow a user to enter DateTime.Now.AddDays(1), or something similar, as a string and then evaluate the string to get the result.
I woder if it is possible to access the emmediate windows functionality, since it seems that is evaluates every line entered dynamically.
I have found that VB has an undocumented EbExecuteLine() API function from the VBA*.dll and wonder if there is something equivalent for c#.
I have also found a custom tool https://github.com/DavidWynne/CSharpEval (it used to be at kamimucode.com but the author has moved it to GitHub) that seems to do it, but I would prefer something that comes as part of .NET
Thanks
Mono has the interactive command line (csharp.exe)
You can look at it's source code to see exactly how it does it's magic:
https://github.com/mono/mono/raw/master/mcs/tools/csharp/repl.cs
As you've probably already seen, there is no built-in method for evaluating C# code at runtime. This is the primary reason that the custom tool you mentioned exists.
I also have a C# eval program that allows for evaluating C# code. It provides for evaluating C# code at runtime and supports many C# statements. In fact, this code is usable within any .NET project, however, it is limited to using C# syntax. Have a look at my website, http://csharp-eval.com, for additional details.
Microsoft's C# compiler don't have Compiler-as-a-Service yet (Should come with C# 5.0).
You can either use Mono's REPL, or write your own service using CodeDOM
Its not fast but you can compile the code on the fly, see my previous question,
Once you have the assembly and you know the type name you can construct an instance of your compiled class using reflection and execute your method..
The O2 Platform's C# REPL Script Environment use the Fluent# APIs which have a real powerful reflection API that allows you do execute code snippets.
For example:
"return DateTime.Now.ToLongTimeString();".executeCodeSnippet();
will return
5:01:22 AM
note that the "...".executeCodeSnippet(); can actually execute any valid C# code snippet (so it is quite powerful).
If you want to control what your users can execute, I could use AST trees to limite the C# features that they have access to.
Also take a look at the Microsoft's Roslyn, which is VERY powerful as you can see on Multiple Roslyn based tools (all running Stand-Alone outside VisualStudio)

c# compile source code from database

I would like to build an application framework that is mainly interpreted.
Say that the source code would be stored in the database that could be edited by the users and always the latest version would be executed.
Can anyone give me some ideas how does one implement sth like this !
cheers,
gabor
In .Net, you can use reflection and CodeDOM to compile code on the fly. But neither approach is really very simple or practical. Mono has some ability to interpret c# on the fly as well, but I haven't looked closely at it yet.
Another alternative is to go with an interpreted .Net language like Boo or IronPython as the language for your database code.
Either way, make sure you think long and hard about the security of your platform. Allowing users to execute arbitrary code is always an exercise fraught with peril. It's often too tempting to look for a simple eval() method, and even if one exists, that is not good enough for this kind of scenario.
Try Mono ( http://www.monoproject.org ). It supports many scripting languages including JavaScript.
If you don't want to use any scripting you can use CodeDOM or Reflection (see Reflection.Emit).
Here are really useful links on the topic :
Dynamically executing code in .Net (Here you can find a tool which can be very helpul)
Late Binding and On-the-Fly Code
Generation Using Reflection in C#
Dynamic Source Code Generation and
Compilation
Usually the Program uses a scripting language for the scriptable parts, i.e. Lua or Javascript.
To answer your technical question: You don't want to write your own language and interpreter. That's too much work for you to do. So pick some other language, say Python or Lua, and look for the documentation that lets your C program hand it blocks of code to execute. Of course, the script needs to be able to do something, so you'll need to find how to expose your program's objects to the script. Also, what will happen if a client is running the program when you update its source code in the database? Should the client restart? Are you going to store the entire program as a single row in this database, or did you want to store individual functions? That affects how you structure your updates.
To address other issues with your question: Why do you want to do this? Making "interpreted language" part of your design spec for a system is not often a good sign. Is the real requirement something like this: "I update the program often and I want users to always have the latest copy?" If so, there are other, better ways to go about this (just give us your actual scenario and requirements).

Dynamic code generation

I am currently developing an application where you can create "programs" with it without writing source code, just click&play if you like.
Now the question is how do I generate an executable program from my data model. There are many possibilities but I am not sure which one is the best for me. I need to generate assemblies with classes and namespace and everything which can be part of the application.
CodeDOM class: I heard of lots of limitations and bugs of this class. I need to create attributes on method parameters and return values. Is this supported?
Create C# source code programmatically and then call CompileAssemblyFromFile on it: This would work since I can generate any code I want and C# supports most CLR features. But wouldn't this be slow?
Use the reflection ILGenerator class: I think with this I can generate every possible .NET code. But I think this is much more complicated and error prone than the other approaches?
Are there other possible solutions?
EDIT:
The tool is general for developing applications, it is not restricted to a specific domain. I don't know if it can be considered a visual programming language. The user can create classes, methods, method calls, all kinds of expressions. It won't be very limitating because you should be able to do most things which are allowed in real programming languages.
At the moment lots of things must still be written by the user as text, but the goal at the end is, that nearly everything can be clicked together.
You my find it is rewarding to look at the Dynamic Language Runtime which is more or less designed for creating high-level languages based on .NET.
It's perhaps also worth looking at some of the previous Stack Overflow threads on Domain Specific Languages which contain some useful links to tools for working with DSLs, which sounds a little like what you are planning although I'm still not absolutely clear from the question what exactly your aim is.
Most things "click and play" should be simple enough just to stick some pre-defined building-block objects together (probably using interfaces on the boundaries). Meaning: you might not need to do dynamic code generation - just "fake it". For example, using property-bag objects (like DataTable etc, although that isn't my first choice) for values, etc.
Another option for dynamic evaluation is the Expression class; especially in .NET 4.0, this is hugely versatile, and allows compilation to a delegate.
Do the C# source generation and don't care about speed until it matters. The C# compiler is quite quick.
When I wrote a dynamic code generator, I relied heavily on System.Reflection.Emit.
Basically, you programatically create dynamic assemblies and add new types to them. These types are constructed using the Emit constructs (properties, events, fields, etc..). When it comes to implementing methods, you'll have to use an ILGenerator to pump out MSIL op-codes into your method. That sounds super scary, but you can use a couple of tools to help:
A pre-built sample implementation
ILDasm to inspect the op-codes of the sample implementation.
It depends on your requirements, CodeDOM would certainly be the best fit for a "program" stored it in a "data model".
However its unlikely that using option 2 will be in any way measurably slower in comparision with any other approach.
I would echo others in that 1) the compiler is quick, and 2) "Click and Play" things should be simple enough so that no single widget added to a pile of widgets can make it an illegal pile.
Good luck. I'm skeptical that you can achieve point (2) for anything but really toy-level programs.

Using reflection for code gen?

I'm writing a console tool to generate some C# code for objects in a class library. The best/easiest way I can actual generate the code is to use reflection after the library has been built. It works great, but this seems like a haphazard approch at best. Since the generated code will be compiled with the library, after making a change I'll need to build the solution twice to get the final result, etc. Some of these issues could be mitigated with a build script, but it still feels like a bit too much of a hack to me.
My question is, are there any high-level best practices for this sort of thing?
Its pretty unclear what you are doing, but what does seem clear is that you have some base line code, and based on some its properties, you want to generate more code.
So the key issue here are, given the base line code, how do you extract interesting properties, and how do you generate code from those properties?
Reflection is a way to extract properties of code running (well, at least loaded) into the same execution enviroment as the reflection user code. The problem with reflection is it only provides a very limited set of properties, typically lists of classes, methods, or perhaps names of arguments. IF all the code generation you want to do can be done with just that, well, then reflection seems just fine. But if you want more detailed properties about the code, reflection won't cut it.
In fact, the only artifact from which truly arbitrary code properties can be extracted is the the source code as a character string (how else could you answer, is the number of characters between the add operator and T in middle of the variable name is a prime number?). As a practical matter, properties you can get from character strings are generally not very helpful (see the example I just gave :).
The compiler guys have spent the last 60 years figuring out how to extract interesting program properties and you'd be a complete idiot to ignore what they've learned in that half century.
They have settled on a number of relatively standard "compiler data structures": abstract syntax trees (ASTs), symbol tables (STs), control flow graphs (CFGs), data flow facts (DFFs), program triples, ponter analyses, etc.
If you want to analyze or generate code, your best bet is to process it first into such standard compiler data structures and then do the job. If you have ASTs, you can answer all kinds of question about what operators and operands are used. If you have STs, you can answer questions about where-defined, where-visible and what-type. If you have CFGs, you can answer questions about "this-before-that", "what conditions does statement X depend upon". If you have DFFs, you can determine which assignments affect the actions at a point in the code. Reflection will never provide this IMHO, because it will always be limited to what the runtime system developers are willing to keep around when running a program. (Maybe someday they'll keep all the compiler data structures around, but then it won't be reflection; it will just finally be compiler support).
Now, after you have determined the properties of interest, what do you do for code generation? Here the compiler guys have been so focused on generation of machine code that they don't offer standard answers. The guys that do are the program transformation community (http://en.wikipedia.org/wiki/Program_transformation). Here the idea is to keep at least one representation of your program as ASTs, and to provide special support for matching source code syntax (by constructing pattern-match ASTs from the code fragments of interest), and provide "rewrite" rules that say in effect, "when you see this pattern, then replace it by that pattern under this condition".
By connecting the condition to various property-extracting mechanisms from the compiler guys, you get relatively easy way to say what you want backed up by that 50 years of experience. Such program transformation systems have the ability to read in source code,
carry out analysis and transformations, and generally to regenerate code after transformation.
For your code generation task, you'd read in the base line code into ASTs, apply analyses to determine properties of interesting, use transformations to generate new ASTs, and then spit out the answer.
For such a system to be useful, it also has to be able to parse and prettyprint a wide variety of source code langauges, so that folks other than C# lovers can also have the benefits of code analysis and generation.
These ideas are all reified in the
DMS Software Reengineering Toolkit. DMS handles C, C++, C#, Java, COBOL, JavaScript, PHP, Verilog, ... and a lot of other langauges.
(I'm the architect of DMS, so I have a rather biased view. YMMV).
Have you considered using T4 templates for performing the code generation? It looks like it's getting much more publicity and attention now and more support in VS2010.
This tutorial seems database centric but it may give you some pointers: http://www.olegsych.com/2008/09/t4-tutorial-creatating-your-first-code-generator/ in addition there was a recent Hanselminutes on T4 here: http://www.hanselminutes.com/default.aspx?showID=170.
Edit: Another great place is the T4 tag here on StackOverflow: https://stackoverflow.com/questions/tagged/t4
EDIT: (By asker, new developments)
As of VS2012, T4 now supports reflection over an active project in a single step. This means you can make a change to your code, and the compiled output of the T4 template will reflect the newest version, without requiring you to perform a second reflect/build step. With this capability, I'm marking this as the accepted answer.
You may wish to use CodeDom, so that you only have to build once.
First, I would read this CodeProject article to make sure there are not language-specific features you'd be unable to support without using Reflection.
From what I understand, you could use something like Common Compiler Infrastructure (http://ccimetadata.codeplex.com/) to programatically analyze your existing c# source.
This looks pretty involved to me though, and CCI apparently only has full support for C# language spec 2. A better strategy may be to streamline your existing method instead.
I'm not sure of the best way to do this, but you could do this
As a post-build step on your base dll, run the code generator
As another post-build step, run csc or msbuild to build the generated dll
Other things which depend on the generated dll will also need to depend on the base dll, so the build order remains correct

Categories

Resources