I got .net dll (originally, written in C#) which is being updated / released from time to time. There's a small part of code I need to modify inside this dll which suits my usage needs.
I'm able to do these changes every time using dnSpy but I don't like doing it manually every time.
Is there possibility to automate the process of code change inside dll and how can it be done?
Is it easier to convert dll to IL and change IL instructions and then compile it back or should I do full decompile to C# and then recompile it back using Roslyn?
The code I change is always the same and is changed to the same result code.
A possible solution for what you want to achieve is Mono.Cecil.
With Cecil, you can load existing managed assemblies, browse all the
contained types, modify them on the fly and save back to the disk the
modified assembly.
This library is being used by tools like coverlet which change the assembly on the fly in order to be able to compute code coverage.
That being said, I highly agree with Marc Gravell comment that at the first glance this seems to be the wrong approach and a change in design would be more appropriate.
I need to parse a simple statement (essentially a chain of function calls on some object) represented as a string variable into a CodeDom object (probably a subclass of CodeStatement). I would also like to provide some default imports of namespaces to be able to use less verbose statements.
I have looked around SO and the Internet to find some suggestions but I'm quite confused about what is and isn't possible and what is the simplest way to do it. For example this question seems to be almost what I want, unfortunately I can't use the solution as the CodeSnippetStatement seems not to be supported by the execution engine that I use (the WF rules engine).
Any suggestions that could help me / point me into the right direction ?
There is no library or function to parse C# code into CodeDOM objects as part of the standard .NET libraries. The CodeDOM libraries have some methods that seem to be designed for this, but none of them are actually implemented. As far as I know, there is some implementation available in Visual Studio (used e.g. by designers), but that is only internal.
CodeSnippetStatement is a CodeDOM node that allows you to place any string into the generated code. If you want to create CodeDOM tree just to generate C# source code, than this is usually fine (the source code generator just prints the string to the output). If the WF engine needs to understand the code in your string (and not just generate source code and compile it), than CodeSnippetStatement won't work.
However, there are 3rd party tools that can be used for parsing C# source code. In one project I worked on, we used NRefactory library (which is used in SharpDevelop) and it worked quite well. It gives you some tree (AST) representing the parsed code and I'm afraid you'll need to convert this to the corresponding CodeDOM tree yourself.
I have found a library implementation here that seems to cover pretty much everything I need for my purposes. I don't know if it's robust enough to be used in business scenarios, but for my unit tests it's pretty much all I can ask for.
Before start let me tell my experience: I am experienced with C#.NET, web services, XML part and few more. Reflection is something new to me, though I have read extensively on it and tried out some experimental code, but haven't made anything great using reflection
I checked out many examples of how we can create Type at runtime and then which can be saved in an assembly (.dll) files. Of all the examples I have seen is about saving the created types in the .dll files instead of code file. Isn't there any way to create the code file out of reflection?
I need to create code file since I want to distribute code instead of compiled assemblies. What I want to do is something like xsd.exe does, either spit out a .dll or the code file(in any language).
Isn't there any way to create a code file, since most of the place I can find is
AssemblyBuilder ab = System.AppDomain.CurrentDomain.DefineDynamicAssembly(an, AssemblyBuilderAccess.Save);
and then lastly
ab.Save("QuoteOfTheDay.dll");
There is no disassembler built into the .NET framework. You could use Reflector to convert the IL you emitted to one of the source code languages it supports. It isn't terribly likely that this will work without problems, Reflector was designed to convert code that was generated by a compiler back to source code. Compilers tend to be conservative about the IL they generate, unlike the anything-goes approach that System.Reflection.Emit offers.
If you want source code then you're best off by generating it in the first place. Use System.CodeDom. That's what System.Xml.Serialization uses.
Generally either the string classes or CodeDOM (http://msdn.microsoft.com/en-us/library/y2k85ax6.aspx) are used to generate C# source.
I've put off using generated code as part of the build process for fear of the complexity it introduces into the build process.
Is there a simple way to integrate build-time generated code into an app?
The kind of code I'm thinking of is similar to the resource and settings file code generation that Visual studio performs:
Having intellisense here is valuable
There are a lot of properties and links between properties that are trivial to describe, but impossible to implement tersely in C#.
The underlying resource is be modifiable and the code is automatically regenerated without needing any user interaction and without any need to understand the internals of the generator.
For (a non-real-world) example consider a precompiler that generated accessor to the named capture groups of a Regex via similarly named C# properties (or methods). This is typical of the kinds of things I'd like to generate: long snippets of boilerplate wrappers whose primary function is to enable compile time checking for errors (in the above; accessing non-existant capturing groups or writing and invalid regex) and no less importantly, intellisense for these properties. Finally, this setup should be trivially usable by others on the team with only the bare minimum of learning curve. I.e., it's absolutely not acceptable to require manual intervention to regenerate the code, nor acceptable to commit the generated code into source control. At worst, everyone should just need to install some extension; ideally the extension should be installable into the source-tree so that anyone that checks out the tree can build the project without any introduction.
For that to work well, it's critical that the IDE integration be excellent: Updating the underlying "resource" definition file should trigger a regeneration of the code without any user interaction, and ideally the generator itself would be easy to maintain for other developers later on (i.e. some amount of generator debug-ability is a plus).
Finally, an XSLT-like approach where the same template can be applied to various input resources is ideal; both because this means that you don't even need to look at the actual generator code if all you want to do is is update the resource, and because it makes template reuse trivial.
I've looked at T4, but from what I've seen this has a less handy ASP-like approach where template and resource aren't cleanly split (i.e, the generator is responsible for finding the resource - which makes template reuse less easy).
Is there a better (cleaner) solution or some way of running T4 such that the same template is can be trivially reused and (much like .NET settings files) that any update of the resource automatically triggers a regeneration of the implemented code?
Summary:
I'm looking for a code-gen approach that can
Regenerate code automatically without dev intervention when the underlying resource (not the template!) changes.
Be somewhat simple to maintain
Be able to share the same generator template between several resources (which, with point #1 probably implies the resource should refer to the generator and not vice-versa).
You can use T4ScriptFileGenerator from T4 Toolbox. Change "Custom Tool" property for your "resource" file to T4ScriptFileGenerator and save changes. The custom tool will generate a new, empty T4 script (.tt file). Place your code generation logic in this .tt file. Any time you modify (and save) the resource file, the T4ScriptFileGenerator will use the .tt file to generate the output code. For an example of how this works, see "LINQ to SQL Model" generator in the T4 Toolbox, which uses a .dbml file as the "resource". In the .tt file created by this generator, you will see that all of the code generation logic resides in separate .tt files and is reused with the help of include directives.
You may want to keep an eye on ABSE (http://www.abse.info). ABSE is a code-generation and model-driven software development methodology that is completely agnostic in terms of platform and language, so you wouldn't have any trouble creating your own generators for C# and anything else you wish. The big plus is that you can generate code exactly the way you want. The downside is that you may have more work to do at first to build your templates.
ABSE allows you to capture your domain knowledge into "Atoms", which are basically fragments of larger models you can build. ABSE is both declarative and executable. The model is able to generate code by your specification and incorporate custom code at the model level.
Unfortunately, ABSE is still work in progress and an Integrated Development Environment (named AtomWeaver) is still in the making. Anyway a CTP release of the generator is scheduled for January 2010, so we're already close to it.
I'm writing a console tool to generate some C# code for objects in a class library. The best/easiest way I can actual generate the code is to use reflection after the library has been built. It works great, but this seems like a haphazard approch at best. Since the generated code will be compiled with the library, after making a change I'll need to build the solution twice to get the final result, etc. Some of these issues could be mitigated with a build script, but it still feels like a bit too much of a hack to me.
My question is, are there any high-level best practices for this sort of thing?
Its pretty unclear what you are doing, but what does seem clear is that you have some base line code, and based on some its properties, you want to generate more code.
So the key issue here are, given the base line code, how do you extract interesting properties, and how do you generate code from those properties?
Reflection is a way to extract properties of code running (well, at least loaded) into the same execution enviroment as the reflection user code. The problem with reflection is it only provides a very limited set of properties, typically lists of classes, methods, or perhaps names of arguments. IF all the code generation you want to do can be done with just that, well, then reflection seems just fine. But if you want more detailed properties about the code, reflection won't cut it.
In fact, the only artifact from which truly arbitrary code properties can be extracted is the the source code as a character string (how else could you answer, is the number of characters between the add operator and T in middle of the variable name is a prime number?). As a practical matter, properties you can get from character strings are generally not very helpful (see the example I just gave :).
The compiler guys have spent the last 60 years figuring out how to extract interesting program properties and you'd be a complete idiot to ignore what they've learned in that half century.
They have settled on a number of relatively standard "compiler data structures": abstract syntax trees (ASTs), symbol tables (STs), control flow graphs (CFGs), data flow facts (DFFs), program triples, ponter analyses, etc.
If you want to analyze or generate code, your best bet is to process it first into such standard compiler data structures and then do the job. If you have ASTs, you can answer all kinds of question about what operators and operands are used. If you have STs, you can answer questions about where-defined, where-visible and what-type. If you have CFGs, you can answer questions about "this-before-that", "what conditions does statement X depend upon". If you have DFFs, you can determine which assignments affect the actions at a point in the code. Reflection will never provide this IMHO, because it will always be limited to what the runtime system developers are willing to keep around when running a program. (Maybe someday they'll keep all the compiler data structures around, but then it won't be reflection; it will just finally be compiler support).
Now, after you have determined the properties of interest, what do you do for code generation? Here the compiler guys have been so focused on generation of machine code that they don't offer standard answers. The guys that do are the program transformation community (http://en.wikipedia.org/wiki/Program_transformation). Here the idea is to keep at least one representation of your program as ASTs, and to provide special support for matching source code syntax (by constructing pattern-match ASTs from the code fragments of interest), and provide "rewrite" rules that say in effect, "when you see this pattern, then replace it by that pattern under this condition".
By connecting the condition to various property-extracting mechanisms from the compiler guys, you get relatively easy way to say what you want backed up by that 50 years of experience. Such program transformation systems have the ability to read in source code,
carry out analysis and transformations, and generally to regenerate code after transformation.
For your code generation task, you'd read in the base line code into ASTs, apply analyses to determine properties of interesting, use transformations to generate new ASTs, and then spit out the answer.
For such a system to be useful, it also has to be able to parse and prettyprint a wide variety of source code langauges, so that folks other than C# lovers can also have the benefits of code analysis and generation.
These ideas are all reified in the
DMS Software Reengineering Toolkit. DMS handles C, C++, C#, Java, COBOL, JavaScript, PHP, Verilog, ... and a lot of other langauges.
(I'm the architect of DMS, so I have a rather biased view. YMMV).
Have you considered using T4 templates for performing the code generation? It looks like it's getting much more publicity and attention now and more support in VS2010.
This tutorial seems database centric but it may give you some pointers: http://www.olegsych.com/2008/09/t4-tutorial-creatating-your-first-code-generator/ in addition there was a recent Hanselminutes on T4 here: http://www.hanselminutes.com/default.aspx?showID=170.
Edit: Another great place is the T4 tag here on StackOverflow: https://stackoverflow.com/questions/tagged/t4
EDIT: (By asker, new developments)
As of VS2012, T4 now supports reflection over an active project in a single step. This means you can make a change to your code, and the compiled output of the T4 template will reflect the newest version, without requiring you to perform a second reflect/build step. With this capability, I'm marking this as the accepted answer.
You may wish to use CodeDom, so that you only have to build once.
First, I would read this CodeProject article to make sure there are not language-specific features you'd be unable to support without using Reflection.
From what I understand, you could use something like Common Compiler Infrastructure (http://ccimetadata.codeplex.com/) to programatically analyze your existing c# source.
This looks pretty involved to me though, and CCI apparently only has full support for C# language spec 2. A better strategy may be to streamline your existing method instead.
I'm not sure of the best way to do this, but you could do this
As a post-build step on your base dll, run the code generator
As another post-build step, run csc or msbuild to build the generated dll
Other things which depend on the generated dll will also need to depend on the base dll, so the build order remains correct