Get error Diagnostics only for the delta of a CSharpCompilation - c#

I'm using Roslyn to perform code manipulation on C# methods. To test the validity of the rewritten code, I examine compilation errors by calling Compilation.GetDiagnostics(). This is done only in test stages.
This works fine, but it is too slow, specially if the rewritten methods are part of a big project that apparently is compiled again each time.
I have the compilation delta since I use Compilation.UpdateCompilation(oldSyntaxTree, newSyntaxTree) but it is still too slow.
Is there a way to validate only the changed parts? Such as how Visual Studio determines syntax errors while we write code?
The rewritten code is comprised of changes to a method implementation (one method at time). I don't create/remove or change anything else in the method's signature or Type.

Related

Is it possible to instruct C# compiler NOT to inline constants?

This question is related to How to detect static code dependencies in C# code in the presence of constants?
If type X depends on a constant defined in type Y, this dependency is not captured in the binary code, because the constant is inlined. Yet the dependency is there - try compiling X without Y and the compilation fails. So it is a compile time dependency, but not runtime.
I need to be able to discover such dependencies and scanning all the source code is prohibitively expensive. However, I have full control over the build and if there is a way to instruct the C# compiler not to inline constants - that is good enough for me.
Is there a way to compile C# code without inlining the constants?
EDIT 1
I would like to respond to all the comments so far:
I cannot modify the source code. This is not a toy project. I am analysing a big code base - millions of lines of C# code.
I am already using Roslyn API to examine the source code. However, I only do it when the binary code inspection (I use Mono.Cecil) of a method indicates the use of dynamic types. Analysing methods using dynamic with Roslyn is useful, because not all the dynamic usages are as bad as reflection. However, there is absolutely no way to figure out that a method uses a constant in general. Using Roslyn Analyser for that takes really long time, because of the code base size. Hence my "prohibitively expensive" statement.
I have an NDepend license and I used it at first. However, it only processes binary code. It does NOT see any dependencies introduced through constants. My analysis is better, because I drill down to dynamic users and employ Roslyn API to harvest as much as I can from such methods. NDepend does nothing of the kind. Moreover, it has bugs. For example, the latest version does not inspect generic method constraints and thus does not recognise any dependencies introduced by them.

Is it possible to instantiate and call a method of the class my Analyzer is analyzing right now?

My analyzer will match methods with certain signatures. I would like from inside my analyzer to create an instance of the class I'm analyzing and call the method that caused the analyzer to kick in.
Assuming the source code is in a compilable state, is it possible?
Getting the class name and method name is pretty easy, but Type.GetType(...) will always return null.
The purpose of this is that I would like for my analyzer to kick in when I'm on a test method and run it, failing if the test fails.
If the code is not ready for compilation, it would be fine to just return.
It seems possible, but you'd need to check the efficiency of these solutions. Also, you can't guarantee that the code is compilable.
You can grab the Compilation object (from let's say context.SemanticModel.Compilation), call Emit on it, and write it to disc. Then use Assembly.Load to load it, and then it's simple reflection to instantiate the class, whose name you already know, and call the method on it with appropriate arguments.
Another approach would be to use the Compilation in a scripting session as a reference assembly, and use the Roslyn Scripting API to invoke the method. There is a ToMetadataReference method on the Compilation, so you could get a MetadataReference, which could be then passed to ScriptOptions.Default.AddReferences. And then you'd need to pass the resulting options instance to CSharpScript.EvaluateAsync().
There's another fundamental reason you can't run code from the user's compilation, even if it did actually compile -- it might be the wrong environment. Consider a scenario where you're targeting Windows Phone, or Xamarin Android/iOS, .NET Core on Linux, or whatever. In any of these cases the compiler has reference assemblies that you can compile against but obviously you can't actually run that code because it's targeting a different platform. People often ask why you can't convert an ITypeSymbol to a reflection System.Type and back, and this is one of the reasons why -- the compiler can compile code on platform A for platform B, when it can't actually run (or fully load) B's assemblies in the first place.

Does the compiler discard empty methods?

Would C# compiler optimize empty void methods away?
Something like
private void DoNothing()
{
}
As essentially, no code is run aside from adding DoNothing to the call stack and removing it again, wouldn't it be better to optimize this call away?
Would C# compiler optimize empty void methods away?
No. They could still be accessed via reflection, so it's important that the method itself stays.
Any call sites are likely to include the call as well - but the JIT may optimize them away. It's in a much better position to do so. It's basically a special case of inlining, where the inlined code is empty.
Note that if you call it on another object:
foo.DoNothing();
that's not a no-op, because it will check that foo is non-null.
If you want you could intercept the post build event for every project and run an IL inspecting tool that will reflect your generated dll, inspect every methodinfo in your type and request it's IL looking for empty IL patterns like only NoOp IL instructions, and remove the unwanted methods.
For example:
var ilBytes = SomeMethodInfo.GetMethodBody().GetILAsByteArray();
A good obfuscation tool will "prune" methods in this way. preemptive.com/products/dotfuscator/features#pruning – weston 5 mins ago
You could use the tool externally of visual studio to find empty methods and remove them from the file they are defined or used in.
Never. Compiler doesn't has to do with what's empty or not written. Its just what you write, you get in your MSIL. you can check it here in ILDASM

How to get an attribute to act as [Conditional("DEBUG")]?

I have a C# program where some parts of code are generated using D-style mixins (i.e., the body of the method is compiled, executed, and results inserted into a class). The method is marked with [MixinAttribute] and, naturally, I don't want it to be compiled into the program. Is there some cheap way of preventing the method decorated with this attribute from being included in a build?
The only way is with compiler conditionals:
#if DEBUG
[MixinAttribute]
// method you don't want included
#endif
The problem with this approach is that you then create a member which will be unavailable in builds where DEBUG is not defined. You then have to mark all usages with the conditional, and I don't think this is what you want. It's not quite clear but I think what you are really asking is how you create dynamic call sites at build time, or, rather, at JIT time (which is what the ConditionalAttribute controls). If this is the case, you can't really do this easily in C# without using some kind of dynamic dispatch overriding (using some proxying library) or by using some post-processing tool like PostSharp to manipulate the compiler output.

Is there any off the shelf component which can be used to evaluate expressions on an object?

We would like to parse expressions of the type:
Func<T1, bool>, Func<T1, T2, bool>, Func<T1, T2, T3, bool>, etc.
I understand that it is relatively easy to build an expression tree and evaluate it, but I would like to get around the overhead of doing a Compile on the expression tree.
Is there any off the shelf component which can do this?
Is there any component which can parse C# expressions from a string and evaluate them? (Expression services for C# , I think there is something like this available for VB which is used by WF4)
Edit:
We have specific models which on which we need to evaluate expressions which are entered by IT Administrators.
public class SiteModel
{
public int NumberOfUsers {get;set;}
public int AvailableLicenses {get;set;}
}
We would like for them to enter an expression like:
Site.NumberOfUsers > 100 && Site.AvailableLicenses < Site.NumberOfUsers
We would then like to generate a Func which can be evaluated by passing a SiteModel object.
Func<SiteModel, bool> (Site) => Site.NumberOfUsers > 100 && Site.AvailableLicenses < Site.NumberOfUsers
Also, the performance should not be miserable (but around 80-100 calls per second on a normal PC should be fine).
Mono.CSharp can evaluate expressions from strings, and is very simple to use. The required references come with the mono compiler and runtime. (In the tools directory iirc).
You need to reference Mono.CSharp.dll and the Mono C# compiler executable (mcs.exe).
Next set up the evaluator to know about your code if necessary.
using Mono.CSharp;
...
Evaluator.ReferenceAssembly (Assembly.GetExecutingAssembly ());
Evaluator.Run ("using Foo.Bar;");
Then evaluating expressions is as simple as calling Evaluate.
var x = (bool) Evaluator.Evaluate ("0 == 1");
Maybe ILCalc (on codeplex) does what you are looking for. It comes as a .NET and a Silverlight version and is open sourced.
We have been using it successfully for quite a while. It even allows you to reference variables in your expression.
The "component" you are talking about:
Needs to understand C# syntax (for parsing your input string)
Needs to understand C# semantics (where to perform implicit int->double conversions, etc.)
Needs to generate IL code
Such a "component" is called a C# compiler.
The current Microsoft C# compiler is poor option as it runs in a separate process (thus increasing compilation time as all the metadata needs to be loaded into that process) and can only compile full assemblies (and .NET assemblies cannot be unloaded without unloading the whole AppDomain, thus leaking memory). However, if you can live with those restrictions, it's an easy solution - see sgorozco's answer.
The future Microsoft C# compiler (Roslyn project) will be able to do what you want, but that is still some time in the future - my guess is that it will be released with the next VS after VS11, i.e. with C# 6.0.
Mono C# compiler (see Mark H's answer) can do what you want, but I don't know if that supports code unloading or will also leak a bit of memory.
Roll your own. You know which subset of C# you need to support, and there are separate components available for the various "needs" above. For example, NRefactory 5 can parse C# code and analyze semantics. Expression Trees greatly simplify IL code generation. You could write a converter from NRefactory ResolveResults to Expression Trees, that would likely solve your problem in less than 300 lines of code. However, NRefactory reuses large parts of the Mono C# compiler in its parser - and if you're taking that big dependency, you might as well go with option 3.
Perhaps this technique is useful to you - especially regarding the dependency reviews as you are depending solely on framework components.
EDIT: as pinpointed by #Asti, this technique creates dynamic assemblies that unfortunately, due to limitations of .net Framework design, cannot be unloaded, so careful consideration should be done before using it. This means that if a script is updated, the old assembly containing the previous version of the script can't be unloaded from memory and will be lingering until the application or service hosting it is restarted.
In a scenario where the frequency of change in scripts is reduced, and where compiled scripts are cached and reused and not recompiled on every use, this memory leak can be IMO safely tolerated (this has been the case for all our uses of this technique). Fortunately, in my experience, the memory footprint of the generated assemblies for typical scripts tends to be quite small.
If this is not acceptable, then the scripts can be compiled on a separate AppDomain that can be removed from memory, although, this would require call marshaling between domains (e.g. a named pipe WCF service), or perhaps an IIS hosted service, where unloading occurs automatically after an inactivity period, or a memory footprint threshold is exceeded).
End EDIT
First, you need to add to your project a reference to Microsoft.CSharp, and add the following using statements
using System.CodeDom.Compiler; // this is included in System.Dll assembly
using Microsoft.CSharp;
Then, I'm adding the following method:
private void TestDynCompile() {
// the code you want to dynamically compile, as a string
string code = #"
using System;
namespace DynCode {
public class TestClass {
public string MyMsg(string name) {
//---- this would be code your users provide
return string.Format(""Hello {0}!"", name);
//-----
}
}
}";
// obtain a reference to a CSharp compiler
var provider = CodeDomProvider.CreateProvider("CSharp");
// Crate instance for compilation parameters
var cp = new CompilerParameters();
// Add assembly dependencies
cp.ReferencedAssemblies.Add("System.dll");
// hold compiled assembly in memory, don't produce an output file
cp.GenerateInMemory = true;
cp.GenerateExecutable = false;
// don't produce debugging information
cp.IncludeDebugInformation = false;
// Compile source code
var rslts = provider.CompileAssemblyFromSource(cp, code);
if( rslts.Errors.Count == 0 ) {
// No errors in compilation, obtain type for DynCode.TestClass
var type = rslts.CompiledAssembly.GetType("DynCode.TestClass");
// Create an instance for the dynamically compiled class
dynamic instance = Activator.CreateInstance(type);
// Invoke dynamic code
MessageBox.Show(instance.MyMsg("Gerardo")); // Hello Gerardo! is diplayed =)
}
}
As you can see, you need to add boilerplate code like a wrapper class definition, inject assembly dependencies, etc.), but this is a really powerful technique that adds scripting capabilities with full C# syntax and executes almost as fast as static code. (Invocation will be a little bit slower).
Assembly dependencies can refer to your own project dependencies, so classes and types defined in your project can be refered and used inside the dynamic code.
Hope this helps!
Not sure about the performance part but this seems like a good match for dynamic linq...
Generate xsd out of SiteModel class, then through web/whatever-UI let the administrator input the expression, transform the input via xsl where you modify the expression as a functor literal, then generate and execute it via CodeDom on the fly.
Maybe you can use LUA Scripts as input. The user enters a LUA expression and you can parse and execute it with the LUA engine. If needed you can wrap the input with some other LUA code before you interpret it and I'm not sure about the performance. But 100 calls/s are not that much.
Evaluating expressions is always a security issue. So take care of that, too.
You can use LUA in c#
Another way would be to compile some C# code that contains the input expression in a class. But here you will end up with one assembly per request. I think .net 4.0 can unload assemblies but older versions of .net can't. so this solution might not scale well. A workaround can be an own process that is restarted every X requests.
Thanks for your answers.
Introducing a dependency on Mono in a product like ours (which has more than 100K installations and has a long release cycle of 1-1.5 years) may not be a good option for us. This might also be an overkill since we only need to support simple expressions (with little or no nested expressions) and not an entire language.
After using the code dom compiler, we noticed that it causes the application to leak memory. Although we could load it in a separate app domain to work around this, this again might be an overkill.
The dynamic LINQ expression tree sample provided as part of the VS Samples has a lot of bugs and no support for type conversions when ding comparisons (changing a string to an int, a double to an int, a long to an int, etc). The parsing for indexers also seems to be broken. Although not usable off the shelf, it shows promise for our use cases.
We have decided to go with expression trees as of now.

Categories

Resources