I've been puzzling about this for a while and I've looked around a bit, unable to find any discussion about the subject.
Lets assume I wanted to implement a trivial example, like a new looping construct: do..until
Written very similarly to do..while
do {
//Things happen here
} until (i == 15)
This could be transformed into valid csharp by doing so:
do {
//Things happen here
} while (!(i == 15))
This is obviously a simple example, but is there any way to add something of this nature? Ideally as a Visual Studio extension to enable syntax highlighting etc.
Microsoft proposes Rolsyn API as an implementation of C# compiler with public API. It contains individual APIs for each of compiler pipeline stages: syntax analysis, symbol creation, binding, MSIL emission. You can provide your own implementation of syntax parser or extend existing one in order to get C# compiler w/ any features you would like.
Roslyn CTP
Let's extend C# language using Roslyn! In my example I'm replacing do-until statement w/ corresponding do-while:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Roslyn.Compilers.CSharp;
namespace RoslynTest
{
class Program
{
static void Main(string[] args)
{
var code = #"
using System;
class Program {
public void My() {
var i = 5;
do {
Console.WriteLine(""hello world"");
i++;
}
until (i > 10);
}
}
";
//Parsing input code into a SynaxTree object.
var syntaxTree = SyntaxTree.ParseCompilationUnit(code);
var syntaxRoot = syntaxTree.GetRoot();
//Here we will keep all nodes to replace
var replaceDictionary = new Dictionary<DoStatementSyntax, DoStatementSyntax>();
//Looking for do-until statements in all descendant nodes
foreach (var doStatement in syntaxRoot.DescendantNodes().OfType<DoStatementSyntax>())
{
//Until token is treated as an identifier by C# compiler. It doesn't know that in our case it is a keyword.
var untilNode = doStatement.Condition.ChildNodes().OfType<IdentifierNameSyntax>().FirstOrDefault((_node =>
{
return _node.Identifier.ValueText == "until";
}));
//Condition is treated as an argument list
var conditionNode = doStatement.Condition.ChildNodes().OfType<ArgumentListSyntax>().FirstOrDefault();
if (untilNode != null && conditionNode != null)
{
//Let's replace identifier w/ correct while keyword and condition
var whileNode = Syntax.ParseToken("while");
var condition = Syntax.ParseExpression("(!" + conditionNode.GetFullText() + ")");
var newDoStatement = doStatement.WithWhileKeyword(whileNode).WithCondition(condition);
//Accumulating all replacements
replaceDictionary.Add(doStatement, newDoStatement);
}
}
syntaxRoot = syntaxRoot.ReplaceNodes(replaceDictionary.Keys, (node1, node2) => replaceDictionary[node1]);
//Output preprocessed code
Console.WriteLine(syntaxRoot.GetFullText());
}
}
}
///////////
//OUTPUT://
///////////
// using System;
// class Program {
// public void My() {
// var i = 5;
// do {
// Console.WriteLine("hello world");
// i++;
// }
//while(!(i > 10));
// }
// }
Now we can compile updated syntax tree using Roslyn API or save syntaxRoot.GetFullText() to text file and pass it to csc.exe.
The big missing piece is hooking into the pipeline, otherwise you're not much further along than what .Emit provided. Don't misunderstand, Roslyn brings alot of great things, but for those of us who want to implement preprocessors and meta programming, it seems for now that was not on the plate. You can implement "code suggestions" or what they call "issues"/"actions" as an extension, but this is basically a one off transformation of code that acts as a suggested inline replacement and is not the way you would implement a new language feature. This is something you could always do with extensions, but Roslyn makes the code analysis/transformation tremendously easier:
From what I've read of comments from Roslyn developers on the codeplex forums, providing hooks into the pipeline has not been an initial goal. All of the new C# language features they've provided in C# 6 preview involved modifying Roslyn itself. So you'd essentially need to fork Roslyn. They have documentation on how to build Roslyn and test it with Visual Studio. This would be a heavy handed way to fork Roslyn and have Visual Studio use it. I say heavy-handed because now anyone who wants to use your new language features must replace the default compiler with yours. You could see where this would begin to get messy.
Building Roslyn and replacing Visual Studio 2015 Preview's compiler with your own build
Another approach would be to build a compiler that acts as a proxy to Roslyn. There are standard APIs for building compilers that VS can leverage. It's not a trivial task though. You'd read in the code files, call upon the Roslyn APIs to transform the syntax trees and emit the results.
The other challenge with the proxy approach is going to be getting intellisense to play nicely with any new language features you implement. You'd probably have to have your "new" variant of C#, use a different file extension, and implement all the APIs that Visual Studio requires for intellisense to work.
Lastly, consider the C# ecosystem, and what an extensible compiler would mean. Let's say Roslyn did support these hooks, and it was as easy as providing a Nuget package or a VS extension to support a new language feature. All of your C# leveraging the new Do-Until feature is essentially invalid C#, and will not compile without the use of your custom extension. If you go far enough down this road with enough people implementing new features, very quickly you will find incompatible language features. Maybe someone implements a preprocessor macro syntax, but it can't be used along side someone else's new syntax because they happened to use similar syntax to delineate the beginning of the macro. If you leverage alot of open source projects and find yourself digging into their code, you would encounter alot of strange syntax that would require you side track and research the particular language extensions that project is leveraging. It could be madness. I don't mean to sound like a naysayer, as I have alot of ideas for language features and am very interested in this, but one should consider the implications of this, and how maintainable it would be. Imagine if you got hired to work somewhere and they had implemented all kinds of new syntax that you had to learn, and without those features having been vetted the same way C#'s features have, you can bet some of them would be not well designed/implemented.
You can check www.metaprogramming.ninja (I am the developer), it provides an easy way to accomplish language extensions (I provide examples for constructors, properties, even js-style functions) as well as full-blown grammar based DSLs.
The project is open source as well. You can find documentations, examples, etc at github.
Hope it helps.
You can't create your own syntactic abstractions in C#, so the best you can do is to create your own higher-order function. You could create an Action extension method:
public static void DoUntil(this Action act, Func<bool> condition)
{
do
{
act();
} while (!condition());
}
Which you can use as:
int i = 1;
new Action(() => { Console.WriteLine(i); i++; }).DoUntil(() => i == 15);
although it's questionable whether this is preferable to using a do..while directly.
I found the easiest way to extend the C# language is to use the T4 text processor to preprocess my source. The T4 Script would read my C# and then call a Roslyn based parser, which would generate a new source with custom generated code.
During build time, all my T4 scripts would be executed, thus effectively working as an extended preprocessor.
In your case, the none-compliant C# code could be entered as follows:
#if ExtendedCSharp
do
#endif
{
Console.WriteLine("hello world");
i++;
}
#if ExtendedCSharp
until (i > 10);
#endif
This would allow syntax checking the rest of your (C# compliant) code during development of your program.
No there is no way to achieve what you'are talking about.
Cause what you're asking about is defining new language construct, so new lexical analysis, language parser, semantic analyzer, compilation and optimization of generated IL.
What you can do in such cases is use of some macros/functions.
public bool Until(int val, int check)
{
return !(val == check);
}
and use it like
do {
//Things happen here
} while (Until(i, 15))
Related
I want to use Roslyn to clean the code of some of the older preprocessor directives.
For example, from this code
#define TEST_1_0
#define TEST_1_1
namespace ConsoleApplication1
{
class TypeName
{
public static void Main(string[] args)
{
#if TEST_1_0
int TEST_1_0 = 1;
#if TEST_1_1
int TEST_1_1 = 1;
#else//TEST_1_1
int TEST_1_1 = 0;
#endif//TEST_1_1
#else//TEST_1_0
int TEST_1_0 = 0;
#endif//TEST_1_0
}
}
}
I'd like to remove else//TEST_1_0, but keep the else//TEST_1_1. I cannot count on the comments, so I should related a #if with its corresponding #else, if there is one.
Finding the #if is easy, but finding the corresponding #else is less easy.
I tried two strategies:
Here i lookup #else//TEST_1_0 in the analyzer, and create a codefix for that location
Here I just create a codefix for #if TEST_1_0 in analyzer, and try to get to the corresponding else from the CodeFixprovider
Both get quite complicated quickly, it seems problematic that directives are trivia, which are spread out over the leadingTrivia of different SyntaxTokens. Changes in the code affect the location directives around quite a bit so it looks like lots of work to program all cases..
Am I missing something? Is there an easier way to do this without programming all the different cases by hand?
Would you go for strategy 1 or 2?
I agree with Arjan - Roslyn in not usable for the task. To solve a similar task I made my own simple C# preprocessor tool based on regexp and Python sympy library: undefine. I believe it would be helpful for you. As for the task you described, try the following command:
>> python undefine apply -d TEST_1_0 YourFile.cs
I concluded that roslyn is not the way to go here.
Roslyn models preprocesessor directives as trivia in the syntax tree, and the location of trivia has great variation depending on the structure of the actual code.
Therefore working on the syntax tree introduces lookup complexities that are not an issue when working text based, and more complexity means more RISK. Binaries should be the same before/after processing!
So I chose to abandon Roslyn and simply parse the code/directive mix as text, using regex to parse and the good-old stack to handle the directive logic.
Now it's much easier, even a piece of cake ..
Still need to handle some encoding issues, then I'm done! :)
Happy parsing!
Does anybody know of a way to prevent developers from using a particular type of expression in a Visual Studio C# code base? I have seen that analysers might be the way to go here.
I would like Visual Studio to warn developers when they use DateTime.Now instead of DateTime.UtcNow in our code base.
As far as my analysis goes, I don't think we will ever need to use DateTime.Now in our API code base, so I'm also pondering the idea of throwing errors at compilation if DateTime.Now is used, in case unit testing doesn't cover this case.
The reason I want to do this is to prevent any non UTC time values from making their way into the database. Senior developers are solid on this, however I have seen time and time again DateTime.Now being used by less senior developers, and I would like to automate the management of this.
I know that setting server time zones to UTC could negate this, however I'd like to solve the issue closer to the cause.
Are there any existing tools out there I could leverage, that anyone knows of, or should I write my own analyser?
Any tips would be much appreciated.
Here's a simple Rosalyn example:
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp; //NuGet package
using Microsoft.CodeAnalysis.CSharp.Syntax;
using System;
namespace StackOverflow_RosalynExample
{
class Program
{
static void Main(string[] args)
{
var tree = CSharpSyntaxTree.ParseText(#"
using System;
namespace SomeNamespace
{
class SomeClass
{
public void SomeMethod()
{
DateTime example = DateTime.Now;
}
}
}");
var rewriter = new DateTimeUtcEnsurer();
var result = rewriter.Visit(tree.GetRoot());
Console.WriteLine(result.ToFullString());
Console.ReadKey(); //DateTime.Now -> DateTime.UtcNow
}
}
public class DateTimeUtcEnsurer : CSharpSyntaxRewriter
{
public override SyntaxNode VisitMemberAccessExpression(MemberAccessExpressionSyntax node)
{
var dateTimeNow = SyntaxFactory.ParseExpression("DateTime.Now") as MemberAccessExpressionSyntax;
if (SyntaxFactory.AreEquivalent(node, dateTimeNow))
{
var dateTimeUtcNow = SyntaxFactory.ParseExpression("DateTime.UtcNow") as MemberAccessExpressionSyntax;
dateTimeUtcNow = dateTimeUtcNow.WithTrailingTrivia(SyntaxFactory.ParseTrailingTrivia(" /*Silly junior dev ;)*/"));
return dateTimeUtcNow;
}
return base.VisitMemberAccessExpression(node);
}
}
}
This is just to give you an idea of how to use Roslyn, a C# compiler tool. Here, I use it to convert a string to a C# Abstract Syntax Tree, modify the AST directly, and then toString the AST.
The big advantage of manipulating the AST over, say directly manipulating text via regex, is safety. (Admittedly less important in this example).
For your needs, you could adjust this example to accept a file directory as a command line argument and run the rewriter against all files ending in cs. (I did something similar last year; ~10,000 files in ~20 seconds). Then, you could run the enforcement tool before each check-in to source control.
Alternatively, you could make an active code-analyzer, a la intellisense. (Roslyn is probably more geared for that anyways.)
Good luck!
I am trying to write a custom rule (code analysis) where, if the body of a method contains empty statements, an error is raised.
However, there is one problem. I can not seem to figure out how to get the body of a method (the text that is in the method).
How can I get the text inside a method, and assign it to a string?
Thanks in advance.
For reference; I use c# in visual studio, with FxCop to make the rule.
Edit: Some code added for reference, this does NOT work.
using Microsoft.FxCop.Sdk;
using Microsoft.VisualStudio.CodeAnalysis.Extensibility;
public override ProblemCollection Check(Member member)
{
Method method = member as Method;
if (method == null)
{
return null;
}
if (method.Name.Name.Contains("{}"))
{
var resolution = GetResolution(member.Name.Name);
var problem = new Problem(resolution, method)
{
Certainty = 100,
FixCategory = FixCategories.Breaking,
MessageLevel = MessageLevel.Warning
};
Problems.Add(problem);
}
return Problems;
}
FxCop doesn't analyse source code, it works on .Net assemblies built from any language.
You may be able to find whether the method contains a statement or not using FxCop, I advice you to read the documentation and check the implementation of existing rules to understand it.
An empty statement in the middle of other code might be removed by the compiler and you may not find it using FxCop. If you want to analyze source code you should take a look at StyleCop.
However, there is one problem. I can not seem to figure out how to get the body of a method
(the text that is in the method).
You can not. FxCop does not work based on the source, but analysis the compiled bytecode.
What you can do is find the source - which is not totally trivial - but you have to do so without the FxCop API. A start point may be analysing the pdb files (not sure where to find documentation) as they can point you to the file that contains the method.
The question in short is: How do you reference a second script containing reusable script code, under the constraints that you need to be able to unload and reload the scripts when either of them changes without restarting the host application?
I'm trying to compile a script class using the CS-Script "compiler as service" (CSScript.Evaluator), while referencing an assembly that has just been compiled from a second "library" script. The purpose is that the library script should contain code that can be reused for different scripts.
Here is a sample code that illustrates the idea but also causes a CompilerException at runtime.
using CSScriptLibrary;
using NUnit.Framework;
[TestFixture]
public class ScriptReferencingTests
{
private const string LibraryScriptCode = #"
public class Helper
{
public static int AddOne(int x)
{
return x + 1;
}
}
";
private const string ScriptCode = #"
using System;
public class Script
{
public int SumAndAddOne(int a, int b)
{
return Helper.AddOne(a+b);
}
}
";
[Test]
public void CSScriptEvaluator_CanReferenceCompiledAssembly()
{
var libraryEvaluator = CSScript.Evaluator.CompileCode(LibraryScriptCode);
var libraryAssembly = libraryEvaluator.GetCompiledAssembly();
var evaluatorWithReference = CSScript.Evaluator.ReferenceAssembly(libraryAssembly);
dynamic scriptInstance = evaluatorWithReference.LoadCode(ScriptCode);
var result = scriptInstance.SumAndAddOne(1, 2);
Assert.That(result, Is.EqualTo(4));
}
}
To run the code you need NuGet packages NUnit and cs-script.
This line causes a CompilerException at runtime:
dynamic scriptInstance = evaluatorWithReference.LoadCode(ScriptCode);
{interactive}(7,23): error CS0584: Internal compiler error: The invoked member is not supported in a dynamic assembly.
{interactive}(7,9): error CS0029: Cannot implicitly convert type '<fake$type>' to 'int'
Again, the reason for using CSScript.Evaluator.LoadCode instead of CSScript.LoadCode is so that the script can be reloaded at any time without restarting the host application when either of the scripts changes. (CSScript.LoadCode already supports including other scripts according to http://www.csscript.net/help/Importing_scripts.html)
Here is the documentation on the CS-Script Evaluator: http://www.csscript.net/help/evaluator.html
The lack of google results for this is discouraging, but I hope I'm missing something simple. Any help would be greatly appreciated.
(This question should be filed under the tag cs-script which does not exist.)
There is some slight confusion here. Evaluator is not the only way to achieve reloadable script behavior. CSScript.LoadCode allows reloading as well.
I do indeed advise to consider CSScript.Evaluator.LoadCode as a first candidate for the hosting model as it offers less overhead and arguably more convenient reloading model. However it comes with the cost. You have very little control over reloading and dependencies inclusion (assemblies, scripts). Memory leaks are not 100% avoidable. And it also makes script debugging completely impossible (Mono bug).
In your case I would really advice you to move to the more conventional hosting model: CodeDOM.
Have look at "[cs-script]\Samples\Hosting\CodeDOM\Modifying script without restart" sample.
And "[cs-script]\Samples\Hosting\CodeDOM\InterfaceAlignment" will also give you an idea how to use interfaces with reloading.
CodeDOM was for years a default CS-Script hosting mode and it is in fact very robust, intuitive and manageable. The only real drawback is the fact that all object you pass to (or get from) the script will need to be serializable or inherited from MarshalByRef. This is the side effect of the script being executed in the "automatic" separate domain. Thus one have to deal with the all "pleasures" of Remoting.
BTW this is the only reason why I implemented Mono-based evaluator.
CodeDOM model will also automatically manage the dependencies and recompile them when needed. But it looks like you are aware about this anyway.
CodeDOM also allows you to define precisely the mechanism of checking dependencies for changes:
//the default algorithm "recompile if script or dependency is changed"
CSScript.IsOutOfDateAlgorithm = CSScript.CachProbing.Advanced;
or
//custom algorithm "never recompile script"
CSScript.IsOutOfDateAlgorithm = (s, a) => false;
The quick solution to the CompilerException appears to be not use Evaluator to compile the assembly, but instead just CSScript.LoadCode like so
var compiledAssemblyName = CSScript.CompileCode(LibraryScriptCode);
var evaluatorWithReference = CSScript.Evaluator.ReferenceAssembly(compiledAssemblyName);
dynamic scriptInstance = evaluatorWithReference.LoadCode(ScriptCode);
However, as stated in previous answer, this limits the possibilities for dependency control that the CodeDOM model offers (like css_include). Also, any change to the LibraryScriptCode are not seen which again limits the usefulness of the Evaluator method.
The solution I chose is the AsmHelper.CreateObject and AsmHelper.AlignToInterface<T> methods. This lets you use the regular css_include in your scripts, while at the same time allowing you at any time to reload the scripts by disposing the AsmHelper and starting over. My solution looks something like this:
AsmHelper asmHelper = new AsmHelper(CSScript.Compile(filePath), null, false);
object obj = asmHelper.CreateObject("*");
IMyInterface instance = asmHelper.TryAlignToInterface<IMyInterface>(obj);
// Any other interfaces you want to instantiate...
...
if (instance != null)
instance.MyScriptMethod();
Once a change is detected (I use FileSystemWatcher), you just call asmHelper.Dispose and run the above code again.
This method requires the script class to be marked with the Serializable attribute, or simply inherit from MarshalByRefObject.
Note that your script class does not need to inherit any interface. The AlignToInterface works both with and without it. You could use dynamic here, but I prefer having a strongly typed interface to avoid errors down the line.
I couldn't get the built in try-methods to work, so I made this extension method for less clutter when it is not known whether or not the interface is implemented:
public static class InterfaceExtensions
{
public static T TryAlignToInterface<T>(this AsmHelper helper, object obj) where T : class
{
try
{
return helper.AlignToInterface<T>(obj);
}
catch
{
return null;
}
}
}
Most of this is explained in the hosting guidelines http://www.csscript.net/help/script_hosting_guideline_.html, and there are helpful samples mentioned in previous post.
I feel I might have missed something regarding script change detection, but this method works solidly.
In my VSPackage I need to replace reference to a property in code with its actual value. For example
public static void Main(string[] args) {
Console.WriteLine(Resource.HelloWorld);
}
What I want is to replace "Resource.HelloWorld" with its actual value - that is, find class Resource and get value of its static property HelloWorld. Does Visual Studio expose any API to handle code model of the project? It definitely has one, because this is very similar to common task of renaming variables. I don't want to use reflection on output assembly, because it's slow and it locks the file for a while.
There is no straight forward way to do this that I know of. Reliably getting an AST out of Visual Studio (and changes to it) has always been a big problem. Part of the goal of the Rosalyn project is to create an unified way of doing this, because many tool windows had their own way of doing this sort of stuff.
There are four ways to do this:
Symbols
FileCodeModel + CodeDOM
Rosalyn AST
Unexplored Method
Symbols
I believe most tool windows such as the CodeView and things like Code Element Search use the symbols created from a compiled build. This is not ideal as it is a little more heavy weight and hard to keep in sync. You'd have to cache symbols to make this not slow. Using reflector, you can see how CodeView implements this.
This approach uses private assemblies. The code for getting the symbols would look something like this:
var compilerHost = new IDECompilerHost();
var typeEnumerator = (from compiler in compilerHost.Compilers.Cast<IDECompiler>()
from type in compiler.GetCompilation().MainAssembly.Types
select new Tuple<IDECompiler, CSharpType>(compiler, type));
foreach (var typeTuple in typeEnumerator)
{
Trace.WriteLine(typeTuple.Item2.Name);
var csType = typeTuple.Item2;
foreach (var loc in csType.SourceLocations)
{
var file = loc.FileName.Value;
var line = loc.Position.Line;
var charPos = loc.Position.Character;
}
}
FileCodeModel + CodeDOM
You could try using the EnvDTE service to get the FileCodeModel associated with a Code Document. This will let you get classes and methods. But it does not support getting the method body. You're messing with buggy COM. This ugly because an COM object reference to a CodeFunction or CodeClass can get invalided without you knowing it, meaning you'd have to keep your own mirror.
Rosalyn AST
This allows provides the same capabilities as both FileCodeModel and Symbols. I've been playing with this and it's actually not too bad.
Unexplored Method
You could try getting the underlying LanguageServiceProvider that is associated with the Code Document. But this is really difficult to pull off, and leaves you with many issues.