How to access project code meta data?

How to access project code meta data? - c#

In my VSPackage I need to replace reference to a property in code with its actual value. For example
public static void Main(string[] args) {
Console.WriteLine(Resource.HelloWorld);
}
What I want is to replace "Resource.HelloWorld" with its actual value - that is, find class Resource and get value of its static property HelloWorld. Does Visual Studio expose any API to handle code model of the project? It definitely has one, because this is very similar to common task of renaming variables. I don't want to use reflection on output assembly, because it's slow and it locks the file for a while.

There is no straight forward way to do this that I know of. Reliably getting an AST out of Visual Studio (and changes to it) has always been a big problem. Part of the goal of the Rosalyn project is to create an unified way of doing this, because many tool windows had their own way of doing this sort of stuff.
There are four ways to do this:
Symbols
FileCodeModel + CodeDOM
Rosalyn AST
Unexplored Method
Symbols
I believe most tool windows such as the CodeView and things like Code Element Search use the symbols created from a compiled build. This is not ideal as it is a little more heavy weight and hard to keep in sync. You'd have to cache symbols to make this not slow. Using reflector, you can see how CodeView implements this.
This approach uses private assemblies. The code for getting the symbols would look something like this:
var compilerHost = new IDECompilerHost();
var typeEnumerator = (from compiler in compilerHost.Compilers.Cast<IDECompiler>()
from type in compiler.GetCompilation().MainAssembly.Types
select new Tuple<IDECompiler, CSharpType>(compiler, type));
foreach (var typeTuple in typeEnumerator)
{
Trace.WriteLine(typeTuple.Item2.Name);
var csType = typeTuple.Item2;
foreach (var loc in csType.SourceLocations)
{
var file = loc.FileName.Value;
var line = loc.Position.Line;
var charPos = loc.Position.Character;
}
}
FileCodeModel + CodeDOM
You could try using the EnvDTE service to get the FileCodeModel associated with a Code Document. This will let you get classes and methods. But it does not support getting the method body. You're messing with buggy COM. This ugly because an COM object reference to a CodeFunction or CodeClass can get invalided without you knowing it, meaning you'd have to keep your own mirror.
Rosalyn AST
This allows provides the same capabilities as both FileCodeModel and Symbols. I've been playing with this and it's actually not too bad.
Unexplored Method
You could try getting the underlying LanguageServiceProvider that is associated with the Code Document. But this is really difficult to pull off, and leaves you with many issues.

Related

How do I get the containing namespace of a called method using Roslyn when that method has no symbol info?

I have an application that allows users to write c-sharp code that gets saved as a class library for being called later.
A new requirement has been established that some namespaces (and the methods they contain and any variables or methods with their return types) are not allowed anymore. So I need to analyze the code and alert the user to any forbidden namespaces in their code so they can remove them.
Using Roslyn, I can access the InvocationExpressionSyntax nodes for the method calls. From that I then get the symbol info by calling var mySymbol = mySemanticModel.GetSymbolInfo(myInvocationExpressionSyntaxNode).Symbol.
Then calling mySymbol.ContainingType.ToDisplayString() returns the namespace type of the call.
However, it seems not all called methods have symbol information in Roslyn. For example, System.Math.Sqrt() has symbol information, so from that I can get the containing namespace of System.Math. On the other hand System.Net.WebRequest.Create() or System.Diagnostics.Process.Start() do not. How do I get System.Net.WebRequest or System.Dignostics.Process from those nodes? I can clearly see them using QuickWatch.
For example, the System.Diagnostics.Process.Start() node itself shows the following value in QuickWatch:
InvocationExpressionSyntax InvocationExpression System.Diagnostics.Process.Start("CMD.exe","")
And the node's expression has this value:
MemberAccessExpressionSyntax SimpleMemberAccessExpression System.Diagnostics.Process.Start
So obviously the namespace is there in the value itself. But the Symbol from the SymbolInfo and the Type from TypeInfo are both null.
Edit
In regards to my compilation, the C# Roslyn tools are set up as follows (we are supposed to support VB as well, hence the properties are interfaced):
private class CSharpRoslynTools : IRoslynTools
{
public CompilationUnitSyntax SyntaxTreeRoot { get; }
public SemanticModel SemanticModel { get; }
public CSharpRoslynTools(string code)
{
var mscorlib = MetadataReference.CreateFromFile(typeof(object).Assembly.Location);
var syntaxTree = CSharpSyntaxTree.ParseText(code);
var compilation = CSharpCompilation.Create(
"MyCompilation",
syntaxTrees: new[] { syntaxTree },
references: new[]
{
mscorlib
});
this.SemanticModel = compilation.GetSemanticModel(syntaxTree);
this.SyntaxTreeRoot = (CompilationUnitSyntax)syntaxTree.GetRoot();
}
}
One thing I did come to realize is that System Diagnostics isn't part of the mscorlib. Could that be why the symbol information is missing?
Honestly, I kind of view this as a bit of a waste of my time because the scripts were designed to be run in a WinForms desktop application that's probably 15 years old at this point. But then they decided to this desktop application needed to move to Citrix Cloud for certain customers. And as a result, we have to lock out anything that can access the filesystem if it's not an admin logged into the application. So we have this giant potential security hole with these scripts. The chance of someone getting access to the application and exploiting any of this is slim, though.
I pushed for a blacklist, which would be easy enough to do with a simple string search of the code. They want a whitelist which requires full out parsing of the symbols.

However, it seems not all called methods have symbol information in Roslyn.
This probably indicates that something went wrong with how you got your Compilation, and you should attempt to investigate that directly. Don't attempt to deal with it downstream. (Software: garbage in, garbage out!)
On the other hand System.Net.WebRequest.Create() or System.Diagnostics.Process.Start() do not. How do I get System.Net.WebRequest or System.Dignostics.Process from those nodes? I can clearly see them using QuickWatch.
Keep in mind that from the perspective of syntax only, System.Net.WebRequest.Create() could be:
A Create method on the WebRequest type that's in System.Net
A Create method on on the WebRequest type, which is a nested class of the Net type, in the System namespace
A Create method on the WebRequest type that's in MyApp.System.Net.WebRequest, because we of course don't require fully namespace names and if you decide to make a System namespace inside your MyApp, that could potentially work!
One thing I did come to realize is that System Diagnostics isn't part of the mscorlib. Could that be why the symbol information is missing?
Yep; we're only going to reference the assemblies you give us. It's up to you to know your context and if other references are included in what that code can reference, then you should include them in your production of the Compilation.
I pushed for a blacklist, which would be easy enough to do with a simple string search of the code. They want a whitelist which requires full out parsing of the symbols.
From a security perspective they may be right -- it's very difficult to block from a string search. See some thoughts at https://stackoverflow.com/a/66555319/972216 for how difficult that can be.

Library requires reference to System.Windows.Forms

I'm having trouble finding information on this topic, possibly because I'm not sure how to phrase the question. Hopefully the braintrust here can help or at least advise. This situation might just be me being retentive but it's bugging me so I thought I'd ask for help on how to get around it.
I have a C# library filled with utility classes used in other assemblies. All of my extensions reside in this library and it comes in quite handy. Any of my other libraries or executables that need to use those classes must naturally reference the library.
But one of the extensions I have in there is an extension on the Control class to handle cross thread control updates in a less convoluted fashion. As a consequence the utility library must reference System.Windows.Forms.
The problem being that any library or executable that references the utilities library must now have a reference to System.Windows.Forms as well or I get a a build error for the missing reference. While this is not a big deal, it seems sort of stupid to have assemblies that have nothing to do with controls or forms having to reference System.Windows.Forms just because the utilities library does especially since most of them aren't actually using the InvokeAsRequired() extension I wrote.
I thought about moving the InvokeAsRequired() extension into it's own library, which would eliminate the System.Windows.forms problem as only assemblies that needed to use the InvokeAsRequired() extension would already have a reference to SWF.... but then I'd have a library with only one thing in it which will probably bother me more.
Is there a way around this requirement beyond separating out the 'offending' method and creating a nearly empty library? Maybe a compile setting or something?
It should be noted that the 'offending method' is actually used across multiple projects that have UI. A lot of the UI updates I do are as a result of events coming in and trying to update windows form controls from another thread causes various UI thread problems. Hence the method handling the Invoke when needed. (Though personally I think that whole InvokeRequired pattern should be wrapped up into the control itself rather than having something external do the thread alignment in the first place).

If it's just one function, then package it up as a source code file into a NuGet package and then add the NuGet package to your projects. Then, this code will be easily deployable to new projects (as well as easily updateable), but you don't need to create a separate assembly. Just compile it into your application.
You would then store your NuGet package in a local NuGet repository, or get a myget account, or even just store it somewhere on your network. Worst case, you can check it into your version control, but I would just check in the "project" that you build the nuget package from, so you can rebuild the package if need be.
Who knows, at some point, you may add more utility functions that require windows forms, and at that point you could justify a separate assembly.

It's easy: you have to move the offending code out. For now it might be a little concern, but in the end it might be a blast you did it now instead of at the moment you are forced to.
Even if it is just one method (for now), just move the method into another assembly. I didn't say a new one, it can be in the assembly that uses it if only one, or all others that need that moment derive from it.

You can solve your problem by switching the utility library code from the early-binding pattern to the late-binding pattern when it comes to types declared in the System.Windows.Forms namespace.
This article shows how to do it the short way: Stack Overflow: C#.NET - Type.GetType(“System.Windows.Forms.Form”) returns null
And this code snippet shows how the monoresgen tool from the Mono Project (open source ECMA CLI, C# and .NET implementation) solves the System.Windows.Forms dependency problem.
public const string AssemblySystem_Windows_Forms = "System.Windows.Forms, Version=" + FxVersion + ", Culture=neutral, PublicKeyToken=b77a5c561934e089";
// ...
static Assembly swf;
static Type resxr;
static Type resxw;
/*
* We load the ResX format stuff on demand, since the classes are in
* System.Windows.Forms (!!!) and we can't depend on that assembly in mono, yet.
*/
static void LoadResX () {
if (swf != null)
return;
try {
swf = Assembly.Load(Consts.AssemblySystem_Windows_Forms);
resxr = swf.GetType("System.Resources.ResXResourceReader");
resxw = swf.GetType("System.Resources.ResXResourceWriter");
} catch (Exception e) {
throw new Exception ("Cannot load support for ResX format: " + e.Message);
}
}
// ...
static IResourceReader GetReader (Stream stream, string name, bool useSourcePath) {
string format = Path.GetExtension (name);
switch (format.ToLower (System.Globalization.CultureInfo.InvariantCulture)) {
// ...
case ".resx":
LoadResX ();
IResourceReader reader = (IResourceReader) Activator.CreateInstance (
resxr, new object[] {stream});
if (useSourcePath) { // only possible on 2.0 profile, or higher
PropertyInfo p = reader.GetType ().GetProperty ("BasePath",
BindingFlags.Public | BindingFlags.Instance);
if (p != null && p.CanWrite) {
p.SetValue (reader, Path.GetDirectoryName (name), null);
}
}
return reader;
// ...
}
}
Snippet source: https://github.com/mono/mono/blob/mono-3.10.0/mcs/tools/resgen/monoresgen.cs#L30

Best way to run a string as c# code

Let's say I have:
#{
var str= "DateTime.Now";
}
I want to process this string as a c# code
#Html.Raw(App.ProcessAsCode(str));
The output should be the current date time.

Final Edit:
Based on further information - if the goal here is to simply have a formatting engine there are lots of options out there. One such option is based around the .liquid syntax from shopify (see here). You can find a .NET port of this on gitHub here: https://github.com/formosatek/dotliquid/. The main purpose of this is to turn something like:
<h2>{{product.name}}</h2>
Into something like:
<h2>Beef Jerky</h2>
I would strongly recommend reading more about the liquid engine and syntax and I believe this will lead you in the right direction. Best of luck!
Initial Answer
This is definitely possible - although as others have said you will want to be careful in what you do. Using C# the key to compiling and running code generically is the "CSharpCodeProvider" class. Here is a brief example of how that looks:
string[] references = { "System.dll" };
CompilerParams.ReferencedAssemblies.AddRange(references);
var provider = new CSharpCodeProvider();
CompilerResults compile = provider.CompileAssemblyFromSource(CompilerParams, formattedCode);
In this example, "formattedCode" is a string with the C# code. Any references must be manually added. For the full example see this stack question (How to get a Type from a C# type name string?).
NOTE -- If all you are looking to do here is a format string or something simple like that you might have the user pass in a .NET format string (eg "MM/dd/yyyy"), then use that in a call to the "ToString" method. That would provide the user some configurability, while still making sure your system stays secure. In general running code on a server that hasn't been properly checked/escaped is really dangerous!
Reference - For your reference, the current msdn page for CSharpCodeProvider also has some examples.

Another option would be using a dynamic language such as IronRuby or IronPython.

Get method body c#

I am trying to write a custom rule (code analysis) where, if the body of a method contains empty statements, an error is raised.
However, there is one problem. I can not seem to figure out how to get the body of a method (the text that is in the method).
How can I get the text inside a method, and assign it to a string?
Thanks in advance.
For reference; I use c# in visual studio, with FxCop to make the rule.
Edit: Some code added for reference, this does NOT work.
using Microsoft.FxCop.Sdk;
using Microsoft.VisualStudio.CodeAnalysis.Extensibility;
public override ProblemCollection Check(Member member)
{
Method method = member as Method;
if (method == null)
{
return null;
}
if (method.Name.Name.Contains("{}"))
{
var resolution = GetResolution(member.Name.Name);
var problem = new Problem(resolution, method)
{
Certainty = 100,
FixCategory = FixCategories.Breaking,
MessageLevel = MessageLevel.Warning
};
Problems.Add(problem);
}
return Problems;
}

FxCop doesn't analyse source code, it works on .Net assemblies built from any language.
You may be able to find whether the method contains a statement or not using FxCop, I advice you to read the documentation and check the implementation of existing rules to understand it.
An empty statement in the middle of other code might be removed by the compiler and you may not find it using FxCop. If you want to analyze source code you should take a look at StyleCop.

However, there is one problem. I can not seem to figure out how to get the body of a method
(the text that is in the method).
You can not. FxCop does not work based on the source, but analysis the compiled bytecode.
What you can do is find the source - which is not totally trivial - but you have to do so without the FxCop API. A start point may be analysing the pdb files (not sure where to find documentation) as they can point you to the file that contains the method.

CS-Script Evaluator LoadCode: How to compile and reference a second script (reusable library)

The question in short is: How do you reference a second script containing reusable script code, under the constraints that you need to be able to unload and reload the scripts when either of them changes without restarting the host application?
I'm trying to compile a script class using the CS-Script "compiler as service" (CSScript.Evaluator), while referencing an assembly that has just been compiled from a second "library" script. The purpose is that the library script should contain code that can be reused for different scripts.
Here is a sample code that illustrates the idea but also causes a CompilerException at runtime.
using CSScriptLibrary;
using NUnit.Framework;
[TestFixture]
public class ScriptReferencingTests
{
private const string LibraryScriptCode = #"
public class Helper
{
public static int AddOne(int x)
{
return x + 1;
}
}
";
private const string ScriptCode = #"
using System;
public class Script
{
public int SumAndAddOne(int a, int b)
{
return Helper.AddOne(a+b);
}
}
";
[Test]
public void CSScriptEvaluator_CanReferenceCompiledAssembly()
{
var libraryEvaluator = CSScript.Evaluator.CompileCode(LibraryScriptCode);
var libraryAssembly = libraryEvaluator.GetCompiledAssembly();
var evaluatorWithReference = CSScript.Evaluator.ReferenceAssembly(libraryAssembly);
dynamic scriptInstance = evaluatorWithReference.LoadCode(ScriptCode);
var result = scriptInstance.SumAndAddOne(1, 2);
Assert.That(result, Is.EqualTo(4));
}
}
To run the code you need NuGet packages NUnit and cs-script.
This line causes a CompilerException at runtime:
dynamic scriptInstance = evaluatorWithReference.LoadCode(ScriptCode);
{interactive}(7,23): error CS0584: Internal compiler error: The invoked member is not supported in a dynamic assembly.
{interactive}(7,9): error CS0029: Cannot implicitly convert type '<fake$type>' to 'int'
Again, the reason for using CSScript.Evaluator.LoadCode instead of CSScript.LoadCode is so that the script can be reloaded at any time without restarting the host application when either of the scripts changes. (CSScript.LoadCode already supports including other scripts according to http://www.csscript.net/help/Importing_scripts.html)
Here is the documentation on the CS-Script Evaluator: http://www.csscript.net/help/evaluator.html
The lack of google results for this is discouraging, but I hope I'm missing something simple. Any help would be greatly appreciated.
(This question should be filed under the tag cs-script which does not exist.)

There is some slight confusion here. Evaluator is not the only way to achieve reloadable script behavior. CSScript.LoadCode allows reloading as well.
I do indeed advise to consider CSScript.Evaluator.LoadCode as a first candidate for the hosting model as it offers less overhead and arguably more convenient reloading model. However it comes with the cost. You have very little control over reloading and dependencies inclusion (assemblies, scripts). Memory leaks are not 100% avoidable. And it also makes script debugging completely impossible (Mono bug).
In your case I would really advice you to move to the more conventional hosting model: CodeDOM.
Have look at "[cs-script]\Samples\Hosting\CodeDOM\Modifying script without restart" sample.
And "[cs-script]\Samples\Hosting\CodeDOM\InterfaceAlignment" will also give you an idea how to use interfaces with reloading.
CodeDOM was for years a default CS-Script hosting mode and it is in fact very robust, intuitive and manageable. The only real drawback is the fact that all object you pass to (or get from) the script will need to be serializable or inherited from MarshalByRef. This is the side effect of the script being executed in the "automatic" separate domain. Thus one have to deal with the all "pleasures" of Remoting.
BTW this is the only reason why I implemented Mono-based evaluator.
CodeDOM model will also automatically manage the dependencies and recompile them when needed. But it looks like you are aware about this anyway.
CodeDOM also allows you to define precisely the mechanism of checking dependencies for changes:
//the default algorithm "recompile if script or dependency is changed"
CSScript.IsOutOfDateAlgorithm = CSScript.CachProbing.Advanced;
or
//custom algorithm "never recompile script"
CSScript.IsOutOfDateAlgorithm = (s, a) => false;

The quick solution to the CompilerException appears to be not use Evaluator to compile the assembly, but instead just CSScript.LoadCode like so
var compiledAssemblyName = CSScript.CompileCode(LibraryScriptCode);
var evaluatorWithReference = CSScript.Evaluator.ReferenceAssembly(compiledAssemblyName);
dynamic scriptInstance = evaluatorWithReference.LoadCode(ScriptCode);
However, as stated in previous answer, this limits the possibilities for dependency control that the CodeDOM model offers (like css_include). Also, any change to the LibraryScriptCode are not seen which again limits the usefulness of the Evaluator method.
The solution I chose is the AsmHelper.CreateObject and AsmHelper.AlignToInterface<T> methods. This lets you use the regular css_include in your scripts, while at the same time allowing you at any time to reload the scripts by disposing the AsmHelper and starting over. My solution looks something like this:
AsmHelper asmHelper = new AsmHelper(CSScript.Compile(filePath), null, false);
object obj = asmHelper.CreateObject("*");
IMyInterface instance = asmHelper.TryAlignToInterface<IMyInterface>(obj);
// Any other interfaces you want to instantiate...
...
if (instance != null)
instance.MyScriptMethod();
Once a change is detected (I use FileSystemWatcher), you just call asmHelper.Dispose and run the above code again.
This method requires the script class to be marked with the Serializable attribute, or simply inherit from MarshalByRefObject.
Note that your script class does not need to inherit any interface. The AlignToInterface works both with and without it. You could use dynamic here, but I prefer having a strongly typed interface to avoid errors down the line.
I couldn't get the built in try-methods to work, so I made this extension method for less clutter when it is not known whether or not the interface is implemented:
public static class InterfaceExtensions
{
public static T TryAlignToInterface<T>(this AsmHelper helper, object obj) where T : class
{
try
{
return helper.AlignToInterface<T>(obj);
}
catch
{
return null;
}
}
}
Most of this is explained in the hosting guidelines http://www.csscript.net/help/script_hosting_guideline_.html, and there are helpful samples mentioned in previous post.
I feel I might have missed something regarding script change detection, but this method works solidly.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.