Find all references in Visual Studio solution using Roslyn - c#

TLDR;
How do I find all the const string parameters of the references to the index property Microsoft.Extensions.Localization.IStringLocalizer.Item[String] in my Visual Studio solution? All source code is written in C#. The solution must also support MVC razor views.
Additional info
I believe that Roslyn is the answer to the question. I, however, haven't yet found my way through the API to achieve this. I'm also uncertain about whether to use syntax tree, compilation or semantic model. The following is an attempt based on other Q&A here on stackoverflow. Any help to make it work is highly appreciated :-) If you are curious you can read about the reason for this need here.
namespace AspNetCoreLocalizationKeysExtractor
{
using System;
using System.Linq;
using Microsoft.CodeAnalysis.FindSymbols;
using Microsoft.CodeAnalysis.MSBuild;
class Program
{
static void Main(string[] args)
{
string solutionPath = #"..\source\MySolution.sln";
var msWorkspace = MSBuildWorkspace.Create();
var solution = msWorkspace.OpenSolutionAsync(solutionPath).Result;
foreach (var project in solution.Projects.Where(p => p.AssemblyName.StartsWith("MyCompanyNamespace.")))
{
var compilation = project.GetCompilationAsync().Result;
var interfaceType = compilation.GetTypeByMetadataName("Microsoft.Extensions.Localization.IStringLocalizer");
// TODO: Find the indexer based on the name ("Item"/"this"?) and/or on the parameter and return type
var indexer = interfaceType.GetMembers().First();
var indexReferences = SymbolFinder.FindReferencesAsync(indexer, solution).Result.ToList();
foreach (var symbol in indexReferences)
{
// TODO: How to get output comprised by "a location" like e.g. a namespace qualified name and the parameter of the index call. E.g:
//
// MyCompanyNamespace.MyLib.SomeClass: "Please try again"
// MyCompanyNamespace.MyWebApp.Views.Shared._Layout: "Welcome to our cool website"
Console.WriteLine(symbol.Definition.ToDisplayString());
}
}
}
}
}
Update: Workaround
Despite the great help from #Oxoron I've chosen to resort to a simple workaround. Currently Roslyn doesn't find any references using SymbolFinder.FindReferencesAsync. It appears to be according to "silent" msbuild failures. These errors are available like this:
msWorkspace.WorkspaceFailed += (sender, eventArgs) =>
{
Console.Error.WriteLine($"{eventArgs.Diagnostic.Kind}: {eventArgs.Diagnostic.Message}");
Console.Error.WriteLine();
};
and
var compilation = project.GetCompilationAsync().Result;
foreach (var diagnostic in compilation.GetDiagnostics())
Console.Error.WriteLine(diagnostic);
My workaround is roughly like this:
public void ParseSource()
{
var sourceFiles = from f in Directory.GetFiles(SourceDir, "*.cs*", SearchOption.AllDirectories)
where f.EndsWith(".cs") || f.EndsWith(".cshtml")
where !f.Contains(#"\obj\") && !f.Contains(#"\packages\")
select f;
// _["Hello, World!"]
// _[#"Hello, World!"]
// _localizer["Hello, World!"]
var regex = new Regex(#"_(localizer)?\[""(.*?)""\]");
foreach (var sourceFile in sourceFiles)
{
foreach (var line in File.ReadLines(sourceFile))
{
var matches = regex.Matches(line);
foreach (Match match in matches)
{
var resourceKey = GetResourceKeyFromFileName(sourceFile);
var key = match.Groups[2].Value;
Console.WriteLine($"{resourceKey}: {key}");
}
}
}
}
Of course the solution isn't bullet proof and relies on naming conventions and doesn't handle multiline verbatim strings. But it'll probably do the job for us :-)

Take a look on this and this questions, they will help with indexers.
Determine namespaces - it's a bit more difficult.
You can determine it using code like
int spanStart = symbol.Locations[0].Location.SourceSpan.Start;
Document doc = symbol.Locations[0].Location.Document;
var indexerInvokation = doc.GetSyntaxRootAsync().Result.DescendantNodes()
.FirstOrDefault(node => node.GetLocation().SourceSpan.Start == spanStart );
After that just find indexerInvokation parents nodes until MethodDeclarationSyntax, ClassDeclarationSyntax, etc.
Upd1.
Test project code:
namespace TestApp
{
class Program
{
static void Main(string[] args)
{
int test0 = new A().GetInt();
int test1 = new IndexedUno()[2];
int test2 = new IndexedDo()[2];
}
}
public interface IIndexed
{
int this[int i] { get; }
}
public class IndexedUno : IIndexed
{
public int this[int i] => i;
}
public class IndexedDo : IIndexed
{
public int this[int i] => i;
}
public class A
{
public int GetInt() { return new IndexedUno()[1]; }
}
public class B
{
public int GetInt() { return new IndexedDo()[4]; }
}
}
Search code:
using System;
using System.Linq;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.FindSymbols;
using Microsoft.CodeAnalysis.MSBuild;
namespace AnalyzeIndexers
{
class Program
{
static void Main(string[] args)
{
string solutionPath = #"PathToSolution.sln";
var msWorkspace = MSBuildWorkspace.Create();
var solution = msWorkspace.OpenSolutionAsync(solutionPath).Result;
foreach (var project in solution.Projects.Where(p => p.AssemblyName.StartsWith("TestApp")))
{
var compilation = project.GetCompilationAsync().Result;
var interfaceType = compilation.GetTypeByMetadataName("TestApp.IIndexed");
var indexer = interfaceType
.GetMembers()
.OfType<IPropertySymbol>()
.First(member => member.IsIndexer);
var indexReferences = SymbolFinder.FindReferencesAsync(indexer, solution).Result.ToList();
foreach (var indexReference in indexReferences)
{
foreach (ReferenceLocation indexReferenceLocation in indexReference.Locations)
{
int spanStart = indexReferenceLocation.Location.SourceSpan.Start;
var doc = indexReferenceLocation.Document;
var indexerInvokation = doc.GetSyntaxRootAsync().Result
.DescendantNodes()
.FirstOrDefault(node => node.GetLocation().SourceSpan.Start == spanStart);
var className = indexerInvokation.Ancestors()
.OfType<ClassDeclarationSyntax>()
.FirstOrDefault()
?.Identifier.Text ?? String.Empty;
var #namespace = indexerInvokation.Ancestors()
.OfType<NamespaceDeclarationSyntax>()
.FirstOrDefault()
?.Name.ToString() ?? String.Empty;
Console.WriteLine($"{#namespace}.{className} : {indexerInvokation.GetText()}");
}
}
}
Console.WriteLine();
Console.ReadKey();
}
}
}
Take a look at the var indexer = ... code - it extracts indexer from a type. Maybe you'll need to work with getter\setter.
Other point of interest: indexerInvokation computation. We get SyntaxRoot too often, maybe you'll need some kind of cache.
Next: class and namespace search. I didn't find a method, but recommend not to find it: there can be properties, other indexers, anonymous methods used your indexers. If you don't really care about this - just find ancestors of type MethodDeclarationSyntax.

Related

Adding static constructor with Mono.Cecil causes TypeInitializationException

I am trying to add a static constructor using Mono Cecil to a program like the following:
namespace SimpleTarget
{
class C
{
public void M()
{
Console.WriteLine("Hello, World!");
}
}
}
The following code adds the static constructor:
namespace AddStaticConstructor
{
class Program
{
static void Main(string[] args)
{
var assemblyPath = args[0];
var module = ModuleDefinition.ReadModule(assemblyPath);
var corlib = ModuleDefinition.ReadModule(typeof(object).Module.FullyQualifiedName);
var method = corlib.Types.First(t => t.Name.Equals("Console")).Methods.First(m => m.Name.Contains("WriteLine"));
var methodToCall = module.Import(method);
foreach (var type in module.Types)
{
if (!type.Name.Contains("C")) continue;
var staticConstructorAttributes =
Mono.Cecil.MethodAttributes.Private |
Mono.Cecil.MethodAttributes.HideBySig |
Mono.Cecil.MethodAttributes.Static |
Mono.Cecil.MethodAttributes.SpecialName |
Mono.Cecil.MethodAttributes.RTSpecialName;
MethodDefinition staticConstructor = new MethodDefinition(".cctor", staticConstructorAttributes, module.TypeSystem.Void);
type.Methods.Add(staticConstructor);
type.IsBeforeFieldInit = false;
var il = staticConstructor.Body.GetILProcessor();
il.Append(Instruction.Create(OpCodes.Ret));
Instruction ldMethodName = il.Create(OpCodes.Ldstr, type.FullName);
Instruction callOurMethod = il.Create(OpCodes.Call, methodToCall);
Instruction firstInstruction = staticConstructor.Body.Instructions[0];
// Inserts the callOurMethod instruction before the first instruction
il.InsertBefore(firstInstruction, ldMethodName);
il.InsertAfter(ldMethodName, callOurMethod);
}
module.Write(assemblyPath);
}
}
}
Looking at the decompiled binary in dotPeek, it appears as if everything is setup correctly. When trying to use the modified C type, I get a TypeInitializationException with the inner exception "System.InvalidProgramException: JIT Compiler encountered an internal limitation"
Is there anything else I need to set correctly before using a static constructor?
Thanks!
The problem is that you are getting the wrong overload of System.WriteLine here:
var corlib = ModuleDefinition.ReadModule(typeof(object).Module.FullyQualifiedName);
var method = corlib.Types.First(t => t.Name.Equals("Console")).Methods.First(m => m.Name.Contains("WriteLine"));
var methodToCall = module.Import(method);
use this simple code the get the overload you want to use:
var wlMethod = typeof (Console).GetMethod(nameof(Console.WriteLine), new[] {typeof (string)});
var methodToCall = module.ImportReference(wlMethod);

OpenXml Excel: throw error in any word after mail address

I read Excel files using OpenXml. all work fine but if the spreadsheet contains one cell that has an address mail and after it a space and another word, such as:
abc#abc.com abc
It throws an exception immediately at the opening of the spreadsheet:
var _doc = SpreadsheetDocument.Open(_filePath, false);
exception:
DocumentFormat.OpenXml.Packaging.OpenXmlPackageException
Additional information:
Invalid Hyperlink: Malformed URI is embedded as a
hyperlink in the document.
There is an open issue on the OpenXml forum related to this problem: Malformed Hyperlink causes exception
In the post they talk about encountering this issue with a malformed "mailto:" hyperlink within a Word document.
They propose a work-around here: Workaround for malformed hyperlink exception
The workaround is essentially a small console application which locates the invalid URL and replaces it with a hard-coded value; here is the code snippet from their sample that does the replacement; you could augment this code to attempt to correct the passed brokenUri:
private static Uri FixUri(string brokenUri)
{
return new Uri("http://broken-link/");
}
The problem I had was actually with an Excel document (like you) and it had to do with a malformed http URL; I was pleasantly surprised to find that their code worked just fine with my Excel file.
Here is the entire work-around source code, just in case one of these links goes away in the future:
void Main(string[] args)
{
var fileName = #"C:\temp\corrupt.xlsx";
var newFileName = #"c:\temp\Fixed.xlsx";
var newFileInfo = new FileInfo(newFileName);
if (newFileInfo.Exists)
newFileInfo.Delete();
File.Copy(fileName, newFileName);
WordprocessingDocument wDoc;
try
{
using (wDoc = WordprocessingDocument.Open(newFileName, true))
{
ProcessDocument(wDoc);
}
}
catch (OpenXmlPackageException e)
{
e.Dump();
if (e.ToString().Contains("The specified package is not valid."))
{
using (FileStream fs = new FileStream(newFileName, FileMode.OpenOrCreate, FileAccess.ReadWrite))
{
UriFixer.FixInvalidUri(fs, brokenUri => FixUri(brokenUri));
}
}
}
}
private static Uri FixUri(string brokenUri)
{
brokenUri.Dump();
return new Uri("http://broken-link/");
}
private static void ProcessDocument(WordprocessingDocument wDoc)
{
var elementCount = wDoc.MainDocumentPart.Document.Descendants().Count();
Console.WriteLine(elementCount);
}
}
public static class UriFixer
{
public static void FixInvalidUri(Stream fs, Func<string, Uri> invalidUriHandler)
{
XNamespace relNs = "http://schemas.openxmlformats.org/package/2006/relationships";
using (ZipArchive za = new ZipArchive(fs, ZipArchiveMode.Update))
{
foreach (var entry in za.Entries.ToList())
{
if (!entry.Name.EndsWith(".rels"))
continue;
bool replaceEntry = false;
XDocument entryXDoc = null;
using (var entryStream = entry.Open())
{
try
{
entryXDoc = XDocument.Load(entryStream);
if (entryXDoc.Root != null && entryXDoc.Root.Name.Namespace == relNs)
{
var urisToCheck = entryXDoc
.Descendants(relNs + "Relationship")
.Where(r => r.Attribute("TargetMode") != null && (string)r.Attribute("TargetMode") == "External");
foreach (var rel in urisToCheck)
{
var target = (string)rel.Attribute("Target");
if (target != null)
{
try
{
Uri uri = new Uri(target);
}
catch (UriFormatException)
{
Uri newUri = invalidUriHandler(target);
rel.Attribute("Target").Value = newUri.ToString();
replaceEntry = true;
}
}
}
}
}
catch (XmlException)
{
continue;
}
}
if (replaceEntry)
{
var fullName = entry.FullName;
entry.Delete();
var newEntry = za.CreateEntry(fullName);
using (StreamWriter writer = new StreamWriter(newEntry.Open()))
using (XmlWriter xmlWriter = XmlWriter.Create(writer))
{
entryXDoc.WriteTo(xmlWriter);
}
}
}
}
}
The fix by #RMD works great. I've been using it for years. But there is a new fix.
You can see the fix here in the changelog for issue #793
Upgrade OpenXML to 2.12.0.
Right click solution and select Manage NuGet Packages.
Implement the fix
It is helpful to have a unit test. Create an excel file with a bad email address like test#gmail,com. (Note the comma instead of the dot).
Make sure the stream you open and the call to SpreadsheetDocument.Open allows Read AND Write.
You need to implement a RelationshipErrorHandlerFactory and use it in the options when you open. Here is the code I used:
public class UriRelationshipErrorHandler : RelationshipErrorHandler
{
public override string Rewrite(Uri partUri, string id, string uri)
{
return "https://broken-link";
}
}
Then you need to use it when you open the document like this:
var openSettings = new OpenSettings
{
RelationshipErrorHandlerFactory = package =>
{
return new UriRelationshipErrorHandler();
}
};
using var document = SpreadsheetDocument.Open(stream, true, openSettings);
One of the nice things about this solution is that it does not require you to create a temporary "fixed" version of your file and it is far less code.
Unfortunately solution where you have to open file as zip and replace broken hyperlink would not help me.
I just was wondering how it is posible that it works fine when your target framework is 4.0 even if your only installed .Net Framework has version 4.7.2.
I have found out that there is private static field inside System.UriParser that selects version of URI's RFC specification. So it is possible to set it to V2 as it is set for .net 4.0 and lower versions of .Net Framework. Only problem that it is private static readonly.
Maybe someone will want to set it globally for whole application. But I wrote UriQuirksVersionPatcher that will update this version and restore it back in Dispose method. It is obviously not thread-safe but it is acceptable for my purpose.
using System;
using System.Diagnostics;
using System.Reflection;
namespace BarCap.RiskServices.RateSubmissions.Utility
{
#if (NET20 || NET35 || NET40)
public class UriQuirksVersionPatcher : IDisposable
{
public void Dispose()
{
}
}
#else
public class UriQuirksVersionPatcher : IDisposable
{
private const string _quirksVersionFieldName = "s_QuirksVersion"; //See Source\ndp\fx\src\net\System\_UriSyntax.cs in NexFX sources
private const string _uriQuirksVersionEnumName = "UriQuirksVersion";
/// <code>
/// private enum UriQuirksVersion
/// {
/// V1 = 1, // RFC 1738 - Not supported
/// V2 = 2, // RFC 2396
/// V3 = 3, // RFC 3986, 3987
/// }
/// </code>
private const string _oldQuirksVersion = "V2";
private static readonly Lazy<FieldInfo> _targetFieldInfo;
private static readonly Lazy<int?> _patchValue;
private readonly int _oldValue;
private readonly bool _isEnabled;
static UriQuirksVersionPatcher()
{
var targetType = typeof(UriParser);
_targetFieldInfo = new Lazy<FieldInfo>(() => targetType.GetField(_quirksVersionFieldName, BindingFlags.Static | BindingFlags.NonPublic));
_patchValue = new Lazy<int?>(() => GetUriQuirksVersion(targetType));
}
public UriQuirksVersionPatcher()
{
int? patchValue = _patchValue.Value;
_isEnabled = patchValue.HasValue;
if (!_isEnabled) //Disabled if it failed to get enum value
{
return;
}
int originalValue = QuirksVersion;
_isEnabled = originalValue != patchValue;
if (!_isEnabled) //Disabled if value is proper
{
return;
}
_oldValue = originalValue;
QuirksVersion = patchValue.Value;
}
private int QuirksVersion
{
get
{
return (int)_targetFieldInfo.Value.GetValue(null);
}
set
{
_targetFieldInfo.Value.SetValue(null, value);
}
}
private static int? GetUriQuirksVersion(Type targetType)
{
int? result = null;
try
{
result = (int)targetType.GetNestedType(_uriQuirksVersionEnumName, BindingFlags.Static | BindingFlags.NonPublic)
.GetField(_oldQuirksVersion, BindingFlags.Static | BindingFlags.Public)
.GetValue(null);
}
catch
{
#if DEBUG
Debug.WriteLine("ERROR: Failed to find UriQuirksVersion.V2 enum member.");
throw;
#endif
}
return result;
}
public void Dispose()
{
if (_isEnabled)
{
QuirksVersion = _oldValue;
}
}
}
#endif
}
Usage:
using(new UriQuirksVersionPatcher())
{
using(var document = SpreadsheetDocument.Open(fullPath, false))
{
//.....
}
}
P.S. Later I found that someone already implemented this pathcher: https://github.com/google/google-api-dotnet-client/blob/master/Src/Support/Google.Apis.Core/Util/UriPatcher.cs
I haven't use OpenXml but if there's no specific reason for using it then I highly recommend LinqToExcel from LinqToExcel. Example of code is here:
var sheet = new ExcelQueryFactory("filePath");
var allRows = from r in sheet.Worksheet() select r;
foreach (var r in allRows) {
var cella = r["Header"].ToString();
}

Determine if a Database is "Equal" to a DacPackage

Is there a way to use the SQL Server 2012 Microsoft.SqlServer.Dac Namespace to determine if a database has an identical schema to that described by a DacPackage object? I've looked at the API docs for DacPackage as well as DacServices, but not having any luck; am I missing something?
Yes there is, I have been using the following technique since 2012 without issue.
Calculate a fingerprint of the dacpac.
Store that fingerprint in the target database.
The .dacpac is just a zip file containing goodies like metadata, and
model information.
Here's a screen-grab of what you will find in the .dacpac:
The file model.xml has XML structured like the following
<DataSchemaModel>
<Header>
... developer specific stuff is in here
</Header>
<Model>
.. database model definition is in here
</Model>
</<DataSchemaModel>
What we need to do is extract the contents from <Model>...</Model>
and treat this as the fingerprint of the schema.
"But wait!" you say. "Origin.xml has the following nodes:"
<Checksums>
<Checksum Uri="/model.xml">EB1B87793DB57B3BB5D4D9826D5566B42FA956EDF711BB96F713D06BA3D309DE</Checksum>
</Checksums>
In my experience, this <Checksum> node changes regardless of a schema change in the model.
So let's get to it.
Calculate the fingerprint of the dacpac.
using System.IO;
using System.IO.Packaging;
using System.Security.Cryptography;
static string DacPacFingerprint(byte[] dacPacBytes)
{
using (var ms = new MemoryStream(dacPacBytes))
using (var package = ZipPackage.Open(ms))
{
var modelFile = package.GetPart(new Uri("/model.xml", UriKind.Relative));
using (var streamReader = new System.IO.StreamReader(modelFile.GetStream()))
{
var xmlDoc = new XmlDocument() { InnerXml = streamReader.ReadToEnd() };
foreach (XmlNode childNode in xmlDoc.DocumentElement.ChildNodes)
{
if (childNode.Name == "Header")
{
// skip the Header node as described
xmlDoc.DocumentElement.RemoveChild(childNode);
break;
}
}
using (var crypto = new SHA512CryptoServiceProvider())
{
byte[] retVal = crypto.ComputeHash(Encoding.UTF8.GetBytes(xmlDoc.InnerXml));
return BitConverter.ToString(retVal).Replace("-", "");// hex string
}
}
}
}
With this fingerprint now available, pseudo code for applying a dacpac can be:
void main()
{
var dacpacBytes = File.ReadAllBytes("<path-to-dacpac>");
var dacpacFingerPrint = DacPacFingerprint(dacpacBytes);// see above
var databaseFingerPrint = Database.GetFingerprint();//however you choose to do this
if(databaseFingerPrint != dacpacFingerPrint)
{
DeployDacpac(...);//however you choose to do this
Database.SetFingerprint(dacpacFingerPrint);//however you choose to do this
}
}
Here's what I've come up with, but I'm not really crazy about it. If anyone can point out any bugs, edge cases, or better approaches, I'd be much obliged.
...
DacServices dacSvc = new DacServices(connectionString);
string deployScript = dacSvc.GenerateDeployScript(myDacpac, #"aDb", deployOptions);
if (DatabaseEqualsDacPackage(deployScript))
{
Console.WriteLine("The database and the DacPackage are equal");
}
...
bool DatabaseEqualsDacPackage(string deployScript)
{
string equalStr = string.Format("GO{0}USE [$(DatabaseName)];{0}{0}{0}GO{0}PRINT N'Update complete.'{0}GO", Environment.NewLine);
return deployScript.Contains(equalStr);
}
...
What I really don't like about this approach is that it's entirely dependent upon the format of the generated deployment script, and therefore extremely brittle. Questions, comments and suggestions very welcome.
#Aaron Hudon answer does not account for post script changes. Sometimes you just add a new entry to a type table without changing the model. In our case we want this to count as new dacpac. Here is my modification of his code to account for that
private static string DacPacFingerprint(string path)
{
using (var stream = File.OpenRead(path))
using (var package = Package.Open(stream))
{
var extractors = new IDacPacDataExtractor [] {new ModelExtractor(), new PostScriptExtractor()};
string content = string.Join("_", extractors.Select(e =>
{
var modelFile = package.GetPart(new Uri($"/{e.Filename}", UriKind.Relative));
using (var streamReader = new StreamReader(modelFile.GetStream()))
{
return e.ExtractData(streamReader);
}
}));
using (var crypto = new MD5CryptoServiceProvider())
{
byte[] retVal = crypto.ComputeHash(Encoding.UTF8.GetBytes(content));
return BitConverter.ToString(retVal).Replace("-", "");// hex string
}
}
}
private class ModelExtractor : IDacPacDataExtractor
{
public string Filename { get; } = "model.xml";
public string ExtractData(StreamReader streamReader)
{
var xmlDoc = new XmlDocument() { InnerXml = streamReader.ReadToEnd() };
foreach (XmlNode childNode in xmlDoc.DocumentElement.ChildNodes)
{
if (childNode.Name == "Header")
{
// skip the Header node as described
xmlDoc.DocumentElement.RemoveChild(childNode);
break;
}
}
return xmlDoc.InnerXml;
}
}
private class PostScriptExtractor : IDacPacDataExtractor
{
public string Filename { get; } = "postdeploy.sql";
public string ExtractData(StreamReader stream)
{
return stream.ReadToEnd();
}
}
private interface IDacPacDataExtractor
{
string Filename { get; }
string ExtractData(StreamReader stream);
}

How to access invocations through extension methods, methods in static classes and methods with ref/out parameters with Roslyn

I'm working on creating an open source project for creating .NET UML Sequence Diagrams that leverages a javascript library called js-sequence-diagrams. I am not sure Roslyn is the right tool for the job, but I thought I would give it a shot so I have put together some proof of concept code which attempts to get all methods and their invocations and then outputs these invocations in a form that can be interpreted by js-sequence-diagrams.
The code generates some output, but it does not capture everything. I cannot seem to capture invocations via extension methods, invocations of static methods in static classes.
I do see invocations of methods with out parameters, but not in any form that extends the BaseMethodDeclarationSyntax
Here is the code (keep in mind this is proof of concept code and so I did not entirely follow best-practices, but I am not requesting a code review here ... also, I am used to using Tasks so I am messing around with await, but am not entirely sure I am using it properly yet)
https://gist.github.com/SoundLogic/11193841
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Reflection.Emit;
using System.Threading.Tasks;
using Microsoft.CodeAnalysis;
using Microsoft.CodeAnalysis.CSharp;
using Microsoft.CodeAnalysis.CSharp.Syntax;
using Microsoft.CodeAnalysis.Formatting;
using Microsoft.CodeAnalysis.MSBuild;
using Microsoft.CodeAnalysis.FindSymbols;
using System.Collections.Immutable;
namespace Diagrams
{
class Program
{
static void Main(string[] args)
{
string solutionName = "Diagrams";
string solutionExtension = ".sln";
string solutionFileName = solutionName + solutionExtension;
string rootPath = #"C:\Workspace\";
string solutionPath = rootPath + solutionName + #"\" + solutionFileName;
MSBuildWorkspace workspace = MSBuildWorkspace.Create();
DiagramGenerator diagramGenerator = new DiagramGenerator( solutionPath, workspace );
diagramGenerator.ProcessSolution();
#region reference
//TODO: would ReferencedSymbol.Locations be a better way of accessing MethodDeclarationSyntaxes?
//INamedTypeSymbol programClass = compilation.GetTypeByMetadataName("DotNetDiagrams.Program");
//IMethodSymbol barMethod = programClass.GetMembers("Bar").First(s => s.Kind == SymbolKind.Method) as IMethodSymbol;
//IMethodSymbol fooMethod = programClass.GetMembers("Foo").First(s => s.Kind == SymbolKind.Method) as IMethodSymbol;
//ITypeSymbol fooSymbol = fooMethod.ContainingType;
//ITypeSymbol barSymbol = barMethod.ContainingType;
//Debug.Assert(barMethod != null);
//Debug.Assert(fooMethod != null);
//List<ReferencedSymbol> barReferencedSymbols = SymbolFinder.FindReferencesAsync(barMethod, solution).Result.ToList();
//List<ReferencedSymbol> fooReferencedSymbols = SymbolFinder.FindReferencesAsync(fooMethod, solution).Result.ToList();
//Debug.Assert(barReferencedSymbols.First().Locations.Count() == 1);
//Debug.Assert(fooReferencedSymbols.First().Locations.Count() == 0);
#endregion
Console.ReadKey();
}
}
class DiagramGenerator
{
private Solution _solution;
public DiagramGenerator( string solutionPath, MSBuildWorkspace workspace )
{
_solution = workspace.OpenSolutionAsync(solutionPath).Result;
}
public async void ProcessSolution()
{
foreach (Project project in _solution.Projects)
{
Compilation compilation = await project.GetCompilationAsync();
ProcessCompilation(compilation);
}
}
private async void ProcessCompilation(Compilation compilation)
{
var trees = compilation.SyntaxTrees;
foreach (var tree in trees)
{
var root = await tree.GetRootAsync();
var classes = root.DescendantNodes().OfType<ClassDeclarationSyntax>();
foreach (var #class in classes)
{
ProcessClass( #class, compilation, tree, root );
}
}
}
private void ProcessClass(
ClassDeclarationSyntax #class
, Compilation compilation
, SyntaxTree tree
, SyntaxNode root)
{
var methods = #class.DescendantNodes().OfType<MethodDeclarationSyntax>();
foreach (var method in methods)
{
var model = compilation.GetSemanticModel(tree);
// Get MethodSymbol corresponding to method
var methodSymbol = model.GetDeclaredSymbol(method);
// Get all InvocationExpressionSyntax in the above code.
var allInvocations = root.DescendantNodes().OfType<InvocationExpressionSyntax>();
// Use GetSymbolInfo() to find invocations of target method
var matchingInvocations =
allInvocations.Where(i => model.GetSymbolInfo(i).Symbol.Equals(methodSymbol));
ProcessMethod( matchingInvocations, method, #class);
}
var delegates = #class.DescendantNodes().OfType<DelegateDeclarationSyntax>();
foreach (var #delegate in delegates)
{
var model = compilation.GetSemanticModel(tree);
// Get MethodSymbol corresponding to method
var methodSymbol = model.GetDeclaredSymbol(#delegate);
// Get all InvocationExpressionSyntax in the above code.
var allInvocations = tree.GetRoot().DescendantNodes().OfType<InvocationExpressionSyntax>();
// Use GetSymbolInfo() to find invocations of target method
var matchingInvocations =
allInvocations.Where(i => model.GetSymbolInfo(i).Symbol.Equals(methodSymbol));
ProcessDelegates(matchingInvocations, #delegate, #class);
}
}
private void ProcessMethod(
IEnumerable<InvocationExpressionSyntax> matchingInvocations
, MethodDeclarationSyntax methodDeclarationSyntax
, ClassDeclarationSyntax classDeclarationSyntax )
{
foreach (var invocation in matchingInvocations)
{
MethodDeclarationSyntax actingMethodDeclarationSyntax = null;
if (SyntaxNodeHelper.TryGetParentSyntax(invocation, out actingMethodDeclarationSyntax))
{
var r = methodDeclarationSyntax;
var m = actingMethodDeclarationSyntax;
PrintCallerInfo(
invocation
, classDeclarationSyntax
, m.Identifier.ToFullString()
, r.ReturnType.ToFullString()
, r.Identifier.ToFullString()
, r.ParameterList.ToFullString()
, r.TypeParameterList != null ? r.TypeParameterList.ToFullString() : String.Empty
);
}
}
}
private void ProcessDelegates(
IEnumerable<InvocationExpressionSyntax> matchingInvocations
, DelegateDeclarationSyntax delegateDeclarationSyntax
, ClassDeclarationSyntax classDeclarationSyntax )
{
foreach (var invocation in matchingInvocations)
{
DelegateDeclarationSyntax actingMethodDeclarationSyntax = null;
if (SyntaxNodeHelper.TryGetParentSyntax(invocation, out actingMethodDeclarationSyntax))
{
var r = delegateDeclarationSyntax;
var m = actingMethodDeclarationSyntax;
PrintCallerInfo(
invocation
, classDeclarationSyntax
, m.Identifier.ToFullString()
, r.ReturnType.ToFullString()
, r.Identifier.ToFullString()
, r.ParameterList.ToFullString()
, r.TypeParameterList != null ? r.TypeParameterList.ToFullString() : String.Empty
);
}
}
}
private void PrintCallerInfo(
InvocationExpressionSyntax invocation
, ClassDeclarationSyntax classBeingCalled
, string callingMethodName
, string returnType
, string calledMethodName
, string calledMethodArguments
, string calledMethodTypeParameters = null )
{
ClassDeclarationSyntax parentClassDeclarationSyntax = null;
if (!SyntaxNodeHelper.TryGetParentSyntax(invocation, out parentClassDeclarationSyntax))
{
throw new Exception();
}
calledMethodTypeParameters = calledMethodTypeParameters ?? String.Empty;
var actedUpon = classBeingCalled.Identifier.ValueText;
var actor = parentClassDeclarationSyntax.Identifier.ValueText;
var callInfo = callingMethodName + "=>" + calledMethodName + calledMethodTypeParameters + calledMethodArguments;
var returnCallInfo = returnType;
string info = BuildCallInfo(
actor
, actedUpon
, callInfo
, returnCallInfo);
Console.Write(info);
}
private string BuildCallInfo(string actor, string actedUpon, string callInfo, string returnInfo)
{
const string calls = "->";
const string returns = "-->";
const string descriptionSeparator = ": ";
string callingInfo = actor + calls + actedUpon + descriptionSeparator + callInfo;
string returningInfo = actedUpon + returns + actor + descriptionSeparator + "returns " + returnInfo;
callingInfo = callingInfo.RemoveNewLines(true);
returningInfo = returningInfo.RemoveNewLines(true);
string result = callingInfo + Environment.NewLine;
result += returningInfo + Environment.NewLine;
return result;
}
}
static class SyntaxNodeHelper
{
public static bool TryGetParentSyntax<T>(SyntaxNode syntaxNode, out T result)
where T : SyntaxNode
{
// set defaults
result = null;
if (syntaxNode == null)
{
return false;
}
try
{
syntaxNode = syntaxNode.Parent;
if (syntaxNode == null)
{
return false;
}
if (syntaxNode.GetType() == typeof (T))
{
result = syntaxNode as T;
return true;
}
return TryGetParentSyntax<T>(syntaxNode, out result);
}
catch
{
return false;
}
}
}
public static class StringEx
{
public static string RemoveNewLines(this string stringWithNewLines, bool cleanWhitespace = false)
{
string stringWithoutNewLines = null;
List<char> splitElementList = Environment.NewLine.ToCharArray().ToList();
if (cleanWhitespace)
{
splitElementList.AddRange(" ".ToCharArray().ToList());
}
char[] splitElements = splitElementList.ToArray();
var stringElements = stringWithNewLines.Split(splitElements, StringSplitOptions.RemoveEmptyEntries);
if (stringElements.Any())
{
stringWithoutNewLines = stringElements.Aggregate(stringWithoutNewLines, (current, element) => current + (current == null ? element : " " + element));
}
return stringWithoutNewLines ?? stringWithNewLines;
}
}
}
Any guidance here would be much appreciated!
Using the methodSymbol in the ProcessClass method I took Andy's suggestion and came up with the below (although I imagine there may be an easier way to go about this):
private async Task<List<MethodDeclarationSyntax>> GetMethodSymbolReferences( IMethodSymbol methodSymbol )
{
var references = new List<MethodDeclarationSyntax>();
var referencingSymbols = await SymbolFinder.FindCallersAsync(methodSymbol, _solution);
var referencingSymbolsList = referencingSymbols as IList<SymbolCallerInfo> ?? referencingSymbols.ToList();
if (!referencingSymbolsList.Any(s => s.Locations.Any()))
{
return references;
}
foreach (var referenceSymbol in referencingSymbolsList)
{
foreach (var location in referenceSymbol.Locations)
{
var position = location.SourceSpan.Start;
var root = await location.SourceTree.GetRootAsync();
var nodes = root.FindToken(position).Parent.AncestorsAndSelf().OfType<MethodDeclarationSyntax>();
references.AddRange(nodes);
}
}
return references;
}
and the resulting image generated by plugging the output text into js-sequence-diagrams (I have updated the github gist with the full source for this should anyone find it useful - I excluded method parameters so the diagram was easy digest, but these can optionally be turned back on):
Edit:
I've updated the code (see the github gist) so now calls are shown in the order they were made (based on the span start location of a called method from within the calling method via results from FindCallersAsync):

Search with custom analyzer/filter returns no results

I have a simple custom analyzer that appears to properly generate phonetic hashes in a index from SQL server. It appears most attempts to query indexes generated with my custom analyzer return no results. I haven't been able to find similar cases so I must certainly be doing something wrong.
Custom filter:
internal class SoundexFilter : TokenFilter
{
private readonly ITermAttribute _termAttr;
private Queue<Token> soundexTokenQueue
= new Queue<Token>();
public SoundexFilter(TokenStream input)
: base(input)
{
_termAttr = AddAttribute<ITermAttribute>();
}
public override bool IncrementToken()
{
if (input.IncrementToken())
{
string currentTerm = _termAttr.Term;
var hash = Soundex.For(currentTerm);
Console.WriteLine("Original: {0}, Hash: {1}", currentTerm, hash);
soundexTokenQueue.Enqueue(new Token(hash, 0, hash.Length));
return true;
}
else if (soundexTokenQueue.Count > 0)
{
var token = soundexTokenQueue.Dequeue();
_termAttr.SetTermBuffer(token.Term);
_termAttr.SetTermLength(token.TermLength());
return true;
}
return false;
}
}
Custom analyzer:
public class SoundexAnalyzer : Analyzer
{
public override TokenStream TokenStream(string fieldName, TextReader reader)
{
//create the tokenizer
TokenStream result = new StandardTokenizer(Version.LUCENE_30, reader);
//add in filters
result = new StandardFilter(result);
// Add soundex filter
result = new SoundexFilter(result);
return result;
}
}
Simple test program:
public class Program
{
private const string NAME = "John Smith";
private const string SEARCH_NAME = "John Smith";
private Analyzer _analyzer = new SoundexAnalyzer();
private Directory _directory = new RAMDirectory();
internal void Run(string[] args)
{
using (var writer = new IndexWriter(_directory, _analyzer, IndexWriter.MaxFieldLength.UNLIMITED))
{
var field = new Field("Name", NAME, Field.Store.YES, Field.Index.ANALYZED);
var document = new Document();
document.Add(field);
writer.AddDocument(document);
// Unnecessary but helps imply intent
writer.Commit();
}
using (var searcher = new IndexSearcher(_directory))
{
var parser = new QueryParser(Version.LUCENE_30, "Name", _analyzer);
var query = parser.Parse(SEARCH_NAME);
var docs = searcher.Search(query, 10);
Console.WriteLine("\nReturned Docs:");
foreach (var scoreDoc in docs.ScoreDocs)
{
var doc = searcher.Doc(scoreDoc.Doc);
Console.WriteLine(doc.Get("Name"));
}
}
}
private static void Main(string[] args)
{
new Program().Run(args);
}
}
The only search that succeeds using this code is an exact match like NAME = "John" and SEARCH_NAME = "John".
The strange thing is searching in Luke with the standard analyzers for the phonetic hashes works fine, so the write must be working as expected (or at least how I expect).
I've done a fair amount of research around this and have little help. Any idea what I'm missing?
I figured out what solves the problem but haven't quite figured out exactly why it's a problem.
Basically, my TokenFilter implementation included in the question is attempting to do too much and doesn't appear to align with the expectations of Lucene.
By limiting the IncrementToken implementation to perform just the phonetic hash and replace the ITermAttribute.Term value with the generated hash, it works quite well.
TokenFilter implementation:
public class SoundexFilter : TokenFilter
{
private readonly ITermAttribute _termAttr;
public SoundexFilter(TokenStream input)
: base(input)
{
_termAttr = AddAttribute<ITermAttribute>();
}
public override bool IncrementToken()
{
if (input.IncrementToken())
{
string currentTerm = _termAttr.Term;
// Any phonetic hash calculation will work here.
var hash = Soundex.For(currentTerm);
_termAttr.SetTermBuffer(hash);
return true;
}
return false;
}
}
The result requires the same filter to be applied at both index and query time, but it works extremely well.
As a side note, performance of this filter doesn't appear to match my expectations so I'll be profiling the solution to identify possible enhancements. I'd recommend anyone looking to use this solution do the same if they expect sub-second response time for an index with > 2 million documents.

Categories

Resources