Looking for a C# code parser [closed] - c#

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
I'm looking for a set of classes (preferably in the .net framework) that will parse C# code and return a list of functions with parameters, classes with their methods, properties etc. Ideally it would provide all that's needed to build my own intellisense.
I have a feeling something like this should be in the .net framework, given all the reflection stuff they offer, but if not then an open source alternative is good enough.
What I'm trying to build is basically something like Snippet Compiler, but with a twist. I'm trying to figure out how to get the code dom first.
I tried googling for this but I'm not sure what the correct term for this is so I came up empty.
Edit: Since I'm looking to use this for intellisense-like processing, actually compiling the code won't work since it will most likely be incomplete. Sorry I should have mentioned that first.

While .NET's CodeDom namespace provides the basic API for code language parsers, they are not implemented. Visual Studio does this through its own language services. These are not available in the redistributable framework.
You could either...
Compile the code then use reflection on the resulting assembly
Look at something like the Mono C# compiler which creates these syntax trees. It won't be a high-level API like CodeDom but maybe you can work with it.
There may be something on CodePlex or a similar site.
See this related post. Parser for C#

If you need it to work on incomplete code, or code with errors in it, then I believe you're pretty much on your own (that is, you won't be able to use the CSharpCodeCompiler class or anything like that).
There's tools like ReSharper which does its own parsing, but that's prorietary. You might be able to start with the Mono compiler, but in my experience, writing a parser that works on incomplete code is a whole different ballgame to writing one that's just supposed to spit out errors on incomplete code.
If you just need the names of classes and methods (metadata, basically) then you might be able to do the parsing "by hand", but I guess it depends on how accurate you need the results to be.

Mono project GMCS compiler contains a pretty reusable parser for C#4.0. And, it is relatively easy to write your own parser which will suite your specific needs. For example, you can reuse this: http://antlrcsharp.codeplex.com/

Have a look at CSharpCodeCompiler in Microsoft.CSharp namespace. You can compile using CSharpCodeCompiler and access the result assembly using CompilerResults.CompiledAssembly. Off that assembly you will be able to get the types and off the type you can get all property and method information using reflection.
The performance will be pretty average as you will need to compile all the source code whenever something changes. I am not aware of any methods that will let you incrementatlly compile snippets of code.

Have you tried using the Microsoft.CSharp.CSharpCodeProvider class? This is a full C# code provider that supports CodeDom. You would simply need to call .Parse() on a text stream, and you get a CodeCompileUnit back.
var codeStream = new StringReader(code);
var codeProvider = new CSharpCodeProvider();
var compileUnit = codeProvider.Parse(codeStream);
// compileUnit contains your code dom
Well, seeing as the above does not work (I just tested it), the following article might be of interest. I bookmarked it a good long time ago, so I believe it only supports C# 2.0, but it might still be worth it:
Generate Code-DOMs directly from C# or VB.NET

It might be a bit late for Blindy, but I recently released a C# parser that would be perfect for this sort of thing, as it's designed to handle code fragments and retains comments:
C# Parser and CodeDOM
It handles C# 4.0 and also the new 'async' feature. It's commercial, but is a small fraction of the cost of other commercial compilers.
I really think few people realize just how difficult parsing C# has become, especially if you need to resolve symbolic references properly (which is usually required, unless maybe you're just doing formatting). Just try to read and fully understand the Type Inference section of the 500+ page language specification. Then, meditate on the fact that the spec is not actually fully correct (as mentioned by Eric Lippert himself).


Convert VB.NET To C# [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Convert VB.NET code to C#
I'm looking for a powerfull tool who can convert & C# code to VB.NET or vice-versa.
I've tried some websites but they are not very good.
Any ideas ?
I have used .Net Reflector. Just load the DLL and select the language C#, VB etc.
See http://reflector.red-gate.com/download.aspx
The two most popular ones are developerFusions and teleriks:
Sometimes you will have to hand convert certain parts of the converted code that the converters have trouble understanding but on the whole they do a large part of the heavy lifting for you.
One example is that vb.net uses array indexes as () whereas c# uses []. For some reason I have seen the developerFusion get confused and leave these as () in the c# code which confuses the compiler.
I mostly use this for quick translations of code when I am answering forum questions and the op has specifically requested a vb.net answer. I just find it easier to code it in c# and then convert it.
If you have a very large project then you might find it tedious to convert each of the individual pages. In this case you might (I haven't actually tested this) find a way to convert the code more quickly by using the #develop IDE. I only say this as the developerFusion online converter is actually powered by code from this tool.
Another thing to remember, as has been mentioned elsewhere in this thread, is that its entirely possible to mix and match languages within a single project. The only restriction to my knowledge is that you can only have one language per folder. This is to do with the way that the .net code is compiled. By default each folder is compiled into its own assembly.
If you really need to mix and match on the same page I think you could make a usercontrol to contain the code for one language and put it onto a different languages page.
You've both online- and offline solutions at your disposal.
A very popular online converter: Developer Fusion's Converter
Reflector has for long been the most popular offline source inspector. It allows you to view any assembly in the language of your choice. However, recently they changed and it is now no longer free.

Format C# Source Code with Hyperlinks to Reference Library Documentation

I'm wondering if anyone has done this already.
I want to format C# source code in HTML. But with a twist! I want to turn the names of all types and methods that appear in the code into hyperlinks to the MSDN Library documentation of the types and methods.
To do a good job, the data types of variables and expressions needs to be known, just like how the C# compiler does it. So it's a tall order. If something like this is not available, please point me to any free libraries that can generate a parsed tree of the C# source code in sufficient detail to do this task. (In fact, I'd like to know about such a standalone parser library even if the full solution I am asking for already exists.)
This kind of utility might benefit blogs and forums -- maybe even Stack Overflow!
Have you checked out Docu? It's an open source library that converts .net documentation into HTML documents.
I'd suggest using the Visual Studio SDK.

Using reflection for code gen?

I'm writing a console tool to generate some C# code for objects in a class library. The best/easiest way I can actual generate the code is to use reflection after the library has been built. It works great, but this seems like a haphazard approch at best. Since the generated code will be compiled with the library, after making a change I'll need to build the solution twice to get the final result, etc. Some of these issues could be mitigated with a build script, but it still feels like a bit too much of a hack to me.
My question is, are there any high-level best practices for this sort of thing?
Its pretty unclear what you are doing, but what does seem clear is that you have some base line code, and based on some its properties, you want to generate more code.
So the key issue here are, given the base line code, how do you extract interesting properties, and how do you generate code from those properties?
Reflection is a way to extract properties of code running (well, at least loaded) into the same execution enviroment as the reflection user code. The problem with reflection is it only provides a very limited set of properties, typically lists of classes, methods, or perhaps names of arguments. IF all the code generation you want to do can be done with just that, well, then reflection seems just fine. But if you want more detailed properties about the code, reflection won't cut it.
In fact, the only artifact from which truly arbitrary code properties can be extracted is the the source code as a character string (how else could you answer, is the number of characters between the add operator and T in middle of the variable name is a prime number?). As a practical matter, properties you can get from character strings are generally not very helpful (see the example I just gave :).
The compiler guys have spent the last 60 years figuring out how to extract interesting program properties and you'd be a complete idiot to ignore what they've learned in that half century.
They have settled on a number of relatively standard "compiler data structures": abstract syntax trees (ASTs), symbol tables (STs), control flow graphs (CFGs), data flow facts (DFFs), program triples, ponter analyses, etc.
If you want to analyze or generate code, your best bet is to process it first into such standard compiler data structures and then do the job. If you have ASTs, you can answer all kinds of question about what operators and operands are used. If you have STs, you can answer questions about where-defined, where-visible and what-type. If you have CFGs, you can answer questions about "this-before-that", "what conditions does statement X depend upon". If you have DFFs, you can determine which assignments affect the actions at a point in the code. Reflection will never provide this IMHO, because it will always be limited to what the runtime system developers are willing to keep around when running a program. (Maybe someday they'll keep all the compiler data structures around, but then it won't be reflection; it will just finally be compiler support).
Now, after you have determined the properties of interest, what do you do for code generation? Here the compiler guys have been so focused on generation of machine code that they don't offer standard answers. The guys that do are the program transformation community (http://en.wikipedia.org/wiki/Program_transformation). Here the idea is to keep at least one representation of your program as ASTs, and to provide special support for matching source code syntax (by constructing pattern-match ASTs from the code fragments of interest), and provide "rewrite" rules that say in effect, "when you see this pattern, then replace it by that pattern under this condition".
By connecting the condition to various property-extracting mechanisms from the compiler guys, you get relatively easy way to say what you want backed up by that 50 years of experience. Such program transformation systems have the ability to read in source code,
carry out analysis and transformations, and generally to regenerate code after transformation.
For your code generation task, you'd read in the base line code into ASTs, apply analyses to determine properties of interesting, use transformations to generate new ASTs, and then spit out the answer.
For such a system to be useful, it also has to be able to parse and prettyprint a wide variety of source code langauges, so that folks other than C# lovers can also have the benefits of code analysis and generation.
These ideas are all reified in the
DMS Software Reengineering Toolkit. DMS handles C, C++, C#, Java, COBOL, JavaScript, PHP, Verilog, ... and a lot of other langauges.
(I'm the architect of DMS, so I have a rather biased view. YMMV).
Have you considered using T4 templates for performing the code generation? It looks like it's getting much more publicity and attention now and more support in VS2010.
This tutorial seems database centric but it may give you some pointers: http://www.olegsych.com/2008/09/t4-tutorial-creatating-your-first-code-generator/ in addition there was a recent Hanselminutes on T4 here: http://www.hanselminutes.com/default.aspx?showID=170.
Edit: Another great place is the T4 tag here on StackOverflow: https://stackoverflow.com/questions/tagged/t4
EDIT: (By asker, new developments)
As of VS2012, T4 now supports reflection over an active project in a single step. This means you can make a change to your code, and the compiled output of the T4 template will reflect the newest version, without requiring you to perform a second reflect/build step. With this capability, I'm marking this as the accepted answer.
You may wish to use CodeDom, so that you only have to build once.
First, I would read this CodeProject article to make sure there are not language-specific features you'd be unable to support without using Reflection.
From what I understand, you could use something like Common Compiler Infrastructure (http://ccimetadata.codeplex.com/) to programatically analyze your existing c# source.
This looks pretty involved to me though, and CCI apparently only has full support for C# language spec 2. A better strategy may be to streamline your existing method instead.
I'm not sure of the best way to do this, but you could do this
As a post-build step on your base dll, run the code generator
As another post-build step, run csc or msbuild to build the generated dll
Other things which depend on the generated dll will also need to depend on the base dll, so the build order remains correct

Which parsers are available for parsing C# code? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
Which parsers are available for parsing C# code?
I'm looking for a C# parser that can be used in C# and give me access to line and file informations about each artefact of the analysed code.
Works on source code:
From C# 1.0 to 2.0, open-source
Metaspec C# Parser:
From C# 1.0 to 3.0, commercial product (about 5000$)
From C# 1.0 to 3.0, commercial product (about 900€) (answer by SharpRecognize)
SharpDevelop Parser (answer by Akselsson)
From C# 1.0 to 4.0 (+async), open-source, parser used in SharpDevelop. Includes semantic analysis.
C# Parser and CodeDOM:
A complete C# 4.0 Parser, already support the C# 5.0 async feature. Commercial product (49$ to 299$) (answer by Ken Beckett)
Microsoft Roslyn CTP:
Compiler as a service.
Works on assembly:
Microsoft Common Compiler Infrastructure:
From C# 1.0 to 3.0, Microsoft Public License. Used by Fxcop and Spec#
From C# 1.0 to 3.0, open-source
The problem with assembly "parsing" is that we have less informations about line and file (the informations is based on .pdb file, and Pdb contains lines informations only for methods)
I personnaly recommend Mono.Cecil and NRefactory.
Mono (open source) includes C# compiler (and of course parser)
If you are going to compile C# v3.5 to .net assemblies:
var cp = new Microsoft.CSharp.CSharpCodeProvider(new Dictionary<string, string>() { { "CompilerVersion", "v3.5" } });
If you're familiar with ANTLR, you can use Antlr C# grammar.
I've implemented just what you are asking (AST Parsing of C# code) at the OWASP O2 Platform project using SharpDevelop AST APIs.
In order to make it easier to consume I wrote a quick API that exposes a number of key source code elements (using statements, types, methods, properties, fields, comments) and is able to rewrite the original C# code into C# and into VBNET.
You can see this API in action on this O2 XRule script file: ascx_View_SourceCode_AST.cs.o2 .
For example this is how you process a C# source code text and populate a number of TreeViews & TextBoxes:
public void updateView(string sourceCode)
var ast = new Ast_CSharp(sourceCode);
types_TreeView.show_List(ast.astDetails.Types, "Text");
rewritenCSharpCode_SourceCodeEditor.setDocumentContents(ast.astDetails.CSharpCode, ".cs");
rewritenVBNet_SourceCodeEditor.setDocumentContents(ast.astDetails.VBNetCode, ".vb");
The example on ascx_View_SourceCode_AST.cs.o2 also shows how you can then use the information gathered from the AST to select on the source code a type, method, comment, etc..
For reference here is the API code that wrote (note that this is my first pass at using SharpDevelop's C# AST parser, and I am still getting my head around how it works):
We have recently released a C# parser that handles all C# 4.0 features plus the new async feature: C# Parser and CodeDOM
This library generates a semantic object model which retains comments and formatting information and can be modified and saved. It also supports the use of LINQ queries to analyze source code.
You should definitely check out Roslyn since MS just opened (or will soon open) the code with an Apache 2 license here. You can also check out a way to parse this info with this code from GitHub.
SharpDevelop, an open source IDE, comes with a visitor-based code parser which works really well. It can be used independently of the IDE.
Consider to use reflection on a built binary instead of parsing the C# code directly. The reflection API is really easy to use and perhaps you can get all the information you need?
Have a look at Gold Parser. It has a very intuitive IU that lets you interactively test your grammar and generate C# code. There are plenty of examples available with it and it is completely free.
Maybe you could try with Irony on irony.codeplex.com.
It's very fast and a c# grammar already exists.
The grammar itself is written directly in c# in a BNF like way (acheived with some operators overloads)
The best thing with it is that the "grammar" produces the AST directly.
Something that is gaining momentum and very appropriate for the job is Nemerle
you can see how it could solve it in these videos from NDC :
Igor Tkachev - Metaprogramming with Nemerle
Igor Tkachev - Nemerle Programming Language
Not in C#, but a full C# 2/3/4 parser that builds full ASTs is available with our DMS Software Reengineering Toolkit.
DMS provides a vast infrastructure for parsing, tree building, construction of symbol tables and flow analyses, source-to-source transformation, and regeneration of source code from the (modified) ASTs. (It also handles many other languages than just C#.)
EDIT (September) 2013: This answer hasn't been updated recently. DMS has long handled C# 5.0
GPPG might be of use, if you are willing to write your own parser (which is fun).

What is the best C# to VB.net converter? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
While searching the interweb for a solution for my VB.net problems I often find helpful articles on a specific topic, but the code is C#. That is no big problem but it cost some time to convert it to VB manually.
There are some sites that offer code converters from C# to VB and vice versa, but to fix all the flaws after the code-conversion is nearly as time-consuming as doing it by myself in the first place.
Till now I am using http://labs.developerfusion.co.uk/convert/csharp-to-vb.aspx
Do you know something better?
Telerik has a good converter that is based on SharpDevelop that has worked pretty well over the years, though it has not been updated in years (due to it being based on SharpDevelop).
I've recently come across a roslyn based converter as well. I don't know how well it works or how well maintained it is, but as it's open source you can always fork it and update it as needed.
If you cannot find a good converter, you could always compile the c# code and use the dissasembler in Reflector to see Visual Basic code. Some of the variable names will change.
I currently use these two most often:
But have also had some success with these others:
SharpDevelop has a built-in translator between C# and VB.NET. Is not perfect thought (e.g. the optional values in VB.NET doesn't have an equivalent in C#, so the signature of the converter method must be edited), but you can save some time, as you are making all operations inside an IDE and not a webpage (copy C# code, paste, hit button, copy VB.NET code, paste on IDE :P )
I think the best thing to do is learn enough of the other language so that you can rewrite by hand, there's some quite difficult differences in certain aspects that I'm not sure a converter would handle very well. For example, compare my translation from C# to VB of the following:
public class FileSystemEventSubscription : EventSubscription
private FileSystemWatcher fileSystemWatcher;
public FileSystemEventSubscription(IComparable queueName,
Guid workflowInstanceId, FileSystemWatcher fileSystemWatcher) : base(queueName, workflowInstanceId)
this.fileSystemWatcher = fileSystemWatcher;
Public Class FileSystemEventSubscription
Inherits EventSubscription
Private myFileSystemWatcher As FileSystemWatcher
Public Sub New(ByVal QueueName As IComparable, ByVal WorkflowInstanceID As Guid, ByVal Watcher As FileSystemWatcher)
MyBase.New(QueueName, WorkflowInstanceID)
Me.myFileSystemWatcher = Watcher
End Sub
The C# is from the Custom Activity Framework sample, and I'm afraid I've lost the link to it. But it contains some nasty looking inheritance (from a VB point of view).
I am using a free Visual Studio 2012 plug-in named Language Convert
It works perfectly on 2010/2012, unfortunately isn't working at VS 2013 yet.
The conversion is not 100% accurate, but it is definitely very helpful, to launch for the first time it is a bit tricky, check before the image below :
Last I checked, SharpDevelop has one and it is open source too.
You can load your DLL or EXE into Redgate's (formerly Lutz Roeder's) .Net Reflector, select your method and then the desired language from the language combo. The code of the selected method will be displayed in the selected language.
I hope this helps.
You can try this one converter. There is functionality for C# to VB and VB to C#.
Hope this helps.
Carlos Aguilar Mares has had an online converter for about 40 forevers - Code Translator but I would agree that Reflector is the better answer.
While not answering your question, I will say that I have been in a similar position.
I realised that code samples in C# were awkward when I was really starting out in .NET, but a few weeks into my first project (after I grown more familiar with the .NET framework and VB.NET itself), I found that it was interesting and sometimes beneficial to have to reverse-engineer the C# code. Not just in terms of syntax, but also learning about the subtle differences in approach - it's useful to be open-minded in this respect.
I'm sticking with VB.NET as I learn more and more about the framework, but before long I'll dip my to into C# with the intention of becoming 'multi-lingual'.
Currently I use a plugin for VS2005 that I found on CodeProject (http://www.codeproject.com/KB/cs/Code_convert_add-in.aspx); it use an external service (http://www.carlosag.net/Tools/CodeTranslator/) to perform translation.
Occasionally, when I'm offline, I use a converter tool (http://www.kamalpatel.net/ConvertCSharp2VB.aspx).
The one at http://www.developerfusion.com/tools/convert/csharp-to-vb/ (new url) now supports .NET 3.5 syntax (thanks to the #develop guys once again), and will automatically copy the results to your clipboard :)

