I am working on a larger C# project in visual studio handling finance math, so naturally the code implements many special math formulas and they need to be properly documented. I am looking for a good way to produce a documentation from the code. Many objects already have some xml-doc comments with description setup and i am looking for ways to include math formulas written in latex into that.
What options are there and how easy are they to set up?
Or maybe more generally, are there better ways to produce such code documentation?
For me a few things are important:
documenation must have a way to include math formulas.
latex is our preferred syntax to write formulas
ability to use cref-like links in documentation
refactoring (like renaming a class) shouldn't break the links between documentation and object.
it should work with vs-intellisense tooltips and at least show the summary documentation of methods and classes
I tried using Doxygen 1.9.6 (we have also one C++ project) and I manged to make it partially work. it does render latex formulas from the summary tag, but it seems to have issues with certain C# things, for example i cannot make it to generate any documentation for (public) implementations of methods from generic interfaces regardless how i set up the configuration (need to do more research to what exactly is the problem).
I add this as a separate answer because it is completely different approach.
I have found another existing answer which may be helpful.
There are two extensions to VS which support LaTeX formula in comments.
https://marketplace.visualstudio.com/items?itemName=vs-publisher-1305558.VsTeXCommentsExtension (for VS 2017, 2019)
https://marketplace.visualstudio.com/items?itemName=pierreMertz.TeXcomments (works with VS 2010)
For VS 2022 there is new version of the first extension:
https://marketplace.visualstudio.com/items?itemName=vs-publisher-1305558.VsTeXCommentsExtension2022
Maybe Literate Programming is this what are you looking for. Literate programming is a programming paradigm where the documentation and the source code are written in a natural language, making the code easier to read, understand, and maintain. The source code is interspersed with explanations and descriptions that provide context and clarify the purpose and function of the code. This approach aims to make the code more accessible to a wider audience and to improve the overall quality of the code.
The general idea was introduced by Donald E. Knuth and implemented for C. (see http://www.literateprogramming.com)
Tommi Johtela proposed LiterateCS to implement it for C#. It assumes using markdown with LaTeX syntax for math formulas.
General introduction: https://johtela.github.io/LiterateCS/Introduction.html
Example of math formula in the generated documentation: https://johtela.github.io/LiterateCS/TipsAndTricks.html
Related
Background
Part of the project I'm working on requires me to analyze Q# source code and perform specific actions when certain syntax elements are encountered. For example, say I'd like to count how many different gate types are used throughout the program. Now, this could be implemented by walking the Abstract Syntax Tree of the program and performing actions based on the current syntax node.
What I've tried
I've started by analyzing the qsharp-compiler repository, however, the inner workings of the compiler lack online documentation and browsing all the C# and F# sources can be really tedious.
Of course, I could write my own parser for the language, but that would probably be an overkill for the task at hand. There has to be a way to extract the AST from inside of the compiler.
The question
Is there a way to compile Q# source code using the Q# compiler programmatically (from C# or F#), and extract the internal AST?
Yes, it is perfectly possible to compile Q# source code programmatically. This is particularly useful if you want to repeatedly update a compilation - you can add/remove/edit (parts of) the sources and references in memory, and query all kinds of useful information about the current state of the compilation that e.g. an IDE cares about (like e.g. which symbols are defined at a particular location in a certain file).
However, if you just want to process the AST for a Q# compilation, then there is a much easier way! The Q# compiler has an extensibility mechanism that I believe fits your need perfectly.
This blog post gives a brief overview over the feature.
There is also an example for an extension on the compiler repo. This readme (and possibly this one) may also come in handy. I believe this answers half of your question, namely how to easily get access to the built AST.
The other half of the question according to my interpretation is how to conveniently analyze or transform the AST. For that there is also a mechanism provided; the syntax tree transformation framework. That framework consists of a couple of classes that define the walk/transformation for different kinds of nodes, as well as a wrapping class that plugs it all together.
Rather than starting by looking at the definition of the transformations, it is probably more intuitive to just look at some examples that use it. An example that is pretty close to what you want to do can be found here. The implemented transformation adds a comment to each callable listing all identifiers used within the callable. It is invoked as as part of a compilation step (see here) that is defined in the example I already linked above.
There are a couple of other good examples for simple transformations that are a bit farther from what you want to do, but should give you an idea how the whole setup works if you are interested: this one allows to attach attributes to callables, and this one is used to inline conjugations (pattern of the form U*VU).
Last but not least, the Gitter for the Q# community can possibly also be a good resource to engage as you work.
a little question in which I hope you can help me. (To make my life simpler)
One of the most respected numerical solvers for differential equations is LSODA, however it is written in Fortran... ( http://www.netlib.org/odepack/index.html )
There does not seem to be a decent solver for C#, and writing my own is too time consuming in C#, especially as I have very stiff equations that need to be solved.
The NAG libraries for net do not contain an ODE solver (they lack D02 routines). In terms of "university side" libraries that's it.
However NAG Support suggested calling their dll, which is fine for simple variables, but has me rather perplexed with its external functions and dummy parameters which made me give up.
This leaves LSODA still, which is fortran, but a lot simpler in its calling sequence - so I wonder, how can the Odepack (the solvers that include the lsoda routine) be turned into a dll with little work, so that it may be called from C#?
(Which will leave me worried about the Jacobian, being a matrix, i.e. 2D array.)
Specifically, I would like a situation similar to that with the Fortran NAG library, but instead offering me access to lsoda: http://www.nag.co.uk/numeric/csharpinfo.asp
Please keep in mind that I am a mathematician - so if your responses loose me, please be patient with me. And why am I so focused on C# - well, it is simple, especially when one has VisualStudio 2010.
Many thanks for any responses in advance.
SmartMathLibrary looks dead, but it claims to have ODEPACK bindings. You could also check out Wikipedia's List of .NET Numerical Packages.
If you're open to other languages, Python's SciPy library contains a binding to LSODA: enter link description here. It's available on Windows, easy to use, free, and widely embraced by the scientific community.
It's not a full solution, but f2c (Fortran-to-C converter) should be able to give you working C code from the Fortran source. That might at least be easier to get working from C#.
Disclaimer: I've never used f2c to convert a routine, I've only used some of the routines that someone else converted.
I'm writing a console tool to generate some C# code for objects in a class library. The best/easiest way I can actual generate the code is to use reflection after the library has been built. It works great, but this seems like a haphazard approch at best. Since the generated code will be compiled with the library, after making a change I'll need to build the solution twice to get the final result, etc. Some of these issues could be mitigated with a build script, but it still feels like a bit too much of a hack to me.
My question is, are there any high-level best practices for this sort of thing?
Its pretty unclear what you are doing, but what does seem clear is that you have some base line code, and based on some its properties, you want to generate more code.
So the key issue here are, given the base line code, how do you extract interesting properties, and how do you generate code from those properties?
Reflection is a way to extract properties of code running (well, at least loaded) into the same execution enviroment as the reflection user code. The problem with reflection is it only provides a very limited set of properties, typically lists of classes, methods, or perhaps names of arguments. IF all the code generation you want to do can be done with just that, well, then reflection seems just fine. But if you want more detailed properties about the code, reflection won't cut it.
In fact, the only artifact from which truly arbitrary code properties can be extracted is the the source code as a character string (how else could you answer, is the number of characters between the add operator and T in middle of the variable name is a prime number?). As a practical matter, properties you can get from character strings are generally not very helpful (see the example I just gave :).
The compiler guys have spent the last 60 years figuring out how to extract interesting program properties and you'd be a complete idiot to ignore what they've learned in that half century.
They have settled on a number of relatively standard "compiler data structures": abstract syntax trees (ASTs), symbol tables (STs), control flow graphs (CFGs), data flow facts (DFFs), program triples, ponter analyses, etc.
If you want to analyze or generate code, your best bet is to process it first into such standard compiler data structures and then do the job. If you have ASTs, you can answer all kinds of question about what operators and operands are used. If you have STs, you can answer questions about where-defined, where-visible and what-type. If you have CFGs, you can answer questions about "this-before-that", "what conditions does statement X depend upon". If you have DFFs, you can determine which assignments affect the actions at a point in the code. Reflection will never provide this IMHO, because it will always be limited to what the runtime system developers are willing to keep around when running a program. (Maybe someday they'll keep all the compiler data structures around, but then it won't be reflection; it will just finally be compiler support).
Now, after you have determined the properties of interest, what do you do for code generation? Here the compiler guys have been so focused on generation of machine code that they don't offer standard answers. The guys that do are the program transformation community (http://en.wikipedia.org/wiki/Program_transformation). Here the idea is to keep at least one representation of your program as ASTs, and to provide special support for matching source code syntax (by constructing pattern-match ASTs from the code fragments of interest), and provide "rewrite" rules that say in effect, "when you see this pattern, then replace it by that pattern under this condition".
By connecting the condition to various property-extracting mechanisms from the compiler guys, you get relatively easy way to say what you want backed up by that 50 years of experience. Such program transformation systems have the ability to read in source code,
carry out analysis and transformations, and generally to regenerate code after transformation.
For your code generation task, you'd read in the base line code into ASTs, apply analyses to determine properties of interesting, use transformations to generate new ASTs, and then spit out the answer.
For such a system to be useful, it also has to be able to parse and prettyprint a wide variety of source code langauges, so that folks other than C# lovers can also have the benefits of code analysis and generation.
These ideas are all reified in the
DMS Software Reengineering Toolkit. DMS handles C, C++, C#, Java, COBOL, JavaScript, PHP, Verilog, ... and a lot of other langauges.
(I'm the architect of DMS, so I have a rather biased view. YMMV).
Have you considered using T4 templates for performing the code generation? It looks like it's getting much more publicity and attention now and more support in VS2010.
This tutorial seems database centric but it may give you some pointers: http://www.olegsych.com/2008/09/t4-tutorial-creatating-your-first-code-generator/ in addition there was a recent Hanselminutes on T4 here: http://www.hanselminutes.com/default.aspx?showID=170.
Edit: Another great place is the T4 tag here on StackOverflow: https://stackoverflow.com/questions/tagged/t4
EDIT: (By asker, new developments)
As of VS2012, T4 now supports reflection over an active project in a single step. This means you can make a change to your code, and the compiled output of the T4 template will reflect the newest version, without requiring you to perform a second reflect/build step. With this capability, I'm marking this as the accepted answer.
You may wish to use CodeDom, so that you only have to build once.
First, I would read this CodeProject article to make sure there are not language-specific features you'd be unable to support without using Reflection.
From what I understand, you could use something like Common Compiler Infrastructure (http://ccimetadata.codeplex.com/) to programatically analyze your existing c# source.
This looks pretty involved to me though, and CCI apparently only has full support for C# language spec 2. A better strategy may be to streamline your existing method instead.
I'm not sure of the best way to do this, but you could do this
As a post-build step on your base dll, run the code generator
As another post-build step, run csc or msbuild to build the generated dll
Other things which depend on the generated dll will also need to depend on the base dll, so the build order remains correct
Is there any software that visualizes algorithms from code? As a flow chart of something similar. Not dependencies, inheritance and that kind of thing, but the code inside a function, or a series of functions.
I don't know about inside a function, but VS2010 has sequence diagram generation from code - see here or here
I think you may be looking for Code Rocket.
It provides flowchart and pseudocode visualizations of code methods and algorithms, embedded directly inside Visual Studio and Eclipse - and there is a separate Designer application for working outwith IDEs too.
AiVosto has a set of tools to visualize source code from many languages: Visustin. It is on the market since a long time. I tried it a very long time ago, was not 100% convinved. Maybe you want to give it a try and evaluate if it's worth the money for you.
For me it was so that in order to understand some complex algorithm, I have anyway to experiment with it, having a graph helps a bit but as a coder you probably can visualize loops and decision trees well without software. I don't want to discourage you from using that, just try it out before investing money, having graphs is nice but shall also be useful.
I’m starting a project where I need to implement a light-weight interpreter.
The interpreter is used to execute simple scientific algorithms.
The programming language that this interpreter will use should be simple, since it is targeting non- software developers (for example, mathematicians.)
The interpreter should support basic programming languages features:
Real numbers, variables, multi-dimensional arrays
Binary (+, -, *, /, %) and Boolean (==, !=, <, >, <=, >=) operations
Loops (for, while), Conditional expressions (if)
Functions
MathWorks MatLab is a good example of where I’m heading, just much simpler.
The interpreter will be used as an environment to demonstrate algorithms; simple algorithms such as finding the average of a dataset/array, or slightly more complicated algorithms such as Gaussian elimination or RSA.
Best/Most practical resource I found on the subject is Ron Ayoub’s entry on Code Project (Parsing Algebraic Expressions Using the Interpreter Pattern) - a perfect example of a minified version of my problem.
The Purple Dragon Book seems to be too much, anything more practical?
The interpreter will be implemented as a .NET library, using C#. However, resources for any platform are welcome, since the design-architecture part of this problem is the most challenging.
Any practical resources?
(please avoid “this is not trivial” or “why re-invent the wheel” responses)
I would write it in ANTLR. Write the grammar, let ANTLR generate a C# parser. You can ANTLR ask for a parse tree, and possibly the interpreter can already operate on the parse tree. Perhaps you'll have to convert the parse tree to some more abstract internal representation (although ANTLR already allows to leave out irrelevant punctuation when generating the tree).
It might sound odd, but Game Scripting Mastery is a great resource for learning about parsing, compiling and interpreting code.
You should really check it out:
http://www.amazon.com/Scripting-Mastery-Premier-Press-Development/dp/1931841578
One way to do it is to examine the source code for an existing interpreter. I've written a javascript interpreter in the D programming language, you can download the source code from http://ftp.digitalmars.com/dmdscript.zip
Walter Bright, Digital Mars
I'd recommend leveraging the DLR to do this, as this is exactly what it is designed for.
Create Your Own Language ontop of the DLR
Lua was designed as an extensible interpreter for use by non-programmers. (The first users were Brazilian petroleum geologists although the user base has broadened considerably since then.) You can take Lua and easily add your scientific algorithms, visualizations, what have you. It's superbly well engineered and you can get on with the task at hand.
Of course, if what you really want is the fun of building your own, then the other advice is reasonable.
Have you considered using IronPython? It's easy to use from .NET and it seems to meet all your requirements. I understand that python is fairly popular for scientific programming, so it's possible your users will already be familiar with it.
The Silk library has just been published to GitHub. It seems to do most of what you are asking. It is very easy to use. Just register the functions you want to make available to the script, compile the script to bytecode and execute it.
The programming language that this interpreter will use should be simple, since it is targeting non- software developers.
I'm going to chime in on this part of your question. A simple language is not what you really want to hand to non-software developers. Stripped down languages require more effort by the programmer. What you really want id a well designed and well implemented Domain Specific Language (DSL).
In this sense I will second what Norman Ramsey recommends with Lua. It has an excellent reputation as a base for high quality DSLs. A well documented and useful DSL takes time and effort, but will save everyone time in the long run when domain experts can be brought up to speed quickly and require minimal support.
I am surprised no one has mentioned xtext yet. It is available as Eclipse plugin and IntelliJ plugin. It provides not just the parser like ANTLR but the whole pipeline (including parser, linker, typechecker, compiler) needed for a DSL. You can check it's source code on Github for understanding how, an interpreter/compiler works.