Pretty-printing C# from Python

Pretty-printing C# from Python - c#

Suppose I wrote a compiler in Python or Ruby that translates a language into a C# AST.
How do I pretty-print this AST from Python or Ruby to get nicely indented C# code?
Thanks, Joel

In python the pprint module is available.
Depending on how your data is structured it may not return the result your looking for.

Once you have an AST, this should be very easy. When you walk your AST, all you have to do is keep track of what your current indent level is -- you could use a global for this. The code that's walking the tree simply needs to increment the indent level every time you enter a block, and decrement it when you exit a block. Then, whenever you print a line of code, you call it like this:
print "\t"*indentlevel + code
You should end up with nicely formatted code. However, I'm a bit confused that you're asking this question -- if you have the skills to parse C# into an AST, I can't imagine you wouldn't be able to write a pretty-printing output function. :-)

One way would be to just print it and then invoke a code formatter.

Related

c# - beginner wants to add code on the run from a string containing a math function

I learned basic algorithms on visual C# in highschool, and I made a simple code that numerically integrates a math function within given limits.
I want to be able to change the function the code integrates without actually editing the code, so I googled it for a while and found a lot of articles about how to do it. I tired to understand it but the problem is I can't understand any of what's written there because it's too much above my level.
I need a code that can add code on the run from a string containing a math function, that can accept a variable, log, ln, powers, sin, cos, tan and maybe pi and e, that is ready in a friendly "copy-paste" format, followed by instructions on where to paste it, and how to connect it to my code. To clarify:
I want to take something like this:
string s = "Sqrt(ln(1 + x ^ 2))";
and make it like this:
double x = 0;
double y = Math.Sqrt(Math.Log(1 + Math.Pow(x,2)));
I know it's a pretty annoying request and if it's not the right place to ask such a thing I apologize in advance.

This is actually fairly difficult to do in a language like C#, as it's statically compiled.
A good alternative would be to use an expression parsing library, such as NCalc. This library would allow you to create the expression (your string), parse it, and extract the result.

Is Ironpython suitable for value calculation?

I want to allow the users to define a formula based on two variables. Say I have a Quantity and Value and I want to allow the users of my app to write something like
Quantity*20+Value*0,005
in a textbox and pass the result back to the C# app. I thought of embedding the IronPython interpreter in my app, but I'm not sure it's worth the effort. Is this the way to go or should I consider another option?
Update
To clarify, users may also write more complex formulae like:
if Value > 10000:
return Value*0,05
elif Value > 1000:
return Value*0,02
else
return 0

I think I'll go this way. Embedding the Python runtime is easy enough and the syntax is really simple for users to write simple scripts.

If that is all that you want to do, to evaluate a simple expression you could just dynamically compile C# code:
http://msdn.microsoft.com/en-us/magazine/cc188948.aspx
It's something like 20 lines of code.

How do you do dynamic script evaluation in C#?

What is the state of dynamic code evaluation in C#? For a very advanced feature of an app I'm working on, I'd like the users to be able to enter a line of C# code that should evaluate to a boolean.
Something like:
DateTime.Now.Hours > 12 && DateTime.Now.Hours < 14
I want to dynamically eval this string and capture the result as a boolean.
I tried Microsoft.JScript.Eval.JScriptEvaluate, and this worked, but it's technically deprecated and it only works with Javascript (not ideal, but workable). Additionally, I'd like to be able to push objects into the script engine so that they can be used in the evaluation.
Some resources I find mentioned dynamically compiling assemblies, but this is more overhead than I think I want to deal with.
So, what is the state of dynamic script evaluation in C#? Is it possible, or am I out of luck?

You use the DLR's ScriptEngine, here is an example:
http://www.codeproject.com/KB/codegen/ScriptEngine.aspx

The most informative link is this one:
Execute a string in C# 4.0
Expression Trees might also be of interest:
http://community.bartdesmet.net/blogs/bart/archive/2009/08/10/expression-trees-take-two-introducing-system-linq-expressions-v4-0.aspx
Alternatively these links:
http://www.codeproject.com/KB/dotnet/Expr.aspx
How can I evaluate a C# expression dynamically?
How can I evaluate C# code dynamically?

hand coding a parser

For all you compiler gurus, I wanna write a recursive descent parser and I wanna do it with just code. No generating lexers and parsers from some other grammar and don't tell me to read the dragon book, i'll come around to that eventually.
I wanna get into the gritty details about implementing a lexer and parser for a reasonable simple language, say CSS. And I wanna do this right.
This will probably end up being a series of questions but right now I'm starting with a lexer. Tokenization rules for CSS can be found here.
I find my self writing code like this (hopefully you can infer the rest from this snippet):
public CssToken ReadNext()
{
int val;
while ((val = _reader.Read()) != -1)
{
var c = (char)val;
switch (_stack.Top)
{
case ParserState.Init:
if (c == ' ')
{
continue; // ignore
}
else if (c == '.')
{
_stack.Transition(ParserState.SubIdent, ParserState.Init);
}
break;
case ParserState.SubIdent:
if (c == '-')
{
_token.Append(c);
}
_stack.Transition(ParserState.SubNMBegin);
break;
What is this called? and how far off am I from something reasonable well understood? I'm trying to balance something which is fair in terms of efficiency and easy to work with, using a stack to implement some kind of state machine is working quite well, but I'm unsure how to continue like this.
What I have is an input stream, from which I can read 1 character at a time. I don't do any look a head right now, I just read the character then depending on the current state try to do something with that.
I'd really like to get into the mind set of writing reusable snippets of code. This Transition method is currently means to do that, it will pop the current state of the stack and then push the arguments in reverse order. That way, when I write Transition(ParserState.SubIdent, ParserState.Init) it will "call" a sub routine SubIdent which will, when complete, return to the Init state.
The parser will be implemented in much the same way, currently, having everything in a single big method like this allows me to easily return a token when I found one, but it also forces me to keep everything in one single big method. Is there a nice way to split these tokenization rules into separate methods?

What you're writing is called a pushdown automaton. This is usually more power than you need to write a lexer, it's certainly excessive if you're writing a lexer for a modern language like CSS. A recursive descent parser is close in power to a pushdown automaton, but recursive descent parsers are much easier to write and to understand. Most parser generators generate pushdown automatons.
Lexers are almost always written as finite state machines, i.e., like your code except get rid of the "stack" object. Finite state machines are closely related to regular expressions (actually, they're provably equivalent to one another). When designing such a parser, one usually starts with the regular expressions and uses them to create a deterministic finite automaton, with some extra code in the transitions to record the beginning and end of each token.
There are tools to do this. The lex tool and its descendants are well known and have been translated into many languages. The ANTLR toolchain also has a lexer component. My preferred tool is ragel on platforms that support it. There is little benefit to writing a lexer by hand most of the time, and the code generated by these tools will probably be faster and more reliable.
If you do want to write your own lexer by hand, good ones often look something like this:
function readToken() // note: returns only one token each time
while !eof
c = peekChar()
if c in A-Za-z
return readIdentifier()
else if c in 0-9
return readInteger()
else if c in ' \n\r\t\v\f'
nextChar()
...
return EOF
function readIdentifier()
ident = ""
while !eof
c = nextChar()
if c in A-Za-z0-9
ident.append(c)
else
return Token(Identifier, ident)
// or maybe...
return Identifier(ident)
Then you can write your parser as a recursive descent parser. Don't try to combine lexer / parser stages into one, it leads to a total mess of code. (According to the Parsec author, it's slower, too).

You need to write your own Recursive Descent Parser from your BNF/EBNF. I had to write my own recently and this page was a lot of help. I'm not sure what you mean by "with just code". Do you mean you want to know how to write your own recursive parser?
If you want to do that, you need to have your grammar in place first. Once you have your EBNF/BNF in place, the parser can be written quite easily from it.
The first thing I did when I wrote my parser, was to read everything in and then tokenize the text. So I essentially ended up with an array of tokens that I treated as a stack. To reduce the verbosity/overhead of pulling a value off a stack and then pushing it back on if you don't require it, you can have a peek method that simply returns the top value on the stack without popping it.
UPDATE
Based on your comment, I had to write a recursive-descent parser in Javascript from scratch. You can take a look at the parser here. Just search for the constraints function. I wrote my own tokenize function to tokenize the input as well. I also wrote another convenience function (peek, that I mentioned before). The parser parses according to the EBNF here.
This took me a little while to figure out because it's been years since I wrote a parser (last time I wrote it was in school!), but trust me, once you get it, you get it. I hope my example gets your further along on your way.
ANOTHER UPDATE
I also realized that my example may not be what you want because you might be going towards using a shift-reduce parser. You mentioned that right now you are trying to write a tokenizer. In my case, I did write my own tokenizer in Javascript. It's probably not robust, but it was sufficient for my needs.
function tokenize(options) {
var str = options.str;
var delimiters = options.delimiters.split("");
var returnDelimiters = options.returnDelimiters || false;
var returnEmptyTokens = options.returnEmptyTokens || false;
var tokens = new Array();
var lastTokenIndex = 0;
for(var i = 0; i < str.length; i++) {
if(exists(delimiters, str[i])) {
var token = str.substring(lastTokenIndex, i);
if(token.length == 0) {
if(returnEmptyTokens) {
tokens.push(token);
}
}
else {
tokens.push(token);
}
if(returnDelimiters) {
tokens.push(str[i]);
}
lastTokenIndex = i + 1;
}
}
if(lastTokenIndex < str.length) {
var token = str.substring(lastTokenIndex, str.length);
token = token.replace(/^\s+/, "").replace(/\s+$/, "");
if(token.length == 0) {
if(returnEmptyTokens) {
tokens.push(token);
}
}
else {
tokens.push(token);
}
}
return tokens;
}
Based on your code, it looks like you are reading, tokenizing, and parsing at the same time - I'm assuming that's what a shift-reduce parser does? The flow for what I have is tokenize first to build the stack of tokens, and then send the tokens through the recursive-descent parser.

If you are going to hand code everything from scratch I would definately consider going with a recursive decent parser. In your post you are not really saying what you will be doing with the token stream once you have parsed the source.
Some things I would recommend getting a handle on
1. Good design for your scanner/lexer, this is what will be tokenizing your source code for your parser.
2. The next thing is the parser, if you have a good ebnf for the source language the parser can usually translate quite nicely into a recursive decent parser.
3. Another data structure you will really need to get your head around is the symbol table. This can be as simple as a hashtable or as complex as a tree structure that can represent complex record structures etc. I think for CSS you might be somewhere between the two.
4. And finally you want to deal with code generation. You have many options here. For an interpreter, you might simply interpret on the fly as you parse the code. A better approach might be to generate a for of i-code that you can then write an iterpreter for, and later even a compiler. Of course for the .NET platform you could directly generate IL (probably not applicable for CSS :))
For references, I gather you are not heavy into the deep theory and I do not blame you. A really good starting point for getting the basics without complex, code if you do not mind the Pascal that is, is Jack Crenshaw's 'Let's build a compiler'
http://compilers.iecc.com/crenshaw/
Good luck I am sure you are going to enjoy this project.

It looks like you want to implement a "shift-reduce" parser, where you explicitly build a token stack. The usual alternative is a "recursive descent" parser, in which depth of procedure calls build the same token stack with their own local variables, on the actual hardware stack.
In shift-reduce, the term "reduce" refers to the operation performed on the explicitly-maintained token stack. For example, if the top of the stack has become Term, Operator, Term then a reduction rule can be applied resulting in Expression as a replacement for the pattern. The reduction rules are explicitly encoded in a data structure used by the shift-reduce parser; as a result, all reduction rules can be found in the same spot of the source code.
The shift-reduce approach brings a few benefits compared to recursive-descent. On a subjective level, my opinion is that shift-reduce is easier to read and maintain than recursive-descent. More objectively, shift-reduce allows for more informative error messages from the parser when an unexpected token occurs.
Specifically, because the shift-reduce parser has an explicit encoding of rules for making "reductions," the parser is easily extended to articulate what sorts of tokens could legally have followed. (e.g., "; expected"). A recursive descent implementation cannot easily be extended to do the same thing.
A great book on both kinds of parser, and the trade-offs in implementing different kinds of shift-reduce is "Introduction to Compiler Construction", by Thomas W. Parsons.
Shift-reduce is sometimes called "bottom-up" parsing and recursive-descent is sometimes called "top-down" parsing. In the analogy used, nodes composed with highest precedence (e.g., "factors" in multiplication expression) are considered to be "at the bottom" of the parsing. This is in accord with the same analogy used in "descent" of "recursive descent".

If you want to use the parser to also handle not-well-formed expressions, you really want a recursive descent parser. Much easier to get the error handling and reporting usable.
For literature, I'd recommend some of the old work of Niklaus Wirth. He knows how to write. Algorithms + Data Structures = Programs is what I used, but you can find his Compiler Construction online.

Advice for C# programmer writing Python [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 9 years ago.
I've mainly been doing C# development for the past few years but recently started to do a bit of Python (not Iron Python). But I'm not sure if I've made the mental leap to Python...I kind of feel I'm trying to do things as I would in C#.
Any advice on how I can fully take advantage of Python?
Or any tips\tricks, things to learn more about, things to watch out for?

First, check tgray's and Lundström's advice.
Then, some things you may want to know:
Python is dynamically typed, so unlike C#, you will not
check type, but behavior. You may want to google about duck
typing. It implies you do not have to deal with boxing and
unboxing.
Python is fully object oriented, but the syntax does not
enforce this paradigm. You can write Python without using
the word "class".
The GUI library featured with Python can't compare with
C#'s. Check PyQt, GTK or wxPython libraries.
Python has a lot of concepts you may not be familiar with:
list comprehensions, generators ("yield" does exist in C#,
but it is not used much), decorators, metaclasses, etc. Don't
be afraid; you can program in Python without them. They
are just smart tools, not mandatory.
Like in C#, the Python standard library is huge. Always
look at it when you encounter any problem. It is most
likely that someone solved it already.
Python use LATE binding and variable labels. It's far too
early for somebody starting with the language to worry
about it, but remember that one day you will encounter a
behavior with variables that SEEMS illogical, and you'll
have to check that. For the moment:
Just remember to never do the following:
def myfunc(my_list=[]) :
# bla
Instead:
def myfunc(my_list=()) :
my_list = list(my_list)
And you'll be good. There is a good reason for that, but
that's not the point :-)
Python is cross platform, enjoy writing on Mac, and
run on Linux, if you wish.
Python is not provided with a complex IDE (you got IDLE :-)).
If you are a Visual Studio addict, check Glade. This is
not as advanced as Visual Studio, but it's still a good RAD.
If you want to develop some web application in Python,
remember that Python is not .NET. You must add a web
framework to it if you want to compare. I like Django.
Python does not need a huge IDE to work with. SciTE,
Notepad++, IDLE, Kate, gedit...
Lightweight editors are really sufficient.
Python enforces indentation using spaces and line break,
you can't change that. You should avoid using tabs for
indenting and choose spaces instead. The equivalent of
empty bracelets {} is the keyword "pass".
Python does not enforce private variables. You can define a
private var using "__" (two underscores) at the beginning of
the variable name, but it's still bypassable in some tricky
ways. Python usually assume programmers are grown adults
that know what they do and communicate.
Python uses iteration. A lot. A lot of a lot. And so the
itertools module is you best friend.
Python has no built in delegates. The delegate module is
not what you think. For event-driven programming, use a
GUI lib (or code the pattern yourself, it's not that
difficult).
Python has an interpreter: you can test almost anything,
live. It should always be running next to your text
editor. Python basic interpreter is not much, try IPython
for something tasty.
Python is autodocumented: use docstrings in your own code
and consult other's using "help()" in the python interpreter
Module basics:
sys: manipulate system features
os: set credential, manipulate file paths, rename, recursive file walk, etc
shutil: batch file processing (such as recursive delete)
re: regexp
urllib and urllib2: HTTP¨scripting like downloading, post / get resquests, etc.
datetime: manipulate date, time AND DURATION
thread: you guess it
zlib: compression
pickle: serialization
xml: parsing / Writing XML with SAX or DOM
There are hundreds of modules. Enjoy.
Some typical ways to do things in Python:
Loops:
Python coders use massively the equivalent of the foreach C#
loop, and prefer it to any others:
Basic iterations:
for item in collection:
print str(item)
"collection" can be a string, a list, a tuple... Any
iterable: any object defining the .next() method. There are
a lot of iterables in Python. E.g: a typical Python idiom
to read files:
for line in open("/path/to/file") :
print line
A shortcut to the for loop is called "list comprehension".
It's a way to create an new iterable in one line:
Creating a filtered list with list comprehension:
my_list = [item for item in collection if condition]
Creating a new list with a list comprehension:
my_list = [int(item) * 3 for item in collection]
Creating a new generator with a list comprehension:
my_list = (int(item) * 3 for item in collection)
Same as above, but the values will be generated on the fly
at the first iteration then lost. More information about it here.
Ordinary for loop
If you want to express a usual for loop, you can use the
xrange() function. for (int i = 0; i < 5; i++) becomes:
for i in xrange(0,5) :
do while equivalent
There is no "Do While" in Python. I never missed it, but if
you have to use this logic, do the following:
while True : # Yes, this is an infinite loop. Crazy, hu?
# Do your stuff
if condition :
break
Unpacking
Swapping variables:
a, b = b, a
Multiple assignations:
The above is just a result of what we call "unpacking" (here
applied to a tuple). A simple way to explain it is that you
can assign each value of any sequence directly to an equal
number a variables, in one row:
animal1, animal2, animal3, animal4 = ["cow", "dog", "bird", "fish"]
This has a lot of implications. While iterating on a
multidimensional array, you normally get each sub sequence
one by one then use it, for example:
agenda = [("steve", "jobs"), ("linus", "torvald"), ("bill", "gates"),("jon", "skeet")]
for person in agenda:
print person[0], person[1]
But with unpacking, you can assign the values directly to
variables as well:
agenda = [("steve", "jobs"), ("linus", "torvald"), ("bill", "gates"),("jon", "skeet")]
for name, lastname in agenda:
print name, lastname
And that's why if you want to get an index while iterating,
Python coders use the following idioms (enumerate() is a
standard function):
for index, value in enumerate(sequence) :
print index, value
Unpacking in functions calls
This is advanced use, and you can skip it if it bothers you.
You can unpack values using the sign "*" to use a sequence
directly in a function call. E.g:
>>> foo(var1, var1, var3) :
print var1, var2
print var3
>>> seq = (3.14, 42, "yeah")
>>> foo(*seq)
3.14 42
yeah
There is even more than that. You can unpack a dictionary as
named variables, and write function prototypes with *,
** to accept an arbitrary number of arguments. But it not
used enough to deserve to make this post even longer :-).
String formatting:
print "This is a %s on %s about %s" % ("post", "stackoverflow", "python")
print "This is a %(subject)s on %(place)s about %(about)s" % {"subject" : "post", "place" : "stackoverflow", "about" : "python"}
Slicing an iterable:
You can get any part of an iterable using a very concise syntax:
print "blebla"[2:4] # Print "eb"
last = string[:-1] # Getting last element
even = (0,1,2,3,4,5,6,7,8,9)[::2] # Getting evens only (third argument is a step)
reversed = string[::-1] # Reversing a string
Logical checks:
You can check the way you do in C#, but there are "Pythonic"
ways (shorter, clearer :-)):
if 1 in (1, 2, 3, 4) : # Check en element is in a sequence
if var : # check is var is true. Var == false if it's False, 0, (), [], {} or None
if not var : # Contrary of above
if thing is var: # Check if "thing" and "var" label the same content.
if thing is None : # We use that one because None means nothing in Python (almost null)
Combo (print on one line all the words containing an "o" in uppercase ):
sentence = "It's a good day to write some code"
print " ".join([word.upper() for word in sentence.split() if "o" in word])
Output: "GOOD TO SOME CODE"
Easier to ask for forgiveness than permission
Python coders usually don't check if something is possible. They are a bit like Chuck Norris. They do it. Then catch the exception. Typically, you don't check if a file exists, you try to open it, and roll back if it fails:
try :
f = open(file)
except IOerror :
print "no file here !"
Of course Chuck Norris never uses excepts since he never fails.
The else clause
"Else" is a world of many uses in Python. You will find
"else" after "if", but after "except" and "for" as well.
for stuff in bunch :
# Do things
else :
# This always happens unless you hit "break" in the loop
This works for "while" loop too, even if we do not use this
loop as much.
try :
# A crazy stuff
except ToCrazyError :
# This happens if the crazy stuff raises a ToCrazyError Exception
else :
# This will happen if there is no error so you can put only one line after the "try" clause
finally :
# The same as in C#
If you are curious, here is a bunch of advanced quick and
dirty (but nice) Python snippets.

Refrain from using classes. Use dictionaries, sets, list and tuples.
Setters and getters are forbidden.
Don't have exception handlers unless you really need to - let it crash in style.
Pylint can be your friend for more pythonish coding style.
When you're ready - check out list comprehensions, generators and lambda functions.

If you are not new to programming, I would recommend the book "Dive into Python" by Mark Pilgrim. It explains Python in a way that makes it easy to understand how Python techniques and idioms can be applied to build practical applications.

Start by reading The Zen of Python
You can read it at the link above, or just type import this at the Python prompt. =)
Take advantage of Python features not offered* by C#
Such as duck-typing, metaclasses, list comprehension, etc.*
Write simple programs just to test these features. You'll get used (if not addicted) to them in no time.
Look at the Python Standard Library
So you don't reinvent the wheel. Don't try to read the whole thing, even a quick look at the TOC could save you a lot of time.
* I know C# already has some of these features, but from what I can see they're either pretty new or not commonly used by C# developers. Please correct me if I'm wrong.

In case you haven't heard about it yet, Dive Into Python is a great place to start for anyone learning Python. It also has a bunch of Tips & Tricks.

If you are someone who is better learning a new language by taking small incremental steps then I would recommend using IronPython. Otherwise use regular CPython and don't do any more C# coding until you feel like you have a grasp of Python.

I would suggest getting a good editor so that you don't get bitten by whitespace. For simplicity, I just use ActivePython's packages Link, which include an editor and all of the win32api libraries. They are pretty fun to get into if you have been using C#. The win32api in Python can be a little bit simpler. You don't need to do the whole DDLImport thing. Download ActivePython (which comes with CPython), open it up, and start entering some stuff at the console. You will pick it up fairly easy after using C#. For some more interesting Python tidbits, try ActiveState code, which has all sorts of recipes, which can allow you to very simply see different things that you can do with Python.

I'm pretty much in your shoes too, still using C# for most of my work, but using Python more and more for other projects.
#e-satis probably knows Python inside-out and all his advice is top-notch. From my point of view what made the biggest difference to me was the following:
Get back into functional. not necessarily spaghetti code, but learning that not everything has to be in an object, nor should it be.
The interpreter. It's like the immediate window except 10^10 better. Because of how Python works you don't need all the baggage and crap C# makes you put in before you can run things; you can just whack in a few lines and see how things work.
I've normally got an IDLE instance up where I just throw around snippets as I'm working out how the various bits in the language works while I'm editing my files... e.g. busy working out how to do a map call on a list, but I'm not 100% on the lambda I should use... whack in a few lines into IDLE, see how it works and what it does.
And finally, loving into the verbosity of Python, and I don't mean that in the long winded meaning of verbosity, but as e-satis pointed out, using verbs like "in", "is", "for", etc.
If you did a lot of reflection work in C# you'll feel like crying when you see how simple the same stuff is in Python.
Good luck with it.

If you have programming experience and don't feel like spending money I'd recommend How to Think Like a Computer Scientist in Python.

And then something you can benefit from:
IPython shell: Auto completion in the shell. It does batch operations, adds a ton of features, logging and such. >>> Play with the shell - always!
easy_install / pip: So nice and an easy way to install a 3rd party Python application.

Develop Reference

C# (C-Sharp) is a programming language developed by Microsoft that runs on the .NET Framework.