ANTLR: What is the fastest way to get grammar tree? - c#

What is the fastest (smaller code) way to get grammar tree ?
I am trying to get grammar tree. I've generated C# code based on my simple grammar:
grammar MyPascal;
options
{
language=CSharp3;
output=AST;
}
operator: (block | ID);
block : BEGIN operator* END;
BEGIN :'begin';
END :'end';
ID :('a'..'z')+;
WS :( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;};
When i'am using ANTLR works for simple input text like:
input.txt:
begin
abs
qwe
begin
begin
end
end
end
i get nice picture of grammar tree.
Now i'am wonder if there any simple way to get tree structure of my "program" from C# without writing 1000s lines of code.
Here i'am trying to get grammar tree:
class Program
{
static void Main(string[] args)
{
MyPascalLexer lex = new MyPascalLexer(new ANTLRFileStream(#"M:\input.txt"));
CommonTokenStream tokens = new CommonTokenStream(lex);
MyPascalParser g = new MyMyPascalParser(tokens);
MyPascalParser.myprogram_return X = g.myprogram();
Console.WriteLine(X.Tree); // Writes: nill
Console.WriteLine(X.Start); // Writes: [#0,0:4='begin',<4>,1:0]
Console.WriteLine(X.Stop); // Writes: [#35,57:57='end',<19>,12:2]
}
}

You'll have to "tell" ANTLR to build an AST, opposed to just a flat stream of tokens (simple parse tree).
See this SO Q&A that shows how to do this in C#.
Also, you should not use:
ID : ('a'..'z')*;
i.e.: let a lexer rule match an empty string, this might (or even will?) get you in trouble (it always matches!). You'll want to let it match at least one character:
ID : ('a'..'z')+;

Related

Antlr Lexer Parse error

Antlr version: antlr-dotnet-tool-3.5.0.2
TestGrammar.g
lexer grammar TestGrammar;
options
{
language=CSharp3;
backtrack=true;
}
DOT
: '.'
;
NUMBER
: ( '0'..'9' )+ ('.' ( '0'..'9' )+)?
;
WS
: (' '|'\r'|'\t'|'\u000C'|'\n') {$channel=Hidden;}
;
C# code:
var lexer = new TestGrammar(new ANTLRStringStream("2..3"));
while (true)
{
var token = lexer.NextToken();
Console.WriteLine(token);
if (token.Type == -1)
break;
}
Result:
[#-1,3:3='3',<5>,1:3]
[#-1,4:4='<EOF>',<-1>,1:4]
So, I test this grammar with input
2..3
I expect that the result will be the following:
NUMBER["2"] DOT["."] DOT["."] NUMBER["3"]
So, what am I doing wrong? Thank you!
Testing with the Java target:
TestGrammar lexer = new TestGrammar(new ANTLRInputStream("2..3"));
for (Token t : lexer.getAllTokens()) {
System.out.printf("%s -> %s\n", t.getText(), TestGrammar.VOCABULARY.getSymbolicName(t.getType()));
}
produces the following output:
2 -> NUMBER
. -> DOT
. -> DOT
3 -> NUMBER
So, the grammar is correct. I doubt that the C# runtime would produce anything other than what I posted (with the C# equivalent test class). If it still doesn't work, please edit your question and add some code that demonstrates how to reproduce the error(s) you get.

c# migrating to ANTLR 4 from ANTLR 3 with AST

I have inherited some c# code based on ANTLR 3.
We have some grammar files that uses the AST (abstract syntax tree) option and we use those grammar to parse text files with a very odd "language" to objects. we are using the AST as intermediate objects and than convert them to the real objects that we need (with some more processing).
I have no knowledge in ANTLR but currently we have a bottleneck in the application performance from ANTLR processing of the files.
Since we are using ANTLR 3 we thought that we might get a performance boost if we migrate to ANTLR (and also get the latest and greatest version of ANTLR which is always a good practice).
I have read that AST no longer exist in ANTLR 4, what is the best (and simplest) way to replace it and what will it mean to my current code.
What is the best approach to upgrade ? and will it really give us a performance boost.
An example of one of the grammar file ( there are 6 and this is the simplest one):
grammar Rules;
options
{
language=CSharp2;
output=AST;
ASTLabelType=CommonTree;
superClass = OOPLParserBase;
}
tokens
{
OOPL_MODEL;
}
#lexer::namespace { TestParser.Common.RulesParser }
#parser::namespace { TestParser.Common.RulesParser }
#header
{
using System.Collections.Generic;
using TestParser.OOPLModel;
}
#members
{
public RulesParser() : base(null)
{
}
protected override CommonTree GetAst()
{
return root().Tree as CommonTree;
}
protected override Lexer GetLexer()
{
return new RulesLexer();
}
}
//semantic analysis
root : header (rule_line COMMENT?)+ -> ^(header rule_line+);
header : header_comment+ -> ^(OOPL_MODEL<OOPLModel>[new CommonToken(OOPL_MODEL), "1.0"] header_comment+);
header_comment : COMMENT -> ^(COMMENT<OOPLComment>[$COMMENT, $COMMENT.Text]);
rule_line : parameter RULE_TYPE COMMA PARAMETER_NAME COLON condition -> ^(RULE_TYPE<OOPLBlock>[$RULE_TYPE, $RULE_TYPE.Text] parameter PARAMETER_NAME<OOPLValue>[$PARAMETER_NAME, $PARAMETER_NAME.Text] condition);
parameter : PARAMETER_NAME EQUALS (integer_value = INTEGER | real_value = REAL |string_value = STRING) COMMA -> ^(PARAMETER_NAME<OOPLKeyedValue>[$PARAMETER_NAME, $PARAMETER_NAME.Text, SingleWhereNotNull<IToken>($integer_value, $string_value, $real_value).Text]);
condition : condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value COMMA condition_value;
condition_value : (asterisk| parameter_name | positive_integer);
asterisk : ASTERISK -> ^(ASTERISK<OOPLValue>[$ASTERISK, $ASTERISK.Text]);
parameter_name : PARAMETER_NAME -> ^(PARAMETER_NAME<OOPLValue>[$PARAMETER_NAME, $PARAMETER_NAME.Text]);
positive_integer : INTEGER -> ^(INTEGER<OOPLValue>[$INTEGER, $INTEGER.Text]);
//lexical analysis
EQUALS : '=';
NEW_LINE_R : '\r' { $channel = HIDDEN; };
NEW_LINE_N : '\n' { $channel = HIDDEN; };
RULE_TYPE : ('Time'|'TIME'|'Lol'|'LOL'|'World'|'WORLD'|'Template'|'TEMPLATE');
DOUBLE_COLON : COLON COLON;
INTEGER : MINUS? DIGIT+;
REAL : INTEGER '.' INTEGER;
PARAMETER_NAME : ASTERISK? (LETTER|DIGIT|UNDERSCORE|FORWARDSLASH|DOUBLE_COLON|MINUS)+ ASTERISK?;
WS : ( ' '
| '\t'
| NEW_LINE_R
| NEW_LINE_N
) { $channel = HIDDEN; } ;
COMMENT : '#' ( options {greedy=false;} : . )* NEW_LINE_R? NEW_LINE_N;
STRING : '"'~('"')* '"';
fragment
MINUS : '-';
COMMA : ',';
COLON : ':';
fragment
DOT : '.';
ASTERISK : '*';
fragment
FORWARDSLASH : '/';
fragment
UNDERSCORE : '_';
fragment
DIGIT : '0'..'9';
fragment
LETTER : 'A'..'Z' | 'a'..'z';
I'd do the transformation solely in C# code after the parse.
In this case I'd even skip the intermediate AST form and transform the parse tree (provided by ANTLR4) directly into the target representation.
Some prefer ParseTreeListener/ParseTreeWalkers, which aid you in walking the parse tree. Check these out, if you want some pre-build code. Be sure to use the typed ParseTreeWalker, which should be named RulesParseTreeListener<>, inherit and adjust to your needs.
link: https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Parse+Tree+Listeners
I'd not recommend ParseTreeVisitors which are invoked during the parse (as opposed to after the parse). They are only suitable for simple operations or grammars that are not context free and require code during the parse. If the requirements evolve later on, you're way more flexible with custom processing or listeners/walkers.

How can my ANTLR parser (not lexer) trigger a lexical "include" (not AST splice)?

The ANTLR website describes two approaches to implementing "include" directives. The first approach is to recognize the directive in the lexer and include the file lexically (by pushing the CharStream onto a stack and replacing it with one that reads the new file); the second is to recognize the directive in the parser, launch a sub-parser to parse the new file, and splice in the AST generated by the sub-parser. Neither of these are quite what I need.
In the language I'm parsing, recognizing the directive in the lexer is impractical for a few reasons:
There is no self-contained character pattern that always means "this is an include directive". For example, Include "foo"; at top level is an include directive, but in Array bar --> Include "foo"; or Constant Include "foo"; the word Include is an identifier.
The name of the file to include may be given as a string or as a constant identifier, and such constants can be defined with arbitrarily complex expressions.
So I want to trigger the inclusion from the parser. But to perform the inclusion, I can't launch a sub-parser and splice the AST together; I have to splice the tokens. It's legal for a block to begin with { in the main file and be terminated by } in the included file. A file included inside a function can even close the function definition and start a new one.
It seems like I'll need something like the first approach but at the level of TokenStreams instead of CharStreams. Is that a viable approach? How much state would I need to keep on the stack, and how would I make the parser switch back to the original token stream instead of terminating when it hits EOF? Or is there a better way to handle this?
==========
Here's an example of the language, demonstrating that blocks opened in the main file can be closed in the included file (and vice versa). Note that the # before Include is required when the directive is inside a function, but optional outside.
main.inf:
[ Main;
print "This is Main!";
if (0) {
#include "other.h";
print "This is OtherFunction!";
];
other.h:
} ! end if
]; ! end Main
[ OtherFunction;
A possibility is for each Include statement to let your parser create a new instance of your lexer and insert these new tokens the lexer creates at the index the parser is currently at (see the insertTokens(...) method in the parser's #members block.).
Here's a quick demo:
Inform6.g
grammar Inform6;
options {
output=AST;
}
tokens {
STATS;
F_DECL;
F_CALL;
EXPRS;
}
#parser::header {
import java.util.Map;
import java.util.HashMap;
}
#parser::members {
private Map<String, String> memory = new HashMap<String, String>();
private void putInMemory(String key, String str) {
String value;
if(str.startsWith("\"")) {
value = str.substring(1, str.length() - 1);
}
else {
value = memory.get(str);
}
memory.put(key, value);
}
private void insertTokens(String fileName) {
// possibly strip quotes from `fileName` in case it's a Str-token
try {
CommonTokenStream thatStream = new CommonTokenStream(new Inform6Lexer(new ANTLRFileStream(fileName)));
thatStream.fill();
List extraTokens = thatStream.getTokens();
extraTokens.remove(extraTokens.size() - 1); // remove EOF
CommonTokenStream thisStream = (CommonTokenStream)this.getTokenStream();
thisStream.getTokens().addAll(thisStream.index(), extraTokens);
} catch(Exception e) {
e.printStackTrace();
}
}
}
parse
: stats EOF -> stats
;
stats
: stat* -> ^(STATS stat*)
;
stat
: function_decl
| function_call
| include
| constant
| if_stat
;
if_stat
: If '(' expr ')' '{' stats '}' -> ^(If expr stats)
;
function_decl
: '[' id ';' stats ']' ';' -> ^(F_DECL id stats)
;
function_call
: Id exprs ';' -> ^(F_CALL Id exprs)
;
include
: Include Str ';' {insertTokens($Str.text);} -> /* omit statement from AST */
| Include id ';' {insertTokens(memory.get($id.text));} -> /* omit statement from AST */
;
constant
: Constant id expr ';' {putInMemory($id.text, $expr.text);} -> ^(Constant id expr)
;
exprs
: expr (',' expr)* -> ^(EXPRS expr+)
;
expr
: add_expr
;
add_expr
: mult_expr (('+' | '-')^ mult_expr)*
;
mult_expr
: atom (('*' | '/')^ atom)*
;
atom
: id
| Num
| Str
| '(' expr ')' -> expr
;
id
: Id
| Include
;
Comment : '!' ~('\r' | '\n')* {skip();};
Space : (' ' | '\t' | '\r' | '\n')+ {skip();};
If : 'if';
Include : 'Include';
Constant : 'Constant';
Id : ('a'..'z' | 'A'..'Z') ('a'..'z' | 'A'..'Z' | '0'..'9')+;
Str : '"' ~'"'* '"';
Num : '0'..'9'+ ('.' '0'..'9'+)?;
main.inf
Constant IMPORT "other.h";
[ Main;
print "This is Main!";
if (0) {
Include IMPORT;
print "This is OtherFunction!";
];
other.h
} ! end if
]; ! end Main
[ OtherFunction;
Main.java
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
public class Main {
public static void main(String[] args) throws Exception {
// create lexer & parser
Inform6Lexer lexer = new Inform6Lexer(new ANTLRFileStream("main.inf"));
Inform6Parser parser = new Inform6Parser(new CommonTokenStream(lexer));
// print the AST
DOTTreeGenerator gen = new DOTTreeGenerator();
StringTemplate st = gen.toDOT((CommonTree)parser.parse().getTree());
System.out.println(st);
}
}
To run the demo, do the following on the command line:
java -cp antlr-3.3.jar org.antlr.Tool Inform6.g
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main
The output you'll see corresponds to the following AST:

Regex for checking for compass directions

I'm looking to match the 8 main directions as might appear in a street or location prefix or suffix, such as:
N Main
south I-22
124 Grover Ave SE
This is easy to code using a brute force list of matches and cycle through every match possibility for every street address, matching once with a start-of-string anchor and once with a end-of-string anchor. My blunt starting point is shown farther down, if you want to see it.
My question is if anyone has some clever ideas for compact, fast-executing patterns to accomplish the same thing. You can assume:
Compound directions always start with the north / south component. So I need to match South East but not EastSouth
The pattern should not match [direction]-ern words, like "Northern" or "Southwestern"
The match will always be at the very beginning or very end of the string.
I'm using C#, but I'm just looking for a pattern so I'm not emphasizing the language. /s(outh)?/ is just as good as #"s(outh)?" for me or future readers.
SO emphasizes real problems, so FYI this is one. I'm parsing a few hundred thousand nasty, unvalidated user-typed address strings. I want to check if the start or end of the "street" field (which is free-form jumble of PO boxes, streets, apartments, and straight up invalid junk) begins or ends with a compass direction. I'm trying to deconstruct these free form strings to find similar addresses which may be accidental or intentional variations and obfuscations.
My blunt attempt
Core pattern: /n(orth)?|e(ast)?|s(outh)?|w(est)?|n(orth\s*east|e|orth\s*west|w)|s(outh\s*east|e|outh\s*west|w)/
In a function:
public static Tuple<Match, Match> MatchDirection(String value) {
string patternBase = #"n(orth)?|e(ast)?|s(outh)?|w(est)?|n(orth\s*east|e|orth\s*west|w)|s(outh\s*east|e|outh\s*west|w)";
Match[] matches = new Match[2];
string[] compassPatterns = new[] { #"^(" + patternBase + #")\b", #"\b(" + patternBase + #")$" };
for (int i = 0; i < 2; i++) { matches[i] = Regex.Match(value, compassPatterns[i], RegexOptions.IgnoreCase); }
return new Tuple<Match, Match>(matches[0], matches[1]);
}
In use, where sourceDt is a table with all the addresses:
var parseQuery = sourceDt.AsEnumerable()
.Select((DataRow row) => {
string addr = ((string)row["ADDR_STREET"]).Trim();
Tuple<Match, Match> dirMatches = AddressParser.MatchDirection(addr);
return new string[] { addr, dirMatches.Item1.Value, dirMatches.Item2.Value };
})
Edit: Actually this is probably wrong answer - so keeping it just so people not suggest the same thing - figuring out tokenization for "South East" is task in itself. Also I still doubt RegExp will be very usable either.
Original answer:
Don't... your initial RegExp attempt is already non-readable.
Dictionary look up for each word you want from the tokenized string ("brute force approach") already gives you linear time on length and constant time per word. And it is very easy to customize with new words.
(^[nesw][^n\s]*)|([nesw][^n\s]*$)
So this will match a line that:
begins or ends with a word that:
Begins with a cardinal direction
Doesn't have an n otherwise in it (to get rid of the "-ern"s)
Perl/PCRE compatible expression:
(?xi)
(^)?
\b
(?:
n(?:orth)?
(?:\s* (?: e(?:ast)? | w(?:est)? ))?
|
s(?:outh)?
(?:\s* (?: e(?:ast)? | w(?:est)? ))?
|
e(?:ast)?
|
w(?:est)?
)
\b
(?(1)|$)
I think C# supports all the features used here.

Using ANTLR 3.3?

I'm trying to get started with ANTLR and C# but I'm finding it extraordinarily difficult due to the lack of documentation/tutorials. I've found a couple half-hearted tutorials for older versions, but it seems there have been some major changes to the API since.
Can anyone give me a simple example of how to create a grammar and use it in a short program?
I've finally managed to get my grammar file compiling into a lexer and parser, and I can get those compiled and running in Visual Studio (after having to recompile the ANTLR source because the C# binaries seem to be out of date too! -- not to mention the source doesn't compile without some fixes), but I still have no idea what to do with my parser/lexer classes. Supposedly it can produce an AST given some input...and then I should be able to do something fancy with that.
Let's say you want to parse simple expressions consisting of the following tokens:
- subtraction (also unary);
+ addition;
* multiplication;
/ division;
(...) grouping (sub) expressions;
integer and decimal numbers.
An ANTLR grammar could look like this:
grammar Expression;
options {
language=CSharp2;
}
parse
: exp EOF
;
exp
: addExp
;
addExp
: mulExp (('+' | '-') mulExp)*
;
mulExp
: unaryExp (('*' | '/') unaryExp)*
;
unaryExp
: '-' atom
| atom
;
atom
: Number
| '(' exp ')'
;
Number
: ('0'..'9')+ ('.' ('0'..'9')+)?
;
Now to create a proper AST, you add output=AST; in your options { ... } section, and you mix some "tree operators" in your grammar defining which tokens should be the root of a tree. There are two ways to do this:
add ^ and ! after your tokens. The ^ causes the token to become a root and the ! excludes the token from the ast;
by using "rewrite rules": ... -> ^(Root Child Child ...).
Take the rule foo for example:
foo
: TokenA TokenB TokenC TokenD
;
and let's say you want TokenB to become the root and TokenA and TokenC to become its children, and you want to exclude TokenD from the tree. Here's how to do that using option 1:
foo
: TokenA TokenB^ TokenC TokenD!
;
and here's how to do that using option 2:
foo
: TokenA TokenB TokenC TokenD -> ^(TokenB TokenA TokenC)
;
So, here's the grammar with the tree operators in it:
grammar Expression;
options {
language=CSharp2;
output=AST;
}
tokens {
ROOT;
UNARY_MIN;
}
#parser::namespace { Demo.Antlr }
#lexer::namespace { Demo.Antlr }
parse
: exp EOF -> ^(ROOT exp)
;
exp
: addExp
;
addExp
: mulExp (('+' | '-')^ mulExp)*
;
mulExp
: unaryExp (('*' | '/')^ unaryExp)*
;
unaryExp
: '-' atom -> ^(UNARY_MIN atom)
| atom
;
atom
: Number
| '(' exp ')' -> exp
;
Number
: ('0'..'9')+ ('.' ('0'..'9')+)?
;
Space
: (' ' | '\t' | '\r' | '\n'){Skip();}
;
I also added a Space rule to ignore any white spaces in the source file and added some extra tokens and namespaces for the lexer and parser. Note that the order is important (options { ... } first, then tokens { ... } and finally the #... {}-namespace declarations).
That's it.
Now generate a lexer and parser from your grammar file:
java -cp antlr-3.2.jar org.antlr.Tool Expression.g
and put the .cs files in your project together with the C# runtime DLL's.
You can test it using the following class:
using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;
namespace Demo.Antlr
{
class MainClass
{
public static void Preorder(ITree Tree, int Depth)
{
if(Tree == null)
{
return;
}
for (int i = 0; i < Depth; i++)
{
Console.Write(" ");
}
Console.WriteLine(Tree);
Preorder(Tree.GetChild(0), Depth + 1);
Preorder(Tree.GetChild(1), Depth + 1);
}
public static void Main (string[] args)
{
ANTLRStringStream Input = new ANTLRStringStream("(12.5 + 56 / -7) * 0.5");
ExpressionLexer Lexer = new ExpressionLexer(Input);
CommonTokenStream Tokens = new CommonTokenStream(Lexer);
ExpressionParser Parser = new ExpressionParser(Tokens);
ExpressionParser.parse_return ParseReturn = Parser.parse();
CommonTree Tree = (CommonTree)ParseReturn.Tree;
Preorder(Tree, 0);
}
}
}
which produces the following output:
ROOT
*
+
12.5
/
56
UNARY_MIN
7
0.5
which corresponds to the following AST:
(diagram created using graph.gafol.net)
Note that ANTLR 3.3 has just been released and the CSharp target is "in beta". That's why I used ANTLR 3.2 in my example.
In case of rather simple languages (like my example above), you could also evaluate the result on the fly without creating an AST. You can do that by embedding plain C# code inside your grammar file, and letting your parser rules return a specific value.
Here's an example:
grammar Expression;
options {
language=CSharp2;
}
#parser::namespace { Demo.Antlr }
#lexer::namespace { Demo.Antlr }
parse returns [double value]
: exp EOF {$value = $exp.value;}
;
exp returns [double value]
: addExp {$value = $addExp.value;}
;
addExp returns [double value]
: a=mulExp {$value = $a.value;}
( '+' b=mulExp {$value += $b.value;}
| '-' b=mulExp {$value -= $b.value;}
)*
;
mulExp returns [double value]
: a=unaryExp {$value = $a.value;}
( '*' b=unaryExp {$value *= $b.value;}
| '/' b=unaryExp {$value /= $b.value;}
)*
;
unaryExp returns [double value]
: '-' atom {$value = -1.0 * $atom.value;}
| atom {$value = $atom.value;}
;
atom returns [double value]
: Number {$value = Double.Parse($Number.Text, CultureInfo.InvariantCulture);}
| '(' exp ')' {$value = $exp.value;}
;
Number
: ('0'..'9')+ ('.' ('0'..'9')+)?
;
Space
: (' ' | '\t' | '\r' | '\n'){Skip();}
;
which can be tested with the class:
using System;
using Antlr.Runtime;
using Antlr.Runtime.Tree;
using Antlr.StringTemplate;
namespace Demo.Antlr
{
class MainClass
{
public static void Main (string[] args)
{
string expression = "(12.5 + 56 / -7) * 0.5";
ANTLRStringStream Input = new ANTLRStringStream(expression);
ExpressionLexer Lexer = new ExpressionLexer(Input);
CommonTokenStream Tokens = new CommonTokenStream(Lexer);
ExpressionParser Parser = new ExpressionParser(Tokens);
Console.WriteLine(expression + " = " + Parser.parse());
}
}
}
and produces the following output:
(12.5 + 56 / -7) * 0.5 = 2.25
EDIT
In the comments, Ralph wrote:
Tip for those using Visual Studio: you can put something like java -cp "$(ProjectDir)antlr-3.2.jar" org.antlr.Tool "$(ProjectDir)Expression.g" in the pre-build events, then you can just modify your grammar and run the project without having to worry about rebuilding the lexer/parser.
Have you looked at Irony.net? It's aimed at .Net and therefore works really well, has proper tooling, proper examples and just works. The only problem is that it is still a bit 'alpha-ish' so documentation and versions seem to change a bit, but if you just stick with a version, you can do nifty things.
p.s. sorry for the bad answer where you ask a problem about X and someone suggests something different using Y ;^)
My personal experience is that before learning ANTLR on C#/.NET, you should spare enough time to learn ANTLR on Java. That gives you knowledge on all the building blocks and later you can apply on C#/.NET.
I wrote a few blog posts recently,
http://www.lextm.com/index.php/2012/07/how-to-use-antlr-on-net-part-i/
http://www.lextm.com/index.php/2012/07/how-to-use-antlr-on-net-part-ii/
http://www.lextm.com/index.php/2012/07/how-to-use-antlr-on-net-part-iii/
http://www.lextm.com/index.php/2012/07/how-to-use-antlr-on-net-part-iv/
http://www.lextm.com/index.php/2012/07/how-to-use-antlr-on-net-part-v/
The assumption is that you are familiar with ANTLR on Java and is ready to migrate your grammar file to C#/.NET.
There is a great article on how to use antlr and C# together here:
http://www.codeproject.com/KB/recipes/sota_expression_evaluator.aspx
it's a "how it was done" article by the creator of NCalc which is a mathematical expression evaluator for C# - http://ncalc.codeplex.com
You can also download the grammar for NCalc here:
http://ncalc.codeplex.com/SourceControl/changeset/view/914d819f2865#Grammar%2fNCalc.g
example of how NCalc works:
Expression e = new Expression("Round(Pow(Pi, 2) + Pow([Pi2], 2) + X, 2)");
e.Parameters["Pi2"] = new Expression("Pi * Pi");
e.Parameters["X"] = 10;
e.EvaluateParameter += delegate(string name, ParameterArgs args)
{
if (name == "Pi")
args.Result = 3.14;
};
Debug.Assert(117.07 == e.Evaluate());
hope its helpful

Categories

Resources