Antlr4: How to exit a grammar rule? -


so i"m experimenting antlr v4, , i'm poking unusual grammar sense of how works. here's current test case:

i'd grammar consists of letters a, b, c, d in order. letters may repeated. group a's , b's together, , c's , d's also, make grammar more interesting. strings these acceptable grammars:

aaa

abcd

acccdd

but it's not going well. think happening antlr needs better exit rule grammar. doesn't seem recognize after collecting a's , b's, presence of c means go next rule. it's sort of working, error messages, , resulting parse tree seems have null elements in it, inserted element issued error message.

here's example error message:

line 1:2 extraneous input 'c' expecting {'b', 'a'} 

which happens input 'abcd'. weird going on when antlr sees c there. here's output of parse tree:

'abcd': (prog (aorb (a a) (aorb (b b) aorb)) (cord (c c) (cord (d d) cord)) <eof>) 

which can see has empty aorb element there @ end of first set of elements.

any idea going on? antlr "thinking" here when issues error , adds empty element? , how might fix this?

ok, here gory details.

my grammar:

grammar abcd;  prog : aorb cord eof; aorb : ( | b ) aorb ; : 'a'+ ; b : 'b'+ ; cord : ( c | d ) cord ; c : 'c'+ ; d : 'd'+ ; 

my test program in java:

  package antlrtests;    import antlrtests.grammars.*;   import org.antlr.v4.runtime.*;   import org.antlr.v4.runtime.tree.*;    class abcdtest {      private final string[] testvectors = {         "a", "aabb", "b", "abcd", "c", "d", };      public void runtests() {         for( string test : testvectors )            simpletest( test );      }      private void simpletest( string test ) {         antlrinputstream ains = new antlrinputstream( test );         abcdlexer wpl = new abcdlexer( ains );         commontokenstream tokens = new commontokenstream( wpl );         abcdparser wikiparser = new abcdparser( tokens );         parsetree parsetree = wikiparser.prog();         system.out.println( "'" + test + "': " + parsetree.tostringtree(                 wikiparser ) );      }   } 

and output of test program. note error message jumbled regular output because printed antlr on standard error.

  run:   line 1:1 no viable alternative @ input '<eof>'   'a': (prog (aorb (a a) aorb) cord <eof>)   line 1:4 no viable alternative @ input '<eof>'   'aabb': (prog (aorb (a a) (aorb (b b b) aorb)) cord <eof>)   'b': (prog (aorb (b b) aorb) cord <eof>)   line 1:1 no viable alternative @ input '<eof>'   line 1:2 extraneous input 'c' expecting {'b', 'a'}   line 1:4 no viable alternative @ input '<eof>'   'abcd': (prog (aorb (a a) (aorb (b b) aorb)) (cord (c c) (cord (d d) cord)) <eof>)   line 1:0 no viable alternative @ input 'c'   line 1:1 no viable alternative @ input '<eof>'   line 1:0 no viable alternative @ input 'd'   'c': (prog aorb (cord (c c) cord) <eof>)   line 1:1 no viable alternative @ input '<eof>'   'd': (prog aorb (cord (d d) cord) <eof>)   build successful (total time: 0 seconds) 

any appreciated.

is not you're after?

prog : 'a'* 'b'* 'c'* 'd'* eof; 

the following rule of grammar matches infinitely long series of a , b tokens because tail recursive aorb reference not optional. grammar either throw stackoverflowexception if input starts sufficiently many a and/or b characters, or encounter syntax error if not.

aorb : ( | b ) aorb ; 

if want maintain groupings, use grammar instead. made changes aorb , cord rules. since a rule matches sequence of a tokens, aorb rule uses a? instead of a* (only 1 instance of a ever appear, , entire series of a tokens children).

grammar abcd;  prog : aorb cord eof; aorb : a? b?; : 'a'+ ; b : 'b'+ ; cord : c? d?; c : 'c'+ ; d : 'd'+ ; 

here grammar matches same language (but produces different parse tree) showing other options *, +, , ? quantifiers. wouldn't recommend using grammar, should on understand each choice doing , understand why matches same input grammar gave above.

grammar abcd;  prog : aorb cord? eof; aorb : a* b; : 'a' ; b : 'b'* ; cord : (c d* | d+); c : 'c'+ ; d : 'd' ; 

Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

javascript - Clean way to programmatically use CSS transitions from JS? -

android - send complex objects as post php java -