parsing with flex and bison fails for space and brace -


i trying parse file this: (too simple actual purpose, beginning, ok)

@book{key2,  author="some2value" ,  title="value2"  } 

the lexer is:

[a-za-z"][^\\\"  \n\(\),=\{\}#~_]*      { yylval.sval = strdup(yytext); return key; } @[a-za-z][a-za-z]+                 {yylval.sval = strdup(yytext + 1); return entrytype;} [ \t\n]                                ; /* ignore whitespace */ [{}=,]                                 { return *yytext; } .                                      { fprintf(stderr, "unrecognized character %c in input\n", *yytext); } 

and parsing with:

%union {     char    *sval; };  %token <sval> entrytype %type <sval> value %token <sval> key  %start input  %%  input: entry       | input entry ;  /* input 0 or more entires */  entry:       entrytype '{' key ','{           b_entry.type = $1;           b_entry.id = $3;          b_entry.table = g_hash_table_new_full(g_str_hash, g_str_equal, free, free);}       keyvals '}' {          parse_entry(&b_entry);          g_hash_table_destroy(b_entry.table);          free(b_entry.type); free(b_entry.id);          b_entry.table = null;          b_entry.type = b_entry.id = null;}      ;  keyvals:        /* empty */        | keyvals keyval ; /* 0 or more keyvals */  value:       /*empty*/       | key        | value key        ; keyval:        /*empty*/       key '=' value ',' { g_hash_table_replace(b_entry.table, $1, $3); }       | key '=' value  { g_hash_table_replace(b_entry.table, $1, $3); }       | error '\n' {yyerrok;}       ; 

there few problem, need generalize both lexer , parser: 1) can not read sentence, i.e. if rhs of author="some value", shows "some. i.e. space not handled. dont know how it. 2) if enclose rhs {} rather "", gives syntax error. looking for 2 situation.

the main issue tokens not appropriate. should try recognize tokens of example follows:

@book        entrytype {            '{' key2         key ,            ',' author       key =            '=' "some2value" value ,            ',' title        key =            '=' "value2"     value }            '}' 

the value token example defined follows:

%x value %% "\""           {begin(value);} <value>"\""    {begin{initial); return value;} <value>"\\\""  { /* escaped " */ } <value>[^"]    { /* non-escaped char */ } 

or in single expression as

"\""([^"]|("\\\""))*"\"" 

this assuming " needs escaped \. i'm not sure how bibtex defines how escape ", if possible @ all.


Comments

Popular posts from this blog

linux - Does gcc have any options to add version info in ELF binary file? -

javascript - Clean way to programmatically use CSS transitions from JS? -

android - send complex objects as post php java -