parsing with flex and bison fails for space and brace -
i trying parse file this: (too simple actual purpose, beginning, ok)
@book{key2, author="some2value" , title="value2" }
the lexer is:
[a-za-z"][^\\\" \n\(\),=\{\}#~_]* { yylval.sval = strdup(yytext); return key; } @[a-za-z][a-za-z]+ {yylval.sval = strdup(yytext + 1); return entrytype;} [ \t\n] ; /* ignore whitespace */ [{}=,] { return *yytext; } . { fprintf(stderr, "unrecognized character %c in input\n", *yytext); }
and parsing with:
%union { char *sval; }; %token <sval> entrytype %type <sval> value %token <sval> key %start input %% input: entry | input entry ; /* input 0 or more entires */ entry: entrytype '{' key ','{ b_entry.type = $1; b_entry.id = $3; b_entry.table = g_hash_table_new_full(g_str_hash, g_str_equal, free, free);} keyvals '}' { parse_entry(&b_entry); g_hash_table_destroy(b_entry.table); free(b_entry.type); free(b_entry.id); b_entry.table = null; b_entry.type = b_entry.id = null;} ; keyvals: /* empty */ | keyvals keyval ; /* 0 or more keyvals */ value: /*empty*/ | key | value key ; keyval: /*empty*/ key '=' value ',' { g_hash_table_replace(b_entry.table, $1, $3); } | key '=' value { g_hash_table_replace(b_entry.table, $1, $3); } | error '\n' {yyerrok;} ;
there few problem, need generalize both lexer , parser: 1) can not read sentence, i.e. if rhs of author="some value", shows "some. i.e. space not handled. dont know how it. 2) if enclose rhs {} rather "", gives syntax error. looking for 2 situation.
the main issue tokens not appropriate. should try recognize tokens of example follows:
@book entrytype { '{' key2 key , ',' author key = '=' "some2value" value , ',' title key = '=' "value2" value } '}'
the value token example defined follows:
%x value %% "\"" {begin(value);} <value>"\"" {begin{initial); return value;} <value>"\\\"" { /* escaped " */ } <value>[^"] { /* non-escaped char */ }
or in single expression as
"\""([^"]|("\\\""))*"\""
this assuming "
needs escaped \
. i'm not sure how bibtex defines how escape "
, if possible @ all.
Comments
Post a Comment