I'm trying to create a grammar to parse a single line (a doc number actually). Here is my grammar
%header%
GRAMMARTYPE = "LL"
%tokens%
dash = '-'
period = '.'
slash = '/'
rev1 = "REVISION"
rev2 = "REV"
part1 = "PART"
part2 ="PT"
chapter1 = "CHAPTER"
chapter2 = "CHAP"
volume1 = "VOLUME"
volume2 = "VOL"
validNotice1 = "VALID NOTICE"
validNotice2 = "VAL NOTICE"
interimChg1 = "INTERIM CHANGE"
interimChg2 = "INT CHG"
supplement1 = "SUPPLEMENT"
supplement2 = "SUPP"
aLetter = <<[A-Z]>>
aDigit = <<[0-9]>>
mil = "MIL"
dod = "DOD"
jan = "JAN"
prf = "PRF"
dtl = "DTL"
oo = "00"
oh = "OH"
h = "H"
WHITESPACE = <<[ \t]+>> %ignore%
%productions%
reference = milspec [suffix] ;
suffix = ( rev1 | rev2 | part1 | part2 | chapter1 | chapter2
| volume1 | volume2 | validNotice1 | validNotice2
| interimChg1 | interimChg2 | supplement1 | supplement2 ) [singleNumber] ;
milspec = milPrefix baseNum [slashNum] ;
milPrefix = ( mil | dod | jan ) dash [middleAlpha] [delimiter] [indicator] baseNum [slash slashNum];
middleAlpha = prf | dtl | alpha ;
baseNum = singleNumber [impliedRev] ;
slashNum = singleNumber [impliedRev] ;
alpha = aLetter+ ;
singleNumber = aDigit+ ;
delimiter = dash | period | slash ;
indicator = oo | oh | h ;
impliedRev = (dash singleNumber [alpha]) | alpha ;
The problem I'm having is when I try to parse the line
MIL-DTL-0053133/47B SUPPLEMENT 1
I get
bandikoot$ java -jar lib/grammatica-1.4.jar struct_key.grammar.g --parse ./inpfile
Parse tree from ./inpfile:
reference(2001)
milspec(2003)
milPrefix(2004)
mil(1020): "MIL", line: 1, col: 1
dash(1001): "-", line: 1, col: 4
middleAlpha(2005)
dtl(1024): "DTL", line: 1, col: 5
delimiter(2010)
dash(1001): "-", line: 1, col: 8
indicator(2011)
oo(1025): "00", line: 1, col: 9
baseNum(2006)
singleNumber(2009)
aDigit(1019): "5", line: 1, col: 11
aDigit(1019): "3", line: 1, col: 12
aDigit(1019): "1", line: 1, col: 13
aDigit(1019): "3", line: 1, col: 14
aDigit(1019): "3", line: 1, col: 15
slash(1003): "/", line: 1, col: 16
slashNum(2007)
singleNumber(2009)
aDigit(1019): "4", line: 1, col: 17
aDigit(1019): "7", line: 1, col: 18
impliedRev(2012)
alpha(2008)
aLetter(1018): "B", line: 1, col: 19
Error: in ./inpfile: line 1:
unexpected token "SUPPLEMENT", expected <aDigit>
MIL-DTL-0053133/47B SUPPLEMENT 1
^
I'm not sure why it's expecting an <aDigit> token. If I move the [suffix] to the end of the slashNumber, it works O.K .... i.e.
slashNum = singleNumber [impliedRev] [suffix] ;
and removing [suffix] from the reference line, gives me an output of
bandikoot$ java -jar lib/grammatica-1.4.jar struct_key.grammar.g --parse ./inpfile
Parse tree from ./inpfile:
reference(2001)
milspec(2003)
milPrefix(2004)
mil(1020): "MIL", line: 1, col: 1
dash(1001): "-", line: 1, col: 4
middleAlpha(2005)
dtl(1024): "DTL", line: 1, col: 5
delimiter(2010)
dash(1001): "-", line: 1, col: 8
indicator(2011)
oo(1025): "00", line: 1, col: 9
baseNum(2006)
singleNumber(2009)
aDigit(1019): "5", line: 1, col: 11
aDigit(1019): "3", line: 1, col: 12
aDigit(1019): "1", line: 1, col: 13
aDigit(1019): "3", line: 1, col: 14
aDigit(1019): "3", line: 1, col: 15
slash(1003): "/", line: 1, col: 16
slashNum(2007)
singleNumber(2009)
aDigit(1019): "4", line: 1, col: 17
aDigit(1019): "7", line: 1, col: 18
impliedRev(2012)
alpha(2008)
aLetter(1018): "B", line: 1, col: 19
suffix(2002)
supplement1(1016): "SUPPLEMENT", line: 1, col: 21
singleNumber(2009)
aDigit(1019): "1", line: 1, col: 32
Error: in ./inpfile: line 1:
unexpected character '
'
MIL-DTL-0053133/47B SUPPLEMENT 1
^
Not quite perfect but at least it gets the [suffix] part correctly.
Any thoughts why the first one doesn't work?
Thanks in advance
Anant
_______________________________________________
Grammatica-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/grammatica-users