Re: [Grammatica-users] Having problems with my grammar

grammatica-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Grammatica-users] Having problems with my grammar

From:	Anant Mistry
Subject:	Re: [Grammatica-users] Having problems with my grammar
Date:	Wed, 16 Mar 2005 11:48:32 -0700

Thanks a lot Per .... I have been acting as the "middle man" for this posting, but now that I am using your tool, grammatica is a really slick tool. I tried to get ANTLR to be able to separate grammar from code (and even posted the question on their forum) .... got a lot of responses but none of them were as nice as what you provide "out of the box"

Thanks for a great tool

Anant

On Wed, 2005-03-16 at 19:30 +0100, Per Cederberg wrote:

Ok, the problem has already been solved. But I noted a few other
things you'd might consider fixing while you're at it:

  aLetter = <<[A-Z]>>
  ...
  h       = "H"

As the <h> token is below <aLetter> and they can match the same
character sequence, the <aLetter> token will always be chosen. I
suggest you move the <aLetter> (and <aDigit>) token down in the
list below <h>.

  WHITESPACE = <<[ \t]+>> %ignore%

Your second issue was probably caused by your <WHITESPACE> token
not including linefeeds. Add \n and \r to the token, or trim()
the input text before calling the parser. Note that if you add
linefeeds to the <WHITESPACE> token, your parser will also accept
linefeeds inside document numbers, which might not be what you
want.

Cheers,

/Per

On wed, 2005-03-16 at 11:07 -0700, Anant Mistry wrote:
> 
> Please ignore this posting. I figured it out ..... duh!!! I was being
> really stupid ..... cut'n'paste is not always a good thing!!!
> 
> Thanks
> 
> Anant
> 
> On Wed, 2005-03-16 at 07:48 -0700, Anant Mistry wrote:
> > 
> > I'm trying to create a grammar to parse a single line (a doc number
> > actually). Here is my grammar
> > 
> > %header%
> > 
> > GRAMMARTYPE = "LL"
> > 
> > %tokens%
> > 
> > dash    = '-'
> > period  = '.'
> > slash   = '/'
> > rev1    = "REVISION"
> > rev2    = "REV"
> > part1   = "PART"
> > part2   ="PT"
> > chapter1 = "CHAPTER"
> > chapter2 = "CHAP"
> > volume1 = "VOLUME"
> > volume2 = "VOL"
> > validNotice1 = "VALID NOTICE"
> > validNotice2 = "VAL NOTICE"
> > interimChg1 = "INTERIM CHANGE"
> > interimChg2 = "INT CHG"
> > supplement1 = "SUPPLEMENT"
> > supplement2 = "SUPP"
> > aLetter = <<[A-Z]>>
> > aDigit  = <<[0-9]>>
> > mil     = "MIL"
> > dod     = "DOD"
> > jan     = "JAN"
> > prf     = "PRF"
> > dtl     = "DTL"
> > oo      = "00"
> > oh      = "OH"
> > h       = "H"
> > WHITESPACE = <<[ \t]+>> %ignore%
> > 
> > %productions%
> > 
> > reference       = milspec [suffix] ;
> > 
> > suffix          =  ( rev1 | rev2 | part1 | part2 | chapter1 | chapter2
> >                         | volume1 | volume2 | validNotice1 | validNotice2
> >                         | interimChg1 | interimChg2 | supplement1 | supplement2 ) [singleNumber] ;
> > 
> > milspec         = milPrefix baseNum [slashNum] ;
> > 
> > milPrefix       = ( mil | dod | jan ) dash [middleAlpha] [delimiter] [indicator] baseNum [slash slashNum];
> > middleAlpha     = prf | dtl | alpha ;
> > 
> > baseNum         = singleNumber [impliedRev] ;
> > slashNum        = singleNumber [impliedRev] ;
> > 
> > alpha           = aLetter+ ;
> > singleNumber    = aDigit+ ;
> > 
> > delimiter       = dash | period | slash ;
> > 
> > indicator       = oo | oh | h ;
> > 
> > impliedRev      = (dash singleNumber [alpha]) | alpha ;
> > 
> > The problem I'm having is when I try to parse the line 
> > 
> > MIL-DTL-0053133/47B SUPPLEMENT 1
> > 
> > I get
> > 
> > bandikoot$ java -jar lib/grammatica-1.4.jar struct_key.grammar.g --parse ./inpfile
> > Parse tree from ./inpfile:
> > reference(2001)
> >   milspec(2003)
> >     milPrefix(2004)
> >       mil(1020): "MIL", line: 1, col: 1
> >       dash(1001): "-", line: 1, col: 4
> >       middleAlpha(2005)
> >         dtl(1024): "DTL", line: 1, col: 5
> >       delimiter(2010)
> >         dash(1001): "-", line: 1, col: 8
> >       indicator(2011)
> >         oo(1025): "00", line: 1, col: 9
> >       baseNum(2006)
> >         singleNumber(2009)
> >           aDigit(1019): "5", line: 1, col: 11
> >           aDigit(1019): "3", line: 1, col: 12
> >           aDigit(1019): "1", line: 1, col: 13
> >           aDigit(1019): "3", line: 1, col: 14
> >           aDigit(1019): "3", line: 1, col: 15
> >       slash(1003): "/", line: 1, col: 16
> >       slashNum(2007)
> >         singleNumber(2009)
> >           aDigit(1019): "4", line: 1, col: 17
> >           aDigit(1019): "7", line: 1, col: 18
> >         impliedRev(2012)
> >           alpha(2008)
> >             aLetter(1018): "B", line: 1, col: 19
> > Error: in ./inpfile: line 1:
> >     unexpected token "SUPPLEMENT", expected <aDigit>
> > 
> > MIL-DTL-0053133/47B SUPPLEMENT 1
> >                     ^
> > I'm not sure why it's expecting an <aDigit> token. If I move the
> > [suffix] to the end of the slashNumber, it works O.K .... i.e.
> > 
> > slashNum        = singleNumber [impliedRev] [suffix] ;
> > 
> > and removing [suffix] from the reference line, gives me an output of
> > 
> > bandikoot$ java -jar lib/grammatica-1.4.jar struct_key.grammar.g --parse ./inpfile
> > Parse tree from ./inpfile:
> > reference(2001)
> >   milspec(2003)
> >     milPrefix(2004)
> >       mil(1020): "MIL", line: 1, col: 1
> >       dash(1001): "-", line: 1, col: 4
> >       middleAlpha(2005)
> >         dtl(1024): "DTL", line: 1, col: 5
> >       delimiter(2010)
> >         dash(1001): "-", line: 1, col: 8
> >       indicator(2011)
> >         oo(1025): "00", line: 1, col: 9
> >       baseNum(2006)
> >         singleNumber(2009)
> >           aDigit(1019): "5", line: 1, col: 11
> >           aDigit(1019): "3", line: 1, col: 12
> >           aDigit(1019): "1", line: 1, col: 13
> >           aDigit(1019): "3", line: 1, col: 14
> >           aDigit(1019): "3", line: 1, col: 15
> >       slash(1003): "/", line: 1, col: 16
> >       slashNum(2007)
> >         singleNumber(2009)
> >           aDigit(1019): "4", line: 1, col: 17
> >           aDigit(1019): "7", line: 1, col: 18
> >         impliedRev(2012)
> >           alpha(2008)
> >             aLetter(1018): "B", line: 1, col: 19
> >         suffix(2002)
> >           supplement1(1016): "SUPPLEMENT", line: 1, col: 21
> >           singleNumber(2009)
> >             aDigit(1019): "1", line: 1, col: 32
> > Error: in ./inpfile: line 1:
> >     unexpected character '
> > '
> > 
> > MIL-DTL-0053133/47B SUPPLEMENT 1
> >                                 ^
> > 
> > Not quite perfect but at least it gets the [suffix] part correctly.
> > 
> > Any thoughts why the first one doesn't work?
> > 
> > Thanks in advance
> > 
> > Anant
> > 
> > 
> > 
> > 
> > _______________________________________________
> > Grammatica-users mailing list
> > address@hidden
> > http://lists.nongnu.org/mailman/listinfo/grammatica-users
> 
> 
> _______________________________________________
> Grammatica-users mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/grammatica-users



_______________________________________________
Grammatica-users mailing list
address@hidden
http://lists.nongnu.org/mailman/listinfo/grammatica-users

[Prev in Thread]

Current Thread

[Next in Thread]

[Grammatica-users] Having problems with my grammar, Anant Mistry, 2005/03/16
- Re: [Grammatica-users] Having problems with my grammar, Anant Mistry, 2005/03/16
  - Re: [Grammatica-users] Having problems with my grammar, Per Cederberg, 2005/03/16
    - Re: [Grammatica-users] Having problems with my grammar, Anant Mistry <=

Prev by Date: Re: [Grammatica-users] Having problems with my grammar
Next by Date: [Grammatica-users] Question about tokens
Previous by thread: Re: [Grammatica-users] Having problems with my grammar
Next by thread: [Grammatica-users] Question about tokens
Index(es):
- Date
- Thread