[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Grammatica-users] Terminate the Subrule
From: |
Per Cederberg |
Subject: |
Re: [Grammatica-users] Terminate the Subrule |
Date: |
Sat, 25 Jun 2005 11:01:56 +0200 |
Hi,
Actually you've run into the only know problem in the look-ahead
calculation. In this case Grammatica should have generated a
parser that checks the two upcoming tokens for a match, but it
only checks the first one and picks the production "morestuff"
based on that.
The work-around for this case is to rewrite the grammar a bit:
morestuff = alphaNum | delimiter | ws ;
This will cause the "morestuff" production to match any alpha-
numeric charater + delimiters and whitespace. When the parser
then encounters the SUPP token it will terminate the production
"alphaNumWDelim" (that ended with morestuff*) and continue in
the grammar.
Hope this helps!
/Per
On fri, 2005-06-24 at 11:26 -0600, address@hidden wrote:
>
> Hi,
>
> I need a way to terminate a subrule and allow the higher rule to start
> a new node. The subrule allows the alternative "space number" but it
> appears to be stuck on space and expecting it to be followed by a
> number. Grammatica issued an error for unexpected token:
>
> reference(2001)
> SDO(2007)
> ieee(1019): "IEEE", line: 1, col: 1
> ws(1043): " ", line: 1, col: 5
> alphaNumWDelim(2022)
> alphaNum(2023)
> number(1048): "1003", line: 1, col: 6
> morestuff(2024)
> delimiter(2019)
> period(1003): ".", line: 1, col: 10
> morestuff(2024)
> alphaNum(2023)
> number(1048): "1", line: 1, col: 11
> morestuff(2024)
> delimiter(2019)
> dash(1001): "-", line: 1, col: 12
> morestuff(2024)
> alphaNum(2023)
> alpha(1050): "X", line: 1, col: 13
> morestuff(2024)
> ws(1043): " ", line: 1, col: 14
> number(1048): "49", line: 1, col: 15
> morestuff(2024)
> ws(1043): " ", line: 1, col: 17
> Error: in C:\brianm\parseInput2.txt: line 1:
> unexpected token "SUPP", expected <number>
>
> IEEE 1003.1-X 49 SUPP 1A VOL 2
> ^
>
> Note that "49" needs to be included with the base number "1003.1-X"
> but the keyword "SUPP" (abbreviation of SUPPLEMENT) needs to be in the
> suffix node. I realize that you prefer to %ignore% the space
> character but it is needed to enable references "MIL-STD-123-7 SUPP
> 1A" and "MIL-STD-123 SUPP 1A" (the impliedRev subrule handles the "-7"
> implied revision). The grammar follows:
>
> %header%
>
> GRAMMARTYPE = "LL"
>
> DESCRIPTION = "A grammar for translating document number reference
> into structured keys: family, revision, suffix, etc."
>
> %tokens%
>
> dash = "-"
> slash = "/"
> period = "."
> lparen = "("
> rparen = ")"
> mil = "MIL"
> dod = "DOD"
> jan = "JAN"
> std = "STD"
> hdbk = "HDBK"
> prf = "PRF"
> dtl = "DTL"
> qml = "QML"
> qpl = "QPL"
> cfr = "CFR"
> usc = "USC"
> sae = "SAE"
> astm = "ASTM"
> ieee = "IEEE"
> rev1 = "REVISION"
> rev2 = "REV"
> part1 = "PART"
> part2 = "PT"
> chapter1 = "CHAPTER"
> chapter2 = "CHAP"
> volume1 = "VOLUME"
> volume2 = "VOL"
> validNotice1 = "VALID NOTICE"
> validNotice2 = "VAL NOTICE"
> cancNotice = "CANC NOTICE"
> canc = "CANC"
> interimChg1 = "INTERIM CHANGE"
> interimChg2 = "INT CHG"
> amd1 = "AMEND"
> amd2 = "AMD"
> interimAmd1 = "INTERIM AMD"
> interimAmd2 = "INT AMD"
> supplement1 = "SUPPLEMENT"
> supplement2 = "SUPP"
> oo = "-00" // preceding dash required for in-lieu-of (unable to
> isolate leading zeroes)
> oh = "-0H" // both in-lieu-of and handbook
> sparen = " (" // get around problem with ws lparen
> ws = " " // plain old space character
>
> delim = <<[+=",':;!#$%&*<>address@hidden>>
> fluff = <<[\t\n\r\f\a\e]+>> %ignore% // do not explicitly
> handle unprintable
> number = <<[0-9]+>> // one or more digits
> string = <<[A-Z][A-Z]+>>
> alpha = <<[A-Z]>>
>
>
> %productions%
>
> reference = ( milspec | fedspec | QPLdoc | regulation | SDO )
> suffix* ;
>
> milspec = milPrefix basic ;
> fedspec = fedPrefix basic ;
> QPLdoc = ( qpl | qml ) dash ( fedspec | basic |
> SDO ) ; // Qualified Products List
> regulation = [string] regPattern alphaNumWDelim ; // optional
> string represents govt agency
> SDO = ( sae | astm | ieee ) [ delimiter | ws ]
> alphaNumWDelim ;
>
> suffix = impliedAmend | ( [ws] keyword [ws]
> value? ) ;
> keyword = rev1 | rev2 | part1 | part2 | chapter1 |
> chapter2
> | volume1 | volume2 | validNotice1 | validNotice2
> | cancNotice | canc | amd1 | amd2 | interimAmd1 |
> interimAmd2
> | interimChg1 | interimChg2 | supplement1 |
> supplement2
> | string | alpha ;
> value = [delimiter] alphaNumWDelim+ ;
>
> milPrefix = ( mil | dod | jan ) dash [middleAlpha] ;
> basic = baseNum [slashNum] ;
> baseNum = number [impliedRev] ;
> slashNum = ( slash | period ) number [impliedRev] ;
> fedPrefix = ( string | alpha ) dash alpha indicators ;
> regPattern = number ( cfr | usc ) [ string | alpha ] ;
>
> middleAlpha = ( prf | dtl | hdbk | std | alpha )
> [indicators] ;
> indicators = oo | oh | (dash alpha) | dash ; // in-lieu-of or
> both or hdbk (single H)
> delimiter = dash | slash | period | lparen | rparen |delim ;
> impliedAmend = sparen alphaNum+ rparen ; // implied amendment
> impliedRev = ( dash number [ string | alpha ] ) | ( string
> | alpha ) ;
> alphaNumWDelim = alphaNum morestuff* ;
> alphaNum = number | string | alpha ;
> morestuff = alphaNum | delimiter | ws number ;
>
>
> Thanks in advance for your help, Brian.
> By the way, I am an associate of Anant Mistry.
> _______________________________________________
> Grammatica-users mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/grammatica-users