grammatica-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Grammatica-users] Terminate the Subrule


From: Per Cederberg
Subject: Re: [Grammatica-users] Terminate the Subrule
Date: Sat, 25 Jun 2005 11:01:56 +0200

Hi,

Actually you've run into the only know problem in the look-ahead
calculation. In this case Grammatica should have generated a
parser that checks the two upcoming tokens for a match, but it
only checks the first one and picks the production "morestuff"
based on that.

The work-around for this case is to rewrite the grammar a bit:

morestuff        = alphaNum | delimiter | ws ;

This will cause the "morestuff" production to match any alpha-
numeric charater + delimiters and whitespace. When the parser
then encounters the SUPP token it will terminate the production
"alphaNumWDelim" (that ended with morestuff*) and continue in
the grammar.

Hope this helps!

/Per

On fri, 2005-06-24 at 11:26 -0600, address@hidden wrote:
> 
> Hi, 
> 
> I need a way to terminate a subrule and allow the higher rule to start
> a new node.  The subrule allows the alternative "space number" but it
> appears to be stuck on space and expecting it to be followed by a
> number.  Grammatica issued an error for unexpected token: 
> 
> reference(2001) 
>   SDO(2007) 
>     ieee(1019): "IEEE", line: 1, col: 1 
>     ws(1043): " ", line: 1, col: 5 
>     alphaNumWDelim(2022) 
>       alphaNum(2023) 
>         number(1048): "1003", line: 1, col: 6 
>       morestuff(2024) 
>         delimiter(2019) 
>           period(1003): ".", line: 1, col: 10 
>       morestuff(2024) 
>         alphaNum(2023) 
>           number(1048): "1", line: 1, col: 11 
>       morestuff(2024) 
>         delimiter(2019) 
>           dash(1001): "-", line: 1, col: 12 
>       morestuff(2024) 
>         alphaNum(2023) 
>           alpha(1050): "X", line: 1, col: 13 
>       morestuff(2024) 
>         ws(1043): " ", line: 1, col: 14 
>         number(1048): "49", line: 1, col: 15 
>       morestuff(2024) 
>         ws(1043): " ", line: 1, col: 17 
> Error: in C:\brianm\parseInput2.txt: line 1: 
>     unexpected token "SUPP", expected <number> 
> 
> IEEE 1003.1-X 49 SUPP 1A VOL 2 
>                  ^ 
> 
> Note that "49" needs to be included with the base number "1003.1-X"
> but the keyword "SUPP" (abbreviation of SUPPLEMENT) needs to be in the
> suffix node.  I realize that you prefer to %ignore% the space
> character but it is needed to enable references "MIL-STD-123-7 SUPP
> 1A" and "MIL-STD-123 SUPP 1A" (the impliedRev subrule handles the "-7"
> implied revision).  The grammar follows: 
> 
> %header% 
> 
> GRAMMARTYPE = "LL" 
> 
> DESCRIPTION = "A grammar for translating document number reference  
>                 into structured keys: family, revision, suffix, etc." 
> 
> %tokens% 
> 
> dash        = "-" 
> slash        = "/" 
> period        = "." 
> lparen        = "(" 
> rparen        = ")" 
> mil        = "MIL" 
> dod        = "DOD" 
> jan        = "JAN" 
> std        = "STD" 
> hdbk        = "HDBK" 
> prf        = "PRF" 
> dtl        = "DTL" 
> qml        = "QML" 
> qpl        = "QPL" 
> cfr        = "CFR" 
> usc        = "USC" 
> sae        = "SAE" 
> astm        = "ASTM" 
> ieee        = "IEEE" 
> rev1        = "REVISION" 
> rev2        = "REV" 
> part1        = "PART" 
> part2        = "PT" 
> chapter1 = "CHAPTER" 
> chapter2 = "CHAP" 
> volume1        = "VOLUME" 
> volume2        = "VOL" 
> validNotice1 = "VALID NOTICE" 
> validNotice2 = "VAL NOTICE" 
> cancNotice = "CANC NOTICE" 
> canc        = "CANC" 
> interimChg1 = "INTERIM CHANGE" 
> interimChg2 = "INT CHG" 
> amd1        = "AMEND" 
> amd2        = "AMD" 
> interimAmd1 = "INTERIM AMD" 
> interimAmd2 = "INT AMD" 
> supplement1 = "SUPPLEMENT" 
> supplement2 = "SUPP" 
> oo        = "-00" // preceding dash required for in-lieu-of (unable to
> isolate leading zeroes) 
> oh        = "-0H" // both in-lieu-of and handbook 
> sparen        = " (" // get around problem with ws lparen 
> ws        = " " // plain old space character 
> 
> delim        = <<[+=",':;!#$%&*<>address@hidden>> 
> fluff        = <<[\t\n\r\f\a\e]+>> %ignore% // do not explicitly
> handle unprintable 
> number        = <<[0-9]+>> // one or more digits 
> string        = <<[A-Z][A-Z]+>> 
> alpha        = <<[A-Z]>> 
> 
> 
> %productions% 
> 
> reference        = ( milspec | fedspec | QPLdoc | regulation | SDO )
> suffix* ; 
> 
> milspec                = milPrefix  basic ; 
> fedspec                = fedPrefix  basic ; 
> QPLdoc                = ( qpl | qml ) dash ( fedspec | basic |
> SDO ) ; // Qualified Products List 
> regulation        = [string] regPattern alphaNumWDelim ; // optional
> string represents govt agency 
> SDO                = ( sae | astm | ieee ) [ delimiter | ws ]
> alphaNumWDelim ; 
> 
> suffix                 = impliedAmend | ( [ws] keyword [ws]
> value? ) ; 
> keyword                = rev1 | rev2 | part1 | part2 | chapter1 |
> chapter2  
>                 | volume1 | volume2 | validNotice1 | validNotice2 
>                 | cancNotice | canc | amd1 | amd2 | interimAmd1 |
> interimAmd2  
>                 | interimChg1 | interimChg2 | supplement1 |
> supplement2  
>                 | string | alpha ; 
> value                = [delimiter] alphaNumWDelim+ ; 
> 
> milPrefix        = ( mil | dod | jan ) dash [middleAlpha] ; 
> basic                = baseNum  [slashNum] ; 
> baseNum                = number  [impliedRev]  ; 
> slashNum        = ( slash | period )  number  [impliedRev] ; 
> fedPrefix        = ( string | alpha )  dash alpha indicators ; 
> regPattern        = number  ( cfr | usc )  [ string | alpha ] ; 
> 
> middleAlpha        = ( prf | dtl | hdbk | std | alpha )
> [indicators] ; 
> indicators        = oo | oh | (dash alpha) | dash ; // in-lieu-of or
> both or hdbk (single H) 
> delimiter        = dash | slash | period | lparen | rparen |delim ; 
> impliedAmend        = sparen alphaNum+ rparen ; // implied amendment 
> impliedRev        = ( dash  number  [ string | alpha ] )  |  ( string
> | alpha ) ; 
> alphaNumWDelim        = alphaNum morestuff* ; 
> alphaNum        = number | string | alpha ; 
> morestuff        = alphaNum | delimiter | ws number ; 
> 
> 
> Thanks in advance for your help, Brian. 
> By the way, I am an associate of Anant Mistry.
> _______________________________________________
> Grammatica-users mailing list
> address@hidden
> http://lists.nongnu.org/mailman/listinfo/grammatica-users





reply via email to

[Prev in Thread] Current Thread [Next in Thread]