Re: $$ = $1

bison-patches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: $$ = $1

From:	Paul Eggert
Subject:	Re: $$ = $1
Date:	Wed, 31 Jul 2002 00:55:46 -0700 (PDT)

> Cc: Paul Eggert <address@hidden>
> From: Akim Demaille <address@hidden>
> Date: 30 Jul 2002 19:48:02 +0200
> User-Agent: Gnus/5.0808 (Gnus v5.8.8) XEmacs/21.4 (Honest Recruiter)
> 
> 
> Paul, the following comment and code do not agree:
> 
> 
>    /*-----------------------------.
>    | yyreduce -- Do a reduction.  |
>    `-----------------------------*/
>    yyreduce:
>      /* yyn is the number of a rule to reduce with.  */
>      yylen = yyr2[yyn];
>    
>      /* If YYLEN is nonzero, implement the default value of the action:
>         `$$ = $1'.
>    
>         Otherwise, the following line sets YYVAL to the semantic value of
>         the lookahead token.  This behavior is undocumented and Bison
>         users should not rely upon it.  Assigning to YYVAL
>         unconditionally makes the parser a bit smaller, and it avoids a
>         GCC warning that YYVAL may be used uninitialized.  */
>      yyval = yyvsp[1-yylen];
> 
> 
> The comment refers to rules without any lhs, i.e., yylen = 0,

The first part of the comment ("If YYLEN is nonzero...") refers to
rules with nonempty right-hand sides.

The second part of the comment ("Otherwise, ...") refers to rules with
empty right-hand sides.  I think this is what you're talking about
when you say "without any lhs".

> i.e., we point to yyvsp[1], in other words one past the end of the
> stack: a no-bit-land.

It shouldn't be a no-bit-land, since yyvsp[1] should contain the
semantic value of the lookahead token that was used to decide whether
to make this reduction.

If memory serves, Bison won't reduce an empty rule without a lookahead
token.  I very vaguely recall checking for this back in 1999 when I
proposed the above-quoted code (which was installed on 2000-10-02).
Perhaps things have changed since then?  Or quite possibly I didn't
check correctly.  FYI, before the patch, the last line of the above
code was `if (yylen > 0) yyval = yyvsp[1-yylen];', which meant that
yyval could be used "uninitialized".

> I face the same problem for the default location, which when there is
> a non empty lhs, amounts to
> 
>         @$ = range from @1 to @n
> 
> What does make sense in that case, is to
> have a 0-width location starting where the last symbol ended (@0), and
> ends at the same point.

More useful, I think, would be to have a nonpositive-width location,
starting where the next symbol starts, and ending where the previous
symbol ends.  This would let the user identify exactly where the empty
rule was reduced, and it should be a natural generalization of the
existing code, which already says that @$ starts with @1's start and
ends with @n's end (in this case, n is 0, so @$ ends where @0 ends).

If you don't like negative widths, I suppose you could arbitrarily
set those widths to zero.

>         exp: one empty one;
>         one: '1';
>         empty: ;
> 
> on the input `11', one should have in the first rule
> 
>         exp: one empty one { @1 = 0-1; @2 = 1-1; @3 = 1-2; }

Yes, this agrees with the nonpositive-width solution suggested above:
in this case the width of @2 is zero, which is nonpositive.

> So when reducing empty, it
> should be given @0, the location of the first one :(

Yes, that's right (if I understand you correctly).

> Anyway, back to our $$ = $1.  I fail to see what real problem the code
> could introduce.  Sure, we read unitialized bytes, but tools such as
> Valgrind are fine with this *except* if you *use* this value.  But
> then, the user is wrong anyway, since she didn't specify a value for
> `empty'.

Can't Bison check for this case when it is processing the rule?
Then we don't need to worry about the user being wrong, since Bison
will reject such grammars.


> The other problem might be looking past the actually
> allocated stack, but I couldn't make a test case that exhibits this
> failure (I'll try again tomorrow, I must go now, but maybe the limits
> we have on the stack always ensures we can use its top + 1.

Yes, I think the code is designed that way (at least it was in 1999).
Even better, the location should always be initialized.

> What fix would you suggest?

How about if we have Bison reject grammars where the implicit action
"$$=$1" is applied to a rule with an empty right-hand-side?  Perhaps
that's too Draconian; on the other hand, such a grammar is in
trouble anyway.

[Prev in Thread]

Current Thread

[Next in Thread]

$$ = $1, Akim Demaille, 2002/07/30
- Re: $$ = $1, Paul Eggert <=
  - Re: $$ = $1, Akim Demaille, 2002/07/31
    - Re: $$ = $1, Paul Eggert, 2002/07/31

Prev by Date: $$ = $1
Next by Date: Re: $$ = $1
Previous by thread: $$ = $1
Next by thread: Re: $$ = $1
Index(es):
- Date
- Thread