bison-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bison GLR buffer-overflow and invalid pointer fixes


From: Paul Eggert
Subject: Re: Bison GLR buffer-overflow and invalid pointer fixes
Date: Tue, 05 Jul 2005 21:30:10 -0700
User-agent: Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux)

Thanks for looking at the patch.  At the end of this message is the
further patch that I installed in response to your comments.

Paul Hilfinger <address@hidden> writes:

> I write
>     foo bar;
>     bar = baz;
> rather than
>     foo bar = baz;
> in cases where bar is a local variable that is going to be modified,
> and I avoid ++ and -- as much as possible)

OK, I adjusted to that style in code that is unique to glr.c.  Some
code is shared between yacc.c, though; I thought it better to leave
that alone (please see below).

> yyFail is called with a known set of arguments, so no resizing is
> really needed.  I suppose it can't hurt to protect against rogue calls
> from user code.

That's a a good point about the known set of arguments: I hadn't
noticed that the printf-like functionality is needed at all.  This
simplifies things greatly (and also avoids relying on vsnprintf, which
is a bit newfangled).  However, we shouldn't need to worry about rogue
calls: after all, user code can do rogue calls of any sort, and there
is no real defense against that.

>  >    (yyreportSyntaxError): Check for size-calculation overflow.
>  >    This code is taken from yacc.c.
>
> This one I am less happy about.  The original code should have tested
> for memory overflow, true, but that requires just one small check. 

Actually, I think there are several places where checks are required.
In general we need to check whenever two size_t values are added.
Some of these checks can be omitted, but it's not clear to me that
all can be omitted.

Part of the extra complexity in yacc.c is needed to deal with the
possibility that the strings become arbitrarily long after they are
translated into the user's natural language.  (A user could, for
example, substitute a message catalog with super-long strings.)  This
complexity is not needed in glr.c yet, since it doesn't do translation
yet, but my guess is that it will support translation soon once the
kinks are ironed out elsewhere.  All other things being equal I'd
prefer glr.c to resemble yacc.c (by improving either or both modules)
in areas like this where they're doing the same thing.

I agree that the code is too complicated, but I don't offhand see how
to simplify it.  (Perhaps I'm too familiar with it....)

> This replacement seems even more complex than the original and
> introduces what seems to be a gratuitous limit on the number of
> "expected" tokens reported.  

The gratuitous limit of 5 tokens was already in glr.c, albeit in an
unobvious spot.  Here it is (starting with line 1711 of revision 1.94
dated 2005-06-07):

                if (yycount == 5)
                  {
                    yysize = 0;
                    break;
                  }

This same limit of 5 is also in yacc.c.  The limit could be removed,
but I think the limit is there because users typically don't want to
see more than 5 possible tokens.

Anyway, here's the further patch I just installed:

2005-07-05  Paul Eggert  <address@hidden>

        * data/glr.c (yyFail): Drastically simplify; since the format argument
        never had any % directives, we can simply pass it to yyerror.
        (yyparse): Use "t a; a=b;" rather than "t a = b;" when a will
        be modified later, as that is the usual style in glr.c.
        Problems reported by Paul Hilfinger.

--- glr.c       5 Jul 2005 21:58:37 -0000       1.95
+++ glr.c       6 Jul 2005 04:12:23 -0000       1.96
@@ -678,37 +678,11 @@ struct yyGLRStack {
 static void yyexpandGLRStack (yyGLRStack* yystack]b4_pure_formals[);
 
 static void
-yyFail (yyGLRStack* yystack]b4_pure_formals[, const char* yyformat, ...)
+yyFail (yyGLRStack* yystack]b4_pure_formals[, const char* yymsg)
 {
   yystack->yyerrflag = 1;
-  if (yyformat != NULL)
-    {
-      char yysmallbuf[1024];
-      char const *yymsg = yysmallbuf;
-      char *yybigbuf = NULL;
-      int yymsglen;
-      va_list yyap;
-
-      va_start (yyap, yyformat);
-      yymsglen = vsnprintf (yysmallbuf, sizeof yysmallbuf, yyformat, yyap);
-      va_end (yyap);
-
-      if (yymsglen < 0)
-       yymsg = "message is too long to be printed";
-      else if (sizeof yysmallbuf <= yymsglen && yymsglen < YYSIZEMAX)
-       {
-         size_t yybigsize = yymsglen;
-         yybigsize++;
-         yymsg = yybigbuf = YYMALLOC (yybigsize);
-
-         va_start (yyap, yyformat);
-         vsnprintf (yybigbuf, yybigsize, yyformat, yyap);
-         va_end (yyap);
-       }
-
-      yyerror (]b4_yyerror_args[yymsg);
-      YYFREE (yybigbuf);
-    }
+  if (yymsg != NULL)
+    yyerror (]b4_yyerror_args[yymsg);
   longjmp (yystack->yyexception_buffer, 1);
 }
 
@@ -1973,7 +1947,7 @@ yyrecoverSyntaxError (yyGLRStack* yystac
 
 ]b4_c_ansi_function_def([yyparse], [int], b4_parse_param)[
 {
-  yySymbol yytoken = YYEMPTY;
+  yySymbol yytoken;
   yyGLRStack yystack;
   size_t yyposn;
 ]b4_pure_if(
@@ -1989,6 +1963,8 @@ yyrecoverSyntaxError (yyGLRStack* yystac
 
   YYDPRINTF ((stderr, "Starting parse\n"));
 
+  yytoken = YYEMPTY;
+
   if (setjmp (yystack.yyexception_buffer) != 0)
     goto yyDone;
 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]