bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] 4.0 beta1, character lists broken


From: Aharon Robbins
Subject: Re: [bug-gawk] 4.0 beta1, character lists broken
Date: Sun, 29 May 2011 22:46:05 +0300
User-agent: Heirloom mailx 12.4 7/29/08

Hi.  Thanks for the bug report.

> Date: Fri, 27 May 2011 16:45:33 +0200
> From: Juergen Daubert <address@hidden>
> To: address@hidden
>
> Hello,
>
> first a big thanks for the new gawk with lots of nice new features. 
>
> I've done some first tests, it looks like the handling of character
> lists is partially broken with the new 4.0 beta:
>
> $:~> echo 'a' | awk '/[\001-\177]/'
> awk: cmd. line:1: fatal: add_char: *bufp: can't allocate -1879052298 bytes of 
> memory (Cannot allocate memory)

This is a bug. See the patch below, which will shortly be in the
git repo.

> $:~> echo 'a' | awk '/[\134]/'
> awk: cmd. line:1: error: Unmatched [ or [^: /[\]/

This is not a bug:

$ echo 'a' | gawk-3.1.8 '/[\134]/'
gawk-3.1.8: fatal: Unmatched [ or [^: /[\]/

Octal 134 is a backslash, and thus the diagnostic is correct; there is
no closing ] character.

> $:~> echo '\' | awk '/[\001-\176]/'
> $:~> 

Fixed, now.

> All of the above works with gawk 3.1.8 as expected.

Well, except for the case above with \134.

> This is on a 
> almost up-to-date Linux system with glibc 2.12.2 and gcc 4.5.3.
>
> thanks
> Juergen

Thanks for the report. Here is a patch, which fixes an additional
problem reported by John Haque.

diff --git a/re.c b/re.c
index 691955f..b317b09 100644
--- a/re.c
+++ b/re.c
@@ -643,6 +643,7 @@ add_char(char **bufp, size_t *lenp, char ch, char **ptr)
        erealloc(*bufp, char *, newlen + 2, "add_char");
        *ptr = *bufp + offset;
        **ptr = ch;
+       *lenp = newlen + 2;
        (*ptr)++;
 }
 
@@ -714,7 +715,7 @@ again:
                        /* inside [...] but not inside [[:...:]] */
                        if (*sp == '-') {
                                int start, end;
-                               char i;
+                               int i;
 
                                if (sp[1] == ']') {     /* also literal */
                                        copy();
@@ -728,8 +729,18 @@ again:
                                        len--;
                                }
                                end = sp[1];
-                               for (i = start + 1; i <= end; i++)
+                               if (end < start)
+                                       fatal(_("Invalid range end: /%.*s/"),
+                                                               *lenp, s);
+                               for (i = start + 1; i < end; i++) {
+                                       /*
+                                        * Will the special cases never end?
+                                        */
+                                       if (i == '\\' || i == ']') {
+                                               copych('\\');
+                                       }
                                        copych(i);
+                               }
                                sp++;
                                len--;
                                continue;



reply via email to

[Prev in Thread] Current Thread [Next in Thread]