grep-commit
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

grep branch, master, updated. v2.9-60-g7d20c09


From: Jim Meyering
Subject: grep branch, master, updated. v2.9-60-g7d20c09
Date: Tue, 04 Oct 2011 19:46:15 +0000

This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "grep".

The branch, master has been updated
       via  7d20c09e3e7cf3af9060f395e884fca285ce3598 (commit)
      from  49684e05ed0362928b9fd2d14ecc3153300b702f (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
http://git.savannah.gnu.org/cgit/grep.git/commit/?id=7d20c09e3e7cf3af9060f395e884fca285ce3598


commit 7d20c09e3e7cf3af9060f395e884fca285ce3598
Author: Eli Zaretskii <address@hidden>
Date:   Sun Oct 2 21:33:53 2011 +0200

    dfa: don't mishandle high-bit bytes in a regexp with signed-char
    
    This appears to arise only on systems for which "char" is signed.
    * src/dfa.c (FETCH_WC, FETCH): Produce an unsigned value, rather
    than a sign-extended one.  Fixes a bug on MS-Windows with compiling
    patterns that include characters with the 8-th bit set.
    (to_uchar): Define.  From coreutils.
    Reported by David Millis <address@hidden>.
    See http://thread.gmane.org/gmane.comp.gnu.grep.bugs/3893
    * NEWS (Bug fixes): Mention it.

diff --git a/NEWS b/NEWS
index 8578e82..2b06af4 100644
--- a/NEWS
+++ b/NEWS
@@ -2,6 +2,11 @@ GNU grep NEWS                                    -*- outline 
-*-
 
 * Noteworthy changes in release ?.? (????-??-??) [?]
 
+** Bug fixes
+
+  grep no longer mishandles high-bit-set pattern bytes on systems
+  where "char" is a signed type. [bug appears to affect only MS-Windows]
+
   grep now rejects a command like "grep -r pattern . > out",
   in which the output file is also one of the inputs,
   because it can result in an "infinite" disk-filling loop.
diff --git a/src/dfa.c b/src/dfa.c
index 8611435..dc87915 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -86,6 +86,11 @@
 /* Sets of unsigned characters are stored as bit vectors in arrays of ints. */
 typedef int charclass[CHARCLASS_INTS];
 
+/* Convert a possibly-signed character to an unsigned character.  This is
+   a bit safer than casting to unsigned char, since it catches some type
+   errors that the cast doesn't.  */
+static inline unsigned char to_uchar (char ch) { return ch; }
+
 /* Sometimes characters can only be matched depending on the surrounding
    context.  Such context decisions depend on what the previous character
    was, and the value of the current (lookahead) character.  Context
@@ -686,7 +691,7 @@ static unsigned char const *buf_end;        /* reference to 
end in dfaexec().  */
           {                                    \
             cur_mb_len = 1;                    \
             --lexleft;                         \
-            (wc) = (c) = (unsigned char) *lexptr++; \
+            (wc) = (c) = to_uchar (*lexptr++);  \
           }                                    \
         else                                   \
           {                                    \
@@ -715,7 +720,7 @@ static unsigned char const *buf_end;        /* reference to 
end in dfaexec().  */
         else                         \
           return lasttok = END;              \
       }                                      \
-    (c) = (unsigned char) *lexptr++;  \
+    (c) = to_uchar (*lexptr++);       \
     --lexleft;                       \
   } while(0)
 

-----------------------------------------------------------------------

Summary of changes:
 NEWS      |    5 +++++
 src/dfa.c |    9 +++++++--
 2 files changed, 12 insertions(+), 2 deletions(-)


hooks/post-receive
-- 
grep



reply via email to

[Prev in Thread] Current Thread [Next in Thread]