bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17305: [PATCH] dfa: fix bug that caused NUL to be mishandled in patt


From: Paul Eggert
Subject: bug#17305: [PATCH] dfa: fix bug that caused NUL to be mishandled in patterns
Date: Sun, 20 Apr 2014 23:18:56 -0700

This bug was introduced in the early-2012 patches that fixed some
context-handling bugs.  Bisecting found commit
d8951d3f4e1bbd564809aa8e713d8333bda2f802 (2012-02-05 18:00:43 +0100),
but it apears the underlying problem was introduced in commit
8b47c4cf6556933f59226c234b0fe984f6c77dc7 (2012-01-03 11:22:09 +0100).
* NEWS: Mention bug fix.
* src/dfa.c (char_context): Consider NUL to be a newline only if -z.
* tests/Makefile.am (TESTS): Add null-byte.
* tests/null-byte: New file.
---
 NEWS              |  3 +++
 src/dfa.c         |  2 +-
 tests/Makefile.am |  1 +
 tests/null-byte   | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 57 insertions(+), 1 deletion(-)
 create mode 100755 tests/null-byte

diff --git a/NEWS b/NEWS
index 92ce95e..fbb782b 100644
--- a/NEWS
+++ b/NEWS
@@ -11,6 +11,9 @@ GNU grep NEWS                                    -*- outline 
-*-
   grep no longer mishandles an empty pattern at the end of a pattern list.
   [bug introduced in grep-2.5]
 
+  grep -f no longer mishandles patterns containing NUL bytes.
+  [bug introduced in grep-2.11]
+
   grep -P now works with -w and -x and backreferences. Before,
   echo aa|grep -Pw '(.)\1' would fail to match, yet
   echo aa|grep -Pw '(.)\2' would match.
diff --git a/src/dfa.c b/src/dfa.c
index 90cf4a9..c93f451 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -694,7 +694,7 @@ static charclass newline;
 static int
 char_context (unsigned char c)
 {
-  if (c == eolbyte || c == 0)
+  if (c == eolbyte)
     return CTX_NEWLINE;
   if (IS_WORD_CONSTITUENT (c))
     return CTX_LETTER;
diff --git a/tests/Makefile.am b/tests/Makefile.am
index cc79903..91775bd 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -76,6 +76,7 @@ TESTS =                                               \
   max-count-vs-context                         \
   mb-non-UTF8-performance                      \
   multibyte-white-space                                \
+  null-byte                                    \
   empty-line-mb                                        \
   unibyte-bracket-expr                         \
   unibyte-negated-circumflex                   \
diff --git a/tests/null-byte b/tests/null-byte
new file mode 100755
index 0000000..c967dbc
--- /dev/null
+++ b/tests/null-byte
@@ -0,0 +1,52 @@
+#!/bin/sh
+# Test NUL bytes in patterns and data.
+
+# Copyright 2014 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+
+# Add "." to PATH for the use of get-mb-cur-max.
+path_prepend_ .
+
+locales=C
+for locale in en_US.iso885915 en_US.UTF-8; do
+  get-mb-cur-max en_US.UTF-8 >/dev/null 2>&1 && locales="$locales $locale"
+done
+
+fail=0
+
+for left in '' a '#' '\0'; do
+  for right in '' b '#' '\0'; do
+    data="$left\\0$right"
+    printf "$data\\n" >in || framework_failure_
+    for hat in '' '^'; do
+      for dollar in '' '$'; do
+        for force_regex in '' '\\(\\)\\1'; do
+          pat="$hat$force_regex$data$dollar"
+          printf "$pat\\n" >pat || framework_failure_
+          for locale in $locales; do
+            LC_ALL=$locale grep -f pat in ||
+              fail_ "'$pat' does not match '$data'"
+            LC_ALL=$locale grep -a -f pat in | cmp -s - in ||
+              fail_ "-a '$pat' does not match '$data'"
+          done
+        done
+      done
+    done
+  done
+done
+
+Exit $fail
-- 
1.9.0






reply via email to

[Prev in Thread] Current Thread [Next in Thread]