bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #36703] Change of behaviour


From: anonymous
Subject: [bug #36703] Change of behaviour
Date: Thu, 21 Jun 2012 22:42:19 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.162 Safari/535.19

URL:
  <http://savannah.gnu.org/bugs/?36703>

                 Summary: Change of behaviour
                 Project: grep
            Submitted by: None
            Submitted on: Thu 21 Jun 2012 22:42:19 UTC
                Category: None
                Severity: 3 - Normal
              Item Group: None
                  Status: None
                 Privacy: Public
             Assigned to: None
             Open/Closed: Open
         Discussion Lock: Any

    _______________________________________________________

Details:

Before 7a0ad00f634237b753572378289d76fa8f1c5942:

$ grep 'foo.*\<bar' /tmp/t
fooxxxxx_bar
$

7a0ad00f634237b753572378289d76fa8f1c5942 and after:

$ src/grep 'foo.*\<bar' /tmp/t
$

(/tmp/t contains only that one line.)

The grep that I built to bisect this links to Ubuntu 10.10's libpcre3, which
apt-cache show libpcre3 shows as being:

Architecture: amd64
Source: pcre3
Version: 8.02-1

I'm not sure how relevant pcre is, as I'm not using -P.

I don't feel qualified to determine which of the behaviours is correct, but
the reason I hunted down this change in behaviour is that the underscore not
forming part of the "word" surprised me.

It turns out also that before 7a0ad00, the output depends on whether I set
LC_ALL=C or LC_ALL=en_ZA.utf8; 7a0ad00 and after, the output is empty
regardless of which of those two locales I set.

In any case, this change of behaviour seems unrelated to the commit message:

commit 7a0ad00f634237b753572378289d76fa8f1c5942
Author: Paolo Bonzini <address@hidden>
Date:   Mon Apr 19 14:50:23 2010 +0200

    dfa: optimize UTF-8 period
    
    * NEWS: Document improvement.
    * src/dfa.c (struct dfa): Add utf8_anychar_classes.
    (add_utf8_anychar): New.
    (atom): Simplify if/else nesting.  Call add_utf8_anychar for ANYCHAR
    in UTF-8 locales.
    (dfaoptimize): Abort on ANYCHAR.

Submitter: Bernd Jendrissek (not logged in)




    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?36703>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]