bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#20657: Traditional range expression not accepted in regex/dfa


From: Paul Eggert
Subject: bug#20657: Traditional range expression not accepted in regex/dfa
Date: Mon, 25 May 2015 23:53:31 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0

address@hidden wrote:

The bugaboo here is the "---"; it's
a range expression consisting of minus through minus, and apparently long
ago was how one got a minus into a bracket expression.

Actually, long ago expressions like '[^0-9-]' worked just as they do now, and it wasn't ever necessary to use trailing "---". That being said, it is true that in 7th Edition Unix '[^0-9---]' meant the same thing as '[^0-9-]', so in that sense we have an incompatibility with 7th Edition Unix here.

        $ ./src/grep '[^0-9---]' /dev/null
        ./src/grep: Invalid range end

The underlying regex and, I believe, dfa routines don't accept this.

Yes, that's correct. It's not a bug, though, as the regexp is ambiguous and does not conform to POSIX, which says the following about RE bracket expressions: "To use a <hyphen> as the starting range point, it shall either come first in the bracket expression or be specified as a collating symbol; for example, "[][.-.]-0]", which matches either a <right-square-bracket> or any character or collating element that collates between <hyphen> and 0, inclusive." <http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_03_05> In your correspondent's example, the hyphen is a starting range point but is neither first in the bracket expression nor is specified as a collating symbol, so the regexp doesn't conform to POSIX.

Even though it's not a bug I suppose it wouldn't hurt to make the GNU matchers compatible with 7th Edition Unix here, if someone really wants to take that task on; it's not urgent, though.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]