[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#60690: -P '\d' in GNU and git grep
From: |
Carlo Arenas |
Subject: |
bug#60690: -P '\d' in GNU and git grep |
Date: |
Wed, 5 Apr 2023 14:20:51 -0700 |
On Wed, Apr 5, 2023 at 12:40 PM Jim Meyering <jim@meyering.net> wrote:
>
> Changing grep -P's \d to match multibyte digits by default would break
> an important contract.
While I tend to agree[1] (and indeed that is why PCRE2_EXTRA_ASCII_BSD
was invented), it would be also important to note that it goes against
the Unicode recommendation[2] and it is actually not true already[3]
for Python, .NET or Rust (which means ripgrep behaves like GNU grep -P
3.9).
FWIW I also agree that (at least `git grep -P`) should use
PCRE2_EXTRA_ASCII_BSD by default as that is what makes more sense in
the context of matching source code and using instead `\P{Nd}` if you
really want all Unicode digits is not much of a burden, but I am also
not sure if that makes sense in other contexts, specially considering
that I am obviously biased since the languages I mostly interact with
ONLY use arabic numerals and therefore `\d` meaning `[0-9]` seems
"normal".
Carlo
CC: changed to the real email address for PCRE2 development, for full
context on this thread use [4]
[1] https://github.com/PCRE2Project/pcre2/pull/186
[2] https://unicode.org/reports/tr18/
[3] https://regex101.com/r/S5RW4c/1
[4] https://lore.kernel.org/git/230109.86v8lf297g.gmgdl@evledraar.gmail.com/T/
- bug#60690: -P '\d' in GNU and git grep, (continued)
- bug#60690: -P '\d' in GNU and git grep, Carlo Arenas, 2023/04/04
- bug#60690: -P '\d' in GNU and git grep, Paul Eggert, 2023/04/04
- bug#60690: -P '\d' in GNU and git grep, Junio C Hamano, 2023/04/04
- bug#60690: -P '\d' in GNU and git grep, Paul Eggert, 2023/04/05
- bug#60690: -P '\d' in GNU and git grep, Paul Eggert, 2023/04/05
- bug#60690: -P '\d' in GNU and git grep, Junio C Hamano, 2023/04/05
- bug#60690: -P '\d' in GNU and git grep, Jim Meyering, 2023/04/05
- bug#60690: -P '\d' in GNU and git grep, Paul Eggert, 2023/04/05
- bug#60690: -P '\d' in GNU and git grep,
Carlo Arenas <=
- bug#60690: -P '\d' in GNU and git grep, demerphq, 2023/04/06
- bug#60690: -P '\d' in GNU and git grep, Paul Eggert, 2023/04/07
- bug#60690: -P '\d' in GNU and git grep, demerphq, 2023/04/06
- bug#60690: -P '\d' in GNU and git grep, Paul Eggert, 2023/04/07
- bug#60690: -P '\d' in GNU and git grep, Carlo Arenas, 2023/04/08
- bug#60690: -P '\d' in GNU and git grep, Paul Eggert, 2023/04/08