[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18455: grep 2.20 perl-regexp: invalid UTF-8 byte sequence in input
From: |
Mario Grgic |
Subject: |
bug#18455: grep 2.20 perl-regexp: invalid UTF-8 byte sequence in input |
Date: |
Thu, 11 Sep 2014 21:27:22 -0400 |
This happens with GNU grep version 2.20 and PCRE 8.35 on Mac OS X. The
following command reproduce the problem:
$ printf 'j\x82\nj\n' | grep -P j
invalid UTF-8 byte sequence in input
But I usually encounter this when recursively searching through files and
encountering a binary file which contains invalid UTF-8 sequence. If binary
file with invalid UTF-8 sequence is encountered first (without any other
matches), grep will abort the entire recursive search and not even mention
which file caused the error. This is somewhat confusing when you first
encounter it.
By the way, this works in GNU grep 2.18 without any errors (you get messages
like binary file x matches), and with PCRE 8.33 or 8.35 (I have not tried any
other combinations).
- bug#18455: grep 2.20 perl-regexp: invalid UTF-8 byte sequence in input,
Mario Grgic <=