[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18888: new snapshot available: grep-2.20.72-d512
From: |
Jim Meyering |
Subject: |
bug#18888: new snapshot available: grep-2.20.72-d512 |
Date: |
Wed, 29 Oct 2014 11:35:48 -0700 |
FYI, just prior to making that snapshot, I pushed a change
that updated the gnulib submodule to the latest as of some time
yesterday, and also pulled in some small improvements to the
bootstrap script:
http://git.sv.gnu.org/cgit/grep.git/commit/?id=d512007830d2c
On Wed, Oct 29, 2014 at 11:29 AM, Jim Meyering <address@hidden> wrote:
> Thanks to many fixes and improvements by Paul Eggert and Norihiro Tanaka,
> here is a pre-release snapshot:
>
> grep snapshot:
> http://meyering.net/grep/grep-ss.tar.xz 1.2 MB
> http://meyering.net/grep/grep-ss.tar.xz.sig
> http://meyering.net/grep/grep-2.20.72-d512.tar.xz
>
> Here is the NEWS so far:
>
> ** Improvements
>
> Performance has been greatly improved for searching files containing
> holes, on platforms where lseek's SEEK_DATA flag works efficiently.
>
> Performance has improved for rejecting data that cannot match even
> the first part of a nontrivial pattern.
>
> Performance has improved for very long strings in patterns.
>
> If a file contains data improperly encoded for the current locale,
> and this is discovered before any of the file's contents are output,
> grep now treats the file as binary.
>
> grep -P no longer reports an error and exits when given invalid UTF-8 data.
> Instead, it considers the data to be non-matching.
>
> ** Bug fixes
>
> grep no longer mishandles patterns that contain \w or \W in multibyte
> locales.
>
> grep would fail to count newlines internally when operating in non-UTF8
> multibyte locales, leading it to print potentially many lines that did
> not match. E.g., the command, "seq 10 | env LC_ALL=zh_CN src/grep -n .."
> would print this:
> 1:1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> 10
> implying that the match, "10" was on line 1.
> [bug introduced in grep-2.19]
>
> grep in a non-UTF8 multibyte locale could mistakenly match in the middle
> of a multibyte character when using a '^'-anchored alternate in a pattern,
> leading it to print non-matching lines. [bug present since "the beginning"]
>
> grep -E rejected unmatched ')', instead of treating it like '\)'.
> [bug present since "the beginning"]
>
> ** Changes in behavior
>
> The GREP_OPTIONS environment variable is now obsolescent, and grep
> now warns if it is used. Please use an alias or script instead.
>
> In locales with multibyte character encodings other than UTF-8,
> grep -P now reports an error and exits instead of misbehaving.
>
> When searching binary data, grep now may treat non-text bytes as
> line terminators. This can boost performance significantly.
>
> grep -z no longer automatically treats the byte '\200' as binary data.
> ====================================================
>
> Changes in grep since v2.20:
>
> Jim Meyering (13):
> maint: post-release administrivia
> build: don't redirect directly to $@
> build: improve rule to generate egrep+fgrep scripts
> maint: generate distributed THANKS from VC'd THANKS.in
> doc: update HACKING
> maint: split long lines, and enforce the 80-column limit
> maint: avoid distcheck failure
> tests: add expect-to-fail test for a glibc regexp bug
> doc: move NEWS note about GREP_OPTIONS into proper section
> maint: suppress a false-positive -Wcast-align warning
> grep: avoid stack buffer read-underrun and overrun
> tests: make new test script executable
> gnulib: update to latest; bootstrap, too
>
> Norihiro Tanaka (13):
> dfa: speed-up at initial state
> dfa: separate dfaexec function to help optimization by compiler
> grep: fix subscript error when testing whether empty lines match
> dfa: check end of input buffer after transition in non-UTF8
> multibyte locale
> dfa: factor out a new nontrivial block of duplicated code
> dfa: test for just-fixed bug
> dfa: fix a theoretical bug
> grep: initialize validation_boundary properly before use
> dfa: process all MBCSET constructs via glibc's matcher
> dfa: remove two erroneous clauses from a now-unused function
> tests: add test for grep -P fix
> dfa: avoid false match in a non-UTF8 multibyte locale
> dfa: make \w and \W work in multibyte locales
>
> Paul Eggert (46):
> build: update gnulib submodule to latest
> grep: use system strstr if available and fast
> grep: undo part of previous change
> doc: use gnulib fdl module
> maint: remove grep.spec
> build: don't make output files read-only
> build: avoid -Wstack-protector
> grep: with -E, unmatched ')' matches itself
> doc: Document -r vs --exclude more carefully.
> doc: prefer @env to @code
> doc: document LANGUAGE
> grep: fix integer-width bugs in undossify_input etc.
> grep: -P now treats invalid UTF-8 input as non-matching
> grep: port recent fix to older pcre version
> grep: fix false matches with -P '...$' and invalid UTF-8
> grep: fix false matches with -P '...$' and invalid UTF-8
> doc: bug tracker has moved to debbugs.gnu.org
> grep: make GREP_OPTIONS obsolescent
> grep: diagnose -P in non-UTF-8 multibyte locale
> grep: remove/refactor unnecessary code about line splitting
> grep: speed up -P on files containing many multibyte errors
> grep: use bool for boolean in grep.c
> grep: treat a file as binary if its prefix contains encoding errors
> grep: improve performance for older glibc
> grep: use mbclen cache more effectively
> grep: avoid false alarms for mb_clen and to_uchar
> grep: use mbclen cache in one more place
> grep: port -P speedup to hosts lacking PCRE_STUDY_JIT_COMPILE
> grep: fix -P speedup bug with empty match
> grep: refactor binary-vs-unknown-vs-text flags for clarity
> grep: -z no longer considers '\200' to be binary data
> grep: non-text bytes in binary data may be treated as line ends
> grep: minor -P speedup with jit_stack
> grep: improve -P performance in typical cases
> grep: skip past holes efficiently
> grep: port to platforms lacking SEEK_DATA
> grep: speed up processing of holes before EOF on Solaris
> grep: scan for valid multibyte strings more quickly
> grep: don't check extensively for invalid prefix bytes unless -P
> maint: generalize the -Wcast-align fix
> dfa: minor tweaks, mostly to remove __attribute__ ((noinline))
> doc: clarify exit status
> doc: modernize and simplify man page
> grep: fix off-by-one bug in -P optimization
> grep: fix grep -P crash
> tests: work around older libpcre bugs when testing -P and UTF-8
>
>
> Changes in gnulib since v2.20:
>
> * gnulib 98ca2c0...8415b67 (95):
> > socketlib, sockets, sys_socket: Use AC_REQUIRE to pacify autoconf.
> > iconv: avoid false detection of non-working iconv
> > bootstrap: print more diagnostics for missing programs
> > bootstrap: only update the gnulib submodule
> > symlinkat: port to AIX 7.1
> > readlinkat: port to AIX 7.1
> > remove spurious {
> > modules/fcntl: fix error reporting by dupfd
> > basename, dirname: Improve documentation.
> > exclude: declare exclude_patopts static
> > autoupdate
> > dirname: support compilation with C++
> > qsort_r: include <config.h>
> > avltree-list: avoid compiler warnings
> > qsort_r: new module, for GNU-style qsort_r
> > strerror_r-posix: support compilation with C++
> > fcntl-h: fix compilation with Intel C++ compiler
> > autoupdate
> > mountlist: use /proc/self/mountinfo when available
> > users.txt: add cmogstored
> > gnulib-tool: Sync with build-aux/bootstrap options
> > gnulib-tool: Fallback to wget when rsync fails
> > maintainer-makefile: add syntax check for useless ';;'
> > pthread, pthread_sigmask, threadlib: port to Ubuntu 14.04
> > error: drop spurious semicolon
> > gnulib-common.m4: port to GCC 4.2.1 and Sun Studio 12 C++
> > manywarnings: add GCC 4.9 warnings
> > vasnprintf: fix bugs in width computation
> > vasnprintf: Avoid signed/unsigned comparison warning.
> > parse-datetime: Avoid signed/unsigned comparison warning
> > qsort_r: new module, for GNU-style qsort_r
> > vla: new module
> > localename: make gl_locale_name_thread really thread-safe on Windows
> > getpass: don't assume struct termios
> > getdtablesize: fall back on sysconf (_SC_OPEN_MAX)
> > vararrays: modernize AC_C_VARARRAYS for C11
> > relocatable-prog-wrapper: port gettext to OS X 10.8 + GCC 4.8.1
> > sys_select: fix FD_ZERO problem on Solaris 10
> > accept: document Solaris 10 type glitch
> > extern-inline: port to FreeBSD, DragonFly
> > autoupdate
> > Use consistent style to check DEBUG macro in regex_internal.c
> > openat-die: use _Noreturn markup
> > test-open: port to cygwin, which lacks Fortify
> > localename: Enforce declarations before statements.
> > test-userspec: don't look up numeric user names
> > localcharset, localename: MS-Windows support for non-default locales
> > announce-gen: avoid failure when Digest::SHA is installed
> > gettext: revert "update macros to version 0.19"
> > regex: don't deref NULL upon heap allocation failure
> > maint.mk: give projects more flexibilty in set_prog_name arguments
> > regex: fix memory leak in compiler
> > announce-gen: avoid perl warnings
> > localename: avoid -Wsuggest-attribute={const,pure} warnings
> > nl_langinfo: Fix last change.
> > Define macros for glibc
> > Sync up error.c with glibc
> > nl_langinfo: fix build under mingw
> > mountlist: do not classify a bind-mounted dir entry as "dummy"
> > maint.mk: less syntax-check noise when SIGPIPE is ignored
> > nl_langinfo: CODESET on MS-Windows and more items from localeconv
> > Bruno Haible has stepped down as maintainer.
> > mktime: merge #if/#ifdef usage from glibc
> > git-version-gen: improve option descriptions
> > regex: fix memory leak in compiler
> > regex: merge patch from libc
> > acl: port to gcc -Wredundant-decls
> > parse-duration: eliminate 68-year duration limit
> > pthread: don't assume AC_CANONICAL_HOST, port better to Solaris, etc.
> > pthread: define thread-safe macros on some platforms
> > regex: don't be multithreaded if USE_UNLOCKED_IO.
> > gettext: update macros to version 0.19
> > select,poll: fix console handle check on windows 8
> > select: fix waiting on anonymous pipes on MS-Windows
> > times: fix to return non constant value on MS-Windows
> > isatty: fix to work on windows 8
> > maint: fix typo in fdl.texi
> > mountlist: avoid hasmntopt const type warning on solaris
> > maintainer-makefile: delete obsolete code
> > maintainer-makefile: avoid spurious error messages
> > rename: avoid unused-but-set-variable compiler warning
> > maint: add ChangeLog entry missing in previous commit
> > rename: mark a label as potentially unused
> > gnulib-common.m4: Fix typo in _GL_UNUSED_LABEL.
> > acl: apply pure attribute to two functions
> > gnulib-common.m4: add _GL_UNUSED_LABEL
> > dup2, fcntl, fcntl-h: port to AIX 7.1
> > printf, config.rpath: Port to FreeBSD 10.
> > ftoastr: work around compiler bug in IBM xlc 12.1
> > valgrind-tests: fixed misleading help message
> > isfinite, isinf, isnan tests: fix for little-endian PowerPC
> > exclude-tests: port to AIX 7.1
> > pthread_sigmask, timer-time: use gl_THREADLIB only if needed
> > gnulib-tool: wget translations using --no-verbose rather than --quiet
> > gnulib-tool: adjust translation wget to avoid a https redirection
>
>
>
bug#18888: [platform-testers] new snapshot available: grep-2.20.72-d512, Eric Blake, 2014/10/31