bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18888: new snapshot available: grep-2.20.72-d512


From: Jim Meyering
Subject: bug#18888: new snapshot available: grep-2.20.72-d512
Date: Wed, 29 Oct 2014 11:35:48 -0700

FYI, just prior to making that snapshot, I pushed a change
that updated the gnulib submodule to the latest as of some time
yesterday, and also pulled in some small improvements to the
bootstrap script:

  http://git.sv.gnu.org/cgit/grep.git/commit/?id=d512007830d2c

On Wed, Oct 29, 2014 at 11:29 AM, Jim Meyering <address@hidden> wrote:
> Thanks to many fixes and improvements by Paul Eggert and Norihiro Tanaka,
> here is a pre-release snapshot:
>
> grep snapshot:
>   http://meyering.net/grep/grep-ss.tar.xz      1.2 MB
>   http://meyering.net/grep/grep-ss.tar.xz.sig
>   http://meyering.net/grep/grep-2.20.72-d512.tar.xz
>
> Here is the NEWS so far:
>
> ** Improvements
>
>   Performance has been greatly improved for searching files containing
>   holes, on platforms where lseek's SEEK_DATA flag works efficiently.
>
>   Performance has improved for rejecting data that cannot match even
>   the first part of a nontrivial pattern.
>
>   Performance has improved for very long strings in patterns.
>
>   If a file contains data improperly encoded for the current locale,
>   and this is discovered before any of the file's contents are output,
>   grep now treats the file as binary.
>
>   grep -P no longer reports an error and exits when given invalid UTF-8 data.
>   Instead, it considers the data to be non-matching.
>
> ** Bug fixes
>
>   grep no longer mishandles patterns that contain \w or \W in multibyte
>   locales.
>
>   grep would fail to count newlines internally when operating in non-UTF8
>   multibyte locales, leading it to print potentially many lines that did
>   not match.  E.g., the command, "seq 10 | env LC_ALL=zh_CN src/grep -n .."
>   would print this:
>   1:1
>   2
>   3
>   4
>   5
>   6
>   7
>   8
>   9
>   10
>   implying that the match, "10" was on line 1.
>   [bug introduced in grep-2.19]
>
>   grep in a non-UTF8 multibyte locale could mistakenly match in the middle
>   of a multibyte character when using a '^'-anchored alternate in a pattern,
>   leading it to print non-matching lines.  [bug present since "the beginning"]
>
>   grep -E rejected unmatched ')', instead of treating it like '\)'.
>   [bug present since "the beginning"]
>
> ** Changes in behavior
>
>   The GREP_OPTIONS environment variable is now obsolescent, and grep
>   now warns if it is used.  Please use an alias or script instead.
>
>   In locales with multibyte character encodings other than UTF-8,
>   grep -P now reports an error and exits instead of misbehaving.
>
>   When searching binary data, grep now may treat non-text bytes as
>   line terminators.  This can boost performance significantly.
>
>   grep -z no longer automatically treats the byte '\200' as binary data.
> ====================================================
>
> Changes in grep since v2.20:
>
> Jim Meyering (13):
>       maint: post-release administrivia
>       build: don't redirect directly to $@
>       build: improve rule to generate egrep+fgrep scripts
>       maint: generate distributed THANKS from VC'd THANKS.in
>       doc: update HACKING
>       maint: split long lines, and enforce the 80-column limit
>       maint: avoid distcheck failure
>       tests: add expect-to-fail test for a glibc regexp bug
>       doc: move NEWS note about GREP_OPTIONS into proper section
>       maint: suppress a false-positive -Wcast-align warning
>       grep: avoid stack buffer read-underrun and overrun
>       tests: make new test script executable
>       gnulib: update to latest; bootstrap, too
>
> Norihiro Tanaka (13):
>       dfa: speed-up at initial state
>       dfa: separate dfaexec function to help optimization by compiler
>       grep: fix subscript error when testing whether empty lines match
>       dfa: check end of input buffer after transition in non-UTF8
> multibyte locale
>       dfa: factor out a new nontrivial block of duplicated code
>       dfa: test for just-fixed bug
>       dfa: fix a theoretical bug
>       grep: initialize validation_boundary properly before use
>       dfa: process all MBCSET constructs via glibc's matcher
>       dfa: remove two erroneous clauses from a now-unused function
>       tests: add test for grep -P fix
>       dfa: avoid false match in a non-UTF8 multibyte locale
>       dfa: make \w and \W work in multibyte locales
>
> Paul Eggert (46):
>       build: update gnulib submodule to latest
>       grep: use system strstr if available and fast
>       grep: undo part of previous change
>       doc: use gnulib fdl module
>       maint: remove grep.spec
>       build: don't make output files read-only
>       build: avoid -Wstack-protector
>       grep: with -E, unmatched ')' matches itself
>       doc: Document -r vs --exclude more carefully.
>       doc: prefer @env to @code
>       doc: document LANGUAGE
>       grep: fix integer-width bugs in undossify_input etc.
>       grep: -P now treats invalid UTF-8 input as non-matching
>       grep: port recent fix to older pcre version
>       grep: fix false matches with -P '...$' and invalid UTF-8
>       grep: fix false matches with -P '...$' and invalid UTF-8
>       doc: bug tracker has moved to debbugs.gnu.org
>       grep: make GREP_OPTIONS obsolescent
>       grep: diagnose -P in non-UTF-8 multibyte locale
>       grep: remove/refactor unnecessary code about line splitting
>       grep: speed up -P on files containing many multibyte errors
>       grep: use bool for boolean in grep.c
>       grep: treat a file as binary if its prefix contains encoding errors
>       grep: improve performance for older glibc
>       grep: use mbclen cache more effectively
>       grep: avoid false alarms for mb_clen and to_uchar
>       grep: use mbclen cache in one more place
>       grep: port -P speedup to hosts lacking PCRE_STUDY_JIT_COMPILE
>       grep: fix -P speedup bug with empty match
>       grep: refactor binary-vs-unknown-vs-text flags for clarity
>       grep: -z no longer considers '\200' to be binary data
>       grep: non-text bytes in binary data may be treated as line ends
>       grep: minor -P speedup with jit_stack
>       grep: improve -P performance in typical cases
>       grep: skip past holes efficiently
>       grep: port to platforms lacking SEEK_DATA
>       grep: speed up processing of holes before EOF on Solaris
>       grep: scan for valid multibyte strings more quickly
>       grep: don't check extensively for invalid prefix bytes unless -P
>       maint: generalize the -Wcast-align fix
>       dfa: minor tweaks, mostly to remove __attribute__ ((noinline))
>       doc: clarify exit status
>       doc: modernize and simplify man page
>       grep: fix off-by-one bug in -P optimization
>       grep: fix grep -P crash
>       tests: work around older libpcre bugs when testing -P and UTF-8
>
>
> Changes in gnulib since v2.20:
>
> * gnulib 98ca2c0...8415b67 (95):
>   > socketlib, sockets, sys_socket: Use AC_REQUIRE to pacify autoconf.
>   > iconv: avoid false detection of non-working iconv
>   > bootstrap: print more diagnostics for missing programs
>   > bootstrap: only update the gnulib submodule
>   > symlinkat: port to AIX 7.1
>   > readlinkat: port to AIX 7.1
>   > remove spurious {
>   > modules/fcntl: fix error reporting by dupfd
>   > basename, dirname: Improve documentation.
>   > exclude: declare exclude_patopts static
>   > autoupdate
>   > dirname: support compilation with C++
>   > qsort_r: include <config.h>
>   > avltree-list: avoid compiler warnings
>   > qsort_r: new module, for GNU-style qsort_r
>   > strerror_r-posix: support compilation with C++
>   > fcntl-h: fix compilation with Intel C++ compiler
>   > autoupdate
>   > mountlist: use /proc/self/mountinfo when available
>   > users.txt: add cmogstored
>   > gnulib-tool: Sync with build-aux/bootstrap options
>   > gnulib-tool: Fallback to wget when rsync fails
>   > maintainer-makefile: add syntax check for useless ';;'
>   > pthread, pthread_sigmask, threadlib: port to Ubuntu 14.04
>   > error: drop spurious semicolon
>   > gnulib-common.m4: port to GCC 4.2.1 and Sun Studio 12 C++
>   > manywarnings: add GCC 4.9 warnings
>   > vasnprintf: fix bugs in width computation
>   > vasnprintf: Avoid signed/unsigned comparison warning.
>   > parse-datetime: Avoid signed/unsigned comparison warning
>   > qsort_r: new module, for GNU-style qsort_r
>   > vla: new module
>   > localename: make gl_locale_name_thread really thread-safe on Windows
>   > getpass: don't assume struct termios
>   > getdtablesize: fall back on sysconf (_SC_OPEN_MAX)
>   > vararrays: modernize AC_C_VARARRAYS for C11
>   > relocatable-prog-wrapper: port gettext to OS X 10.8 + GCC 4.8.1
>   > sys_select: fix FD_ZERO problem on Solaris 10
>   > accept: document Solaris 10 type glitch
>   > extern-inline: port to FreeBSD, DragonFly
>   > autoupdate
>   > Use consistent style to check DEBUG macro in regex_internal.c
>   > openat-die: use _Noreturn markup
>   > test-open: port to cygwin, which lacks Fortify
>   > localename: Enforce declarations before statements.
>   > test-userspec: don't look up numeric user names
>   > localcharset, localename: MS-Windows support for non-default locales
>   > announce-gen: avoid failure when Digest::SHA is installed
>   > gettext: revert "update macros to version 0.19"
>   > regex: don't deref NULL upon heap allocation failure
>   > maint.mk: give projects more flexibilty in set_prog_name arguments
>   > regex: fix memory leak in compiler
>   > announce-gen: avoid perl warnings
>   > localename: avoid -Wsuggest-attribute={const,pure} warnings
>   > nl_langinfo: Fix last change.
>   > Define macros for glibc
>   > Sync up error.c with glibc
>   > nl_langinfo: fix build under mingw
>   > mountlist: do not classify a bind-mounted dir entry as "dummy"
>   > maint.mk: less syntax-check noise when SIGPIPE is ignored
>   > nl_langinfo: CODESET on MS-Windows and more items from localeconv
>   > Bruno Haible has stepped down as maintainer.
>   > mktime: merge #if/#ifdef usage from glibc
>   > git-version-gen: improve option descriptions
>   > regex: fix memory leak in compiler
>   > regex: merge patch from libc
>   > acl: port to gcc -Wredundant-decls
>   > parse-duration: eliminate 68-year duration limit
>   > pthread: don't assume AC_CANONICAL_HOST, port better to Solaris, etc.
>   > pthread: define thread-safe macros on some platforms
>   > regex: don't be multithreaded if USE_UNLOCKED_IO.
>   > gettext: update macros to version 0.19
>   > select,poll: fix console handle check on windows 8
>   > select: fix waiting on anonymous pipes on MS-Windows
>   > times: fix to return non constant value on MS-Windows
>   > isatty: fix to work on windows 8
>   > maint: fix typo in fdl.texi
>   > mountlist: avoid hasmntopt const type warning on solaris
>   > maintainer-makefile: delete obsolete code
>   > maintainer-makefile: avoid spurious error messages
>   > rename: avoid unused-but-set-variable compiler warning
>   > maint: add ChangeLog entry missing in previous commit
>   > rename: mark a label as potentially unused
>   > gnulib-common.m4: Fix typo in _GL_UNUSED_LABEL.
>   > acl: apply pure attribute to two functions
>   > gnulib-common.m4: add _GL_UNUSED_LABEL
>   > dup2, fcntl, fcntl-h: port to AIX 7.1
>   > printf, config.rpath: Port to FreeBSD 10.
>   > ftoastr: work around compiler bug in IBM xlc 12.1
>   > valgrind-tests: fixed misleading help message
>   > isfinite, isinf, isnan tests: fix for little-endian PowerPC
>   > exclude-tests: port to AIX 7.1
>   > pthread_sigmask, timer-time: use gl_THREADLIB only if needed
>   > gnulib-tool: wget translations using --no-verbose rather than --quiet
>   > gnulib-tool: adjust translation wget to avoid a https redirection
>
>
>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]