[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Plain text matching with regex: RE_PLAIN
From: |
Reuben Thomas |
Subject: |
Plain text matching with regex: RE_PLAIN |
Date: |
Wed, 15 Sep 2010 01:27:17 +0100 |
With my "maintainer of GNU Zile" hat on, I was improving the searching
code recently, and a thought struck me which has often struck me
before: the code would be much simpler if I could do non-regex
searches using the regex APIs. In particular, I had written (simple)
text searching routines, and had both to maintain them, and decide
whether to use them or regex for each search.
Since I have recently been looking at the regex code quite a lot, I
thought this time I would see how easy it might be to implement.
The answer is: very easy indeed.
The attached patch adds a new syntax flag, RE_PLAIN. Although the
patch looks quite long at first glance, it is mostly reindentation.
The code consists of a #define of RE_PLAIN (thus reducing the number
of spare flag bits on a 32-bit machine from 6 to 5, but I think this
is a justifiable use of a flag), and two changes to the function
peek_token in regcomp.c, one to add a test of RE_PLAIN to an existing
if, and another to wrap an entire block in such a test. The idea is
very simple: when RE_PLAIN is used, the parser is prevented from
assigning any type other than CHARACTER to a token, and no parsing
routine beyond peek_token is ever called.
I saved about 50 lines of C in Zile for the cost of these 3, which
seems like a good start...
If this feature is approved, I would of course write a documentation
patch (for regex.texi; my patch already includes documentation in
regex.h) to go with my code patch.
There is one other potential advantage to adopting this patch, even if
it is a rather odd one: currently, gnulib uses a small set of tests to
determine whether or not to use the system regex. With the addition of
this new feature, this set of tests could be replaced by a simple test
for RE_PLAIN. (Of course, as and when further bugs are fixed, it would
be desirable to add more tests, but this new feature provides a nice
epoch.)
I have written some autoconf code to test for this feature which can
already be used, based on the existing test, thus:
dnl If system lacks RE_PLAIN, force --with-included-regex
AC_MSG_CHECKING([whether system regex.h has RE_PLAIN])
AC_COMPILE_IFELSE(
[AC_LANG_PROGRAM(
[AC_INCLUDES_DEFAULT[
#include <regex.h>
]],
[[reg_syntax_t syn = RE_PLAIN;]])],
[AC_MSG_RESULT([yes])],
[AC_MSG_RESULT([no])
with_included_regex=yes],
dnl When crosscompiling, force included regex.
[AC_MSG_RESULT([no])
with_included_regex=yes])
In GNU Zile's configure.ac, I place this code directly before gl_INIT,
as gl_INIT runs the code that decides whether to use the system regex
or gnulib's copy.
--
http://rrt.sc3d.org
0010-Add-RE_PLAIN-flag-to-match-plain-text-patterns.patch
Description: Binary data
- Plain text matching with regex: RE_PLAIN,
Reuben Thomas <=
- Re: Plain text matching with regex: RE_PLAIN, Eric Blake, 2010/09/14
- Re: Plain text matching with regex: RE_PLAIN, Reuben Thomas, 2010/09/14
- Re: Plain text matching with regex: RE_PLAIN, Bruno Haible, 2010/09/14
- Re: Plain text matching with regex: RE_PLAIN, Reuben Thomas, 2010/09/17
- Re: new module 'regex-quote', Bruno Haible, 2010/09/18
- Re: new module 'regex-quote', Reuben Thomas, 2010/09/18
- Re: new module 'regex-quote', Reuben Thomas, 2010/09/20
- Re: new module 'regex-quote', Eric Blake, 2010/09/20
- Re: new module 'regex-quote', Bruno Haible, 2010/09/20
- Re: new module 'regex-quote', Eric Blake, 2010/09/20