bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] maint.mk: prohibit doubled words


From: Jim Meyering
Subject: [PATCH] maint.mk: prohibit doubled words
Date: Mon, 11 Apr 2011 10:35:44 +0200

I have no illusions that this is complete,
but it should be good enough for most packages.

>From e30db8463f31e860c02b1bd95cb1d0e8bd8b3263 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sun, 10 Apr 2011 10:26:46 +0200
Subject: [PATCH] maint.mk: prohibit doubled words

Detect them also when they're separated by a newline.
There are 3 ways to customize it:
  - disable the test on a per file basis, as usual with rules using
    $(VC_LIST_EXCEPT)
  - replace the default doubled-word-selecting regexp (affects all files)
  - ignore a particular file-vs-doubled-word match
I nearly used that last one to ignore the "is is" match in
coreutils' NEWS file, since the text was "ls -is is ..."
To do that, I would have added this line to cfg.mk:
  ignore_doubled_word_match_RE_ = ^NEWS:[0-9]+:is[ ]is$
but it would have ignored any "is is" match in NEWS.
Low probability, but still...
Instead, I changed the text, slightly:
  -  ls -is is now consistent with ls -lis in ignoring values returned
  +  "ls -is" is now consistent with ls -lis in ignoring values returned
* top/maint.mk (prohibit_double_word_RE_): Provide default.
(prohibit_doubled_word_): Define.
(sc_prohibit_doubled_word): New rule.
(sc_prohibit_the_the): Remove.  Subsumed by the above.
---
 ChangeLog    |   23 +++++++++++++++++++++++
 top/maint.mk |   23 +++++++++++++++++++----
 2 files changed, 42 insertions(+), 4 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 110cee5..365dd2f 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,28 @@
 2011-04-10  Jim Meyering  <address@hidden>

+       maint.mk: prohibit doubled words
+       Detect them also when they're separated by a newline.
+       There are 3 ways to customize it:
+         - disable the test on a per file basis, as usual with rules using
+           $(VC_LIST_EXCEPT)
+         - replace the default doubled-word-selecting regexp (affects all 
files)
+         - ignore a particular file-vs-doubled-word match
+       I nearly used that last one to ignore the "is is" match in
+       coreutils' NEWS file, since the text was "ls -is is ..."
+       To do that, I would have added this line to cfg.mk:
+         ignore_doubled_word_match_RE_ = ^NEWS:[0-9]+:is[ ]is$
+       but it would have ignored any "is is" match in NEWS.
+       Low probability, but still...
+       Instead, I changed the text, slightly:
+         -  ls -is is now consistent with ls -lis in ignoring values returned
+         +  "ls -is" is now consistent with ls -lis in ignoring values returned
+       * top/maint.mk (prohibit_double_word_RE_): Provide default.
+       (prohibit_doubled_word_): Define.
+       (sc_prohibit_doubled_word): New rule.
+       (sc_prohibit_the_the): Remove.  Subsumed by the above.
+
+2011-04-10  Jim Meyering  <address@hidden>
+
        maint: fix doubled-word typo in comment
        * m4/gethostname.m4: s/is is/it is/
        * m4/getdomainname.m4: Likewise.
diff --git a/top/maint.mk b/top/maint.mk
index ada00be..07a7773 100644
--- a/top/maint.mk
+++ b/top/maint.mk
@@ -841,10 +841,25 @@ sc_prohibit_S_IS_definition:
        halt='do not define S_IS* macros; include <sys/stat.h>'         \
          $(_sc_search_regexp)

-sc_prohibit_the_the:
-       @prohibit='\<the[ ]the\>'                                       \
-       halt='avoid double "the"'                                       \
-         $(_sc_search_regexp)
+prohibit_doubled_word_RE_ ?= \
+  /\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt]o)\s+\1\b/gims
+prohibit_doubled_word_ =                                               \
+    -e 'while ($(prohibit_doubled_word_RE_))'                          \
+    -e '  {'                                                           \
+    -e '    $$n = ($$` =~ tr/\n/\n/ + 1);'                             \
+    -e '    ($$v = $$&) =~ s/\n/\\n/g;'                                        
\
+    -e '    print "$$ARGV:$$n:$$v\n";'                                 \
+    -e '  }'
+
+# Define this to a regular expression that matches
+# any filename:dd:match lines you want to ignore.
+# The default is to ignore no matches.
+ignore_doubled_word_match_RE_ ?= ^$$
+
+sc_prohibit_doubled_word:
+       @perl -n -0777 $(prohibit_doubled_word_) $$($(VC_LIST_EXCEPT))  \
+         | grep -vE '$(ignore_doubled_word_match_RE_)'                 \
+         | grep . && { echo '$(ME): doubled words' 1>&2; exit 1; } || :

 sc_prohibit_can_not:
        @prohibit='\<can[ ]not\>'                                       \
--
1.7.5.rc1.228.g86d60b



reply via email to

[Prev in Thread] Current Thread [Next in Thread]