[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Rational Range Interpretation patches, 2/3
From: |
Aharon Robbins |
Subject: |
Rational Range Interpretation patches, 2/3 |
Date: |
Mon, 16 Jan 2012 22:25:41 +0200 |
User-agent: |
Heirloom mailx 12.4 7/29/08 |
>From 366cc2f4170f8dfbaa2137602e4ccc35e854766a Mon Sep 17 00:00:00 2001
From: Arnold D. Robbins <address@hidden>
Date: Mon, 16 Jan 2012 22:07:40 +0200
Subject: [PATCH 2/2] Document Rational Range Interpretation.
---
doc/grep.texi | 21 ++++++++++++++++-----
1 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/doc/grep.texi b/doc/grep.texi
index de73d7f..dc27e52 100644
--- a/doc/grep.texi
+++ b/doc/grep.texi
@@ -939,9 +939,7 @@ They are omitted (i.e., false) by default and become true
when specified.
@cindex character type
@cindex national language support
@cindex NLS
-These variables specify the locale for the @code{LC_COLLATE} category,
-which determines the collating sequence
-used to interpret range expressions like @samp{[a-z]}.
+These variables specify the locale for the @code{LC_COLLATE} category.
@item LC_ALL
@itemx LC_CTYPE
@@ -1202,7 +1200,12 @@ For example, the regular expression
Within a bracket expression, a @dfn{range expression} consists of two
characters separated by a hyphen.
It matches any single character that
-sorts between the two characters, inclusive, using the locale's
+sorts between the two characters, inclusive,
+using the machine's character set.
+
+Up to and including version 2.10 of @command{grep},
+range expressions would match any single character that sorted between
+the two characters, inclusive, using the current locale's
collating sequence and character set.
For example, in the default C
locale, @samp{[a-d]} is equivalent to @samp{[abcd]}.
@@ -1211,9 +1214,17 @@ characters in dictionary order, and in these locales
@samp{[a-d]} is
typically not equivalent to @samp{[abcd]};
it might be equivalent to @samp{[aBbCcDd]}, for example.
To obtain the traditional interpretation
-of bracket expressions, you can use the @samp{C} locale by setting the
+of bracket expressions, it was necessary to use the @samp{C} locale
+by setting the
@env{LC_ALL} environment variable to the value @samp{C}.
+Since the current POSIX standard now makes the behavior of range expressions
+be implementation-defined, instead of requiring the locale's
+collating order, @command{grep} has reverted to the traditional Unix
+behavior of defining ranges based on the machine character address@hidden
+is known as ``Rational Range Interpretation,'' a lovely phrase
+coined by Karl Berry.}
+
Finally, certain named classes of characters are predefined within
bracket expressions, as follows.
Their interpretation depends on the @code{LC_CTYPE} locale;
--
1.7.1
- Rational Range Interpretation patches, 2/3,
Aharon Robbins <=