[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Portability problems of "Usual Tools" not described in manual
From: |
Eric Blake |
Subject: |
Re: Portability problems of "Usual Tools" not described in manual |
Date: |
Tue, 17 Mar 2009 19:34:44 -0600 |
User-agent: |
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.19) Gecko/20081209 Thunderbird/2.0.0.19 Mnenhy/0.7.6.666 |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
According to Eric Blake on 3/17/2009 8:43 AM:
> Thanks for the report. I would like to verify these claims before
> applying any patches, but agree that they are probably worth mentioning.
>
>> 1. sed behaves entirely unpredictable on lines that are not
>> newline-terminated.
>
> Confirmed. POSIX states that sed is only required to operate on text
> files, and also that a file without a trailing newline is not a text file.
This has actually already been done:
http://git.savannah.gnu.org/gitweb/?p=autoconf.git;a=commitdiff;h=b2bde72
>> 2. On HP-UX 11.23, regexp matching with expr does not allow multiple sub-
>> expressions:
>
>> bash-3.1$ expr 'Xfoo' : 'X\(f\(oo\)*\)$'
>> expr: More than one '\(' was used.
>
> Ouch. I don't have access to HP-UX to verify, but this means we need to
> audit autoconf source to make sure we don't violate this restriction.
>
>> 3. On GNU/Linux the regexp "$", when used with older versions of expr,
>> matches newlines embedded in the match string:
>
>> bash-3.1$ baz='foo
>> > bar'
>> bash-3.1$ expr "X$baz" : 'X\(foo\)$' || echo baz
>> foo
>
> I'm assuming this was from an older version of coreutils? Can someone
> determine 'expr --version' in the broken case, to see when it was fixed?
Both expr issues are mentioned in this patch, now applied:
- --
Don't work too hard, make some time for fun as well!
Eric Blake address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAknAT7QACgkQ84KuGfSFAYCb8ACfdcGsMmGLQ/QKlXoBbclnhVAr
X6oAoMP2yNoZyxlxLP7eqWzbZqWch0K1
=16+4
-----END PGP SIGNATURE-----
>From abee382683d1b977f2ab4a91121b4277045e6d5a Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Tue, 17 Mar 2009 19:33:08 -0600
Subject: [PATCH] Manual: mention more expr pitfalls.
* doc/autoconf.texi (Limitations of Usual Tools) <expr (:)>:
Mention HP-UX limitation, and $ ambiguity.
* THANKS: Update.
Reported by Jens Schmidt, in http://bugs.debian.org/466990.
Signed-off-by: Eric Blake <address@hidden>
---
ChangeLog | 8 ++++++++
THANKS | 1 +
doc/autoconf.texi | 23 +++++++++++++++++++++++
3 files changed, 32 insertions(+), 0 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index 259004e..3829924 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,11 @@
+2009-03-18 Eric Blake <address@hidden>
+
+ Manual: mention more expr pitfalls.
+ * doc/autoconf.texi (Limitations of Usual Tools) <expr (:)>:
+ Mention HP-UX limitation, and $ ambiguity.
+ * THANKS: Update.
+ Reported by Jens Schmidt, in http://bugs.debian.org/466990.
+
2009-03-17 Jim Meyering <address@hidden>
Manual: fix a typo.
diff --git a/THANKS b/THANKS
index c3e19a2..8d8bb37 100644
--- a/THANKS
+++ b/THANKS
diff --git a/doc/autoconf.texi b/doc/autoconf.texi
index a0a19b8..a4cb0d1 100644
--- a/doc/autoconf.texi
+++ b/doc/autoconf.texi
@@ -16642,6 +16642,21 @@ Limitations of Usual Tools
@samp{^}. Patterns are automatically anchored so leading @samp{^} is
not needed anyway.
+On the other hand, the behavior of the @samp{$} anchor is not portable
+on multi-line strings. Posix is ambiguous whether the anchor applies to
+each line, as was done in older versions of @acronym{GNU} Coreutils, or
+whether it applies only to the end of the overall string, as in
+Coreutils 6.0 and most other implementations.
+
address@hidden
+$ @kbd{baz='foo}
+> @kbd{bar'}
+$ @kbd{expr "X$baz" : 'X\(foo\)$'}
+
+$ @kbd{expr-5.97 "X$baz" : 'X\(foo\)$'}
+foo
address@hidden example
+
The Posix standard is ambiguous as to whether
@samp{expr 'a' : '\(b\)'} outputs @samp{0} or the empty string.
In practice, it outputs the empty string on most platforms, but portable
@@ -16718,6 +16733,14 @@ Limitations of Usual Tools
1
@end example
+On @acronym{HP-UX} 11, @command{expr} only supports a single
+sub-expression.
+
address@hidden
+$ @kbd{expr 'Xfoo' : 'X\(f\(oo\)*\)$'}
+expr: More than one '\(' was used.
address@hidden example
+
@item @command{fgrep}
@c ------------------
--
1.6.1.2