lmi
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [lmi] dash print vs. printf


From: Vadim Zeitlin
Subject: Re: [lmi] dash print vs. printf
Date: Thu, 18 Apr 2019 00:36:12 +0200

On Wed, 17 Apr 2019 22:10:05 +0000 Greg Chicares <address@hidden> wrote:

GC> Vadim--Somehow I thought 'print' was a shell builtin like 'printf',
GC> which might be preferable if no particular formatting is wanted.
GC> That's true with zsh...

 Yes, but I think zsh is really just an outlier here. Neither dash nor bash
have a print built-in. Moreover, according to the table in 1(b) at

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01

"print" is actually in a select group of commands the results of executing
which are unspecified (!). I have no idea why should this be the case, but
it definitely doesn't feel me with confidence about it.

GC> Let me ask a few questions about this.
GC> 
GC> (1) I tried in vain to find out whether POSIX requires 'print'
GC> to be a built-in, but there seem to be three categories:
GC>  - "special built-ins", like 'exec' or 'export', which are
GC>    prescribed to be built in;

 And also are really special as described in the beginning of 2.14 at

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_14

GC>  - utilities that by their nature must be built in, like 'fg'
GC>    or 'cd'; and
GC>  - other "regular built-ins", for which AFAICT POSIX doesn't
GC>    prescribe anything: they can be a large set, or an empty set.
GC> Have I understood that correctly?

 I don't think there is any distinction between these 2 classes, e.g. the
table in 1.6 of

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tagtcjh_18

includes both "fg" and "cd", even though I have a lot of trouble seeing how
could "cd" be implemented as anything else than a built-in. Yet the text at
the link above appears to say that it could well be. But I'm not at all
either a shell or a POSIX lawyer, so I could well be missing something
here.

 BTW, something that I've just discovered while writing this answer is that
POSIX _requires_ having external versions of all non-special built-ins as
they are supposed to be executable via exec(). And in fact a
POSIX-compliant shell is supposed to deactivate its support for its
built-in (such as printf) if an external command with the same name is not
found. However none of the common shells works like this and both dash and
zsh explicitly document this deviation from POSIX.

GC> (2) What practical rule should be followed?

 I believe the advice remains to use printf for maximum portability.

GC> I used to write
GC> 'echo' everywhere. Then I changed most occurrences of 'echo' in
GC> scripts and makefile recipes to 'printf' or 'print'. Is the best
GC> practice to use 'printf' as the only replacement for 'echo', and
GC> never 'print'?

 In short, yes, I think this is exactly right. In fact, I really don't know
where has the idea of using print come from.

GC> If so, then...how can I be sure 'printf' won't
GC> surprise me someday as 'print' has today? Is the answer that
GC> POSIX prescribes a 'printf' command:
GC>   https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
GC> so that's safe; but it but simply doesn't mention any 'print'
GC> command, so 'print' is allowed to surprise me in a POSIX shell?

 Again, yes, I think so.
 
GC> (3) I'll just accept that '/usr/bin/print' is some mailcap thing
GC> that enables email attachments to do things that I probably don't
GC> want them to do anyway. But there's that distressing series of
GC> locale diagnostics, which I often see when I run 'apt-get upgrade'
GC> as well. Overriding only the ones that are missing lets the
GC> command complete, at least:
GC> 
GC> $LANGUAGE="$LANG" LC_ALL="$LANG" /usr/bin/print --help           
GC> perl: warning: Setting locale failed.
GC> perl: warning: Please check that your locale settings:
GC>         LANGUAGE = "en_US.UTF-8",
GC>         LC_ALL = "en_US.UTF-8",
GC>         LC_TIME = "en_DK.UTF-8",
GC>         LC_COLLATE = "C.UTF-8",
GC>         LANG = "en_US.UTF-8"
GC>     are supported and installed on your system.
GC> perl: warning: Falling back to the standard locale ("C").
GC> Use: /usr/bin/print <--action=VAL> [--debug] [MIME-TYPE:[ENCODING:]]FILE 
[...]
GC> 
GC> ...and, better, if I instead follow your advice here:
GC> 
GC>   https://lists.nongnu.org/archive/html/lmi/2018-07/msg00018.html
GC> | Presumably, if you changed it to C.UTF-8, this warning would disappear.
GC> 
GC> ...then the unwelcome diagnostics disappear:
GC> 
GC> $LANGUAGE=C.UTF-8 LC_ALL=C.UTF-8 /usr/bin/print --help
GC> Use: /usr/bin/print <--action=VAL> [--debug] [MIME-TYPE:[ENCODING:]]FILE 
[...]
GC> 
GC> I'll try that sort of override next time I 'apt-get upgrade'.
GC> Or...assuming that this is a perl-specific problem

 No, it isn't. Perl just calls setlocale() and check for its error, which
is the right thing to do (rather than silently ignoring it).

GC> there a perl-specific solution, like
GC>   export PERL_USE_DEFAULT_LOCALE=1
GC> ?

 No, I'm not aware of anything like this. The only solutions I can think of
are:

1. Just use C.UTF-8 locale.
2. Ensure that your system supports en_US.UTF-8 and en_DK.UTF-8 locales.

I guess you are reluctant to just do (1) as otherwise you would have done
it a long time ago, but I'm not sure what prevents you from doing (2). Or
do you mean that your system already does support these locales (i.e. they
both appear in "locale -a" output), yet setting them from Perl still fails?

 Sorry if I'm forgetting something here, I know we've discussed this in the
past, but I don't remember the details and I couldn't find anything
relevant in my mailbox (which really needs a better search...).

VZ


reply via email to

[Prev in Thread] Current Thread [Next in Thread]