[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#65659: RFC: changing printf(1) behavior on %b
From: |
Eric Blake |
Subject: |
bug#65659: RFC: changing printf(1) behavior on %b |
Date: |
Thu, 31 Aug 2023 10:35:59 -0500 |
User-agent: |
NeoMutt/20230517 |
In today's Austin Group call, we discussed the fact that printf(1) has
mandated behavior for %b (escape sequence processing similar to XSI
echo) that will eventually conflict with C2x's desire to introduce %b
to printf(3) (to produce 0b000... binary literals).
For POSIX Issue 8, we plan to mark the current semantics of %b in
printf(1) as obsolescent (it would continue to work, because Issue 8
targets C17 where there is no conflict with C2x), but with a Future
Directions note that for Issue 9, we could remove %b entirely, or
(more likely) make %b output binary literals just like C. But that
raises the question of whether the escape-sequence processing
semantics of %b should still remain available under the standard,
under some other spelling, since relying on XSI echo is still not
portable.
One of the observations made in the meeting was that currently, both
the POSIX spec for printf(1) as seen at [1], and the POSIX and C
standard (including the upcoming C2x standard) for printf(3) as seen
at [3] state that both the ' and # flag modifiers are currently
undefined when applied to %s.
[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
"The format operand shall be used as the format string described in
XBD File Format Notation[2] with the following exceptions:..."
[2]
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05
"The flag characters and their meanings are: ...
# The value shall be converted to an alternative form. For c, d, i, u,
and s conversion specifiers, the behavior is undefined.
[and no mention of ']"
[3] https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html
"The flag characters and their meanings are:
' [CX] [Option Start] (The <apostrophe>.) The integer portion of the
result of a decimal conversion ( %i, %d, %u, %f, %F, %g, or %G )
shall be formatted with thousands' grouping characters. For other
conversions the behavior is undefined. The non-monetary grouping
character is used. [Option End]
...
# Specifies that the value is to be converted to an alternative
form. For o conversion, it shall increase the precision, if and only
if necessary, to force the first digit of the result to be a zero
(if the value and precision are both 0, a single 0 is printed). For
x or X conversion specifiers, a non-zero result shall have 0x (or
0X) prefixed to it. For a, A, e, E, f, F, g, and G conversion
specifiers, the result shall always contain a radix character, even
if no digits follow the radix character. Without this flag, a radix
character appears in the result of these conversions only if a digit
follows it. For g and G conversion specifiers, trailing zeros shall
not be removed from the result as they normally are. For other
conversion specifiers, the behavior is undefined."
Thus, it appears that both %#s and %'s are available for use for
future standardization. Typing-wise, %#s as a synonym for %b is
probably going to be easier (less shell escaping needed). Is there
any interest in a patch to coreutils or bash that would add such a
synonym, to make it easier to leave that functionality in place for
POSIX Issue 9 even when %b is repurposed to align with C2x?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization: qemu.org | libguestfs.org
- bug#65659: RFC: changing printf(1) behavior on %b,
Eric Blake <=