bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] printf: add %#s alias to %b


From: Eric Blake
Subject: Re: [PATCH] printf: add %#s alias to %b
Date: Thu, 7 Sep 2023 13:59:25 -0500
User-agent: NeoMutt/20230517

On Thu, Sep 07, 2023 at 02:42:16PM +0700, Robert Elz wrote:
>     Date:        Wed, 6 Sep 2023 11:32:32 -0500
>     From:        Eric Blake <eblake@redhat.com>
>     Message-ID:  
> <z7svroub44xygdps6apuikftl4tjtpc67zkq5ivb6cmktsfnfg@5vc4pkemtzog>
> 
>   | You (anyone reading this, not just kre) are welcome to join tomorrow's
>   | Austin Group meeting
> 
> Thanks, but I don't expect its time of day will coincide with
> mine this week, at best I would be a half asleep zombie.
> 
>   |  it is a Zoom call
> 
> As best I understand it, zoom does not support NetBSD - which
> is the only platform I use, which has been true for decades now
> (previously I also used SunOS (not Solaris) and Ultrix).
> 
> While probably works on android (ie: phone)  meeting use that
> way would not be convenient for anyone - certainly not for me
> staring at it all the time, and assuming that it works with
> video enabled, not for anyone else with an image moving around
> randomly... (my phone has no stand, I haven't been able to
> find one which fits it).

The meeting is now over, but for clarification, the Austin Group does
audio-only meetings.  Some weeks we use Zoom, some we use Webex
(depends on who is available to run the meeting), but no one is
on-screen, so a POT dialin always works at no disadvantage to someone
unable/unwilling to run Zoom software (whether that be for reasons of
not yet having port available, or for Zoom not releasing their
software under a license acceptable to your liking).  Speak up if you
think the Austin Group is ever unfairly crippling someone's right to
participate by limiting the participation behind a paywall.

> 
>   | Or you can add comments to the bug directly.
> 
> I have done that already, and probably will add one more.
> 
>   | Of course, the gamble is easier to win if we have multiple independent
>   | implementations that have all coordinated to do it the same way, so we
>   | can push back on WG14 to tell them they would be foolish to commandeer
>   | %#s for anything other than what existing practice has.
> 
> Which worked how well with %b ?

As Geoff commented on 1771, if someone had raised the issue about %b
conflicting 6 months sooner, and pointed out the ksh extension of
%..<base>d as an alternative, we may have had time to do so.
https://austingroupbugs.net/view.php?id=1771#c6453

But because the Austin Group learned about the conflict so late in the
game, we were already too late to push back on C2x at the time,
putting us instead into the camp of seeing what consensus we could get
from shell developers.  This thread (and others like it) have been
helpful - we DID get consensus (namely, that printf(1) and printf(3)
have always diverged, so diverging further on %b is okay), and today's
Austin Group meeting we updated what will go into Issue 8 based on
that feedback.

I consider that to be a successful outcome, even if you may have felt
heartburn through the intermediate stages of it all.

> 
> Further, upon reflection, I think a better use of %#s in printf(1)
> (no point in printf(3)) would be to explicity output a string of
> bytes (what %s used to do, before it was reinterpreted to output
> characters instead).   While the two might seem to be mostly the
> same, that depends upon the implementation - if an implementation
> treats strings of characters as arrays of wchar_t, and converts
> from byte encoding to wchar_t on input, there's no guarantee that
> the output (converted back from wchar_t to byte encoding) will be
> identical to the input string.   Sometimes that might not be
> desirable and a method to simply copy the input string to the
> output, as uninterpreted bytes might be useful to have.  To me
> that is a better use of %#s than as a %b clone - particularly
> as %b needs the same kind of variant (%#b).   This also deals
> with the precision issue, %.1s is 1 character fr9m the arg
> string, %#.1s is one byte instead.

That is indeed a cool idea, but one for the libc folks to take up.  At
any rate, I agree that burning %#s to be a synonym for %b precludes
this useful idea (and it may be even more important in shell contexts,
now that Issue 8 has taken efforts to make it clear that sometimes the
shell deals with characters, and sometimes with bytes; in particular,
environment variables can hold bytes that need not always form
characters in the current locale).

> 
> If there were to be anything worthy of announcing as deprecated
> from posix printf(1) it would be %c - then we could make %c be
> compat with its printf(3) meaning, where it takes a codepoint
> as an int (just 8 bits in printf(3) but we don't neet to retain
> that restriction) and outputs the associated character, rather
> than just being an (almost) alias for %.1s -- where the almost
> is because given '' as the arg string, %c is permitted to output
> \0 or nothing, where %.1s is required to output nothing.  Because
> it is unspecified which happens with %c, portable applications
> cannot rely upon either behaviour, so %.1s is a much safer and
> more portable format to use for the purpose.   If %c were
> (eventually) altered to take an int (codepoint) as its arg,
> rather than a string, we could also stop needing to tell people
> they have to use the bizarre printf \\$(printf %o val) nonsense
> method to do such a simple operation, which only works for
> 8 bit codepoints.
> 
> kre
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]