coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: numfmt: locale/grouping input issue


From: Pádraig Brady
Subject: Re: numfmt: locale/grouping input issue
Date: Tue, 11 Dec 2012 19:03:35 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1

On 12/11/2012 06:36 PM, Assaf Gordon wrote:
Hello,

(Continuing a previously discussed issue - accepting input values with locale 
grouping separators)

Pádraig Brady wrote, On 12/07/2012 01:09 PM:
On 12/07/2012 03:07 PM, Assaf Gordon wrote:
Another thing I thought of there, was it would be
good to be able to parse number formats that it can generate:

Sounds like two separate (but related) issues:

$ echo '1,234' | src/numfmt --from=auto
src/numfmt: invalid suffix in input '1,234': ',234'

1. Is there already a gnulib function that can accept locale-grouped values? can the 
"xstrtoXXX" functions handle that?

I was thinking you would just strip out
localeconv()->thousands_sep before parsing.

I couldn't find an example of a coreutil program that readily accepts locale'd 
input.
The while dots and commas (US/DE locales) are relatively easy to handle, in the 
french locale the separator is space - causing a conflict when assuming the 
default field separator is also white space.

True. You could only support that when --delimiter was not ' ',
or when LC_NUMERIC was set to one with a non space grouping char.

Another complication is that just stripping out the 'thousands_sep' character would treat text such 
as "1,3,4,5,6" as valid number "13456" .

Good point. You'd need to count as well as strip

I would suggest at first not to accept locale'd input, or only offer partial 
support.

    # Input is not valid
    $ LC_ALL=fr_FR.utf8 ./src/printf "%'d\n" "1 000"
    ./src/printf: 1 000 : valeur non complètement convertie
    1

    # Sort can't handle locale'd input, treats the white-space as separator,
    #  not as "thousand separator".
    $ printf "1 123\n1 000\n" | LC_ALL=fr_FR.utf8 sort --debug -k1,1
    sort: utilse les règles de tri « fr_FR.utf8 »
    sort: leading blanks are significant in key 1; consider also specifying 'b'
    1 000
    _
    _____
    1 123
    _
    _____

So the above don't support localized number formats directly,
which is fair enough. That shows that the functionality is
useful within numfmt as it would enable the above to
use such numbers after being filtered through numfmt.
Implementation should not be too onerous, given the above caveats.

thanks,
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]