[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: question about behavior of sort -n -t,
From: |
Gabriel Gaster |
Subject: |
Re: question about behavior of sort -n -t, |
Date: |
Wed, 9 Oct 2013 16:06:22 -0500 |
On Tuesday, October 8, 2013 at 8:48 PM, Eric Blake wrote:
>
> > the question in my mind remains: if a user specifies a
> > field-separator shouldn't that override the locale?
> >
>
> No, because POSIX requires that -n parse as many characters as
> possible regardless of locale, unless you explicitly ask to limit
> the sort to a specific key.
That's interesting. Could you perhaps point me to that section (if you
know it off the top of your head)? The POSIX requirement that -n parse
as many characters regardless of locale seems to directly
contradict the other requirement (that at least made sense to me)
that you mentioned earlier that -n parse as many characters until
it sees a non numeric (which is locale dependent).
> Perhaps less likely to be used in real life, but still apropos to
> the example:
> $ printf '1202\n2011\n' | LC_ALL=C sort --debug -t0 -s -n -k1,1
> sort: using simple byte comparison 2011 _ 1202 __
> $ printf '1202\n2011\n' | LC_ALL=C sort --debug -t0 -s -n sort:
> using simple byte comparison 1202 ____ 2011 ____
> And you'll get the same behavior on Solaris or BSD sort (at least,
> assuming they don't have blatant POSIX compliance bugs). Once you
> understand WHY the above example has two different sorts, based on
> whether -k is used, you'll understand why we can't stop parsing -n
> at a comma even for -t, in a non-C locale.
>
I understand why the above examples give two different sorts right
now. I just think that, in your example, -t0 should mean that 0 is no longer
a numeric character but a field-separator (regardless of locale) and
therefore that sort should stop on the first line at 2. In other words,
sort -t0 -n should output '2011\n1202' since 2 is smaller than
12. It seems that the current rationale is to have the locale
override user specified field-separators, and to then have some
other POSIX requirement (that sort -n take as much as possible, regardless
of locales and depending on locales), overiding locales sometimes.
>
> > It seems that the locale overrides specific arguments to sort (in
> > this case, field-separator=, ).
> >
>
> Rather, the lack of -k determines how far -n will parse, regardless
> of locale; it's just that some locales let -n parse farther than
> others.
> -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt
> virtualization library http://libvirt.org
Don't you actually mean here that "the lack of -k determines how far -n will
parse, depending on locale."
- question about behavior of sort -n -t,, Gabriel Gaster, 2013/10/08
- Re: question about behavior of sort -n -t,, Eric Blake, 2013/10/08
- Re: question about behavior of sort -n -t,, Pádraig Brady, 2013/10/08
- Re: question about behavior of sort -n -t,, Eric Blake, 2013/10/08
- Re: question about behavior of sort -n -t,, Gabriel Gaster, 2013/10/09
- Re: question about behavior of sort -n -t,,
Gabriel Gaster <=
- Re: question about behavior of sort -n -t,, Eric Blake, 2013/10/09
- Re: question about behavior of sort -n -t,, Gabriel Gaster, 2013/10/09