bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: library for unicode collation in C for texi2any?


From: Eli Zaretskii
Subject: Re: library for unicode collation in C for texi2any?
Date: Sat, 14 Oct 2023 13:24:38 +0300

> Date: Sat, 14 Oct 2023 11:57:02 +0200
> From: Patrice Dumas <pertusus@free.fr>
> Cc: bug-texinfo@gnu.org
> 
> On Thu, Oct 12, 2023 at 06:13:34PM +0300, Eli Zaretskii wrote:
> > What you say is not detailed enough, but using my crystal ball I think
> > you can have this with glibc-based systems, and also on Windows (but
> > that requires using a special API for comparing strings).  Not sure
> > about the equivalent features on other systems, like *BSD and macOS.
> > You can see that in action in how GNU 'ls' sorts file names.
> 
> Looks like ls ultimately uses strcoll.  The problem is that it selects
> the current locale, we never want to use the current locale in Texinfo.
> We either want to use a 'generic' locale (which does not really exist
> as far as I can tell) or the @documentlanguage locale.

Yes, I know.  However, if the current locale's codeset is UTF-8, AFAIK
glibc uses the full Unicode CLDR, which is what I wanted to point out.

> There seems to be variants of strcoll and of strxfrm, strcoll_l and 
> strxfrm_l that allow to specify a locale, but it is not very well
> documented (these functions seem to be in the glibc, but are not
> documented, strcoll and strxfrm are), there are no gnulib modules, and I
> am not sure whether with "C" locale these functions really use the
> specified locale.

I don't think we want to depend on the locale in Texinfo.  The problem
is how to find or write an implementation that on the one side doesn't
use the locale-dependent collation rules, and OTOH ignores punctuation
and other "unimportant" characters.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]