bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: library for unicode collation in C for texi2any?


From: Patrice Dumas
Subject: Re: library for unicode collation in C for texi2any?
Date: Sat, 14 Oct 2023 17:31:48 +0200

On Sat, Oct 14, 2023 at 01:24:38PM +0300, Eli Zaretskii wrote:
> 
> > There seems to be variants of strcoll and of strxfrm, strcoll_l and 
> > strxfrm_l that allow to specify a locale, but it is not very well
> > documented (these functions seem to be in the glibc, but are not
> > documented, strcoll and strxfrm are), there are no gnulib modules, and I
> > am not sure whether with "C" locale these functions really use the
> > specified locale.
> 
> I don't think we want to depend on the locale in Texinfo.  The problem
> is how to find or write an implementation that on the one side doesn't
> use the locale-dependent collation rules, and OTOH ignores punctuation
> and other "unimportant" characters.

Exactly.  For most needs (column width, classification of characters,
...) there is libunistring that does exactly that.  But there are still
some problematic cases in C without obvious solution:
* sorting of indices
* translation of strings (if locale is "C", I could not find a way
  to have gettext switch the locale based on document locale)

A last issue is transliteration.  There seems to be //TRANSLIT in iconv
but I fear again some issues with portability.

In the perl implementation those issues are solved, for gettext with
some workarounds, but in C it seems to me to be more challenging.

-- 
Pat



reply via email to

[Prev in Thread] Current Thread [Next in Thread]