bug-texinfo
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

c32width gives incorrect return values in C locale


From: Gavin Smith
Subject: c32width gives incorrect return values in C locale
Date: Sat, 11 Nov 2023 18:53:21 +0000

On Fri, Nov 10, 2023 at 07:39:43PM +0000, Gavin Smith wrote:
> Is the expected output
> 
>    å å (å) Å Å (Å) æ æ (æ) œ œ (œ) Æ Æ (Æ) Œ Œ (Œ) ø ø (ø) Ø Ø (Ø) ß ß (ß)
> 
> (width 74) or
> 
>    @aa å (å) @AA Å (Å) @ae æ (æ) @oe œ (œ) @AE Æ (Æ) @OE Œ (Œ) @o ø (ø) @O Ø 
> (Ø) @ss ß (ß)
> 
> (width 90)?
> 
> I guess you will need to look at the Unicode characters that you pass to 
> c32width,
> and whether you get return values < 1 for some of them.

It is locale-dependent!

It looks like c32width is simply being redirected to wcwidth which then
doesn't work properly with LC_ALL=C.  This is from the gnulib module
c32width.

I don't know if there is an easy way to make a self-contained example
to show the difference, because it needs all the gnulib Makefile machinery,
but the difference shows up for any non-ASCII character.  If I add a line
like

 fprintf (stderr, "width of [%4.0lx] is %d (remaining %s)\n",
                    (long) wc, width, q);

in the right place in the code, where width is the result of c32width,
then the output looks like

width of [  40] is 1 (remaining @)
width of [  4f] is 1 (remaining OE )
width of [  45] is 1 (remaining E )
width of [ 152] is -1 (remaining Œ)
width of [  28] is 1 (remaining (Œ)

for LC_ALL=C, but

width of [  40] is 1 (remaining @)
width of [  4f] is 1 (remaining OE )
width of [  45] is 1 (remaining E )
width of [ 152] is 1 (remaining Œ)
width of [  28] is 1 (remaining (Œ)

otherwise (LC_ALL=en_GB.UTF-8).

Should this be reported as a bug to bug-gnulib or bug-libunistring?

In the context of the input from the test, the following is the contents
of a a simplified test file "test.texi":

@@aa @aa{} (å)
@@AA @AA{} (Å)
@@ae @ae{} (æ)
@@oe @oe{} (œ)
@@AE @AE{} (Æ)
@@OE @OE{} (Œ)
@@o @o{} (ø)
@@O @O{} (Ø)
@@ss @ss{} (ß)
@@l @l{} (ł)
@@L @L{} (Ł)
@@DH @DH{} (Ð)
@@TH @TH{} (Þ)
@@dh @dh{} (ð)
@@th @th{} (þ)


Then, in a UTF-8 locale:

$ ../tp/texi2any.pl test.texi && cat test.info
test.texi: warning: document without nodes
This is test.info, produced by texi2any version 7.1dev+dev from
test.texi.

@aa å (å) @AA Å (Å) @ae æ (æ) @oe œ (œ) @AE Æ (Æ) @OE Œ (Œ) @o ø (ø) @O
Ø (Ø) @ss ß (ß) @l ł (ł) @L Ł (Ł) @DH Ð (Ð) @TH Þ (Þ) @dh ð (ð) @th þ
(þ)



Tag Table:

End Tag Table


Local Variables:
coding: utf-8
End:

However:

$ LC_ALL=C ../tp/texi2any.pl test.texi && cat test.info
test.texi: warning: document without nodes
This is test.info, produced by texi2any version 7.1dev+dev from
test.texi.

@aa å (å) @AA Å (Å) @ae æ (æ) @oe œ (œ) @AE Æ (Æ) @OE Œ (Œ) @o ø (ø) @O Ø (Ø) 
@ss ß (ß) @l
ł(ł) @L Ł (Ł) @DH Ð (Ð) @TH Þ (Þ) @dh ð (ð) @th þ (þ)



Tag Table:

End Tag Table


Local Variables:
coding: utf-8
End:

In the later case, it is a much longer line.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]