[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ignoring control characters in character width
From: |
Patrice Dumas |
Subject: |
Re: ignoring control characters in character width |
Date: |
Tue, 5 Sep 2023 22:08:08 +0200 |
On Tue, Sep 05, 2023 at 10:46:25PM +0300, Eli Zaretskii wrote:
> > Date: Tue, 5 Sep 2023 21:16:47 +0200
> > From: Patrice Dumas <pertusus@free.fr>
> > Cc: bug-texinfo@gnu.org
> >
> > I think I understand what you don't understand, actually this is not
> > about displaying the characters, which is not really done by texi2any,
> > it is about situations where we need to count the width of characters
> > in texi2any. For instance, this is to determine when to put end of
> > lines when formatting Info to compare with line width, or to format
> > multitable cells, or to determine the length of underlining * for a
> > heading string as in
> >
> > Some heading
> > ************
> >
> > Hope that it is clearer. Also we need to make this choice without
> > knowing precisely how the characters will be displayed. In general
> > the display is done by info readers for Info, but it could also be in a
> > pager, a text editor for the diverse possibilities of plain text output.
>
> OK, but in any case the width of control characters is not zero,
> except for some of them, like newline.
>
> Perhaps you should describe the problem you are trying to solve in
> more detail?
When I tried to use u8_strwidth from libunistring in the code in C
equivalent to the code in perl determining the width of characters I
found differences, attributable to the fact that control characters are
ignored in strings in u8_strwidth. That made me think about what was
currently done and what should be best. I realized that most of control
characters should have a 0 width, including Line Feed, Form Feed and
Carriage Return and that others had ambiguous width, such as horizontal
and vertical tab, and that all in all considering that their width is 0
is not worse than best than considering that it is 1, and then that the
simple rule of all control characters having zero width would be both
better than what we have now and simpler. An added bonus would be that
it would be compatible with u8_strwidth behaviour, but it is not an
important argument, if something else need to be implemented, no problem
to do it.
--
Pat