[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Alignment bug in ls with UTF-8 filenames under Mac OS X
From: |
Bruno Haible |
Subject: |
Re: Alignment bug in ls with UTF-8 filenames under Mac OS X |
Date: |
Thu, 18 Jan 2007 15:37:23 +0100 |
User-agent: |
KMail/1.9.1 |
Vincent Lefevre wrote:
> Hmm... I forgot that ls was an alias (the same one on all my accounts).
> So, back on Mac OS X:
>
> prunille:~/blah> \ls -C --color=always | hexdump -C
> 00000000 1b 5b 30 30 6d 1b 5b 30 6d 45 cc 81 1b 5b 30 30 |.[00m.[0mE�..[00|
> 00000010 6d 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |m |
> 00000020 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
> 00000030 1b 5b 30 6d 79 31 32 33 34 35 36 37 38 39 30 31 |.[0my12345678901|
> 00000040 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 |2345678901234567|
> 00000050 38 39 30 1b 5b 30 30 6d 0a 1b 5b 30 6d 78 31 32 |890.[00m..[0mx12|
> 00000060 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 |3456789012345678|
> 00000070 39 30 31 32 33 34 35 36 37 38 39 30 1b 5b 30 30 |901234567890.[00|
> 00000080 6d 20 20 1b 5b 30 6d 7a 31 32 33 34 35 36 37 38 |m .[0mz12345678|
> 00000090 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 |9012345678901234|
> 000000a0 35 36 37 38 39 30 1b 5b 30 30 6d 0a 1b 5b 6d |567890.[00m..[m|
> 000000af
That makes - except for the escape sequences - an E, a combining accent and
31 spaces. So it's the same bug as in "ls -C -T0".
> > I see that the first call to wcwidth() gives: wcwidth(0x0301) = 1.
> > U+0301 is COMBINING ACUTE ACCENT. So here is the problem: MacOS'
> > wcwidth is buggy for combining characters like accents.
>
> OK. Can't autoconf detect that and use another implementation?
Yes. We can do that in gnulib. I'll work on this issue in the next few weeks.
Please remind us (on the bug-gnulib mailing list) in 1 or 2 months.
And, as we have seen, the other issue is that Apple Terminal has problems
estimating the width of tabs when there are non-ASCII characters. Since
you can start an telnet/ssh session from MacOS X to any platform (Linux,
Solaris, etc.), the fix needs to be platform independent. Here is such a fix:
2007-01-18 Bruno Haible <address@hidden>
Avoid problems with tabs after non-ASCII characters in some terminals.
* src/ls.c (nonascii_in_this_line): New variable.
(quote_name): Update nonascii_in_this_line.
(print_many_per_line, print_horizontal): Set nonascii_in_this_line to
false at the beginning of each line.
(indent): Use spaces for indentation when nonascii_in_this_line.
diff -c -3 -r1.447 ls.c
*** src/ls.c 2 Jan 2007 06:29:12 -0000 1.447
--- src/ls.c 18 Jan 2007 14:38:14 -0000
***************
*** 851,856 ****
--- 851,859 ----
for the separating white space. */
#define MIN_COLUMN_WIDTH 3
+ /* True if some non-ASCII character has been output on this line. */
+ static bool nonascii_in_this_line;
+
/* This zero-based index is used solely with the --dired option.
When that option is in effect, this counter is incremented for each
***************
*** 3704,3710 ****
}
if (out != NULL)
! fwrite (buf, 1, len, out);
if (width != NULL)
*width = displayed_width;
return len;
--- 3702,3722 ----
}
if (out != NULL)
! {
! /* Update nonascii_in_this_line indicator. */
! char const *p = buf;
! char const *plimit = buf + len;
!
! for (; p < plimit; p++)
! if (!isascii (to_uchar (*p)))
! {
! nonascii_in_this_line = true;
! break;
! }
!
! /* Actually output the quoted representation. */
! fwrite (buf, 1, len, out);
! }
if (width != NULL)
*width = displayed_width;
return len;
***************
*** 3957,3962 ****
--- 3969,3975 ----
size_t pos = 0;
/* Print the next row. */
+ nonascii_in_this_line = false;
while (1)
{
size_t name_length = length_of_file_name_and_frills (files + filesno);
***************
*** 3984,3989 ****
--- 3997,4004 ----
size_t name_length = length_of_file_name_and_frills (files);
size_t max_name_length = line_fmt->col_arr[0];
+ nonascii_in_this_line = false;
+
/* Print first entry. */
print_file_name_and_frills (files);
***************
*** 3996,4001 ****
--- 4011,4017 ----
{
putchar ('\n');
pos = 0;
+ nonascii_in_this_line = false;
}
else
{
***************
*** 4047,4060 ****
}
/* Assuming cursor is at position FROM, indent up to position TO.
! Use a TAB character instead of two or more spaces whenever possible. */
static void
indent (size_t from, size_t to)
{
while (from < to)
{
! if (tabsize != 0 && to / tabsize > (from + 1) / tabsize)
{
putchar ('\t');
from += tabsize - from % tabsize;
--- 4063,4085 ----
}
/* Assuming cursor is at position FROM, indent up to position TO.
! Use a TAB character instead of two or more spaces whenever possible.
! Depends on the TABSIZE option and on the current value of
! NONASCII_IN_THIS_LINE. */
static void
indent (size_t from, size_t to)
{
while (from < to)
{
! /* Setting TABSIZE to 0 inhibits the use of tabs. Also, since some
! terminal emulators (like Apple Terminal from MacOS X 10.3) don't
! handle tabs after non-ASCII combining accents on the same line
! well, avoid tabs where there are non-ASCII characters so far on
! the current line. */
! if (tabsize != 0
! && !nonascii_in_this_line
! && to / tabsize > (from + 1) / tabsize)
{
putchar ('\t');
from += tabsize - from % tabsize;
Message not available
- Message not available
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/17
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Vincent Lefevre, 2007/01/17
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X,
Bruno Haible <=
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Vincent Lefevre, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/19
Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Paul Eggert, 2007/01/18
Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18