[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Alignment bug in ls with UTF-8 filenames under Mac OS X
From: |
Vincent Lefevre |
Subject: |
Re: Alignment bug in ls with UTF-8 filenames under Mac OS X |
Date: |
Thu, 18 Jan 2007 03:50:26 +0100 |
User-agent: |
Mutt/1.5.13-vl-r14963 (2007-01-09) |
On 2007-01-18 03:14:37 +0100, Bruno Haible wrote:
> Conclusion: What you see is not an ls bug, but an Apple Terminal bug
> with tabs.
I don't use the Apple Terminal (and never use it). As I said in my
bug report, I'm using uxterm here. More precisely:
prunille:~> uxterm -version
XFree86 4.3.99.903(184)
With the same uxterm, after a ssh to a Linux machine:
vin:~tmp/blah> LC_ALL=en_US.UTF-8 \ls -C | hd
00000000 45 cc 81 09 09 09 09 20 79 31 32 33 34 35 36 37 |E...... y1234567|
00000010 38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33 |8901234567890123|
00000020 34 35 36 37 38 39 30 0a 78 31 32 33 34 35 36 37 |4567890.x1234567|
00000030 38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33 |8901234567890123|
00000040 34 35 36 37 38 39 30 20 20 7a 31 32 33 34 35 36 |4567890 z123456|
00000050 37 38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 |7890123456789012|
00000060 33 34 35 36 37 38 39 30 0a |34567890.|
00000069
vin:~tmp/blah> LC_ALL=en_US.UTF-8 \ls -C
É y123456789012345678901234567890
x123456789012345678901234567890 z123456789012345678901234567890
No problem.
Hmm... I forgot that ls was an alias (the same one on all my accounts).
So, back on Mac OS X:
prunille:~/blah> \ls
É y123456789012345678901234567890
x123456789012345678901234567890 z123456789012345678901234567890
prunille:~/blah> \ls --color=always
É y123456789012345678901234567890
x123456789012345678901234567890 z123456789012345678901234567890
prunille:~/blah> \ls -C | hexdump -C
00000000 45 cc 81 09 09 09 09 20 79 31 32 33 34 35 36 37 |E�..... y1234567|
00000010 38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33 |8901234567890123|
00000020 34 35 36 37 38 39 30 0a 78 31 32 33 34 35 36 37 |4567890.x1234567|
00000030 38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33 |8901234567890123|
00000040 34 35 36 37 38 39 30 20 20 7a 31 32 33 34 35 36 |4567890 z123456|
00000050 37 38 39 30 31 32 33 34 35 36 37 38 39 30 31 32 |7890123456789012|
00000060 33 34 35 36 37 38 39 30 0a |34567890.|
00000069
prunille:~/blah> \ls -C --color=always | hexdump -C
00000000 1b 5b 30 30 6d 1b 5b 30 6d 45 cc 81 1b 5b 30 30 |.[00m.[0mE�..[00|
00000010 6d 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 |m |
00000020 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 | |
00000030 1b 5b 30 6d 79 31 32 33 34 35 36 37 38 39 30 31 |.[0my12345678901|
00000040 32 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 |2345678901234567|
00000050 38 39 30 1b 5b 30 30 6d 0a 1b 5b 30 6d 78 31 32 |890.[00m..[0mx12|
00000060 33 34 35 36 37 38 39 30 31 32 33 34 35 36 37 38 |3456789012345678|
00000070 39 30 31 32 33 34 35 36 37 38 39 30 1b 5b 30 30 |901234567890.[00|
00000080 6d 20 20 1b 5b 30 6d 7a 31 32 33 34 35 36 37 38 |m .[0mz12345678|
00000090 39 30 31 32 33 34 35 36 37 38 39 30 31 32 33 34 |9012345678901234|
000000a0 35 36 37 38 39 30 1b 5b 30 30 6d 0a 1b 5b 6d |567890.[00m..[m|
000000af
> But there is an ls bug:
>
> $ ls -C -T0
> É y123456789012345678901234567890
> x123456789012345678901234567890 z123456789012345678901234567890
> $ ls -C -T0 | hd
> 000000 45 CC 81 20 20 20 20 20 20 20 20 20 20 20 20 20 E..
> 000010 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20
> 000020 20 20 79 31 32 33 34 35 36 37 38 39 30 31 32 33 y1234567890123
[...]
OK, so I think I was seeing this bug.
> What 'ls' here outputs is: an E, a combining accent and 31 spaces - text
> that moves to column 32, not 33. When I set a breakpoint in wcwidth,
> I see that the first call to wcwidth() gives: wcwidth(0x0301) = 1.
> U+0301 is COMBINING ACUTE ACCENT. So here is the problem: MacOS'
> wcwidth is buggy for combining characters like accents.
OK. Can't autoconf detect that and use another implementation?
> (*) 'hd' is a shell script:
> #!/bin/sh
> hexdump -e '"%06.6_ax " 16/1 "%02X "' -e '" " 16/1 "%_p" "\n"' "$@"
It's a bit like (or identical to) "hexdump -C", then.
Regards,
--
Vincent Lefèvre <address@hidden> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)
Message not available
- Message not available
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/17
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X,
Vincent Lefevre <=
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Bruno Haible, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Vincent Lefevre, 2007/01/18
- Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Jim Meyering, 2007/01/19
Re: Alignment bug in ls with UTF-8 filenames under Mac OS X, Paul Eggert, 2007/01/18