[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [vile] problem with 'wide characters' (utf-8) under macosx
From: |
j. van den hoff |
Subject: |
Re: [vile] problem with 'wide characters' (utf-8) under macosx |
Date: |
Sat, 06 Dec 2014 21:45:04 +0100 |
User-agent: |
Opera Mail/12.12 (MacIntel) |
On Sat, 06 Dec 2014 18:46:06 +0100, Thomas Dickey <address@hidden> wrote:
On Sat, Dec 06, 2014 at 11:15:36AM +0100, j. van den hoff wrote:
forgot to Cc the list. sorry for the noise, brendan ....
On Sat, 06 Dec 2014 07:49:29 +0100, Brendan O'Dea <address@hidden>
wrote:
>On 6 December 2014 at 09:39, j. van den hoff
><address@hidden> wrote:
>>[...] I want to use it in the native `Terminal.app' coming
>>with macos. here's the problem: despite `Terminal.app' being set up
for
>>utf-8 character encoding, vile displays
>>non-ascii characters by their hexcode such as \u00E4. [...]
>
>Hi Joerg,
hi brendan,
>
>Could you paste the contents of the buffer produced by :show-variables
>when you are in a file which has such a problem?
sure. I've saved this list for _both_ cases, editing from within urxvt
(where everything is fine and
from within `Terminal.app' (where it is displaying the utf-8
hexcodes). I
here only list the differences:
urxvt: Terminal.app:
====== =============
$curcol = 1 | $curcol = 6
$encoding = | $encoding = UTF-8
$lcols = 9 | $lcols = 14
$locale = de_DE | $locale = UTF-8
$pagelen = 50 | $pagelen = 56
$pagewid = 141 | $pagewid = 181
$pid = 33249 | $pid = 33243
$term-cols = 141 | $term-cols = 181
$term-encoding = utf-8 | $term-encoding = locale
$term-lines = 50 | $term-lines = 56
$wlines = 48 | $wlines = 54
Testing the port (which seems to be old - "9.8" - "9.8o" is current),
I don't see any encoding differences.
most of these differences are obviously irrelevant but the encoding
related values differ, too...
I think I will start to read up, what exactly they mean in `vile --help'
...
>
>The output of the locale command from the shell and the value of $TERM
>may also be useful.
the problem might lie in this area. in urxvt I get
LANG=
LC_COLLATE="C"
LC_CTYPE="de_DE.UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
I have something comparable in uxterm (started in OSX):
LANG=
LC_COLLATE="C"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
However - see below.
where I explicitely set LC_CTYPE to that value in (the equivalent of)
.xinitrc so that it is defined when the x11 window manager starts up
(but
is ignored, of course by Terminal.app...) in Terminal.app I get instead:
LANG=
LC_COLLATE="C"
LC_CTYPE="UTF-8"
LC_MESSAGES="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_ALL=
I see - I have this in Terminal.app:
LANG="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_CTYPE="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_ALL=
In its preferences ("Advanced" tab), I have
Character encoding: Unicode (UTF-8)
Set locale environment variables on startup
I have exactly the same there but end up with the strange `locale' settings
including LC_CTYPE=UTF-8. this definitely is no longer a vile related
question
but do you have any idea from where Terminal.app derives it's information
_what_
locale environement vars to set (even in your case they are not the same
-- with
the lucky exception of LC_CTYPE -- as in uxterm).
Generally I don't set locale variables in my shell startup scripts
(for special cases, I set those in scripts around certain programs).
which conforms to what I can select under "character encoding" in the
`preferences' settings of that program. so it's not exactly the same
locale but my (limited) understanding of these things is that "UTF-8"
alone should suffice and the country specfic qualifier (de_DE for me)
has
not much of an influence? (and both terminals identify as xterm-color).
Not exactly. One might suppose that the names are well-standardized, but
they are not. By itself, for instance, "UTF-8" as a locale setting
likely
refers to an alias. The names that I'm accustomed to using are those
found
using "locale -a".
understood -- but I have no idea whatsoever _how_ that `locale' setting
in Terminal.app comes about ...
vile's different from the other editors because it will (if available)
use the "de_DE" to infer a useful value for the "8bit" encoding.
(vile has built-in locale tables in case "de_DE" itself is not supplied
on your machine, so that can do this - about 70kb).
I see.
I experimented a little, and see that your locale settings are confusing
vile. You can see this best by ":show-printable" and looking at the
bottom
of the page (codes are showing as hexadecimal).
Using "de_DE.UTF-8" throughout (actually LC_CTYPE is the important one),
I don't see the hexadecimal characters in "9.8" or the current version.
I see something similar but not quite: in Terminal.app and
with the `UTF-8' value for LC_CTYPE I can hexcodes for positions 128-159
(\x80 - \x9F)
and a verbatim `?' for positions 160-255. If I then manually set
LC_CTYPE=de_DE.UTF-8 in that Terminal.app window and restart vile I
1) still get the hexcodes for pos. 128-159 (but the same happens in urxvt)
2) get regular chars for 160-255
3) most important: the `??' problem when entering diacritical characters
such as ü vanishes
only problem: I don't see any way to convince Terminal.app to use a valid
(fully qualified) value for LC_CTYPE
in the first place...
This might be related to the "??" problem - I'm not sure.
bingo ;-), see above (thanks!). the whole remains confusing for me,
though. for one,
I don't understand in which way the LC_CTYPE=UTF-8 setting is confusing
vile (since
as explained at least after a redraw the entered `ü' (and similar) are
rendered correctly
in the buffer (while not being displayed in the show-printable output).
but obviously
there'll be some hidden reason for this. the other thing which remains
unclear for
me is how I manage to end up with LC_CTYPE=UTF-8 in Terminal.app in the
first place.
but that's probably not a problem for this list...
("xterm-color" is problematic as well - a different topic).
is it? what do you recommend here?
the strange thing is that several other editors
recognize these settings in a way that utf-8 is displayed correctly in
bother terminal emulators.
--
Using Opera's revolutionary email client: http://www.opera.com/mail/
Re: [vile] problem with 'wide characters' (utf-8) under macosx, j. van den hoff, 2014/12/06