[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [vile] problem with 'wide characters' (utf-8) under macosx
From: |
Thomas Dickey |
Subject: |
Re: [vile] problem with 'wide characters' (utf-8) under macosx |
Date: |
Sat, 6 Dec 2014 16:53:46 -0500 |
User-agent: |
Mutt/1.5.20 (2009-06-14) |
On Sat, Dec 06, 2014 at 09:45:04PM +0100, j. van den hoff wrote:
> >In its preferences ("Advanced" tab), I have
> > Character encoding: Unicode (UTF-8)
> > Set locale environment variables on startup
>
> I have exactly the same there but end up with the strange `locale' settings
> including LC_CTYPE=UTF-8. this definitely is no longer a vile related
> question but do you have any idea from where Terminal.app derives it's
> information _what_ locale environement vars to set (even in your case they
> are not the same -- with the lucky exception of LC_CTYPE -- as in uxterm).
hmm - no, I don't... When I setup my Mac's (both macmini servers), I didn't
delve into its locale settings. In the system preferences, I see the
language/region part, which is English/United States - which is probably
where Terminal.app gets its information from. I do recall that initially
OSX wasn't making useful locale settings that I could pass via ssh -- I used
to just override it on the remote end. I'm running last year's release
Mavericks on both machines (am still seeing X as buggy in yosemite).
> >Generally I don't set locale variables in my shell startup scripts
> >(for special cases, I set those in scripts around certain programs).
> >
> >>which conforms to what I can select under "character encoding" in the
> >>`preferences' settings of that program. so it's not exactly the same
> >>locale but my (limited) understanding of these things is that "UTF-8"
> >>alone should suffice and the country specfic qualifier (de_DE
> >>for me) has
> >>not much of an influence? (and both terminals identify as xterm-color).
> >
> >Not exactly. One might suppose that the names are well-standardized, but
> >they are not. By itself, for instance, "UTF-8" as a locale
> >setting likely
> >refers to an alias. The names that I'm accustomed to using are
> >those found
> >using "locale -a".
>
> understood -- but I have no idea whatsoever _how_ that `locale' setting
> in Terminal.app comes about ...
>
> >
> >vile's different from the other editors because it will (if available)
> >use the "de_DE" to infer a useful value for the "8bit" encoding.
> >(vile has built-in locale tables in case "de_DE" itself is not supplied
> >on your machine, so that can do this - about 70kb).
>
> I see.
right. If it cannot find the useful value, then it falls back to POSIX
or Latin-1, depending. The ":show-printable" I saw today looked like POSIX.
(I probably should revisit this and attempt to improve it - but the port
is old, too ...)
> >I experimented a little, and see that your locale settings are confusing
> >vile. You can see this best by ":show-printable" and looking at
> >the bottom
> >of the page (codes are showing as hexadecimal).
> >
> >Using "de_DE.UTF-8" throughout (actually LC_CTYPE is the important one),
> >I don't see the hexadecimal characters in "9.8" or the current version.
>
> I see something similar but not quite: in Terminal.app and
> with the `UTF-8' value for LC_CTYPE I can hexcodes for positions
> 128-159 (\x80 - \x9F)
> and a verbatim `?' for positions 160-255. If I then manually set
> LC_CTYPE=de_DE.UTF-8 in that Terminal.app window and restart vile I
>
> 1) still get the hexcodes for pos. 128-159 (but the same happens in urxvt)
> 2) get regular chars for 160-255
> 3) most important: the `??' problem when entering diacritical
> characters such as ü vanishes
>
> only problem: I don't see any way to convince Terminal.app to use a
> valid (fully qualified) value for LC_CTYPE
> in the first place...
>
> >
> >This might be related to the "??" problem - I'm not sure.
>
> bingo ;-), see above (thanks!). the whole remains confusing for me, though.
> for one, I don't understand in which way the LC_CTYPE=UTF-8 setting is
> confusing vile (since as explained at least after a redraw the entered `ü'
> (and similar) are rendered correctly in the buffer (while not being displayed
> in the show-printable output). but obviously there'll be some hidden reason
> for this. the other thing which remains unclear for me is how I manage to
> end up with LC_CTYPE=UTF-8 in Terminal.app in the first place. but that's
> probably not a problem for this list...
>
> >
> >("xterm-color" is problematic as well - a different topic).
>
> is it? what do you recommend here?
The closest for Terminal.app would be from ncurses:
nsterm-256color
which I added in 2012.
But for Mac's that's hard:
a) the terminal database in /usr/share/terminfo is very old.
b) Terminal.app only has certain settings (actually in Mavericks,
none are "xterm-color"). It does have "nsterm", which was
added to ncurses in 2001. A quick check with that entry shows
some problem with line-drawing.
c) Unlike the Linux and *BSD's, the port for ncurses is still 5.9
release (2011). That includes the terminal database in
/opt/local/share/terminfo
As xterm-color, the function keys are half-right.
--
Thomas E. Dickey <address@hidden>
http://invisible-island.net
ftp://invisible-island.net
signature.asc
Description: Digital signature
Re: [vile] problem with 'wide characters' (utf-8) under macosx, j. van den hoff, 2014/12/06