vile
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [vile] Non-us keyboard layout mishandling.


From: Thomas Dickey
Subject: Re: [vile] Non-us keyboard layout mishandling.
Date: Thu, 16 Dec 2010 06:53:08 -0500 (EST)

On Wed, 15 Dec 2010, Ramil Farkhshatov wrote:

Thomas Dickey <address@hidden> wrote:

On Wed, 15 Dec 2010, Ramil Farkhshatov wrote:

Thomas Dickey <address@hidden> wrote:

On Sun, 12 Dec 2010, Ramil Farkhshatov wrote:

Thomas Dickey <address@hidden> wrote:

On Thu, Nov 18, 2010 at 03:33:14AM +0300, Ramil Farkhshatov
wrote:
Thomas Dickey <address@hidden> wrote:

On Thu, 18 Nov 2010, Ramil Farkhshatov wrote:

Thomas Dickey <address@hidden> wrote:

On Thu, 18 Nov 2010, Ramil Farkhshatov wrote:

Hello.  Vile in normal mode interprets keypresses made in
non-us layout (e.g.  utf-8 cyrillic) as a sequence of us
(latin) keypresses and does random actions or sets 'arg:  '
to some huge values.  This behaviour makes confusion when
editing.  I think that vile should ignore such keypresses.

vile should...

This sounds like a special case which I thought I'd fixed:
if your machine doesn't have the corresponding ru_RU locale
data installed, then vile may have incomplete information on
the locale.

Is that the case?

Machine have ru_RU.UTF-8 locale.  At least it is generated by
glibc's (v2.12.1) 'localedef' and it is shown in 'locale -a'
output list.  And I didn't have any locale issues before
except for vile behaviour.

From the reference to glibc, I assume we're talking about Linux.

Yes.  It is Arch linux to be specific.  Sorry, forgot to mention
OS in first message.

I meant something like this (from my machine's "locale -a"):

ru_RU ru_RU.iso88595 ru_RU.koi8r ru_RU.utf8

Some distributions such as Ubuntu don't deliver 8-bit locales,
which I've used to construct a lookup table relating those with
UTF-8 encoding, which is used in various ways.

Output of locale -a:  C en_GB.utf8 en_US en_US.iso88591
en_US.utf8 POSIX ru_RU.utf8

From this, I'm assuming that the problem is as I stated above.
In 9.8c, I added a built-in table which supplies the "ru_RU" part
as a "builtin" locale using ISO-8859-5 encoding.  vile uses that
for cases when the file-encoding is "8bit", as well as to
simplify editing UTF-8 files in 8-bit locales.

The help-file summarizes this (noting that "locale" is the
default, which means that vile should assume that your files are
in UTF-8 encoding, based on the "ru_RU.utf8", and the nl_langinfo
function):

file-encoding (fk) This is the character encoding of the buffer
contents, which is not necessarily the same as the display's
character encoding.  It must be one of the following values:

"8bit" "ascii" "auto" "locale" (default) "utf-16" "utf-32"
"utf-8"

The "auto" setting tells vile to determine the encoding by
inspecting the buffer contents.  The "locale" setting tells vile
to assume that the buffer contents are in the current locale's
encoding.  The "8bit" setting corresponds to the 8-bit locale
support used since 9.3i (20021223).  (B)

Yes, I've scanned whole help just in case I missed something,
didn't find anything what would help though.  The "file-encoding"
value is always "locale", at least I don't set it anywhere and
":set" shows it as "locale".  And I can edit UTF-8 encoded files
without problems on ru_RU.utf8 locale.  When I run vile in C,
POSIX, or en_GB locales I see unicode symbols as \xZZ\xZZ…, but
input from cyrillic keyboard layout behaves nicely that means it
does not trigger any commands in normal mode.  It breaks only in
any unicode locale (I tried ru_RU.utf8 and en_GB.utf8).

I see - but if your locale is unicode, vile is expecting the
keyboard input to be in UTF-8.  It can be told to use 8-bit
encoding for the ":" (minibuffer) line using $cmd-encoding.  It
also has $term-encoding to denote both input/output operations
(actually used only in a macro for uxvile).

Now I'm totaly confused.  I believe that my keyboard input is in
UTF-8.  I've tried to read kb input both by getchar() with c_lflag
&= ~(ICANON | ECHO) and ncurses' getch() and got UTF-8 sequences in
both cases under xterm and in linux console.

Is that the wide-character version of ncurses?  (libncursesw).
ncursesw provides get_wch(), which decoded the UTF-8.  For instance, I
use the ncurses test-programs to check this area.

I linked my test-program against ncursesw.  get_wch() just hangs and
does not return for a long time when I enter something cyrillic.
getchar() and getch() returns multiple values for each keypress, which I
decode to ensure that the encoding is UTF-8.  Don't know whether this
information is relevant/useful since vile does not use get_wch().

It's sort of useful, since it seems to show that ncursesw and vile
are seeing the same sort of inconsistency (but I don't know the cause).
Ultimately they're using the same encoding data in the C runtime library.

Both of those can be built with debug-traces enabled - for instance,
to show the individual bytes read.  (ncurses/ncursesw is simpler to
use, with the NCURSES_TRACE environment variable).


It sounds as if I could address this case that you're reporting by making
a new variable, e.g., $keyboard-encoding which would tell vile if it has
to translate from an 8-bit encoding into UTF-8.

Is it possible that on my system vile is already (mis)translating 8-bit
encoding into UTF-8?

I suppose so - but I'm not sure where to look.  Except for the provision
for 8-bit encodings, vile and ncurses both use equivalent locale support.
For your locale, in vile "show-variables" shows these values for
"encoding":

$buf-encoding = UTF-8
$cmd-encoding = auto
$encoding = ISO-8859-5
$term-encoding = utf-8
$title-encoding = 8bit

Yes.  Exactly the same.

--
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net

reply via email to

[Prev in Thread] Current Thread [Next in Thread]