nano-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Nano-devel] Wide character support.


From: David Benbennick
Subject: Re: [Nano-devel] Wide character support.
Date: Thu, 6 Mar 2003 19:41:30 -0500
User-agent: Mutt/1.2.5.1i

Here's a patch that makes Nano know about multi-byte characters.  It is
attached, and available at
    http://www.math.cornell.edu/~dbenbenn/nano/utf8.patch
It must be applied on top of my big patch from
    http://www.math.cornell.edu/~dbenbenn/nano/nano.patch

Despite the name, this patch doesn't know anything about UTF-8
particularly.  It just uses the system-provided conversion routines, such
as mbtowc() and wcwidth().

It doesn't (yet) use Fribidi.  Also, there is a bug where key input gets
frozen sometimes until you type a function or arrow key.  (There are
probably other bugs I don't know about.)


To make it work:

1) Get the newest Ncurses, configure it with --enable-widec, and install.

2) Apply nano.patch and utf8.patch.

3) In the Makefile, replace
        -lncurses
   with
        -lncursesw
Also, you might have to use the -L flag if you didn't install
libncursesw.so in a standard place.

4) Set the locale to something reflecting the type of file you are
opening.  This is how Nano knows the multi-byte character set.  I use
    export LANG=en_US.utf8


Details:

This patch leaves the file as multi-byte in memory, instead of converting
everything to wide characters (which take 4 bytes each).  A line of text
gets expanded to wchar_t at display.  do_char, do_delete, do_left, and
do_right had to be changed to work with multi-byte characters.

In nano.patch, there is a function display_string() that expands out Tabs
and control characters, and returns a string that can be displayed.  Each
character in that string is only 1 column wide, since for example ^C gets
expanded to '^' followed by 'C'.  Lots of places in winio.c used this
fact.  Of course, with Unicode, you can't expand all multi-column
characters, such as the Chinese character ç«¥.  So all those bits of code
had to be fixed.

Attachment: utf8.patch
Description: Text document

Attachment: pgpd9RIR6cFzi.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]