[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev chartrans to CJK-like display (was: stopping when viewing a
From: |
Henry Nelson |
Subject: |
Re: lynx-dev chartrans to CJK-like display (was: stopping when viewing a site) |
Date: |
Fri, 27 Aug 1999 11:36:57 +0900 (JST) |
> > do NOT want CJK translation tables to become a part of Lynx.
>
> Why not? :)
Yeah, right. Quadruple the size of the distribution overnight.
> There would be advantages if Lynx could really translate between arbitrary
> charsets including CJK, say "Japanese" (whatever charset) <--> Unicode.
> For example, with the advent of utf-8 capable xterm, usets of that should
> be able to see Japanese text mixed with all kinds of other text, without
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
That is the eventual goal isn't it: all languages of the world mixed
together in any way. Speech is making TREMENDOUS strides. With the
speed of relatively inexpensive cpu's these days, they are already
doing "unicode" speech translation. Saw a demonstration earlier this
year that just boggled my mind. Talk in Japanese, and Japanese text
shows up on the screen, or nearly as fast, English speech comes out
of a neighboring terminal. It was done in two steps, Japanese to a
"universal language", then the "universal language" to English, or
theoretically to any language. Up until now it has always been direct
translation.
> switching, even on the same screen. Given the right fonts of course -
> I understand they exist. (I have not tried any of this.)
Yes, those fonts exist; the smallest are about 300 KB, and the exotic,
large-point ones are at least 2 MB. Of course this is all jabber, but
my suspicion of why sjis was perpetrated was because the chip makers
had to make and sell a few billion ROMs before they could make a decent
profit (before the days of GB-size HDDs).
> But don't worry, I think there is no imminent danger of someone adding full
> CJK-charset <--> unicode translation...
In Lynx, anyway.
> that. I think Leonid's point was that that FTP directory might help you
> find out the CPNNN name for what your system uses. (If there is one.
^^^^^^^^^^^^^^^
I appreciated the pointer. I wasn't sure of his intent though because of
the comment on the size. AFAIK, there isn't (couldn't be?).
> I would prefer to use a different term here instead of translation, I would
> rather call it re-encoding. Just to separate it from the the other, table-
> driven translation.
You're right. This is what I meant in my comment to Heather (the one
with the sick pun).
> I know what you mean, although the terminology you use is a bit too limiting:
> There is no reason to restrict this a priori to "single-byte characters".
> Rather it should apply to everything that is recognized as a "Unicode
> character" (i.e. translatable by lynx to a U+nnnn value), whether it
> originally came in as iso-8859-1, koi8-r, utf-8, windows-1252 bytes,
Very true. I didn't mean to be limiting so much as practical. Anyway,
with your fix, I think this thread has come to a close (except for the
fun part, the dream part :).
> Getting this right for *7-bit* characters that need translation may
> be not worth it (I am thinking about the '\' that is not a backslash).
^
Neat, isn't it? You see a '\', and I see a '=' on top of a 'Y'.
That reflects my opinion, and that is why I am perfectly content to
continue to copy my hoge-hoge.tbl (that's Japanese) over def7_uni.tbl.
> Did you ever try any of this *before* nearly all of the separate
> translation tables ('old-style') in LYCharSets.c were eliminated?
Not sure what you mean by "any of this", but I used to patch the 7-bit
approximations before I switched over to just copying over with a
replacement. That was so long ago that I've pretty much forgotten.
I think perhaps before, or maybe soon after Leonid started to be the
de facto maintainer of chrtrans. It was never a big deal before it
stopped working.
__Henry