[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: lynx-dev tech. question: translating strings to different charsets
From: |
Klaus Weide |
Subject: |
Re: lynx-dev tech. question: translating strings to different charsets |
Date: |
Mon, 6 Sep 1999 07:14:08 -0500 (CDT) |
On Sun, 5 Sep 1999, Vlad Harchev wrote:
> On Fri, 3 Sep 1999, Klaus Weide wrote:
>
> As for translation, here are my thoughts:
> * to avoid performance decrease due to LYUCFullyTranslateString_1, the
> following thing can be used:
> the translation of each character used in hydict chset (aka "human
> letter") to d.c.s. can be precalculated (since translation of even
> unicode "characters" is zero-state machine) - so seems flexibility is
What does that remark in parenthesis mean? I don't understand it
at all.
> regained - user will have to specify either in hydict (as comment) or in
> lynx.cfg the chset used in hydict to make such translation.
It seems you are talking about translating the patterns at runtime (at
program start and/or each time the display character set (or something
else?) is changed?). That will only be general enough if the the pattern
input before translation is in a form that is general enough for the
languages to be covered, which in general means you have to provide
them as UCS (in whatever encoding, e.g. UTF-8) for some languages
(or possibly combinations of languages, if that's supposed to be
covered too).
> As for
> Unicode, IMO even at the present state (without modification) libhnj is
> suitable for this - simply there will be extra (that can be avoided with
> cleverer approach - of using 'int' instead of 'char') states used by UTF
> prefixes.
Again I don't understand. Are you talking about a specific encoding,
UTF-8, when you write "Unicode"? I don't know what kind of "states"
you mean.
> * IMO we can turn lynx is a powerfull charset translator with a very cheap
> hack ( I mean adding something like 'lynx -recode utf-8 koi8-r < in >out')
> IMO this worth this.
Lynx already is a "powerfull charset translator" that one could use
in place of packages like "recode" etc., although one should expect
those specific packages to be better (more correct / more general /
more flexible / more efficient) at the job they were written for.
Lynx just doesn't have a convenient syntax to invoke it as a filter
for this (maybe to encourage to use "the right tool for the right
job").
But try the appended script. It will only work right if there is
no ~/.lynxrc. (It would probably better to temporarily mess with
~/.lynxrc instead of messing with lynx.cfg, and just using -cfg=/dev/null
for speed.) Yes it requires bash, won't work with any Bourne-like shell.
Klaus
--------------- lynx-recode.sh -----------------------------------
#! /bin/bash
if [ $# -ne 3 -a $# -ne 2 ]; then
echo "Usage: $0 cs_in cs_out [file]" >&2
exit 1
fi
LYNX="${LYNX:-lynx}"
LYNX_CFG="${LYNX_CFG:-/usr/local/lib/lynx.cfg}"
file="${3:-/dev/stdin}"
if [ $# = 3 -a "$file" != "/dev/stdin" ]; then
cat "$file" | $0 $1 $2
else
$LYNX -assume_charset="$1" -assume_local_charset="$1" \
-cfg <(sed -e "s/^#\?CHARACTER_SET:.*/CHARACTER_SET:$2/" "$LYNX_CFG") \
-dump "$file"
fi
- Re: lynx-dev tech. question: translating strings to different charsets, (continued)
- Re: lynx-dev tech. question: translating strings to different charsets, Vlad Harchev, 1999/09/02
- Re: lynx-dev tech. question: translating strings to different charsets, Klaus Weide, 1999/09/02
- Re: lynx-dev tech. question: translating strings to different charsets, Vlad Harchev, 1999/09/03
- Re: lynx-dev tech. question: translating strings to different charsets, Klaus Weide, 1999/09/03
- Re: lynx-dev tech. question: translating strings to different charsets, Vlad Harchev, 1999/09/04
- Re: lynx-dev tech. question: translating strings to different charsets, Vlad Harchev, 1999/09/06
- Re: lynx-dev tech. question: translating strings to different charsets, Klaus Weide, 1999/09/06
- Re: lynx-dev tech. question: translating strings to different charsets, Vlad Harchev, 1999/09/06
- lynx-dev hyphenation (was tech. question: translating strings), Klaus Weide, 1999/09/06
- Re: lynx-dev hyphenation (was tech. question: translating strings), Vlad Harchev, 1999/09/06
- Re: lynx-dev tech. question: translating strings to different charsets,
Klaus Weide <=
- Re: lynx-dev tech. question: translating strings to different charsets, Vlad Harchev, 1999/09/06
- lynx-dev hyhenation (was tech. question: translating strings), Klaus Weide, 1999/09/07
- Re: lynx-dev hyhenation (was tech. question: translating strings), Vlad Harchev, 1999/09/07
- lynx-dev hyphenation (was tech. question: translating strings), Klaus Weide, 1999/09/06
- Re: lynx-dev hyphenation (was tech. question: translating strings), Vlad Harchev, 1999/09/06
- Re: lynx-dev hyphenation (was tech. question: translating strings), Doug Kaufman, 1999/09/06
- Re: lynx-dev hyphenation (was tech. question: translating strings), Vlad Harchev, 1999/09/06
- Re: lynx-dev hyphenation (was tech. question: translating strings), Klaus Weide, 1999/09/06
- Re: lynx-dev hyphenation (was tech. question: translating strings), Vlad Harchev, 1999/09/06
- Re: lynx-dev hyphenation (was tech. question: translating strings), rjp, 1999/09/06