[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: "Unidecode" functionality in Emacs
From: |
Eli Zaretskii |
Subject: |
Re: "Unidecode" functionality in Emacs |
Date: |
Tue, 20 Mar 2018 08:20:22 +0200 |
> From: John Mastro <john.b.mastro@gmail.com>
> Date: Mon, 19 Mar 2018 15:04:29 -0700
>
> There are "Unidecode" packages for Perl[1], Python[2], and Emacs[3]
> (derived from one another in that order). They each transliterate
> Unicode text to ASCII, e.g.:
>
> (unidecode "Déjà vu")
> ;=> "Deja vu"
> (unidecode "北亰")
> ;=> "Bei Jing "
>
> Does Emacs have equivalent functionality built-in?
It's possible to remove accents (the first example) using the
functionality in ucs-normalize.el. Some transliteration is possible
for scripts for which there exists a "transliteration" input method,
using the code by Michael Welsh Duggan posted here:
http://lists.gnu.org/archive/html/emacs-devel/2018-02/msg00387.html
For example, you can transliterate Cyrillic text using the
cyrillic-translit input method that comes with Emacs. But there are
no general-purpose transliteration capabilities in Emacs, AFAIK.
However, it looks like the Perl package is just a huge database of
precomputed transliterations, in which case doing the same in Emacs
Lisp should be almost trivial.