[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: multiple destinations in charset mapping files
From: |
Kenichi Handa |
Subject: |
Re: multiple destinations in charset mapping files |
Date: |
Thu, 25 Jun 2009 09:36:36 +0900 |
In article <address@hidden>, YAMAMOTO Mitsuharu <address@hidden> writes:
> I noticed that some charset mapping files such as
> etc/charsets/symbol.map contain entries where the same source is
> mapped to multiple destinations, and the latter one is preferred in
> decoding in such cases.
> 0x20 0x0020
> 0x20 0x00A0
> (decode-char 'symbol #x20) -> 160
> But at least for symbol.map, it seems to be more natural to prefer the
> former entry (e.g., SPACE vs. NO-BRAKE SPACE, GREEK CAPITAL LETTER
> DELTA vs. INCREMENT). WDYT?
I agree. By this script in etc/charsets:
% for f in *.map; do awk '{print $1}' < $f | sort | uniq -c | grep '^ *[2-9] 0'
&& echo $f; done
I confirmed only symbol.map and stdenc.map contain such
duplications, so I regenrated those maps (simply by doing
sort -r) and committed to EMACS_23_1_RC and trunk.
Do you find any other maps that have duplications?
---
Kenichi Handa
address@hidden