emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF


From: Eli Zaretskii
Subject: Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files
Date: Mon, 28 Sep 2015 00:04:36 +0300

> Cc: address@hidden
> From: Paul Eggert <address@hidden>
> Date: Sun, 27 Sep 2015 13:21:51 -0700
> 
> Eli Zaretskii wrote:
> > This is unrelated: it specifies which character sequences should be
> > composed and displayed as a single grapheme cluster.
> 
> Yes.  It might be reasonable to replace some of those \u instances for 
> readability, e.g.:
> 
> -        ("V" . "[\u0904-\u0914\u0960-\u0961\u0972]") ; independent vowel
> +        ("V" . "[ऄ-औॠ-ॡॲ]") ; independent vowel

I'm not so sure this is a good idea: since most of us don't read Indic
scripts, leaving the codepoints there makes it easier to compare these
patterns with various relevant publications and standards on the
Internet.  If we make them characters instead, most of us will have to
use "C-x =" to see the codepoints anyway.

> But replacements would not be such a good idea for some of this code, e.g.:
> 
> -        ("H" . "\u094D")             ; HALANT
> +        ("H" . "्")          ; HALANT
> 
> as standalone combining characters are problematic on display, and here:
> 
> -        ("J" . "\u200D")             ; ZWJ
> +        ("J" . "‍")          ; ZWJ
> 
> where one can't easily see a zero width joiner when editing the
> source file.

Indeed.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]