bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#15984: 24.3; Problem with combining characters in attachment filenam


From: Stefan Monnier
Subject: bug#15984: 24.3; Problem with combining characters in attachment filename
Date: Fri, 29 Nov 2013 10:04:04 -0500
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.3.50 (gnu/linux)

> What I think is the right thing, is to allow a sequence of unicode
> values, e.g., "A" + combining character, or "A" + any random sequence of
> combining characters, intern this string, and treat this as a single
> "character".

For the Lisp-level notion of "character", I think this would require too
many deep changes.

> The idea is that this character object should correspond to what the
> user thinks of as a single character. E.g, one glyph per character, and
> treated as a unit by forward-char, and regexp matching with "." and
> character sets.

For forward-char, we do try to fake that behavior (e.g. a `forward-char'
command will skip over the whole A+ring combo) but not faithfully
(e.g. `C-u 2 forward-char' will also just skip that combo, and not the
subsequent char).  It's not perfect, but it seems "close enough" that it
hasn't proved problematic.

Adjusting . in regexps would indeed help solve some
unexpected behaviors.  We would probably want to keep the ability to match
a single "code point", so we'd need to introduce a new regexp operator.

Maybe we could follow the lead of the POSIX collation thingy, IIRC,
where [ϐ] in case-folding mode wants to be able to match SS in
a German locale.  So maybe [[:any:]] could match A+ring.

> E.g, there could be a mode which makes each and every unicode value a
> single character, which will then be displayed as separate glyphs,
> separate characters for regexp matching, etc.

I think we wouldn't want to use different modes (too coarse) but
different commands instead.

In any case, a first step would be to find a name for that notion of "multi
character character".  "Grapheme cluster" doesn't sound too good if we
want to expose the concept to the end user.


        Stefan





reply via email to

[Prev in Thread] Current Thread [Next in Thread]