bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#24206: 25.1; Curly quotes generate invalid strings, leading to a seg


From: Paul Eggert
Subject: bug#24206: 25.1; Curly quotes generate invalid strings, leading to a segfault
Date: Sun, 14 Aug 2016 19:04:42 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.2.0

Eli Zaretskii wrote:
Its multibyteness is entirely in Emacs's imagination.

Sure, but Emacs should not substitute "\342\200\230" for "`". The point of text-quoting-style is to substitute quotes, not byte string encodings of quotes.

> More generally, Fsubstitute_command_keys is quite confused about unibyte
> versus multibyte issues. It merges together a number of strings, and
> assumes that they are all multibyte iff the original string is
> multibyte, which is obviously not true in general.
Could you please point out the specific places where this is done?

OK, here's a contrived example. Run this code in emacs-25:

(progn
  (setq km (make-keymap))
  (define-key km "≠" 'global-set-key)
  (substitute-command-keys "\200\\<km>\\[global-set-key]"))

This should return a 2-character string equal to "\200≠". But in Emacs 25 it dumps core, at least on my platform (Fedora 23 x86-64). And in Emacs 24 on my platform it returns a malformed string that prints as "\242\1340" but has length 2. I suppose we could make Emacs 24 dump core too, though I haven't tried hard to do that.

The problem is that the older Emacs code incorrectly assumes that the output of substitution must be properly-encoded if the substitution changes something. This assumption can fail if the input is unibyte and contains bytes that are not properly-encoded for UTF-8. (There are other ways the assumption can fail.)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]