bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#12054: 24.1; regression? font-lock no-break-space with nil nobreak-c


From: Drew Adams
Subject: bug#12054: 24.1; regression? font-lock no-break-space with nil nobreak-char-display
Date: Sat, 3 Nov 2012 09:25:35 -0700

> > With point before the no-break-space, C-u C-x =.  That 
> > shows that the character is indeed a no-break-space,
> > and there is no face on it.
> 
> "[\240]+" doesn't do what you want.  Octal 240 is a unibyte character,
> so that string constant specifies a unibyte string.  When this unibyte
> string is converted to multibyte, the raw byte becomes codepoint
> #x3ffa0.
> 
> You should use either of these instead:
> (font-lock-add-keywords nil '(("[\u00a0]+" (0 'foo t))) 'APPEND)
> (font-lock-add-keywords nil '(("[ ]+" (0 'foo t))) 'APPEND)

I still have some questions.

`C-q 240' and `C-x 8 RET no-break space' insert the same char.
C-u C-x = says this about it: (codepoint 160, #o240, #xa0)
And with your font-lock sexp that char is indeed highlighted
as expected (yellow bg).  Emacs says the char is octal 240.

Just why is it that the regexp "[\240]+" does not match this char?  Why should a
character-alternative expression care whether the representation is unibyte or
multibyte?  Isn't that a bug?

How to use octal syntax to match that char?  The Elisp manual says clearly that
"The most general read syntax for a character represents the character code in
either octal or hex."  MOST GENERAL, not most limited and partial.

Are you saying that for regexps octal and hex are no longer "the most general
syntax", and that to represent (at least some) unicode chars in a regexp we must
use the \u... syntax?  Is there no way for the `font-lock-add-keywords' sexp to
use either octal or hex here?

With the current state of affairs, which you say is not bugged, how can an Emacs
version < 23 (i.e., without \u... syntax) be used to highlight the char?
Shouldn't it be possible in Emacs 22 to pick up a file that has Unicode chars
and highlight them using font-lock, even if you cannot use Emacs 22 to insert
such chars?

And for Emacs 20 there is not even hex syntax - shouldn't we be able to do
everything using just octal syntax, since it is supposedly "the most general
syntax"?

I haven't seen your doc clarification yet, but given the questions above I would
imagine that things need to be clarified in several places of the manual.

But isn't treating this as a doc bug a bit of a cop-out?  Shouldn't it be
possible to use octal syntax to match Unicode chars?






reply via email to

[Prev in Thread] Current Thread [Next in Thread]