bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B


From: martin rudalics
Subject: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B
Date: Sun, 03 Feb 2013 19:57:31 +0100

Just to recite the initial problem and your proposal:

>> With emacs -Q evaluate
>>
>> (with-current-buffer (get-buffer-create "*foo*")
>>    (dotimes (i 1000)
>>      (insert "1234​")) ; U-200B
>>    (setq word-wrap t)
>>    (display-buffer "*foo*"))
>>
>> where the character after 1234 is a zero-width space character with
>> unicode code point U-200B.  As can be seen in the window showing *foo*,
>> lines are not regularly wrapped at that character.
>
> You mean, not wrapped at all.  Witness the continuation bitmaps in the
> fringes, which shouldn't appear when a line is wrapped.
>
>> Doing
>>
>> (with-current-buffer (get-buffer-create "*foo*")
>>    (dotimes (i 1000)
>>      (insert "1234 "))
>>    (setq word-wrap t)
>>    (display-buffer "*foo*"))
>>
>> instead wraps lines as expected.
>
> If anything, this is a missing feature, since word-wrap is explicitly
> coded to break lines only on SPC and TAB characters.  See the
> IT_DISPLAYING_WHITESPACE macro in xdisp.c.
>
> If we want to add more characters to the set, we should probably
> arrange a special char-table for this, and have it exposed to Lisp, so
> it could be customized.  Patches are welcome.

I now rewrote IT_DISPLAYING_WHITESPACE as

#define IT_DISPLAYING_WHITESPACE(it)                                    \
  ((it->what == IT_CHARACTER                                         \
    && !NILP (CHAR_TABLE_REF (Vword_wrap_chars, it->c)))             \
   || ((STRINGP (it->string)                                         \
        && !NILP (CHAR_TABLE_REF                                        \
                   (Vword_wrap_chars,                                   \
                      SREF (it->string, IT_STRING_BYTEPOS (*it)))))  \
       || (it->s && !NILP (CHAR_TABLE_REF                            \
                            (Vword_wrap_chars,                          \
                               it->s[IT_BYTEPOS (*it)])))            \
       || (IT_BYTEPOS (*it) < ZV_BYTE                                        \
           && !NILP (CHAR_TABLE_REF                                     \
                      (Vword_wrap_chars,                                \
                         (*BYTE_POS_ADDR (IT_BYTEPOS (*it))))))))       \


and have a character table called `word-wrap-chars' such that
(aref word-wrap-chars ?​) returns t, but it doesn't wrap at a
U-200B character.  Is there some additional wrinkle like some
hardcoded space/tab in the word-wrap code I have to observe?
Or is my code wrong?

Thanks, martin






reply via email to

[Prev in Thread] Current Thread [Next in Thread]