bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B


From: Eli Zaretskii
Subject: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B
Date: Tue, 12 Dec 2017 19:13:33 +0200

> From: Adam Tack <adam.tack.513@gmail.com>
> Date: Sat, 9 Dec 2017 03:50:05 +0000
> Cc: 13399@debbugs.gnu.org
> 
> > I think this is okay, but maybe the macro could be converted into an
> > inline function, and then fetching the character from the various
> > objects separated from looking up the char-table for that character?
> 
> I've made the conversion — it's now slightly less messy.  Regarding
> the separation, I think that the most that can be done is to have the
> look-up in a separate function.  Regrettably, trying to first obtain
> the character, for example via a set of if-else clauses, and then
> looking it up, which would be cleaner, can't really work since the
> cases (in particular the first and fourth) are not disjunct.

Hmm... not sure why you arrived at this conclusion.  E.g., what's
wrong with the implementation at the bottom of this message?

> > We could also look at LineBreak.txt in the Unicode database for
> > inspiration and ideas.
> 
> The three main customisation options that I'm considering are:
> 
> i) Unicode whitespace (U+2000 - U+200B),

Yes.

> ii) vim's breakat characters (default " ^I!@*-+;:,./?"), since
> presumably they had given it some thought,

Maybe.  I'm not sure in what modes this would be TRT.

> iii) The characters in LineBreak.txt (parsing the file shouldn't be
> hard, if there aren't copyright issues).

We already import several UCD files, see admin/unidata, where you will
also find copyright.html from the Unicode Consortium.

> > And also a couple of tests (the ones you used would be a good start).
> 
> These would presumably have to be in tests/manual since the position of
> the word-wrap depends on too many variables (width of window, font
> type, font size)?

test/manual is okay.

> diff --git a/lisp/word-wrap.el b/lisp/word-wrap.el
> new file mode 100644
> index 0000000..6d59a83
> --- /dev/null
> +++ b/lisp/word-wrap.el
> @@ -0,0 +1,21 @@
> +(define-minor-mode word-wrap-char-table-mode
> +  "Toggle wrapping using a look-up to word-wrap-chars, globally.
> +
> +Currently, this allows word wrapping on the characters U+2000 to
> +U+200B in addition to the default of space and tap, when
> +`word-wrap' is set to t.
> +
> +(Provisional and unstable.)
> +"
> +  :global t
> +  :lighter "uws "
> +  (if word-wrap-char-table-mode
> +      (progn (setq word-wrap-chars (make-char-table nil nil))
> +             (set-char-table-range word-wrap-chars 9 t)
> +             (set-char-table-range word-wrap-chars 32 t)
> +             (set-char-table-range word-wrap-chars
> +                                   '(8192 . 8203) t))
> +    (setq word-wrap-chars nil)))

This should probably go into simple.el.

Thanks.

Here's the implementation of IT_DISPLAYING_WHITESPACE I had in mind:

static inline bool
IT_DISPLAYING_WHITESPACE (struct it *it)
{
  bool char_table_p = CHAR_TABLE_P (Vword_wrap_chars);
  int c;

  if (it->what == IT_CHARACTER)
    c = it->c;
  else if (!char_table_p)
    {
      if (STRINGP (it->string))
        c = SREF (it->string, IT_STRING_BYTEPOS (*it));
      else if (it->s)
        c = it->s[IT_BYTEPOS (*it)];
      else if (IT_BYTEPOS (*it) < ZV_BYTE)
        c = *BYTE_POS_ADDR (IT_BYTEPOS (*it));
    }
  else
    {
      if (STRINGP (it->string))
        c = STRING_CHAR (SDATA (it->string) + IT_STRING_BYTEPOS (*it));
      else if (it->s)
        c = STRING_CHAR (it->s + IT_BYTEPOS (*it));
      else if (IT_BYTEPOS (*it) < ZV_BYTE)
        c = FETCH_CHAR_AS_MULTIBYTE (IT_BYTEPOS (*it));
    }

  return
    char_table_p
    ? !NILP (CHAR_TABLE_REF (Vword_wrap_chars, c))
    : (c == ' ' || c == '\t');
}





reply via email to

[Prev in Thread] Current Thread [Next in Thread]