bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B


From: Eli Zaretskii
Subject: bug#13399: 24.3.50; Word-wrap can't wrap at zero-width space U-200B
Date: Fri, 08 Dec 2017 17:38:29 +0200

> From: Adam Tack <adam.tack.513@gmail.com>
> Date: Fri, 8 Dec 2017 01:02:08 +0000
> 
> I have a patch for the original issue of word-wrap not wrapping at a
> zero-width space.  The implementation uses a character table, and is
> closely based on that written by Martin Rudalics
> (https://debbugs.gnu.org/cgi/bugreport.cgi?bug=13399#113), with Eli
> Zaretski's suggestions regarding unicode.

Thanks for working on this!

> However, this is my first foray into modifying a serious C codebase,
> so I am not sure if I have done the right thing.  In particular, I
> have serious doubts about the second and third cases from
> IT_DISPLAYING_WHITESPACE, especially since I don't really know when
> they would be applicable.
> 
>    || ((STRINGP (it->string)                        \
>     && !NILP (CHAR_TABLE_REF                    \
>           (Vword_wrap_chars, STRING_CHAR            \
>            (SDATA (it->string) + IT_STRING_BYTEPOS (*it)))))    \
>        || (it->s && !NILP (CHAR_TABLE_REF                \
>                (Vword_wrap_chars,                \
>                 STRING_CHAR(it->s + IT_BYTEPOS (*it)))))    \

I think this is okay, but maybe the macro could be converted into an
inline function, and then fetching the character from the various
objects separated from looking up the char-table for that character?

> Additionally, I'm not certain whether syms_of_character in character.c
> is the right location for the definition of the char-table and whether
> the range of characters U+2000 to U+200B should be in the chartable,
> or if it should just be space and tab, by default.

Well, since it's a char-table, users will probably want to control
which characters cause word-wrap.  One idea would be to have a minor
mode or some such, providing users an ability to include or exclude
different groups of related whitespace characters as a whole?  This
could be in follow-up patches, though.

We could also look at LineBreak.txt in the Unicode database for
inspiration and ideas.

But I do think that the default should be only TAB and SPC, as Emacs
always did, and the rest should be optional, and probably in Lisp, not
C.

> I am aware that if this were to be accepted, I would also need to make
> a change to etc/NEWS, probably the docstring of `word-wrap' and
> somewhere in the Texinfo manual.

And also a couple of tests (the ones you used would be a good start).

> I have not yet filled out a copyright assignment form, though I will
> do so if this patch (modulo changes) is considered acceptable.

I will send the forms off-list, thanks.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]