bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#27526: 25.1; Nonconformance to Unicode bidirectionality algorithm du


From: Itai Berli
Subject: bug#27526: 25.1; Nonconformance to Unicode bidirectionality algorithm due to paragraph separator
Date: Thu, 29 Jun 2017 21:36:39 +0300

> The UBA allows applications to employ "higher-level protocols" when
> deciding on base paragraph direction.  See section 4.3 in UAX#9 and 
> specifically clause HL1 there.

> This is what Emacs does: it applies its own heuristics for this
> decision.  The reason for that is that Emacs's implementation of the
> UBA must work reasonably well in plain-text buffers, where typically
> long paragraphs are broken into lines by newline characters (which are
> paragraph separators according to the UBA), and many times the
> partition into lines is done by auto-fill or similar features, thus
> making the first character of the next line fairly arbitrary.  Using
> the UBA paragraph-direction determination would then produce
> unacceptable results, whereby the direction of a part of a paragraph
> could change in unpredictable ways when text is refilled.

 As I understand it, the "higher-level protocols" provision is intended
 to allow for such things as table cells, elements of structured markup
 languages, and word processors that use an idio-syncratic
 implementation of a paragraph separator *under the hood*. It is not
 intended for plain running text; for this the standard specifies
 explicitly what the paragraph separators for every operating system
 are.

> typically long paragraphs are broken into lines by newline characters

I see no evidence of the validity of this statement on my system (Emacs
25.1.1). But even if this were so, it would still not merit
*hard-coding* the paragraph separator as a blank line, as there are
situations (such as the one I presented in my bug report) that require
a diffferent configuration.

> You can alleviate this to some extent by ...(in your case) starting
> the paragraph with an RLM control character before \noindent,
> optionally followed by an LRM or enclosing \noindent in LRE..PDF (so
> that the backslash displays to the left of "noindent").  This is
> admittedly a bit awkward, but I think the results are still acceptable.

As you mentioned, the solution is cubersome. It might have been
acceptable if this was the sole issue, but this example illustrates just one of
several problems that arise due to current paragraph separator
convention.

In conclusion, and on a personal note, I implore you to change this
behavior, and to do so as soon as possible, and not only for specialized
markup documents, but for every document.

I am currently working on my thesis. Emacs is useless to me as a text
editor of Hebrew texts without this feature. This is no
exaggeration.

The original reason I chose Emacs over other editors was because of
the combination of AUCTeX and the promise of full Unicode
compatibility. AUCTeX has delivered on its promise, but in the area of
Unicode, as far as my needs are concerned it is if there was no Unicode
support at all, and I will be sadly forced to look for a different editor.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]