[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH] org-export: Remove zero-width space escapes during export
From: |
Ihor Radchenko |
Subject: |
[PATCH] org-export: Remove zero-width space escapes during export |
Date: |
Tue, 26 Jul 2022 20:59:18 +0800 |
K K <k_foreign@outlook.com> writes:
> My use case is to emphasize chinese characters without spaces being inserted,
> even those zero-width spaces. For example "中文*测*试" should be enough to
> emphasize "测".
>
> I am using zero-width spaces right now, and it works fine in org-mode
> buffers, but if exported to latex-pdf files, the U+200B ZERO WIDTH SPACE
> character will not be zero-width for certain fonts. So I hope not to use that
> character.
This is a bug. While escape symbols do not affect export in most common
scenarios, your report is adding yet another case when zero-width space
is actually altering the export result.
I am attaching a tentative patch that will make Org export remove
zero-width spaces when those spaces actually separate the object
boundaries.
Any objections?
> On Tue, 26 Jul 2022 09:26:42 +0800, Ihor Radchenko wrote:
>> Another idea we have discussed is using something similar to Markdown
>> format: **bold**, //italics//, __underline__, etc. It is less verbose
>> compared to the special blocks, which should be valuable for
>> Japanese/Chinese/other languages with no spaces between words.
>
> By the way, it seems that my use case has already been implemented by
> markdown-mode. In a markdown-mode buffer "中文**测**试" will certainly make "测"
> bold.
The idea was indeed inspired by Markdown.
However, Markdown is different - **bold** is the official syntax to
indicate bold markup. Though things are more complex in reality:
https://www.markdownguide.org/basic-syntax/ Markdown has its own edge
cases.
Best,
Ihor
>From 5764b41b858bff3d56dcb24741cf550a7e245d36 Mon Sep 17 00:00:00 2001
Message-Id:
<5764b41b858bff3d56dcb24741cf550a7e245d36.1658840330.git.yantar92@gmail.com>
From: Ihor Radchenko <yantar92@gmail.com>
Date: Tue, 26 Jul 2022 20:50:47 +0800
Subject: [PATCH] org-export: Remove zero-width space escapes during export
* lisp/ox.el (org-export--remove-escaped): New function removing
zero-width spaces when they separate object boundaries.
(org-export-as): Call `org-export--remove-escaped'.
* testing/lisp/test-ox.el (test-org-export/remove-escaped): New test.
---
lisp/ox.el | 22 ++++++++++++++++++++++
testing/lisp/test-ox.el | 13 +++++++++++++
2 files changed, 35 insertions(+)
diff --git a/lisp/ox.el b/lisp/ox.el
index 40ad7ae4e..de034fd22 100644
--- a/lisp/ox.el
+++ b/lisp/ox.el
@@ -2916,6 +2916,25 @@ (defun org-export--remove-uninterpreted-data (data info)
;; Return modified parse tree.
data)
+(defun org-export--remove-escaped (data info)
+ "Remove escape symbols from plain-text in DATA.
+DATA is a parse tree or a secondary string. INFO is a plist
+containing export options. It is modified by side effect and
+returned by the function."
+ (org-element-map data '(plain-text)
+ (lambda (string)
+ (let (processed-string)
+ (setq processed-string
+ (replace-regexp-in-string "\\`" "" string))
+ (setq processed-string
+ (replace-regexp-in-string "\\'" "" processed-string))
+ (unless (equal string processed-string)
+ (org-element-insert-before processed-string string)
+ (org-element-extract-element string))))
+ info nil nil t)
+ ;; Return modified parse tree.
+ data)
+
;;;###autoload
(defun org-export-as
(backend &optional subtreep visible-only body-only ext-plist)
@@ -3046,6 +3065,9 @@ (defun org-export-as
;; communication channel.
(org-export--prune-tree tree info)
(org-export--remove-uninterpreted-data tree info)
+ ;; Remove zero-width spaces that escape Org syntax
+ ;; elements.
+ (org-export--remove-escaped tree info)
;; Call parse tree filters.
(setq tree
(org-export-filter-apply-functions
diff --git a/testing/lisp/test-ox.el b/testing/lisp/test-ox.el
index 7c71b6e24..ea4fce363 100644
--- a/testing/lisp/test-ox.el
+++ b/testing/lisp/test-ox.el
@@ -982,6 +982,19 @@ (ert-deftest test-org-export/uninterpreted ()
(section . (lambda (s c i) c))))
nil nil nil '(:with-sub-superscript {}))))))
+(ert-deftest test-org-export/remove-escaped ()
+ "Test removing escape symbols."
+ ;; Remove zero-width space around markup.
+ (should
+ (equal "This*is*test.\n"
+ (org-test-with-temp-text "This*is*test.\n"
+ (org-export-as (org-test-default-backend)))))
+ ;; Do not remove zero-width space in other places.
+ (should
+ (equal "Thisistest.\n"
+ (org-test-with-temp-text "Thisistest.\n"
+ (org-export-as (org-test-default-backend))))))
+
(ert-deftest test-org-export/export-scope ()
"Test all export scopes."
;; Subtree.
--
2.35.1
- Re: How to force markup without spaces, K, 2022/07/25
- Re: How to force markup without spaces, K, 2022/07/25
- Re: How to force markup without spaces, Ihor Radchenko, 2022/07/25
- Re: How to force markup without spaces, Max Nikulin, 2022/07/25
- Re: How to force markup without spaces, K K, 2022/07/26
- Re: How to force markup without spaces, Max Nikulin, 2022/07/26
- [PATCH] org-export: Remove zero-width space escapes during export,
Ihor Radchenko <=
- Re: [PATCH] org-export: Remove zero-width space escapes during export, Timothy, 2022/07/26
- Re: [PATCH] org-export: Remove zero-width space escapes during export, András Simonyi, 2022/07/26
- Re: [PATCH] org-export: Remove zero-width space escapes during export, Max Nikulin, 2022/07/26
- Re: [PATCH] org-export: Remove zero-width space escapes during export, Max Nikulin, 2022/07/26
- [PATCH] Add new entity \-- serving as markup separator/escape symbol, Ihor Radchenko, 2022/07/28
- Re: [PATCH] Add new entity \-- serving as markup separator/escape symbol, Max Nikulin, 2022/07/28
- Re: [PATCH] Add new entity \-- serving as markup separator/escape symbol, Ihor Radchenko, 2022/07/28
- Re: [PATCH] Add new entity \-- serving as markup separator/escape symbol, Max Nikulin, 2022/07/28
- [PATCH v2] Add new entity \-- serving as markup separator/escape symbol, Ihor Radchenko, 2022/07/29
- Re: [PATCH v2] Add new entity \-- serving as markup separator/escape symbol, Samuel Wales, 2022/07/29