[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Form feed characters break odt export
From: |
Ihor Radchenko |
Subject: |
Re: Form feed characters break odt export |
Date: |
Fri, 27 Dec 2024 10:21:02 +0000 |
Joseph Turner <joseph@breatheoutbreathe.in> writes:
> Thanks, Ihor! Tested working on my machine.
>
> Here's another potential solution to consider, which adds a defcustom to
> let the user decide how to handle forbidden characters:
>
> https://github.com/kjambunathan/org-mode-ox-odt/commit/07fde1e9b7cdda3e3ef8136f5b1d478499dfd780
Good idea!
I went even further and used a proper export setting.
See the attached 2nd version of the fix.
>From de015e4a3b98bc975c2dcd1dfce7adcf77eb537c Mon Sep 17 00:00:00 2001
Message-ID:
<de015e4a3b98bc975c2dcd1dfce7adcf77eb537c.1735294805.git.yantar92@posteo.net>
From: Ihor Radchenko <yantar92@posteo.net>
Date: Tue, 24 Dec 2024 15:11:22 +0100
Subject: [PATCH v2] ox-odt: Avoid putting forbidden characters into ODT xml
* lisp/ox-odt.el (org-odt-with-forbidden-chars): New export option to
control how to handle forbidden XML characters.
(org-odt--remove-forbidden): New filter removing/replacing forbidden
characters.
Reported-by: Joseph Turner <joseph@breatheoutbreathe.in>
Link: 87o711l4u4.fsf@christianmoe.com">https://orgmode.org/list/87o711l4u4.fsf@christianmoe.com
---
lisp/ox-odt.el | 43 ++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 42 insertions(+), 1 deletion(-)
diff --git a/lisp/ox-odt.el b/lisp/ox-odt.el
index ec81637ef0..635bf38971 100644
--- a/lisp/ox-odt.el
+++ b/lisp/ox-odt.el
@@ -94,7 +94,8 @@ (org-export-define-backend 'odt
. (org-odt--translate-latex-fragments
org-odt--translate-description-lists
org-odt--translate-list-tables
- org-odt--translate-image-links)))
+ org-odt--translate-image-links))
+ (:filter-final-output . org-odt--remove-forbidden))
:menu-entry
'(?o "Export to ODT"
((?o "As ODT file" org-odt-export-to-odt)
@@ -108,6 +109,7 @@ (org-export-define-backend 'odt
(:keywords "KEYWORDS" nil nil space)
(:subtitle "SUBTITLE" nil nil parse)
;; Other variables.
+ (:odt-with-forbidden-chars nil nil org-odt-with-forbidden-chars)
(:odt-content-template-file nil nil org-odt-content-template-file)
(:odt-display-outline-level nil nil org-odt-display-outline-level)
(:odt-fontify-srcblocks nil nil org-odt-fontify-srcblocks)
@@ -170,6 +172,14 @@ (defconst org-odt-special-string-regexps
("\\.\\.\\." . "…")) ; hellip
"Regular expressions for special string conversion.")
+(defconst org-odt-forbidden-char-re
+ (rx (not (in ?\N{U+9} ?\N{U+A} ?\N{U+D}
+ (?\N{U+20} . ?\N{U+D7FF})
+ (?\N{U+E000} . ?\N{U+FFFD})
+ (?\N{U+10000} . ?\N{U+10FFFF}))))
+ "Regexp matching forbidden XML1.0 characters.
+https://www.w3.org/TR/REC-xml/#charsets")
+
(defconst org-odt-schema-dir-list
(list (expand-file-name "./schema/" org-odt-data-dir))
"List of directories to search for OpenDocument schema files.
@@ -364,6 +374,19 @@ (defgroup org-export-odt nil
:tag "Org Export ODT"
:group 'org-export)
+(defcustom org-odt-with-forbidden-chars ""
+ "String to replace forbidden XML characters.
+When set to t, forbidden characters are retained.
+When set to nil, an error is thrown.
+See `org-odt-forbidden-char-re' for the list of forbidden characters
+that cannot occur inside ODT documents.
+
+You may also consider export filters to perform more fine-grained
+replacements. See info node `(org)Advanced Export Configuration'."
+ :package-version '(Org . "9.8")
+ :type '(choice (const :tag "Strip forbidden characters" t)
+ (const :tag "Err when forbidden characters encountered" nil)
+ (string :tag "Replacement string")))
;;;; Debugging
@@ -2892,6 +2915,24 @@ (defun org-odt--encode-tabs-and-spaces (line)
(format " <text:s text:c=\"%d\"/>" (1- (length s)))))
line))
+(defun org-odt--remove-forbidden (text _backend info)
+ "Remove forbidden and discouraged characters from TEXT.
+INFO is the communication plist"
+ (pcase (plist-get info :odt-with-forbidden-chars)
+ ((and (pred stringp) rep)
+ (prog1 (replace-regexp-in-string org-odt-forbidden-char-re rep text)
+ (when (match-string 0 text)
+ (display-warning
+ '(ox-odt ox-odt-with-forbidden-chars)
+ (format "Replacing forbidden character '%s' with '%s'"
+ (match-string 0 text) rep)))))
+ (`nil
+ (if (string-match org-odt-forbidden-char-re text)
+ (error "Forbidden character '%s' found. See
`org-odt-with-forbidden-chars'"
+ (match-string 0 text))
+ text))
+ (_ text)))
+
(defun org-odt--encode-plain-text (text &optional no-whitespace-filling)
(dolist (pair '(("&" . "&") ("<" . "<") (">" . ">")))
(setq text (replace-regexp-in-string (car pair) (cdr pair) text t t)))
--
2.47.1
--
Ihor Radchenko // yantar92,
Org mode maintainer,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>
- Re: Form feed characters break odt export, (continued)
- Re: Form feed characters break odt export, Ihor Radchenko, 2024/12/23
- Re: Form feed characters break odt export, Christian Moe, 2024/12/24
- Re: Form feed characters break odt export, Ihor Radchenko, 2024/12/24
- Re: Form feed characters break odt export, Joseph Turner, 2024/12/25
- Re: Form feed characters break odt export,
Ihor Radchenko <=
- Re: Form feed characters break odt export, Joseph Turner, 2024/12/27
- Re: Form feed characters break odt export, Ihor Radchenko, 2024/12/28
- Re: Form feed characters break odt export, Joseph Turner, 2024/12/28
- Re: Form feed characters break odt export, Ihor Radchenko, 2024/12/28
- Re: Form feed characters break odt export, Max Nikulin, 2024/12/24
- Re: Form feed characters break odt export, Ihor Radchenko, 2024/12/24