emacs-orgmode
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[O] Canonical way to strip off all markup from an element in Org exporte


From: Kaushal Modi
Subject: [O] Canonical way to strip off all markup from an element in Org exporter backend?
Date: Wed, 20 Dec 2017 18:30:20 +0000

Hello,

What's the canonical way to strip off all markup from an element in an Org exporter backend.

I do it in this round-about way in ox-hugo..it works but feels convoluted. The trick is to remove all markup chars from an element while retaining the *, /, `, etc chars *not* used for any markup.

I export Org subtrees to individual posts, where the subtree headline will become the post title. So I need to sanitize that headline of any markup.

Step1: I get the HTMLized version of the title

(org-export-data-with-backend (plist-get info :title) 'html info)

But getting the HTMLized version of the title, it would be easy to strip off the HTML tags which would be inserted basically for formatting (bold, italics, etc.).

Step 2: Strip off the HTML tags.

(while (string-match "<\\(?1:[a-z]+\\)[^>]*>\\(?2:[^<]+\\)</\\1>" title)
  (setq title (replace-match "\\2" nil nil title)))

If I do any other exporter like md, I will lose the ability to distinguish a literal * in the title from a * meant for bold/italics markup in Markdown. Even ascii is not good because then I'd need to do some intensive parsing to figure out if ` is meant to be a literal ` or part of `code'.

So the question: Is this the best way.. or is there a canonical way to export an element without any markup char?

Full actual code[1].

[1]: https://github.com/kaushalmodi/ox-hugo/blob/dffb7e970f33959a0b97fb8df267a54d01a98a2a/ox-hugo.el#L1769-L1802
--

Kaushal Modi


reply via email to

[Prev in Thread] Current Thread [Next in Thread]