[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[PATCH 3/3] Add CommonMark reader.
|
From: |
Ludovic Courtès |
|
Subject: |
[PATCH 3/3] Add CommonMark reader. |
|
Date: |
Sat, 17 Feb 2024 11:27:11 +0100 |
* src/guile/skribilo/reader/commonmark.scm,
tests/readers/commonmark.test: New files.
* configure.ac: Check for (commonmark) ans define ‘BUILD_COMMONMARK_READER’.
* src/guile/Makefile.am (readers, EXTRA_DIST): Add ‘commonmark.scm’.
* doc/user/syntax.skb (The CommonMark Syntax): New section.
(The Outline Syntax): Adjust initial sentence.
* .guix/skribilo-package.scm (skribilo)[propagated-inputs]: Add
GUILE-COMMONMARK.
(skribilo/guile-2.0): Remove GUILE-COMMONMARK from ‘propagated-inputs’.
---
.guix/skribilo-package.scm | 16 +-
configure.ac | 10 +
doc/user/syntax.skb | 62 ++++++-
src/guile/Makefile.am | 13 +-
src/guile/skribilo/reader/commonmark.scm | 221 +++++++++++++++++++++++
tests/readers/commonmark.test | 147 +++++++++++++++
6 files changed, 459 insertions(+), 10 deletions(-)
create mode 100644 src/guile/skribilo/reader/commonmark.scm
create mode 100644 tests/readers/commonmark.test
diff --git a/.guix/skribilo-package.scm b/.guix/skribilo-package.scm
index 12bb2d1..c7d47ab 100644
--- a/.guix/skribilo-package.scm
+++ b/.guix/skribilo-package.scm
@@ -1,6 +1,6 @@
;;; guix.scm -- Build recipe for GNU Guix.
;;;
-;;; Copyright © 2020, 2023 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2020, 2023-2024 Ludovic Courtès <ludo@gnu.org>
;;; Copyright © 2023 Arun Isaac <arunisaac@systemreboot.net>
;;;
;;; This file is part of Skribilo.
@@ -24,7 +24,7 @@
#:use-module (gnu packages ghostscript)
#:use-module (gnu packages guile)
#:use-module ((gnu packages guile-xyz)
- #:select (guile-lib guile-reader))
+ #:select (guile-commonmark guile-lib guile-reader))
#:use-module (gnu packages imagemagick)
#:use-module (gnu packages lout)
#:use-module (gnu packages pkg-config)
@@ -53,7 +53,9 @@
;; The 'skribilo' command needs them, and for people using Skribilo as a
;; library, these inputs are needed as well.
- (propagated-inputs (list guile-reader guile-lib))))
+ (propagated-inputs (list guile-reader
+ guile-lib
+ guile-commonmark))))
(define with-guile-2.0
(package-input-rewriting/spec `(("guile" . ,(const guile-2.0)))
@@ -76,7 +78,13 @@
(define-public skribilo/guile-2.0
(package
- (inherit (with-guile-2.0 skribilo))
+ (inherit (with-guile-2.0
+ (package
+ (inherit skribilo)
+ (propagated-inputs
+ (modify-inputs (package-propagated-inputs skribilo)
+ ;; XXX: Guile-CommonMark has test failures on Guile 2.0.
+ (delete "guile-commonmark"))))))
(name "guile2.0-skribilo")
(inputs
(modify-inputs (package-inputs skribilo)
diff --git a/configure.ac b/configure.ac
index 5ad964a..af1cb75 100644
--- a/configure.ac
+++ b/configure.ac
@@ -75,6 +75,16 @@ fi
AM_CONDITIONAL([BUILD_GEMTEXT_READER],
[test "x$have_srfi_171" == "xyes"])
+# Check for Guile-CommonMark, needed for the `commonmark' reader.
+GUILE_MODULE_AVAILABLE([have_commonmark], [(commonmark)])
+if test "x$have_commonmark" != "xyes"; then
+ AC_MSG_WARN([Guile-CommonMark needed by the `commonmark' reader is missing.])
+fi
+
+AM_CONDITIONAL([BUILD_COMMONMARK_READER],
+ [test "x$have_commonmark" == "xyes"])
+
+
# Look for `convert', from ImageMagick.
AC_PATH_PROG([CONVERT], [convert])
if test "x$CONVERT" == "x"; then
diff --git a/doc/user/syntax.skb b/doc/user/syntax.skb
index 2de7cbd..ce8ed6d 100644
--- a/doc/user/syntax.skb
+++ b/doc/user/syntax.skb
@@ -1,7 +1,7 @@
;;; syntax.skb -- The Skribilo syntaxes.
-;;; -*- coding: iso-8859-1 -*-
+;;; -*- coding: utf-8 -*-
;;;
-;;; Copyright 2008, 2009 Ludovic Court�s <ludo@gnu.org>
+;;; Copyright 2008, 2009, 2024 Ludovic Courtès <ludo@gnu.org>
;;; Copyright 2001, 2002, 2003, 2004 Manuel Serrano
;;;
;;;
@@ -132,12 +132,66 @@ specifying a width:])
in the output document (such as HTML column ,(code "\"0*\"")
specification). Note that this way of specifying width
is strictly unportable.])))))
-
+
+ (section :title [The CommonMark Syntax]
+ :ident "commonmark-syntax"
+
+ (p [While the Skribe syntax is super expressive, you may instead prefer
+something less expressive but more lightweight. The ,(tt [commonmark]) reader
+addresses this needs: it reads inputs that follows the ,(ref :url
+"https://commonmark.org/" :text [CommonMark syntax])—essentially a subset of
+what you may know as Markdown—and produces a document from there. The simplest
+CommonMark input looks like this:]
+
+ (example :legend [Example input using the CommonMark syntax.]
+ (tt (pre "\
+# Title of the Document
+
+Here is an introduction.
+
+## First Chapter
+
+This is the first chapter, and here’s a hyperlink to the [CommonMark
+spec](https://commonmark.org).
+
+## Second Chapter
+
+Well, you get the idea."))))
+
+ (p [Alternatively, the input can start with metadata headers in the
+style of ,(ref :url "https://dthompson.us/projects/haunt.html" :text [Haunt]),
+which is useful for instance if you’d like to specify the document author(s):]
+
+ (example :legend [CommonMark input with metadata headers.]
+ (tt (pre "\
+title: Title of the Document
+author: Charlie
+---
+
+Here goes the introduction.
+
+# First Chapter
+
+This is the first chapter.
+
+## Sub-section
+
+… and this is a subsection."))))
+
+ (p [Once you have a CommonMark document, pass ,(tt
[--reader=commonmark])
+or ,(tt [-R commonmark]) to ,(tt [skribilo]). For example, here is how you
would render
+a CommonMark document as PDF ,(it [via]) LaTeX:]
+
+ (disp :verb #t [
+skribilo -R commonmark -t latex document.md \\
+ | pdflatex])
+
+ [The resulting file is ,(tt [texput.pdf]) by default.]))
(section :title [The Outline Syntax]
:ident "outline-syntax"
- (p [Alternatively, Skribilo allows documents to be written in a
+ (p [You may also write documents in
plain text format, with almost no markup. Instead, conventions borrowed
from ,(ref :text [Emacs' Outline Mode] :url
"http://www.gnu.org/software/emacs/manual/html_node/emacs/Outline-Mode.html")
diff --git a/src/guile/Makefile.am b/src/guile/Makefile.am
index 09bb7da..80c03be 100644
--- a/src/guile/Makefile.am
+++ b/src/guile/Makefile.am
@@ -1,5 +1,4 @@
-# Copyright 2005, 2006, 2007, 2008, 2009, 2010, 2012,
-# 2015, 2018, 2020 Ludovic Courtès <ludo@gnu.org>
+# Copyright 2005-2010, 2012, 2015, 2018, 2020, 2024 Ludovic Courtès
<ludo@gnu.org>
# Copyright 2022 Arun Isaac <arunisaac@systemreboot.net>
#
# This file is part of Skribilo.
@@ -70,6 +69,16 @@ EXTRA_DIST += skribilo/reader/gemtext.scm
endif !BUILD_GEMTEXT_READER
+if BUILD_COMMONMARK_READER
+
+readers += skribilo/reader/commonmark.scm
+
+else !BUILD_COMMONMARK_READER
+
+EXTRA_DIST += skribilo/reader/commonmark.scm
+
+endif !BUILD_COMMONMARK_READER
+
engines = \
skribilo/engine/base.scm skribilo/engine/context.scm \
skribilo/engine/html.scm skribilo/engine/html4.scm \
diff --git a/src/guile/skribilo/reader/commonmark.scm
b/src/guile/skribilo/reader/commonmark.scm
new file mode 100644
index 0000000..266dfa8
--- /dev/null
+++ b/src/guile/skribilo/reader/commonmark.scm
@@ -0,0 +1,221 @@
+;;; commonmark.scm -- Reader for the CommonMark syntax.
+;;;
+;;; Copyright © 2024 Ludovic Courtès <ludo@gnu.org>
+;;; Copyright © 2015 David Thompson <davet@gnu.org>
+;;;
+;;;
+;;; This file is part of Skribilo.
+;;;
+;;; Skribilo is free software: you can redistribute it and/or modify
+;;; it under the terms of the GNU General Public License as published by
+;;; the Free Software Foundation, either version 3 of the License, or
+;;; (at your option) any later version.
+;;;
+;;; Skribilo is distributed in the hope that it will be useful,
+;;; but WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+;;; GNU General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU General Public License
+;;; along with Skribilo. If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (skribilo reader commonmark)
+ #:autoload (commonmark) (commonmark->sxml)
+ #:autoload (sxml match) (sxml-match)
+ #:use-module (skribilo reader)
+ #:use-module (skribilo utils syntax)
+ #:use-module (srfi srfi-1)
+ #:use-module (srfi srfi-11)
+ #:use-module (srfi srfi-34)
+ #:use-module (srfi srfi-35)
+ #:use-module (ice-9 match)
+ #:use-module (ice-9 rdelim)
+ #:export (reader-specification
+ make-commonmark-reader))
+
+;;; Commentary:
+;;;
+;;; This is a reader for the CommonMark syntax, a subset of what's usually
+;;; referred to as "Markdown".
+;;;
+;;; Code:
+
+(define (sxml->skribilo sxml)
+ (define sectioning
+ `((h1 . chapter)
+ (h2 . section)
+ (h3 . subsection)
+ (h4 . subsubsection)))
+
+ (define (heading? tag)
+ (assq tag sectioning))
+
+ (define (tagged? tag)
+ (match-lambda
+ ((head . _) (eq? head tag))
+ (_ #f)))
+
+ (let loop ((sxml sxml)
+ (result '()))
+ (match sxml
+ (()
+ (reverse result))
+ ((((? heading? tag) . title) . rest)
+ (let ((section (assoc-ref sectioning tag)))
+ (let-values (((body rest)
+ (break (tagged? tag) rest)))
+ (loop rest
+ (cons `(,section #:title (list ,@title)
+ ,@(sxml->skribilo body))
+ result)))))
+ ((('p . body) . rest)
+ (loop rest
+ (cons `(paragraph ,@(sxml->skribilo body))
+ result)))
+ ((('pre ('code ('@ _ ...) . body)) . rest)
+ (loop rest
+ (cons `(pre (prog ,@body)) result)))
+ ((('em . body) . rest)
+ (loop rest
+ (cons `(emph ,@(sxml->skribilo body))
+ result)))
+ ((('ul ('li . items) ...) . rest)
+ (loop rest
+ (cons `(itemize ,@(map (lambda (body)
+ `(item ,@(sxml->skribilo body)))
+ items))
+ result)))
+ ((('ol ('li . items) ...) . rest)
+ (loop rest
+ (cons `(enumerate ,@(map (lambda (body)
+ `(item ,@(sxml->skribilo body)))
+ items))
+ result)))
+ ((('a ('@ ('href url)) . body) . rest)
+ (loop rest
+ (cons `(ref #:url ,url
+ #:text (list ,@(sxml->skribilo body)))
+ result)))
+ ((('code . body) . rest)
+ (loop rest
+ (cons `(tt ,@(sxml->skribilo body))
+ result)))
+ ((('blockquote . body) . rest)
+ (loop rest
+ (cons `(blockquote ,@(sxml->skribilo body))
+ result)))
+ ((('strong . body) . rest)
+ (loop rest
+ (cons `(bold ,@(sxml->skribilo body))
+ result)))
+ ((('img ('@ ('src url) ('alt text) . _) . _) . rest)
+ (loop rest
+ (cons `(image #:url ,url #:text ,text)
+ result)))
+ ((('hr) . rest)
+ (loop rest result))
+ (((? string? str) . rest)
+ (loop rest (cons str result)))
+ ((lst ...)
+ `(list ,@(fold loop result lst))))))
+
+(define (string-split-at str char-pred)
+ "Split STR at the first character that matches CHAR-PRED and return
+a list of one or two strings. Two strings are returned if the string
+was able to be split, with the character matching CHAR-PRED removed.
+A list containing only STR is returned if CHAR-PRED does not match any
+charcter."
+ (let ((i (string-index str char-pred)))
+ (if i
+ (list (string-take str i)
+ (string-drop str (1+ i)))
+ (list str))))
+
+(define (read-headers port)
+ "Read Haunt-style headers from PORT. Return a list of key/value pairs."
+ (let loop ((headers '()))
+ (match (read-line port)
+ ((? eof-object?) ;premature end-of-file
+ (reverse headers))
+ ("---"
+ (reverse headers))
+ (line
+ (match (map string-trim-both (string-split-at line #\:))
+ (((= string->symbol key) value)
+ (loop (alist-cons key value headers)))
+ (_
+ (raise (condition
+ (&message
+ (message (G_ "invalid CommonMark header")))))))))))
+
+(define (starts-with-headers? port)
+ "Return true if PORT starts with Haunt-style headers. Do not actually
+consume those bytes."
+ (let ((line (read-line port 'concat)))
+ (and (not (eof-object? line))
+ (begin
+ (unread-string line port)
+ (string-prefix? "title:" (string-trim line))))))
+
+(define (raise-sectioning body)
+ "Raise sectioning in BODY: sections become chapters, subsections become
+section, and so on."
+ (define mapping
+ '((subsubsection . subsection)
+ (subsection . section)
+ (section . chapter)
+ (chapter . chapter)))
+
+ (define (section? tag)
+ (assq tag mapping))
+
+ (map (lambda (item)
+ (match item
+ (((? section? tag) #:title title . body)
+ (let ((tag (assq-ref mapping tag)))
+ `(,tag #:title ,title
+ ,@(raise-sectioning body))))
+ (_
+ item)))
+ body))
+
+(define (read-commonmark-document port)
+ "Read a CommonMark document from PORT. Return Scheme code as an
s-expression."
+ (set-port-encoding! port "UTF-8")
+ (match (peek-char port)
+ ((? eof-object? eof)
+ eof)
+ (_
+ (let* ((headers (if (starts-with-headers? port)
+ (read-headers port)
+ '()))
+ (sexp (sxml->skribilo (commonmark->sxml port)))
+ (title (or (assoc-ref headers 'title)
+ (match sexp
+ ((('chapter #:title title _ ...) _ ...)
+ title)
+ (_ #f))))
+ (body (if (assoc-ref headers 'title)
+ sexp
+ (match sexp
+ ((('chapter #:title _ body ...))
+ body)
+ (_
+ sexp)))))
+ `(document
+ #:title ,title
+ #:author (list ,@(map (lambda (name)
+ `(author #:name ,name))
+ (if (assoc-ref headers 'author)
+ (string-split (assoc-ref headers 'author)
+ #\,)
+ '())))
+
+ ,@(if (assoc-ref headers 'title)
+ body
+ (raise-sectioning body)))))))
+
+(define make-commonmark-reader
+ (const read-commonmark-document))
+
+(define-reader commonmark "0.1" make-commonmark-reader)
diff --git a/tests/readers/commonmark.test b/tests/readers/commonmark.test
new file mode 100644
index 0000000..8047d1f
--- /dev/null
+++ b/tests/readers/commonmark.test
@@ -0,0 +1,147 @@
+;;; Exercise CommonMark reader. -*- Scheme -*-
+;;;
+;;; Copyright © 2024 Ludovic Courtès <ludo@gnu.org>
+;;;
+;;; This file is part of Skribilo.
+;;;
+;;; Skribilo is free software; you can redistribute it and/or modify it
+;;; under the terms of the GNU Lesser General Public License as published by
+;;; the Free Software Foundation; either version 3 of the License, or (at
+;;; your option) any later version.
+;;;
+;;; Skribilo is distributed in the hope that it will be useful, but
+;;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser
+;;; General Public License for more details.
+;;;
+;;; You should have received a copy of the GNU Lesser General Public License
+;;; along with this program. If not, see <http://www.gnu.org/licenses/>.
+
+(define-module (tests commonmark)
+ #:use-module (ice-9 match)
+ #:use-module (skribilo reader)
+ #:use-module (srfi srfi-64))
+
+(unless (false-if-exception (resolve-interface '(commonmark)))
+ (exit 77))
+
+(define make-commonmark-reader
+ (reader:make (lookup-reader 'commonmark)))
+
+
+(test-begin "commonmark")
+
+(test-equal "document with headers"
+ '(document #:title "CommonMark is cool"
+ #:author (list (author #:name "Charlie Doe"))
+ (chapter #:title (list "First")
+ (paragraph "Hi there" "!"))
+ (chapter #:title (list "Second")
+ (paragraph "Ooh.")
+ (section #:title (list "Sub One")
+ (paragraph "Something here."))
+ (section #:title (list "Sub Two")
+ (paragraph "Not much to add."))))
+ (call-with-input-string "\
+title: CommonMark is cool
+author: Charlie Doe
+---
+
+# First
+
+Hi there!
+
+# Second
+
+Ooh.
+
+## Sub One
+Something here.
+
+## Sub Two
+Not much to add."
+ (make-commonmark-reader)))
+
+(test-equal "document without headers"
+ '(document #:title (list "Title coming from first chapter")
+ #:author (list)
+ (paragraph "Hi there" "!")
+ (chapter #:title (list "Section becomes a chapter")
+ (section #:title (list "Subsection")
+ (paragraph "Something here.")))
+ (chapter #:title (list "Conclusion")
+ (paragraph "Done.")))
+ (call-with-input-string "
+# Title coming from first chapter
+
+Hi there!
+
+## Section becomes a chapter
+
+### Subsection
+Something here.
+
+## Conclusion
+Done."
+ (make-commonmark-reader)))
+
+(test-equal "ornaments"
+ '(document #:title (list "The Title")
+ #:author (list)
+
+ (paragraph "Here's a " (bold "bullet") " list:")
+ (itemize
+ (item "one")
+ (item "two"))
+ (paragraph "Here's a " (emph "numbered") " list:")
+ (enumerate
+ (item "one")
+ (item "two")))
+ (call-with-input-string "
+# The Title
+
+Here's a **bullet** list:
+
+ - one
+ - two
+
+Here's a *numbered* list:
+
+ 1. one
+ 2. two"
+ (make-commonmark-reader)))
+
+(test-equal "references and images"
+ '(document #:title (list "The Title")
+ #:author (list)
+
+ (paragraph "The " (ref #:url "https://guix.gnu.org"
+ #:text (list (emph "awesome") " Guix"))
+ " project.")
+ (paragraph (image #:url "https://guix.gnu.org/favicon.ico"
+ #:text "The favicon.")))
+ (call-with-input-string "
+# The Title
+
+The [_awesome_ Guix](https://guix.gnu.org) project.
+
+"
+ (make-commonmark-reader)))
+
+(test-equal "code block"
+ '(document #:title (list "The Title")
+ #:author (list)
+ (pre (prog "int main () {
+ return 0;
+}")))
+ (call-with-input-string "
+# The Title
+
+```c
+int main () {
+ return 0;
+}
+```"
+ (make-commonmark-reader)))
+
+(test-end)
--
2.41.0