[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[elpa] master bf49fb6 38/60: Upate README
From: |
Junpeng Qiu |
Subject: |
[elpa] master bf49fb6 38/60: Upate README |
Date: |
Tue, 25 Oct 2016 17:45:15 +0000 (UTC) |
branch: master
commit bf49fb6f5d9067ea41f502782df53770957f28e5
Author: Junpeng Qiu <address@hidden>
Commit: Junpeng Qiu <address@hidden>
Upate README
---
README.org | 258 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 258 insertions(+)
diff --git a/README.org b/README.org
index b4f8e2c..b439c25 100644
--- a/README.org
+++ b/README.org
@@ -1,3 +1,261 @@
#+TITLE: parsec.el
A parser combinator library for Emacs Lisp similar to Haskell's Parsec library.
+
+* Overview
+
+This work is based on [[https://github.com/jwiegley/][John Wiegley]]'s
[[https://github.com/jwiegley/emacs-pl][emacs-pl]]. The original
[[https://github.com/jwiegley/emacs-pl][emacs-pl]] is awesome,
+but I found following problems when I tried to use it:
+
+- It only contains a very limited set of combinators
+- Some of its functions (combinators) have different behaviors than their
+ Haskell counterparts
+- It can't show error messages when parsing fails
+
+So I decided to make a new library on top of it. This library, however,
contains
+most of the parser combinators in =Text.Parsec.Combinator=, which should be
+enough in most use cases. Of course more combinators can be added if necessary!
+Most of the parser combinators have the same behavior as their Haskell
+counterparts. =parsec.el= also comes with a simple error handling mechanism so
+that it can display an error message showing how the parser fails.
+
+So we can
+
+- use these parser combinators to write parsers easily from scratch in Emacs
+ Lisp like what we can do in Haskell
+- port existing Haskell program using Parsec to its equivalent Emacs Lisp
+ program easily
+
+* Parsing Functions & Parser Combinators
+
+ We compare the functions and macros defined in this library with their
Haskell
+ counterparts, assuming you're already familiar with Haskell's Parsec. If you
+ don't have any experience with parser combinators, look at the docstrings of
+ these functions and macros and try them to see the results! They are really
+ easy to learn and use!
+
+** Basic Parsing Functions
+ These parsing functions are used as the basic building block for a parser.
By
+ default, their return value is a string.
+
+ | parsec.el | Haskell's Parsec | Usage
|
+
|------------------------+------------------+-------------------------------------------------------|
+ | parsec-ch | char | parse a character
|
+ | parsec-any-ch | anyChar | parse an arbitrary character
|
+ | parsec-satisfy | satisfy | parse a character satisfying a
predicate |
+ | parsec-newline | newline | parse '\n'
|
+ | parsec-crlf | crlf | parse '\r\n'
|
+ | parsec-eol | eol | parse newline or CRLF
|
+ | parsec-eof, parsec-eob | eof | parse end of file
|
+ | parsec-eol-or-eof | *N/A* | parse EOL or EOL
|
+ | parsec-re | *N/A* | parse using a regular
expression |
+ | parsec-one-of | oneOf | parse one of the characters
|
+ | parsec-none-of | noneOf | parse any character other than
the supplied ones |
+ | parsec-str | *N/A* | parse a string but consume
input only when successful |
+ | parsec-string | string | parse a string and consume
input for partial matches |
+ | parsec-num | *N/A* | parse a number
|
+ | parsec-letter | letter | parse a letter
|
+ | parsec-digit | digit | parse a digit
|
+
+ Note:
+ - =parsec-str= and =parsec-string= are different. =parsec-string= behaves the
+ same as =string= in Haskell, and =parsec-str= is more like combining
+ =string= and =try= in Haskell.
+ - Use the power of regular expressions provided by =parsec-re= and simplify
the parser!
+
+** Parser Combinators
+ These combinators can be used to combine different parsers.
+
+ | parsec.el | Haskell's Parsec | Usage
|
+
|---------------------------+------------------+--------------------------------------------------------------|
+ | parsec-or | choice | try the parsers until one
succeeds |
+ | parsec-try | try | try parser and consume no
input when an error occurs |
+ | parsec-with-error-message | <?> (similar) | use the new error message
when an error occurs |
+ | parsec-many | many | apply the parser zero or
more times |
+ | parsec-many1 | many1 | apply the parser one or
more times |
+ | parsec-many-till | manyTill | apply parser zero or more
times until end succeeds |
+ | parsec-until | *N/A* | parse until end succeeds
|
+ | parsec-not-followed-by | notFollowedBy | succeed when the parser
fails |
+ | parsec-endby | endby | apply parser zero or more
times, separated and ended by end |
+ | parsec-sepby | sepby | apply parser zero or more
times, separated by sep |
+ | parsec-between | between | apply parser between open
and close |
+ | parsec-count | count | apply parser n times
|
+ | parsec-option | option | apply parser, if it fails,
return opt |
+ | parsec-optional | *N/A* | apply parser zero or one
time and return the result |
+ | parsec-optional* | optional | apply parser zero or one
time and discard the result |
+ | parsec-optional-maybe | optionMaybe | apply parser zero or one
time and return the result in Maybe |
+
+ Note:
+ - =parsec-or= can also be used to replace =<|>=.
+ - =parsec-with-error-message= is slightly different from =<?>=. It will
+ replace the error message even when the input is consumed.
+ - By default, =parsec-many-till= behaves as Haskell's =manyTill=. However,
+ =parsec-many-till= and =parsec-until= can accept an optional argument to
+ specify which part(s) to be returned. You can use =:both= or =:end= as the
+ optional argument to change the default behavior. See the docstrings for
+ more information.
+
+** Parser Utilities
+ These utilities can be used together with parser combinators to build a
+ parser and ease the translation process if you're trying to port an existing
+ Haskell program.
+
+ | parsec.el | Haskell's Parsec | Usage
|
+
|----------------------------------+------------------+---------------------------------------------------------|
+ | parsec-and | do block | try all parsers and
return the last result |
+ | parsec-return | do block | try all parsers and
return the first result |
+ | parsec-ensure | *N/A* | quit the parsing
when an error occurs |
+ | parsec-ensure-with-error-message | *N/A* | quit the parsing
when an error occurs with new message |
+ | parsec-collect | sequence | try all parsers and
collect the results into a list |
+ | parsec-collect* | *N/A* | try all parsers and
collect non-nil results into a list |
+ | parsec-start | parse | entry point
|
+ | parsec-parse | parse | entry point (same as
parsec-start) |
+ | parsec-with-input | parse | perform parsers on
input |
+ | parsec-from-maybe | fromMaybe | retrieve value from
Maybe |
+ | parsec-maybe-p | *N/A* | is a Maybe value or
not |
+ | parsec-query | *N/A* | change the parser's
return value |
+
+** Variants that Return a String
+
+ By default, the macros/functions that return multiple values will put the
+ values into a list. These macros/functions are:
+ - =parsec-many=
+ - =parsec-many1=
+ - =parsec-many-till=
+ - =parsec-until=
+ - =parsec-count=
+ - =parsec-collect= and =parsec-collect*=
+
+ They all have a variant that returns a string by concatenating the results
in
+ the list:
+ - =parsec-many-as-string=
+ - =parsec-many1-as-string=
+ - =parsec-many-till-as-string=
+ - =parsec-until-as-string=
+ - =parsec-collect-as-string=
+
+ These variants accept the same arguments. The only difference is the return
+ value. In most cases I found myself using these variants instead of the
+ original versions that return a list.
+
+* Code Examples
+ Some very simple examples are given here. You can see many code examples in
+ the test files in this GitHub repo.
+
+ The following code extract the "hello" from the comment:
+ #+BEGIN_SRC elisp
+ (parsec-with-input "/* hello */"
+ (parsec-string "/*")
+ (parsec-many-till-as-string (parsec-any-ch)
+ (parsec-try
+ (parsec-string "*/"))))
+ #+END_SRC
+
+ THe equivalent Haskell program:
+ #+BEGIN_SRC haskell
+ import Text.Parsec
+
+ main :: IO ()
+ main = print $ parse p "" "/* hello */"
+ where
+ p = do string "/*"
+ manyTill anyChar (try (string "*/"))
+ #+END_SRC
+
+ The following code returns the "aeiou" before "end":
+ #+BEGIN_SRC elisp
+ (parsec-with-input "if aeiou end"
+ (parsec-str "if ")
+ (parsec-return
+ (parsec-many-as-string (parsec-one-of ?a ?e ?i ?o ?u))
+ (parsec-str " end")))
+ #+END_SRC
+
+* Parser Examples
+ I translate some Haskell Parsec examples into Emacs Lisp using =parsec.el=.
+ You can see from these examples that it is very easy to write parsers using
+ =parsec.el=, and if you know haskell, you can see that basically I just
+ translate the Haskell into Emacs Lisp one by one because most of them are
just
+ the same!
+
+ You can find five examples under the =examples/= directory.
+
+ Three of the examples are taken from the chapter
[[http://book.realworldhaskell.org/read/using-parsec.html][Using Parsec]] in
the book of
+ [[http://book.realworldhaskell.org/read/][Real World Haskell]]:
+ - =simple-csv-parser.el=: a simple csv parser with no support for quoted
cells
+ - =full-csv-parser.el=: a full csv parser
+ - =url-str-parser.el=: parser parameters in URL
+
+ =pjson.el= is a translation of Haskell's
[[https://hackage.haskell.org/package/json-0.9.1/docs/src/Text-JSON-Parsec.html][json
library using Parsec]].
+
+ =scheme.el= is a much simplified Scheme parser based on
[[https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/][Write
Yourself a
+ Scheme in 48 Hours]].
+
+ They're really simple but you can see how this library works!
+
+* Change the Return Values using =parsec-query=
+ Parsing has side-effects such as forwarding the current point. In the
original
+ [[https://github.com/jwiegley/emacs-pl][emacs-pl]], you can specify some
optional arguments to some parsing functions
+ (=pl-ch=, =pl-re= etc.) to change the return values. In =parsec.el=, these
+ functions don't have such a behavior. Instead, we provide a unified interface
+ =parsec-query=, which accepts any parser, and changes the return value of the
+ parser.
+
+ You can speicify following arguments:
+ #+BEGIN_EXAMPLE
+ :beg --> return the point before applying the PARSER
+ :end --> return the point after applying the PARSER
+ :nil --> return nil
+ :groups N --> return Nth group for `parsec-re'."
+ #+END_EXAMPLE
+
+ So instead of returning "b" as the result, the following code returns 2:
+ #+BEGIN_SRC elisp
+ (parsec-with-input "ab"
+ (parsec-ch ?a)
+ (parsec-query (parsec-ch ?b) :beg))
+ #+END_SRC
+
+ Returning a point means that you can also incorporate =parsec.el= with Emacs
+ Lisp functions that can operate on points/regions, such as =goto-char= and
+ =kill-region=.
+
+* Error Messages
+
+ =parsec.el= implements a simple error handling mechanism. When an error
+ happens, it will show how the parser fails.
+
+ For example, the following code fails:
+ #+BEGIN_SRC elisp
+ (parsec-with-input "aac"
+ (parsec-count 2 (parsec-ch ?a))
+ (parsec-ch ?b))
+ #+END_SRC
+
+ The return value is:
+ #+BEGIN_SRC elisp
+ (parsec-error . "Found \"c\" -> Expected \"b\"")
+ #+END_SRC
+
+ This also works when parser combinators fail:
+ #+BEGIN_SRC elisp
+ (parsec-with-input "a"
+ (parsec-or (parsec-ch ?b)
+ (parsec-ch ?c)))
+ #+END_SRC
+
+ The return value is:
+ #+BEGIN_SRC elisp
+ (parsec-error . "None of the parsers succeeds:
+ Found \"a\" -> Expected \"c\"
+ Found \"a\" -> Expected \"b\"")
+ #+END_SRC
+
+ If an error occurs, the return value is a cons cell that contains the error
+ message in its =cdr=. Compared to Haskell's Parsec, it's really simple, but
at
+ least the error message could tell us some information. Yeah, not perfect but
+ usable.
+
+* Acknowledgement
+ - Daan Leijen for Haskell's Parsec
+ - [[https://github.com/jwiegley/][John Wiegley]] for
[[https://github.com/jwiegley/emacs-pl][emacs-pl]]
- [elpa] master d8cd9d6 17/60: Better naming, (continued)
- [elpa] master d8cd9d6 17/60: Better naming, Junpeng Qiu, 2016/10/25
- [elpa] master a5ca813 04/60: Full & simple parser, Junpeng Qiu, 2016/10/25
- [elpa] master 9996b5b 31/60: Update full-csv-parser, Junpeng Qiu, 2016/10/25
- [elpa] master bdfcbde 23/60: Update library description, Junpeng Qiu, 2016/10/25
- [elpa] master 2e8c52b 40/60: Make sure parsec-not-followed-by consumes no input, Junpeng Qiu, 2016/10/25
- [elpa] master fb26929 34/60: Fix parsec-make-alternatives, Junpeng Qiu, 2016/10/25
- [elpa] master 31388e6 52/60: Add -s aliases, Junpeng Qiu, 2016/10/25
- [elpa] master 34521c6 53/60: Update README about *-s functions, Junpeng Qiu, 2016/10/25
- [elpa] master 8f0c266 58/60: Add doc for parsec-peek(-p), Junpeng Qiu, 2016/10/25
- [elpa] master ffd42de 45/60: Use simple-csv-parser.el as a demo, Junpeng Qiu, 2016/10/25
- [elpa] master bf49fb6 38/60: Upate README,
Junpeng Qiu <=
- [elpa] master 966ca9e 43/60: Add comments to parsec.el, Junpeng Qiu, 2016/10/25
- [elpa] master c61a38c 22/60: Add convenient newline methods, Junpeng Qiu, 2016/10/25
- [elpa] master 0c3408a 01/60: Init commit, Junpeng Qiu, 2016/10/25
- [elpa] master 1929932 02/60: Split into two files, Junpeng Qiu, 2016/10/25
- [elpa] master fd77961 25/60: Add a few simple API, Junpeng Qiu, 2016/10/25
- [elpa] master 12d2ad6 24/60: Bug fixes, Junpeng Qiu, 2016/10/25
- [elpa] master adf4706 20/60: Add many-till, notFollowedBy and fix others, Junpeng Qiu, 2016/10/25
- [elpa] master 32809ad 30/60: Add parsec-error-new-2, Junpeng Qiu, 2016/10/25
- [elpa] master 3503e4a 13/60: Rename for easier understanding, Junpeng Qiu, 2016/10/25
- [elpa] master 4fb2abe 29/60: Update simple-csv-parser, Junpeng Qiu, 2016/10/25