[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[elpa] master ffd42de 45/60: Use simple-csv-parser.el as a demo
From: |
Junpeng Qiu |
Subject: |
[elpa] master ffd42de 45/60: Use simple-csv-parser.el as a demo |
Date: |
Tue, 25 Oct 2016 17:45:16 +0000 (UTC) |
branch: master
commit ffd42de77fc504f17e84d618892fc05e2ba81843
Author: Junpeng Qiu <address@hidden>
Commit: Junpeng Qiu <address@hidden>
Use simple-csv-parser.el as a demo
---
README.org | 94 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 91 insertions(+), 3 deletions(-)
diff --git a/README.org b/README.org
index 97e9214..eb31c02 100644
--- a/README.org
+++ b/README.org
@@ -36,7 +36,7 @@ So we can
** Basic Parsing Functions
These parsing functions are used as the basic building block for a parser.
By
- default, their return value is a string.
+ default, their return value is a *string*.
| parsec.el | Haskell's Parsec | Usage
|
|------------------------+------------------+-------------------------------------------------------|
@@ -172,7 +172,94 @@ So we can
(parsec-str " end")))
#+END_SRC
-* Parser Examples
+* Write a Parser: a Simple CSV Parser
+ You can find the code in =examples/simple-csv-parser.el=. The code is based
+ on the Haskell code in
[[http://book.realworldhaskell.org/read/using-parsec.html][Using Parsec]].
+
+ An end-of-line should a string =\n=. We use =(parsec-str "\n")= to parse it
+ (Note that since =\n= is also one character, =(parsec-ch ?\n)= also works).
+ Some files may not contain a newline at the end, but we can view end-of-file
+ as the end-of-line for the last line, and use =parsec-eof= (or =parsec-eob=)
+ to parse the end-of-file. We use =parsec-or= to combine these two
+ combinators:
+ #+BEGIN_SRC elisp
+ (defun s-csv-eol ()
+ (parsec-or (parsec-str "\n")
+ (parsec-eof)))
+ #+END_SRC
+
+ A CSV file contains many lines and ends with an end-of-file. Use
+ =parsec-return= to return the result of the first parser as the result.
+ #+BEGIN_SRC elisp
+ (defun s-csv-file ()
+ (parsec-return (parsec-many (s-csv-line))
+ (parsec-eof)))
+ #+END_SRC
+
+ A CSV line contains many CSV cells and ends with an end-of-line, and we
+ should return the cells as the results:
+ #+BEGIN_SRC elisp
+ (defun s-csv-line ()
+ (parsec-return (s-csv-cells)
+ (s-csv-eol)))
+ #+END_SRC
+
+ CSV cells is a list, containing the first cell and the remaining cells:
+ #+BEGIN_SRC elisp
+ (defun s-csv-cells ()
+ (cons (s-csv-cell-content) (s-csv-remaining-cells)))
+ #+END_SRC
+
+ A CSV cell consists any character that is not =,= or =\n=, and we use the
+ =parsec-many-as-string= variant to return the whole content as a string
+ instead of a list of single-character strings:
+ #+BEGIN_SRC elisp
+ (defun s-csv-cell-content ()
+ (parsec-many-as-string (parsec-none-of ?, ?\n)))
+ #+END_SRC
+
+ For the remaining cells: if followed by a comma =,=, we try to parse more csv
+ cells. Otherwise, we should return the =nil=:
+ #+BEGIN_SRC elisp
+ (defun s-csv-remaining-cells ()
+ (parsec-or (parsec-and (parsec-ch ?,) (s-csv-cells)) nil))
+ #+END_SRC
+
+ OK. Our parser is almost done. To begin parsing the content in buffer =foo=,
+ you need to wrap the parser inside =parsec-start= (or =parsec-parse=):
+ #+BEGIN_SRC elisp
+ (with-current-buffer "foo"
+ (goto-char (point-min))
+ (parsec-parse
+ (s-csv-file)))
+ #+END_SRC
+
+ If you want to parse a string instead, we provide a simple wrapper macro
+ =parsec-with-input=, and you feed a string as the input and put arbitraty
+ parsers inside the macro body. =parsec-start= or =parsec-parse= is not
needed.
+ #+BEGIN_SRC elisp
+ (parsec-with-input "a1,b1,c1\na2,b2,c2"
+ (s-csv-file))
+ #+END_SRC
+
+ The above code returns:
+ #+BEGIN_SRC elisp
+ (("a1" "b1" "c1") ("a2" "b2" "c2"))
+ #+END_SRC
+
+ Note that if we replace =parsec-many-as-string= with =parsec-many= in
+ =s-csv-cell-content=:
+ #+BEGIN_SRC elisp
+ (defun s-csv-cell-content ()
+ (parsec-many (parsec-none-of ?, ?\n)))
+ #+END_SRC
+
+ The result would be:
+ #+BEGIN_SRC elisp
+ ((("a" "1") ("b" "1") ("c" "1")) (("a" "2") ("b" "2") ("c" "2")))
+ #+END_SRC
+
+* More Parser Examples
I translate some Haskell Parsec examples into Emacs Lisp using =parsec.el=.
You can see from these examples that it is very easy to write parsers using
=parsec.el=, and if you know haskell, you can see that basically I just
@@ -183,7 +270,8 @@ So we can
Three of the examples are taken from the chapter
[[http://book.realworldhaskell.org/read/using-parsec.html][Using Parsec]] in
the book of
[[http://book.realworldhaskell.org/read/][Real World Haskell]]:
- - =simple-csv-parser.el=: a simple csv parser with no support for quoted
cells
+ - =simple-csv-parser.el=: a simple csv parser with no support for quoted
+ cells, as explained in previous section.
- =full-csv-parser.el=: a full csv parser
- =url-str-parser.el=: parser parameters in URL
- [elpa] master 606fed1 10/60: Add simple JSON parser, (continued)
- [elpa] master 606fed1 10/60: Add simple JSON parser, Junpeng Qiu, 2016/10/25
- [elpa] master d8cd9d6 17/60: Better naming, Junpeng Qiu, 2016/10/25
- [elpa] master a5ca813 04/60: Full & simple parser, Junpeng Qiu, 2016/10/25
- [elpa] master 9996b5b 31/60: Update full-csv-parser, Junpeng Qiu, 2016/10/25
- [elpa] master bdfcbde 23/60: Update library description, Junpeng Qiu, 2016/10/25
- [elpa] master 2e8c52b 40/60: Make sure parsec-not-followed-by consumes no input, Junpeng Qiu, 2016/10/25
- [elpa] master fb26929 34/60: Fix parsec-make-alternatives, Junpeng Qiu, 2016/10/25
- [elpa] master 31388e6 52/60: Add -s aliases, Junpeng Qiu, 2016/10/25
- [elpa] master 34521c6 53/60: Update README about *-s functions, Junpeng Qiu, 2016/10/25
- [elpa] master 8f0c266 58/60: Add doc for parsec-peek(-p), Junpeng Qiu, 2016/10/25
- [elpa] master ffd42de 45/60: Use simple-csv-parser.el as a demo,
Junpeng Qiu <=
- [elpa] master bf49fb6 38/60: Upate README, Junpeng Qiu, 2016/10/25
- [elpa] master 966ca9e 43/60: Add comments to parsec.el, Junpeng Qiu, 2016/10/25
- [elpa] master c61a38c 22/60: Add convenient newline methods, Junpeng Qiu, 2016/10/25
- [elpa] master 0c3408a 01/60: Init commit, Junpeng Qiu, 2016/10/25
- [elpa] master 1929932 02/60: Split into two files, Junpeng Qiu, 2016/10/25
- [elpa] master fd77961 25/60: Add a few simple API, Junpeng Qiu, 2016/10/25
- [elpa] master 12d2ad6 24/60: Bug fixes, Junpeng Qiu, 2016/10/25
- [elpa] master adf4706 20/60: Add many-till, notFollowedBy and fix others, Junpeng Qiu, 2016/10/25
- [elpa] master 32809ad 30/60: Add parsec-error-new-2, Junpeng Qiu, 2016/10/25
- [elpa] master 3503e4a 13/60: Rename for easier understanding, Junpeng Qiu, 2016/10/25