chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-users] Parsing Simple Markup


From: Yves Cloutier
Subject: [Chicken-users] Parsing Simple Markup
Date: Sat, 20 Sep 2014 11:19:08 -0400

Hello,

I am  a new user to Scheme in general and to Chicken in particular, nice to meet you all.

I came to scheme looking for an alternative to Perl for doing a personal project which involves parsing an input file, identifying html-like commands and converting those to Groff code.

I was doing well up to a certain point but things started getting messy and thought perhaps there is a language out there better suited for this - which led me to scheme.

Scheme is a totally different paradigm that I'm used to, so while I wait for my books to arrive I will need some hand-holding...hope that's ok.

1) Is the Chicken Scheme manual available for purchase?  Online docs are great but I like to have a hardcopy so that I can read offline.

2) The best way to learn is to get your hands dirty so I was looking at doing everything from scratch, but then I saw input-parse (http://wiki.call-cc.org/eggref/4/input-parse) which seems pretty much like what I need.  But i can't seem to find this in the Eggs.  It says that page does not exist yet.

For the most part, a lot of what I want to do is "search and replace", except for special cases where additioanl processing would be required to extract command:value pairs.

A few examples of what I am trying to parse:

1. Tags that identify structural elements of a document:
[chapter] "Chapter Title"
[heading1] "Heading Title"
[list]
...
[end]

[quote]
...
[end]

2. Tags that identify formatting of text:
<bold<text>  ;single formatting command with no value
<indent 5<text> ; formatting command with a value
<dropcap<O>nce upon a time
<bold, smallcap, size +2<text> ;a command group which has multiple formatting commands enclosed within <...<.

A command group can be singular:

<...<

or have multiple commands separated by commas:

<...,...,...,<

the closing > signalling the end of the command group.

The idea is to make typesetting with Groff very simple and intuitive for any user - not just programmers and hackers.  The markup we are working on is called Typesetting Markup Language (TML).  So it would convert html-like commands and generate a Groff document from it.

Right now I am trying to do a prototype which generated Groff in the backend, but the idea is to have a general purpose markup that could also be used to generate LaTex/Contex, HTML xml etc....

In Perl I am able to do most of this with regular expressions, but I'm hitting my head against the wall when it comes to multiple formatting commands within a group <...,...,...<

Also to note....I am NOT a programmer of developer - I am a hobbyist and doing this for fun!

So there is my introduction!  If any of you have any words of wisdom on where to begin I would love to hear from you.

I literally started playing with Scheme last night while i wait for my book order (come on amazon...send me my books!)

My idea was that I could read a line of text from a file at a time.  My understanding is that the input would be read into an "s-_expression_" (which I understand to basically be a list).  Then could "car" the first item of the list and match it against my "tags" or "formatting commands" (which would be defined as something like below)

(define chapter "[chapter]")
(define list:digit "[list:digit]")
(define list:alpha "[list:alpha]")
(define end-list "[end]")
(define close-command-group ">")
(define command-group-begin "<")
(define command-group-end "<")
(define bold "bold")
(define smallcap "smallcap")
(define dropcap "dropcap")

And then do something based on what token that is encountered.

This is my first attempt at functional programming so I realize I may not be approaching this in the best way.

Regards, and looking forward to playing with Scheme!

yves

reply via email to

[Prev in Thread] Current Thread [Next in Thread]