[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] Parsing Simple Markup
From: |
Richard |
Subject: |
Re: [Chicken-users] Parsing Simple Markup |
Date: |
Sat, 20 Sep 2014 19:04:58 +0000 |
On Sat, 20 Sep 2014 11:19:08 -0400
Yves Cloutier <address@hidden> wrote:
> Hello,
>
> I am a new user to Scheme in general and to Chicken in particular,
> nice to meet you all.
>
> I came to scheme looking for an alternative to Perl for doing a
> personal project which involves parsing an input file, identifying
> html-like commands and converting those to Groff code.
>
> I was doing well up to a certain point but things started getting
> messy and thought perhaps there is a language out there better suited
> for this - which led me to scheme.
>
> Scheme is a totally different paradigm that I'm used to, so while I
> wait for my books to arrive I will need some hand-holding...hope
> that's ok.
>
> 1) Is the Chicken Scheme manual available for purchase? Online docs
> are great but I like to have a hardcopy so that I can read offline.
>
> 2) The best way to learn is to get your hands dirty so I was looking
> at doing everything from scratch, but then I saw input-parse (
> http://wiki.call-cc.org/eggref/4/input-parse) which seems pretty much
> like what I need. But i can't seem to find this in the Eggs. It
> says that page does not exist yet.
>
> For the most part, a lot of what I want to do is "search and replace",
> except for special cases where additioanl processing would be
> required to extract command:value pairs.
>
> A few examples of what I am trying to parse:
>
> 1. Tags that identify structural elements of a document:
> [chapter] "Chapter Title"
> [heading1] "Heading Title"
> [list]
> ...
> [end]
>
> [quote]
> ...
> [end]
>
> 2. Tags that identify formatting of text:
> <bold<text> ;single formatting command with no value
> <indent 5<text> ; formatting command with a value
> <dropcap<O>nce upon a time
> <bold, smallcap, size +2<text> ;a command group which has multiple
> formatting commands enclosed within <...<.
>
> A command group can be singular:
>
> <...<
>
> or have multiple commands separated by commas:
>
> <...,...,...,<
>
> the closing > signalling the end of the command group.
>
> The idea is to make typesetting with Groff very simple and intuitive
> for any user - not just programmers and hackers. The markup we are
> working on is called Typesetting Markup Language (TML). So it would
> convert html-like commands and generate a Groff document from it.
>
> Right now I am trying to do a prototype which generated Groff in the
> backend, but the idea is to have a general purpose markup that could
> also be used to generate LaTex/Contex, HTML xml etc....
>
> In Perl I am able to do most of this with regular expressions, but I'm
> hitting my head against the wall when it comes to multiple formatting
> commands within a group <...,...,...<
>
> Also to note....I am NOT a programmer of developer - I am a hobbyist
> and doing this for fun!
>
> So there is my introduction! If any of you have any words of wisdom
> on where to begin I would love to hear from you.
>
> I literally started playing with Scheme last night while i wait for
> my book order (come on amazon...send me my books!)
>
> My idea was that I could read a line of text from a file at a time.
> My understanding is that the input would be read into an
> "s-expression" (which I understand to basically be a list). Then
> could "car" the first item of the list and match it against my "tags"
> or "formatting commands" (which would be defined as something like
> below)
>
> (define chapter "[chapter]")
> (define list:digit "[list:digit]")
> (define list:alpha "[list:alpha]")
> (define end-list "[end]")
> (define close-command-group ">")
> (define command-group-begin "<")
> (define command-group-end "<")
> (define bold "bold")
> (define smallcap "smallcap")
> (define dropcap "dropcap")
>
> And then do something based on what token that is encountered.
>
> This is my first attempt at functional programming so I realize I may
> not be approaching this in the best way.
>
> Regards, and looking forward to playing with Scheme!
>
> yves
Hello Yves,
Welcome to Chicken,
I can give you a more in-depth answer tomorrow when I have more time. In
the meantime; input-parse is working. I do not understand what you mean
by not being able to find it in the Eggs? You install it by typing at
the command line:
chicken-install -s input-parse
(use -s if you need root-privilidges).
Then, to use it, include (use input-parse) at the top of your source
good.
You say you have been using regexps before but got stuck, may I point
you to: http://wiki.call-cc.org/man/4/Unit%20irregex
IMO the extended SRE Syntax is a lot saner than that of Perl. Maybe
this is of some help.
This is an intro written for Python programmers but you might find it
useful none the less.
http://wiki.call-cc.org/chicken-for-python-programmers
Good luck,
Richard