help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: how to use parsing expressing grammar


From: Xah Lee
Subject: Re: how to use parsing expressing grammar
Date: Sat, 20 Dec 2008 13:41:59 -0800 (PST)
User-agent: G2/1.0

Xah Lee wrote:
> let's say i want to change tags of the form
> “<img src="archimedesspiral.png">”
> into
> “<img src="★">”
> I tried the following:
> (defun doMyReplace ()
> (interactive)
>   (peg-parse
>    (start imgTag)
>   (imgTag "<img" whitespace "src=" "\"" (replace filePath "★") "\"" ">")
>   (whitespace [" "])
>   (filePath [a-z "."])
>    )
> )

Helmut Eller wrote:
> The filePath rule only matches the first character.  You probably
> want to write (+ [a-z "."]).  Same issue for whitespace.

Thanks a lot! It worked out great!

I have another question, hopefully this one is not a dumb one.

In summary, if i have

   (imgTag "<img" whitespace (+ attributes whitespace) ">")

how to tell PEG that if a attribute is the last item, then the
whitespace following it is optional?

For example, the above will match
<A B C >
but won't match
<A B C>

Here's my code:

(defun doMyReplace ()
(interactive)
  (peg-parse
   (imgTag "<img" _ (+ attributes _) ">")
   (attributes (or src alt width height))
   (src "src" _* "=" _* "\"" filePath "\"")
   (filePath (+ [A-Z a-z "./_-"]))
   (alt "alt" _* "=" _* "\"" altStr "\"")
   (altStr (* [A-Z a-z "./ '_"]))
   (width "width" _* "=" _* "\"" digits "\"")
   (height "height" _* "=" _* "\"" digits "\"")
   (_* (* ["\n \t"])) ; 0 or more white space
   (_ (+ ["\n \t"])) ; 1 or more white space
   (digits (+ [0-9]))
   )
)

here's a sample text to be matched:
<img src="archimedes_spiral_k.png" alt="archimedean spiral"
width="288" height="115">

if i add a space to the ending “>”, it matches.

Thanks again.

> > Btw, would you be interested in starting a mailing list on PEG in
> > emacs? e.g. yasnippet has one thru google, nxml has one in yahoo
> > group, ljupdate has one in livejournal. I think it'd be helpful.
>
> So far only 2 people asked questions.  If there are some more we can set
> up a mailing list.

I'm pretty sure if you create it, more and more people will join it.
I'm very interested in PEG and think it is of critical importance. If
say emacs 24 has it built in as C code, with all its regex functions
such as search-forward-regexp, query-replace-regexp etc having PEG
version, it would make emacs a killer app.

>From Wikipedia, it appears that people have already wrote PEG lib for
most major langs. There is already a C lib for PEG. The problem with
them is that most comes with a background of computer lang parsing, as
opposed to practical use for text processing like regex. (note: regex
itself came from computer science background as a way to determine
languages with “regular” grammar, but today it is far removed from
theoretical parsing. The process of this transmutation took i think 10
years, and it took another 10 or so years until it become a widely
popular tool in langs, starting with Perl in the 1990s) I don't forsee
that in the next 10 years that practicing programers will all know
about computer science of parsing or that major langs will all have
formal grammar spec. I'm pretty certain people are already seeing the
potential of PEG as regex replacement and working towards creatings
such practical goal.

  Xah
∑ http://xahlee.org/

reply via email to

[Prev in Thread] Current Thread [Next in Thread]