[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Chicken-users] regex and named subpatterns
From: |
Hans Nowak |
Subject: |
[Chicken-users] regex and named subpatterns |
Date: |
Wed, 05 Mar 2008 20:42:23 -0500 |
User-agent: |
Thunderbird 2.0.0.0 (Macintosh/20070326) |
Hi,
In Python (and Perl, etc) it's possible to name subpatterns (or "groups") in a
regular expression. For example:
In [1]: import re
In [2]: s = "the quick brown fox"
In [3]: rx = re.compile("(?P<foo>quick)")
In [4]: m = rx.search(s)
In [6]: m.groups()
Out[6]: ('quick',)
In [7]: m.groupdict()
Out[7]: {'foo': 'quick'}
I can access the group named 'foo' using the groupdict() method. Is there a way
to do this in Chicken as well? Apparently Chicken's regexen support the
(?P<name>...) syntax, but I don't know if or how the groups can be accessed by name:
#;1> (use regex)
; loading library regex ...
#;2> (define s "the quick brown fox")
#;3> (define rx (regexp "(?P<foo>quick)"))
#;4> (string-search rx s)
("quick" "quick")
#;5> (string-search-positions rx s)
((4 9) (4 9))
;; ...now what? :-)
Any tips welcome...
In other "news", I am preparing a small module that provides a Python-like
"match object". Basically, it has a small number of functions (regex-search,
regex-match, regex-find-all) that return these match objects (or #f if no
match), which are basically records with accessor functions. In some cases,
this might be clearer than dealing with lists of lists of numbers, and/or having
to extract the corresponding substrings by hand -- at least to my Python-addled
brain. :-) For example, you can do:
#;1> (define s "Bob ate 5 hamburgers")
#;3> (define rx "^(\\S+).*?(\\d+)\\s*(\\S+)s")
#;4> (define m (regex-search rx s))
#;5> m
#<rxmatch>
#;6> (rxmatch-groups m)
(("Bob" 0 3) ("5" 8 9) ("hamburger" 10 19))
#;9> (rxmatch-group m 1)
("Bob" 0 3)
#;10> (rxmatch-group-string m 3)
"hamburger"
...etc. I don't know if this will be useful to anyone else, but if it is, I
will make it available (after a lot of polishing and debugging, no doubt).
Suggestions welcome.
--Hans
- [Chicken-users] regex and named subpatterns,
Hans Nowak <=