[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] irregex and callbacks

From: Alex Shinn
Subject: Re: [Chicken-users] irregex and callbacks
Date: Thu, 2 Oct 2014 11:18:55 +0900

On Thu, Oct 2, 2014 at 8:22 AM, Andy Bennett <address@hidden> wrote:

I am trying to use the database to do HTTP User Agent

This database consists of a (large) number of regexes and data about the
browser should the user agent string match that regex.

What I want to do is compile all the regexes together and be able to add
annotations such that I can match a UA string against this regex and get
back an idea of which pattern matched so that I can look up the
appropriate data.

i.e. I have a data structure keyed by "pattern" and I want to my input
to be something that matches that pattern rather than the pattern itself.

It seems that for this I need "Callbacks" but I don't really need full
callback support: I don't necessarily need to call an actual procedure
and I don't need to replace anything: I'm not doing a search/replace,
just a match. "All" I really need is to be able to annotate the FSM node
that matched with a little bit of data that I can get back.

You could use submatch info and check which submatch matched.
This would keep the matching as a single regexp, but you'd then
need a linear scan to see which submatch succeeded.

(define (irregex-merge-vector vec)
  (irregex `(or ,@(map (lambda (x) `(=> alt ,x)) (vector->list vec)))))

(define ua-vec ...)
(define all-ua-rx (irregex-merge-vector ua-vec))

(define (maybe-match-ua ua)
    ((irregex-match all-ua-rx ua)
     => (lambda (m)
             (vector-reg ua-vec (irregex-match-numeric-index 'match-ua m '(alt)))))

although I believe irregex-match-numeric-index is not exported.
It's worth having a utility for this idiom.


Is this something that would be easy to add to irregex or can anyone
suggest any other alternative implementations that I might consider?

The PHP library that uses this browscap database (apparently) just does
a linear search by trying to match each regex in turn but I'd rather
keep that approach as a last resort.

Thanks for your help and any tips you can offer.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]