chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Using irregex safely & responsibly


From: Peter Bex
Subject: Re: [Chicken-users] Using irregex safely & responsibly
Date: Mon, 11 Oct 2010 09:51:15 +0200
User-agent: Mutt/1.4.2.3i

On Mon, Oct 11, 2010 at 01:17:49PM +0900, Alex Shinn wrote:
> > The valid-index? predicate does not return a boolean #t value:
> >
> > #;9> (irregex-match-valid-index? m 3)
> > 0
> 
> It returns #t for this in the upstream irregex.

I'll look into that. It's probably a bug introduced by a
Chicken-specific optimization.

> *-valid-index? just states whether the submatch _may_ exist.
> 
> We could add a utility irregex-match-matched-index? to test
> if a specific index was successfully matched.

That's a horrible name.  I think we shouldn't need this if
the procedures just returned #f in case of no match.

> An index which could never be a valid submatch should
> arguably always throw an error.

Agreed.

> An index which is valid, but failed to match, could either
> throw an error or return #f.  The -index and -substring
> operations are inconsistent in this respect, so we should
> fix that.

IMHO they all should behave like -substring; return #f if
there was no match.

> It may be good to provide both sets, with a /default version
> analogous to SRFI-69 hash-table-ref and
> hash-table-ref/default:
> 
>   (irregex-match-substring <m> <invalid-i>)    => error
>   (irregex-match-substring <m> <unmatched-i>)  => error
> 
>   (irregex-match-substring/default <m> <invalid-i> #f)    => error
>   (irregex-match-substring/default <m> <unmatched-i> #f)  => #f
> 
> Thoughts?

I think this is pointless.  The hash table has a way to specify a
default value because it's possible to have #f as a value in your
hash table, which makes returning #f ambiguous.  That's why there's
a way to specify the default.

However, in case of substring and index operations, the result is
always an integer/a string.  Returning #f is completely unambiguous
in those cases, so I don't see the need to add yet another procedure.

It would be preferable to have this behaviour:

 (irregex-match-substring <m> <invalid-i>)    => error
 (irregex-match-substring <m> <unmatched-i>)  => #f

 (irregex-match-start-index <m> <invalid-i>)    => error
 (irregex-match-start-index <m> <unmatched-i>)  => #f

Cheers,
Peter
-- 
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
 is especially attractive, not only because it can be economically
 and scientifically rewarding, but also because it can be an aesthetic
 experience much like composing poetry or music."
                                                        -- Donald Knuth



reply via email to

[Prev in Thread] Current Thread [Next in Thread]