chicken-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Chicken-users] Regex fail?


From: Peter Bex
Subject: Re: [Chicken-users] Regex fail?
Date: Fri, 30 Oct 2015 08:45:03 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

On Thu, Oct 29, 2015 at 09:12:44PM -0700, Matt Welland wrote:
> (string-match "^([^\n]*)(\n.*|).*$" "This\nis \n")
> => #f
> 
> Using Ruby as comparison:
> 
> irb(main):001:0> "This\nis \n".match(/^([^\n]*)(\n.*|)$/)
> => #<MatchData "This\nis " 1:"This" 2:"\nis ">

Interesting!  This seems to be a problem in the way string->sre works:

#;10> (string->sre  "^([^\n]*)(\n.*|).*$")
(seq bos (submatch (* (/ #\xe000 #\x10ffff #\vtab #\xd7ff #\null #\tab))) 
(submatch (or (seq "\n" (* nonl)) epsilon)) (* nonl) eos)

Note the nonl, which the manual states is equivalent to ".", but of
course nonl means "no newline".

You can work around this by using the SRE directly:

#;12> (irregex-match '(seq bos (submatch (* (~ "\n"))) (submatch (or (seq "\n" 
(* any)) epsilon)) (* any) eos)  "This\nis \n")
#<regexp-match (2 submatches)>
#;13> (irregex-match-substring #12 1)
"This"
#;14> (irregex-match-substring #12 2)
"\nis \n"

Fixing this in irregex would be trivial, but I guess there's a *reason*
why "." is considered the same as 'nonl.

Maybe Alex can give us some info about why this is the case?  I think this
may have something to do with the multi-line / single-line distinction
(which, to be honest, I never really understood).

Cheers,
Peter

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]