[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-users] Regex fail?
From: |
Peter Bex |
Subject: |
Re: [Chicken-users] Regex fail? |
Date: |
Fri, 30 Oct 2015 08:45:03 +0100 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Thu, Oct 29, 2015 at 09:12:44PM -0700, Matt Welland wrote:
> (string-match "^([^\n]*)(\n.*|).*$" "This\nis \n")
> => #f
>
> Using Ruby as comparison:
>
> irb(main):001:0> "This\nis \n".match(/^([^\n]*)(\n.*|)$/)
> => #<MatchData "This\nis " 1:"This" 2:"\nis ">
Interesting! This seems to be a problem in the way string->sre works:
#;10> (string->sre "^([^\n]*)(\n.*|).*$")
(seq bos (submatch (* (/ #\xe000 #\x10ffff #\vtab #\xd7ff #\null #\tab)))
(submatch (or (seq "\n" (* nonl)) epsilon)) (* nonl) eos)
Note the nonl, which the manual states is equivalent to ".", but of
course nonl means "no newline".
You can work around this by using the SRE directly:
#;12> (irregex-match '(seq bos (submatch (* (~ "\n"))) (submatch (or (seq "\n"
(* any)) epsilon)) (* any) eos) "This\nis \n")
#<regexp-match (2 submatches)>
#;13> (irregex-match-substring #12 1)
"This"
#;14> (irregex-match-substring #12 2)
"\nis \n"
Fixing this in irregex would be trivial, but I guess there's a *reason*
why "." is considered the same as 'nonl.
Maybe Alex can give us some info about why this is the case? I think this
may have something to do with the multi-line / single-line distinction
(which, to be honest, I never really understood).
Cheers,
Peter
signature.asc
Description: Digital signature