|
From: | Charles Hixson |
Subject: | Re: [Chicken-users] Neophyte in scheme: string-split not quite what I want |
Date: | Fri, 20 Jul 2012 11:19:22 -0700 |
User-agent: | Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.16) Gecko/20120613 Icedove/3.0.11 |
On 07/20/2012 04:05 AM, Дмитрий wrote:
As I said, I'm a neophyte. My "character classes" were based around [a-zA-z] etc. So you can readily see why the pattern would have quickly become unreasonably complex. I didn't find any definition of other character classes (well, not one that meant anything) and given the discussion, I think that they wouldn't have worked if I'd gotten to the point of testing them.Hello. Does IrRegex support Unicode character classes? E.g. Will IrRegex consider accented letters (á) or Cyrillic letters (я) as "alpha"? Wil IrRegex consider Chinese wide space ( ) as "space"? Will IrRegex consider Chinese brackets (「」【】) as "punct"? If it doesn't, the regexp is going to be EXTREMELY messy [in fact, I believe it may better to build such a regexp automatically then]. I’m on Windows, so I can’t check it (when I use UTF-8 console via chcp 65001, for some reason Chicken seems to fail on every string with operation non-ascii string — even on a simple (display "Привет")). -- Yours sincerely, Dmitry Kushnariov
I was planning on using Chicken to learn scheme, since R7SR is supposed to be based more on R5SR than on R6SR, but maybe it's better to learn using Racket. I *trust* the conversion won't be too difficult. (I *do* need to use utf-8 in lots of places, and an incomplete implementation while I was learning would be ... unpleasant. Particularly if the user documentation presumed that it *was* complete.)
-- Charles Hixson
[Prev in Thread] | Current Thread | [Next in Thread] |