|
From: | Juergen Sauermann |
Subject: | Re: [Bug-apl] Regex support |
Date: | Thu, 21 Sep 2017 13:39:21 +0200 |
User-agent: | Mozilla/5.0 (X11; Linux i686; rv:45.0) Gecko/20100101 Thunderbird/45.2.0 |
Hi Elias, the UTF8_constructors look OK, but it can be tricky to properly interpret indices (the elements of sub in your code) of UTF8-encoded strings (i.e whether they mean code points or byte offsets). My feeling is that you should avoid UTF8_strings completely and go for the UTF32 option of the library (assuming that UTF32 are codepoints encoded as 32 bit integers). APL character strings are almost UTF32 strings (except for gaps between the codepoints) and they avoid all the bits shifting needed for UTF8 strings. Best Regards, /// Jürgen On 09/21/2017 12:09 PM, Elias Mårtenson
wrote:
|
[Prev in Thread] | Current Thread | [Next in Thread] |