[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/3] Make string-length documentation more correct
From: |
tomas |
Subject: |
Re: [PATCH 1/3] Make string-length documentation more correct |
Date: |
Wed, 26 Jun 2024 14:07:49 +0200 |
On Wed, Jun 26, 2024 at 01:46:28PM +0200, Maxime Devos wrote:
>
> >> >-Returns the number of characters in the given @var{string}.
> >> +Returns the number of bytes in the given @var{string}.
> >>
> >> This is false. For example, (string-length "š") is 1, whereas in all
> >> encodings I know of it is >more than one byte. Also, R5RS says: [...]
> >
> >Maybe `the number of codepoints` will work here.
> >
> >(string-length "šØāš") ;; => 3
> >(string-length "eĢ") ;; => 2
> >
> >The number of characters here is 1 in both cases.
>
> No, in Unicode (and Guile equates character=Unicode character) all characters
> correspond to a single codepoint.
It's more subtle than that: Unicode knows about "combining characters",
so it's quite possible that Andrew's "Ć©" consists of two code points
(FWIW, it arrives to me as just one, but perhaps there was some
canonicalization [1] step in between).
ISTR that "Unicode character" is actually synonymous the same than "Unicode
code point" -- but the common meaning of "character" is more fuzzy. Perhaps
it's wise to avoid that word when trying to be precise.
Cheers
[1] https://en.wikipedia.org/wiki/Unicode_normalization
--
t
signature.asc
Description: PGP signature
- [PATCH 0/3] Documentation improvements, Andrew Tropin, 2024/06/25
- [PATCH 1/3] Make string-length documentation more correct, Andrew Tropin, 2024/06/25
- RE: [PATCH 1/3] Make string-length documentation more correct, Maxime Devos, 2024/06/25
- Re: [PATCH 1/3] Make string-length documentation more correct, Jean Abou Samra, 2024/06/26
- RE: [PATCH 1/3] Make string-length documentation more correct, Maxime Devos, 2024/06/26
- Re: [PATCH 1/3] Make string-length documentation more correct, Damien Mattei, 2024/06/26
- RE: [PATCH 1/3] Make string-length documentation more correct, Andrew Tropin, 2024/06/28
- RE: [PATCH 1/3] Make string-length documentation more correct, Andrew Tropin, 2024/06/28
[PATCH 2/3] Change make-dynamic-state mentions to current-dynamic-state, Andrew Tropin, 2024/06/25
[PATCH 3/3] Fix spelling, Andrew Tropin, 2024/06/25