Re: [PATCH 1/3] Make string-length documentation more correct

guile-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/3] Make string-length documentation more correct

From:	tomas
Subject:	Re: [PATCH 1/3] Make string-length documentation more correct
Date:	Wed, 26 Jun 2024 14:07:49 +0200

On Wed, Jun 26, 2024 at 01:46:28PM +0200, Maxime Devos wrote:
> 
> >>  >-Returns the number of characters in the given @var{string}.
> >> +Returns the number of bytes in the given @var{string}.
> >>  
> >> This is false. For example, (string-length "😀") is 1, whereas in all 
> >> encodings I know of it is >more than one byte. Also, R5RS says: [...]
> >
> >Maybe `the number of codepoints` will work here.
> >
> >(string-length "👨‍🏭") ;; => 3
> >(string-length "é") ;; => 2
> >
> >The number of characters here is 1 in both cases.
> 
> No, in Unicode (and Guile equates character=Unicode character) all characters 
> correspond to a single codepoint.

It's more subtle than that: Unicode knows about "combining characters",
so it's quite possible that Andrew's "é" consists of two code points
(FWIW, it arrives to me as just one, but perhaps there was some
canonicalization [1] step in between).

ISTR that "Unicode character" is actually synonymous the same than "Unicode
code point" -- but the common meaning of "character" is more fuzzy. Perhaps
it's wise to avoid that word when trying to be precise.

Cheers

[1] https://en.wikipedia.org/wiki/Unicode_normalization

-- 
t

signature.asc
Description: PGP signature

[Prev in Thread]

Current Thread

[Next in Thread]

[PATCH 0/3] Documentation improvements, Andrew Tropin, 2024/06/25
- [PATCH 1/3] Make string-length documentation more correct, Andrew Tropin, 2024/06/25
  - RE: [PATCH 1/3] Make string-length documentation more correct, Maxime Devos, 2024/06/25
    - RE: [PATCH 1/3] Make string-length documentation more correct, Andrew Tropin, 2024/06/26
    - RE: [PATCH 1/3] Make string-length documentation more correct, Maxime Devos, 2024/06/26
    - Re: [PATCH 1/3] Make string-length documentation more correct, tomas <=
    - RE: [PATCH 1/3] Make string-length documentation more correct, Maxime Devos, 2024/06/26
    - Re: [PATCH 1/3] Make string-length documentation more correct, Jean Abou Samra, 2024/06/26
    - RE: [PATCH 1/3] Make string-length documentation more correct, Maxime Devos, 2024/06/26
    - Re: [PATCH 1/3] Make string-length documentation more correct, Damien Mattei, 2024/06/26
    - RE: [PATCH 1/3] Make string-length documentation more correct, Andrew Tropin, 2024/06/28
    - RE: [PATCH 1/3] Make string-length documentation more correct, Andrew Tropin, 2024/06/28
- [PATCH 2/3] Change make-dynamic-state mentions to current-dynamic-state, Andrew Tropin, 2024/06/25
- [PATCH 3/3] Fix spelling, Andrew Tropin, 2024/06/25

Prev by Date: RE: [PATCH 1/3] Make string-length documentation more correct
Next by Date: RE: [PATCH 1/3] Make string-length documentation more correct
Previous by thread: RE: [PATCH 1/3] Make string-length documentation more correct
Next by thread: RE: [PATCH 1/3] Make string-length documentation more correct
Index(es):
- Date
- Thread