[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: From wchar_t to char32_t
From: |
Paul Eggert |
Subject: |
Re: From wchar_t to char32_t |
Date: |
Thu, 6 Jul 2023 17:36:16 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 |
On 2023-07-06 11:34, Bruno Haible wrote:
Indeed, this is the solution that makes no assumptions. Find a patch that
does it.
Thanks, I think. I installed that, and in reviewing it found a minor
glitch or two and some opportunities for simplification. I installed the
attached further patch which I hope fixes glitches without breaking
anything else.
I had expected that the replacement mbrtowc -> mbrtoc32 would be purely
mechanical; I'm surprised that it requires application-specific considerations
here.
Yes, the (size_t) -3 change can be quite a pain.
In my review I found another only-theoretical problem involving
locking-shift sequences. The patch adds a brief comment about one of
them. However, I'm not planning to lose any sleep over this, as you and
I have already spent more time on theoretical platforms than they're
likely worth.
One other thing I discovered in my review. POSIX says that 'diff' need
not support locking-shift sequences[1], and this business of mbrtoc23
returning (size_t) -3 is in a murky area as it would appear to fall into
the locking-shift sequence category (at any rate, it doesn't appear to
be a single-shift encoding which is POSIX's only other option for
state-dependent encodings). Or maybe the next version of POSIX will have
to change in this area? Either way, I wouldn't be surprised if
supporting (size_t) -3 would not be a POSIX conformance issue for GNU
'diff' even on theoretical platforms where (size_t) -3 is possible.
[1]:
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html#tag_06_02
0001-diff-simplify-recent-mbrtoc32-improvement.patch
Description: Text Data
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/04
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/04
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/06
- Re: From wchar_t to char32_t,
Paul Eggert <=
- mbcel module for Gnulib?, Paul Eggert, 2023/07/09
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/10
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/11
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/11
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/11
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/13
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/13
- Re: From wchar_t to char32_t, new module mbszero, Bruno Haible, 2023/07/16
- Re: From wchar_t to char32_t, new module mbszero, Paul Eggert, 2023/07/16
- Re: From wchar_t to char32_t, new module mbszero, Bruno Haible, 2023/07/17