diffutils-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: From wchar_t to char32_t


From: Paul Eggert
Subject: Re: From wchar_t to char32_t
Date: Thu, 6 Jul 2023 17:36:16 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0

On 2023-07-06 11:34, Bruno Haible wrote:

Indeed, this is the solution that makes no assumptions. Find a patch that
does it.

Thanks, I think. I installed that, and in reviewing it found a minor glitch or two and some opportunities for simplification. I installed the attached further patch which I hope fixes glitches without breaking anything else.


I had expected that the replacement mbrtowc -> mbrtoc32 would be purely
mechanical; I'm surprised that it requires application-specific considerations
here.

Yes, the (size_t) -3 change can be quite a pain.

In my review I found another only-theoretical problem involving locking-shift sequences. The patch adds a brief comment about one of them. However, I'm not planning to lose any sleep over this, as you and I have already spent more time on theoretical platforms than they're likely worth.

One other thing I discovered in my review. POSIX says that 'diff' need not support locking-shift sequences[1], and this business of mbrtoc23 returning (size_t) -3 is in a murky area as it would appear to fall into the locking-shift sequence category (at any rate, it doesn't appear to be a single-shift encoding which is POSIX's only other option for state-dependent encodings). Or maybe the next version of POSIX will have to change in this area? Either way, I wouldn't be surprised if supporting (size_t) -3 would not be a POSIX conformance issue for GNU 'diff' even on theoretical platforms where (size_t) -3 is possible.

[1]: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap06.html#tag_06_02

Attachment: 0001-diff-simplify-recent-mbrtoc32-improvement.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]