[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: From wchar_t to char32_t, new module mbszero
|
From: |
Bruno Haible |
|
Subject: |
Re: From wchar_t to char32_t, new module mbszero |
|
Date: |
Wed, 19 Jul 2023 22:05:46 +0200 |
Paul Eggert wrote:
> > On NetBSD, I apparently did not locate the right source code of the mbsinit
> > function, due to the complexity of the citrus code. And did not want to
> > debug
> > it, because debugging in libc code without debugging information is often
> > a waste of time.
>
> I looked into it. I think it's due to the funky business with struct
> _RuneStatePriv. There's a pointer's worth of data before the data that
> the converters see, and until recently there was another character that
> ended up perhaps needing another pointer's worth of padding. Two
> pointers (8 bytes each), plus the 12 bytes we already saw, would explain
> the 28 bytes you observed.
In a 64-bit world, yes. It doesn't explain why 28 bytes is also the minimum
needed in 32-bit NetBSD 8: In this situation, with the ZERO_SIZE set to
24, I get an 'assertion "mbsinit (&iter->state)" failed'. Anyway, this
is not a high priority platform.
> I fixed mbcel.h for that by installing the attached patch into
> diffutils. Thanks for pointing it out.
> + /* These platforms have 128-byte mbstate_t. What were they thinking?
They were probably afraid of having to break binary compatibility when
needing to increase mbstate_t's size.
> >> This macro is not used anywhere. How about adding a comment explaining
> >> why it's defined but not used? Or if it's not needed we can remove it.
> >
> > It's needed, namely as lower bound for _GL_MBSTATE_ZERO_SIZE:
> > 0 < _GL_MBSTATE_INIT_SIZE <= _GL_MBSTATE_ZERO_SIZE <= sizeof (mbstate_t).
>
> That's merely documentation, right? That is, only comments use the lower
> bound.
Right.
> One thing I noticed on NetBSD 9.3 x86-64 (it has GCC 7), is that it
> didn't optimize memset calls away.
Strange. For me, it does. I tried gcc 7.5.0 both on a Linux+glibc system,
as well as on NetBSD 9.3 x86_64, and 'nm' of the executable does not reference
'memset'.
If you can reproduce it, feel free to change
memset (ps, 0, _GL_MBSTATE_ZERO_SIZE);
to
# if defined __GNUC__ && __GNUC__ >= 3
__builtin_memset (ps, 0, _GL_MBSTATE_ZERO_SIZE);
# else
memset (ps, 0, _GL_MBSTATE_ZERO_SIZE);
# endif
(__builtin_memset was already documented for GCC 3.0.4:
https://gcc.gnu.org/onlinedocs/gcc-3.0.4/gcc_5.html#SEC114 )
Bruno
- Re: From wchar_t to char32_t, (continued)
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/10
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/11
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/11
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/11
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/13
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/13
- Re: From wchar_t to char32_t, new module mbszero, Bruno Haible, 2023/07/16
- Re: From wchar_t to char32_t, new module mbszero, Paul Eggert, 2023/07/16
- Re: From wchar_t to char32_t, new module mbszero, Bruno Haible, 2023/07/17
- Re: From wchar_t to char32_t, new module mbszero, Paul Eggert, 2023/07/18
- Re: From wchar_t to char32_t, new module mbszero,
Bruno Haible <=
- Re: From wchar_t to char32_t, new module mbszero, Bruno Haible, 2023/07/17
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/04
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/11
- Re: From wchar_t to char32_t, Bruno Haible, 2023/07/11
- Re: From wchar_t to char32_t, Paul Eggert, 2023/07/11
Re: From wchar_t to char32_t, Bruno Haible, 2023/07/02
Re: From wchar_t to char32_t, Bruno Haible, 2023/07/02
Re: From wchar_t to char32_t, Bruno Haible, 2023/07/12
Re: From wchar_t to char32_t, Bruno Haible, 2023/07/13