|
| From: | Bruno Haible |
| Subject: | Re: From wchar_t to char32_t, new module mbszero |
| Date: | Sun, 16 Jul 2023 10:43:34 +0200 |
Paul Eggert wrote:
> > By reading the source code of FreeBSD, NetBSD, OpenBSD, macOS, Solaris,
> > and so on, I can easily determine
> > - which parts of the mbstate_t mbsinit() tests,
> > - which parts of the mbstate_t the various functions use.
> > But in order to understand what interdependencies there are, between
> > the various mbstate_t fields, and what are the assumed invariants,
> > I would need to carefully read each of the mentioned files (one per
> > OS and per locale type).
>
> Yes, and I did that for mbcel - that is, I looked at the source code for
> every coding system used by mbrtoc32 on NetBSD, OpenBSD, FreeBSD,
> Darwin, and DragonFly. The analysis was not as hard as one might think,
> as mbrtoc32 quickly decides whether the state is initial, and mbrtoc32
> is all that matters for mbcel.
>
> I doubt whether other primitives like mbrlen would differ, though I did
> not check this. Also, it's possible I made a mistake in analyzing
> mbrtoc32, though I hope that's unlikely.
I did that analysis again, more carefully than previously, and found
that for macOS, FreeBSD, NetBSD, OpenBSD, Solaris, zeroing the first
12 bytes of the mbstate_t should be sufficient. (Like you said.)
However, after implementing mbszero with this data and enabling its use
in many places, I got test failures on NetBSD and Solaris.
- On NetBSD, the minimum we need to clear is 28 bytes.
- On Solaris OmniOS and OpenIndiana, the minimum we need to clear is 16 bytes.
- On proprietary Solaris, the minimum we need to clear is 20 or 28 bytes
(depending on 32-bit or 64-bit mode).
So, clearly this is fragile stuff. I'm committing it nevertheless, since it
seems that we have a good enough test coverage to detect future changes.
2023-07-16 Bruno Haible <bruno@clisp.org>
dfa: Optimize clearing an mbstate_t.
* lib/dfa.c (mbszero) [GAWK]: Add fallback definition.
(mbs_to_wchar, lex, addtok_wc, dfaexec_main): Use mbszero.
* modules/dfa (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
uchar-c23: Optimize clearing an mbstate_t.
* lib/lc-charset-unicode.c (locale_encoding_to_unicode,
unicode_to_locale_encoding): Use mbszero.
* modules/uchar-c23 (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
quotearg: Optimize clearing an mbstate_t.
* lib/quotearg.c: Include <wchar.h>.
(quotearg_buffer_restyled): Use mbszero.
* modules/quotearg (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
vasnprintf, vasnwprintf: Optimize clearing an mbstate_t.
* lib/vasnprintf.c (VASNPRINTF): Use mbszero.
* modules/vasnprintf (Depends-on): Add mbszero.
* modules/vasnwprintf (Depends-on): Likewise.
* modules/c-vasnprintf (Depends-on): Likewise.
* modules/unistdio/u8-vasnprintf (Depends-on): Likewise.
* modules/unistdio/u8-u8-vasnprintf (Depends-on): Likewise.
* modules/unistdio/u16-vasnprintf (Depends-on): Likewise.
* modules/unistdio/u16-u16-vasnprintf (Depends-on): Likewise.
* modules/unistdio/u32-vasnprintf (Depends-on): Likewise.
* modules/unistdio/u32-u32-vasnprintf (Depends-on): Likewise.
* modules/unistdio/ulc-vasnprintf (Depends-on): Likewise.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbmemcasecoll: Optimize clearing an mbstate_t.
* lib/mbmemcasecoll.c (apply_c32tolower): Use mbszero.
* modules/mbmemcasecoll (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbswidth: Optimize clearing an mbstate_t.
* lib/mbswidth.c (mbsnwidth): Use mbszero.
* modules/mbswidth (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbfile: Optimize clearing an mbstate_t.
* lib/mbfile.h (mbfile_multi_getc, mbf_init): Use mbszero.
* modules/mbfile (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbuiter: Optimize clearing an mbstate_t.
* lib/mbuiter.h: Include <wchar.h>.
(mbuiter_multi_next, mbuiter_multi_copy, mbui_init): Use mbszero.
* modules/mbuiter (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbiter: Optimize clearing an mbstate_t.
* lib/mbiter.h: Include <wchar.h>.
(mbiter_multi_next, mbiter_multi_copy, mbi_init): Use mbszero.
* modules/mbiter (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
c32stombs: Optimize clearing an mbstate_t.
* lib/c32stombs.c (c32stombs): Use mbszero.
* lib/uchar.in.h (c32stombs): Likewise.
* modules/c32stombs (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbstoc32s: Optimize clearing an mbstate_t.
* lib/mbstoc32s.c (mbstoc32s): Use mbszero.
* lib/uchar.in.h (mbstoc32s): Likewise.
* modules/mbstoc32s (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbstowcs: Optimize clearing an mbstate_t.
* lib/mbstowcs.c (mbstowcs): Use mbszero.
* modules/mbstowcs (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
c32tob: Optimize clearing an mbstate_t.
* lib/c32tob.c (c32tob): Use mbszero.
* modules/c32tob (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
wctomb: Optimize clearing an mbstate_t.
* lib/wctomb-impl.h (wctomb): Use mbszero.
* modules/wctomb (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
btoc32: Optimize clearing an mbstate_t.
* lib/btoc32.c: Include <wchar.h>.
(btoc32): Use mbszero.
* modules/btoc32 (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
btowc: Optimize clearing an mbstate_t.
* lib/btowc.c (btowc): Use mbszero.
* modules/btowc (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbrtoc32: Optimize clearing an mbstate_t.
* lib/mbrtoc32.c (mbrtoc32): Use mbszero.
* modules/mbrtoc32 (Depends-on): Add mbsinit, mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbtowc: Optimize clearing an mbstate_t.
* lib/mbtowc-impl.h (mbtowc): Use mbszero.
* modules/mbtowc (Depends-on): Add mbszero.
2023-07-16 Bruno Haible <bruno@clisp.org>
mbszero: New module.
* lib/wchar.in.h: Include <string.h>.
(_GL_MBSTATE_INIT_SIZE, _GL_MBSTATE_ZERO_SIZE): New macros.
(mbszero): New declaration.
* lib/mbrtoc16.c: Update comments.
* lib/mbszero.c: New file.
* m4/wchar_h.m4 (gl_WCHAR_H_REQUIRE_DEFAULTS): Initialize
GNULIB_MBSZERO.
* modules/wchar (Depends-on): Add extern-inline.
(Makefile.am): Substitute GNULIB_MBSZERO.
* modules/mbszero: New file.
0001-mbszero-New-module.patch
Description: Text Data
0002-mbtowc-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0003-mbrtoc32-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0004-btowc-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0005-btoc32-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0006-wctomb-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0007-c32tob-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0008-mbstowcs-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0009-mbstoc32s-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0010-c32stombs-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0011-mbiter-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0012-mbuiter-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0013-mbfile-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0014-mbswidth-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0015-mbmemcasecoll-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0016-vasnprintf-vasnwprintf-Optimize-clearing-an-mbstate_.patch
Description: Text Data
0017-quotearg-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0018-uchar-c23-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
0019-dfa-Optimize-clearing-an-mbstate_t.patch
Description: Text Data
| [Prev in Thread] | Current Thread | [Next in Thread] |