bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

new module 'mbuiterf'


From: Bruno Haible
Subject: new module 'mbuiterf'
Date: Tue, 18 Jul 2023 13:55:38 +0200

This set of patches adds a new module 'mbuiterf', similar to 'mbuiter',
just faster. The 'f' stands for "faster" or "functional style".

Kudos to Paul Eggert for the intuition that a function that returns values
is more efficient than an equivalent function that modifies state and
returns void.

The benchmarks clearly show the speedup, especially for the case of ASCII
text (CPU time, measured on x86_64, with gcc 13, on an AMD Ryzen 7 CPU):

          gcc              clang
    mbuiter mbuiterf  mbuiter mbuiterf
a    0.990   0.205     1.225   1.038
b    1.032   0.207     1.232   1.054
c    2.323   1.212     2.607   2.345
d    2.036   0.905     2.358   2.076
e    2.120   0.953     2.358   2.083
f   15.335  15.036    15.307  15.496
g   10.402   9.726    10.636  10.382
h   11.082  10.223    11.324  10.899
i    4.846   4.713     4.882   4.922
j    5.151   4.919     5.097   5.137

The speedup of the ASCII case with gcc is understandable when one looks
at the generated code: The inner loop in bench-mbuiter's do_test is

.L8:
        addq    %rax, %rbx
        movq    120(%rsp), %rax
        addq    %rax, 112(%rsp)
        movb    $0, 104(%rsp)
.L6:
        movq    112(%rsp), %r14
        cmpb    $0, (%r14)
        js      .L9
        movq    $1, 120(%rsp)
        movsbl  (%r14), %eax
        movb    $1, 128(%rsp)
        movl    %eax, 132(%rsp)
        movb    $1, 104(%rsp)
.L7:
        testl   %eax, %eax
        jne     .L8

whereas the inner loop in bench-mbuiterf's do_test is

.L15:
        testb   %al, %al
        js      .L6
        addq    %rax, %rbx
        movl    $1, %eax
        addq    %rax, %r14
        movsbq  (%r14), %rax
        testb   %al, %al
        jne     .L15

That's nearly optimal.

The module 'mbuiter' is still recommended for more complicated code
that is not performance critical, because it has a simpler idiom.
For this reason, I'm not optimizing the modules 'mbsstr', 'mbscasestr',
'regex-quote', 'propername', 'exclude': Their code is more readable
with the 'mbuiter' macros.


2023-07-18  Bruno Haible  <bruno@clisp.org>

        mbsspn: Optimize.
        * lib/mbsspn.c: Include mbuiterf.h instead of mbuiter.h.
        (mbsspn): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbsspn (Depends-on): Add mbuiterf. Remove mbuiter.

        mbscspn: Optimize.
        * lib/mbscspn.c: Include mbuiterf.h instead of mbuiter.h.
        (mbscspn): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbscspn (Depends-on): Add mbuiterf. Remove mbuiter.

        mbspbrk: Optimize.
        * lib/mbspbrk.c: Include mbuiterf.h instead of mbuiter.h.
        (mbspbrk): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbspbrk (Depends-on): Add mbuiterf. Remove mbuiter.

        mbspcasecmp: Optimize.
        * lib/mbspcasecmp.c: Include mbuiterf.h instead of mbuiter.h.
        (mbspcasecmp): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbspcasecmp (Depends-on): Add mbuiterf. Remove mbuiter.

        mbsncasecmp: Optimize.
        * lib/mbsncasecmp.c: Include mbuiterf.h instead of mbuiter.h.
        (mbsncasecmp): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbsncasecmp (Depends-on): Add mbuiterf. Remove mbuiter.

        mbscasecmp: Optimize.
        * lib/mbscasecmp.c: Include mbuiterf.h instead of mbuiter.h.
        (mbscasecmp): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbscasecmp (Depends-on): Add mbuiterf. Remove mbuiter.

        mbssep: Optimize.
        * lib/mbssep.c: Include mbuiterf.h instead of mbuiter.h.
        (mbssep): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbssep (Depends-on): Add mbuiterf. Remove mbuiter.

        mbsrchr: Optimize.
        * lib/mbsrchr.c: Include mbuiterf.h instead of mbuiter.h.
        (mbsrchr): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbsrchr (Depends-on): Add mbuiterf. Remove mbuiter.

        mbschr: Optimize.
        * lib/mbschr.c: Include mbuiterf.h instead of mbuiter.h.
        (mbschr): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbschr (Depends-on): Add mbuiterf. Remove mbuiter.

        mbslen: Optimize.
        * lib/mbslen.c: Include mbuiterf.h instead of mbuiter.h.
        (mbslen): Use mbuif_* macros instead of mbui_* macros.
        * modules/mbslen (Depends-on): Add mbuiterf. Remove mbuiter.

        mbuiterf: Add a benchmark.
        * tests/bench-mbuiterf.c: New file, based on tests/bench-mbuiter.c.
        * modules/mbuiterf-bench-tests: New file, based on
        modules/mbuiter-bench-tests.

        mbuiterf: New module.
        * lib/mbuiterf.h: New file, based on lib/mbuiter.h.
        * lib/mbuiterf.c: New file, based on lib/mbuiter.c.
        * modules/mbuiterf: New file, based on modules/mbuiter.

Attachment: 0001-mbuiterf-New-module.patch
Description: Text Data

Attachment: 0002-mbuiterf-Add-a-benchmark.patch
Description: Text Data

Attachment: 0003-mbslen-Optimize.patch
Description: Text Data

Attachment: 0004-mbschr-Optimize.patch
Description: Text Data

Attachment: 0005-mbsrchr-Optimize.patch
Description: Text Data

Attachment: 0006-mbssep-Optimize.patch
Description: Text Data

Attachment: 0007-mbscasecmp-Optimize.patch
Description: Text Data

Attachment: 0008-mbsncasecmp-Optimize.patch
Description: Text Data

Attachment: 0009-mbspcasecmp-Optimize.patch
Description: Text Data

Attachment: 0010-mbspbrk-Optimize.patch
Description: Text Data

Attachment: 0011-mbscspn-Optimize.patch
Description: Text Data

Attachment: 0012-mbsspn-Optimize.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]