[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[bug#68242] [core-updates] Compress man pages using zstd
|
From: |
Maxim Cournoyer |
|
Subject: |
[bug#68242] [core-updates] Compress man pages using zstd |
|
Date: |
Mon, 08 Jan 2024 20:17:51 -0500 |
|
User-agent: |
Gnus/5.13 (Gnus v5.13) |
Hi Ludovic!
Ludovic Courtès <ludo@gnu.org> writes:
> Maxim Cournoyer <maxim.cournoyer@gmail.com> skribis:
>
>> The aim is to improve the efficiency of computing the man pages database,
>> which must decompress the man pages. Zstd is faster than gzip, especially
>> for
>> decompression, and has a similar compression ratio.
>>
>> * gnu/packages/commencement.scm (%final-inputs): Add zstd.
>> * guix/build/gnu-build-system.scm
>> (compress-documentation) Update doc.
>> <info-compressor, info-compressor-flags, man-compressor,
>> man-compressor-flags>
>> <man-compressor-file-extension>: New arguments.
>> <compressed-documentation-extension>: Rename argument to...
>> <info-compressor-file-extension>: ... this. Add an 'extension' argument to
>> the retarget-symlink nested procedure. Use new arguments in nested
>> 'maybe-compress' procedure.
>>
>> Change-Id: Ibaad4658f8e5151633714d263d9198f56d255020
>
> That’s a great idea, LGTM!
Thank you for the review!
> Do you have figures on the space savings of a package with many man
> pages such as gnutls:doc or openssl:doc?
Surprisingly, all of these I've checked used the weighed the same.
Here's gnutls:doc from my local (master) Guix:
--8<---------------cut here---------------start------------->8---
$ du -sh /gnu/store/8i3bas6lhziqi2n5wg6qzzhlddkb502c-gnutls-3.7.7-doc
4,9M /gnu/store/8i3bas6lhziqi2n5wg6qzzhlddkb502c-gnutls-3.7.7-doc
--8<---------------cut here---------------end--------------->8---
Compared to core-updates with these changes:
--8<---------------cut here---------------start------------->8---
$ du -sh /gnu/store/h3lbj1g64lkn9rd9xp86dphqnblxqkl6-gnutls-3.8.1-doc
4.9M /gnu/store/h3lbj1g64lkn9rd9xp86dphqnblxqkl6-gnutls-3.8.1-doc
--8<---------------cut here---------------end--------------->8---
That's because all the compressed man pages appear to fit in the minimal
4 KiB size of a single file, whether they are compressed with gzip or
zstd compressed.
Both man-pages packages weigh 11 MiB, but we can get an idea of the
compression ratio using:
With my local Guix:
--8<---------------cut here---------------start------------->8---
$ find $(guix build man-pages) -name '*.gz' | xargs -n1 du | sort -rn | head
-n20
64
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man5/proc.5.gz
44
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/bpf-helpers.7.gz
32
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/perf_event_open.2.gz
28
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/ptrace.2.gz
20
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/tcp.7.gz
20
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/cgroups.7.gz
20
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/seccomp_unotify.2.gz
20
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/prctl.2.gz
20
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/open.2.gz
20
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/futex.2.gz
20
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/fcntl.2.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/user_namespaces.7.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/socket.7.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/man-pages.7.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/ip.7.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/cpuset.7.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man7/capabilities.7.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man5/elf.5.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/seccomp.2.gz
16
/gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02/share/man/man2/keyctl.2.gz
--8<---------------cut here---------------end--------------->8---
On core-updates with these changes:
--8<---------------cut here---------------start------------->8---
$ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst'
| xargs -n1 du | sort -rn | head -n20
56
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man5/proc.5.zst
36
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/bpf-helpers.7.zst
28
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/perf_event_open.2.zst
24
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/ptrace.2.zst
20
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/tcp.7.zst
20
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/seccomp_unotify.2.zst
20
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/prctl.2.zst
20
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/futex.2.zst
20
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/fcntl.2.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/user_namespaces.7.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/man-pages.7.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/ip.7.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/cpuset.7.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/cgroups.7.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man7/capabilities.7.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man5/elf.5.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/seccomp.2.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/open.2.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/keyctl.2.zst
16
/gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02/share/man/man2/clone.2.zst
--8<---------------cut here---------------end--------------->8---
So for larger man pages, it seems we're talking about a 10% improvement.
That's not much, but the decompression is more efficient:
Compare gzipped man-pages decompression:
--8<---------------cut here---------------start------------->8---
$ find /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02 -name '*.gz'
| sh -c 'time xargs gunzip -ck > /dev/null'
real 0m0.137s
user 0m0.106s
sys 0m0.032s
$ find /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02 -name '*.gz'
| sh -c 'time xargs gunzip -ck > /dev/null'
real 0m0.137s
user 0m0.104s
sys 0m0.035s
$ find /gnu/store/93fjc9hv5canvs2lpya0qsbcm44hq7hh-man-pages-6.02 -name '*.gz'
| sh -c 'time xargs gunzip -ck > /dev/null'
real 0m0.138s
user 0m0.103s
sys 0m0.036s
--8<---------------cut here---------------end--------------->8---
With zstd' man-pages decompression:
--8<---------------cut here---------------start------------->8---
$ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst'
| sh -c 'time xargs zstd -dkc > /dev/null'
real 0m0.091s
user 0m0.033s
sys 0m0.059s
$ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst'
| sh -c 'time xargs zstd -dkc > /dev/null'
real 0m0.091s
user 0m0.035s
sys 0m0.058s
$ find /gnu/store/nqp5mmi1kb4xp7nkqsybrp5i18lygsl2-man-pages-6.02 -name '*.zst'
| sh -c 'time xargs zstd -dkc > /dev/null'
real 0m0.090s
user 0m0.027s
sys 0m0.063s
--8<---------------cut here---------------end--------------->8---
Assuming guile-zstd fares as well as zstd itself, we're looking at 1.5x
faster decompression.
Past measurements though had suggested the decompression was not the
limiting thing in making man-pages faster; rather it had to do with
building the database with Guile (sorry, I can't find a reference to it
anymore).
--
Thanks,
Maxim