coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort dynamic linking overhead


From: Yann Collet
Subject: Re: sort dynamic linking overhead
Date: Mon, 26 Feb 2024 06:44:46 +0000

  *   xxhash128 is not a cryptographic hash function, so it doesn't attempt to 
be random.

Just a correction : xxh128 does try to be random. And quite hardly: a 
significant amount of development is spent on ensuring this property.
It’s even tested with PractRand, and it could be used as a good random number 
generator.

Being non-cryptographic means that what it doesn’t try is to make sure no one 
can intentionally forge a hash collision from 2 different files (other than 
brute-forcing, which is impractical).
But that’s different, and I wouldn’t call this property “randomness”, even 
though randomness is a pre-requisite (but not sufficient in itself) to 
collision resistance.


From: Paul Eggert <eggert@cs.ucla.edu>
Date: Sunday, February 25, 2024 at 10:25 PM
To: Pádraig Brady <P@draigBrady.com>, Bruno Haible <bruno@clisp.org>, 
bug-gnulib@gnu.org <bug-gnulib@gnu.org>, Coreutils <coreutils@gnu.org>
Cc: Yann Collet <cyan@meta.com>
Subject: Re: sort dynamic linking overhead
On 2023-10-09 06:48, Pádraig Brady wrote:

> An incremental patch attached to use xxhash128 (0.8.2)
> shows a good improvement (note avx2 being used on this cpu):

xxhash128 is not a cryptographic hash function, so it doesn't attempt to
be random. Of course most people won't care - it's random "enough" - but
it would be a functionality change.

blake2 is cryptographic and would be random, but would bloat the 'sort'
executable with code that's hardly ever used.

To attack the problem in a more conservative way, I installed the
attached patch into coreutils. With it, 'sort -R' continues to use MD5
but on GNUish platforms 'sort' links libcrypto dynamically only if -R is
used (Bruno's suggestion). This doesn't significantly affect 'sort -R'
performance, and reduces the startup overhead of plain 'sort' to be what
it was before we started passing -lcrypto to gcc by default (in
coreutils 8.32).

I also toyed with changing MD5 to SHA512, but that hurt performance. For
what it's worth, although I tested with an Intel Xeon W-1350, which
supports SHA-NI as well as various AVX-512 options, I didn't see where
libcrypto (at least on Ubuntu 23.10, which has OpenSSL 3.0.10) takes
advantage of these special-purpose instructions.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]