[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Processing large amounts of files
From: |
Ricardo Wurmus |
Subject: |
Re: Processing large amounts of files |
Date: |
Thu, 21 Mar 2024 16:03:37 +0100 |
User-agent: |
mu4e 1.10.8; emacs 29.1 |
Liliana Marie Prikler <liliana.prikler@ist.tugraz.at> writes:
> For comparison:
> time cat /tmp/meow/{0..7769}
> […]
>
> real 0m0,144s
> user 0m0,049s
> sys 0m0,094s
>
> It takes GWL 6 times longer to compute the workflow than to create the
> inputs in Guile, and 600 times longer than to actually execute the
> shell command. I think there is room for improvement :)
GWL checks if all input files exist before running the command. Part of
the difference you see here (takes about 2 seconds on my laptop) is GWL
running FILE-EXISTS? on 7769 files. This happens in prepare-inputs; its
purpose:
"Ensure that all files in the INPUTS-MAP alist exist and are linked to
the expected locations. Pick unspecified inputs from the environment.
Return either the INPUTS-MAP alist with any additionally used input
file names added, or raise a condition containing the list of missing
files."
Another significant delay is introduced by the cache mechanism, which
computes a unique prefix based on the contents of all input files. It's
not unexpected that this will take a little while, but it's not great
either.
The rest of the time is lost in inferior package lookups and in using
Guix to build a script that likely already exists. The latter is
something that we could cache (given identical output of "guix describe"
we could skip the computation of the process scripts).
--
Ricardo
- Re: Processing large amounts of files,
Ricardo Wurmus <=