[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performance issues with using local-file/interned-file/add-to-store
From: |
Ludovic Courtès |
Subject: |
Re: Performance issues with using local-file/interned-file/add-to-store |
Date: |
Sun, 17 Sep 2017 22:22:32 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/25.2 (gnu/linux) |
Hello!
Christopher Baines <address@hidden> skribis:
> So I've been playing around with managing some data files with Guix. On
> the whole, this is working quite nicely so far, but I'm having some
> performance issues with using the local-file gexp.
>
> Using it for large files (~1 to ~4 GB in my case) causes the
> guix-daemon to use a large amount of CPU and memory. This isn't a
> problem, but as the data files I'm working with don't change, once the
> file has been added to the store, it doesn't need to be added again.
The memory issue in guix-daemon is a Known Problem, see
<https://bugs.gnu.org/23666>.
I guess it wasn’t a pressing issue until now because almost all our use
cases were dealing with small files.
> The slow performance means that even if nothing needs adding to
> the store or building, it can take a while to work that out, as all the
> big files that you are using have to be added to the store again anyway.
>
> It would be good for me to have the option to cache the result of
> running add-to-store through the local-file gexp, perhaps using the
> full filename, or the hash as the cache key?
Internally, (guix store) has a cache (see ‘add-to-store-cache’) to make
sure that, during a session, the ‘add-to-store’ RPC for a given file is
done once only.
If you have large files, that single RPC can already be a lot, though.
In the “RPC pipelining” thread¹, I proposed a patch that allows us to
avoid actually making the ‘add-to-store’ RPC. The patch changes
‘add-to-store’ to compute the resulting store file name locally, without
actually making the RPC, and makes that RPC at a later time.
This approach is not helpful performance-wise for small files and when
talking to a local daemon. However, it could serve as a trick for large
files, where we could do something like:
1. Compute store file name for large file on the client side;
2. Call ‘add-temp-root’ for that store item; if that works, that means
it’s already in store, otherwise we need to do ‘add-to-store’.
The downside is that #1 requires traversing the whole file, so maybe it
doesn’t help.
Hmm…
Of course we could also have a cache in ~/.cache/guix for all this, as a
last resort.
Thoughts?
Ludo’.
¹ https://lists.gnu.org/archive/html/guix-devel/2017-07/msg00135.html