guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Random idea about speeding up guix pull


From: Hartmut Goebel
Subject: Re: Random idea about speeding up guix pull
Date: Tue, 5 Sep 2017 14:23:15 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0

Am 04.09.2017 um 23:56 schrieb Ludovic Courtès:
> What it does do is maintain a cached checkout in ~/.cache/guix/pull,
> which makes subsequent pulls much faster.

Summary ( TL;DR):

  * "guix pull" should use "git fetch master"
  * "guix download" we can keep the current behaviour

I did a series of tests

  * - "fetch" without any argument will fetch *all* data from *all*
    branches.
  * - "fetch master" only fetches data living on "master", other
    branches are ignored

I compared the data fetched for a repo with status of 6bd1c41e8
(yesterday 05:29):

  * - "fetch" fetches 1000K
  * - "fetch master" fetches 755K
  * - "fetch --depth=1 master" fetches 588K (but see below)

I did some more tests (see results below and  attached script) and had
the following insights:

  * if not checking out FETCH_HEAD after fetch, the next fetch will
    download all data again (compare "fetch by ref" with "fetch by ref +
    checkout"
  * --depth=1 will download the *whole* state (at the given ref), no
    matter how many of the data is already here (compare "fetch by ref +
    checkout" with "fetch --depth=1 by ref + checkout")
  * I was not able to create a test-case where "fetch --depth=1 master"
    would only fetch parts of the data – so this contradicts the results
    when updating from 6bd1c41e8.

I suggest to make "guix pull" to fetch only from "master", since this
already reduces the since of downloaded data.

For guix download we don't (need to) cache former downloads, thus
"--depth=1 <commit>" would suffice. Unfortunately this only works for
branches and tags, not for commit-ids (see "man git-fetch-pack" for
exceptions). But most current package definitions are based on
commit-ids. Thus it is not worth trying "--depth=1 <commit>" first.

cloned repo ---------------
size 45M

fetch all ------------------
size 45M

fetch by ref ------------------
size v0.11.0    26M
size v0.12.0    32M
size v0.13.0    40M
size marker-1   45M
size marker-2   45M
size marker-3   45M
size marker-4   45M
size marker-5   45M
size master     45M

fetch by ref + checkout ------------------
size v0.11.0    26M
size v0.12.0    11M
size v0.13.0    12M
size marker-1   8,9M
size marker-2   1,1M
size marker-3   856K
size marker-4   856K
size marker-5   1,1M
size master     1,1M

fetch --depth=1 by ref ------------------
size v0.11.0    9,8M
size v0.12.0    11M
size v0.13.0    13M
size marker-1   13M
size marker-2   13M
size marker-3   13M
size marker-4   13M
size marker-5   13M
size master     13M

fetch --depth=1 by ref + checkout ------------------
size v0.11.0    9,8M
size v0.12.0    3,8M
size v0.13.0    5,6M
size marker-1   4,1M
size marker-2   4,1M
size marker-3   4,1M
size marker-4   4,1M
size marker-5   4,1M
size master     4,1M

fetch older all and master with --depth=1 by ref + checkout
------------------
size master     45M
size master     45M

-- 
Regards
Hartmut Goebel

| Hartmut Goebel          | address@hidden               |
| www.crazy-compilers.com | compilers which you thought are impossible |

Attachment: test-fetch.sh
Description: application/shellscript


reply via email to

[Prev in Thread] Current Thread [Next in Thread]