emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Recommend these .gitconfig settings for git integrity.


From: Paul Eggert
Subject: Re: Recommend these .gitconfig settings for git integrity.
Date: Tue, 2 Feb 2016 23:31:00 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.5.1

On 02/02/2016 10:48 AM, Óscar Fuentes wrote:
Emacs is no small project.

Emacs is considerably smaller than other projects that Git regularly deals with. The Emacs master branch has 125,000 commits; the Linux kernel has 574,000 commits in its master branch (this omits history before 2005). I've seen reports of git repositories at Facebook where the .git subdirectory contains 50 GB. By comparison, my repository for Emacs master has 0.25 GB and for the Linux kernelhas 1.7 GB. These numbers can vary quite a bit depending on packing and so forth; still, the point remains that Emacs is not nearly as large as other projects that use git.

If the slow down affects large transfers ("large" meaning either many objects or big objects) what happens if an Emacs hackers pauses his activity for several months and then pulls? (after monitoring emacs-diffs for several years, I can attest that this scenario is quite frequent.)

I doubt whether it will slow down such pulls significantly for typical Emacs development. I just now did a simple benchmark that cloned Emacs master from savannah to my desktop at UCLA, and the overhead from the transfer.fsckObjects setting was swamped by noise due to the network being slower or faster.Here are a few details:

    average times
    real      user+system   fsckObjects value
    212.4       202.6           true
    217.1       195.7           false

    The command used was:
    git clone --config transfer.fsckObjects=VALUE git://git.sv.gnu.org/emacs.git

    I warmed up with a clone that I discarded. I then tried three of
    each command, interleaved, and took the averages.

This is just one benchmark of course, but it's suggestive.

If we wish to avoid tainted objects created by whatever cause, the check can be enabled on the Savannah's repo, hence limiting the problem to the "infected" user.

The problem would not be limited to the infected user, if that user pushes commits. Git on the client could remove the taint without actually fixing the problem. Although the pushed commits would have checksums that match their data, the pushed data would be corrupt.

A possible source of problems in this area is an attack on the integrity of the Emacs repository by a determined outsider. Any such attacks cannot be discounted by appealing to estimates based purely on counts of random hardware or software or configuration errors, as the attacks would not be random.

The problem seems to be so rare that a single Emacs hacker experiencing it every decade or so

Perhaps you're right, but perhaps not. Really, we do not know how common this particular problem is. In my experience strange problems with git occur more often than once per decade.

(doesn't warrant the risks and inconveniences associated with using the setting (see my previous messages.)

Although I recall concerns based on hypothetical scenarios, I don't recall descriptions of real inconveniences. Perhaps I missed an email or two.

Are you a reprensetative sample of the overwhelming majority of the Emacs devel populationt? (No, you aren't).

True. I expect I use git more than the average Emacs developer does. Stefan too.So we will probably be more affected than usual by this change.

Thanks, although it is now a bit late. It is already installed on
several repos, for sure. And anybody that stumbles on the revisions that
contained your change (by bisecting some bug, for example) will have his
.git/config file modified. I'm pretty sure that you didn't think of this
issues when you made the change (neither I did at first.)

No, I anticipated these issues. There's no evidence that they are significant problems.git bisect will still work.

I appreciate your concern but that has an easy solution: enable it on the server (and on your own machine, to be extra sure.) See, problem solved

That would not solve the problem. True, it would help against some random errors, but it won't work in general even there, much less against a determined attack.

I take your point that there's no rush in making this change, and it will be helpful to gain experience on it among developers who prefer a more bleeding-edge environment, so I have reverted the change on emacs-25 and installed a considerably more conservative approach on master to help us get started. The new approach does not alter git configuration if a developer invokes plain './autogen.sh'. Instead a developer can invoke the script with a newly-introduced extension: either './autogen.sh git' to configure just git, or './autogen.sh all' to configure both autoconf and git. I hope this helps address the concerns raised in this thread.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]