[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A project-files implementation for Git projects
From: |
Dmitry Gutov |
Subject: |
Re: A project-files implementation for Git projects |
Date: |
Thu, 3 Oct 2019 16:19:04 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 |
On 03.10.2019 11:33, Tassilo Horn wrote:
+(cl-defmethod project-files ((project (head vc)) &optional dirs)
+ (cl-mapcan
+ (lambda (dir)
+ (let (backend)
+ (if (and (file-equal-p dir (cdr project))
+ (setq backend (vc-responsible-backend dir))
+ nil
^^^
So this disables the VC operation. I've removed it, and the speed
improvement is good here. This is my test case (the Emacs repository):
Yes, sorry. Used this for comparative testing and forgot to take it out.
The Emacs repository is the one I've mostly tested on as well.
--8<---------------cut here---------------start------------->8---
(let* ((dir "~/Repos/el/emacs")
(p (project-current nil dir))
f1 f2)
(let ((t1 (benchmark-run 10
(setq f1 (project-files p))))
(t2 (benchmark-run 10
(setq f2 (project--files-in-directory
dir (project--dir-ignores p dir))))))
(message "Files: %d (VC) vs. %d (find)" (length f1) (length f2))
(message "VC) Elapsed time: %fs (%fs in %d GCs)"
(car t1) (nth 2 t1) (nth 1 t1))
(message "Find) Elapsed time: %fs (%fs in %d GCs)"
(car t2) (nth 2 t2) (nth 1 t2)))
(let ((d1 (cl-set-difference f1 f2 :test #'string=))
(d2 (cl-set-difference f2 f1 :test #'string=)))
(message "Files found by VC but not by find:")
(dolist (f d1)
(message " %s" f))
(message "Files found by find but not by VC:")
(dolist (f d2)
(message " %s" f))))
--8<---------------cut here---------------end--------------->8---
Here is the output:
--8<---------------cut here---------------start------------->8---
VC) Elapsed time: 1.379560s (0.308720s in 6 GCs)
Find) Elapsed time: 4.397054s (0.200695s in 4 GCs)
Files found by VC but not by find:
/home/horn/Repos/el/emacs/doc/lispintro/cons-1.pdf
/home/horn/Repos/el/emacs/doc/lispintro/cons-2.pdf
/home/horn/Repos/el/emacs/doc/lispintro/cons-2a.pdf
/home/horn/Repos/el/emacs/doc/lispintro/cons-3.pdf
/home/horn/Repos/el/emacs/doc/lispintro/cons-4.pdf
/home/horn/Repos/el/emacs/doc/lispintro/cons-5.pdf
/home/horn/Repos/el/emacs/doc/lispintro/drawers.pdf
/home/horn/Repos/el/emacs/doc/lispintro/lambda-1.pdf
/home/horn/Repos/el/emacs/doc/lispintro/lambda-2.pdf
/home/horn/Repos/el/emacs/doc/lispintro/lambda-3.pdf
/home/horn/Repos/el/emacs/etc/refcards/Makefile
/home/horn/Repos/el/emacs/etc/refcards/gnus-logo.pdf
/home/horn/Repos/el/emacs/lib/_Noreturn.h
/home/horn/Repos/el/emacs/lib/stdalign.in.h
/home/horn/Repos/el/emacs/lib/stddef.in.h
/home/horn/Repos/el/emacs/lib/stdint.in.h
/home/horn/Repos/el/emacs/lib/stdio-impl.h
/home/horn/Repos/el/emacs/lib/stdio.in.h
/home/horn/Repos/el/emacs/lib/stdlib.in.h
/home/horn/Repos/el/emacs/m4/__inline.m4
/home/horn/Repos/el/emacs/test/data/xdg/mimeinfo.cache
/home/horn/Repos/el/emacs/test/lisp/progmodes/flymake-resources/Makefile
/home/horn/Repos/el/emacs/test/manual/etags/Makefile
/home/horn/Repos/el/emacs/test/manual/etags/make-src/Makefile
/home/horn/Repos/el/emacs/test/manual/indent/Makefile
The difference is that the 'find' based method does not support
whitelist entries yet.
When it does, that might make its performance slightly worse, but
probably not in gtk or gnulib repos.
Files found by find but not by VC:
/home/horn/Repos/el/emacs/aclocal.m4
/home/horn/Repos/el/emacs/config.status
/home/horn/Repos/el/emacs/configure
/home/horn/Repos/el/emacs/info/dir
--8<---------------cut here---------------end--------------->8---
Then I did it on a clean checkout of the gtk repository and got this
result:
--8<---------------cut here---------------start------------->8---
Files: 4774 (VC) vs. 4774 (find)
VC) Elapsed time: 1.721054s (0.461112s in 9 GCs)
Find) Elapsed time: 0.634624s (0.152549s in 3 GCs)
Files found by VC but not by find:
Files found by find but not by VC:
nil
--8<---------------cut here---------------end--------------->8---
So here, Git has been much slower that find!
Interesting! I haven't seen that result before, but it sounds plausible.
IME it's ignore rules that make 'find' work slower. Git optimizes that
logic somehow. So on projects that have few ignore rules 'find' could be
faster.
I've also tried the gtk repo, and the performance ratio over here is the
same, although in my case 'git ls-files' here is faster than 'git
ls-files' in Emacs's repo (and 'find' is twice faster still).
And again with gnulib:
--8<---------------cut here---------------start------------->8---
Files: 9936 (VC) vs. 9936 (find)
VC) Elapsed time: 3.444869s (0.902124s in 16 GCs)
Find) Elapsed time: 1.380269s (0.285082s in 5 GCs)
Files found by VC but not by find:
Files found by find but not by VC:
--8<---------------cut here---------------end--------------->8---
Again Git was slower. What my gtk and gnulib repositories have in
common is that they are clean, i.e., no build artifacts which would be
matched by the exclude args passed to find...
gtk has only one .gitignore entry, gnulib has 8, but fairly simple ones.
So, what should we do here? Maybe:
1. Implement whitelist rules support for 'find'.
2. Add a defcustom project-vc-list-files-method? With a value 'auto'
which would check the backend and Git version. Maybe the presence of
'find' as well. Other possible values would be 'find' and 'vc'.
If you have time, could you compare the performance of 'find' and 'git
ls-files' in the command line? Because when simply redirecting to a file
I'm seeing a different result:
$ bash -c "time git ls-files >test"
real 0m0,011s
user 0m0,005s
sys 0m0,006s
$ bash -c "time find . >test2"
real 0m0,026s
user 0m0,008s
sys 0m0,018s
That could indicate some inefficiency in processing the output in Emacs.
- Re: A project-files implementation for Git projects, Dmitry Gutov, 2019/10/01
- Re: A project-files implementation for Git projects, Tassilo Horn, 2019/10/03
- Re: A project-files implementation for Git projects,
Dmitry Gutov <=
- Re: A project-files implementation for Git projects, Tassilo Horn, 2019/10/03
- Re: A project-files implementation for Git projects, Dmitry Gutov, 2019/10/03
- Re: A project-files implementation for Git projects, Tassilo Horn, 2019/10/04
- Re: A project-files implementation for Git projects, Tassilo Horn, 2019/10/04
- Re: A project-files implementation for Git projects, Dmitry Gutov, 2019/10/04
- Re: A project-files implementation for Git projects, Tassilo Horn, 2019/10/04
- Re: A project-files implementation for Git projects, Dmitry Gutov, 2019/10/04
- Re: A project-files implementation for Git projects, Tassilo Horn, 2019/10/04
- Re: A project-files implementation for Git projects, Dmitry Gutov, 2019/10/04
- Re: A project-files implementation for Git projects, Stefan Monnier, 2019/10/04