[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Skipping unexec via a big .elc file
From: |
Ken Raeburn |
Subject: |
Re: Skipping unexec via a big .elc file |
Date: |
Sun, 11 Dec 2016 08:34:01 -0500 |
I’ve pushed an update to the scratch/raeburn-startup branch. It includes
several updates:
* Stefan’s Oct 31 patch instead of his earlier one. This does more
reinitializing of charsets, coding systems, etc., which I believe were absent
from the previous version.
* More patches to the recursive object substitution pass done during reading.
The big costs on Mac OS X seem to differ from my Linux/GNU/X11 build — there’s
a much larger dumped.elc file, and an entirely different compiler — but I’ve
managed to trim the run time there a bit.
* Changed gc-cons-threshold to be much larger. By itself, this isn’t a good
change. But we’d exceed the old value many times over just reading the big
“progn” form; this way my Linux/GNU/X11 run doesn’t trigger GC during startup,
though I think the Mac version still does. I think a better strategy might try
to defer or discourage GC during startup, and do it instead when we have idle
cycles while the user isn’t trying to get something done. But revamping the GC
strategy is a different discussion.
* Larger obarray. After startup, my Linux/GNU/X11 build has over 15k symbols,
and my Mac build has over 21k. The old obarray size of 1511 meant average
chain lengths of over 10 and 14. Shorter chains mean less time spent in
oblookup. And extra slots are cheap.
* Open-code reading ASCII symbol characters from a file in read1(). The hot
path involved examining readcharfun to determine its type, compare it against
some known symbols, select a function to call, have that function check to see
if we’re doing pushback instead of actually reading, block input, do the actual
getc() call, and unblock input — all for each character. The new version
duplicates a bunch of code, but once it sees we’re reading from a file, skips
most of that for the common path through the inner loop. This cut maybe 10%
off of some of my run times.
With all these changes — Stefan’s new patch with additional initialization, and
my updates to shave a little more time off — I’m still hitting just under 0.2s
for:
time ./temacs --batch --eval '(progn (message "hi") (kill-emacs))'
on Linux/GNU/X11 (Intel Core i5-2320, 3GHz, gcc 4.9); my Mac (Intel Core 2 Duo,
2.8GHz) takes over half a second (including at least one GC invocation).
It can be tested by running “temacs” after building it. The lisp load path
will be set based on the source tree, not the installation prefix. If “-nl”
and “-l” arguments are not given, it’ll load “../src/dumped.elc”, but that’s
interpreted relative to the lisp *source* directory. If you build in a
directory other than the checked-out tree (i.e., $srcdir is not “.”) as I do,
you’ll need to copy dumped.elc from the src directory of the build tree where
it’s generated to the src directory of the source tree where it’s sought.
If dumped.elc isn’t found, temacs will exit with status 42. Under Stefan’s
version, an X11 run would spit out a message saying the file wasn’t found and
exit, but a tty run would get into a loop complaining about
internal-echo-keystrokes-prefix and would need to be killed from another
terminal. This way, it only kind of sucks, equally in both cases. :-)
The remaining time still seems to be about 2/3 reading and parsing bytes,
allocating objects, and updating (mostly scanning) the obarray. There should
be a bit more time that can be squeezed out.
Ken