emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Time to drop the pre-dump phase in the build?


From: Daniel Colascione
Subject: Re: Time to drop the pre-dump phase in the build?
Date: Fri, 10 Jan 2014 21:30:13 -0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0

On 01/10/2014 09:13 PM, Stefan Monnier wrote:
Another possibility is to just allocate enough space in the emacs image
itself in BSS, then replace that mapping with a view of the dump file.

Indeed, that should work, assuming you can mmap into existing space.

On POSIX-y systems, you can just mmap on top of the existing section. On Windows, you have to unmap first, but I think it could be made to work.

But not nearly as bad: the main dump problem we have is with generating
the `emacs' executable, whereas here we'd only need to generate the
"swap file" which is later loaded into the same executable.
Should still be a lot more portable.

Do you mean building emacs with a large blob of zero in .data, using it as a heap, and replacing the contents of that section (without modifying the executable image structure) to actually "dump" emacs?

By the way: is it me, or are we dirtying far too much of the current emacs
image? On my Emacs, we're dirtying (and COWing) 8MB; if I make
Fgarbage_collect a no-op, that drops to 4MB.

For sure, GC will dirty up pretty much all pages that hold Lisp objects
(except for those in the purespace), because of the need to set/reset
the `mark' bit.

I was thinking about this problem. What if we were to just treat all image-backed objects as already marked if they're in pages that are unmodified? (We can perform this test very cheaply, at least on */Linux and Windows.) Then we wouldn't mark them during GC, and we additionally don't demand-page objects just for GC.

The problem we create is that we might have modified image-backed objects reachable only from unmodified image-backed objects, and these modified objects might point to heap-allocated objects that we really should mark. So what if we walk the per-type allocation lists during the *mark* phase and treat all in-image objects on modified pages as individual roots? This way, we eventually mark all heap-allocated objects. (Let's assume that no image-backed unmodified object can directly point to a heap-allocated object.)

This way, we can avoid touching most dumped data structures during GC. We might modify them for other reasons, though, like setting symbol value cells --- but if my quick and dirty GC test worked correctly, we should still save quite a bit on commit charge without worrying about these cases.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]