[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: thinking out loud: wip-rtl, ELF, pages, and mmap
From: |
Nala Ginrut |
Subject: |
Re: thinking out loud: wip-rtl, ELF, pages, and mmap |
Date: |
Mon, 29 Apr 2013 13:47:33 +0800 |
On Wed, 2013-04-24 at 22:23 +0200, Andy Wingo wrote:
> Hi,
>
> I've been working on wip-rtl recently. The goal is to implement good
> debugging. I'll give a bit of background and then get to my problem.
>
> In master, ".go" files are written in the ELF format. ELF is nice
> because it embodies common wisdom on how to structure object files, and
> this wisdom applies to Guile fairly directly. To simplify, ELF files
> are cheap to load and useful to introspect. The former is achieved with
> "segments", which basically correspond to mmap'd blocks of memory. The
> latter is achieved by "sections", which describe parts of the file. The
> table of segments is usually written at the beginning of the file, to
> make loading easier, and the table of sections is usually at the end, as
> it's not usually needed at runtime. There are usually fewer segments
> than sections. You can have segments in the file that are marked as not
> being loaded into memory at runtime. Usually this is the case for
> debugging information.
>
> OK, so that's ELF. The conventional debugging format to use with ELF is
> DWARF, and it's pretty well thought out. In Guile we'll probably use
> DWARF, along with some more basic metadata in .symtab sections.
>
I'm very glad to see that ;-)
And we it's possible to debug .go with GDB.
> I should mention that in master, the ELF files are simple wrappers over
> 2.0-style objcode. The wip-rtl branch takes more advantage of ELF --
> for example, to allocate some constants in read-only shareable memory,
> and to statically allocate any constants that need initialization or
> relocation at runtime. ELF also has advantages when we start to do
> native compilation: native code can go in another section, for example.
>
Seems rtl's compiling is faster, at least for boot-9.scm
But I didn't give it a test.
It's possible to have more than one external AOT compiler except the
official inner one. Maybe it's unnecessary.
> * * *
>
> OK, so that's the thing. I recently added support for writing .symtab
> sections, and have been looking on how to load that up at runtime, for
> example when disassembling functions. To be complete, there are a few
> other common operations that would require loading debug information:
>
> * Procedure names.
> * Line/column information, for example in backtraces.
> * Arity information and argument names.
> * Local variable names and live ranges (the ,locals REPL command).
> * Generic procedure metadata.
>
And I hope there's the number of begin line and the end line for a
procedure. It's easy to record it when compiling. If no, I have to parse
the source file to confirm it, and provide the source code printing in
REPL/debugger.
> Anyway! How do you avoid loading this information at runtime?
>
IMO, we should provide the strip command to guild.
Or vice versa, --debug to the compile option.
Let users decide whether to keep the debug info.
> The original solution I had in mind was to put them in ELF segments that
> don't get loaded. Then at runtime you would somehow map from an IP to
> an ELF object, and at that point you would lazily load the unloaded ELF
> sections.
>
> But that has a few disadvantages. One is that it's difficult to ensure
> that the lazily-loaded object is the same as the one that you originally
> loaded. We don't keep .go file descriptors open currently, and
> debugging would be a bad reason to do so.
>
> Another more serious is that this is a lot of work, actually. There's a
> constant overhead of the data about what is loaded and how to load what
> isn't, and the cross-references from the debug info to the loaded info
> is tricky.
>
> Then I realized: why am I doing all of this if the kernel has a virtual
> memory system already that does all this for me?
>
> So I have a new plan, I think. I'll change the linker to always emit
> sections and segments that correspond exactly in their on-disk layout
> and in their in-memory layout. (In ELF terms: segments are contiguous,
> with p_memsz == p_filesz.) I'll put commonly needed things at the
> beginning, and debugging info and the section table at the end. Then
> I'll just map the whole thing with PROT_READ, and set PROT_WRITE on
> those page-aligned segments that need it. (Obviously in the future,
> PROT_EXEC as well.)
>
Yeah, when we have AOT ;-P
> Then I'll just record a list of ELF objects that have been loaded.
> Simple bisection will map IP -> ELF, and from there we have the section
> table in memory (lazily paged in by the virtual memory system) and can
> find the symtab and other debug info.
>
> So that's the plan. It's a significant change, and I wondered if folks
> had some experience or reactions.
>
> Note that we have a read()-based fallback if mmap is not available.
> This strategy also makes the read-based fallback easier.
>
> Thoughts?
>
> Andy
- thinking out loud: wip-rtl, ELF, pages, and mmap, Andy Wingo, 2013/04/24
- Re: thinking out loud: wip-rtl, ELF, pages, and mmap, dsmich, 2013/04/24
- Re: thinking out loud: wip-rtl, ELF, pages, and mmap, Andy Wingo, 2013/04/28
- Re: thinking out loud: wip-rtl, ELF, pages, and mmap, Ludovic Courtès, 2013/04/28
- Re: thinking out loud: wip-rtl, ELF, pages, and mmap,
Nala Ginrut <=