emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Proposal: stack traces with line numbers


From: John Williams
Subject: Re: Proposal: stack traces with line numbers
Date: Mon, 16 Oct 2017 15:43:38 -0700

On Sun, Oct 15, 2017 at 3:01 AM, Helmut Eller <address@hidden> wrote:
> On Sat, Oct 14 2017, John Williams wrote:
>> The information is "attached"
>> using cons cells as keys in a weak-key hash table.  [...]
>
> Unless you care about interpreted code, a non-weak hash-table should be
> enough.  I think this hash table should work similar to
> read-symbol-positions-list.

What I had in mind was a single global hashtable, because that way
it's easy to make it look as if the source refs are physically part of
the annotated cons cells, and users of the API don't need to be aware
that a supplementary data structure even exists. But of course using a
global hashtable with strong keys would create a huge space leak in
the reader.

Is there any particular disadvantage to using weak keys?

>> - I'm storing the information in vectors because it seems like a
>> reasonably efficient use of memory. [...]
>
> It's debatable whether a [file line column] vector is an efficent
> representation.  E.g. all lists in a source form come from the same file
> (or buffer or string) so storing the same filename many times seems
> redundant.  It might also be reasonable to use different representations
> in the debug info than for the data-structures used by the reader or
> compiler.

The file name would be a single string object shared by every ref in a
given file (or nil when there is no file), so we'd only be saving a
few words per source ref (one for the string itself, plus one or two
saved by using a cons cell instead of a two-element vector.)

>> - I'm saving line and column numbers rather that just byte/character
>> offsets [...]
>
> Line/column pairs have the (minor) advantage that line numbers have a
> higher porbability to stay the same after small edits to the source.
> But other than that, it seems to me that character offsets encode the
> same information more compactly.

It seems like we may be talking about different things. I'm speaking
strictly about the in-memory representation produced by the reader,
which will be quickly garbage-collected in most cases (assuming most
elisp code is compiled, either as part of the Emacs build processes,
or by the package manager). I haven't even thought about how to
represent the same information in bytecode, but I assume it will be
quite different, and more focused on compactness.

>> - I'm only attaching information to the head of each list purely as a
>> memory-saving measure. I can't think of scenario where you'd need a
>> source reference for a list without having its head available, except
>> maybe in the expansion of a macro that disassembles its arguments and
>> puts them back together in a new list.  If it's an issue in practice,
>
> In Lisp almost everything is a macro, so I bet that this is an issue.

Maybe. From what I can tell, most function calls in macro arguments
are copied directly into the expansion, so no important information
would be lost in the expansion process, except for a few outliers like
iter-defun, which appears to completely re-assemble the code it's
given. Attaching the same information to every cons cell wouldn't be
difficult, though. Every cell in a given list could share the same
source ref, so the main overhead would be the extra hash table
entries. My guess is that doing so would roughly double or even triple
the average memory footprint of a cons cell produced by the reader,
but I don't think that would be a problem unless you're trying to run
Emacs on an embedded platform, and it's a feature that could be easily
compiled out or disabled at runtime if necessary.

>> I think a better solution would be for the macro expander to propagate
>> source refs to every cons cell in a macro argument at the point where
>> macro expansion takes place.
>
> It's clearly desirable that source positions are propagated
> automatically as often as possible.  That job will be easier if the
> reader records more information.

It would definitely be simpler. I'll defer to others' opinions
regarding the relative merits of each approach.

> So, I think the reader should, at least optionally, also record
> positions of every cons cell not just the first in a list.  Also, in
> addition to the start position the reader could/should also record the
> end position.

That would not be difficult to implement.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]