emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Fix to long-standing crashes in GC


From: Eli Zaretskii
Subject: Re: Fix to long-standing crashes in GC
Date: Tue, 25 May 2004 09:07:56 +0200

> Date: Mon, 24 May 2004 22:03:34 -0500 (CDT)
> From: Luc Teirlinck <address@hidden>
> 
>     Once you discover the corrupted Lisp object or data structure, it is
>     useful to look at it in a fresh Emacs session and compare its contents
>     with a session that you are debugging.
> 
> Except that to notice that a Lisp object is corrupted you have to
> _already_ know how its contents look in a fresh Emacs session.

No, that's not what DEBUG wants to say.

A corrupted object is _always_ the one that caused the crash.  That's
why we call `abort' at those places: we've discovered something that
cannot happen with valid Lisp objects.

``Discovering the corrupted Lisp object or data structure'' in the
fragment above means that one needs to find the _enclosing_ data
structure of which the corrupted object is a part.  For example, if
the object that was the immideate cause of the crash is a cdr of some
cons cell, one needs to find out what cons cell was that; if it's a
member of a plist, one needs to find out whose property list it was;
etc.  That is when you make use of the last_marked[] array and walk
the marking code backwards guided by its contents.

> Many Elisp programmers do not have a very good knowledge about the
> very low level C structure of various Lisp objects.

Well, that's something that comes with experience.  However, if you
(or someone else) can share some pieces of that knowledge which, if we
add it to DEBUG, could make the learning curve shorter and/or less
steep, we could certainly use that.

> So I went through all of the last_marked array, without any
> idea of what to look for, that is: how do you recognize a "corrupted
> Lisp object or data structure"?

Does what I wrote above help in any way?  It cannot cover every
possible situation, and of course some knowledge about the object that
was the immideate cause of the call to `abort' _is_ needed, but I
don't see how this can be avoided.

> (gdb) p last_marked[17]
> $2 = 143587538
> (gdb) pr
> #<EMACS BUG: INVALID DATATYPE (MISC 0x0002) Save your buffers
> immediately and please report this bug>

Actually, as DEBUG says, it is not recommended to use `pr' in a
crashed session, especially one that crashed during GC.  `pr' invokes
a function inside Emacs code that looks at Lisp data structures; when
those data structures are corrupted, `pr' could well cause another
segfault and ruin your entire debugging session.

>     This is not easy since GC changes the tag bits and relocates strings
>     which make it hard to look at Lisp objects with commands such as `pr'.
>     It is sometimes necessary to convert Lisp_Object variables into
>     pointers to C struct's manually.
> 
> It says "It is sometimes necessary...".  When?

When `pr' and the x* (xstring, xsymbol, etc.) commands fail to print
the Lisp object.

> When I see:
> 
> pr
> 
> that is, no output, I can guess it is necessary.
> 
> What if I see:
> 
> pr
> ""
> 
> I know from experience that I still have to use xstring in that case,
> even though the empty string is a perfectly valid return value.  But
> xstring often reveals a different real value anyway.  Is this a bug in
> pr or is this normal?

Again, don't use `pr' in these cases.  Use xtype and the appropriate
x* command according to the type.

When you use x*, a failure to examine an object generates partial
information and an error message, like this:

  (gdb) xsymbol
  $201 = (struct Lisp_Symbol *) 0xdeadbeef
  Argument to arithmetic operation not a number or boolean.

You then need to examine the Lisp_Symbol structure at the address
shown as a C object:

  (gdb) print *((struct Lisp_Symbol *) 0xdeadbeef)

> What if I see
> 
> pr
> "dired-find-file"
> 
> Can I trust _this_ or should I still use xstring, that is, should the
> above have said: "It is always necessary, to be safe,..."?

In a crashed session, I personally never trust `pr', and only use it
as a secondary means, to view very complex data structures.  The
xsymbol command and its ilk are your friends.

I'll try to add this info to DEBUG when I have time (unless someone
else beats me to that).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]