octave-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Octave-bug-tracker] [bug #34195] Inconsistency detected by ld.so: dl-cl


From: Colin Watson
Subject: [Octave-bug-tracker] [bug #34195] Inconsistency detected by ld.so: dl-close.c: 736: _dl_close: Assertion `map->l_init_called' failed!
Date: Fri, 02 Sep 2011 12:44:13 +0000
User-agent: Mozilla/5.0 (X11; Linux i686; rv:7.0) Gecko/20100101 Firefox/7.0

URL:
  <http://savannah.gnu.org/bugs/?34195>

                 Summary: Inconsistency detected by ld.so: dl-close.c: 736:
_dl_close: Assertion `map->l_init_called' failed!
                 Project: GNU Octave
            Submitted by: cjwatson
            Submitted on: Fri 02 Sep 2011 13:44:12 BST
                Category: Interpreter
                Severity: 3 - Normal
                Priority: 5 - Normal
              Item Group: Crash
                  Status: None
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 3.2.4
        Operating System: GNU/Linux

    _______________________________________________________

Details:

Octave's dynamic library handling sometimes manages to tickle an assertion in
GNU libc's dynamic linker.  I've seen this reported in a few places, for
example:

  https://bugs.launchpad.net/ubuntu/+source/octave-symbolic/+bug/831157
  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=633719
  http://comments.gmane.org/gmane.comp.programming.swig.devel/20631
  http://comments.gmane.org/gmane.comp.gnu.octave.bugs/16976

However, nothing I've found seems to provide a resolution so far, so I've been
spending some time digging into this as part of the general quest to make
every source package in Ubuntu actually build cleanly.

In the last of the URLs above, Xavier Delacour provides a test case, and
comments:

  "Glancing at sources (fetched with apt-get source on that system), the
direct problem is that dlclose is called on the shared library that has been
marked with RTLD_NODELETE (eglibc/elf/dl-close:734). That flag can be set via
dlopen (Octave doesn't do this), as well as under certain circumstances
internal to glibc. One of those is when STB_GNU_UNIQUE symbols ("u" in nm [3])
are loaded in the oct-file"

I have some information to add to that.  The assertion that's directly being
tripped here is that the shared object being closed is in an initialised
state.  This means that (a) it has been opened at some point (which is true
here) and (b) that it has not been finalised.  There are two ways a
dynamically-loaded shared object can be finalised: one is dlclose, but the
other is as part of the general finalisation that happens on process exit
(_dl_fini).  On exit, dynamically-loaded shared object are finalised in
reverse dependency order, that is, each object is finalised before any of the
objects it depends on.

In the case I'm working on, symbols.oct from the octave-symbolic package
depends on liboctinterp.so.  ld.so therefore finalises symbols.oct first
(setting l_init_called to 0 along the way).  Later, it attempts to finalise
liboctinterp.so.  This goes through the usual C++ process of finalising all
allocated objects, including the symbol table.  In the process it finds the
reference it's still holding to symbols.oct and calls
octave_dld_function::~octave_dld_function on it, which ultimately ends up
calling dlclose on this shared object that's already been finalised.  The
environment variable LD_DEBUG=all was helpful in diagnosing this; it shows:

     22029:     calling fini:
/tmp/buildd/octave-symbolic-1.0.9/debian/octave-symbolic/usr/lib/octave/packages/3.2/symbolic-1.0.9/i686-pc-linux-gnu-api-v37/symbols.oct
[0]
     22029:     calling fini: /tmp/buildd/liboctave/liboctinterp.so [0]

... followed shortly after by the assertion failure.

I'm sure Xavier is right that certain symbol types are needed to trigger this
particular path through _dl_close, although I haven't done any investigation
of that myself.  However, I think that in any case an attempt to finalise a
shared object twice is clearly a bug.  It is possible that this is in fact a
dynamic linker bug of sorts: shouldn't it avoid finalising shared objects with
a non-zero dlopen reference count?

However, I think it's possible for Octave to work around this.  If
do_octave_atexit were careful to close any open shared objects (perhaps by
finalising all symbol tables?) before falling through to ::exit, then it seems
to me that it would reliably avoid this assertion failure.

I don't know Octave well enough to make a guess at the best way to do this. 
I'm happy to have a stab at it given a few pointers, or to test suggested
fixes.




    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?34195>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]