emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[debbugs-tracker] bug#19883: closed (Smob's mark_smob has become unrelia


From: GNU bug Tracking System
Subject: [debbugs-tracker] bug#19883: closed (Smob's mark_smob has become unreliable in Guile 2.x)
Date: Thu, 23 Jun 2016 10:37:01 +0000

Your message dated Thu, 23 Jun 2016 12:36:34 +0200
with message-id <address@hidden>
and subject line Re: bug#19883: Correction for backtrace
has caused the debbugs.gnu.org bug report #19883,
regarding Smob's mark_smob has become unreliable in Guile 2.x
to be marked as done.

(If you believe you have received this mail in error, please contact
address@hidden)


-- 
19883: http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19883
GNU Bug Tracking System
Contact address@hidden with problems
--- Begin Message --- Subject: Smob's mark_smob has become unreliable in Guile 2.x Date: Mon, 16 Feb 2015 17:41:36 +0100
This precludes porting LilyPond to Guile 2 since LilyPond relies
extensively on the mark_smob functionality.

The symptom likely occurs when an object is collected and consequently
its free_smob function gets called.  This free_smob function essentially
is responsible for freeing all resources claimed by the smob, rendering
the underlying data moot.

If mark_smob continues to get called, its interpretation of the now dead
data is not likely to lead to happy results.

Debugging is tricky since the gc phases tend to occur asynchronously.
If I start the accompanying program with
./test 2 20000 40
I tend to get a core dump perhaps every fourth call.  Running repeatedly
in a debugger does not produce core dumps (perhaps one needs to reload
the debugger for each try?), so one needs to do

ulimit -c unlimited
./test 2 20000 40

and, after the problem triggers

gdb ./test core

for post-mortem debugging.  A traceback looks like
(gdb) bt
#0  0x00000000 in ?? ()
#1  0x0804aee0 in ?? ()
#2  0xb761b2da in ?? () from /usr/lib/libguile-2.0.so.22
#3  0xb72751f8 in GC_mark_from () from /usr/lib/i386-linux-gnu/libgc.so.1
#4  0xb72766ca in GC_mark_some () from /usr/lib/i386-linux-gnu/libgc.so.1
#5  0xb726cd16 in GC_stopped_mark () from /usr/lib/i386-linux-gnu/libgc.so.1
#6  0xb726d730 in GC_try_to_collect_inner ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#7  0xb726e12a in GC_collect_or_expand ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#8  0xb726e2b1 in GC_allocobj () from /usr/lib/i386-linux-gnu/libgc.so.1
#9  0xb72731a4 in GC_generic_malloc_inner ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#10 0xb72732be in GC_generic_malloc () from /usr/lib/i386-linux-gnu/libgc.so.1
#11 0xb761b81e in scm_i_new_smob () from /usr/lib/libguile-2.0.so.22
#12 0x0804985a in std::vector<void (*)(), std::allocator<void (*)()> 
>::_M_insert_aux(__gnu_cxx::__normal_iterator<void (**)(), std::vector<void 
(*)(), std::allocator<void (*)()> > >, void (* const&)()) ()
#13 0x0804a6fc in __gnu_cxx::new_allocator<void (*)()>::allocate(unsigned int, 
void const*) ()
#14 0x0804a03d in void std::_Destroy<void (**)(), void (*)()>(void (**)(), void 
(**)(), std::allocator<void (*)()>&) ()
#15 0x08049a73 in void std::_Destroy<Family**, Family*>(Family**, Family**, 
std::allocator<Family*>&) ()
#16 0x0804934d in scm_puts ()
#17 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#18 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#19 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#20 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#21 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#22 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#23 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#24 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#25 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#26 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#27 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#28 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#29 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#30 0x080495e7 in std::vector<Family*, std::allocator<Family*> 
>::operator[](unsigned int) ()
#31 0xb75b7dfd in ?? () from /usr/lib/libguile-2.0.so.22
#32 0xb76418e7 in ?? () from /usr/lib/libguile-2.0.so.22
#33 0xb761afb9 in ?? () from /usr/lib/libguile-2.0.so.22
 #34 0xb7659f20 in ?? () from /usr/lib/libguile-2.0.so.22
#35 0xb765a539 in ?? () from /usr/lib/libguile-2.0.so.22
#36 0xb75c24f3 in scm_call_4 () from /usr/lib/libguile-2.0.so.22
#37 0xb7641acf in scm_catch_with_pre_unwind_handler ()
   from /usr/lib/libguile-2.0.so.22
#38 0xb7641bd4 in scm_c_catch () from /usr/lib/libguile-2.0.so.22
#39 0xb75b85d1 in ?? () from /usr/lib/libguile-2.0.so.22
#40 0xb75b86d3 in scm_c_with_continuation_barrier ()
   from /usr/lib/libguile-2.0.so.22
#41 0xb763ef7e in ?? () from /usr/lib/libguile-2.0.so.22
#42 0xb72782c1 in GC_call_with_stack_base ()
   from /usr/lib/i386-linux-gnu/libgc.so.1
#43 0xb763f3e6 in scm_with_guile () from /usr/lib/libguile-2.0.so.22
#44 0x080496c5 in void __gnu_cxx::__alloc_traits<std::allocator<void (*)()> 
>::construct<void (*)()>(std::allocator<void (*)()>&, void (**)(), void (* 
const&)()) ()
#45 0xb72cfa83 in __libc_start_main (main=0x4, argc=134517216, argv=0x0, 
    init=0x8049201 <main+16>, 
    fini=0x804967e <std::_Vector_base<void (*)(), std::allocator<void (*)()> 
>::~_Vector_base()+40>, rtld_fini=0x4, stack_end=0xbfca7aa4) at libc-start.c:287
#46 0xb7750000 in ?? () from /lib/ld-linux.so.2
Backtrace stopped: previous frame inner to this frame (corrupt stack?)

The frames listed as

#17 0x0804938f in Scm_init::Scm_init(void (*)()) ()
#18 ...

appear to be a fluke in address/symbol correlation and do not seem to
have an actual relation to the listed function when looking at the code.

I append all the requisite source files producing the test binary when
running "make".  I can send my binary (when desired) which was compiled
using 2.0.11 on a i586 platform.  However, I would expect the problem to
reproduce in some kind of manner on other systems.

The compiler was gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6), Target:
i686-linux-gnu.

The headers are somewhat cooked-down variants (to remove dependencies on
other parts of LilyPond) of the Smob organizing classes used in
LilyPond.  If one wants to compile for Guile v1 for comparison, one
probably needs

typedef void * scm_t_func;

or so somewhere.

Attachment: guile-bug.tgz
Description: application/gtar-compressed

-- 
David Kastrup

--- End Message ---
--- Begin Message --- Subject: Re: bug#19883: Correction for backtrace Date: Thu, 23 Jun 2016 12:36:34 +0200 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)
On Thu 23 Jun 2016 11:50, Andy Wingo <address@hidden> writes:

> On Thu 26 Feb 2015 16:30, David Kastrup <address@hidden> writes:
>
>> Try ./test 2 2000 200
>
> I can reproduce the crash with your test case, thanks :) The patch below
> fixes the bug for me.  WDYT Ludovic?

Here's a patch with a test case.  I'm going to apply as it seems to be
obviously the right thing and the test case does reproduce what I think
is the bug (racing mark and finalize procedures, even if it's only
happening in one thread, finalizers and mark procedures do introduce
concurrency).  We trigger the concurrency in a simple way, via
allocation in the finalizer.  The patch does fix the original test.  GC
could happen due to another thread of course.  I'm actually not sure
where the concurrency is coming from in David's test though :/

I'm very interested in any feedback you might have!

Andy

>From 8dff3af087c6eaa83ae0d72aa8b22aef5c65d65d Mon Sep 17 00:00:00 2001
From: Andy Wingo <address@hidden>
Date: Thu, 23 Jun 2016 11:47:42 +0200
Subject: [PATCH] Fix race between SMOB marking and finalization

* libguile/smob.c (clear_smobnum): New helper.
  (finalize_smob): Re-set the smobnum to the "finalized smob" type
  before finalizing.  Fixes #19883.
  (scm_smob_prehistory): Pre-register a "finalized smob" type, which has
  no mark procedure.
* test-suite/standalone/test-smob-mark-race.c: New file.
* test-suite/standalone/Makefile.am: Arrange to build and run the new
  test.
---
 libguile/smob.c                             | 33 +++++++++++++--
 test-suite/standalone/Makefile.am           |  6 +++
 test-suite/standalone/test-smob-mark-race.c | 65 +++++++++++++++++++++++++++++
 3 files changed, 101 insertions(+), 3 deletions(-)
 create mode 100644 test-suite/standalone/test-smob-mark-race.c

diff --git a/libguile/smob.c b/libguile/smob.c
index 90849a8..ed9d91a 100644
--- a/libguile/smob.c
+++ b/libguile/smob.c
@@ -374,20 +374,43 @@ scm_gc_mark (SCM o)
 }
 
 
+static void*
+clear_smobnum (void *ptr)
+{
+  SCM smob;
+  scm_t_bits smobnum;
+
+  smob = PTR2SCM (ptr);
+
+  smobnum = SCM_SMOBNUM (smob);
+  /* Frob the object's type in place, re-setting it to be the "finalized
+     smob" type.  This will prevent other routines from accessing its
+     internals in a way that assumes that the smob data is valid.  This
+     is notably the case for SMOB's own "mark" procedure, if any; as the
+     finalizer runs without the alloc lock, it's possible for a GC to
+     occur while it's running, in which case the object is alive and yet
+     its data is invalid.  */
+  SCM_SET_SMOB_DATA_0 (smob, SCM_SMOB_DATA_0 (smob) & ~(scm_t_bits) 0xff00);
+
+  return (void *) smobnum;
+}
+
 /* Finalize SMOB by calling its SMOB type's free function, if any.  */
 static void
 finalize_smob (void *ptr, void *data)
 {
   SCM smob;
+  scm_t_bits smobnum;
   size_t (* free_smob) (SCM);
 
   smob = PTR2SCM (ptr);
+  smobnum = (scm_t_bits) GC_call_with_alloc_lock (clear_smobnum, ptr);
+
 #if 0
-  printf ("finalizing SMOB %p (smobnum: %u)\n",
-         ptr, SCM_SMOBNUM (smob));
+  printf ("finalizing SMOB %p (smobnum: %u)\n", ptr, smobnum);
 #endif
 
-  free_smob = scm_smobs[SCM_SMOBNUM (smob)].free;
+  free_smob = scm_smobs[smobnum].free;
   if (free_smob)
     free_smob (smob);
 }
@@ -470,6 +493,7 @@ void
 scm_smob_prehistory ()
 {
   long i;
+  scm_t_bits finalized_smob_tc16;
 
   scm_i_pthread_key_create (&current_mark_stack_pointer, NULL);
   scm_i_pthread_key_create (&current_mark_stack_limit, NULL);
@@ -493,6 +517,9 @@ scm_smob_prehistory ()
       scm_smobs[i].apply      = 0;
       scm_smobs[i].apply_trampoline_objcode = SCM_BOOL_F;
     }
+
+  finalized_smob_tc16 = scm_make_smob_type ("finalized smob", 0);
+  if (SCM_TC2SMOBNUM (finalized_smob_tc16) != 0) abort ();
 }
 
 /*
diff --git a/test-suite/standalone/Makefile.am 
b/test-suite/standalone/Makefile.am
index 712418a..aec7418 100644
--- a/test-suite/standalone/Makefile.am
+++ b/test-suite/standalone/Makefile.am
@@ -275,4 +275,10 @@ test_smob_mark_LDADD = $(LIBGUILE_LDADD)
 check_PROGRAMS += test-smob-mark
 TESTS += test-smob-mark
 
+test_smob_mark_race_SOURCES = test-smob-mark-race.c
+test_smob_mark_race_CFLAGS = ${test_cflags}
+test_smob_mark_race_LDADD = $(LIBGUILE_LDADD)
+check_PROGRAMS += test-smob-mark-race
+TESTS += test-smob-mark-race
+
 EXTRA_DIST += ${check_SCRIPTS}
diff --git a/test-suite/standalone/test-smob-mark-race.c 
b/test-suite/standalone/test-smob-mark-race.c
new file mode 100644
index 0000000..eca0325
--- /dev/null
+++ b/test-suite/standalone/test-smob-mark-race.c
@@ -0,0 +1,65 @@
+/* Copyright (C) 2016 Free Software Foundation, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public License
+ * as published by the Free Software Foundation; either version 3 of
+ * the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301 USA
+ */
+
+#if HAVE_CONFIG_H
+#include <config.h>
+#endif
+
+#undef NDEBUG
+
+#include <libguile.h>
+#include <assert.h>
+
+static SCM
+mark_smob (SCM smob)
+{
+  assert (SCM_SMOB_DATA (smob) == 1);
+  return SCM_BOOL_F;
+}
+
+static size_t
+finalize_smob (SCM smob)
+{
+  assert (SCM_SMOB_DATA (smob) == 1);
+  SCM_SET_SMOB_DATA (smob, 0);
+  /* Allocate a bit in the hopes of triggering a new GC, making the
+     marker race with the finalizer.  */
+  scm_cons (SCM_BOOL_F, SCM_BOOL_F);
+  return 0;
+}
+
+static void
+tests (void *data, int argc, char **argv)
+{
+  scm_t_bits tc16;
+  int i;
+
+  tc16 = scm_make_smob_type ("smob with finalizer", 0);
+  scm_set_smob_mark (tc16, mark_smob);
+  scm_set_smob_free (tc16, finalize_smob);
+
+  for (i = 0; i < 1000 * 1000; i++)
+    scm_new_smob (tc16, 1);
+}
+
+int
+main (int argc, char *argv[])
+{
+  scm_boot_guile (argc, argv, tests, NULL);
+  return 0;
+}
-- 
2.8.3


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]