bug-hurd
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH glibc 24/34] hurd: Only check for TLS initialization insi


From: Samuel Thibault
Subject: Re: [RFC PATCH glibc 24/34] hurd: Only check for TLS initialization inside rtld or in static builds
Date: Thu, 13 Apr 2023 01:46:57 +0200
User-agent: NeoMutt/20170609 (1.8.3)

Sergey Bugaev, le mer. 12 avril 2023 13:42:50 +0300, a ecrit:
> before my SSH / network stack died were: (Gmail is surely going to
> wrap this, but hopefully not too badly)
> 
> gcc 
> /home/sergey/glibc/build/elf/dso-sort-tests-src/tst-dso-ordering9_42-bdeca-c.c
> -c -std=gnu11 -fgnu89-inline  -g -O2 -Wall -Wwrite-strings -Wundef
> -Werror -fmerge-all-constants -frounding-math -fno-stack-protector
> -fno-common -Wno-parentheses -Wstrict-prototypes
> -Wold-style-definition -fmath-errno    -fPIC              -I../include
> -I/home/sergey/glibc/build/elf  -I/home/sergey/glibc/build
> -I../sysdeps/mach/hurd/i386  -I../sysdeps/mach/hurd/x86
> -I../sysdeps/mach/hurd/i386/htl  -I../sysdeps/mach/hurd/htl
> -I../sysdeps/hurd/htl  -I../sysdeps/mach/htl  -I../sysdeps/htl/include
> -I../sysdeps/htl  -I../sysdeps/pthread  -I../sysdeps/mach/hurd/x86/htl
>  -I../sysdeps/i386/htl  -I../sysdeps/x86/htl  -I../sysdeps/mach/hurd
> -I../sysdeps/gnu  -I../sysdeps/unix/bsd  -I../sysdeps/unix/inet
> -I../sysdeps/mach/i386  -I../sysdeps/mach/x86
> -I../sysdeps/mach/include -I../sysdeps/mach
> -I../sysdeps/i386/i686/fpu/multiarch  -I../sysdeps/i386/i686/fpu
> -I../sysdeps/i386/i686/multiarch  -I../sysdeps/i386/i686
> -I../sysdeps/i386/fpu  -I../sysdeps/x86/fpu  -I../sysdeps/i386
> -I../sysdeps/x86/include -I../sysdeps/x86  -I../sysdeps/wordsize-32
> -I../sysdeps/ieee754/float128  -I../sysdeps/ieee754/ldbl-96/include
> -I../sysdeps/ieee754/ldbl-96  -I../sysdeps/ieee754/dbl-64
> -I../sysdeps/ieee754/flt-32  -I../sysdeps/hurd/include
> -I../sysdeps/hurd  -I../sysdeps/unix  -I../sysdeps/posix
> -I../sysdeps/ieee754  -I../sysdeps/generic -I../hurd
> -I/home/sergey/glibc/build/hurd/ -I../mach
> -I/home/sergey/glibc/build/mach/ -I.. -I../libio -I.
> -D_LIBC_REENTRANT -include /home/sergey/glibc/build/libc-modules.h
> -DMODULE_NAME=testsuite -include ../include/libc-symbols.h  -DPIC
> -DSHARED     -DTOP_NAMESPACE=glibc -o
> /home/sergey/glibc/build/elf/tst-dso-ordering9-dir/tst-dso-ordering9_42-bdeca-c.os
> client_loop: send disconnect: Broken pipe
> 
> ...but that doesn't seem useful.

Perhaps you can pipe to tee -a build.log?

> This is what's shipped in Debian, i.e. prior to my changes. As you can
> see, the libc.so version accesses %gs:0x0 without any checks, which
> makes sense, since __errno_location is just return &errno, 'errno'
> itself being a thread-local. There is no __LIBC_NO_TLS check in the
> source code. And yet it works just fine!

But possibly that's just because the ld.so code before setting up tls
doesn't access errno.

> > > Would it have been easy for me to run the full test suite, I would
> > > surely do that before submitting any patches. But it's not.
> >
> > Then it's simple: we have to fix that first.
> 
> If that's simple to fix, great!
> 
> Can you reproduce my issues?

No :/

> > > > You can run on master to get the list of current expected failures.
> > >
> > > But that's the thing, I can not :|
> >
> > I meant after having fixed the tests that break your testing, by
> > disabling them as I hinted.
> 
> Alright, after spending a day trying to make this work, I declare this
> a lost cause.

Ergl, sorry it ate up that much time :/

> I have disabled the two tests you suggested, and some more that seemed
> to behave particularly bad.

Which one? I have been running glibc's testsuite safely for several
years now.

> It still always kills / hard-locks my system, at seemingly random
> places.

Mmm, what is your system setup in terms of disk drivers and alike? As
of now, I am using the the Debian 1.8+git20230410-486-dbg kernel, the
gnumach AHCI disk drivers. I have 3635000KB of memory showing up as
total ni `free` (kvm -m 3550M, the maximum one can have with <4G phys
addressing).

Perhaps one thing worth noting: my /tmp is on the real disk, not a
tmpfs.

> This means fs corruption, each time, so I'm not willing to try doing
> this again and again.

I understand, but since it's apparently random, this looks like a
problem on your system, not related to glibc specifically, that you'll
probably want to fix anyway.

> Is there any other way for me to reproduce the crashes? If you can
> reproduce them, can you see what's going on, maybe enable LD_DEBUG and
> see if rtld is getting relocated early for some reason? Or maybe you
> could at least get a backtrace, and then we could try to stare at it
> and figure out what's going on together?
> 
> Maybe you're building with some flags that affect this? I'm only doing
> ../configure.

I'm using

../configure --prefix= --enable-pt_chown

I have uploaded the build result of master +
b37899d34d2190ef4b454283188f22519f096048 restored on:

https://dept-info.labri.fr/~thibault/tmp/libc.so.0.3
https://dept-info.labri.fr/~thibault/tmp/ld.so
https://dept-info.labri.fr/~thibault/tmp/test-as-const-rtld-sizes

you can run it by hand with
./ld.so --library-path $PWD ./test-as-const-rtld-sizes

It hangs on my system. I have put the core dump on 

https://dept-info.labri.fr/~thibault/tmp/core.18601

which can be inspected with 

gdb ./ld.so core.18601


Running live gdb ./ld.so 18529, I get:

(gdb) thread apply all bt

Thread 2 (Thread 18529.2):
#0  0x0102aa3c in __GI___mach_msg_trap () at 
/usr/src/glibc-upstream/build/mach/mach_msg_trap.S:2
#1  0x0102b1d6 in __GI___mach_msg (msg=0x1315d10, option=3, send_size=64, 
rcv_size=32, rcv_name=0, timeout=0, notify=0) at msg.c:111
#2  0x012c9850 in __gsync_wait (task=<optimized out>, addr=<optimized out>, 
val1=<optimized out>, val2=<optimized out>, msec=<optimized out>, 
flags=<optimized out>) at ./build-tree/hurd-i386-libc/mach/RPC_gsync_wait.c:186
#3  0x0104631b in __GI___spin_lock (__lock=0x12bb844 <_hurd_siglock>) at 
../mach/lock-intern.h:60
#4  __GI___mutex_lock (__lock=0x12bb844 <_hurd_siglock>) at 
../mach/lock-intern.h:119
#5  __GI__hurd_thread_sigstate (thread=<optimized out>) at hurdsig.c:80
#6  0x0116abb8 in _hurd_critical_section_lock () at ../hurd/hurd/signal.h:230
#7  _hurd_fd_get (fd=2) at ../hurd/hurd/fd.h:74
#8  __GI___write_nocancel (fd=2, buf=0x1315e60, nbytes=<optimized out>) at 
../sysdeps/mach/hurd/write_nocancel.c:26
#9  0x01149135 in __GI___libc_write (fd=2, buf=0x1315e60, nbytes=41) at 
../sysdeps/mach/hurd/write.c:26
#10 0x0116ff07 in __GI___writev (fd=<optimized out>, vector=<optimized out>, 
count=<optimized out>) at ../sysdeps/posix/writev.c:87
#11 0x010b9df5 in writev_for_fatal (fd=<optimized out>, total=<optimized out>, 
niov=<optimized out>, iov=<optimized out>) at ../sysdeps/posix/libc_fatal.c:44
#12 __libc_message (fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:124
#13 0x010b9ead in __GI___libc_fatal (message=0x12216b4 "hurd: Can't add 
reference on Mach thread\n") at ../sysdeps/posix/libc_fatal.c:159
#14 0x01046524 in __GI__hurd_thread_sigstate (thread=<optimized out>) at 
hurdsig.c:136
#15 0x0103fd33 in __GI__hurd_self_sigstate () at ../hurd/hurd/signal.h:173
#16 _hurd_msgport_receive (arg=<error reading variable: Cannot access memory at 
address 0x1316004>) at msgportdemux.c:47
Backtrace stopped: Cannot access memory at address 0x1316000

Thread 1 (Thread 18529.1):
#0  0x0102aa3c in __GI___mach_msg_trap () at 
/usr/src/glibc-upstream/build/mach/mach_msg_trap.S:2
#1  0x0102b1d6 in __GI___mach_msg (msg=0x10018dc, option=3, send_size=64, 
rcv_size=32, rcv_name=31, timeout=0, notify=0) at msg.c:111
#2  0x012c9850 in __gsync_wait (task=<optimized out>, addr=<optimized out>, 
val1=<optimized out>, val2=<optimized out>, msec=<optimized out>, 
flags=<optimized out>) at ./build-tree/hurd-i386-libc/mach/RPC_gsync_wait.c:186
#3  0x01110cb7 in __GI___spin_lock (__lock=<optimized out>) at 
../mach/lock-intern.h:60
#4  __GI___mutex_lock (__lock=<optimized out>) at ../mach/lock-intern.h:119
#5  __GI__Fork () at ../sysdeps/mach/hurd/_Fork.c:116
#6  0x01110892 in __libc_fork () at fork.c:74
#7  0x010089af in ?? ()


(gdb) thread apply all bt full

Thread 2 (Thread 18529.2):
#0  0x0102aa3c in __GI___mach_msg_trap () at 
/usr/src/glibc-upstream/build/mach/mach_msg_trap.S:2
No locals.
#1  0x0102b1d6 in __GI___mach_msg (msg=0x1315d10, option=3, send_size=64, 
rcv_size=32, rcv_name=0, timeout=0, notify=0) at msg.c:111
        ret = <optimized out>
#2  0x012c9850 in __gsync_wait (task=<optimized out>, addr=<optimized out>, 
val1=<optimized out>, val2=<optimized out>, msec=<optimized out>, 
flags=<optimized out>) at ./build-tree/hurd-i386-libc/mach/RPC_gsync_wait.c:186
        Mess = {In = {Head = {msgh_bits = 5395, msgh_size = 64, 
{msgh_remote_port = 36, msgh_remote_port_do_not_use = 36}, {msgh_local_port = 
0, msgh_protected_payload = 0}, msgh_seqno = 0, msgh_id = 4204}, addrType = 
{msgt_name = 2, msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform 
= 0, msgt_deallocate = 0, msgt_unused = 0}, addr = 19642436, val1Type = 
{msgt_name = 2, msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform 
= 0, msgt_deallocate = 0, msgt_unused = 0}, val1 = 2, val2Type = {msgt_name = 
2, msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform = 0, 
msgt_deallocate = 0, msgt_unused = 0}, val2 = 0, msecType = {msgt_name = 2, 
msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform = 0, 
msgt_deallocate = 0, msgt_unused = 0}, msec = 0, flagsType = {msgt_name = 2, 
msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform = 0, 
msgt_deallocate = 0, msgt_unused = 0}, flags = 0}, Out = {Head = {msgh_bits = 
5395, msgh_size = 64, {msgh_remote_port = 36, msgh_remote_port_do_not_use = 
36}, {msgh_local_port = 0, msgh_protected_payload = 0}, msgh_seqno = 0, msgh_id 
= 4204}, RetCodeType = {msgt_name = 2, msgt_size = 32, msgt_number = 1, 
msgt_inline = 1, msgt_longform = 0, msgt_deallocate = 0, msgt_unused = 0}, 
RetCode = 19642436}}
        InP = 0x1315d10
        OutP = 0x1315d10
        msg_result = <optimized out>
#3  0x0104631b in __GI___spin_lock (__lock=0x12bb844 <_hurd_siglock>) at 
../mach/lock-intern.h:60
        __iptr = 0x12bb844 <_hurd_siglock>
        __flags = 0
#4  __GI___mutex_lock (__lock=0x12bb844 <_hurd_siglock>) at 
../mach/lock-intern.h:119
No locals.
#5  __GI__hurd_thread_sigstate (thread=<optimized out>) at hurdsig.c:80
        ss = <optimized out>
#6  0x0116abb8 in _hurd_critical_section_lock () at ../hurd/hurd/signal.h:230
        self = 28
        ss = <optimized out>
        ss = <optimized out>
        self = <optimized out>
#7  _hurd_fd_get (fd=2) at ../hurd/hurd/fd.h:74
        __hurd_critical__ = <optimized out>
        descriptor = <optimized out>
        descriptor = <optimized out>
        __hurd_critical__ = <optimized out>
        cell = <optimized out>
#8  __GI___write_nocancel (fd=2, buf=0x1315e60, nbytes=<optimized out>) at 
../sysdeps/mach/hurd/write_nocancel.c:26
        descriptor = <optimized out>
        err = <optimized out>
#9  0x01149135 in __GI___libc_write (fd=2, buf=0x1315e60, nbytes=41) at 
../sysdeps/mach/hurd/write.c:26
        ret = <optimized out>
        cancel_oldtype = 0
#10 0x0116ff07 in __GI___writev (fd=<optimized out>, vector=<optimized out>, 
count=<optimized out>) at ../sysdeps/posix/writev.c:87
        bytes = <optimized out>
        buffer = <optimized out>
        malloced_buffer = 0x0
        to_copy = <optimized out>
        bp = <optimized out>
        bytes_written = <optimized out>
#11 0x010b9df5 in writev_for_fatal (fd=<optimized out>, total=<optimized out>, 
niov=<optimized out>, iov=<optimized out>) at ../sysdeps/posix/libc_fatal.c:44
        __result = <optimized out>
#12 __libc_message (fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:124
        iov = <optimized out>
        total = 41
        buf = <optimized out>
        ap = 0x1315f6c "\001"
        fd = <optimized out>
        list = <optimized out>
        nlist = <optimized out>
        cp = <optimized out>
#13 0x010b9ead in __GI___libc_fatal (message=0x12216b4 "hurd: Can't add 
reference on Mach thread\n") at ../sysdeps/posix/libc_fatal.c:159
--Type <RET> for more, q to quit, c to continue without paging--
No locals.
#14 0x01046524 in __GI__hurd_thread_sigstate (thread=<optimized out>) at 
hurdsig.c:136
        err = <optimized out>
        s = <optimized out>
        ss = <optimized out>
#15 0x0103fd33 in __GI__hurd_self_sigstate () at ../hurd/hurd/signal.h:173
        self = 28
        ss = 0x0
        ss = <optimized out>
        self = <optimized out>
#16 _hurd_msgport_receive (arg=<error reading variable: Cannot access memory at 
address 0x1316004>) at msgportdemux.c:47
No locals.
Backtrace stopped: Cannot access memory at address 0x1316000

Thread 1 (Thread 18529.1):
#0  0x0102aa3c in __GI___mach_msg_trap () at 
/usr/src/glibc-upstream/build/mach/mach_msg_trap.S:2
No locals.
#1  0x0102b1d6 in __GI___mach_msg (msg=0x10018dc, option=3, send_size=64, 
rcv_size=32, rcv_name=31, timeout=0, notify=0) at msg.c:111
        ret = <optimized out>
#2  0x012c9850 in __gsync_wait (task=<optimized out>, addr=<optimized out>, 
val1=<optimized out>, val2=<optimized out>, msec=<optimized out>, 
flags=<optimized out>) at ./build-tree/hurd-i386-libc/mach/RPC_gsync_wait.c:186
        Mess = {In = {Head = {msgh_bits = 4608, msgh_size = 32, 
{msgh_remote_port = 0, msgh_remote_port_do_not_use = 0}, {msgh_local_port = 31, 
msgh_protected_payload = 31}, msgh_seqno = 29, msgh_id = 4304}, addrType = 
{msgt_name = 2, msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform 
= 0, msgt_deallocate = 0, msgt_unused = 0}, addr = 28, val1Type = {msgt_name = 
2, msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform = 0, 
msgt_deallocate = 0, msgt_unused = 0}, val1 = 2, val2Type = {msgt_name = 2, 
msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform = 0, 
msgt_deallocate = 0, msgt_unused = 0}, val2 = 0, msecType = {msgt_name = 2, 
msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform = 0, 
msgt_deallocate = 0, msgt_unused = 0}, msec = 0, flagsType = {msgt_name = 2, 
msgt_size = 32, msgt_number = 1, msgt_inline = 1, msgt_longform = 0, 
msgt_deallocate = 0, msgt_unused = 0}, flags = 0}, Out = {Head = {msgh_bits = 
4608, msgh_size = 32, {msgh_remote_port = 0, msgh_remote_port_do_not_use = 0}, 
{msgh_local_port = 31, msgh_protected_payload = 31}, msgh_seqno = 29, msgh_id = 
4304}, RetCodeType = {msgt_name = 2, msgt_size = 32, msgt_number = 1, 
msgt_inline = 1, msgt_longform = 0, msgt_deallocate = 0, msgt_unused = 0}, 
RetCode = 28}}
        InP = 0x10018dc
        OutP = 0x10018dc
        msg_result = <optimized out>
#3  0x01110cb7 in __GI___spin_lock (__lock=<optimized out>) at 
../mach/lock-intern.h:60
        __iptr = <optimized out>
        __flags = 0
#4  __GI___mutex_lock (__lock=<optimized out>) at ../mach/lock-intern.h:119
No locals.
#5  __GI__Fork () at ../sysdeps/mach/hurd/_Fork.c:116
        newtask = 19675056
        thread_refs = 16831952
        sigthread_refs = 19762984
        state = {gs = 16784272, fs = 16784276, es = 134259806, ds = 16806696, 
edi = 16784276, esi = 134441884, ebp = 1, esp = 16832864, ebx = 5, edx = 0, ecx 
= 1, eax = 134441520, eip = 16777324, cs = 0, efl = 0, uesp = 134438900, ss = 
19640256}
        threads = 0x0
        stopped = 0
        newproc = 134438900
        nportnames = 0
        thread = 16927252
        sigthread = 19674880
        statecount = 19760428
        portnames = 0x0
        nporttypes = 0
        porttypes = 0x0
        nthreads = 0
        ports_locked = 0
        env = {{__jmpbuf = {19974152, 19974152, 0, 0, 16783680, 17894163}, 
__mask_was_saved = 0, __saved_mask = 134441992}}
        pid = 134438900
        i = 16807887
        err = <optimized out>
        ss = 0x130c808
        __PRETTY_FUNCTION__ = "_Fork"
        lose = <optimized out>
#6  0x01110892 in __libc_fork () at fork.c:74
        multiple_threads = true
        lastrun = 0
--Type <RET> for more, q to quit, c to continue without paging--
        nss_database_data = {nsswitch_conf = {size = 75406264232385142, ino = 
84360991733424116, mtime = {tv_sec = 577421931782412918, tv_nsec = 16826440}, 
ctime = {tv_sec = 72187705911477250, tv_nsec = 134441520}}, services = 
{0x8036c08, 0x100d960, 0x1, 0x5, 0x0, 0x90, 0x8035ff4, 0x1007588, 0x16, 0x54, 
0x107aaf6 <__GI_getenv+102>, 0x1018cf8, 0x1110850 <__libc_fork>, 0xe, 0x0, 
0x801033d <_dl_fixup+13>, 0x100bff4}, reload_disabled = 16784548, initialized = 
false}
        pid = <optimized out>
#7  0x010089af in ?? ()
No symbol table info available.
Backtrace stopped: previous frame inner to this frame (corrupt stack?)



Interestingly, watching for the $gs update:

€ gdb --args ./ld.so --library-path=/tmp ./test-as-const-rtld-sizes
(gdb) b _start
Breakpoint 1 at 0x1a5a0
(gdb) r
Starting program: /tmp/ld.so --library-path /tmp ./test-as-const-rtld-sizes

Thread 5 hit Breakpoint 1, 0x0801a5a0 in _start ()
(gdb) watch $gs
Watchpoint 2: $gs
(gdb) c
Continuing.

Thread 5 hit Watchpoint 2: $gs

Old value = 31
New value = 75
_hurd_tls_init (tcb=0x100e6c0) at ../sysdeps/mach/hurd/i386/tls.h:179
179       __hurd_reply_port0 = MACH_PORT_NULL;
(gdb) bt
#0  _hurd_tls_init (tcb=0x100e6c0) at ../sysdeps/mach/hurd/i386/tls.h:179
#1  call_tls_init_tp (addr=0x100e6c0) at 
../sysdeps/generic/dl-call_tls_init_tp.h:31
#2  init_tls (naudit=naudit@entry=0) at rtld.c:797
#3  0x0801db49 in dl_main (phdr=<optimized out>, phnum=<optimized out>, 
user_entry=<optimized out>, auxv=<optimized out>)
    at rtld.c:2046
#4  0x0801a07d in go (argdata=0x1001d50) at ../sysdeps/mach/hurd/dl-sysdep.c:172
#5  0x0801fa13 in _hurd_startup (argptr=<optimized out>, main=<optimized out>) 
at hurdstartup.c:184
#6  0x08019ee5 in _dl_sysdep_start (start_argptr=0x1002000, dl_main=0x801b810 
<dl_main>)
    at ../sysdeps/mach/hurd/dl-sysdep.c:229
#7  0x0801b565 in _dl_start_final (arg=<optimized out>) at rtld.c:495
#8  _dl_start (arg=<optimized out>) at rtld.c:582
#9  0x0801a5ab in _start () from /tmp/ld.so

At that point the library loading has happened:

(gdb) info sharedlibrary 
From        To          Syms Read   Shared Object Library
0x08000db0  0x080256e1  Yes         /tmp/ld.so
0x0102a650  0x01200d35  No          /tmp/libc.so.0.3
0x012c49a0  0x012d0ad4  No          /tmp/libmachuser.so.1
0x012e0bc0  0x012fee50  No          /tmp/libhurduser.so.0.3

And the function symbols indeed seem to have been overloaded:

(gdb) l __write
384     __write (int fd, const void *buf, size_t nbytes)
385     {
386       error_t err;
387       vm_size_t nwrote;
388
389       assert (fd < _hurd_init_dtablesize);


That is why I'm thinking that apparently exposing the libc functions
happens before setting up TLS, and thus potential for mayhem if libc
assumes that TLS is set up. The loading itself is apparently done in the
_dl_map_object_deps call of dl_main.

Samuel



reply via email to

[Prev in Thread] Current Thread [Next in Thread]