[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
address@hidden (Balazs Scheidler)] bug in thread support
From: |
Martin Stjernholm |
Subject: |
address@hidden (Balazs Scheidler)] bug in thread support |
Date: |
Tue, 20 May 2003 18:24:25 +0200 |
User-agent: |
Gnus/5.090016 (Oort Gnus v0.16) Emacs/20.7 (gnu/linux) |
I'd like to point out that the bug reported by Balazs Scheidler on Oct
21st 2001 (see below) still exists in glibc 2.3.2 (and/or its
accompanying linuxthreads lib) together with linux kernel 2.4.20
(smp).
This bug, being a fairly rare race of some sort, has proven very
difficult to narrow down in a heavily loaded production environment. I
therefore think it deserves more attention.
I'm not connected with Balazs Scheidler in any way (but I'd like to
thank him for producing the report with a test case that we could
verify). The application is completely different, but the common
things are lots of threads doing lots of I/O in combination with
frequent syslog(2) calls.
An excerpt of Scheidlers report follows. For the complete messages
including the code for his test case, see
http://mail.gnu.org/archive/html/bug-glibc/2001-10/msg00103.html,
http://sources.redhat.com/ml/libc-hacker/2001-10/msg00020.html, and
http://www.ussg.iu.edu/hypermail/linux/kernel/0109.3/1294.html.
> From: Balazs Scheidler
> Subject: bug in thread support
> Date: Sun, 21 Oct 2001 14:22:21 +0200
>
> Hi,
>
> I was sending this information and example program to the linux kernel
> folks, but they responded that this must be a libc bug instead. So I'm
> sending this information to you. (the thread on the linux-kernel mailing
> list should give you additional information in addition to this message)
>
> So the problem: we are developing a massively multithreaded application.
> This application sends syslog() messages from its threads. The problem I'm
> encountering seems to be related to SIGPIPE handling (either the kernel
> signal code, the libc signal code or the linuxthreads signal code)
>
> Our application starts a new thread for each new TCP session. Writing to
> sockets may result in a SIGPIPE to be delivered and an EPIPE to be returned
> from write() when the remote end closes its socket. If this SIGPIPE happens
> about the same time as a syslog() libc call, a segmentation fault occurs.
> Since core dumping of multithreaded programs do not work reliably, I
> implemented a quick&dirty backtrace function, which dumps the stack when a
> signal occurs. (see the attached test program)
>
> My backtrace function reports that the SIGSEGV occurs at virtual address
> 0x1:
>
> address@hidden:~$ cc -g -lpthread stressthreads.c
> address@hidden:~$ ./a.out
> Signal (11) received, stackdump follows; eax='ffffffe0', ebx='0000001d',
> ecx='bc5ff96c', edx='00000400', eip='00000001'
> retaddr=0x1, ebp=0xbc5ff944
> retaddr=0x8048a2a, ebp=0xbc5ffd74
> retaddr=0x4001bc9f, ebp=0xbc5ffe34
> address@hidden:~$ gdb a.out
> GNU gdb 19990928
> Copyright 1998 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB. Type "show warranty" for details.
> This GDB was configured as "i686-pc-linux-gnu"...
> (gdb) info line *0x8048a2a
> Line 80 of "stressthreads.c" starts at address 0x8048a12 <thread_func+118>
> and ends at 0x8048a2d <thread_func+145>.
> (gdb) l stressthreads.c:80
> 75 #endif
> 76
> 77 memset(buf, 'a', sizeof(buf));
> 78 for (i = 0; i < 1024; i++)
> 79 {
> 80 write(fd, buf, sizeof(buf));
> 81 }
> 82 close(fd);
> 83 //syslog(LOG_DEBUG, "thread stopped...%p\n", pthread_self());
> 84 free(arg);
> (gdb) x/2i 0x8048a25
> 0x8048a25 <thread_func+137>: call 0x8048680 <write>
> 0x8048a2a <thread_func+142>: add $0x10,%esp
>
> so the virtual address of 0x804892a points where the write() call returns.
- address@hidden (Balazs Scheidler)] bug in thread support,
Martin Stjernholm <=