[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#17561: Emacs can forget processes
From: |
Paul Eggert |
Subject: |
bug#17561: Emacs can forget processes |
Date: |
Mon, 26 May 2014 16:58:45 -0700 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 |
Jorgen Schaefer wrote:
the-bug.txt.gz ... shows the strace of a
running Emacs session which starts to exhibit the bug, so we now have a
trace that includes the time when the bug actually happens. It shows me
starting Python subprocesses (elpy RPC processes) and killing them, in
an attempt to reproduce the problem. Eventually, this succeeds, and
Emacs goes into the loop reading from fd 15.
Thanks for that trace; it's helpful. A couple of things. First, what
platform are you running on, exactly? I'm running on an AMD Phenom II
X4 910e (according to /proc/cpuinfo), and using Fedora 20 x86-64 (Linux
kernel 3.14.4-200.fc20.x86_64, glibc 2.18), and compiling with GCC 4.9.0.
Second, looking at the-bug.txt I'm suspicious about the interaction
between vfork, pthread_sigmask (aka rt_sigprocmask) and rt_sigreturn.
Could you please try building Emacs with fork rather than vfork and see
whether that fixes the problem? If it fixes things, great; if not I'd
like to see another strace of the bug. To use fork instead of vfork,
you can run "./configure ac_cv_func_fork_works=no", or (faster but
flimsier) edit src/config.h to add "#define fork vfork".
Here are some details as to why I think vfork is related to the problem.
You see a bug on line 7717 of the trace in the-bug.txt, where read(15,
...) returns 0 and Emacs just keeps trying. I see bugs before that,
including:
* On trace line 1043, file descriptor 15 is created, as part of the
preparation for vforking and execing subprocess 27012. So far so good.
* On trace line 1048 vfork succeeds. Still good.
* On trace line 1049 the parent process continues, and executes another
system call! The child process has neither execed nor terminated. This
is incorrect; if vfork is called, the Linux man page says "the calling
thread is suspended until the child terminates (either normally, by
calling _exit(2), or abnormally, after delivery of a fatal signal), or
it makes a call to execve(2)". I'm reasonably sure that Emacs relies on
this and will fail badly if the parent and child can stomp on the same
memory in parallel. (Another possibility, alas, is that strace is buggy
and its output unreliable....)
* On trace line 1049, the signal mask is cleared except for QUIT, ALRM,
PROF.
* On trace line 5598, an INT is delivered just as the INT bit is removed
from the signal mask. This strongly suggests that the INT bit was set
in the signal mask -- but that bit was cleared on line 1049, so what
gives? This suggests that the signal mask is messed up somehow. One
possibility is that the buggy vfork is doing it somehow -- perhaps the
signal mask is part of the shared memory?
* On trace line 7706, the subprocess exits but no CHLD signal is
delivered to Emacs. Again, this suggests that the signal mask is messed
up. This last problem most likely leads to the symptoms you're
observing: Emacs isn't told that a child died, so it doesn't check for it.
FYI when I try your test, I get something like the following strace
output near the vfork. This is the correct vfork behavior, i.e., the
parent is suspended until the child execs:
6690 15:51:23.093780 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
<0.000025>
6690 15:51:23.093873 vfork( <unfinished ...>
6692 15:51:23.094078 setsid() = 6692 <0.000070>
6692 15:51:23.094202 open("/dev/tty", O_RDWR|O_CLOEXEC) = -1 ENXIO (No
such device or address) <0.000048>
6692 15:51:23.094310 open("/dev/pts/9", O_RDWR|O_CLOEXEC) = 9 <0.000041>
6692 15:51:23.094409 rt_sigaction(SIGINT, {SIG_DFL, [INT],
SA_RESTORER|SA_RESTART, 0x7f9de6e2fcb0}, {0x4ea440, [INT QUIT ALRM CHLD
PROF WINCH IO], SA_RESTORER, 0x7f9de73d7750}, 8) = 0 <0.000026>
6692 15:51:23.094507 rt_sigaction(SIGQUIT, {SIG_DFL, [QUIT],
SA_RESTORER|SA_RESTART, 0x7f9de6e2fcb0}, {0x4ea440, [INT QUIT ALRM CHLD
PROF WINCH IO], SA_RESTORER, 0x7f9de73d7750}, 8) = 0 <0.000026>
6692 15:51:23.094602 rt_sigaction(SIGPROF, {SIG_DFL, [PROF],
SA_RESTORER|SA_RESTART, 0x7f9de6e2fcb0}, {SIG_IGN, [PROF],
SA_RESTORER|SA_RESTART, 0x7f9de6e2fcb0}, 8) = 0 <0.000025>
6692 15:51:23.094694 rt_sigaction(SIGPIPE, {SIG_DFL, [PIPE],
SA_RESTORER|SA_RESTART, 0x7f9de6e2fcb0}, {SIG_IGN, [PIPE],
SA_RESTORER|SA_RESTART, 0x7f9de6e2fcb0}, 8) = 0 <0.000025>
6692 15:51:23.094793 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
<0.000026>
6692 15:51:23.094877 ioctl(9, SNDCTL_TMR_TIMEBASE or
SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon echo
...}) = 0 <0.000029>
6692 15:51:23.094963 ioctl(9, SNDCTL_TMR_STOP or
SNDRV_TIMER_IOCTL_GINFO or TCSETSW, {B38400 opost isig icanon -echo
...}) = 0 <0.000030>
6692 15:51:23.095080 ioctl(9, SNDCTL_TMR_TIMEBASE or
SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, {B38400 opost isig icanon -echo
...}) = 0 <0.000029>
6692 15:51:23.095170 getpid() = 6692 <0.000025>
6692 15:51:23.095258 chdir("/home/eggert/src/gnu/emacs/emacs-24-sc") =
0 <0.000038>
6692 15:51:23.095466 dup2(9, 0) = 0 <0.000028>
6692 15:51:23.095545 dup2(9, 1) = 1 <0.000027>
6692 15:51:23.095621 dup2(9, 2) = 2 <0.000025>
6692 15:51:23.095696 setpgid(0, 0) = -1 EPERM (Operation not
permitted) <0.000026>
6692 15:51:23.095776 ioctl(0, SNDRV_TIMER_IOCTL_SELECT or TIOCSPGRP,
[6692]) = 0 <0.000027>
6692 15:51:23.095872
execve("/home/eggert/opt/Linux-x86_64/merged/bin/cat",
["/home/eggert/opt/Linux-x86_64/me"...], [/* 100 vars */] <unfinished ...>
6690 15:51:23.096361 <... vfork resumed> ) = 6692 <0.002462>
6690 15:51:23.096440 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
<0.000028>
- bug#17561: 24.4.50; Emacs can forget processes, Jorgen Schaefer, 2014/05/23
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/23
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/23
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/24
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/25
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/26
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/26
- bug#17561: Emacs can forget processes,
Paul Eggert <=
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/27
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/27
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/27
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/27
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/28
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/28
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/28
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/28
- bug#17561: Emacs can forget processes, Jorgen Schaefer, 2014/05/29
- bug#17561: Emacs can forget processes, Paul Eggert, 2014/05/29