bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#17561: Emacs can forget processes


From: Paul Eggert
Subject: bug#17561: Emacs can forget processes
Date: Mon, 26 May 2014 10:08:38 -0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

Jorgen Schaefer wrote:
The trace starts with a number of these triplets, which seem to be
"Emacs behaving normally".

16300 09:41:27.072717 pselect6(20, [3 4 5 6 8 10 14 15 19], [], NULL, {0, 176410474}, 
{NULL, 8}) = 0 (Timeout) <0.176635>
16300 09:41:27.249649 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 
<0.000011>
16300 09:41:27.249881 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 
<0.000010>

I don't observe this behavior when running Emacs on Fedora 20 (this is the latest emacs-24 version, running your little test). I ran:

strace -o /tmp/tr -fttT src/bootstrap-emacs -nw -Q

I did see this:

29589 09:44:10.979952 ioctl(4, FIONREAD, [0]) = 0 <0.000023>
29589 09:44:10.980031 ioctl(4, FIONREAD, [0]) = 0 <0.000024>
29589 09:44:10.980143 pselect6(6, [4 5], [], NULL, {0, 499810466}, {NULL, 8}) = 0 (Timeout) <0.500499> 29589 09:44:11.480745 poll([{fd=5, events=POLLIN}], 1, 0) = 0 (Timeout) <0.000030> 29589 09:44:11.480861 read(5, 0x7fff777b3820, 16) = -1 EAGAIN (Resource temporarily unavailable) <0.000029> 29589 09:44:11.481214 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000030> 29589 09:44:11.481505 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000027>
29589 09:44:11.481594 ioctl(4, FIONREAD, [0]) = 0 <0.000030>

but that's not really the same. Could you please try running emacs -nw -Q with your test case, and see whether it behaves like the pattern on my platform? If so, we might try to investigate why Emacs changes from this pattern to the pattern that you observe.

Then there's a large bunch of syscalls related to my command to restart
the process, with the "\r" now being me sending the M-x command:

Which M-x command was that?  M-x list-processes?

16300 09:41:28.298391 read(3, "\r", 1)  = 1 <0.000012>
16300 09:41:28.298438 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
16300 09:41:28.298480 ioctl(3, FIONREAD, [0]) = 0 <0.000009>
16300 09:41:28.312476 write(3, "\r", 1) = 1 <0.000021>
16300 09:41:28.317392 kill(4294953129, SIGHUP) = 0 <0.002235>
16300 09:41:28.321642 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 
<0.000017>
16300 09:41:28.321841 write(3, "\33[K\33[H\n\n", 8) = 8 <0.000018>
16300 09:41:28.321909 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 
<0.000012>

When I try your test, the child process is running in parallel with the parent, the 'kill' terminates the child, and the parent is signaled. In contrast your trace shows no child, leading me to guess that the child has already exited (so the parent is killing a zombie), which means it's not the test case you sent but some other process (since the test case you sent waits on a pty so the child shouldn't exit). Here's the trace I see around the kill:

29592 09:44:24.494089 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) <0.000028>
29589 09:44:24.494167 kill(4294937704, SIGHUP <unfinished ...>
29592 09:44:24.494199 open("/usr/lib64/alliance/lib/tls/x86_64/libc.so.6", O_RDONLY|O_CLOEXEC <unfinished ...>
29589 09:44:24.494218 <... kill resumed> ) = 0 <0.000040>
29592 09:44:24.494236 <... open resumed> ) = -1 ENOENT (No such file or directory) <0.000025> 29592 09:44:24.494267 --- SIGHUP {si_signo=SIGHUP, si_code=SI_USER, si_pid=29589, si_uid=1000} ---
29592 09:44:24.494375 +++ killed by SIGHUP +++
29589 09:44:24.494388 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=29592, si_status=SIGHUP, si_utime=0, si_stime=0} --- 29589 09:44:24.494435 wait4(29592, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGHUP}], WNOHANG|WSTOPPED|WCONTINUED, NULL) = 29592 <0.000037>
29589 09:44:24.494507 rt_sigreturn()    = 12159536 <0.000021>
29589 09:44:24.494603 rt_sigprocmask(SIG_BLOCK, [ALRM], NULL, 8) = 0 <0.000019> 29589 09:44:24.494653 rt_sigprocmask(SIG_UNBLOCK, [ALRM], NULL, 8) = 0 <0.000017> 29589 09:44:24.495590 rt_sigprocmask(SIG_BLOCK, [WINCH IO], NULL, 8) = 0 <0.000025> 29589 09:44:24.496152 write(4, "\r\n\33[?25lnil\33[48;34H\33[7m11\33[0m\33[3"..., 70) = 70 <0.000036> 29589 09:44:24.496236 rt_sigprocmask(SIG_UNBLOCK, [WINCH IO], NULL, 8) = 0 <0.000015>
29589 09:44:24.496282 --- SIGIO {si_signo=SIGIO, si_code=SI_KERNEL} ---
29589 09:44:24.496306 rt_sigreturn()    = 0 <0.000015>
29589 09:44:24.496350 ioctl(4, FIONREAD, [0]) = 0 <0.000016>

Perhaps you could run your test on a fresh emacs -Q -nw and see whether it matches the behavior I'm seeing.

Killing the process via the process list (hence the "d"):

You typed "d" in the window generated by M-x list-processes? When I try that, it says "d is undefined".

My Emacs sessions usually have a number of processes open. When the bug
showed up just now, it was five processes and three network connections,
for example. I'm not sure if that's related.

I expect that it is, unfortunately.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]