[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
timeouts, zombies, and shellcommands
From: |
Darrell Fuhriman |
Subject: |
timeouts, zombies, and shellcommands |
Date: |
11 Oct 2001 14:10:08 -0700 |
User-agent: |
Gnus/5.0807 (Gnus v5.8.7) XEmacs/21.1 (Canyonlands) |
[this is a slightly more detailed copy of a post to gnu.cfengine.help]
So, when I'm executing a shell command using 1.6.3, in this case,
a very basic:
shellcommands:
redhat::
"/sbin/chkconfig ntpd on" useshell=false
timeout=30
Now, it sometimes hangs when cfengine is being run by kickstart
as part of the post-install scripts. Note there are *no* daemons
being started by this prograsm, so I don't think it's the
not-closing-descriptors problem. All the program does is create
a couple symlinks.
Here's some strace output, from the cfengine command.
rt_sigaction(SIGALRM, {0x4002c8c0, [ALRM], SA_RESTART|0x4000000}, {SIG_DFL}, 8)
= 0
alarm(30) = 0
umask(077) = 022
pipe([22, 23]) = 0
fork() = 1370
close(23) = 0
fcntl64(22, F_GETFL) = 0 (flags O_RDONLY)
fstat64(22, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) =
0x402e9000
_llseek(22, 0, 0xbfff3940, SEEK_CUR) = -1 ESPIPE (Illegal seek)
munmap(0x402e9000, 4096) = 0
wait4(89,
the child process looks like this:
getpid() = 1370
[deleted]
execve("/sbin/chkconfig", ["/sbin/chkconfig", "ntpd", "on"], [/* 21 vars */]) = 0
[deleted]
close(22) = 0
unlink("/etc/rc5.d/K74ntpd") = 0
unlink("/etc/rc5.d/S26ntpd") = -1 ENOENT (No such file
or directory)
symlink("../init.d/ntpd", "/etc/rc5.d/S26ntpd") = 0
_exit(0) = ?
One thing that has me confused is why it seems to be waiting on
PID 89, instead of '-1', or the actual PID of the child (1370).
I smell a bug of some sort... especially in light of the fact
that it does correctly wait for the previous command's PID.
Anyway, that's where it hangs. Also, I notice that it seems to
never recieve the ALRM signal. Is that some strange signal
interaction I don't understand?
To make things worse, it works correctly when run manually
instead of automatically.
As an aside, it seems that the alarm handler doesn't actually
*do* anything, especially anything useful like attempt to kill
the shellcommand.
net.c:41
void TimeOut()
{
alarm(0);
Verbose("%s: Time out\n",VPREFIX);
}
Is this, in fact, an unimplemented feature?
Darrell
- timeouts, zombies, and shellcommands,
Darrell Fuhriman <=