QEMU has a sigchld handler that reaps any child process. -smb is the
only user of it and, in fact, QEMU inherited it from slirp. However,
this handler causes 'exec' based migration to randomly return 'status:
failed' in the monitor. This happens when the signal handler for SIGCHLD
is ran before the pclose() of exec migration.
The return status of fclose() is passed back as return status of
qemu_fclose(). If qemu_fclose() fails, then the exec_close() in
migration-exec.c returns a error code. This causes migrate_fd_cleanup()
to return an error, and thus finally we see why 'status: failed' occurs:
if (migrate_fd_cleanup(s)< 0) {
if (old_vm_running) {
vm_start();
}
state = MIG_STATE_ERROR;
}
To avoid this, register the pids in a list and, on SIGCHLD, set up a
bottom-half that would go through the pids and reap them.
Since I'm at it, I'm moving iohandler stuff out of vl.c. The new
file isn't a perfect place to add the child watcher, but it's arguably
better than vl.c.
This should be applied to both master and stable.
Paolo Bonzini (2):
extract I/O handler lists to iohandler.c
add a service to reap zombies
Makefile.objs | 2 +-
iohandler.c | 193 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
os-posix.c | 9 ---
qemu-common.h | 4 +
slirp/misc.c | 5 +-
vl.c | 106 ++------------------------------
6 files changed, 207 insertions(+), 112 deletions(-)
create mode 100644 iohandler.c