Aloha -
While testing the exec server, I setup a very minimalist subhurd using just the most essential files, as opposed to copying the entire filesystem, and uncovered a number of bugs.
I've refined the process into a shell script (attached) which creates the subhurd on a ramdisk and then boots it.
At least three bugs become apparent:
1. /hurd/startup doesn't fallback on /bin/sh if it can't exec /etc/hurd/runsystem. This is easy to fix - just a missing increment. Patch attached.
2. /hurd/startup naively assumes that SIGCHLD and waitpid() both work on init (PID 1), but they don't.
I've been able to patch this up by introducing special cases to check for HURD_PID_INIT in proc/wait.c's alert_parent (if PID is HURD_PID_INIT then ignore the p_parent field and treat startup_proc as the parent) and S_proc_wait (if we're called from procserver, make a special attempt to reap(init_proc)), but I hesitate to submit this as a patch. I'm not sure how we want to do this. Introduce special cases for init everywhere we've got a problem with it? Also, after fixing bug #1, this screws up startup's attempt to start a new shell if the old one dies. proc doesn't like having a second init process started after the first one has died and been reaped. Maybe startup shouldn't try to start a second init, even if the first one dies. And startup still should have some way to detect if init dies.
Our current setup is that PID 5 (ext2fs) runs first, then starts PID 2 (startup), which starts PID 1 (init). Weird. The cleanest solution, of course, would be to have proc actually respect these parenting relationships, then SIGCHLD and waitpid() would work normally.
3. Booting the subhurd, then running "halt -f" from its shell crashes the parent Hurd. Here's what the subhurd displays:
# halt -f
startup: notifying ext2fs.static pseudo-root of halt...done
startup: Killing pid 1
startup: Killing pid 3