[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to debug error "Signal SIGCHLD received, but no signal handler s
From: |
Ole Tange |
Subject: |
Re: How to debug error "Signal SIGCHLD received, but no signal handler set."? |
Date: |
Sat, 23 Jul 2016 18:55:12 +0200 |
On Fri, Jul 22, 2016 at 2:50 PM, <com@heuel.org> wrote:
:
> I use Cygwin (updated to latest packages) with the latest parallel version
> (20160622).
> My workflow looks like this:
>
> cat input.txt | parallel --pipe -N64 --blocksize 63K --joblog
> joblog.txt --retries 3 --progress python myscript.py
>
> myscript.py does some CPU-bound processing with some network I/O and takes
> about 3 seconds per input line. input.txt has about 360k lines.
>
> The above command works well for about 30-60 minutes, fully utilizing 16
> cores. But then, it stops with
>
> "Signal SIGCHLD received, but no signal handler set."
>
> on STDERR. I tried to simulate the command with
>
> seq 360000 | parallel --pipe -N64 --blocksize 63K --joblog joblog.txt
> --retries 3 --progress sleep 3
>
> But I could not replicate the error yet.
>
> Does anyone have an idea how to debug/resolve this?
First step is to reproduce it.
My gut tells me this is a CygWin thing.
Looking at the code $SIG{CHLD} is only messed with in:
# When a child dies, wake up from sleep (or select(,,,))
$SIG{CHLD} = sub { kill "ALRM", $$ };
usleep($ms);
# --compress needs $SIG{CHLD} undefined
delete $SIG{CHLD};
exit_if_disk_full();
On GNU/Linux 'delete $SIG{CHLD};' has the same effect as
'$SIG{CHLD}="IGNORE";' but maybe CygWin is different? When you find a
way to reproduce the error, try changing:
delete $SIG{CHLD};
into:
$SIG{CHLD}="IGNORE";
And please post how you reproduced the error.
/Ole