parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to debug error "Signal SIGCHLD received, but no signal handler s


From: Ole Tange
Subject: Re: How to debug error "Signal SIGCHLD received, but no signal handler set."?
Date: Sat, 23 Jul 2016 18:55:12 +0200

On Fri, Jul 22, 2016 at 2:50 PM,  <com@heuel.org> wrote:
:
> I use Cygwin (updated to latest packages) with the latest parallel version
> (20160622).
> My workflow looks like this:
>
>      cat input.txt | parallel --pipe -N64 --blocksize 63K  --joblog
> joblog.txt --retries 3 --progress  python myscript.py
>
> myscript.py does some CPU-bound processing with some network I/O and takes
> about 3 seconds per input line. input.txt has about 360k lines.
>
> The above command works well for about 30-60 minutes, fully utilizing 16
> cores. But then, it stops with
>
>      "Signal SIGCHLD received, but no signal handler set."
>
> on STDERR. I tried to simulate the command with
>
>       seq 360000 | parallel --pipe -N64 --blocksize 63K  --joblog joblog.txt
> --retries 3 --progress sleep 3
>
> But I could not replicate the error yet.
>
> Does anyone have an idea how to debug/resolve this?

First step is to reproduce it.

My gut tells me this is a CygWin thing.

Looking at the code $SIG{CHLD} is only messed with in:

        # When a child dies, wake up from sleep (or select(,,,))
        $SIG{CHLD} = sub { kill "ALRM", $$ };
        usleep($ms);
        # --compress needs $SIG{CHLD} undefined
        delete $SIG{CHLD};
        exit_if_disk_full();

On GNU/Linux 'delete $SIG{CHLD};' has the same effect as
'$SIG{CHLD}="IGNORE";' but maybe CygWin is different? When you find a
way to reproduce the error, try changing:

        delete $SIG{CHLD};

into:

        $SIG{CHLD}="IGNORE";

And please post how you reproduced the error.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]