[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Spreading parallel across nodes on HPC system
From: |
Ken Mankoff |
Subject: |
Re: Spreading parallel across nodes on HPC system |
Date: |
Fri, 11 Nov 2022 08:52:34 +0100 |
User-agent: |
mu4e 1.8.10; emacs 27.1 |
On 2022-11-11 at 08:37 +01, Rob Sargent <robjsargent@gmail.com> wrote:
> How do you mix slurm and parallel hostfile?
I have a script S that launches parallel with 30 tasks via "::: $(seq 30)"
I start S with "sbatch" and --ntasks=31. One for S itself, 30 for the parallel
processes.
When slurm gives me 1 node with 31 cores (or cpu?) things run fast. But
sometimes I get 31 nodes with 1 core each, and then parallel only sees "1" and
it runs things sequentially and 30 (or 29?) nodes are not used.
I realize from above that if parallel only sees 1 core in this case, then I've
answered my last question - parallel sees what I'm allocated, not what is
physically available. Therefore, I think just passing "--slf nodelist" should
solve everything.
-k.