Re: [ESPResSo-users] mpi and compressed block files

espressomd-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ESPResSo-users] mpi and compressed block files

From:	Axel Arnold
Subject:	Re: [ESPResSo-users] mpi and compressed block files
Date:	Thu, 06 Sep 2012 22:47:56 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:15.0) Gecko/20120827 Thunderbird/15.0

Dear Martin,

this is basically the same problem as you ran already into with readingall Tcl variables. You read back all values, and some are incompatiblewith changing the number of nodes. Here, this is the processor nodegrid, which of course has to fit the number of nodes as the checkpointwas written, i.e., you can only read this variable back if you don'tchange the number of processors.

Just like for tcl variables, there are also blacklists for not readingback certain variables from the setmd variables, see the user's guide,and you can specify which variables you really need to reset. However,in the case of the checkpoints, there is one more concern that youshould be aware of: if you use a thermostat that relies on randomnumbers, such as the standard Langevin, then the random numbers will beonly reproducable if you use the same node_grid (and hence, number ofnodes), and restore the random seeds. Therefore, for true checkpointing,you need to save node_grid and restore it, on the same number of nodes.In addition, you need to unconditionally recreate the Verlet lists,which requires the command "invalidate_system" right after writing thecheckpoint.

In your case, it seems that you are just creating the setup serially,and then want to go parallel. In this case, saving random seeds etc isnot necessary, and you should only save those setmd variables, that youactually changed during your setup script.


Cheers,
Axel

On 09/06/2012 02:50 PM, Martin Lindén wrote:

Hi!

I am fairly new to Espresso, and have some trouble with reading
checkpoints, as described at the end of Sec. 10.1.7 in the users guide
for 3.1.0.

To reproduce the problem:

1. Run blockread3.tcl in serial mode. This reads a uncompressed and a
compressed version of a blockfile (idential content), and works as expected.

Espresso blockread3.tcl

2. Run in mpi mode with one processor. Somewhat artificial, but works:

mpirun -n 1 Espresso blockread3.tcl

3. The problem is mpi on multiple processors:

mpirun -n 4 Espresso blockread3.tcl

(...)

WARNING: node_grid incompatible with current n_nodes, ignoring
error waiting for process to exit: child process lost (is SIGCHLD
ignored or trapped?)
     while executing
"close $innnn"
     (file "blockread3.tcl" line 14)
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

Two processors (mpirun -n 2 ...) sometimes go through, and sometimes
crashes, but more than two always crashes on my system.

A temporary fix is of course to stay away from compressing the block
files. But it would be nice to be able to work with compressed files
when I go to larger systems.


System info:

ESPResSo-3.1.0
{ Compilation status { FFTW } { BOND_ANGLE_HARMONIC } { LENNARD_JONES }
{ LJCOS } { LJCOS2 } { MPI_CORE } { EXCLUSIONS } }

mpirun (Open MPI) 1.5.4

gzip 1.4

ubuntu 12.04 64 bit.

Sincerely,

Martin



--
JP Dr. Axel Arnold           Tel: +49 711 685 67609
ICP, Universität Stuttgart   Email: address@hidden
Pfaffenwaldring 27
70569 Stuttgart, Germany

[Prev in Thread]

Current Thread

[Next in Thread]

[ESPResSo-users] mpi and compressed block files, Martin Lindén, 2012/09/06
- Re: [ESPResSo-users] mpi and compressed block files, Axel Arnold <=
  - Re: [ESPResSo-users] mpi and compressed block files, Martin Lindén, 2012/09/07
    - Re: [ESPResSo-users] mpi and compressed block files, Axel Arnold, 2012/09/07

Prev by Date: [ESPResSo-users] mpi and compressed block files
Next by Date: Re: [ESPResSo-users] mpi and compressed block files
Previous by thread: [ESPResSo-users] mpi and compressed block files
Next by thread: Re: [ESPResSo-users] mpi and compressed block files
Index(es):
- Date
- Thread