espressomd-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ESPResSo-users] Problem of checkpointing with mpi


From: Jean-Noël Grad
Subject: Re: [ESPResSo-users] Problem of checkpointing with mpi
Date: Fri, 10 May 2019 16:05:36 +0200
User-agent: Mozilla/5.0 (X11; Linux i686; rv:60.0) Gecko/20100101 Thunderbird/60.6.1

Dear Ricky,

The scripts you attached contained many commented out lines and file operations that are not necessary to replicate an MPI error. In any case, I ran your scripts with espresso 4.0.2 on Ubuntu 18 and 8 MPI threads in a [2, 2, 2] configuration and couldn't replicate your error message. In fact, I got the "Could not activate magnetostatics method DipolarDirectSumCpu" exception you mentioned in your email of May 7th. Did you solve the MPI error?

Best regards,
JN

On 5/6/19 4:24 PM, 赵睿祺 wrote:
Dear all,

I have some problems about checkpointing with mpi. What I want to do is to register the system which I set up in the part1.py and load it in the part2.py.

When I run the scripts without mpi, it works well. The command I use is

./pypresso <SCRIPT>

However, when I execute the command with mpi,

mpirun –n 32 ./pypresso <SCRIPT>

something wrong happens:

_______________________________________________________________________________

terminate called after throwing an instance of 'std::out_of_range'

what():_Map_base::at

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] *** Process received signal ***

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] Signal: Aborted (6)

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] Signal code:(-6)

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fe4d0054390]

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] [ 1] /lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7fe4cfcae428]

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22768] [ 2] terminate called after throwing an instance of 'std::out_of_range'

……

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22772] *** End of error message ***

x4bec4b]

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22746] *** End of error message ***

x4bec4b]

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22747] *** End of error message ***

x4bec4b]

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22743] *** End of error message ***

x4bec4b]

[zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11:22744] *** End of error message ***

--------------------------------------------------------------------------

mpirun noticed that process rank 4 with PID 22746 on node zhrq-X10DRi-Invalid-entry-length-16-Fixed-up-to-11 exited on signal 6 (Aborted).

How to solve this problem? Thanks so much for your kind help!

Best regards!

Ricky Zhao




reply via email to

[Prev in Thread] Current Thread [Next in Thread]