[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
problems in a beowulf
From: |
Bruno Muller Junior |
Subject: |
problems in a beowulf |
Date: |
19 Mar 2002 14:12:25 -0300 |
Hi,
I'm working in an application that runs over a beowulf
arquitecture, but I'm having a run-time problem that occur probably
due to the way libtool configures the execution script.
Is our beowulf, all computers have the same configuration (PCs with
redhat 7.2 linux), and only the homedir is visible to all
computers.
I compiled my application using "/bin/sh libtool --mode=link
...". This generates two files: ./my_apps and ./.lib/my_apps.
my_apps is an application that I want to run on each node (one per
node), and each instance of it communicates with each other using
MPI. All the system start when I type "mpirun", that starts one
instance of the program on each node.
At this point happens the problem: my_apps uses a shared lib, and
in the .lib/ directory a lot of files are created
(i.e. <PID>-lt-my_apps), and the system crashes by:
1. input/output error
2. stale NFS file handle
I think that it happens because all computers in the beowulf tries
to write its own "executable" in .lib/ directory, and then the
computer that "owns" the homedir gets too much work to do and
crashes (?).
Well, I'd like to know why this happen and what should I do to
avoid this problem.
Bruno
- problems in a beowulf,
Bruno Muller Junior <=