h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] Variable-size particle groups


From: Pierre de Buyl
Subject: Re: [h5md-user] Variable-size particle groups
Date: Tue, 29 May 2012 20:24:43 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, May 29, 2012 at 06:34:04AM -0400, Peter Colberg wrote:
> On Tue, May 29, 2012 at 11:19:44AM +0200, Olaf Lenz wrote:
> > After Peter's mailing, I have had a first thorough look at H5MD. From
> > our point of view, Peter describes a very valid point - the number of
> > particles in a simulation can vary. This is not only true in the case of
> > grand-canonical simulations, but also other state-of-the-art-schemes
> > have a varying particle number, e.g. the ADResS-scheme, where the level
> > of detail might vary in different regions. As an example, think of a
> > protein-water simulation, where the protein and the surrounding nm of
> > water is simulated on an atomistic level of detail, while the water
> > further away is simulated on a coarse-grained level with less
> > interaction sites per water molacule. I believe that this kind of
> > schemes will become more important in the future, so allowing to store
> > trajectories with varying particle number may become important.
> 
> You are spot on with coarse graining. This is exactly what I intend to do.
> 
> 
> > On 05/29/2012 10:15 AM, Felix Höfling wrote:
> > >> H5MD implements an optional dataset “range” inside each
> > >> trajectory subgroup, next to the other datasets groups “step” and
> > >> “time”.
> > 
> > Besides making the format more complex, as Felix remarked, I believe
> > that forcing Peter's definition upon the format would also have major
> > impact upon parallel IO.
> > 
> > I think a relatively simple solution to avoid making the format more
> > complex while still allowing for varying particle number would be to
> > specify that if the subgroup "range" exists in a time-dependent dataset,
> > the subgroup "value" is to be interpreted in the way Peter described,
> > otherwise it uses the simple definition.
> 
> Ok, then such a “range” dataset should be optional.
> 
> The point of parallel I/O is very interesting: How would this be
> implemented in practice? To warn you, I have not used parallel
> HDF5 yet.
> 
> I would assume that e.g. for a parallel MPI simulation, one would need
> a designated process to extend the “value”, “step”, “time” datasets
> on each time-step, after which all processes perform a write to the
> their slice of the newly appended region of the “value” dataset.
> 
> Then adding a “range” dataset should not change this requirement.
> There would still be a single process to extend the datasets. The
> designated process would further communicate to the other processes
> the new range with regard to “value”, after which all processes
> perform a write to their sub-range.
> 
> Are my assumptions on parallel HDF5 I/O anything close to reality?
If I recall correctly (but cannot find the reference right now), the size and
pattern of the data may impact severely the performance of HDF5. This should
also be checked.

P




reply via email to

[Prev in Thread] Current Thread [Next in Thread]