h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [h5md-user] A note on the string type


From: Felix Höfling
Subject: Re: [h5md-user] A note on the string type
Date: Tue, 10 Dec 2013 10:46:32 +0100
User-agent: Opera Mail/12.16 (Linux)

Am 06.12.2013, 18:58 Uhr, schrieb Peter Colberg
<address@hidden>:

On Fri, Dec 06, 2013 at 05:45:15PM +0100, Felix Höfling wrote:
In conclusion, there seems to be a preference for fixed-length strings in h5py, but a preference for VL Strings in H5MD. Further, a novice user may easily mess up the string types, e.g., this issue is discussed in "Special
topics" in the h5py manual ... not for beginners. This made me wondering
again if forcing strings to be of VL type is a good idea, but I know that
this is a majority opinion.

The novice user would probably store the wrong string type either way.

import h5py
f = h5py.File("/tmp/vlstring.h5", "w")
f.attrs["unit"] = "nm"
f.attrs["boundary"] = ["periodic", "periodic", "nonperiodic"]

HDF5 "vlstring.h5" {
GROUP "/" {
   ATTRIBUTE "boundary" {
      DATATYPE  H5T_STRING {
         STRSIZE 11;
         STRPAD H5T_STR_NULLPAD;
         CSET H5T_CSET_ASCII;
         CTYPE H5T_C_S1;
      }
      DATASPACE  SIMPLE { ( 3 ) / ( 3 ) }
      DATA {
      (0): "periodic\000\000\000", "periodic\000\000\000", "nonperiodic"
      }
   }
   ATTRIBUTE "unit" {
      DATATYPE  H5T_STRING {
         STRSIZE H5T_VARIABLE;
         STRPAD H5T_STR_NULLTERM;
         CSET H5T_CSET_ASCII;
         CTYPE H5T_C_S1;
      }
      DATASPACE  SCALAR
      DATA {
      (0): "nm"
      }
   }
}
}

For strings, h5py uses a variable-length string type. For arrays of
strings, h5py uses a fixed-length string type. Thus if you are a
novice user, better use pyh5md.

Peter


Deferring users to some intermediate library such as pyh5md foils the goal
of portability. Users are then bound to existing H5MD libraries, which so
far exist only for Python and Fortran and are not yet mature. Not speaking
of commercial software (Matlab, IDL) which is explicitly in the focus of
H5MD as well and for which there are currently no attempts to create
intermediate libraries.

On the other hand, if users are encouraged to use such a library, it won't
be a big deal to implement reading of either String type. Then, there
would be no "wrong" string type anymore and H5MD files would be more
likely to be valid.

Felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]