h5md-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[h5md-user] A note on the string type


From: Felix Höfling
Subject: [h5md-user] A note on the string type
Date: Fri, 06 Dec 2013 17:45:15 +0100
User-agent: Opera Mail/12.16 (Linux)

Hi,

I'm aware that the String type has been discussed at length, and I don't
want to change anything. But I would like to share some recent
observations:

i) The h5py manual says about String types:
http://www.h5py.org/docs/topics/strings.html
==
Technically, Fixed-length ASCII strings are supposed to store only
ASCII-encoded text, although in practice anything you can store in NumPy
will round-trip. But for compatibility with other progams using HDF5 (IDL,
MATLAB, etc.), you should use ASCII only.

Note: This is the most-compatible way to store a string. Everything else
can read it.
==

I don't know what led the h5py author to this conclusion, but I assume
that he made some research on and experiences with the string types.

ii) The box boundaries are now stored as an array of VL Strings (each
string is allocated somewhere in the HDF5 file), which appears a bit
overkill given the small piece of information. An array of fixed-size
strings would be much cleaner.

iii) h5py 2.2.0 (by default) stores a single string as VL String, but an
array of a single or many strings as fixed-size string.

In conclusion, there seems to be a preference for fixed-length strings in
h5py, but a preference for VL Strings in H5MD. Further, a novice user may
easily mess up the string types, e.g., this issue is discussed in "Special
topics" in the h5py manual ... not for beginners. This made me wondering
again if forcing strings to be of VL type is a good idea, but I know that
this is a majority opinion.

Regards,

Felix



reply via email to

[Prev in Thread] Current Thread [Next in Thread]