Why is HDF a desirable format?
HDF is an extremely flexible, open-source format for storing hierarchical data. It supports heterogenerous data types
and organises data and meta-data (called attributes) into groups for easy access. It greatly simplifies data storage
and is cross-platform with bindings to most scientific languages. An all-in-one data management solution.
Read more about what HDF is and the
features of the technology.[ ↑ top ]
Why are native LabVIEW solutions (TDMS, binary/ASCII flat-file) undesirable?
As stated above, hierarchical data storage is a
very beautiful thing. The ability to store data with metadata,
handle arbitrary types and categorise data into a meaningful tree transparently is amazing. Handling arbitrary metadata
in flat-files is a logistical mess, and common approaches require serialisation to string (e.g. XML) making them
extremely slow for scientific data.
LabVIEW supports its own structured binary format called TDMS
(white paper), but there are some significant problems with it:
- Not a true hierarchical layout: Structure of TDMS files is prescribed and inflexible
- It's opaque and closed source: The structure of the binary file is unknown
- It has few language bindings: APIs for TDMS files are only available for C and LabVIEW (with Excel and MATLAB plugins)
We consider the first reason alone sufficient to warrant using HDF in LabVIEW.
[ ↑ top ]Why is the existing LVHDF5 library inadequate?
The
LVHDF5 library by Jason Sommerville is a very useful
set of bindings to HDF5 v1.6.5, last updated in 2007.
Unfortunately that version of HDF5 is very heavily deprecated now and no longer compatible with the latest version.
In particular, naming and argument conventions were modified in HDF5 v1.8.0, meaning the LVHDF5 library which attempts to
call functions directly cannot use DLLs built after this version. Furthermore, changes to string handling mean
LVHDF5 cannot correctly interpret strings written in HDF v1.8.9+.
It's also very slow, which was our initial motivation to start writing our own direct HDF calls in LabVIEW.[ ↑ top ]
Why build a new library from scratch?
Originally this project started as an attempt to patch LVHDF5 to link to a newer revision of HDF, but the following problems were encountered:
- It's slow
This is something the author of LVHDF5 freely admits to: in order to preserve generality and ease-of-use, flattening to string is perfomed
at the expense of speed. For applications requiring large numerics this takes appreciable time - enough to motivate writing
calls to H5Dwrite directly. These flattening operations are often not required, and actually involve costly little endian->big endian->little endian
conversion on x86/x64 machines. Speed was a primary motivation for h5labview
- There are a lot of VIs to maintain
There are 133 VIs in the LVHDF5 palette alone, and every single one of them that references a deprecated function name needs changing,
at the very least. Some updates can be scripted, but much of this functionality is overkill anyway and is a lot of effort to keep up-to-date.
That's >190 files to keep updated - another goal of this project is to keep the code tree compact and clean.
- The wrapper DLL is complex
The wrapper DLL interfacing between LabVIEW and HDF5 (h5helper.dll) comprises 60 C++ source files wrapping the C API of HDF5 in C++ objects
and exposing those to LabVIEW. This is several layers of complexity and makes maintainance very difficult.
- Close doesn't close, aborts mess things up
This isn't LVHDF's fault, but rather a design decision in the HDF library itself. When you call H5Fclose, the file is not actually closed until
all other pointers were closed properly. This is a serious problem in a non-linear execution language like LabVIEW because
there are many reasons a call to H5Xclose might not execute, so handles don't get closed until LabVIEW quits.
This is especially bad for files on network drives when multiple computers want read/write access the same file.
- Better defaults
Defaults set in the LabVIEW library should mirror the expected behaviour in LabVIEW with normal LabVIEW paradigms, and not simply be the default
underlying C behaviour. While default can be changed with H5P* calls, the library is no longer user-friendly in LabVIEW. The item above is one
such instance of this behaviour.
[ ↑ top ]What's the recommended way to install h5labview?
The
VI package manager (VIPM) is the recommened way to install LabVIEW addons, and a VIPM package is available from the
downloads page. Install the VIPM (the free community edition works fine),
and then open the
.vip file and click install. This will place all relevant files in their correct destinations.
If you wish to stay up-to-date with the most recent developments and bug-fixes, you can check out the most recent code from version control, as
outlined below.[ ↑ top ]
Can I install h5labview manually? (i.e. without VIPM)
There are several reasons you may want to install
h5labview manually - that is,
without using VIPM - such as difficulty
getting VIPM to work on your installation (namely inside VMs), installing on several machines, using an old version of LabVIEW, it's incredibly slow, etc.
Unless you already use VIPM, it is something of a pain.
To install manually, do the following:
- Obtain a copy of the latest version of the code from the repository by clicking
the "Download Snapshot" button in the top-right, or by clicking here.
This will download a ZIP of all the files.
- Unzip the file to vi.lib\addons (which is located in your LabVIEW install directory). Rename the folder that looks like
h5labview-code-VERSIONHASH to h5labview2.
- On Linux, enter the "lib" directory and execute make.
You will probably need to update the path variables in the makefile.
- Mass-compile the package for your local system by selecting Tools>Advanced>Mass Compile from the menu bar in LabVIEW and selecting
the h5labview2 directory. Set "Cache VIs" to 50 and click "Mass Compile". Take note of any errors that may occur.
- Run h5labview2\tests\test_all.vi and make sure it works. If LabVIEW crashes or any of the indicators are false, file a bug
report.
Please note that you will need to repeat these steps any time you want to upgrade - VIPM is still the recommended installation method.[ ↑ top ]
Does h5labview work in earlier editions than LabVIEW 2010?
Potentially - down to LabVIEW 8.5 - but there could be issues.
Forwards-compatibility is an absolute pain in LabVIEW, since VIs are binary-incompatible (that is, simply cannot be opened by older versions),
even if that version of LabVIEW supports all of its functionality.
While files can be exported to an older version, this must be done
manually as a scripting approach does not have access to the same
functionality (specifically LVLIB and XNODE export) and cannot produce 8.5-compatible files.
We have worked out some tricks to get around this, but it means that editions pre-2010 versions will only be produced
on request.
Note that VIPM cannot be used for pre-2010, as it recompiles everything in 2010-format (on my machine) when creating the package,
so install and upgrade must be done manually.
A ZIP file will be provided, which is installed in the same way as the from source control, outlined above.
The VI scripting addon is required and must be installed manually for pre-2010 installations, which is
available for free from NI.
Please note that we cannot bug-test exported versions, and debugging on your part will almost certainly be required.[ ↑ top ]
Does h5labview work on 64-bit Windows or Linux?
Yes. Versions of the helper DLL compiled for these OSes are distributed with the package, but it is possible they
may be less stable than the 32-bit Windows version because
there are several (poorly documented) variations between the underlying memory management routines between the different platforms.
Both 64-bit Windows and 32-bit Linux versions are not as extensively tested before release.
On 64-bit Windows, the installed version of HDF5 must match the architecture of LabVIEW, not Windows.
If using 32-bit LabVIEW on 64-bit Windows it's recommended that you copy the 32-bit versions of hdf5.dll,
szip.dll and zlib.dll to the resource folder of the LabVIEW
directory so the correct DLLs get loaded.[ ↑ top ]
Does h5labview work on OSX?
Possibly. No developers have access to OSX and therefore cannot conduct testing. Some users have had success compiling
h5labview on OSX systems, but it cannot be supported at this stage.
[ ↑ top ]Does h5labview work on Windows XP?
Windows XP is a legacy system and is
not an officially supported platform
by the HDF group, so HDF5 functionality may break on Windows XP at any point in the future. XP is therefore not officially supported by
h5labview either, and users may encounter errors getting it to work. However, users have reported success getting it to work
with a number of versions of HDF5 and limited support may be available on the forums.
To operate correctly on WinXP, you must obtain the VS2010 release of HDF5, which as of v1.8.12 is no longer the default.
You can obtain this release from the HDF5 FTP site, by selecting the latest
version, selecting "bin", "windows" and find the release marked "VS2010-shared" (may be in a subfolder).
Install that instead of the default version and you should be able to install h5labview.[ ↑ top ]
Why do I get a DLL load failure error?
The most likely cause is that LabVIEW cannot find the HDF5 DLLs, or Windows is loading the wrong DLLs.
Sometimes it matters exactly which version of the HDF5 library
you are using, so please download the one from the link on the front page.
Since v2.8.0, h5labview checks the version of HDF5 installed to ensure it's compatible. If an incompatible version is installed,
it is likely that Error 7, 12 or 15 will be raised by a "Call Library Function" Node (see also
this post for an explanation of the codes).
The solution is to ensure LabVIEW can find the required DLLs as described below.
One you've installed HDF5, copy hdf5.dll, szip.dll and zlib.dll from the "bin" folder of the installation
directory to the LabVIEW\vi.lib\addons\h5labview2\lib folder. This will ensure that Windows locates the correct versions of the libraries.[ ↑ top ]
Why do I see an increase in memory usage when resizing datasets?
This is most likely
not a memory leak. HDF5 maintains an internal memory heap to speed up memory and I/O operations. When you resize
a dataset, the memory is put back into the heap, not released to the operating system. HDF5 reuses this memory next time, but it appears
to the OS that the memory usage has increased. This heap is limited to ~16MB in size (per file) and should not increase beyond that.
Therefore, it is expected behaviour that as you append data to a dataset, LabVIEW's apparent memory usage memory will increase with each
H5Dwrite call by a maximum of 16MB. It may take several minutes to cap out.
If you see memory usage increase beyond this, ensure that all open pointers get closed correctly. In particular, the dataspace output by
H5Dprepare_append must be closed with H5Sclose after the H5Dwrite has finished (see example code).
[ ↑ top ]Should I use the XNodes?
XNodes are an unofficial internal-only technology not sanctioned by National Instruments. It is
possible that using them can have
unintended consequences and as such may be undesirable in developing commercial products.
However, the distributed XNodes have been heavily tested and provide an extremely simple interface to reading and writing
arbitrary typed data without needing to make and maintain many polymorphic instances, or worse use variants and have to
make copies, serialise via string, and cast to type. XNodes are fast and simple to use by comparison.
If XNodes are "not executable", right click on them and select "Rebuild". If they are still broken, lodge a bug report.[ ↑ top ]
In the event that the Abort button gets pressed (either directly via the button or indirectly via method call or error handling)
no DLLs are unloaded. This is by design in LabVIEW, but unfortunately means no automated clean-up occurs. In particular, file
handles opened by HDF do not get shut, so attempting to reopen an open file produces an error.
To fix this, an abort-handler has been incorporated into the H5Fopen call, which should close handles and files associated
with that VI. This functionality should not be relied upon. NI insists that aborts should not be a part of normal operation,
they are a debugging tool only and unexpected results may occur.
If an abort happens accidentally (because of an error or otherwise) and HDF gets confused, instead of restarting LabVIEW you
can simply run H5restart.vi in the base directory, which forces HDF5 to unload and restart.
Do not use this feature as a regular part of your code because it interrupts behaviour across all VIs.
It does however come in handy when a mistake gets made.[ ↑ top ]
What changes were made in version 2?
Version 2 is actually a complete rewrite of the
h5labview library, based on feedback from users and a desire to
develop more general and robust code. Handling of the type-interface has been rewritten in C for flexibility, enabling new
types and efficient conversions to be implemented. The XNodes were introduced to replace polymorphic VI instances and automatically
adapt-to-type, enabling compound datatypes to be implemented. 64-bit support was introduced, and problems related to the
differences between memory managers in 32- and 64-bit LabVIEW addressed.
Because version 2 is a complete rewrite of the core code, upgrading your code from V1 to the new version must be done manually.
This is an unfortunate consequence of LabVIEW's reliance on file names and directory layout to identify code.
It is highly recommended you do upgrade when you have time to, because bug-fixes and improvements will only continue on this version.[ ↑ top ]
How does error handling work?
HDF maintains an
error stack so that when an error occurs, traceback information is stored. An
automatic error handler
is called upon an error and given the stack to process into an error message for the user, and take necessary action. By default
this dumps a stack trace to
stderr. Since LabVIEW is not a console application, this makes no sense and instead when the
helper DLL is loaded is installs an error handler that converts the stack to a description string to later be queried from LabVIEW
if a function returns error (usually -1).
Alternatively the stack can be interrogated directly by walking the stack explicitly without needing secondary buffers.
However, any function which succeeds
after the failed one (e.g. H5*close) clears the error stack eliminating the information. Since the error handler VI cannot be
guaranteed to execute before the stack gets emptied, the installed error handler is used instead.[ ↑ top ]
Why do all handles get closed when the file is closed?
HDF has
two modes of operation regarding closing files: strong and weak
closing. Unless strong closing is specified, dangling references prevent the file handle from actually being closed, causing lingering
access and requiring LabVIEW to be restarted. It was a design choice to enforce strong closing, which closes all associated objects
and invalidates their pointers. It is the opinion of the author that once the file is "closed", the file should be
closed.
[ ↑ top ]Does h5labview create temporary data copies?
Not unless it has to. During data read/write operations, LabVIEW memory is passed directly to HDF5 to prevent serialisation
and its associated slow-down. However, since LabVIEW does not use C datatypes and HDF does, some conversion must sometimes take place.
In particular, LabVIEW's variable-sized types such as arrays and strings have their own memory conventions and must be converted to a
form capable of being processed by HDF5. Care is taken to ensure conversion is done efficiently.
[ ↑ top ]How are strings handled?
HDF supports two kinds of strings - fixed-strings are var-strings. In C, fixed strings look like
char str[256], and have
a definite length. Variable strings look like
char *str and are a pointer to a variable-sized block of memory terminated
by the NULL character (
\0). LabVIEW strings are
neither of these things, and are a variable-sized block of memory
beginning with a length indicator. Because the \0 character is an acceptible element of a LabVIEW string, they cannot be directly
mapped.
The following algorithm is therefore used when storing an array of strings: When a string value is wired to a
read/write node in h5labview v2, the name is checked to see if it matches one of the specified forms:
- name <n> - will be written as a fixed-length string of length n, and will always read n characters (padded with \0 if necessary)
- name [n] - will be written as a fixed-length string of length n, and trailing \0 characters will be removed upon reading
- name - will be written as a variable-length string, and will be truncated at the first \0 character
Check the examples to see how to read and write strings in different situations.[ ↑ top ]
Why aren't all of LabVIEW's datatypes supported?
Some datatypes are planned but not implemented, and other datatypes will never be implemented because it is unclear how to map
between HDF and LabVIEW datatypes. Implementation so far is on an as-desired basis. If there is demand for functionality it will
become more of a priority.
For instance, the HDF_ARRAY datatype will not be implemented, and all LabVIEW arrays will instead be handled through the dataspace
interface. This means arrays of clusters might be ok but clusters containing an array are invalid. Other datatypes are planned
but require substantial processing between HDF and LabVIEW (e.g. enums, waveforms) and others have platform-specific behaviour
(extended precision float).[ ↑ top ]
Why aren't all HDF functions implemented?
Again, functionality is implemented on a as-requested basis. The HDF library is massive and contains a lot of seldom-used functionality.
I do not think it makes sense to expose every single function call directly in LabVIEW, but instead try to cluster them together in
meaningful ways. Decision upon versatility and value is open to interpretation and the author(s) are open to suggestion.
[ ↑ top ]I noticed feature X is missing, can I implement it?
Absolutely! Please let the team know you're interested in contributing and what, in case we've already tried it or have
some ideas.
[ ↑ top ]What version control do you use, and how can I get the latest version?
This project is developed under revision control with
Mercurial (hg),
which can integrate with LabVIEW using
LVMerge and
LVDiff
(see
instructions here),
and the Windows GUI
TortoiseHg is pretty good.
Having used several alternatives (including SVN and GIT) I find Hg most intuitive.
Get the latest version by cloning the repository from http://hg.code.sf.net/p/h5labview/code,
then you can pull and update to get the latest changes.
There is also a web interface at http://sourceforge.net/p/h5labview/code/.[ ↑ top ]
How should changes be submitted?
In Mercurial, any changes you make will be stored in your local repository (including commits you make)
which you then
synchronize to merge with the central SourceForge repository.
Please make sure to submit any modified VIs exported to LabVIEW 2010 or earlier.
[ ↑ top ]How often is the package updated?
Whenever a major change or bug-fix has been successfully implemented and tested.
[ ↑ top ]Why are the compiled VIs in separate files?
A big problem with version control in LabVIEW is that it's a
binary format and is therefore annoying to merge.
The tools outlined above make merging
possible but it's still an annoying thing to do. In particular, if a subVI
is modified, often VIs which call it are recompiled, changing their object code, which looks like a "diff".
Storing the object code separately therefore prevents needless entries in the changelog because of downstream recompiles.
[ ↑ top ]How are aborts handled?
The
Call Library Node has special callback functions that get activated under certain conditions. Each CLN is
allowed to use a single pointer as local memory to keep track of allocated objects, etc. The
H5Fopen VI calls
a helper function, which stores the file pointer in that local memory. When an abort or close occurs, any remaining
handles are closed.
[ ↑ top ]How is work split between the helper DLL and LabVIEW?
Ideally, work done in the DLL should be limited to that which cannot be done in LabVIEW (global DLL objects,
function callbacks), but in practice any code that becomes
messy to implement in LabVIEW but easy in C
should be put in the helper DLL. So far this is limited to error handling and raw data/type manipulation.
[ ↑ top ]Why don't you use the *.* notation to specify library paths?
LabVIEW allows you to use *.* when specifying a library path; it substitutes the 32/64 corresponding to the
application's architecture for the first *, and the relevant extension (.dll, .so) for the second. This seems
useful for cross-platform compatibility, but unfortunately it gets
removed when VIs are mass-compiled.
When you install h5labview through VIPM, the final step is to mass-compile the VIs for your system, to
resolve dependencies and such. However, this destroys the cross-compatibility of the library calls. This is a
registered bug in LabVIEW. The mass-compile can be disabled for VIPM as a whole, but not individual packages,
which is also unsatisfactory.
Several hacks have been tried to get around this. The latest is to use a common filename for the library across
all platforms, and simply replace that file during the installation.[ ↑ top ]
How do the XNodes work?
XNodes are "black magic" in LabVIEW, but arbitrary type I/O is an ideal application for them (and they are
indeed used for LabVIEW's own storage VIs). The XNodes we've developed wrap a low-level C call, passing
a pointer to LabVIEW's data directly to C. The C function is passed a type descriptor so it can correctly
interpret the data, which it uses to parse and interpret the data, avoiding making copies wherever possible.
The XNodes can be avoided completely by making specific polymorphic instances of the base classes, and simply using
them instead. Think of the XNode as simply a "factory" for whatever polymorphic instances you may want to use.
A set of tools were developed to create the XNodes, which are not considered stable enough to release.
Contact the author if you are interested in knowing more.[ ↑ top ]
How is extended precision implemented?
Extended precision (EXT) is implemented as either a 64-bit, 80-bit, 96-bit or 128-bit IEEE float depending on the platform.
When
h5labview is compiled, the EXT datatype for the host machine is used to construct an appropriate H5T_FLOAT
type using the
H5T_set_fields call.
HDF5 can automatically transform between different precision float representations, so you can read it out as an
H5T_NATIVE_FLOAT or H5T_NATIVE_DOUBLE, but since it does not match a native datatype, some viewers and interpreters
will get confused (namely
h5py which throws an exception).
[ ↑ top ]