Check HDF5 files for corruption

HDF5 files do not have an error recovery mechanism and do not journal. There is an optional per-variable error checksum Fletcher32 to detect data corruption. Checking/comparing file size alone is not an adequate check for HDF5 corruption.

Here a few easy techniques to check for corrupted HDF5 files.

Python HDF5 checking script checks HDF5 files for corruption and optionally finds the corrupted block(s) and variable(s)

HDF5 shell tools are installed by:

  • Linux: apt install hdf5-tools
  • MacOS: brew install hdf5
  • Windows: use MSYS2
h5stat file.h5

Print the data values in the file

h5dump file.h5

HDFview GUI appears to use the Fletcher32 checksum to show a red question mark if corruption is detected. Another curiosity is that the Object reference is 2^32 - 1 on the corrupted variable.

HDFView bad variable


Related: HDF5 GUIs to view and edit variables in .h5 files