Documentation
¶
Overview ¶
Package hdf5 provides a trivial API to access HDF5 file contents.
It requires the `hdf5-tools` (a deb package) installed in the system, more specifically the `h5dump` binary.
It is basic but provides the necessary functionality to list the contents and extract the binary contents.
Index ¶
Constants ¶
const H5DumpBinary = "h5dump"
Variables ¶
This section is empty.
Functions ¶
func DtypeForH5T ¶
DtypeForH5T returns the DType corresponding to known HDF5 types. If not know/supported, returns invalid dtype.
Types ¶
type Hdf5Contents ¶
type Hdf5Contents map[string]*Hdf5Dataset
Hdf5Contents is a map of all the datasets present in the HDF5 file. The key is the path built from the concatenation of the "group" (how HDF5 calls directories or folders) with the dataset name, separated by a "/" character.
func ParseFile ¶
func ParseFile(filePath string) (contents Hdf5Contents, err error)
ParseFile in filePath as an HDF5 file and returns map of contents.
It requires the `hdf5-tools` (a deb package) installed in the system, more specifically the `h5dump` binary.
type Hdf5Dataset ¶
type Hdf5Dataset struct {
FilePath, GroupPath, RawHeader string
DType shapes.DType
Shape shapes.Shape
}
Hdf5Dataset has (some of) the metadata about a dataset (but not the data itself). The dataset "DATATYPE" and "DATASPACE" fields are converted to the equivalent GoMLX `shapes.Shape`.
func (*Hdf5Dataset) Load ¶
func (ds *Hdf5Dataset) Load() (rawContent []byte, err error)
type UnpackToTensorsConfig ¶
type UnpackToTensorsConfig struct {
// contains filtered or unexported fields
}
UnpackToTensorsConfig holds the configuration created by UnpackToTensors, to unpack HDF5 files into a directory structure with the individual tensors saved in GoMLX format.
The targetDir must not yet exist.
func UnpackToTensors ¶
func UnpackToTensors(targetDir, h5Path string) *UnpackToTensorsConfig
UnpackToTensors unpacks tensors from an HDF5 file (typically with an '.h5' extension). It will generate one file per tensor, in subdirectories under `targetDirectory` mimicking the groups ("folder" or "directory" in HDF5 lingo) structure within the HDF5 file.
UnpackToTensors returns a configuration structure, that can be further configured. Once done configuring, call Done, and it will do the unpacking.
Tensors are serialized using `tensor.Local.Save`, and can be read with `tensor.Local.Load`.
Example: unpack `weights.h5` file into `/my/target/directory`.
err := UnpackToTensors("weights.h5", "/my/target/directory").ProgressBar().Done()
func (*UnpackToTensorsConfig) Done ¶
func (c *UnpackToTensorsConfig) Done() (err error)
Done actually does the unpacking according to the configuration. See details in UnpackToTensor.
It unpacks first to a temporary directory and renames it at the very end if the unpackaging was successful.
If an error occurs, by default, it will remove the temporary directory with unpackaged files generated thus far. You can change this behavior with KeepTemporary.
func (*UnpackToTensorsConfig) FilePermissions ¶
func (c *UnpackToTensorsConfig) FilePermissions(perm os.FileMode) *UnpackToTensorsConfig
FilePermissions configures the file permissions used for the creation of the directories and files. Default is `os.FileMode(0755)`.
It modifies the configuration and returns itself, so configuration calls can be cascaded.
func (*UnpackToTensorsConfig) KeepTemporary ¶
func (c *UnpackToTensorsConfig) KeepTemporary() *UnpackToTensorsConfig
KeepTemporary configures unpacking to keep the temporary directory with (potentially partially) unpackaged files.
The default behavior is, if an error occurs, remove all temporary files.
It modifies the configuration and returns itself, so configuration calls can be cascaded.
func (*UnpackToTensorsConfig) ProgressBar ¶
func (c *UnpackToTensorsConfig) ProgressBar() *UnpackToTensorsConfig
ProgressBar configures a progressbar to be displayed during the unpacking.
It modifies the configuration and returns itself, so configuration calls can be cascaded.