hdf5

package
v0.3.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 5, 2023 License: Apache-2.0 Imports: 14 Imported by: 0

Documentation

Overview

Package hdf5 provides a trivial API to access HDF5 file contents.

It requires the `hdf5-tools` (a deb package) installed in the system, more specifically the `h5dump` binary.

It is basic but provides the necessary functionality to list the contents and extract the binary contents.

Index

Constants

View Source
const H5DumpBinary = "h5dump"

Variables

This section is empty.

Functions

func DtypeForH5T

func DtypeForH5T(h5type string) (dtype shapes.DType)

DtypeForH5T returns the DType corresponding to known HDF5 types. If not know/supported, returns invalid dtype.

Types

type Hdf5Contents

type Hdf5Contents map[string]*Hdf5Dataset

Hdf5Contents is a map of all the datasets present in the HDF5 file. The key is the path built from the concatenation of the "group" (how HDF5 calls directories or folders) with the dataset name, separated by a "/" character.

func ParseFile

func ParseFile(filePath string) (contents Hdf5Contents, err error)

ParseFile in filePath as an HDF5 file and returns map of contents.

It requires the `hdf5-tools` (a deb package) installed in the system, more specifically the `h5dump` binary.

type Hdf5Dataset

type Hdf5Dataset struct {
	FilePath, GroupPath, RawHeader string
	DType                          shapes.DType
	Shape                          shapes.Shape
}

Hdf5Dataset has (some of) the metadata about a dataset (but not the data itself). The dataset "DATATYPE" and "DATASPACE" fields are converted to the equivalent GoMLX `shapes.Shape`.

func (*Hdf5Dataset) Load

func (ds *Hdf5Dataset) Load() (rawContent []byte, err error)

func (*Hdf5Dataset) ToTensor

func (ds *Hdf5Dataset) ToTensor() (local *tensor.Local, err error)

ToTensor reads the HDF5 dataset into GoMLX's `tensor.Local`.

type UnpackToTensorsConfig

type UnpackToTensorsConfig struct {
	// contains filtered or unexported fields
}

UnpackToTensorsConfig holds the configuration created by UnpackToTensors, to unpack HDF5 files into a directory structure with the individual tensors saved in GoMLX format.

The targetDir must not yet exist.

func UnpackToTensors

func UnpackToTensors(targetDir, h5Path string) *UnpackToTensorsConfig

UnpackToTensors unpacks tensors from an HDF5 file (typically with an '.h5' extension). It will generate one file per tensor, in subdirectories under `targetDirectory` mimicking the groups ("folder" or "directory" in HDF5 lingo) structure within the HDF5 file.

UnpackToTensors returns a configuration structure, that can be further configured. Once done configuring, call Done, and it will do the unpacking.

Tensors are serialized using `tensor.Local.Save`, and can be read with `tensor.Local.Load`.

Example: unpack `weights.h5` file into `/my/target/directory`.

err := UnpackToTensors("weights.h5", "/my/target/directory").ProgressBar().Done()

func (*UnpackToTensorsConfig) Done

func (c *UnpackToTensorsConfig) Done() (err error)

Done actually does the unpacking according to the configuration. See details in UnpackToTensor.

It unpacks first to a temporary directory and renames it at the very end if the unpackaging was successful.

If an error occurs, by default, it will remove the temporary directory with unpackaged files generated thus far. You can change this behavior with KeepTemporary.

func (*UnpackToTensorsConfig) FilePermissions

func (c *UnpackToTensorsConfig) FilePermissions(perm os.FileMode) *UnpackToTensorsConfig

FilePermissions configures the file permissions used for the creation of the directories and files. Default is `os.FileMode(0755)`.

It modifies the configuration and returns itself, so configuration calls can be cascaded.

func (*UnpackToTensorsConfig) KeepTemporary

func (c *UnpackToTensorsConfig) KeepTemporary() *UnpackToTensorsConfig

KeepTemporary configures unpacking to keep the temporary directory with (potentially partially) unpackaged files.

The default behavior is, if an error occurs, remove all temporary files.

It modifies the configuration and returns itself, so configuration calls can be cascaded.

func (*UnpackToTensorsConfig) ProgressBar

func (c *UnpackToTensorsConfig) ProgressBar() *UnpackToTensorsConfig

ProgressBar configures a progressbar to be displayed during the unpacking.

It modifies the configuration and returns itself, so configuration calls can be cascaded.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL