ggml

package
v0.0.0-...-6093f96 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 24, 2026 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Overview

Package ggml provides a parser for the legacy GGML (GPT-Generated Model Language) format.

GGML is the predecessor to the GGUF format and was used by early Whisper.cpp models. While GGUF is the modern standard with memory-mapped access and extensive metadata, GGML remains important for backward compatibility with existing models.

The GGML format consists of: 1. Magic bytes (0x67676d6c little-endian or 0x6c6d6767 big-endian) 2. Hyperparameters (11 int32 values: n_vocab, n_audio_ctx, n_audio_state, etc.) 3. Mel filter coefficients (n_mels × n_fft × 4 bytes) 4. Vocabulary (count + variable-length tokens) 5. Tensor data (repeating: n_dims, name_len, type, dims[], name[], data[])

Unlike GGUF's memory-mapped approach, GGML loads all tensor data into memory. This is simpler but less efficient for very large models.

For new models, prefer GGUF format. This package exists to support existing Whisper.cpp GGML models.

Index

Constants

This section is empty.

Variables

View Source
var WhisperTensorOrder = []string{

	"encoder.conv1.weight",
	"encoder.conv1.bias",
	"encoder.conv2.weight",
	"encoder.conv2.bias",
	"encoder.position_embedding.weight",
	"encoder.ln.weight",
	"encoder.ln.bias",

	"encoder.layers.0.attn.q_proj.weight",
	"encoder.layers.0.attn.k_proj.weight",
	"encoder.layers.0.attn.v_proj.weight",
	"encoder.layers.0.attn.out_proj.weight",
	"encoder.layers.0.attn.out_proj.bias",
	"encoder.layers.0.attn.ln.weight",
	"encoder.layers.0.attn.ln.bias",
	"encoder.layers.0.ffn_gate.weight",
	"encoder.layers.0.ffn_up.weight",
	"encoder.layers.0.ffn_down.weight",
	"encoder.layers.0.ffn.ln.weight",
	"encoder.layers.0.ffn.ln.bias",
}

WhisperTensorOrder defines the expected order of tensors in a Whisper GGML model.

This is based on the whisper.cpp implementation and serves as a reference for validating GGML model files. Modern GGUF files use named tensors directly, making explicit ordering unnecessary.

Functions

func GenerateTensorOrder

func GenerateTensorOrder(nEncLayers, nDecLayers int) []string

GenerateTensorOrder generates the complete list of expected tensor names based on the model configuration.

This function is used to validate that all required tensors are present in a GGML model file. It generates tensor names for both encoder and decoder layers based on the specified layer counts.

Parameters:

  • nEncLayers: Number of encoder layers
  • nDecLayers: Number of decoder layers

Returns a slice of expected tensor names in the order they should appear.

Example usage:

// Generate tensor order for Whisper tiny model (4 encoder, 4 decoder layers)
tensorOrder := GenerateTensorOrder(4, 4)

// Validate that all tensors are present
for _, expectedName := range tensorOrder {
    if _, exists := model.Tensors[expectedName]; !exists {
        return fmt.Errorf("missing tensor: %s", expectedName)
    }
}

func GetTensorName

func GetTensorName(index int, config map[string]int32) string

GetTensorName returns the expected tensor name for a given position index.

For Whisper models, this uses a fixed ordering based on the model architecture. This function is primarily useful for legacy GGML models where tensor names are not explicitly stored in the file.

Parameters:

  • index: Tensor position index in the model
  • config: Model configuration (e.g., number of encoder/decoder layers)

Returns the expected tensor name, or empty string if index is out of range.

Note: Modern GGUF files store tensor names directly, making this function unnecessary for new models.

func LoadGGMLModel

func LoadGGMLModel(path string) ([]byte, map[string]RawTensor, error)

LoadGGMLModel reads a GGML format model file from disk.

This function supports both little-endian (0x67676d6c) and big-endian (0x6c6d6767) magic bytes. It loads the entire model into memory, including all tensor data.

Parameters:

  • path: Path to the GGML model file

Returns:

  • metadata: Raw bytes containing 11 hyperparameters (44 bytes total)
  • tensors: Map of tensor name → RawTensor with loaded data
  • error: Any error encountered during parsing

Note: For modern models with memory-mapped access and extensive metadata, use GGUF format instead (see format/gguf package).

func LoadWhisperGGMLTensors

func LoadWhisperGGMLTensors(path string) (map[string]RawTensor, []int32, error)

LoadWhisperGGMLTensors loads a Whisper model from a GGML format file.

This is a specialized wrapper around LoadGGMLModel() that handles Whisper-specific requirements. It returns the raw tensors and extracts model hyperparameters (metadata) from the file.

Parameters:

  • path: Path to the Whisper GGML model file

Returns:

  • tensors: Map of tensor name → RawTensor with loaded data
  • metadata: Slice of 11 int32 hyperparameters: [0] n_vocab: Vocabulary size [1] n_audio_ctx: Audio context window size [2] n_audio_state: Audio encoder hidden size [3] n_audio_head: Number of audio attention heads [4] n_audio_layer: Number of audio encoder layers [5] n_text_ctx: Text context window size [6] n_text_state: Text encoder hidden size [7] n_text_head: Number of text attention heads [8] n_text_layer: Number of text decoder layers [9] n_mels: Number of mel spectrogram bands [10] ftype: Model quantization type
  • error: Any error encountered during loading

Example usage:

tensors, metadata, err := ggml.LoadWhisperGGMLTensors("whisper-tiny.ggml")
if err != nil {
    log.Fatal(err)
}
fmt.Printf("Vocab size: %d\n", metadata[0])
fmt.Printf("Audio context: %d\n", metadata[1])

Types

type RawTensor

type RawTensor struct {
	Data        []byte  // Raw quantized tensor data
	Type        uint32  // GGML quantization type (matches quant package types)
	NumElements int     // Total number of elements in the tensor
	Dimensions  []int32 // Tensor shape (e.g., [1, 256, 64, 64])
}

RawTensor stores tensor payload in original GGML encoding.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL