tokencount

package
v0.24.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2026 License: MIT Imports: 5 Imported by: 0

Documentation

Overview

Package tokencount provides a shared offline tiktoken wrapper for LLM token estimation. It maps model IDs to BPE encodings and counts tokens without any network calls, using embedded BPE tables from tiktoken-go-loader.

Index

Constants

View Source
const (
	EncodingCL100K = "cl100k_base"
	EncodingO200K  = "o200k_base"
)

Variables

This section is empty.

Functions

func CountText

func CountText(encoding, text string) (int, error)

CountText returns the number of tokens in text using the named BPE encoding. The encoding must be one of the constants in this package (cl100k_base, o200k_base).

func CountTextForModel

func CountTextForModel(modelID, text string) (int, error)

CountTextForModel is a convenience wrapper that calls EncodingForModel and then CountText.

func EncodingForModel

func EncodingForModel(modelID string) (encoding string, ok bool)

EncodingForModel returns the BPE encoding name appropriate for the given model ID, using prefix matching.

Mappings:

  • o200k_base: gpt-4o*, gpt-4.1*, gpt-4.5*, o1*, o3*, o4*
  • cl100k_base: claude-*, gpt-4* (non-o suffixed), gpt-3.5*, and all unknowns

The second return value is false when the model was not recognised and the fallback encoding (cl100k_base) was returned.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL