dox

module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 9, 2026 License: MIT

README

dox

dox is a Go CLI that syncs library and framework docs from GitHub repos or URLs into a local directory for indexing tools and AI assistants.

Why

Keep documentation config in your repo, keep downloaded docs out of git, and reproduce docs locally on any machine with one command.

Installation

Homebrew (macOS/Linux)
brew install g5becks/tap/dox
Download Binary

Download pre-built binaries from the releases page.

Linux/macOS:

# Download and extract (adjust version and platform)
curl -L https://github.com/g5becks/dox/releases/latest/download/dox_Linux_x86_64.tar.gz | tar xz
sudo mv dox /usr/local/bin/

Windows: Download the .zip from releases, extract, and add to PATH.

Linux Packages

Debian/Ubuntu:

wget https://github.com/g5becks/dox/releases/latest/download/dox_0.1.0_linux_amd64.deb
sudo dpkg -i dox_0.1.0_linux_amd64.deb

RedHat/Fedora/CentOS:

wget https://github.com/g5becks/dox/releases/latest/download/dox_0.1.0_linux_amd64.rpm
sudo rpm -i dox_0.1.0_linux_amd64.rpm

Alpine Linux:

wget https://github.com/g5becks/dox/releases/latest/download/dox_0.1.0_linux_amd64.apk
sudo apk add --allow-untrusted dox_0.1.0_linux_amd64.apk

Arch Linux:

wget https://github.com/g5becks/dox/releases/latest/download/dox_0.1.0_linux_amd64.pkg.tar.zst
sudo pacman -U dox_0.1.0_linux_amd64.pkg.tar.zst
Docker
docker pull ghcr.io/g5becks/dox:latest
docker run --rm ghcr.io/g5becks/dox:latest --version
Go Install
go install github.com/g5becks/dox/cmd/dox@latest
Build from Source
git clone https://github.com/g5becks/dox.git
cd dox
task build
./bin/dox --help

Quick Start

  1. Create a starter config:
dox init

This generates dox.toml with sensible defaults, including global exclude patterns for common files (images, node_modules, build artifacts, etc.).

  1. Edit dox.toml and add sources.

  2. Sync docs:

dox sync
  1. Ignore downloaded docs:
.dox/

Example Config

# dox.toml
output = ".dox"

# Optional: Set default parallelism for syncing (default: 4x CPU cores, min 10)
# Override per-command with: dox sync --parallel N
max_parallel = 20

# Global exclude patterns applied to all git sources
# Per-source excludes add to (not replace) these global patterns
excludes = [
    "**/*.png",
    "**/*.jpg",
    "node_modules/**",
    ".github/**",
]

# GitHub source - type inferred from 'repo'
[sources.goreleaser]
repo = "goreleaser/goreleaser"
path = "www/docs"
patterns = ["**/*.md"]  # Override default patterns
exclude = ["**/changelog.md"]  # Adds to global excludes

# URL source - type inferred from 'url'
[sources.hono]
url = "https://hono.dev/llms-full.txt"

Commands

Sync
dox sync
dox sync goreleaser hono
dox sync --force
dox sync --clean
dox sync --dry-run
dox sync --parallel 5

All commands accept --config path/to/dox.toml (-c) to override config discovery.

List
dox list
dox list --verbose
dox list --json
dox list --files
Add
# Add GitHub source (type inferred from --repo)
dox add goreleaser --repo goreleaser/goreleaser --path www/docs

# Add URL source (type inferred from --url)
dox add hono --url https://hono.dev/llms-full.txt

# Add GitLab source (specify host)
dox add myproject --repo owner/repo --path docs --host gitlab.com

# Overwrite existing source
dox add goreleaser --repo goreleaser/goreleaser --path www/docs --force

Use --force to overwrite an existing source with the same name.

Clean
dox clean
dox clean goreleaser
Init
dox init

Query Documentation

Browse and read synced documentation directly from the CLI. After syncing, dox generates a manifest that enables fast querying of your documentation collections.

Quick Start
# Sync documentation
dox sync

# List all collections
dox collections

# List files in a collection
dox files goreleaser

# Read a file with line numbers
dox cat goreleaser docs/install.md

# Show document structure (headings/exports)
dox outline goreleaser docs/install.md
Collections

List all synced documentation collections:

dox collections                   # Table output
dox collections --json            # JSON output
dox collections --limit 5         # Limit results
Files

List files in a collection with customizable output:

# Default table output
dox files goreleaser

# JSON output
dox files goreleaser --json

# CSV output
dox files goreleaser --format csv

# Custom fields
dox files goreleaser --fields path,type,size

# Limit results
dox files goreleaser --limit 10

# Show all files (no limit)
dox files goreleaser --all

# Shorter descriptions
dox files goreleaser --desc-length 100

Available fields: path, type, lines, size, description, modified

Cat

Read file contents with line numbers:

# Show entire file
dox cat goreleaser docs/install.md

# Start at line 10
dox cat goreleaser docs/install.md --offset 10

# Show 20 lines
dox cat goreleaser docs/install.md --limit 20

# Start at line 10, show 20 lines
dox cat goreleaser docs/install.md --offset 10 --limit 20

# Without line numbers
dox cat goreleaser docs/install.md --no-line-numbers

# JSON output with metadata
dox cat goreleaser docs/install.md --json
Outline

Show document structure (headings for markdown/MDX, exports for TypeScript):

dox outline goreleaser docs/install.md
dox outline goreleaser docs/install.md --json

Output shows line numbers alongside:

  • Markdown/MDX: Heading hierarchy with levels and indentation
  • TypeScript/TSX: Exported functions, classes, types
  • Text files: First line description

Search across documentation metadata or file contents to discover relevant docs without guessing paths.

Metadata Search (Default)

Fuzzy search across file paths, descriptions, headings, and exports:

# Search all collections
dox search "installation"

# Search specific collection
dox search "hooks" --collection react

# Limit results
dox search "config" --limit 10

# JSON output
dox search "api" --json

# CSV output
dox search "guide" --format csv

# Truncate descriptions
dox search "guide" --desc-length 80

Metadata search returns:

  • Collection name
  • File path
  • File type
  • Match field (path, description, heading, or export)
  • Fuzzy match score
  • File description

Search within file contents using literal or regex patterns:

# Literal search (case-insensitive)
dox search "useState" --content

# Search specific collection
dox search "configuration" --content --collection goreleaser

# Regex search
dox search "func.*Logger" --content --regex

# Limit results
dox search "import" --content --limit 20

Content search returns:

  • Collection name
  • File path
  • Line number (1-based)
  • Matching line text
Search Behavior
  • Metadata mode: Fuzzy matches across paths, descriptions, headings, and TypeScript exports
  • Content mode: Searches actual file contents (skips binary files and files >50MB)
  • Regex mode: Requires --content flag, uses case-insensitive patterns
  • Collection filter: Restricts search to a single collection
  • Limit: 0 means unlimited (default uses config default_limit)
AI Agent Integration

Query commands are designed for AI agents and automation:

# Agent workflow example
dox sync                                    # Ensure docs are current
dox collections --json                      # Discover available docs
dox search "authentication" --json          # Find relevant topics
dox files react --json --limit 100          # Browse collection files
dox cat react docs/hooks.md                 # Read specific content
Display Configuration

Customize query output in dox.toml:

[display]
default_limit = 50            # Default result limit for dox files and dox search
description_length = 200      # Max description/text length in table output
line_numbers = true           # Show line numbers in dox cat
format = "table"              # Default format: table, json, csv
list_fields = ["path", "type", "lines", "size", "description"]
Troubleshooting

"Manifest not found"

  • Run dox sync to generate the manifest
  • Manifest is created automatically after each sync

"Collection not found"

  • Run dox collections to see available collections
  • Ensure the source name matches your dox.toml configuration

Global Excludes

Use the top-level excludes key to define patterns that apply to all git sources (GitHub, GitLab, Codeberg, etc.):

excludes = [
    "**/*.png",
    "**/*.jpg",
    "node_modules/**",
    ".vitepress/**",
]

Key behaviors:

  • Global excludes apply to all git sources (GitHub, GitLab, Codeberg, self-hosted)
  • Global excludes do NOT apply to URL sources
  • Per-source exclude patterns add to (not replace) global excludes
  • Duplicate patterns are automatically removed
  • dox init generates a config with comprehensive defaults

Example:

excludes = ["**/*.png", "node_modules/**"]

[sources.docs]
repo = "owner/repo"
path = "docs"
exclude = ["**/*.jpg", "**/*.png"]  # Final excludes: *.png, *.jpg, node_modules/**

Configuration Reference

Minimal Configuration

dox infers source types automatically and applies sensible defaults to minimize config verbosity:

# Minimal GitHub source - type inferred from 'repo' presence
[sources.goreleaser]
repo = "goreleaser/goreleaser"
path = "www/docs"

# Minimal URL source - type inferred from 'url' presence
[sources.hono]
url = "https://hono.dev/llms-full.txt"

What's inferred:

  • type = "github" when repo is present
  • type = "url" when url is present
  • host = "github.com" for git sources (GitHub default)
  • patterns = ["**/*.md", "**/*.mdx", "**/*.txt"] for git sources
  • ref = <default branch> for git sources (fetched from remote)
  • filename = <basename from URL> for URL sources
Git Hosting Sources
GitHub (default)
[sources.docs]
repo = "owner/repo"
path = "docs"
# type inferred as "github", host defaults to "github.com"
GitLab
[sources.docs]
repo = "owner/repo"
path = "docs"
host = "gitlab.com"  # Explicit host triggers gitlab type inference

Or explicitly set type:

[sources.docs]
type = "gitlab"
repo = "owner/repo"
path = "docs"
# host defaults to "github.com" but gitlab client handles gitlab.com URLs
Codeberg
[sources.docs]
repo = "owner/repo"
path = "docs"
host = "codeberg.org"
Self-Hosted Git (GitHub Enterprise, GitLab CE/EE, Gitea, etc.)
[sources.internal-docs]
repo = "company/documentation"
path = "guides"
host = "git.company.com"
Advanced Git Source Options
[sources.advanced]
type = "github"              # Optional: github|gitlab|codeberg|git (inferred if omitted)
repo = "owner/repo"          # Required: owner/repo format
host = "github.com"          # Optional: git host (default: github.com)
path = "docs/content"        # Required: path to directory or file in repo
ref = "v2.0.0"               # Optional: branch, tag, or commit SHA (default: repo default branch)
patterns = ["**/*.md"]       # Optional: glob patterns (default: ["**/*.md", "**/*.mdx", "**/*.txt"])
exclude = ["**/draft/**"]    # Optional: exclude patterns (merges with global excludes)
out = "custom-dir"           # Optional: output subdirectory (default: source name)
URL Sources

For direct file downloads from any HTTP/HTTPS URL:

# Minimal
[sources.hono]
url = "https://hono.dev/llms-full.txt"

# With options
[sources.spec]
url = "https://example.com/api/spec.yaml"
filename = "openapi.yaml"   # Optional: custom filename (default: spec.yaml from URL)
out = "specs"               # Optional: output subdirectory (default: source name)
Complete Field Reference
Global Configuration
Field Type Default Description
output string .dox Output directory root (relative to config file or absolute)
github_token string $GITHUB_TOKEN or $GH_TOKEN GitHub API token for private repos and higher rate limits
max_parallel int 4 × CPU cores (min 10) Max concurrent source syncs (I/O-bound operations)
excludes []string [] Global exclude patterns applied to all git sources
Source Fields (Git Hosting)
Field Required Default Description
type No Inferred Source type: github, gitlab, codeberg, git, url
repo Yes* Repository in owner/repo format
host No github.com Git hosting domain (e.g., gitlab.com, git.company.com)
path Yes Path to directory or file in repository
ref No Default branch Branch, tag, or commit SHA to sync
patterns No ["**/*.md", "**/*.mdx", "**/*.txt"] Glob patterns for files to include
exclude No [] Glob patterns to exclude (merged with global excludes)
out No Source name Custom output subdirectory name

*Required for git sources. Must have either repo OR url, not both.

Source Fields (URL)
Field Required Default Description
type No url (inferred) Automatically set to url when url field is present
url Yes* Direct HTTP/HTTPS URL to a file
filename No Basename from URL Custom filename for downloaded file
out No Source name Custom output subdirectory name

*Required for URL sources. Must have either repo OR url, not both.

Configuration Comparison: Before vs. After

Before (verbose, redundant):

[sources.goreleaser]
type = "github"              # Redundant - obvious from 'repo'
repo = "goreleaser/goreleaser"
ref = "main"                 # Redundant - same as default branch
path = "www/docs"
patterns = ["**/*.md", "**/*.mdx", "**/*.txt"]  # Redundant - same as default

[sources.hono]
type = "url"                 # Redundant - obvious from 'url'
url = "https://hono.dev/llms-full.txt"
filename = "llms-full.txt"   # Redundant - same as URL basename

After (slim, clear):

[sources.goreleaser]
repo = "goreleaser/goreleaser"
path = "www/docs"

[sources.hono]
url = "https://hono.dev/llms-full.txt"

Both configs produce identical behavior. The slimmer version relies on type inference and sensible defaults.

Output Layout

Default output root is .dox/:

.dox/
  .dox.lock
  goreleaser/
  hono/

Each source writes into its own directory (source name by default, or out override).

Integrations

dox downloads docs to a local directory that can be indexed by various tools:

AI Code Search & RAG
  • ck-search — Semantic code search powered by embeddings. Index your .dox/ directory for context-aware documentation search.

    dox sync
    ck index .dox/
    ck search "how to configure goreleaser"
    
  • kit — AI-powered development assistant. Point kit at your synced docs for enhanced context.

    dox sync
    kit configure --docs-path .dox/
    
  • aichat RAG — Retrieval-Augmented Generation for LLMs. Build a RAG system from your documentation.

    dox sync
    aichat --rag .dox/
    
Documentation Indexers
  • Codana — Documentation search and indexing:

    dox sync
    codana documents add-collection goreleaser .dox/goreleaser/
    codana documents add-collection hono .dox/hono/
    codana documents index
    
  • Meilisearch — Fast, typo-tolerant search engine:

    dox sync
    # Index .dox/ directory with Meilisearch
    
  • Typesense — Open-source search engine:

    dox sync
    # Index .dox/ with Typesense for fast doc search
    
IDE & Editor Extensions

Configure your editor to use .dox/ as a documentation source:

  • VSCode: Point Copilot or Codeium context to .dox/
  • Cursor: Add .dox/ to workspace context
  • Vim/Neovim: Use with coc.nvim or native LSP doc providers
Custom Scripts

Since dox outputs plain files, you can build custom tooling:

# Full-text search
rg "pattern" .dox/

# Convert markdown to HTML
find .dox/ -name "*.md" -exec pandoc -o {}.html {} \;

# Generate embeddings
python embed_docs.py .dox/

Notes

  • Config discovery searches for dox.toml or .dox.toml from CWD up to filesystem root.
  • Relative config paths resolve from the config file directory.
  • GitHub token resolution order:
    1. github_token in config
    2. GITHUB_TOKEN
    3. GH_TOKEN
  • Parallelism: Default is 4x CPU cores (minimum 10) since syncing is I/O-bound. Set max_parallel in config or use --parallel flag to override.

Contributing

Contributions are welcome! See CONTRIBUTING.md for guidelines.

License

MIT

Directories

Path Synopsis
cmd
dox command
internal
ui

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL