symbols

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 27, 2026 License: MIT Imports: 14 Imported by: 0

README

treesitter-symbols

Go Reference

Extract code symbols — function/type definitions, imports, references, call edges, method→owner relations, the declared package, and the exported set — from source files in 17 languages, with one small API.

Go is parsed with the standard library's go/ast. The other 16 — Python · JavaScript · TypeScript · Java · Rust · C · C++ · C# · Kotlin · PHP · Ruby · Scala · R · MATLAB · Perl · Swift — are parsed with the pure-Go tree-sitter runtime gotreesitter and its bundled grammars.

Extraction is name-based: references and call edges record bare callee names, not type-resolved targets. That's intentionally lightweight — enough to build cross-language call graphs, coupling metrics, and dead-code / unused- export analysis without a full type checker.

Install

go get github.com/richardwooding/treesitter-symbols

Usage

import symbols "github.com/richardwooding/treesitter-symbols"

s, err := symbols.Extract("rust", src) // any supported language
s, err := symbols.ExtractGo(src)       // go/ast path

fmt.Println(s.Functions)    // ["build", "greet", ...]
fmt.Println(s.Imports)      // ["std::collections::HashMap", ...]
fmt.Println(s.CallEdges)    // [{Caller:"Write" Callee:"helper"}, ...]
fmt.Println(s.MethodOwners) // [{Method:"Write" Owner:"Buffer"}, ...]
type Symbols struct {
	Functions       []string       // function + method names
	Types           []string       // type / class / interface / enum / ...
	Imports         []string       // import paths
	References      []string       // call-site callees + type usages
	Exported        []string       // exported/public subset (see below)
	CallEdges       []CallEdge     // caller -> callee
	MethodOwners    []MethodOwner  // method -> owning type
	Package         string         // declared package / namespace
	RelativeImports []string       // relative imports, dots preserved (Python)
	FunctionSpans   []FunctionSpan // name, 1-based line range, + complexity
}

type FunctionSpan struct {
	Name       string
	StartLine  int
	EndLine    int
	Cyclomatic int   // McCabe (1 + branch points)
	Cognitive  *int  // SonarSource; nil when unavailable (Swift)
}
Complexity, from the same parse

Each FunctionSpan carries cyclomatic and cognitive complexity, computed by go-codemetrics over the same parse tree (treesitter.MetricsFromTree) — so symbols and metrics cost a single parse, not two. Cognitive is nil only where the analyzer has no spec (Swift).

SupportedLanguages() lists the 17 identifiers. An unknown language returns a wrapped ErrUnsupportedLanguage. Parsing is best-effort: a partial parse yields the symbols recovered; a failed/timed-out parse yields a zero Symbols and a nil error.

Exported set

Exported is computed where a language has a clear rule:

Rule Languages
Capitalised name Go
Not _-prefixed Python
Keyword visibility (pub / export / public) Rust, TypeScript, JavaScript, Java, C#
Default-public minus private/internal/protected Kotlin, Scala

For the remaining languages (Ruby, PHP, C, C++, Perl, R, MATLAB, Swift) Exported is nil — there's no unambiguous syntactic rule.

Dependencies & binary size

The module requires gotreesitter; the Go path (ExtractGo) needs no grammar and is unaffected. A plain build embeds every bundled grammar (~22 MB). To embed only the languages you use, build with the gotreesitter subset tags:

go build -tags 'grammar_subset grammar_subset_rust grammar_subset_python' ./...
  • go-codemetrics — cyclomatic + cognitive complexity for the same languages.

License

MIT — see LICENSE.

Documentation

Overview

Package symbols extracts code symbols — function and type definitions, imports, references (call sites + type usages), call edges, method→owner relations, the declared package, and the exported set — from source files in 17 languages.

Go is parsed with the standard library's go/ast (no external dependency for that path). The other 16 languages — Python, JavaScript, TypeScript, Java, Rust, C, C++, C#, Kotlin, PHP, Ruby, Scala, R, MATLAB, Perl, Swift — are parsed with the pure-Go tree-sitter runtime github.com/odvcencio/gotreesitter and its bundled grammars, using each grammar's tags query plus a small set of per-language supplemental queries.

Extraction is name-based: references and call edges record bare callee names, not type-resolved targets. This is intentionally lightweight — enough to build cross-language call graphs, coupling metrics, and dead-code / unused- export analysis without a full type checker.

s, err := symbols.Extract("rust", src)   // any supported language
s, err := symbols.ExtractGo(src)         // go/ast path

A plain build embeds every bundled grammar (~22 MB). To embed only the languages you use, build with the gotreesitter subset tags, e.g.

-tags 'grammar_subset grammar_subset_rust grammar_subset_python'

(The Go path needs no grammar and is unaffected.)

Index

Constants

This section is empty.

Variables

View Source
var ErrUnsupportedLanguage = errors.New("symbols: unsupported language")

ErrUnsupportedLanguage is returned by Extract for a language with no extractor. Test for it with errors.Is.

Functions

func SupportedLanguages

func SupportedLanguages() []string

SupportedLanguages returns every language identifier Extract accepts, sorted. It includes "go" (parsed via go/ast) and the tree-sitter languages.

Types

type CallEdge

type CallEdge struct {
	Caller string
	Callee string
}

CallEdge is a name-based call attribution: Caller (an enclosing function or method name) invokes Callee.

type FunctionSpan

type FunctionSpan struct {
	Name      string
	StartLine int
	EndLine   int
	// Cyclomatic is the McCabe cyclomatic complexity (1 + branch points).
	Cyclomatic int
	// Cognitive is the SonarSource cognitive complexity, or nil when the
	// language's analyzer does not compute it (currently Swift among the
	// tree-sitter languages). Always set for Go.
	Cognitive *int
}

FunctionSpan is a function or method definition's name, 1-based inclusive line span, and complexity metrics.

type MethodOwner

type MethodOwner struct {
	Method string
	Owner  string
}

MethodOwner binds a method name to the type that declares it (e.g. method "String" on owner "Buffer"). Lets a consumer disambiguate same-named methods across types.

type Symbols

type Symbols struct {
	// Functions are function and method definition names (bare, not
	// receiver-qualified — see MethodOwners for the owning type).
	Functions []string
	// Types are type / class / struct / interface / enum / trait / … names.
	Types []string
	// Imports are imported module / package paths (quotes and angle brackets
	// stripped).
	Imports []string
	// References are the bare names a file uses: call-site callees plus type
	// usages (a type named as a field/param/return/generic type). Name-based.
	References []string
	// Exported is the subset of definitions visible outside the file's
	// module/package. Computed for Go (capitalised), Python (not "_"-prefixed),
	// the keyword-visibility languages (Rust/TS/JS/Java/C#) and the
	// default-public languages (Kotlin/Scala). Nil for languages with no clear
	// rule (Ruby/PHP/C/C++/Perl/R/MATLAB/Swift).
	Exported []string
	// CallEdges attribute each call site to its innermost enclosing function.
	CallEdges []CallEdge
	// MethodOwners bind methods to their owning type, where the language nests
	// methods in a type container (most class-based languages; not C/C++).
	MethodOwners []MethodOwner
	// Package is the file's declared package / namespace, for languages that
	// declare one in source (Java/C#/Kotlin/Scala/PHP/Perl). "" otherwise.
	Package string
	// RelativeImports are imports with their leading dots preserved
	// (Python today); kept separate from Imports.
	RelativeImports []string
	// FunctionSpans are the line ranges of every named function/method.
	FunctionSpans []FunctionSpan
}

Symbols is the result of analysing one source file.

All name slices are deduplicated, first-occurrence order preserved. Fields a language doesn't support are nil/empty (see the per-field notes); none of this is an error.

func Extract

func Extract(language string, src []byte) (Symbols, error)

Extract analyses src as the named language and returns its symbols. Recognised identifiers are those from SupportedLanguages; "go" (alias "golang") routes to ExtractGo, the rest to the tree-sitter backend. An unknown or unavailable language returns a wrapped ErrUnsupportedLanguage.

Extraction is best-effort: source that only partially parses yields the symbols recovered so far. A parse that fails or times out yields a zero Symbols and a nil error.

func ExtractGo

func ExtractGo(src []byte) (Symbols, error)

ExtractGo analyses Go source via the standard library's go/ast — no external dependency. Both top-level functions and receiver-bound methods land in Functions (bare names); methods also appear in MethodOwners with their receiver type. References combines call sites, function-value uses, and type usages. Exported is the capitalised subset of functions and types.

Parsing is best-effort: a partial parse still yields the recovered symbols with a nil error; only a total parse failure (no tree) returns the error.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL