parser

package
v0.4.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 14, 2026 License: MIT Imports: 11 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ChildFieldText

func ChildFieldText(n *Node, field, source string) string

ChildFieldText returns the source text of the named field of n, or "" if n has no such field. Convenience wrapper around ChildByFieldName + node text extraction; the caller passes the source string (not bytes) because most extractors hold their content as a string already.

func NodeText

func NodeText(n *sitter.Node, source []byte) string

NodeText returns the source text for a tree-sitter node.

func NodeTextFromString

func NodeTextFromString(n *Node, source string) string

NodeTextFromString is the string-source equivalent of NodeText. Returns "" if n is nil or its byte range is outside source.

func Walk

func Walk(n *Node, visit func(*Node) bool)

Walk does a pre-order DFS over n (inclusive). The visitor returns true to recurse into the current node's children, false to skip them. Walking stops when the visitor returns false at the root or when all descendants have been visited. nil-safe.

Implementation uses tree-sitter's TreeCursor for iterative traversal. Compared to the previous recursive `n.Child(i)` form, the cursor avoids Go-level recursion frames per descent and matches the canonical tree-sitter walking idiom. Note: each visited node still flows through smacker's per-Tree node cache (allocates one *Node on first visit per node), so the allocation count is roughly the same as the recursive form — the win is in stack discipline and code clarity, not GC pressure.

Types

type Language

type Language int

Language identifies a supported source language. Phase 1 supports only Java and Python; the rest land in phase 2 / phase 4.

const (
	LanguageUnknown Language = iota
	LanguageJava
	LanguagePython
	LanguageTypeScript
	LanguageGo
	// Structured / textual languages added in phase 4 (batch 1 / 2). No
	// tree-sitter grammar — the analyzer parses these via the structured
	// parser in internal/parser/structured.go.
	LanguageYaml
	LanguageJSON
	LanguageTOML
	LanguageINI
	LanguageProperties
	LanguageSQL
	LanguageBatch
	LanguageVue
	LanguageSvelte
	// Additional languages discovered through file extension but parsed via
	// regex/structured paths (no tree-sitter grammar wired in).
	LanguageCSharp
	LanguageKotlin
	LanguageScala
	LanguageCpp
	LanguageRust
	LanguageTerraform
	LanguageBicep
	LanguageProto
	LanguageDockerfile
	LanguageXML
	LanguageMarkdown
	LanguagePowerShell
	LanguageBash
	LanguageRuby
	LanguageGroovy
)

func LanguageFromExtension

func LanguageFromExtension(ext string) Language

LanguageFromExtension maps a file extension (including leading dot, e.g. ".java") to a Language. Returns LanguageUnknown for anything unsupported.

func (Language) String

func (l Language) String() string

type Node

type Node = sitter.Node

Node is a tree-sitter parse-tree node. Re-exported as a type alias so callers can write `parser.Node` without an extra import of the tree-sitter SDK. The underlying type is `sitter.Node`, so all its methods (Type, ChildByFieldName, StartPoint, ...) are available.

type ParsedEnvelope

type ParsedEnvelope = map[string]any

ParsedEnvelope wraps a structured parse result in the same envelope shape the Java side uses (a Map<String, Object> with keys "type" and either "data" or "documents"). It is a typed alias for clarity; detectors consume it as a plain map[string]any (see detector.Context.ParsedData).

func ParseStructured

func ParseStructured(lang Language, source []byte) (ParsedEnvelope, error)

ParseStructured dispatches to the right structured parser based on Language. Returns nil for languages this parser does not handle. Errors are returned for true parse failures; an empty / non-applicable input yields ({"type":"yaml","data":{}}, nil) rather than nil/error.

type Tree

type Tree struct {
	Lang   Language
	Source []byte
	Root   *sitter.Tree
}

Tree wraps a parsed *sitter.Tree along with the source bytes so detectors can pull node text via tree-sitter's byte-range API.

func Parse

func Parse(lang Language, source []byte) (*Tree, error)

Parse parses the source bytes in the given language. The returned Tree must be Close()d. Returns (nil, nil) for structured / textual languages without a tree-sitter grammar (yaml/json/toml/ini/properties/sql/batch/vue/svelte) — those are handled by the structured / regex paths, not tree-sitter. Returns an error for LanguageUnknown (truly unsupported).

func ParseByName

func ParseByName(lang string, source []byte) (*Tree, error)

ParseByName routes a string language key ("java", "python", "typescript", "go") to the typed Parse(Language, ...) call. Returns (nil, error) for unknown keys. The string-keyed entry point exists for the intelligence extractors, which receive their language as a string off DetectLanguage.

func (*Tree) Close

func (t *Tree) Close()

Close releases the tree-sitter parse tree.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL