codegraph

package
v1.8.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 4, 2026 License: MIT Imports: 18 Imported by: 0

Documentation

Index

Constants

View Source
const DefaultNumHashes = 128

DefaultNumHashes is the number of hash functions used for MinHash signatures. 128 provides good accuracy with sub-10ms query time for 50k symbols.

Variables

This section is empty.

Functions

func DetectLanguage

func DetectLanguage(filename string) string

DetectLanguage returns the language for a file based on its extension. Returns empty string if the language is not recognized.

func DetectProjectLanguage

func DetectProjectLanguage(dir string) string

DetectProjectLanguage determines the primary language of a project by checking for manifest files in the given directory.

func IsIndexableFile

func IsIndexableFile(filename string) bool

IsIndexableFile returns true if the file's language has parser support.

func JaccardSimilarity

func JaccardSimilarity(a, b MinHashSignature) float64

JaccardSimilarity estimates the Jaccard similarity between two MinHash signatures. Returns a value between 0.0 (completely different) and 1.0 (identical sets).

func ShinglesForSymbol

func ShinglesForSymbol(sym Symbol, source []byte) []string

ShinglesForSymbol generates enriched shingles for a symbol, used as input to MinHash for semantic similarity search. Each shingle is a lowercased token derived from the symbol's name, types, body references, package, and comments.

func ShouldSkipPath

func ShouldSkipPath(path string) bool

ShouldSkipPath returns true if the path should be excluded from indexing.

Types

type Edge

type Edge struct {
	SourceID int64
	TargetID int64
	Kind     EdgeKind
}

Edge represents a relationship between two symbols.

type EdgeKind

type EdgeKind string

EdgeKind identifies the kind of relationship between symbols.

const (
	EdgeCalls      EdgeKind = "calls"
	EdgeImports    EdgeKind = "imports"
	EdgeImplements EdgeKind = "implements"
	EdgeEmbeds     EdgeKind = "embeds"
	EdgeReferences EdgeKind = "references"
)

type FileRecord

type FileRecord struct {
	Path        string
	Language    string
	Size        int64
	ContentHash string
	IndexedAt   int64
}

FileRecord tracks indexed files for incremental updates.

type GenericParser

type GenericParser struct {
	// contains filtered or unexported fields
}

GenericParser extracts symbols from non-Go source files using regex patterns. Covers Python, JavaScript, TypeScript, and Rust. No call graph (would need tree-sitter / CGo). Focuses on declarations: functions, classes, imports.

func NewGenericParser

func NewGenericParser(language string) *GenericParser

NewGenericParser creates a parser for the given language.

func (*GenericParser) ParseFile

func (p *GenericParser) ParseFile(path string) (*ParseResult, error)

ParseFile parses a source file and extracts symbols using regex.

type GoParser

type GoParser struct{}

GoParser extracts symbols and edges from Go source files using go/ast.

func NewGoParser

func NewGoParser() *GoParser

NewGoParser creates a new Go AST parser.

func (*GoParser) ParseFile

func (p *GoParser) ParseFile(path string) (*ParseResult, error)

ParseFile parses a single Go source file and extracts symbols and edges.

type Indexer

type Indexer struct {
	// contains filtered or unexported fields
}

Indexer manages the code graph lifecycle: build, update, and query.

func NewIndexer

func NewIndexer(workspace, dbPath string) (*Indexer, error)

NewIndexer creates an indexer for the given workspace, using the specified SQLite database path.

func (*Indexer) Build

func (idx *Indexer) Build() error

Build performs a full index of the workspace. Walks the file tree, parses source files, extracts symbols and edges, computes MinHash signatures, and stores everything in SQLite.

func (*Indexer) Close

func (idx *Indexer) Close() error

Close releases the underlying database connection.

func (*Indexer) KeywordSearch

func (idx *Indexer) KeywordSearch(query string, limit int) ([]Symbol, error)

KeywordSearch finds symbols matching a keyword query using SQL LIKE.

func (*Indexer) ProjectSummary

func (idx *Indexer) ProjectSummary() string

ProjectSummary returns a brief summary suitable for the system prompt.

func (*Indexer) SemanticSearch

func (idx *Indexer) SemanticSearch(query string, topK int) ([]SearchResult, error)

SemanticSearch finds symbols semantically similar to the query string. The query is split into shingles, MinHashed, then compared against all symbol signatures using brute-force Jaccard similarity.

func (*Indexer) Stats

func (idx *Indexer) Stats() (*StoreStats, error)

Stats returns aggregate stats for the indexed codebase.

func (*Indexer) Store

func (idx *Indexer) Store() *Store

Store returns the underlying store for direct queries (used by tools).

func (*Indexer) Update

func (idx *Indexer) Update() error

Update performs an incremental update. Only re-indexes files whose content hash has changed since the last index. Removes symbols for deleted files.

type MinHashEntry

type MinHashEntry struct {
	SymbolID  int64
	Signature MinHashSignature
}

MinHashEntry pairs a symbol ID with its MinHash signature for bulk queries.

type MinHashSignature

type MinHashSignature []uint64

MinHashSignature is a fixed-length array of hash values for similarity search.

type MinHasher

type MinHasher struct {
	// contains filtered or unexported fields
}

MinHasher computes MinHash signatures for sets of shingles. Uses hash/maphash with different seeds to simulate N independent hash functions.

func NewMinHasher

func NewMinHasher(numHashes int) *MinHasher

NewMinHasher creates a MinHasher with the specified number of hash functions.

func (*MinHasher) Signature

func (m *MinHasher) Signature(shingles []string) MinHashSignature

Signature computes the MinHash signature for a set of shingles. Each element of the returned slice is the minimum hash value across all shingles for that hash function.

type ParseResult

type ParseResult struct {
	Symbols []Symbol
	Edges   []RawEdge
	Source  []byte // raw file content for shingle generation
}

ParseResult holds the symbols and edges extracted from a single file.

type RawEdge

type RawEdge struct {
	SourceName string
	TargetName string
	Kind       EdgeKind
}

RawEdge is an unresolved edge that uses symbol names instead of IDs. Resolved to Edge (with IDs) when inserted into the store.

type SearchResult

type SearchResult struct {
	Symbol     Symbol
	Similarity float64
}

SearchResult pairs a symbol with its similarity score.

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store manages the SQLite database for the code graph.

func NewStore

func NewStore(dbPath string) (*Store, error)

NewStore opens (or creates) a SQLite database at the given path and initializes the schema.

func (*Store) AddEdge

func (s *Store) AddEdge(sourceID, targetID int64, kind EdgeKind) error

AddEdge records a directional relationship between two symbols.

func (*Store) Close

func (s *Store) Close() error

Close closes the underlying database connection.

func (*Store) DeleteFile

func (s *Store) DeleteFile(path string) error

DeleteFile removes a file record.

func (*Store) DeleteFileSymbols

func (s *Store) DeleteFileSymbols(file string) error

DeleteFileSymbols removes all symbols (and their edges) for a file.

func (*Store) GetAllFiles

func (s *Store) GetAllFiles() ([]FileRecord, error)

GetAllFiles returns all indexed file records.

func (*Store) GetAllMinHashes

func (s *Store) GetAllMinHashes() ([]MinHashEntry, error)

GetAllMinHashes retrieves all symbol IDs and their MinHash signatures for similarity search. Symbols without a signature are skipped.

func (*Store) GetEdgesFrom

func (s *Store) GetEdgesFrom(sourceID int64) ([]Edge, error)

GetEdgesFrom returns all outgoing edges from the given symbol.

func (*Store) GetEdgesTo

func (s *Store) GetEdgesTo(targetID int64) ([]Edge, error)

GetEdgesTo returns all incoming edges to the given symbol.

func (*Store) GetFile

func (s *Store) GetFile(path string) (*FileRecord, error)

GetFile retrieves a file record by path.

func (*Store) GetMinHash

func (s *Store) GetMinHash(symbolID int64) (MinHashSignature, error)

GetMinHash retrieves the MinHash signature for a symbol.

func (*Store) GetSymbol

func (s *Store) GetSymbol(id int64) (*Symbol, error)

GetSymbol retrieves a symbol by its ID.

func (*Store) GetSymbolIDByName added in v1.8.2

func (s *Store) GetSymbolIDByName(name string) (int64, bool)

GetSymbolIDByName returns the ID of a symbol by exact name match. If multiple symbols share the same name, returns the first found.

func (*Store) GetSymbolsByFile

func (s *Store) GetSymbolsByFile(file string) ([]Symbol, error)

GetSymbolsByFile returns all symbols in the given file.

func (*Store) GetSymbolsByPackage

func (s *Store) GetSymbolsByPackage(pkg string) ([]Symbol, error)

GetSymbolsByPackage returns all symbols in the given package.

func (*Store) SearchSymbolsByName

func (s *Store) SearchSymbolsByName(query string) ([]Symbol, error)

SearchSymbolsByName returns symbols whose name contains the query (case-insensitive).

func (*Store) Stats

func (s *Store) Stats() (*StoreStats, error)

Stats returns aggregate counts for the indexed codebase.

func (*Store) UpdateMinHash

func (s *Store) UpdateMinHash(symbolID int64, sig MinHashSignature) error

UpdateMinHash stores the MinHash signature for a symbol.

func (*Store) UpsertFile

func (s *Store) UpsertFile(f FileRecord) error

UpsertFile inserts or updates a file record.

func (*Store) UpsertSymbol

func (s *Store) UpsertSymbol(sym Symbol) (int64, error)

UpsertSymbol inserts or updates a symbol. Uniqueness is determined by (name, kind, package, file). Returns the row ID.

type StoreStats

type StoreStats struct {
	TotalSymbols  int
	TotalEdges    int
	TotalFiles    int
	SymbolsByKind map[SymbolKind]int
	FilesByLang   map[string]int
}

StoreStats holds aggregate counts for the indexed codebase.

type Symbol

type Symbol struct {
	ID        int64
	Name      string
	Kind      SymbolKind
	Package   string
	File      string
	Line      int
	Signature string
}

Symbol represents a code entity (function, type, interface, etc.).

type SymbolKind

type SymbolKind string

SymbolKind identifies the kind of code symbol.

const (
	SymbolFunction  SymbolKind = "function"
	SymbolMethod    SymbolKind = "method"
	SymbolType      SymbolKind = "type"
	SymbolInterface SymbolKind = "interface"
	SymbolConst     SymbolKind = "const"
	SymbolVar       SymbolKind = "var"
	SymbolStruct    SymbolKind = "struct"
	SymbolImport    SymbolKind = "import"
	SymbolClass     SymbolKind = "class"
)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL