extract

package
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 9, 2026 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Package extract provides Go AST extraction for building knowledge graphs. This file provides backward compatibility with the legacy Extractor API. New code should use MultiExtractor with the Registry pattern.

Package extract provides multi-language code extraction for building knowledge graphs.

This package provides backward compatibility by re-exporting types from the provider package. New code should import github.com/plexusone/graphize/provider directly for the public interface.

Package extract provides multi-language code extraction for building knowledge graphs.

This file provides backward compatibility with the legacy Registry API. New code should use the provider package directly.

Index

Constants

This section is empty.

Variables

View Source
var DefaultRegistry = NewRegistry()

DefaultRegistry is the global registry instance. Deprecated: Use provider.Register, provider.Get, etc. directly.

View Source
var SemanticEdgeTypes = []string{
	"inferred_depends",
	"rationale_for",
	"similar_to",
	"implements_pattern",
	"shared_concern",
}

SemanticEdgeTypes are the valid edge types for LLM-inferred relationships.

Functions

func BuildSubagentPrompt

func BuildSubagentPrompt(files []string, chunkID, totalChunks int, baseDir string) string

BuildSubagentPrompt creates the prompt for a semantic extraction subagent.

func ChunkFiles

func ChunkFiles(files []string, chunkSize int) [][]string

ChunkFiles splits a list of files into chunks of the specified size.

func IsValidSemanticEdgeType

func IsValidSemanticEdgeType(edgeType string) bool

IsValidSemanticEdgeType checks if an edge type is valid for semantic edges.

func MergeExtractions

func MergeExtractions(astNodes []*graph.Node, astEdges []*graph.Edge, semantic *SemanticExtraction) ([]*graph.Node, []*graph.Edge)

MergeExtractions combines AST extraction with semantic extraction. AST edges take precedence (they have EXTRACTED confidence). Semantic edges are added if they don't duplicate AST edges.

func NodeIDPrefix

func NodeIDPrefix(language string) string

NodeIDPrefix returns the standard prefix for node IDs for a given language. Deprecated: Use provider.NodeIDPrefix directly.

func ValidateSemanticExtraction

func ValidateSemanticExtraction(ext *SemanticExtraction) error

ValidateSemanticExtraction checks that the extraction is well-formed.

Types

type ExtractStats

type ExtractStats = provider.ExtractStats

ExtractStats is an alias for provider.ExtractStats. Deprecated: Use provider.ExtractStats directly.

func NewExtractStats

func NewExtractStats() *ExtractStats

NewExtractStats creates a new ExtractStats instance. Deprecated: Use provider.NewExtractStats directly.

type Extractor deprecated

type Extractor struct {
	// contains filtered or unexported fields
}

Extractor provides backward compatibility with the original Go-only extractor API. It wraps MultiExtractor using the DefaultRegistry which has Go registered.

Deprecated: Use MultiExtractor with Registry for multi-language support.

func NewExtractor deprecated

func NewExtractor() *Extractor

NewExtractor creates a new extractor using the default registry. This maintains backward compatibility while using the new architecture.

Deprecated: Use NewMultiExtractor(DefaultRegistry) instead.

func (*Extractor) ExtractDir

func (e *Extractor) ExtractDir(dir string) (*graph.Graph, error)

ExtractDir extracts nodes and edges from all supported files in a directory tree.

func (*Extractor) ExtractDirWithStats

func (e *Extractor) ExtractDirWithStats(dir string) (*graph.Graph, *ExtractStats)

ExtractDirWithStats extracts nodes and edges with cache statistics. Returns the legacy ExtractStats format for compatibility.

func (*Extractor) WithCache

func (e *Extractor) WithCache(c *cache.Cache) *Extractor

WithCache sets the cache for the extractor.

type FrameworkInfo

type FrameworkInfo = provider.FrameworkInfo

FrameworkInfo is an alias for provider.FrameworkInfo. Deprecated: Use provider.FrameworkInfo directly.

type LanguageExtractor

type LanguageExtractor = provider.LanguageExtractor

LanguageExtractor is an alias for provider.LanguageExtractor. Deprecated: Use provider.LanguageExtractor directly.

type MultiExtractor

type MultiExtractor struct {
	// contains filtered or unexported fields
}

MultiExtractor coordinates extraction across multiple language extractors. It walks directories, dispatches files to the appropriate extractor, and aggregates results into a unified graph.

func NewMultiExtractor

func NewMultiExtractor(registry *Registry) *MultiExtractor

NewMultiExtractor creates a new multi-language extractor using the given registry. Deprecated: Use NewMultiExtractorWithOptions for more flexibility.

func NewMultiExtractorWithOptions

func NewMultiExtractorWithOptions(opts ...MultiExtractorOption) *MultiExtractor

NewMultiExtractorWithOptions creates a new multi-language extractor with options.

func (*MultiExtractor) DetectFrameworks

func (m *MultiExtractor) DetectFrameworks(dir string) map[string][]*FrameworkInfo

DetectFrameworks scans the directory and returns detected frameworks.

func (*MultiExtractor) ExtractDir

func (m *MultiExtractor) ExtractDir(dir string) (*graph.Graph, error)

ExtractDir extracts nodes and edges from all supported files in a directory tree.

func (*MultiExtractor) ExtractDirWithStats

func (m *MultiExtractor) ExtractDirWithStats(dir string) (*graph.Graph, *ExtractStats)

ExtractDirWithStats extracts nodes and edges with cache and language statistics.

func (*MultiExtractor) WithCache

func (m *MultiExtractor) WithCache(c *cache.Cache) *MultiExtractor

WithCache sets the cache for the extractor.

type MultiExtractorOption

type MultiExtractorOption func(*MultiExtractor)

MultiExtractorOption configures a MultiExtractor.

func WithCustomExtractor

func WithCustomExtractor(extension string, extractor LanguageExtractor) MultiExtractorOption

WithCustomExtractor adds a custom extractor for an extension via direct injection. This bypasses the global registry and allows per-instance customization.

type Registry

type Registry struct{}

Registry manages language extractors and maps file extensions to extractors. Deprecated: Use provider.Register, provider.Get, etc. directly.

func NewRegistry

func NewRegistry() *Registry

NewRegistry creates a new extractor registry. Deprecated: The provider package uses a global registry with provider.Register.

func (*Registry) CanExtract

func (r *Registry) CanExtract(path string) bool

CanExtract returns true if there is an extractor registered for the given path. Deprecated: Use provider.CanExtract instead.

func (*Registry) Extensions

func (r *Registry) Extensions() []string

Extensions returns a list of all registered file extensions. Deprecated: Use provider.Extensions instead.

func (*Registry) Get

func (r *Registry) Get(path string) LanguageExtractor

Get returns the extractor for a given file path. Deprecated: Use provider.GetByPath instead.

func (*Registry) GetByLanguage

func (r *Registry) GetByLanguage(language string) LanguageExtractor

GetByLanguage returns the extractor for a given language name. Deprecated: Use provider.GetByLanguage instead.

func (*Registry) Languages

func (r *Registry) Languages() []string

Languages returns a list of all registered language names. Deprecated: Use provider.Languages instead.

func (*Registry) Register

func (r *Registry) Register(extractor LanguageExtractor)

Register adds a language extractor to the registry. Deprecated: Use provider.Register with a factory function instead.

type SemanticEdge

type SemanticEdge struct {
	From            string  `json:"from"`
	To              string  `json:"to"`
	Type            string  `json:"type"`
	Confidence      string  `json:"confidence"`
	ConfidenceScore float64 `json:"confidence_score"`
	Reason          string  `json:"reason"`
}

SemanticEdge represents an edge discovered by LLM analysis.

type SemanticExtraction

type SemanticExtraction struct {
	Nodes []SemanticNode `json:"nodes"`
	Edges []SemanticEdge `json:"edges"`
}

SemanticExtraction represents the output from an LLM semantic extraction subagent.

func ParseSemanticJSON

func ParseSemanticJSON(data []byte) (*SemanticExtraction, error)

ParseSemanticJSON parses JSON output from an LLM subagent.

type SemanticNode

type SemanticNode struct {
	ID    string            `json:"id"`
	Type  string            `json:"type"`
	Label string            `json:"label"`
	Attrs map[string]string `json:"attrs,omitempty"`
}

SemanticNode represents a node discovered by LLM analysis.

Directories

Path Synopsis
Package golang provides Go language extraction for knowledge graphs.
Package golang provides Go language extraction for knowledge graphs.
Package java provides Java extraction for knowledge graphs.
Package java provides Java extraction for knowledge graphs.
Package markdown provides Markdown/text extraction for knowledge graphs.
Package markdown provides Markdown/text extraction for knowledge graphs.
Package swift provides Swift extraction for knowledge graphs.
Package swift provides Swift extraction for knowledge graphs.
Package systemspec provides system-spec extraction for knowledge graphs.
Package systemspec provides system-spec extraction for knowledge graphs.
Package typescript provides TypeScript/JavaScript extraction for knowledge graphs.
Package typescript provides TypeScript/JavaScript extraction for knowledge graphs.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL