Documentation
¶
Overview ¶
Package cache provides semantic analysis caching with content-addressable storage.
Cache Versioning System ¶
The cache uses a three-tier versioning system to detect when cached entries become stale due to changes in extraction logic, analysis prompts, or data structures.
Version Tiers:
CacheSchemaVersion: Tracks changes to CachedAnalysis structure itself. Bump when adding/removing/renaming fields in CachedAnalysis.
CacheMetadataVersion: Tracks changes to metadata extraction logic. Bump when changing FileMetadata fields, extraction algorithms, or handlers.
CacheSemanticVersion: Tracks changes to semantic analysis logic. Bump when changing prompts, SemanticAnalysis fields, or analysis routing.
Cache Key Format:
Old format: {hash[:16]}.json
New format: {hash[:16]}-v{schema}-{metadata}-{semantic}.json
Example: sha256:abc12345-v1-1-1.json
Index ¶
- Constants
- func CacheVersion() string
- func HashFile(filePath string) (string, error)
- func IsCurrentVersion(cached *types.CachedAnalysis) bool
- func IsLegacyVersion(cached *types.CachedAnalysis) bool
- func IsStaleVersion(cached *types.CachedAnalysis) bool
- func ParseCacheVersion(cached *types.CachedAnalysis) (schema, metadata, semantic int)
- func ShardPath(basePath, hash, filename string) string
- func VersionString(cached *types.CachedAnalysis) string
- type CacheStats
- type Manager
- func (m *Manager) Clear() error
- func (m *Manager) ClearOldVersions() (int, error)
- func (m *Manager) Get(fileHash, provider string) (*types.CachedAnalysis, error)
- func (m *Manager) GetCacheDir() string
- func (m *Manager) GetStats() (*CacheStats, error)
- func (m *Manager) IsStale(cached *types.CachedAnalysis, currentHash string) bool
- func (m *Manager) Set(cached *types.CachedAnalysis) error
Constants ¶
const CacheMetadataVersion = 1
CacheMetadataVersion tracks changes to metadata extraction logic. Increment when:
- Adding fields to FileMetadata
- Changing metadata extraction algorithms
- Fixing bugs in metadata handlers
- Adding new metadata handlers
- Changing categorization logic
- Updating readability detection
const CacheSchemaVersion = 1
CacheSchemaVersion tracks changes to CachedAnalysis structure. Increment when:
- Adding fields to CachedAnalysis struct
- Removing fields from CachedAnalysis struct
- Renaming fields in CachedAnalysis struct
- Changing field types in CachedAnalysis struct
- Changing cache storage format (JSON structure)
- Changing cache key generation algorithm
const CacheSemanticVersion = 2
CacheSemanticVersion tracks changes to semantic analysis logic. Increment when:
- Changing prompt templates
- Adding fields to SemanticAnalysis
- Changing analysis routing logic (which analyzer for which file type)
- Updating response parsing logic
- Changing confidence score calculations
- Updating entity/reference extraction
- Fixing bugs in semantic analysis
Version 2: Multi-provider semantic analysis refactor (claude.* + analysis.* → semantic.*)
const (
// HashPrefix is the standard prefix for SHA-256 content hashes
HashPrefix = "sha256:"
)
Variables ¶
This section is empty.
Functions ¶
func CacheVersion ¶ added in v0.13.0
func CacheVersion() string
CacheVersion returns the combined version string in format "v{schema}.{metadata}.{semantic}"
func IsCurrentVersion ¶ added in v0.13.0
func IsCurrentVersion(cached *types.CachedAnalysis) bool
IsCurrentVersion checks if a cache entry is from the current version.
func IsLegacyVersion ¶ added in v0.13.0
func IsLegacyVersion(cached *types.CachedAnalysis) bool
IsLegacyVersion checks if a cache entry is a legacy entry (version 0.0.0). Legacy entries are from before versioning was implemented.
func IsStaleVersion ¶ added in v0.13.0
func IsStaleVersion(cached *types.CachedAnalysis) bool
IsStaleVersion checks if a cache entry is from an older version that should be re-analyzed. Returns true if the entry should be considered stale and re-analyzed.
Staleness rules:
- Schema version mismatch = always stale (incompatible structure)
- Metadata version behind current = stale (missing newer metadata fields)
- Semantic version behind current = stale (outdated analysis)
- Future versions (newer than current) = not stale (forward compatible)
func ParseCacheVersion ¶ added in v0.13.0
func ParseCacheVersion(cached *types.CachedAnalysis) (schema, metadata, semantic int)
ParseCacheVersion extracts version components from a cached analysis entry. Returns (0, 0, 0) for legacy entries that don't have version fields set.
func ShardPath ¶ added in v0.14.0
ShardPath generates a two-level sharded directory path for a given hash. This prevents filesystem performance degradation when directories contain many files by distributing entries across 65,536 possible subdirectories.
For hashes with "sha256:" prefix, the prefix is stripped before sharding to use the actual hash value for directory distribution.
Examples:
- "sha256:41d6..." → "{base}/41/d6/{filename}"
- "abc123..." → "{base}/ab/c1/{filename}"
- "ab" (short) → "{base}/{filename}" (no sharding)
func VersionString ¶ added in v0.13.0
func VersionString(cached *types.CachedAnalysis) string
VersionString returns the version string for a cached entry in format "v{schema}.{metadata}.{semantic}".
Types ¶
type CacheStats ¶ added in v0.13.0
type CacheStats struct {
TotalEntries int `json:"total_entries"`
LegacyEntries int `json:"legacy_entries"`
TotalSize int64 `json:"total_size_bytes"`
VersionCounts map[string]int `json:"version_counts"`
}
CacheStats provides statistics about the cache contents
type Manager ¶
type Manager struct {
// contains filtered or unexported fields
}
func NewManager ¶
func (*Manager) ClearOldVersions ¶ added in v0.13.0
ClearOldVersions removes all cache entries that are not the current version. Recursively walks all subdirectories (provider and shard directories). Returns the number of entries removed.
func (*Manager) Get ¶
func (m *Manager) Get(fileHash, provider string) (*types.CachedAnalysis, error)
func (*Manager) GetCacheDir ¶ added in v0.13.0
GetCacheDir returns the cache directory path.
func (*Manager) GetStats ¶ added in v0.13.0
func (m *Manager) GetStats() (*CacheStats, error)
GetStats returns statistics about the cache contents including version distribution. Recursively walks all subdirectories (provider and shard directories).
func (*Manager) IsStale ¶
func (m *Manager) IsStale(cached *types.CachedAnalysis, currentHash string) bool
IsStale checks if a cache entry is stale and needs re-analysis. Returns true if either:
- Content hash doesn't match (file was modified)
- Cache version is outdated (needs re-analysis with current logic)