git

package
v0.3.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: MIT Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func IsRepo added in v0.3.0

func IsRepo(root string) bool

IsRepo reports whether root is inside a git work tree. It runs `git rev-parse --is-inside-work-tree` once, letting callers short-circuit per-file CollectHistory forks when the tree is not version-controlled (or git is not installed). Returns false on any error.

Types

type FileHistory

type FileHistory struct {
	Path          string
	CommitCount   int
	FirstCommitAt int64
	LastCommitAt  int64
	AuthorCount   int
	LastAuthor    string
	LastSubject   string
}

FileHistory holds git commit statistics for a single file.

func CollectHistories added in v0.3.0

func CollectHistories(gitRoot string, relPaths []string, workers int) []FileHistory

CollectHistories collects a FileHistory for every relPath under gitRoot, fanning the per-file `git log --follow` forks across up to `workers` goroutines. On a large versioned corpus this is the dominant index cost: each fork is CPU-bound `--follow` rename detection that runs in a child process, so a serial loop pegs a single core while the rest idle (measured on a 64k-commit vscode clone, 383 doc files: ~304s wall serial vs ~32s with workers=NumCPU on 14 cores — 9.4×, file_history rows bit-identical to serial).

CollectHistory is a pure per-file function with no shared state, so each goroutine writes its own disjoint results slot — no mutex, no batcher (unlike similarity.runPairwiseWorkers, whose edge writes must funnel through one SQLite writer). Rows are returned in the same order as relPaths so callers can keep their serial UpsertFileHistory loop unchanged. workers <= 1, an empty list, or a single path runs serially; workers is clamped to len.

Globally fork-bounded: every git child this spawns is gated by the package-level forkSem (cap NumCPU), so total concurrent `git log` children stay ≤ NumCPU even when multiple callers fan out at once. Both call sites rely on this — the single-store index flush (one CollectHistories at workers=NumCPU) and the multi-project workspace IndexAll (one CollectHistories per project, with NumCPU projects in flight). The per-project `workers` here can therefore be NumCPU regardless of how many projects run concurrently: it only sets how wide each project *requests*, never how many forks actually run (forkSem decides that). Blocked goroutines waiting on the budget are cheap; only ≤ NumCPU forks are ever live.

func CollectHistory

func CollectHistory(gitRoot, relPath string) FileHistory

CollectHistory runs git log to gather change history for relPath within gitRoot. Returns a zero-value FileHistory (CommitCount == 0) on any error: git not installed, directory not a git repo, or file untracked.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL