Back to

Package hercules

Latest Go to latest
Published: Mar 17, 2019 | License: Apache-2.0 | Module:


Package hercules contains the functions which are needed to gather various statistics from a Git repository.

The analysis is expressed in a form of the tree: there are nodes - "pipeline items" - which require some other nodes to be executed prior to selves and in turn provide the data for dependent nodes. There are several service items which do not produce any useful statistics but rather provide the requirements for other items. The top-level items include:

- BurndownAnalysis - line burndown statistics for project, files and developers.

- CouplesAnalysis - coupling statistics for files and developers.

- ShotnessAnalysis - structural hotness and couples, by any Babelfish UAST XPath (functions by default).

The typical API usage is to initialize the Pipeline class:

import ""

var repository *git.Repository
// ...initialize repository...
pipeline := hercules.NewPipeline(repository)

Then add the required analysis:

ba := pipeline.DeployItem(&hercules.BurndownAnalysis{}).(hercules.LeafPipelineItem)

This call will add all the needed intermediate pipeline items. Then link and execute the analysis tree:

result, err := pipeline.Run(pipeline.Commits(false))

Finally extract the result:

result := result[ba].(hercules.BurndownResult)

The actual usage example is cmd/hercules/root.go - the command line tool's code.

Hercules depends heavily on and leverages the diff algorithm through

Besides, BurndownAnalysis involves File and RBTree. These are low level data structures which enable incremental blaming. File carries an instance of RBTree and the current line burndown state. RBTree implements the red-black balanced binary tree and is based on

Coupling stats are supposed to be further processed rather than observed directly. uses Swivel embeddings and visualises them in Tensorflow Projector.

Shotness analysis as well as other UAST-featured items relies on [Babelfish]( and requires the server to be running.


Package Files


const (
	// BoolConfigurationOption reflects the boolean value type.
	BoolConfigurationOption = core.BoolConfigurationOption
	// IntConfigurationOption reflects the integer value type.
	IntConfigurationOption = core.IntConfigurationOption
	// StringConfigurationOption reflects the string value type.
	StringConfigurationOption = core.StringConfigurationOption
	// FloatConfigurationOption reflects a floating point value type.
	FloatConfigurationOption = core.FloatConfigurationOption
	// StringsConfigurationOption reflects the array of strings value type.
	StringsConfigurationOption = core.StringsConfigurationOption
	// MessageFinalize is the status text reported before calling LeafPipelineItem.Finalize()-s.
	MessageFinalize = core.MessageFinalize
const (
	// ConfigPipelineDAGPath is the name of the Pipeline configuration option (Pipeline.Initialize())
	// which enables saving the items DAG to the specified file.
	ConfigPipelineDAGPath = core.ConfigPipelineDAGPath
	// ConfigPipelineDumpPlan is the name of the Pipeline configuration option (Pipeline.Initialize())
	// which outputs the execution plan to stderr.
	ConfigPipelineDumpPlan = core.ConfigPipelineDumpPlan
	// ConfigPipelineDryRun is the name of the Pipeline configuration option (Pipeline.Initialize())
	// which disables Configure() and Initialize() invocation on each PipelineItem during the
	// Pipeline initialization.
	// Subsequent Run() calls are going to fail. Useful with ConfigPipelineDAGPath=true.
	ConfigPipelineDryRun = core.ConfigPipelineDryRun
	// ConfigPipelineCommits is the name of the Pipeline configuration option (Pipeline.Initialize())
	// which allows to specify the custom commit sequence. By default, Pipeline.Commits() is used.
	ConfigPipelineCommits = core.ConfigPipelineCommits
const (
	// DependencyCommit is the name of one of the three items in `deps` supplied to PipelineItem.Consume()
	// which always exists. It corresponds to the currently analyzed commit.
	DependencyCommit = core.DependencyCommit
	// DependencyIndex is the name of one of the three items in `deps` supplied to PipelineItem.Consume()
	// which always exists. It corresponds to the currently analyzed commit's index.
	DependencyIndex = core.DependencyIndex
	// DependencyIsMerge is the name of one of the three items in `deps` supplied to PipelineItem.Consume()
	// which always exists. It indicates whether the analyzed commit is a merge commit.
	// Checking the number of parents is not correct - we remove the back edges during the DAG simplification.
	DependencyIsMerge = core.DependencyIsMerge
	// DependencyAuthor is the name of the dependency provided by identity.Detector.
	DependencyAuthor = identity.DependencyAuthor
	// DependencyBlobCache identifies the dependency provided by BlobCache.
	DependencyBlobCache = plumbing.DependencyBlobCache
	// DependencyDay is the name of the dependency which DaysSinceStart provides - the number
	// of days since the first commit in the analysed sequence.
	DependencyDay = plumbing.DependencyDay
	// DependencyFileDiff is the name of the dependency provided by FileDiff.
	DependencyFileDiff = plumbing.DependencyFileDiff
	// DependencyTreeChanges is the name of the dependency provided by TreeDiff.
	DependencyTreeChanges = plumbing.DependencyTreeChanges
	// DependencyUastChanges is the name of the dependency provided by Changes.
	DependencyUastChanges = uast.DependencyUastChanges
	// DependencyUasts is the name of the dependency provided by Extractor.
	DependencyUasts = uast.DependencyUasts
	// FactCommitsByDay contains the mapping between day indices and the corresponding commits.
	FactCommitsByDay = plumbing.FactCommitsByDay
	// FactIdentityDetectorPeopleCount is the name of the fact which is inserted in
	// identity.Detector.Configure(). It is equal to the overall number of unique authors
	// (the length of ReversedPeopleDict).
	FactIdentityDetectorPeopleCount = identity.FactIdentityDetectorPeopleCount
	// FactIdentityDetectorPeopleDict is the name of the fact which is inserted in
	// identity.Detector.Configure(). It corresponds to identity.Detector.PeopleDict - the mapping
	// from the signatures to the author indices.
	FactIdentityDetectorPeopleDict = identity.FactIdentityDetectorPeopleDict
	// FactIdentityDetectorReversedPeopleDict is the name of the fact which is inserted in
	// identity.Detector.Configure(). It corresponds to identity.Detector.ReversedPeopleDict -
	// the mapping from the author indices to the main signature.
	FactIdentityDetectorReversedPeopleDict = identity.FactIdentityDetectorReversedPeopleDict


var BinaryGitHash = "<unknown>"

BinaryGitHash is the Git hash of the Hercules binary file which is executing.

var BinaryVersion = 0

BinaryVersion is Hercules' API version. It matches the package name.

var Registry = core.Registry

Registry contains all known pipeline item types.

func EnablePathFlagTypeMasquerade

func EnablePathFlagTypeMasquerade()

EnablePathFlagTypeMasquerade changes the type of all "path" command line arguments from "string" to "path". This operation cannot be canceled and is intended to be used for better --help output.

func LoadCommitsFromFile

func LoadCommitsFromFile(path string, repository *git.Repository) ([]*object.Commit, error)

LoadCommitsFromFile reads the file by the specified FS path and generates the sequence of commits by interpreting each line as a Git commit hash.

func PathifyFlagValue

func PathifyFlagValue(flag *pflag.Flag)

PathifyFlagValue changes the type of a string command line argument to "path".

func SafeYamlString

func SafeYamlString(str string) string

SafeYamlString escapes the string so that it can be reliably used in YAML.

type CachedBlob

type CachedBlob = plumbing.CachedBlob

CachedBlob allows to explicitly cache the binary data associated with the Blob object. Such structs are returned by DependencyBlobCache.

type CommonAnalysisResult

type CommonAnalysisResult = core.CommonAnalysisResult

CommonAnalysisResult holds the information which is always extracted at Pipeline.Run().

func MetadataToCommonAnalysisResult

func MetadataToCommonAnalysisResult(meta *core.Metadata) *CommonAnalysisResult

MetadataToCommonAnalysisResult copies the data from a Protobuf message.

type ConfigurationOption

type ConfigurationOption = core.ConfigurationOption

ConfigurationOption allows for the unified, retrospective way to setup PipelineItem-s.

type ConfigurationOptionType

type ConfigurationOptionType = core.ConfigurationOptionType

ConfigurationOptionType represents the possible types of a ConfigurationOption's value.

type FeaturedPipelineItem

type FeaturedPipelineItem = core.FeaturedPipelineItem

FeaturedPipelineItem enables switching the automatic insertion of pipeline items on or off.

type FileDiffData

type FileDiffData = plumbing.FileDiffData

FileDiffData is the type of the dependency provided by plumbing.FileDiff.

type LeafPipelineItem

type LeafPipelineItem = core.LeafPipelineItem

LeafPipelineItem corresponds to the top level pipeline items which produce the end results.

type NoopMerger

type NoopMerger = core.NoopMerger

NoopMerger provides an empty Merge() method suitable for PipelineItem.

type OneShotMergeProcessor

type OneShotMergeProcessor = core.OneShotMergeProcessor

OneShotMergeProcessor provides the convenience method to consume merges only once.

type Pipeline

type Pipeline = core.Pipeline

Pipeline is the core Hercules entity which carries several PipelineItems and executes them. See the extended example of how a Pipeline works in doc.go

func NewPipeline

func NewPipeline(repository *git.Repository) *Pipeline

NewPipeline initializes a new instance of Pipeline struct.

type PipelineItem

type PipelineItem = core.PipelineItem

PipelineItem is the interface for all the units in the Git commits analysis pipeline.

func ForkCopyPipelineItem

func ForkCopyPipelineItem(origin PipelineItem, n int) []PipelineItem

ForkCopyPipelineItem clones items by copying them by value from the origin.

func ForkSamePipelineItem

func ForkSamePipelineItem(origin PipelineItem, n int) []PipelineItem

ForkSamePipelineItem clones items by referencing the same origin.

type PipelineItemRegistry

type PipelineItemRegistry = core.PipelineItemRegistry

PipelineItemRegistry contains all the known PipelineItem-s.

type ResultMergeablePipelineItem

type ResultMergeablePipelineItem = core.ResultMergeablePipelineItem

ResultMergeablePipelineItem specifies the methods to combine several analysis results together.

Documentation was rendered with GOOS=linux and GOARCH=amd64.

Jump to identifier

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to identifier