Documentation ¶
Overview ¶
Package hardlinkable determines which files in the given directories have equal content and compatible inode properties, and returns information on the space that would be saved by hardlinking them all together. It can also, optionally, perform the hardlinking.
Index ¶
- Constants
- func CheckQuiescence(o *Options)
- func ContentOnly(o *Options)
- func DebugLevel(debugLevel uint) func(*Options)
- func Humanize(n uint64) string
- func HumanizeWithPrecision(n uint64, prec int) string
- func HumanizedUint64(s string) (uint64, error)
- func IgnoreLinkErrors(o *Options)
- func IgnoreOwner(o *Options)
- func IgnorePerm(o *Options)
- func IgnoreTime(o *Options)
- func IgnoreWalkErrors(o *Options)
- func IgnoreXAttr(o *Options)
- func LinkingDisabled(o *Options)
- func LinkingEnabled(o *Options)
- func MaxFileSize(size uint64) func(*Options)
- func MinFileSize(size uint64) func(*Options)
- func SameName(o *Options)
- func ShowExtendedRunStats(o *Options)
- func ValidateDirsAndFiles(dirsAndFiles []string) (dirs []string, files []string, err error)
- type Options
- type Results
- type RunPhases
- type RunStats
Constants ¶
const DefaultMinFileSize = 1
const DefaultSearchThresh = 1
const DefaultShowExtendedRunStats = false // Non-cli default
const DefaultShowRunStats = true // Non-cli default
const DefaultStoreExistingLinkResults = true // Non-cli default
const DefaultStoreNewLinkResults = true // Non-cli default
const DefaultUseNewestLink = true
Variables ¶
This section is empty.
Functions ¶
func CheckQuiescence ¶
func CheckQuiescence(o *Options)
CheckQuiescence enables quiescence checking which can detect changes to the filesystem during the file/directory walk.
func ContentOnly ¶
func ContentOnly(o *Options)
ContentOnly uses only file content to determine equality (not inode parameters like time, permission, ownership, etc.)
func DebugLevel ¶
DebugLevel sets the debugging level (1,2,or 3)
func HumanizeWithPrecision ¶
HumanizeWithPrecision allows providing FormatFloat precision value
func HumanizedUint64 ¶
HumanizedUint64 converts humanized size strings like "1k" into an unsigned int64 (ie. "1k" -> 1024)
func IgnoreLinkErrors ¶
func IgnoreLinkErrors(o *Options)
IgnoreLinkErrors allows the Run to continue during Link phase errors (typically the actual linking itself)
func IgnoreOwner ¶
func IgnoreOwner(o *Options)
IgnoreOwner allows linked files to have unequal uid or gid
func IgnorePerm ¶
func IgnorePerm(o *Options)
IgnorePerm allows linked files to have unequal mode bits
func IgnoreTime ¶
func IgnoreTime(o *Options)
IgnoreTime allows linked files to have unequal modification times
func IgnoreWalkErrors ¶
func IgnoreWalkErrors(o *Options)
IgnoreWalkErrors allows the Run to continue during Walk phase errors (such as permission errors reading dirs or files)
func IgnoreXAttr ¶
func IgnoreXAttr(o *Options)
IgnoreXAttr allows linked files to have unequal xattrs
func LinkingDisabled ¶
func LinkingDisabled(o *Options)
LinkingDisabled forbids Run() from actually linking the files
func LinkingEnabled ¶
func LinkingEnabled(o *Options)
LinkingEnabled allows Run() to actually perform linking of files
func MaxFileSize ¶
MaxFileSize sets the maximum size of files that can be linked
func MinFileSize ¶
MinFileSize sets the minimum size of files that can be linked
func ShowExtendedRunStats ¶
func ShowExtendedRunStats(o *Options)
ShowExtendedRunStats enabled prints more in OutputRunStats()
Types ¶
type Options ¶
type Options struct { // SameName enabled ensures only files with matching filenames can be // linked SameName bool // IgnoreTime enabled allows files with different mtime values can be // linked IgnoreTime bool // IgnorePerm enabled allows files with different inode mode values // can be linked IgnorePerm bool // IgnoreOwner enabled allows files with different uid or gid can be // linked IgnoreOwner bool // IgnoreXAttr enabled allows files with different xattrs can be linked IgnoreXAttr bool // LinkingEnabled causes the Run to perform the linking step LinkingEnabled bool // MinFileSize controls the minimum size of files that are eligible to // be considered for linking. MinFileSize uint64 // MaxFileSize controls the maximum size of files that are eligible to // be considered for linking. MaxFileSize uint64 // DebugLevel controls the amount of debug information reported in the // results output, as well as debug logging. DebugLevel uint // UseNewestLink requests setting the inode to the mtime and uid/gid of // the more recent inode when files are linked. UseNewestLink bool // FileIncludes is a slice of regex expressions that control what // filenames will be considered for linking. If given without any // FileExcludes, the walked files must match one of the includes. If // FileExcludes are provided, the FileIncludes can override them. FileIncludes []string // FileExcludes is a slice of regex expressions that control what // filenames will be excluded from consideration for linking. FileExcludes []string // DirExcludes is a slice of regex expressions that control what // directories will be excluded from the file discovery walk. DirExcludes []string // StoreExistingLinkResults allows controlling whether to store // discovered existing links in Results. Command line option Verbosity // > 2 can override. StoreExistingLinkResults bool // StoreNewLinkResults allows controlling whether to store discovered // new hardlinkable pathnames in Results. Command line option Verbosity // > 1 can override. StoreNewLinkResults bool // ShowExtendedRunStats enabled displays additional Result stats // output. Command line option Verbosity > 0 can override. ShowExtendedRunStats bool // ShowRunStats enabled displays Result stats output. ShowRunStats bool // IgnoreWalkErrors allows Run to continue when errors occur during the // walk phase, such as not having permission to walk a directory, or // being unable to read a file for comparision. IgnoreWalkErrors bool // IgnoreLinkErrors allows Run to continue when linking fails (or any // errors during the Link phase) IgnoreLinkErrors bool // CheckQuiescence enabled looks for signs of the filesystems changing // during walk. Always enabled when LinkingEnabled is true. CheckQuiescence bool // SearchThresh determines the length that the lists of files with // equivalent inode hashes can grow to, before also enabling content // digests (which can drastically reduce the number of compared files // when there are many with the same hash, but differing content at the // start of the file). Can be disabled with -1. May save a small // amount of memory, but potentially at greatly increased runtime in // worst case scenarios with many, many files. SearchThresh int }
Options is passed to the Run() func, and controls the operation of the hardlinkable algorithm, including what inode parameters much match for files to be compared for equality, what files and directories are included or excluded, and whether linking is actually enabled or not.
func SetupOptions ¶
SetupOptions returns a Options struct with the defaults initialized and the given setup functions also applied.
type Results ¶
type Results struct { // Link member strings are pathnames ExistingLinks map[string][]string `json:"existingLinks"` ExistingLinkSizes map[string]uint64 `json:"existingLinkSizes"` LinkPaths [][]string `json:"linkPaths"` SkippedLinkPaths [][]string `json:"skippedLinkPaths"` // Skipped when link failed RunStats StartTime time.Time `json:"startTime"` EndTime time.Time `json:"endTime"` RunTime string `json:"runTime"` Opts Options `json:"options"` // Set to true when Run() has completed successfully RunSuccessful bool `json:"runSuccessful"` // Record which 'phase' we've gotten to in the algorithms, in case of // early termination of the run. Phase RunPhases `json:"phase"` }
Results contains the RunStats information, as well as the found existing and new links. It also includes a measurement of how long the Run() took to execute, and the Options that were used to perform the Run().
func Run ¶
Run performs a scan of the supplied directories and files, with the given Options, and outputs information on which files could be linked to save space.
func RunWithProgress ¶
RunWithProgress performs a scan of the supplied directories and files, with the given Options, and outputs information on which files could be linked to save space. A progress line is continually updated as the directories and files are scanned.
func (*Results) OutputExistingLinks ¶
func (r *Results) OutputExistingLinks()
OutputExistingLinks shows in text form the existing links that were found by Run.
func (*Results) OutputJSONResults ¶
func (r *Results) OutputJSONResults()
OutputJSONResults outputs a JSON formatted object with all the information gathered by Run() about existing and new links, and stats on space saved, etc.
func (*Results) OutputNewLinks ¶
func (r *Results) OutputNewLinks()
OutputNewLinks shows in text form the pathnames that were discovered to be linkable.
func (*Results) OutputResults ¶
func (r *Results) OutputResults()
OutputResults prints results in text form, including existing links that were found, new pathnames that were discovered to be linkable, and stats about the run giving information on the amount of data that can be saved (or was saved if linking was enabled).
func (*Results) OutputRunStats ¶
func (r *Results) OutputRunStats()
OutputRunStats show information about how many files could be linked, how much space would be saved, and other information on inodes, comparisons, etc. If linking was enabled, it displays the information on links that were actually made and space actually saved (which should equal the predicted amounts).
func (*Results) OutputSkippedNewLinks ¶
func (r *Results) OutputSkippedNewLinks()
OutputSkippedNewLinks shows in text form the pathnames that were skipped due to linking errors.
type RunPhases ¶
type RunPhases int
RunPhases is an enum that indicates which phase of the Run() algorithm is being executed.
const ( // StartPhase indicates the Run() algorithm hasn't started StartPhase RunPhases = iota // WalkPhase indicates the directory/file walk which gathers info WalkPhase // LinkPhase indicates that the pathname link pairs are being computed LinkPhase // EndPhase indicates the Run() has finished EndPhase )
type RunStats ¶
type RunStats struct { DirCount int64 `json:"dirCount"` FileCount int64 `json:"fileCount"` FileTooSmallCount int64 `json:"fileTooSmallCount"` FileTooLargeCount int64 `json:"fileTooLargeCount"` ComparisonCount int64 `json:"comparisonCount"` InodeCount int64 `json:"inodeCount"` InodeRemovedCount int64 `json:"inodeRemovedCount"` NlinkCount int64 `json:"nlinkCount"` ExistingLinkCount int64 `json:"existingLinkCount"` NewLinkCount int64 `json:"newLinkCount"` ExistingLinkByteAmount uint64 `json:"existingLinkByteAmount"` InodeRemovedByteAmount uint64 `json:"inodeRemovedByteAmount"` BytesCompared uint64 `json:"bytesCompared"` // Some stats on files that compared equal, but which had some // mismatching inode parameters. This can be helpful for tuning the // command line options on subsequent runs. MismatchedMtimeCount int64 `json:"mismatchedMtimeCount"` MismatchedModeCount int64 `json:"mismatchedModeCount"` MismatchedUIDCount int64 `json:"mismatchedUIDCount"` MismatchedGIDCount int64 `json:"mismatchedGIDCount"` MismatchedXAttrCount int64 `json:"mismatchedXAttrCount"` MismatchedTotalCount int64 `json:"mismatchedTotalCount"` MismatchedMtimeBytes uint64 `json:"mismatchedMtimeBytes"` MismatchedModeBytes uint64 `json:"mismatchedModeBytes"` MismatchedUIDBytes uint64 `json:"mismatchedUIDBytes"` MismatchedGIDBytes uint64 `json:"mismatchedGIDBytes"` MismatchedXAttrBytes uint64 `json:"mismatchedXAttrBytes"` MismatchedTotalBytes uint64 `json:"mismatchedTotalBytes"` // Counts of file I/O errors (reading, linking, etc.) SkippedDirErrCount int64 `json:"skippedDirErrCount"` SkippedFileErrCount int64 `json:"skippedFileErrCount"` SkippedLinkErrCount int64 `json:"skippedLinkErrCount"` // Counts of files and dirs excluded by the Regex matches ExcludedDirCount int64 `json:"excludedDirCount"` ExcludedFileCount int64 `json:"excludedFileCount"` IncludedFileCount int64 `json:"includedFileCount"` // Count of how many setuid and setgid files were encountered (and skipped) SkippedSetuidCount int64 `json:"skippedSetuidCount"` SkippedSetgidCount int64 `json:"skippedSetgidCount"` // Also keep track of files with bits other than the permission bits // set (other than setuid/setgid and bits already excluded by "regular" // file bits) SkippedNonPermBitCount int64 `json:"skippedNonPermBitCount"` // Debugging counts EqualComparisonCount int64 `json:"equalComparisonCount"` FoundHashCount int64 `json:"foundHashCount"` MissedHashCount int64 `json:"missedHashCount"` HashMismatchCount int64 `json:"hashMismatchCount"` InoSeqSearchCount int64 `json:"inoSeqSearchCount"` InoSeqIterationCount int64 `json:"inoSeqIterationCount"` DigestComputedCount int64 `json:"digestComputedCount"` // Counts of how many times the hardlinkFiles() func wasn't able to // successfully change inode times and/or uid/gid. Since we ignore // such errors and continue anyway (ie. it's a best-effort attempt, // rather than a guarantee), the counts are debugging info. FailedLinkChtimesCount int64 `json:"failedLinkChtimesCount"` FailedLinkChownCount int64 `json:"failedLinkChownCount"` }
RunStats holds information about counts, the number of files found to be linkable, the bytes that linking would save (or did save), and a variety of related, useful, or just interesting information gathered during the Run().