dgut

package
v4.5.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 22, 2024 License: MIT Imports: 15 Imported by: 0

Documentation

Index

Constants

View Source
const ErrBlankLine = Error("the provided line had no information")
View Source
const ErrDBExists = Error("database already exists")
View Source
const ErrDBNotExists = Error("database doesn't exist")
View Source
const ErrDirNotFound = Error("directory not found")
View Source
const ErrInvalidFormat = Error("the provided data was not in dgut format")

Variables

This section is empty.

Functions

This section is empty.

Types

type DB

type DB struct {
	// contains filtered or unexported fields
}

DB is used to create and query a database made from a dgut file, which is the directory,group,user,type summary output produced by the summary packages' DirGroupUserType.Output() method.

func NewDB

func NewDB(paths ...string) *DB

NewDB returns a *DB that can be used to create or query a dgut database. Provide the path to directory that (will) store(s) the database files. In the case of only reading databases with Open(), you can supply multiple directory paths to query all of them simultaneously.

func (*DB) Children

func (d *DB) Children(dir string) []string

Children returns the directory paths that are directly inside the given directory.

Returns an empty slice if dir had no children (because it was a leaf dir, or didn't exist at all).

The same children from multiple databases are de-duplicated.

You must call Open() before calling this.

func (*DB) Close

func (d *DB) Close()

Close closes the database(s) after reading. You should call this once you've finished reading, but it's not necessary; errors are ignored.

func (*DB) DirInfo

func (d *DB) DirInfo(dir string, filter *Filter) (uint64, uint64, int64, int64,
	[]uint32, []uint32, []summary.DirGUTFileType, error)

DirInfo tells you the total number of files, their total size, oldest atime and newset mtime nested under the given directory, along with the UIDs, GIDs and FTs of those files. See GUTs.Summary for an explanation of the filter.

Returns an error if dir doesn't exist.

You must call Open() before calling this.

func (*DB) Open

func (d *DB) Open() error

Open opens the database(s) for reading. You need to call this before using the query methods like DirInfo() and Which(). You should call Close() after you've finished.

func (*DB) Store

func (d *DB) Store(data io.Reader, batchSize int) error

Store will read the given dgut file data (as output by summary.DirGroupUserType.Output()) and store it in 2 database files that offer fast lookup of the information by directory.

The path for the database directory you provided to NewDB() (only the first will be used) must not already have database files in it to create a new database. You can't add to an existing database. If you create multiple sets of data to store, instead Store them to individual database directories, and then load all them together during Open().

batchSize is how many directories worth of information are written to the database in one go. More is faster, but uses more memory. 10,000 might be a good number to try.

type DCSs

type DCSs []*DirSummary

DCSs is a Size-sortable slice of DirSummary.

func (DCSs) Len

func (d DCSs) Len() int

func (DCSs) Less

func (d DCSs) Less(i, j int) bool

func (DCSs) SortByDir

func (d DCSs) SortByDir()

SortByDir sorts by Dir instead of Size.

func (DCSs) Swap

func (d DCSs) Swap(i, j int)

type DGUT

type DGUT struct {
	Dir  string
	GUTs GUTs
}

DGUT handles all the *GUT information for a directory.

func (*DGUT) Append

func (d *DGUT) Append(other *DGUT)

Append appends the GUTs in the given DGUT to our own. Useful when you have 2 DGUTs for the same Dir that were calculated on different subdirectories independently, and now you're dealing with DGUTs for their common parent directories.

func (*DGUT) Summary

func (d *DGUT) Summary(filter *Filter) (uint64, uint64, int64, int64, []uint32, []uint32, []summary.DirGUTFileType)

Summary sums the count and size of all our GUTs and returns the results, along with the oldest atime and newset mtime (seconds since Unix epoch) and unique set of UIDs, GIDs abd FTs in all our GUTs.

See GUTs.Summary for an explanation of the filter.

type DirInfo

type DirInfo struct {
	Current  *DirSummary
	Children []*DirSummary
}

DirInfo holds nested file count, size, UID and GID information on a directory, and also its immediate child directories.

func (*DirInfo) IsSameAsChild

func (d *DirInfo) IsSameAsChild() bool

IsSameAsChild tells you if this DirInfo has only 1 child, and the child has the same file count. Ie. our child contains the same files as us.

type DirSummary

type DirSummary struct {
	Dir   string
	Count uint64
	Size  uint64
	Atime time.Time
	Mtime time.Time
	UIDs  []uint32
	GIDs  []uint32
	FTs   []summary.DirGUTFileType
}

DirSummary holds nested file count, size, atime and mtim information on a directory. It also holds which users and groups own files nested under the directory, and what the file types are.

type Error

type Error string

func (Error) Error

func (e Error) Error() string

type Filter

type Filter struct {
	GIDs []uint32
	UIDs []uint32
	FTs  []summary.DirGUTFileType
}

Filter can be applied to a GUT to see if it has one of the specified GIDs, UIDs and FTs, in which case it passes the filter.

If the Filter has one of those properties set to nil, or the whole Filter is nil, a GUT will be considered to pass the filter.

The exeception to this is when FTs != []{DGUTFileTypeTemp}, and the GUT has an FT of DGUTFileTypeTemp. A GUT for a temporary file will always fail to pass the filter unless filtering specifically for temporary files, because other GUT objects will represent the same file on disk but with another file type, and you won't want to double-count.

type GUT

type GUT struct {
	GID   uint32
	UID   uint32
	FT    summary.DirGUTFileType
	Count uint64
	Size  uint64
	Atime int64 // seconds since Unix epoch
	Mtime int64 // seconds since Unix epoch
}

GUT handles group,user,type,count,size information.

func (*GUT) PassesFilter

func (g *GUT) PassesFilter(filter *Filter) (bool, bool)

PassesFilter checks to see if this GUT has a GID in the filter's GIDs (considered true if GIDs is nil), and has a UID in the filter's UIDs (considered true if UIDs is nil), and has an FT in the filter's FTs (considered true if FTs is nil). The second bool returned will match the first unless FT is DGUTFileTypeTemp, in which case it will be false, unless the filter FTs == []{DGUTFileTypeTemp}).

type GUTs

type GUTs []*GUT

GUTs is a slice of *GUT, offering ways to filter and summarise the information in our *GUTs.

func (GUTs) Summary

func (g GUTs) Summary(filter *Filter) (uint64, uint64, int64, int64, []uint32, []uint32, []summary.DirGUTFileType)

Summary sums the count and size of all our GUT elements and returns the results, along with the oldest atime and newset mtime (in seconds since Unix epoch) and lists of the unique UIDs, GIDs and FTs in our GUT elements.

Provide a Filter to ignore GUT elements that do not match one of the specified GIDs, one of the UIDs, and one of the FTs. If one of those properties is nil, does not filter on that property.

Provide nil to do no filtering.

Note that FT 1 is "temp" files, and because a file can be both temporary and another type, if your Filter's FTs slice doesn't contain just DGUTFileTypeTemp, any GUT with FT DGUTFileTypeTemp is always ignored. (But the FTs list will still indicate if you had temp files that passed other filters.)

type Tree

type Tree struct {
	// contains filtered or unexported fields
}

Tree is used to do high-level queries on DB.Store() database files.

func NewTree

func NewTree(paths ...string) (*Tree, error)

NewTree, given the paths to one or more dgut database files (as created by DB.Store()), returns a *Tree that can be used to do high-level queries on the stats of a tree of disk folders. You should Close() the tree after use.

func (*Tree) Close

func (t *Tree) Close()

Close should be called after you've finished querying the tree to release its database locks.

func (*Tree) DirHasChildren

func (t *Tree) DirHasChildren(dir string, filter *Filter) bool

DirHasChildren tells you if the given directory has any child directories with files in them that pass the filter. See GUTs.Summary for an explanation of the filter.

func (*Tree) DirInfo

func (t *Tree) DirInfo(dir string, filter *Filter) (*DirInfo, error)

DirInfo tells you the total number of files and their total size nested under the given directory, along with the UIDs and GIDs that own those files. See GUTs.Summary for an explanation of the filter.

It also tells you the same information about the immediate child directories of the given directory (if the children have files in them that pass the filter).

Returns an error if dir doesn't exist.

func (*Tree) FileLocations

func (t *Tree) FileLocations(dir string, filter *Filter) (DCSs, error)

FileLocations, starting from the given dir, finds the first directory that directly contains filter-passing files along every branch from dir.

See GUTs.Summary for an explanation of the filter.

The results are returned sorted by directory.

func (*Tree) Where

func (t *Tree) Where(dir string, filter *Filter, depth int) (DCSs, error)

Where tells you where files are nested under dir that pass the filter. With a depth of 0 it only returns the single deepest directory that has all passing files nested under it.

With a depth of 1, it also returns the results that calling Where() with a depth of 0 on each of the deepest directory's children would give. And so on recursively for higher depths.

See GUTs.Summary for an explanation of the filter.

For example, if all user 354's files are in the directories /a/b/c/d (2 files), /a/b/c/d/1 (1 files), /a/b/c/d/2 (2 files) and /a/b/e/f/g (2 files), Where("/", &Filter{UIDs: []uint32{354}}, 0) would tell you that "/a/b" has 7 files. With a depth of 1 it would tell you that "/a/b" has 7 files, "/a/b/c/d" has 5 files and "/a/b/e/f/g" has 2 files. With a depth of 2 it would tell you that "/a/b" has 7 files, "/a/b/c/d" has 5 files, "/a/b/c/d/1" has 1 file, "/a/b/c/d/2" has 2 files, and "/a/b/e/f/g" has 2 files.

The returned DirSummarys are sorted by Size, largest first.

Returns an error if dir doesn't exist.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL