fileutils

package module
v0.0.0-...-e58c9c3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 17, 2024 License: MIT Imports: 20 Imported by: 20

README

Simple file utilities for golang, written for working with markup files and other file types (like images) typically associated with documentation.

Included are types for describing a file and configuring a set of associated output files:

  • InputFile describes a file: full path, size, MIME type, ?IsXML, MMCtype, and contents (up to 2 megabytes).

  • MMCtype is meant to function like a MIME type and has three fields. It can be set based on file name and contents, and later updated if the file is XML and has a DOCTYPE declaration. Refer to file mmctype.go

  • OutputFiles makes it easier to create a group of like-named files for an InputFile, in the same directory or optionally in a like-named subdirectory.

Known issues

  • Tested only on macos (i.e. it's sure to fail on Windows)

Example

$ cd /opt
$ ls example*
example.xml
import "github.com/fbaube/fileutils"

IF, _ := fileutils.NewInputFile("example.xml")

fmt.Fprintf(os.Stdout, "You opened: %s \n", IF)
// You opened: /opt/example.xml
println("i.e.", IF.DString())
// i.e. InputFile</opt/example.xml>sz<42>dir?<n>bin?<n>img?<n>mime<text/plain>

// Argument is not "": Creates a subdirectory for associated output files.
OF, _ := IF.NewOutputFiles("_myapp")

// Creates an associated file and returns the io.WriteCloser
w_diag, _ := OF.NewOutputExt("diag")
fmt.Fprintln(w_diag, "Lots of diagnostic info")
w_diag.Close()
$ ls example*
example.xml

example.xml_myapp:
example.diag

Dependencies

  • github.com/hosom/gomagic for MIME type analysis
  • github.com/pkg/errors for wrapping errors
  • github.com/fbaube/stringutils for various

Documentation

Overview

Dependencies

This package imports github.com/fbaube/(stringutils,wasmutils)

Dependencies on package os

Note that for simplicity and correctness, this package should depend as much as possible on these stdlib libraries:

1) "path" https://golang.org/pkg/path/

Funcs: Base(s) Clean(s) Dir(s) Ext(s) IsAbs(s) Join(s..) Split(s) Match(..)

Package path has utility routines for manipulating slash-separated paths. Use only for paths separated by forward slashes, such as URL paths. This package does not deal with Windows paths with drive letters or backslashes; to do O/S paths, use package path/filepath .

2) filepath https://golang.org/pkg/path/filepath/

Funcs: as for package "path" above but optimised for file paths, plus: Abs(s) EvalSymlinks(s) FromSlash(s) Glob(s) Rel(base,target) SplitList(s) ToSlash(s) VolumeName(s) Walk(root string, walkFn WalkFunc) type_WalkFunc

3) os https://golang.org/pkg/os/

A mix of pure functions and os.File methods. See comments below in source file.

4) Maybe also some dependencies on package aferoutils

Index

Constants

View Source
const MAX_FILE_SIZE = 4000000

MAX_FILE_SIZE is set (arbitrarily) to 4 megabytes

View Source
const PathSep = string(os.PathSeparator)

A token nod to Windoze compatibility.

Variables

This section is empty.

Functions

func AbsWRT

func AbsWRT(problyRelFP string, wrtDir string) string

AbsWRT is like "filepath.Abs(..)"": it can convert a possibly-relative filepath to an absolute filepath. The difference is that a relative filepath argument is not resolved w.r.t. the current working directory; it is instead done w.r.t. the supplied directory argument.

func AppendToFileBaseName

func AppendToFileBaseName(name, toAppend string) string

func ClearAndCreateDirectory

func ClearAndCreateDirectory(path string) error

ClearAndCreateDirectory deletes it before re-creating it. The older version (named "ClearDirectory") tried to keep the directory as-is while emptying it.

func ClearDirectory

func ClearDirectory(path string) error

ClearDirectory tries to keep the directory as-is while emptying it.

func CopyDirRecursivelyFromTo

func CopyDirRecursivelyFromTo(src string, dst string) error

CopyDirRecursivelyFromTo copies a whole directory recursively. BOTH arguments should be directories !! Otherwise, hilarity ensures.

func CopyFileFromTo

func CopyFileFromTo(src, dst string) error

CopyFileFromTo copies a single file from src to dst.

func CopyFileGreedily

func CopyFileGreedily(src string, dst string) error

CopyFileGreedily reads the entire file into memory, and is therefore memory-constrained !

func CopyFromTo

func CopyFromTo(src, dst string) error

CopyFromTo copies the contents of src to dst atomically, using a temp file as intermediary.

func CreateEmpty

func CreateEmpty(path AbsFilePath) (*os.File, error)

CreateEmpty opens the filepath as a writable empty file, truncating it if it exists and is non-empty.

func DirectoryContents

func DirectoryContents(f *os.File) ([]os.FileInfo, error)

DirectoryContents returns the results of "(*os.File)Readdir(..)". "File.Name()" might be a relative filepath but if it was opened okay then it at least functions as an absolute filepath. If the path is not a directory then it panics.

The call to "Readdir(..)" reads the contents of the directory associated with arg "File" and returns a slice of "FileInfo" values, as would be returned by "Lstat(..)", in directory order.

func DirectoryFiles

func DirectoryFiles(f *os.File) (int, []os.FileInfo, error)

DirectoryFiles is like "DirectoryContents(..)" except that results that are directories (not files) are nil'ed out. If there were entries but none were files, it return ("0,nil,nil").

func Enhomed

func Enhomed(s string) string

Enhomed shortens a filepath by substituting "~".

func EnsureTrailingPathSep

func EnsureTrailingPathSep(s string) string

func Exists

func Exists(path string) bool

Exists returns true *iff* the file exists and is in fact a file.

func GetHomeDir

func GetHomeDir() string

GetHomeDir is a convenience function, and refers to the invoking user's home directory.

func GetStringFromStdin

func GetStringFromStdin() (string, error)

GetStringFromStdin reads "os.Stdin" completely and returns a string.

func IsDirAndExists

func IsDirAndExists(path string) bool

IsDirAndExists returns true *iff* the directory exists and is in fact a directory.

func IsFileAtPath

func IsFileAtPath(aPath string) (bool, os.FileInfo, error)

IsFileAtPath checks that the file exists AND that it is "regular" (not dir, symlink, pipe), and also returns size and permissions in *os.FileInfo

Return values:

  • (true, *FileInfo, nil) if a regular file exists (but can be 0-len!)
  • (false, *FileInfo, nil) if something else exists (incl. dir)
  • (false, nil, nil) if nothing at all exists
  • (false, nil, anError) if some unusual error was returned (failing disk?)

Notes & caveats:

  • File emptiness (i.e. length 0) is not checked
  • "~" for user home dir is not expanded and will fail

.

func IsNonEmpty

func IsNonEmpty(path string) bool

IsNonEmpty returns true *iff* the file exists *and* contains at least one byte of data.

func IsXML

func IsXML(path string) bool

IsXML returns true *iff* the file exists *and* appears to be XML. The check is simple though.

func MTypeSub

func MTypeSub(mtype string, i int) string

func MakeDirectoryExist

func MakeDirectoryExist(path string) error

MakeDirectoryExist might not create it ?! (NOTE)

func Must

func Must(f *os.File, e error) *os.File

Must wraps this package's most common return values and panics if it gets an error.

func OpenRO

func OpenRO(path string) (f *os.File, e error)

OpenRO opens (and returns) the filepath as a readable file.

func OpenRW

func OpenRW(path string) (f *os.File, e error)

OpenRW opens (and returns) the filepath as a writable file. An existing file is not truncated, merely opened.

func ResolvePath

func ResolvePath(s string) string

ResolvePath is needed because functions in package path/filepath do not handle "~" (home directory) well. If an error occurs (for whatever reason), we punt: simply return the original input argument.

func SameContents

func SameContents(f1, f2 *os.File) bool

SameContents returns: Are the two files' contents identical ?

func SessionSummary

func SessionSummary() string

SessionSummary can be called anytime.

func TempDir

func TempDir(dest string) string

Tempdir checks and returns the value of the envar `TMPDIR`.

func WriteAtomic

func WriteAtomic(dest string, write func(w io.Writer) error) (err error)

WriteAtomic is TBS.

func XmlAttrS

func XmlAttrS(a xml.Attr) string

func XmlNameS

func XmlNameS(n xml.Name) string

func XmlStartElmS

func XmlStartElmS(se xml.StartElement) string

Types

type AbsFilePath

type AbsFilePath string

AbsFilePath is a new type, based on `string`. It serves three purposes: - clarify and bring correctness to the processing of absolute path arguments - permit the use of a clearly named struct field - permit the definition of methods on the type

Note that when working with an `os.File`, `Name()` returns the name of the file as was passed to `Open(..)`, so it might be a relative filepath.

func AbsFP

func AbsFP(relFP string) AbsFilePath

AbsFP is like filepath.Abs(..) except using our own types.

func (AbsFilePath) Append

func (afp AbsFilePath) Append(rfp string) AbsFilePath

Append is a convenience function to keep code cleaner.

func (AbsFilePath) BaseName

func (afp AbsFilePath) BaseName() string

func (AbsFilePath) DirExists

func (afp AbsFilePath) DirExists() bool

DirExists returns true *iff* the directory exists and is in fact a directory.

func (AbsFilePath) DirPath

func (afp AbsFilePath) DirPath() AbsFilePath

func (AbsFilePath) Enhomed

func (afp AbsFilePath) Enhomed() string

func (AbsFilePath) Exists

func (afp AbsFilePath) Exists() bool

Exists returns true *iff* the file exists and is in fact a file.

func (AbsFilePath) FileExt

func (afp AbsFilePath) FileExt() string

func (AbsFilePath) FileSize

func (afp AbsFilePath) FileSize() int

FileSize returns the size *iff* the filepath exists and is in fact a file.

func (AbsFilePath) HasPrefix

func (afp AbsFilePath) HasPrefix(beg AbsFilePath) bool

StartsWith is like strings.HasPrefix(..) but uses our types.

func (AbsFilePath) OpenExistingDir

func (afp AbsFilePath) OpenExistingDir() (f *os.File, e error)

OpenExistingDir returns the directory *iff* it exists and can be opened for reading. Note that the `os.File` can be nil without error. Thus we cannot (or: *do not*) distinguish btwn non-existence and an actual error. OTOH if it exists but is not a directory, return an error.

func (AbsFilePath) OpenOrCreateDir

func (afp AbsFilePath) OpenOrCreateDir() (f *os.File, e error)

OpenOrCreateDir returns true if (a) the directory exists and can be opened, or (b) it does not exist, and/but it can be created anew.

func (AbsFilePath) S

func (afp AbsFilePath) S() string

S is a utility method to keep code cleaner.

func (AbsFilePath) Tildotted

func (afp AbsFilePath) Tildotted() string

type FSEntry

type FSEntry interface {
	// ON.Nord // this should remain decoupled !
	fs.File
	IsFile() bool
	IsDir() bool
	IsSymlink() bool
	// File: nr bytes; Dir: nr files.
	Size() int
	// These methods use [Lstat] rather than [Stat],
	// so symlinks must be resolved "manually".
	ResolveSymlink() (string, error)
	// Refresh reloads the [FileInfo]
	// and returns: Did it change ?
	Refresh() bool
}

FSEntry handles read-only access and attributes for files, dirs, and symilnks; it extends fs.File (also R/O), which comprises

  • Stat() (FileInfo, error)
  • Read([]byte) (int, error)
  • Close() error

Related there is also interface [io/fs.ReadFileFS):

  • FS
  • For ReadFile(s),
  • Success returns a nil error, not io.EOF.
  • The caller may modify the returned byte slice.
  • This method should return a copy of the underlying data. ReadFile(name string) ([]byte, error)

Related there is also interface io/fs.ReadDirFS:

  • FS
  • ReadDir(name string) ([]DirEntry, error)

Related there is also interface io/fs.ReadDirFile:

  • File
  • ReadDir reads the contents of the directory and returns a slice of up to n DirEntry values in directory order. Subsequent calls on the same file will yield further DirEntry values. If n > 0, ReadDir returns at most n DirEntry structures. In this case, if ReadDir returns an empty slice, it will return a non-nil error explaining why. At the end of a directory, the error is io.EOF. (ReadDir must return io.EOF itself, not an error wrapping io.EOF.) If n <= 0, ReadDir returns all the DirEntry values from the directory in a single slice. In this case, if ReadDir succeeds (reads all the way to the end of the directory), it returns the slice and a nil error. If it encounters an error before the end of the directory, ReadDir returns the DirEntry list read until that point and a non-nil error.
  • ReadDir(n int) ([]DirEntry, error)

FSEntry adds convenience funcs for easy access to file attributes.

It can co-exist with interface [orderednodes.Nord].

It could also be rewritten to extend [os.file], which provides R/W access, but that is not the model of package io/fs. .

type FSItem

type FSItem struct {
	FileMeta
	CT.TypedRaw
	FPs Filepaths
}

FSItem is an item identified by a filepath (plus its contents) that we have redd, will read, or will create. It might be a directory or symlink, either of which requires further processing elsewhere. In the most common usage, it is a file. Its filepath(s) can be empty ("") if (for example) its content was created interactively.

NOTE that the file name (the part of the full path after the last directory separator) is not stored separately: it is stored in the AbsFP *and* the RelFP. Note also that this path & name information duplicates what is stored in an instance of orderednodes.Nord .

NOTE that the embedded field FileMeta embeds an os.FileInfo.

FSItem is embedded in struct datarepo/rowmodels/ContentityRow.

It might seem odd to include a [TypedRaw] rather than a plain [Raw]. But in general when we are working with serializing and deserializing content ASTs, it is important to know what we are working with, cos sometimes we can - or want to - have to - do things like include HTML in Markdown, or permit HTML tags in LwDITA.

It might also seem odd that MU_type_DIRLIKE is a "markup type", but this avoids many practival problems encountered in trying to process file system trees.

NOTE that RelFP and AbsFP must be exported to be persisted to the DB. .

func NewFSItem

func NewFSItem(fp string) (*FSItem, error)

NewFSItem takes a filepath (absolute or relative) and analyzes the object (assuming one exists) at the path. This func does not load and analyse the content.

Note that a relative path is appended to the CWD, which may not be the desired behavior; in such a case, use NewFSItemRelativeTo (below). .

func NewFSItemWithContent

func NewFSItemWithContent(fp string) (*FSItem, error)

func (*FSItem) Debug

func (p *FSItem) Debug() string

Debug implements [Stringser].

func (*FSItem) Echo

func (p *FSItem) Echo() string

Echo implements [Stringser].

func (*FSItem) GoGetFileContents

func (p *FSItem) GoGetFileContents() error

GoGetFileContents reads in the file (assuming it is a file) into the field [Raw] and does a quick check for XML and HTML5 declarations.

It assumes that [LStat] has been called, and that the size of the file is known. Therefore this func is a no-op if func [BasicInfo.Size] returns 0, its zero value. Therefore do not call this if the argument's [BasicInfo] is uninitialized.

It is tolerant about non-files and empty files, returning nil for error.

The call it makes to os.Open defaults to R/W mode, altho R/O would probably suffice. .

func (*FSItem) Info

func (p *FSItem) Info() string

Info implements [Stringser].

func (*FSItem) IsDirlike

func (p *FSItem) IsDirlike() bool

func (*FSItem) IsWhat

func (p *FSItem) IsWhat() string

IsWhat is for use with functions from github.com/samber/lo . If the item does not exists, it returns "".

func (*FSItem) NewLinesFile

func (pPI *FSItem) NewLinesFile() (*LinesFile, error)

NewLinesFile is pretty self-explanatory.

func (p *FSItem) ResolveSymlinks() *FSItem

ResolveSymlinks will follow links until it finds something else. NOTE that this is a SECURITY HOLE.

func (*FSItem) String

func (p *FSItem) String() (s string)

type FSItemer

type FSItemer interface {
	// Exists is a convenience function.
	Exists() bool
	// Refresh does not check for changed type, it only checks
	// (a) existence, and (b) file size, writing to stdout if
	// either has changed.
	Refresh()
	// IsFile is a (somewhat foolproofed) convenience function.
	IsFile() bool
	// IsDir is a (somewhat foolproofed) convenience function.
	IsDir() bool
	// IsDirlike means (a) it can NOT contain own content and
	// (b) it is/has link(s) to other items (meaning it is a
	// directory or a symlink).
	IsDirlike() bool
	// IsSymlink is a (somewhat foolproofed) convenience function.
	IsSymlink() bool
}

FSItemer is implemented by *FileMeta, which embeds os.FileInfo.

type FileLine

type FileLine struct {
	CT.Raw        // string
	RawLineNr int // source file line number
	// contains filtered or unexported fields
}

FileLine is a record (i.e. a line) in a LinesFile.

type FileMeta

type FileMeta struct {
	os.FileInfo

	// error
	MU.Errer
	// contains filtered or unexported fields
}

FileMeta (ptr to it) implements FSItemer and is the most basic level of file system metadata: the results of a call to os.Stat (or the contents of a record in sqlar), lightly parsed.

NOTE that it is also used for directories and symlinks, so a more precise name would be FSItemMeta.

This struct is "mostly" applicable to non-file FS nodes and other hierarchical structures (like XML). For example:

  • for directories, Size() can be the number of files in it, and permissions can apply
  • for XML elements, Size() can apply

IsDir() is pass-thru. If the item is a directory, its name will end in a path separator (tipicly "/").

TODO: Size() is now pass-thru, but it could be overridden for directories (to return child item count), and might be overridden for a file that is modifiable/dynamic in memory. .

func NewFileMeta

func NewFileMeta(inpath string) *FileMeta

NewFileMeta replaces a call to os.LStat. This is necessary because a call of the form NewFileMeta(FileInfo) won't work because an error return from os.LStat indicates whether the file or dir (or symlink) exists. However no further analysis of the path is performed in this func, because that is more properly done by the caller.

NOTE that if the file/dir does not exist, [exists] is false and/but no error is indicated (i.e. [error] is nil).

NOTE that by convention, directories should have a trailing path separator, and it is enforced here. .

func (*FileMeta) Exists

func (p *FileMeta) Exists() bool

Exists is a convenience function.

func (*FileMeta) HasContents

func (p *FileMeta) HasContents() bool

HasContents is the opposite of [IsEmpty].

func (*FileMeta) IsDirlike

func (p *FileMeta) IsDirlike() bool

IsDirlike is, well, documented elsewhere.

func (*FileMeta) IsEmpty

func (p *FileMeta) IsEmpty() bool

IsEmpty is a convenience function for files (and directories too?). It can be overwritten when the file contents are loaded (and modifiable). .

func (*FileMeta) IsFile

func (p *FileMeta) IsFile() bool

IsFile is a (somewhat foolproofed) convenience function.

func (p *FileMeta) IsSymlink() bool

IsSymlink is a (somewhat foolproofed) convenience function.

func (*FileMeta) Refresh

func (p *FileMeta) Refresh()

type FileSet

type FileSet struct {
	// RelFilePath is a "short" argument such as supplied on the
	// command line; its absolute resolution is in the next field.
	// It may of course store an absolute (full) file path instead.
	// If this is "", it is not an error.
	// // RelFilePath string
	// AbsFilePath is the fully resolved counterpart to `RelFilePath`.
	// // AbsFilePath
	DirSpec FSItem
	// `filepath.WalkFunc` can provide relative filepaths, so we can't
	// say for sure whether this list will contain relative or absolute
	// paths. Therefore for convenience we use a bunch of strings.
	FilePaths []string
	// Then we process them.
	CheckedFiles []FSItem
}

FileSet groups a set of files that can and should be considered as a group. For example, when processing a multi-file document (LwDITA), or a multi-file DTD. It is assumed that they are related via a top-level directory, and it is the top-level directory that is contained in the field `DirSpec`.

In the pathological case that this was called on a file not a directory, all data refer to the file path, rather than (say) just the directory portion.

func NewOneFileSet

func NewOneFileSet(s string) *FileSet

func (*FileSet) FilterInBySuffix

func (p *FileSet) FilterInBySuffix(okayExts []string) (someOut bool)

FilterInBySuffix is TBS.

func (*FileSet) Size

func (p *FileSet) Size() int

Size returns the number of files.

type Filepaths

type Filepaths struct {
	// RelFP is tipicly the path given (e.g.) on the command line and is
	// useful for resolving relative paths in batches of content items.
	RelFP string
	// AbsFP is the authoritative field when processing individual files.
	AbsFP AbsFilePath
	// ShortFP is the path shortened by using "." (CWD) or "~" (user's
	// home directory), so it might only be valid for the current CLI
	// invocation or user session and it is def not persistable.
	ShortFP string
}

Filepaths shuld always have all three fields set, even if the third ([ShortFP]) is basically session-specific. Note that directories have a "/" appended.

func NewFilepaths

func NewFilepaths(anFP string) (*Filepaths, error)

NewFilepaths relies on the std lib, and accepts either an absolute or a relative filepath.

Ref: type PathError struct { Op string Path string Err error } .

type LinesFile

type LinesFile struct {
	*FSItem
	Lines []*FileLine
}

LinesFile is for reading a file where each line is a record.

type ValidUTF8Reader

type ValidUTF8Reader struct {
	// contains filtered or unexported fields
}

ValidUTF8Reader implements a Reader which reads only bytes that constitute valid UTF-8.

func NewValidUTF8Reader

func NewValidUTF8Reader(rd io.Reader) ValidUTF8Reader

NewValidUTF8Reader constructs a new `ValidUTF8Reader` that wraps an existing `io.Reader`.

func (ValidUTF8Reader) Read

func (rd ValidUTF8Reader) Read(b []byte) (n int, err error)

Read reads bytes into the byte array passed in. It returns `n`, the number of bytes read.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL