cwalk

package module
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 12, 2022 License: MIT Imports: 7 Imported by: 1

README

cwalk = Concurrent filepath.Walk

A concurrent version of https://golang.org/pkg/path/filepath/#Walk function that scans files in a directory tree and runs a callback for each file.

Since scanning (and callback execution) is done from within goroutines, this may result in a significant performance boost on multicore systems in cases when the bottleneck is the CPU, not the I/O.

My tests showed ~3.5x average speed increase on an 8-core CPU and 8 workers. For measurements, I used the provided bin/traversaltime.go utility that measures directory traversal time for both concurrent (cwalk.Walk()) and standard (filepath.Walk()) functions.

INFO: This variant of cwalk allows the walking over single files and does have the same behavior as filepath.Walk. The path passed to the WalkFunc is the absolute path to the file.

Here are two common use cases when cwalk might be useful:

  1. You're doing subsequent scans of the same directory (e.g. monitoring it for changes), which means that the directory structure is likely cached in memory by OS;

  2. You're doing some CPU-heavy processing for each file in the callback.

Installation

$ go get github.com/iafan/cwalk

Usage

import "github.com/iafan/cwalk"

...

func walkFunc(path string, info os.FileInfo, err error) error {
    ...
}

...

err := cwalk.Walk("/path/to/dir", walkFunc)

Errors

An error such as a file limit being exceeded will be reported as too many open files for a particular file. Each occurance of this is available in the returned error via the type WalkerError struct. When errors are encountered the file walk will be completed prematurley, not all paths/files shall be walked. You can check and access for errors like this:

if err != nil {
	fmt.Printf("Error : %s\n", err.Error())
	for _, errors := range err.(cwalk.WalkerError).ErrorList {
		fmt.Println(errors)
	}
}

Differences from filepath.Walk

filepath.Walk sorts directory results while traversing the tree, which makes processing repeatable between runs. cwalk.Walk() processes files concurrentrly, sp there's no way to guarantee the order in which files or even folders are processed. If needed, you can sort the results once the entire tree is processed.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var BufferSize = NumWorkers

BufferSize defines the size of the job buffer

View Source
var ErrBrokenSymlink = errors.New("broken symlink")

ErrBrokenSymlink is returned when a symlink is not pointing to as proper path

View Source
var NumWorkers = runtime.GOMAXPROCS(0)

NumWorkers defines how many workers to run on each Walk() function invocation

Functions

func Walk

func Walk(root string, walkFn filepath.WalkFunc) error

Walk is a wrapper function for the Walker object that mimics the behavior of filepath.Walk, and doesn't follow symlinks.

func WalkWithSymlinks(root string, walkFn filepath.WalkFunc) error

WalkWithSymlinks is a wrapper function for the Walker object that mimics the behavior of filepath.Walk, but follows directory symlinks.

Types

type Walker

type Walker struct {
	// contains filtered or unexported fields
}

Walker is constructed for each Walk() function invocation

func (*Walker) Walk

func (w *Walker) Walk(path string, walkFn filepath.WalkFunc) error

Walk recursively descends into subdirectories, calling walkFn for each file or directory in the tree, including the root directory.

type WalkerError

type WalkerError struct {
	// contains filtered or unexported fields
}

WalkerError struct stores individual errors reported from each worker routine

func (WalkerError) Error

func (we WalkerError) Error() string

Implement the error interface for WalkerError

type WalkerErrorList

type WalkerErrorList struct {
	ErrorList []WalkerError
}

WalkerErrorList struct store a list of errors reported from all worker routines

func (WalkerErrorList) Error

func (wel WalkerErrorList) Error() string

Implement the error interface fo WalkerErrorList

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL