spill

package
v1.12.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 20, 2023 License: BSD-3-Clause Imports: 12 Imported by: 0

Documentation

Index

Constants

View Source
const TempPrefix = "zed-spill-"

Variables

This section is empty.

Functions

func TempDir

func TempDir() (string, error)

func TempFile

func TempFile() (*os.File, error)

Types

type File

type File struct {
	*zngio.Reader
	*zngio.Writer
	// contains filtered or unexported fields
}

File provides a means to write a sequence of zng records to temporary storage then read them back. This is used for processing large batches of data that do not fit in memory and/or cannot be shuffled to a peer worker, but can be processed in multiple passes. File implements zio.Reader and zio.Writer.

func NewFile

func NewFile(f *os.File) *File

NewFile returns a File. Records should be written to File via the zio.Writer interface, followed by a call to the Rewind method, followed by reading records via the zio.Reader interface.

func NewFileWithPath

func NewFileWithPath(path string) (*File, error)

func NewTempFile

func NewTempFile() (*File, error)

func (*File) CloseAndRemove

func (r *File) CloseAndRemove() error

CloseAndRemove closes and removes the underlying file.

func (*File) Rewind

func (f *File) Rewind(zctx *zed.Context) error

func (*File) Size

func (f *File) Size() (int64, error)

type MergeSort

type MergeSort struct {
	// contains filtered or unexported fields
}

MergeSort manages "runs" (files of sorted zng records) that are spilled to disk a chunk at a time, then read back and merged in sorted order, effectively implementing an external merge sort.

func NewMergeSort

func NewMergeSort(comparator *expr.Comparator) (*MergeSort, error)

NewMergeSort returns a MergeSort to implement external merge sorts of a large zng record stream. It creates a temporary directory to hold the collection of spilled chunks. Call Cleanup to remove it.

func (*MergeSort) Cleanup

func (r *MergeSort) Cleanup()

func (*MergeSort) Len

func (r *MergeSort) Len() int

func (*MergeSort) Less

func (r *MergeSort) Less(i, j int) bool

func (*MergeSort) Peek

func (r *MergeSort) Peek() (*zed.Value, error)

Peek returns the next record without advancing the reader. The record stops being valid at the next read call.

func (*MergeSort) Pop

func (r *MergeSort) Pop() interface{}

func (*MergeSort) Push

func (r *MergeSort) Push(x interface{})

func (*MergeSort) Read

func (r *MergeSort) Read() (*zed.Value, error)

Read returns the smallest record (per the comparison function provided to MewMergeSort) from among the next records in the spilled chunks. It implements the merge operation for an external merge sort.

func (*MergeSort) Spill

func (r *MergeSort) Spill(ctx context.Context, vals []zed.Value) error

Spill sorts and spills a new run of records to a file in the MergeSort's temp directory. Since we sort each chunk in memory before spilling, the different chunks can be easily merged into sorted order when reading back the chunks sequentially.

func (*MergeSort) SpillSize

func (r *MergeSort) SpillSize() int64

func (*MergeSort) Swap

func (r *MergeSort) Swap(i, j int)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL