cli

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 13, 2026 License: Apache-2.0 Imports: 24 Imported by: 0

Documentation

Overview

Package cli builds the ccrawl command tree on top of the ccrawl library.

Index

Constants

This section is empty.

Variables

View Source
var (
	Version = "dev"
	Commit  = "none"
	Date    = "unknown"
)

Build metadata, set via -ldflags at release time.

Functions

func Execute

func Execute(ctx context.Context, cmd *cobra.Command) int

Execute runs the root command, mapping errors to exit codes.

func Root

func Root() *cobra.Command

Root builds the root command and its whole subtree.

Types

type App

type App struct {
	Cfg   ccrawl.Config
	HTTP  *ccrawl.HTTPClient
	Cache *ccrawl.Cache
	Out   *Output

	Limit      int
	Workers    int
	UseLibrary bool
	LibraryDir string
	// contains filtered or unexported fields
}

App carries the resolved configuration and shared clients for a command run.

func (*App) AllCrawls

func (a *App) AllCrawls(ctx context.Context) ([]string, error)

AllCrawls returns the crawl IDs to operate over when -c all/year is given, newest first, otherwise the single resolved crawl.

func (*App) Crawl

func (a *App) Crawl(ctx context.Context) (string, error)

Crawl resolves the crawl reference once and caches the canonical ID.

func (*App) Library

func (a *App) Library(ctx context.Context) (ccrawl.Library, error)

Library resolves the crawl ID and returns the dataset library rooted at the configured library dir for that crawl.

type Format

type Format string

Format is an output encoding.

const (
	FormatAuto  Format = "auto"
	FormatTable Format = "table"
	FormatJSON  Format = "json"
	FormatJSONL Format = "jsonl"
	FormatCSV   Format = "csv"
	FormatTSV   Format = "tsv"
	FormatURL   Format = "url"
	FormatRaw   Format = "raw"
)

type Output

type Output struct {
	// contains filtered or unexported fields
}

Output renders rows in the selected format. A single Output instance handles a whole command run, so streaming formats can write incrementally.

func (*Output) Emit

func (o *Output) Emit(r Row) error

Emit renders one row.

func (*Output) Flush

func (o *Output) Flush() error

Flush finalises buffered formats. Call once at the end of a command.

func (*Output) Format

func (o *Output) Format() Format

Format returns the resolved output format.

func (*Output) Raw

func (o *Output) Raw(b []byte) error

Raw writes bytes straight to stdout (for --raw/--body content).

type Row

type Row struct {
	Cols  []string
	Vals  []string
	Value any
}

Row is one output record: an ordered set of named columns plus the original value (used by json/jsonl and templates).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL