Documentation
¶
Overview ¶
Package io defines the I/O pipeline framework for Pulse: Reader/Writer interfaces, schema inference, and job types (ImportJob, ExportJob, ConvertJob).
Format-specific adapters live in sub-packages (csv, tsv).
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ErrStopIteration ¶
func ErrStopIteration() error
ErrStopIteration returns the stop iteration sentinel for use by readers.
Types ¶
type ConvertJob ¶
type ConvertJob struct {
Source Reader
Target Writer
Schema *encoding.Schema
KeepPulseAt string // optional: also write intermediate .pulse
SampleRows int
FS afero.Fs
}
ConvertJob chains import and export with no intermediate file on disk (unless KeepPulseAt is set).
func NewConvertJob ¶
func NewConvertJob(source Reader, target Writer) *ConvertJob
NewConvertJob creates a ConvertJob with default settings.
func (*ConvertJob) Predict ¶
func (j *ConvertJob) Predict(ctx context.Context) (*PredictReport, error)
Predict validates the conversion without writing.
func (*ConvertJob) Run ¶
func (j *ConvertJob) Run(ctx context.Context) (*ConvertReport, error)
Run executes the convert job, streaming from source to target. If KeepPulseAt is set, also writes an intermediate .pulse file.
type ConvertReport ¶
ConvertReport summarizes the result of a convert operation.
type ExportJob ¶
ExportJob converts a .pulse file into tabular output.
func NewExportJob ¶
NewExportJob creates an ExportJob.
type ExportReport ¶
ExportReport summarizes the result of an export operation.
type ImportJob ¶
type ImportJob struct {
Source Reader
Target string // output .pulse path
Schema *encoding.Schema
SampleRows int // default 500, min 50
FS afero.Fs
}
ImportJob converts tabular source data into a .pulse file.
func NewImportJob ¶
NewImportJob creates an ImportJob with default settings.
type ImportReport ¶
ImportReport summarizes the result of an import operation.
type InferenceWarning ¶
InferenceWarning records a non-fatal observation during inference.
func InferSchema ¶
InferSchema samples up to sampleRows rows from reader and proposes a Schema. If sampleRows <= 0, defaultSampleRows is used. The minimum is minSampleRows.
type PredictReport ¶
type PredictReport struct {
Schema *encoding.Schema
EstimatedRows int
Warnings []InferenceWarning
}
PredictReport summarizes a validation-only run.
type Reader ¶
type Reader interface {
// ReadHeader returns column names from the source.
ReadHeader() ([]string, error)
// ReadRows streams rows; calls fn for each row.
// The context controls cancellation.
ReadRows(ctx context.Context, fn func(row []string) error) error
// Close releases underlying resources.
Close() error
}
Reader reads tabular data from a source format.
type ResetReader ¶
ResetReader is an optional interface for readers that support rewinding to the beginning. This is needed for schema inference followed by import.
type SchemaAwareWriter ¶ added in v0.2.0
SchemaAwareWriter is an optional extension of Writer for targets that emit native typed columns (Arrow, Parquet, Excel) and want the source .pulse schema to drive column-type selection. ExportJob calls SetPulseSchema before WriteHeader on writers that implement this interface, then passes typed values through WriteRow:
- encoding.Decimal128 for decimal128 / nullable_decimal128 columns
- encoding.PointF64 for point_f64 columns
- encoding.H3Cell for h3_cell columns
- canonical strings for narrow types (current behavior)
Writers that do not implement SchemaAwareWriter receive only canonical strings, which is the prior text-only export contract.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package arrow provides Arrow IPC (Feather V2) import and export for the pulse I/O pipeline, plus shared Arrow<->Pulse type-mapping helpers used by both this package and io/parquet.
|
Package arrow provides Arrow IPC (Feather V2) import and export for the pulse I/O pipeline, plus shared Arrow<->Pulse type-mapping helpers used by both this package and io/parquet. |
|
Package csv provides CSV format adapters for the Pulse I/O pipeline.
|
Package csv provides CSV format adapters for the Pulse I/O pipeline. |
|
Package excel provides Excel import and export for the pulse I/O pipeline.
|
Package excel provides Excel import and export for the pulse I/O pipeline. |
|
Package format dispatches tabular Reader construction by format identifier, sitting between the io/ interface definitions and the per-format leaf packages (io/csv, io/tsv, io/ndjson, io/jsonarray, io/parquet, io/arrow, io/excel).
|
Package format dispatches tabular Reader construction by format identifier, sitting between the io/ interface definitions and the per-format leaf packages (io/csv, io/tsv, io/ndjson, io/jsonarray, io/parquet, io/arrow, io/excel). |
|
Package jsonarray provides JSON-array import and export for the pulse I/O pipeline.
|
Package jsonarray provides JSON-array import and export for the pulse I/O pipeline. |
|
Package jsonshared holds value coercion helpers shared by the ndjson and jsonarray packages.
|
Package jsonshared holds value coercion helpers shared by the ndjson and jsonarray packages. |
|
Package ndjson provides NDJSON (newline-delimited JSON) import and export for the pulse I/O pipeline.
|
Package ndjson provides NDJSON (newline-delimited JSON) import and export for the pulse I/O pipeline. |
|
Package parquet provides Parquet import and export for the pulse I/O pipeline.
|
Package parquet provides Parquet import and export for the pulse I/O pipeline. |
|
Package tsv provides TSV import and export for the pulse I/O pipeline.
|
Package tsv provides TSV import and export for the pulse I/O pipeline. |