textio

package
v2.55.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 2, 2024 License: Apache-2.0, BSD-3-Clause, MIT Imports: 14 Imported by: 163

Documentation

Overview

Package textio contains transforms for reading and writing text files.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Immediate

func Immediate(s beam.Scope, filename string) (beam.PCollection, error)

Immediate reads a local file at pipeline construction-time and embeds the data into a I/O-free pipeline source. Should be used for small files only.

func Read

func Read(s beam.Scope, glob string, opts ...ReadOptionFn) beam.PCollection

Read reads a set of files indicated by the glob pattern and returns the lines as a PCollection<string>. The newlines are not part of the lines. Read accepts a variadic number of ReadOptionFn that can be used to configure the compression type of the file. By default, the compression type is determined by the file extension.

func ReadAll

func ReadAll(s beam.Scope, col beam.PCollection, opts ...ReadOptionFn) beam.PCollection

ReadAll expands and reads the filename given as globs by the incoming PCollection<string>. It returns the lines of all files as a single PCollection<string>. The newlines are not part of the lines. ReadAll accepts a variadic number of ReadOptionFn that can be used to configure the compression type of the files. By default, the compression type is determined by the file extension.

func ReadAllSdf deprecated

func ReadAllSdf(s beam.Scope, col beam.PCollection) beam.PCollection

ReadAllSdf is a variation of ReadAll implemented via SplittableDoFn. This should result in increased performance with runners that support splitting.

Deprecated: Use ReadAll instead, which has been migrated to use this SDF implementation.

func ReadSdf deprecated

func ReadSdf(s beam.Scope, glob string) beam.PCollection

ReadSdf is a variation of Read implemented via SplittableDoFn. This should result in increased performance with runners that support splitting.

Deprecated: Use Read instead, which has been migrated to use this SDF implementation.

func ReadWithFilename added in v2.48.0

func ReadWithFilename(s beam.Scope, glob string, opts ...ReadOptionFn) beam.PCollection

ReadWithFilename reads a set of files indicated by the glob pattern and returns a PCollection<KV<string, string>> of each filename and line. The newlines are not part of the lines. ReadWithFilename accepts a variadic number of ReadOptionFn that can be used to configure the compression type of the files. By default, the compression type is determined by the file extension.

func Write

func Write(s beam.Scope, filename string, col beam.PCollection)

Write writes a PCollection<string> to a file as separate lines. The writer add a newline after each element.

Types

type ReadOptionFn added in v2.48.0

type ReadOptionFn func(*readOption)

ReadOptionFn is a function that can be passed to Read or ReadAll to configure options for reading files.

func ReadAutoCompression added in v2.48.0

func ReadAutoCompression() ReadOptionFn

ReadAutoCompression specifies that the compression type of files should be auto-detected.

func ReadGzip added in v2.48.0

func ReadGzip() ReadOptionFn

ReadGzip specifies that files have been compressed using gzip.

func ReadUncompressed added in v2.48.0

func ReadUncompressed() ReadOptionFn

ReadUncompressed specifies that files have not been compressed.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL