dataflowlib

package
v2.7.0+incompatible Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 26, 2018 License: Apache-2.0, BSD-3-Clause, MIT Imports: 31 Imported by: 0

Documentation

Overview

Package dataflowlib translates a Beam pipeline model to the Dataflow API job model, for submission to Google Cloud Dataflow.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Execute

func Execute(ctx context.Context, raw *pb.Pipeline, opts *JobOptions, workerURL, modelURL, endpoint string, async bool) (string, error)

Execute submits a pipeline as a Dataflow job.

func Fixup

func Fixup(p *pb.Pipeline) (*pb.Pipeline, error)

Fixup proto pipeline with Dataflow quirks.

func NewClient

func NewClient(ctx context.Context, endpoint string) (*df.Service, error)

NewClient creates a new dataflow client with default application credentials and CloudPlatformScope. The Dataflow endpoint is optionally overridden.

func PrintJob

func PrintJob(ctx context.Context, job *df.Job)

PrintJob logs the Dataflow job.

func StageModel

func StageModel(ctx context.Context, project, modelURL string, model []byte) error

StageModel uploads the pipeline model to GCS as a unique object.

func StageWorker

func StageWorker(ctx context.Context, project, workerURL, worker string) error

StageWorker uploads the worker binary to GCS as a unique object.

func Submit

func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job) (*df.Job, error)

Submit submits a prepared job to Cloud Dataflow.

func Translate

func Translate(p *pb.Pipeline, opts *JobOptions, workerURL, modelURL string) (*df.Job, error)

Translate translates a pipeline to a Dataflow job.

func WaitForCompletion

func WaitForCompletion(ctx context.Context, client *df.Service, project, region, jobID string) error

WaitForCompletion monitors the given job until completion. It logs any messages and state changes received.

Types

type JobOptions

type JobOptions struct {
	// Name is the job name.
	Name string
	// Experiments are additional experiments.
	Experiments []string
	// Pipeline options
	Options runtime.RawOptions

	Project     string
	Region      string
	Zone        string
	Network     string
	NumWorkers  int64
	MachineType string
	Labels      map[string]string

	TempLocation string

	// Worker is the worker binary override.
	Worker string

	TeardownPolicy string
}

JobOptions capture the various options for submitting jobs to Dataflow.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL