Documentation
¶
Overview ¶
Package dataflowlib translates a Beam pipeline model to the Dataflow API job model, for submission to Google Cloud Dataflow.
Index ¶
- func Execute(ctx context.Context, raw *pipepb.Pipeline, opts *JobOptions, ...) (string, error)
- func Fixup(p *pipepb.Pipeline) (*pipepb.Pipeline, error)
- func NewClient(ctx context.Context, endpoint string) (*df.Service, error)
- func PrintJob(ctx context.Context, job *df.Job)
- func StageFile(ctx context.Context, project, url, filename string) error
- func StageModel(ctx context.Context, project, modelURL string, model []byte) error
- func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job) (*df.Job, error)
- func Translate(p *pipepb.Pipeline, opts *JobOptions, workerURL, jarURL, modelURL string) (*df.Job, error)
- func WaitForCompletion(ctx context.Context, client *df.Service, project, region, jobID string) error
- type JobOptions
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func Execute ¶
func Execute(ctx context.Context, raw *pipepb.Pipeline, opts *JobOptions, workerURL, jarURL, modelURL, endpoint string, async bool) (string, error)
Execute submits a pipeline as a Dataflow job.
func Fixup ¶
Fixup proto pipeline with Dataflow quirks.
func NewClient ¶
NewClient creates a new dataflow client with default application credentials and CloudPlatformScope. The Dataflow endpoint is optionally overridden.
func StageFile ¶
StageFile uploads a file to GCS.
func StageModel ¶
StageModel uploads the pipeline model to GCS as a unique object.
func Submit ¶
func Submit(ctx context.Context, client *df.Service, project, region string, job *df.Job) (*df.Job, error)
Submit submits a prepared job to Cloud Dataflow.
Types ¶
type JobOptions ¶
type JobOptions struct {
// Name is the job name.
Name string
// Experiments are additional experiments.
Experiments []string
// Pipeline options
Options runtime.RawOptions
Project string
Region string
Zone string
Network string
Subnetwork string
NoUsePublicIPs bool
NumWorkers int64
MachineType string
Labels map[string]string
ServiceAccountEmail string
// Autoscaling settings
Algorithm string
MaxNumWorkers int64
TempLocation string
// Worker is the worker binary override.
Worker string
// WorkerJar is a custom worker jar.
WorkerJar string
TeardownPolicy string
}
JobOptions capture the various options for submitting jobs to Dataflow.
Source Files
¶
- execute.go
- fixup.go
- job.go
- messages.go
- stage.go
- translate.go
Click to show internal directories.
Click to hide internal directories.