Documentation ¶
Overview ¶
Package morebeam provides additional functions useful for building Apache Beam pipelines.
Example ¶
package main import ( "context" "flag" "reflect" "bramp.net/morebeam" "bramp.net/morebeam/csvio" "github.com/apache/beam/sdks/go/pkg/beam" "github.com/apache/beam/sdks/go/pkg/beam/x/beamx" "github.com/apache/beam/sdks/go/pkg/beam/x/debug" ) // Painting represents a single record in the csv file. type Painting struct { Artist string `csv:"artist"` Title string `csv:"title"` Year int `csv:"year"` NotUsed string `csv:"-"` // Ignored field } func main() { flag.Parse() beam.Init() p, s := beam.NewPipelineWithRoot() // Read the CSV file as a PCollection<Painting>. paintings := csvio.Read(s, "paintings.csv", reflect.TypeOf(Painting{})) // Reshuffle the CSV output to improve parallelism. paintings = morebeam.Reshuffle(s, paintings) // Return a new PCollection<KV<string, Painting>> where the key is the artist. paintingsByArtist := morebeam.AddKey(s, func(painting Painting) string { return painting.Artist }, paintings) debug.Print(s, paintingsByArtist) beamx.Run(context.Background(), p) }
Output:
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func AddKey ¶
func AddKey(s beam.Scope, keyFn interface{}, col beam.PCollection) beam.PCollection
AddKey takes a PCollection<V> and returns a PCollection<KV<K, V>> where the key is calculated from keyFn.
func Reshuffle ¶
func Reshuffle(s beam.Scope, col beam.PCollection) beam.PCollection
Reshuffle takes a PCollection<A> and shuffles the data to help increase parallelism.
Types ¶
This section is empty.
Click to show internal directories.
Click to hide internal directories.