Documentation
¶
Overview ¶
windowed_wordcount counts words in text, and can run over either unbounded or bounded input collections.
This example is the last in a series of four successively more detailed 'word count' examples. First take a look at minimal_wordcount, wordcount, and debugging_wordcount.
Basic concepts, also in the preceeding examples: Reading text files; counting a PCollection; writing to GCS; executing a Pipeline both locally and using a selected runner; defining DoFns; user-defined PTransforms; defining pipeline options.
New Concepts:
1. Unbounded and bounded pipeline input modes 2. Adding timestamps to data 3. Windowing 4. Re-using PTransforms over windowed PCollections 5. Accessing the window of an element