Version: v2.32.0+incompatible Latest Latest Go to latest
Published: Aug 10, 2021 License: Apache-2.0, BSD-3-Clause, MIT

## Documentation ¶

### Overview ¶

Package stats contains transforms for statistical processing.

### Constants ¶

This section is empty.

### Variables ¶

This section is empty.

### Functions ¶

#### func ApproximateQuantiles ¶

`func ApproximateQuantiles(s beam.Scope, pc beam.PCollection, less interface{}, opts Opts) beam.PCollection`

ApproximateQuantiles computes approximate quantiles for the input PCollection<T>.

The output PCollection contains a single element: a list of numQuantiles - 1 elements approximately splitting up the input collection into numQuantiles separate quantiles. For example, if numQuantiles = 2, the returned list would contain a single element such that approximately half of the input would be less than that element and half would be greater.

#### func ApproximateWeightedQuantiles ¶

`func ApproximateWeightedQuantiles(s beam.Scope, pc beam.PCollection, less interface{}, opts Opts) beam.PCollection`

ApproximateWeightedQuantiles computes approximate quantiles for the input PCollection<(weight int, T)>.

The output PCollection contains a single element: a list of numQuantiles - 1 elements approximately splitting up the input collection into numQuantiles separate quantiles. For example, if numQuantiles = 2, the returned list would contain a single element such that approximately half of the input would be less than that element and half would be greater or equal.

#### func Count ¶

`func Count(s beam.Scope, col beam.PCollection) beam.PCollection`

Count counts the number of appearances of each element in a collection. It expects a PCollection<T> as input and returns a PCollection<KV<T,int>>. T's encoding must be deterministic so it is valid as a key.

#### func CountElms ¶

`func CountElms(s beam.Scope, col beam.PCollection) beam.PCollection`

CountElms counts the number of elements in a collection. It expects a PCollection<T> as input and returns a PCollection<int> of one element containing the count.

#### func Max ¶

`func Max(s beam.Scope, col beam.PCollection) beam.PCollection`

Max returns the maximal element in a PCollection<A> as a singleton PCollection<A>. It can only be used for numbers, such as int, uint16, float32, etc.

For example:

```col := beam.Create(s, 1, 11, 7, 5, 10)
max := stats.Max(s, col)   // PCollection<int> with 11 as the only element.
```

#### func MaxPerKey ¶

`func MaxPerKey(s beam.Scope, col beam.PCollection) beam.PCollection`

MaxPerKey returns the maximal element per key in a PCollection<KV<A,B>> as a PCollection<KV<A,B>>. It can only be used for numbers, such as int, uint16, float32, etc.

#### func Mean ¶

`func Mean(s beam.Scope, col beam.PCollection) beam.PCollection`

Mean returns the arithmetic mean (or average) of the elements in a collection. It expects a PCollection<A> as input and returns a singleton PCollection<float64>. It can only be used for numbers, such as int, uint16, float32, etc.

For example:

```col := beam.Create(s, 1, 11, 7, 5, 10)
mean := stats.Mean(s, col)   // PCollection<float64> with 6.8 as the only element.
```

#### func MeanPerKey ¶

`func MeanPerKey(s beam.Scope, col beam.PCollection) beam.PCollection`

MeanPerKey returns the arithmetic mean (or average) for each key of the elements in a collection. It expects a PCollection<KV<A,B>> as input and returns a PCollection<KV<A,float64>>. It can only be used for numbers, such as int, uint16, float32, etc.

#### func Min ¶

`func Min(s beam.Scope, col beam.PCollection) beam.PCollection`

Min returns the minimal element in a PCollection<A> as a singleton PCollection<A>. It can only be used for numbers, such as int, uint16, float32, etc.

For example:

```col := beam.Create(s, 1, 11, 7, 5, 10)
min := stats.Min(s, col)   // PCollection<int> with 1 as the only element.
```

#### func MinPerKey ¶

`func MinPerKey(s beam.Scope, col beam.PCollection) beam.PCollection`

MinPerKey returns the minimal element per key in a PCollection<KV<A,B>> as a PCollection<KV<A,B>>. It can only be used for numbers, such as int, uint16, float32, etc.

#### func Sum ¶

`func Sum(s beam.Scope, col beam.PCollection) beam.PCollection`

Sum returns the sum of the elements in a PCollection<A> as a singleton PCollection<A>. It can only be used for numbers, such as int, uint16, float32, etc.

For example:

```col := beam.Create(s, 1, 11, 7, 5, 10)
sum := stats.Sum(s, col)   // PCollection<int> with 34 as the only element.
```

#### func SumPerKey ¶

`func SumPerKey(s beam.Scope, col beam.PCollection) beam.PCollection`

SumPerKey returns the sum of the values per key in a PCollection<KV<A,B>> as a PCollection<KV<A,B>>. It can only be used for value numbers, such as int, uint16, float32, etc.

### Types ¶

#### type Opts ¶

```type Opts struct {
// Controls the memory used and approximation error (difference between the quantile returned and the true quantile.)
K int
// Number of quantiles to return. The algorithm will return NumQuantiles - 1 numbers
NumQuantiles int
// For extremely large datasets, runners may have issues with out of memory errors or taking too long to finish.
// If ApproximateQuantiles is failing, you can use this option to tune how the data is sharded internally.
// This parameter is optional. If unspecified, Beam will compact all elements into a single compactor at once using a single machine.
// For example, if this is set to [8, 4, 2]: First, elements will be assigned to 8 shards which will run in parallel. Then the intermediate results from those 8 shards will be reassigned to 4 shards and merged in parallel. Then once again to 2 shards. Finally the intermediate results of those two shards will be merged on one machine before returning the final result.
InternalSharding []int
}```

Opts contains settings used to configure how approximate quantiles are computed.