bq

package

v0.0.0-...-51f9457 Latest Latest Go to latest Published: Jul 9, 2021 License: Apache-2.0 Imports: 31 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/luci/luci-go

Documentation ¶

Overview ¶

Package bq is a library for working with BigQuery.

Limits ¶

Please see BigQuery docs: https://cloud.google.com/bigquery/quotas#streaminginserts for the most updated limits for streaming inserts. It is expected that the client is responsible for ensuring their usage will not exceed these limits through bq usage. A note on maximum rows per request: Put() batches rows per request, ensuring that no more than 10,000 rows are sent per request, and allowing for custom batch size. BigQuery recommends using 500 as a practical limit (so we use this as a default), and experimenting with your specific schema and data sizes to determine the batch size with the ideal balance of throughput and latency for your use case.

Authentication ¶

Authentication for the Cloud projects happens during client creation: https://godoc.org/cloud.google.com/go#pkg-examples. What form this takes depends on the application.

Monitoring ¶

You can use tsmon (https://godoc.org/go.chromium.org/luci/common/tsmon) to track upload latency and errors.

If Uploader.UploadsMetricName field is not zero, Uploader will create a counter metric to track successes and failures.

Index ¶

func AddMissingFields(dest *bigquery.Schema, src bigquery.Schema)
func SchemaDiff(before, after bigquery.Schema) string
func SchemaString(s bigquery.Schema) string
type InsertIDGenerator
- func (id *InsertIDGenerator) Generate() string
type Row
- func (r *Row) Save() (map[string]bigquery.Value, string, error)
type SchemaConverter
- func (c *SchemaConverter) Schema(messageName string) (schema bigquery.Schema, description string, err error)
type SourceCodeInfoMap
type Uploader
- func NewUploader(ctx context.Context, c *bigquery.Client, datasetID, tableID string) *Uploader
- func (u *Uploader) Put(ctx context.Context, messages ...proto.Message) error

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func AddMissingFields ¶

func AddMissingFields(dest *bigquery.Schema, src bigquery.Schema)

AddMissingFields copies fields from src to dest if they are not present in dest.

func SchemaDiff ¶

func SchemaDiff(before, after bigquery.Schema) string

SchemaDiff returns unified diff of two schemas. Returns "" if there is no difference.

func SchemaString ¶

func SchemaString(s bigquery.Schema) string

SchemaString returns schema in string format.

Types ¶

type InsertIDGenerator ¶

type InsertIDGenerator struct {
	// Counter is an atomically-managed counter used to differentiate Insert
	// IDs produced by the same process.
	Counter int64
	// Prefix should be able to uniquely identify this specific process,
	// to differentiate Insert IDs produced by different processes.
	//
	// If empty, prefix will be derived from system and process specific
	// properties.
	Prefix string
}

InsertIDGenerator generates unique Insert IDs.

BigQuery uses Insert IDs to deduplicate rows in the streaming insert buffer. The association between Insert ID and row persists only for the time the row is in the buffer.

InsertIDGenerator is safe for concurrent use.

var ID InsertIDGenerator

ID is the global InsertIDGenerator

func (*InsertIDGenerator) Generate ¶

func (id *InsertIDGenerator) Generate() string

Generate returns a unique Insert ID.

type Row ¶

type Row struct {
	proto.Message // embedded

	// InsertID is unique per insert operation to handle deduplication.
	InsertID string
}

Row implements bigquery.ValueSaver

func (*Row) Save ¶

func (r *Row) Save() (map[string]bigquery.Value, string, error)

Save is used by bigquery.Uploader.Put when inserting values into a table.

type SchemaConverter ¶

type SchemaConverter struct {
	Desc           *descriptorpb.FileDescriptorSet
	SourceCodeInfo map[*descriptorpb.FileDescriptorProto]SourceCodeInfoMap
}

func (*SchemaConverter) Schema ¶

func (c *SchemaConverter) Schema(messageName string) (schema bigquery.Schema, description string, err error)

Schema constructs a bigquery.Schema from a named message.

type SourceCodeInfoMap ¶

type SourceCodeInfoMap map[interface{}]*descriptorpb.SourceCodeInfo_Location

SourceCodeInfoMap maps descriptor proto messages to source code info, if available. See also descutil.IndexSourceCodeInfo.

type Uploader ¶

type Uploader struct {
	*bigquery.Uploader
	// Uploader is bound to a specific table. DatasetID and Table ID are
	// provided for reference.
	DatasetID string
	TableID   string
	// UploadsMetricName is a string used to create a tsmon Counter metric
	// for event upload attempts via Put, e.g.
	// "/chrome/infra/commit_queue/events/count". If unset, no metric will
	// be created.
	UploadsMetricName string

	// BatchSize is the max number of rows to send to BigQuery at a time.
	// The default is 500.
	BatchSize int
	// contains filtered or unexported fields
}

Uploader contains the necessary data for streaming data to BigQuery.

func NewUploader ¶

func NewUploader(ctx context.Context, c *bigquery.Client, datasetID, tableID string) *Uploader

NewUploader constructs a new Uploader struct.

DatasetID and TableID are provided to the BigQuery client to gain access to a particular table.

You may want to change the default configuration of the bigquery.Uploader. Check the documentation for more details.

Set UploadsMetricName on the resulting Uploader to use the default counter metric.

Set BatchSize to set a custom batch size.

func (*Uploader) Put ¶

func (u *Uploader) Put(ctx context.Context, messages ...proto.Message) error

Put uploads one or more rows to the BigQuery service. Put takes care of adding InsertIDs, used by BigQuery to deduplicate rows.

If any rows do now match one of the expected types, Put will not attempt to upload any rows and returns an InvalidTypeError.

Put returns a PutMultiError if one or more rows failed to be uploaded. The PutMultiError contains a RowInsertionError for each failed row.

Put will retry on temporary errors. If the error persists, the call will run indefinitely. Because of this, if ctx does not have a timeout, Put will add one.

See bigquery documentation and source code for detailed information on how struct values are mapped to rows.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
pb Package pb contains helper protobuf messages used to define BQ schemas.	Package pb contains helper protobuf messages used to define BQ schemas.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL