Back to godoc.org

Package bigquerysync

v0.0.0-...-e0f9f6e
Latest Go to latest

The latest major version is .

Published: Sep 4, 2015 | License: Apache-2.0 | Module: bitbucket.org/ronoaldo/aetools

Overview

Package bigquerysync allow the AppEngine Datastore to be synced with Google BigQuery.

Index

Constants

const (
	// BigquerySyncOptionsKind is the kind that holds configuration
	// options for the synchronization.
	BigquerySyncOptionsKind = "BigquerySyncOptions"
	// BigqueryScope is the OAuth2 scope to access BigQuery data.
	BigqueryScope = "https://www.googleapis.com/auth/bigquery"
	// InsertAllRequestKind is the API Kind field value for the
	// streaming bigquery ingestion request.
	InsertAllRequestKind = "bigquery#tableDataInsertAllRequest"
)
const (
	StatByPropertyKind = "__Stat_PropertyType_PropertyName_Kind__"
	StatByKindKind     = "__Stat_Kind__"
)
const (
	MaxErrorsPerSync = 10
	BatchSize        = 81
)

Variables

var (
	// InsertAllURL is the URL endpoint where we send data to the streaming request.
	InsertAllURL = "https://www.googleapis.com/bigquery/v2/projects/%s/datasets/%s/tables/%s/insertAll"
)
var NewClient func(c appengine.Context) (*http.Client, error) = newServiceAccountClient

NewClient returns a http.Client, that authenticates all requests using the application Service Account. This is a variable to allow for mocking in unit tests, to use a different service account, or to use a custom OAuth implementation.

var (
	ScatterProperty = "__scatter__"
)

func CompareKeys

func CompareKeys(k, other *datastore.Key) int

CompareKeys compares k and other, returning -1, 0, 1 if k is less than equal or grather than other, taking into account the full ancestor path.

func CreateTableForKind

func CreateTableForKind(c appengine.Context, project, dataset, kind string) (*bigquery.Table, error)

CreateTableForKind parses the datastore statisticas for a kind name, generates a schema suitable for BigQuery, and then creates a new table using the kind name as identifier, and the provided project and dataset. It returns the new-ly created bigquery.Table and a nil error, or a nil table and the error value generated during the schema parsing, the client configuration or the table call.

func IngestToBigQuery

func IngestToBigQuery(c appengine.Context, project, dataset string, entities []*aetools.Entity, exclude string) error

IngestToBigQuery takes an aetools.Entity, and ingest it's JSON representation into the configured project.

func KeyPath

func KeyPath(k *datastore.Key) []*datastore.Key

KeyPath takes a datastore.Key and decomposes its ancestor path as a slice of keys, where the first ancestor is at position 0.

func MakeFieldName

func MakeFieldName(propName string) string

MakeFieldName returns a string replacing invalid field name chars by "_".

func SchemaForKind

func SchemaForKind(c appengine.Context, kind string) (*bigquery.TableSchema, error)

SchemaForKind guess the schema based on the datastore statistics for the specified entity kind.

func SyncKeyRange

func SyncKeyRange(c appengine.Context, project, dataset string, start, end *datastore.Key, exclude string) (int, *datastore.Key, error)

SyncKeyRange sinchronizes the specified key range using the provided appengine context. The kind for start and end keys must be the same. The method returns error if start key is nil. If end key is nil, all entities starting from starkey are processed.

The calee is responsible for checking if the last key returned is equal to the end parameter, and eventually reschedule the syncronization:

start, end = startKey(), endKey()
count, start, err := SyncKeyRange(c, proj, dataset, start, end, nil)
if err != nil {
	// Handle errors
} else if !start.Equal(end) {
	// Reschedule from new start
}

The above sample code ilustrates how to handle the results.

type Errors

type Errors []error

Errors collects all errors during the igestion job for reporting

func (Errors) Error

func (e Errors) Error() string

type InsertAllRequest

type InsertAllRequest struct {
	Kind string      `json:"kind"`
	Rows []InsertRow `json:"rows"`
}

InsertAllRequest is the payload to streaming data into BigQuery.

type InsertRow

type InsertRow struct {
	InsertID string                 `json:"insertId"`
	Json     map[string]interface{} `json:"json"`
}

InsertRow represents one row to be ingested.

type KeyRange

type KeyRange struct {
	Start *datastore.Key
	End   *datastore.Key
}

func KeyRangesForKind

func KeyRangesForKind(c appengine.Context, kind string) []KeyRange

KeyRangesForKind generates a set of KeyRanges, attempting to make them uniformly distributed by using the __scatter__ property implementation.

type StatByKind

type StatByKind struct {
	Count               int64     `datastore:"count"`
	EntityBytes         int64     `datastore:"entity_bytes"`
	Kind                string    `datastore:"kind_name"`
	IndexBytes          int64     `datastore:"builtin_index_bytes"`
	IndexCount          int64     `datastore:"builtin_index_count"`
	CompositeIndexBytes int64     `datastore:"composite_index_count"`
	CompositeIndexCount int64     `datastore:"composite_index_bytes"`
	Timestamp           time.Time `datastore:"timestamp"`
}

StatByKind holds the statistic information about an entity kind.

type StatByProperty

type StatByProperty struct {
	Count      int64     `datastore:"count"`
	Bytes      int64     `datastore:"bytes"`
	Type       string    `datastore:"property_type"`
	Name       string    `datastore:"property_name"`
	Kind       string    `datastore:"kind_name"`
	IndexBytes int64     `datastore:"builtin_index_bytes"`
	IndexCount int64     `datastore:"builtin_index_count"`
	Timestamp  time.Time `datastore:"timestamp"`
}

StatByProperty holds the statistic information about an entity property.

Package Files

Documentation was rendered with GOOS=linux and GOARCH=amd64.

Jump to identifier

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to identifier