sql

package

v1.1.2 Latest Latest Go to latest Published: Oct 26, 2017 License: Apache-2.0 Imports: 72 Imported by: 0

Documentation ¶

Overview ¶

Package sql provides the user-facing API for access to a Cockroach datastore. As the name suggests, the API is based around SQL, the same SQL you find in traditional RDBMS systems like Oracle, MySQL or Postgres. The core Cockroach system implements a distributed, transactional, monolithic sorted key-value map. The sql package builds on top of this core system (provided by the storage and kv packages) adding parsing, query planning and query execution as well as defining the privilege model.

Databases and Tables ¶

The two primary objects are databases and tables. A database is a namespace which holds a series of tables. Conceptually, a database can be viewed as a directory in a filesystem plus some additional metadata. A table is like a file on steroids: it contains a structured layout of rows and columns along with secondary indexes.

Like a directory, a database has a name and some metadata. The metadata is defined by the DatabaseDescriptor:

message DatabaseDescriptor {
  optional string name;
  optional uint32 id;
  optional PrivilegeDescriptor privileges;
}

As you can see, currently the metadata we store for databases just consists of privileges.

Similarly, tables have a TableDescriptor:

message TableDescriptor {
  optional string name;
  optional uint32 id;
  repeated ColumnDescriptor columns;
  optional IndexDescriptor primary_index;
  repeated IndexDescriptor indexes;
  optional PrivilegeDescriptor privileges;
}

Both the database ID and the table ID are allocated from the same "ID space" and IDs are never reused.

The namespace in which databases and tables exist contains only two levels: the root level contains databases and the database level contains tables. The "system.namespace" and "system.descriptor" tables implement the mapping from database/table name to ID and from ID to descriptor:

CREATE TABLE system.namespace (
  "parentID" INT,
  "name"     CHAR,
  "id"       INT,
  PRIMARY KEY ("parentID", name)
);

Create TABLE system.descriptor (
  "id"         INT PRIMARY KEY,
  "descriptor" BLOB
);

The ID 0 is a reserved ID used for the "root" of the namespace in which the databases reside. In order to look up the ID of a database given its name, the system runs the underlying key-value operations that correspond to the following query:

SELECT id FROM system.namespace WHERE "parentID" = 0 AND name = <database-name>

And given a database/table ID, the system looks up the descriptor using the following query:

SELECT descriptor FROM system.descriptor WHERE id = <ID>

Let's also create two new tables to use as running examples, one relatively simple, and one a little more complex. The first table is just a list of stores, with a "store_id" primary key that is an automatically incremented unique integer as the primary key (the "SERIAL" datatype) and a name.

CREATE DATABASE test;
SET DATABASE TO test;

Create TABLE stores (
  "store_id" SERIAL PRIMARY KEY,
  "name" CHAR UNIQUE
);

The second table

CREATE TABLE inventory (
  "item_id" INT UNIQUE,
  "name" CHAR UNIQUE,
  "at_store" INT,
  "stock" INT,
  PRIMARY KEY (item_id, at_store),
  CONSTRAINT at_store_fk FOREIGN KEY (at_store) REFERENCES stores (store_id)
);

Primary Key Addressing ¶

All of the SQL data stored in tables is mapped down to individual keys and values. We call the exact mapping converting any table or row to a key value pair "key addressing". Cockroach's key addressing relies upon a primary key, and thus all tables have a primary key, whether explicitly listed in the schema or automatically generated. Note that the notion of a "primary key" refers to the primary key in the SQL sense, and is unrelated to the "key" in Cockroach's underlying key-value pairs.

Primary keys consist of one or more non-NULL columns from the table. For a given row of the table, the columns for the primary key are encoded into a single string. For example, our inventory table would be encoded as:

/item_id/at_store

[Note that "/" is being used to disambiguate the components of the key. The actual encodings do not use the "/" character. The actual encoding is specified in the `util` package in `util/encoding`. These encoding routines allow for the encoding of NULL values, integers, floating point numbers and strings such that the lexicographic ordering of the encoded strings corresponds to the same ordering of the unencoded data.]

Before being stored in the monolithic key-value space, the encoded primary key columns are prefixed with the table ID and an ID indicating that the key corresponds to the primary index. The prefix for the inventory table looks like this:

/TableID/PrimaryIndexID/item_id/at_store

Each column value is stored in a key with that prefix. Every column has a unique ID (local to the table). The value for every cell is stored at the key:

/TableID/PrimaryIndexID/item_id/at_store/ColumnID -> ColumnValue

Thus, the scan over the range

[/TableID/PrimaryIndexID/item_id/at_store,
/TableID/PrimaryIndexID/item_id/at_storf)

Where the abuse of notation "namf" in the end key refers to the key resulting from incrementing the value of the start key. As an efficiency, we do not store columns NULL values. Thus, all returned rows from the above scan give us enough information to construct the entire row. However, a row that has exclusively NULL values in non-primary key columns would have nothing stored at all. Thus, to note the existence of a row with only a primary key and remaining NULLs, every row also has a sentinel key indicating its existence. The sentinel key is simply the primary index key, with an empty value:

/TableID/PrimaryIndexID/item_id/at_store -> <empty>

Thus the above scan on such a row would return a single key, which we can use to reconstruct the row filling in NULLs for the non-primary-key values.

Column Families ¶

The above structure is inefficient if we have many columns, since each row in an N-column table results in up to N+1 entries (1 sentinel key + N keys if every column was non-NULL). Thus, Cockroach has the ability to group multiple columns together and write them as a single key-value pair. We call this a "column family", and there are more details in this blog post: https://www.cockroachlabs.com/blog/sql-cockroachdb-column-families/

Secondary Indexes ¶

Despite not being a formal part of the SQL standard, secondary indexes are one of its most powerful features. Secondary indexes are a level of indirection that allow quick lookups of a row using something other than the primary key. As an example, here is a secondary index on the "inventory" table, using only the "name" column:

CREATE INDEX name ON inventory (name);

This secondary index allows fast lookups based on just the "name". We use the following key addressing scheme for this non-unique index:

/TableId/SecondaryIndexID/name/item_id/at_store -> <empty>

Notice that while the index is on "name", the key contains both "name" and the values for item_id and at_store. This is done to ensure that each row for a table has a unique key for the non-unique index. In general, in order to guarantee that a non-unique index is unique, we encode the index's columns followed by any primary key columns that have not already been mentioned. Since the primary key must uniquely define a row, this transforms any non-unique index into a unique index.

Let's suppose that we had instead defined the index as:

CREATE UNIQUE INDEX name ON inventory (name, item_id);

Since this index is defined on creation as a unique index, we do not need to append the rest of the primary key columns to ensure uniqueness; instead, any insertion of a row into the table that would result in a duplication in the index will fail (and if there already are duplicates upon creation, the index creation itself will fail). However, we still need to be able to decode the full primary key by reading this index, as we will see later, in order to read any columns that are not in this index:

SELECT at_store FROM inventory WHERE name = "foo";

The solution is to put any remaining primary key columns into the value. Thus, the key addressing for this unique index looks like this:

/TableID/SecondaryIndexID/name/item_id -> at_store

The value for a unique index is composed of any primary key columns that are not already part of the index ("at_store" in this example). The goal of this key addressing scheme is to ensure that the primary key is fully specified by the key-value pair, and that the key portion is unique. However, any lookup of a non-primary and non-index column requires two reads, first to decode the primary key, and then to read the full row for the primary key, which contains all the columns. For instance, to read the value of the "stock" column in this table:

SELECT stock FROM inventory WHERE name = "foo";

Looking this up by the index on "name" does not give us the value of the "stock" column. Instead, to process this query, Cockroach does two key-value reads, which are morally equivalent to the following two SQL queries:

SELECT (item_id, at_store) FROM inventory WHERE name = "foo";

Then we use the values for the primary key that we received from the first query to perform the lookup:

SELECT stock FROM inventory WHERE item_id = "..." AND at_store = "...";

Query Planning and Execution ¶

SQL queries are executed by converting every SQL query into a set of transactional key-value operations. The Cockroach distributed transactional key-value store provides a few operations, of which we shall discuss execution using two important ones: conditional puts, and ordered scans.

Query planning is the system which takes a parsed SQL statement (described by an abstract syntax tree) and creates an execution plan which is itself a tree of operations. The execution tree consists of leaf nodes that are SCANs and PUTs, and internal nodes that consist of operations such as join, groupby, sort, or projection.. For the bulk of SQL statements, query planning is straightforward: the complexity lies in SELECT.

At one end of the performance spectrum, an implementation of SELECT can be straightforward: do a full scan of the (joined) tables in the FROM clause, filter rows based on the WHERE clause, group the resulting rows based on the GROUP BY clause, filter those rows using the HAVING clause, and sort the remaining rows using the ORDER BY clause. There are a number of steps, but they all have well defined semantics and are mostly just an exercise in software engineering: retrieve the rows as quickly as possible and then send them through the pipeline of filtering, grouping, filtering and sorting.

However, this naive execution plan would have poor performance if the first scans return large amounts of data: if we are scanning orders of magnitude extra data, only to discard the vast majority of rows as we filter out the few rows that we need, this is needlessly inefficient. Instead, the query planner attempts to take advantage of secondary indexes to limit the data retrieved by the leafs. Additionally, the query planner makes joins between tables faster by taking advantage of the different sort orders of various secondary indexes, and avoiding re-sorting (or taking advantage of partial sorts to limit the amount of sorting done). As query planning is under active development, the details of how we implement this are in flux and will continue to be in flux for the foreseeable future. This section is intended to provide a high-level overview of a few of the techniques involved.

For a SELECT query, after parsing it, the query planner performs semantic analysis to statically verify if the query obeys basic type-safety checks, and to resolve names within the query to actual objects within the system. Let's consider a query which looks up the stock of an item in the inventory table named "foo" with item_id X:

SELECT stock FROM inventory WHERE item_id = X AND name = 'test'

The query planner first needs to resolve the "inventory" qualified name in the FROM clause to the appropriate TableDescriptor. It also needs to resolve the "item_id", "stock" and "name" column references to the appropriate column descriptions with the "inventory" TableDescriptor. Lastly, as part of semantic analysis, the query planner verifies that the expressions in the select targets and the WHERE clause are valid (e.g. the WHERE clause evaluates to a boolean).

From that starting point, the query planner then analyzes the GROUP BY and ORDER BY clauses, adding "hidden" targets for expressions used in those clauses that are not explicit targets of the query. Our example query does not have any GROUP BY or ORDER BY clauses, so we move straight to the next step: index selection. Index selection is the stage where the query planner selects the best index to scan and selects the start and end keys that minimize the amount of scanned data. Depending on the complexity of the query, the query planner might even select multiple ranges to scan from an index or multiple ranges from different indexes.

How does the query planner decide which index to use and which range of the index to scan? We currently use a restricted form of value propagation in order to determine the range of possible values for columns referenced in the WHERE clause. Using this range information, each index is examined to determine if it is a potential candidate and ranked according to its specificity. In addition to ranking indexes by the column value range information, they are ranked by how well they match the sorting required by the ORDER BY clause. A more detailed description is here: https://www.cockroachlabs.com/blog/index-selection-cockroachdb-2/, but back to the example above, the range information would determine that:

item_id >= 0 AND item_id <= 0 AND name >= 'test' and name <= 'test

Since there are two indexes on the "inventory" table, one index on "name" and another unique index on "item_id" and "name", the latter is selected as the candidate for performing a scan. To perform this scan, we need a start (inclusive) and end key (exclusive). The start key is computed using the SecondaryIndexID of the chosen index, and the constraints on the range information above:

/inventory/SecondaryIndexID/item_id/name

The end key is:

/inventory/SecondaryIndexID/item_id/namf

The "namf" suffix is not a typo: it is an abuse of notation to demonstrate how we calculate the end key: the end key is computed by incrementing the final byte of the start key such that "t" becomes "u".

Our example scan will return two key-value pairs:

/system.descriptor/primary/0/test    -> NULL
/system.descriptor/primary/0/test/id -> <ID>

The first key is the sentinel key, and the value from the second key returned by the scan is the result we need to return as the result of this SQL query.

Index ¶

Constants
Variables
func AddPlanHook(f planHookFn)
func CheckPrivilege(user string, descriptor sqlbase.DescriptorProto, privilege privilege.Kind) error
func DropTableDesc(ctx context.Context, tableDesc *sqlbase.TableDescriptor, db *client.DB, ...) error
func DropTableName(ctx context.Context, tableDesc *sqlbase.TableDescriptor, db *client.DB, ...) error
func EvalAsOfTimestamp(evalCtx *parser.EvalContext, asOf parser.AsOfClause, max hlc.Timestamp) (hlc.Timestamp, error)
func GenerateInsertRow(defaultExprs []parser.TypedExpr, ...) (parser.Datums, error)
func GenerateUniqueDescID(ctx context.Context, db *client.DB) (sqlbase.ID, error)
func GetKeysForTableDescriptor(tableDesc *sqlbase.TableDescriptor) (zoneKey roachpb.Key, nameKey roachpb.Key, descKey roachpb.Key)
func GetTableDesc(cfg config.SystemConfig, id sqlbase.ID) (*sqlbase.TableDescriptor, error)
func GetUserHashedPassword(ctx context.Context, executor *Executor, metrics *MemoryMetrics, ...) (bool, []byte, error)
func GetZoneConfig(cfg config.SystemConfig, id uint32) (config.ZoneConfig, bool, error)
func HashAppName(appName string) string
func HoistConstraints(n *parser.CreateTable)
func IsStmtParallelized(stmt Statement) bool
func MakeTableDesc(ctx context.Context, txn *client.Txn, vt VirtualTabler, ...) (sqlbase.TableDescriptor, error)
func MustGetDatabaseDesc(ctx context.Context, txn *client.Txn, vt VirtualTabler, name string) (*sqlbase.DatabaseDescriptor, error)
func MustGetDatabaseDescByID(ctx context.Context, txn *client.Txn, id sqlbase.ID) (*sqlbase.DatabaseDescriptor, error)
func MustGetTableDesc(ctx context.Context, txn *client.Txn, vt VirtualTabler, tn *parser.TableName, ...) (*sqlbase.TableDescriptor, error)
func MustGetTableOrViewDesc(ctx context.Context, txn *client.Txn, vt VirtualTabler, tn *parser.TableName, ...) (*sqlbase.TableDescriptor, error)
func NewWireFailureError(err error) error
func NormalizeAndValidateUsername(username string) (string, error)
func RecomputeViewDependencies(ctx context.Context, txn *client.Txn, e *Executor) error
func TestDisableTableLeases() func()
type AuthorizationAccessor
type CopyDataBlock
- func (CopyDataBlock) Format(buf *bytes.Buffer, f parser.FmtFlags)
- func (CopyDataBlock) StatementTag() string
- func (CopyDataBlock) StatementType() parser.StatementType
- func (CopyDataBlock) String() string
type DatabaseAccessor
type DependencyAnalyzer
- func NewSpanBasedDependencyAnalyzer() DependencyAnalyzer
type DescriptorAccessor
type DistLoader
- func (l *DistLoader) LoadCSV(ctx context.Context, job *jobs.Job, db *client.DB, evalCtx parser.EvalContext, ...) error
type DistSQLExecMode
- func DistSQLExecModeFromString(val string) DistSQLExecMode
- func (m DistSQLExecMode) String() string
type DistSQLPlannerTestingKnobs
type EventLogType
type EventLogger
- func MakeEventLogger(leaseMgr *LeaseManager) EventLogger
- func (ev EventLogger) InsertEventRecord(ctx context.Context, txn *client.Txn, eventType EventLogType, ...) error
type Executor
- func NewExecutor(cfg ExecutorConfig, stopper *stop.Stopper) *Executor
- func (e *Executor) AnnotateCtx(ctx context.Context) context.Context
- func (e *Executor) CopyData(session *Session, data string) error
- func (e *Executor) CopyDone(session *Session) error
- func (e *Executor) ExecutePreparedStatement(session *Session, stmt *PreparedStatement, pinfo *parser.PlaceholderInfo) error
- func (e *Executor) ExecuteStatements(session *Session, stmts string, pinfo *parser.PlaceholderInfo) error
- func (e *Executor) ExecuteStatementsBuffered(session *Session, stmts string, pinfo *parser.PlaceholderInfo, ...) (StatementResults, error)
- func (e *Executor) FillUnimplementedErrorCounts(fill map[string]int64)
- func (e *Executor) GetScrubbedStmtStats() []roachpb.CollectedStatementStatistics
- func (e *Executor) GetVirtualTabler() VirtualTabler
- func (e *Executor) IsVirtualDatabase(name string) bool
- func (e *Executor) Prepare(stmt Statement, stmtStr string, session *Session, ...) (res *PreparedStatement, err error)
- func (e *Executor) PrepareStmt(session *Session, s *parser.Prepare) error
- func (e *Executor) ResetStatementStats(ctx context.Context)
- func (e *Executor) ResetUnimplementedCounts()
- func (e *Executor) SetDistSQLSpanResolver(spanResolver distsqlplan.SpanResolver)
- func (e *Executor) Start(ctx context.Context, startupMemMetrics *MemoryMetrics, ...)
type ExecutorConfig
- func (ec *ExecutorConfig) Organization() string
type ExecutorTestingKnobs
- func (*ExecutorTestingKnobs) ModuleTestingKnobs()
type InternalExecutor
- func (ie InternalExecutor) ExecuteStatementInTransaction(ctx context.Context, opName string, txn *client.Txn, statement string, ...) (int, error)
- func (ie InternalExecutor) GetTableSpan(ctx context.Context, user string, txn *client.Txn, dbName, tableName string) (roachpb.Span, error)
- func (ie InternalExecutor) QueryRowInTransaction(ctx context.Context, opName string, txn *client.Txn, statement string, ...) (parser.Datums, error)
- func (ie InternalExecutor) QueryRowsInTransaction(ctx context.Context, opName string, txn *client.Txn, statement string, ...) ([]parser.Datums, error)
type LeaseManager
- func NewLeaseManager(nodeID *base.NodeIDContainer, db client.DB, clock *hlc.Clock, ...) *LeaseManager
- func (m *LeaseManager) Acquire(ctx context.Context, timestamp hlc.Timestamp, tableID sqlbase.ID) (*sqlbase.TableDescriptor, hlc.Timestamp, error)
- func (m *LeaseManager) AcquireByName(ctx context.Context, timestamp hlc.Timestamp, dbID sqlbase.ID, ...) (*sqlbase.TableDescriptor, hlc.Timestamp, error)
- func (m *LeaseManager) RefreshLeases(s *stop.Stopper, db *client.DB, gossip *gossip.Gossip)
- func (m *LeaseManager) Release(desc *sqlbase.TableDescriptor) error
- func (m *LeaseManager) SetDraining(drain bool)
type LeaseManagerTestingKnobs
- func (*LeaseManagerTestingKnobs) ModuleTestingKnobs()
type LeaseStore
- func (s LeaseStore) Publish(ctx context.Context, tableID sqlbase.ID, ...) (*sqlbase.Descriptor, error)
- func (s LeaseStore) WaitForOneVersion(ctx context.Context, tableID sqlbase.ID, retryOpts retry.Options) (sqlbase.DescriptorVersion, error)
type LeaseStoreTestingKnobs
- func (*LeaseStoreTestingKnobs) ModuleTestingKnobs()
type MemoryMetrics
- func MakeMemMetrics(endpoint string, histogramWindow time.Duration) MemoryMetrics
- func (MemoryMetrics) MetricStruct()
type NodeInfo
type ParallelizeQueue
- func MakeParallelizeQueue(analyzer DependencyAnalyzer) ParallelizeQueue
- func (pq *ParallelizeQueue) Add(params runParams, plan planNode, exec func(planNode) error) error
- func (pq *ParallelizeQueue) Errs() []error
- func (pq *ParallelizeQueue) Len() int
- func (pq *ParallelizeQueue) Wait() []error
type PlanHookState
type PreparedPortal
type PreparedPortals
- func (pp PreparedPortals) Delete(ctx context.Context, name string) bool
- func (pp PreparedPortals) Exists(name string) bool
- func (pp PreparedPortals) Get(name string) (*PreparedPortal, bool)
- func (pp PreparedPortals) New(ctx context.Context, name string, stmt *PreparedStatement, ...) (*PreparedPortal, error)
type PreparedStatement
type PreparedStatements
- func (ps PreparedStatements) Delete(ctx context.Context, name string) bool
- func (ps *PreparedStatements) DeleteAll(ctx context.Context)
- func (ps PreparedStatements) Exists(name string) bool
- func (ps PreparedStatements) Get(name string) (*PreparedStatement, bool)
- func (ps PreparedStatements) New(e *Executor, name string, stmt Statement, stmtStr string, ...) (*PreparedStatement, error)
- func (ps PreparedStatements) NewFromString(e *Executor, name, query string, placeholderHints parser.PlaceholderTypes) (*PreparedStatement, error)
type Result
- func (r *Result) Close(ctx context.Context)
type ResultList
- func (rl ResultList) Close(ctx context.Context)
type ResultsGroup
type ResultsWriter
type RowBuffer
- func (rb *RowBuffer) Next() bool
- func (rb *RowBuffer) Values() parser.Datums
type RowResultWriter
- func NewRowResultWriter(statementType parser.StatementType, rowContainer *sqlbase.RowContainer) *RowResultWriter
- func (b *RowResultWriter) AddRow(ctx context.Context, row parser.Datums) error
- func (b *RowResultWriter) IncrementRowsAffected(n int)
- func (b *RowResultWriter) StatementType() parser.StatementType
type SchemaAccessor
type SchemaChangeManager
- func NewSchemaChangeManager(st *cluster.Settings, testingKnobs *SchemaChangerTestingKnobs, db client.DB, ...) *SchemaChangeManager
- func (s *SchemaChangeManager) Start(stopper *stop.Stopper)
type SchemaChanger
- func NewSchemaChangerForTesting(tableID sqlbase.ID, mutationID sqlbase.MutationID, nodeID roachpb.NodeID, ...) SchemaChanger
- func (sc *SchemaChanger) AcquireLease(ctx context.Context) (sqlbase.TableDescriptor_SchemaChangeLease, error)
- func (sc *SchemaChanger) ExtendLease(ctx context.Context, existingLease *sqlbase.TableDescriptor_SchemaChangeLease) error
- func (sc *SchemaChanger) MaybeIncrementVersion(ctx context.Context) (*sqlbase.Descriptor, error)
- func (sc *SchemaChanger) ReleaseLease(ctx context.Context, lease sqlbase.TableDescriptor_SchemaChangeLease) error
- func (sc *SchemaChanger) RunStateMachineBeforeBackfill(ctx context.Context) error
type SchemaChangerTestingKnobs
- func (*SchemaChangerTestingKnobs) ModuleTestingKnobs()
type Session
- func NewSession(ctx context.Context, args SessionArgs, e *Executor, remote net.Addr, ...) *Session
- func (s *Session) ClearStatementsAndPortals(ctx context.Context)
- func (s *Session) CopyEnd(ctx context.Context)
- func (s *Session) Ctx() context.Context
- func (s *Session) EmergencyClose()
- func (s *Session) Finish(e *Executor)
- func (s *Session) FinishPlan()
- func (s *Session) OpenAccount() WrappableMemoryAccount
- func (s *Session) ProcessCopyData(ctx context.Context, data string, msg copyMsg) (StatementList, error)
- func (s *Session) StartMonitor(pool *mon.BytesMonitor, reserved mon.BoundAccount)
- func (s *Session) StartUnlimitedMonitor()
type SessionArgs
type SessionRegistry
- func MakeSessionRegistry() *SessionRegistry
- func (r *SessionRegistry) CancelQuery(queryIDStr string, username string) (bool, error)
- func (r *SessionRegistry) SerializeAll() []serverpb.Session
type SessionTracing
- func (st *SessionTracing) Enabled() bool
- func (st *SessionTracing) KVTracingEnabled() bool
- func (st *SessionTracing) RecordingType() tracing.RecordingType
- func (st *SessionTracing) StartTracing(recType tracing.RecordingType, kvTracingEnabled bool) error
- func (st *SessionTracing) StopTracing() error
type Statement
- func (s Statement) String() string
type StatementFilter
type StatementList
- func NewStatementList(stmts parser.StatementList) StatementList
- func (l StatementList) Format(buf *bytes.Buffer, f parser.FmtFlags)
- func (l StatementList) String() string
type StatementResult
type StatementResults
- func (s *StatementResults) Close(ctx context.Context)
type SyncSchemaChangersFilter
type TableCollection
type TestingSchemaChangerCollection
- func (tscc TestingSchemaChangerCollection) ClearSchemaChangers()
type TxnStateEnum
- func (i TxnStateEnum) String() string
type VirtualTabler
type WireFailureError
- func (e WireFailureError) Error() string
type WrappableMemoryAccount
- func (w *WrappableMemoryAccount) Wsession(s *Session) WrappedMemoryAccount
- func (w *WrappableMemoryAccount) Wtxn(s *Session) WrappedMemoryAccount
type WrappedMemoryAccount
- func (w WrappedMemoryAccount) Clear(ctx context.Context)
- func (w WrappedMemoryAccount) Close(ctx context.Context)
- func (w WrappedMemoryAccount) Grow(ctx context.Context, extraSize int64) error
- func (w WrappedMemoryAccount) OpenAndInit(ctx context.Context, initialAllocation int64) error
- func (w WrappedMemoryAccount) ResizeItem(ctx context.Context, oldSize, newSize int64) error

Constants ¶

View Source

const MaxSQLBytes = 1000

MaxSQLBytes is the maximum length in bytes of SQL statements serialized into a serverpb.Session. Exported for testing.

View Source

const (
	// PgServerVersion is the latest version of postgres that we claim to support.
	PgServerVersion = "9.5.0"
)

View Source

const TableTruncateChunkSize = indexTruncateChunkSize

TableTruncateChunkSize is the maximum number of keys deleted per chunk during a table truncation.

Variables ¶

View Source

var (
	MetaTxnBegin = metric.Metadata{
		Name: "sql.txn.begin.count",
		Help: "Number of SQL transaction BEGIN statements"}
	MetaTxnCommit = metric.Metadata{
		Name: "sql.txn.commit.count",
		Help: "Number of SQL transaction COMMIT statements"}
	MetaTxnAbort = metric.Metadata{
		Name: "sql.txn.abort.count",
		Help: "Number of SQL transaction ABORT statements"}
	MetaTxnRollback = metric.Metadata{
		Name: "sql.txn.rollback.count",
		Help: "Number of SQL transaction ROLLBACK statements"}
	MetaSelect = metric.Metadata{
		Name: "sql.select.count",
		Help: "Number of SQL SELECT statements"}
	MetaSQLExecLatency = metric.Metadata{
		Name: "sql.exec.latency",
		Help: "Latency of SQL statement execution"}
	MetaSQLServiceLatency = metric.Metadata{
		Name: "sql.service.latency",
		Help: "Latency of SQL request execution"}
	MetaDistSQLSelect = metric.Metadata{
		Name: "sql.distsql.select.count",
		Help: "Number of dist-SQL SELECT statements"}
	MetaDistSQLExecLatency = metric.Metadata{
		Name: "sql.distsql.exec.latency",
		Help: "Latency of dist-SQL statement execution"}
	MetaDistSQLServiceLatency = metric.Metadata{
		Name: "sql.distsql.service.latency",
		Help: "Latency of dist-SQL request execution"}
	MetaUpdate = metric.Metadata{
		Name: "sql.update.count",
		Help: "Number of SQL UPDATE statements"}
	MetaInsert = metric.Metadata{
		Name: "sql.insert.count",
		Help: "Number of SQL INSERT statements"}
	MetaDelete = metric.Metadata{
		Name: "sql.delete.count",
		Help: "Number of SQL DELETE statements"}
	MetaDdl = metric.Metadata{
		Name: "sql.ddl.count",
		Help: "Number of SQL DDL statements"}
	MetaMisc = metric.Metadata{
		Name: "sql.misc.count",
		Help: "Number of other SQL statements"}
	MetaQuery = metric.Metadata{
		Name: "sql.query.count",
		Help: "Number of SQL queries"}
)

Fully-qualified names for metrics.

View Source

var (
	// SchemaChangeLeaseDuration is the duration a lease will be acquired for.
	// Exported for testing purposes only.
	SchemaChangeLeaseDuration = 5 * time.Minute
	// MinSchemaChangeLeaseDuration is the minimum duration a lease will have
	// remaining upon acquisition. Exported for testing purposes only.
	MinSchemaChangeLeaseDuration = time.Minute
)

View Source

var ClusterOrganization = settings.RegisterStringSetting(
	"cluster.organization",
	"organization name",
	"",
)

ClusterOrganization is the organization name.

View Source

var DistSQLClusterExecMode = settings.RegisterEnumSetting(
	"sql.defaults.distsql",
	"Default distributed SQL execution mode",
	"Auto",
	map[int64]string{
		int64(DistSQLOff):  "Off",
		int64(DistSQLAuto): "Auto",
		int64(DistSQLOn):   "On",
	},
)

DistSQLClusterExecMode controls the cluster default for when DistSQL is used.

View Source

var (
	// LeaseDuration is the mean duration a lease will be acquired for. The
	// actual duration is jittered in the range
	// [0.75,1.25]*LeaseDuration. Exported for testing purposes only.
	LeaseDuration = 5 * time.Minute
)

View Source

var NilVirtualTabler nilVirtualTabler

NilVirtualTabler implements VirtualTabler that returns nil.

Functions ¶

func AddPlanHook ¶

func AddPlanHook(f planHookFn)

AddPlanHook adds a hook used to short-circuit creating a planNode from a parser.Statement. If the func returned by the hook is non-nil, it is used to construct a planNode that runs that func in a goroutine during Start.

func CheckPrivilege ¶ added in v1.1.0

func CheckPrivilege(
	user string, descriptor sqlbase.DescriptorProto, privilege privilege.Kind,
) error

CheckPrivilege verifies that `user“ has `privilege` on `descriptor`.

func DropTableDesc ¶ added in v1.1.0

func DropTableDesc(
	ctx context.Context, tableDesc *sqlbase.TableDescriptor, db *client.DB, traceKV bool,
) error

DropTableDesc removes a descriptor from the KV database.

func DropTableName ¶ added in v1.1.0

func DropTableName(
	ctx context.Context, tableDesc *sqlbase.TableDescriptor, db *client.DB, traceKV bool,
) error

DropTableName removes a mapping from name to ID from the KV database.

func EvalAsOfTimestamp ¶

func EvalAsOfTimestamp(
	evalCtx *parser.EvalContext, asOf parser.AsOfClause, max hlc.Timestamp,
) (hlc.Timestamp, error)

EvalAsOfTimestamp evaluates and returns the timestamp from an AS OF SYSTEM TIME clause.

func GenerateInsertRow ¶

func GenerateInsertRow(
	defaultExprs []parser.TypedExpr,
	insertColIDtoRowIndex map[sqlbase.ColumnID]int,
	insertCols []sqlbase.ColumnDescriptor,
	evalCtx parser.EvalContext,
	tableDesc *sqlbase.TableDescriptor,
	rowVals parser.Datums,
) (parser.Datums, error)

GenerateInsertRow prepares a row tuple for insertion. It fills in default expressions, verifies non-nullable columns, and checks column widths.

func GenerateUniqueDescID ¶

func GenerateUniqueDescID(ctx context.Context, db *client.DB) (sqlbase.ID, error)

GenerateUniqueDescID returns the next available Descriptor ID and increments the counter. The incrementing is non-transactional, and the counter could be incremented multiple times because of retries.

func GetKeysForTableDescriptor ¶

func GetKeysForTableDescriptor(
	tableDesc *sqlbase.TableDescriptor,
) (zoneKey roachpb.Key, nameKey roachpb.Key, descKey roachpb.Key)

GetKeysForTableDescriptor retrieves the KV keys corresponding to the zone, name and descriptor of a table.

func GetTableDesc ¶

func GetTableDesc(cfg config.SystemConfig, id sqlbase.ID) (*sqlbase.TableDescriptor, error)

GetTableDesc returns the table descriptor for the table with 'id'. Returns nil if the descriptor is not present, or is present but is not a table.

func GetUserHashedPassword ¶

func GetUserHashedPassword(
	ctx context.Context, executor *Executor, metrics *MemoryMetrics, username string,
) (bool, []byte, error)

GetUserHashedPassword returns the hashedPassword for the given username if found in system.users.

func GetZoneConfig ¶

func GetZoneConfig(cfg config.SystemConfig, id uint32) (config.ZoneConfig, bool, error)

GetZoneConfig returns the zone config for the object with 'id'.

func HashAppName ¶

func HashAppName(appName string) string

HashAppName 1-way hashes an application names for use in stat reporting.

func HoistConstraints ¶ added in v1.1.0

func HoistConstraints(n *parser.CreateTable)

HoistConstraints finds column constraints defined inline with the columns and moves them into n.Defs as constraints. For example, the foreign key constraint in `CREATE TABLE foo (a INT REFERENCES bar(a))` gets pulled into a top level fk constraint like `CREATE TABLE foo (a int CONSTRAINT .. FOREIGN KEY(a) REFERENCES bar(a)`.

func IsStmtParallelized ¶

func IsStmtParallelized(stmt Statement) bool

IsStmtParallelized determines if a given statement's execution should be parallelized. This means that its results should be mocked out, and that it should be run asynchronously and in parallel with other statements that are independent.

func MakeTableDesc ¶

func MakeTableDesc(
	ctx context.Context,
	txn *client.Txn,
	vt VirtualTabler,
	searchPath parser.SearchPath,
	n *parser.CreateTable,
	parentID, id sqlbase.ID,
	creationTime hlc.Timestamp,
	privileges *sqlbase.PrivilegeDescriptor,
	affected map[sqlbase.ID]*sqlbase.TableDescriptor,
	sessionDB string,
	evalCtx *parser.EvalContext,
) (sqlbase.TableDescriptor, error)

MakeTableDesc creates a table descriptor from a CreateTable statement.

func MustGetDatabaseDesc ¶

func MustGetDatabaseDesc(
	ctx context.Context, txn *client.Txn, vt VirtualTabler, name string,
) (*sqlbase.DatabaseDescriptor, error)

MustGetDatabaseDesc looks up the database descriptor given its name, returning an error if the descriptor is not found.

func MustGetDatabaseDescByID ¶ added in v1.1.0

func MustGetDatabaseDescByID(
	ctx context.Context, txn *client.Txn, id sqlbase.ID,
) (*sqlbase.DatabaseDescriptor, error)

MustGetDatabaseDescByID looks up the database descriptor given its ID, returning an error if the descriptor is not found.

func MustGetTableDesc ¶ added in v1.1.0

func MustGetTableDesc(
	ctx context.Context, txn *client.Txn, vt VirtualTabler, tn *parser.TableName, allowAdding bool,
) (*sqlbase.TableDescriptor, error)

MustGetTableDesc returns a table descriptor for a table, or an error if the descriptor is not found. allowAdding when set allows a table descriptor in the ADD state to also be returned.

func MustGetTableOrViewDesc ¶ added in v1.1.0

func MustGetTableOrViewDesc(
	ctx context.Context, txn *client.Txn, vt VirtualTabler, tn *parser.TableName, allowAdding bool,
) (*sqlbase.TableDescriptor, error)

MustGetTableOrViewDesc returns a table descriptor for either a table or view, or an error if the descriptor is not found. allowAdding when set allows a table descriptor in the ADD state to also be returned.

func NewWireFailureError ¶ added in v1.1.0

func NewWireFailureError(err error) error

NewWireFailureError returns a new WireFailureError which wraps err.

func NormalizeAndValidateUsername ¶

func NormalizeAndValidateUsername(username string) (string, error)

NormalizeAndValidateUsername case folds the specified username and verifies it validates according to the usernameRE regular expression.

func RecomputeViewDependencies ¶ added in v1.1.0

func RecomputeViewDependencies(ctx context.Context, txn *client.Txn, e *Executor) error

RecomputeViewDependencies does the work of CREATE VIEW w.r.t. dependencies over again. Used by a migration to fix existing view descriptors created prior to fixing #17269 and #17306; it may also be used by a future "fsck" utility.

func TestDisableTableLeases ¶

func TestDisableTableLeases() func()

TestDisableTableLeases disables table leases and returns a function that can be used to enable it.

Types ¶

type AuthorizationAccessor ¶

type AuthorizationAccessor interface {
	// CheckPrivilege verifies that the user has `privilege` on `descriptor`.
	CheckPrivilege(
		descriptor sqlbase.DescriptorProto, privilege privilege.Kind,
	) error

	// RequiresSuperUser errors if the session user isn't a super-user (i.e. root
	// or node). Includes the named action in the error message.
	RequireSuperUser(action string) error
	// contains filtered or unexported methods
}

AuthorizationAccessor for checking authorization (e.g. desc privileges).

type CopyDataBlock ¶

type CopyDataBlock struct {
	Done bool
}

CopyDataBlock represents a data block of a COPY FROM statement.

func (CopyDataBlock) Format ¶

func (CopyDataBlock) Format(buf *bytes.Buffer, f parser.FmtFlags)

Format implements the NodeFormatter interface.

func (CopyDataBlock) StatementTag ¶

func (CopyDataBlock) StatementTag() string

StatementTag returns a short string identifying the type of statement.

func (CopyDataBlock) StatementType ¶

func (CopyDataBlock) StatementType() parser.StatementType

StatementType implements the Statement interface.

func (CopyDataBlock) String ¶

func (CopyDataBlock) String() string

type DatabaseAccessor ¶

type DatabaseAccessor interface {
	// contains filtered or unexported methods
}

DatabaseAccessor provides helper methods for using SQL database descriptors.

type DependencyAnalyzer ¶

type DependencyAnalyzer interface {
	// Analyze collects any upfront analysis that is necessary to make future
	// independence decisions about the planNode. It must be called before
	// calling Independent for each planNode, and the planNode provided must
	// not be running when Analyze is called. Analyze is allowed to mutate the
	// provided planner if necessary.
	Analyze(runParams, planNode) error
	// Independent determines if the provided planNodess are independent from
	// one another. Either planNode may be running when Independent is called,
	// so the method will not modify the plans in any way. Implementations of
	// Independent are always commutative.
	Independent(planNode, planNode) bool
	// Clear is a hint to the DependencyAnalyzer that the provided plan will
	// no longer be needed. It is useful for DependencyAnalyzers that cache
	// state on the planNodes during Analyze.
	Clear(planNode)
}

DependencyAnalyzer determines if plans are independent of one another, where independent plans are defined by whether their execution could be safely reordered without having an effect on their runtime semantics or on their results. DependencyAnalyzer is used by ParallelizeQueue to test whether it is safe for multiple statements to be run concurrently.

DependencyAnalyzer implementations do not need to be safe to use from multiple goroutines concurrently.

var NoDependenciesAnalyzer DependencyAnalyzer = dependencyAnalyzerFunc(func(
	_ planNode, _ planNode,
) bool {
	return true
})

NoDependenciesAnalyzer is a DependencyAnalyzer that performs no analysis on planNodes and asserts that all plans are independent.

func NewSpanBasedDependencyAnalyzer ¶

func NewSpanBasedDependencyAnalyzer() DependencyAnalyzer

NewSpanBasedDependencyAnalyzer creates a new SpanBasedDependencyAnalyzer.

type DescriptorAccessor ¶

type DescriptorAccessor interface {
	// contains filtered or unexported methods
}

DescriptorAccessor provides helper methods for using descriptors to SQL objects.

type DistLoader ¶ added in v1.1.0

type DistLoader struct {
	// contains filtered or unexported fields
}

DistLoader uses DistSQL to convert external data formats (csv, etc) into sstables of our mvcc-format key values.

func (*DistLoader) LoadCSV ¶ added in v1.1.0

func (l *DistLoader) LoadCSV(
	ctx context.Context,
	job *jobs.Job,
	db *client.DB,
	evalCtx parser.EvalContext,
	thisNode roachpb.NodeID,
	nodes []roachpb.NodeDescriptor,
	resultRows *RowResultWriter,
	tableDesc *sqlbase.TableDescriptor,
	from []string,
	to string,
	comma, comment rune,
	nullif *string,
	walltime int64,
	splitSize int64,
) error

LoadCSV performs a distributed transformation of the CSV files at from and stores them in enterprise backup format at to.

type DistSQLExecMode ¶

type DistSQLExecMode int64

DistSQLExecMode controls if and when the Executor uses DistSQL.

const (
	// DistSQLOff means that we never use distSQL.
	DistSQLOff DistSQLExecMode = iota
	// DistSQLAuto means that we automatically decide on a case-by-case basis if
	// we use distSQL.
	DistSQLAuto
	// DistSQLOn means that we use distSQL for queries that are supported.
	DistSQLOn
	// DistSQLAlways means that we only use distSQL; unsupported queries fail.
	DistSQLAlways
)

func DistSQLExecModeFromString ¶

func DistSQLExecModeFromString(val string) DistSQLExecMode

DistSQLExecModeFromString converts a string into a DistSQLExecMode

func (DistSQLExecMode) String ¶

func (m DistSQLExecMode) String() string

type DistSQLPlannerTestingKnobs ¶ added in v1.1.0

type DistSQLPlannerTestingKnobs struct {
	// If OverrideSQLHealthCheck is set, we use this callback to get the health of
	// a node.
	OverrideHealthCheck func(node roachpb.NodeID, addrString string) error
}

DistSQLPlannerTestingKnobs is used to control internals of the distSQLPlanner for testing purposes.

type EventLogType ¶

type EventLogType string

EventLogType represents an event type that can be recorded in the event log.

const (
	// EventLogCreateDatabase is recorded when a database is created.
	EventLogCreateDatabase EventLogType = "create_database"
	// EventLogDropDatabase is recorded when a database is dropped.
	EventLogDropDatabase EventLogType = "drop_database"

	// EventLogCreateTable is recorded when a table is created.
	EventLogCreateTable EventLogType = "create_table"
	// EventLogDropTable is recorded when a table is dropped.
	EventLogDropTable EventLogType = "drop_table"
	// EventLogAlterTable is recorded when a table is altered.
	EventLogAlterTable EventLogType = "alter_table"

	// EventLogCreateIndex is recorded when an index is created.
	EventLogCreateIndex EventLogType = "create_index"
	// EventLogDropIndex is recorded when an index is dropped.
	EventLogDropIndex EventLogType = "drop_index"

	// EventLogCreateView is recorded when a view is created.
	EventLogCreateView EventLogType = "create_view"
	// EventLogDropView is recorded when a view is dropped.
	EventLogDropView EventLogType = "drop_view"

	// EventLogReverseSchemaChange is recorded when an in-progress schema change
	// encounters a problem and is reversed.
	EventLogReverseSchemaChange EventLogType = "reverse_schema_change"
	// EventLogFinishSchemaChange is recorded when a previously initiated schema
	// change has completed.
	EventLogFinishSchemaChange EventLogType = "finish_schema_change"

	// EventLogNodeJoin is recorded when a node joins the cluster.
	EventLogNodeJoin EventLogType = "node_join"
	// EventLogNodeRestart is recorded when an existing node rejoins the cluster
	// after being offline.
	EventLogNodeRestart EventLogType = "node_restart"

	// EventLogSetClusterSetting is recorded when a cluster setting is changed.
	EventLogSetClusterSetting EventLogType = "set_cluster_setting"
)

NOTE: When you add a new event type here. Please manually add it to ui/app/util/eventTypes.ts so that it will be recognized in the UI.

type EventLogger ¶

type EventLogger struct {
	InternalExecutor
}

An EventLogger exposes methods used to record events to the event table.

func MakeEventLogger ¶

func MakeEventLogger(leaseMgr *LeaseManager) EventLogger

MakeEventLogger constructs a new EventLogger. A LeaseManager is required in order to correctly execute SQL statements.

func (EventLogger) InsertEventRecord ¶

func (ev EventLogger) InsertEventRecord(
	ctx context.Context,
	txn *client.Txn,
	eventType EventLogType,
	targetID, reportingID int32,
	info interface{},
) error

InsertEventRecord inserts a single event into the event log as part of the provided transaction.

type Executor ¶

type Executor struct {

	// Transient stats.
	SelectCount *metric.Counter
	// The subset of SELECTs that are processed through DistSQL.
	DistSQLSelectCount    *metric.Counter
	DistSQLExecLatency    *metric.Histogram
	SQLExecLatency        *metric.Histogram
	DistSQLServiceLatency *metric.Histogram
	SQLServiceLatency     *metric.Histogram
	TxnBeginCount         *metric.Counter

	// txnCommitCount counts the number of times a COMMIT was attempted.
	TxnCommitCount *metric.Counter

	TxnAbortCount    *metric.Counter
	TxnRollbackCount *metric.Counter
	UpdateCount      *metric.Counter
	InsertCount      *metric.Counter
	DeleteCount      *metric.Counter
	DdlCount         *metric.Counter
	MiscCount        *metric.Counter
	QueryCount       *metric.Counter
	// contains filtered or unexported fields
}

An Executor executes SQL statements. Executor is thread-safe.

func NewExecutor ¶

func NewExecutor(cfg ExecutorConfig, stopper *stop.Stopper) *Executor

NewExecutor creates an Executor and registers a callback on the system config.

func (*Executor) AnnotateCtx ¶

func (e *Executor) AnnotateCtx(ctx context.Context) context.Context

AnnotateCtx is a convenience wrapper; see AmbientContext.

func (*Executor) CopyData ¶

func (e *Executor) CopyData(session *Session, data string) error

CopyData adds data to the COPY buffer and executes if there are enough rows.

func (*Executor) CopyDone ¶

func (e *Executor) CopyDone(session *Session) error

CopyDone executes the buffered COPY data.

func (*Executor) ExecutePreparedStatement ¶ added in v1.1.0

func (e *Executor) ExecutePreparedStatement(
	session *Session, stmt *PreparedStatement, pinfo *parser.PlaceholderInfo,
) error

ExecutePreparedStatement executes the given statement and returns a response.

func (*Executor) ExecuteStatements ¶

func (e *Executor) ExecuteStatements(
	session *Session, stmts string, pinfo *parser.PlaceholderInfo,
) error

ExecuteStatements executes the given statement(s).

func (*Executor) ExecuteStatementsBuffered ¶ added in v1.1.0

func (e *Executor) ExecuteStatementsBuffered(
	session *Session, stmts string, pinfo *parser.PlaceholderInfo, expectedNumResults int,
) (StatementResults, error)

ExecuteStatementsBuffered executes the given statement(s), buffering them entirely in memory prior to returning a response. If there is an error then we return an empty StatementResults and the error.

Note that we will only receive an error even if we run a successful statement followed by a statement which has an error then the caller will only receive the error, however the first statement will have been executed.

If no error is returned, the caller has to call Close() on the returned StatementResults.

func (*Executor) FillUnimplementedErrorCounts ¶ added in v1.1.0

func (e *Executor) FillUnimplementedErrorCounts(fill map[string]int64)

FillUnimplementedErrorCounts fills the passed map with the executor's current counts of how often individual unimplemented features have been encountered.

func (*Executor) GetScrubbedStmtStats ¶

func (e *Executor) GetScrubbedStmtStats() []roachpb.CollectedStatementStatistics

GetScrubbedStmtStats returns the statement statistics by app, with the queries scrubbed of their identifiers. Any statements which cannot be scrubbed will be omitted from the returned map.

func (*Executor) GetVirtualTabler ¶ added in v1.1.0

func (e *Executor) GetVirtualTabler() VirtualTabler

GetVirtualTabler retrieves the VirtualTabler reference for this executor.

func (*Executor) IsVirtualDatabase ¶

func (e *Executor) IsVirtualDatabase(name string) bool

IsVirtualDatabase checks if the provided name corresponds to a virtual database, exposing this information on the Executor object itself.

func (*Executor) Prepare ¶

func (e *Executor) Prepare(
	stmt Statement, stmtStr string, session *Session, placeholderHints parser.PlaceholderTypes,
) (res *PreparedStatement, err error)

Prepare returns the result types of the given statement. pinfo may contain partial type information for placeholders. Prepare will populate the missing types. The PreparedStatement is returned (or nil if there are no results).

func (*Executor) PrepareStmt ¶ added in v1.1.0

func (e *Executor) PrepareStmt(session *Session, s *parser.Prepare) error

PrepareStmt implements the PREPARE statement. See https://www.postgresql.org/docs/current/static/sql-prepare.html for details.

func (*Executor) ResetStatementStats ¶

func (e *Executor) ResetStatementStats(ctx context.Context)

ResetStatementStats resets the executor's collected statement statistics.

func (*Executor) ResetUnimplementedCounts ¶ added in v1.1.0

func (e *Executor) ResetUnimplementedCounts()

ResetUnimplementedCounts resets counting of unimplemented errors.

func (*Executor) SetDistSQLSpanResolver ¶

func (e *Executor) SetDistSQLSpanResolver(spanResolver distsqlplan.SpanResolver)

SetDistSQLSpanResolver changes the SpanResolver used for DistSQL. It is the caller's responsibility to make sure no queries are being run with DistSQL at the same time.

func (*Executor) Start ¶

func (e *Executor) Start(
	ctx context.Context, startupMemMetrics *MemoryMetrics, nodeDesc roachpb.NodeDescriptor,
)

Start starts workers for the executor and initializes the distSQLPlanner.

type ExecutorConfig ¶

type ExecutorConfig struct {
	Settings *cluster.Settings
	NodeInfo
	AmbientCtx      log.AmbientContext
	DB              *client.DB
	Gossip          *gossip.Gossip
	DistSender      *kv.DistSender
	RPCContext      *rpc.Context
	LeaseManager    *LeaseManager
	Clock           *hlc.Clock
	DistSQLSrv      *distsqlrun.ServerImpl
	StatusServer    serverpb.StatusServer
	SessionRegistry *SessionRegistry
	JobRegistry     *jobs.Registry

	TestingKnobs              *ExecutorTestingKnobs
	SchemaChangerTestingKnobs *SchemaChangerTestingKnobs
	// HistogramWindowInterval is (server.Context).HistogramWindowInterval.
	HistogramWindowInterval time.Duration

	// Caches updated by DistSQL.
	RangeDescriptorCache *kv.RangeDescriptorCache
	LeaseHolderCache     *kv.LeaseHolderCache
}

An ExecutorConfig encompasses the auxiliary objects and configuration required to create an executor. All fields holding a pointer or an interface are required to create a Executor; the rest will have sane defaults set if omitted.

func (*ExecutorConfig) Organization ¶ added in v1.1.0

func (ec *ExecutorConfig) Organization() string

Organization returns the value of cluster.organization.

type ExecutorTestingKnobs ¶

type ExecutorTestingKnobs struct {
	// WaitForGossipUpdate causes metadata-mutating operations to wait
	// for the new metadata to back-propagate through gossip.
	WaitForGossipUpdate bool

	// CheckStmtStringChange causes Executor.execStmtGroup to verify that executed
	// statements are not modified during execution.
	CheckStmtStringChange bool

	// StatementFilter can be used to trap execution of SQL statements and
	// optionally change their results. The filter function is invoked after each
	// statement has been executed.
	StatementFilter StatementFilter

	// BeforePrepare is called by the Executor before preparing any statement. It
	// gives access to the planner that will be used to do the prepare. If any of
	// the return values are not nil, the values are used as the prepare results
	// and normal preparation is short-circuited.
	BeforePrepare func(ctx context.Context, stmt string, planner *planner) (*PreparedStatement, error)

	// BeforeExecute is called by the Executor before plan execution. It is useful
	// for synchronizing statement execution, such as with parallel statemets.
	BeforeExecute func(ctx context.Context, stmt string, isParallel bool)

	// AfterExecute is like StatementFilter, but it runs in the same goroutine of the
	// statement.
	AfterExecute func(
		ctx context.Context, stmt string, res StatementResult, err error,
	)

	// DisableAutoCommit, if set, disables the auto-commit functionality of some
	// SQL statements. That functionality allows some statements to commit
	// directly when they're executed in an implicit SQL txn, without waiting for
	// the Executor to commit the implicit txn.
	// This has to be set in tests that need to abort such statements using a
	// StatementFilter; otherwise, the statement commits immediately after
	// execution so there'll be nothing left to abort by the time the filter runs.
	DisableAutoCommit bool

	// DistSQLPlannerKnobs are testing knobs for distSQLPlanner.
	DistSQLPlannerKnobs DistSQLPlannerTestingKnobs

	// BeforeAutoCommit is called when the Executor is about to commit the KV
	// transaction after running a statement in an implicit transaction, allowing
	// tests to inject errors into that commit.
	// If an error is returned, that error will be considered the result of
	// txn.Commit(), and the txn.Commit() call will not actually be
	// made. If no error is returned, txn.Commit() is called normally.
	//
	// Note that this is not called if the SQL statement representing the implicit
	// transaction has committed the KV txn itself (e.g. if it used the 1-PC
	// optimization). This is only called when the Executor is the one doing the
	// committing.
	BeforeAutoCommit func(ctx context.Context, stmt string) error
}

ExecutorTestingKnobs is part of the context used to control parts of the system during testing.

func (*ExecutorTestingKnobs) ModuleTestingKnobs ¶

func (*ExecutorTestingKnobs) ModuleTestingKnobs()

ModuleTestingKnobs is part of the base.ModuleTestingKnobs interface.

type InternalExecutor ¶

type InternalExecutor struct {
	LeaseManager *LeaseManager
}

InternalExecutor can be used internally by cockroach to execute SQL statements without needing to open a SQL connection. InternalExecutor assumes that the caller has access to a cockroach KV client to handle connection and transaction management.

func (InternalExecutor) ExecuteStatementInTransaction ¶

func (ie InternalExecutor) ExecuteStatementInTransaction(
	ctx context.Context, opName string, txn *client.Txn, statement string, qargs ...interface{},
) (int, error)

ExecuteStatementInTransaction executes the supplied SQL statement as part of the supplied transaction. Statements are currently executed as the root user.

func (InternalExecutor) GetTableSpan ¶

func (ie InternalExecutor) GetTableSpan(
	ctx context.Context, user string, txn *client.Txn, dbName, tableName string,
) (roachpb.Span, error)

GetTableSpan gets the key span for a SQL table, including any indices.

func (InternalExecutor) QueryRowInTransaction ¶

func (ie InternalExecutor) QueryRowInTransaction(
	ctx context.Context, opName string, txn *client.Txn, statement string, qargs ...interface{},
) (parser.Datums, error)

QueryRowInTransaction executes the supplied SQL statement as part of the supplied transaction and returns the result. Statements are currently executed as the root user.

func (InternalExecutor) QueryRowsInTransaction ¶ added in v1.1.0

func (ie InternalExecutor) QueryRowsInTransaction(
	ctx context.Context, opName string, txn *client.Txn, statement string, qargs ...interface{},
) ([]parser.Datums, error)

QueryRowsInTransaction executes the supplied SQL statement as part of the supplied transaction and returns the resulting rows. Statements are currently executed as the root user.

type LeaseManager ¶

type LeaseManager struct {
	LeaseStore
	// contains filtered or unexported fields
}

LeaseManager manages acquiring and releasing per-table leases. It also handles resolving table names to descriptor IDs. The leases are managed internally with a table descriptor and expiration time exported by the API. The table descriptor acquired needs to be released. A transaction can use a table descriptor as long as its timestamp is within the validity window for the descriptor: descriptor.ModificationTime <= txn.Timestamp < expirationTime

Exported only for testing.

The locking order is: LeaseManager.mu > tableState.mu > tableNameCache.mu > tableVersionState.mu

func NewLeaseManager ¶

func NewLeaseManager(
	nodeID *base.NodeIDContainer,
	db client.DB,
	clock *hlc.Clock,
	testingKnobs LeaseManagerTestingKnobs,
	stopper *stop.Stopper,
	memMetrics *MemoryMetrics,
) *LeaseManager

NewLeaseManager creates a new LeaseManager.

stopper is used to run async tasks. Can be nil in tests.

func (*LeaseManager) Acquire ¶

func (m *LeaseManager) Acquire(
	ctx context.Context, timestamp hlc.Timestamp, tableID sqlbase.ID,
) (*sqlbase.TableDescriptor, hlc.Timestamp, error)

Acquire acquires a read lease for the specified table ID valid for the timestamp. It returns the table descriptor and a expiration time. A transaction using this descriptor must ensure that its commit-timestamp < expiration-time. Care must be taken to not modify the returned descriptor.

func (*LeaseManager) AcquireByName ¶

func (m *LeaseManager) AcquireByName(
	ctx context.Context, timestamp hlc.Timestamp, dbID sqlbase.ID, tableName string,
) (*sqlbase.TableDescriptor, hlc.Timestamp, error)

AcquireByName returns a table version for the specified table valid for the timestamp. It returns the table descriptor and a expiration time. A transaction using this descriptor must ensure that its commit-timestamp < expiration-time. Care must be taken to not modify the returned descriptor.

func (*LeaseManager) RefreshLeases ¶

func (m *LeaseManager) RefreshLeases(s *stop.Stopper, db *client.DB, gossip *gossip.Gossip)

RefreshLeases starts a goroutine that refreshes the lease manager leases for tables received in the latest system configuration via gossip.

func (*LeaseManager) Release ¶

func (m *LeaseManager) Release(desc *sqlbase.TableDescriptor) error

Release releases a previously acquired table.

func (*LeaseManager) SetDraining ¶

func (m *LeaseManager) SetDraining(drain bool)

SetDraining (when called with 'true') removes all inactive leases. Any leases that are active will be removed once the lease's reference count drops to 0.

type LeaseManagerTestingKnobs ¶

type LeaseManagerTestingKnobs struct {
	// A callback called when a gossip update is received, before the leases are
	// refreshed. Careful when using this to block for too long - you can block
	// all the gossip users in the system.
	GossipUpdateEvent func(config.SystemConfig)
	// A callback called after the leases are refreshed as a result of a gossip update.
	TestingLeasesRefreshedEvent func(config.SystemConfig)

	LeaseStoreTestingKnobs LeaseStoreTestingKnobs
}

LeaseManagerTestingKnobs contains test knobs.

func (*LeaseManagerTestingKnobs) ModuleTestingKnobs ¶

func (*LeaseManagerTestingKnobs) ModuleTestingKnobs()

ModuleTestingKnobs is part of the base.ModuleTestingKnobs interface.

type LeaseStore ¶

type LeaseStore struct {
	// contains filtered or unexported fields
}

LeaseStore implements the operations for acquiring and releasing leases and publishing a new version of a descriptor. Exported only for testing.

func (LeaseStore) Publish ¶

func (s LeaseStore) Publish(
	ctx context.Context,
	tableID sqlbase.ID,
	update func(*sqlbase.TableDescriptor) error,
	logEvent func(*client.Txn) error,
) (*sqlbase.Descriptor, error)

Publish updates a table descriptor. It also maintains the invariant that there are at most two versions of the descriptor out in the wild at any time by first waiting for all nodes to be on the current (pre-update) version of the table desc. The update closure is called after the wait, and it provides the new version of the descriptor to be written. In a multi-step schema operation, this update should perform a single step. The closure may be called multiple times if retries occur; make sure it does not have side effects. Returns the updated version of the descriptor.

func (LeaseStore) WaitForOneVersion ¶

func (s LeaseStore) WaitForOneVersion(
	ctx context.Context, tableID sqlbase.ID, retryOpts retry.Options,
) (sqlbase.DescriptorVersion, error)

WaitForOneVersion returns once there are no unexpired leases on the previous version of the table descriptor. It returns the current version. After returning there can only be versions of the descriptor >= to the returned version. Lease acquisition (see acquire()) maintains the invariant that no new leases for desc.Version-1 will be granted once desc.Version exists.

type LeaseStoreTestingKnobs ¶

type LeaseStoreTestingKnobs struct {
	// Called after a lease is removed from the store, with any operation error.
	// See LeaseRemovalTracker.
	LeaseReleasedEvent func(table sqlbase.TableDescriptor, err error)
	// Called after a lease is acquired, with any operation error.
	LeaseAcquiredEvent func(table sqlbase.TableDescriptor, err error)
	// RemoveOnceDereferenced forces leases to be removed
	// as soon as they are dereferenced.
	RemoveOnceDereferenced bool
}

LeaseStoreTestingKnobs contains testing knobs.

func (*LeaseStoreTestingKnobs) ModuleTestingKnobs ¶

func (*LeaseStoreTestingKnobs) ModuleTestingKnobs()

ModuleTestingKnobs is part of the base.ModuleTestingKnobs interface.

type MemoryMetrics ¶

type MemoryMetrics struct {
	MaxBytesHist         *metric.Histogram
	CurBytesCount        *metric.Counter
	TxnMaxBytesHist      *metric.Histogram
	TxnCurBytesCount     *metric.Counter
	SessionMaxBytesHist  *metric.Histogram
	SessionCurBytesCount *metric.Counter
}

MemoryMetrics contains pointers to the metrics object for one of the SQL endpoints: - "client" for connections received via pgwire. - "admin" for connections received via the admin RPC. - "internal" for activities related to leases, schema changes, etc.

func MakeMemMetrics ¶

func MakeMemMetrics(endpoint string, histogramWindow time.Duration) MemoryMetrics

MakeMemMetrics instantiates the metric objects for an SQL endpoint.

func (MemoryMetrics) MetricStruct ¶

func (MemoryMetrics) MetricStruct()

MetricStruct implements the metrics.Struct interface.

type NodeInfo ¶ added in v1.1.0

type NodeInfo struct {
	ClusterID func() uuid.UUID
	NodeID    *base.NodeIDContainer
	AdminURL  func() *url.URL
	PGURL     func(*url.Userinfo) (*url.URL, error)
}

NodeInfo contains metadata about the executing node and cluster.

type ParallelizeQueue ¶

type ParallelizeQueue struct {
	// contains filtered or unexported fields
}

ParallelizeQueue maintains a set of planNodes running with parallelized execution. Parallelized execution means that multiple statements run asynchronously, with their results mocked out to the client and with independent statements allowed to run in parallel. Any errors seen when running these statements are delayed until the parallelized execution is "synchronized" on the next non-parallelized statement. The syntax to parallelize statement execution is the statement with RETURNING NOTHING appended to it. The feature is described further in docs/RFCS/sql_parallelization.md.

It uses a DependencyAnalyzer to determine dependencies between plans. Using this knowledge, the queue provides the following guarantees about the execution of plans:

No two plans will ever be run concurrently if they are dependent of one another.
If two dependent plans are added to the queue, the plan added first will be executed before the plan added second.
No plans will begin execution once an error has been seen until Wait is called to drain the plans and reset the error-set state.

func MakeParallelizeQueue ¶

func MakeParallelizeQueue(analyzer DependencyAnalyzer) ParallelizeQueue

MakeParallelizeQueue creates a new empty ParallelizeQueue that uses the provided DependencyAnalyzer to determine plan dependencies.

func (*ParallelizeQueue) Add ¶

func (pq *ParallelizeQueue) Add(params runParams, plan planNode, exec func(planNode) error) error

Add inserts a new plan in the queue and executes the provided function when all plans that it depends on have completed successfully, obeying the guarantees made by the ParallelizeQueue above. The exec function should be used to run the planNode and return any error observed during its execution.

Add should not be called concurrently with Wait. See Wait's comment for more details.

func (*ParallelizeQueue) Errs ¶ added in v1.1.0

func (pq *ParallelizeQueue) Errs() []error

Errs returns the ParallelizeQueue's error-set.

func (*ParallelizeQueue) Len ¶

func (pq *ParallelizeQueue) Len() int

Len returns the number of plans in the ParallelizeQueue.

func (*ParallelizeQueue) Wait ¶

func (pq *ParallelizeQueue) Wait() []error

Wait blocks until the ParallelizeQueue finishes executing all plans. It then returns the error-set of the last batch of parallelized execution before reseting the error-set to allow for future use.

Wait can not be called concurrently with Add. If we need to lift this restriction, consider replacing the sync.WaitGroup with a syncutil.RWMutex, which will provide the desired starvation and ordering properties. Those being that once Wait is called, future Adds will not be reordered ahead of Waits attempts to drain all running and pending plans.

type PlanHookState ¶

type PlanHookState interface {
	EvalContext() parser.EvalContext
	ExecCfg() *ExecutorConfig
	DistLoader() *DistLoader
	TypeAsString(e parser.Expr, op string) (func() (string, error), error)
	TypeAsStringArray(e parser.Exprs, op string) (func() ([]string, error), error)
	TypeAsStringOpts(
		opts parser.KVOptions, valuelessOpts map[string]bool,
	) (func() (map[string]string, error), error)
	User() string
	AuthorizationAccessor
}

PlanHookState exposes the subset of planner needed by plan hooks. We pass this as one interface, rather than individually passing each field or interface as we find we need them, to avoid churn in the planHookFn sig and the hooks that implement it.

type PreparedPortal ¶

type PreparedPortal struct {
	Stmt  *PreparedStatement
	Qargs parser.QueryArguments

	ProtocolMeta interface{} // a field for protocol implementations to hang metadata off of.
	// contains filtered or unexported fields
}

PreparedPortal is a PreparedStatement that has been bound with query arguments.

type PreparedPortals ¶

type PreparedPortals struct {
	// contains filtered or unexported fields
}

PreparedPortals is a mapping of PreparedPortal names to their corresponding PreparedPortals.

func (PreparedPortals) Delete ¶

func (pp PreparedPortals) Delete(ctx context.Context, name string) bool

Delete removes the PreparedPortal with the provided name from the PreparedPortals. The method returns whether a portal with that name was found and removed.

func (PreparedPortals) Exists ¶

func (pp PreparedPortals) Exists(name string) bool

Exists returns whether a PreparedPortal with the provided name exists.

func (PreparedPortals) Get ¶

func (pp PreparedPortals) Get(name string) (*PreparedPortal, bool)

Get returns the PreparedPortal with the provided name.

func (PreparedPortals) New ¶

func (pp PreparedPortals) New(
	ctx context.Context, name string, stmt *PreparedStatement, qargs parser.QueryArguments,
) (*PreparedPortal, error)

New creates a new PreparedPortal with the provided name and corresponding PreparedStatement, binding the statement using the given QueryArguments.

type PreparedStatement ¶

type PreparedStatement struct {
	// Str is the statement string prior to parsing, used to generate
	// error messages. This may be used in
	// the future to present a contextual error message based on location
	// information.
	Str string
	// Statement is the parsed, prepared SQL statement. It may be nil if the
	// prepared statement is empty.
	Statement parser.Statement
	// TypeHints contains the types of the placeholders set by the client. It
	// dictates how input parameters for those placeholders will be parsed. If a
	// placeholder has no type hint, it will be populated during type checking.
	TypeHints parser.PlaceholderTypes
	// Types contains the final types of the placeholders, after type checking.
	// These may differ from the types in TypeHints, if a user provides an
	// imprecise type hint like sending an int for an oid comparison.
	Types   parser.PlaceholderTypes
	Columns sqlbase.ResultColumns

	ProtocolMeta interface{} // a field for protocol implementations to hang metadata off of.
	// contains filtered or unexported fields
}

PreparedStatement is a SQL statement that has been parsed and the types of arguments and results have been determined.

type PreparedStatements ¶

type PreparedStatements struct {
	// contains filtered or unexported fields
}

PreparedStatements is a mapping of PreparedStatement names to their corresponding PreparedStatements.

func (PreparedStatements) Delete ¶

func (ps PreparedStatements) Delete(ctx context.Context, name string) bool

Delete removes the PreparedStatement with the provided name from the PreparedStatements. The method returns whether a statement with that name was found and removed.

func (*PreparedStatements) DeleteAll ¶

func (ps *PreparedStatements) DeleteAll(ctx context.Context)

DeleteAll removes all PreparedStatements from the PreparedStatements. This will in turn remove all PreparedPortals from the session's PreparedPortals. This is used by the "delete" message in the pgwire protocol; after DeleteAll statements and portals can be added again.

func (PreparedStatements) Exists ¶

func (ps PreparedStatements) Exists(name string) bool

Exists returns whether a PreparedStatement with the provided name exists.

func (PreparedStatements) Get ¶

func (ps PreparedStatements) Get(name string) (*PreparedStatement, bool)

Get returns the PreparedStatement with the provided name.

func (PreparedStatements) New ¶

func (ps PreparedStatements) New(
	e *Executor,
	name string,
	stmt Statement,
	stmtStr string,
	placeholderHints parser.PlaceholderTypes,
) (*PreparedStatement, error)

New creates a new PreparedStatement with the provided name and corresponding query statements, using the given PlaceholderTypes hints to assist in inferring placeholder types.

ps.session.Ctx() is used as the logging context for the prepare operation.

func (PreparedStatements) NewFromString ¶ added in v1.1.0

func (ps PreparedStatements) NewFromString(
	e *Executor, name, query string, placeholderHints parser.PlaceholderTypes,
) (*PreparedStatement, error)

NewFromString creates a new PreparedStatement with the provided name and corresponding query string, using the given PlaceholderTypes hints to assist in inferring placeholder types.

ps.session.Ctx() is used as the logging context for the prepare operation.

type Result ¶

type Result struct {
	Err error
	// The type of statement that the result is for.
	Type parser.StatementType
	// The tag of the statement that the result is for.
	PGTag string
	// RowsAffected will be populated if the statement type is "RowsAffected".
	RowsAffected int
	// Columns will be populated if the statement type is "Rows". It will contain
	// the names and types of the columns returned in the result set in the order
	// specified in the SQL statement. The number of columns will equal the number
	// of values in each Row.
	Columns sqlbase.ResultColumns
	// Rows will be populated if the statement type is "Rows". It will contain
	// the result set of the result.
	// TODO(nvanbenschoten): Can this be streamed from the planNode?
	Rows *sqlbase.RowContainer
}

Result corresponds to the execution of a single SQL statement.

func (*Result) Close ¶

func (r *Result) Close(ctx context.Context)

Close ensures that the resources claimed by the result are released.

type ResultList ¶

type ResultList []Result

ResultList represents a list of results for a list of SQL statements. There is one result object per SQL statement in the request.

func (ResultList) Close ¶

func (rl ResultList) Close(ctx context.Context)

Close ensures that the resources claimed by the results are released.

type ResultsGroup ¶ added in v1.1.0

type ResultsGroup interface {
	// Close should be called once all the results for a group have been produced
	// (i.e. after all StatementResults have been closed). It informs the
	// implementation that the results are not going to be reset any more, and
	// thus can be sent to the client.
	//
	// Close has to be called before new groups are created.  It's illegal to call
	// Close before CloseResult() has been called on all of the group's
	// StatementResults.
	Close()

	// NewStatementResult creates a new StatementResult, indicating that future
	// results are part of a new SQL query.
	//
	// A single StatementResult can be active on a ResultGroup at a time; it is
	// illegal to create a new StatementResult before CloseResult() has been
	// called on the previous one.
	NewStatementResult() StatementResult

	// Flush informs the ResultsGroup that the caller relinquishes the capability
	// to Reset() the results that have been already been accumulated on this
	// group. This means that future Reset() calls will only reset up to the
	// current point in the stream - only future results will be discarded. This
	// is used to ensure that some results are always sent to the client even if
	// further statements are retried automatically; it supports the statements
	// run in the AutoRetry state: these statements are not executed again when
	// doing an automatic retry, and so their results shouldn't be reset.
	//
	// It is illegal to call this while any StatementResults on this group are
	// open.
	//
	// Like StatementResult.AddRow(), Flush returns communication errors, if any.
	// TODO(andrei): provide guidance on handling these errors.
	Flush(context.Context) error

	// ResultsSentToClient returns true if any results pertaining to this group
	// beyond the last Flush() point have been sent to the consumer.
	// Remember that the implementation is free to buffer or send results to the
	// client whenever it pleases. This method checks to see if the implementation
	// has in fact sent anything so far.
	//
	// TODO(andrei): add a note about the synchronous nature of the implementation
	// imposed by this interface.
	ResultsSentToClient() bool

	// Reset discards all the accumulated results from the last Flush() call
	// onwards (or from the moment the group was created if Flush() was never
	// called).
	// It is illegal to call Reset if any results have already been sent to the
	// consumer; this can be tested with ResultsSentToClient().
	Reset(context.Context)
}

ResultsGroup is used to produce a result group (see ResultsWriter).

type ResultsWriter ¶ added in v1.1.0

type ResultsWriter interface {
	// NewResultsGroup creates a new ResultGroup and indicates that future results
	// are part of a new result group.
	//
	// A single group can be ongoing on a ResultsWriter at a time; it is illegal to
	// create a new group before the previous one has been Close()d.
	NewResultsGroup() ResultsGroup

	// SetEmptyQuery is used to indicate that there are no statements to run.
	// Empty queries are different than queries with no results.
	SetEmptyQuery()
}

ResultsWriter is the interface used to by the Executor to produce results for query execution for a SQL client. The main implementer is v3Conn, which streams the results on a SQL network connection. There's also bufferedWriter, which buffers all results in memory.

ResultsWriter is built with the SQL session model in mind: queries from a given SQL client (which we'll call the consumer to not confuse it with clients of this interface - the Executor) keep coming out of band and all of their results (generally, datum tuples) are pushed to a single ResultsWriter. The ResultsWriter needs to be made aware of which results pertain to which statement, as implementations need to split results accordingly. The ResultsWriter also supports the notion of a "results group": the ResultsWriter sequentially goes through groups of results and the group is the level at which a client can request for results to be dropped; a group can be reset, meaning that the consumer will not receive any of them. Only the current group can be reset; the client gives up the ability to reset a group the moment it closes it. This feature is used to support the automatic retries that we do for SQL transactions - groups will correspond to transactions and, when the Executor decides to automatically retry a transaction, it will reset its group (as it can't automatically retry if any results have been sent to the consumer).

Example usage:

var rw ResultsWriter
group := rw.NewResultsGroup()
defer group.Close()
sr := group.NewStatementResult()
for each result row {
  if err := sr.AddRow(...); err != nil {
    // send err to the client in another way
  }
}
sr.CloseResult()
group.Close()

type RowBuffer ¶

type RowBuffer struct {
	*sqlbase.RowContainer
	// contains filtered or unexported fields
}

RowBuffer is a buffer for rows of DTuples. Rows must be added using AddRow(), once the work is done the Close() method must be called to release the allocated memory.

This is intended for nodes where it is simpler to compute a batch of rows at once instead of maintaining internal state in order to operate correctly under the constraints imposed by Next() and Values() under the planNode interface.

func (*RowBuffer) Next ¶

func (rb *RowBuffer) Next() bool

Next here is analogous to Next() as defined under planNode, if no pre-computed results were buffered in prior to the call we return false. Else we stage the next output value for the subsequent call to Values().

func (*RowBuffer) Values ¶

func (rb *RowBuffer) Values() parser.Datums

Values here is analogous to Values() as defined under planNode.

Available after Next(), result only valid until the next call to Next()

type RowResultWriter ¶ added in v1.1.0

type RowResultWriter struct {
	// contains filtered or unexported fields
}

RowResultWriter is a thin wrapper around a RowContainer.

func NewRowResultWriter ¶ added in v1.1.0

func NewRowResultWriter(
	statementType parser.StatementType, rowContainer *sqlbase.RowContainer,
) *RowResultWriter

NewRowResultWriter creates a new RowResultWriter.

func (*RowResultWriter) AddRow ¶ added in v1.1.0

func (b *RowResultWriter) AddRow(ctx context.Context, row parser.Datums) error

AddRow implements the rowResultWriter interface.

func (*RowResultWriter) IncrementRowsAffected ¶ added in v1.1.0

func (b *RowResultWriter) IncrementRowsAffected(n int)

IncrementRowsAffected implements the rowResultWriter interface.

func (*RowResultWriter) StatementType ¶ added in v1.1.0

func (b *RowResultWriter) StatementType() parser.StatementType

StatementType implements the rowResultWriter interface.

type SchemaAccessor ¶

type SchemaAccessor interface {
	// contains filtered or unexported methods
}

SchemaAccessor provides helper methods for using the SQL schema.

type SchemaChangeManager ¶

type SchemaChangeManager struct {
	// contains filtered or unexported fields
}

SchemaChangeManager processes pending schema changes seen in gossip updates. Most schema changes are executed synchronously by the node that created the schema change. If the node dies while processing the schema change this manager acts as a backup execution mechanism.

func NewSchemaChangeManager ¶

func NewSchemaChangeManager(
	st *cluster.Settings,
	testingKnobs *SchemaChangerTestingKnobs,
	db client.DB,
	nodeDesc roachpb.NodeDescriptor,
	rpcContext *rpc.Context,
	distSQLServ *distsqlrun.ServerImpl,
	distSender *kv.DistSender,
	gossip *gossip.Gossip,
	leaseMgr *LeaseManager,
	clock *hlc.Clock,
	jobRegistry *jobs.Registry,
) *SchemaChangeManager

NewSchemaChangeManager returns a new SchemaChangeManager.

func (*SchemaChangeManager) Start ¶

func (s *SchemaChangeManager) Start(stopper *stop.Stopper)

Start starts a goroutine that runs outstanding schema changes for tables received in the latest system configuration via gossip.

type SchemaChanger ¶

type SchemaChanger struct {
	// contains filtered or unexported fields
}

SchemaChanger is used to change the schema on a table.

func NewSchemaChangerForTesting ¶

func NewSchemaChangerForTesting(
	tableID sqlbase.ID,
	mutationID sqlbase.MutationID,
	nodeID roachpb.NodeID,
	db client.DB,
	leaseMgr *LeaseManager,
	jobRegistry *jobs.Registry,
) SchemaChanger

NewSchemaChangerForTesting only for tests.

func (*SchemaChanger) AcquireLease ¶

func (sc *SchemaChanger) AcquireLease(
	ctx context.Context,
) (sqlbase.TableDescriptor_SchemaChangeLease, error)

AcquireLease acquires a schema change lease on the table if an unexpired lease doesn't exist. It returns the lease.

func (*SchemaChanger) ExtendLease ¶

func (sc *SchemaChanger) ExtendLease(
	ctx context.Context, existingLease *sqlbase.TableDescriptor_SchemaChangeLease,
) error

ExtendLease for the current leaser. This needs to be called often while doing a schema change to prevent more than one node attempting to apply a schema change (which is still safe, but unwise). It updates existingLease with the new lease.

func (*SchemaChanger) MaybeIncrementVersion ¶

func (sc *SchemaChanger) MaybeIncrementVersion(ctx context.Context) (*sqlbase.Descriptor, error)

MaybeIncrementVersion increments the version if needed. If the version is to be incremented, it also assures that all nodes are on the current (pre-increment) version of the descriptor. Returns the (potentially updated) descriptor.

func (*SchemaChanger) ReleaseLease ¶

func (sc *SchemaChanger) ReleaseLease(
	ctx context.Context, lease sqlbase.TableDescriptor_SchemaChangeLease,
) error

ReleaseLease releases the table lease if it is the one registered with the table descriptor.

func (*SchemaChanger) RunStateMachineBeforeBackfill ¶

func (sc *SchemaChanger) RunStateMachineBeforeBackfill(ctx context.Context) error

RunStateMachineBeforeBackfill moves the state machine forward and wait to ensure that all nodes are seeing the latest version of the table.

type SchemaChangerTestingKnobs ¶

type SchemaChangerTestingKnobs struct {
	// SyncFilter is called before running schema changers synchronously (at
	// the end of a txn). The function can be used to clear the schema
	// changers (if the test doesn't want them run using the synchronous path)
	// or to temporarily block execution. Note that this has nothing to do
	// with the async path for running schema changers. To block that, set
	// AsyncExecNotification.
	SyncFilter SyncSchemaChangersFilter

	// RunBeforePublishWriteAndDelete is called just before publishing the
	// write+delete state for the schema change.
	RunBeforePublishWriteAndDelete func()

	// RunBeforeBackfill is called just before starting the backfill.
	RunBeforeBackfill func() error

	// RunBeforeBackfill is called just before starting the index backfill, after
	// fixing the index backfill scan timestamp.
	RunBeforeIndexBackfill func()

	// RunBeforeBackfillChunk is called before executing each chunk of a
	// backfill during a schema change operation. It is called with the
	// current span and returns an error which eventually is returned to the
	// caller of SchemaChanger.exec(). It is called at the start of the
	// backfill function passed into the transaction executing the chunk.
	RunBeforeBackfillChunk func(sp roachpb.Span) error

	// RunAfterBackfillChunk is called after executing each chunk of a
	// backfill during a schema change operation. It is called just before
	// returning from the backfill function passed into the transaction
	// executing the chunk. It is always called even when the backfill
	// function returns an error, or if the table has already been dropped.
	RunAfterBackfillChunk func()

	// RenameOldNameNotInUseNotification is called during a rename schema
	// change, after all leases on the version of the descriptor with the old
	// name are gone, and just before the mapping of the old name to the
	// descriptor id is about to be deleted.
	RenameOldNameNotInUseNotification func()

	// AsyncExecNotification is a function called before running a schema
	// change asynchronously. Returning an error will prevent the asynchronous
	// execution path from running.
	AsyncExecNotification func() error

	// AsyncExecQuickly executes queued schema changes as soon as possible.
	AsyncExecQuickly bool

	// WriteCheckpointInterval is the interval after which a checkpoint is
	// written.
	WriteCheckpointInterval time.Duration

	// BackfillChunkSize is to be used for all backfill chunked operations.
	BackfillChunkSize int64
}

SchemaChangerTestingKnobs for testing the schema change execution path through both the synchronous and asynchronous paths.

func (*SchemaChangerTestingKnobs) ModuleTestingKnobs ¶

func (*SchemaChangerTestingKnobs) ModuleTestingKnobs()

ModuleTestingKnobs is part of the base.ModuleTestingKnobs interface.

type Session ¶

type Session struct {

	// Database indicates the "current" database for the purpose of
	// resolving names. See searchAndQualifyDatabase() for details.
	Database string
	// DefaultIsolationLevel indicates the default isolation level of
	// newly created transactions.
	DefaultIsolationLevel enginepb.IsolationType
	// DistSQLMode indicates whether to run queries using the distributed
	// execution engine.
	DistSQLMode DistSQLExecMode
	// Location indicates the current time zone.
	Location *time.Location
	// SearchPath is a list of databases that will be searched for a table name
	// before the database. Currently, this is used only for SELECTs.
	// Names in the search path must have been normalized already.
	SearchPath parser.SearchPath
	// User is the name of the user logged into the session.
	User string
	// SafeUpdates causes errors when the client
	// sends syntax that may have unwanted side effects.
	SafeUpdates bool

	// ClientAddr is the client's IP address and port.
	ClientAddr string

	// TxnState carries information about the open transaction (if any),
	// including the retry status and the KV client Txn object.
	TxnState txnState
	// PreparedStatements and PreparedPortals store the statements/portals
	// that have been prepared via pgwire.
	PreparedStatements PreparedStatements
	PreparedPortals    PreparedPortals

	// ResultsWriter is where query results are written to. It's set to a
	// pgwire.v3conn for sessions opened for SQL client connections and a
	// bufferedResultWriter for internal uses.
	ResultsWriter ResultsWriter

	Tracing SessionTracing

	// ActiveSyncQueries contains query IDs of all synchronous (i.e. non-parallel)
	// queries in flight. All ActiveSyncQueries must also be in mu.ActiveQueries.
	ActiveSyncQueries []uint128.Uint128
	// contains filtered or unexported fields
}

Session contains the state of a SQL client connection. Create instances using NewSession().

func NewSession ¶

func NewSession(
	ctx context.Context, args SessionArgs, e *Executor, remote net.Addr, memMetrics *MemoryMetrics,
) *Session

NewSession creates and initializes a new Session object. remote can be nil.

func (*Session) ClearStatementsAndPortals ¶

func (s *Session) ClearStatementsAndPortals(ctx context.Context)

ClearStatementsAndPortals de-registers all statements and portals. Afterwards none can be added any more.

func (*Session) CopyEnd ¶

func (s *Session) CopyEnd(ctx context.Context)

CopyEnd ends the COPY mode. Any buffered data is discarded.

func (*Session) Ctx ¶

func (s *Session) Ctx() context.Context

Ctx returns the current context for the session: if there is an active SQL transaction it returns the transaction context, otherwise it returns the session context. Note that in some cases we may want the session context even if there is an active transaction (an example is when we want to log an event to the session event log); in that case s.context should be used directly.

func (*Session) EmergencyClose ¶ added in v1.1.0

func (s *Session) EmergencyClose()

EmergencyClose is a simplified replacement for Finish() which is less picky about the current state of the Session. In particular this can be used to tidy up after a session even in the middle of a transaction, where there may still be memory activity registered to a monitor and not cleanly released.

func (*Session) Finish ¶

func (s *Session) Finish(e *Executor)

Finish releases resources held by the Session. It is called by the Session's main goroutine, so no synchronous queries will be in-flight during the method's execution. However, it could be called when asynchronous queries are operating in the background in the case of parallelized statements, which is why we make sure to drain background statements.

func (*Session) FinishPlan ¶ added in v1.1.0

func (s *Session) FinishPlan()

FinishPlan releases the resources that were consumed by the currently active default planner. It does not check to see whether any other resources are still pointing to the planner, so it should only be called when a connection is entirely finished executing a statement and all results have been sent.

func (*Session) OpenAccount ¶

func (s *Session) OpenAccount() WrappableMemoryAccount

OpenAccount interfaces between Session and mon.MemoryMonitor.

func (*Session) ProcessCopyData ¶

func (s *Session) ProcessCopyData(
	ctx context.Context, data string, msg copyMsg,
) (StatementList, error)

ProcessCopyData appends data to the planner's internal COPY state as parsed datums. Since the COPY protocol allows any length of data to be sent in a message, there's no guarantee that data contains a complete row (or even a complete datum). It is thus valid to have no new rows added to the internal state after this call.

func (*Session) StartMonitor ¶

func (s *Session) StartMonitor(pool *mon.BytesMonitor, reserved mon.BoundAccount)

StartMonitor interfaces between Session and mon.MemoryMonitor

func (*Session) StartUnlimitedMonitor ¶

func (s *Session) StartUnlimitedMonitor()

StartUnlimitedMonitor interfaces between Session and mon.MemoryMonitor

type SessionArgs ¶

type SessionArgs struct {
	Database        string
	User            string
	ApplicationName string
}

SessionArgs contains arguments for creating a new Session with NewSession().

type SessionRegistry ¶ added in v1.1.0

type SessionRegistry struct {
	syncutil.Mutex
	// contains filtered or unexported fields
}

SessionRegistry stores a set of all sessions on this node. Use register() and deregister() to modify this registry.

func MakeSessionRegistry ¶ added in v1.1.0

func MakeSessionRegistry() *SessionRegistry

MakeSessionRegistry creates a new SessionRegistry with an empty set of sessions.

func (*SessionRegistry) CancelQuery ¶ added in v1.1.0

func (r *SessionRegistry) CancelQuery(queryIDStr string, username string) (bool, error)

CancelQuery looks up the associated query in the session registry and cancels it.

func (*SessionRegistry) SerializeAll ¶ added in v1.1.0

func (r *SessionRegistry) SerializeAll() []serverpb.Session

SerializeAll returns a slice of all sessions in the registry, converted to serverpb.Sessions.

type SessionTracing ¶ added in v1.1.0

type SessionTracing struct {
	// contains filtered or unexported fields
}

SessionTracing holds the state used by SET TRACING {ON,OFF,LOCAL} statements in the context of one SQL session. It holds the current trace being collected (or the last trace collected, if tracing is not currently ongoing).

SessionTracing and its interactions with the Session are thread-safe; tracing can be turned on at any time.

func (*SessionTracing) Enabled ¶ added in v1.1.0

func (st *SessionTracing) Enabled() bool

Enabled checks whether session tracing is currently enabled.

func (*SessionTracing) KVTracingEnabled ¶ added in v1.1.0

func (st *SessionTracing) KVTracingEnabled() bool

KVTracingEnabled checks whether KV tracing is currently enabled.

func (*SessionTracing) RecordingType ¶ added in v1.1.0

func (st *SessionTracing) RecordingType() tracing.RecordingType

RecordingType returns which type of tracing is currently being done.

func (*SessionTracing) StartTracing ¶ added in v1.1.0

func (st *SessionTracing) StartTracing(recType tracing.RecordingType, kvTracingEnabled bool) error

StartTracing starts "session tracing". After calling this, all SQL transactions running on this session will be traced. The current transaction, if any, will also be traced (except that children spans of the current txn span that have already been created will not be traced).

StopTracing() needs to be called to finish this trace.

Args: kvTracingEnabled: If set, the traces will also include "KV trace" messages -

verbose messages around the interaction of SQL with KV. Some of the messages
are per-row.

func (*SessionTracing) StopTracing ¶ added in v1.1.0

func (st *SessionTracing) StopTracing() error

StopTracing stops the trace that was started with StartTracing().

An error is returned if tracing was not active.

type Statement ¶ added in v1.1.0

type Statement struct {
	AST           parser.Statement
	ExpectedTypes sqlbase.ResultColumns
	// contains filtered or unexported fields
}

Statement contains a statement with optional expected result columns and metadata.

func (Statement) String ¶ added in v1.1.0

func (s Statement) String() string

type StatementFilter ¶

type StatementFilter func(context.Context, string, ResultsWriter, error) error

StatementFilter is the type of callback that ExecutorTestingKnobs.StatementFilter takes.

type StatementList ¶ added in v1.1.0

type StatementList []Statement

StatementList is a list of statements.

func NewStatementList ¶ added in v1.1.0

func NewStatementList(stmts parser.StatementList) StatementList

NewStatementList creates a StatementList from a parser.StatementList.

func (StatementList) Format ¶ added in v1.1.0

func (l StatementList) Format(buf *bytes.Buffer, f parser.FmtFlags)

Format implements the NodeFormatter interface.

func (StatementList) String ¶ added in v1.1.0

func (l StatementList) String() string

type StatementResult ¶ added in v1.1.0

type StatementResult interface {
	// BeginResult should be called prior to any of the other methods.
	// TODO(andrei): remove BeginResult and SetColumns, and have
	// NewStatementResult() take in a parser.Statement
	BeginResult(stmt parser.Statement)
	// GetPGTag returns the PGTag of the statement passed into BeginResult.
	PGTag() string
	// GetStatementType returns the StatementType that corresponds to the type of
	// results that should be sent to this interface.
	StatementType() parser.StatementType
	// SetColumns should be called after BeginResult and before AddRow if the
	// StatementType is parser.Rows.
	SetColumns(columns sqlbase.ResultColumns)
	// AddRow takes the passed in row and adds it to the current result.
	AddRow(ctx context.Context, row parser.Datums) error
	// IncrementRowsAffected increments a counter by n. This is used for all
	// result types other than parser.Rows.
	IncrementRowsAffected(n int)
	// RowsAffected returns either the number of times AddRow was called, or the
	// sum of all n passed into IncrementRowsAffected.
	RowsAffected() int
	// CloseResult ends the current result. The v3Conn will send control codes to
	// the client informing it that the result for a statement is now complete.
	//
	// CloseResult cannot be called unless there's a corresponding BeginResult
	// prior.
	CloseResult() error
}

StatementResult is used to produce results for a single query (see ResultsWriter).

type StatementResults ¶

type StatementResults struct {
	ResultList
	// Indicates that after parsing, the request contained 0 non-empty statements.
	Empty bool
}

StatementResults represents a list of results from running a batch of SQL statements, plus some meta info about the batch.

func (*StatementResults) Close ¶

func (s *StatementResults) Close(ctx context.Context)

Close ensures that the resources claimed by the results are released.

type SyncSchemaChangersFilter ¶

type SyncSchemaChangersFilter func(TestingSchemaChangerCollection)

SyncSchemaChangersFilter is the type of a hook to be installed through the ExecutorContext for blocking or otherwise manipulating schema changers run through the sync schema changers path.

type TableCollection ¶ added in v1.1.0

type TableCollection struct {
	// contains filtered or unexported fields
}

TableCollection is a collection of tables held by a single session that serves SQL requests, or a background job using a table descriptor. The collection is cleared using releaseTables() which is called at the end of each transaction on the session, or on hitting conditions such as errors, or retries that result in transaction timestamp changes.

type TestingSchemaChangerCollection ¶

type TestingSchemaChangerCollection struct {
	// contains filtered or unexported fields
}

TestingSchemaChangerCollection is an exported (for testing) version of schemaChangerCollection. TODO(andrei): get rid of this type once we can have tests internal to the sql package (as of April 2016 we can't because sql can't import server).

func (TestingSchemaChangerCollection) ClearSchemaChangers ¶

func (tscc TestingSchemaChangerCollection) ClearSchemaChangers()

ClearSchemaChangers clears the schema changers from the collection. If this is called from a SyncSchemaChangersFilter, no schema changer will be run.

type TxnStateEnum ¶

type TxnStateEnum int64

TxnStateEnum represents the state of a SQL txn.

const (
	// No txn is in scope. Either there never was one, or it got committed/rolled
	// back. Note that this state will not be experienced outside of the Session
	// and Executor (i.e. it will not be observed by a running query) because the
	// Executor opens implicit transactions before executing non-transactional
	// queries.
	NoTxn TxnStateEnum = iota

	// Like Open, a txn is in scope. The difference is that, while in the
	// AutoRetry state, a retriable error will be handled by an automatic
	// transaction retry, whereas we can't do that in Open. There's a caveat -
	// even if we're in AutoRetry, we can't do automatic retries if any
	// results for statements in the current transaction have already been
	// delivered to the client.
	// In principle, we can do automatic retries for the first batch of statements
	// in a transaction. There is an extension to the rule, though: for
	// example, is we get a batch with "BEGIN; SET TRANSACTION ISOLATION LEVEL
	// foo; SAVEPOINT cockroach_restart;" followed by a 2nd batch, we can
	// automatically retry the 2nd batch even though the statements in the first
	// batch will not be executed again and their results have already been sent
	// to the clients. We can do this because some statements are special in that
	// their execution always generates exactly the same results to the consumer
	// (i.e. the SQL client).
	//
	// TODO(andrei): This state shouldn't exist; the decision about whether we can
	// retry automatically or not should be entirely dynamic, based on which
	// results we've delivered to the client already. It should have nothing to do
	// with the client's batching of statements. For example, the client can send
	// 100 batches but, if we haven't sent it any results yet, we should still be
	// able to retry them all). Currently the splitting into batches is relevant
	// because we don't keep track of statements from previous batches, so we
	// would not be capable of retrying them even if we knew that no results have
	// been delivered.
	AutoRetry

	// A txn is in scope.
	Open

	// The txn has encountered a (non-retriable) error.
	// Statements will be rejected until a COMMIT/ROLLBACK is seen.
	Aborted
	// The txn has encountered a retriable error.
	// Statements will be rejected until a RESTART_TRANSACTION is seen.
	RestartWait
	// The KV txn has been committed successfully through a RELEASE.
	// Statements are rejected until a COMMIT is seen.
	CommitWait
)

func (TxnStateEnum) String ¶

func (i TxnStateEnum) String() string

type VirtualTabler ¶

type VirtualTabler interface {
	// contains filtered or unexported methods
}

VirtualTabler is used to fetch descriptors for virtual tables and databases.

type WireFailureError ¶ added in v1.1.0

type WireFailureError struct {
	// contains filtered or unexported fields
}

WireFailureError is used when sending data over pgwire fails.

func (WireFailureError) Error ¶ added in v1.1.0

func (e WireFailureError) Error() string

type WrappableMemoryAccount ¶

type WrappableMemoryAccount struct {
	// contains filtered or unexported fields
}

WrappableMemoryAccount encapsulates a MemoryAccount to give it the Wsession()/Wtxn() method below.

func (*WrappableMemoryAccount) Wsession ¶

func (w *WrappableMemoryAccount) Wsession(s *Session) WrappedMemoryAccount

Wsession captures the current session monitor pointer so it can be provided transparently to the other Account APIs below.

func (*WrappableMemoryAccount) Wtxn ¶

func (w *WrappableMemoryAccount) Wtxn(s *Session) WrappedMemoryAccount

Wtxn captures the current txn-specific monitor pointer so it can be provided transparently to the other Account APIs below.

type WrappedMemoryAccount ¶

type WrappedMemoryAccount struct {
	// contains filtered or unexported fields
}

WrappedMemoryAccount is the transient structure that carries the extra argument to the MemoryAccount APIs.

func (WrappedMemoryAccount) Clear ¶

func (w WrappedMemoryAccount) Clear(ctx context.Context)

Clear interfaces between Session and mon.MemoryMonitor.

func (WrappedMemoryAccount) Close ¶

func (w WrappedMemoryAccount) Close(ctx context.Context)

Close interfaces between Session and mon.MemoryMonitor.

func (WrappedMemoryAccount) Grow ¶

func (w WrappedMemoryAccount) Grow(ctx context.Context, extraSize int64) error

Grow interfaces between Session and mon.MemoryMonitor.

func (WrappedMemoryAccount) OpenAndInit ¶

func (w WrappedMemoryAccount) OpenAndInit(ctx context.Context, initialAllocation int64) error

OpenAndInit interfaces between Session and mon.MemoryMonitor.

func (WrappedMemoryAccount) ResizeItem ¶

func (w WrappedMemoryAccount) ResizeItem(ctx context.Context, oldSize, newSize int64) error

ResizeItem interfaces between Session and mon.MemoryMonitor.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
distsqlplan
distsqlrun Package distsqlrun is a generated protocol buffer package.	Package distsqlrun is a generated protocol buffer package.
ir
base
example/base
irgen command
irgen/analyzer
irgen/codegen
irgen/ir
irgen/parser
irgen/template
jobs Package jobs is a generated protocol buffer package.	Package jobs is a generated protocol buffer package.
mon
parser
pgbench
cmd/pgbenchsetup command
pgwire
pgerror Package pgerror is a generated protocol buffer package.	Package pgerror is a generated protocol buffer package.
privilege
sqlbase Package sqlbase is a generated protocol buffer package.	Package sqlbase is a generated protocol buffer package.
sqlutil

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL