Documentation
¶
Overview ¶
Package logical holds the Phase 3 (TODO.md §"Phase 3 — Semantic analysis") logical-operator hierarchy. A LogicalOperator describes WHAT a SQL query is doing — scan this table, filter on that predicate, project these columns, sort by those columns — without committing to HOW. Semantic analysis (porting of Java's `SemanticAnalyzer`) produces a LogicalOperator tree from the parse tree. Phase 4 Cascades consumes the tree and emits physical plans; until Cascades lands, the naive Generator can consume LogicalOperator directly through a thin translator.
Naming mirrors Java's `fdb-relational-core/recordlayer/query/LogicalOperator.java` + siblings, trimmed to the core operator set we actually need for the current yamsql corpus:
- LogicalScan — `FROM tbl` (single-table read)
- LogicalFilter — `WHERE pred`
- LogicalProject — `SELECT a, b, expr AS n`
- LogicalSort — `ORDER BY …`
- LogicalLimit — `LIMIT n OFFSET m`
- LogicalAggregate — `GROUP BY … + agg(…)`
- LogicalJoin — `INNER / LEFT / RIGHT JOIN`
- LogicalUnion — `UNION [ALL]`
- LogicalInsert — `INSERT INTO tbl VALUES / INSERT SELECT`
- LogicalUpdate — `UPDATE tbl SET col = expr WHERE …`
- LogicalDelete — `DELETE FROM tbl WHERE …`
- LogicalDDL — `CREATE / DROP` passthrough (no tree shape)
- LogicalCTE — `WITH name AS (…) SELECT …` (non-recursive + recursive)
**Phase 3 scope:** this package is the TARGET SHAPE only. The semantic analyzer that translates parse tree → LogicalOperator and the translator that turns LogicalOperator → executable Plan are separate Phase 3 deliverables. Committing to the shape here lets them land incrementally without churning all of the pkg/ relational tree at once.
**Java alignment.** Most operators map 1:1 to a Java counterpart:
LogicalFilter ↔ LogicalFilter / QueryPredicate-carrying child LogicalProject ↔ LogicalProjectionExpression LogicalSort ↔ LogicalSortExpression LogicalAggregate ↔ GroupByExpression LogicalUnion ↔ LogicalUnionExpression LogicalInsert ↔ InsertExpression LogicalUpdate ↔ UpdateExpression LogicalDelete ↔ DeleteExpression
Two deliberate divergences:
- `LogicalScan` does not exist in Java-Cascades; Java represents a FROM-source as a Cascades `FullUnorderedScanExpression` wrapped by a `Quantifier`. RFC-022 argues that Phase 3 should own a pure logical-plan representation distinct from Cascades; LogicalScan is the scan-stand-in at the logical level.
- `LogicalJoin` does not exist in Java-Cascades as a discrete type; Java encodes joins via a `SelectExpression` binding multiple `Quantifier`s. Same rationale: keeping join as an explicit logical operator separates Phase 3 tree-building from Phase 4 Cascades translation.
Both divergences are documented in RFC-023 / TODO Phase 3+4.
Explicit predicate / expression representation is deferred. For now LogicalFilter / LogicalProject etc. carry parse-tree handles (antlr IExpressionContext). RFC-021 Phase 2 replaces those with `Value` nodes from the Cascades Value hierarchy. RFC-023 committed to non-generic interfaces + `any`; the seed lives in `pkg/recordlayer/query/plan/cascades/`. As that package grows this one will migrate its text-handle predicates to real Values.
Index ¶
- func FindOuterScanTable(op LogicalOperator, alias string) string
- func OuterSourceIsDerivedTable(op LogicalOperator, alias string) bool
- type Assignment
- type CorrelatedScalarSubquery
- type ExistsSubquery
- type JoinKind
- type LogicalAggregate
- type LogicalCTE
- type LogicalDDL
- type LogicalDelete
- type LogicalDistinct
- type LogicalFilter
- type LogicalInsert
- type LogicalJoin
- type LogicalLimit
- type LogicalOperator
- type LogicalProject
- type LogicalScan
- type LogicalSort
- type LogicalUnion
- type LogicalUnnest
- type LogicalUpdate
- type LogicalValues
- type ScalarSubquery
- type SortDir
- type SortKey
- type TraversalOrder
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func FindOuterScanTable ¶
func FindOuterScanTable(op LogicalOperator, alias string) string
FindOuterScanTable resolves a lateral unnest's outer source alias to the scanned table name by matching a LogicalScan whose source alias is `alias` (case-insensitive) among the VISIBLE FROM-scope sources of the outer leg `op`. It is used to resolve the outer source of a lateral unnest (`FROM t, t.arr AS x` → the scan of `t`) so its proto descriptor can be inspected for the array field.
The walk MUST NOT descend into a CTE / derived-table BODY. A derived table `(SELECT … FROM T1) AS d` lowers to a `LogicalCTE{Body: <…scan of T1…>, Main: Scan(d)}`; only `d` is a visible source — `T1` is hidden inside the body and out of scope. Descending into `Body` would match the hidden `T1` scan and explode a correlated array against a source the query can't see (a silent-wrong / mis-classification). So at a `LogicalCTE` we resolve ONLY against its `Main` (the visible alias projection), never its `Body`. This mirrors Java's `resolveCorrelatedIdentifier` resolving against `getLogicalOperatorsIncludingOuter()` — the in-scope quantifiers, not nested query bodies.
Returns "" when no VISIBLE matching scan is found (a non-scan outer, or a name hidden behind a derived-table boundary), so the caller falls back to the table path.
func OuterSourceIsDerivedTable ¶
func OuterSourceIsDerivedTable(op LogicalOperator, alias string) bool
OuterSourceIsDerivedTable reports whether `alias` (a lateral unnest's segment-0 outer source name) is bound, in the outer sub-plan `op`, to a DERIVED-TABLE / CTE leg — i.e. a `LogicalCTE` whose Name equals `alias`. It reads the logical tree STRUCTURALLY (not a translator-internal cteScope map), so it fires independent of cteScope population order and regardless of whether the alias also names a real same-named base table.
A derived table `(SELECT …) AS D` lowers to a `LogicalCTE{Name:D, Main:Scan(D)}` inside the outer leg. When such a leg shadows a real same-named table, validating the unnest's array field against the base-table descriptor would explode the WRONG column (a derived-output silent-wrong). Detecting the derived/CTE leg here — by the in-scope quantifier alias, exactly as Java's generateCorrelatedFieldAccess resolves the in-scope source rather than the catalog table — lets the caller reject (or skip a base-table check for) the derived-output unnest in ALL cases.
Like FindOuterScanTable, it does NOT descend into a CTE's Body (only its Main): a derived table is its own FROM scope; a same-named CTE nested inside another derived body is out of the current scope.
Types ¶
type Assignment ¶
Assignment is one SET clause entry. Expr is the canonical text (used for explain and as a fallback); Value is the resolved RHS expression Value (populated by the catalog-aware builder) that the executor evaluates against each target row. A nil Value means the text builder ran without catalog resolution.
type CorrelatedScalarSubquery ¶
type CorrelatedScalarSubquery struct {
}
CorrelatedScalarSubquery pairs a correlation alias with a logical plan for a correlated scalar subquery like `(SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id)`. The inner plan has the correlation predicate baked in as a filter child of the aggregate — the executor re-evaluates it per outer row via FlatMap. Carried on LogicalProject.
type ExistsSubquery ¶
type ExistsSubquery struct {
Alias values.CorrelationIdentifier
Plan LogicalOperator
JoinPredicate predicates.QueryPredicate
}
ExistsSubquery pairs an existential alias with the logical plan for an EXISTS subquery. Carried on LogicalFilter so the Cascades translator can build ExistentialQuantifiers over the subquery plans.
type LogicalAggregate ¶
type LogicalAggregate struct {
Input LogicalOperator
GroupKeys []string
GroupKeyValues []values.Value // resolved Value trees for GROUP BY expressions; nil slot = bare column
Aggregates []string // e.g. "SUM(a)", "COUNT(*)"
Aliases []string // parallel to Aggregates
AggregateOperands []values.Value // resolved operand Values (parallel to Aggregates); nil slot = use text
HasDistinctAggregate bool // true when any aggregate uses DISTINCT (e.g. COUNT(DISTINCT x))
Having string // canonical HAVING predicate, "" when absent
HavingPredicate predicates.QueryPredicate
HavingExistsSubqueries []ExistsSubquery // EXISTS subquery plans inside HAVING
HavingScalarSubqueries []ScalarSubquery // scalar subquery plans inside HAVING
}
LogicalAggregate runs GROUP BY + aggregate functions on its child. GroupKeys are the grouping-column expressions; Aggregates holds the aggregate-call text with aliases.
func NewAggregate ¶
func NewAggregate(input LogicalOperator, groupKeys, aggs, aliases []string, having string) *LogicalAggregate
func (*LogicalAggregate) Children ¶
func (a *LogicalAggregate) Children() []LogicalOperator
func (*LogicalAggregate) Explain ¶
func (a *LogicalAggregate) Explain(indent string) string
type LogicalCTE ¶
type LogicalCTE struct {
Name string
Body LogicalOperator
Main LogicalOperator
Recursive bool
ColumnAliases []string // WITH c(a, b) AS (...) → renames body's output columns
TraversalOrder TraversalOrder
}
LogicalCTE wraps a named Common Table Expression around a Main query. The Body is the CTE's own plan; Main references Body via a LogicalScan on Name. Recursive CTEs set Recursive=true — Body may self-reference (the recursive evaluator lives at the executor layer for now).
func NewCTE ¶
func NewCTE(name string, body, main LogicalOperator, recursive bool) *LogicalCTE
NewCTE constructs a LogicalCTE.
func (*LogicalCTE) Children ¶
func (c *LogicalCTE) Children() []LogicalOperator
func (*LogicalCTE) Explain ¶
func (c *LogicalCTE) Explain(indent string) string
type LogicalDDL ¶
LogicalDDL wraps a DDL statement that has no meaningful tree shape (CREATE TABLE, DROP INDEX, …). Kind carries the DDL command ("CREATE TABLE" etc.) and Text the canonical source.
func NewDDL ¶
func NewDDL(kind, text string) *LogicalDDL
func (*LogicalDDL) Children ¶
func (*LogicalDDL) Children() []LogicalOperator
func (*LogicalDDL) Explain ¶
func (d *LogicalDDL) Explain(indent string) string
type LogicalDelete ¶
type LogicalDelete struct {
Target string
Input LogicalOperator
}
LogicalDelete removes rows matching Input from Target.
func NewDelete ¶
func NewDelete(target string, input LogicalOperator) *LogicalDelete
func (*LogicalDelete) Children ¶
func (d *LogicalDelete) Children() []LogicalOperator
func (*LogicalDelete) Explain ¶
func (d *LogicalDelete) Explain(indent string) string
type LogicalDistinct ¶
type LogicalDistinct struct {
Input LogicalOperator
}
LogicalDistinct removes duplicate rows from its input.
func NewDistinct ¶
func NewDistinct(input LogicalOperator) *LogicalDistinct
func (*LogicalDistinct) Children ¶
func (d *LogicalDistinct) Children() []LogicalOperator
func (*LogicalDistinct) Explain ¶
func (d *LogicalDistinct) Explain(indent string) string
type LogicalFilter ¶
type LogicalFilter struct {
Input LogicalOperator
Predicate predicates.QueryPredicate // preferred when non-nil
PredicateText string // source-text fallback
ExistsSubqueries []ExistsSubquery // subquery plans for EXISTS predicates
ScalarSubqueries []ScalarSubquery // subquery plans for scalar subqueries
}
LogicalFilter applies a WHERE/HAVING predicate to its child.
Predicate is the preferred representation — a cascades QueryPredicate tree produced by the expr walker. When non-nil, Explain renders it via Predicate.Explain(), which yields the normalised form after simplification (tautology-folded, NOTs pushed to leaves, operands tree-walked).
PredicateText is the fallback: the canonical source text of the WHERE expression. Used when the expression shape is out of the walker's scope (UnsupportedExpressionShapeError) or when the builder is constructed without a metadata-backed catalog (today's naive_generator Explain path, which has no transaction in scope).
func NewFilter ¶
func NewFilter(input LogicalOperator, pred string) *LogicalFilter
NewFilter constructs a text-only LogicalFilter — used by the non-catalog-aware logical-builder path where only canonical source text is available. Pair with NewFilterWithPredicate when a predicates.QueryPredicate tree is in scope (catalog-aware builder); the predicate-tree form takes precedence in Explain output when both are set.
func NewFilterWithPredicate ¶
func NewFilterWithPredicate(input LogicalOperator, pred predicates.QueryPredicate, text string) *LogicalFilter
NewFilterWithPredicate constructs a LogicalFilter whose predicate is a cascades QueryPredicate tree. The text form is retained for diagnostics so Explain output stays stable even when the Predicate render differs from the source text (e.g. after tautology-folding).
func (*LogicalFilter) Children ¶
func (f *LogicalFilter) Children() []LogicalOperator
func (*LogicalFilter) Explain ¶
func (f *LogicalFilter) Explain(indent string) string
type LogicalInsert ¶
type LogicalInsert struct {
Table string
Columns []string
Source LogicalOperator
ValuesArray values.Value
}
LogicalInsert describes an INSERT into Table. Source is the row- producing child (a SELECT); Columns is the projected-column list (may be empty to mean "all columns").
ValuesArray holds the literal VALUES rows as a Cascades array Value (an ArrayConstructorValue of one RecordConstructorValue per row), mutually exclusive with Source. The translator wraps it in an ExplodeExpression so INSERT … VALUES streams through the same Cascades path as INSERT … SELECT — matching Java's RecordConstructorValue → array → Explode → Insert shape. It is a typed Value rather than a child operator because the rows are built from evaluated literals (parameters are already substituted at plan time), which needs the connection's evaluation context the pure logical builder lacks.
func NewInsert ¶
func NewInsert(table string, cols []string, source LogicalOperator) *LogicalInsert
func (*LogicalInsert) Children ¶
func (i *LogicalInsert) Children() []LogicalOperator
func (*LogicalInsert) Explain ¶
func (i *LogicalInsert) Explain(indent string) string
type LogicalJoin ¶
type LogicalJoin struct {
Left LogicalOperator
Right LogicalOperator
Kind JoinKind
OnText string
OnPredicate any // predicates.QueryPredicate when set
// OnExistsSubqueries carries EXISTS subqueries lifted from the ON clause
// (RFC-154 §5). The cascades translator turns each into an existential
// quantifier on the join's SelectExpression, so the NLJ rule's
// implementJoinWithExistential path builds the semi-join. Only populated for
// INNER joins (OUTER EXISTS-in-ON is deferred — RFC-154 §5.2b).
OnExistsSubqueries []ExistsSubquery
}
LogicalJoin combines two children. Empty OnText means "no ON condition" (comma cross-join form — the outer WHERE provides the predicate). OnPredicate is the optional structured form (used by the catalog-aware walker); when non-nil, it takes precedence over OnText for Cascades lowering.
func NewJoin ¶
func NewJoin(left, right LogicalOperator, kind JoinKind, on string) *LogicalJoin
func NewJoinWithPredicate ¶
func NewJoinWithPredicate(left, right LogicalOperator, kind JoinKind, pred any) *LogicalJoin
NewJoinWithPredicate builds a LogicalJoin with a structured ON predicate.
func (*LogicalJoin) Children ¶
func (j *LogicalJoin) Children() []LogicalOperator
func (*LogicalJoin) Explain ¶
func (j *LogicalJoin) Explain(indent string) string
type LogicalLimit ¶
type LogicalLimit struct {
Input LogicalOperator
Limit int64
Offset int64
LimitValue values.Value
}
LogicalLimit caps the row count, optionally after skipping Offset. Negative Limit means "no limit" (pure offset).
LimitValue is an OPTIONAL runtime row cap (RFC-156 parameterized vector rank limit `... <= ?`): when non-nil the cap is evaluated at execution against the bound parameters and Limit is the no-cap sentinel (-1).
func NewLimit ¶
func NewLimit(input LogicalOperator, limit, offset int64) *LogicalLimit
func NewRuntimeLimit ¶
func NewRuntimeLimit(input LogicalOperator, limitValue values.Value, offset int64) *LogicalLimit
NewRuntimeLimit builds a LIMIT whose row cap is a runtime Value (evaluated at execution). The static Limit is the no-cap sentinel (-1).
func (*LogicalLimit) Children ¶
func (l *LogicalLimit) Children() []LogicalOperator
func (*LogicalLimit) Explain ¶
func (l *LogicalLimit) Explain(indent string) string
type LogicalOperator ¶
type LogicalOperator interface {
// Children returns the immediate child operators. Returning an
// empty slice (not nil) for leaf nodes keeps caller code free
// of nil checks.
Children() []LogicalOperator
// Explain returns an indented textual rendering of this node
// and its subtree. The indent argument is prefixed to this
// node's first line; children receive indent + " ".
//
// This is the stable surface Plan.Explain() exposes to
// frontend callers. Cascades physical plans will Explain()
// through their own impl; logical plans use this.
Explain(indent string) string
}
LogicalOperator is the root interface every logical operator satisfies. A LogicalOperator exposes its children (tree structure) and can render an indented text explanation of the subtree.
Operators are value-types (small, immutable once constructed). They are NOT identity-comparable — two structurally-identical Filter nodes should compare equal under a structural walker (implemented on top of this interface, not part of it).
func FoldTransparentUnaryInput ¶
func FoldTransparentUnaryInput(op LogicalOperator) (LogicalOperator, bool)
FoldTransparentUnaryInput returns (input, true) when op is a fold-transparent unary operator — one the RFC-141 projected-EXISTS fold descends THROUGH to reach the existential filter without changing the row shape. Only Sort and Limit qualify (a Project/Join/Aggregate/Distinct/Union reshapes the rows and is NOT fold-transparent). This is the SINGLE source of truth for the transparency set: both the translator's `findExistsFilterUnderUnaryChain` (which folds the projection through the chain) and the generator's `existsFilterReachableForFold` (which rejects a projected EXISTS the fold cannot reach) consult it, so the two can never silently diverge. Returns (nil, false) for any non-transparent op.
type LogicalProject ¶
type LogicalProject struct {
Input LogicalOperator
Projections []string
Aliases []string // parallel to Projections; "" means no alias
ProjectedValues []values.Value // parallel to Projections; nil slot = walker declined
IsComputed []bool // parallel to Projections; true = expression, not plain column ref
// AggregateSlots is parallel to Projections; true = the slot's value tree
// CONTAINS an aggregate. Captured pre-rewrite, where the *AggregateValue
// node is still present (rewriteAggregateValuesInTree destructively replaces
// it with a typed FieldValue). Read once by the INSERT…SELECT promotion guard
// to identify reliably-typed aggregate-result columns — plain columns are
// concrete-typed too (ResolveIdentifier), so type-presence cannot
// discriminate. A bridge until the Java end-state (PromoteValue projection
// nodes), which dissolves this marker.
AggregateSlots []bool
ScalarSubqueries []ScalarSubquery // uncorrelated scalar subquery plans (pre-evaluated)
}
LogicalProject selects / renames columns and computes expressions. Each element of Projections is the canonical text of the projected expression or column name. Aliases (parallel slice) hold the output name; empty string means "use the underlying name."
ProjectedValues (parallel to Projections) carries resolved Value trees when the catalog-aware builder successfully walks the ANTLR expression. nil slots mean the walker declined (unsupported shape) — the Cascades translator treats nil as "cannot translate" and returns nil for the whole query. Non-nil slots are used directly as projection Values in the Cascades plan.
func NewProject ¶
func NewProject(input LogicalOperator, projs, aliases []string) *LogicalProject
func (*LogicalProject) Children ¶
func (p *LogicalProject) Children() []LogicalOperator
func (*LogicalProject) Explain ¶
func (p *LogicalProject) Explain(indent string) string
type LogicalScan ¶
LogicalScan reads a single table. Empty Alias means "use the table name as the source alias."
func (*LogicalScan) Children ¶
func (*LogicalScan) Children() []LogicalOperator
func (*LogicalScan) Explain ¶
func (s *LogicalScan) Explain(indent string) string
type LogicalSort ¶
type LogicalSort struct {
Input LogicalOperator
Keys []SortKey
}
LogicalSort sorts its child rows by the given keys.
func NewSort ¶
func NewSort(input LogicalOperator, keys []SortKey) *LogicalSort
func (*LogicalSort) Children ¶
func (s *LogicalSort) Children() []LogicalOperator
func (*LogicalSort) Explain ¶
func (s *LogicalSort) Explain(indent string) string
type LogicalUnion ¶
type LogicalUnion struct {
Inputs []LogicalOperator
Distinct bool
}
LogicalUnion ties together two (or more) children with UNION [ALL] semantics. Distinct = true applies a DISTINCT dedup across the union.
func NewUnion ¶
func NewUnion(inputs []LogicalOperator, distinct bool) *LogicalUnion
func (*LogicalUnion) Children ¶
func (u *LogicalUnion) Children() []LogicalOperator
func (*LogicalUnion) Explain ¶
func (u *LogicalUnion) Explain(indent string) string
type LogicalUnnest ¶
type LogicalUnnest struct {
// Segments is the un-flattened dotted name of the array source
// (`["T1","ARR1"]` for `T1.arr1`). Segment 0 names the in-scope outer
// source; the remaining segments name the array field on it. Kept
// un-flattened (no re-split of a joined string) so the translator
// resolves segment-by-segment against the scope.
Segments []string
// Alias is the AS alias (`x` in `... AS x`) bound to each unnested
// element. Empty when the AS alias is omitted (AT-only form).
Alias string
// AtAlias is the AT ordinal alias (`ord` in `... AT ord`), empty when
// absent. Its presence makes the Explode WITH ORDINALITY.
AtAlias string
}
LogicalUnnest is a lateral array UNNEST source in the FROM list (`FROM t, t.arr AS x [AT ord]`). It is the RIGHT child of a lateral LogicalJoin whose LEFT child is the source `t` it correlates to. The translator lowers it to an Explode of the correlated array field under a FlatMap of the outer source. Mirrors Java's `LogicalOperator.generateCorrelatedFieldAccess`. RFC-142 R5.
func (*LogicalUnnest) Children ¶
func (*LogicalUnnest) Children() []LogicalOperator
func (*LogicalUnnest) Explain ¶
func (u *LogicalUnnest) Explain(indent string) string
type LogicalUpdate ¶
type LogicalUpdate struct {
Target string
Sets []Assignment
Input LogicalOperator // the scan + filter producing target rows
}
LogicalUpdate updates Target rows matching Input with the per-col expression assignments in Sets.
func NewUpdate ¶
func NewUpdate(target string, sets []Assignment, input LogicalOperator) *LogicalUpdate
func (*LogicalUpdate) Children ¶
func (u *LogicalUpdate) Children() []LogicalOperator
func (*LogicalUpdate) Explain ¶
func (u *LogicalUpdate) Explain(indent string) string
type LogicalValues ¶
LogicalValues is a leaf operator that yields a single row of constant/expression projections — the canonical target for a SELECT without a FROM clause (`SELECT 1 + 2, 'hello'`). Rows is a list of expression-texts per output column; Aliases is parallel (empty string = no AS clause). The number of rows is always 1 in this seed; a future VALUES (…), (…) literal table would extend to multi-row. Java equivalent: a ConstantExpression flowing through LogicalProjectionExpression.
func NewValues ¶
func NewValues(rows, aliases []string) *LogicalValues
NewValues constructs a LogicalValues with per-column expression text + parallel aliases.
func (*LogicalValues) Children ¶
func (*LogicalValues) Children() []LogicalOperator
func (*LogicalValues) Explain ¶
func (v *LogicalValues) Explain(indent string) string
type ScalarSubquery ¶
type ScalarSubquery struct {
Alias values.CorrelationIdentifier
Plan LogicalOperator
}
ScalarSubquery pairs a correlation alias with the logical plan for a scalar subquery `(SELECT MAX(v) FROM t2)`. Carried on LogicalFilter and LogicalProject so the Cascades translator can build inner plans. The executor pre-evaluates these and binds the scalar result under Alias before evaluating the outer plan's predicates/projections.
type SortKey ¶
type SortKey struct {
Expr string // canonical text
Dir SortDir
NullsFirst bool
Value values.Value // resolved Value expression (nil = use text as FieldValue)
}
SortKey is one ORDER BY entry.
type TraversalOrder ¶
type TraversalOrder int
const ( TraversalLevelOrder TraversalOrder = iota TraversalPreOrder TraversalPostOrder )