orast

module
v0.0.0-...-4fa455d Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 2, 2026 License: MIT

README

orast — Oracle SQL & PL/SQL Parser in Go

Go Version License

orast is a pure-Go, hand-written recursive descent parser for Oracle SQL and PL/SQL. It produces a richly-typed AST (Abstract Syntax Tree) with zero external dependencies. No code generation, no yacc, no ANTLR — just idiomatic Go.

Features

  • Oracle 12c+ SQL — SELECT (joins, subqueries, CTEs, hierarchical, FOR UPDATE, OFFSET/FETCH), INSERT (multi-row, multi-table, SELECT-based), UPDATE, DELETE, MERGE
  • DDLCREATE/ALTER/DROP for TABLE, INDEX, VIEW, SEQUENCE, SYNONYM
  • PL/SQL — anonymous blocks, procedures, functions, packages, triggers, IF/LOOP/FOR/WHILE, EXECUTE IMMEDIATE, cursors, exception handlers
  • TCL/DCLCOMMIT, ROLLBACK, SAVEPOINT, GRANT, REVOKE
  • Rich AST — 70+ node types with source position tracking and a Walk visitor
  • Oracle-specific lexing — Q-quoted strings, bind variables (:var), substitution variables (&var), optimizer hints (/*+ */), outer join marker (+), :=, =>, ..
  • Hand-written recursive descent — Pratt parser for expressions, full control over error messages
  • Comprehensive tests — 40 test functions, 200+ test cases

Installation

go get github.com/yourorg/orast

Quick Start

CLI
# Install
cd cmd/orast && go build -o orast .

# Parse a file
orast parse script.sql

# Fast validation (no AST, ideal for CI/CD)
orast validate sql_dir/

# Format SQL
orast format script.sql          # to stdout
orast format -w script.sql        # in-place
orast format --check sql_dir/     # check-only

# Parse with AST inspection
orast parse -a script.sql
Library
package main

import (
    "fmt"
    "github.com/yourorg/orast/parser"
)

func main() {
    sql := "SELECT dept_id, COUNT(*) FROM emp WHERE sal > 5000 GROUP BY dept_id ORDER BY dept_id"

    file, err := parser.ParseFile([]byte(sql))
    if err != nil {
        fmt.Println("Parse error:", err)
        return
    }

    for _, stmt := range file.Stmts {
        fmt.Printf("Parsed: %T\n", stmt)
    }
}

Architecture

orast/
├── token/          Token constants + ~500 Oracle keyword mappings
├── lexer/          Hand-written scanner (all Oracle literal formats)
├── ast/            AST node interfaces, expressions, DML/DDL/PLSQL types
├── parser/         Recursive descent + Pratt expression parser
├── printer/        AST → SQL pretty-printer
├── cmd/orast/      CLI tool (cobra-based)
└── test/cmd/       Test utilities (benchmark, roundtrip, profiler)
Package Files Lines Description
token 2 ~750 350+ token constants, precedence table
lexer 1 ~400 Scanner: identifiers, strings, numbers, operators, comments
ast 3 ~1100 Node interface, 70+ node types, Walk visitor
parser 4 + test ~5700 Recursive descent parser, Pratt expressions, DML/DDL/PLSQL

Supported Syntax (Oracle 12c+)

DML
-- SELECT with all major clauses
SELECT /*+ INDEX(t) */ DISTINCT e.ename, d.dname
  FROM emp e
  JOIN dept d ON e.deptno = d.deptno
 WHERE e.sal > 3000
 START WITH e.mgr IS NULL
 CONNECT BY PRIOR e.empno = e.mgr
 GROUP BY e.deptno
HAVING COUNT(*) > 2
 ORDER BY e.sal DESC NULLS LAST
   FOR UPDATE OF e.sal NOWAIT;

-- WITH (CTE), set operations
WITH cte AS (SELECT * FROM t)
SELECT * FROM cte
 UNION ALL
SELECT * FROM other;

-- INSERT variants
INSERT INTO t (col1, col2) VALUES (1, 'a');
INSERT INTO t SELECT * FROM u;
INSERT ALL WHEN x = 1 THEN INTO t1 VALUES (1) SELECT * FROM src;

-- UPDATE, DELETE, MERGE
UPDATE t SET col = val WHERE id = 5 RETURNING col INTO v;
DELETE FROM t WHERE id = 5;
MERGE INTO target USING source ON (id) WHEN MATCHED THEN UPDATE SET ...;
DDL
CREATE TABLE t (
    id      NUMBER PRIMARY KEY,
    name    VARCHAR2(100) NOT NULL,
    amount  NUMBER(10,2) DEFAULT 0,
    dept_id NUMBER REFERENCES dept(id),
    CONSTRAINT pk_t PRIMARY KEY (id)
);

CREATE UNIQUE INDEX idx_u ON t (col);
CREATE OR REPLACE VIEW v AS SELECT * FROM t;
CREATE SEQUENCE seq1 START WITH 1 INCREMENT BY 1;
CREATE PUBLIC SYNONYM syn FOR schema.table;

ALTER TABLE t ADD (new_col NUMBER);
DROP TABLE t CASCADE CONSTRAINTS PURGE;
TRUNCATE TABLE t;
PL/SQL
DECLARE
    v_count NUMBER := 0;
    CURSOR c IS SELECT * FROM emp;
BEGIN
    FOR rec IN c LOOP
        v_count := v_count + 1;
    END LOOP;

    IF v_count > 10 THEN
        DBMS_OUTPUT.PUT_LINE('Many');
    ELSIF v_count > 0 THEN
        DBMS_OUTPUT.PUT_LINE('Some');
    END IF;

    EXECUTE IMMEDIATE 'SELECT COUNT(*) FROM emp' INTO v_count;
EXCEPTION
    WHEN NO_DATA_FOUND THEN
        NULL;
    WHEN OTHERS THEN
        RAISE;
END;
CREATE OR REPLACE PROCEDURE calc_bonus(p_emp_id IN NUMBER, p_bonus OUT NUMBER) IS
BEGIN
    SELECT sal * 0.1 INTO p_bonus FROM emp WHERE empno = p_emp_id;
END;

CREATE OR REPLACE FUNCTION get_name(p_id NUMBER) RETURN VARCHAR2 IS
    v_name VARCHAR2(100);
BEGIN
    SELECT ename INTO v_name FROM emp WHERE empno = p_id;
    RETURN v_name;
END;

CREATE OR REPLACE TRIGGER trg_audit
    BEFORE INSERT OR UPDATE ON emp
    FOR EACH ROW
BEGIN
    :NEW.updated_at := SYSDATE;
END;

Development

# Build & test
go build ./...
go test ./... -v

# Build CLI
go build -o orast ./cmd/orast/

# Run test suite (585 real-world SQL files)
go run ./test/cmd/parse_all/main.go sql_files.txt

# Round-trip validation (parse → print → parse)
go run ./test/cmd/roundtrip/main.go sql_files.txt

# Lint
go vet ./...

License

MIT

Roadmap

  • Phase 1: Token, Lexer, AST, Parser skeleton
  • Phase 2: Expression parser (Pratt)
  • Phase 3: DML statements (SELECT/INSERT/UPDATE/DELETE/MERGE)
  • Phase 4–6: DDL, TCL, DCL, PL/SQL blocks & subprograms
  • Printer: AST → SQL pretty-printer (round-trip testing)
  • CLI tool: orast parse|format|validate (cobra-based)
  • Memory safety: no leaks, ~5MB peak for 7MB files
  • PL/SQL: packages with full body support, compound triggers
  • PIVOT/UNPIVOT, MATCH_RECOGNIZE, MODEL clause
  • CREATE TYPE, object types, collections
  • Fuzz testing, golden file tests, real-world SQL corpus

Directories

Path Synopsis
Package ast declares the types used to represent Oracle SQL and PL/SQL Abstract Syntax Trees.
Package ast declares the types used to represent Oracle SQL and PL/SQL Abstract Syntax Trees.
cmd
orast command
CLI entry point for the orast Oracle SQL/PLSQL parser.
CLI entry point for the orast Oracle SQL/PLSQL parser.
orast/cmd
Package cmd provides the CLI commands for orast.
Package cmd provides the CLI commands for orast.
Package lexer implements a hand-written lexer for Oracle SQL and PL/SQL.
Package lexer implements a hand-written lexer for Oracle SQL and PL/SQL.
Package parser implements a recursive descent parser for Oracle SQL and PL/SQL.
Package parser implements a recursive descent parser for Oracle SQL and PL/SQL.
Package printer implements AST → formatted Oracle SQL output.
Package printer implements AST → formatted Oracle SQL output.
Package token provides keyword-to-token mapping for Oracle SQL and PL/SQL.
Package token provides keyword-to-token mapping for Oracle SQL and PL/SQL.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL