orast — Oracle SQL & PL/SQL Parser in Go

orast is a pure-Go, hand-written recursive descent parser for Oracle SQL and PL/SQL. It produces a richly-typed AST (Abstract Syntax Tree) with zero external dependencies. No code generation, no yacc, no ANTLR — just idiomatic Go.
Features
- Oracle 12c+ SQL — SELECT (joins, subqueries, CTEs, hierarchical,
FOR UPDATE, OFFSET/FETCH), INSERT (multi-row, multi-table, SELECT-based), UPDATE, DELETE, MERGE
- DDL —
CREATE/ALTER/DROP for TABLE, INDEX, VIEW, SEQUENCE, SYNONYM
- PL/SQL — anonymous blocks, procedures, functions, packages, triggers,
IF/LOOP/FOR/WHILE, EXECUTE IMMEDIATE, cursors, exception handlers
- TCL/DCL —
COMMIT, ROLLBACK, SAVEPOINT, GRANT, REVOKE
- Rich AST — 70+ node types with source position tracking and a
Walk visitor
- Oracle-specific lexing — Q-quoted strings, bind variables (
:var), substitution variables (&var), optimizer hints (/*+ */), outer join marker (+), :=, =>, ..
- Hand-written recursive descent — Pratt parser for expressions, full control over error messages
- Comprehensive tests — 40 test functions, 200+ test cases
Installation
go get github.com/yourorg/orast
Quick Start
CLI
# Install
cd cmd/orast && go build -o orast .
# Parse a file
orast parse script.sql
# Fast validation (no AST, ideal for CI/CD)
orast validate sql_dir/
# Format SQL
orast format script.sql # to stdout
orast format -w script.sql # in-place
orast format --check sql_dir/ # check-only
# Parse with AST inspection
orast parse -a script.sql
Library
package main
import (
"fmt"
"github.com/yourorg/orast/parser"
)
func main() {
sql := "SELECT dept_id, COUNT(*) FROM emp WHERE sal > 5000 GROUP BY dept_id ORDER BY dept_id"
file, err := parser.ParseFile([]byte(sql))
if err != nil {
fmt.Println("Parse error:", err)
return
}
for _, stmt := range file.Stmts {
fmt.Printf("Parsed: %T\n", stmt)
}
}
Architecture
orast/
├── token/ Token constants + ~500 Oracle keyword mappings
├── lexer/ Hand-written scanner (all Oracle literal formats)
├── ast/ AST node interfaces, expressions, DML/DDL/PLSQL types
├── parser/ Recursive descent + Pratt expression parser
├── printer/ AST → SQL pretty-printer
├── cmd/orast/ CLI tool (cobra-based)
└── test/cmd/ Test utilities (benchmark, roundtrip, profiler)
| Package |
Files |
Lines |
Description |
token |
2 |
~750 |
350+ token constants, precedence table |
lexer |
1 |
~400 |
Scanner: identifiers, strings, numbers, operators, comments |
ast |
3 |
~1100 |
Node interface, 70+ node types, Walk visitor |
parser |
4 + test |
~5700 |
Recursive descent parser, Pratt expressions, DML/DDL/PLSQL |
Supported Syntax (Oracle 12c+)
DML
-- SELECT with all major clauses
SELECT /*+ INDEX(t) */ DISTINCT e.ename, d.dname
FROM emp e
JOIN dept d ON e.deptno = d.deptno
WHERE e.sal > 3000
START WITH e.mgr IS NULL
CONNECT BY PRIOR e.empno = e.mgr
GROUP BY e.deptno
HAVING COUNT(*) > 2
ORDER BY e.sal DESC NULLS LAST
FOR UPDATE OF e.sal NOWAIT;
-- WITH (CTE), set operations
WITH cte AS (SELECT * FROM t)
SELECT * FROM cte
UNION ALL
SELECT * FROM other;
-- INSERT variants
INSERT INTO t (col1, col2) VALUES (1, 'a');
INSERT INTO t SELECT * FROM u;
INSERT ALL WHEN x = 1 THEN INTO t1 VALUES (1) SELECT * FROM src;
-- UPDATE, DELETE, MERGE
UPDATE t SET col = val WHERE id = 5 RETURNING col INTO v;
DELETE FROM t WHERE id = 5;
MERGE INTO target USING source ON (id) WHEN MATCHED THEN UPDATE SET ...;
DDL
CREATE TABLE t (
id NUMBER PRIMARY KEY,
name VARCHAR2(100) NOT NULL,
amount NUMBER(10,2) DEFAULT 0,
dept_id NUMBER REFERENCES dept(id),
CONSTRAINT pk_t PRIMARY KEY (id)
);
CREATE UNIQUE INDEX idx_u ON t (col);
CREATE OR REPLACE VIEW v AS SELECT * FROM t;
CREATE SEQUENCE seq1 START WITH 1 INCREMENT BY 1;
CREATE PUBLIC SYNONYM syn FOR schema.table;
ALTER TABLE t ADD (new_col NUMBER);
DROP TABLE t CASCADE CONSTRAINTS PURGE;
TRUNCATE TABLE t;
PL/SQL
DECLARE
v_count NUMBER := 0;
CURSOR c IS SELECT * FROM emp;
BEGIN
FOR rec IN c LOOP
v_count := v_count + 1;
END LOOP;
IF v_count > 10 THEN
DBMS_OUTPUT.PUT_LINE('Many');
ELSIF v_count > 0 THEN
DBMS_OUTPUT.PUT_LINE('Some');
END IF;
EXECUTE IMMEDIATE 'SELECT COUNT(*) FROM emp' INTO v_count;
EXCEPTION
WHEN NO_DATA_FOUND THEN
NULL;
WHEN OTHERS THEN
RAISE;
END;
CREATE OR REPLACE PROCEDURE calc_bonus(p_emp_id IN NUMBER, p_bonus OUT NUMBER) IS
BEGIN
SELECT sal * 0.1 INTO p_bonus FROM emp WHERE empno = p_emp_id;
END;
CREATE OR REPLACE FUNCTION get_name(p_id NUMBER) RETURN VARCHAR2 IS
v_name VARCHAR2(100);
BEGIN
SELECT ename INTO v_name FROM emp WHERE empno = p_id;
RETURN v_name;
END;
CREATE OR REPLACE TRIGGER trg_audit
BEFORE INSERT OR UPDATE ON emp
FOR EACH ROW
BEGIN
:NEW.updated_at := SYSDATE;
END;
Development
# Build & test
go build ./...
go test ./... -v
# Build CLI
go build -o orast ./cmd/orast/
# Run test suite (585 real-world SQL files)
go run ./test/cmd/parse_all/main.go sql_files.txt
# Round-trip validation (parse → print → parse)
go run ./test/cmd/roundtrip/main.go sql_files.txt
# Lint
go vet ./...
License
MIT
Roadmap
- Phase 1: Token, Lexer, AST, Parser skeleton
- Phase 2: Expression parser (Pratt)
- Phase 3: DML statements (SELECT/INSERT/UPDATE/DELETE/MERGE)
- Phase 4–6: DDL, TCL, DCL, PL/SQL blocks & subprograms
- Printer: AST → SQL pretty-printer (round-trip testing)
- CLI tool:
orast parse|format|validate (cobra-based)
- Memory safety: no leaks, ~5MB peak for 7MB files
- PL/SQL: packages with full body support, compound triggers
- PIVOT/UNPIVOT,
MATCH_RECOGNIZE, MODEL clause
-
CREATE TYPE, object types, collections
- Fuzz testing, golden file tests, real-world SQL corpus