Dataform is an application to manage data in BigQuery, Snowflake, Redshift, and other data warehouses. It enables data teams to build scalable, tested, SQL based data transformation pipelines using version control and engineering inspired best practices.
Compile hundreds of data models in under a second using SQLX. SQLX extends your existing SQL warehouse dialect to add features that support dependency management, testing, documentation and more.
- Azure SQL data warehouse
- Presto (under development)
Data modeling with Dataform
- Turn any SQL query into a dataset published back to your warehouse
- Write data quality checks for your datasets
- Simplify generation of incremental tables using merge/insert to save costs
- Generate a DAG automatically from dataset dependencies
- Document datasets in code alongside your SQL
More examples and packages
- Reading and writing data from S3
- Writing unit tests
- Create slowly-changing dimension tables
- Manage development, staging and production environments
- Model Segment data in minutes
- Analyse Bigquery usage logs
With the CLI
You can install the Dataform SDK using the following command line. Follow the docs to get started.
npm i -g @dataform/cli
With Dataform web
Dataform web is a development environment and production ready application for the Dataform SDK. You can learn more on dataform.co
How it works
- Read the docs here
More about Dataform
- 5 minute overview video
- Read about how we think you should approach building a modern analytics stack
Join the Dataform community
Want to report a bug or request a feature?
Want to contribute?
Check out our contributors guide to get started with setting up the repo.