README
¶
go-dump
A parallel MySQL logical backup tool written in Go. Produces one SQL file per table per thread with a consistent, point-in-time snapshot using InnoDB MVCC — similar to mydumper but without the C dependency.
Contents
- How it works
- MySQL version support
- Use cases
- Building
- Quick start
- All flags
- INI file configuration
- Common examples
- Output files
- Restoring
- Required privileges
- Cloud SQL notes
How it works
- Table discovery — resolves the full table list from
--databases,--tables, or--all-databases. - Chunking — for each table, queries the primary (or unique) key to split rows into
fixed-size ranges. Tables without a usable key are handled as a single chunk or
skipped, depending on
--tables-without-uniquekey. - Consistent snapshot — locks tables (
LOCK TABLES … READorFLUSH TABLES WITH READ LOCKfor--all-databases), opens oneREPEATABLE READtransaction per worker thread, then immediately releases the lock. All workers read from the same MVCC snapshot for the lifetime of the dump. - Parallel workers — N goroutines consume chunks from a channel and write compressed or plain SQL files to the destination directory.
- Metadata — writes
metadata.jsonat start and updates it on completion. Records MySQL version, binlog position, GTID set, and per-table chunk counts. - Checksums (optional) — runs
CHECKSUM TABLEfor every table after the lock is released and writeschecksums.txt. These can be used to verify a restore.
MySQL version support
| Version | Supported | Notes |
|---|---|---|
| MySQL 5.7 | ✅ | Full support |
| MySQL 8.0.x | ✅ | Uses SHOW MASTER STATUS / SHOW REPLICA STATUS |
| MySQL 8.4+ | ✅ | Uses SHOW BINARY LOG STATUS / SHOW REPLICA STATUS |
| Google Cloud SQL MySQL 5.7 | ✅ | Direct TCP (private IP) |
| Google Cloud SQL MySQL 8.0 | ✅ | Direct TCP (private IP) |
Authentication: mysql_native_password (5.7) and caching_sha2_password (8.0+)
are both handled automatically by the driver.
Use cases
All three topologies work over a direct TCP connection (no proxy required):
- On-prem → On-prem: standard usage, any supported version pair.
- On-prem → Cloud SQL: connect to the Cloud SQL private IP directly.
- Cloud SQL → Cloud SQL: run go-dump on a GCE VM with access to both instances.
Building
Requires Go 1.23+.
# Clone
git clone https://github.com/ChaosHour/go-dump.git
cd go-dump
# Build native binary → bin/go-dump
make build
# Cross-compile
make build-linux # linux/amd64
make build-linux-arm64 # linux/arm64
make build-darwin # darwin/amd64
make build-darwin-arm64 # darwin/arm64 (Apple Silicon)
make build-all # all of the above
# Run tests (no MySQL required)
make test
# Run integration tests (requires MySQL at 127.0.0.1:3306)
make test-integration
The VERSION file controls the embedded version string. Override at build time:
make build VERSION=1.2.0
Quick start
# Dry run — shows estimated chunk counts without touching the filesystem
go-dump \
--mysql-host db01.example.com \
--mysql-user backup \
--mysql-password secret \
--databases myapp \
--destination /backups/myapp \
--dry-run
# Execute the dump
go-dump \
--mysql-host db01.example.com \
--mysql-user backup \
--mysql-password secret \
--databases myapp \
--destination /backups/myapp \
--threads 4 \
--chunk-size 50000 \
--execute
# Same thing with an INI file (recommended for production)
go-dump --ini-file /etc/go-dump/primary.ini --databases myapp --destination /backups/myapp --execute
All flags
Mode
| Flag | Default | Description |
|---|---|---|
--dry-run |
false | Calculate chunk counts per table and print a summary. No files written. |
--execute |
false | Run the dump. Mutually exclusive with --dry-run. |
--version |
false | Print version and exit. |
--help |
false | Print usage and exit. |
MySQL connection
| Flag | Default | Description |
|---|---|---|
--mysql-host |
localhost | MySQL server hostname or IP. |
--mysql-port |
3306 | MySQL server port. |
--mysql-socket |
— | Unix socket path. Takes precedence over host/port when set. |
--mysql-user |
root | MySQL user. |
--mysql-password |
— | MySQL password. Also readable from GODUMP_PASSWORD env var. |
--ini-file |
— | Path to an INI file (see INI file configuration). |
What to dump
| Flag | Default | Description |
|---|---|---|
--databases |
— | Comma-separated list of databases: mydb,reporting. |
--tables |
— | Comma-separated list of schema.table pairs: mydb.orders,mydb.users. |
--all-databases |
false | Dump every user database. Excludes mysql, sys, information_schema, performance_schema by default. |
--include-system-databases |
false | With --all-databases, include the mysql schema (minus slow_log/general_log). Use for on-prem full-cluster migrations. Do not use with Cloud SQL. |
--where |
— | Filter rows. Global: "status = 'active'". Per-table: "db.tbl:expr,db.tbl2:expr". |
Parallelism and chunking
| Flag | Default | Description |
|---|---|---|
--threads |
1 | Number of parallel worker goroutines. Match to available CPU cores and disk I/O capacity. |
--chunk-size |
1000 | Rows per read chunk (key range query). Larger = fewer queries, more memory per worker. |
--output-chunk-size |
0 | Rows per INSERT statement. 0 = same as --chunk-size. |
--channel-buffer-size |
1000 | Depth of the chunk work queue. Rarely needs tuning. |
--tables-without-uniquekey |
error | What to do with tables that have no PK or unique key: error (abort), single-chunk (dump entire table in one query), skip. |
Consistency
| Flag | Default | Description |
|---|---|---|
--consistent |
true | Require a consistent (point-in-time) backup via MVCC. |
--lock-tables |
true | Lock tables to establish the consistent snapshot. Required when --consistent=true. |
--isolation-level |
REPEATABLE READ | Transaction isolation level. Options: REPEATABLE READ, READ COMMITTED, READ UNCOMMITTED, SERIALIZABLE. |
Output
| Flag | Default | Description |
|---|---|---|
--destination |
— | Required. Directory to write dump files. Created if it does not exist. |
--add-drop-table |
false | Prepend DROP TABLE IF EXISTS before each CREATE TABLE. |
--skip-use-database |
false | Omit USE \schema`` statements from chunk files. |
--compress |
false | Gzip-compress output files (.sql.gz). |
--compress-level |
1 | Compression level 1 (fastest) to 9 (smallest). |
--checksum |
false | Run CHECKSUM TABLE after the dump and write checksums.txt. |
--get-master-status |
false | Record binlog file/position and GTID set in master-data.sql and metadata.json. |
--get-slave-status |
false | Record replica status in slave-data.sql. |
--output-chunk-size |
0 | Rows per INSERT statement (0 = same as chunk-size). |
Logging
| Flag | Default | Description |
|---|---|---|
--debug |
false | Enable debug-level logging. |
--quiet |
false | Suppress INFO messages (warnings and errors still print). |
INI file configuration
The --ini-file flag accepts a MySQL-style INI file. This is the recommended way to
manage credentials and defaults in production — keeps passwords out of shell history
and process listings.
Supported sections: [client], [mysqldump] (MySQL standard keys), and [go-dump]
(go-dump-specific keys).
[client]
user = backup_user
password = s3cr3t#with#hashes # '#' in passwords is handled correctly
host = db01.example.com
port = 3306
[go-dump]
threads = 8
chunk-size = 50000
output-chunk-size = 5000
destination = /backups/mysql
compress = true
compress-level = 1
add-drop-table = true
get-master-status = true
checksum = true
tables-without-uniquekey = single-chunk
consistent = true
isolation-level = REPEATABLE READ
Command-line flags override INI values. The GODUMP_PASSWORD environment variable is
used as a last resort if --mysql-password is not set and the INI file has no password.
MySQL defaults-group-suffix
Use multiple INI sections to target different servers:
# ~/.my.cnf
[client_primary1]
user = backup
password = primary_password
host = primary1.db.internal
port = 3306
[client_replica1]
user = backup
password = replica_password
host = replica1.db.internal
port = 3306
go-dump --ini-file ~/.my.cnf --databases myapp --destination /backups/myapp --execute
# (go-dump reads [go-dump] section; [client] / [mysqldump] sections for credentials)
Common examples
Dump a single database
go-dump \
--mysql-host db01.example.com \
--mysql-user backup \
--mysql-password secret \
--databases myapp \
--destination /backups/myapp \
--threads 4 \
--chunk-size 50000 \
--get-master-status \
--execute
Dump multiple databases
go-dump \
--ini-file /etc/go-dump/prod.ini \
--databases "myapp,reporting,analytics" \
--destination /backups/prod \
--threads 8 \
--chunk-size 100000 \
--compress \
--add-drop-table \
--execute
Dump specific tables
go-dump \
--ini-file /etc/go-dump/prod.ini \
--tables "myapp.orders,myapp.order_items,myapp.customers" \
--destination /backups/orders \
--threads 4 \
--execute
Dry run to estimate chunk counts
go-dump \
--ini-file /etc/go-dump/prod.ini \
--databases myapp \
--destination /tmp/unused \
--threads 4 \
--chunk-size 50000 \
--dry-run
Output:
2026-06-10 14:00:01 INFO Table: myapp.orders Engine: InnoDB Estimated Chunks: 1200
2026-06-10 14:00:01 INFO Table: myapp.customers Engine: InnoDB Estimated Chunks: 80
2026-06-10 14:00:01 INFO Table: myapp.products Engine: InnoDB Estimated Chunks: 12
1200 -> `myapp`.`orders`
80 -> `myapp`.`customers`
12 -> `myapp`.`products`
Dump all user databases (on-prem)
go-dump \
--ini-file /etc/go-dump/prod.ini \
--all-databases \
--destination /backups/full \
--threads 8 \
--chunk-size 100000 \
--get-master-status \
--add-drop-table \
--checksum \
--execute
Dump all databases including mysql schema (account migration)
go-dump \
--ini-file /etc/go-dump/source.ini \
--all-databases \
--include-system-databases \
--destination /backups/full-cluster \
--threads 4 \
--execute
Note: Do not use
--include-system-databaseswith Cloud SQL. Cloud SQL manages its ownmysql.*permission tables via IAM.
Dump with checksums for restore verification
go-dump \
--ini-file /etc/go-dump/prod.ini \
--databases myapp \
--destination /backups/myapp \
--threads 8 \
--chunk-size 50000 \
--get-master-status \
--checksum \
--execute
After restore, verify:
# (checksums.txt is in the dump directory — go-load or a custom script
# can run CHECKSUM TABLE on the target and compare)
cat /backups/myapp/checksums.txt
Dump with row filters (WHERE conditions)
# Global filter — applies to every table
go-dump \
--ini-file /etc/go-dump/prod.ini \
--databases myapp \
--destination /backups/active-only \
--where "status = 'active' AND created_at >= '2025-01-01'" \
--execute
# Per-table filters — different condition per table
go-dump \
--ini-file /etc/go-dump/prod.ini \
--databases myapp \
--destination /backups/filtered \
--where "myapp.orders:total > 100.00,myapp.customers:country = 'US'" \
--execute
Compressed dump
go-dump \
--ini-file /etc/go-dump/prod.ini \
--databases myapp \
--destination /backups/myapp \
--threads 8 \
--compress \
--compress-level 1 \
--execute
# Output files: myapp.orders-thread0.sql.gz, myapp.orders-definition.sql.gz, etc.
Large table — tune chunk size
For an 8M-row table dumped with 4 threads:
go-dump \
--ini-file /etc/go-dump/prod.ini \
--tables "analytics.events" \
--destination /backups/events \
--threads 4 \
--chunk-size 100000 \
--output-chunk-size 10000 \
--tables-without-uniquekey single-chunk \
--execute
Progress output (every 5 seconds):
2026-06-10 14:02:15 INFO Progress: 450/1700 (26.5%) | Rate: 14.2 chunks/s | ETA: ~1m28s
2026-06-10 14:02:20 INFO Progress: 521/1700 (30.6%) | Rate: 14.1 chunks/s | ETA: ~1m22s
Cloud SQL (on-prem → Cloud SQL via private IP)
go-dump \
--mysql-host 10.80.0.3 \
--mysql-user backup \
--mysql-password secret \
--databases myapp \
--destination /backups/cloudsql \
--threads 4 \
--chunk-size 50000 \
--get-master-status \
--execute
Output files
A dump of myapp.orders with 4 threads produces:
/backups/myapp/
├── metadata.json # dump metadata (MySQL version, binlog, table status)
├── master-data.sql # binlog position / GTID set (--get-master-status)
├── slave-data.sql # replica status (--get-slave-status)
├── checksums.txt # CHECKSUM TABLE results (--checksum)
├── myapp-schema-create.sql # CREATE DATABASE IF NOT EXISTS (one per schema)
├── myapp.orders-definition.sql # CREATE TABLE statement
├── myapp.orders-thread0.sql # rows assigned to worker 0
├── myapp.orders-thread1.sql # rows assigned to worker 1
├── myapp.orders-thread2.sql # rows assigned to worker 2
├── myapp.orders-thread3.sql # rows assigned to worker 3
├── myapp.customers-definition.sql
└── myapp.customers.sql # tables without a PK produce a single file
metadata.json
Written at dump start (status: in_progress) and updated on clean finish
(status: complete). Used by resume logic and restore verification.
{
"go_dump_version": "1.0.0",
"start_time": "2026-06-10T14:00:00Z",
"end_time": "2026-06-10T14:05:30Z",
"status": "complete",
"mysql_host": "db01.example.com",
"mysql_version": "8.0.43",
"binlog_file": "binlog.000042",
"binlog_position": 1421,
"gtid_set": "3e11fa47-71ca-11e1-9e33-c80aa9429562:1-123",
"character_set": "utf8mb4",
"tables": [
{
"schema": "myapp",
"name": "orders",
"row_estimate": 1500000,
"chunks": 170,
"status": "done",
"checksum": 3829201847
}
]
}
checksums.txt
# go-dump checksum file — generated 2026-06-10T14:05:30Z
# Format: schema.table checksum timestamp
myapp.orders 3829201847 2026-06-10T14:05:31Z
myapp.customers 918273645 2026-06-10T14:05:32Z
Caveat:
CHECKSUM TABLEruns after the locks are released and reads the current table state. The recorded values only match the dumped data if no writes occurred between the snapshot and the checksum. Use--checksum/go-load --verifyon quiesced sources (maintenance windows, stopped replicas) — on a live primary, a mismatch does not necessarily mean the dump is bad.
Restoring
go-load (in this repo) loads files in the right order automatically:
database creation (*-schema-create.sql), then table definitions, then data
files in parallel — so a dump restores onto a server where the database does
not exist yet.
go-load --host target-host --user root --password ... --directory /backups/myapp --workers 4 --verify
With standard MySQL tooling instead:
# Create the database and schema first
mysql -h target-host -u root -p < /backups/myapp/myapp-schema-create.sql
mysql -h target-host -u root -p myapp < /backups/myapp/myapp.orders-definition.sql
# Restore data files (parallel with xargs or a shell loop)
ls /backups/myapp/myapp.orders-thread*.sql | xargs -P4 -I{} mysql -h target-host -u root -p myapp < {}
# Or restore compressed files
ls /backups/myapp/*.sql.gz | xargs -P4 -I{} sh -c 'zcat {} | mysql -h target-host -u root -p myapp'
For point-in-time recovery, apply binary logs from the position recorded in
master-data.sql (or metadata.json binlog_file/binlog_position).
Required privileges
-- Minimum privileges for the backup user
GRANT SELECT, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'backup'@'%';
-- If using --get-slave-status
GRANT SELECT, RELOAD, LOCK TABLES, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'backup'@'%';
| Privilege | Required for |
|---|---|
SELECT |
Reading table data |
RELOAD |
FLUSH TABLES WITH READ LOCK (used with --all-databases) |
LOCK TABLES |
LOCK TABLES … READ (used with --databases / --tables) |
REPLICATION CLIENT |
SHOW MASTER STATUS / SHOW BINARY LOG STATUS / SHOW REPLICA STATUS |
REPLICATION SLAVE |
SHOW REPLICA STATUS when --get-slave-status |
SUPER is not required. go-dump is compatible with Cloud SQL, RDS, and other
managed MySQL services that restrict super-user access.
Cloud SQL notes
- Connect directly to the Cloud SQL private IP — no Auth Proxy required.
- The Cloud SQL root user has
RELOADandREPLICATION CLIENT— all required privileges are available. - Do not use
--include-system-databaseswith Cloud SQL. Themysql.*tables are managed by Cloud SQL's IAM layer; importing them into another instance will corrupt permissions. --all-databasessafely excludesmysql,sys,information_schema, andperformance_schemaby default.
License
Directories
¶
| Path | Synopsis |
|---|---|
|
cmd
|
|
|
go-dump
command
|
|
|
go-load
command
|
|
|
internal
|
|
|
log
Package log is a thin slog-backed wrapper that preserves the outbrain/golib/log call surface (Debug/Info/Warning/Fatal etc.) while removing the external dependency.
|
Package log is a thin slog-backed wrapper that preserves the outbrain/golib/log call surface (Debug/Info/Warning/Fatal etc.) while removing the external dependency. |