go-dump

module

v0.0.0-...-e501bf0 Latest Latest Go to latest Published: Jun 12, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ChaosHour/go-dump

Links

Open Source Insights

README ¶

go-dump

A parallel MySQL logical backup tool written in Go. Produces one SQL file per table per thread with a consistent, point-in-time snapshot using InnoDB MVCC — similar to mydumper but without the C dependency.

How it works

Table discovery — resolves the full table list from --databases, --tables, or --all-databases.
Chunking — for each table, queries the primary (or unique) key to split rows into fixed-size ranges. Tables without a usable key are handled as a single chunk or skipped, depending on --tables-without-uniquekey.
Consistent snapshot — locks tables (LOCK TABLES … READ or FLUSH TABLES WITH READ LOCK for --all-databases), opens one REPEATABLE READ transaction per worker thread, then immediately releases the lock. All workers read from the same MVCC snapshot for the lifetime of the dump.
Parallel workers — N goroutines consume chunks from a channel and write compressed or plain SQL files to the destination directory.
Metadata — writes metadata.json at start and updates it on completion. Records MySQL version, binlog position, GTID set, and per-table chunk counts.
Checksums (optional) — runs CHECKSUM TABLE for every table after the lock is released and writes checksums.txt. These can be used to verify a restore.

MySQL version support

Version	Supported	Notes
MySQL 5.7	✅	Full support
MySQL 8.0.x	✅	Uses `SHOW MASTER STATUS` / `SHOW REPLICA STATUS`
MySQL 8.4+	✅	Uses `SHOW BINARY LOG STATUS` / `SHOW REPLICA STATUS`
Google Cloud SQL MySQL 5.7	✅	Direct TCP (private IP)
Google Cloud SQL MySQL 8.0	✅	Direct TCP (private IP)

Authentication: mysql_native_password (5.7) and caching_sha2_password (8.0+) are both handled automatically by the driver.

Use cases

All three topologies work over a direct TCP connection (no proxy required):

On-prem → On-prem: standard usage, any supported version pair.
On-prem → Cloud SQL: connect to the Cloud SQL private IP directly.
Cloud SQL → Cloud SQL: run go-dump on a GCE VM with access to both instances.

Building

Requires Go 1.23+.

# Clone
git clone https://github.com/ChaosHour/go-dump.git
cd go-dump

# Build native binary → bin/go-dump
make build

# Cross-compile
make build-linux          # linux/amd64
make build-linux-arm64    # linux/arm64
make build-darwin         # darwin/amd64
make build-darwin-arm64   # darwin/arm64 (Apple Silicon)
make build-all            # all of the above

# Run tests (no MySQL required)
make test

# Run integration tests (requires MySQL at 127.0.0.1:3306)
make test-integration

The VERSION file controls the embedded version string. Override at build time:

make build VERSION=1.2.0

Quick start

# Dry run — shows estimated chunk counts without touching the filesystem
go-dump \
  --mysql-host db01.example.com \
  --mysql-user backup \
  --mysql-password secret \
  --databases myapp \
  --destination /backups/myapp \
  --dry-run

# Execute the dump
go-dump \
  --mysql-host db01.example.com \
  --mysql-user backup \
  --mysql-password secret \
  --databases myapp \
  --destination /backups/myapp \
  --threads 4 \
  --chunk-size 50000 \
  --execute

# Same thing with an INI file (recommended for production)
go-dump --ini-file /etc/go-dump/primary.ini --databases myapp --destination /backups/myapp --execute

All flags

Mode

Flag	Default	Description
`--dry-run`	false	Calculate chunk counts per table and print a summary. No files written.
`--execute`	false	Run the dump. Mutually exclusive with `--dry-run`.
`--version`	false	Print version and exit.
`--help`	false	Print usage and exit.

MySQL connection

Flag	Default	Description
`--mysql-host`	localhost	MySQL server hostname or IP.
`--mysql-port`	3306	MySQL server port.
`--mysql-socket`	—	Unix socket path. Takes precedence over host/port when set.
`--mysql-user`	root	MySQL user.
`--mysql-password`	—	MySQL password. Also readable from `GODUMP_PASSWORD` env var.
`--ini-file`	—	Path to an INI file (see INI file configuration).

What to dump

Flag	Default	Description
`--databases`	—	Comma-separated list of databases: `mydb,reporting`.
`--tables`	—	Comma-separated list of `schema.table` pairs: `mydb.orders,mydb.users`.
`--all-databases`	false	Dump every user database. Excludes `mysql`, `sys`, `information_schema`, `performance_schema` by default.
`--include-system-databases`	false	With `--all-databases`, include the `mysql` schema (minus `slow_log`/`general_log`). Use for on-prem full-cluster migrations. Do not use with Cloud SQL.
`--where`	—	Filter rows. Global: `"status = 'active'"`. Per-table: `"db.tbl:expr,db.tbl2:expr"`.

Parallelism and chunking

Flag	Default	Description
`--threads`	1	Number of parallel worker goroutines. Match to available CPU cores and disk I/O capacity.
`--chunk-size`	1000	Rows per read chunk (key range query). Larger = fewer queries, more memory per worker.
`--output-chunk-size`	0	Rows per `INSERT` statement. 0 = same as `--chunk-size`.
`--channel-buffer-size`	1000	Depth of the chunk work queue. Rarely needs tuning.
`--tables-without-uniquekey`	error	What to do with tables that have no PK or unique key: `error` (abort), `single-chunk` (dump entire table in one query), `skip`.

Consistency

Flag	Default	Description
`--consistent`	true	Require a consistent (point-in-time) backup via MVCC.
`--lock-tables`	true	Lock tables to establish the consistent snapshot. Required when `--consistent=true`.
`--isolation-level`	REPEATABLE READ	Transaction isolation level. Options: `REPEATABLE READ`, `READ COMMITTED`, `READ UNCOMMITTED`, `SERIALIZABLE`.

Output

Flag	Default	Description
`--destination`	—	Required. Directory to write dump files. Created if it does not exist.
`--add-drop-table`	false	Prepend `DROP TABLE IF EXISTS` before each `CREATE TABLE`.
`--skip-use-database`	false	Omit `USE \`schema`` statements from chunk files.
`--compress`	false	Gzip-compress output files (`.sql.gz`).
`--compress-level`	1	Compression level 1 (fastest) to 9 (smallest).
`--checksum`	false	Run `CHECKSUM TABLE` after the dump and write `checksums.txt`.
`--get-master-status`	false	Record binlog file/position and GTID set in `master-data.sql` and `metadata.json`.
`--get-slave-status`	false	Record replica status in `slave-data.sql`.
`--output-chunk-size`	0	Rows per INSERT statement (0 = same as chunk-size).

Logging

Flag	Default	Description
`--debug`	false	Enable debug-level logging.
`--quiet`	false	Suppress INFO messages (warnings and errors still print).

INI file configuration

The --ini-file flag accepts a MySQL-style INI file. This is the recommended way to manage credentials and defaults in production — keeps passwords out of shell history and process listings.

Supported sections: [client], [mysqldump] (MySQL standard keys), and [go-dump] (go-dump-specific keys).

[client]
user     = backup_user
password = s3cr3t#with#hashes   # '#' in passwords is handled correctly
host     = db01.example.com
port     = 3306

[go-dump]
threads             = 8
chunk-size          = 50000
output-chunk-size   = 5000
destination         = /backups/mysql
compress            = true
compress-level      = 1
add-drop-table      = true
get-master-status   = true
checksum            = true
tables-without-uniquekey = single-chunk
consistent          = true
isolation-level     = REPEATABLE READ

Command-line flags override INI values. The GODUMP_PASSWORD environment variable is used as a last resort if --mysql-password is not set and the INI file has no password.

MySQL defaults-group-suffix

Use multiple INI sections to target different servers:

# ~/.my.cnf
[client_primary1]
user     = backup
password = primary_password
host     = primary1.db.internal
port     = 3306

[client_replica1]
user     = backup
password = replica_password
host     = replica1.db.internal
port     = 3306

go-dump --ini-file ~/.my.cnf --databases myapp --destination /backups/myapp --execute
# (go-dump reads [go-dump] section; [client] / [mysqldump] sections for credentials)

Common examples

Dump a single database

go-dump \
  --mysql-host db01.example.com \
  --mysql-user backup \
  --mysql-password secret \
  --databases myapp \
  --destination /backups/myapp \
  --threads 4 \
  --chunk-size 50000 \
  --get-master-status \
  --execute

Dump multiple databases

go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --databases "myapp,reporting,analytics" \
  --destination /backups/prod \
  --threads 8 \
  --chunk-size 100000 \
  --compress \
  --add-drop-table \
  --execute

Dump specific tables

go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --tables "myapp.orders,myapp.order_items,myapp.customers" \
  --destination /backups/orders \
  --threads 4 \
  --execute

Dry run to estimate chunk counts

go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --databases myapp \
  --destination /tmp/unused \
  --threads 4 \
  --chunk-size 50000 \
  --dry-run

Output:

2026-06-10 14:00:01 INFO Table: myapp.orders Engine: InnoDB Estimated Chunks: 1200
2026-06-10 14:00:01 INFO Table: myapp.customers Engine: InnoDB Estimated Chunks: 80
2026-06-10 14:00:01 INFO Table: myapp.products Engine: InnoDB Estimated Chunks: 12
   1200 -> `myapp`.`orders`
     80 -> `myapp`.`customers`
     12 -> `myapp`.`products`

Dump all user databases (on-prem)

go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --all-databases \
  --destination /backups/full \
  --threads 8 \
  --chunk-size 100000 \
  --get-master-status \
  --add-drop-table \
  --checksum \
  --execute

Dump all databases including mysql schema (account migration)

go-dump \
  --ini-file /etc/go-dump/source.ini \
  --all-databases \
  --include-system-databases \
  --destination /backups/full-cluster \
  --threads 4 \
  --execute

Note: Do not use --include-system-databases with Cloud SQL. Cloud SQL manages its own mysql.* permission tables via IAM.

Dump with checksums for restore verification

go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --databases myapp \
  --destination /backups/myapp \
  --threads 8 \
  --chunk-size 50000 \
  --get-master-status \
  --checksum \
  --execute

After restore, verify:

# (checksums.txt is in the dump directory — go-load or a custom script
#  can run CHECKSUM TABLE on the target and compare)
cat /backups/myapp/checksums.txt

Dump with row filters (WHERE conditions)

# Global filter — applies to every table
go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --databases myapp \
  --destination /backups/active-only \
  --where "status = 'active' AND created_at >= '2025-01-01'" \
  --execute

# Per-table filters — different condition per table
go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --databases myapp \
  --destination /backups/filtered \
  --where "myapp.orders:total > 100.00,myapp.customers:country = 'US'" \
  --execute

Compressed dump

go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --databases myapp \
  --destination /backups/myapp \
  --threads 8 \
  --compress \
  --compress-level 1 \
  --execute
# Output files: myapp.orders-thread0.sql.gz, myapp.orders-definition.sql.gz, etc.

Large table — tune chunk size

For an 8M-row table dumped with 4 threads:

go-dump \
  --ini-file /etc/go-dump/prod.ini \
  --tables "analytics.events" \
  --destination /backups/events \
  --threads 4 \
  --chunk-size 100000 \
  --output-chunk-size 10000 \
  --tables-without-uniquekey single-chunk \
  --execute

Progress output (every 5 seconds):

2026-06-10 14:02:15 INFO Progress: 450/1700 (26.5%) | Rate: 14.2 chunks/s | ETA: ~1m28s
2026-06-10 14:02:20 INFO Progress: 521/1700 (30.6%) | Rate: 14.1 chunks/s | ETA: ~1m22s

Cloud SQL (on-prem → Cloud SQL via private IP)

go-dump \
  --mysql-host 10.80.0.3 \
  --mysql-user backup \
  --mysql-password secret \
  --databases myapp \
  --destination /backups/cloudsql \
  --threads 4 \
  --chunk-size 50000 \
  --get-master-status \
  --execute

Output files

A dump of myapp.orders with 4 threads produces:

/backups/myapp/
├── metadata.json                          # dump metadata (MySQL version, binlog, table status)
├── master-data.sql                        # binlog position / GTID set (--get-master-status)
├── slave-data.sql                         # replica status (--get-slave-status)
├── checksums.txt                          # CHECKSUM TABLE results (--checksum)
├── myapp-schema-create.sql                # CREATE DATABASE IF NOT EXISTS (one per schema)
├── myapp.orders-definition.sql            # CREATE TABLE statement
├── myapp.orders-thread0.sql               # rows assigned to worker 0
├── myapp.orders-thread1.sql               # rows assigned to worker 1
├── myapp.orders-thread2.sql               # rows assigned to worker 2
├── myapp.orders-thread3.sql               # rows assigned to worker 3
├── myapp.customers-definition.sql
└── myapp.customers.sql                    # tables without a PK produce a single file

metadata.json

Written at dump start (status: in_progress) and updated on clean finish (status: complete). Used by resume logic and restore verification.

{
  "go_dump_version": "1.0.0",
  "start_time": "2026-06-10T14:00:00Z",
  "end_time": "2026-06-10T14:05:30Z",
  "status": "complete",
  "mysql_host": "db01.example.com",
  "mysql_version": "8.0.43",
  "binlog_file": "binlog.000042",
  "binlog_position": 1421,
  "gtid_set": "3e11fa47-71ca-11e1-9e33-c80aa9429562:1-123",
  "character_set": "utf8mb4",
  "tables": [
    {
      "schema": "myapp",
      "name": "orders",
      "row_estimate": 1500000,
      "chunks": 170,
      "status": "done",
      "checksum": 3829201847
    }
  ]
}

checksums.txt

# go-dump checksum file — generated 2026-06-10T14:05:30Z
# Format: schema.table  checksum  timestamp
myapp.orders      3829201847    2026-06-10T14:05:31Z
myapp.customers   918273645     2026-06-10T14:05:32Z

Caveat: CHECKSUM TABLE runs after the locks are released and reads the current table state. The recorded values only match the dumped data if no writes occurred between the snapshot and the checksum. Use --checksum / go-load --verify on quiesced sources (maintenance windows, stopped replicas) — on a live primary, a mismatch does not necessarily mean the dump is bad.

Restoring

go-load (in this repo) loads files in the right order automatically: database creation (*-schema-create.sql), then table definitions, then data files in parallel — so a dump restores onto a server where the database does not exist yet.

go-load --host target-host --user root --password ... --directory /backups/myapp --workers 4 --verify

With standard MySQL tooling instead:

# Create the database and schema first
mysql -h target-host -u root -p < /backups/myapp/myapp-schema-create.sql
mysql -h target-host -u root -p myapp < /backups/myapp/myapp.orders-definition.sql

# Restore data files (parallel with xargs or a shell loop)
ls /backups/myapp/myapp.orders-thread*.sql | xargs -P4 -I{} mysql -h target-host -u root -p myapp < {}

# Or restore compressed files
ls /backups/myapp/*.sql.gz | xargs -P4 -I{} sh -c 'zcat {} | mysql -h target-host -u root -p myapp'

For point-in-time recovery, apply binary logs from the position recorded in master-data.sql (or metadata.json binlog_file/binlog_position).

Required privileges

-- Minimum privileges for the backup user
GRANT SELECT, RELOAD, LOCK TABLES, REPLICATION CLIENT ON *.* TO 'backup'@'%';

-- If using --get-slave-status
GRANT SELECT, RELOAD, LOCK TABLES, REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'backup'@'%';

Privilege	Required for
`SELECT`	Reading table data
`RELOAD`	`FLUSH TABLES WITH READ LOCK` (used with `--all-databases`)
`LOCK TABLES`	`LOCK TABLES … READ` (used with `--databases` / `--tables`)
`REPLICATION CLIENT`	`SHOW MASTER STATUS` / `SHOW BINARY LOG STATUS` / `SHOW REPLICA STATUS`
`REPLICATION SLAVE`	`SHOW REPLICA STATUS` when `--get-slave-status`

SUPER is not required. go-dump is compatible with Cloud SQL, RDS, and other managed MySQL services that restrict super-user access.

Cloud SQL notes

Connect directly to the Cloud SQL private IP — no Auth Proxy required.
The Cloud SQL root user has RELOAD and REPLICATION CLIENT — all required privileges are available.
Do not use --include-system-databases with Cloud SQL. The mysql.* tables are managed by Cloud SQL's IAM layer; importing them into another instance will corrupt permissions.
--all-databases safely excludes mysql, sys, information_schema, and performance_schema by default.

License

Apache 2.0

Directories ¶

Path	Synopsis
cmd
go-dump command
go-load command
internal
dump
load
log Package log is a thin slog-backed wrapper that preserves the outbrain/golib/log call surface (Debug/Info/Warning/Fatal etc.) while removing the external dependency.	Package log is a thin slog-backed wrapper that preserves the outbrain/golib/log call surface (Debug/Info/Warning/Fatal etc.) while removing the external dependency.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

go-dump

Contents

How it works

MySQL version support

Use cases

Building

Quick start

All flags

Mode

MySQL connection

What to dump

Parallelism and chunking

Consistency

Output

Logging

INI file configuration

MySQL defaults-group-suffix

Common examples

Dump a single database

Dump multiple databases

Dump specific tables

Dry run to estimate chunk counts

Dump all user databases (on-prem)

Dump all databases including mysql schema (account migration)

Dump with checksums for restore verification

Dump with row filters (WHERE conditions)

Compressed dump

Large table — tune chunk size

Cloud SQL (on-prem → Cloud SQL via private IP)

Output files

metadata.json

checksums.txt

Restoring

Required privileges

Cloud SQL notes

License

Directories ¶