libschema

package module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 25, 2022 License: MIT Imports: 11 Imported by: 0

README

libschema - database schema migration for libraries

GoDoc unit tests pg tests mysql tests singlestore tests report card codecov

Install:

go get github.com/muir/libschema

Libraries

Libschema provides a way for Go libraries to manage their own database migrations.

Trying migrations to libraries supports two things: the first is source code locality: the migrations can be next to the code that uses the tables that the migrations address.

The second is support for migrations in third-party libraries. This is a relatively unexplored and unsolved problem: how can an open source (or proprietary) library specify and maintain a database schema. Libschema hopes to start solving this problem.

Register and execute

Migrations are registered:

schema := libschema.NewSchema(ctx, libschema.Options{})

sqlDB, err := sql.Open("postgres", "....")

database, err := lspostgres.New(logger, "main-db", schema, sqlDB)

database.Migrations("MyLibrary",
	lspostgres.Script("createUserTable", `
		CREATE TABLE users (
			name	text,
			id	bigint
		)`
	}),
	lspostgres.Script("addLastLogin", `
		ALTER TABLE users
			ADD COLUMN last_login timestamp
		`
	}),
)

Migrations are then run run later in the order that they were registered.

err := schema.Migrate(context)

Computed Migrations

Migrations may be SQL strings or migrations can be done in Go:

database.Migrations("MyLibrary", 
	lspostgres.Computed("importUsers", func(_ context.Context, _ Migration, tx *sql.Tx) error {
		// code to import users here
	}),
)

Asynchronous migrations

The normal mode for migrations is to run the migrations synchronously when schema.Migrate() is called. Asynchronous migrations are started when schema.Migrate() is called but they're run in the background in a go-routine. If there are later migrations, after the asynchronous migration, they'll force the asynchronous migration to be synchronous unless they're also asynchronous.

Version blocking

Migrations can be tied to specific code versions so that they are not run until conditions are met. This is done with SkipRemainingIf. This be used to backfill data.

database.Migrations("MyLibrary",
	...
	lspostgres.Script("addColumn", `
			ALTER TABLE users
				ADD COLUMN rating`,
	libschema.SkipThisAndFollowingIf(func() bool {
		return semver.Compare(version(), "3.11.3") < 1
	})),
	lspostgres.Script("fillInRatings", `
			UPDATE	users
			SET	rating = ...
			WHERE	rating IS NULL;

			ALTER TABLE users
				MODIFY COLUMN rating SET NOT NULL;`,
	libschema.Asychronous),
)

Cross-library dependencies

Although it is best if the schema from one library is independent of the schema for another, sometimes that's not possible, especially if you want to enforce foriegn key constraints.

Use After() to specify a cross-library dependency.

database.Migrations("users",
	...
	lspostgres.Script("addOrg", `
			ALTER TABLE users
				ADD COLUMN org TEXT,
				ADD ADD CONSTRAINT orgfk FOREIGN KEY (org)
					REFERENCES org (name) `, 
		libschema.After("orgs", "createOrgTable")),
)

database.Migrations("orgs",
	...
	lspostgres.Script("createOrgTable", `
		...
	`),
)

Transactions

For databases that support transactions on metadata, all migrations will be wrapped with a BEGIN and COMMIT. For databases that do not support transactions on metadata, migrations will be split into individual commands and run one at a time. If only some of the commands succeed, the migration will be marked as partially complete. If the migration is revised, then the later parts can be re-tried as long as the earlier parts are not modified. This does not apply to Compute()ed migrations.

Command line

The OverrideOptions can be added as command line flags that change the behavior of calling schema.Migrate()

--migrate-only			Call os.Exit() after completing migrations
--migrate-database		Migrate only one logical database (must match NewDatabase)
--migrate-dsn			Override *sql.DB 
--no-migrate			Skip all migrations
--error-if-migrate-needed	Return error if there are outstanding synchronous migrations
--migrate-all-synchronously	Treat asychronous migrations as synchronous

Ordering and pull requests

Migrations are run the order that they're defined. If the set of migrations is updated so that there are new migrations that are earlier in the table than migrations that have already run, this is not considered an error and the new migrations will be run anyway. This allows multiple branches of code with migrations to be merged into a combined branch without hassle.

Migrations can have explicit dependencies and these dependencies can cross between libraries so that one library's migrations can depend on anothers.

Code structure

Registering the migrations before executing them is easier if using library singletons. Library singletons can be supported by using nserve or fx. With nserve, migrations can be given their own hook.

Driver inclusion and database support

Like database/sql, libschema requires database-specific drivers:

  • PostgreSQL support is in "github.com/muir/libschema/lspostgres"
  • MySQL support in "github.com/muir/libschema/lsmysql"
  • SingleStore support "github.com/muir/libschema/lssinglestore"

libschema currently supports: PostgreSQL, SingleStore, MySQL. It is relatively easy to add additional databases.

Forward only

Libschema does not support reverse migrations. If you need to fix a migration, fix forward. The history behind this is that reverse migrations are rarely the right answer for production systems and the extra work for maintaining reverse migrations is does not have enough of a payoff during development to be worth the effort.

One way to get the benefits of reverse migrations for development is to put enough enough reverse migrations to reverse to the last production schema at the end of the migration list but protected by a gateway:

libschema.SkipThisAndRemainingIf(func() bool {
	return os.Getenv("LIBMIGRATE_REVERSE_TO_PROD") != "true"
}),

This set of reverse migrations would always be small since it would just be enough to take you back to the current production release.

Patterns for applying migrations

When using a migration tool like libschema there are several reasonable patterns one can follow to apply migrations to produciton code.

Down-Up deploys

The simplist pattern is to deploy migrations synchronously when rolling out updates. If you take your service down to do deploys then your migrations do not have to be backwards compatible. This has the huge upside of allowing your schema to eveolve easily and avoid the build up of technical debt. For example, if you have a column whose name is sub-optimal, you can simply rename it and change the code that uses it at the same time.

To minimimize downtime so that the downtime doesn't matter in practice, run expensive migrations asynchronously. Asychronous migrations are harder to define because they should be broken up into a whole bunch of smallish transactions. The RepeatUntilNoOp() decorator may be useful.

Green-Blue deploys

When you decide to run without downtime, one consequence is that all migrations must be backwards compatible with the deployed code.

DDL operations that are backwards compatible include:

  • adding a column, table, or view
  • removing a column, table, or view that is no longer accessed
  • adding a default value to a column
  • remvoing a constraint
  • adding a constraint as long as there are no violations and won't be any new ones

From a coding point-of-view, the simplest way to manage developing with these restrictions is to separate the migration into a separate pull request from any other code changes. Tests must still pass in the pull request that just has the migration. Local and CI testing should apply the migration and validate that the the existing code isn't broken by the change in database schema.

Only after the migration has been deployed can code that uses the migration be deployed. When using git, this can be done by having layered side branches:

graph LR;
 mob(migration-only branch)
 code(code branch)
 cleanup(cleanup migration branch)
 main --> mob --> code --> cleanup;
Kubernetes and slow migrations

One issue with using libschema to deploy changes is that servers can take a long time to come up if there are expensive migrations that need to be deployed first. A solution for this is to use OverrideOptions to separate the migrations into a separate step and run them in an init container.

To do this use the MigrateOnly / --migrate-only option on your main program when running it in the init container.

Then use the ErrorIfMigrateNeeded / --error-if-migrate-needed option on your main program when it starts up for normal use.

Code Stability

Libschema is still subject to changes. Anything that is not backwards compatible will be clearly documented and will fail in a way that does not cause hidden problems. For example, switching from using "flag" to using OverrideOptions will trigger an obvious breakage if you try to use a flag that no longer works.

Anticpated changes for the future:

  • API tweaks
  • Support for additional databases
  • Support for additional logging APIs
  • Support for tracing spans (per migration)

Documentation

Index

Constants

View Source
const DefaultTrackingTable = "libschema.migration_status"

Variables

This section is empty.

Functions

func LogFromLog added in v0.0.2

func LogFromLog(logger interface{ Log(...interface{}) }) *internal.Log

LogFromPrintln creates a logger for libschema from a logger that implements Log() like testing.T does.

func LogFromLogur added in v0.0.2

func LogFromLogur(logur Logur) *internal.Log

LogFromLogur creates a logger for libschema from a logger that implments the recommended interface in Logur. See https://github.com/logur/logur#readme

func LogFromPrintln added in v0.0.2

func LogFromPrintln(printer interface{ Println(...interface{}) }) *internal.Log

LogFromPrintln creates a logger for libchema from a logger that implements Println() like the standard "log" pacakge.

import "log"

LogFromPrintln(log.Default())

func OpenAnyDB

func OpenAnyDB(dsn string) (*sql.DB, error)

Types

type Database

type Database struct {
	DBName string

	Options Options
	// contains filtered or unexported fields
}

Database tracks all of the migrations for a specific database.

func (*Database) DB

func (d *Database) DB() *sql.DB

func (*Database) Lookup

func (d *Database) Lookup(name MigrationName) (Migration, bool)

func (*Database) Migrations

func (d *Database) Migrations(libraryName string, migrations ...Migration)

Migrations specifies the migrations needed for a library. By default, each migration is dependent upon the prior migration and they'll run in the order given. By default, all the migrations for a library will run in the order in which the library migrations are defined.

type Driver

type Driver interface {
	CreateSchemaTableIfNotExists(context.Context, *internal.Log, *Database) error
	LockMigrationsTable(context.Context, *internal.Log, *Database) error
	UnlockMigrationsTable(*internal.Log) error

	// DoOneMigration must update the both the migration status in
	// the Database object and it must persist the migration status
	// in the tracking table.  It also does the migration.
	// The returned sql.Result is optional: Computed() migrations do not
	// need to provide results.  The result is used for RepeatUntilNoOp.
	DoOneMigration(context.Context, *internal.Log, *Database, Migration) (sql.Result, error)

	// IsMigrationSupported exists to guard against additional migration
	// options and features.  It should return nil except if there are new
	// migration features added that haven't been included in all support
	// libraries.
	IsMigrationSupported(*Database, *internal.Log, Migration) error

	LoadStatus(context.Context, *internal.Log, *Database) ([]MigrationName, error)
}

Driver interface is what's required to use libschema with a new database.

type Logur added in v0.0.2

type Logur interface {
	Trace(msg string, fields ...map[string]interface{})
	Debug(msg string, fields ...map[string]interface{})
	Info(msg string, fields ...map[string]interface{})
	Warn(msg string, fields ...map[string]interface{})
	Error(msg string, fields ...map[string]interface{})
}

type Migration

type Migration interface {
	Base() *MigrationBase
	Copy() Migration
}

MigrationBase is a workaround for lacking object inheritance.

type MigrationBase

type MigrationBase struct {
	Name MigrationName
	// contains filtered or unexported fields
}

Migration defines a single database defintion update.

func (MigrationBase) Copy

func (m MigrationBase) Copy() MigrationBase

func (*MigrationBase) HasSkipIf

func (m *MigrationBase) HasSkipIf() bool

func (*MigrationBase) SetStatus

func (m *MigrationBase) SetStatus(status MigrationStatus)

func (*MigrationBase) Status

func (m *MigrationBase) Status() MigrationStatus

type MigrationName

type MigrationName struct {
	Name    string
	Library string
}

MigrationName holds both the name of the specific migration and the library to which it belongs.

func (MigrationName) String

func (n MigrationName) String() string

type MigrationOption

type MigrationOption func(Migration)

MigrationOption modifies a migration to set additional parameters

func After

func After(lib, migration string) MigrationOption

After sets up a dependency between one migration and another. This can be across library boundaries. By default, migrations are dependent on the prior migration defined. After specifies that the current migration must run after the named migration.

func Asynchronous added in v0.0.2

func Asynchronous() MigrationOption

Asynchronous marks a migration is okay to run asynchronously. If all of the remaining migrations can be asynchronous, then schema.Migrate() will return while the remaining migrations run.

func RepeatUntilNoOp added in v0.0.2

func RepeatUntilNoOp() MigrationOption

RepeatUntilNoOp marks a migration as potentially being needed to run multiple times. It will run over and over until the database reports that the migration modified no rows. This can useuflly be combined with Asychnronous.

This marking only applies to Script() and Generated() migrations. The migration must be a single statement.

For Computed() migrations, do not use RepeatUntilNoOp. Instead simply write the migration use Driver.DB() to get a database handle and use it to do many little transactions, each one modifying a few rows until there is no more work to do.

func SkipIf

func SkipIf(pred func() (bool, error)) MigrationOption

SkipIf is checked before the migration is run. If the function returns true then this migration is skipped. For MySQL, this allows migrations that are not idempotent to be checked before they're run and skipped if they have already been applied.

func SkipRemainingIf

func SkipRemainingIf(pred func() (bool, error)) MigrationOption

SkipRemainingIf is checked before the migration is run. If the function returns true then this migration and all following it are not run at this time. One use for this to hold back migrations that have not been released yet. For example, in a blue-green deploy organization, you could first do a migration that creates another column, then later do a migration that removes the old column. The migration to remove the old column can be defined and tested in advance but held back by SkipRemainingIf until it's time to deploy it.

type MigrationStatus

type MigrationStatus struct {
	Done  bool
	Error string // If an attempt was made but failed, this will be set
}

MigrationStatus tracks if a migration is complete or not.

type Options

type Options struct {
	// Overrides change the behavior of libschema in big ways: causing it to
	// call os.Exit() when finished or not migrating.  If overrides is not
	// specified then DefaultOverrides is used.
	Overrides *OverrideOptions

	// TrackingTable is the name of the table used to track which migrations
	// have been applied
	TrackingTable string

	// SchemaOverride is used to override the default schema.  This is most useful
	// for testing schema migrations
	SchemaOverride string

	// These TxOptions will be used for all migration transactions.
	MigrationTxOptions *sql.TxOptions

	ErrorOnUnknownMigrations bool

	// OnMigrationFailure is only called when there is a failure
	// of a specific migration.  OnMigrationsComplete will also
	// be called.  OnMigrationFailure is called for each Database
	// (if there is a failure).
	OnMigrationFailure func(dbase *Database, n MigrationName, err error)

	// OnMigrationsStarted is only called if migrations are needed
	// OnMigrationsStarted is called for each Database (if needed).
	OnMigrationsStarted func(dbase *Database)

	// OnMigrationsComplete called even if no migrations are needed.  It
	// will be called when async migrations finish even if they finish
	// with an error.  OnMigrationsComplete is called for each Database.
	OnMigrationsComplete func(dbase *Database, err error)

	// DebugLogging turns on extra debug logging
	DebugLogging bool
}

Options operate at the Database level but are specified at the Schema level at least initially. If you want separate options on a per-Database basis, you must override the values after attaching the database to the Schema.

type OverrideOptions added in v0.0.2

type OverrideOptions struct {
	// MigrateOnly causes program to exit when migrations are complete.
	// Asynchronous migrations will be skipped.  If no migrations are need,
	// the program will exit very quickly.  This is accomplished with a call
	// to os.Exit().
	MigrateOnly bool `flag:"migrate-only" help:"Call os.Exit() after completing migrations"`

	// MigrateDatabase specifies that only a specific database should
	// be migrated.  The name must match to a name provided with the schema.NewDatabase() call.
	// For both libschema/lsmysql and libschema/lspostgres, that is the name parameter to
	// New() NOT the database name in the DSN.  This is a logical name.
	MigrateDatabase string `flag:"migrate-database" help:"Migrate only the this database"`

	// MigrateDSN overrides the data source name for a single database.  It must be used in
	// conjunction with MigrateDatabase unless there are only migrations for a single database.
	MigrateDSN string `flag:"migrate-dsn" help:"Override *sql.DB, must combine with --migrate-database"`

	// NoMigrate command line flag / config variable skips all migrations
	NoMigrate bool `flag:"no-migrate" help:"Skip all migrations (except async)"`

	// ErrorIfMigrateNeeded command line flag / config variable causes Migrate() to return error if there are
	// migrations required.  Asynchronous migrations do not count as required and will
	// run in the background.
	// In combination with EverythingSynchronous = true, if there are asychronous migrations pending then
	// Migrate() will return error immediately.
	ErrorIfMigrateNeeded bool `flag:"error-if-migrate-needed" help:"Return error if migrations are not current"`

	// EverythingSynchronous command line flag / config variable causes asynchronous migrations to be
	// treated like regular migrations from the point of view of --migrate-only, --no-migrate,
	// and --error-if-migrate-needed.
	EverythingSynchronous bool `flag:"migrate-all-synchronously" help:"Run async migrations synchronously"`
}

OverrideOptions define the command line flags and/or configuration block that can be provided to control libschema's behavior. OverrideOptions is designed to be filled with https://github.com/muir/nfigure but it doesn't have to be done that way.

var DefaultOverrides OverrideOptions

DefaultOverrides provides default values for Options.Overrides. DefaultOverrides is only used when Options.Overrides is nil.

DefaultOverrides can be filled by nfigure:

import "github.com/muir/nfigure"
registry := nfigure.NewRegistry()
request, err := registry.Request(&libschema.DefaultOverrides)
err := registry.Configure()

DefaultOverrides can be filled using the "flag" package:

import "flag"
import "github.com/muir/nfigure"
nfigure.MustExportToFlagSet(flag.CommandLine, "flag", &libschema.DefautOverrides)

type Schema

type Schema struct {
	// contains filtered or unexported fields
}

Schema tracks all the migrations

func New

func New(ctx context.Context, options Options) *Schema

New creates a schema object.

func (*Schema) Migrate

func (s *Schema) Migrate(ctx context.Context) (err error)

Migrate runs pending migrations that have been registered as long as the command line flags support doing migrations. We all remaining migrations are asynchronous then the remaining migrations will be run in the background.

A lock is held while migrations are in progress so that there is no chance of double migrations.

func (*Schema) NewDatabase

func (s *Schema) NewDatabase(log *internal.Log, dbName string, db *sql.DB, driver Driver) (*Database, error)

NewDatabase creates a Database object. For Postgres and Mysql this is bundled into lspostgres.New() and lsmysql.New().

Directories

Path Synopsis
Package lsmysql has a libschema.Driver support MySQL
Package lsmysql has a libschema.Driver support MySQL
Package lspostgres has a libschema.Driver support PostgreSQL
Package lspostgres has a libschema.Driver support PostgreSQL
Package lssinglestore is a libschema.Driver for connecting to SingleStore databases.
Package lssinglestore is a libschema.Driver for connecting to SingleStore databases.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL