postgres

package
v0.0.0-...-6f34b33 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 28, 2023 License: MIT Imports: 22 Imported by: 0

README

PostgreSQL Coordinate Backend

This Coordinate backend uses no local state, but instead stores all of the information required for the Coordinate system in a PostgreSQL database.

Usage

Connect to an existing PostgreSQL server:

coordinated -backend postgres://user:pass@postgres.example.com/database

You can also, carefully, use the connection-string format:

coordinated -backend 'postgres:host=postgres.example.com user=user ...'

Or, you can set the connection information in environment variables, and use an empty connection string:

export PGHOST=postgres.example.com
export PGDATABASE=database
export PGUSER=user
export PGPASSWORD=password
coordinated -backend postgres:

As of this writing, all required tables will be created (and updated to the current version) on first startup. There is not yet a provision to remove existing tables.

Under load, the database may print out log messages

ERROR:  could not serialize access due to concurrent update

with corresponding SQL statements. The transaction-handling code in this module traps this error and correctly retries transactions. This error can be safely ignored.

Migrations

Database migrations are implemented with sql-migrate. To add a new migration file, add it to the migrations subdirectory, and run

go get -u github.com/jteeuwen/go-bindata/...
go generate github.com/diffeo/go-coordinate/postgres
git add src/github.com/diffeo/go-coordinate/postgres/migrations.go
go build github.com/diffeo/go-coordinate/cmd/coordinated

This sequence regenerates the migrations.go file, which should be checked in with your other changes.

In the current implementation, the migrations will run automatically on first startup (or any other call to postgres.New()). In principle you can also manually run the sql-migrate tool, pointing it at the migrations directory.

Testing

If you use Docker for a temporary database:

docker run -d --name postgres -p 5432:5432 postgres

Then you can run the tests as:

export PGHOST=127.0.0.1  # or $(docker-machine ip default)
export PGUSER=postgres
export PGDATABASE=postgres
export PGSSLMODE=disable
go test github.com/diffeo/go-coordinate/postgres

Implementation notes

Most objects are lightweight records that carry an object's in-database identifier, name, and parent, and little else. This means that calling e.g. namespace.WorkSpec("foo") will make a database round-trip to ensure the work spec exists but will not attempt to retrieve its definition or work units without explicit calls to fetch them.

Source files are arranged around database tables, not Coordinate interface objects. work_unit.go contains all functions that directly affect the work_unit table, including functions like coordinate.Namespace.SetWorkUnit().

Semi-structured data --- work spec definitions, work unit data, and per-attempt updated work unit data --- are stored as CBOR data in BYTEA columns. The extracted work spec metadata is stored in its own columns in the work_spec table, since this is allowed to change independently of the actual work spec definition. Consideration was given to a table of work spec ID, data key, data value, which would be easier to query for specific data fields, but there is no current use case for this.

Earlier versions of this code used gob instead, but that cannot (by design) distinguish nil from an empty slice; and when this is round-tripped to Python code, this causes empty lists to become the constant None instead. Using CBOR preserves nilness, but comes at the cost of losing most of the non-kind Go type information (most maps become map[interface{}]interface{}). In Go land, using a library like mapstructure is almost necessary for dealing with JSONish objects, and hides this typing issue.

We rely on the database to manage concurrency for us. This means cooperating with the database to tell it what we want, and it means being able to tolerate (and retry) transaction failures in some cases. Both of these things result in fairly database-specific code: the exact syntax of SELECT ... FOR UPDATE calls is highly variable across databases, and the actual error if the database engine traps a concurrency error is database-specific. Porting to other databases involves understanding their concurrency semantics as well as just updating syntax.

I chose sql-migrate as a database migration tool. It has the advantages of being able to run in-process (and not strictly require an external tool or configuration file) and being able to use go-bindata as a migration source. This also means that, if coordinated wants an ORMish system in the future, gorp has indirectly already been chosen.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Asset

func Asset(name string) ([]byte, error)

Asset loads and returns the asset for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetDir

func AssetDir(name string) ([]string, error)

AssetDir returns the file names below a certain directory embedded in the file by go-bindata. For example if you run go-bindata on data/... and data contains the following hierarchy:

data/
  foo.txt
  img/
    a.png
    b.png

then AssetDir("data") would return []string{"foo.txt", "img"} AssetDir("data/img") would return []string{"a.png", "b.png"} AssetDir("foo.txt") and AssetDir("notexist") would return an error AssetDir("") will return []string{"data"}.

func AssetInfo

func AssetInfo(name string) (os.FileInfo, error)

AssetInfo loads and returns the asset info for the given name. It returns an error if the asset could not be found or could not be loaded.

func AssetNames

func AssetNames() []string

AssetNames returns the names of the assets.

func Drop

func Drop(db *sql.DB) error

Drop clears a database by running all of the migrations in reverse, ultimately resulting in dropping all of the tables.

func MustAsset

func MustAsset(name string) []byte

MustAsset is like Asset but panics when Asset would return an error. It simplifies safe initialization of global variables.

func New

func New(connectionString string) (coordinate.Coordinate, error)

New creates a new coordinate.Coordinate connection object using the provided PostgreSQL connection string. The connection string may be an expanded PostgreSQL string, a "postgres:" URL, or a URL without a scheme. These are all equivalent:

"host=localhost user=postgres password=postgres dbname=postgres"
"postgres://postgres:postgres@localhost/postgres"
"//postgres:postgres@localhost/postgres"

See http://godoc.org/github.com/lib/pq for more details. If parameters are missing from this string (or if you pass an empty string) they can be filled in from environment variables as well; see http://www.postgresql.org/docs/current/static/libpq-envars.html.

The returned Coordinate object carries around a connection pool with it. It can (and should) be shared across the application. This New() function should be called sparingly, ideally exactly once.

func NewWithClock

func NewWithClock(connectionString string, clk clock.Clock) (coordinate.Coordinate, error)

NewWithClock creates a new coordinate.Coordinate connection object, using an explicit time source. See New() for further details. Most application code should call New(), and use the default (real) time source; this entry point is intended for tests that need to inject a mock time source.

func RestoreAsset

func RestoreAsset(dir, name string) error

RestoreAsset restores an asset under the given directory

func RestoreAssets

func RestoreAssets(dir, name string) error

RestoreAssets restores an asset under the given directory recursively

func Upgrade

func Upgrade(db *sql.DB) error

Upgrade upgrades a database to the latest database schema version.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL