arangomigo

package module
v0.0.0-...-5e50ad4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 17, 2024 License: MIT Imports: 17 Imported by: 1

README

ArangoMiGO

A schema evolution tool for ArangoDB. Manage your collections, indices, and data transformations in a centralized way in your source control along with your application.

The goal behind the project is to apply to ArangoDB years of hard fought lessons (especially those that kicked us in the teeth). We needed a schema version manager that could create a database, add all of the collections, indexes and data population necessary for a developer to create a local VM of ArangoDB that looks like a mini-production. The system should automatically adjust to merges. While providing all of this, it must also support the neat features we all know and chose ArangoDB for: sharding collections on distributed systems. This means that we can't rely on creating the collection automatically if it doesn't exist when inserting a document. Sometimes collections should come preloaded with some documents from start.

Supports Arango 3.1+.

Getting an executable

If you're familiar with Go, you can clone and build this project directly on your target machine. If you'd prefer an official build, look in the builds folder.

To your executable pass the path to the configuration file, which is defined below.

Creating your structures

ArangoMiGO supports creating, modifying, and deleting graphs, collections, indexes, views, and even the database. Below you'll see how to use YAML to create a migration set. Once a migration component executes, the system doesn't rerun it. You don't have to worry about creating a collection or running data migration twice.

Creating the configuration file
endpoints:
   - http://arangodb-local:8529
username: root
password: devroot
migrationspath: /home/jdavenpo/go/src/github.com/deusdat/arangomigo/testdata/complete
db: MigoFull
extras:
  {patricksUser: jdavenpo,
   patricksPassword: Extrem!Password&^%$#,
   shouldBeANumber: 10,
   secret: Lots of mayo}

ArangoMiGO supports fail over out of the box: endpoints. If you are creating a database as part of the migration set, make sure that the username has the proper rights.

migrationspath is the directory holding the migration configurations. You may specify a single value like migrationspath: some/where/to/migrations or you can specify a list.

db is the name of the target database. If you create the database as part of the migration, the name in the config and in the migration must match.

extras allows you to specify arbitrary values through a look up mechanism. As you'll see later, you can use ${} to mark fields, such as those found in the BindVars of the AQL migration, as replaceable. This allows you to add sensitive data that should not go in source control.

Did we mention that you shouldn't store the config in source control? No? Don't store the config in source control.

A quick note on versioning

Each step in the migration set is another version. If you're familiar with liquibase, you give the change a specific id. Flyway uses the file name format, as does ArangoMiGO.

File names have this pattern: VersionNumber<_Any_description>.migration. A version number has to be in the format of number.. Here are a few examples.

  • 1.migration
  • 1.4_Adds_ROI_CALC_FUNCTION.migration
  • 12.6.7.2.migration
  • 19.02.02.migration

Let's say you start with the following migration set.

  • 1.migration
  • 2.migration
  • 3.migration
  • 4.migration

Then you need to add an index for a collection created in 3.migration. You can either create 5.migration or 3.1.migration. ArangoMiGO will see that it's applied 3, but not 3.1 and apply it. Either way works. The latter is more logically consisent for a new deploy.

ArangoMiGO halts at the first failure. Other systems solider through error and report them at the end. In our experience this is a bad idea when it comes to our data. We baked that philosophy in.

Creating your database
type: database
action: create
name: MigoFull
allowed:
  - username: ${patricksUser}
    password: ${patricksPassword}

One thing to notice is that the user name and password leverage the replacement feature. You can safely commit this migration without fear of tipping your security hand in the future.

You can also include a list of users not allowed in the database with Disallowed field.

Dropping your database
type: database
action: delete
name: MigoFull

Deletes a database with the name of MigoFull

Creating a collection
type: collection
action: create
name: recipes
journalsize: 10485760
waitforsync: true

This example creates a collection in the database named in the config file and sets the journalsize and wait for sync properties. You can also add the following features.

  • shardkeys - list of fields to use for the shard key.
  • numberofshards - integer
  • allowuserkeys - boolean
  • volatile - boolean
  • compactable - boolean
  • waitforsync - boolean
  • journalsize - int

If you don't include a specific property, Arango applies its own default.

Modifying a collection
type: collection
action: modify
name: recipes
journalsize: 10485760
waitforsync: true

You can exclude either journalsize or waitforsync.

Deleting a collection
type: collection
action: delete
name: recipes
Executing AQL
type: aql
query: 'INSERT {Name: "Taco Fishy", WithEscaped: @escaped, MeatType: @meat, _key: "hello"} IN recipes'
bindvars:
    escaped: ${secret}
    meat: Fish

The example shows how to execute arbitrary AQL. You should use single quotes to encapsulate the statement. If you need to insert AQL variables, add them to the bindvars. You can use the value subsitution with Extras from the configuration. One useful scenario is creating user accounts for your application.

Creating a graph
type: graph
action: create
name: testing_graph
edgedefinitions:
   - collection: relationships
     from: 
         - recipes
     to: 
         - recipes

Creates a graph named testing_graph with one edge between the collection vertex recipes named relationships. You can also set the following attributes.

  • smart - bool if you are using the Enterprise edition.
  • smartgraphattribute - string the attribute used to shuffle vertexes.
    • shards - int the number of shards each collection has.
    • orphanvertices - []string a list of collections within the graph, but not part of an edge.
    • edgedefinitions - []EdgeDefinition creates a single edge between vertexes, where EdgeDefinition looks like the on in the example above.
Modify a graph

This example modifies the graph testing_graph by adding a new edge owns and changing the existing edge relationship to include users as a target. Finally, this adds a vertex another to the orphan vertices collections.

type: graph
action: modify
name: testing_graph
edgedefinitions:
   - collection: owns
     from: 
         - users
     to: 
         - recipes
   - collection: relationships
     from: 
         - recipes
     to: 
         - recipes
         - users
orphanvertices:
   - another

It is possible that a graph could be partially configured. If you specified a series of changes like removing orphan vertices and adding new edges, that the vertices maybe deleted, but the edges won't be added. Please watch the output for warnings.

You must specify the graph, action as modify and the name of the graph. You can use these attributes to make changes.

  • removevertices - []string names of the vertices to remove. If you attempt to remove a vertex included in an edge, the migration will fail.
  • removeedges - []string the names of the edges you want to remove.
  • orphanvertices - []string allows you to add vertices to the graph without included them in the edges. It will create a new vertex if the collection does not already exist.
  • edgedefinitions - []EdgeDefinition names an edge and vertices that comprise the To and From. If the edge definition already exists, it gets updated to reflect the To, From relationship.
Delete a graph
type: graph
action: delete
name: testing_graph
Indexes

At present you can only create indexes. ArangoDB doesn't expose an API to properly identify indexes. As a result, the migrator ignore the action. This may change in the future. Please specify create.

Full Text Index

type: fulltextindex
action: create
fields:
    - name
collection: recipes
minlength: 4
inbackground: false

You can add a full text index by specifying the minlength and fields.

Geo Index

type: geoindex
action: create
collection: recipes
fields:
    - pts
geojson: true
inbackground: false

geojson indicate that field or fields are an array in the form [lat, long]. To excerpt the Arango Documentation


To create a geo index on an array attribute that contains longitude first, set the geoJson attribute to true. This corresponds to the format described in RFC 7946 Position

collection.ensureIndex({ type: "geo", fields: [ "location" ], geoJson: true })

To create a geo-spatial index on all documents using latitude and longitude as separate attribute paths, two paths need to be specified in the fields array:

collection.ensureIndex({ type: "geo", fields: [ "latitude", "longitude" ] })


Hash Index

type: hashindex
action: create
collection: recipes
fields:
    - one
    - two
sparse: true
nodeduplicate: false
inbackground: false

Persistent Index

type: persistentindex
action: create
fields:
    - tags
collection: recipes
unique: true
sparse: true
inbackground: false

TTL Index

type: ttlindex
action: create
field: createdAt
collection: recipes
expireafter: 3600
inbackground: false

Skiplist Index

type: skiplistindex
action: create
fields:
    - a
    - b
collection: recipes
unique: true
sparse: true
nodeduplicate: true
inbackground: false
Views

Views were added to Arango 3.4. They allow provide a way to search on collections. See Arango Search View for more information. Note: For creating and modifying, if the analyzers are defined they must already be defined in the Arango DB (Analzyers)

Create View

type: view
action: create
name: SearchRecipes
links:
    - name: recipe
      analyzers:
        - identity
      fields:
        - name: name
      includeAllFields: false
      storeValues: none
      trackListPositions: false
primarySort:z
    - field: name
      ascending: false

This example creates a view in the database named in the config file and sets a linked collection (recipes) with a primary sort on the name field. You can also add the following features:

  • cleanupIntervalStep - integer
  • commitIntervalMsec - integer
  • consolidationIntervalMsec - integer

If you don't include a specific property, Arango applies its own default.

Modify View

type: view
action: modify
name: SearchRecipes
links:
    - name: recipe
      analyzers:
        - identity
      fields:
        - name: name
      includeAllFields: false
      storeValues: none
      trackListPositions: false

Delete View

type: view
action: delete
name: SearchRecipes

Build into your go software

Instead of using the binary amd yaml files you can also embed the migrations directly from your go code. See perform_test.go for an example.

Run tests

To run tests inside Docker run:

./test.sh

Documentation

Overview

Package arangomigo allows the tool to execute from the command line.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func PerformMigrations

func PerformMigrations(ctx context.Context, c Config, ms []Migration) error

func TriggerMigration

func TriggerMigration(configAt string)

Types

type AQL

type AQL struct {
	Operation `yaml:",inline"`
	Query     string
	BindVars  map[string]interface{}
}

AQL allows arbitrary AQL execution as part of the migration.

func (AQL) Migrate

func (a AQL) Migrate(ctx context.Context, db driver.Database, extras map[string]interface{}) error

type Action

type Action string

Action enumerated values for valid operation actions.

const (
	CREATE Action = "create"
	DELETE Action = "delete"
	MODIFY Action = "modify"
	RUN    Action = "run"
)

Enumerated values for the Action

type Collection

type Collection struct {
	Operation `yaml:",inline"`

	ShardKeys      *[]string
	JournalSize    *int
	NumberOfShards *int
	WaitForSync    *bool
	AllowUserKeys  *bool
	Volatile       *bool
	Compactable    *bool
	CollectionType string
}

Collection the YAML struct for configuring a collection migration.

func (Collection) Migrate

func (cl Collection) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type Config

type Config struct {
	Endpoints      []string
	Username       string
	Password       string
	MigrationsPath StringArray
	Db             string
	SkipSslVerify  bool `yaml:"skip_ssl_verify"`
	// Extras allows the user to pass in replaced variables
	Extras map[string]interface{}
}

Config The content of a migration configuration.

type ConsolidationPolicy

type ConsolidationPolicy struct {
	// Type returns the type of the ConsolidationPolicy.
	Type string
	// Options contains the fields used by the ConsolidationPolicy and are related to the Type.
	Options map[string]interface{}
}

ConsolidationPolicy holds threshold values specifying when to consolidate view data. see ArangoSearchConsolidationPolicy

ArangoSearchConsolidationPolicyTier
ArangoSearchConsolidationPolicyBytesAccum

type Database

type Database struct {
	Operation `yaml:",inline"`

	Allowed    []User
	Disallowed []string
	// contains filtered or unexported fields
}

Database the YAML struct for configuring a database migration.

func (*Database) Migrate

func (d *Database) Migrate(ctx context.Context, db driver.Database, extras map[string]interface{}) error

type EdgeDefinition

type EdgeDefinition struct {
	// The name of the edge collection to be used.
	Collection string `json:"collection"`
	// To contains the names of one or more edge collections that can contain target vertices.
	To []string `json:"to"`
	// From contains the names of one or more vertex collections that can contain source vertices.
	From []string `json:"from"`
}

EdgeDefinition contains all information needed to define a single edge in a graph.

type FullTextIndex

type FullTextIndex struct {
	Operation    `yaml:",inline"`
	Fields       []string
	Collection   string
	MinLength    int
	InBackground bool
}

FullTextIndex defines how to build a full text index on a field

func (FullTextIndex) Migrate

func (i FullTextIndex) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type GeoIndex

type GeoIndex struct {
	Operation    `yaml:",inline"`
	Fields       []string
	Collection   string
	GeoJSON      bool
	InBackground bool
}

GeoIndex creates a GeoIndex within the specified collection.

func (GeoIndex) Migrate

func (i GeoIndex) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type Graph

type Graph struct {
	Operation `yaml:",inline"`
	// Smart indicates that the graph uses the Enterprise
	// edition's graph management.
	Smart *bool
	// SmartGraphAttribute is the attribute used to shuffle vertexes.
	SmartGraphAttribute string
	// Shards is the number of shards each collection has.
	Shards int
	// OrphanVertex
	OrphanVertices []string
	// EdgeDifinition creates a single edge between vertexes
	EdgeDefinitions []EdgeDefinition
	// Names of Edge Collections to remove
	RemoveEdges []string
	// Names of vertices to re
	RemoveVertices []string
}

Graph allows a user to manage graphs

func (Graph) Migrate

func (g Graph) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type HashIndex

type HashIndex struct {
	Operation     `yaml:",inline"`
	Fields        []string
	Collection    string
	Unique        bool
	Sparse        bool
	NoDeduplicate bool
	InBackground  bool
}

HashIndex creates a hash index on the fields within the specified Collection.

func (HashIndex) Migrate

func (i HashIndex) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type Migration

type Migration interface {
	Migrate(ctx context.Context, driver driver.Database, extras map[string]interface{}) error
	FileName() string
	SetFileName(name string)
	CheckSum() string
	SetCheckSum(sum string)
}

Migration all the operations necessary to modify a database, even make one.

type Operation

type Operation struct {
	Type   string
	Name   string
	Action Action
	// contains filtered or unexported fields
}

Operation the common elements for all migrations.

func (*Operation) CheckSum

func (op *Operation) CheckSum() string

CheckSum gets the checksum for the migration's file

func (*Operation) FileName

func (op *Operation) FileName() string

FileName gets the filename of the migrations configuration.

func (*Operation) SetCheckSum

func (op *Operation) SetCheckSum(sum string)

SetCheckSum sets the checksum of the file, in hex.

func (*Operation) SetFileName

func (op *Operation) SetFileName(fileName string)

SetFileName updates the filename of the migration

type PairedMigrations

type PairedMigrations struct {
	// contains filtered or unexported fields
}

PairedMigrations Defines the primary change and an undo operation if provided. Presently undo is not a supported feature. After reading Flyway's history of the feature, it might never be supported

type PersistentIndex

type PersistentIndex struct {
	Operation    `yaml:",inline"`
	Fields       []string
	Collection   string
	Unique       bool
	Sparse       bool
	InBackground bool
	StoredValues []string
}

PersistentIndex creates a persistent index on the collections' fields.

func (PersistentIndex) Migrate

func (i PersistentIndex) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type SearchElementProperties

type SearchElementProperties struct {
	// Name of the element (e.g. collection name)
	Name string
	// The list of analyzers to be used for indexing of string values. Defaults to ["identify"].
	// NOTE: They much be defined in Arango.
	Analyzers []string `yaml:"analyzers,omitempty"`
	// Fields contains the properties for individual fields of the element.
	Fields []SearchElementProperties `yaml:"fields,omitempty"`
	// If set to true, all fields of this element will be indexed. Defaults to false.
	IncludeAllFields *bool `yaml:"includeAllFields,omitempty"`
	// This values specifies how the view should track values.
	// see ArangoSearchStoreValues
	StoreValues *string `yaml:"storeValues,omitempty"`
	// If set to true, values in a listed are treated as separate values. Defaults to false.
	TrackListPositions *bool `yaml:"trackListPositions,omitempty"`
}

SearchElementProperties contains properties that specify how an element is indexed in an ArangoSearch view. Note that this structure is recursive. Settings not specified (nil) at a given level will inherit their setting from a lower level.

type SearchView

type SearchView struct {
	Operation `yaml:",inline"`
	// CleanupIntervalStep specifies the minimum number of commits to wait between
	// removing unused files in the data directory.
	CleanupIntervalStep *int64 `yaml:"cleanupIntervalStep,omitempty"`
	// CommitInterval ArangoSearch waits at least this many milliseconds between committing
	// view data store changes and making documents visible to queries
	CommitIntervalMsec *int64 `yaml:"commitIntervalMsec,omitempty"`
	// ConsolidationInterval specifies the minimum number of milliseconds that must be waited
	// between committing index data changes and making them visible to queries.
	ConsolidationIntervalMsec *int64 `yaml:"consolidationIntervalMsec,omitempty"`
	// ConsolidationPolicy specifies thresholds for consolidation.
	ConsolidationPolicy *ConsolidationPolicy `yaml:"consolidationPolicy,omitempty"`
	// SortFields lists the fields that used for sorting.
	SortFields []SortField `yaml:"primarySort,omitempty"`
	// Links contains the properties for how individual collections
	// are indexed in thie view.
	Links []SearchElementProperties `yaml:"links,omitempty"`
}

SearchView contains all the information needed to create an Arango Search SearchView.

func (SearchView) Migrate

func (searchView SearchView) Migrate(ctx context.Context, db driver.Database, extras map[string]interface{}) error

The impls_view contains all the implementation code for create, modifying and deleting an Arango Search View.

type SkiplistIndex

type SkiplistIndex struct {
	Operation     `yaml:",inline"`
	Fields        []string
	Collection    string
	Unique        bool
	Sparse        bool
	NoDeduplicate bool
	InBackground  bool
}

SkiplistIndex creates a sliplist index on the collections' fields.

func (SkiplistIndex) Migrate

func (i SkiplistIndex) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type SortField

type SortField struct {
	// The name of the field.
	Field string
	// Whether the field is ascending or descending.
	Ascending *bool `yaml:"ascending,omitempty"`
}

SortField describes a field and whether its ascending or not used for primary search.

type StringArray

type StringArray []string

func (*StringArray) UnmarshalYAML

func (a *StringArray) UnmarshalYAML(unmarshal func(interface{}) error) error

type TTLIndex

type TTLIndex struct {
	Operation    `yaml:",inline"`
	Field        string
	Collection   string
	ExpireAfter  int
	InBackground bool
}

TTLIndex creates a TTL index on the collections' fields.

func (TTLIndex) Migrate

func (i TTLIndex) Migrate(ctx context.Context, db driver.Database, _ map[string]interface{}) error

type User

type User struct {
	Username string
	Password string
}

User the data used to update a user account

Directories

Path Synopsis
cmd
arangomigo
Package main allows the tool to execute from the command line.
Package main allows the tool to execute from the command line.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL