hydrate

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 18, 2020 License: Apache-2.0 Imports: 6 Imported by: 0

README

Hydrate

Actions Status codecov

Hydrate is a package designed to work with jinzhu/gorm to provide an alternative to Preload to load hierarchies efficiently.

Preload will load each layer of a hiearchy as its own query by constructing a query for each table using a WHERE IN (?,?,...,?) where the criteria includes the relationship keys to load. This can cause hierarchies with large branching factor to generate queries with an extreme number of IDs in the WHERE IN clause which can result in queries being significantly less performant.

With hydrate one or more queries can be constructed to load a full hierarchy of data.

Usage

Query

A query is constructed using NewQuery to pass in the database query used to load the full hierarchy. Models are added using AddModel to tell hydrate which alias to use to load each model. Run takes in a context and any results you want returned. Valid values for results are references to a struct, pointer to a struct, slice of structs, or slice of pointers of structs (ie. references to: ModelType, *ModelType, []ModelType, or []*ModelType). Multiple result types can be passed and all will be loaded

hydrate.NewQuery(db, `FROM textbooks t
        LEFT JOIN sections s on t.textbook_id = s.textbook_id
        LEFT JOIN exercises e ON e.section_id = s.section_id
        LEFT JOIN authors a ON a.author_id = t.author_id
        WHERE t.textbook_id in (?)
        ORDER BY t.textbook_id, s.section_id, a.author_id`, 1).
    AddModel(Textbook{}, "t").
    AddModel(Section{}, "s").
    AddModel(Exercise{}, "e").
    AddModel(Author{}, "a").
    Run(context.Background(), &textbooks)
MultiQuery

MultiQuery is a type backed by []Query. Multiple queries can be chained together to all load the same hierarchy. For important code paths well crafted MultiQueries are typically how you will get the most performance. Separate sections of your hierarchy can be loaded intelligently.

hydrate.MultiQuery{
    hydrate.NewQuery(db, `FROM textbooks t
            LEFT JOIN sections s on t.textbook_id = s.textbook_id
            WHERE t.textbook_id in (?)
            ORDER BY t.textbook_id, s.section_id`, 1).
        AddModel(Textbook{}, "t").
        AddModel(Section{}, "s"),

    hydrate.NewQuery(db, `FROM textbooks t
            LEFT JOIN authors a ON a.author_id = t.author_id
            WHERE t.textbook_id in (?)
            ORDER BY t.textbook_id, s.section_id, a.author_id`, 1).
        AddModel(Author{}, "a"),
}.Run(context.Background(), &textbooks)

Running Tests

Tests depend on a mysql database being available. The connection to this DB can be set with TEST_DB_HOST, TEST_DB_USERNAME, TEST_DB_PASSWORD environment variables.

Prior to executing tests a new schema will be created on the DB and that schema will be removed when tests complete.

$ TEST_DB_HOST=localhost TEST_DB_USERNAME=root TEST_DB_PASSWORD=password go test .

Tests use "golden" files which record the expected output of each test. This is typically json encoding of structs. When updating or writing new tests the -update flag can be passed to update the golden files.

$ TEST_DB_HOST=localhost TEST_DB_USERNAME=root TEST_DB_PASSWORD=password go test . -update
Benchmarks + Performance

Benchmarks will run actual queries against the test db, as a result timing can fluctuate. Performance will also vary drastically based on the shape of the hierarchy being loaded. Various configurations are run loading different amounts of data and different relationship branching factors. Loading data all in a single query can be less performant than gorm's Preload in some cases. However in most cases a well thought out MultiQuery tends to perform better.

An example of current benchmarks are below, which compare different configurations using standard gorm's Preload, one large hydrate.Query, and two queries using hydrate.MultiQuery:

goos: darwin
goarch: amd64
pkg: github.com/coursehero/hydrate
BenchmarkHydrate/S5:E10:I3/Preload-12 	     		     228	   5264193 ns/op	  170172 B/op	    3283 allocs/op
BenchmarkHydrate/S5:E10:I3/Query-12   	     		     408	   2973970 ns/op	  163543 B/op	    8394 allocs/op
BenchmarkHydrate/S5:E10:I3/MultiQuery-12         	     409	   2847169 ns/op	   72650 B/op	    2479 allocs/op
BenchmarkHydrate/S2000:E2:I2/Preload-12          	      22	  46077057 ns/op	16378600 B/op	  320521 allocs/op
BenchmarkHydrate/S2000:E2:I2/Query-12            	      14	  76894843 ns/op	 9535165 B/op	  497835 allocs/op
BenchmarkHydrate/S2000:E2:I2/MultiQuery-12       	      36	  32554055 ns/op	 4807059 B/op	  204267 allocs/op
BenchmarkHydrate/S10:E10:I10/Preload-12          	     205	   5807946 ns/op	  319066 B/op	    6551 allocs/op
BenchmarkHydrate/S10:E10:I10/Query-12            	     122	   9622931 ns/op	  903677 B/op	   53810 allocs/op
BenchmarkHydrate/S10:E10:I10/MultiQuery-12       	     358	   3222708 ns/op	  126218 B/op	    4857 allocs/op

Documentation

Overview

Package hydrate provides functionality to expand preloading functionality of gorm.

Specifically gorm hydrate will only support loading each relationship as a different query. This results in a query per level. Additionally each level loads the data by using a WHERE IN (...primary keys) which in certain situations with deep hierarchies with thousands of items can result in poor performing queries.

There are two ways provided to load data into a hierarchy using raw queries. Query will perform a single query to load one or more model structs. MultiQuery has a slice of Queries as its backing type and will allow multiple queries to be run and have results combined.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type MultiQuery

type MultiQuery []Query

MultiQuery will allow you to run multiple Queries and combine all results. Queries are run as normal but all structs are shared across all queries. Meaning relationships can be populated from independent queries, or even a collection of results from the same table can be loaded from multiple queries.

Example
package main

import (
	"context"
	"fmt"

	"github.com/jinzhu/gorm"

	"github.com/coursehero/hydrate"
)

func main() {
	var db *gorm.DB

	//example tables
	type Author struct {
		AuthorID uint `gorm:"primary_key"`
		Name     string
	}

	type Section struct {
		SectionID  *uint `gorm:"primary_key"`
		TextbookID uint
		Title      string
	}

	type Textbook struct {
		TextbookID uint `gorm:"primary_key"`
		AuthorID   uint
		Name       string

		Author   Author     `gorm:"foreignkey:AuthorID;association_foreignkey:AuthorID"`
		Sections []*Section `gorm:"foreignkey:TextbookID;association_foreignkey:TextbookID"`
	}

	var textbooks []Textbook
	err := hydrate.MultiQuery{
		hydrate.NewQuery(db, `FROM textbooks t
    LEFT JOIN sections s on t.textbook_id = s.textbook_id
    WHERE t.textbook_id in (?)
	ORDER BY t.textbook_id, s.section_id`, 1).
			AddModel(Textbook{}, "t").
			AddModel(Section{}, "s"),

		hydrate.NewQuery(db, `FROM textbooks t
	LEFT JOIN authors a ON a.author_id = t.author_id
    WHERE t.textbook_id in (?)
	ORDER BY t.textbook_id, s.section_id, a.author_id`, 1).
			AddModel(Author{}, "a"),
	}.Run(context.Background(), &textbooks)

	if err != nil {
		fmt.Printf("err = %v\n", err)
	}

	//all textbooks are fully loaded with all relationships
	fmt.Printf("%d textbooks loaded\n", len(textbooks))
	for _, t := range textbooks {
		fmt.Printf("Textbook %d has %d sections\n", t.TextbookID, len(t.Sections))
	}
}
Output:

func (MultiQuery) Run

func (m MultiQuery) Run(ctx context.Context, output ...interface{}) error

Run will run all queries and return output combined from all query runs.

type Query

type Query struct {
	// contains filtered or unexported fields
}

Query is used to define a query to hydrate data using a single query. Any number of models can be added to be loaded, each model will be added to the select query using the alias provided (or the table name if empty). Each model added will store unique items based on primary key values returned from the query. All relationships for each model will be filled by connecting to other loaded models using the gorm defined relationship.

Example

A single query can be provided using hydrate.NewQuery. Add model structs using AddModel providing a pointer to an instance of the model type and the alias used for the model in the query. The query provided should not include SELECT and should start with FROM.

package main

import (
	"context"
	"fmt"

	"github.com/jinzhu/gorm"

	"github.com/coursehero/hydrate"
)

func main() {
	var db *gorm.DB

	//example structs
	type Author struct {
		AuthorID uint `gorm:"primary_key"`
		Name     string
	}

	type Exercise struct {
		ExerciseID uint `gorm:"primary_key"`
		SectionID  uint
		Name       string
	}

	type Section struct {
		SectionID  *uint `gorm:"primary_key"`
		TextbookID uint
		Title      string

		Exercises []Exercise `gorm:"foreignkey:SectionID;association_foreignkey:SectionID"`
	}

	type Textbook struct {
		TextbookID uint `gorm:"primary_key"`
		AuthorID   uint
		Name       string

		Author   Author     `gorm:"foreignkey:AuthorID;association_foreignkey:AuthorID"`
		Sections []*Section `gorm:"foreignkey:TextbookID;association_foreignkey:TextbookID"`
	}

	var textbooks []Textbook
	err := hydrate.NewQuery(db, `FROM textbooks t
    LEFT JOIN sections s on t.textbook_id = s.textbook_id
	LEFT JOIN exercises e ON e.section_id = s.section_id
	LEFT JOIN authors a ON a.author_id = t.author_id
    WHERE t.textbook_id in (?)
	ORDER BY t.textbook_id, s.section_id, a.author_id`, 1).
		AddModel(Textbook{}, "t").
		AddModel(Section{}, "s").
		AddModel(Exercise{}, "e").
		AddModel(Author{}, "a").
		Run(context.Background(), &textbooks)
	if err != nil {
		fmt.Printf("err = %v\n", err)
	}

	//all textbooks are fully loaded with all relationships
	fmt.Printf("%d textbooks loaded\n", len(textbooks))
	for _, t := range textbooks {
		fmt.Printf("Textbook %d has %d sections\n", t.TextbookID, len(t.Sections))
		for _, s := range t.Sections {
			fmt.Printf("Section %d has %d exercises\n", s.SectionID, len(s.Exercises))
		}
	}
}
Output:

func NewQuery

func NewQuery(db *gorm.DB, query string, args ...interface{}) Query

NewQuery will create a query with a given query and sql args

func (Query) AddModel

func (r Query) AddModel(in interface{}, alias string) Query

AddModel will add a model to be loaded during execution. Its fields will be added to the select. If no alias is provided it will use the table name

func (Query) Run

func (r Query) Run(ctx context.Context, output ...interface{}) error

Run will run the query and put results in any outputs provided. Each output must be a pointer to a value that can be set. If a slice is provided it will fill with all results. If a single item is passed the first item will be returned. However no limiting will be done to the query.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL