elasticsearch

package

v0.15.5 Latest Latest Go to latest Published: Aug 10, 2018 License: MIT Imports: 16 Imported by: 3

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/dataux/dataux

Links

Open Source Insights

README ¶

MySQL access to Elasticsearch

Elasticsearch is awesome, but sometimes querying it is not, and integrating it into other apps involves custom code.
It would be great to have mysql and all the tools that talk MySql to be able to access it.

Example Usage

This example imports a couple hours worth of historical data from https://www.githubarchive.org/ into a local elasticsearch server for example querying. It only imports github_issues and github_watch.

# from root of this repo, there is a docker-compose
# which has an elasticsearch server in it we will use for example.
cd github.com/dataux/dataux
docker-compose up

# assuming elasticsearch on localhost else --host=myeshost
# lets import the data into elasticsearch
cd backends/elasticsearch/importgithub
go build && ./importgithub

# using docker start a dataux server
docker run --rm -it --net=host -p 4000:4000 gcr.io/dataux-io/dataux:latest

# now that dataux is running use mysql-client to connect
mysql -h 127.0.0.1 -P 4000

now run some queries

show databases;

-- first register the schema
-- dataux will introspect the tables
-- to create schema for the tables
CREATE schema github_archive IF NOT EXISTS WITH {
  "type":"elasticsearch",
  "schema":"github_archive",
  "hosts": ["http://127.0.0.1:9200"]
};

-- now a sql tour
show databases;
use github_archive;

show tables;

describe github_watch;

select actor, repostory.url from github_watch limit 10;

select cardinality(`actor`) AS users_who_watched, min(`repository.id`) as oldest_repo from github_watch;

SELECT actor, `repository.name`, `repository.stargazers_count`, `repository.language`
FROM github_watch where `repository.language` = "Go";

select actor, repository.name from github_watch where repository.stargazers_count BETWEEN "1000" AND 1100;

SELECT actor, repository.organization AS org
FROM github_watch 
WHERE repository.created_at BETWEEN "2008-10-21T17:20:37Z" AND "2008-10-21T19:20:37Z";

select actor, repository.name from github_watch where repository.name IN ("node", "docker","d3","myicons", "bootstrap") limit 100;

select cardinality(`actor`) AS users_who_watched, count(*) as ct, min(`repository.id`) as oldest_repo
FROM github_watch
WHERE repository.description LIKE "database";

SQL -> Elasticsearch

ES API	SQL Query
Aliases	`show tables;`
Mapping	`describe mytable;`
hits.total for filter	`select count(*) from table WHERE exists(a);`
aggs min, max, avg, sum	`select min(year), max(year), avg(year), sum(year) from table WHERE exists(a);`
filter: terms	`select * from table WHERE year IN (2015,2014,2013);`
filter: gte, range	`select * from table WHERE year BETWEEN 2012 AND 2014`

Configuration

tables_to_load If you do not want to load all tables.

CREATE schema github_archive IF NOT EXISTS WITH {
  "type":"elasticsearch",
  "schema":"github_archive",
  "tables_to_load":["indexa","indexb"],
  "hosts": ["http://127.0.0.1:9200"]
};

Documentation ¶

Index ¶

Constants
Variables
type ResultReader
- func NewResultReader(req *SqlToEs) *ResultReader
type ResultReaderNext
type Source
type SqlToEs
- func NewSqlToEs(table *schema.Table) *SqlToEs

Constants ¶

View Source

const (
	// SourceType defines the "source" type from qlbridge datasource
	SourceType = "elasticsearch"
)

Variables ¶

View Source

var (
	// DefaultLimit is default page size limit.
	DefaultLimit = 1000
)

Functions ¶

This section is empty.

Types ¶

type ResultReader ¶

type ResultReader struct {
	*exec.TaskBase

	Docs []u.JsonHelper
	Vals [][]driver.Value

	Total    int
	Aggs     u.JsonHelper
	ScrollId string
	Req      *SqlToEs
	// contains filtered or unexported fields
}

Elasticsearch ResultProvider, adapts the elasticsearch http json

to dataux/driver values

func NewResultReader ¶

func NewResultReader(req *SqlToEs) *ResultReader

func (*ResultReader) Close ¶

func (m *ResultReader) Close() error

func (*ResultReader) Columns ¶

func (m *ResultReader) Columns() []string

func (*ResultReader) Run ¶

func (m *ResultReader) Run() error

Run() Fetch api response, wait for response, then convert response into rows (static) and allow Next() to start iterating through them.

type ResultReaderNext ¶

type ResultReaderNext struct {
	*ResultReader
}

A wrapper, allowing us to implement sql/driver Next() interface

which is different than qlbridge/datasource Next()

type Source ¶

type Source struct {
	// contains filtered or unexported fields
}

Source is the elasticsearch datasource

func (*Source) Close ¶

func (m *Source) Close() error

Close this source.

func (*Source) Init ¶

func (m *Source) Init()

Init this source

func (*Source) Open ¶

func (m *Source) Open(schemaName string) (schema.Conn, error)

Open open connection to elasticsearch source.

func (*Source) Setup ¶

func (m *Source) Setup(ss *schema.Schema) error

Setup this source

func (*Source) Table ¶

func (m *Source) Table(table string) (*schema.Table, error)

Table get a single table.

func (*Source) Tables ¶

func (m *Source) Tables() []string

Tables list of tablenames

type SqlToEs ¶

type SqlToEs struct {
	// contains filtered or unexported fields
}

SqlToEs Sql To Elasticsearch Request Object Map sql queries into Elasticsearch Json Requests

func NewSqlToEs ¶

func NewSqlToEs(table *schema.Table) *SqlToEs

NewSqlToEs creates request for translating SqlToEs

func (*SqlToEs) Close ¶

func (m *SqlToEs) Close() error

Close this request

func (*SqlToEs) Columns ¶

func (m *SqlToEs) Columns() []string

func (*SqlToEs) Host ¶

func (m *SqlToEs) Host() string

Host get es host

func (*SqlToEs) WalkAggs ¶

func (m *SqlToEs) WalkAggs(cur expr.Node) (q esMap, _ error)

WalkAggs() aggregate expressions when used ast part of <select_list>

For Aggregates (functions) it builds aggs
For Projectsion (non-functions) it does nothing, that will be done later during projection

func (*SqlToEs) WalkExecSource ¶

func (m *SqlToEs) WalkExecSource(p *plan.Source) (exec.Task, error)

func (*SqlToEs) WalkGroupBy ¶

func (m *SqlToEs) WalkGroupBy() error

Aggregations from the <select_list>

WHERE .. GROUP BY x,y,z

func (*SqlToEs) WalkNode ¶

func (m *SqlToEs) WalkNode(cur expr.Node, q *esMap) (value.Value, error)

Walk() an expression, and its AND/OR/() logic to create an appropriately

nested json document for elasticsearch queries

TODO:  think we need to separate Value Nodes from those that return es types?

func (*SqlToEs) WalkSelectList ¶

func (m *SqlToEs) WalkSelectList() error

Aggregations from the <select_list>

SELECT <select_list> FROM ... WHERE

func (*SqlToEs) WalkSourceSelect ¶

func (m *SqlToEs) WalkSourceSelect(planner plan.Planner, p *plan.Source) (plan.Task, error)

WalkSourceSelect is used during planning phase, to create a plan (plan.Task)

or error, and to report back any poly-fill necessary

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
importgithub

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL