athena

package module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 28, 2022 License: MIT Imports: 17 Imported by: 1

README

go-athena

go-athena is a simple Golang database/sql driver for Amazon Athena.

import (
    "database/sql"
    _ "github.com/akrennmair/go-athena"
)

func main() {
  // Make sure the access key and access key id are url encoded first
  db, _ := sql.Open("athena", "db=default&output_location=s3://results&secret_key_id=XX&secret_access_key=YY")
  rows, _ := db.Query("SELECT url, code from cloudfront")

  for rows.Next() {
    var url string
    var code int
    rows.Scan(&url, &code)
  }
}

It provides a higher-level, idiomatic wrapper over the AWS Go SDK, comparable to the Athena JDBC driver AWS provides for Java users.

For example,

Caveats

database/sql exposes lots of methods that aren't supported in Athena. For example, Athena doesn't support transactions so Begin() is irrelevant. If a method must be supplied to satisfy a standard library interface but is unsupported, the driver will return an error indicating so.

Testing

Unit tests are in place to aim to achieve test coverage for as much of the package code as possible. This cannot fully replace end-to-end testing, and as Athena doesn't have a local version and revolves around S3, integration tests are in place to test against AWS itself. They can be enabled by running go test -tags=integration.

These tests require AWS credentials. The simplest way to provide them is via AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables, but you can use anything supported by the Default Credential Provider Chain.

The tests support a few environment variables:

  • ATHENA_DATABASE can be used to override the default database "go_athena_tests"
  • S3_BUCKET can be used to override the default S3 bucket of "go-athena-tests"

Please bear in mind that the integration tests are currently unmaintained.

Acknowlegments

This library started out as a fork of https://github.com/segmentio/go-athena. As that project seems to have been abandoned, this fork aims to be as feature-complete as possible and integrate all the changes that were submitted as PRs to the original package and committed to other forks. The following people's work has been included in this package:

  • Fredrik Petrini prepared statement support
  • Forud (fzerorubigd) support for authentication with access_key_id/secret_access_key
  • jkatagi support for workgroups

Documentation

Index

Constants

View Source
const (
	// TimestampLayout is the Go time layout string for an Athena `timestamp`.
	TimestampLayout             = "2006-01-02 15:04:05.999"
	TimestampWithTimeZoneLayout = "2006-01-02 15:04:05.999 MST"
	DateLayout                  = "2006-01-02"
)

Variables

This section is empty.

Functions

func Open

func Open(cfg Config) (*sql.DB, error)

Open is a more robust version of `db.Open`, as it accepts a raw aws.Session. This is useful if you have a complex AWS session since the driver doesn't currently attempt to serialize all options into a string.

Types

type Config

type Config struct {
	Athena athenaiface.AthenaAPI
	//Session        *session.Session
	Database       string
	OutputLocation string
	WorkGroup      string

	PollFrequency time.Duration
}

Config is the input to Open().

type Driver

type Driver struct {
	// contains filtered or unexported fields
}

Driver is a sql.Driver. It's intended for db/sql.Open().

func NewDriver

func NewDriver(cfg *Config) *Driver

NewDriver allows you to register your own driver with `sql.Register`. It's useful for more complex use cases. Read more in PR #3. https://github.com/segmentio/go-athena/pull/3

Generally, sql.Open() or athena.Open() should suffice.

func (*Driver) Open

func (d *Driver) Open(connStr string) (driver.Conn, error)

Open should be used via `db/sql.Open("athena", "<params>")`. The following parameters are supported in URI query format (k=v&k2=v2&...)

- `db` (required) This is the Athena database name. In the UI, this defaults to "default", but the driver requires it regardless.

- `output_location` (required) This is the S3 location Athena will dump query results in the format "s3://bucket/and/so/forth". In the AWS UI, this defaults to "s3://aws-athena-query-results-<ACCOUNTID>-<REGION>", but the driver requires it.

- `poll_frequency` (optional) Athena's API requires polling to retrieve query results. This is the frequency at which the driver will poll for results. It should be a time/Duration.String(). A completely arbitrary default of "5s" was chosen.

- `region` (optional) Override AWS region. Useful if it is not set with environment variable.

- `workgroup` (optional) Athena's workgroup. This defaults to "primary".

Credentials must be accessible via the SDK's Default Credential Provider Chain. For more advanced AWS credentials/session/config management, please supply a custom AWS session directly via `athena.Open()`.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL