devstats

package module
v0.0.0-...-09410e4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 31, 2017 License: Apache-2.0 Imports: 18 Imported by: 0

README

Build Status CII Best Practices

GitHub archives Grafana visualization dashboards

Author: Łukasz Gryglicki lukaszgryglick@o2.pl

This is a toolset to visualize GitHub archives using Grafana dashboards.

Gha2db stands for GitHub Archives to DashBoards.

Goal

We want to create a toolset for visualizing various metrics for the Kubernetes community.

Everything is open source so that it can be used by other CNCF and non-CNCF open source projects.

The only requirement is that project must be hosted on a public GitHub repository/repositories.

Forking and installing locally

This toolset uses only Open Source tools: Postgres database, InfluxDB time-series database and Grafana dashboards. It is written in Go, and can be forked and installed by anyone.

Contributions and PRs are welcome. If You see a bug or want to add a new metric please create an issue and/or PR.

To work on this project locally please fork the original repository, and then follow instructions about running locally:

For more detailed description of all environment variables, tools, switches etc, please see usage.

Metrics

We want to support all kind of metrics, including historical ones. Please see requested metrics to see what kind of metrics are needed. Many of them cannot be computed based on the data sources currently used.

Company Affiliations

We also want to have per company statistics. To implement such metrics we need a mapping of developers and their employers.

There is a project that attempts to create such mapping cncf/gitdm.

Gha2db has an import tool that fetches company affiliations from cncf/gitdm and allows to create per company metrics/statistics.

If you see errors in the company affiliations, please open a pull request on cncf/gitdm and the updates will be reflected on https://devstats.k8s.io a couple days after the PR has been accepted. Note that gitdm supports mapping based on dates, to account for developers moving between companies.

GitHub Archives

Our approach is to use GitHub archives instead. The possible alternatives are:

  1. BigQuery:
  • You can query any data you want, but the structure is quite flat and entire GitHub event payloads are stored as a single column containing JSON text.
  • This limits usage due to the need of parsing that JSON in DB queries.
  • BigQuery is commercial, paid and is quite expensive.
  • It is not a standard SQL.
  1. GitHub API:
  • You can get the current state of the objects, but you cannot get repo, PR, issue state in the past (for example summary fields, etc).
  • It is limited by GitHub API usage per hour, which makes local development harder.
  • API limits are very aggressive for unauthorized access, and even with authorized access, you're limited to 5000 API calls/hour. With this limit, it would take more than 2 months to get all Kubernetes GitHub events (estimate).
  • It is much slower than processing GitHub archives or BigQuery.
  • You must query it via API and it is returning a single result.
  • You can use GitHub hook callbacks, but they only fire for current events.
  1. GitHub archives
  • All GitHub events are packed into multi-json gzipped files each hour and made available from Github Archive. To use this data, you need to extract all hours (since the Kubernetes project started) and filter out all data except for events from the 3 kubernetes organizations (kubernetes, kubernetes-incubator, and kubernetes-client).
  • This is a lot of data to process, but you have all possible GitHub events in the past, processing more than 3 years of this data takes about 2-2,5 hours, but this must only be done once and then the processed results are available for other's use.
  • You have a lot of data in a single file, that can be processed/filtered in memory.
  • You are getting all possible events, and all of them include the current state of PRs, issues, repos at given point in time.
  • Processing of GitHub archives is free, so local development is easy.
  • GitHub archives format changed in 2015-01-01, so it is using older format (pre-2015) before that date, and newer after. For details please see USAGE, specially GHA2DB_OLDFMT environment variable.
  • I have 1.2M events in my Psql database, and each event contains quite complex structure, I would estimate about 3-6 GitHub API calls are needed to get that data. It means about 7M API calls.
  • 7.2M / 5K (API limit per hour) gives 1440 hours which is 2 months. And we're on GitHub API limit all the time. Processing ALL GitHub events takes about 2 hours without ANY limit.
  • You can optionally save downloaded JSONs to avoid network traffic in next calls (also usable for local development mode).
  • There is an already implemented version in Go, please see usage here USAGE
  • Dashboards can be displayed here link

Architecture

We're getting all possible GitHub data for all objects, and all objects historical state as well (not discarding any data):

  1. structure (manages database structure, summaries, views)
  • structure
  • It is used to create database structure, indexes and to update database summary tables, views etc.
  • Postgres advantages over MySQL include:
  • Postgres supports hash joins that allows multi-million table joins in less than 1s, while MySQL requires more than 3 minutes. MySQL had to use data duplication in multiple tables to create fast metrics.
  • Postgres has built-in fast REGEXP extract & match, while MySQL only has slow REGEXP match and no REGEXP extract, requiring external libraries like lib_mysql_pcre to be installed.
  • Postgres supports materialized views - so complex metrics can be stored by such views, and we only need to refresh them when syncing data. MySQL requires creating an additional table and managing it.
  • MySQL has utf8 related issues, I've found finally workaround that requires to use utf8mb4 and do some additional mysqld configuration.
  1. gha2db (imports GitHub archives to database and eventually JSON files)
  • devstats
  • Reads from GitHub archive and writes to Postgres
  • It saves ALL data from GitHub archives, so we have all GitHub structures fully populated. See Database structure.
  • We have all historical data from all possible GitHub events and summary values for repositories at given points of time.
  • The idea is to divide all data into two categories: const and variable. Const data is a data that is not changing in time, variable data is a data that changes in time, so event_id is added as a part of this data primary key.
  • Table structure, const and variable description can be found in USAGE
  • The program can be parallelized very easy (events are distinct in different hours, so each hour can be processed by other CPU), uses 48 CPUs on our test machine.
  1. db2influx (computes metrics given as SQL files to be run on Postgres and saves time series output to InfluxDB)
  • db2influx
  • This separates metrics complex logic in SQL files, db2influx executes parameterized SQL files and write final time-series to InfluxDB.
  • Parameters are '{{from}}', '{{to}}' to allow computing the given metric for any date period.
  • For histogram metrics there is a single parameter '{{period}}' instead. To run db2influx in histogram mode add "h" as last parameter after all other params. gha2db_sync already handles this.
  • This means that InfluxDB will only hold multiple time-series (very simple data). InfluxDB is extremely good at manipulating such kind of data - this is what it was created for.
  • Grafana will read from InfluxDB by default and will use its power to generate all possible aggregates, minimums, maximums, averages, medians, percentiles, charts etc.
  • Adding new metric will mean add Postgres SQL that will compute this metric.
  1. gha2db_sync (synchronizes GitHub archive data and Postgres, InfluxDB databases)
  • gha2db_sync
  • This program figures out what is the most recent data in Postgres database then queries GitHub archive from this date to current date.
  • It will add data to Postgres database (since the last run)
  • It will update summary tables and/or (materialized) views on Postgres DB.
  • Then it will call db2influx for all defined SQL metrics and update Influx database as well.
  • It reads a list of metrics from YAML file: metrics/metrics.yaml, some metrics require to fill gaps in their data. Those metrics are defined in another YAML file metrics/gaps.yaml.
  • This tool also supports initial computing of All InfluxDB data (instead of default update since the last run).
  • It is called by cron job on 1:10, 2:10, ... and so on - GitHub archive publishes new file every hour, so we're off by at most 1 hour.
  1. Additional stuff, most important being runq and import_affs tools.
  • runq
  • runq gets SQL file name and parameter values and allows to run metric manually from the command line (this is for local development)
  • import_affs
  • import_affs takes one parameter - JSON file name (this is a file from cncf/gitdm: github_users.json
  • This tools imports GitHub usernames (in addition to logins from GHA) and creates developers - companies affiliations (that can be used by Companies stats metric)
  • z2influx
  • z2influx is used to fill gaps that can occur for metrics that returns multiple columns and rows, but the number of rows depends on date range, it uses gaps.yaml file to define which metrics should be zero filled.
  • annotations
  • annotations is used to add annotations on charts, it uses annotations.yaml file to define them, syntax is self describing.
  • idb_tags
  • idb_tags is used to add InfluxDB tags on some specified series. Those tags are used to populate Grafana template drop-down values and names. This is used to auto-populate Repository groups drop down, so when somebody adds new repository group - it will automatically appear in the drop-down.
  • idb_tags uses idb_tags.yaml file to configure InfluxDB tags generation.
  • idb_backup
  • idb_backup is used to backup/restore InfluxDB. Full renenerate of InfluxDB takes about 12 minutes. To avoid downtime when we need to rebuild InfluDB - we can generate new InfluxDB on test database and then if succeeded, restore it on gha. Downtime will be about 2 minutes.
  • webhook
  • webhook is used to react to Travis CI webhooks and trigger deploy if status, branch and type match defined values, more details here.
  • There are few shell scripts for example: running sync every N seconds, setup InfluxDB etc.

Detailed usage is here USAGE

Adding new metrics

Please see metrics to see how to add new metrics.

Database structure details

The main idea is that we divide tables into 2 groups:

  • const: meaning that data in this table is not changing in time (is saved once)
  • variable: meaning that data in those tables can change between GH events, and GH event_id is a part of this tables primary key.
  • there are also "compute" tables that are auto-updated by gha2db_sync/structure tools and affiliations table that is filled by import_affs tool.

Please see USAGE for detailed list of database tables.

Grafana dashboards

Please see dashboards to see list of already defined Grafana dashboards.

Detailed Usage instructions

Benchmarks

Ruby version was dropped, but You can see benchmarks of Ruby using MySQL, Ruby using Postgres and current Go using Postgres here:

Benchmarks

In summary: Go version can import all GitHub archives data (not discarding anything) for all Kubernetes orgs/repos, from the beginning on GitHub 2014-06-01 in about 2-2,5 hours!

Servers

The servers to run devstats are generously provided by Packet bare metal hosting as part of CNCF's Community Infrastructure Lab.

Documentation

Index

Constants

View Source
const DataDir string = "/etc/gha2db/"

DataDir - common constant string

View Source
const GHA string = "gha"

GHA - common constant string

View Source
const GHAAdmin string = "gha_admin"

GHAAdmin - common constant string

View Source
const Now string = "now"

Now - common constant string

View Source
const Password string = "password"

Password - common constant string

View Source
const Quarter string = "quarter"

Quarter - common constant string

View Source
const Retry string = "retry"

Retry - common constant string

View Source
const Today string = "today"

Today - common constant string

Variables

This section is empty.

Functions

func ActorIDOrNil

func ActorIDOrNil(actPtr *Actor) interface{}

ActorIDOrNil - return Actor ID from pointer or nil

func ActorLoginOrNil

func ActorLoginOrNil(actPtr *Actor) interface{}

ActorLoginOrNil - return Actor Login from pointer or nil

func AddNIntervals

func AddNIntervals(dt time.Time, n int, nextIntervalStart, prevIntervalStart func(time.Time) time.Time) time.Time

AddNIntervals adds (using nextIntervalStart) or subtracts (using prevIntervalStart) N itervals to the given date Functions Next/Prev can use Hour, Day, Week, Month, Quarter, Year functions (defined in this module) or other custom defined functions With `func(time.Time) time.Time` signature

func BoolOrNil

func BoolOrNil(boolPtr *bool) interface{}

BoolOrNil - return either nil or value of boolPtr

func CleanUTF8

func CleanUTF8(str string) string

CleanUTF8 - clean UTF8 string to containg only Pq allowed runes

func CommentIDOrNil

func CommentIDOrNil(commPtr *Comment) interface{}

CommentIDOrNil - return Comment ID from pointer or nil

func CreateDatabaseIfNeeded

func CreateDatabaseIfNeeded(ctx *Ctx) bool

CreateDatabaseIfNeeded - creates requested database if not exists Returns true if database was not existing existed and created dropped

func CreateTable

func CreateTable(tdef string) string

CreateTable is used to replace DB specific parts of Create Table SQL statement

func DatabaseExists

func DatabaseExists(ctx *Ctx, closeConn bool) (exists bool, c *sql.DB)

DatabaseExists - checks if database stored in context exists If closeConn is true - then it closes connection after checking if database exists If closeConn is false, then it returns open connection to default database "postgres"

func DayStart

func DayStart(dt time.Time) time.Time

DayStart - return time rounded to current day start

func DescriblePeriodInHours

func DescriblePeriodInHours(hrs float64) (desc string)

DescriblePeriodInHours - return string description of a time period given in hours

func DropDatabaseIfExists

func DropDatabaseIfExists(ctx *Ctx) bool

DropDatabaseIfExists - drops requested database if exists Returns true if database existed and was dropped

func ExecCommand

func ExecCommand(ctx *Ctx, cmdAndArgs []string, env map[string]string) error

ExecCommand - execute command given by array of strings with eventual environment map

func ExecSQL

func ExecSQL(con *sql.DB, ctx *Ctx, query string, args ...interface{}) (sql.Result, error)

ExecSQL executes given SQL on Postgres DB (and return single state result, that doesn't need to be closed)

func ExecSQLTx

func ExecSQLTx(con *sql.Tx, ctx *Ctx, query string, args ...interface{}) (sql.Result, error)

ExecSQLTx executes given SQL on Postgres DB (and return single state result, that doesn't need to be closed) It is for running inside transaction

func ExecSQLTxWithErr

func ExecSQLTxWithErr(con *sql.Tx, ctx *Ctx, query string, args ...interface{}) sql.Result

ExecSQLTxWithErr wrapper to ExecSQLTx that exists on error It is for running inside transaction

func ExecSQLWithErr

func ExecSQLWithErr(con *sql.DB, ctx *Ctx, query string, args ...interface{}) sql.Result

ExecSQLWithErr wrapper to ExecSQL that exists on error

func FatalOnError

func FatalOnError(err error) string

FatalOnError displays error message (if error present) and exits program

func FirstIntOrNil

func FirstIntOrNil(intPtrs []*int) interface{}

FirstIntOrNil - return either nil or value of intPtr

func ForkeeIDOrNil

func ForkeeIDOrNil(forkPtr *Forkee) interface{}

ForkeeIDOrNil - return Forkee ID from pointer or nil

func ForkeeNameOrNil

func ForkeeNameOrNil(forkPtr *Forkee) interface{}

ForkeeNameOrNil - return Forkee Name from pointer or nil

func ForkeeOldIDOrNil

func ForkeeOldIDOrNil(forkPtr *ForkeeOld) interface{}

ForkeeOldIDOrNil - return ForkeeOld ID from pointer or nil

func GetIntervalFunctions

func GetIntervalFunctions(intervalAbbr string) (interval string, n int, intervalStart, nextIntervalStart, prevIntervalStart func(time.Time) time.Time)

GetIntervalFunctions - return interval name, interval number, interval start, next, prev function from interval abbr: h|d2|w3|m4|q|y w3 = 3 weeks, q2 = 2 quarters, y = year (1), d7 = 7 days (not the same as w), m3 = 3 months (not the same as q)

func GetThreadsNum

func GetThreadsNum(ctx *Ctx) int

GetThreadsNum returns the number of available CPUs If environment variable GHA2DB_ST is set it retuns 1 It can be used to debug single threaded verion

func HashStrings

func HashStrings(strs []string) int

HashStrings - returns unique Hash for strings array This value is supposed to be used as ID (negative) to mark it was artificially generated

func HourStart

func HourStart(dt time.Time) time.Time

HourStart - return time rounded to current hour start

func IDBBatchPoints

func IDBBatchPoints(ctx *Ctx, con *client.Client) client.BatchPoints

IDBBatchPoints returns batch points for given connection and database from context

func IDBBatchPointsWithDB

func IDBBatchPointsWithDB(ctx *Ctx, con *client.Client, db string) client.BatchPoints

IDBBatchPointsWithDB returns batch points for given connection and database from context

func IDBConn

func IDBConn(ctx *Ctx) client.Client

IDBConn Connects to InfluxDB database

func IDBNewPointWithErr

func IDBNewPointWithErr(name string, tags map[string]string, fields map[string]interface{}, dt time.Time) *client.Point

IDBNewPointWithErr - return InfluxDB Point, on error exit

func InsertIgnore

func InsertIgnore(query string) string

InsertIgnore - will return insert statement with ignore option specific for DB

func IntOrNil

func IntOrNil(intPtr *int) interface{}

IntOrNil - return either nil or value of intPtr

func IssueIDOrNil

func IssueIDOrNil(issuePtr *Issue) interface{}

IssueIDOrNil - return Issue ID from pointer or nil

func Mgetc

func Mgetc(ctx *Ctx) string

Mgetc waits for single key press and return character pressed

func MilestoneIDOrNil

func MilestoneIDOrNil(milPtr *Milestone) interface{}

MilestoneIDOrNil - return Milestone ID from pointer or nil

func MonthStart

func MonthStart(dt time.Time) time.Time

MonthStart - return time rounded to current month start

func NValue

func NValue(index int) string

NValue will return $n

func NValues

func NValues(n int) string

NValues will return values($1, $2, .., $n)

func NegatedBoolOrNil

func NegatedBoolOrNil(boolPtr *bool) interface{}

NegatedBoolOrNil - return either nil or negated value of boolPtr

func NextDayStart

func NextDayStart(dt time.Time) time.Time

NextDayStart - return time rounded to next day start

func NextHourStart

func NextHourStart(dt time.Time) time.Time

NextHourStart - return time rounded to next hour start

func NextMonthStart

func NextMonthStart(dt time.Time) time.Time

NextMonthStart - return time rounded to next month start

func NextQuarterStart

func NextQuarterStart(dt time.Time) time.Time

NextQuarterStart - return time rounded to next quarter start

func NextWeekStart

func NextWeekStart(dt time.Time) time.Time

NextWeekStart - return time rounded to next week start

func NextYearStart

func NextYearStart(dt time.Time) time.Time

NextYearStart - return time rounded to next year start

func NormalizeName

func NormalizeName(str string) string

NormalizeName - clean DB string from -, /, ., " ", trim leading and trailing space, lowercase Normalize Unicode characters

func OrgIDOrNil

func OrgIDOrNil(orgPtr *Org) interface{}

OrgIDOrNil - return Org ID from pointer or nil

func OrgLoginOrNil

func OrgLoginOrNil(orgPtr *Org) interface{}

OrgLoginOrNil - return Org ID from pointer or nil

func PgConn

func PgConn(ctx *Ctx) *sql.DB

PgConn Connects to Postgres database

func PrettyPrintJSON

func PrettyPrintJSON(jsonBytes []byte) []byte

PrettyPrintJSON - pretty formats raw JSON bytes

func PrevDayStart

func PrevDayStart(dt time.Time) time.Time

PrevDayStart - return time rounded to prev day start

func PrevHourStart

func PrevHourStart(dt time.Time) time.Time

PrevHourStart - return time rounded to prev hour start

func PrevMonthStart

func PrevMonthStart(dt time.Time) time.Time

PrevMonthStart - return time rounded to prev month start

func PrevQuarterStart

func PrevQuarterStart(dt time.Time) time.Time

PrevQuarterStart - return time rounded to prev quarter start

func PrevWeekStart

func PrevWeekStart(dt time.Time) time.Time

PrevWeekStart - return time rounded to prev week start

func PrevYearStart

func PrevYearStart(dt time.Time) time.Time

PrevYearStart - return time rounded to prev year start

func Printf

func Printf(format string, args ...interface{}) (n int, err error)

Printf is a wrapper around Printf(...) that supports logging.

func PullRequestIDOrNil

func PullRequestIDOrNil(prPtr *PullRequest) interface{}

PullRequestIDOrNil - return PullRequest ID from pointer or nil

func QuarterStart

func QuarterStart(dt time.Time) time.Time

QuarterStart - return time rounded to current month start

func QueryIDB

func QueryIDB(con client.Client, ctx *Ctx, query string) []client.Result

QueryIDB - do InfluxDB query

func QueryIDBWithDB

func QueryIDBWithDB(con client.Client, ctx *Ctx, query, db string) []client.Result

QueryIDBWithDB - do InfluxDB query

func QueryRowSQL

func QueryRowSQL(con *sql.DB, ctx *Ctx, query string, args ...interface{}) *sql.Row

QueryRowSQL executes given SQL on Postgres DB (and returns single row)

func QuerySQL

func QuerySQL(con *sql.DB, ctx *Ctx, query string, args ...interface{}) (*sql.Rows, error)

QuerySQL executes given SQL on Postgres DB (and returns rowset that needs to be closed)

func QuerySQLTx

func QuerySQLTx(con *sql.Tx, ctx *Ctx, query string, args ...interface{}) (*sql.Rows, error)

QuerySQLTx executes given SQL on Postgres DB (and returns rowset that needs to be closed) It is for running inside transaction

func QuerySQLTxWithErr

func QuerySQLTxWithErr(con *sql.Tx, ctx *Ctx, query string, args ...interface{}) *sql.Rows

QuerySQLTxWithErr wrapper to QuerySQLTx that exists on error It is for running inside transaction

func QuerySQLWithErr

func QuerySQLWithErr(con *sql.DB, ctx *Ctx, query string, args ...interface{}) *sql.Rows

QuerySQLWithErr wrapper to QuerySQL that exists on error

func ReleaseIDOrNil

func ReleaseIDOrNil(relPtr *Release) interface{}

ReleaseIDOrNil - return Release ID from pointer or nil

func RepoIDOrNil

func RepoIDOrNil(repoPtr *Repo) interface{}

RepoIDOrNil - return Repo ID from pointer or nil

func RepoNameOrNil

func RepoNameOrNil(repoPtr *Repo) interface{}

RepoNameOrNil - return Repo Name from pointer or nil

func SafeQueryIDB

func SafeQueryIDB(con client.Client, ctx *Ctx, query string) (*client.Response, error)

SafeQueryIDB - do InfluxDB query, on error return error data

func SkipEmpty

func SkipEmpty(arr []string) []string

SkipEmpty - skip one element arrays contining only empty string This is what strings.Split() returns for empty input We expect empty array or empty map returned in such cases

func StringOrNil

func StringOrNil(strPtr *string) interface{}

StringOrNil - return either nil or value of strPtr

func StringsMapToArray

func StringsMapToArray(f func(string) string, strArr []string) []string

StringsMapToArray this is a function that calls given function for all array items and returns array of items processed by this func Example call: lib.StringsMapToArray(func(x string) string { return strings.TrimSpace(x) }, []string{" a", " b ", "c "})

func StringsMapToSet

func StringsMapToSet(f func(string) string, strArr []string) map[string]struct{}

StringsMapToSet this is a function that calls given function for all array items and returns set of items processed by this func Example call: lib.StringsMapToSet(func(x string) string { return strings.TrimSpace(x) }, []string{" a", " b ", "c "})

func StringsSetKeys

func StringsSetKeys(set map[string]struct{}) []string

StringsSetKeys - returns all keys from string map

func StripUnicode

func StripUnicode(str string) string

StripUnicode strip non-unicode and control characters from a string From: https://rosettacode.org/wiki/Strip_control_codes_and_extended_characters_from_a_string#Go

func Structure

func Structure(ctx *Ctx)

Structure creates full database structure, indexes, views/summary tables etc

func TimeOrNil

func TimeOrNil(timePtr *time.Time) interface{}

TimeOrNil - return either nil or value of timePtr

func TimeParseAny

func TimeParseAny(dtStr string) time.Time

TimeParseAny - attempts to parse time from string YYYY-MM-DD HH:MI:SS Skipping parts from right until only YYYY id left

func TimeParseIDB

func TimeParseIDB(dtStr string) time.Time

TimeParseIDB - parse InfluxDB time output string into time.Time

func ToGHADate

func ToGHADate(dt time.Time) string

ToGHADate - return time formatted as YYYY-MM-DD-H

func ToYMDDate

func ToYMDDate(dt time.Time) string

ToYMDDate - return time formatted as YYYY-MM-DD

func ToYMDHDate

func ToYMDHDate(dt time.Time) string

ToYMDHDate - return time formatted as YYYY-MM-DD HH

func ToYMDHMSDate

func ToYMDHMSDate(dt time.Time) string

ToYMDHMSDate - return time formatted as YYYY-MM-DD HH:MI:SS

func TruncStringOrNil

func TruncStringOrNil(strPtr *string, maxLen int) interface{}

TruncStringOrNil - return either nil or value of strPtr truncated to maxLen chars

func TruncToBytes

func TruncToBytes(str string, size int) string

TruncToBytes - truncates text to <= size bytes (note that this can be a lot less UTF-8 runes)

func WeekStart

func WeekStart(dt time.Time) time.Time

WeekStart - return time rounded to current week start Assumes first week day is Sunday

func YearStart

func YearStart(dt time.Time) time.Time

YearStart - return time rounded to current month start

Types

type Actor

type Actor struct {
	ID    int    `json:"id"`
	Login string `json:"login"`
	Name  string `json:"-"`
}

Actor - GHA Actor structure Name is unexported and not used by JSON load/save But is used when importing affiliations from cncf/gitdm:github_users.json

type AnyArray

type AnyArray []interface{}

AnyArray - holds array of interface{} - just a shortcut

type Asset

type Asset struct {
	ID            int       `json:"id"`
	CreatedAt     time.Time `json:"created_at"`
	UpdatedAt     time.Time `json:"updated_at"`
	Name          string    `json:"name"`
	Label         *string   `json:"label"`
	Uploader      Actor     `json:"uploader"`
	ContentType   string    `json:"content_type"`
	State         string    `json:"state"`
	Size          int       `json:"size"`
	DownloadCount int       `json:"download_count"`
}

Asset - GHA Asset structure

type Author

type Author struct {
	Name  string `json:"name"`
	Email string `json:"email"`
}

Author - GHA Commit Author structure

type Branch

type Branch struct {
	SHA   string  `json:"sha"`
	User  *Actor  `json:"user"`
	Repo  *Forkee `json:"repo"` // This is confusing, but actually GHA has "repo" fields that holds "forkee" structure
	Label string  `json:"label"`
	Ref   string  `json:"ref"`
}

Branch - GHA Branch structure

type Comment

type Comment struct {
	ID                  int       `json:"id"`
	Body                string    `json:"body"`
	CreatedAt           time.Time `json:"created_at"`
	UpdatedAt           time.Time `json:"updated_at"`
	User                Actor     `json:"user"`
	CommitID            *string   `json:"commit_id"`
	OriginalCommitID    *string   `json:"original_commit_id"`
	DiffHunk            *string   `json:"diff_hunk"`
	Position            *int      `json:"position"`
	OriginalPosition    *int      `json:"original_position"`
	Path                *string   `json:"path"`
	PullRequestReviewID *int      `json:"pull_request_review_id"`
	Line                *int      `json:"line"`
}

Comment - GHA Comment structure

type Commit

type Commit struct {
	SHA      string `json:"sha"`
	Author   Author `json:"author"`
	Message  string `json:"message"`
	Distinct bool   `json:"distinct"`
}

Commit - GHA Commit structure

type Ctx

type Ctx struct {
	Debug            int       // from GHA2DB_DEBUG Debug level: 0-no, 1-info, 2-verbose, including SQLs, default 0
	CmdDebug         int       // from GHA2DB_CMDDEBUG Commands execution Debug level: 0-no, 1-only output commands, 2-output commands and their output, 3-output full environment as well, default 0
	JSONOut          bool      // from GHA2DB_JSON gha2db: write JSON files? default false
	DBOut            bool      // from GHA2DB_NODB gha2db: write to SQL database, default true
	ST               bool      // from GHA2DB_ST true: use single threaded version, false: use multi threaded version, default false
	NCPUs            int       // from GHA2DB_NCPUS, set to override number of CPUs to run, this overwrites GHA2DB_ST, default 0 (which means do not use it)
	PgHost           string    // from PG_HOST, default "localhost"
	PgPort           string    // from PG_PORT, default "5432"
	PgDB             string    // from PG_DB, default "gha"
	PgUser           string    // from PG_USER, default "gha_admin"
	PgPass           string    // from PG_PASS, default "password"
	PgSSL            string    // from PG_SSL, default "disable"
	Index            bool      // from GHA2DB_INDEX Create DB index? default false
	Table            bool      // from GHA2DB_SKIPTABLE Create table structure? default true
	Tools            bool      // from GHA2DB_SKIPTOOLS Create DB tools (like views, summary tables, materialized views etc)? default true
	Mgetc            string    // from GHA2DB_MGETC Character returned by mgetc (if non empty), default ""
	IDBHost          string    // from IDB_HOST, default "http://localhost"
	IDBPort          string    // form IDB_PORT, default 8086
	IDBDB            string    // from IDB_DB, default "gha"
	IDBUser          string    // from IDB_USER, default "gha_admin"
	IDBPass          string    // from IDB_PASS, default "password"
	QOut             bool      // from GHA2DB_QOUT output all SQL queries?, default false
	CtxOut           bool      // from GHA2DB_CTXOUT output all context data (this struct), default false
	LogTime          bool      // from GHA2DB_SKIPTIME, output time with all lib.Printf(...) calls, default true, use GHA2DB_SKIPTIME to disable
	DefaultStartDate time.Time // from GHA2DB_STARTDT, default `2014-06-01 00:00 UTC`, expects format "YYYY-MM-DD HH:MI:SS"
	LastSeries       string    // from GHA2DB_LASTSERIES, use this InfluxDB series to determine last timestamp date, default "events_h"
	SkipIDB          bool      // from GHA2DB_SKIPIDB gha2db_sync tool, skip Influx DB processing? for db2influx it skips final series write, default false
	SkipPDB          bool      // from GHA2DB_SKIPPDB gha2db_sync tool, skip Postgres DB processing? default false
	ResetIDB         bool      // from GHA2DB_RESETIDB sync tool, regenerate all InfluxDB points? default false
	Explain          bool      // from GHA2DB_EXPLAIN runq tool, prefix query with "explain " - it will display query plan instead of executing real query, default false
	OldFormat        bool      // from GHA2DB_OLDFMT gha2db tool, if set then use pre 2015 GHA JSONs format
	Exact            bool      // From GHA2DB_EXACT gha2db tool, if set then orgs list provided from commandline is used as a list of exact repository full names, like "a/b,c/d,e"
	LogToDB          bool      // From GHA2DB_SKIPLOG all tools, if set, DB logging into Postgres table `gha_logs` will be disabled
	Local            bool      // From GHA2DB_LOCAL gha2db_sync tool, if set, gha2_db will call other tools prefixed with "./" to use local compile ones. Otherwise it will call binaries without prefix (so it will use thos ein /usr/bin/).
	AnnotationsYaml  string    // From GHA2DB_ANNOTATIONS_YAML annotations tool, set other annotations.yaml file, default is "metrics/annotations.yaml"
	MetricsYaml      string    // From GHA2DB_METRICS_YAML gha2db_sync tool, set other metrics.yaml file, default is "metrics/metrics.yaml"
	GapsYaml         string    // From GHA2DB_GAPS_YAML gha2db_sync tool, set other gaps.yaml file, default is "metrics/gaps.yaml"
	TagsYaml         string    // From GHA2DB_TAGS_YAML idb_tags tool, set other idb_tags.yaml file, default is "metrics/idb_tags.yaml"
	ClearDBPeriod    string    // From GHA2DB_MAXLOGAGE gha2db_sync tool, maximum age of gha_logs entries, default "1 week"
	Trials           []int     // From GHA2DB_TRIALS, all Postgres related tools, retry periods for "too many connections open" error
	WebHookRoot      string    // From GHA2DB_WHROOT, webhook tool, default "/hook", must match .travis.yml notifications webhooks
	WebHookPort      string    // From GHA2DB_WHPORT, webhook tool, default ":1982", note that webhook listens using http:1982, but we use apache on https:2982 (to enable https protocol and proxy requests to http:1982)
	CheckPayload     bool      // From GHA2DB_SKIP_VERIFY_PAYLOAD, webhook tool, default true, use GHA2DB_SKIP_VERIFY_PAYLOAD=1 to manually test payloads
	DeployBranches   []string  // From GHA2DB_DEPLOY_BRANCHES, webhook tool, default "master" - comma separated list
	DeployStatuses   []string  // From GHA2DB_DEPLOY_STATUSES, webhook tool, default "Passed,Fixed", - comma separated list
	DeployResults    []int     // From GHA2DB_DEPLOY_RESULTS, webhook tool, default "0", - comma separated list
	DeployTypes      []string  // From GHA2DB_DEPLOY_TYPES, webhook tool, default "push", - comma separated list
	ProjectRoot      string    // From GHA2DB_PROJECT_ROOT, webhook tool, no default, must be specified to run webhook tool
	ExecFatal        bool      // default true, set this manually to false to avoid lib.ExecCommand calling os.Exit() on failure and return error instead
}

Ctx - environment context packed in structure

func (*Ctx) Init

func (ctx *Ctx) Init()

Init - get context from environment variables

func (*Ctx) Print

func (ctx *Ctx) Print()

Print context contents

type Dummy

type Dummy struct{}

Dummy - structure with no data pointer to this struct is used to test if such field was present in JSON or not

type Event

type Event struct {
	ID        string    `json:"id"`
	Type      string    `json:"type"`
	Public    bool      `json:"public"`
	CreatedAt time.Time `json:"created_at"`
	Actor     Actor     `json:"actor"`
	Repo      Repo      `json:"repo"`
	Org       *Org      `json:"org"`
	Payload   Payload   `json:"payload"`
}

Event - full GHA (GitHub Archive) event structure

type EventOld

type EventOld struct {
	ID         string      `json:"-"`
	Type       string      `json:"type"`
	Public     bool        `json:"public"`
	CreatedAt  time.Time   `json:"created_at"`
	Actor      string      `json:"actor"`
	Repository ForkeeOld   `json:"repository"`
	Payload    *PayloadOld `json:"payload"`
}

EventOld - full GHA (GitHub Archive) event structure, before 2015

type Forkee

type Forkee struct {
	ID              int        `json:"id"`
	Name            string     `json:"name"`
	FullName        string     `json:"full_name"`
	Owner           Actor      `json:"owner"`
	Description     *string    `json:"description"`
	Public          *bool      `json:"public"`
	Fork            bool       `json:"fork"`
	CreatedAt       time.Time  `json:"created_at"`
	UpdatedAt       time.Time  `json:"updated_at"`
	PushedAt        *time.Time `json:"pushed_at"`
	Homepage        *string    `json:"homepage"`
	Size            int        `json:"size"`
	StargazersCount int        `json:"stargazers_count"`
	HasIssues       bool       `json:"has_issues"`
	HasProjects     *bool      `json:"has_projects"`
	HasDownloads    bool       `json:"has_downloads"`
	HasWiki         bool       `json:"has_wiki"`
	HasPages        *bool      `json:"has_pages"`
	Forks           int        `json:"forks"`
	OpenIssues      int        `json:"open_issues"`
	Watchers        int        `json:"watchers"`
	DefaultBranch   string     `json:"default_branch"`
}

Forkee - GHA Forkee structure

type ForkeeOld

type ForkeeOld struct {
	ID            int        `json:"id"`
	CreatedAt     time.Time  `json:"created_at"`
	Description   *string    `json:"description"`
	Fork          bool       `json:"fork"`
	Forks         int        `json:"forks"`
	HasDownloads  bool       `json:"has_downloads"`
	HasIssues     bool       `json:"has_issues"`
	HasWiki       bool       `json:"has_wiki"`
	Homepage      *string    `json:"homepage"`
	Language      *string    `json:"language"`
	DefaultBranch string     `json:"master_branch"`
	Name          string     `json:"name"`
	OpenIssues    int        `json:"open_issues"`
	Organization  *string    `json:"organization"`
	Owner         string     `json:"owner"`
	Private       *bool      `json:"private"`
	PushedAt      *time.Time `json:"pushed_at"`
	Size          int        `json:"size"`
	Stargazers    int        `json:"stargazers"`
	Watchers      int        `json:"watchers"`
}

ForkeeOld - GHA Forkee structure (from before 2015) Handle missing 4 last properties (including two non-nulls!)

type Issue

type Issue struct {
	ID          int        `json:"id"`
	Number      int        `json:"number"`
	Comments    int        `json:"comments"`
	Title       string     `json:"title"`
	State       string     `json:"state"`
	Locked      bool       `json:"locked"`
	Body        *string    `json:"body"`
	User        Actor      `json:"user"`
	Assignee    *Actor     `json:"assignee"`
	Labels      []Label    `json:"labels"`
	Assignees   []Actor    `json:"assignees"`
	Milestone   *Milestone `json:"milestone"`
	CreatedAt   time.Time  `json:"created_at"`
	UpdatedAt   time.Time  `json:"updated_at"`
	ClosedAt    *time.Time `json:"closed_at"`
	PullRequest *Dummy     `json:"pull_request"`
}

Issue - GHA Issue structure

type Label

type Label struct {
	ID      *int   `json:"id"`
	Name    string `json:"name"`
	Color   string `json:"color"`
	Default *bool  `json:"default"`
}

Label - GHA Label structure

type Milestone

type Milestone struct {
	ID           int        `json:"id"`
	Name         string     `json:"name"`
	Number       int        `json:"number"`
	Title        string     `json:"title"`
	Description  *string    `json:"description"`
	Creator      *Actor     `json:"creator"`
	OpenIssues   int        `json:"open_issues"`
	ClosedIssues int        `json:"closed_issues"`
	State        string     `json:"state"`
	CreatedAt    time.Time  `json:"created_at"`
	UpdatedAt    time.Time  `json:"updated_at"`
	ClosedAt     *time.Time `json:"closed_at"`
	DueOn        *time.Time `json:"due_on"`
}

Milestone - GHA Milestone structure

type Org

type Org struct {
	ID    int    `json:"id"`
	Login string `json:"login"`
}

Org - GHA Org structure

type Page

type Page struct {
	SHA    string `json:"sha"`
	Action string `json:"action"`
	Title  string `json:"title"`
}

Page - GHA Page structure

type Payload

type Payload struct {
	PushID       *int         `json:"push_id"`
	Size         *int         `json:"size"`
	Ref          *string      `json:"ref"`
	Head         *string      `json:"head"`
	Before       *string      `json:"before"`
	Action       *string      `json:"action"`
	RefType      *string      `json:"ref_type"`
	MasterBranch *string      `json:"master_branch"`
	Description  *string      `json:"description"`
	Number       *int         `json:"number"`
	Forkee       *Forkee      `json:"forkee"`
	Release      *Release     `json:"release"`
	Member       *Actor       `json:"member"`
	Issue        *Issue       `json:"issue"`
	Comment      *Comment     `json:"comment"`
	Commits      *[]Commit    `json:"commits"`
	Pages        *[]Page      `json:"pages"`
	PullRequest  *PullRequest `json:"pull_request"`
}

Payload - GHA Payload structure

type PayloadOld

type PayloadOld struct {
	Issue        *int           `json:"issue"`
	IssueID      *int           `json:"issue_id"`
	Comment      *Comment       `json:"comment"`
	CommentID    *int           `json:"comment_id"`
	Description  *string        `json:"description"`
	MasterBranch *string        `json:"master_branch"`
	Ref          *string        `json:"ref"`
	Action       *string        `json:"action"`
	RefType      *string        `json:"ref_type"`
	Head         *string        `json:"head"`
	Size         *int           `json:"size"`
	Number       *int           `json:"number"`
	PullRequest  *PullRequest   `json:"pull_request"`
	Member       *Actor         `json:"member"`
	Release      *Release       `json:"release"`
	Pages        *[]Page        `json:"pages"`
	Commit       *string        `json:"commit"`
	SHAs         *[]interface{} `json:"shas"`
	Repository   *Forkee        `json:"repository"`
	Team         *Team          `json:"team"`
}

PayloadOld - GHA Payload structure (from before 2015)

type PullRequest

type PullRequest struct {
	ID                  int        `json:"id"`
	Base                Branch     `json:"base"`
	Head                Branch     `json:"head"`
	User                Actor      `json:"user"`
	Number              int        `json:"number"`
	State               string     `json:"state"`
	Locked              *bool      `json:"locked"`
	Title               string     `json:"title"`
	Body                *string    `json:"body"`
	CreatedAt           time.Time  `json:"created_at"`
	UpdatedAt           time.Time  `json:"updated_at"`
	ClosedAt            *time.Time `json:"closed_at"`
	MergedAt            *time.Time `json:"merged_at"`
	MergeCommitSHA      *string    `json:"merge_commit_sha"`
	Assignee            *Actor     `json:"assignee"`
	Assignees           *[]Actor   `json:"assignees"`
	RequestedReviewers  *[]Actor   `json:"requested_reviewers"`
	Milestone           *Milestone `json:"milestone"`
	Merged              *bool      `json:"merged"`
	Mergeable           *bool      `json:"mergeable"`
	MergedBy            *Actor     `json:"merged_by"`
	MergeableState      *string    `json:"mergeable_state"`
	Rebaseable          *bool      `json:"rebaseable"`
	Comments            *int       `json:"comments"`
	ReviewComments      *int       `json:"review_comments"`
	MaintainerCanModify *bool      `json:"maintainer_can_modify"`
	Commits             *int       `json:"commits"`
	Additions           *int       `json:"additions"`
	Deletions           *int       `json:"deletions"`
	ChangedFiles        *int       `json:"changed_files"`
}

PullRequest - GHA Pull Request structure

type Release

type Release struct {
	ID              int        `json:"id"`
	TagName         string     `json:"tag_name"`
	TargetCommitish string     `json:"target_commitish"`
	Name            *string    `json:"name"`
	Draft           bool       `json:"draft"`
	Author          Actor      `json:"author"`
	Prerelease      bool       `json:"prerelease"`
	CreatedAt       time.Time  `json:"created_at"`
	PublishedAt     *time.Time `json:"published_at"`
	Body            *string    `json:"body"`
	Assets          []Asset    `json:"assets"`
}

Release - GHA Release structure

type Repo

type Repo struct {
	ID   int    `json:"id"`
	Name string `json:"name"`
}

Repo - GHA Repo structure

type Team

type Team struct {
	ID         int    `json:"id"`
	Name       string `json:"name"`
	Slug       string `json:"slug"`
	Permission string `json:"permission"`
}

Team - GHA Team structure (only used before 2015)

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL