ale

package module
Version: v0.0.0-...-9090b94 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 19, 2021 License: Apache-2.0 Imports: 0 Imported by: 0

README

ALE

Build Status Go Report Card

Automated Log Extractor

Purpose

The intent for this project is to crawl the workflow API in Jenkins, and extract a more structured log divided into stages. It'll use the configured regex to try to extract the timestamp from each log line.

Configuration

The following is the default config

[server]
address = "0.0.0.0" # IP address to bind
port = 7654 # The Port to bind

[logging]
level = "debug"
format = "text" # Can be json or text

[metadata] # metadata will be presented in the service-metadata route
owner = "${USER}" # Owner of the service

[crawler]
# Regex used to extract the timestamp from the logs.
# Should have two groups, timestamp and log line.
logpattern = '''.*\[([\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}.\d*Z]*)\].*?\s(.*)$'''

See config_test.toml for more configuration options.

Postgres SQL

To use psql as a backend, add a config similar to:

[PostgreSQL]
username = "postgres_user"
passwordfile = "/path/to/file/with/password"
host = "postgres.local"
Port = 5432
database = "ale_database_name"
disablessl = true
Datastore

To use Google Datastore as a backend, add a config similar to:

[GoogleCloudDatastore]
namespace = "ale-jenkinslog"
project = "my-gcs-project"

Flow

POST
user     ALE     Jenkins      Database
-+--------+---------+------------+----
 |        |         |            |
 +------->|         |            |
 |        +--------------------->|
 |<-------+         |            |
 |        +-------->| poll       |
 |        |<--------+ !done      |
 |        +--------------------->|

GET
 user     ALE     Jenkins      Database
-+--------+---------+------------+----
 |        |         |            |
 +------->|         |            |
 |        +--------------------->|
 |        |<---------------------+
 |<-------+         |            |

Usage

Process a Build:

curl -XPOST http://ale-server:port/api/v1/process \
    -H "Content-Type: application/json" \
    -d @- << EOF
{
    "buildId": "unique-id-of-build",
    "buildUrl": "http://jenkins.local:8080/job/jobId/262"
}
EOF

response:

201 CREATED
{
    "location": "http://ale-server:port/api/v1/build/unique-id-of-build"
}

If it has already been crawled, the response will be

302 FOUND
{
    "location": "http://ale-server:port/api/v1/build/unique-id-of-build"
}

Query for build information

curl http://ale-server:port/api/v1/build/unique-id-of-build \
    -H "Accept: application/json"

response (sample):

200 OK
{
    "stages": [
        {
            "status": "SUCCESS",
            "name": "Preparation - Delete workspace when build is done",
            "log": [
                {
                    "timestamp": "09:46:24", // Format will depend on your log and regex
                    "line": "[WS-CLEANUP] Deleting project workspace..."
                },
                {
                    "timestamp": "09:46:24",
                    "line": "[WS-CLEANUP] Deferred wipeout is used..."
                },
                {
                    "timestamp": "09:46:24",
                    "line": "[WS-CLEANUP] done"
                }
            ],
            "log_length": 1119,
            "start_time": 1548083830768
        }
    ],
    "status": "SUCCESS",
    "name": "#502 - org/repo - refs/pull/65/merge",
    "id": "502",
    "build_id": "597bc093-6824-4287-8161-f558f8022ded"
}

API

The POST to start processing takes the following input:

  • buildUrl
    • Required The URL of the build to start crawling. The format should be similar to http://jenkins.internal:8080/job/jobName/714, and should end in the build number.
  • buildId
    • optional If provided it will be used as the key of the build.
    • If not provided, a Version 4 UUID will be generated and used as a key.
    • Needs to be unique.
  • forceRecrawl
    • optional If provided, an existing database entry with the same buildId (whether provided or generated), will be deleted before the crawl.
    • Defaults to false.

Getting more logs from Jenkins API

Set the following JAVA_OPTS when you launch your Jenkins

export JAVA_OPTS="${JAVA_OPTS} -Dfile.encoding=UTF-8 -Dcom.cloudbees.workflow.rest.external.FlowNodeLogExt.maxReturnChars=1048576"

TODO

  • Only crawl entries that were not previously marked as done

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DatastoreEntity

type DatastoreEntity struct {
	Key   string      `json:"key" datastore:"key"`
	Value JenkinsData `json:"value" datastore:"value,noindex"`
}

DatastoreEntity is used to store data in datastore, and prevent indexing of the huge json

type JenkinsData

type JenkinsData struct {
	Stages        []*JenkinsStage `json:"stages"`
	Status        string          `json:"status"`
	Name          string          `json:"name"`
	ID            string          `json:"id"`
	BuildID       string          `json:"build_id"`
	StartTime     int             `json:"start_time"`
	EndTime       int             `json:"end_time"`
	Duration      int             `json:"build_duration"`
	QueueDuration int             `json:"queue_duration"`
	PauseDuration int             `json:"pause_duration"`
}

JenkinsData is the topmost level of the flattened structure stored in the database

type JenkinsStage

type JenkinsStage struct {
	Status      string          `json:"status"`
	Name        string          `json:"name"`
	Logs        []*Log          `json:"log"`
	LogLength   int             `json:"log_length"`
	SubStages   []*JenkinsStage `json:"substage"`
	StartTime   int             `json:"start_time"`
	Duration    int             `json:"duration"`
	Task        string          `json:"task"`
	Description string          `json:"description"`
}

JenkinsStage holds the output from a given stage

type JobData

type JobData struct {
	Links struct {
		Self      Link `json:"self"`
		Artifacts Link `json:"artifacts"`
	} `json:"_links"`
	Stages              []JobStage `json:"stages"`
	Status              string     `json:"status"`
	Name                string     `json:"name"`
	ID                  string     `json:"id"`
	StartTimeMillis     int        `json:"startTimeMillis"`
	EndTimeMillis       int        `json:"endTimeMillis"`
	DurationMillis      int        `json:"durationMillis"`
	QueueDurationMillis int        `json:"queueDurationMillis"`
	PauseDurationMillis int        `json:"pauseDurationMillis"`
}

JobData holds parts of a jenkins job

type JobExecution

type JobExecution struct {
	Links struct {
		Self Link `json:"self"`
		Log  Link `json:"log"`
	} `json:"_links"`
	ID              string          `json:"id"`
	Status          string          `json:"status"`
	Name            string          `json:"name"`
	StartTimeMillis int             `json:"startTimeMillis"`
	DurationMillis  int             `json:"durationMillis"`
	StageFlowNodes  []StageFlowNode `json:"stageFlowNodes"`
}

JobExecution holds information regarding an execution of a job

type JobStage

type JobStage struct {
	Links struct {
		Self Link `json:"self"`
	} `json:"_links"`
	ID     string `json:"id"`
	Status string `json:"status"`
	Name   string `json:"name"`
}

JobStage holds information about a stage of a job

type Link struct {
	Href string `json:"href"`
}

Link represents a relative uri deeper into the Jenkins API

type Log

type Log struct {
	TimeStamp string `json:"timestamp"`
	Line      string `json:"line"`
}

The Log struct maps to the response value for the structured log

type NodeLog

type NodeLog struct {
	NodeID     string `json:"nodeId"`
	NodeStatus string `json:"nodeStatus"`
	Length     int    `json:"length"`
	HasMore    bool   `json:"hasMore"`
	Text       string `json:"text"`
	ConsoleURL string `json:"consoleUrl"`
}

NodeLog maps to the logs from a node

type StageFlowNode

type StageFlowNode struct {
	Links struct {
		Self Link `json:"self"`
		Log  Link `json:"log"`
	} `json:"_links"`
	ID                   string   `json:"id"`
	Status               string   `json:"status"`
	Name                 string   `json:"name"`
	StartTimeMillis      int      `json:"startTimeMillis"`
	DurationMillis       int      `json:"durationMillis"`
	ParameterDescription string   `json:"parameterDescription"`
	Parents              []string `json:"parentNodes"`
}

StageFlowNode holds information regarding a flow-node in a stage

Source Files

Directories

Path Synopsis
cmd
ale
db

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
t or T : Toggle theme light dark auto
y or Y : Canonical URL