package module
Version: v0.0.0-...-9090b94 Latest Latest

This package is not in the latest version of its module.

Go to latest
Published: Apr 19, 2021 License: Apache-2.0 Imports: 0 Imported by: 0



Build Status Go Report Card

Automated Log Extractor


The intent for this project is to crawl the workflow API in Jenkins, and extract a more structured log divided into stages. It'll use the configured regex to try to extract the timestamp from each log line.


The following is the default config

address = "" # IP address to bind
port = 7654 # The Port to bind

level = "debug"
format = "text" # Can be json or text

[metadata] # metadata will be presented in the service-metadata route
owner = "${USER}" # Owner of the service

# Regex used to extract the timestamp from the logs.
# Should have two groups, timestamp and log line.
logpattern = '''.*\[([\d{4}\-\d{2}\-\d{2}T\d{2}:\d{2}:\d{2}.\d*Z]*)\].*?\s(.*)$'''

See config_test.toml for more configuration options.

Postgres SQL

To use psql as a backend, add a config similar to:

username = "postgres_user"
passwordfile = "/path/to/file/with/password"
host = "postgres.local"
Port = 5432
database = "ale_database_name"
disablessl = true

To use Google Datastore as a backend, add a config similar to:

namespace = "ale-jenkinslog"
project = "my-gcs-project"


user     ALE     Jenkins      Database
 |        |         |            |
 +------->|         |            |
 |        +--------------------->|
 |<-------+         |            |
 |        +-------->| poll       |
 |        |<--------+ !done      |
 |        +--------------------->|

 user     ALE     Jenkins      Database
 |        |         |            |
 +------->|         |            |
 |        +--------------------->|
 |        |<---------------------+
 |<-------+         |            |


Process a Build:

curl -XPOST http://ale-server:port/api/v1/process \
    -H "Content-Type: application/json" \
    -d @- << EOF
    "buildId": "unique-id-of-build",
    "buildUrl": "http://jenkins.local:8080/job/jobId/262"


    "location": "http://ale-server:port/api/v1/build/unique-id-of-build"

If it has already been crawled, the response will be

    "location": "http://ale-server:port/api/v1/build/unique-id-of-build"

Query for build information

curl http://ale-server:port/api/v1/build/unique-id-of-build \
    -H "Accept: application/json"

response (sample):

200 OK
    "stages": [
            "status": "SUCCESS",
            "name": "Preparation - Delete workspace when build is done",
            "log": [
                    "timestamp": "09:46:24", // Format will depend on your log and regex
                    "line": "[WS-CLEANUP] Deleting project workspace..."
                    "timestamp": "09:46:24",
                    "line": "[WS-CLEANUP] Deferred wipeout is used..."
                    "timestamp": "09:46:24",
                    "line": "[WS-CLEANUP] done"
            "log_length": 1119,
            "start_time": 1548083830768
    "status": "SUCCESS",
    "name": "#502 - org/repo - refs/pull/65/merge",
    "id": "502",
    "build_id": "597bc093-6824-4287-8161-f558f8022ded"


The POST to start processing takes the following input:

  • buildUrl
    • Required The URL of the build to start crawling. The format should be similar to http://jenkins.internal:8080/job/jobName/714, and should end in the build number.
  • buildId
    • optional If provided it will be used as the key of the build.
    • If not provided, a Version 4 UUID will be generated and used as a key.
    • Needs to be unique.
  • forceRecrawl
    • optional If provided, an existing database entry with the same buildId (whether provided or generated), will be deleted before the crawl.
    • Defaults to false.

Getting more logs from Jenkins API

Set the following JAVA_OPTS when you launch your Jenkins

export JAVA_OPTS="${JAVA_OPTS} -Dfile.encoding=UTF-8"


  • Only crawl entries that were not previously marked as done




This section is empty.


This section is empty.


This section is empty.


type DatastoreEntity

type DatastoreEntity struct {
	Key   string      `json:"key" datastore:"key"`
	Value JenkinsData `json:"value" datastore:"value,noindex"`

DatastoreEntity is used to store data in datastore, and prevent indexing of the huge json

type JenkinsData

type JenkinsData struct {
	Stages        []*JenkinsStage `json:"stages"`
	Status        string          `json:"status"`
	Name          string          `json:"name"`
	ID            string          `json:"id"`
	BuildID       string          `json:"build_id"`
	StartTime     int             `json:"start_time"`
	EndTime       int             `json:"end_time"`
	Duration      int             `json:"build_duration"`
	QueueDuration int             `json:"queue_duration"`
	PauseDuration int             `json:"pause_duration"`

JenkinsData is the topmost level of the flattened structure stored in the database

type JenkinsStage

type JenkinsStage struct {
	Status      string          `json:"status"`
	Name        string          `json:"name"`
	Logs        []*Log          `json:"log"`
	LogLength   int             `json:"log_length"`
	SubStages   []*JenkinsStage `json:"substage"`
	StartTime   int             `json:"start_time"`
	Duration    int             `json:"duration"`
	Task        string          `json:"task"`
	Description string          `json:"description"`

JenkinsStage holds the output from a given stage

type JobData

type JobData struct {
	Links struct {
		Self      Link `json:"self"`
		Artifacts Link `json:"artifacts"`
	} `json:"_links"`
	Stages              []JobStage `json:"stages"`
	Status              string     `json:"status"`
	Name                string     `json:"name"`
	ID                  string     `json:"id"`
	StartTimeMillis     int        `json:"startTimeMillis"`
	EndTimeMillis       int        `json:"endTimeMillis"`
	DurationMillis      int        `json:"durationMillis"`
	QueueDurationMillis int        `json:"queueDurationMillis"`
	PauseDurationMillis int        `json:"pauseDurationMillis"`

JobData holds parts of a jenkins job

type JobExecution

type JobExecution struct {
	Links struct {
		Self Link `json:"self"`
		Log  Link `json:"log"`
	} `json:"_links"`
	ID              string          `json:"id"`
	Status          string          `json:"status"`
	Name            string          `json:"name"`
	StartTimeMillis int             `json:"startTimeMillis"`
	DurationMillis  int             `json:"durationMillis"`
	StageFlowNodes  []StageFlowNode `json:"stageFlowNodes"`

JobExecution holds information regarding an execution of a job

type JobStage

type JobStage struct {
	Links struct {
		Self Link `json:"self"`
	} `json:"_links"`
	ID     string `json:"id"`
	Status string `json:"status"`
	Name   string `json:"name"`

JobStage holds information about a stage of a job

type Link struct {
	Href string `json:"href"`

Link represents a relative uri deeper into the Jenkins API

type Log

type Log struct {
	TimeStamp string `json:"timestamp"`
	Line      string `json:"line"`

The Log struct maps to the response value for the structured log

type NodeLog

type NodeLog struct {
	NodeID     string `json:"nodeId"`
	NodeStatus string `json:"nodeStatus"`
	Length     int    `json:"length"`
	HasMore    bool   `json:"hasMore"`
	Text       string `json:"text"`
	ConsoleURL string `json:"consoleUrl"`

NodeLog maps to the logs from a node

type StageFlowNode

type StageFlowNode struct {
	Links struct {
		Self Link `json:"self"`
		Log  Link `json:"log"`
	} `json:"_links"`
	ID                   string   `json:"id"`
	Status               string   `json:"status"`
	Name                 string   `json:"name"`
	StartTimeMillis      int      `json:"startTimeMillis"`
	DurationMillis       int      `json:"durationMillis"`
	ParameterDescription string   `json:"parameterDescription"`
	Parents              []string `json:"parentNodes"`

StageFlowNode holds information regarding a flow-node in a stage

Source Files


Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
t or T : Toggle theme light dark auto
y or Y : Canonical URL