githubreceiver

package module
v0.150.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 13, 2026 License: Apache-2.0 Imports: 31 Imported by: 3

README

GitHub Receiver

Status
Stability development: traces
alpha: metrics
Distributions contrib
Issues Open issues Closed issues
Code coverage codecov
Code Owners @adrielp, @crobert-1, @TylerHelmuth

Table of Contents

Overview

The GitHub receiver receives data from GitHub via two methods:

  1. Scrapes version control system metrics from GitHub repositories and organizations using the GraphQL and REST APIs.
  2. Receives GitHub Actions events by serving a webhook endpoint, converting those events into traces.

Metrics - Getting Started

The current default set of metrics can be found in documentation.md.

These metrics can be used as leading indicators (capabilities) to the DORA metrics; helping provide insight into modern-day engineering practices.

The collection interval is common to all scrapers and is set to 30 seconds by default.

Note: Generally speaking, if the vendor allows for anonymous API calls, then you won't have to configure any authentication, but you may only see public repositories and organizations. You may also run into significantly more rate limiting.

github:
    collection_interval: <duration> #default = 30s recommended 300s
    scrapers:
        scraper/config-1:
        scraper/config-2:
        ...

A more complete example using the GitHub scrapers with authentication is as follows:

extensions:
    bearertokenauth/github:
        token: ${env:GH_PAT}

receivers:
    github:
        initial_delay: 1s
        collection_interval: 60s
        scrapers:
            scraper:
                metrics: #Optional
                    vcs.contributor.count:
                        enabled: true
                github_org: <myfancyorg> 
                search_query: "org:<myfancyorg> topic:<o11yalltheway>" # Recommended optional query override, defaults to "{org,user}:<github_org>"
                concurrency_limit: 50  # Optional: (default: 50)
                merged_pr_lookback_days: 30 # Optional: (default: 30)
                endpoint: "https://selfmanagedenterpriseserver.com" # Optional
                auth:
                    authenticator: bearertokenauth/github
service:
    extensions: [bearertokenauth/github]
    pipelines:
        metrics:
            receivers: [..., github]
            processors: []
            exporters: [...]
Configuration

github_org (required): Specify the GitHub organization or username to scrape.

endpoint (optional): Set this only when using a self-managed GitHub instance (e.g., https://selfmanagedenterpriseserver.com -- SHOULD NOT include api subdomain or /graphql context path).

search_query (optional): A filter to narrow down repositories. Defaults to org:<github_org> (or user:<username>). For example, use repo:<org>/<repo> to target a specific repository. Any valid GitHub search syntax is allowed.

concurrency_limit (optional): Maximum number of concurrent repository processing goroutines. Defaults to 50 to stay under GitHub's 100 concurrent request limit. Set to 0 for unlimited concurrency (not recommended for >100 repositories).

merged_pr_lookback_days (optional): Number of days to query back in time when fetching merged pull requests. Defaults to 30. Set to 0 to fetch all merged PRs.

metrics (optional): Enable or disable metrics scraping. See the metrics documentation for details.

Scraping

Important:

  • The GitHub scraper does not emit metrics for branches that have not had changes since creation from the default branch (trunk).
  • Due to GitHub API limitations, it is possible for the branch time metric to change when rebases occur, recreating the commits with new timestamps.

For additional context on GitHub scraper limitations and inner workings please see the Scraping README.

GitHub Personal Access Token (PAT) Setup

To create a GitHub Personal Access Token (PAT), please refer to the official documentation.

Organization or Personal Access: When generating the PAT, select the appropriate Resource owner — either your personal account or the organization and choose the correct Repository access type. For fine-grained tokens, explicitly configure the necessary Repository permissions or Organization permissions.

Note: The PAT must have read access to the target repositories. If the PAT doesn't have permission to access repositories in the target organization, only the repository count metric will be available. Detailed repository metrics cannot be fetched.

Traces - Getting Started

Workflow tracing support is accomplished through the processing of GitHub Actions webhook events for workflows and jobs. The workflow_job and workflow_run event payloads are then constructed into trace telemetry.

Each GitHub Action workflow or job, along with its steps, are converted into trace spans, allowing the observation of workflow execution times, success, and failure rates. Each Trace and Span ID is deterministic. This enables the underlying actions to emit telemetry from any command running in any step. This can be achieved by using tools like the run-with-telemetry action and otel-cli. The key is generating IDs in the same way that this GitHub receiver does. The trace_event_handling.go file contains the new*ID functions that generate deterministic IDs.

IMPORTANT - Workflow Job names MUST be unique in each workflow for deterministic span IDs to not conflict with eachother. GitHub does not enforce this behavior, but when linting a workflow, warns that there are duplicate job names.

Receiver Configuration

IMPORTANT - Ensure your WebHook endpoint is secured with a secret and a Web Application Firewall (WAF) or other security measure.

The WebHook configuration exposes the following settings:

  • endpoint: (default = localhost:8080) - The address and port to bind the WebHook to.
  • path: (default = /events) - The path for Action events to be sent to.
  • health_path: (default = /health) - The path for health checks.
  • secret: (optional) - The secret used to validates the payload.
  • required_header: (optional) - The required header key and value for incoming requests.
  • service_name: (optional) - The service name for the traces. See the Configuring Service Name section for more information.
  • include_span_events: (default = false) - When set to true, attaches the raw webhook event JSON as a span event. The workflow run event is attached to the workflow run span, and the workflow job event is attached to the job span.

The WebHook configuration block also accepts all the confighttp settings.

An example configuration is as follows:

receivers:
    github:
        webhook:
            endpoint: localhost:19418
            path: /events
            health_path: /health
            secret: ${env:SECRET_STRING_VAR}
            service_name: github-actions  # single logical CI service (See Configuring Service Name section below)
            required_headers:
                WAF-Header: "value"
        scrapers: # The validation expects at least a dummy scraper config
            scraper:
                github_org: open-telemetry

For tracing, all configuration is set under the webhook key. The full set of exposed configuration values can be found in config.go.

Configuring Service Name

The service_name option in the WebHook configuration can be used to set a pre-defined service.name resource attribute for all traces emitted by the receiver. This value should represent the logical service producing telemetry (as defined by OpenTelemetry resource semantics), not individual repositories or components. For CI/CD usage, a typical choice is a single service such as github-actions (optionally paired with service.namespace for ownership and uniqueness).

If you choose to set service_name explicitly, consider running a separate GitHub receiver (and/or GitHub App) for each distinct service that you want to model.

If you do not set service_name, the receiver supports deriving it from repository metadata. You can configure Custom Properties in each GitHub repository by adding a service_name key; all events from that repository will then carry the specified service.name. If no custom property is found, the receiver will derive service.name from the repository name.

Note: Deriving service.name from repositories is a convenience and may be sufficient for small setups. In larger organizations, mapping repositories directly to service.name often leads to many pseudo-services and can make cross-repository analysis harder. Prefer an explicit CI service name when modeling your pipelines as a platform service.

Precedence for setting service.name:

  1. service_name in the WebHook configuration.
  2. service_name key in the repository’s Custom Properties.
  3. service_name derived from the repository name.
  4. service.name defaults to unknown_service per the semantic conventions.
Span Events

When include_span_events is enabled, the receiver attaches the raw GitHub webhook event JSON as a span event to the corresponding span:

  • Workflow Run events: Attached as a span event named github.workflow_run.event to the root workflow run span
  • Workflow Job events: Attached as a span event named github.workflow_job.event to the job span

The raw event is stored in the event.payload attribute as a JSON string. This allows for detailed inspection of the complete webhook payload, including fields that may not be mapped to span attributes.

Note: The raw event payload can be large (typically 5-50KB). Consider the impact on storage and performance before enabling this feature in production environments.

An example configuration with span events enabled:

receivers:
    github:
        webhook:
            endpoint: localhost:19418
            path: /events
            health_path: /health
            secret: ${env:SECRET_STRING_VAR}
            required_headers:
                WAF-Header: "value"
            include_span_events: true
        scrapers: # The validation expects at least a dummy scraper config
            scraper:
                github_org: open-telemetry
Configuring A GitHub App

To configure a GitHub App, you will need to create a new GitHub App within your organization. Refer to the general GitHub App documentation for how to create a GitHub App. During the subscription phase, subscribe to workflow_run and workflow_job events.

Custom Properties as Resource Attributes

The GitHub receiver supports adding custom properties from GitHub repositories as resource attributes in your telemetry data. This allows users to enrich traces and events with additional metadata specific to each repository.

How It Works

When a GitHub webhook event is received, the receiver extracts all custom properties from the repository and adds them as resource attributes with the prefix github.repository.custom_properties.

For example, if your repository has these custom properties:

classification: public
service-tier: critical
slack-support-channel: #observability-alerts
team-name: observability-engineering

They will be added as resource attributes:

github.repository.custom_properties.classification: "public"
github.repository.custom_properties.service_tier: "critical"
github.repository.custom_properties.slack_support_channel: "#observability-alerts"
github.repository.custom_properties.team_name: "observability-engineering"
Key Formatting

To ensure consistency with OpenTelemetry naming conventions, all custom property keys are converted to snake_case format using the following rules:

  1. Hyphens, spaces, and dots are replaced with underscores
  2. Special characters like $ and # are replaced with _dollar_ and _hash_
  3. CamelCase and PascalCase are converted to snake_case by inserting underscores before uppercase letters
  4. Multiple consecutive underscores are replaced with a single underscore

Examples of key transformations:

Original Key Transformed Key
teamName team_name
API-Key api_key
Service.Level service_level
$Cost _dollar_cost
#Priority _hash_priority

Note: The service_name custom property is handled specially and is not added as a resource attribute with the prefix. Instead, it's used to set the service.name resource attribute directly, as described in the Configuring Service Name section.

Migration Notes

Semantic Conventions v1.37.0 Upgrade

The GitHub receiver has been updated to use OpenTelemetry semantic conventions v1.37.0. This brings standardization improvements and better alignment with the broader OpenTelemetry ecosystem.

Breaking Changes

Resource Attributes:

  • organization.namevcs.owner.name - The resource attribute for organization/owner name has been standardized
  • vcs.vendor.namevcs.provider.name - The VCS provider attribute has been standardized

Trace Attributes:

  • vcs.ref.head.typevcs.ref.type - Some trace attributes now use standardized naming
What This Means for Users

For Dashboard and Alerting Users:

  • Update your queries and dashboards to use the new attribute names
  • Old attribute names are no longer emitted starting with this version
  • The schema URL in telemetry data now references OpenTelemetry schemas v1.37.0

For Configuration Users:

  • No configuration changes are required
  • All existing receiver configurations continue to work unchanged

Migration Timeline:

  • Before upgrading: Update downstream systems (dashboards, alerts, queries) to use new attribute names
  • After upgrading: Verify that telemetry data is flowing correctly with the new attributes

For the complete list of semantic convention changes, see the OpenTelemetry semantic conventions v1.37.0 documentation.

Documentation

Index

Constants

View Source
const (

	// vcs.repository.name
	AttributeVCSRepositoryName = "vcs.repository.name"

	// vcs.ref.head.name (used in trace generation)
	AttributeVCSRefHead = "vcs.ref.head"

	// vcs.ref.head.revision
	AttributeVCSRefHeadRevision = "vcs.ref.head.revision"

	// vcs.ref.head.type with enum values of branch or tag.
	// Note: This is now standardized in semantic conventions v1.37.0
	AttributeVCSRefHeadType       = "vcs.ref.head.type"
	AttributeVCSRefHeadTypeBranch = "branch"
	AttributeVCSRefHeadTypeTag    = "tag"

	AttributeCICDPipelineRunURLFull = "cicd.pipeline.run.url.full" // equivalent to GitHub's `html_url`

	// CICD pipeline and task run status attributes for GitHub workflow integration
	AttributeCICDPipelineRunStatus             = "cicd.pipeline.run.status" // equivalent to GitHub's `conclusion`
	AttributeCICDPipelineRunStatusSuccess      = "success"
	AttributeCICDPipelineRunStatusFailure      = "failure"
	AttributeCICDPipelineRunStatusCancellation = "cancellation"

	AttributeCICDPipelineRunStatusSkip = "skip"

	AttributeCICDPipelineTaskRunStatus             = "cicd.pipeline.run.task.status" // equivalent to GitHub's `conclusion`
	AttributeCICDPipelineTaskRunStatusSuccess      = "success"
	AttributeCICDPipelineTaskRunStatusFailure      = "failure"
	AttributeCICDPipelineTaskRunStatusCancellation = "cancellation"
	AttributeCICDPipelineTaskRunStatusSkip         = "skip"

	// The following GitHub-specific attributes are not part of semantic conventions v1.37.0.
	AttributeCICDPipelineRunSenderLogin     = "cicd.pipeline.run.sender.login"      // GitHub's Run Sender Login
	AttributeCICDPipelineTaskRunSenderLogin = "cicd.pipeline.task.run.sender.login" // GitHub's Task Sender Login

	AttributeCICDPipelinePreviousAttemptURLFull = "cicd.pipeline.run.previous_attempt.url.full"
	AttributeCICDPipelineWorkerID               = "cicd.pipeline.worker.id"          // GitHub's Runner ID
	AttributeCICDPipelineWorkerGroupID          = "cicd.pipeline.worker.group.id"    // GitHub's Runner Group ID
	AttributeCICDPipelineWorkerName             = "cicd.pipeline.worker.name"        // GitHub's Runner Name
	AttributeCICDPipelineWorkerGroupName        = "cicd.pipeline.worker.group.name"  // GitHub's Runner Group Name
	AttributeCICDPipelineWorkerNodeID           = "cicd.pipeline.worker.node.id"     // GitHub's Runner Node ID
	AttributeCICDPipelineWorkerLabels           = "cicd.pipeline.worker.labels"      // GitHub's Runner Labels
	AttributeCICDPipelineRunQueueDuration       = "cicd.pipeline.run.queue.duration" // GitHub's Queue Duration

	// The following attributes are exclusive to GitHub but not listed under
	// vendor extensions within semantic conventions v1.37.0.
	AttributeGitHubRepositoryCustomProperty = "github.repository.custom_properties" // GitHub's Repository Custom Properties (used in custom property processing)

	// github.reference.workflow acts as a template attribute where it'll be
	// joined with a `name` and a `version` value. There is an unknown amount of
	// reference workflows that are sent as a list of strings by GitHub making
	// it necessary to leverage template attributes. One key thing to note is
	// the length of the names. Evaluate if this causes issues.
	// WARNING: Extremely long workflow file names could create extremely long
	// attribute keys which could lead to unknown issues in the backend and
	// create additional memory usage overhead when processing data (though
	// unlikely).
	// TODO: Evaluate if there is a need to truncate long workflow files names.
	// eg. github.reference.workflow.my-great-workflow.path
	// eg. github.reference.workflow.my-great-workflow.version
	// eg. github.reference.workflow.my-great-workflow.revision
	AttributeGitHubReferenceWorkflow = "github.reference.workflow"

	// SECURITY: This information will always exist on the repository, but may
	// be considered private if the repository is set to private. Care should be
	// taken in the data pipeline for sanitizing sensitive user information if
	// the user deems it as such.
	AttributeVCSRefHeadRevisionAuthorName  = "vcs.ref.head.revision.author.name"  // GitHub's Head Revision Author Name
	AttributeVCSRefHeadRevisionAuthorEmail = "vcs.ref.head.revision.author.email" // GitHub's Head Revision Author Email

)

model.go contains custom attributes that complement the standardized attributes from OpenTelemetry semantic conventions v1.37.0. While many VCS and CICD attributes are now standardized, these custom attributes provide GitHub-specific functionality not yet covered by the standard semantic conventions.

Variables

This section is empty.

Functions

func NewFactory

func NewFactory() receiver.Factory

NewFactory creates a factory for the github receiver

Types

type Config

type Config struct {
	scraperhelper.ControllerConfig `mapstructure:",squash"`
	Scrapers                       map[string]internal.Config `mapstructure:"scrapers"`
	metadata.MetricsBuilderConfig  `mapstructure:",squash"`
	WebHook                        WebHook `mapstructure:"webhook"`
}

Config that is exposed to this github receiver through the OTEL config.yaml

func (*Config) Unmarshal

func (cfg *Config) Unmarshal(componentParser *confmap.Conf) error

Unmarshal a config.Parser into the config struct.

func (*Config) Validate

func (cfg *Config) Validate() error

Validate the configuration passed through the OTEL config.yaml

type GitHubHeaders added in v0.120.0

type GitHubHeaders struct {
	Customizable map[string]string `mapstructure:","` // can be overwritten via required_headers
	Fixed        map[string]string `mapstructure:","` // are not allowed to be overwritten
}

type WebHook added in v0.116.0

type WebHook struct {
	confighttp.ServerConfig `mapstructure:",squash"`       // squash ensures fields are correctly decoded in embedded struct
	Path                    string                         `mapstructure:"path"`             // path for data collection. Default is /events
	HealthPath              string                         `mapstructure:"health_path"`      // path for health check api. Default is /health_check
	RequiredHeaders         map[string]configopaque.String `mapstructure:"required_headers"` // optional setting to set one or more required headers for all requests to have (except the health check)
	GitHubHeaders           GitHubHeaders                  `mapstructure:",squash"`          // GitLab headers set by default
	Secret                  string                         `mapstructure:"secret"`           // secret for webhook
	ServiceName             string                         `mapstructure:"service_name"`
	IncludeSpanEvents       bool                           `mapstructure:"include_span_events"` // attach raw webhook event JSON as span events
}

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL