github

package
v0.8.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 29, 2025 License: Apache-2.0 Imports: 16 Imported by: 0

Documentation

Overview

Package github fetches pull request data from GitHub using prx or turnserver.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CalculateActualTimeWindow added in v0.7.0

func CalculateActualTimeWindow(prs []PRSummary, requestedDays int) (actualDays int, hitLimit bool)

CalculateActualTimeWindow validates time coverage for the fetched PRs. With the multi-query approach, we fetch PRs to cover the full requested period. This function logs coverage statistics but always returns the requested period.

Parameters:

  • prs: List of PRs fetched (may be from multiple queries)
  • requestedDays: Number of days originally requested

Returns:

  • actualDays: Always returns requestedDays (multi-query ensures coverage)
  • hitLimit: Always returns false (no period adjustment needed)

func CountBotPRs added in v0.8.0

func CountBotPRs(prs []PRSummary) int

CountBotPRs counts how many PRs in the list are authored by bots.

func CountOpenPRsInOrg added in v0.7.0

func CountOpenPRsInOrg(ctx context.Context, org, token string) (int, error)

CountOpenPRsInOrg counts all open PRs across an entire GitHub organization with a single GraphQL query. This is much more efficient than counting PRs repo-by-repo for organizations with many repositories. Only counts PRs created more than 24 hours ago to exclude brand-new PRs.

func CountOpenPRsInRepo added in v0.7.0

func CountOpenPRsInRepo(ctx context.Context, owner, repo, token string) (int, error)

CountOpenPRsInRepo queries GitHub GraphQL API to get the total count of open PRs in a repository that were created more than 24 hours ago (PRs open <24 hours don't count as tracking overhead yet).

Parameters:

  • ctx: Context for the API call
  • owner: GitHub repository owner
  • repo: GitHub repository name
  • token: GitHub authentication token

Returns:

  • count: Number of open PRs created >24 hours ago

func CountUniqueAuthors added in v0.7.0

func CountUniqueAuthors(prs []PRSummary) int

CountUniqueAuthors counts the number of unique authors in a slice of PRSummary. Bot authors are excluded from the count.

func FetchPRData

func FetchPRData(ctx context.Context, prURL string, token string, updatedAt time.Time) (cost.PRData, error)

FetchPRData retrieves pull request information from GitHub and converts it to the format needed for cost calculation.

Uses prx's CacheClient for disk-based caching with automatic cleanup.

The updatedAt parameter enables effective caching. Pass the PR's updatedAt timestamp from GraphQL queries, or time.Now() for fresh data.

Parameters:

  • ctx: Context for the API call
  • prURL: Full GitHub PR URL (e.g., "https://github.com/owner/repo/pull/123")
  • token: GitHub authentication token
  • updatedAt: PR's last update timestamp (for caching) or time.Now() to bypass cache

Returns:

  • cost.PRData with all information needed for cost calculation

func FetchPRDataViaTurnserver added in v0.7.0

func FetchPRDataViaTurnserver(ctx context.Context, prURL string, token string, updatedAt time.Time) (cost.PRData, error)

FetchPRDataViaTurnserver retrieves pull request information from the turnserver and converts it to the format needed for cost calculation.

The turnserver aggregates PR data and analysis, and includes full event history when requested. This is more efficient than calling GitHub API directly for complete PR data.

The updatedAt parameter enables effective caching on the turnserver side. Pass the PR's updatedAt timestamp from GraphQL queries, or time.Now() for fresh data.

Parameters:

  • ctx: Context for the API call
  • prURL: Full GitHub PR URL (e.g., "https://github.com/owner/repo/pull/123")
  • token: GitHub authentication token
  • updatedAt: PR's last update timestamp (for caching) or time.Now() to bypass cache

Returns:

  • cost.PRData with all information needed for cost calculation

func IsBot added in v0.8.0

func IsBot(author string) bool

IsBot returns true if the author name indicates a bot account.

func PRDataFromPRX

func PRDataFromPRX(prData *prx.PullRequestData) cost.PRData

PRDataFromPRX converts prx.PullRequestData to cost.PRData. This allows you to use prcost with pre-fetched PR data.

Parameters:

  • prData: PullRequestData from prx package

Returns:

  • cost.PRData with all information needed for cost calculation

Types

type PRDataWithAnalysis added in v0.8.0

type PRDataWithAnalysis struct {
	PRData   cost.PRData
	Analysis turn.Analysis
}

PRDataWithAnalysis combines PR data with turnserver analysis.

func FetchPRDataWithAnalysisViaTurnserver added in v0.8.0

func FetchPRDataWithAnalysisViaTurnserver(ctx context.Context, prURL string, token string, updatedAt time.Time) (PRDataWithAnalysis, error)

FetchPRDataWithAnalysisViaTurnserver retrieves pull request information and analysis from the turnserver. This includes both the PR data needed for cost calculation and the workflow analysis (seconds_in_state, workflow_state, etc.).

Parameters:

  • ctx: Context for the API call
  • prURL: Full GitHub PR URL (e.g., "https://github.com/owner/repo/pull/123")
  • token: GitHub authentication token
  • updatedAt: PR's last update timestamp (for caching) or time.Now() to bypass cache

Returns:

  • PRDataWithAnalysis containing both cost.PRData and turn.Analysis

type PRSummary added in v0.7.0

type PRSummary struct {
	UpdatedAt time.Time
	Owner     string
	Repo      string
	Author    string
	Number    int
}

PRSummary holds minimal information about a PR for sampling and fetching.

func FetchPRsFromOrg added in v0.7.0

func FetchPRsFromOrg(ctx context.Context, org string, since time.Time, token string, progress ProgressCallback) ([]PRSummary, error)

FetchPRsFromOrg queries GitHub GraphQL Search API for all PRs across an organization modified since the specified date.

Uses an adaptive multi-query strategy for comprehensive time coverage:

  1. Query recent activity (updated desc) - get up to 1000 PRs
  2. If hit limit, query old activity (updated asc) - get ~500 more
  3. Check gap between oldest "recent" and newest "old"
  4. If gap > 1 week, query early period (created asc) - get ~250 more

Parameters:

  • ctx: Context for the API call
  • org: GitHub organization name
  • since: Only include PRs updated after this time
  • token: GitHub authentication token
  • progress: Optional callback for progress updates (can be nil)

Returns:

  • Slice of PRSummary for all matching PRs (deduplicated)

func FetchPRsFromRepo added in v0.7.0

func FetchPRsFromRepo(ctx context.Context, owner, repo string, since time.Time, token string, progress ProgressCallback) ([]PRSummary, error)

FetchPRsFromRepo queries GitHub GraphQL API for all PRs in a repository modified since the specified date.

Uses an adaptive multi-query strategy for comprehensive time coverage:

  1. Query recent activity (updated DESC) - get up to 1000 PRs
  2. If hit limit, query old activity (updated ASC) - get ~500 more
  3. Check gap between oldest "recent" and newest "old"
  4. If gap > 1 week, query early period (created ASC) - get ~250 more

Parameters:

  • ctx: Context for the API call
  • owner: GitHub repository owner
  • repo: GitHub repository name
  • since: Only include PRs updated after this time
  • token: GitHub authentication token
  • progress: Optional callback for progress updates (can be nil)

Returns:

  • Slice of PRSummary for all matching PRs (deduplicated)

func SamplePRs added in v0.7.0

func SamplePRs(prs []PRSummary, sampleSize int) []PRSummary

SamplePRs uses a time-bucket strategy to evenly sample PRs across the time range. This ensures samples are distributed throughout the period rather than clustered. Bot-authored PRs are excluded from sampling.

Parameters:

  • prs: List of PRs to sample from
  • sampleSize: Desired number of samples

Returns:

  • Slice of sampled PRs (may be smaller than sampleSize if insufficient PRs)

Strategy:

  • Includes both human and bot-authored PRs
  • Divides time range into buckets equal to sampleSize
  • Selects most recent PR from each bucket
  • If buckets are empty, fills with nearest unused PRs

type ProgressCallback added in v0.8.0

type ProgressCallback func(queryName string, page int, prCount int)

ProgressCallback is called during PR fetching to report progress. Parameters: queryName (e.g., "recent", "old", "early"), currentPage, totalPRsSoFar.

type SimpleFetcher added in v0.8.0

type SimpleFetcher struct {
	Token      string
	DataSource string // "prx" or "turnserver"
}

SimpleFetcher is a PRFetcher that fetches PR data without caching. It uses either prx or turnserver based on configuration.

func (*SimpleFetcher) FetchPRData added in v0.8.0

func (f *SimpleFetcher) FetchPRData(ctx context.Context, prURL string, updatedAt time.Time) (cost.PRData, error)

FetchPRData implements the PRFetcher interface from pkg/cost.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL