Documentation
¶
Overview ¶
Package integrations provides HTTP clients for package registry APIs.
Overview ¶
This package contains low-level API clients for fetching package metadata from various registries. Each registry has its own subpackage:
- [pypi]: Python Package Index
- [npm]: Node Package Manager
- [crates]: Rust crates.io
- [rubygems]: Ruby gems
- [packagist]: PHP Composer packages
- [maven]: Java Maven Central
- [goproxy]: Go Module Proxy
- [github]: GitHub API for metadata enrichment
- [gitlab]: GitLab API for metadata enrichment
Client Pattern ¶
All registry clients follow a consistent pattern:
client, err := pypi.NewClient(24 * time.Hour) // Cache TTL pkg, err := client.FetchPackage(ctx, "fastapi", false) // false = use cache
Clients handle:
- HTTP requests with retry and rate limiting
- Response caching (file-based, configurable TTL)
- API-specific parsing and normalization
Shared Infrastructure ¶
The Client type provides shared HTTP functionality used by all registry clients, including HTTP response caching via cache.Cache.
Adding a New Registry ¶
To add support for a new package registry:
- Create a subpackage: pkg/integrations/<registry>/
- Define response structs matching the API schema
- Implement a Client with FetchPackage method
- Use NewClient for HTTP with caching
- Wire into [deps] as a new language
[pypi]: github.com/matzehuels/stacktower/pkg/integrations/pypi [npm]: github.com/matzehuels/stacktower/pkg/integrations/npm [crates]: github.com/matzehuels/stacktower/pkg/integrations/crates [rubygems]: github.com/matzehuels/stacktower/pkg/integrations/rubygems [packagist]: github.com/matzehuels/stacktower/pkg/integrations/packagist [maven]: github.com/matzehuels/stacktower/pkg/integrations/maven [goproxy]: github.com/matzehuels/stacktower/pkg/integrations/goproxy [github]: github.com/matzehuels/stacktower/pkg/integrations/github [gitlab]: github.com/matzehuels/stacktower/pkg/integrations/gitlab cache.Cache: github.com/matzehuels/stacktower/pkg/cache.Cache [deps]: github.com/matzehuels/stacktower/pkg/core/deps
Example (Errors) ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Standard errors for registry operations
fmt.Println("ErrNotFound:", integrations.ErrNotFound)
fmt.Println("ErrNetwork:", integrations.ErrNetwork)
}
Output: ErrNotFound: not found ErrNetwork: network error
Index ¶
- Variables
- func ExtractRepoURL(re *regexp.Regexp, urls map[string]string, homepage string) (owner, repo string, ok bool)
- func NewHTTPClient() *http.Client
- func NormalizePkgName(name string) string
- func NormalizeRepoURL(raw string) string
- func URLEncode(s string) string
- type Client
- func (c *Client) Cached(ctx context.Context, key string, refresh bool, v any, fetch func() error) error
- func (c *Client) Get(ctx context.Context, url string, v any) error
- func (c *Client) GetText(ctx context.Context, url string) (string, error)
- func (c *Client) GetWithHeaders(ctx context.Context, url string, headers map[string]string, v any) error
- type Contributor
- type RateLimitedError
- type RepoMetrics
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ( // ErrNotFound is returned when a package or resource doesn't exist in the registry. // This corresponds to HTTP 404 responses. ErrNotFound = cache.ErrNotFound // ErrNetwork is returned for HTTP failures (timeouts, connection errors, 5xx responses). ErrNetwork = cache.ErrNetwork )
Sentinel errors - re-exported from cache for API consistency.
Functions ¶
func ExtractRepoURL ¶
func ExtractRepoURL(re *regexp.Regexp, urls map[string]string, homepage string) (owner, repo string, ok bool)
ExtractRepoURL finds GitHub/GitLab owner and repo from package URLs. It searches through urls using standard keys (Source, Repository, Code, Homepage) and falls back to homepage if no match is found.
The re parameter should match URLs and capture:
- Group 1: owner/organization name
- Group 2: repository name
Examples:
re := regexp.MustCompile(`https?://github\.com/([^/]+)/([^/]+)`) owner, repo, ok := ExtractRepoURL(re, pkg.ProjectURLs, pkg.HomePage)
URLs containing "/sponsors/" are automatically skipped to avoid false positives. The .git suffix is trimmed from the repository name if present.
Parameters:
- re: Regular expression with exactly 2 capture groups (must not be nil)
- urls: Map of URL keys to URL values (may be nil or empty)
- homepage: Fallback homepage URL (may be empty)
Returns:
- owner: The repository owner/organization (empty if not found)
- repo: The repository name without .git suffix (empty if not found)
- ok: true if a valid match was found, false otherwise
This function is safe for concurrent use if re is not mutated. Panics if re is nil.
func NewHTTPClient ¶
NewHTTPClient creates an HTTP client with a standard timeout for registry requests. The returned client has a 10-second timeout applied to all requests.
The client is safe for concurrent use by multiple goroutines. Returns a new client on every call; clients are not pooled.
func NormalizePkgName ¶
NormalizePkgName converts a package name to its canonical form. Applies lowercase and replaces underscores with hyphens, following PEP 503 normalization rules used by PyPI and other registries.
Normalization steps:
- Trim leading and trailing whitespace
- Convert to lowercase
- Replace all underscores with hyphens
Examples:
NormalizePkgName("FastAPI") → "fastapi"
NormalizePkgName("my_package") → "my-package"
NormalizePkgName(" Spaces ") → "spaces"
An empty string input returns an empty string. This function is safe for concurrent use.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Package names are normalized to lowercase with hyphens
fmt.Println(integrations.NormalizePkgName("FastAPI"))
fmt.Println(integrations.NormalizePkgName("my_package"))
fmt.Println(integrations.NormalizePkgName(" Spaces "))
}
Output: fastapi my-package spaces
func NormalizeRepoURL ¶
NormalizeRepoURL converts various repository URL formats to canonical HTTPS form. Handles git@, git://, and git+ prefixes, and removes .git suffixes.
Transformations applied:
- git@github.com:user/repo → https://github.com/user/repo
- git://github.com/user/repo → https://github.com/user/repo
- git+https://example.com/repo.git → https://example.com/repo
- https://example.com/repo.git → https://example.com/repo
Returns an empty string if the input is empty or contains only whitespace. Non-git URLs are returned unchanged after whitespace trimming and .git suffix removal. This function is safe for concurrent use.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Various repository URL formats are normalized to HTTPS
fmt.Println(integrations.NormalizeRepoURL("git@github.com:user/repo.git"))
fmt.Println(integrations.NormalizeRepoURL("git://github.com/user/repo"))
fmt.Println(integrations.NormalizeRepoURL("git+https://github.com/user/repo.git"))
fmt.Println(integrations.NormalizeRepoURL("https://github.com/user/repo"))
}
Output: https://github.com/user/repo https://github.com/user/repo https://github.com/user/repo https://github.com/user/repo
func URLEncode ¶
URLEncode percent-encodes a string for use in URLs. This is a convenience wrapper around url.QueryEscape.
Spaces are encoded as "+", and special characters as "%XX" hex sequences. An empty string returns an empty string. This function is safe for concurrent use.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// URL-encode special characters for API queries
fmt.Println(integrations.URLEncode("@scope/package"))
fmt.Println(integrations.URLEncode("package name"))
}
Output: %40scope%2Fpackage package+name
Types ¶
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client provides shared HTTP functionality for all registry API clients. It handles caching, retry logic, and common request headers.
Client is safe for concurrent use by multiple goroutines. The underlying HTTP client, cache, and headers are all goroutine-safe.
Zero values: Do not use an uninitialized Client; always create via NewClient.
func NewClient ¶
func NewClient(c cache.Cache, namespace string, ttl time.Duration, headers map[string]string) *Client
NewClient creates a Client with the given cache and default headers. Headers are applied to all requests made through this client.
Parameters:
- c: Cache for caching HTTP responses. If nil, a NullCache is used (no caching).
- namespace: Cache key prefix for this client (e.g., "pypi:", "npm:").
- ttl: How long to cache responses.
- headers: Default HTTP headers for all requests. Pass nil if no default headers are needed. Common examples: "Authorization", "User-Agent", "Accept".
The returned Client is safe for concurrent use by multiple goroutines.
func (*Client) Cached ¶
func (c *Client) Cached(ctx context.Context, key string, refresh bool, v any, fetch func() error) error
Cached retrieves a value from cache or executes fetch and caches the result. If refresh is true, the cache is bypassed and fetch is always called.
Parameters:
- ctx: Context for cancellation. If cancelled, fetch is not executed and returns ctx.Err().
- key: Cache key (usually package name or coordinate). Must not be empty.
- refresh: If true, bypass cache and always call fetch. If false, try cache first.
- v: Pointer to store the result. Must be a non-nil pointer to a JSON-serializable type.
- fetch: Function to fetch data and populate v. Called with retry on transient failures.
Behavior:
- If refresh=false and cache hit: returns nil immediately with v populated
- If cache miss or refresh=true: calls fetch with automatic retry on [RetryableError]
- On successful fetch: stores result in cache (ignoring cache write errors)
The fetch function should populate v and return nil on success, or return an error. Network errors should be wrapped with [Retryable] to enable retry.
Returns:
- nil on success (v is populated)
- error from fetch if it fails (v may be partially populated)
- ctx.Err() if context is cancelled
This method is safe for concurrent use on the same Client.
func (*Client) Get ¶
Get performs an HTTP GET request and JSON-decodes the response into v. It uses the client's default headers and handles retries automatically.
Parameters:
- ctx: Context for cancellation and timeout
- url: Full URL to request (must be absolute URL with scheme)
- v: Pointer to store decoded JSON response (must be non-nil)
Returns:
- ErrNotFound for HTTP 404 responses
- ErrNetwork wrapped with [RetryableError] for HTTP 5xx responses
- ErrNetwork for connection failures and timeouts
- json decoding errors if response is not valid JSON
This method is safe for concurrent use on the same Client.
func (*Client) GetText ¶
GetText performs an HTTP GET request and returns the response body as a string. Useful for non-JSON endpoints like go.mod files or plain text responses.
Parameters:
- ctx: Context for cancellation and timeout
- url: Full URL to request (must be absolute URL with scheme)
The entire response body is read into memory. Use caution with large responses. For files larger than a few MB, consider streaming with a custom implementation.
Returns:
- The response body as a string
- ErrNotFound for HTTP 404 responses
- ErrNetwork for connection failures, timeouts, and HTTP 5xx responses
- io errors if reading the response body fails
This method is safe for concurrent use on the same Client.
func (*Client) GetWithHeaders ¶
func (c *Client) GetWithHeaders(ctx context.Context, url string, headers map[string]string, v any) error
GetWithHeaders performs an HTTP GET with additional headers merged with defaults. Request-specific headers override client defaults for the same key.
Parameters:
- ctx: Context for cancellation and timeout
- url: Full URL to request (must be absolute URL with scheme)
- headers: Additional headers for this request only (may be nil). Headers with the same key as client defaults will override the default value for this request.
- v: Pointer to store decoded JSON response (must be non-nil)
Example:
err := client.GetWithHeaders(ctx, url, map[string]string{"X-Custom": "value"}, &resp)
Returns the same errors as [Get]. This method is safe for concurrent use on the same Client.
type Contributor ¶
type Contributor struct {
Login string `json:"login"` // GitHub/GitLab username. Never empty in valid contributors.
Contributions int `json:"contributions"` // Number of commits. Always positive in valid contributors.
}
Contributor represents a repository contributor with their contribution count. Used for bus factor analysis and maintainer identification.
Zero values: Login is empty, Contributions is 0. A Contributor with 0 contributions is invalid. This struct is safe for concurrent reads.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Contributors track commit counts for bus factor analysis
contributors := []integrations.Contributor{
{Login: "maintainer1", Contributions: 500},
{Login: "maintainer2", Contributions: 200},
{Login: "contributor3", Contributions: 50},
}
fmt.Println("Top contributor:", contributors[0].Login)
fmt.Println("Contributions:", contributors[0].Contributions)
}
Output: Top contributor: maintainer1 Contributions: 500
type RateLimitedError ¶ added in v1.0.0
type RateLimitedError struct {
RetryAfter int // Seconds to wait before retrying (0 if unknown)
}
RateLimitedError indicates the API rate limit has been exceeded.
func (*RateLimitedError) Error ¶ added in v1.0.0
func (e *RateLimitedError) Error() string
Error implements the error interface.
type RepoMetrics ¶
type RepoMetrics struct {
RepoURL string `json:"repo_url"` // Canonical repository URL (https://...). Never empty in valid metrics.
Owner string `json:"owner"` // Repository owner username. Never empty in valid metrics.
Description string `json:"description,omitempty"` // Repository description from GitHub/GitLab. Empty if not set.
Stars int `json:"stars"` // GitHub/GitLab star count. 0 is a valid value for new repositories.
SizeKB int `json:"size_kb,omitempty"` // Repository size in kilobytes. 0 means not available or very small.
LastCommitAt *time.Time `json:"last_commit_at,omitempty"` // Date of most recent commit. Nil if not available.
LastReleaseAt *time.Time `json:"last_release_at,omitempty"` // Date of most recent release. Nil if no releases or not available.
License string `json:"license,omitempty"` // SPDX license identifier (e.g., "MIT", "Apache-2.0"). Empty if not detected.
Contributors []Contributor `json:"top_contributors,omitempty"` // Top contributors by commit count (typically top 5). Nil or empty if not available.
Language string `json:"language,omitempty"` // Primary repository language (e.g., "Go", "Python"). Empty if not detected.
Topics []string `json:"topics,omitempty"` // Repository topic tags. Nil or empty if none.
Archived bool `json:"archived"` // Whether the repository is archived. False means active or unknown.
}
RepoMetrics holds repository-level data fetched from GitHub or GitLab. Used to enrich package metadata with maintenance and popularity indicators.
Zero values: All string fields are empty, integers are 0, time pointers are nil. Nil Contributors slice is valid and indicates no contributor data was fetched.
This struct is safe for concurrent reads after construction but not for concurrent writes.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// RepoMetrics holds repository data from GitHub/GitLab
metrics := integrations.RepoMetrics{
RepoURL: "https://github.com/psf/requests",
Owner: "psf",
Stars: 51000,
Language: "Python",
Archived: false,
}
fmt.Println("Repository:", metrics.RepoURL)
fmt.Println("Stars:", metrics.Stars)
fmt.Println("Archived:", metrics.Archived)
}
Output: Repository: https://github.com/psf/requests Stars: 51000 Archived: false
Directories
¶
| Path | Synopsis |
|---|---|
|
Package crates provides an HTTP client for the crates.io API.
|
Package crates provides an HTTP client for the crates.io API. |
|
Package github provides an HTTP client for the GitHub API.
|
Package github provides an HTTP client for the GitHub API. |
|
Package gitlab provides an HTTP client for the GitLab API.
|
Package gitlab provides an HTTP client for the GitLab API. |
|
Package goproxy provides an HTTP client for the Go Module Proxy.
|
Package goproxy provides an HTTP client for the Go Module Proxy. |
|
Package maven provides an HTTP client for Maven Central.
|
Package maven provides an HTTP client for Maven Central. |
|
Package npm provides an HTTP client for the npm registry API.
|
Package npm provides an HTTP client for the npm registry API. |
|
Package packagist provides an HTTP client for the Packagist API.
|
Package packagist provides an HTTP client for the Packagist API. |
|
Package pypi provides an HTTP client for the Python Package Index API.
|
Package pypi provides an HTTP client for the Python Package Index API. |
|
Package rubygems provides an HTTP client for the RubyGems.org API.
|
Package rubygems provides an HTTP client for the RubyGems.org API. |