Documentation
¶
Overview ¶
Package integrations provides HTTP clients for package registry APIs.
Overview ¶
This package contains low-level API clients for fetching package metadata from various registries. Each registry has its own subpackage:
- [pypi]: Python Package Index
- [npm]: Node Package Manager
- [crates]: Rust crates.io
- [rubygems]: Ruby gems
- [packagist]: PHP Composer packages
- [maven]: Java Maven Central
- [goproxy]: Go Module Proxy
- [github]: GitHub API for metadata enrichment
- [gitlab]: GitLab API for metadata enrichment
Client Pattern ¶
All registry clients follow a consistent pattern:
client, err := pypi.NewClient(24 * time.Hour) // Cache TTL pkg, err := client.FetchPackage(ctx, "fastapi", false) // false = use cache
Clients handle:
- HTTP requests with retry and rate limiting
- Response caching (file-based, configurable TTL)
- API-specific parsing and normalization
Shared Infrastructure ¶
The Client type provides shared HTTP functionality used by all registry clients, including caching via httputil.Cache.
Adding a New Registry ¶
To add support for a new package registry:
- Create a subpackage: pkg/integrations/<registry>/
- Define response structs matching the API schema
- Implement a Client with FetchPackage method
- Use NewClient for HTTP with caching
- Wire into [deps] as a new language
[pypi]: github.com/matzehuels/stacktower/pkg/integrations/pypi [npm]: github.com/matzehuels/stacktower/pkg/integrations/npm [crates]: github.com/matzehuels/stacktower/pkg/integrations/crates [rubygems]: github.com/matzehuels/stacktower/pkg/integrations/rubygems [packagist]: github.com/matzehuels/stacktower/pkg/integrations/packagist [maven]: github.com/matzehuels/stacktower/pkg/integrations/maven [goproxy]: github.com/matzehuels/stacktower/pkg/integrations/goproxy [github]: github.com/matzehuels/stacktower/pkg/integrations/github [gitlab]: github.com/matzehuels/stacktower/pkg/integrations/gitlab httputil.Cache: github.com/matzehuels/stacktower/pkg/httputil.Cache [deps]: github.com/matzehuels/stacktower/pkg/deps
Example (Errors) ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Standard errors for registry operations
fmt.Println("ErrNotFound:", integrations.ErrNotFound)
fmt.Println("ErrNetwork:", integrations.ErrNetwork)
}
Output: ErrNotFound: resource not found ErrNetwork: network error
Index ¶
- Variables
- func ExtractRepoURL(re *regexp.Regexp, urls map[string]string, homepage string) (owner, repo string, ok bool)
- func NewCache(ttl time.Duration) (*httputil.Cache, error)
- func NewCacheWithNamespace(namespace string, ttl time.Duration) (*httputil.Cache, error)
- func NewHTTPClient() *http.Client
- func NormalizePkgName(name string) string
- func NormalizeRepoURL(raw string) string
- func URLEncode(s string) string
- type Client
- func (c *Client) Cached(ctx context.Context, key string, refresh bool, v any, fetch func() error) error
- func (c *Client) Get(ctx context.Context, url string, v any) error
- func (c *Client) GetText(ctx context.Context, url string) (string, error)
- func (c *Client) GetWithHeaders(ctx context.Context, url string, headers map[string]string, v any) error
- type Contributor
- type RepoMetrics
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ( // ErrNotFound is returned when a package or resource doesn't exist in the registry. // This corresponds to HTTP 404 responses. // Callers should check with errors.Is(err, integrations.ErrNotFound). // This error is never wrapped with additional context. ErrNotFound = errors.New("resource not found") // ErrNetwork is returned for HTTP failures (timeouts, connection errors, 5xx responses). // This error may be wrapped with [httputil.RetryableError] for 5xx status codes. // Callers should check with errors.Is(err, integrations.ErrNetwork) for any network issue, // or errors.As(err, &httputil.RetryableError{}) to detect retryable failures specifically. ErrNetwork = errors.New("network error") )
Functions ¶
func ExtractRepoURL ¶
func ExtractRepoURL(re *regexp.Regexp, urls map[string]string, homepage string) (owner, repo string, ok bool)
ExtractRepoURL finds GitHub/GitLab owner and repo from package URLs. It searches through urls using standard keys (Source, Repository, Code, Homepage) and falls back to homepage if no match is found.
The re parameter should match URLs and capture:
- Group 1: owner/organization name
- Group 2: repository name
Examples:
re := regexp.MustCompile(`https?://github\.com/([^/]+)/([^/]+)`) owner, repo, ok := ExtractRepoURL(re, pkg.ProjectURLs, pkg.HomePage)
URLs containing "/sponsors/" are automatically skipped to avoid false positives. The .git suffix is trimmed from the repository name if present.
Parameters:
- re: Regular expression with exactly 2 capture groups (must not be nil)
- urls: Map of URL keys to URL values (may be nil or empty)
- homepage: Fallback homepage URL (may be empty)
Returns:
- owner: The repository owner/organization (empty if not found)
- repo: The repository name without .git suffix (empty if not found)
- ok: true if a valid match was found, false otherwise
This function is safe for concurrent use if re is not mutated. Panics if re is nil.
func NewCache ¶
NewCache creates a file-based cache with the given TTL in the default cache directory. See httputil.NewCache for details on cache location and behavior.
The ttl parameter must be positive. A ttl of 0 means items never expire (not recommended). Negative ttl values are invalid and will be treated as 0.
For registry-specific clients, prefer using NewCacheWithNamespace to automatically scope cache keys by registry name and prevent collisions.
Returns an error if the cache directory cannot be created or accessed. The returned cache is safe for concurrent use by multiple goroutines.
func NewCacheWithNamespace ¶ added in v0.2.2
NewCacheWithNamespace creates a namespaced cache for a specific registry. The namespace parameter (e.g., "pypi:", "npm:") is automatically prefixed to all cache keys, preventing collisions between different registries.
The namespace should be non-empty and typically ends with a colon. An empty namespace is valid but defeats the purpose of this function; use NewCache instead.
The ttl parameter must be positive. A ttl of 0 means items never expire (not recommended).
This is the preferred way to create caches for registry clients:
cache, err := integrations.NewCacheWithNamespace("pypi:", 24*time.Hour)
client := integrations.NewClient(cache, nil)
Returns an error if the cache directory cannot be created or accessed. The returned cache is safe for concurrent use by multiple goroutines.
func NewHTTPClient ¶
NewHTTPClient creates an HTTP client with a standard timeout for registry requests. The returned client has a 10-second timeout applied to all requests.
The client is safe for concurrent use by multiple goroutines. Returns a new client on every call; clients are not pooled.
func NormalizePkgName ¶
NormalizePkgName converts a package name to its canonical form. Applies lowercase and replaces underscores with hyphens, following PEP 503 normalization rules used by PyPI and other registries.
Normalization steps:
- Trim leading and trailing whitespace
- Convert to lowercase
- Replace all underscores with hyphens
Examples:
NormalizePkgName("FastAPI") → "fastapi"
NormalizePkgName("my_package") → "my-package"
NormalizePkgName(" Spaces ") → "spaces"
An empty string input returns an empty string. This function is safe for concurrent use.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Package names are normalized to lowercase with hyphens
fmt.Println(integrations.NormalizePkgName("FastAPI"))
fmt.Println(integrations.NormalizePkgName("my_package"))
fmt.Println(integrations.NormalizePkgName(" Spaces "))
}
Output: fastapi my-package spaces
func NormalizeRepoURL ¶
NormalizeRepoURL converts various repository URL formats to canonical HTTPS form. Handles git@, git://, and git+ prefixes, and removes .git suffixes.
Transformations applied:
- git@github.com:user/repo → https://github.com/user/repo
- git://github.com/user/repo → https://github.com/user/repo
- git+https://example.com/repo.git → https://example.com/repo
- https://example.com/repo.git → https://example.com/repo
Returns an empty string if the input is empty or contains only whitespace. Non-git URLs are returned unchanged after whitespace trimming and .git suffix removal. This function is safe for concurrent use.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Various repository URL formats are normalized to HTTPS
fmt.Println(integrations.NormalizeRepoURL("git@github.com:user/repo.git"))
fmt.Println(integrations.NormalizeRepoURL("git://github.com/user/repo"))
fmt.Println(integrations.NormalizeRepoURL("git+https://github.com/user/repo.git"))
fmt.Println(integrations.NormalizeRepoURL("https://github.com/user/repo"))
}
Output: https://github.com/user/repo https://github.com/user/repo https://github.com/user/repo https://github.com/user/repo
func URLEncode ¶
URLEncode percent-encodes a string for use in URLs. This is a convenience wrapper around url.QueryEscape.
Spaces are encoded as "+", and special characters as "%XX" hex sequences. An empty string returns an empty string. This function is safe for concurrent use.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// URL-encode special characters for API queries
fmt.Println(integrations.URLEncode("@scope/package"))
fmt.Println(integrations.URLEncode("package name"))
}
Output: %40scope%2Fpackage package+name
Types ¶
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client provides shared HTTP functionality for all registry API clients. It handles caching, retry logic, and common request headers.
Client is safe for concurrent use by multiple goroutines. The underlying HTTP client, cache, and headers are all goroutine-safe.
Zero values: Do not use an uninitialized Client; always create via NewClient.
func NewClient ¶
NewClient creates a Client with the given cache and default headers. Headers are applied to all requests made through this client.
Parameters:
- cache: Cache instance for storing responses (must not be nil). Create with NewCacheWithNamespace for registry-specific caching.
- headers: Default HTTP headers for all requests. Pass nil if no default headers are needed. Common examples: "Authorization", "User-Agent", "Accept".
The returned Client is safe for concurrent use by multiple goroutines. Panics if cache is nil.
func (*Client) Cached ¶
func (c *Client) Cached(ctx context.Context, key string, refresh bool, v any, fetch func() error) error
Cached retrieves a value from cache or executes fetch and caches the result. If refresh is true, the cache is bypassed and fetch is always called.
Parameters:
- ctx: Context for cancellation. If cancelled, fetch is not executed and returns ctx.Err().
- key: Cache key (usually package name or coordinate). Must not be empty.
- refresh: If true, bypass cache and always call fetch. If false, try cache first.
- v: Pointer to store the result. Must be a non-nil pointer to a JSON-serializable type.
- fetch: Function to fetch data and populate v. Called with retry on transient failures.
Behavior:
- If refresh=false and cache hit: returns nil immediately with v populated
- If cache miss or refresh=true: calls fetch with automatic retry on httputil.RetryableError
- On successful fetch: stores result in cache (ignoring cache write errors)
The fetch function should populate v and return nil on success, or return an error. Network errors should be wrapped with httputil.Retryable to enable retry.
Returns:
- nil on success (v is populated)
- error from fetch if it fails (v may be partially populated)
- ctx.Err() if context is cancelled
This method is safe for concurrent use on the same Client.
func (*Client) Get ¶
Get performs an HTTP GET request and JSON-decodes the response into v. It uses the client's default headers and handles retries automatically.
Parameters:
- ctx: Context for cancellation and timeout
- url: Full URL to request (must be absolute URL with scheme)
- v: Pointer to store decoded JSON response (must be non-nil)
Returns:
- ErrNotFound for HTTP 404 responses
- ErrNetwork wrapped with httputil.RetryableError for HTTP 5xx responses
- ErrNetwork for connection failures and timeouts
- json decoding errors if response is not valid JSON
This method is safe for concurrent use on the same Client.
func (*Client) GetText ¶
GetText performs an HTTP GET request and returns the response body as a string. Useful for non-JSON endpoints like go.mod files or plain text responses.
Parameters:
- ctx: Context for cancellation and timeout
- url: Full URL to request (must be absolute URL with scheme)
The entire response body is read into memory. Use caution with large responses. For files larger than a few MB, consider streaming with a custom implementation.
Returns:
- The response body as a string
- ErrNotFound for HTTP 404 responses
- ErrNetwork for connection failures, timeouts, and HTTP 5xx responses
- io errors if reading the response body fails
This method is safe for concurrent use on the same Client.
func (*Client) GetWithHeaders ¶
func (c *Client) GetWithHeaders(ctx context.Context, url string, headers map[string]string, v any) error
GetWithHeaders performs an HTTP GET with additional headers merged with defaults. Request-specific headers override client defaults for the same key.
Parameters:
- ctx: Context for cancellation and timeout
- url: Full URL to request (must be absolute URL with scheme)
- headers: Additional headers for this request only (may be nil). Headers with the same key as client defaults will override the default value for this request.
- v: Pointer to store decoded JSON response (must be non-nil)
Example:
err := client.GetWithHeaders(ctx, url, map[string]string{"X-Custom": "value"}, &resp)
Returns the same errors as [Get]. This method is safe for concurrent use on the same Client.
type Contributor ¶
type Contributor struct {
Login string `json:"login"` // GitHub/GitLab username. Never empty in valid contributors.
Contributions int `json:"contributions"` // Number of commits. Always positive in valid contributors.
}
Contributor represents a repository contributor with their contribution count. Used for bus factor analysis and maintainer identification.
Zero values: Login is empty, Contributions is 0. A Contributor with 0 contributions is invalid. This struct is safe for concurrent reads.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// Contributors track commit counts for bus factor analysis
contributors := []integrations.Contributor{
{Login: "maintainer1", Contributions: 500},
{Login: "maintainer2", Contributions: 200},
{Login: "contributor3", Contributions: 50},
}
fmt.Println("Top contributor:", contributors[0].Login)
fmt.Println("Contributions:", contributors[0].Contributions)
}
Output: Top contributor: maintainer1 Contributions: 500
type RepoMetrics ¶
type RepoMetrics struct {
RepoURL string `json:"repo_url"` // Canonical repository URL (https://...). Never empty in valid metrics.
Owner string `json:"owner"` // Repository owner username. Never empty in valid metrics.
Stars int `json:"stars"` // GitHub/GitLab star count. 0 is a valid value for new repositories.
SizeKB int `json:"size_kb,omitempty"` // Repository size in kilobytes. 0 means not available or very small.
LastCommitAt *time.Time `json:"last_commit_at,omitempty"` // Date of most recent commit. Nil if not available.
LastReleaseAt *time.Time `json:"last_release_at,omitempty"` // Date of most recent release. Nil if no releases or not available.
License string `json:"license,omitempty"` // SPDX license identifier (e.g., "MIT", "Apache-2.0"). Empty if not detected.
Contributors []Contributor `json:"top_contributors,omitempty"` // Top contributors by commit count (typically top 5). Nil or empty if not available.
Language string `json:"language,omitempty"` // Primary repository language (e.g., "Go", "Python"). Empty if not detected.
Topics []string `json:"topics,omitempty"` // Repository topic tags. Nil or empty if none.
Archived bool `json:"archived"` // Whether the repository is archived. False means active or unknown.
}
RepoMetrics holds repository-level data fetched from GitHub or GitLab. Used to enrich package metadata with maintenance and popularity indicators.
Zero values: All string fields are empty, integers are 0, time pointers are nil. Nil Contributors slice is valid and indicates no contributor data was fetched.
This struct is safe for concurrent reads after construction but not for concurrent writes.
Example ¶
package main
import (
"fmt"
"github.com/matzehuels/stacktower/pkg/integrations"
)
func main() {
// RepoMetrics holds repository data from GitHub/GitLab
metrics := integrations.RepoMetrics{
RepoURL: "https://github.com/psf/requests",
Owner: "psf",
Stars: 51000,
Language: "Python",
Archived: false,
}
fmt.Println("Repository:", metrics.RepoURL)
fmt.Println("Stars:", metrics.Stars)
fmt.Println("Archived:", metrics.Archived)
}
Output: Repository: https://github.com/psf/requests Stars: 51000 Archived: false
Directories
¶
| Path | Synopsis |
|---|---|
|
Package crates provides an HTTP client for the crates.io API.
|
Package crates provides an HTTP client for the crates.io API. |
|
Package github provides an HTTP client for the GitHub API.
|
Package github provides an HTTP client for the GitHub API. |
|
Package gitlab provides an HTTP client for the GitLab API.
|
Package gitlab provides an HTTP client for the GitLab API. |
|
Package goproxy provides an HTTP client for the Go Module Proxy.
|
Package goproxy provides an HTTP client for the Go Module Proxy. |
|
Package maven provides an HTTP client for Maven Central.
|
Package maven provides an HTTP client for Maven Central. |
|
Package npm provides an HTTP client for the npm registry API.
|
Package npm provides an HTTP client for the npm registry API. |
|
Package packagist provides an HTTP client for the Packagist API.
|
Package packagist provides an HTTP client for the Packagist API. |
|
Package pypi provides an HTTP client for the Python Package Index API.
|
Package pypi provides an HTTP client for the Python Package Index API. |
|
Package rubygems provides an HTTP client for the RubyGems.org API.
|
Package rubygems provides an HTTP client for the RubyGems.org API. |