Documentation
¶
Overview ¶
Package profile defines the common types for social media profile extraction.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ( ErrAuthRequired = errors.New("authentication required") ErrNoCookies = errors.New("no cookies available") ErrProfileNotFound = errors.New("profile not found") ErrRateLimited = errors.New("rate limited") )
Common errors returned by platform packages.
Functions ¶
func Register ¶ added in v0.9.0
func Register(p Platform)
Register adds a platform to the global registry. This should be called from each platform package's init() function.
func RegisterWithFetcher ¶ added in v0.9.0
RegisterWithFetcher adds a platform with its fetch function to the global registry.
Types ¶
type FetchFunc ¶ added in v0.9.0
FetchFunc is a function that fetches a profile from a URL. This allows platforms to register their fetch logic without requiring a specific client type.
func LookupFetcher ¶ added in v0.9.0
LookupFetcher returns the fetch function for the given platform name, or nil if not found.
type FetcherConfig ¶ added in v0.9.0
type FetcherConfig struct {
Cache any // httpcache.Cacher - use any to avoid import cycles
Cookies map[string]string // Platform-specific cookies
Logger *slog.Logger
GitHubToken string // GitHub API token
BrowserCookies bool // Whether to read cookies from browser
}
FetcherConfig holds configuration for creating platform fetchers.
type Platform ¶ added in v0.9.0
type Platform interface {
// Name returns the platform identifier (e.g., "github", "twitter").
Name() string
// Type returns the category of content this platform hosts.
Type() PlatformType
// Match returns true if the URL belongs to this platform.
Match(url string) bool
// AuthRequired returns true if authentication is needed to fetch profiles.
AuthRequired() bool
}
Platform defines the interface that all platform implementations must satisfy. Each platform package registers itself via Register() in an init() function.
func LookupPlatform ¶ added in v0.9.0
LookupPlatform returns the platform with the given name, or nil if not found.
type PlatformType ¶ added in v0.9.0
type PlatformType string
PlatformType categorizes what kind of content a platform primarily hosts. This enables cross-platform matching bonuses (e.g., same username on GitHub and GitLab).
const ( PlatformTypeCode PlatformType = "code" // Code hosting: GitHub, GitLab, Codeberg, etc. PlatformTypeBlog PlatformType = "blog" // Long-form writing: Medium, Substack, Dev.to, etc. PlatformTypeMicroblog PlatformType = "microblog" // Short posts: Twitter, Mastodon, Bluesky, etc. PlatformTypeVideo PlatformType = "video" // Video content: YouTube, TikTok, Twitch, etc. PlatformTypeForum PlatformType = "forum" // Discussion forums: Reddit, HN, Lobsters, etc. PlatformTypeGaming PlatformType = "gaming" // Gaming platforms: Steam, etc. PlatformTypeSocial PlatformType = "social" // General social: LinkedIn, Instagram, VK, etc. PlatformTypePackage PlatformType = "package" // Package registries: npm, PyPI, crates.io, etc. PlatformTypeSecurity PlatformType = "security" // Security platforms: HackerOne, Bugcrowd, etc. PlatformTypeOther PlatformType = "other" // Uncategorized platforms )
Platform type constants for categorizing platforms by their primary content type.
func TypeOf ¶ added in v0.9.0
func TypeOf(name string) PlatformType
TypeOf returns the platform type for a given platform name. Returns PlatformTypeOther for unknown platforms.
type Post ¶
type Post struct {
Type PostType `json:"type"` // Type of content
Title string `json:"title,omitempty"` // Title (for videos, articles, posts)
Content string `json:"content,omitempty"` // Body text or description
URL string `json:"url,omitempty"` // Link to the original content
Category string `json:"category,omitempty"` // Category (subreddit, channel, topic, etc.)
}
Post represents a piece of user-generated content (post, comment, video, etc.).
type PostType ¶
type PostType string
PostType indicates the type of user-generated content.
const ( PostTypeComment PostType = "comment" PostTypePost PostType = "post" PostTypeVideo PostType = "video" PostTypeArticle PostType = "article" PostTypeQuestion PostType = "question" PostTypeAnswer PostType = "answer" PostTypeRepository PostType = "repository" )
Post type constants for categorizing user-generated content.
type Profile ¶
type Profile struct {
// Metadata
Platform string `json:",omitempty"` // Platform name: "linkedin", "twitter", "mastodon", etc.
URL string `json:",omitempty"` // Original URL fetched
Authenticated bool `json:",omitempty"` // Whether login cookies were used
Error string `json:",omitempty"` // Error message if fetch failed (e.g., "login required")
// Core profile data
Username string `json:",omitempty"` // Handle/username (without @ prefix)
Name string `json:",omitempty"` // Display name
AvatarURL string `json:",omitempty"` // Profile photo/avatar URL
Bio string `json:",omitempty"` // Profile bio/description
Location string `json:",omitempty"` // Geographic location
Website string `json:",omitempty"` // Personal website URL
CreatedAt string `json:",omitempty"` // Account creation date (ISO timestamp)
UpdatedAt string `json:",omitempty"` // Most recent activity or profile update (ISO timestamp)
UTCOffset *float64 `json:",omitempty"` // UTC offset in hours (e.g., -8 for PST, 5.5 for IST)
// Platform-specific fields
Fields map[string]string `json:",omitempty"` // Additional platform-specific data (headline, employer, etc.)
// For further crawling
SocialLinks []string `json:",omitempty"` // Other social media URLs detected on the profile
// User-generated content (posts, comments, videos, etc.)
Posts []Post `json:",omitempty"` // Structured content extracted from the profile
// Fallback for unrecognized platforms
Unstructured string `json:",omitempty"` // Raw markdown content (HTML->MD conversion)
// Guess mode fields (omitted from JSON when empty)
IsGuess bool `json:",omitempty"` // True if this profile was discovered via guessing
Confidence float64 `json:",omitempty"` // Confidence score 0.0-1.0 for guessed profiles
GuessMatch []string `json:",omitempty"` // Reasons for match (e.g., "username", "name", "location")
}
Profile represents extracted data from a social media profile.