youtube

package
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: Apache-2.0 Imports: 30 Imported by: 0

Documentation

Index

Constants

View Source
const (
	// BaseURL is the YouTube web origin.
	BaseURL = "https://www.youtube.com"
	// MusicBaseURL is the YouTube Music web origin.
	MusicBaseURL = "https://music.youtube.com"

	// DefaultDelay is the polite minimum delay between requests.
	DefaultDelay = 1500 * time.Millisecond
	// DefaultWorkers is the default concurrency for detail fetches and crawl workers.
	DefaultWorkers = 4
	// DefaultTimeout is the default per-request timeout.
	DefaultTimeout = 30 * time.Second
	// DefaultRetries is the default retry count on transient failures.
	DefaultRetries = 3
	// DefaultMaxResults caps a list command's rows when the user gives no -n.
	DefaultMaxResults = 0 // 0 == unlimited (page-capped instead)

	// Entity kinds used by the crawl queue.
	EntityVideo     = "video"
	EntityChannel   = "channel"
	EntityPlaylist  = "playlist"
	EntitySearch    = "search"
	EntityHashtag   = "hashtag"
	EntityComments  = "comments"
	EntityCommunity = "community"
)

Variables

View Source
var AllSponsorCategories = []string{
	"sponsor", "selfpromo", "interaction", "intro", "outro",
	"preview", "music_offtopic", "filler",
}

AllSponsorCategories are the segment categories SponsorBlock publishes.

View Source
var ErrCommentsRestricted = errors.New("comments hidden by Restricted Mode")

ErrCommentsRestricted is returned when a video's comments are hidden by Restricted Mode. YouTube applies Restricted Mode to some server and datacenter requests regardless of cookies, so callers can present a clear message rather than mistaking it for a video with no comments.

View Source
var ErrFFmpegMissing = errors.New("ffmpeg not found: install it or pass --ffmpeg-bin (the ytb binary itself stays pure-Go)")

ErrFFmpegMissing is returned when an operation needs ffmpeg but none was found. Callers map it to the missing-tool exit code.

View Source
var ErrStop = errors.New("youtube: stop iteration")

ErrStop is returned by a streaming emit function to signal that no more items are needed. Streaming functions treat ErrStop as a clean stop (return nil).

Functions

func CommentsRestricted

func CommentsRestricted(root any) bool

CommentsRestricted reports whether a watch page's comment section was hidden by Restricted Mode. YouTube applies this to some server and datacenter requests; the section then carries only a messageRenderer to that effect.

func Crawl

func Crawl(ctx context.Context, c *Client, store *Store, opt CrawlOptions, logf func(string)) error

Crawl pops items off the store queue and fetches each one using c, storing results into store. It runs until the queue is empty or ctx is cancelled.

If opt.Workers > 1 a concurrent errgroup is used; any per-item error is logged via logf and does not abort the overall crawl.

func EdgeHelp added in v0.4.0

func EdgeHelp() string

EdgeHelp is the one-line catalogue of presets and edges for flag help and usage errors, so the names a user can type live in exactly one place.

func EmbedThumbnail

func EmbedThumbnail(ctx context.Context, ffmpeg, media, image, out string) error

EmbedThumbnail attaches an image to a media file as cover art (mp4/m4a only).

func Export

func Export(store *Store, channel, outDir string) error

Export writes Markdown pages for all channels (channel=="") or a single channel identified by @handle or UC-style ID.

Output layout mirrors the reference exporter:

outDir/<handle-or-id>/README.md          channel index
                      videos/README.md    all-videos index
                      videos/YYYY/MM/DD/<slug>-<id>.md
                      playlists/README.md
                      playlists/<slug>.md

func ExtractAudio

func ExtractAudio(ctx context.Context, ffmpeg, in, out, codec, quality string) error

ExtractAudio transcodes (or copies) an input into an audio-only file. When codec is "copy" the stream is remuxed; otherwise it is re-encoded (e.g. mp3, aac, opus, flac). quality, when non-empty, sets -q:a or -b:a as appropriate.

func ExtractHashtags

func ExtractHashtags(text string) []string

ExtractHashtags finds all #word patterns in a string and returns unique hashtags.

func ExtractPlaylistID

func ExtractPlaylistID(input string) string

ExtractPlaylistID extracts the playlist ID from a URL or recognises bare IDs.

func ExtractVideoID

func ExtractVideoID(input string) string

ExtractVideoID extracts the video ID from a URL or returns the bare ID as-is.

func FFmpeg

func FFmpeg(explicit string) string

FFmpeg locates the ffmpeg binary. The explicit path wins, then YTB_FFMPEG_BIN, then PATH. It returns "" when none is usable.

func FindCommentsToken

func FindCommentsToken(root any) string

FindCommentsToken finds the comment-section continuation token in a watch page's ytInitialData (the reliable source: the /next API strips it for unauthenticated requests).

func MergeAV

func MergeAV(ctx context.Context, ffmpeg, videoPath, audioPath, out string) error

MergeAV muxes a video-only and audio-only file into one container, copying both streams without re-encoding. The output container is taken from out's extension.

func NormalizeChannelID

func NormalizeChannelID(input string) string

NormalizeChannelID strips URL prefix and path suffix to return a bare channel ID (UC...) or handle (without @).

func NormalizeChannelURL

func NormalizeChannelURL(input string) string

NormalizeChannelURL converts a channel ID, @handle, vanity name, or URL to a canonical https://www.youtube.com/... form that ends with /videos.

func NormalizePlaylistURL

func NormalizePlaylistURL(input string) string

NormalizePlaylistURL converts a playlist ID or URL to a canonical https://www.youtube.com/playlist?list=ID form.

func NormalizeVideoURL

func NormalizeVideoURL(input string) string

NormalizeVideoURL converts a video ID, youtu.be URL, shorts URL, or full watch URL to a canonical https://www.youtube.com/watch?v=ID form.

func ParseChannelNumericCounts

func ParseChannelNumericCounts(data map[string]any, ch *Channel)

ParseChannelNumericCounts enriches a Channel with numeric counts from InnerTube data.

func ParseChannelPage

func ParseChannelPage(data *PageData, pageURL string) (*Channel, []Video, string, error)

ParseChannelPage parses ytInitialData from a channel HTML page.

func ParseContinuationPlaylistVideos

func ParseContinuationPlaylistVideos(data map[string]any, playlistID string) ([]Video, []PlaylistVideo, string)

ParseContinuationPlaylistVideos extracts playlist videos from a /browse continuation.

func ParseInnerTubeSearchResults

func ParseInnerTubeSearchResults(data map[string]any) ([]Video, []Channel, []Playlist, string)

ParseInnerTubeSearchResults extracts videos, channels, playlists and the next continuation token from an InnerTube /search continuation response.

func ParseMicroformat

func ParseMicroformat(playerResp map[string]any, v *Video)

ParseMicroformat enriches a Video struct from microformat and playabilityStatus data.

func ParsePlaylistPage

func ParsePlaylistPage(data *PageData, pageURL string) (*Playlist, []PlaylistVideo, []Video, string, error)

ParsePlaylistPage parses ytInitialData from a playlist HTML page.

func ParseSearchPage

func ParseSearchPage(data *PageData, query string) ([]SearchResult, []Video, []Channel, []Playlist, string, error)

ParseSearchPage parses a search results HTML page.

func ParseVideoPage

func ParseVideoPage(data *PageData, pageURL string) (*Video, []CaptionTrack, []RelatedVideo, string, error)

ParseVideoPage parses ytInitialData + ytInitialPlayerResponse from a video HTML page.

func RenderOutputTemplate

func RenderOutputTemplate(tmpl string, f OutputFields) string

RenderOutputTemplate expands a yt-dlp-style template such as "%(title)s [%(id)s].%(ext)s". Unknown fields expand to "NA". The result is sanitized component-by-component so path separators in titles cannot escape the intended directory.

func RenderSubtitles

func RenderSubtitles(segs []TranscriptSegment, format SubtitleFormat) string

RenderSubtitles serializes timed segments into srt, vtt, or plain text. The conversion is pure-Go and needs no ffmpeg.

func SponsorChapters

func SponsorChapters(segs []SponsorSegment) string

SponsorChapters builds an ffmpeg metadata file marking sponsor segments so players can show them as chapters. It is an alternative to cutting them out.

Types

type Album

type Album struct {
	AlbumID         string    `json:"album_id"`
	Title           string    `json:"title"`
	ArtistID        string    `json:"artist_id"`
	ArtistName      string    `json:"artist_name"`
	AlbumType       string    `json:"album_type"`
	Year            string    `json:"year"`
	TrackCount      int       `json:"track_count"`
	DurationText    string    `json:"duration_text"`
	ThumbnailURL    string    `json:"thumbnail_url"`
	AudioPlaylistID string    `json:"audio_playlist_id"`
	Description     string    `json:"description"`
	URL             string    `json:"url"`
	FetchedAt       time.Time `json:"fetched_at"`
}

Album is a YouTube Music album.

type AlbumTrack

type AlbumTrack struct {
	AlbumID  string `json:"album_id"`
	VideoID  string `json:"video_id"`
	Position int    `json:"position"`
}

AlbumTrack is the album↔song join.

type Artist

type Artist struct {
	ArtistID        string    `json:"artist_id"`
	Name            string    `json:"name"`
	Description     string    `json:"description"`
	SubscribersText string    `json:"subscribers_text"`
	ThumbnailURL    string    `json:"thumbnail_url"`
	URL             string    `json:"url"`
	FetchedAt       time.Time `json:"fetched_at"`
}

Artist is a YouTube Music artist.

type ArtistAlbum

type ArtistAlbum struct {
	ArtistID  string `json:"artist_id"`
	AlbumID   string `json:"album_id"`
	AlbumType string `json:"album_type"`
}

ArtistAlbum is the artist↔album join.

type CaptionTrack

type CaptionTrack struct {
	VideoID         string    `json:"video_id"`
	LanguageCode    string    `json:"language_code"`
	Name            string    `json:"name"`
	BaseURL         string    `json:"base_url"`
	Kind            string    `json:"kind"`
	IsAutoGenerated bool      `json:"is_auto_generated"`
	FetchedAt       time.Time `json:"fetched_at"`
}

CaptionTrack is one available caption track for a video.

type Channel

type Channel struct {
	ChannelID         string    `json:"channel_id" kit:"id" table:"id"`
	Handle            string    `json:"handle" table:"handle"`
	Title             string    `json:"title" table:"title,truncate"`
	Description       string    `json:"description" kit:"body" table:"-"`
	AvatarURL         string    `json:"avatar_url" table:"-"`
	BannerURL         string    `json:"banner_url" table:"-"`
	SubscribersText   string    `json:"subscribers_text" table:"subscribers"`
	VideosText        string    `json:"videos_text" table:"videos"`
	ViewsText         string    `json:"views_text" table:"-"`
	Country           string    `json:"country" table:"-"`
	JoinedDateText    string    `json:"joined_date_text" table:"-"`
	UploadsPlaylistID string    `json:"uploads_playlist_id" table:"-"`
	URL               string    `json:"url" table:"url,url"`
	SubscriberCount   int64     `json:"subscriber_count" table:"-"`
	VideoCount        int64     `json:"video_count" table:"-"`
	ViewCount         int64     `json:"view_count" table:"-"`
	Keywords          []string  `json:"keywords" table:"-"`
	TrailerVideoID    string    `json:"trailer_video_id" table:"-"`
	IsVerified        bool      `json:"is_verified" table:"-"`
	FetchedAt         time.Time `json:"fetched_at" table:"-"`
}

Channel is one YouTube channel.

type Chapter

type Chapter struct {
	VideoID      string `json:"video_id"`
	Title        string `json:"title"`
	StartSeconds int    `json:"start_seconds"`
	ThumbnailURL string `json:"thumbnail_url"`
	Position     int    `json:"position"`
}

Chapter is one chapter marker on a video.

func ParseChapters

func ParseChapters(nextResp map[string]any, videoID string, description string) []Chapter

ParseChapters extracts chapter data from an InnerTube /next response. Falls back to parsing timestamps from the video description.

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client is the rate-limited HTTP front end for YouTube web + InnerTube.

func NewClient

func NewClient(cfg Config) *Client

NewClient builds a Client from cfg.

func (*Client) Captions

func (c *Client) Captions(ctx context.Context, idOrURL string) ([]CaptionTrack, error)

Captions returns the list of available caption tracks for a video.

The caption list lives in the player response. The HTML-embedded ytInitialPlayerResponse is the reliable source (the bare InnerTube /player WEB call is frequently bot-gated and omits captions/streamingData), so we read the page first and fall back to /player only if the page yields nothing.

func (*Client) DownloadThumbnail

func (c *Client) DownloadThumbnail(ctx context.Context, videoID, dst string) (Thumbnail, error)

DownloadThumbnail fetches the best available rendition for a video to dst, trying renditions largest-first and skipping any that 404.

func (*Client) DownloadToFile

func (c *Client) DownloadToFile(ctx context.Context, rawURL, dst string, total int64, workers int, onProgress func(DownloadProgress)) error

DownloadToFile fetches rawURL into dst. When total is known and workers > 1 it downloads ranges concurrently, writing each at its offset; otherwise it streams sequentially. onProgress, if non-nil, is called as bytes land.

func (*Client) Fetch

func (c *Client) Fetch(ctx context.Context, url string) ([]byte, int, error)

Fetch GETs url with browser-like headers and the polite rate limit, retrying transient 429/5xx responses with backoff.

func (*Client) FetchAlbum

func (c *Client) FetchAlbum(ctx context.Context, idOrURL string) (*Album, []Song, error)

FetchAlbum fetches the album page for the given ID or URL and returns the album and its track list.

func (*Client) FetchArtist

func (c *Client) FetchArtist(ctx context.Context, idOrURL string) (*Artist, []Album, []Song, error)

FetchArtist fetches the artist page for the given ID or URL and returns the artist, their albums (including singles/EPs), and top songs.

func (*Client) FetchChannel

func (c *Client) FetchChannel(ctx context.Context, idOrURL string) (*Channel, error)

FetchChannel fetches a channel's header metadata.

func (*Client) FetchHTML

func (c *Client) FetchHTML(ctx context.Context, url string) (*goquery.Document, int, error)

FetchHTML fetches and parses an HTML document.

func (*Client) FetchMusicPlaylist

func (c *Client) FetchMusicPlaylist(ctx context.Context, idOrURL string) (*Album, []Song, error)

FetchMusicPlaylist fetches a music playlist by ID or URL and returns an Album (playlist metadata) and the song list.

func (*Client) FetchPageData

func (c *Client) FetchPageData(ctx context.Context, url string) (*PageData, int, error)

FetchPageData fetches an HTML page and extracts the embedded JSON bootstrap blobs.

func (*Client) FetchPlaylist

func (c *Client) FetchPlaylist(ctx context.Context, idOrURL string) (*Playlist, error)

FetchPlaylist fetches a playlist's header metadata.

func (*Client) FetchSong

func (c *Client) FetchSong(ctx context.Context, videoID string, withLyrics bool) (*Song, error)

FetchSong fetches details for a single song/video via MusicPlayer and optionally retrieves lyrics via the /next browse tab.

func (*Client) FetchTimedText

func (c *Client) FetchTimedText(ctx context.Context, url string) ([]byte, error)

FetchTimedText fetches a caption track's timed-text XML and returns its raw bytes.

func (*Client) FetchVideo

func (c *Client) FetchVideo(ctx context.Context, idOrURL string, opt VideoOptions) (*VideoResult, error)

FetchVideo fetches a single video's full metadata.

It always fetches the HTML page (ytInitialData + ytInitialPlayerResponse). If opt.Player is true, it additionally calls InnerTube /player for richer format and microformat data. If opt.Next is true, it calls InnerTube /next for chapters and related videos. If opt.Transcript is true, it fetches the best matching caption track (or the one matching opt.Lang) and populates Video.Transcript / Video.TranscriptLanguage.

func (*Client) Formats

func (c *Client) Formats(ctx context.Context, idOrURL string) ([]VideoFormat, error)

Formats returns the streaming formats for a video.

Like Captions, this reads the HTML-embedded player response first (it carries streamingData reliably) and falls back to the InnerTube /player call.

func (*Client) GL

func (c *Client) GL() string

func (*Client) HL

func (c *Client) HL() string

HL/GL expose the configured language/country for the InnerTube context.

func (*Client) HTTP

func (c *Client) HTTP() *http.Client

HTTP exposes the underlying *http.Client (used by the InnerTube client and yt-dlp probe).

func (*Client) MusicSearch

func (c *Client) MusicSearch(ctx context.Context, query, typ string, opt PageOptions, emit func(any) error) error

MusicSearch streams search results for the given query and type filter. typ may be "", "song", "album", "artist", "playlist", or "video". emit receives Artist, Album, or Song values. Iteration stops on ErrStop.

func (*Client) ResolveChannelID

func (c *Client) ResolveChannelID(ctx context.Context, input string) (string, error)

ResolveChannelID resolves a handle, vanity name, or URL to a UC-style channel ID. A UC... input is returned unchanged.

func (*Client) ResolveStreamURL

func (c *Client) ResolveStreamURL(ctx context.Context, m *StreamManifest, s *Stream) (string, error)

ResolveStreamURL turns a stream into a fetchable googlevideo URL, deciphering the signature and transforming the n parameter as needed.

func (*Client) Search

func (c *Client) Search(ctx context.Context, query string, f SearchFilters, opt PageOptions, emit func(any) error) error

Search performs a YouTube search and streams results to emit. Each call to emit receives a Video, Channel, or Playlist value. Returning ErrStop from emit halts iteration cleanly.

func (*Client) SponsorSegments

func (c *Client) SponsorSegments(ctx context.Context, videoID string, categories []string) ([]SponsorSegment, error)

SponsorSegments fetches segments for a video. When categories is empty all categories are requested.

func (*Client) StreamChannelPlaylists

func (c *Client) StreamChannelPlaylists(ctx context.Context, idOrURL string, opt PageOptions, emit func(Playlist) error) error

StreamChannelPlaylists streams playlists from a channel's playlists tab. The emit function receives each Playlist; returning ErrStop halts iteration cleanly.

func (*Client) StreamChannelTab

func (c *Client) StreamChannelTab(ctx context.Context, idOrURL, tab string, opt PageOptions, emit func(Video) error) error

StreamChannelTab streams videos from a channel tab (videos, shorts, or streams). tab must be one of "videos", "shorts", "streams". If opt.Enrich is true, each video is enriched with a /player call. The emit function receives each Video; returning ErrStop halts iteration cleanly.

func (*Client) StreamComments

func (c *Client) StreamComments(ctx context.Context, idOrURL string, opt CommentOptions, emit func(Comment) error) error

StreamComments streams comments (and optionally replies) for a video. idOrURL may be a video ID or any URL form. Returning ErrStop from emit halts iteration cleanly.

The watch page's ytInitialData is the reliable source for both the comment continuation token and the visitor session it is bound to; the /next API strips the token for unauthenticated requests. Comment bodies arrive as entity payloads (the modern model), with the classic commentRenderer kept as a fallback for replies and older responses.

func (*Client) StreamCommunity

func (c *Client) StreamCommunity(ctx context.Context, channel string, opt PageOptions, emit func(CommunityPost) error) error

StreamCommunity streams community posts for a channel. channel may be a channel ID (UC...), handle (@name), vanity name, or URL. Returning ErrStop from emit halts iteration cleanly.

func (*Client) StreamHashtag

func (c *Client) StreamHashtag(ctx context.Context, tag string, opt PageOptions, emit func(Video) error) error

StreamHashtag streams videos tagged with a YouTube hashtag. tag may include or omit the leading "#". Returning ErrStop from emit halts iteration cleanly.

func (*Client) StreamManifest

func (c *Client) StreamManifest(ctx context.Context, idOrURL string) (*StreamManifest, error)

StreamManifest resolves the downloadable streams for a video. It leads with the ANDROID_VR client (plain, token-free URLs) and falls back to the watch page's player response (ciphered URLs solved via base.js).

func (*Client) StreamPlaylistItems

func (c *Client) StreamPlaylistItems(ctx context.Context, idOrURL string, opt PageOptions, emit func(PlaylistVideo, Video) error) error

StreamPlaylistItems streams items from a playlist. emit receives each (PlaylistVideo, Video) pair in playlist order. Returning ErrStop from emit halts iteration cleanly.

func (*Client) Suggest

func (c *Client) Suggest(ctx context.Context, query string) ([]string, error)

Suggest returns autocomplete suggestions for query from YouTube's suggestion endpoint.

func (*Client) Transcript

func (c *Client) Transcript(ctx context.Context, idOrURL, lang string) (string, []TranscriptSegment, error)

Transcript fetches the timed-text XML for a video caption track and returns the joined plain-text transcript and the individual timed segments. lang selects the caption track; if empty the first non-auto track is used.

func (*Client) Trending

func (c *Client) Trending(ctx context.Context, category string, opt PageOptions, emit func(Video) error) error

Trending streams trending/popular videos for the given category. category may be "music", "gaming", "news", "movies", or "" for general trending. Returning ErrStop from emit halts iteration cleanly.

func (*Client) Walk added in v0.4.0

func (c *Client) Walk(ctx context.Context, seeds []Seed, opts WalkOptions, emit func(*Node) error) error

Walk runs the client's traversal. It is the production entry point: it builds a Walker over the client and walks the seeds. See Walker.Walk.

type Comment

type Comment struct {
	ID                 string    `json:"id" kit:"id" table:"id"`
	VideoID            string    `json:"video_id" kit:"link,kind=youtube/video" table:"-"`
	ParentID           string    `json:"parent_id" table:"-"`
	AuthorChannelID    string    `json:"author_channel_id" kit:"link,kind=youtube/channel" table:"-"`
	AuthorDisplayName  string    `json:"author_display_name" table:"author,truncate"`
	AuthorProfileImage string    `json:"author_profile_image_url" table:"-"`
	TextDisplay        string    `json:"text_display" kit:"body" table:"text,truncate"`
	LikeCount          int64     `json:"like_count" table:"likes"`
	ReplyCount         int       `json:"reply_count" table:"replies"`
	IsOwnerComment     bool      `json:"is_owner_comment" table:"-"`
	PublishedText      string    `json:"published_text" table:"published"`
	PublishedAt        time.Time `json:"published_at" table:"-"`
	UpdatedAt          time.Time `json:"updated_at" table:"-"`
	FetchedAt          time.Time `json:"fetched_at" table:"-"`
}

Comment is one comment or reply. Replies carry the parent comment id in ParentID.

func ParseCommentRenderer

func ParseCommentRenderer(m map[string]any, videoID, parentID string) *Comment

ParseCommentRenderer parses a single commentRenderer or replyRenderer map.

type CommentOptions

type CommentOptions struct {
	Max      int
	MaxPages int
	Replies  bool
	Sort     string // "top" | "new"
}

CommentOptions controls the comment stream.

type CommunityPost

type CommunityPost struct {
	PostID        string    `json:"post_id" kit:"id" table:"id"`
	ChannelID     string    `json:"channel_id" kit:"link,kind=youtube/channel" table:"-"`
	AuthorName    string    `json:"author_name" table:"author,truncate"`
	AuthorAvatar  string    `json:"author_avatar_url" table:"-"`
	ContentText   string    `json:"content_text" kit:"body" table:"text,truncate"`
	LikeCount     int64     `json:"like_count" table:"likes"`
	ReplyCount    int       `json:"reply_count" table:"-"`
	VoteCount     string    `json:"vote_count_text" table:"-"`
	PublishedText string    `json:"published_text" table:"published"`
	Attachments   string    `json:"attachments" table:"-"`
	FetchedAt     time.Time `json:"fetched_at" table:"-"`
}

CommunityPost is one community/posts-tab post. Attachments is a JSON array.

func ParseCommunityPost

func ParseCommunityPost(m map[string]any, channelID string) *CommunityPost

ParseCommunityPost parses a backstagePostRenderer or sharedPostRenderer.

type Config

type Config struct {
	Workers int
	Delay   time.Duration
	Timeout time.Duration
	Retries int
	HL      string // interface language, e.g. "en"
	GL      string // content country, e.g. "US"
}

Config controls the HTTP client and InnerTube behaviour.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns the zero-setup defaults.

type CrawlOptions

type CrawlOptions struct {
	// Workers is the number of concurrent queue-pop goroutines (1 = sequential).
	Workers int
	// Entity restricts crawling to a specific entity type (empty = all).
	Entity string
	// MaxPerItem caps the number of items fetched per queue entry (0 = unlimited).
	MaxPerItem int
}

CrawlOptions controls how the Crawl loop runs.

type Domain added in v0.2.0

type Domain struct{}

Domain is the YouTube driver. It carries no state; the per-run client is built by the factory Register hands kit.

func (Domain) Classify added in v0.2.0

func (Domain) Classify(input string) (uriType, id string, err error)

Classify turns any accepted input into the canonical (type, id), so a host resolves a youtube:// reference without a network call. A URL is read by what its path and query say; a watch link carries both a video and a playlist id and the video wins, matching what a person means by the link. A bare string is read by its shape: @handle and UC ids are channels, the playlist prefixes are playlists, and anything else is taken as a video id.

func (Domain) Info added in v0.2.0

func (Domain) Info() kit.DomainInfo

Info describes the scheme, the hostnames a pasted link is matched against, and the identity a host reuses for help and version.

func (Domain) Locate added in v0.2.0

func (Domain) Locate(uriType, id string) (string, error)

Locate is the inverse: the live page URL for a (type, id).

func (Domain) Register added in v0.2.0

func (Domain) Register(app *kit.App)

Register installs the client factory and every YouTube record operation onto app. A resolver op names its own record type and answers `ant get`; a List op enumerates a parent resource's members and answers `ant ls`. The non-record commands (download, transcript text, the local store, config) are not operations; the standalone binary adds them as escape hatches in cli.

type DownloadArchive

type DownloadArchive struct {
	// contains filtered or unexported fields
}

DownloadArchive tracks which videos have already been downloaded, mirroring yt-dlp's --download-archive file (one "youtube <id>" record per line). It is safe for concurrent use.

func OpenArchive

func OpenArchive(path string) (*DownloadArchive, error)

OpenArchive loads (or initializes) an archive file. A missing file is treated as an empty archive; it is created on the first Add.

func (*DownloadArchive) Add

func (a *DownloadArchive) Add(videoID string) error

Add records videoID and appends it to the archive file. It is a no-op when the archive has no path or the id is already present.

func (*DownloadArchive) Has

func (a *DownloadArchive) Has(videoID string) bool

Has reports whether videoID is already recorded.

func (*DownloadArchive) IDs

func (a *DownloadArchive) IDs() []string

IDs returns the recorded video IDs in sorted order.

type DownloadProgress

type DownloadProgress struct {
	Downloaded int64
	Total      int64
}

DownloadProgress reports cumulative bytes written and the total when known.

type Edge added in v0.4.0

type Edge string

Edge names a link the walk can follow. The string is the public vocabulary: it is what the user types in --follow, what lands in the store's edges.kind column, and what a discovered node reports as the edge it arrived by.

const (
	EdgeChannel   Edge = "channel"   // video -> the channel that uploaded it
	EdgeRelated   Edge = "related"   // video -> a related/recommended video
	EdgeComments  Edge = "comments"  // video -> a comment on it
	EdgeUploads   Edge = "uploads"   // channel -> a video it uploaded
	EdgePlaylists Edge = "playlists" // channel -> a playlist it owns
	EdgeCommunity Edge = "community" // channel -> a community post
	EdgeItems     Edge = "items"     // playlist -> a video in it
	EdgeOwner     Edge = "owner"     // playlist -> the channel that owns it
	EdgeCommenter Edge = "commenter" // comment -> the channel that wrote it
)

func (Edge) Target added in v0.4.0

func (e Edge) Target() NodeKind

Target reports the kind of node an edge leads to.

type EdgeSet added in v0.4.0

type EdgeSet map[Edge]bool

EdgeSet is a chosen set of edges to follow.

func DefaultEdges added in v0.4.0

func DefaultEdges() EdgeSet

DefaultEdges is what a walk follows when --follow is unset: the obvious neighbors of the seed, all on the open surface, so `ytb discover <video>` works with no tokens and no anti-bot exposure.

func ParseEdges added in v0.4.0

func ParseEdges(spec string) (EdgeSet, error)

ParseEdges turns a --follow spec into an EdgeSet. The spec is a comma list of preset names and/or edge names ("content", "channel,related", "uploads,items"). An empty spec yields DefaultEdges. An unknown token is a usage error naming the catalogue, so a typo points the user at the real vocabulary.

func (EdgeSet) Has added in v0.4.0

func (s EdgeSet) Has(e Edge) bool

Has reports whether the set contains e (a nil set contains nothing).

func (EdgeSet) List added in v0.4.0

func (s EdgeSet) List() []Edge

List returns the set's edges in stable display order.

func (EdgeSet) String added in v0.4.0

func (s EdgeSet) String() string

String renders the set as a comma-separated, ordered list.

type InnerTubeClient

type InnerTubeClient struct {
	// contains filtered or unexported fields
}

InnerTubeClient calls YouTube's internal InnerTube API. No API key or auth required. It routes every POST through the parent Client so requests are rate-limited and retried.

func NewInnerTube

func NewInnerTube(c *Client) *InnerTubeClient

NewInnerTube creates an InnerTube client bound to c.

func (*InnerTubeClient) AndroidVRPlayer

func (it *InnerTubeClient) AndroidVRPlayer(ctx context.Context, videoID string) (map[string]any, error)

AndroidVRPlayer calls /player with the ANDROID_VR client. Unlike the WEB /player call, the formats it returns carry plain `url` fields rather than a signatureCipher, so only the `n` throttling parameter still needs solving.

func (*InnerTubeClient) Browse

func (it *InnerTubeClient) Browse(ctx context.Context, browseID, params, continuation string) (map[string]any, error)

Browse calls /browse.

func (*InnerTubeClient) BrowseContinuation

func (it *InnerTubeClient) BrowseContinuation(ctx context.Context, continuation string) (map[string]any, error)

BrowseContinuation pages /browse with only a continuation token.

func (*InnerTubeClient) CommentContinuation

func (it *InnerTubeClient) CommentContinuation(ctx context.Context, continuation string) (map[string]any, error)

CommentContinuation pages comments/replies via the MWEB client.

func (*InnerTubeClient) CommentContinuationWEB

func (it *InnerTubeClient) CommentContinuationWEB(ctx context.Context, continuation, visitor string) (map[string]any, error)

CommentContinuationWEB pages comments/replies via the WEB client. Modern YouTube serves comment bodies as entity payloads under this client; the token comes from the watch page's ytInitialData and is bound to its visitor session.

func (*InnerTubeClient) Community

func (it *InnerTubeClient) Community(ctx context.Context, browseID, params, continuation string) (map[string]any, error)

Community fetches a channel's community/posts tab.

func (*InnerTubeClient) DiscoverCommunityTabParams

func (it *InnerTubeClient) DiscoverCommunityTabParams(ctx context.Context, browseID string) (string, error)

DiscoverCommunityTabParams finds the community/posts tab params for a channel.

func (*InnerTubeClient) MusicBrowse

func (it *InnerTubeClient) MusicBrowse(ctx context.Context, browseID, params, continuation string) (map[string]any, error)

MusicBrowse calls music.youtube.com /browse with the WEB_REMIX client.

func (*InnerTubeClient) MusicPlayer

func (it *InnerTubeClient) MusicPlayer(ctx context.Context, videoID string) (map[string]any, error)

MusicPlayer calls music.youtube.com /player for a song's details.

func (*InnerTubeClient) MusicSearch

func (it *InnerTubeClient) MusicSearch(ctx context.Context, query, params, continuation string) (map[string]any, error)

MusicSearch calls music.youtube.com /search with the WEB_REMIX client.

func (*InnerTubeClient) Next

func (it *InnerTubeClient) Next(ctx context.Context, videoID, continuation string) (map[string]any, error)

Next calls /next.

func (*InnerTubeClient) NextMWEB

func (it *InnerTubeClient) NextMWEB(ctx context.Context, videoID string) (map[string]any, error)

NextMWEB calls /next with the MWEB client (returns classic commentRenderer).

func (*InnerTubeClient) Player

func (it *InnerTubeClient) Player(ctx context.Context, videoID string) (map[string]any, error)

Player calls /player.

func (*InnerTubeClient) ResolveHashtag

func (it *InnerTubeClient) ResolveHashtag(ctx context.Context, hashtag string) (string, string, error)

ResolveHashtag resolves a hashtag to its browseId and params.

func (*InnerTubeClient) Search

func (it *InnerTubeClient) Search(ctx context.Context, query string, filters SearchFilters, continuation string) (map[string]any, error)

Search calls /search.

func (*InnerTubeClient) Suggest

func (it *InnerTubeClient) Suggest(ctx context.Context, input string) ([]string, error)

Suggest fetches autocomplete suggestions from the public suggestqueries endpoint.

type ItemSelector

type ItemSelector struct {
	// contains filtered or unexported fields
}

ItemSelector resolves a yt-dlp-style --playlist-items spec against a 1-based index. Supported forms, comma-separated: "1", "3-7", "5-" (open end), "-3" (from start), and negative indices counting from the end ("-1" is last when a total is known).

func ParseItemSelector

func ParseItemSelector(spec string) (*ItemSelector, error)

ParseItemSelector compiles a --playlist-items spec. An empty spec selects all.

func (*ItemSelector) Selects

func (s *ItemSelector) Selects(index, total int) bool

Selects reports whether the 1-based index is selected. total may be 0 when unknown; negative bounds then never match.

type JobRecord

type JobRecord struct {
	JobID       string    `json:"job_id"`
	Name        string    `json:"name"`
	Type        string    `json:"type"`
	Status      string    `json:"status"`
	StartedAt   time.Time `json:"started_at"`
	CompletedAt time.Time `json:"completed_at"`
}

JobRecord is one crawl job's history row.

type Node added in v0.4.0

type Node struct {
	Kind     NodeKind       `json:"kind"`
	Depth    int            `json:"depth"`
	Via      Edge           `json:"via,omitempty"`
	Parent   string         `json:"parent,omitempty"`
	Video    *Video         `json:"video,omitempty"`
	Channel  *Channel       `json:"channel,omitempty"`
	Playlist *Playlist      `json:"playlist,omitempty"`
	Comment  *Comment       `json:"comment,omitempty"`
	Post     *CommunityPost `json:"post,omitempty"`
}

Node is one object the walk reached, tagged with how it got there: the BFS depth, the edge it arrived by, and the endpoint of the node it came from. Exactly one of the entity pointers is set, matching Kind. Node is what Walk hands to its callback and what the CLI renders.

func (*Node) Endpoint added in v0.4.0

func (n *Node) Endpoint() string

Endpoint is the node's stable identifier inside a walk: a video/channel/ playlist/comment/post id. It is what edges record as src/dst.

type NodeKind added in v0.4.0

type NodeKind string

NodeKind is the type of a node the walk visits.

const (
	KindVideo    NodeKind = "video"
	KindChannel  NodeKind = "channel"
	KindPlaylist NodeKind = "playlist"
	KindComment  NodeKind = "comment"
	KindPost     NodeKind = "post"
)

type OutputFields

type OutputFields struct {
	ID            string
	Title         string
	Author        string // channel/uploader name
	ChannelID     string
	PlaylistTitle string
	PlaylistIndex int
	Ext           string
	Resolution    string
	Duration      int
}

OutputFields supplies values for an output template. It mirrors the subset of yt-dlp's --output fields that the native engine can fill.

type PageData

type PageData struct {
	HTML          string
	InitialData   any
	PlayerResp    any
	YTCFG         map[string]any
	APIKey        string
	ClientVersion string
	VisitorData   string
}

PageData holds the JSON blobs scraped from a YouTube HTML page.

type PageOptions

type PageOptions struct {
	Max      int // max rows to emit (0 = unlimited)
	MaxPages int // max continuation pages (0 = unlimited)
	Enrich   bool
}

PageOptions bounds a paginated stream.

type Playlist

type Playlist struct {
	PlaylistID      string    `json:"playlist_id" kit:"id" table:"id"`
	Title           string    `json:"title" table:"title,truncate"`
	Description     string    `json:"description" kit:"body" table:"-"`
	ChannelID       string    `json:"channel_id" kit:"link,kind=youtube/channel" table:"-"`
	ChannelName     string    `json:"channel_name" table:"channel,truncate"`
	VideoCount      int       `json:"video_count" table:"videos"`
	ViewCountText   string    `json:"view_count_text" table:"-"`
	LastUpdatedText string    `json:"last_updated_text" table:"-"`
	URL             string    `json:"url" table:"url,url"`
	FetchedAt       time.Time `json:"fetched_at" table:"-"`
}

Playlist is one playlist's header.

func ParseContinuationPlaylists

func ParseContinuationPlaylists(data map[string]any) ([]Playlist, string)

ParseContinuationPlaylists extracts playlists and next continuation token from a /browse continuation.

type PlaylistVideo

type PlaylistVideo struct {
	PlaylistID string `json:"playlist_id"`
	VideoID    string `json:"video_id"`
	Position   int    `json:"position"`
}

PlaylistVideo is the playlist↔video membership join with position.

type QueueItem

type QueueItem struct {
	ID         int64  `json:"id"`
	URL        string `json:"url"`
	EntityType string `json:"entity_type"`
	Status     string `json:"status"`
	Priority   int    `json:"priority"`
}

QueueItem is one pending crawl-queue entry.

type RelatedVideo

type RelatedVideo struct {
	VideoID        string `json:"video_id"`
	RelatedVideoID string `json:"related_video_id"`
	Position       int    `json:"position"`
}

RelatedVideo is the related-videos graph edge.

func ParseContinuationRelatedVideos

func ParseContinuationRelatedVideos(data map[string]any, videoID string) ([]RelatedVideo, string)

ParseContinuationRelatedVideos extracts related videos and next token from a /next continuation.

type SearchFilters

type SearchFilters struct {
	Sort           string // relevance, date, views, rating
	Type           string // video, channel, playlist
	Duration       string // short (<4m), medium (4-20m), long (>20m)
	UploadDate     string // hour, today, week, month, year
	HD             bool
	CC             bool // closed captions / subtitles
	CreativeCommon bool
	Live           bool
	FourK          bool
	ThreeSixty     bool
	HDR            bool
	VR180          bool
}

SearchFilters controls YouTube search filtering via the sp= parameter.

func (SearchFilters) Encode

func (f SearchFilters) Encode() string

Encode returns the base64url-encoded protobuf sp parameter, or "" if empty.

func (SearchFilters) IsEmpty

func (f SearchFilters) IsEmpty() bool

IsEmpty reports whether no filter is set.

type SearchResult

type SearchResult struct {
	EntityType string `json:"entity_type"`
	ID         string `json:"id"`
	Title      string `json:"title"`
	URL        string `json:"url"`
}

SearchResult is the thin polymorphic row for mixed search output.

type Seed added in v0.4.0

type Seed struct {
	Kind NodeKind
	Ref  string // canonical id / @handle to fetch
}

Seed is a parsed starting point for a walk.

func ParseSeed added in v0.4.0

func ParseSeed(ref string) (Seed, error)

ParseSeed classifies a raw reference into a Seed, reusing the domain's own Classify so a seed is read exactly like any other youtube reference: a watch link or bare video id is a video, a playlist link or PL-style id is a playlist, a @handle or UC id or channel URL is a channel.

type Selection

type Selection struct {
	Video *Stream
	Audio *Stream
}

Selection is the outcome of resolving a -f format string against a manifest. When Audio is non-nil the result is two adaptive streams to be merged; otherwise Video alone is a complete (progressive or single) download.

func SelectFormat

func SelectFormat(streams []Stream, spec string) (Selection, error)

SelectFormat resolves a yt-dlp-style format string against the manifest's streams. Supported grammar:

best worst b w                  overall best/worst progressive-or-merged
bestvideo bestaudio bv ba       best/worst adaptive video / audio track
bv*                             best video, progressive allowed
22 137+140                      explicit itags, '+' merges two tracks
bv+ba/b                         '/' tries each group left to right
bv[height<=720] ba[ext=m4a]     [k OP v] filters: =, !=, <, <=, >, >=

Recognised filter keys: height, width, fps, ext, vcodec, acodec, itag, abr, tbr.

func (Selection) NeedsMerge

func (s Selection) NeedsMerge() bool

NeedsMerge reports whether the selection is a video+audio pair that ffmpeg must combine.

func (Selection) Streams

func (s Selection) Streams() []Stream

Streams returns the selected streams in download order.

type Song

type Song struct {
	VideoID         string    `json:"video_id"`
	Title           string    `json:"title"`
	ArtistID        string    `json:"artist_id"`
	ArtistName      string    `json:"artist_name"`
	AlbumID         string    `json:"album_id"`
	AlbumName       string    `json:"album_name"`
	DurationSeconds int       `json:"duration_seconds"`
	DurationText    string    `json:"duration_text"`
	PlaysText       string    `json:"plays_text"`
	IsExplicit      bool      `json:"is_explicit"`
	VideoType       string    `json:"video_type"`
	ThumbnailURL    string    `json:"thumbnail_url"`
	Lyrics          string    `json:"lyrics"`
	URL             string    `json:"url"`
	FetchedAt       time.Time `json:"fetched_at"`
}

Song is a YouTube Music song.

type SponsorSegment

type SponsorSegment struct {
	Category string  `json:"category"`
	Action   string  `json:"action"`
	Start    float64 `json:"start_seconds"`
	End      float64 `json:"end_seconds"`
	UUID     string  `json:"uuid"`
}

SponsorSegment is one community-submitted segment of a video.

type Store

type Store struct {
	// contains filtered or unexported fields
}

Store is the SQLite-backed persistence layer for all crawled YouTube data.

func OpenStore

func OpenStore(path string) (*Store, error)

OpenStore opens (or creates) the SQLite database at path and ensures all tables exist.

func (*Store) Close

func (s *Store) Close() error

Close closes the underlying database connection.

func (*Store) Enqueue

func (s *Store) Enqueue(url, entity string, priority int) error

Enqueue adds a URL to the crawl queue. Duplicate URLs are silently ignored.

func (*Store) ListJobs

func (s *Store) ListJobs(limit int) ([]JobRecord, error)

ListJobs returns up to limit jobs ordered by started_at DESC.

func (*Store) ListQueue

func (s *Store) ListQueue(status string, limit int) ([]QueueItem, error)

ListQueue returns up to limit items with the given status.

func (*Store) MarkStatus

func (s *Store) MarkStatus(id int64, status string) error

MarkStatus updates the status of a queue item by its row ID.

func (*Store) NextPending

func (s *Store) NextPending() (*QueueItem, error)

NextPending atomically pops the highest-priority pending item and marks it in_progress. Returns nil, nil if the queue is empty.

func (*Store) Path

func (s *Store) Path() string

Path returns the filesystem path of the database file.

func (*Store) Query

func (s *Store) Query(sqlText string) ([]string, [][]any, error)

Query executes a raw read-only SQL statement and returns column names + rows.

func (*Store) RecordJob

func (s *Store) RecordJob(j JobRecord) error

RecordJob inserts or replaces a job record.

func (*Store) Reset

func (s *Store) Reset() error

Reset drops and recreates all tables.

func (*Store) SearchChannels

func (s *Store) SearchChannels(q string, limit int) ([]Channel, error)

SearchChannels performs a LIKE search on channel title, description, and handle.

func (*Store) SearchVideos

func (s *Store) SearchVideos(q string, limit int) ([]Video, error)

SearchVideos performs a LIKE search on video title and description.

func (*Store) Stats

func (s *Store) Stats() (map[string]int64, error)

Stats returns row counts for all major tables.

func (*Store) UpsertCaptionTrack

func (s *Store) UpsertCaptionTrack(ct CaptionTrack) error

func (*Store) UpsertChannel

func (s *Store) UpsertChannel(c Channel) error

func (*Store) UpsertChapter

func (s *Store) UpsertChapter(ch Chapter) error

func (*Store) UpsertComment

func (s *Store) UpsertComment(c Comment) error

func (*Store) UpsertCommunityPost

func (s *Store) UpsertCommunityPost(p CommunityPost) error

func (*Store) UpsertEdge added in v0.4.0

func (s *Store) UpsertEdge(src, dst, kind string) error

UpsertEdge records one traversed link of the discovery graph, keyed by the (src, dst, kind) triple so re-walking is idempotent. src and dst are entity ids; kind is the Edge string ("channel", "uploads", ...). It is the sink `ytb discover --store` feeds from WalkOptions.OnEdge.

func (*Store) UpsertNode added in v0.4.0

func (s *Store) UpsertNode(n *Node) error

UpsertNode persists a discovered node into its typed table, dispatching on the node kind. It is what `ytb discover --store` calls for every emitted node, so a walk fills videos/channels/playlists/comments/community_posts exactly as the per-object reads do, with the edges table joining them.

func (*Store) UpsertPlaylist

func (s *Store) UpsertPlaylist(p Playlist) error

func (*Store) UpsertPlaylistVideo

func (s *Store) UpsertPlaylistVideo(pv PlaylistVideo) error

func (*Store) UpsertRelatedVideo

func (s *Store) UpsertRelatedVideo(rv RelatedVideo) error

func (*Store) UpsertVideo

func (s *Store) UpsertVideo(v Video) error

func (*Store) UpsertVideoFormat

func (s *Store) UpsertVideoFormat(f VideoFormat) error

func (*Store) Vacuum

func (s *Store) Vacuum() error

Vacuum runs SQLite VACUUM to reclaim space.

type Stream

type Stream struct {
	ITag            int    `json:"itag"`
	MimeType        string `json:"mime_type"`
	Container       string `json:"container"` // mp4, webm, m4a, 3gp
	VideoCodec      string `json:"video_codec"`
	AudioCodec      string `json:"audio_codec"`
	Quality         string `json:"quality"`
	QualityLabel    string `json:"quality_label"`
	Width           int    `json:"width"`
	Height          int    `json:"height"`
	FPS             int    `json:"fps"`
	Bitrate         int64  `json:"bitrate"`
	ContentLength   int64  `json:"content_length"`
	AudioQuality    string `json:"audio_quality"`
	AudioChannels   int    `json:"audio_channels"`
	AudioSampleRate int    `json:"audio_sample_rate"`
	IsAdaptive      bool   `json:"is_adaptive"`
	HasVideo        bool   `json:"has_video"`
	HasAudio        bool   `json:"has_audio"`
	// contains filtered or unexported fields
}

Stream is one downloadable format with the data needed to resolve its URL. It is the download-oriented sibling of VideoFormat (which is metadata only).

func (Stream) AudioOnly

func (s Stream) AudioOnly() bool

AudioOnly reports a track carrying audio but no video.

func (Stream) Ext

func (s Stream) Ext() string

Ext returns the file extension to use when saving this stream on its own.

func (Stream) Muxed

func (s Stream) Muxed() bool

Muxed reports a progressive track carrying both audio and video.

func (Stream) VideoOnly

func (s Stream) VideoOnly() bool

VideoOnly reports a track carrying video but no audio.

type StreamManifest

type StreamManifest struct {
	VideoID  string   `json:"video_id"`
	Title    string   `json:"title"`
	Author   string   `json:"author"`
	Duration int      `json:"duration_seconds"`
	IsLive   bool     `json:"is_live"`
	HLSURL   string   `json:"hls_url,omitempty"`
	DASHURL  string   `json:"dash_url,omitempty"`
	Streams  []Stream `json:"streams"`
	// contains filtered or unexported fields
}

StreamManifest is the resolved set of streams for one video plus the player URL needed to decipher them.

type SubtitleFormat

type SubtitleFormat string

SubtitleFormat is a target caption serialization.

const (
	SubSRT  SubtitleFormat = "srt"
	SubVTT  SubtitleFormat = "vtt"
	SubText SubtitleFormat = "txt"
)

type Suggestion added in v0.2.0

type Suggestion struct {
	Text string `json:"suggestion" kit:"id" table:"suggestion"`
}

Suggestion is one search-autocomplete suggestion, wrapped so the suggest operation emits a record the renderer and a host can both address.

type Thumbnail

type Thumbnail struct {
	URL    string `json:"url"`
	Width  int    `json:"width"`
	Height int    `json:"height"`
	Name   string `json:"name"`
}

Thumbnail is one available preview image for a video.

func Thumbnails

func Thumbnails(videoID string) []Thumbnail

Thumbnails returns the standard rendition URLs for a video, largest first. Not every video has every rendition (maxres in particular); Download probes availability.

type TranscriptSegment

type TranscriptSegment struct {
	StartSeconds float64 `json:"start"`
	DurSeconds   float64 `json:"dur"`
	Text         string  `json:"text"`
}

TranscriptSegment is one timed line of a transcript.

type Video

type Video struct {
	VideoID            string    `json:"video_id" kit:"id" table:"id"`
	Title              string    `json:"title" table:"title,truncate"`
	Description        string    `json:"description" kit:"body" table:"-"`
	ChannelID          string    `json:"channel_id" kit:"link,kind=youtube/channel" table:"-"`
	ChannelName        string    `json:"channel_name" table:"channel,truncate"`
	DurationSeconds    int       `json:"duration_seconds" table:"-"`
	DurationText       string    `json:"duration_text" table:"duration"`
	ViewCount          int64     `json:"view_count" table:"views"`
	CommentCount       int64     `json:"comment_count" table:"-"`
	LikeCount          int64     `json:"like_count" table:"-"`
	PublishedText      string    `json:"published_text" table:"published"`
	PublishedAt        time.Time `json:"published_at" table:"-"`
	UploadDate         string    `json:"upload_date" table:"-"`
	IsLive             bool      `json:"is_live" table:"-"`
	IsShort            bool      `json:"is_short" table:"-"`
	Category           string    `json:"category" table:"-"`
	Tags               []string  `json:"tags" table:"-"`
	ThumbnailURL       string    `json:"thumbnail_url" table:"-"`
	URL                string    `json:"url" table:"url,url"`
	EmbedURL           string    `json:"embed_url" table:"-"`
	Transcript         string    `json:"transcript" table:"-"`
	TranscriptLanguage string    `json:"transcript_language" table:"-"`
	// Extended metadata from microformat / videoDetails.
	AvailableCountries  []string  `json:"available_countries" table:"-"`
	IsFamilySafe        bool      `json:"is_family_safe" table:"-"`
	AllowRatings        bool      `json:"allow_ratings" table:"-"`
	AgeRestricted       bool      `json:"age_restricted" table:"-"`
	LocationDescription string    `json:"location_description" table:"-"`
	Hashtags            []string  `json:"hashtags" table:"-"`
	FetchedAt           time.Time `json:"fetched_at" table:"-"`
}

Video is the central record: one YouTube video with full metadata. The kit tags make it addressable by a host (id, body, links) and the table tags pick the columns the aligned table shows; every other field stays in the JSON.

func ParseContinuationVideos

func ParseContinuationVideos(data map[string]any) ([]Video, string)

ParseContinuationVideos extracts videos and next continuation token from a /browse continuation.

func ParsePlayerDetails

func ParsePlayerDetails(data map[string]any, videoID string) *Video

ParsePlayerDetails extracts video details from an InnerTube /player response. Only populates fields that have values; callers should merge into existing data.

type VideoFormat

type VideoFormat struct {
	VideoID       string `json:"video_id"`
	ITag          int    `json:"itag"`
	MimeType      string `json:"mime_type"`
	Quality       string `json:"quality"`
	QualityLabel  string `json:"quality_label"`
	Width         int    `json:"width"`
	Height        int    `json:"height"`
	FPS           int    `json:"fps"`
	Bitrate       int64  `json:"bitrate"`
	ContentLength int64  `json:"content_length"`
	IsAdaptive    bool   `json:"is_adaptive"`
	AudioQuality  string `json:"audio_quality"`
}

VideoFormat is one streaming format (muxed or adaptive) of a video.

func ParseVideoFormats

func ParseVideoFormats(playerResp map[string]any, videoID string) []VideoFormat

ParseVideoFormats extracts streaming format info from an InnerTube /player response. Deduplicates by itag, keeping the entry with the largest bitrate.

type VideoOptions

type VideoOptions struct {
	Player     bool // call /player (default true)
	Next       bool // call /next for chapters/related/comment token
	Transcript bool // fetch the transcript text
	Lang       string
}

VideoOptions controls what FetchVideo gathers.

type VideoResult

type VideoResult struct {
	Video        Video
	Formats      []VideoFormat
	Captions     []CaptionTrack
	Chapters     []Chapter
	Related      []RelatedVideo
	CommentToken string
}

VideoResult holds the full output of FetchVideo.

type WalkOptions added in v0.4.0

type WalkOptions struct {
	Depth  int     // hops to follow from each seed (0 = seeds only)
	Max    int     // stop after emitting this many nodes (0 = unlimited)
	Fanout int     // per-edge neighbor cap (0 = unlimited)
	Edges  EdgeSet // edges to follow (nil = DefaultEdges)

	// OnEdge, if set, is called for every edge the walk traverses, before the
	// neighbor is visited, with the two endpoints and the edge. The store sink
	// uses it to record the graph; it fires even for an already-visited neighbor
	// so the edge list stays complete.
	OnEdge func(src, dst string, edge Edge)

	// Note, if set, surfaces a one-line advisory (a comment edge refused by the
	// anti-bot wall, a neighbor that could not be fetched). It never carries a
	// fatal error.
	Note func(string)
}

WalkOptions tunes a traversal.

type Walker added in v0.4.0

type Walker struct {
	// contains filtered or unexported fields
}

Walker performs the breadth-first traversal over a grapher.

func NewWalker added in v0.4.0

func NewWalker(g grapher) *Walker

NewWalker builds a Walker over any grapher (the client in production, a fake in tests).

func (*Walker) Walk added in v0.4.0

func (w *Walker) Walk(ctx context.Context, seeds []Seed, opts WalkOptions, emit func(*Node) error) error

Walk visits the seeds and their links in breadth-first order, calling emit for each node as it is reached. It returns when the queue drains, the node budget (opts.Max) is hit, emit returns an error, or a seed cannot be fetched. A gated edge (comments, and the channel reached through one) is attempted, not pre-dropped: when YouTube refuses it the failure becomes a Note and the walk keeps going on the rest of the graph.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL