gallery

package module
v0.0.0-...-0773bd4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 5, 2026 License: GPL-2.0 Imports: 36 Imported by: 0

README

A Go library and CLI tool for downloading image and video galleries from Twitter/X. Uses the Twitter/X GraphQL and REST APIs - both guest-mode and authenticated - to extract media, then downloads files in parallel with deduplication, filtering, and post-processing support.

Features

  • Multi-source extraction - Single tweets, user timelines, likes, bookmarks, lists, search results, and home timeline
  • Guest & authenticated - Works without login; provide cookies for protected endpoints (likes, bookmarks, home timeline)
  • Parallel downloads - Configurable concurrency with resumable .part file support
  • Deduplication - SQLite-backed archive prevents re-downloading already-seen files
  • Flexible filtering - Filter by date range, content type, retweet/reply/quote flags, or arbitrary expr-lang expressions
  • Filename templates - Powerful {key}, {key!l}, {key:layout}, {key/old/new}, {key|sep}, {key?true:false} patterns
  • Post-processors - exec, mtime, rename, zip, hash, metadata (JSON sidecar)
  • yt-dlp fallback - Optional HLS/complex stream support via yt-dlp
  • Config file - YAML, TOML, or JSON
  • Graceful unavailability handling - Deleted, DMCA-removed, suspended, and geo-blocked tweets are reported per-item and counted separately; the run never aborts because of them

Installation

go install github.com/hecker-01/go-gallery/cmd/go-gallery@latest

Or build from source:

git clone https://github.com/hecker-01/go-gallery
cd go-gallery
go build ./cmd/go-gallery

Requirements: Go 1.22+. Pure-Go SQLite is used - no CGO required for basic usage.

CLI Usage

go-gallery [flags] URL...
Flags
Flag Default Description
-g Print direct media URLs and exit (no download)
-j Print per-item JSON to stdout and exit
-K Print available template keywords for the first item
-simulate Run full pipeline but skip all I/O
-d DIR . Base output directory; twitter/{username}/... structure is created beneath it
-D DIR Direct output directory; files are placed here with no subdirectory structure
-f PATTERN (config) Filename template pattern
--concurrency N 4 Number of parallel downloads
--cookies-from-browser BROWSER Import cookies from browser (firefox)
--cookies-from-file PATH Import from Netscape cookies.txt file
--filter EXPR expr-lang expression to filter items
--config PATH Path to a YAML/TOML/JSON config file
-v / --verbose Enable debug-level logging
-q / --quiet Suppress all output

Output format follows gallery-dl's style - [source][level] message with ANSI colors on terminals (auto-detected; set NO_COLOR=1 to disable):

[twitter][info] twitter/username/1234567890_1.jpg
[twitter][warning] unavailable [tombstone] tweet 9876543210
[twitter][warning] rate limited on UserMedia; waiting 15s (resets at 2026-01-01T12:00:00Z)
[go-gallery][info] 42 downloaded, 0 skipped, 1 unavailable, 0 failed (8.3s)
[go-gallery][warning] unavailable:
[go-gallery][warning]   unavailable (dmca): https://video.twimg.com/.../video.mp4

The summary line always reports four counters:

Counter Meaning
downloaded Files successfully saved to disk
skipped Archive hits (already downloaded previously)
unavailable Deleted, DMCA, suspended, geo-blocked items
failed Network errors, I/O errors, unexpected failures

Exit code - exits 0 even when some items were unavailable (matching gallery-dl behaviour). Exits 1 only on a fatal auth/challenge failure, or when zero files downloaded and there were real failures.

-d vs -D mirrors the convention from gallery-dl: -d sets the base directory and the tool still creates twitter/{username}/ beneath it; -D sets the direct directory and files go there with no further subdirectories.

Examples
# Download a user's media tab
go-gallery https://x.com/username/media

# Download a single tweet
go-gallery https://x.com/username/status/1234567890

# Download bookmarks (requires authentication)
go-gallery --cookies-from-browser firefox https://x.com/i/bookmarks

# Download to ~/gallery, keeping twitter/username/ subfolders
go-gallery -d ~/gallery https://x.com/username/media

# Download flat - all files directly into ~/flat, no subfolders
go-gallery -D ~/flat https://x.com/username/media

# Custom base directory and filename pattern
go-gallery -d ~/gallery \
  -f "{author.screen_name}/{date:2006-01}/{tweet_id}_{num}.{extension}" \
  https://x.com/username/media

# Filter to videos only
go-gallery --filter 'extension == "mp4"' https://x.com/username/media

# Print JSON metadata without downloading
go-gallery -j https://x.com/username/media

# Print available template keywords
go-gallery -K https://x.com/username/status/1234567890

Supported URLs

Both twitter.com and x.com domains are supported.

Source URL Pattern Auth Required
Single tweet /{username}/status/{id} No
User timeline / media /{username} or /{username}/media No
User likes /{username}/likes Yes
Bookmarks /i/bookmarks Yes
List /i/lists/{id} No
Search /search?q=... No
Home timeline /home Yes

Authentication: Pass cookies via --cookies-from-browser firefox or --cookies-from-file cookies.txt. The tool will auto-fetch and cache a guest token for unauthenticated endpoints.

Error Handling & Unavailable Content

go-gallery distinguishes between permanent unavailability (content that is gone and won't come back) and transient failures (network errors worth retrying).

Unavailability reasons

When a tweet or media file cannot be retrieved, the reason is reported per-item and counted as unavailable in the summary:

Reason Cause
tombstone Tweet removed (Twitter shows a placeholder in timeline)
deleted HTTP 404 - content does not exist
gone HTTP 410 - content permanently removed
dmca HTTP 451 - DMCA / legal takedown
suspended Tweet from a suspended account
policy-violation Tweet removed for policy violation
protected Tweet from a protected account you cannot access
Rate limiting

When a 429 response is received, go-gallery reads the x-rate-limit-reset header and waits until the window resets, adding a 10-second buffer to account for clock skew between your machine and Twitter's servers. It retries up to 3 times before aborting the operation.

Typed errors in the library
import (
    "errors"
    gallery "github.com/hecker-01/go-gallery"
)

result, err := client.Download(ctx, url, opts...)

// Fatal extraction errors
var authnErr *gallery.AuthenticationError  // 401 / bad credentials
var challengeErr *gallery.ChallengeError   // CAPTCHA / account lock
if errors.As(err, &authnErr) { /* re-authenticate */ }

// Per-item unavailability (in result.Errors)
for _, e := range result.Errors {
    var nfe *gallery.NotFoundError
    if errors.As(e, &nfe) {
        fmt.Printf("unavailable (%s): %s\n", nfe.Reason, nfe.URL)
    }
}

// Summary counters
fmt.Printf("%d downloaded, %d unavailable, %d failed\n",
    result.TotalFiles, result.UnavailableFiles, result.FailedFiles)

ClassifyHTTPStatus(status int, url string, body []byte) error is also exported if you need to map raw HTTP codes to the same typed errors in your own code.

# config.yaml
output:
  dir: "~/gallery"
  filename_format: "{category}/{author.screen_name}/{tweet_id}_{num}.{extension}"
  skip_existing: true
  write_metadata: false

twitter:
  replies_enabled: false
  retweets_enabled: true
  video_max_bitrate: true # pick highest bitrate video variant

archive:
  enabled: true
  path: "" # defaults to XDG data dir
  key: "{tweet_id}_{num}"

cache:
  enabled: true
  path: "" # defaults to XDG cache dir
  ttl: 3600 # seconds

downloader:
  concurrency: 4
  retries: 4
  resume: true
  min_file_size: 0 # bytes, 0 = no limit
  max_file_size: 0 # bytes, 0 = no limit

Load with:

go-gallery --config config.yaml https://x.com/username/media

Filename Templates

Templates use {key} placeholders. Available keywords:

Keyword Description
tweet_id Tweet ID
author.screen_name Username (e.g. username)
author.name Display name
author.id Author's numeric ID
date Tweet date (use {date:2006-01-02} for layout)
content Tweet text
extension File extension (jpg, mp4, etc.)
num Media index within tweet (1-based)
count Total media count in tweet
favorite_count Like count
retweet_count Retweet count
reply_count Reply count
quote_count Quote count
lang Tweet language
hashtags Hashtags (use {hashtags|,} to join with separator)
mentions Mentions
is_retweet Boolean
is_reply Boolean
is_quote Boolean
category Extractor category

Modifiers:

Syntax Description
{key!l} Lowercase
{key!u} Uppercase
{key:layout} Format (dates use Go time layout)
{key/old/new} String replacement
{key|sep} Join slice with separator
{key?true:false} Conditional

Library Usage

import "github.com/hecker-01/go-gallery"

client := gallery.NewClient(gallery.WithConcurrency(4))

// Base directory - twitter/username/ structure is created beneath ./output
result, err := client.Download(ctx, "https://x.com/username/media",
    gallery.WithOutputDir("./output"),
)
fmt.Printf("%d downloaded, %d skipped, %d unavailable, %d failed\n",
    result.TotalFiles, result.SkippedFiles, result.UnavailableFiles, result.FailedFiles)

// Direct directory - files go straight into ./flat with no subdirectories
result, err = client.Download(ctx, "https://x.com/username/media",
    gallery.WithDirectOutputDir("./flat"),
)

// Or compose manually: set base dir then enable flat mode
result, err = client.Download(ctx, "https://x.com/username/media",
    gallery.WithOutputDir("./flat"),
    gallery.WithFlatDir(),
)
Streaming extraction

Extract returns a channel of Message values. Three variants exist:

msgs, errs := client.Extract(ctx, "https://x.com/username/media")
for msg := range msgs {
    switch m := msg.(type) {
    case gallery.Directory:
        // Subsequent media items belong under m.Path
    case gallery.Media:
        // m.URL is the direct download URL; m.Info holds all metadata
    case gallery.Skipped:
        // Item was permanently unavailable
        // m.TweetID  - tweet ID (best-effort for timeline tombstones)
        // m.Reason   - "tombstone" | "deleted" | "suspended" | "dmca" | ...
        // m.Cause    - typed error (*NotFoundError etc.) if available
    }
}
if err := <-errs; err != nil {
    // Fatal extraction failure (auth error, challenge, network abort)
}

Development

# Run tests
make test

# Run tests with race detector (requires CGO)
make test-race

# Lint
make lint

# Benchmarks
make bench

# Integration tests
make integration

# Clean
make clean

License

See LICENSE.

Documentation

Overview

Package gallery provides a pluggable library for downloading image and video galleries from Twitter/X and other sites. The primary entry point is Client.

Example (DownloadUserMedia)
package main

import (
	"context"
	"fmt"

	gallery "github.com/hecker-01/go-gallery"
)

func main() {
	client := gallery.NewClient(
		gallery.WithConcurrency(4),
	)
	defer client.Close()

	result, err := client.Download(
		context.Background(),
		"https://twitter.com/exampleuser/media",
		gallery.WithOutputDir("./downloads"),
		gallery.WithSimulate(true),
		gallery.WithFilter(gallery.AllOf()),
	)
	if err != nil {
		// In this example the method is not yet implemented; suppress the error
		// so the testable example compiles and runs cleanly.
		_ = result
		_ = err
	}
	fmt.Println("download example executed")
}
Output:
download example executed

Index

Examples

Constants

This section is empty.

Variables

View Source
var AbortExtraction = galleryerrs.AbortExtraction

AbortExtraction cancels the current extraction immediately. In-flight downloads that have already started are allowed to finish.

View Source
var ClassifyHTTPStatus = galleryerrs.ClassifyHTTPStatus

ClassifyHTTPStatus maps an HTTP status code to a typed extraction error.

View Source
var StopExtraction = galleryerrs.StopExtraction

StopExtraction asks the extractor to stop after the current page / batch. Items already in-flight are completed; no new pages are fetched.

View Source
var TerminateExtraction = galleryerrs.TerminateExtraction

TerminateExtraction hard-cancels everything, including in-flight downloads.

Functions

func CookiesFromBrowser

func CookiesFromBrowser(browserName string) (http.CookieJar, error)

CookiesFromBrowser extracts Twitter/X cookies from a locally-installed browser's profile database.

Supported browsers: "firefox". All other values return an InputError listing supported options. Consumers should fall back to CookiesFromFile for other browsers.

func CookiesFromFile

func CookiesFromFile(path string) (http.CookieJar, error)

CookiesFromFile parses a Netscape-format cookies.txt file and returns a cookie jar containing its entries. Lines beginning with '#' are comments. Returns an InputError for unknown formats.

func DefaultCachePath

func DefaultCachePath() string

DefaultCachePath returns the platform-appropriate path for the cache database, honouring XDG_CACHE_HOME on Linux/macOS and the equivalent on Windows. Falls back to ~/.cache/go-gallery/cache.sqlite3.

Types

type Archive

type Archive interface {
	// Has reports whether key has been previously recorded.
	Has(ctx context.Context, key string) (bool, error)
	// Put records key as downloaded.
	Put(ctx context.Context, key string) error
	// Close flushes pending writes and releases resources.
	Close() error
}

Archive tracks downloaded media by key to prevent duplicate downloads. Inject an implementation via WithArchive; defaults to a no-op if unset.

type ArchiveConfig

type ArchiveConfig struct {
	// Path is the SQLite file path. Defaults to XDG data home.
	Path string `json:"path" yaml:"path" toml:"path"`
	// Key is the archive key pattern. Defaults to "{tweet_id}_{num}".
	Key string `json:"key" yaml:"key" toml:"key"`
	// Enabled toggles archive checking. Defaults to false so it is opt-in.
	Enabled bool `json:"enabled" yaml:"enabled" toml:"enabled"`
}

ArchiveConfig controls the download archive (deduplication database).

type AuthenticationError

type AuthenticationError = galleryerrs.AuthenticationError

AuthenticationError indicates invalid or missing credentials (expired session, wrong auth_token / ct0 cookies).

type AuthorInfo

type AuthorInfo struct {
	ID         string
	Name       string
	ScreenName string
}

AuthorInfo holds Twitter user fields embedded in MediaInfo.

type AuthorizationError

type AuthorizationError = galleryerrs.AuthorizationError

AuthorizationError indicates the authenticated user is not permitted to view the requested resource (protected account, not following).

type Cache

type Cache interface {
	// Get retrieves the value for key. Returns ("", false, nil) when not found.
	Get(ctx context.Context, key string) (string, bool, error)
	// Set stores value under key with the given TTL.
	Set(ctx context.Context, key, value string, ttl time.Duration) error
	// Delete removes key from the cache.
	Delete(ctx context.Context, key string) error
	// Close releases resources.
	Close() error
}

Cache stores short-lived session data: guest tokens, GraphQL query IDs, user-ID lookups, and similar values that are expensive to re-fetch.

type CacheConfig

type CacheConfig struct {
	// Path overrides the XDG cache path.
	Path string `json:"path" yaml:"path" toml:"path"`
	// Enabled toggles the cache. Defaults to true.
	Enabled bool `json:"enabled" yaml:"enabled" toml:"enabled"`
	// TTL is the default cache entry lifetime in seconds.
	TTL int `json:"ttl" yaml:"ttl" toml:"ttl"`
}

CacheConfig controls the SQLite session cache.

type ChallengeError

type ChallengeError = galleryerrs.ChallengeError

ChallengeError indicates Twitter is showing a verification challenge (CAPTCHA / unlock flow) and the request cannot proceed.

type Client

type Client struct {
	// contains filtered or unexported fields
}

Client is the primary entry point. Construct it with NewClient. A zero-value Client is not valid; always use NewClient.

func NewClient

func NewClient(opts ...Option) *Client

NewClient constructs a Client, applying all provided options. Unset fields fall back to DefaultConfig() values.

func (*Client) Close

func (c *Client) Close() error

Close shuts down worker pools, flushes the archive, and releases resources. It should be called via defer after NewClient.

func (*Client) Download

func (c *Client) Download(ctx context.Context, url string, opts ...DownloadOption) (Result, error)

Download is the batteries-included path. It runs extraction, creates directories, downloads files, checks the archive, runs post-processors, and returns a Result summary.

func (*Client) Extract

func (c *Client) Extract(ctx context.Context, url string) (<-chan Message, <-chan error)

Extract starts extraction of url and returns a channel of Messages and an error channel. The caller ranges over the message channel and decides what to do with each item (e.g. calling Download on Media items).

Both channels are closed when extraction finishes or ctx is cancelled. A non-nil value on the error channel indicates a fatal extraction failure.

func (*Client) GetJSON

func (c *Client) GetJSON(ctx context.Context, url string) (<-chan json.RawMessage, <-chan error)

GetJSON returns a channel of json.RawMessage - one per tweet's full metadata object. Equivalent to the -j flag.

func (*Client) GetKeywords

func (c *Client) GetKeywords(ctx context.Context, url string) (map[string]any, error)

GetKeywords returns a flat map of template variables for the first Media item yielded by url. Equivalent to the -K / --get-keywords flag in gallery-dl.

func (*Client) GetURLs

func (c *Client) GetURLs(ctx context.Context, url string) ([]*MediaInfo, error)

GetURLs returns MediaInfo slices containing direct download URLs and metadata for all items reachable from url.

func (*Client) RateLimitStatus

func (c *Client) RateLimitStatus(endpoint string) RateLimitInfo

RateLimitStatus returns the current rate-limit state for the named endpoint. endpoint is a GraphQL operation name, e.g. "UserTweets". Returns zeroed fields if the endpoint has not been seen yet.

type Config

type Config struct {
	Output     OutputConfig              `json:"output"     yaml:"output"     toml:"output"`
	Twitter    TwitterConfig             `json:"twitter"    yaml:"twitter"    toml:"twitter"`
	Archive    ArchiveConfig             `json:"archive"    yaml:"archive"    toml:"archive"`
	Cache      CacheConfig               `json:"cache"      yaml:"cache"      toml:"cache"`
	Downloader DownloaderConfig          `json:"downloader" yaml:"downloader" toml:"downloader"`
	Extractors map[string]map[string]any `json:"extractors" yaml:"extractors" toml:"extractors"`
}

Config is the top-level library configuration. All fields have sane defaults returned by DefaultConfig(); no file is required.

func DefaultConfig

func DefaultConfig() Config

DefaultConfig returns a Config populated with sensible defaults. No fields reference external resources so it works out of the box.

func LoadConfig

func LoadConfig(path string) (Config, error)

LoadConfig reads a config file at path. Format is detected by file extension: .json → JSON, .yaml / .yml → YAML, .toml → TOML. The returned Config is populated over DefaultConfig() so missing fields get sane defaults.

type ContentTypeFilter

type ContentTypeFilter struct {
	// contains filtered or unexported fields
}

ContentTypeFilter matches items whose Extension is in the allowed set. Extension comparison is case-insensitive.

func NewContentTypeFilter

func NewContentTypeFilter(extensions ...string) ContentTypeFilter

NewContentTypeFilter returns a filter that accepts items with any of the given extensions (without leading dot, e.g. "jpg", "mp4").

func (ContentTypeFilter) Accept

func (f ContentTypeFilter) Accept(info *MediaInfo) bool

type DateFilter

type DateFilter struct {
	After  time.Time
	Before time.Time
}

DateFilter accepts items whose date falls within [After, Before]. A zero time means "no bound".

func (DateFilter) Accept

func (f DateFilter) Accept(info *MediaInfo) bool

type Directory

type Directory struct {
	Path string
}

Directory signals that subsequent Media items should be written under Path.

type DownloadConfig

type DownloadConfig struct {
	OutputDir string
	// FlatDir, when true, strips any directory components from the formatted
	// filename so files land directly in OutputDir with no subdirectories.
	// Equivalent to gallery-dl's -D flag.
	FlatDir        bool
	FilenameFormat string
	Filter         Filter
	PostProcessors []PostProcessor
	Simulate       bool
	Range          *Range
	Downloader     Downloader
	// MinFileSize / MaxFileSize in bytes; 0 means no limit.
	MinFileSize int64
	MaxFileSize int64
}

DownloadConfig holds the resolved configuration for a single Download call.

type DownloadOption

type DownloadOption func(*DownloadConfig)

DownloadOption mutates a DownloadConfig.

func WithDirectOutputDir

func WithDirectOutputDir(dir string) DownloadOption

WithDirectOutputDir sets an exact output directory with no subdirectory structure. Equivalent to combining WithOutputDir and WithFlatDir.

func WithDownloaderOpt

func WithDownloaderOpt(d Downloader) DownloadOption

WithDownloaderOpt injects a custom Downloader into this Download call.

func WithFilenameFormat

func WithFilenameFormat(pattern string) DownloadOption

WithFilenameFormat overrides the filename formatter pattern.

func WithFilter

func WithFilter(f Filter) DownloadOption

WithFilter sets the filter applied to each Media item.

func WithFlatDir

func WithFlatDir() DownloadOption

WithFlatDir disables subdirectory creation - files are placed directly in OutputDir regardless of the filename format's directory components. Equivalent to gallery-dl's -D flag.

func WithOutputDir

func WithOutputDir(dir string) DownloadOption

WithOutputDir sets the base output directory. The filename format's directory components (e.g. {category}/{author.screen_name}/) are still created beneath it. Equivalent to gallery-dl's -d flag.

func WithPostProcessors

func WithPostProcessors(pp ...PostProcessor) DownloadOption

WithPostProcessors replaces the post-processor list.

func WithRange

func WithRange(rv Range) DownloadOption

WithRange restricts which items are downloaded by 1-based index.

func WithSimulate

func WithSimulate(s bool) DownloadOption

WithSimulate when true drives the full extraction and filter pipeline but skips all network I/O and filesystem writes.

type Downloader

type Downloader interface {
	Download(ctx context.Context, url string, dest io.Writer, opts DownloadConfig) error
}

Downloader is the interface that wraps media download behaviour. The root implementation (HTTPDownloader) lives in internal/downloader but the interface is public so consumers can substitute their own.

type DownloaderConfig

type DownloaderConfig struct {
	// Concurrency is the number of parallel media downloads. Defaults to 4.
	Concurrency int `json:"concurrency" yaml:"concurrency" toml:"concurrency"`
	// Retries is the number of retry attempts on transient failures. Default 4.
	Retries int `json:"retries" yaml:"retries" toml:"retries"`
	// Resume enables .part file resumption via Range headers. Default true.
	Resume bool `json:"resume" yaml:"resume" toml:"resume"`
	// MinFileSize rejects files smaller than this many bytes. 0 = no limit.
	MinFileSize int64 `json:"min_file_size" yaml:"min_file_size" toml:"min_file_size"`
	// MaxFileSize rejects files larger than this many bytes. 0 = no limit.
	MaxFileSize int64 `json:"max_file_size" yaml:"max_file_size" toml:"max_file_size"`
	// ChunkSize is the streaming read buffer size in bytes. Default 32 KiB.
	ChunkSize int `json:"chunk_size" yaml:"chunk_size" toml:"chunk_size"`
	// RateLimitSleep is the seconds to wait when rate-limited in HTTP 429.
	// 0 means use the server-supplied reset header.
	RateLimitSleep int `json:"rate_limit_sleep" yaml:"rate_limit_sleep" toml:"rate_limit_sleep"`
}

DownloaderConfig controls the HTTP downloader behaviour.

type ExecPostProcessor

type ExecPostProcessor struct {
	// contains filtered or unexported fields
}

ExecPostProcessor runs an external command after each downloaded file. The argv elements are treated as Formatter patterns expanded against the media keywords map before execution.

func NewExecPostProcessor

func NewExecPostProcessor(argv ...string) *ExecPostProcessor

NewExecPostProcessor constructs a post-processor that runs the given command template. argv elements are Formatter patterns expanded per MediaInfo.

func (ExecPostProcessor) Name

func (n ExecPostProcessor) Name() string

func (ExecPostProcessor) OnAfter

func (n ExecPostProcessor) OnAfter(_ context.Context, _ string, _ *MediaInfo) error

func (ExecPostProcessor) OnError

func (n ExecPostProcessor) OnError(_ context.Context, _ error, _ *MediaInfo) error

func (*ExecPostProcessor) OnFile

func (p *ExecPostProcessor) OnFile(ctx context.Context, path string, info *MediaInfo) error

func (ExecPostProcessor) OnPrepare

func (n ExecPostProcessor) OnPrepare(_ context.Context, _ *MediaInfo) error

type ExprFilter

type ExprFilter struct {
	// contains filtered or unexported fields
}

ExprFilter evaluates a boolean expression against MediaInfo.Keywords(). The expression is compiled once at construction time for efficiency.

Example expressions:

  • "favorite_count >= 100"
  • "extension == \"jpg\" && !is_retweet"
  • "len(hashtags) > 0"

func NewExprFilter

func NewExprFilter(src string) (ExprFilter, error)

NewExprFilter compiles src into an expression filter. Returns an error if the expression is syntactically invalid or does not evaluate to a boolean.

func (ExprFilter) Accept

func (f ExprFilter) Accept(info *MediaInfo) bool

type ExtractionError

type ExtractionError = galleryerrs.ExtractionError

ExtractionError is the base interface for failures originating in extractors or the download pipeline. All concrete error types below implement it.

type Filter

type Filter interface {
	Accept(info *MediaInfo) bool
}

Filter decides whether a Media item should be processed. Implementations must be safe for concurrent use.

func AllOf

func AllOf(filters ...Filter) Filter

AllOf composes filters with logical AND (short-circuit). An empty list accepts everything.

func AnyOf

func AnyOf(filters ...Filter) Filter

AnyOf composes filters with logical OR (short-circuit). An empty list accepts everything.

func ExcludeQuotes

func ExcludeQuotes() Filter

ExcludeQuotes returns a Filter that rejects quote tweets.

func ExcludeReplies

func ExcludeReplies() Filter

ExcludeReplies returns a Filter that rejects replies.

func ExcludeRetweets

func ExcludeRetweets() Filter

ExcludeRetweets returns a Filter that rejects retweets.

func ImagesOnly

func ImagesOnly() Filter

ImagesOnly returns a filter that accepts only image media (jpg, jpeg, png, gif, webp).

func IncludeQuotes

func IncludeQuotes() Filter

IncludeQuotes returns a Filter that accepts quote tweets.

func IncludeReplies

func IncludeReplies() Filter

IncludeReplies returns a Filter that accepts replies.

func IncludeRetweets

func IncludeRetweets() Filter

IncludeRetweets returns a Filter that accepts retweets.

func MinFaves

func MinFaves(n int) Filter

MinFaves returns a filter that accepts items with at least n favorites.

func VideosOnly

func VideosOnly() Filter

VideosOnly returns a filter that accepts only video media (mp4, mov, avi, webm).

type Formatter

type Formatter struct {
	// contains filtered or unexported fields
}

Formatter compiles a pattern string into a reusable path formatter.

Pattern syntax:

Literal text is copied verbatim.

{key}                  - value from the keywords map
{key:layout}           - datetime value using Go's time.Format layout
{key!l}                - value forced to lower-case
{key!u}                - value forced to upper-case
{key!j}                - value JSON-encoded
{key/old/new}          - value with old replaced by new (first occurrence)
{key//old/new}         - value with all occurrences replaced
{key|sep}              - slice value joined with sep
{key?trueval:falseval} - if value is truthy use trueval else falseval

Specifiers may be combined left-to-right, e.g. {author.screen_name!l/bad/good}. Unknown keys produce an empty string rather than an error so patterns are robust to missing optional fields.

func NewFormatter

func NewFormatter(pattern string) (*Formatter, error)

NewFormatter compiles pattern into a Formatter. Returns an error if the pattern contains unclosed braces.

func (*Formatter) Format

func (f *Formatter) Format(kw map[string]any) string

Format evaluates the pattern against kw and returns the resulting string. Path separators in individual variable values are replaced with underscores so that a single variable cannot inject extra directory levels.

type HashAlgorithm

type HashAlgorithm string

HashAlgorithm selects the hash function for HashPostProcessor.

const (
	HashMD5    HashAlgorithm = "md5"
	HashSHA1   HashAlgorithm = "sha1"
	HashSHA256 HashAlgorithm = "sha256"
)

type HashPostProcessor

type HashPostProcessor struct {
	// contains filtered or unexported fields
}

HashPostProcessor writes a sidecar checksum file alongside each download. The sidecar is named "<original>.<ext>.<algo>" (e.g. "image.jpg.sha256").

func NewHashPostProcessor

func NewHashPostProcessor(algo HashAlgorithm) *HashPostProcessor

NewHashPostProcessor constructs a hash post-processor.

func (HashPostProcessor) Name

func (n HashPostProcessor) Name() string

func (HashPostProcessor) OnAfter

func (n HashPostProcessor) OnAfter(_ context.Context, _ string, _ *MediaInfo) error

func (HashPostProcessor) OnError

func (n HashPostProcessor) OnError(_ context.Context, _ error, _ *MediaInfo) error

func (*HashPostProcessor) OnFile

func (p *HashPostProcessor) OnFile(_ context.Context, path string, _ *MediaInfo) error

func (HashPostProcessor) OnPrepare

func (n HashPostProcessor) OnPrepare(_ context.Context, _ *MediaInfo) error

type HttpError

type HttpError = galleryerrs.HttpError

HttpError is returned when an HTTP response has an unexpected status code.

type InputError

type InputError = galleryerrs.InputError

InputError is returned for invalid configuration or bad input to a public API function. It is not an ExtractionError.

type Media

type Media struct {
	Info *MediaInfo
	// URL is the direct download URL (already resolved to best variant).
	URL string
}

Media carries a single downloadable item.

func (Media) Download

func (m Media) Download(ctx context.Context, dest io.Writer, d Downloader, cfg DownloadConfig) error

Download streams the media to dest using the client's configured Downloader. It is a convenience method; callers may also use the URL directly.

type MediaInfo

type MediaInfo struct {
	TweetID       string
	Author        AuthorInfo
	Date          time.Time
	Content       string
	MediaURL      string
	Extension     string
	Num           int
	Count         int
	FavoriteCount int
	RetweetCount  int
	ReplyCount    int
	QuoteCount    int
	Lang          string
	Hashtags      []string
	Mentions      []string
	IsRetweet     bool
	IsReply       bool
	IsQuote       bool
	Card          map[string]any
	Category      string
}

MediaInfo is the metadata record for a single downloaded item. Keywords() returns a flat map for use with Formatter and ExprFilter.

func (*MediaInfo) Keywords

func (m *MediaInfo) Keywords() map[string]any

Keywords returns a flat map of all MediaInfo fields for template evaluation and expression filters.

func (*MediaInfo) MarshalJSON

func (m *MediaInfo) MarshalJSON() ([]byte, error)

MarshalJSON produces the full JSON representation of the info, used by MetadataPostProcessor and GetJSON.

type MemoryArchive

type MemoryArchive struct {
	// contains filtered or unexported fields
}

MemoryArchive is an in-process archive backed by a map. It is safe for concurrent use and suitable for testing or short-lived runs.

func NewMemoryArchive

func NewMemoryArchive() *MemoryArchive

NewMemoryArchive returns an empty MemoryArchive.

func (*MemoryArchive) Close

func (m *MemoryArchive) Close() error

func (*MemoryArchive) Has

func (m *MemoryArchive) Has(_ context.Context, key string) (bool, error)

func (*MemoryArchive) Put

func (m *MemoryArchive) Put(_ context.Context, key string) error

type Message

type Message interface {
	// contains filtered or unexported methods
}

Message is a sealed sum type yielded by Client.Extract. Consumers type-switch over Directory, Media, and Queue.

type MetadataPostProcessor

type MetadataPostProcessor struct {
	// contains filtered or unexported fields
}

MetadataPostProcessor writes a sidecar JSON file for each downloaded item. The sidecar is named "<original>.<ext>.json" or "<original>.json".

func NewMetadataPostProcessor

func NewMetadataPostProcessor() *MetadataPostProcessor

NewMetadataPostProcessor constructs a metadata sidecar post-processor.

func (MetadataPostProcessor) Name

func (n MetadataPostProcessor) Name() string

func (MetadataPostProcessor) OnAfter

func (n MetadataPostProcessor) OnAfter(_ context.Context, _ string, _ *MediaInfo) error

func (MetadataPostProcessor) OnError

func (n MetadataPostProcessor) OnError(_ context.Context, _ error, _ *MediaInfo) error

func (*MetadataPostProcessor) OnFile

func (p *MetadataPostProcessor) OnFile(_ context.Context, path string, info *MediaInfo) error

func (MetadataPostProcessor) OnPrepare

func (n MetadataPostProcessor) OnPrepare(_ context.Context, _ *MediaInfo) error

type MtimePostProcessor

type MtimePostProcessor struct {
	// contains filtered or unexported fields
}

MtimePostProcessor sets the file modification time to the tweet date.

func NewMtimePostProcessor

func NewMtimePostProcessor() *MtimePostProcessor

NewMtimePostProcessor constructs a post-processor that sets file mtimes.

func (MtimePostProcessor) Name

func (n MtimePostProcessor) Name() string

func (MtimePostProcessor) OnAfter

func (n MtimePostProcessor) OnAfter(_ context.Context, _ string, _ *MediaInfo) error

func (MtimePostProcessor) OnError

func (n MtimePostProcessor) OnError(_ context.Context, _ error, _ *MediaInfo) error

func (*MtimePostProcessor) OnFile

func (p *MtimePostProcessor) OnFile(_ context.Context, path string, info *MediaInfo) error

func (MtimePostProcessor) OnPrepare

func (n MtimePostProcessor) OnPrepare(_ context.Context, _ *MediaInfo) error

type NotFoundError

type NotFoundError = galleryerrs.NotFoundError

NotFoundError is returned for deleted tweets, suspended accounts, DMCA takedowns, geo-blocked content, or any other permanent unavailability. The Reason field distinguishes the cause without requiring a new type per case.

type Option

type Option func(*Client)

Option configures a Client during construction.

func WithArchive

func WithArchive(a Archive) Option

WithArchive injects an archive backend for skip-already-downloaded logic.

func WithConcurrency

func WithConcurrency(n int) Option

WithConcurrency sets the number of parallel media downloads.

func WithConfig

func WithConfig(cfg Config) Option

WithConfig sets the library Config. Option values applied after this call override the Config fields.

func WithCookies

func WithCookies(jar http.CookieJar) Option

WithCookies sets the cookie jar used for authenticated requests.

func WithCookiesFromBrowser

func WithCookiesFromBrowser(browser string) Option

WithCookiesFromBrowser extracts cookies from the named browser profile and injects them into the client. Supported values: "firefox". Construction fails silently and logs a warning if extraction fails; the client still works in guest (unauthenticated) mode.

func WithCookiesFromFile

func WithCookiesFromFile(path string) Option

WithCookiesFromFile reads a Netscape cookies.txt file and injects the cookies into the client.

func WithDownloader

func WithDownloader(d Downloader) Option

WithDownloader replaces the media downloader. Consumers implement Downloader to intercept or entirely replace download behaviour.

func WithHTTPClient

func WithHTTPClient(hc *http.Client) Option

WithHTTPClient replaces the underlying HTTP client entirely. The provided client's transport and timeout settings take precedence.

func WithLogger

func WithLogger(l *slog.Logger) Option

WithLogger sets the structured logger. The library never calls log.Fatal or writes to stdout/stderr directly; all output goes through this logger.

func WithProxy

func WithProxy(rawURL string) Option

WithProxy configures an HTTP proxy URL (e.g. "http://localhost:8080").

func WithRateLimitCallback

func WithRateLimitCallback(fn func(endpoint string, resetAt time.Time)) Option

WithRateLimitCallback registers a callback that fires when a per-endpoint rate limit is hit or updated. fn is called from a background goroutine so it must be safe for concurrent use.

type OutputConfig

type OutputConfig struct {
	// Dir is the base directory for downloads. Defaults to the current directory.
	Dir string `json:"dir" yaml:"dir" toml:"dir"`
	// FilenameFormat is the formatter pattern. See NewFormatter for syntax.
	FilenameFormat string `json:"filename_format" yaml:"filename_format" toml:"filename_format"`
	// SkipExisting skips download when the destination file already exists.
	SkipExisting bool `json:"skip_existing" yaml:"skip_existing" toml:"skip_existing"`
	// WriteMetadata writes a JSON sidecar next to each downloaded file.
	WriteMetadata bool `json:"write_metadata" yaml:"write_metadata" toml:"write_metadata"`
}

OutputConfig controls where and how files are saved.

type PostProcessor

type PostProcessor interface {
	// Name returns a human-readable identifier for logging.
	Name() string
	// OnPrepare is called before the download begins.
	OnPrepare(ctx context.Context, info *MediaInfo) error
	// OnFile is called with the path to the completed download file.
	OnFile(ctx context.Context, path string, info *MediaInfo) error
	// OnAfter is called after all post-processors have run OnFile.
	OnAfter(ctx context.Context, path string, info *MediaInfo) error
	// OnError is called when the download or a prior stage failed.
	OnError(ctx context.Context, err error, info *MediaInfo) error
}

PostProcessor runs after a successful download. Implementations must be safe for concurrent use.

type Queue

type Queue struct {
	URL string
}

Queue is a nested URL that Extract should recurse into.

type Range

type Range struct {
	// contains filtered or unexported fields
}

Range selects a subset of items by 1-based index (e.g. "1-5,7,10-20").

func ParseRange

func ParseRange(s string) (Range, error)

ParseRange parses a comma-separated range expression like "1-5,7,10-20".

func (Range) Contains

func (r Range) Contains(n int) bool

Contains reports whether 1-based index n falls within the range.

func (Range) String

func (r Range) String() string

String returns the raw range expression.

type RangeFilter

type RangeFilter struct {
	// contains filtered or unexported fields
}

RangeFilter selects items by 1-based index using a comma-separated range expression like "1-5,7,10-20".

func NewRangeFilter

func NewRangeFilter(s string) (RangeFilter, error)

NewRangeFilter parses s and returns a RangeFilter or an error.

func (RangeFilter) Accept

func (f RangeFilter) Accept(info *MediaInfo) bool

type RateLimitError

type RateLimitError = galleryerrs.RateLimitError

RateLimitError carries the full context of a 429 response or an internal rate-limit trip.

type RateLimitInfo

type RateLimitInfo struct {
	Endpoint  string
	Limit     int
	Remaining int
	ResetAt   time.Time
}

RateLimitInfo is a snapshot of a single endpoint's rate-limit state.

type RenamePostProcessor

type RenamePostProcessor struct {
	// contains filtered or unexported fields
}

RenamePostProcessor renames each file using a Formatter pattern.

func NewRenamePostProcessor

func NewRenamePostProcessor(pattern string) *RenamePostProcessor

NewRenamePostProcessor constructs a rename post-processor. pattern is a Formatter template, e.g. "{author.screen_name}_{tweet_id}_{num}".

func (RenamePostProcessor) Name

func (n RenamePostProcessor) Name() string

func (RenamePostProcessor) OnAfter

func (n RenamePostProcessor) OnAfter(_ context.Context, _ string, _ *MediaInfo) error

func (RenamePostProcessor) OnError

func (n RenamePostProcessor) OnError(_ context.Context, _ error, _ *MediaInfo) error

func (*RenamePostProcessor) OnFile

func (p *RenamePostProcessor) OnFile(_ context.Context, path string, info *MediaInfo) error

func (RenamePostProcessor) OnPrepare

func (n RenamePostProcessor) OnPrepare(_ context.Context, _ *MediaInfo) error

type Result

type Result struct {
	TotalFiles       int
	SkippedFiles     int // archive hits
	FailedFiles      int
	UnavailableFiles int // deleted, DMCA, suspended, tombstone, etc.
	Errors           []error
	Duration         time.Duration
}

Result is returned by Client.Download summarising the completed operation.

type SQLiteArchive

type SQLiteArchive struct {
	// contains filtered or unexported fields
}

SQLiteArchive is a persistent archive backed by a modernc.org/sqlite database. It uses a single table with the archive key as the primary key so Has and Put are O(log n) and safe for concurrent access via a single *sql.DB.

func NewSQLiteArchive

func NewSQLiteArchive(path string) (*SQLiteArchive, error)

NewSQLiteArchive opens (or creates) the SQLite database at path and prepares statements. The caller must call Close when done.

func (*SQLiteArchive) Close

func (a *SQLiteArchive) Close() error

func (*SQLiteArchive) Has

func (a *SQLiteArchive) Has(ctx context.Context, key string) (bool, error)

func (*SQLiteArchive) Put

func (a *SQLiteArchive) Put(ctx context.Context, key string) error

type SQLiteCache

type SQLiteCache struct {
	// contains filtered or unexported fields
}

SQLiteCache is a persistent cache backed by modernc.org/sqlite.

func NewSQLiteCache

func NewSQLiteCache(path string) (*SQLiteCache, error)

NewSQLiteCache opens (or creates) the cache database at path. The caller must call Close when done.

func (*SQLiteCache) Close

func (c *SQLiteCache) Close() error

func (*SQLiteCache) Delete

func (c *SQLiteCache) Delete(ctx context.Context, key string) error

func (*SQLiteCache) Get

func (c *SQLiteCache) Get(ctx context.Context, key string) (string, bool, error)

func (*SQLiteCache) Set

func (c *SQLiteCache) Set(ctx context.Context, key, value string, ttl time.Duration) error

type Skipped

type Skipped struct {
	TweetID string
	Reason  string // "tombstone" | "deleted" | "suspended" | "dmca" | ...
	Cause   error  // typed error if available (e.g. *NotFoundError)
}

Skipped signals that an item was identified but cannot be retrieved - deleted, DMCA-blocked, from a suspended account, geo-restricted, etc. The run continues; the item is counted in Result.UnavailableFiles.

type TwitterConfig

type TwitterConfig struct {
	// GuestToken overrides the dynamically-fetched guest token.
	GuestToken string `json:"guest_token" yaml:"guest_token" toml:"guest_token"`
	// AuthToken is the auth_token cookie value for authenticated requests.
	AuthToken string `json:"auth_token" yaml:"auth_token" toml:"auth_token"`
	// CSRF is the ct0 cookie / x-csrf-token value.
	CSRF string `json:"csrf" yaml:"csrf" toml:"csrf"`
	// UserAgent overrides the default browser User-Agent sent to Twitter.
	UserAgent string `json:"user_agent" yaml:"user_agent" toml:"user_agent"`
	// RepliesEnabled includes reply tweets when extracting a user timeline.
	RepliesEnabled bool `json:"replies_enabled" yaml:"replies_enabled" toml:"replies_enabled"`
	// RetweetsEnabled includes retweets when extracting a user timeline.
	RetweetsEnabled bool `json:"retweets_enabled" yaml:"retweets_enabled" toml:"retweets_enabled"`
	// VideoMaxBitrate picks the highest bitrate variant; false picks lowest.
	VideoMaxBitrate bool `json:"video_max_bitrate" yaml:"video_max_bitrate" toml:"video_max_bitrate"`
}

TwitterConfig holds Twitter-specific extractor settings.

type ZipPostProcessor

type ZipPostProcessor struct {
	// contains filtered or unexported fields
}

ZipPostProcessor streams each downloaded file into a ZIP archive. The zip file is created lazily on the first OnFile call.

func NewZipPostProcessor

func NewZipPostProcessor(zipPath string) *ZipPostProcessor

NewZipPostProcessor constructs a zip post-processor that writes to zipPath.

func (*ZipPostProcessor) Close

func (p *ZipPostProcessor) Close() error

Close finalises and closes the ZIP archive. Call when all files are done.

func (ZipPostProcessor) Name

func (n ZipPostProcessor) Name() string

func (*ZipPostProcessor) OnAfter

func (p *ZipPostProcessor) OnAfter(_ context.Context, _ string, _ *MediaInfo) error

func (ZipPostProcessor) OnError

func (n ZipPostProcessor) OnError(_ context.Context, _ error, _ *MediaInfo) error

func (*ZipPostProcessor) OnFile

func (p *ZipPostProcessor) OnFile(_ context.Context, path string, _ *MediaInfo) error

func (ZipPostProcessor) OnPrepare

func (n ZipPostProcessor) OnPrepare(_ context.Context, _ *MediaInfo) error

Directories

Path Synopsis
cmd
go-gallery command
Command go-gallery is a reference CLI for the go-gallery library.
Command go-gallery is a reference CLI for the go-gallery library.
internal
browser
Package browser provides cookie extraction helpers for locally-installed web browsers.
Package browser provides cookie extraction helpers for locally-installed web browsers.
downloader
Package downloader provides HTTP and yt-dlp based media download implementations.
Package downloader provides HTTP and yt-dlp based media download implementations.
extractor
Package extractor defines the Extractor interface, the item types yielded by extractors, and the global registry used by gallery.Client.Extract.
Package extractor defines the Extractor interface, the item types yielded by extractors, and the global registry used by gallery.Client.Extract.
extractor/twitter
Package twitter registers extractors for Twitter/X URLs.
Package twitter registers extractors for Twitter/X URLs.
galleryerrs
Package galleryerrs contains the exported error types shared between the root gallery package and internal sub-packages (extractor, downloader).
Package galleryerrs contains the exported error types shared between the root gallery package and internal sub-packages (extractor, downloader).
ratelimit
Package ratelimit provides per-endpoint rate-limit tracking driven by Twitter's x-rate-limit-* response headers.
Package ratelimit provides per-endpoint rate-limit tracking driven by Twitter's x-rate-limit-* response headers.
textutil
Package textutil provides string helpers used across the library.
Package textutil provides string helpers used across the library.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL