youtube-transcript-api-go

module
v0.0.10 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jul 17, 2025 License: MIT

README

YouTube Transcript API Go

A Go library and CLI tool to get transcripts/subtitles from YouTube videos. This library supports multiple languages, different output formats, and various formatting options.

Features

  • Fetch transcripts from YouTube videos
  • Support for multiple languages
  • JSON and Text output formats
  • Concurrent processing of transcripts
  • Preserve or strip formatting
  • Include/exclude timestamps
  • Support for both auto-generated and manually created subtitles

Installation

As a CLI Tool
# Install the CLI tool
go install github.com/horiagug/youtube-transcript-api-go/cmd/yt_transcript@latest
As a Library
# Add to your Go project
go get github.com/horiagug/youtube-transcript-api-go

CLI Usage

# Basic usage
yt_transcript [flags] VIDEO_ID

# Flags:
  -languages string
        Comma-separated list of language codes (default "en")
  -formatter string
        Formatter to use (json, text) (default "json")
  -preserve_formatting
        Preserve formatting (default true)
  -with_timestamps
        Include timestamps (default true)
  -exclude_manually_created
        Exclude manually created subtitles
  -exclude_auto_generated
        Exclude auto-generated subtitles
Examples
# Get English transcripts in JSON format
yt_transcript dQw4w9WgXcQ

# The entire url of the video can also be passed
yt_transcript "https://www.youtube.com/watch?v=dQw4w9WgXcQ"

# Get Spanish transcripts in text format
yt_transcript -languages es -formatter text u6aZYZv3duo

# Get transcripts without timestamps
yt_transcript -with_timestamps=false dQw4w9WgXcQ

Library Usage

package main

import (
    "fmt"
	"github.com/horiagug/youtube-transcript-api-go/pkg/yt_transcript"
	"github.com/horiagug/youtube-transcript-api-go/pkg/yt_transcript_formatters"
)

func main() {
    // Create a new client with JSON formatter
	client := yt_transcript.NewClient(
		yt_transcript.WithFormatter(yt_transcript_formatters.WithTimestamps(false)),
	)

    // Get formatted transcripts
    videoID := "dQw4w9WgXcQ"
    languages := []string{"en"}
    transcript, err := client.GetFormattedTranscripts(videoID, languages, true)
    if err != nil {
        panic(err)
    }

    fmt.Println(transcript)

    // Or get raw transcript data
    transcripts, err := client.GetTranscripts(videoID, languages)
    if err != nil {
        panic(err)
    }

    // Process transcripts as needed
    for _, t := range transcripts {
        fmt.Printf("Language: %s\n", t.Language)
        for _, line := range t.Lines {
            fmt.Printf("%s: %s\n", line.Start, line.Text)
        }
    }
}

Custom Formatting

The library supports both JSON and Text formatters with configurable options:

// JSON formatter with custom options
jsonFormatter := yt_transcript_formatters.NewJSONFormatter(
    yt_transcript_formatters.WithPrettyPrint(true),
    yt_transcript_formatters.WithTimestamps(true),
)

// Text formatter with custom options
textFormatter := yt_transcript_formatters.NewTextFormatter(
    yt_transcript_formatters.WithTimestamps(true),
)

// Use formatter with client
client := yt_transcript.NewClient(
    yt_transcript.WithFormatter(jsonFormatter),
)

TODO:

  • Consolidate error handling
  • Custom formatters
  • Add more tests
  • Add (optional) logging

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Directories

Path Synopsis
cmd
internal
pkg

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL