tanuki

package module
v0.1.6 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 29, 2023 License: MPL-2.0 Imports: 8 Imported by: 0

README

🦝 Tanuki

Tanuki is a Golang library for parsing anime video filenames.

It is a fork of Anitogo, which itself is based off of Anitomy and Anitopy.

Changes

Tanuki simply handles more cases while avoiding regression.

Anitogo
 {
  "file_name": "Byousoku 5 Centimeter [Blu-Ray][1920x1080 H.264][2.0ch AAC][SOFTSUBS]",
  "anime_title": "Byousoku",
  "episode_number": ["5"],
  "episode_title": "Centimeter",
  ...
}
🦝 Tanuki
{
  "file_name": "Byousoku 5 Centimeter [Blu-Ray][1920x1080 H.264][2.0ch AAC][SOFTSUBS]",
  "anime_title": "Byousoku 5 Centimeter",
  ...
}

Anitogo
 {
  "file_name": "S01E05 - Episode title.mkv",
  "anime_title": "S01E05 - Episode title",
  ...
}
🦝 Tanuki
{
  "file_name": "S01E05 - Episode title.mkv",
  "anime_season": ["01"],
  "episode_number": ["05"],
  "episode_title": "Episode title",
  ...
}

Anitogo
 {
  "file_name": "[Judas] Aharen-san wa Hakarenai - S01E06v2.mkv",
  "anime_title": "Aharen-san wa Hakarenai - S01E06v2",
  ...
}
🦝 Tanuki
{
  "file_name": "[Judas] Aharen-san wa Hakarenai - S01E06v2.mkv",
  "anime_title": "Aharen-san wa Hakarenai",
  "anime_season": ["01"],
  "episode_number": ["06"],
  ...
}

  • Added anime_part
  • Better parsing of anime_title
  • Updated keywords
  • Fixed:
    • Incorrect episode detection, e.g, now: Rozen Maiden 3 != episode 3, Byousoku 5 Centimeter != episode 5
    • Incorrect versioning detection, e.g, S01E01v2, - 05' are parsed correctly now
  • Support for:
    • Higher episode numbers
    • Absence of title, e.g, S01E05 - Episode title.mkv
    • Season ranges, e.g, S1-2, Seasons 1-2, Seasons 1 ~ 2, etc...
    • Enclosed keywords, e.g, Hyouka (2012) [Season 1+OVA] [BD 1080p HEVC OPUS] [Dual-Audio]

Example

The following filename...

[Trix] Shingeki no Kyojin - S04E29-31 (Part 3) [Multi Subs] (1080p AV1 E-AC3)"

...is resolved into:

{
  "file_name": "[Trix] Shingeki no Kyojin - S04E29-31 (Part 3) [Multi Subs] (1080p AV1 E-AC3)",
  "anime_title": "Shingeki no Kyojin",
  "anime_season": ["04"],
  "anime_part": ["3"],
  "episode_number": ["29", "31"],
  "release_group": "Trix",
  "video_resolution": "1080p",
  "video_term": ["AV1"]
}

The following example code:

package main

import (
    "fmt"
    "encoding/json"

    "github.com/5rahim/tanuki"
)

func main() {
    parsed := tanuki.Parse("[Nubles] Space Battleship Yamato 2199 (2012) episode 18 (720p 10 bit AAC)[1F56D642]", tanuki.DefaultOptions)
    jsonParsed, err := json.MarshalIndent(parsed, "", "    ")
    if err != nil {
        fmt.Println(err)
    }
    fmt.Println(string(jsonParsed) + "\n")

    // Accessing the elements directly
    fmt.Println("Anime Title:", parsed.AnimeTitle)
    fmt.Println("Anime Year:", parsed.AnimeYear)
    fmt.Println("Episode Number:", parsed.EpisodeNumber)
    fmt.Println("Release Group:", parsed.ReleaseGroup)
    fmt.Println("File Checksum:", parsed.FileChecksum)
}

Will output:

{
    "anime_title": "Space Battleship Yamato 2199",
    "anime_year": "2012",
    "audio_term": ["AAC"],
    "episode_number": ["18"],
    "file_checksum": "1F56D642",
    "file_name": "[Nubles] Space Battleship Yamato 2199 (2012) episode 18 (720p 10 bit AAC)[1F56D642]",
    "release_group": "Nubles",
    "video_resolution": "720p"
}

The Parse function returns a pointer to an Elements struct. The full definition of the struct is here:

type elements struct {
    AnimeSeason         []string  `json:"anime_season,omitempty"`
    AnimeSeasonPrefix   []string  `json:"anime_season_prefix,omitempty"`
    AnimePart           []string  `json:"anime_part,omitempty"`
    AnimePartPrefix     []string  `json:"anime_part_prefix,omitempty"`
    AnimeTitle          string    `json:"anime_title,omitempty"`
    AnimeType           []string  `json:"anime_type,omitempty"`
    AnimeYear           string    `json:"anime_year,omitempty"`
    AudioTerm           []string  `json:"audio_term,omitempty"`
    DeviceCompatibility []string  `json:"device_compatibility,omitempty"`
    EpisodeNumber       []string  `json:"episode_number,omitempty"`
    EpisodeNumberAlt    []string  `json:"episode_number_alt,omitempty"`
    EpisodePrefix       []string  `json:"episode_prefix,omitempty"`
    EpisodeTitle        string    `json:"episode_title,omitempty"`
    FileChecksum        string    `json:"file_checksum,omitempty"`
    FileExtension       string    `json:"file_extension,omitempty"`
    FileName            string    `json:"file_name,omitempty"`
    Language            []string  `json:"language,omitempty"`
    Other               []string  `json:"other,omitempty"`
    ReleaseGroup        string    `json:"release_group,omitempty"`
    ReleaseInformation  []string  `json:"release_information,omitempty"`
    ReleaseVersion      []string  `json:"release_version,omitempty"`
    Source              []string  `json:"source,omitempty"`
    Subtitles           []string  `json:"subtitles,omitempty"`
    VideoResolution     string    `json:"video_resolution,omitempty"`
    VideoTerm           []string  `json:"video_term,omitempty"`
    VolumeNumber        []string  `json:"volume_number,omitempty"`
    VolumePrefix        []string  `json:"volume_prefix,omitempty"`
    Unknown             []string  `json:"unknown,omitempty"`
    checkAltNumber      bool
}

Sample results encoded in JSON can be seen in the tests/data.json file.

Installation

Get the package:

go get -u github.com/5rahim/tanuki

Then, import it in your code:

import "github.com/5rahim/tanuki"

Options

The Parse function receives the filename and an Options struct. The default options are as follows:

var DefaultOptions = Options{
    AllowedDelimiters:  " _.&+,|", // Parse these as delimiters
    IgnoredStrings:     []string{}, // Ignore these when they are in the filename
    ParseEpisodeNumber: true, // Parse the episode number and include it in the elements
    ParseEpisodeTitle:  true, // Parse the episode title and include it in the elements
    ParseFileExtension: true, // Parse the file extension and include it in the elements
    ParseReleaseGroup:  true, // Parse the release group and include it in the elements
}

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DefaultOptions = Options{
	AllowedDelimiters:  " _.&+,|",
	IgnoredStrings:     []string{},
	ParseEpisodeNumber: true,
	ParseEpisodeTitle:  true,
	ParseFileExtension: true,
	ParseReleaseGroup:  true,
}

DefaultOptions is a variable configured with the recommended defaults for the Options struct to be passed to the Parse function.

Custom options can be specified by creating a new Options struct and passing it to the Parse function.

Functions

This section is empty.

Types

type Elements

type Elements struct {
	// Slice of strings representing the season of anime. "S1-S3" would be represented as []string{"1", "3"}.
	AnimeSeason []string `json:"anime_season,omitempty"`
	AnimePart   []string `json:"anime_part,omitempty"`

	// Represents the strings prefixing the season in the file, e.g in "SEASON 2" "SEASON" is the AnimeSeasonPrefix.
	AnimeSeasonPrefix []string `json:"anime_season_prefix,omitempty"`
	// Represents the strings prefixing the season in the file, e.g in "PART 2" "PART" is the AnimeSeasonPrefix.
	AnimePartPrefix []string `json:"anime_part_prefix,omitempty"`

	// Title of the Anime. e.g in "[HorribleSubs] Boku no Hero Academia - 01 [1080p].mkv",
	// "Boku No Hero Academia" is the AnimeTitle.
	AnimeTitle string `json:"anime_title,omitempty"`

	// Slice of strings representing the types specified in the anime file, e.g ED, OP, Movie, etc.
	AnimeType []string `json:"anime_type,omitempty"`

	// Year the anime was released.
	AnimeYear string `json:"anime_year,omitempty"`

	// Slice of strings representing the audio terms included in the filename, e.g FLAC, AAC, etc.
	AudioTerm []string `json:"audio_term,omitempty"`

	// Slice of strings representing devices the video is compatible with that are mentioned in the filename.
	DeviceCompatibility []string `json:"device_compatibility,omitempty"`

	// Slice of strings representing the episode numbers. "01-10" would be respresented as []string{"1", "10"}.
	EpisodeNumber []string `json:"episode_number,omitempty"`

	// Slice of strings representing the alternative episode number.
	// This is for cases where you may have an episode number relative to the season,
	// but a larger episode number as if it were all one season.
	// e.g in [Hatsuyuki]_Kuroko_no_Basuke_S3_-_01_(51)_[720p][10bit][619C57A0].mkv
	// 01 would be the EpisodeNumber, and 51 would be the EpisodeNumberAlt.
	EpisodeNumberAlt []string `json:"episode_number_alt,omitempty"`

	// Slice of strings representing the words prefixing the episode number in the file, e.g in "EPISODE 2", "EPISODE" is the prefix.
	EpisodePrefix []string `json:"episode_prefix,omitempty"`

	// Title of the episode. e.g in "[BM&T] Toradora! - 07v2 - Pool Opening [720p Hi10 ] [BD] [8F59F2BA]",
	// "Pool Opening" is the EpisodeTitle.
	EpisodeTitle string `json:"episode_title,omitempty"`

	// Checksum of the file, in [BM&T] Toradora! - 07v2 - Pool Opening [720p Hi10 ] [BD] [8F59F2BA],
	// "8F59F2BA" would be the FileChecksum.
	FileChecksum string `json:"file_checksum,omitempty"`

	// File extension, in [HorribleSubs] Boku no Hero Academia - 01 [1080p].mkv,
	// "mkv" would be the FileExtension.
	FileExtension string `json:"file_extension,omitempty"`

	// Full filename that was parsed.
	FileName string `json:"file_name,omitempty"`

	// Languages specified in the file name, e.g RU, JP, EN etc.
	Language []string `json:"language,omitempty"`

	// Terms that could not be parsed into other buckets, but were deemed identifiers.
	// In [chibi-Doki] Seikon no Qwaser - 13v0 (Uncensored Director's Cut) [988DB090].mkv,
	// "Uncensored" is parsed into Other.
	Other []string `json:"other,omitempty"`

	// The fan sub group that uploaded the file. In [HorribleSubs] Boku no Hero Academia - 01 [1080p],
	// "HorribleSubs" is the ReleaseGroup.
	ReleaseGroup string `json:"release_group,omitempty"`

	// Information about the release that wasn't a version.
	// In "[SubDESU-H] Swing out Sisters Complete Version (720p x264 8bit AC3) [3ABD57E6].mp4
	// "Complete" is parsed into ReleaseInformation.
	ReleaseInformation []string `json:"release_information,omitempty"`

	// Slice of strings representing the version of the release.
	// In [FBI] Baby Princess 3D Paradise Love 01v0 [BD][720p-AAC][457CC066].mkv, 0 is parsed into ReleaseVersion.
	ReleaseVersion []string `json:"release_version,omitempty"`

	// Slice of strings representing where the video was ripped from. e.g BLU-RAY, DVD, etc.
	Source []string `json:"source,omitempty"`

	// Slice of strings representing the type of subtitles included, e.g HARDSUB, BIG5, etc.
	Subtitles []string `json:"subtitles,omitempty"`

	// Resolution of the video. Can be formatted like 1920x1080, 1080, 1080p, etc depending
	// on how it is represented in the filename.
	VideoResolution string `json:"video_resolution,omitempty"`

	// Slice of strings representing the video terms included in the filename, e.g h264, x264, etc.
	VideoTerm []string `json:"video_term,omitempty"`

	// Slice of strings represnting the volume numbers. "01-10" would be represented as []string{"1", "10"}.
	VolumeNumber []string `json:"volume_number,omitempty"`

	// Slice of strings representing the words prefixing the volume number in the file, e.g in "VOLUME 2", "VOLUME" is the prefix.
	VolumePrefix []string `json:"volume_prefix,omitempty"`

	// Entries that could not be parsed into any other categories.
	Unknown []string `json:"unknown,omitempty"`
	// contains filtered or unexported fields
}

Elements is a struct representing a parsed anime filename.

func Parse

func Parse(filename string, options Options) *Elements

Parse returns a pointer to an Elements struct created by parsing a filename with the specified options.

Parsing behavior can be customized in the passed Options struct.

type Options

type Options struct {
	// DefaultOptions value: " _.&+,|"
	// Each character in this string will be evaluated as a delimiter during parsing.
	// The defaults are fairly sane, but in some cases you may want to change them.
	// For example in the following filename: DRAMAtical Murder Episode 1 - Data_01_Login
	// With the defaults, the "_" characters would be replaced with spaces, but this may
	// not be desired behavior.
	AllowedDelimiters string

	// DefaultOptions value: []string{}
	// These strings will be removed from the filename.
	IgnoredStrings []string

	// DefaultOptions value: true
	// Determines if the episode number will be parsed into the Elements struct.
	ParseEpisodeNumber bool

	// DefaultOptions value: true
	// Determines if the episode title will be parsed into the Elements struct.
	ParseEpisodeTitle bool

	// DefaultOptions value: true
	// Determines if the file extension will be parsed into the Elements struct.
	ParseFileExtension bool

	// DefaultOptions value: true
	// Determines if the release group will be parsed into the Elements struct.
	ParseReleaseGroup bool
}

Options is a struct that allows you to change the parsing behavior.

Default options have been provided under a variable named "DefaultOptions".

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL