speedruncom-scraper

module
v0.0.0-...-3714c51 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 25, 2025 License: GPL-3.0

README ยถ

Speedrun.com API Scraper

Go Report Card Go Version

A series of executables to collect all data available from speedrun.com. A version of the collected data has been published here!

โฌ‡๏ธ Installation

The repository can be installed easily, and binaries can be compiled with the following commands:

$ git clone git@github.com:alexmerren/speedruncom-scraper.git
...
$ cd speedruncom-scraper
$ make all
...

This project requires:

๐Ÿš€ Usage

The compiled binaries can be executed to scrape data from the speedrun.com API. The following command retrieves data for all runs, all leaderboards, all games, and all users (whom have contributed to leaderboards) on speedrun.com:

./dist/games-list && ./dist/games-data && ./dist/leaderboards-data && ./dist/users-list && ./dist/users-data && ./dist/runs-data

Alternatively, there is a Makefile target to run all executables in order:

make run

NOTE: During the scraping process there may be repeated API calls. A local HTTP cache has been implemented to handle repeated API calls locally instead of via the rate-limited API. This cache is saved as httpcache.db.

๐Ÿƒ Executables

Path Description Pre-requisite(s)
./dist/games-list Retrieve all Game IDs and other data (i.e. total number of runs for a game) for verification in other executables. Retrieve other miscellaneous pieces of data such as platforms, developers, genres, etc. None
./dist/games-data Retrieve data on categories, levels, variables, and values, etc. for all game IDs retrieved in games-list. ./dist/games-list
./dist/leaderboards-data Retrieve leaderboard(s) data for all games retrieved in games-list. Note: This can fail for games with a high number of runs, use additional-leaderboards-data in this case. ./dist/games-data
./dist/supplementary-leaderboard-data Retrieve leaderboard data for all category/level/variable/value combinations of a game. This executable is tailored to retrieve data for games with an extremely high number of runs i.e. Subway Surfers. This will be extremely inefficient for games with a high count of unique category/level/variable/value combinations. None
./dist/users-list Compile a list of all unique users found on all leaderboards of all gamesโ€” includes both submitters and verifiers. ./dist/leaderboards-data
./dst/users-data Retrieve non-PII data for all unique users compiled in users-list. ./dist/users-list
./dist/runs-data Retrieve all runs for all unique users compiled in users-list. This should be all runs on speedrun.com! ./dist/users-list
./dist/world-record-data Retrieve world record data for all valid category/level/variable/value combinations of a game. This is experimental, and has a delay of 1s applied to every request to ensure the V2 API is not rate limited externally. None

๐Ÿ“ Documentation

All documentation can be found in the docs directory.

๐Ÿ’ญ Feedback and Contribution

Any improvements or requests can be raised via GitHub Issues. Any development conversations can be found on on GitHub Discussions.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL