package module
Version: v0.0.0-...-7a04128 Latest Latest

This package is not in the latest version of its module.

Go to latest
Published: Jun 4, 2018 License: MIT Imports: 15 Imported by: 8



Documentation Build Status Report Card

About is a golang package for archiving web pages via

Please be mindful and responsible and go easy on them, we want to last forever!

Created by Jay Taylor.

Also see: golang package

  • Add timeout to .Capture.
  • Consider unifying to single binary
  • Go version 1.9 or newer
go get
Command-line programs <url>

Archive a fresh new copy of an HTML page <url>

Search for existing page snapshots

Search query examples:

  • for snapshots from the host
  • * for snapshots from and all its subdomains (e.g.
  • for snapshots from exact url (search is case-sensitive)
  •* for snapshots from urls starting with
Go package interfaces
Capture URL HTML Page Content


package main

import (


var captureURL = ""

func main() {
	archiveURL, err := archiveis.Capture(captureURL)
	if err != nil {
	fmt.Printf("Successfully archived %v via %v\n", captureURL, archiveURL)

// Output:
// Successfully archived via
Search for Existing Snapshots


package main

import (


var searchURL = ""

func main() {
    snapshots, err := archiveis.Search(searchURL, 10*time.Second)
    if err != nil {
    fmt.Printf("%# v\n", snapshots)

// Output:
Running the test suite
go test ./...

Permissive MIT license, see the LICENSE file for more information.




This section is empty.


View Source
var (
	BaseURL               = ""                                                                                                       // Overrideable default package value.
	HTTPHost              = ""                                                                                                               // Overrideable default package value.
	UserAgent             = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36" // Overrideable default package value.
	DefaultRequestTimeout = 10 * time.Second                                                                                                           // Overrideable default package value.
	DefaultPollInterval   = 5 * time.Second                                                                                                            // Overrideable default package value.



func Capture

func Capture(u string, cfg ...Config) (string, error)

Capture archives the provided URL using the service.


type Config

type Config struct {
	Anyway         bool          // Force archival even if there is already a recent snapshot of the page.
	Wait           bool          // Wait until the crawl has been completed.
	WaitTimeout    time.Duration // Max time to wait for crawl completion.  Default is unlimited.
	PollInterval   time.Duration // Interval between crawl completion checks.  Defaults to 5s.
	RequestTimeout time.Duration // Overrides default request timeout.
	SubmitID       string        // Accepts a user-provided submitid.

Config settings for page capture client behavior.

type Snapshot

type Snapshot struct {
	URL          string
	ThumbnailURL string
	Timestamp    time.Time

Snapshot represents an instance of a URL page snapshot on

func Search(url string, timeout time.Duration) ([]Snapshot, error)

Search for URL snapshots.


Path Synopsis
cmd module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL