package module
Version: v0.0.0-...-d3f8fd9 Latest Latest

This package is not in the latest version of its module.

Go to latest
Published: Jun 4, 2018 License: MIT Imports: 16 Imported by: 9



Documentation Build Status Report Card

About is a golang package for archiving web pages via

Please be mindful and responsible and go easy on them, we want to last forever!

Created by Jay Taylor.

Also see: golang package

  • Finish migrating to API
  • Consider unifying to single binary
  • Add id_, js_, cs_, etc info to golang pkg.

Related resources:

  • Go version 1.9 or newer
go get
Command-line programs <url>

Archive a fresh new copy of an HTML page <url>

Search for existing page snapshots

Go package interfaces
Search for Existing Snapshots


package main

import (


var captureURL = ""

func main() {
	archiveURL, err := archiveorg.Capture(captureURL, archiveorg.DefaultRequestTimeout)
	if err != nil {
	fmt.Printf("Successfully archived %v via %v\n", captureURL, archiveURL)

// Output:
// Successfully archived via


package main

import (


func main() {
    u := ""

    hits, err := archiveorg.Search(u, archiveorg.DefaultRequestTimeout)
    if err != nil {
        panic(fmt.Errorf("Search error: %s", err))
    fmt.Printf("num: %v\n", len(hits))
    for _, hit := range hits {
        fmt.Printf("hit: %+v\n", hit)

// Output:
// num: 3
// hit: {URL: Reason:webwidecrawlhackernews00000hackernews StatusCode:301 Timestamp:2016-03-04 01:26:38 +0000 UTC}
// hit: {URL: Reason:alexacrawls StatusCode:200 Timestamp:2012-02-02 23:31:58 +0000 UTC}
// hit: {URL: Reason:alexacrawls StatusCode:200 Timestamp:2012-02-02 20:12:33 +0000 UTC}
Running the test suite
go test ./...

Permissive MIT license, see the LICENSE file for more information.




This section is empty.


View Source
var (
	BaseURL               = ""                                                                                                  // Overrideable default package value.
	HTTPHost              = ""                                                                                                              // Overrideable default package value.
	UserAgent             = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36" // Overrideable default package value.
	DefaultRequestTimeout = 10 * time.Second                                                                                                           // Overrideable default package value.
	MaxTries              = 10                                                                                                                         // Max number download retries before giving up.
View Source
var (
	MementoParseErr = errors.New("malformed input: memento parse failed")
View Source
var NoContentLocationErr = errors.New("missing 'content-lcation' header") // Returned when a malformed response is returned by


func Capture

func Capture(url string, timeout ...time.Duration) (string, error)

Capture requests a


type Memento

type Memento struct {
	URL  string
	Rel  string
	Type *string    `json:,omitempty`
	From *time.Time `json:,omitempty`
	Time *time.Time `json:,omitempty`

func ParseMemento

func ParseMemento(line string) (*Memento, error)

ParseMemento parses a line containing a Memento entry.

type Snapshot

type Snapshot struct {
	URL        string
	Reason     string
	StatusCode int
	Timestamp  time.Time

Snapshot represents an instance of a URL page snapshot on

func Search(u string, timeout ...time.Duration) ([]Snapshot, error)

Search for URL snapshots.

type TimeMap

type TimeMap struct {
	Original *Memento
	Self     *Memento
	TimeGate *Memento
	Mementos []Memento

func NewTimeMap

func NewTimeMap() *TimeMap

func ParseTimeMap

func ParseTimeMap(r io.Reader) (*TimeMap, error)

ParseTimeMap takes a reader and parses it as a complete TimeMap.

func TimeMapFor

func TimeMapFor(url string, timeout ...time.Duration) (*TimeMap, error)


Path Synopsis
cmd module

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL