httpdisk

package module
v0.0.0-...-79a5c55 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 22, 2021 License: MIT Imports: 14 Imported by: 0

README

badge

Overview

httpdisk will cache http responses on disk. Several of these already exist (see below) but this one is a bit different. The priority for httpdisk is to always cache on disk. It is not RFC compliant. It caches GET, POST and everything else. httpdisk is useful for crawling projects, to aggressively avoid extra http requests.

Usage

Just plug httpdisk into an http.Client:

hd := NewHTTPDisk(httpdisk.Options{})
client := http.Client{Transport: hd}
resp, err = client.Get("http://google.com")
...

Responses will be cached in httpdisk. The cache key is the md5 sum of the HTTP method, the normalized URL, and the request body. The path will be of the form httpdisk/98/fa/1f08556382802ef7e26852c527c2. Responses never expire and are never deleted by httpdisk. They will last forever and grow unbounded until manually deleted.

Note that HTTP headers are NOT used to calculate the cache key. This can be unintuitive for crawling projects that involve cookies or session state.

Also See

Here are some other excellent caching libraries that you might want to check out. These generally act like traditional HTTP caches:

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Cache

type Cache struct {
	// Directory where the cache is stored. Defaults to httpdisk.
	Dir string
	// If true, don't include the request hostname in the path for each element.
	NoHosts bool
}

Cache will cache http.Responses on disk, using the http.Request to calculate a key. It deals with keys and files, not the network.

func (*Cache) Canonical

func (c *Cache) Canonical(req *http.Request) string

Canonical calculates a signature for the request based on the http method, the normalized URL, and the request body if present. The signature can be quite long since it contains the request body.

func (*Cache) Get

func (c *Cache) Get(req *http.Request) ([]byte, error)

Get the cached data for a request. An empty byte array will be returned if the entry doesn't exist or can't be read for any reason.

func (*Cache) Key

func (c *Cache) Key(req *http.Request) string

Key returns the md5 sum for this request.

func (*Cache) Path

func (c *Cache) Path(req *http.Request) string

Path returns the full path on disk for this request.

func (*Cache) RemoveAll

func (c *Cache) RemoveAll() error

RemoveAll unlinks the cache.

func (*Cache) Set

func (c *Cache) Set(req *http.Request, data []byte) error

Set cached data for a request.

type HTTPDisk

type HTTPDisk struct {
	// Underlying Cache.
	Cache Cache
	// if nil, http.DefaultTransport is used.
	Transport http.RoundTripper
}

HTTPDisk is a caching http transport.

func NewHTTPDisk

func NewHTTPDisk(options Options) *HTTPDisk

NewHTTPDisk constructs a new HTTPDisk.

func (*HTTPDisk) RoundTrip

func (hd *HTTPDisk) RoundTrip(req *http.Request) (*http.Response, error)

RoundTrip is the entry point used by http.Client.

type Options

type Options struct {
	// Directory where the cache is stored. Defaults to httpdisk.
	Dir string
	// If true, include the request hostname in the path for each element.
	NoHosts bool
}

Options for creating a new HTTPDisk.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL