slurp

package

v0.0.0-...-23e6414 Latest Latest Go to latest Published: Aug 20, 2022 License: AGPL-3.0 Imports: 8 Imported by: 2

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/bcampbell/scrapeomat

Links

Open Source Insights

README ¶

This is a client for slurping articles via the scrapeomat slurp API.

Documentation ¶

Index ¶

type ArtStream
- func (as *ArtStream) Close()
- func (as *ArtStream) Next() (*Article, error)
type Article
type Author
type CookedSummary
- func CookSummary(raw RawSummary) *CookedSummary
type Filter
type Keyword
type Msg
type Publication
type RawSummary
type Slurper
- func NewSlurper(location string) *Slurper

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type ArtStream ¶

type ArtStream struct {

	// if there are more articles to grab, this will be set to non-zero when the stream ends
	NextSinceID int
	// contains filtered or unexported fields
}

func (*ArtStream) Close ¶

func (as *ArtStream) Close()

func (*ArtStream) Next ¶

func (as *ArtStream) Next() (*Article, error)

returns io.EOF at end of stream

type Author ¶

type Author struct {
	Name    string `json:"name"`
	RelLink string `json:"rel_link,omitempty"`
	Email   string `json:"email,omitempty"`
	Twitter string `json:"twitter,omitempty"`
}

type CookedSummary ¶

type CookedSummary struct {
	PubCodes []string
	Days     []string
	// An array of array of counts
	// access as: Data[pubcodeindex][dayindex]
	Data [][]int
	Max  int
}

func CookSummary ¶

func CookSummary(raw RawSummary) *CookedSummary

cooks raw article counts, filling in missing days

type Filter ¶

type Filter struct {
	// date ranges are [from,to)
	PubFrom time.Time
	PubTo   time.Time
	//	AddedFrom time.Time
	//	AddedTo   time.Time
	PubCodes []string
	SinceID  int
	Count    int
}

type Keyword ¶

type Keyword struct {
	Name string `json:"name"`
	URL  string `json:"url,omitempty"`
}

type Msg ¶

type Msg struct {
	Article *Article `json:"article,omitempty"`
	Error   string   `json:"error,omitempty"`
	Next    struct {
		SinceID int `json:"since_id,omitempty"`
	} `json:"next,omitempty"`
}

Msg is a single message - can hold an article or error message

type Publication ¶

type Publication struct {
	// Code is a short, unique name (eg "mirror")
	Code string `json:"code"`
	// Name is the 'pretty' name (eg "The Daily Mirror")
	Name   string `json:"name,omitempty"`
	Domain string `json:"domain,omitempty"`
}

type Slurper ¶

type Slurper struct {
	Client *http.Client
	// eg "http://localhost:12345/ukarticles
	Location string
}

Slurper is a client for talking to a slurp server

func NewSlurper ¶

func NewSlurper(location string) *Slurper

func (*Slurper) FetchCount ¶

func (s *Slurper) FetchCount(filt *Filter) (int, error)

FetchCount returns the number of articles on the server matching the filter.

func (*Slurper) Slurp ¶

func (s *Slurper) Slurp(filt *Filter) (chan Msg, chan struct{})

!!! DEPRECATED !!! Slurp downloads a set of articles from the server returns a channel which streams out messages. errors are returned via Msg. In the case of network errors, Slurp may synthesise fake Msgs containing the error message. Will repeatedly request until all results returned. filter count param is not the total - it is the max articles to return per request. !!! DEPRECATED !!!

func (*Slurper) Slurp2 ¶

func (s *Slurper) Slurp2(filt *Filter) *ArtStream

func (*Slurper) Summary ¶

func (s *Slurper) Summary(filt *Filter) (RawSummary, error)

returns a map of maps pubcodes -> days -> counts

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

slurp

README ¶

Documentation ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type ArtStream ¶

func (*ArtStream) Close ¶

func (*ArtStream) Next ¶

type Article ¶

type Author ¶

type CookedSummary ¶

func CookSummary ¶

type Filter ¶

type Keyword ¶

type Msg ¶

type Publication ¶

type RawSummary ¶

type Slurper ¶

func NewSlurper ¶

func (*Slurper) FetchCount ¶

func (*Slurper) Slurp ¶

func (*Slurper) Slurp2 ¶

func (*Slurper) Summary ¶

Source Files ¶