sites

package
v0.1.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2018 License: MIT Imports: 5 Imported by: 1

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func SiteChan

func SiteChan(inp ...Site) (out <-chan Site)

SiteChan returns a channel to receive all inputs before close.

func SiteChanFuncErr

func SiteChanFuncErr(gen func() (Site, error)) (out <-chan Site)

SiteChanFuncErr returns a channel to receive all results of generator `gen` until `err != nil` before close.

func SiteChanFuncNok

func SiteChanFuncNok(gen func() (Site, bool)) (out <-chan Site)

SiteChanFuncNok returns a channel to receive all results of generator `gen` until `!ok` before close.

func SiteChanSlice

func SiteChanSlice(inp ...[]Site) (out <-chan Site)

SiteChanSlice returns a channel to receive all inputs before close.

func SiteDone

func SiteDone(inp <-chan Site) (done <-chan struct{})

SiteDone returns a channel to receive one signal before close after `inp` has been drained.

func SiteDoneFunc

func SiteDoneFunc(inp <-chan Site, act func(a Site)) (done <-chan struct{})

SiteDoneFunc returns a channel to receive one signal after `act` has been applied to every `inp` before close.

func SiteDoneSlice

func SiteDoneSlice(inp <-chan Site) (done <-chan []Site)

SiteDoneSlice returns a channel to receive a slice with every Site received on `inp` before close.

Note: Unlike SiteDone, DoneSiteSlice sends the fully accumulated slice, not just an event, once upon close of inp.

func SiteDoneWait

func SiteDoneWait(inp chan<- Site, wg SiteWaiter) (done <-chan struct{})

SiteDoneWait returns a channel to receive one signal after wg.Wait() has returned and inp has been closed before close.

Note: Use only *after* You've started flooding the facilities.

func SiteFanIn2

func SiteFanIn2(inp1, inp2 <-chan Site) (out <-chan Site)

SiteFanIn2 returns a channel to receive all to receive all from both `inp1` and `inp2` before close.

func SiteFini

func SiteFini() func(inp <-chan Site) (done <-chan struct{})

SiteFini returns a closure around `SiteDone(_)`.

func SiteFiniFunc

func SiteFiniFunc(act func(a Site)) func(inp <-chan Site) (done <-chan struct{})

SiteFiniFunc returns a closure around `SiteDoneFunc(_, act)`.

func SiteFiniSlice

func SiteFiniSlice() func(inp <-chan Site) (done <-chan []Site)

SiteFiniSlice returns a closure around `SiteDoneSlice(_)`.

func SiteFiniWait

func SiteFiniWait(wg SiteWaiter) func(inp chan<- Site) (done <-chan struct{})

SiteFiniWait returns a closure around `DoneSiteWait(_, wg)`.

func SiteFork

func SiteFork(inp <-chan Site) (out1, out2 <-chan Site)

SiteFork returns two channels either of which is to receive every result of inp before close.

func SiteForkSeen

func SiteForkSeen(inp <-chan Site) (new, old <-chan Site)

SiteForkSeen returns two channels, `new` and `old`, where `new` is to receive all `inp` not been seen before and `old` all `inp` seen before (internally growing a `sync.Map` to discriminate) until close.

func SiteForkSeenAttr

func SiteForkSeenAttr(inp <-chan Site, attr func(a Site) interface{}) (new, old <-chan Site)

SiteForkSeenAttr returns two channels, `new` and `old`, where `new` is to receive all `inp` whose attribute `attr` has not been seen before and `old` all `inp` seen before (internally growing a `sync.Map` to discriminate) until close.

func SiteMakeChan

func SiteMakeChan() (out chan Site)

SiteMakeChan returns a new open channel (simply a 'chan Site' that is). Note: No 'Site-producer' is launched here yet! (as is in all the other functions).

This is useful to easily create corresponding variables such as:

var mySitePipelineStartsHere := SiteMakeChan() // ... lot's of code to design and build Your favourite "mySiteWorkflowPipeline"

// ...
// ... *before* You start pouring data into it, e.g. simply via:
for drop := range water {

mySitePipelineStartsHere <- drop

}

close(mySitePipelineStartsHere)

Hint: especially helpful, if Your piping library operates on some hidden (non-exported) type
(or on a type imported from elsewhere - and You don't want/need or should(!) have to care.)

Note: as always (except for SitePipeBuffer) the channel is unbuffered.

func SitePair

func SitePair(inp <-chan Site) (out1, out2 <-chan Site)

SitePair returns a pair of channels to receive every result of inp before close.

Note: Yes, it is a VERY simple fanout - but sometimes all You need.

func SitePipeAdjust

func SitePipeAdjust(inp <-chan Site, sizes ...int) (out <-chan Site)

SitePipeAdjust returns a channel to receive all `inp` buffered by a SiteSendProxy process before close.

func SitePipeBuffer

func SitePipeBuffer(inp <-chan Site, cap int) (out <-chan Site)

SitePipeBuffer returns a buffered channel with capacity `cap` to receive all `inp` before close.

func SitePipeEnter

func SitePipeEnter(inp <-chan Site, wg SiteWaiter) (out <-chan Site)

SitePipeEnter returns a channel to receive all `inp` and registers throughput as arrival on the given `sync.WaitGroup` until close.

func SitePipeFunc

func SitePipeFunc(inp <-chan Site, act func(a Site) Site) (out <-chan Site)

SitePipeFunc returns a channel to receive every result of action `act` applied to `inp` before close. Note: it 'could' be PipeSiteMap for functional people, but 'map' has a very different meaning in go lang.

func SitePipeLeave

func SitePipeLeave(inp <-chan Site, wg SiteWaiter) (out <-chan Site)

SitePipeLeave returns a channel to receive all `inp` and registers throughput as departure on the given `sync.WaitGroup` until close.

func SitePipeSeen

func SitePipeSeen(inp <-chan Site) (out <-chan Site)

SitePipeSeen returns a channel to receive all `inp` not been seen before while silently dropping everything seen before (internally growing a `sync.Map` to discriminate) until close. Note: SitePipeFilterNotSeenYet might be a better name, but is fairly long.

func SitePipeSeenAttr

func SitePipeSeenAttr(inp <-chan Site, attr func(a Site) interface{}) (out <-chan Site)

SitePipeSeenAttr returns a channel to receive all `inp` whose attribute `attr` has not been seen before while silently dropping everything seen before (internally growing a `sync.Map` to discriminate) until close. Note: SitePipeFilterAttrNotSeenYet might be a better name, but is fairly long.

func SiteSendProxy

func SiteSendProxy(out chan<- Site, sizes ...int) chan<- Site

SiteSendProxy returns a channel to serve as a sending proxy to 'out'. Uses a goroutine to receive values from 'out' and store them in an expanding buffer, so that sending to 'out' never blocks.

Note: the expanding buffer is implemented via "container/ring"

Note: SiteSendProxy is kept for the Sieve example and other dynamic use to be discovered even so it does not fit the pipe tube pattern as SitePipeAdjust does.

func SiteStrew

func SiteStrew(inp <-chan Site, size int) (outS [](<-chan Site))

SiteStrew returns a slice (of size = size) of channels one of which shall receive each inp before close.

func SiteTubeAdjust

func SiteTubeAdjust(sizes ...int) (tube func(inp <-chan Site) (out <-chan Site))

SiteTubeAdjust returns a closure around SitePipeAdjust (_, sizes ...int).

func SiteTubeBuffer

func SiteTubeBuffer(cap int) (tube func(inp <-chan Site) (out <-chan Site))

SiteTubeBuffer returns a closure around PipeSiteBuffer (_, cap).

func SiteTubeEnter

func SiteTubeEnter(wg SiteWaiter) (tube func(inp <-chan Site) (out <-chan Site))

SiteTubeEnter returns a closure around SitePipeEnter (_, wg) registering throughput on the given `sync.WaitGroup` as arrival.

func SiteTubeFunc

func SiteTubeFunc(act func(a Site) Site) (tube func(inp <-chan Site) (out <-chan Site))

SiteTubeFunc returns a closure around PipeSiteFunc (_, act).

func SiteTubeLeave

func SiteTubeLeave(wg SiteWaiter) (tube func(inp <-chan Site) (out <-chan Site))

SiteTubeLeave returns a closure around SitePipeLeave (_, wg) registering throughput on the given `sync.WaitGroup` as departure.

func SiteTubeSeen

func SiteTubeSeen() (tube func(inp <-chan Site) (out <-chan Site))

SiteTubeSeen returns a closure around SitePipeSeen() (silently dropping every Site seen before).

func SiteTubeSeenAttr

func SiteTubeSeenAttr(attr func(a Site) interface{}) (tube func(inp <-chan Site) (out <-chan Site))

SiteTubeSeenAttr returns a closure around SitePipeSeenAttr() (silently dropping every Site whose attribute `attr` was seen before).

Types

type Site

type Site struct {
	URL    *url.URL
	Parent *url.URL
	Depth  int
}

Site represents what travels: an URL which may have a Parent URL, and a Depth.

func (Site) Attr

func (s Site) Attr() interface{}

Attr implements the attribute relevant for ForkSiteSeenAttr, the "I've seen this site before" discriminator.

func (Site) Print

func (s Site) Print() Site

print may be used via e.g. PipeSiteFunc(sites, site.print) for tracing

type SiteWaiter

type SiteWaiter interface {
	Add(delta int)
	Done()
	Wait()
}

SiteWaiter - as implemented by `*sync.WaitGroup` - attends Flapdoors and keeps counting who enters and who leaves.

Use DoneSiteWait to learn about when the facilities are closed.

Note: You may also use Your provided `*sync.WaitGroup.Wait()` to know when to close the facilities. Just: DoneSiteWait is more convenient as it also closes the primary channel.

Just make sure to have _all_ entrances and exits attended, and `Wait()` only *after* You've started flooding the facilities.

type Traffic

type Traffic struct {
	Travel          chan Site // to be processed
	*sync.WaitGroup           // monitor SiteEnter & SiteLeave
}

Traffic as it goes around inside a circular site pipe network, e. g. a crawling Crawler. Composed of Travel, a channel for those who travel in the traffic, and an embedded *sync.WaitGroup to keep track of congestion.

func (*Traffic) Feed

func (t *Traffic) Feed(urls []*url.URL, parent *url.URL, depth int)

Feed registers new entries and launches their dispatcher (which we intentionally left untouched).

func (*Traffic) Processor

func (t *Traffic) Processor(crawl func(s Site), parallel int)

Processor builds the site traffic processing network; it is cirular if crawl uses Feed to provide feedback.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL