Documentation
¶
Index ¶
- type HandleRequestFunc
- type RunFunc
- type Spider
- func (s *Spider) Close()
- func (s *Spider) GetCloseAfterNotHandleRequest() []model.Request
- func (s *Spider) SetDownloader(downloader downloader.Downloader) *Spider
- func (s *Spider) SetExitWhenComplete(exitWhenComplete bool) *Spider
- func (s *Spider) SetIdleTimeout(duration time.Duration) *Spider
- func (s *Spider) SetScheduler(scheduler scheduler.Scheduler) *Spider
- func (s *Spider) SetStorage(stg persist.Storage) *Spider
- func (s *Spider) Start()
- func (s *Spider) StartRequest(startRequest ...model.Request) *Spider
- func (s *Spider) StartUrls(startUrls ...string) *Spider
- func (s *Spider) ThreadNum(num int) *Spider
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type HandleRequestFunc ¶
type Spider ¶
type Spider struct { PageCount int64 //爬的页面数量 OnSuccess HandleRequestFunc //请求成功 OnFailed HandleRequestFunc //请求失败 RunBeforeFunc RunFunc RunAfterFunc RunFunc // contains filtered or unexported fields }
Spider 爬虫 整合下载器、处理器、调度器以及持久化模块
func (*Spider) GetCloseAfterNotHandleRequest ¶
GetNotHandleRequest 获取所有未处理的请求
func (*Spider) SetDownloader ¶
func (s *Spider) SetDownloader(downloader downloader.Downloader) *Spider
SetDownloader 设置下载器
func (*Spider) SetExitWhenComplete ¶
SetExitWhenComplete 当下载任务完成后是否退出程序
func (*Spider) SetIdleTimeout ¶
SetIdleTimeout 设置空闲退出时间
func (*Spider) SetScheduler ¶
SetScheduler 设置调度器
func (*Spider) SetStorage ¶
SetStorage 设置数据持久化
func (*Spider) StartRequest ¶
StartRequest 设置种子
Click to show internal directories.
Click to hide internal directories.