Documentation
¶
Index ¶
- func Parse(fname, url, page string) ([]map[string]interface{}, error)
- func ParseExt(fname, url, page string) (string, error)
- func ParseLinks(page, url string) ([]string, error)
- func ParseNewLinks(page, url string) ([]string, error)
- type DomNode
- type Parser
- func (p *Parser) Do() ([]*UrlTask, []map[string]interface{}, error)
- func (p *Parser) Parse(page, pageUrl string) ([]*UrlTask, []map[string]interface{}, error)
- func (p *Parser) ParseURL(url string) ([]*UrlTask, []map[string]interface{}, error)
- func (p *Parser) RunJs(items []map[string]interface{}) ([]map[string]interface{}, error)
- type Rule
- type UrlTask
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ParseLinks ¶
ParseLinks returns all urls contained in html page
func ParseNewLinks ¶
ParseNewLinks returns new urls contained in html page
Types ¶
type Parser ¶
type Parser struct {
Name string `json:"name"`
DefaultFields bool `json:"default_fields"`
ZipContent bool `json:"zip_content"`
ExampleUrl string `json:"example_url"`
UA string `json:"ua"`
Urls []string `json:"urls"`
Rules map[string][]*Rule `json:"rules"`
Js string `json:"js"`
}
Parser contains a set of cascaded rule and an optional js code to parse corresponding htmls
Source Files
¶
Click to show internal directories.
Click to hide internal directories.