goscraper

package module
Version: v0.0.0-...-36995ce Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2019 License: MIT Imports: 9 Imported by: 7

README

goscraper

Golang package to quickly return a preview of a webpage, you can get easily its title, description & images

Usage

func main() {
	s, err := goscraper.Scrape("https://www.w3.org/", 5)
	if err != nil {
        fmt.Println(err)
        return
	}
	fmt.Printf("Icon : %s\n", s.Preview.Icon)
	fmt.Printf("Name : %s\n", s.Preview.Name)
	fmt.Printf("Title : %s\n", s.Preview.Title)
	fmt.Printf("Description : %s\n", s.Preview.Description)
	fmt.Printf("Image: %s\n", s.Preview.Images[0])
	fmt.Printf("Url : %s\n", s.Preview.Link)
}

output:

Icon : https://www.w3.org/favicon.ico
Name : www.w3.org
Title : World Wide Web Consortium (W3C)
Description : The World Wide Web Consortium (W3C) is an international community where Member organizations, a full-time staff, and the public work together to develop Web standards.
Image: https://www.w3.org/2008/site/images/logo-w3c-mobile-lg
Url : https://www.w3.org/

License

Goscraper is licensed under the MIT License.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	EscapedFragment string = "_escaped_fragment_="
)

Functions

This section is empty.

Types

type Document

type Document struct {
	Body    bytes.Buffer
	Preview DocumentPreview
}

func Scrape

func Scrape(uri string, maxRedirect int) (*Document, error)

type DocumentPreview

type DocumentPreview struct {
	Icon        string
	Name        string
	Title       string
	Description string
	Images      []string
	Link        string
}

type Scraper

type Scraper struct {
	Url                *url.URL
	EscapedFragmentUrl *url.URL
	MaxRedirect        int
}

func (*Scraper) Scrape

func (scraper *Scraper) Scrape() (*Document, error)

Source Files

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
t or T : Toggle theme light dark auto
y or Y : Canonical URL