Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type IScraper ¶
type IScraper[T any] interface { // GetUrls retrieves the URLs from the current page and the URLs of the next pages for pagination. GetUrls(ctx context.Context, url string) ([]string, []string) // GetData scrapes the data from a given URL and sends it to the provided channel. GetData(ctx context.Context, ch chan<- T, data *T, url string) }
IScraper is an interface that defines the methods required for any scraper implementation. T is a generic type representing the data being scraped.
type Scraper ¶
type Scraper[T any] struct { // contains filtered or unexported fields }
Scraper represents the main structure that coordinates scraping jobs across multiple strategies. It manages the scraping process, handles concurrency, and invokes a user-defined callback when data is scraped.
func NewScraper ¶
func NewScraper[T any](s []ScraperStrategy[T], callback func(T), requestDelay time.Duration) *Scraper[T]
NewScraper creates a new Scraper instance. logger is used for logging, s is the list of strategies to run, callback is the function that processes scraped data, and requestDelay is the optional delay between requests.
type ScraperJob ¶
type ScraperJob[T any] struct { // contains filtered or unexported fields }
ScraperJob represents a job containing the scraper and a list of URLs to process. T is the type of data being scraped.
type ScraperStrategy ¶
type ScraperStrategy[T any] struct { Scraper IScraper[T] // The scraper implementation used to scrape data from the target URL. Url string // The URL to start scraping from. }
ScraperStrategy defines the strategy for scraping a specific URL with a given scraper implementation. T represents the type of data being scraped.