Documentation
¶
Overview ¶
Package zenrows provides utility functions to set scraping options for the ZenRows API. These functions help in configuring the request parameters for various scraping features and requirements.
Index ¶
- func ApplyParameters(u *url.URL, params ...ScrapeOptions) *url.URL
- type Client
- type ClientConfig
- type HttpClient
- type ScrapeOptions
- func WithAIAntiBot() ScrapeOptions
- func WithAutoparse(value bool) ScrapeOptions
- func WithBlockResources(value string) ScrapeOptions
- func WithCSSExtractor(value string) ScrapeOptions
- func WithCustomHeaders(value bool) ScrapeOptions
- func WithDevice(value string) ScrapeOptions
- func WithJSInstructions(value string) ScrapeOptions
- func WithJSONResponse(value bool) ScrapeOptions
- func WithJSRender() ScrapeOptions
- func WithOriginalStatus(value bool) ScrapeOptions
- func WithPremiumProxy() ScrapeOptions
- func WithProxyCountry(value string) ScrapeOptions
- func WithResolveCaptcha(value bool) ScrapeOptions
- func WithSessionID(sessionID int) ScrapeOptions
- func WithWait(value int) ScrapeOptions
- func WithWaitFor(value string) ScrapeOptions
- func WithWindowHeight(value int) ScrapeOptions
- func WithWindowWidth(value int) ScrapeOptions
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ApplyParameters ¶
func ApplyParameters(u *url.URL, params ...ScrapeOptions) *url.URL
ApplyParameters applies the chosen scraping options to a URL. It modifies the URL's query string based on the provided scraping options.
u: The target URL. params: The ScrapeOptions to be applied to the URL.
Types ¶
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client ZenRow client
func NewClient ¶
func NewClient(httpClient HttpClient) *Client
NewClient Initialise the client with given HttpClient interface
func (*Client) Scrape ¶
func (c *Client) Scrape(ctx context.Context, targetURL string, params ...ScrapeOptions) (string, error)
Scrape fetches content from the specified targetURL using the ZenRows API.
The function constructs the API URL based on the provided targetURL and optional ScrapeOptions. It then sends a GET request to the ZenRows API and returns the scraped content as a string.
The function validates the provided targetURL to ensure it's a full URL with both a scheme and a host. It also checks if the 'js_instructions' parameter is set without enabling 'js_render', and returns an error if so.
For now only supports GET method ¶
Parameters: - ctx: Context - targetURL: The URL of the website you want to scrape. - params: Optional parameters to customize the scraping process. Refer to ScrapeOptions for available options.
Returns: - A string containing the scraped content. - An error if there's any issue during the scraping process, such as invalid URLs, failed requests, or reading issues.
Example usage:
content, err := client.Scrape(context.Background(), "https://example.com", zenrows.WithJSRender(true)) if err != nil { log.Fatalf("Failed to scrape the target: %v", err) } fmt.Println("Scraped Content:", content)
For more details and examples, refer to the https://pkg.go.dev/github.com/renatoaraujo/go-zenrows and the example provided in the repository https://github.com/renatoaraujo/go-zenrows/blob/main/examples/example.go.
func (*Client) WithApiKey ¶
WithApiKey Configures the apikey
type ClientConfig ¶
type ClientConfig struct { BaseURL string // contains filtered or unexported fields }
ClientConfig Configuration with the key and base API URL
func DefaultConfig ¶
func DefaultConfig() ClientConfig
DefaultConfig Generate default configuration -- currently only option but extensive for the future
func (*ClientConfig) ConfigCredentials ¶
func (c *ClientConfig) ConfigCredentials(key string)
ConfigCredentials Adds the apikey to the configuration -- in case they change the format of the credentials it will be easier to implement here
type HttpClient ¶
HttpClient Http client able to perform request, can be http.Client or any other
type ScrapeOptions ¶
ScrapeOptions defines functions that modify URL query values based on the chosen scraping options.
func WithAIAntiBot ¶ added in v0.3.0
func WithAIAntiBot() ScrapeOptions
WithAIAntiBot sets the anti-bot Some websites protect their content with anti-bot solutions such as Cloudfare, Akamai, or Datadome. Enable Anti-bot to bypass them easily without any hassle.
func WithAutoparse ¶
func WithAutoparse(value bool) ScrapeOptions
WithAutoparse employs the auto-parser algorithm for the request, which extracts data from the page automatically.
value: A boolean to determine if the auto parser should be used.
func WithBlockResources ¶
func WithBlockResources(value string) ScrapeOptions
WithBlockResources prevents specific resources from loading during the scrape request.
value: The types of resources to block.
func WithCSSExtractor ¶
func WithCSSExtractor(value string) ScrapeOptions
WithCSSExtractor sets CSS Selectors to extract specific data from the HTML.
value: The desired CSS selectors.
func WithCustomHeaders ¶
func WithCustomHeaders(value bool) ScrapeOptions
WithCustomHeaders allows custom headers to be added to the request.
value: A boolean indicating if custom headers are to be included.
func WithDevice ¶
func WithDevice(value string) ScrapeOptions
WithDevice sets the user agent type (either desktop or mobile) for the request.
value: A string specifying the device type ("desktop" or "mobile").
func WithJSInstructions ¶ added in v0.2.0
func WithJSInstructions(value string) ScrapeOptions
WithJSInstructions provides JavaScript instructions for the scrape request. It automatically enables WithJSRender to ensure the correct execution of JavaScript instructions.
value: A JSON string representing the JavaScript instructions.
func WithJSONResponse ¶
func WithJSONResponse(value bool) ScrapeOptions
WithJSONResponse configures the request to return content in JSON format, including any XHR or Fetch requests made.
value: A boolean to determine if the response should be in JSON format.
func WithJSRender ¶
func WithJSRender() ScrapeOptions
WithJSRender enables JavaScript rendering for the scrape request. Consumes 5 credits per request.
func WithOriginalStatus ¶
func WithOriginalStatus(value bool) ScrapeOptions
WithOriginalStatus configures the request to return the status code as provided by the website.
value: A boolean determining if the original status code should be returned.
func WithPremiumProxy ¶
func WithPremiumProxy() ScrapeOptions
WithPremiumProxy enables the use of premium proxies for the request. This makes the request less detectable and consumes 10-25 credits per request.
func WithProxyCountry ¶
func WithProxyCountry(value string) ScrapeOptions
WithProxyCountry specifies the geolocation of the IP for the request. Note: Only applicable for Premium Proxies.
value: The desired country code for the proxy.
func WithResolveCaptcha ¶
func WithResolveCaptcha(value bool) ScrapeOptions
WithResolveCaptcha integrates a CAPTCHA solver for the request, enabling automatic solving of CAPTCHAs on the page.
value: A boolean to determine if the CAPTCHA solver should be used.
func WithSessionID ¶
func WithSessionID(sessionID int) ScrapeOptions
WithSessionID sets the Session ID number for the scrape request. This allows the use of the same IP for each API Request for up to 10 minutes.
sessionID: An integer representing the Session ID.
func WithWait ¶
func WithWait(value int) ScrapeOptions
WithWait introduces a fixed delay before the content is returned.
value: An integer specifying the wait time in milliseconds.
func WithWaitFor ¶
func WithWaitFor(value string) ScrapeOptions
WithWaitFor delays the request until a specific CSS Selector is loaded in the DOM.
value: A string specifying the CSS Selector to wait for.
func WithWindowHeight ¶
func WithWindowHeight(value int) ScrapeOptions
WithWindowHeight defines the browser window height for the request.
value: The desired window height in pixels.
func WithWindowWidth ¶
func WithWindowWidth(value int) ScrapeOptions
WithWindowWidth defines the browser window width for the request.
value: The desired window width in pixels.