Documentation
¶
Index ¶
- Variables
- type Client
- func (c *Client) Get(ctx context.Context, targetURL string, params *RequestParameters) (*Response, error)
- func (c *Client) Post(ctx context.Context, targetURL string, params *RequestParameters, body any) (*Response, error)
- func (c *Client) Put(ctx context.Context, targetURL string, params *RequestParameters, body any) (*Response, error)
- func (c *Client) Scrape(ctx context.Context, method, targetURL string, params *RequestParameters, ...) (*Response, error)
- type IClient
- type InvalidHTTPMethodError
- type InvalidParameterError
- type InvalidTargetURLError
- type NotConfiguredError
- type Option
- func WithAPIKey(apiKey string) Option
- func WithBaseURL(baseURL string) Option
- func WithMaxConcurrentRequests(maxConcurrentRequests int) Option
- func WithMaxRetryCount(maxRetryCount int) Option
- func WithRetryMaxWaitTime(retryMaxWaitTime time.Duration) Option
- func WithRetryWaitTime(retryWaitTime time.Duration) Option
- type OutputType
- type RequestParameters
- type ResourceType
- type Response
- func (r *Response) Body() []byte
- func (r *Response) Error() error
- func (r *Response) Header() http.Header
- func (r *Response) IsError() bool
- func (r *Response) IsSuccess() bool
- func (r *Response) Problem() *problem.Problem
- func (r *Response) ReceivedAt() time.Time
- func (r *Response) Size() int64
- func (r *Response) Status() string
- func (r *Response) StatusCode() int
- func (r *Response) String() string
- func (r *Response) TargetCookies() []*http.Cookie
- func (r *Response) TargetHeaders() http.Header
- func (r *Response) Time() time.Duration
- type ResponseType
- type ScreenshotFormat
Constants ¶
This section is empty.
Variables ¶
var AllOutputTypes = map[OutputType]struct{}{ OutputTypeEmails: {}, OutputTypePhoneNumbers: {}, OutputTypeHeadings: {}, OutputTypeImages: {}, OutputTypeAudios: {}, OutputTypeVideos: {}, OutputTypeLinks: {}, OutputTypeTables: {}, OutputTypeMenus: {}, OutputTypeHashtags: {}, OutputTypeMetadata: {}, OutputTypeFavicon: {}, OutputTypeAll: {}, }
var AllResourceTypes = map[ResourceType]struct{}{ ResourceTypeEventSource: {}, ResourceTypeFetch: {}, ResourceTypeFont: {}, ResourceTypeImage: {}, ResourceTypeManifest: {}, ResourceTypeMedia: {}, ResourceTypeOther: {}, ResourceTypeScript: {}, ResourceTypeStylesheet: {}, ResourceTypeTextTrack: {}, ResourceTypeWebSocket: {}, ResourceTypeXHR: {}, }
var AllResponseTypes = map[ResponseType]struct{}{ ResponseTypeMarkdown: {}, ResponseTypePlainText: {}, ResponseTypePDF: {}, }
var AllScreenshotFormats = map[ScreenshotFormat]struct{}{ ScreenshotFormatPNG: {}, ScreenshotFormatJPEG: {}, }
var Version = version.Version
var VersionPrerelease = version.Prerelease
Functions ¶
This section is empty.
Types ¶
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client is the ZenRows Scraper API client
func (*Client) Get ¶
func (c *Client) Get(ctx context.Context, targetURL string, params *RequestParameters) (*Response, error)
Get sends an HTTP GET request to the ZenRows Scraper API to scrape the given target URL using the specified parameters.
func (*Client) Post ¶
func (c *Client) Post(ctx context.Context, targetURL string, params *RequestParameters, body any) (*Response, error)
Post sends an HTTP POST request to the ZenRows Scraper API to scrape the given target URL using the specified parameters.
type IClient ¶
type IClient interface { // Scrape sends a request to the ZenRows Scraper API to scrape the given target URL using the specified method and parameters. Scrape(ctx context.Context, targetURL, method string, params RequestParameters) (*Response, error) // Get sends a GET request to the ZenRows Scraper API to scrape the given target URL using the specified parameters. Get(ctx context.Context, targetURL string, params RequestParameters) (*Response, error) // Post sends a POST request to the ZenRows Scraper API to scrape the given target URL using the specified parameters. Post(ctx context.Context, targetURL string, params RequestParameters) (*Response, error) // Put sends a PUT request to the ZenRows Scraper API to scrape the given target URL using the specified parameters. Put(ctx context.Context, targetURL string, params RequestParameters) (*Response, error) }
type InvalidHTTPMethodError ¶
type InvalidHTTPMethodError struct{}
InvalidHTTPMethodError results when the ZenRows Scraper API client is used with an invalid HTTP method.
func (InvalidHTTPMethodError) Error ¶
func (InvalidHTTPMethodError) Error() string
type InvalidParameterError ¶
type InvalidParameterError struct {
Msg string
}
InvalidParameterError results when the ZenRows Scraper API client is used with an invalid parameter.
func (InvalidParameterError) Error ¶
func (e InvalidParameterError) Error() string
type InvalidTargetURLError ¶
InvalidTargetURLError results when the ZenRows Scraper API client is used with an invalid target URL.
func (InvalidTargetURLError) Error ¶
func (e InvalidTargetURLError) Error() string
func (InvalidTargetURLError) Unwrap ¶
func (e InvalidTargetURLError) Unwrap() error
type NotConfiguredError ¶
type NotConfiguredError struct{}
NotConfiguredError results when the ZenRows Scraper API client is used without a valid API Key.
func (NotConfiguredError) Error ¶
func (NotConfiguredError) Error() string
type Option ¶
type Option interface {
// contains filtered or unexported methods
}
Option configures the ZenRows Scraper API client.
func WithAPIKey ¶
WithAPIKey returns an Option which configures the API key of the ZenRows Scraper API client.
func WithBaseURL ¶
WithBaseURL returns an Option which configures the base URL of the ZenRows Scraper API client.
func WithMaxConcurrentRequests ¶
WithMaxConcurrentRequests returns an Option which configures the maximum number of concurrent requests to the ZenRows Scraper API. See https://docs.zenrows.com/scraper-api/features/concurrency for more information.
IMPORTANT: Breaking the concurrency limit will result in a 429 Too Many Requests error. If you exceed the limit repeatedly, your account may be temporarily suspended, so make sure to set this value to a reasonable number according to your subscription plan.
func WithMaxRetryCount ¶
WithMaxRetryCount returns an Option which configures the maximum number of retries to perform.
func WithRetryMaxWaitTime ¶
WithRetryMaxWaitTime returns an Option which configures the maximum time to wait before retrying the request.
func WithRetryWaitTime ¶
WithRetryWaitTime returns an Option which configures the time to wait before retrying the request.
type OutputType ¶
type OutputType string
const ( OutputTypeEmails OutputType = "emails" OutputTypePhoneNumbers OutputType = "phone_numbers" OutputTypeHeadings OutputType = "headings" OutputTypeImages OutputType = "images" OutputTypeAudios OutputType = "audios" OutputTypeVideos OutputType = "videos" OutputTypeLinks OutputType = "links" OutputTypeTables OutputType = "tables" OutputTypeMenus OutputType = "menus" OutputTypeMetadata OutputType = "metadata" OutputTypeFavicon OutputType = "favicon" OutputTypeAll OutputType = "*" )
type RequestParameters ¶
type RequestParameters struct { // Proxy settings UsePremiumProxies bool `json:"premium_proxy,omitempty" structs:"premium_proxy,omitempty" schema:"premium_proxy"` ProxyCountry string `json:"proxy_country,omitempty" structs:"proxy_country,omitempty" schema:"proxy_country"` // Output modifiers AutoParse bool `json:"autoparse,omitempty" structs:"autoparse,omitempty" schema:"autoparse"` CSSExtractor string `json:"css_extractor,omitempty" structs:"css_extractor,omitempty" schema:"css_extractor"` JSONResponse bool `json:"json_response,omitempty" structs:"json_response,omitempty" schema:"json_response"` ResponseType ResponseType `json:"response_type,omitempty" structs:"response_type,omitempty" schema:"response_type"` Outputs []OutputType `json:"outputs,omitempty" structs:"outputs,omitempty" schema:"outputs"` // JSRender enables JavaScript rendering for the request. If not enabled, the request will be processed by the standard scraping engine, // which is faster but does not execute JavaScript and may not bypass some anti-bot systems. // // See https://docs.zenrows.com/scraper-api/features/js-rendering for more information. JSRender bool `json:"js_render,omitempty" structs:"js_render,omitempty" schema:"js_render"` // JSInstructions is a serialized JSON object that contains custom JavaScript instructions that will be executed in the page before // returning the response (only available when using JSRender). // // See https://docs.zenrows.com/scraper-api/features/js-rendering#using-the-javascript-instructions for more information. JSInstructions string `json:"js_instructions,omitempty" structs:"js_instructions,omitempty" schema:"js_instructions"` // WaitMilliseconds will wait for the specified number of milliseconds before returning the response (only available when // using JSRender). The maximum wait time is 30 seconds (30000 ms). WaitMilliseconds int `json:"wait,omitempty" structs:"wait,omitempty" schema:"wait"` // WaitForSelector will wait for the specified element to appear in the page before returning the response (only available when // using JSRender). // // See https://docs.zenrows.com/scraper-api/features/js-rendering#wait-for-selector for more information. // // IMPORTANT: Make sure that the element you are waiting for is present in the page. If the element does not appear, the request will // fail by a timeout error after a few seconds. WaitForSelector string `json:"wait_for,omitempty" structs:"wait_for,omitempty" schema:"wait_for"` // Screenshot will return a screenshot of the page (only available when using JSRender) Screenshot bool `json:"screenshot,omitempty" structs:"screenshot,omitempty" schema:"screenshot"` // ScreenshotFullPage will take a screenshot of the full page (only available when using JSRender and Screenshot is set to true) ScreenshotFullPage bool `json:"screenshot_fullpage,omitempty" structs:"screenshot_fullpage,omitempty" schema:"screenshot_fullpage"` // ScreenshotSelector will take a screenshot of the specified element (only available when using JSRender and Screenshot is set to true) ScreenshotSelector string `json:"screenshot_selector,omitempty" structs:"screenshot_selector,omitempty" schema:"screenshot_selector"` // ScreenshotFormat will set the format of the screenshot (only available when using JSRender and Screenshot is set to true). // The available formats are ScreenshotFormatPNG and ScreenshotFormatJPEG. The default format is ScreenshotFormatPNG. ScreenshotFormat ScreenshotFormat `json:"screenshot_format,omitempty" structs:"screenshot_format,omitempty" schema:"screenshot_format"` // ScreenshotQuality will set the quality of the screenshot (only available when using JSRender and Screenshot is set to true, and // the format is ScreenshotFormatJPEG). The quality must be between 1 and 100. The default quality is 100. ScreenshotQuality int `json:"screenshot_quality,omitempty" structs:"screenshot_quality,omitempty" schema:"screenshot_quality"` // ReturnOriginalStatus will return the original status code of the response wthen the request is not successful. When a request is not // successful, the ZenRows Scraper API will always return a 422 status code. If you enable this feature, the original status code will // be returned instead. ReturnOriginalStatus bool `json:"original_status,omitempty" structs:"original_status,omitempty" schema:"original_status"` // SessionID is an integer between 0 and 99999 that can be used to group requests together. If you provide a SessionID, all requests // with the same SessionID will use the same IP address for up to 10 minutes. This feature is useful for web scraping sites that track // sessions or limit IP rotation. It helps simulate a persistent session and avoids triggering anti-bot systems that flag // frequent IP changes. // // See https://docs.zenrows.com/scraper-api/features/other#session-id for more information. // // IMPORTANT: Use this feature only if you know what you are doing. If you provide a SessionID, the IP rotation feature will be disabled // for all requests with the same SessionID. This may affect the scraping quality and increase the chances of being blocked. SessionID int `json:"session_id,omitempty" structs:"session_id,omitempty" schema:"session_id"` // AllowedStatusCodes will return the response body of a request even if the status code is not a successful one (2xx), but // is one of the specified status codes in this list. // // See https://docs.zenrows.com/scraper-api/features/other#return-content-on-error for more information. // // IMPORTANT: ZenRows Scraper API only charges for successful requests. If you use this feature, you will also be charged for // unsuccessful requests matching the specified status codes. AllowedStatusCodes []int `json:"allowed_status_codes,omitempty" structs:"allowed_status_codes,omitempty" schema:"allowed_status_codes"` // BlockResources will block the specified resources from loading (only available when using JSRender) // // See https://docs.zenrows.com/scraper-api/features/js-rendering#block-resources for more information. // // IMPORTANT: ZenRows Scraper API already blocks some resources by default to improve the scraping quality. Use this feature only if you // know what you are doing. BlockResources []ResourceType `json:"block_resources,omitempty" structs:"block_resources,omitempty" schema:"block_resources"` // CustomHeaders is a http.Header object that will be used to set custom headers in the request. // // See https://docs.zenrows.com/scraper-api/features/headers for more information. // // IMPORTANT: ZenRows Scraper API already rotates and selects the best combination of headers (like User-Agent, Accept-Language, etc.) // automatically for each request. If you provide custom headers, the scraping quality may be affected. Use this feature only if you // know what you are doing. CustomHeaders http.Header `json:"custom_headers,omitempty" structs:"-" schema:"-"` // CustomParams is a map of custom parameters that will be passed to the ZenRows Scraper API. These parameters will be passed as query // parameters in the request, and can be used to pass new features or options that are not available in the standard parameters. CustomParams map[string]string `json:"custom_params,omitempty" structs:"-" schema:"-"` }
RequestParameters represents the parameters that can be passed to the ZenRows Scraper API when making a request to modify the behavior of the scraping engine.
See https://docs.zenrows.com/scraper-api/api-reference for more information.
func ParseQueryRequestParameters ¶
func ParseQueryRequestParameters(query url.Values) (*RequestParameters, error)
ParseQueryRequestParameters parses the provided url.Values object and returns a RequestParameters object, or an error if the parsing fails.
func (*RequestParameters) ToURLValues ¶
func (p *RequestParameters) ToURLValues() url.Values
ToURLValues converts the RequestParameters to a url.Values object
func (*RequestParameters) Validate ¶
func (p *RequestParameters) Validate() error
type ResourceType ¶
type ResourceType string
const ( ResourceTypeEventSource ResourceType = "eventsource" ResourceTypeFetch ResourceType = "fetch" ResourceTypeFont ResourceType = "font" ResourceTypeImage ResourceType = "image" ResourceTypeManifest ResourceType = "manifest" ResourceTypeMedia ResourceType = "media" ResourceTypeOther ResourceType = "other" ResourceTypeScript ResourceType = "script" ResourceTypeStylesheet ResourceType = "stylesheet" ResourceTypeTextTrack ResourceType = "texttrack" ResourceTypeWebSocket ResourceType = "websocket" ResourceTypeXHR ResourceType = "xhr" )
type Response ¶
type Response struct { // RawResponse is the original `*http.Response` object. RawResponse *http.Response // contains filtered or unexported fields }
Response struct holds response values of executed requests.
func (*Response) Body ¶
Body method returns the HTTP response as `[]byte` slice for the executed request.
func (*Response) IsError ¶
IsError method returns true if HTTP status `code >= 400` otherwise false.
func (*Response) IsSuccess ¶
IsSuccess method returns true if HTTP status `code >= 200 and <= 299` otherwise false.
func (*Response) Problem ¶
Problem method returns the problem description of the HTTP response if any.
func (*Response) ReceivedAt ¶
ReceivedAt method returns the time we received a response from the server for the request.
func (*Response) Status ¶
Status method returns the HTTP status string for the executed request.
Example: 200 OK
func (*Response) StatusCode ¶
StatusCode method returns the HTTP status code for the executed request.
Example: 200
func (*Response) String ¶
String method returns the body of the HTTP response as a `string`. It returns an empty string if it is nil or the body is zero length.
func (*Response) TargetCookies ¶
TargetCookies method to returns all the response cookies that the target page has set, if any.
func (*Response) TargetHeaders ¶
TargetHeaders method to returns all the response headers that the target page has set, if any. ZenRows Scraper API encodes these headers with a "Z-" prefix, so this method filters out all headers that do not have this prefix.
To get all the headers, see the [Response.Headers] field.
type ResponseType ¶
type ResponseType string
ResponseType represents the type of response that the ZenRows Scraper API should return.
const ( ResponseTypeMarkdown ResponseType = "markdown" ResponseTypePlainText ResponseType = "plaintext" ResponseTypePDF ResponseType = "pdf" )
type ScreenshotFormat ¶
type ScreenshotFormat string
const ( ScreenshotFormatPNG ScreenshotFormat = "png" ScreenshotFormatJPEG ScreenshotFormat = "jpeg" )
Source Files
¶
Directories
¶
Path | Synopsis |
---|---|
cmd
|
|
examples
|
|
pkg
|
|
Package version provides a location to set the release versions for all packages to consume, without creating import cycles.
|
Package version provides a location to set the release versions for all packages to consume, without creating import cycles. |