html

package

v0.1.1 Latest Latest Go to latest Published: May 28, 2023 License: MIT Imports: 2 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/rudolfoborges/pdf2go

Links

Open Source Insights

Documentation ¶

Index ¶

type HtmlExtractor
- func NewHtmlExtractor(path string, totalOfPages int) *HtmlExtractor
- func (e *HtmlExtractor) Extract() (string, error)
- func (e *HtmlExtractor) ExtractPage(pageNumber int) (string, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type HtmlExtractor ¶

type HtmlExtractor struct {
	// contains filtered or unexported fields
}

func NewHtmlExtractor ¶

func NewHtmlExtractor(path string, totalOfPages int) *HtmlExtractor

NewHtmlExtractor creates a new HtmlExtractor. The path argument is the path to the PDF file. The totalOfPages argument is the total of pages of the PDF file.

func (*HtmlExtractor) Extract ¶

func (e *HtmlExtractor) Extract() (string, error)

Extract extracts the html from the PDF file. It returns an error if the html cannot be extracted.

func (*HtmlExtractor) ExtractPage ¶

func (e *HtmlExtractor) ExtractPage(pageNumber int) (string, error)

Extract extracts the text from the PDF file. It returns an error if the html cannot be extracted. The pageNumber argument is the page number to extract the html from.

Source Files ¶

View all Source files

html_extractor.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL