Documentation ¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Extractor ¶
Extractor is an interface that processes incoming HTML and outputs text within HTML minus all the boilerplate
type TextBlock ¶
TextBlock represents a text block which may comprise of inline elements.
func GenerateTextBlocks ¶
GenerateTextBlocks takes a reader containing HTML and generates a TextBlock array from it.
func (*TextBlock) LinkDensity ¶
LinkDensity is the number of link text words divided by the total number of words in the block.
Click to show internal directories.
Click to hide internal directories.