core

package
v1.0.0-rc.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 1, 2026 License: Apache-2.0 Imports: 18 Imported by: 0

Documentation

Overview

Package core implements WEB::ARTICLE extraction heuristics.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func WithExtractor

func WithExtractor(ctx context.Context, extractor *Extractor) context.Context

Types

type Extractor

type Extractor struct {
	// contains filtered or unexported fields
}

func ExtractorFromContext

func ExtractorFromContext(ctx context.Context) *Extractor

func NewExtractor

func NewExtractor() *Extractor

func (*Extractor) Extract

func (e *Extractor) Extract(input string) types.Article

Extract returns the best-effort normalized article extracted from raw HTML.

func (*Extractor) ExtractSource

func (e *Extractor) ExtractSource(source Source) types.Article

ExtractSource returns the best-effort normalized article extracted from a normalized source.

type Source

type Source struct {
	SourceURL *url.URL
	TitleHint *string
	HTML      string
}

Source is the normalized article extraction input.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL