Documentation ¶
Overview ¶
Package documentloaders includes a standard interface for loading documents from a source and implementations of this interface.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type CSV ¶
type CSV struct {
// contains filtered or unexported fields
}
CSV represents a CSV document loader.
func NewCSV ¶
NewCSV creates a new csv loader with an io.Reader and optional column names for filtering.
func (CSV) LoadAndSplit ¶
func (c CSV) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)
LoadAndSplit reads text data from the io.Reader and splits it into multiple documents using a text splitter.
type HTML ¶
type HTML struct {
// contains filtered or unexported fields
}
HTML loads parses and sanitizes html content from an io.Reader.
func (HTML) LoadAndSplit ¶
func (h HTML) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)
LoadAndSplit reads text data from the io.Reader and splits it into multiple documents using a text splitter.
type Loader ¶
type Loader interface { // Loads loads from a source and returns documents. Load(ctx context.Context) ([]schema.Document, error) // LoadAndSplit loads from a source and splits the documents using a text splitter. LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error) }
Loader is the interface for loading and splitting documents from a source.
type PDF ¶
type PDF struct {
// contains filtered or unexported fields
}
PDF loads text data from an io.Reader.
func NewPDF ¶
func NewPDF(r io.ReaderAt, size int64, opts ...PDFOptions) PDF
NewText creates a new text loader with an io.Reader.
func (PDF) Load ¶
Load reads from the io.Reader for the PDF data and returns the documents with the data and with metadata attached of the page number and total number of pages of the PDF.
func (PDF) LoadAndSplit ¶
func (p PDF) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)
LoadAndSplit reads pdf data from the io.Reader and splits it into multiple documents using a text splitter.
type PDFOptions ¶
type PDFOptions func(pdf *PDF)
PDFOptions are options for the PDF loader.
func WithPassword ¶
func WithPassword(password string) PDFOptions
WithPassword sets the password for the PDF.
type Text ¶
type Text struct {
// contains filtered or unexported fields
}
Text loads text data from an io.Reader.
func (Text) LoadAndSplit ¶
func (l Text) LoadAndSplit(ctx context.Context, splitter textsplitter.TextSplitter) ([]schema.Document, error)
LoadAndSplit reads text data from the io.Reader and splits it into multiple documents using a text splitter.