Documentation
¶
Overview ¶
Package pptx provides PPTX (Office Open XML Presentation) document parsing.
Package pptx provides PPTX (Office Open XML Presentation) document parsing.
Index ¶
- type ExtractOptions
- type Paragraph
- type Reader
- func (r *Reader) Close() error
- func (r *Reader) Document() (*model.Document, error)
- func (r *Reader) Markdown() (string, error)
- func (r *Reader) MarkdownWithOptions(opts ExtractOptions) (string, error)
- func (r *Reader) MarkdownWithRAGOptions(extractOpts ExtractOptions, mdOpts rag.MarkdownOptions) (string, error)
- func (r *Reader) Metadata() model.Metadata
- func (r *Reader) PageCount() (int, error)
- func (r *Reader) Slide(index int) (*Slide, error)
- func (r *Reader) SlideCount() int
- func (r *Reader) Text() (string, error)
- func (r *Reader) TextWithOptions(opts ExtractOptions) (string, error)
- type Run
- type Slide
- type Table
- type TableCell
- type TextBlock
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ExtractOptions ¶
type ExtractOptions struct {
IncludeNotes bool // Include speaker notes
IncludeTitles bool // Include slide titles (default: true)
SlideNumbers []int // Which slides to include (0-indexed, empty = all)
ExcludeHeaders bool // Exclude header placeholders
}
ExtractOptions holds options for text extraction.
type Paragraph ¶
type Paragraph struct {
Text string
Level int // Bullet/indent level (0 = top level)
IsBullet bool // Has bullet point
IsNumbered bool // Is numbered list
BulletChar string // Bullet character (if custom)
Alignment string // l, ctr, r, just
Runs []Run // Text runs with formatting
}
Paragraph represents a paragraph within a text block.
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader provides access to PPTX document content.
func (*Reader) MarkdownWithOptions ¶
func (r *Reader) MarkdownWithOptions(opts ExtractOptions) (string, error)
MarkdownWithOptions returns presentation content as Markdown with options.
func (*Reader) MarkdownWithRAGOptions ¶
func (r *Reader) MarkdownWithRAGOptions(extractOpts ExtractOptions, mdOpts rag.MarkdownOptions) (string, error)
MarkdownWithRAGOptions returns presentation content as Markdown with RAG options.
func (*Reader) SlideCount ¶
SlideCount returns the number of slides.
func (*Reader) TextWithOptions ¶
func (r *Reader) TextWithOptions(opts ExtractOptions) (string, error)
TextWithOptions extracts text content with the specified options.
type Slide ¶
type Slide struct {
Index int // 0-indexed slide number
Title string // Slide title (from title placeholder)
Content []TextBlock // Text content in reading order
Tables []Table // Tables on the slide
Notes string // Speaker notes
}
Slide represents a parsed slide.
func (*Slide) GetMarkdown ¶
GetMarkdown returns the slide content as markdown.
type Table ¶
type Table struct {
Rows [][]TableCell
Columns int
X, Y int // Position in EMUs
Width int // Width in EMUs
Height int // Height in EMUs
}
Table represents a table on a slide.
func (*Table) ToMarkdown ¶
ToMarkdown converts a table to markdown format.
type TableCell ¶
type TableCell struct {
Text string
RowSpan int
ColSpan int
IsMerged bool // Part of a merged cell (not the origin)
}
TableCell represents a cell in a table.
type TextBlock ¶
type TextBlock struct {
Text string
Paragraphs []Paragraph
IsTitle bool // Is this the slide title?
IsSubtitle bool // Is this a subtitle?
Placeholder string // Placeholder type (title, body, etc.)
X, Y int // Position in EMUs
Width int // Width in EMUs
Height int // Height in EMUs
}
TextBlock represents a block of text on a slide.