Documentation
¶
Overview ¶
Package xlsx provides XLSX (Office Open XML Spreadsheet) document parsing.
Package xlsx provides XLSX (Office Open XML Spreadsheet) document parsing.
Index ¶
- func CellRef(col, row int) string
- func ColumnToIndex(col string) int
- func IndexToColumn(index int) string
- func ParseCellRef(ref string) (col, row int, err error)
- func ParseRangeRef(ref string) (startCol, startRow, endCol, endRow int, err error)
- type Cell
- type CellType
- type ExtractOptions
- type MergedRegion
- type ParsedTable
- type Reader
- func (r *Reader) Close() error
- func (r *Reader) Document() (*model.Document, error)
- func (r *Reader) Markdown() (string, error)
- func (r *Reader) MarkdownWithOptions(opts ExtractOptions) (string, error)
- func (r *Reader) MarkdownWithRAGOptions(extractOpts ExtractOptions, mdOpts rag.MarkdownOptions) (string, error)
- func (r *Reader) Metadata() model.Metadata
- func (r *Reader) PageCount() (int, error)
- func (r *Reader) Sheet(index int) (*Sheet, error)
- func (r *Reader) SheetByName(name string) (*Sheet, error)
- func (r *Reader) SheetCount() int
- func (r *Reader) SheetNames() []string
- func (r *Reader) Tables() []ParsedTable
- func (r *Reader) Text() (string, error)
- func (r *Reader) TextWithOptions(opts ExtractOptions) (string, error)
- type Sheet
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ColumnToIndex ¶
ColumnToIndex converts a column letter(s) to a 0-indexed column number. A=0, B=1, ..., Z=25, AA=26, AB=27, etc.
func IndexToColumn ¶
IndexToColumn converts a 0-indexed column number to column letter(s). 0=A, 1=B, ..., 25=Z, 26=AA, 27=AB, etc.
func ParseCellRef ¶
ParseCellRef parses a cell reference like "A1" or "AA100" into column and row indices (0-indexed).
func ParseRangeRef ¶
ParseRangeRef parses a range reference like "A1:D10" into start and end coordinates.
Types ¶
type Cell ¶
type Cell struct {
Value string // The cell's display value
RawValue string // The raw value from XML
Type CellType // The type of data
Row int // 0-indexed row
Col int // 0-indexed column
StyleIndex int // Index into styles
Formula string // Formula if present
// Merge information
IsMerged bool // Is this cell part of a merged region?
IsMergeRoot bool // Is this the top-left cell of a merged region?
MergeRows int // Number of rows in merge (1 = no merge)
MergeCols int // Number of columns in merge (1 = no merge)
}
Cell represents a cell in a worksheet.
type CellType ¶
type CellType int
CellType represents the type of data in a cell.
const ( // CellTypeString indicates a string value. CellTypeString CellType = iota // CellTypeNumber indicates a numeric value. CellTypeNumber // CellTypeBoolean indicates a boolean value. CellTypeBoolean // CellTypeFormula indicates a formula. CellTypeFormula // CellTypeError indicates an error value. CellTypeError // CellTypeEmpty indicates an empty cell. CellTypeEmpty )
type ExtractOptions ¶
type ExtractOptions struct {
Sheets []int // Which sheets to include (0-indexed, empty = all)
IncludeHeaders bool // Include sheet names as headers
Delimiter string // Cell delimiter (default: tab)
ExcludeHeaders bool // For compatibility with other formats
}
ExtractOptions holds options for text extraction.
type MergedRegion ¶
MergedRegion represents a merged cell region.
type ParsedTable ¶
ParsedTable represents a table extracted from a sheet.
func (ParsedTable) ToMarkdown ¶
func (t ParsedTable) ToMarkdown() string
ToMarkdown converts a ParsedTable to markdown.
func (ParsedTable) ToText ¶
func (t ParsedTable) ToText() string
ToText converts a ParsedTable to text.
type Reader ¶
type Reader struct {
// contains filtered or unexported fields
}
Reader provides access to XLSX document content.
func (*Reader) MarkdownWithOptions ¶
func (r *Reader) MarkdownWithOptions(opts ExtractOptions) (string, error)
MarkdownWithOptions returns workbook content as Markdown with options.
func (*Reader) MarkdownWithRAGOptions ¶
func (r *Reader) MarkdownWithRAGOptions(extractOpts ExtractOptions, mdOpts rag.MarkdownOptions) (string, error)
MarkdownWithRAGOptions returns workbook content as Markdown with RAG options.
func (*Reader) SheetByName ¶
SheetByName returns the sheet with the given name.
func (*Reader) SheetCount ¶
SheetCount returns the number of sheets in the workbook.
func (*Reader) SheetNames ¶
SheetNames returns the names of all sheets.
func (*Reader) Tables ¶
func (r *Reader) Tables() []ParsedTable
Tables returns all sheets as ParsedTable format (for compatibility).
func (*Reader) TextWithOptions ¶
func (r *Reader) TextWithOptions(opts ExtractOptions) (string, error)
TextWithOptions extracts text content with the specified options.
type Sheet ¶
type Sheet struct {
Name string
Index int
Rows [][]Cell
MaxRow int // Maximum row index (0-indexed)
MaxCol int // Maximum column index (0-indexed)
// Merged cell regions
MergedRegions []MergedRegion
}
Sheet represents a worksheet in the workbook.
func (*Sheet) Cell ¶
Cell returns the cell at the given row and column (0-indexed). Returns nil if the cell doesn't exist.
func (*Sheet) CellByRef ¶
CellByRef returns the cell at the given reference (e.g., "A1"). Returns nil if the cell doesn't exist.