xlsx

package
v1.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 28, 2026 License: MIT Imports: 8 Imported by: 0

Documentation

Overview

Package xlsx provides XLSX (Office Open XML Spreadsheet) document parsing.

Package xlsx provides XLSX (Office Open XML Spreadsheet) document parsing.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CellRef

func CellRef(col, row int) string

CellRef creates a cell reference string from column and row indices (0-indexed).

func ColumnToIndex

func ColumnToIndex(col string) int

ColumnToIndex converts a column letter(s) to a 0-indexed column number. A=0, B=1, ..., Z=25, AA=26, AB=27, etc.

func IndexToColumn

func IndexToColumn(index int) string

IndexToColumn converts a 0-indexed column number to column letter(s). 0=A, 1=B, ..., 25=Z, 26=AA, 27=AB, etc.

func ParseCellRef

func ParseCellRef(ref string) (col, row int, err error)

ParseCellRef parses a cell reference like "A1" or "AA100" into column and row indices (0-indexed).

func ParseRangeRef

func ParseRangeRef(ref string) (startCol, startRow, endCol, endRow int, err error)

ParseRangeRef parses a range reference like "A1:D10" into start and end coordinates.

Types

type Cell

type Cell struct {
	Value      string   // The cell's display value
	RawValue   string   // The raw value from XML
	Type       CellType // The type of data
	Row        int      // 0-indexed row
	Col        int      // 0-indexed column
	StyleIndex int      // Index into styles
	Formula    string   // Formula if present

	// Merge information
	IsMerged    bool // Is this cell part of a merged region?
	IsMergeRoot bool // Is this the top-left cell of a merged region?
	MergeRows   int  // Number of rows in merge (1 = no merge)
	MergeCols   int  // Number of columns in merge (1 = no merge)
}

Cell represents a cell in a worksheet.

func (*Cell) IsEmpty

func (c *Cell) IsEmpty() bool

IsEmpty returns true if the cell has no value.

type CellType

type CellType int

CellType represents the type of data in a cell.

const (
	// CellTypeString indicates a string value.
	CellTypeString CellType = iota
	// CellTypeNumber indicates a numeric value.
	CellTypeNumber
	// CellTypeBoolean indicates a boolean value.
	CellTypeBoolean
	// CellTypeFormula indicates a formula.
	CellTypeFormula
	// CellTypeError indicates an error value.
	CellTypeError
	// CellTypeEmpty indicates an empty cell.
	CellTypeEmpty
)

func (CellType) String

func (t CellType) String() string

String returns the string representation of the cell type.

type ExtractOptions

type ExtractOptions struct {
	Sheets         []int  // Which sheets to include (0-indexed, empty = all)
	IncludeHeaders bool   // Include sheet names as headers
	Delimiter      string // Cell delimiter (default: tab)
	ExcludeHeaders bool   // For compatibility with other formats
	ExcludeFooters bool   // For compatibility with other formats
}

ExtractOptions holds options for text extraction.

type MergedRegion

type MergedRegion struct {
	StartRow int
	StartCol int
	EndRow   int
	EndCol   int
}

MergedRegion represents a merged cell region.

type ParsedTable

type ParsedTable struct {
	Name    string
	Rows    [][]string
	Headers []string
}

ParsedTable represents a table extracted from a sheet.

func (ParsedTable) ToMarkdown

func (t ParsedTable) ToMarkdown() string

ToMarkdown converts a ParsedTable to markdown.

func (ParsedTable) ToText

func (t ParsedTable) ToText() string

ToText converts a ParsedTable to text.

type Reader

type Reader struct {
	// contains filtered or unexported fields
}

Reader provides access to XLSX document content.

func Open

func Open(filename string) (*Reader, error)

Open opens an XLSX file for reading.

func (*Reader) Close

func (r *Reader) Close() error

Close releases resources associated with the Reader.

func (*Reader) Document

func (r *Reader) Document() (*model.Document, error)

Document returns a model.Document representation of the XLSX content.

func (*Reader) Markdown

func (r *Reader) Markdown() (string, error)

Markdown returns the workbook content as Markdown.

func (*Reader) MarkdownWithOptions

func (r *Reader) MarkdownWithOptions(opts ExtractOptions) (string, error)

MarkdownWithOptions returns workbook content as Markdown with options.

func (*Reader) MarkdownWithRAGOptions

func (r *Reader) MarkdownWithRAGOptions(extractOpts ExtractOptions, mdOpts rag.MarkdownOptions) (string, error)

MarkdownWithRAGOptions returns workbook content as Markdown with RAG options.

func (*Reader) Metadata

func (r *Reader) Metadata() model.Metadata

Metadata returns document metadata.

func (*Reader) PageCount

func (r *Reader) PageCount() (int, error)

PageCount returns the number of "pages" (sheets) in the workbook.

func (*Reader) Sheet

func (r *Reader) Sheet(index int) (*Sheet, error)

Sheet returns the sheet at the given index (0-indexed).

func (*Reader) SheetByName

func (r *Reader) SheetByName(name string) (*Sheet, error)

SheetByName returns the sheet with the given name.

func (*Reader) SheetCount

func (r *Reader) SheetCount() int

SheetCount returns the number of sheets in the workbook.

func (*Reader) SheetNames

func (r *Reader) SheetNames() []string

SheetNames returns the names of all sheets.

func (*Reader) Tables

func (r *Reader) Tables() []ParsedTable

Tables returns all sheets as ParsedTable format (for compatibility).

func (*Reader) Text

func (r *Reader) Text() (string, error)

Text extracts and returns all text content from the workbook.

func (*Reader) TextWithOptions

func (r *Reader) TextWithOptions(opts ExtractOptions) (string, error)

TextWithOptions extracts text content with the specified options.

type Sheet

type Sheet struct {
	Name   string
	Index  int
	Rows   [][]Cell
	MaxRow int // Maximum row index (0-indexed)
	MaxCol int // Maximum column index (0-indexed)

	// Merged cell regions
	MergedRegions []MergedRegion
}

Sheet represents a worksheet in the workbook.

func (*Sheet) Cell

func (s *Sheet) Cell(row, col int) *Cell

Cell returns the cell at the given row and column (0-indexed). Returns nil if the cell doesn't exist.

func (*Sheet) CellByRef

func (s *Sheet) CellByRef(ref string) *Cell

CellByRef returns the cell at the given reference (e.g., "A1"). Returns nil if the cell doesn't exist.

func (*Sheet) ColCount

func (s *Sheet) ColCount() int

ColCount returns the maximum number of columns in any row.

func (*Sheet) RowCount

func (s *Sheet) RowCount() int

RowCount returns the number of rows in the sheet.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL