pandoc

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 23, 2025 License: MIT Imports: 4 Imported by: 1

README

go-pandoc

A small library that uses Pandoc to parse markdown files into an intermediate AST.

The produced AST is a bit nicer, compared to the Pandoc's native AST.

Documentation

Overview

Package pandoc provides Go types and functions for working with Pandoc's JSON AST format. It supports parsing Pandoc JSON output and provides structured access to document elements.

Index

Constants

View Source
const (
	Emph        = InlineFmt(iota) // Emphasis (italic)
	Underline                     // Underlined text
	Strong                        // Strong emphasis (bold)
	Strikeout                     // Strikethrough text
	Superscript                   // Superscript text
	Subscript                     // Subscript text
	SmallCaps                     // Small capitals
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Attr

type Attr struct {
	Identifier string    // Element identifier for cross-referencing
	Classes    []string  // CSS-style class names
	KeyVals    []*KeyVal // Additional key-value attributes
}

Attr represents Pandoc attributes with an identifier, classes, and key-value pairs. This is used for blocks and inline elements that support attributes.

func (*Attr) HasClass

func (a *Attr) HasClass(s string) bool

HasClass checks if the attributes contain the specified class name.

func (*Attr) KeyValMap

func (a *Attr) KeyValMap() map[string]string

KeyValMap converts the key-value pairs to a map for convenient lookup.

type Block

type Block interface {
}

Block is the interface implemented by all Pandoc block-level elements. Block elements include paragraphs, headers, lists, tables, and other structural components of a document.

type BlockList

type BlockList []Block

BlockList is a slice of Block elements.

type BlockQuote

type BlockQuote struct {
	Blocks BlockList
}

BlockQuote represents a block quotation.

type BulletList

type BulletList struct {
	Items []BlockList
}

type Cell

type Cell struct {
	Attr      Attr
	Alignment string
	RowSpan   int
	ColSpan   int
	Blocks    BlockList
}

type Cite

type Cite struct {
	Id      string
	Prefix  InlineList
	Suffix  InlineList
	Mode    string // AuthorInText, SuppressAuthor, NormalCitation
	NoteNum int
	Hash    int
	Content InlineList
}

type Code

type Code struct {
	Attr Attr
	Text string
}

type CodeBlock

type CodeBlock struct {
	Attr Attr   // Attributes including language class
	Text string // Code content
}

CodeBlock represents a code block with optional attributes and language specification.

type ColSpec

type ColSpec struct {
	Alignment string
	ColWidth  float32
}

ColSpec specifies alignment and width for a table column.

type DefinitionItem

type DefinitionItem struct {
	Term        InlineList
	Definitions []BlockList
}

type DefinitionList

type DefinitionList struct {
	Items []*DefinitionItem
}

type Div

type Div struct {
	Attr   Attr
	Blocks BlockList
}

type Document

type Document struct {
	PandocApiVersion json.RawMessage        `json:"pandoc-api-version"` // Pandoc API version
	Meta             map[string]interface{} `json:"meta"`               // Document metadata
	Blocks           []interface{}          `json:"blocks"`             // Top-level blocks
}

Document represents a Pandoc JSON document with API version, metadata, and blocks.

func NewDocument

func NewDocument(buf []byte) (*Document, error)

NewDocument parses Pandoc JSON output into a Document structure. The buf parameter should contain JSON output from `pandoc -t json`.

func (*Document) Flow

func (d *Document) Flow() (bb BlockList, e error)

Flow parses and returns the document's block content as a BlockList. This extracts the main document flow, converting raw JSON blocks into typed Block elements.

func (*Document) ParseMeta

func (d *Document) ParseMeta() map[string]string

ParseMeta extracts metadata from the document as a map of strings. It converts MetaInlines metadata values to plain text strings.

type Figure added in v0.1.0

type Figure struct {
	Attr         Attr
	ShortCaption InlineList
	Caption      BlockList
	Blocks       BlockList
}

type Formatted

type Formatted struct {
	Fmt     InlineFmt
	Content InlineList
}
type Header struct {
	Level   int        // Heading level (1-6)
	Attr    Attr       // Heading attributes
	Inlines InlineList // Heading text
}

Header represents a heading with level, attributes, and inline content.

type HorizontalRule

type HorizontalRule struct {
}

HorizontalRule represents a horizontal rule (thematic break).

type Image

type Image struct {
	Attr    Attr
	Content InlineList
	Target  Target
}

type Inline

type Inline interface {
}

Inline is the interface implemented by all Pandoc inline elements. Inline elements include text, formatting, links, images, and other content that appears within block elements.

type InlineFmt

type InlineFmt int

InlineFmt represents inline formatting types (emphasis, strong, etc.).

func (InlineFmt) String

func (f InlineFmt) String() string

String returns the string representation of the inline format type.

type InlineList

type InlineList []Inline

InlineList is a slice of Inline elements.

type KeyVal

type KeyVal struct {
	Key string
	Val string
}

KeyVal represents a key-value pair in Pandoc attributes.

type LineBlock

type LineBlock struct {
	Lines []InlineList
}

LineBlock represents a block of lines where line breaks are preserved.

type LineBreak

type LineBreak struct {
}
type Link struct {
	Attr    Attr
	Content InlineList
	Target  Target
}

type Math

type Math struct {
	Type string // DisplayMath, InlineMath
	Text string
}

type Note

type Note struct {
	Blocks BlockList
}

type Null

type Null struct {
}

Null represents a null block element (placeholder).

type OrderedList

type OrderedList struct {
	StartNumber int
	NumberStyle string
	NumberDelim string
	Items       []BlockList
}

OrderedList represents a numbered list.

type Para

type Para struct {
	Inlines InlineList
}

Para represents a paragraph block element.

type Plain

type Plain struct {
	Inlines InlineList
}

Plain represents a plain text block without paragraph formatting.

type Quoted

type Quoted struct {
	QuoteType string // SingleQuote or DoubleQuote
	Content   InlineList
}

type RawBlock

type RawBlock struct {
	Format string
	Text   string
}

RawBlock represents a raw block in a specific format (e.g., HTML, LaTeX).

type RawInline

type RawInline struct {
	Format string // Text
	Text   string
}

type Row

type Row struct {
	Attr  Attr
	Cells []*Cell
}

type SoftBreak

type SoftBreak struct {
}

type Space

type Space struct {
}

type Span

type Span struct {
	Attr    Attr
	Content InlineList
}

type Str

type Str struct {
	Text string
}

Str represents a text string.

type TC

type TC struct {
	T string      `json:"t"` // Type tag
	C interface{} `json:"c"` // Content
}

TC represents a tagged container with type (T) and content (C) fields. This is Pandoc's standard format for encoding AST elements in JSON.

type Table

type Table struct {
	Attr         Attr            // Table attributes
	ShortCaption InlineList      // Short caption for list of tables
	Caption      BlockList       // Full caption blocks
	ColSpecs     []*ColSpec      // Column specifications
	Head         TableHeadOrFoot // Table header
	Bodies       []*TableBody    // Table bodies
	Foot         TableHeadOrFoot // Table footer
}

Table represents a table with caption, column specifications, header, body, and footer.

type TableBody

type TableBody struct {
	Attr           Attr
	RowHeadColumns int
	Rows1          []*Row
	Rows2          []*Row
}

type TableHeadOrFoot

type TableHeadOrFoot struct {
	Attr Attr
	Rows []*Row
}

TableHeadOrFoot represents a table header or footer section.

type Target

type Target struct {
	URL   string
	Title string
}

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL