docssed

package
v0.30.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 22, 2026 License: MIT Imports: 14 Imported by: 0

Documentation

Overview

Package docssed parses sed-style Google Docs mutation programs.

Index

Constants

View Source
const IndentNotSet = -1

IndentNotSet is the sentinel value for BraceExpression.Indent meaning "not specified". Any non-negative value means the indent level was explicitly set.

Variables

This section is empty.

Functions

func BraceExpressionHasAnyFormat

func BraceExpressionHasAnyFormat(expr *BraceExpression) bool

BraceExpressionHasAnyFormat returns true if the expression sets any formatting. Used to filter out empty or no-op expressions.

func CanUseNativeReplacement

func CanUseNativeReplacement(replacement string) bool

CanUseNativeReplacement reports whether Docs native find/replace can preserve replacement semantics.

func DetectBraceReference

func DetectBraceReference(pattern string) (string, *TableReference, *ImageReference, error)

DetectBraceReference parses a leading {T=...} or {img=...} reference.

func HasBraceFormatting

func HasBraceFormatting(replacement string) bool

HasBraceFormatting returns true if the replacement contains brace formatting. Used to determine if fast-path native replacement can be used.

func NthFlag

func NthFlag(raw string) int

NthFlag returns the positive occurrence selector from substitution flags.

func ParseExcelReference

func ParseExcelReference(value string) (row, column int, ok bool)

ParseExcelReference parses an Excel-style cell reference such as A1 or AA10.

Types

type Address

type Address struct {
	Start    int
	End      int
	HasRange bool
}

Address targets one paragraph or an inclusive paragraph range.

func ParseAddress

func ParseAddress(raw string) (*Address, string, error)

ParseAddress removes an optional paragraph address prefix.

type AddressElement

type AddressElement struct {
	Number     int
	Kind       AddressElementKind
	Text       string
	StartIndex int64
	EndIndex   int64
}

AddressElement is one addressable top-level document element.

func ResolveAddress

func ResolveAddress(address *Address, elements []AddressElement) ([]AddressElement, error)

ResolveAddress resolves a one-based address against numbered document elements.

type AddressElementKind

type AddressElementKind string

AddressElementKind identifies one numbered top-level document element.

const (
	AddressParagraph AddressElementKind = "paragraph"
	AddressTable     AddressElementKind = "table"
	AddressTOC       AddressElementKind = "toc"
)

type AddressMutation

type AddressMutation struct {
	StartIndex int64
	EndIndex   int64
	InsertText string
}

AddressMutation describes one delete and/or insertion at a document range.

func PlanAddressedAppend

func PlanAddressedAppend(targets []AddressElement, replacement string) []AddressMutation

PlanAddressedAppend plans text insertion before each target's trailing newline.

func PlanAddressedDelete

func PlanAddressedDelete(elements, targets []AddressElement) []AddressMutation

PlanAddressedDelete plans deletion ranges in document order.

func PlanAddressedInsert

func PlanAddressedInsert(targets []AddressElement, replacement string) []AddressMutation

PlanAddressedInsert plans text insertion before each target.

type BraceExpression

type BraceExpression struct {
	// Boolean flags (nil = not set, *true = enabled, *false = negated)
	Bold      *bool // {b} = true, {!b} = false
	Italic    *bool
	Underline *bool
	Strike    *bool
	Code      *bool
	Sup       *bool
	Sub       *bool
	SmallCaps *bool

	// Boolean flags with inline scoping
	InlineSpans []InlineSpan // {b=Warning} -> span with text + flags

	// Value flags
	Text    string  // t= (empty = not set, "$0" = default)
	Color   string  // c= (hex or named, resolved to hex)
	Bg      string  // z= (hex or named, resolved to hex)
	Font    string  // f=
	Size    float64 // s= (0 = not set)
	URL     string  // u=
	Heading string  // h= ("t","s","1"-"6","0", "" = not set)
	Leading float64 // l= (0 = not set)
	Align   string  // a= ("left","center","right","justify")
	Opacity int     // o= (0 = not set, 100 = default)
	Indent  int     // n= (-1 = not set)
	Kerning float64 // k=
	Width   int     // x=
	Height  int     // y=

	// Paragraph spacing
	SpacingAbove float64 // p= first value
	SpacingBelow float64 // p= second value (or same as above if single)
	SpacingSet   bool    // whether p= was specified

	// Effect
	Effect string // e=

	// Columns
	Cols int // cols= (0 = not set)

	// Special flags
	Reset    bool   // {0} - explicit full reset
	NoReset  bool   // {!0} - opt out of implicit reset (additive mode)
	Break    string // += ("" = horizontal rule when + present, "p","c","s")
	HasBreak bool   // whether + was present
	Comment  string // "=text
	Bookmark string // @=name

	// Checkbox (tri-state)
	Check *bool // nil = no checkbox, true = checked, false = unchecked

	// Table of contents
	TOC    int  // toc depth (0 = not set, -1 = unlimited)
	HasTOC bool // whether toc was specified

	// Image ref (pattern-side)
	ImgRef string // img=

	// Table ref (pattern-side)
	TableRef string // T= (raw value for further parsing)
}

BraceExpression represents a fully parsed brace expression from SEDMAT syntax. It captures all formatting, structural, and semantic attributes specified within a {flags} block in a replacement string.

func MergeBraceSpans

func MergeBraceSpans(spans []*BraceSpan) *BraceExpression

MergeBraceSpans combines global formatting spans; inline spans remain positioned separately.

func ParseBraceExpression

func ParseBraceExpression(s string) (*BraceExpression, error)

ParseBraceExpression parses the content inside a brace expression. Input is the content between { }, e.g. for `{b c=red t=hello}` the input is `b c=red t=hello`.

type BraceSpan

type BraceSpan struct {
	Expr      *braceExpr // The parsed brace expression
	Start     int        // Start position in the cleaned output text
	End       int        // End position in the cleaned output text
	IsGlobal  bool       // True if this applies to the whole match (e.g., {b} alone)
	RawBraces string     // Original {content} for debugging
}

BraceSpan represents a positioned brace expression within a replacement string. It tracks where in the output text the formatting should be applied.

func ParseBraceReplacement

func ParseBraceReplacement(replacement string) (string, []*BraceSpan)

ParseBraceReplacement finds all `{...}` groups in a replacement string and returns the cleaned text plus positioned spans. Handles:

  • `{b}` as entire replacement → whole-match formatting
  • `{b=text}` inline → inline span at position
  • Multiple brace groups: `H{,=2}O` → "H2O" with subscript on "2"
  • Escaped braces: `\{` and `\}` are literals

type CellInput

type CellInput struct {
	Text           string
	TextStartIndex int64
	TextEndIndex   int64
}

CellInput is the indexed text content of one table cell.

type CellPlanner

type CellPlanner struct {
	// contains filtered or unexported fields
}

CellPlanner owns one compiled cell replacement expression.

func NewCellPlanner

func NewCellPlanner(expression Expression) (*CellPlanner, error)

NewCellPlanner validates one cell replacement expression.

func (*CellPlanner) Plan

func (p *CellPlanner) Plan(input CellInput) TextPlan

Plan plans one cell using the planner's compiled expression.

type CellReference

type CellReference struct {
	TableIndex      int
	Row             int
	Column          int
	Subpattern      string
	RowOperation    string
	ColumnOperation string
	OperationTarget int
	EndRow          int
	EndColumn       int
}

CellReference identifies one table cell, range, wildcard, or row/column operation.

func ParseTableCellReference

func ParseTableCellReference(value string) *CellReference

ParseTableCellReference parses references such as |1|[2,3], |1|[A1], and row/column operations.

type Command

type Command byte

Command identifies the sed operation represented by an expression.

const (
	CommandSubstitute    Command = 0
	CommandDelete        Command = 'd'
	CommandAppend        Command = 'a'
	CommandInsert        Command = 'i'
	CommandTransliterate Command = 'y'
)

type DeferredBulletPlan

type DeferredBulletPlan struct {
	Requests []*docs.Request
	More     bool
}

DeferredBulletPlan is one index-stable nested-list repair batch.

func PlanDeferredBullets

func PlanDeferredBullets(document *docs.Document) DeferredBulletPlan

PlanDeferredBullets plans the first nested-list group that needs repair.

type DocumentBackend

type DocumentBackend interface {
	Get(context.Context, string) (*docs.Document, error)
	BatchUpdate(context.Context, string, []*docs.Request) (*docs.BatchUpdateDocumentResponse, error)
}

type DocumentBlock

type DocumentBlock struct {
	Kind       DocumentBlockKind
	StartIndex int64
	EndIndex   int64
	ItemIndex  int
}

DocumentBlock preserves top-level source order. ItemIndex addresses the corresponding Paragraphs or Tables entry, and is -1 for other block kinds.

type DocumentBlockKind

type DocumentBlockKind string

DocumentBlockKind identifies one top-level structural element.

const (
	DocumentBlockParagraph       DocumentBlockKind = "paragraph"
	DocumentBlockTable           DocumentBlockKind = "table"
	DocumentBlockTableOfContents DocumentBlockKind = "table_of_contents"
	DocumentBlockSectionBreak    DocumentBlockKind = "section_break"
)

type DocumentImage

type DocumentImage struct {
	ObjectID     string
	Index        int64
	Alt          string
	IsPositioned bool
}

DocumentImage describes an inline or positioned image.

type DocumentParagraph

type DocumentParagraph struct {
	Text               string
	StartIndex         int64
	EndIndex           int64
	NamedStyle         string
	BulletListID       string
	BulletNestingLevel int64
	LeadingTab         bool
}

DocumentParagraph is one paragraph with its concatenated text.

type DocumentProjection

type DocumentProjection struct {
	RevisionID string
	Legacy     *DocumentSegment
	Tabs       []DocumentSegment
}

DocumentProjection is the sed-facing view of a Google Docs document.

func ProjectDocument

func ProjectDocument(document *docs.Document) DocumentProjection

ProjectDocument builds a stable traversal view from legacy and tab-aware Docs fields.

type DocumentSegment

type DocumentSegment struct {
	TabID          string
	Title          string
	BodyStartIndex int64
	BodyEndIndex   int64
	Blocks         []DocumentBlock
	TextRuns       []DocumentTextRun
	Paragraphs     []DocumentParagraph
	Tables         []DocumentTable
	Images         []DocumentImage
}

DocumentSegment is one independently indexed document body.

type DocumentTable

type DocumentTable struct {
	StartIndex      int64
	EndIndex        int64
	DeclaredRows    int64
	DeclaredColumns int64
	Rows            []DocumentTableRow
}

DocumentTable is one table in source order, including nested tables.

type DocumentTableCell

type DocumentTableCell struct {
	Text           string
	StartIndex     int64
	EndIndex       int64
	TextStartIndex int64
	TextEndIndex   int64
	RowSpan        int64
	ColumnSpan     int64
}

DocumentTableCell is one cell's direct paragraph text and indexed range.

type DocumentTableRow

type DocumentTableRow struct {
	StartIndex int64
	EndIndex   int64
	Cells      []DocumentTableCell
}

DocumentTableRow contains projected cells in source order.

type DocumentTextRun

type DocumentTextRun struct {
	Text       string
	StartIndex int64
	EndIndex   int64
}

DocumentTextRun is one indexed text run.

type Executor

type Executor struct {
	// contains filtered or unexported fields
}

func NewExecutor

func NewExecutor(backend DocumentBackend) *Executor

func NewServiceExecutor

func NewServiceExecutor(service *docs.Service) *Executor

func (*Executor) ApplyDeferredBullets

func (e *Executor) ApplyDeferredBullets(ctx context.Context, documentID string) error

ApplyDeferredBullets repairs nested lists one group at a time, refetching after each index-shifting update.

func (*Executor) BatchUpdate

func (e *Executor) BatchUpdate(
	ctx context.Context,
	documentID string,
	requests []*docs.Request,
) (*docs.BatchUpdateDocumentResponse, error)

func (*Executor) Get

func (e *Executor) Get(ctx context.Context, documentID string) (*docs.Document, error)

type Expression

type Expression struct {
	Pattern     string
	Replacement string
	Global      bool
	NthMatch    int
	Command     Command
	Address     *Address

	Cell        *CellReference
	Table       *TableReference
	Image       *ImageReference
	Brace       *BraceExpression
	BraceSpans  []*BraceSpan
	TableCreate *TableCreateSpec
}

Expression is the provider-independent core AST for one sed operation.

func ParseDelete

func ParseDelete(raw string) (Expression, error)

ParseDelete parses a d/pattern/flags expression.

func ParseExpression

func ParseExpression(raw string) (Expression, error)

ParseExpression parses substitution, delete, append, insert, transliterate, and addressed forms.

func ParseInsertAppend

func ParseInsertAppend(raw string, command Command) (Expression, error)

ParseInsertAppend parses an append or insert expression with a search pattern.

func ParseSubstitution

func ParseSubstitution(raw string) (Expression, error)

ParseSubstitution parses an s/pattern/replacement/flags expression.

func ParseTransliterate

func ParseTransliterate(raw string) (Expression, error)

ParseTransliterate parses a y/source/destination expression.

type FootnoteMutation

type FootnoteMutation struct {
	StartIndex int64
	EndIndex   int64
	Text       string
}

FootnoteMutation replaces one text range with a populated footnote.

type FormatIntent

type FormatIntent struct {
	StartIndex           int64
	EndIndex             int64
	StructuralStartIndex int64
	StructuralEndIndex   int64
	Formats              []string
	LeadingTab           bool
	Brace                *BraceExpression
	BraceSpans           []*BraceSpan
}

FormatIntent describes formatting for newly inserted text.

type ImageMutation

type ImageMutation struct {
	StartIndex int64
	EndIndex   int64
	Image      *ImageSpec
}

ImageMutation replaces one text range with an inline image.

type ImageReference

type ImageReference struct {
	ByPosition bool
	Position   int
	AllImages  bool
	ByAlt      bool
	Pattern    string
	AltRegex   *regexp.Regexp
}

ImageReference identifies existing document images by position or alt-text regex.

func ParseBraceImageReference

func ParseBraceImageReference(spec string) (*ImageReference, error)

ParseBraceImageReference parses a {img=...} image reference body.

func ParseImageReference

func ParseImageReference(pattern string) *ImageReference

ParseImageReference parses existing-image references such as !(1), !(*), and ![alt-regex].

func (*ImageReference) String

func (ref *ImageReference) String() string

String returns a compact brace-syntax representation of the image reference.

type ImageSpec

type ImageSpec struct {
	URL     string
	Alt     string
	Caption string
	Width   int
	Height  int
}

ImageSpec describes an inline image replacement.

func ParseImageSyntax

func ParseImageSyntax(text string) *ImageSpec

ParseImageSyntax parses Markdown image syntax with optional Pandoc dimensions.

type InlineSpan

type InlineSpan struct {
	Text  string   // The text content of the span
	Flags []string // Which boolean flags apply: "b", "i", "^", etc.
}

InlineSpan represents an inline text span with associated boolean flags. Used for inline scoping like {b=Warning} where "Warning" is bolded inline.

type MarkdownReplacement

type MarkdownReplacement struct {
	Text    string
	Formats []string
}

MarkdownReplacement is the provider-independent interpretation of replacement markdown.

func ParseMarkdownReplacement

func ParseMarkdownReplacement(replacement string) MarkdownReplacement

ParseMarkdownReplacement extracts text and formatting from markdown-style replacement text.

type MatchAction

type MatchAction struct {
	StartIndex  int64
	EndIndex    int64
	OldText     string
	Replacement Replacement
}

MatchAction describes one indexed replacement in document order.

func PlanMatches

func PlanMatches(segment DocumentSegment, expression Expression) ([]MatchAction, error)

PlanMatches finds matches independently within each projected Docs text run.

func PlanParagraphMatches

func PlanParagraphMatches(paragraphs []DocumentParagraph, expression Expression) ([]MatchAction, error)

PlanParagraphMatches finds matches within each addressed paragraph.

type MatchPlanner

type MatchPlanner struct {
	// contains filtered or unexported fields
}

MatchPlanner owns a compiled expression for repeated planning.

func NewMatchPlanner

func NewMatchPlanner(expression Expression) (*MatchPlanner, error)

NewMatchPlanner validates and compiles one match expression.

func (*MatchPlanner) PlanParagraphs

func (p *MatchPlanner) PlanParagraphs(paragraphs []DocumentParagraph) []MatchAction

PlanParagraphs finds matches independently within each addressed paragraph.

func (*MatchPlanner) PlanSegment

func (p *MatchPlanner) PlanSegment(segment DocumentSegment) []MatchAction

PlanSegment finds matches independently within each projected Docs text run.

type ParagraphPlanner

type ParagraphPlanner struct {
	// contains filtered or unexported fields
}

ParagraphPlanner owns one compiled top-level paragraph match expression.

func NewParagraphPlanner

func NewParagraphPlanner(expression Expression) (*ParagraphPlanner, error)

NewParagraphPlanner validates one paragraph command expression.

func (*ParagraphPlanner) PlanDelete

func (p *ParagraphPlanner) PlanDelete(segment DocumentSegment) []AddressMutation

PlanDelete returns full-range deletions for matching top-level paragraphs.

func (*ParagraphPlanner) PlanInsert

func (p *ParagraphPlanner) PlanInsert(
	segment DocumentSegment,
	replacement string,
	before bool,
) []AddressMutation

PlanInsert returns insert-before or append-after mutations for matching top-level paragraphs.

type Program

type Program struct {
	Expressions []Expression
}

Program is an ordered sequence of sed expressions.

func Enrich

func Enrich(program Program) (Program, error)

Enrich resolves provider-independent table, cell, image, and brace semantics.

func Parse

func Parse(raw string) (Program, error)

Parse parses one sed expression into a program.

type Replacement

type Replacement struct {
	Kind         ReplacementKind
	ExpandedText string
	Text         string
	Formats      []string
	Image        *ImageSpec
	Brace        *BraceExpression
	BraceSpans   []*BraceSpan
}

Replacement is the provider-independent interpretation of one expanded replacement.

type ReplacementKind

type ReplacementKind string

ReplacementKind identifies the mutation represented by a matched replacement.

const (
	ReplacementText  ReplacementKind = "text"
	ReplacementImage ReplacementKind = "image"
)

type TableCreateMutation

type TableCreateMutation struct {
	StartIndex int64
	EndIndex   int64
	Rows       int
	Columns    int
}

TableCreateMutation replaces one matched placeholder with a table.

type TableCreatePlanner

type TableCreatePlanner struct {
	// contains filtered or unexported fields
}

TableCreatePlanner owns the compiled placeholder expression for one table.

func NewTableCreatePlanner

func NewTableCreatePlanner(expression Expression, spec TableCreateSpec) (*TableCreatePlanner, error)

NewTableCreatePlanner validates one table creation expression and shape.

func (*TableCreatePlanner) Plan

Plan returns the first placeholder mutation in source order.

type TableCreateSpec

type TableCreateSpec struct {
	Rows    int
	Columns int
	Header  bool
	Cells   [][]string
}

TableCreateSpec describes explicit or pipe-table creation syntax.

func ParsePipeTable

func ParsePipeTable(value string) *TableCreateSpec

ParsePipeTable parses markdown pipe-table creation syntax.

func ParseTableCreate

func ParseTableCreate(value string) *TableCreateSpec

ParseTableCreate parses explicit |RxC| and |RxC:header| table creation syntax.

type TableReference

type TableReference struct {
	TableIndex int

	IsCreate   bool
	CreateRows int
	CreateCols int
	HasHeader  bool

	Row int
	Col int

	HasRange bool
	EndRow   int
	EndCol   int

	IsAllCells bool
	RowWild    bool
	ColWild    bool

	RowOp string
	ColOp string
}

TableReference is the semantic form of a pattern-side {T=...} reference.

func ParseBraceTableReference

func ParseBraceTableReference(spec string) (*TableReference, error)

ParseBraceTableReference parses a {T=...} table reference body.

func ParseTableReference

func ParseTableReference(value string) *TableReference

ParseTableReference parses a bare table reference such as |1|, |-1|, or |*|.

func (*TableReference) String

func (ref *TableReference) String() string

String returns a compact diagnostic representation of the table reference.

type TextEdit

type TextEdit struct {
	StartIndex     int64
	EndIndex       int64
	InsertText     string
	HorizontalRule bool
}

TextEdit describes one delete followed by an optional insertion.

type TextPlan

type TextPlan struct {
	MatchCount int
	Images     []ImageMutation
	Footnotes  []FootnoteMutation
	TextEdits  []TextEdit
	Formatting []FormatIntent
}

TextPlan separates matched replacements by provider execution requirements.

func PlanCellInsertion

func PlanCellInsertion(index int64, content string) TextPlan

PlanCellInsertion inserts literal Markdown cell content at one document index.

func PlanCellReplacement

func PlanCellReplacement(input CellInput, expression Expression) (TextPlan, error)

PlanCellReplacement plans one whole-cell or sub-pattern replacement.

func PlanTextMutations

func PlanTextMutations(actions []MatchAction) TextPlan

PlanTextMutations converts match actions into deterministic execution phases.

func PlanWholeCellReplacement

func PlanWholeCellReplacement(input CellInput, replacement string) TextPlan

PlanWholeCellReplacement replaces cell text while preserving its terminal newline.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL