mdextract

package module

v0.0.1 Latest Latest Go to latest Published: Jan 17, 2026 License: MIT Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/wiztools/mdextract

Links

Open Source Insights

README ¶

mdextract

A Go module for extracting content under specific headings from markdown documents.

Features

Extract content under any markdown heading (# through ######)
Support for both string and stream input
Case-insensitive heading matching
Preserves formatting, code blocks, lists, and other markdown elements
Stops extraction at next heading of same or higher level
List all headings in a document

Installation

go get github.com/subhash/mdextract

Usage

Basic Example

package main

import (
    "fmt"
    "log"
    
    "github.com/subhash/mdextract"
)

func main() {
    markdown := `# My Document

## Introduction

This is the introduction section.
It has multiple paragraphs.

## Features

- Feature 1
- Feature 2
- Feature 3

## Conclusion

Final thoughts here.`

    extractor := mdextract.New(markdown)
    
    // Extract content under "## Features"
    content, err := extractor.GetContent("## Features")
    if err != nil {
        log.Fatal(err)
    }
    
    fmt.Println(content)
    // Output:
    // - Feature 1
    // - Feature 2
    // - Feature 3
}

Extract from Stream

package main

import (
    "bufio"
    "fmt"
    "log"
    "os"
    
    "github.com/subhash/mdextract"
)

func main() {
    file, err := os.Open("document.md")
    if err != nil {
        log.Fatal(err)
    }
    defer file.Close()
    
    scanner := bufio.NewScanner(file)
    extractor := mdextract.NewFromStream(scanner)
    
    content, err := extractor.GetContent("## Installation")
    if err != nil {
        log.Fatal(err)
    }
    
    fmt.Println(content)
}

Get All Headings

extractor := mdextract.New(markdown)
headings := extractor.GetAllHeadings()

for _, heading := range headings {
    fmt.Println(heading)
}

Nested Headings

When extracting content under a heading, all lower-level headings are included until a heading of the same or higher level is encountered:

markdown := `## Section 1

Content before subsection.

### Subsection 1.1

Subsection content.

### Subsection 1.2

More subsection content.

## Section 2

Different section.`

extractor := mdextract.New(markdown)
content, _ := extractor.GetContent("## Section 1")

fmt.Println(content)
// Output:
// Content before subsection.
// 
// ### Subsection 1.1
// 
// Subsection content.
// 
// ### Subsection 1.2
// 
// More subsection content.

API

`New(markdown string) *Extractor`

Creates a new Extractor from a markdown string.

`NewFromStream(scanner bufio.Scanner) Extractor`

Creates a new Extractor from a buffered scanner (useful for reading from files or streams).

`GetContent(heading string) (string, error)`

Extracts content under a specific heading until the next heading of the same or higher level.

heading: The heading to search for (e.g., "## Section Name")
Returns: The content without the heading itself, or an error if the heading is not found
Heading matching is case-insensitive
Content extraction stops at the next heading of equal or higher level

`GetAllHeadings() []string`

Returns all headings found in the document.

Testing

Run the test suite:

go test

Run with verbose output:

go test -v

Run benchmarks:

go test -bench=.

License

MIT

Documentation ¶

Index ¶

type Extractor
- func New(markdown string) *Extractor
- func NewFromStream(scanner *bufio.Scanner) *Extractor
- func (e *Extractor) GetAllHeadings() []string
- func (e *Extractor) GetContent(heading string) (string, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Extractor ¶

type Extractor struct {
	// contains filtered or unexported fields
}

Extractor provides methods to extract content from markdown documents

func New ¶

func New(markdown string) *Extractor

New creates a new Extractor from a markdown string

func NewFromStream ¶

func NewFromStream(scanner *bufio.Scanner) *Extractor

NewFromStream creates a new Extractor from a stream (io.Reader)

func (*Extractor) GetAllHeadings ¶

func (e *Extractor) GetAllHeadings() []string

GetAllHeadings returns all headings in the document

func (*Extractor) GetContent ¶

func (e *Extractor) GetContent(heading string) (string, error)

GetContent extracts content under a specific heading until the next heading of the same or higher level heading should be in the format "# Heading", "## Heading", etc. Returns the content without the heading itself, or an error if the heading is not found

Source Files ¶

View all Source files

mdextract.go

Directories ¶

Path	Synopsis
example

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL