go2markdown

go2markdown is an open-source Go library for converting DOCX and PDF files to Markdown format. It provides a clean, comprehensive API for document conversion using only Go-native libraries.
Overview
go2markdown enables programmatic conversion of Microsoft Word documents (DOCX) and PDF files to Markdown format. Built with production-ready code, comprehensive test coverage, and a focus on clean, idiomatic Go design.
Key Features:
- DOCX to Markdown conversion
- PDF to Markdown conversion
- Pure Go implementation with no external dependencies
- Published to pkg.go.dev
- MIT licensed with Apache 2.0 compatible dependencies
Installation
go get github.com/edamplified/go2markdown
Quick Start
package main
import (
"fmt"
"log"
"github.com/edamplified/go2markdown"
)
func main() {
// Convert DOCX to Markdown
markdown, err := go2markdown.ConvertDOCX("document.docx")
if err != nil {
log.Fatal(err)
}
fmt.Println(markdown)
// Convert PDF to Markdown
markdown, err = go2markdown.ConvertPDF("document.pdf")
if err != nil {
log.Fatal(err)
}
fmt.Println(markdown)
}
Documentation
Full API documentation and examples are available on pkg.go.dev:
Dependencies
This library uses the following Go-native libraries:
- godocx (MIT License) - DOCX parsing and manipulation
- pdfcpu (Apache 2.0 License) - PDF processing and extraction
See NOTICE for third-party license information and attribution.
Requirements
Development Status
This library is built based on learnings from a comprehensive prototype that evaluated various Go libraries for document conversion. The implementation focuses on:
- Clean, idiomatic Go code following best practices
- Comprehensive test coverage with race detection
- Production-ready error handling
- Well-documented API with examples
- Performance optimization
Contributing
Contributions are welcome. Please see CONTRIBUTING.md for guidelines on code style, testing requirements, and the contribution process.
Popular ways to contribute:
- Report bugs and suggest features
- Improve documentation
- Add test coverage
- Contribute code improvements
Get involved:
License
This project is licensed under the MIT License. See LICENSE.md for the full license text.
Third-party dependencies are licensed under MIT and Apache 2.0. See NOTICE for detailed attribution.
Support
This library was developed based on learnings from a prototype project that evaluated various Go libraries for document conversion. The final implementation uses MIT and Apache 2.0 licensed dependencies for maximum compatibility.
Note: This library is published to pkg.go.dev and can be imported as a standard Go module.