csvpp

package module

v0.0.3 Latest Latest Go to latest Published: Jan 29, 2026 License: MIT Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/osamingo/go-csvpp

Links

Open Source Insights

README ¶

go-csvpp

A Go implementation of the IETF CSV++ specification (draft-mscaldas-csvpp-01).

CSV++ extends traditional CSV to support arrays and structured fields within cells, enabling complex data representation while maintaining CSV's simplicity.

Features

Full IETF CSV++ specification compliance
Wraps encoding/csv for RFC 4180 compatibility
Four field types: Simple, Array, Structured, ArrayStructured
Struct mapping with csvpp tags (Marshal/Unmarshal)
Configurable delimiters
Security-conscious design (nesting depth limits)

Requirements

Go 1.24 or later

Installation

go get github.com/osamingo/go-csvpp

Quick Start

Reading CSV++ Data

package main

import (
    "fmt"
    "io"
    "strings"

    "github.com/osamingo/go-csvpp"
)

func main() {
    input := `name,phone[],geo(lat^lon)
Alice,555-1234~555-5678,34.0522^-118.2437
Bob,555-9999,40.7128^-74.0060
`

    reader := csvpp.NewReader(strings.NewReader(input))

    for {
        record, err := reader.Read()
        if err == io.EOF {
            break
        }
        if err != nil {
            panic(err)
        }

        name := record[0].Value
        phones := record[1].Values
        lat := record[2].Components[0].Value
        lon := record[2].Components[1].Value

        fmt.Printf("%s: phones=%v, location=(%s, %s)\n", name, phones, lat, lon)
    }
}

Output:

Alice: phones=[555-1234 555-5678], location=(34.0522, -118.2437)
Bob: phones=[555-9999], location=(40.7128, -74.0060)

Writing CSV++ Data

package main

import (
    "bytes"
    "fmt"

    "github.com/osamingo/go-csvpp"
)

func main() {
    var buf bytes.Buffer
    writer := csvpp.NewWriter(&buf)

    headers := []*csvpp.ColumnHeader{
        {Name: "name", Kind: csvpp.SimpleField},
        {Name: "tags", Kind: csvpp.ArrayField, ArrayDelimiter: '~'},
    }
    writer.SetHeaders(headers)

    if err := writer.WriteHeader(); err != nil {
        panic(err)
    }
    if err := writer.Write([]*csvpp.Field{
        {Value: "Alice"},
        {Values: []string{"go", "rust", "python"}},
    }); err != nil {
        panic(err)
    }
    writer.Flush()

    fmt.Print(buf.String())
}

Output:

name,tags[]
Alice,go~rust~python

Struct Mapping

package main

import (
    "fmt"
    "strings"

    "github.com/osamingo/go-csvpp"
)

type Person struct {
    Name   string   `csvpp:"name"`
    Phones []string `csvpp:"phone[]"`
    Geo    struct {
        Lat string
        Lon string
    } `csvpp:"geo(lat^lon)"`
}

func main() {
    input := `name,phone[],geo(lat^lon)
Alice,555-1234~555-5678,34.0522^-118.2437
`

    var people []Person
    if err := csvpp.Unmarshal(strings.NewReader(input), &people); err != nil {
        panic(err)
    }

    for _, p := range people {
        fmt.Printf("%s: phones=%v, geo=(%s, %s)\n",
            p.Name, p.Phones, p.Geo.Lat, p.Geo.Lon)
    }
}

Output:

Alice: phones=[555-1234 555-5678], geo=(34.0522, -118.2437)

Field Types

CSV++ supports four field types in headers:

Type	Header Syntax	Example Data	Description
Simple	`name`	`Alice`	Plain text value
Array	`tags[]`	`go~rust~python`	Multiple values with delimiter
Structured	`geo(lat^lon)`	`34.05^-118.24`	Named components
ArrayStructured	`addr[](city^zip)`	`LA^90210~NY^10001`	Array of structures

Default Delimiters

Array delimiter: ~ (tilde)
Component delimiter: ^ (caret)

Custom delimiters can be specified in the header:

phone[|] - uses | as array delimiter
geo;(lat;lon) - uses ; as component delimiter

Delimiter Progression

For nested structures, the IETF specification recommends:

Level	Delimiter
1 (arrays)	`~`
2 (components)	`^`
3	`;`
4	`:`

API Reference

Reader

reader := csvpp.NewReader(r) // r is io.Reader

// Configuration (same as encoding/csv)
reader.Comma = ','           // Field delimiter
reader.Comment = '#'         // Comment character
reader.LazyQuotes = false    // Relaxed quote handling
reader.TrimLeadingSpace = false
reader.MaxNestingDepth = 10  // Nesting limit (security)

// Methods
headers, err := reader.Headers()  // Get parsed headers
record, err := reader.Read()      // Read one record
records, err := reader.ReadAll()  // Read all records

Writer

writer := csvpp.NewWriter(w) // w is io.Writer

// Configuration
writer.Comma = ','      // Field delimiter
writer.UseCRLF = false  // Use \r\n line endings

// Methods
writer.SetHeaders(headers)  // Set column headers
writer.WriteHeader()        // Write header row
writer.Write(record)        // Write one record
writer.WriteAll(records)    // Write all records
writer.Flush()              // Flush buffer

Marshal/Unmarshal

// Unmarshal CSV++ data into structs
var people []Person
err := csvpp.Unmarshal(reader, &people)

// Marshal structs to CSV++ data
err := csvpp.Marshal(writer, people)

Struct Tags

Use csvpp struct tags to map fields:

type Record struct {
    Name     string   `csvpp:"name"`           // Simple field
    Tags     []string `csvpp:"tags[]"`         // Array field
    Location struct {                          // Structured field
        Lat string
        Lon string
    } `csvpp:"geo(lat^lon)"`
    Addresses []Address `csvpp:"addr[](street^city)"` // Array structured
}

Compatibility

This package wraps encoding/csv and inherits:

Full RFC 4180 compliance
Quoted field handling
Configurable field/line delimiters
Comment support

Security

MaxNestingDepth: Limits nested structure depth (default: 10) to prevent stack overflow from malicious input
Header names are restricted to ASCII characters per IETF specification

CSV Injection Prevention

When CSV files are opened in spreadsheet applications, values starting with =, +, -, or @ may be interpreted as formulas. Use HasFormulaPrefix to detect and escape dangerous values:

if csvpp.HasFormulaPrefix(value) {
    value = "'" + value // Escape for spreadsheet safety
}

Specification

This implementation follows the IETF CSV++ specification:

draft-mscaldas-csvpp-01

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Documentation ¶

Overview ¶

Package csvpp implements the IETF CSV++ specification (draft-mscaldas-csvpp-01).

CSV++ extends traditional CSV to support arrays and structured fields within cells, enabling complex data representation while maintaining CSV's simplicity. This package wraps encoding/csv and is fully compatible with RFC 4180.

Overview ¶

CSV++ introduces four field types beyond simple text values:

Simple: "name" - plain text value
Array: "tags[]" - multiple values separated by a delimiter (default: ~)
Structured: "geo(lat^lon)" - named components separated by a delimiter (default: ^)
ArrayStructured: "addresses[](street^city)" - array of structured values

These field types are represented by the FieldKind constants: SimpleField, ArrayField, StructuredField, and ArrayStructuredField.

Basic Usage ¶

Reading CSV++ data:

r := csvpp.NewReader(file)

// Get parsed headers
headers, err := r.Headers()
if err != nil {
    log.Fatal(err)
}

// Read records
for {
    record, err := r.Read()
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
    // process record
}

Writing CSV++ data:

w := csvpp.NewWriter(file)
w.SetHeaders(headers)

if err := w.WriteHeader(); err != nil {
    log.Fatal(err)
}

for _, record := range records {
    if err := w.Write(record); err != nil {
        log.Fatal(err)
    }
}
w.Flush()
if err := w.Error(); err != nil {
    log.Fatal(err)
}

Struct Mapping ¶

Use Marshal and Unmarshal for automatic struct mapping with struct tags:

type Person struct {
    Name   string   `csvpp:"name"`
    Phones []string `csvpp:"phone[]"`
    Geo    struct {
        Lat string
        Lon string
    } `csvpp:"geo(lat^lon)"`
}

// Read into structs
var people []Person
if err := csvpp.Unmarshal(file, &people); err != nil {
    log.Fatal(err)
}

// Write from structs
var buf bytes.Buffer
if err := csvpp.Marshal(&buf, people); err != nil {
    log.Fatal(err)
}

Delimiter Conventions ¶

The IETF CSV++ specification recommends using specific delimiters for nested structures to avoid conflicts. The recommended progression is:

Level 1 (arrays): ~ (tilde)
Level 2 (components): ^ (caret)
Level 3: ; (semicolon)
Level 4: : (colon)

This package uses ~ and ^ as defaults, matching the IETF recommendation.

Compatibility with encoding/csv ¶

This package wraps encoding/csv and inherits its RFC 4180 compliance. The Reader and Writer types expose the same configuration options:

Comma: field delimiter (default: ',')
Comment: comment character (Reader only)
LazyQuotes: relaxed quote handling (Reader only)
TrimLeadingSpace: trim leading whitespace (Reader only)
UseCRLF: use \r\n line endings (Writer only)

Security Considerations ¶

The MaxNestingDepth option (default: 10) limits the depth of nested structures to prevent stack overflow attacks from maliciously crafted input.

CSV Injection ¶

When CSV files are opened in spreadsheet applications (Excel, Google Sheets, etc.), values beginning with '=', '+', '-', or '@' may be interpreted as formulas. This can lead to security vulnerabilities known as "CSV injection" or "formula injection".

Use the HasFormulaPrefix function to detect potentially dangerous values:

for _, field := range record {
    if csvpp.HasFormulaPrefix(field.Value) {
        field.Value = "'" + field.Value // Escape for spreadsheet safety
    }
}

Note: This package does not automatically escape formula prefixes to preserve data integrity. Applications should implement appropriate escaping based on their specific security requirements and target environments.

Errors ¶

The package defines the following sentinel errors:

ErrNoHeader: returned when attempting to read without a header row
ErrInvalidHeader: returned when header format is invalid
ErrNestingTooDeep: returned when nesting exceeds MaxNestingDepth

Parse errors are wrapped in ParseError, which provides line/column information.

Constants ¶

Default delimiters follow IETF recommendations:

DefaultArrayDelimiter: ~ (tilde) for array fields
DefaultComponentDelimiter: ^ (caret) for structured fields
DefaultMaxNestingDepth: 10 (IETF recommended limit)

Specification Reference ¶

For the complete IETF CSV++ specification, see: https://datatracker.ietf.org/doc/draft-mscaldas-csvpp/

Example ¶

input := `name,phone[],geo(lat^lon)
Alice,555-1234~555-5678,34.0522^-118.2437
Bob,555-9999,40.7128^-74.0060
`

reader := csvpp.NewReader(strings.NewReader(input))

// Get headers
headers, err := reader.Headers()
if err != nil {
	log.Fatal(err)
}

fmt.Printf("Headers: %s, %s, %s\n", headers[0].Name, headers[1].Name, headers[2].Name)

// Read all records
for {
	record, err := reader.Read()
	if err == io.EOF {
		break
	}
	if err != nil {
		log.Fatal(err)
	}

	name := record[0].Value
	phones := record[1].Values
	lat := record[2].Components[0].Value
	lon := record[2].Components[1].Value

	fmt.Printf("%s: phones=%v, location=(%s, %s)\n", name, phones, lat, lon)
}

Output:

Headers: name, phone, geo
Alice: phones=[555-1234 555-5678], location=(34.0522, -118.2437)
Bob: phones=[555-9999], location=(40.7128, -74.0060)

Index ¶

Constants
Variables
func HasFormulaPrefix(s string) bool
func Marshal(w io.Writer, src any) error
func MarshalWriter(w *Writer, src any) error
func Unmarshal(r io.Reader, dst any) error
func UnmarshalReader(r *Reader, dst any) error
type ColumnHeader
type Field
type FieldKind
- func (k FieldKind) String() string
type ParseError
- func (e *ParseError) Error() string
- func (e *ParseError) Unwrap() error
type Reader
- func NewReader(r io.Reader) *Reader
type Writer
- func NewWriter(w io.Writer) *Writer

Constants ¶

View Source

const (
	DefaultArrayDelimiter     = '~' // IETF Section 2.3.2: recommended for array fields
	DefaultComponentDelimiter = '^' // IETF Section 2.3.2: recommended for structured fields
)

Default delimiters as recommended in IETF CSV++ Section 2.3.2. The specification suggests delimiter progression: ~ → ^ → ; → : for nested structures.

View Source

const DefaultMaxNestingDepth = 10

DefaultMaxNestingDepth is the default maximum nesting depth. IETF Section 5 (Security Considerations) recommends limiting nesting depth to prevent stack overflow attacks from maliciously crafted input.

Variables ¶

View Source

var (
	ErrNoHeader       = errors.New("csvpp: header record is required")
	ErrInvalidHeader  = errors.New("csvpp: invalid column header format")
	ErrNestingTooDeep = errors.New("csvpp: nesting level exceeds limit")
)

Error definitions.

Functions ¶

func HasFormulaPrefix ¶ added in v0.0.2

func HasFormulaPrefix(s string) bool

HasFormulaPrefix reports whether s starts with a character that spreadsheet applications may interpret as a formula. These characters are: '=', '+', '-', '@'.

When CSV files are opened in spreadsheet applications like Microsoft Excel or Google Sheets, values beginning with these characters may be executed as formulas, potentially leading to security vulnerabilities (CSV injection).

This function helps identify potentially dangerous values so that applications can take appropriate action, such as prefixing with a single quote or rejecting the input.

Example:

if csvpp.HasFormulaPrefix(value) {
    value = "'" + value // Escape for spreadsheet safety
}

func Marshal ¶

func Marshal(w io.Writer, src any) error

Marshal encodes a slice of structs to CSV++ data.

Example ¶

people := []Person{
	{Name: "Alice", Phones: []string{"555-1234", "555-5678"}},
	{Name: "Bob", Phones: []string{"555-9999"}},
}

var buf bytes.Buffer
if err := csvpp.Marshal(&buf, people); err != nil {
	log.Fatal(err)
}

fmt.Print(buf.String())

Output:

name,phone[]
Alice,555-1234~555-5678
Bob,555-9999

func MarshalWriter ¶

func MarshalWriter(w *Writer, src any) error

MarshalWriter encodes a slice of structs to a Writer.

func Unmarshal ¶

func Unmarshal(r io.Reader, dst any) error

Unmarshal decodes CSV++ data into a slice of structs. dst must be a pointer to a slice of structs.

Example ¶

input := `name,phone[]
Alice,555-1234~555-5678
Bob,555-9999
`

var people []Person
if err := csvpp.Unmarshal(strings.NewReader(input), &people); err != nil {
	log.Fatal(err)
}

for _, p := range people {
	fmt.Printf("%s: %v\n", p.Name, p.Phones)
}

Output:

Alice: [555-1234 555-5678]
Bob: [555-9999]

Example (Structured) ¶

input := `name,geo(lat^lon)
Los Angeles,34.0522^-118.2437
New York,40.7128^-74.0060
`

var locations []Location
if err := csvpp.Unmarshal(strings.NewReader(input), &locations); err != nil {
	log.Fatal(err)
}

for _, loc := range locations {
	fmt.Printf("%s: (%s, %s)\n", loc.Name, loc.Geo.Lat, loc.Geo.Lon)
}

Output:

Los Angeles: (34.0522, -118.2437)
New York: (40.7128, -74.0060)

func UnmarshalReader ¶

func UnmarshalReader(r *Reader, dst any) error

UnmarshalReader decodes from a Reader into a slice of structs.

Types ¶

type ColumnHeader ¶

type ColumnHeader struct {
	Name               string          // Field name (ABNF: name = 1*field-char)
	Kind               FieldKind       // Field type (IETF Section 2.2)
	ArrayDelimiter     rune            // Array delimiter (ABNF: delimiter)
	ComponentDelimiter rune            // Component delimiter (ABNF: component-delim)
	Components         []*ColumnHeader // Component list (ABNF: component-list)
}

ColumnHeader represents the declaration information for an individual field. It corresponds to the ABNF "field" rule in IETF CSV++ Section 2.2:

field = simple-field / array-field / struct-field / array-struct-field
name  = 1*field-char
field-char = ALPHA / DIGIT / "_" / "-"

type Field ¶

type Field struct {
	Value      string   // Value for SimpleField
	Values     []string // Values for ArrayField (IETF Section 2.2.2)
	Components []*Field // Components for StructuredField/ArrayStructuredField (IETF Section 2.2.3/2.2.4)
}

Field represents a parsed field value from a data row. The populated fields depend on the corresponding ColumnHeader.Kind:

SimpleField: Value is set
ArrayField: Values is set
StructuredField: Components is set (each component is a Field)
ArrayStructuredField: Components is set (each is a Field with its own Components)

type FieldKind ¶

type FieldKind int

FieldKind represents the type of field as defined in IETF CSV++ Section 2.2. See: https://datatracker.ietf.org/doc/draft-mscaldas-csvpp/

const (
	SimpleField          FieldKind = iota // IETF Section 2.2.1: simple-field = name
	ArrayField                            // IETF Section 2.2.2: array-field = name "[" [delimiter] "]"
	StructuredField                       // IETF Section 2.2.3: struct-field = name [component-delim] "(" component-list ")"
	ArrayStructuredField                  // IETF Section 2.2.4: array-struct-field = name "[" [delimiter] "]" [component-delim] "(" component-list ")"
)

func (FieldKind) String ¶

func (k FieldKind) String() string

String returns the string representation of FieldKind.

type ParseError ¶

type ParseError struct {
	Line   int    // Line number where the error occurred (1-based)
	Column int    // Column number where the error occurred (1-based)
	Field  string // Field name (if available)
	Err    error  // Original error
}

ParseError holds detailed information about an error that occurred during parsing.

func (*ParseError) Error ¶

func (e *ParseError) Error() string

Error returns the error message for ParseError.

func (*ParseError) Unwrap ¶

func (e *ParseError) Unwrap() error

Unwrap returns the original error.

type Reader ¶

type Reader struct {
	// Comma is the field delimiter (default: ',').
	Comma rune
	// Comment is the comment character (disabled if 0).
	Comment rune
	// LazyQuotes relaxes strict quote checking if true.
	LazyQuotes bool
	// TrimLeadingSpace trims leading whitespace from fields if true.
	TrimLeadingSpace bool
	// MaxNestingDepth is the maximum nesting depth for structured fields (default: 10).
	// This limit prevents stack overflow from deeply nested input (IETF Section 5).
	// If 0, DefaultMaxNestingDepth is used.
	MaxNestingDepth int
	// contains filtered or unexported fields
}

Reader reads CSV++ files according to the IETF CSV++ specification. It wraps encoding/csv.Reader and provides CSV++ header parsing and field parsing. The first row is always treated as the header row (IETF Section 2.1).

func NewReader ¶

func NewReader(r io.Reader) *Reader

NewReader creates a new Reader.

Example (CustomDelimiter) ¶

// Using semicolon as field delimiter (common in European locales)
input := `name;age
Alice;30
Bob;25
`

reader := csvpp.NewReader(strings.NewReader(input))
reader.Comma = ';'

records, err := reader.ReadAll()
if err != nil {
	log.Fatal(err)
}

for _, record := range records {
	fmt.Printf("%s is %s\n", record[0].Value, record[1].Value)
}

Output:

Alice is 30
Bob is 25

func (*Reader) Headers ¶

func (r *Reader) Headers() ([]*ColumnHeader, error)

Headers returns the parsed header information. If headers have not been parsed yet, the first row is read and parsed.

Example ¶

input := `id,name,tags[],address(street^city^zip)
1,Alice,go~rust,123 Main^LA^90210
`

reader := csvpp.NewReader(strings.NewReader(input))
headers, err := reader.Headers()
if err != nil {
	log.Fatal(err)
}

for _, h := range headers {
	fmt.Printf("%s: %s\n", h.Name, h.Kind)
}

Output:

id: SimpleField
name: SimpleField
tags: ArrayField
address: StructuredField

func (*Reader) Read ¶

func (r *Reader) Read() ([]*Field, error)

Read reads and returns one record's worth of fields. The header row is automatically parsed on the first call. Returns io.EOF when the end of file is reached.

Example ¶

input := `name,scores[]
Alice,100~95~88
Bob,77~82
`

reader := csvpp.NewReader(strings.NewReader(input))

for {
	record, err := reader.Read()
	if err == io.EOF {
		break
	}
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("%s: %v\n", record[0].Value, record[1].Values)
}

Output:

Alice: [100 95 88]
Bob: [77 82]

func (*Reader) ReadAll ¶

func (r *Reader) ReadAll() ([][]*Field, error)

ReadAll reads and returns all records. The header row is automatically parsed on the first call.

Example ¶

input := `name,age
Alice,30
Bob,25
Charlie,35
`

reader := csvpp.NewReader(strings.NewReader(input))
records, err := reader.ReadAll()
if err != nil {
	log.Fatal(err)
}

fmt.Printf("Read %d records\n", len(records))
for _, record := range records {
	fmt.Printf("%s is %s years old\n", record[0].Value, record[1].Value)
}

Output:

Read 3 records
Alice is 30 years old
Bob is 25 years old
Charlie is 35 years old

type Writer ¶

type Writer struct {
	// Comma is the field delimiter (default: ',').
	Comma rune
	// UseCRLF uses \r\n as the line terminator if true.
	UseCRLF bool
	// contains filtered or unexported fields
}

Writer writes CSV++ files according to the IETF CSV++ specification. It wraps encoding/csv.Writer and serializes CSV++ fields using the delimiters defined in the headers. The output is RFC 4180 compliant.

Example ¶

var buf bytes.Buffer
writer := csvpp.NewWriter(&buf)

headers := []*csvpp.ColumnHeader{
	{Name: "name", Kind: csvpp.SimpleField},
	{Name: "tags", Kind: csvpp.ArrayField, ArrayDelimiter: '~'},
}
writer.SetHeaders(headers)

if err := writer.WriteHeader(); err != nil {
	log.Fatal(err)
}

records := [][]*csvpp.Field{
	{{Value: "Alice"}, {Values: []string{"go", "rust"}}},
	{{Value: "Bob"}, {Values: []string{"python"}}},
}

for _, record := range records {
	if err := writer.Write(record); err != nil {
		log.Fatal(err)
	}
}
writer.Flush()

fmt.Print(buf.String())

Output:

name,tags[]
Alice,go~rust
Bob,python

func NewWriter ¶

func NewWriter(w io.Writer) *Writer

NewWriter creates a new Writer.

func (*Writer) Error ¶

func (w *Writer) Error() error

Error returns any error that occurred during writing.

func (*Writer) Flush ¶

func (w *Writer) Flush()

Flush flushes the buffer.

func (*Writer) SetHeaders ¶

func (w *Writer) SetHeaders(headers []*ColumnHeader)

SetHeaders sets the header information. This must be called before WriteHeader or Write.

func (*Writer) Write ¶

func (w *Writer) Write(record []*Field) error

Write writes one record's worth of fields.

func (*Writer) WriteAll ¶

func (w *Writer) WriteAll(records [][]*Field) error

WriteAll writes all records. The header row is also written automatically.

Example ¶

var buf bytes.Buffer
writer := csvpp.NewWriter(&buf)

headers := []*csvpp.ColumnHeader{
	{Name: "name", Kind: csvpp.SimpleField},
	{Name: "score", Kind: csvpp.SimpleField},
}
writer.SetHeaders(headers)

records := [][]*csvpp.Field{
	{{Value: "Alice"}, {Value: "100"}},
	{{Value: "Bob"}, {Value: "95"}},
}

if err := writer.WriteAll(records); err != nil {
	log.Fatal(err)
}

fmt.Print(buf.String())

Output:

name,score
Alice,100
Bob,95

func (*Writer) WriteHeader ¶

func (w *Writer) WriteHeader() error

WriteHeader writes the header row.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
csvpputil Package csvpputil provides utility functions for converting CSV++ data to other formats such as JSON and YAML.	Package csvpputil provides utility functions for converting CSV++ data to other formats such as JSON and YAML.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL