billdsv

package module
v1.0.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 25, 2018 License: MIT Imports: 3 Imported by: 0

README

Bill DSV

Some files from Bill (Comms.txt) include multi-line strings that are not quoted. No D/CSV reader seemed to deal with these files so a custom library was necessary.

For example:

1000|first string|final string
1001|second string
that is multi-line|final string
1002|third string|final string

Record 1001 contains a value that spans multiple lines but there are no quotes which is usually the strategy to handle multi-line values in DSV formats.

Usage

Reader exports Read and ReadAll and it behaves like the standard CSV reader with a few exceptions:

  • No quoting, because Bill files don't include quotes (from what I can tell)
  • No slice optimisation (yet)
  • No comments

For example:

func main() {
    f, err := os.Open("Bill Data/Comms/18/07/31/Comms.txt")
    if err != nil {
        panic(err)
    }

    cr := billdsv.NewReader(f)

    for {
        row, err := cr.Read()
        if err != nil {
            fmt.Println(err)
            break
        }
        fmt.Println(row)
    }
}

Documentation

Index

Constants

This section is empty.

Variables

View Source
var DefaultBufferSize = 1024

Functions

This section is empty.

Types

type Reader

type Reader struct {
	Separator   byte
	SkipHeading bool
	BufferSize  int
	// contains filtered or unexported fields
}

Reader implements a DSV reader that reads the pipe separated values that Bill outputs.

func NewReader

func NewReader(r io.Reader, fields, bufferSize int) *Reader

NewReader returns a new Reader that reads from r. The number of expected fields per row can be specified so the parser can safely deal with fields containing line breaks. The buffer size may be specified post-instantiate but the default should be fine for most cases. If fields is left at zero, the first line will be used to set the expected field count for the rest of the document. This means that if the CSV is malformed

func (*Reader) ReadAll

func (r *Reader) ReadAll(function func([][]byte)) (err error)

ReadAll reads all records and passes them to the specified function. This function will make no heap allocations in best case scenarios. The only time this function will allocate is if a field exceeds the default field buffer size of 1024, in which case the struct field `wrBuffer` will be resized to 1.5x the size. The other two potential allocation spots, are the two `append` calls in the switch blocks, these are allocated lazily as well as if the `rowBuffer` cell is at capacity and requires resizing to fit the new data.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL