The highest tagged major version is v4.

scanner

package

v3.8.0+incompatible Latest Latest Go to latest Published: Aug 31, 2018 License: Apache-2.0 Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/Clever/optimus

Links

Open Source Insights

Documentation ¶

Index ¶

Constants
Variables
func ScanBytes(data []byte, atEOF bool) (advance int, token []byte, err error)
func ScanLines(data []byte, atEOF bool) (advance int, token []byte, err error)
func ScanRunes(data []byte, atEOF bool) (advance int, token []byte, err error)
func ScanWords(data []byte, atEOF bool) (advance int, token []byte, err error)
type Scanner
- func NewScanner(r io.Reader) *Scanner
type SplitFunc

Constants ¶

View Source

const (
	// MaxScanTokenSize is the maximum size used to buffer a token. The actual maximum token size
	// may be smaller as the buffer may need to include, for instance, a newline.
	// NOTE: This has been updated to be large enough to store a single MongoDB doc.
	MaxScanTokenSize = 16 * 1024 * 1024
)

Variables ¶

View Source

var (
	ErrTooLong         = errors.New("bufio.Scanner: token too long")
	ErrNegativeAdvance = errors.New("bufio.Scanner: SplitFunc returns negative advance count")
	ErrAdvanceTooFar   = errors.New("bufio.Scanner: SplitFunc returns advance count beyond input")
)

Errors returned by Scanner.

Functions ¶

func ScanBytes ¶

func ScanBytes(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanBytes is a split function for a Scanner that returns each byte as a token.

func ScanLines ¶

func ScanLines(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanLines is a split function for a Scanner that returns each line of text, stripped of any trailing end-of-line marker. The returned line may be empty. The end-of-line marker is one optional carriage return followed by one mandatory newline. In regular expression notation, it is `\r?\n`. The last non-empty line of input will be returned even if it has no newline.

func ScanRunes ¶

func ScanRunes(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanRunes is a split function for a Scanner that returns each UTF-8-encoded rune as a token. The sequence of runes returned is equivalent to that from a range loop over the input as a string, which means that erroneous UTF-8 encodings translate to U+FFFD = "\xef\xbf\xbd". Because of the Scan interface, this makes it impossible for the client to distinguish correctly encoded replacement runes from encoding errors.

func ScanWords ¶

func ScanWords(data []byte, atEOF bool) (advance int, token []byte, err error)

ScanWords is a split function for a Scanner that returns each space-separated word of text, with surrounding spaces deleted. It will never return an empty string. The definition of space is set by unicode.IsSpace.

Types ¶

type Scanner ¶

type Scanner struct {
	// contains filtered or unexported fields
}

Scanner provides a convenient interface for reading data such as a file of newline-delimited lines of text. Successive calls to the Scan method will step through the 'tokens' of a file, skipping the bytes between the tokens. The specification of a token is defined by a split function of type SplitFunc; the default split function breaks the input into lines with line termination stripped. Split functions are defined in this package for scanning a file into lines, bytes, UTF-8-encoded runes, and space-delimited words. The client may instead provide a custom split function.

Scanning stops unrecoverably at EOF, the first I/O error, or a token too large to fit in the buffer. When a scan stops, the reader may have advanced arbitrarily far past the last token. Programs that need more control over error handling or large tokens, or must run sequential scans on a reader, should use bufio.Reader instead.

func NewScanner ¶

func NewScanner(r io.Reader) *Scanner

NewScanner returns a new Scanner to read from r. The split function defaults to ScanLines.

func (*Scanner) Bytes ¶

func (s *Scanner) Bytes() []byte

Bytes returns the most recent token generated by a call to Scan. The underlying array may point to data that will be overwritten by a subsequent call to Scan. It does no allocation.

func (*Scanner) Err ¶

func (s *Scanner) Err() error

Err returns the first non-EOF error that was encountered by the Scanner.

func (*Scanner) Scan ¶

func (s *Scanner) Scan() bool

Scan advances the Scanner to the next token, which will then be available through the Bytes or Text method. It returns false when the scan stops, either by reaching the end of the input or an error. After Scan returns false, the Err method will return any error that occurred during scanning, except that if it was io.EOF, Err will return nil.

func (*Scanner) Split ¶

func (s *Scanner) Split(split SplitFunc)

Split sets the split function for the Scanner. If called, it must be called before Scan. The default split function is ScanLines.

func (*Scanner) Text ¶

func (s *Scanner) Text() string

Text returns the most recent token generated by a call to Scan as a newly allocated string holding its bytes.

type SplitFunc ¶

type SplitFunc func(data []byte, atEOF bool) (advance int, token []byte, err error)

SplitFunc is the signature of the split function used to tokenize the input. The arguments are an initial substring of the remaining unprocessed data and a flag, atEOF, that reports whether the Reader has no more data to give. The return values are the number of bytes to advance the input and the next token to return to the user, plus an error, if any. If the data does not yet hold a complete token, for instance if it has no newline while scanning lines, SplitFunc can return (0, nil, nil) to signal the Scanner to read more data into the slice and try again with a longer slice starting at the same point in the input.

If the returned error is non-nil, scanning stops and the error is returned to the client.

The function is never called with an empty data slice unless atEOF is true. If atEOF is true, however, data may be non-empty and, as always, holds unprocessed text.

Source Files ¶

View all Source files

scanner.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL