Documentation
¶
Overview ¶
Package scanner is a custom text scanner implementation. It has the same idiomatic Go scanner programming interface, and it lets the client to freely navigate the buffer. The scanner is also capable of peeking ahead of the cursor. Read runes are rendered as tokens with additional information on their position in the buffer.
Usage
package main
import (
"bufio"
"fmt"
"log"
"os"
"github.com/mdm-code/scanner"
)
func main() {
r := bufio.NewReader(os.Stdin)
s, err := scanner.New(r)
if err != nil {
log.Fatalln(err)
}
var ts []scanner.Token
for s.Scan() {
t := s.Token()
ts = append(ts, t)
}
fmt.Println(ts)
}
Index ¶
Examples ¶
Constants ¶
This section is empty.
Variables ¶
var ErrNilIOReader error = errors.New("provided io.Reader is nil")
ErrNilIOReader indicates that the parameter passed to an attribute of the inteface type io.Reader has a nil value.
var ErrRuneError error = errors.New("Unicode replacement character found")
ErrRuneError says that UTF-8 Unicode replacement character was encountered by the Scanner.
var Zero = Pos{Rune: '\u0000', Start: 0, End: 0}
Zero represents the initial state of the Scanner with the cursor pointing at the start of the byte buffer.
Functions ¶
This section is empty.
Types ¶
type Scanner ¶
Scanner encapsulates the logic of scanning runes from a text file. Its instance is stateful and unsafe to use across multiple threads.
func (*Scanner) Errored ¶ added in v1.2.1
Errored reports if the Scanner encountered errors while scanning the underlying byte buffer.
Example ¶
ExampleScanner_Errored shows how check if errors were encountered while scanning the read text buffer.
package main
import (
"fmt"
"log"
"strings"
"github.com/mdm-code/scanner"
)
func main() {
text := "Hello!"
r := strings.NewReader(text)
s, err := scanner.New(r)
if err != nil {
log.Fatal(err)
}
s.Errors = append(s.Errors, scanner.ErrRuneError)
fmt.Println(s.Errored())
}
Output: true
func (*Scanner) Goto ¶
Goto moves the cursor of the Scanner to the position of the t Token.
Example ¶
ExampleScanner_Goto shows how an already emitted token can be used to move the cursor of the scanner back to the position it's pointing at.
package main
import (
"fmt"
"log"
"strings"
"github.com/mdm-code/scanner"
)
func main() {
r := strings.NewReader("Hello!")
s, err := scanner.New(r)
if err != nil {
log.Fatal(err)
}
var final scanner.Token
for s.Scan() {
if curr := s.Token(); curr.Rune == 'e' {
final = curr
}
}
s.Goto(final)
fmt.Println(s.Token())
}
Output: { e 1:2 }
func (*Scanner) Peek ¶
Peek reports whether the v string matches the byte buffer from the position currently pointed at by the cursor. It returns true if there is a match. It returns false either if there is no match or the provided v string goes beyond the length of the buffer. It does not advance the Scanner.
Example ¶
ExampleScanner_Peek shows how to peek ahead of the scanner cursor to see whether the buffer ahead matches the provided string.
package main
import (
"fmt"
"log"
"strings"
"github.com/mdm-code/scanner"
)
func main() {
r := strings.NewReader("There's a match!")
s, err := scanner.New(r)
if err != nil {
log.Fatal(err)
}
for s.Scan() {
if t := s.Token(); t.Rune == 's' {
break
}
}
result := s.Peek(" a match!")
fmt.Println(result)
}
Output: true
func (*Scanner) Reset ¶
func (s *Scanner) Reset()
Reset puts the Scanner back in its initial state with the cursor pointing at the start of the byte buffer and clears all the recored scanner errors.
Example ¶
ExampleScanner_Reset shows how to reset the scanner back to its initial, zero state. In the example, tokens produced by the scanner the usual way are discarded, and then the scanner gets reset back to its initial state.
package main
import (
"fmt"
"log"
"strings"
"github.com/mdm-code/scanner"
)
func main() {
r := strings.NewReader("Hello!")
s, err := scanner.New(r)
if err != nil {
log.Fatal(err)
}
var t scanner.Token
for s.Scan() {
}
s.Reset()
s.Scan()
t = s.Token()
fmt.Println(t)
}
Output: { H 0:1 }
func (*Scanner) Scan ¶
Scan advances the cursor of the Scanner by a single UTF-8 encoded Unicode character. The method returns a boolean value so that is can be used idiomatically the same way other scanners in the standard Go library are used.
Example ¶
ExampleScanner_Scan shows how to translate text into a list of tokens with the Scanner public API. It combines New, Scan and Token to get a slice of tokens matching the provided "Hello\!" input.
package main
import (
"fmt"
"log"
"strings"
"github.com/mdm-code/scanner"
)
func main() {
in := "Hello!"
r := strings.NewReader(in)
s, err := scanner.New(r)
if err != nil {
log.Fatal(err)
}
var ts = []scanner.Token{}
for s.Scan() {
t := s.Token()
ts = append(ts, t)
}
fmt.Println(ts)
}
Output: [{ H 0:1 } { e 1:2 } { l 2:3 } { l 3:4 } { o 4:5 } { ! 5:6 }]
func (*Scanner) ScanAll ¶ added in v1.2.1
ScanAll scans all Tokens representing UTF-8 encoded Unicode characters from the byte buffer underlying the Scanner.
Example ¶
ExampleScanner_ScanAll shows how to convert text into a list of tokens with a single method call to ScanAll() instead of using a for loop to traverse the input one token at a time.
package main
import (
"fmt"
"log"
"strings"
"github.com/mdm-code/scanner"
)
func main() {
in := "Hello!"
r := strings.NewReader(in)
s, err := scanner.New(r)
if err != nil {
log.Fatal(err)
}
ts, ok := s.ScanAll()
if !ok {
log.Fatal(s.Errors[0])
}
fmt.Println(ts)
}
Output: [{ H 0:1 } { e 1:2 } { l 2:3 } { l 3:4 } { o 4:5 } { ! 5:6 }]