match

package

v0.3.1 Latest Latest Go to latest Published: Sep 3, 2020 License: BSD-3-Clause Imports: 10 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/google/licensecheck

Links

Open Source Insights

Documentation ¶

Overview ¶

Package match defines matching algorithms and support code for the license checker.

Index ¶

Variables
type Dict
type LRE
- func ParseLRE(d *Dict, file, s string) (*LRE, error)
- func (re *LRE) Dict() *Dict
- func (re *LRE) File() string
type Match
type Matches
type MultiLRE
- func NewMultiLRE(list []*LRE) (_ *MultiLRE, err error)
- func (re *MultiLRE) Dict() *Dict
- func (re *MultiLRE) Match(text string) *Matches
type SyntaxError
- func (e *SyntaxError) Error() string
type Word
type WordID

Constants ¶

This section is empty.

Variables ¶

View Source

var TraceDFA int

TraceDFA controls whether DFA execution prints debug tracing when stuck. If TraceDFA > 0 and the DFA has followed a path of at least TraceDFA symbols since the last matching state but hits a dead end, it prints out information about the dead end.

Functions ¶

This section is empty.

Types ¶

type Dict ¶

type Dict struct {
	// contains filtered or unexported fields
}

A Dict maps words to integer indexes in a word list, of type WordID. The zero Dict is an empty dictionary ready for use.

Lookup and Words are read-only operations, safe for any number of concurrent calls from multiple goroutines. Insert is a write operation; it must not run concurrently with any other call, whether to Insert, Lookup, or Words.

func (*Dict) Insert ¶

func (d *Dict) Insert(w string) WordID

Insert adds the word w to the word list, returning its index. If w is already in the word list, it is not added again; Insert returns the existing index.

func (*Dict) InsertSplit ¶

func (d *Dict) InsertSplit(text string) []Word

InsertSplit splits text into a sequence of lowercase words, inserting any new words in the dictionary.

func (*Dict) Lookup ¶

func (d *Dict) Lookup(w string) WordID

Lookup looks for the word w in the word list and returns its index. If w is not in the word list, Lookup returns BadWord.

func (*Dict) Split ¶

func (d *Dict) Split(text string) []Word

Split splits text into a sequence of lowercase words. It does not add any new words to the dictionary. Unrecognized words are reported as having ID = BadWord.

func (*Dict) Words ¶

func (d *Dict) Words() []string

Words returns the current word list. The list is not a copy; the caller can read but must not modify the list.

type LRE ¶

type LRE struct {
	// contains filtered or unexported fields
}

An LRE is a compiled license regular expression.

TODO: Move this comment somewhere non-internal later.

A license regular expression (LRE) is a pattern syntax intended for describing large English texts such as software licenses, with minor allowed variations. The pattern syntax and the matching are word-based and case-insensitive; punctuation is ignored in the pattern and in the matched text.

The valid LRE patterns are:

word            - a single case-insensitive word
__N__           - any sequence of up to N words
expr1 expr2     - concatenation
expr1 || expr2  - alternation
(( expr ))      - grouping
expr??          - zero or one instances of expr
//** text **//  - a comment

To make patterns harder to misread in large texts:

|| must only appear inside (( ))
?? must only follow (( ))
(( must be at the start of a line, preceded only by spaces
)) must be at the end of a line, followed only by spaces and ??.

For example:

//** https://en.wikipedia.org/wiki/Filler_text **//
Now is
((not))??
the time for all good
((men || women || people))
to come to the aid of their __1__.

func ParseLRE ¶

func ParseLRE(d *Dict, file, s string) (*LRE, error)

ParseLRE parses the string s as a license regexp. The file name is used in error messages if non-empty.

func (*LRE) Dict ¶

func (re *LRE) Dict() *Dict

Dict returns the Dict used by the LRE.

func (*LRE) File ¶

func (re *LRE) File() string

File returns the file name passed to ParseLRE.

type Match ¶

type Match struct {
	ID    int // index of LRE in list passed to NewMultiLRE
	Start int // word index of start of match
	End   int // word index of end of match
}

A Match records the position of a single match in a text.

type Matches ¶

type Matches struct {
	Text  string  // the entire text
	Words []Word  // the text, split into Words
	List  []Match // the matches
}

A Matches is a collection of all leftmost-longest, non-overlapping matches in text.

type MultiLRE ¶

type MultiLRE struct {
	// contains filtered or unexported fields
}

A MultiLRE matches multiple LREs simultaneously against a text. It is more efficient than matching each LRE in sequence against the text.

func NewMultiLRE ¶

func NewMultiLRE(list []*LRE) (_ *MultiLRE, err error)

NewMultiLRE returns a MultiLRE looking for the given LREs. All the LREs must have been parsed using the same Dict; if not, NewMultiLRE panics.

func (*MultiLRE) Dict ¶

func (re *MultiLRE) Dict() *Dict

Dict returns the Dict used by the MultiLRE.

func (*MultiLRE) Match ¶

func (re *MultiLRE) Match(text string) *Matches

Match reports all leftmost-longest, non-overlapping matches in text. It always returns a non-nil *Matches, in order to return the split text. Check len(matches.List) to see whether any matches were found.

type SyntaxError ¶

type SyntaxError struct {
	File    string
	Offset  int
	Context string
	Err     string
}

A SyntaxError reports a syntax error during parsing.

func (*SyntaxError) Error ¶

func (e *SyntaxError) Error() string

type Word ¶

type Word struct {
	ID WordID
	Lo int32 // Word appears at text[Lo:Hi].
	Hi int32
}

A Word represents a single word found in a text.

type WordID ¶

type WordID int32

A WordID is the index of a word in a dictionary.

const AnyWord WordID = -2

AnyWord represents a wildcard matching any word.

const BadWord WordID = -1

BadWord represents a word not present in the dictionary.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL