ggl

package
v0.7.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 14, 2020 License: Apache-2.0 Imports: 13 Imported by: 0

Documentation

Overview

Package ggl contains an implementation of lex.Lexer using cloud.google.com/natural-language/. GLexer builds a graph of tokens from the API response, encapsulated in the Lexer interface.

Index

Constants

View Source
const MaxTextSizeInBytes = 1000000

MaxTextSizeInBytes is the maximum size of text that can be passed to GNL API

Variables

View Source
var ByteOrderMark = []byte{0xEF, 0xBB, 0xBF} //nolint

ByteOrderMark is used to detect a BOM in the source text, which isn't allowed by GNL API

View Source
var SupportedEncodingTypes = map[string]gnlpb.EncodingType{
	"UTF-8":    gnlpb.EncodingType_UTF8,
	"UTF-16":   gnlpb.EncodingType_UTF16,
	"UTF-16BE": gnlpb.EncodingType_UTF16,
	"UTF-16LE": gnlpb.EncodingType_UTF16,
	"UTF-32":   gnlpb.EncodingType_UTF32,
	"UTF-32BE": gnlpb.EncodingType_UTF32,
	"UTF-32LE": gnlpb.EncodingType_UTF32,
}

SupportedEncodingTypes describes the encodings permitted by GNL API Maps from IANA detected charset to the protobuf allowed charsets enum

Functions

func NewDocument

func NewDocument(source []byte) (doc *lex.Document, err error)

NewDocument creates a compatible document ready for the GLexer nolint: goconst

Types

type GLexer

type GLexer struct {
	// contains filtered or unexported fields
}

GLexer implements Lexer using cloud.google.com/natural-language/.

func NewGLexer

func NewGLexer() *GLexer

NewGLexer is a factory for GLexer

func NewInitialisedGLexer

func NewInitialisedGLexer(timeout time.Duration, source *lex.Document) (*GLexer, error)

NewInitialisedGNLLexer is a helper factory that creates and initialises a new GLexer ready for a call to Next(). Init calls the remote API cloud.google.com/natural-language/ and will block for a period dependent on source string length. A new context for the request is created based on the given timeout.

func (*GLexer) GetDocument

func (t *GLexer) GetDocument() *lex.Document

GetDocument implements lex.Lexer

func (*GLexer) GetExecTime

func (t *GLexer) GetExecTime() time.Duration

GetExecTime implements lex.Lexer

func (*GLexer) Init

func (t *GLexer) Init(ctx context.Context, source *lex.Document) error

Init calls cloud.google.com/natural-language/ using the source txt to load data ready for Lexer.Next. Given Init makes a network call, builds and walks a token graph it may be slow to return.

func (*GLexer) InitWithClient

func (t *GLexer) InitWithClient(ctx context.Context, client *gnl.Client, source *lex.Document) error

InitWithClient initialises the lexer with a given GCP client in order to give the caller the opportunity to optimise connection management for e.g. Google Cloud Functions

func (*GLexer) Next

func (t *GLexer) Next() (*lex.Token, error)

Next implements Lexer.Next and returns the next token in the sequence. Tokens are lexically ordered. Token with type EOF marks the last token, subsequent calls to Next may fail.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL