lex

package
v0.4.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 6, 2026 License: GPL-3.0 Imports: 3 Imported by: 13

README

At the core of this package is the Entry struct. For details, please see documentation here: https://godoc.org/github.com/stts-se/pronlex/lex

Documentation

Overview

Package lex is used for general 'container' classes such as entry, transcription, lemma, etc.

The main unit here is the entry. The entry contains everything related to a lexicon entry: orthography, transcriptions, lemma, compound parts, sources/references, et cetera. It is implemented as a go struct, and it can automatically be mapped into a JSON object.

The Entry struct is defined here: https://godoc.org/github.com/stts-se/pronlex/lex#Entry

A few JSON examples:

// Minimal example (English)
{
   strn: "things",
   transcriptions: [
   {
      strn: "' T I N z"
   }
   ]
}

// Entry "things" from the CMU (US English) lexicon
{
   id: 112326,
   lexRef: {
      DBRef: "en_am_cmu_lex",
      LexName: "en-us.cmu"
   },
   strn: "things",
   language: "en-us",
   partOfSpeech: "",
   morphology: "",
   wordParts: "",
   lemma: {
      id: 0,
      strn: "",
      reading: "",
      paradigm: ""
   },
   transcriptions: [
   {
      id: 120059,
      entryId: 112326,
      strn: "' T I N z",
      language: "",
      sources: [ ]
   }
   ],
   status: {
      id: 112326,
      name: "imported",
      source: "cmu",
      timestamp: "2017-09-20T13:13:21Z",
      current: true
   },
   entryValidations: [ ],
   preferred: false,
   tag: ""
}

// Entry "hästar" from the Swedish demo lexicon
{
id: 6,
lexRef: {
   DBRef: "wikispeech_lexserver_testdb",
   LexName: "sv"
},
   strn: "hästar",
   language: "sv",
   partOfSpeech: "NN",
   morphology: "NEU IND PLU",
   wordParts: "hästar",
   lemma: {
      id: 4,
      strn: "häst",
      reading: "",
      paradigm: ""
   },
   transcriptions: [
   {
      id: 9,
      entryId: 6,
      strn: "" h E . s t a r",
      language: "sv",
      sources: [ ]
   }
   ],
   status: {
      id: 6,
      name: "demo",
      source: "auto",
      timestamp: "2017-09-22T08:43:32Z",
      current: true
   },
   entryValidations: [ ],
   preferred: false,
   tag: ""
}

Index

Constants

This section is empty.

Variables

View Source
var SourceDelimiter = " : "

SourceDelimiter is used to split a string of sevaral sources into a slice

Functions

This section is empty.

Types

type DBRef

type DBRef string

DBRef a database reference string (for mariadb: the database name; for sqlite: the database filename without extension)

func NewDBRef added in v0.4.1

func NewDBRef(dbName string) DBRef

NewDBRef creates a database reference from input (downcased) strings

type Entry

type Entry struct {
	ID               int64             `json:"id,omitempty"`
	LexRef           LexRef            `json:"lexRef,omitempty"`
	Strn             string            `json:"strn"`
	Language         string            `json:"language,omitempty"`
	PartOfSpeech     string            `json:"partOfSpeech,omitempty"`
	Morphology       string            `json:"morphology,omitempty"`
	WordParts        string            `json:"wordParts,omitempty"`
	Lemma            Lemma             `json:"lemma,omitempty"`
	Transcriptions   []Transcription   `json:"transcriptions"`
	EntryStatus      EntryStatus       `json:"status,omitempty"` // TODO Probably should be a slice of statuses?
	EntryValidations []EntryValidation `json:"entryValidations,omitempty"`

	// Preferred flag: 1=true, 0=false; schema triggers only one preferred per orthographic word
	//Preferred        int64             `json:"preferred"`
	Preferred bool           `json:"preferred,omitempty"`
	Tag       string         `json:"tag,omitempty"`
	Comments  []EntryComment `json:"comments,omitempty"`
}

Entry defines a lexical entry. It does not correspond one-to-one to the entry db table, since it contains data also from associated tables (Lemma, Tag, Transcription, EntryValidations). The Tag field holds an arbitrary, optional, lower case string to disambiguate between different lex.Entries charing the same othograpy. Two different lex.Entries cannot have identical lex.Entry.Tags (the database should not allow this).

type EntryComment added in v0.4.1

type EntryComment struct {
	ID      int64  `json:"id,omitempty"`
	EntryID int64  `json:"entryId,omitempty"`
	Source  string `json:"source,omitempty"`
	Label   string `json:"label,omitempty"`
	Comment string `json:"comment,omitempty"`
}

func (EntryComment) String added in v0.4.1

func (c EntryComment) String() string

type EntryFileWriter

type EntryFileWriter struct {
	Writer io.Writer
	// contains filtered or unexported fields
}

EntryFileWriter outputs formated entries to an io.Writer. Example usage:

bf := bufio.NewWriter(f)
defer bf.Flush()
bfx := lex.EntriesFileWriter{bf}
dbapi.LookUp(db, q, bfx)

func (*EntryFileWriter) Size

func (w *EntryFileWriter) Size() int

Size returns the size of the EntryFileWriter content

func (*EntryFileWriter) Write

func (w *EntryFileWriter) Write(e Entry) error

Write is used to write one lex.Entry at a time to a file

type EntrySliceWriter

type EntrySliceWriter struct {
	Entries []Entry
}

EntrySliceWriter is a container for returning Entries from a LookUp call to the db Example usage:

var q := dbapi.Query{ ... }
var esw lex.EntrySliceWriter
err := dbapi.LookUp(db, q, &esw)
[...] esw.Entries // process Entries

func (*EntrySliceWriter) Size

func (w *EntrySliceWriter) Size() int

Size returns the size of the EntryFileWriter content

func (*EntrySliceWriter) Write

func (w *EntrySliceWriter) Write(e Entry) error

Write is used to write one lex.Entry at a time to a file

type EntryStatus

type EntryStatus struct {
	ID     int64  `json:"id,omitempty"`
	Name   string `json:"name,omitempty"`
	Source string `json:"source,omitempty"`
	//EntryID int64  `json:"entryId"`
	//Timestamp int64  `json:"timestamp"`
	Timestamp string `json:"timestamp,omitempty"`
	Current   bool   `json:"current,omitempty"`
}

EntryStatus associates a status to an Entry. The status has a name (such as 'ok') and a source (a string identifying who or what generated the status)

type EntryValidation

type EntryValidation struct {
	ID int64 `json:"id,omitempty"`

	// Lower case name of level of severity
	Level     string `json:"level"`
	RuleName  string `json:"ruleName"`
	Message   string `json:"Message"`
	Timestamp string `json:"timestamp"`
}

EntryValidation associates a validation result to an Entry

func (EntryValidation) String

func (ev EntryValidation) String() string

type EntryWriter

type EntryWriter interface {
	Write(Entry) error
	Size() int
}

EntryWriter is an interface defining things to which one can write an Entry. See EntrySliceWriter, for returning a slice of Entry, and EntryFileWriter, for writing Entries to file.

type Lemma

type Lemma struct {
	ID       int64  `json:"id,omitempty"`
	Strn     string `json:"strn,omitempty"`
	Reading  string `json:"reading,omitempty"`
	Paradigm string `json:"paradigm,omitempty"`
}

Lemma corresponds to a row of the lemma db table

type LexName

type LexName string

LexName a lexicon name

type LexRef

type LexRef struct {
	DBRef   DBRef   `json:"dbRef,omitempty"`
	LexName LexName `json:"lexName,omitempty"`
}

LexRef a lexicon reference specified by DBRef and LexName

func NewLexRef

func NewLexRef(lexDB string, lexName string) LexRef

NewLexRef creates a lexicon reference from input (downcased) strings

func ParseLexRef

func ParseLexRef(fullLexName string) (LexRef, error)

ParseLexRef is used to parse a lexicon reference string into a LexRef struct

var fullLexName  = "pronlex:sv-se-nst"
var lexRef, _    = ParseLexRef(fullLexName)
// lexRef.DBRef  = pronlex
// lexRef.LexName = sv-se-nst

*

func (LexRef) String

func (lr LexRef) String() string

type LexRefWithInfo

type LexRefWithInfo struct {
	LexRef        LexRef
	SymbolSetName string
}

LexRefWithInfo is a lexicon reference (LexRef) with additional info (SymbolSetName)

func NewLexRefWithInfo added in v0.4.1

func NewLexRefWithInfo(lexDB string, lexName string, symbolSetName string) LexRefWithInfo

NewLexRefWithInfo creates a lexicon reference with symbol set, from (downcased) input strings

type Transcription

type Transcription struct {
	ID       int64    `json:"id,omitempty"`
	EntryID  int64    `json:"entryId,omitempty"`
	Strn     string   `json:"strn"`
	Language string   `json:"language,omitempty"`
	Sources  []string `json:"sources,omitempty"`
}

Transcription corresponds to the transcription db table

func (*Transcription) AddSource

func (t *Transcription) AddSource(s string) error

AddSource ... adds a source string at the beginning of the Transcription.Sources slice. If the source is already present, AddSource silently ignores to add the already existing source. AddSource returns an error when the input string contains the SourceDelimiter string.

func (Transcription) SourcesString

func (t Transcription) SourcesString() string

SourcesString returns the []string items of Transcription.Sources as a string, where the items are delimited by SourceDelimiter

type TranscriptionSlice

type TranscriptionSlice []Transcription

TranscriptionSlice is used for soring according to ascending id

func (TranscriptionSlice) Len

func (a TranscriptionSlice) Len() int

func (TranscriptionSlice) Less

func (a TranscriptionSlice) Less(i, j int) bool

func (TranscriptionSlice) Swap

func (a TranscriptionSlice) Swap(i, j int)

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL