kowalski

package module

v5.3.0 Latest Latest Go to latest Published: Mar 14, 2022 License: MIT Imports: 22 Imported by: 1

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/csmith/kowalski

Links

Open Source Insights

README ¶

Kowalski

Kowalski, analysis

Kowalski is a Go library for performing various operations to help solve puzzles, and an accompanying Discord bot.

Supported functions

Wildcard matching

AKA Crossword Solving. Given an input with one or more missing letters (represented by ? characters), returns a list of dictionary words that match.

Anagram solving

Given a set of letters, possibly including ? wildcards, checks all possible anagrams and returns a list of dictionary words that match.

Morse decoding

Given a Morse-encoded word (represented with - and . characters) without spaces, finds all valid dictionary words that could match.

Image processing

Various utilities to analyse images, find hidden parts, etc.

Library usage

SpellChecker

Most functions that involve English text use the SpellChecker struct, which can indicate whether a word is a valid dictionary word or not. To create a new SpellChecker you must provide it with an io.reader where it can read words line-by-line, and a rough estimate of the number of words it will find:

package example

import (
  "os"

  "github.com/csmith/kowalski/v5"
)

func create() {
  f, _ := os.Open("file.txt")
  defer f.Close()

  checker, err := kowalski.CreateSpellChecker(f, 100)
}

As creating a spellchecker can be expensive and cumbersome, Kowalski supports serialising data to disk and loading it back. Some examples of these serialised models are available in the models directory. To load a model:

package example

import (
  "os"

  "github.com/csmith/kowalski/v5"
)

func create() {
  f, _ := os.Open("file.wl")
  defer f.Close()

  checker, err := kowalski.LoadSpellChecker(f)
}

This repository also contains a command-line tool to generate a new SpellChecker and export the serialised model:

go run cmd/compile -in wordlist.txt -out model.wl

Vellum

The fst package contains automata for use with the Vellum finite state transducer. This allows for quick matching of anagrams, regular expressions, etc, against a pre-prepared transducer.

To use these you need to open a transducer using Vellum, create an iterator, and then iterate over it:

package example

import (
  "github.com/blevesearch/vellum"
  "github.com/csmith/kowalski/v5/fst"
)

func create() {
  transducer, err := vellum.Open("some_file.fst")
  if err != nil {
    panic(err)
  }

  i, err := transducer.Search(fst.NewMorseAutomaton("... --- .-.. .. -.-."), nil, nil)
  if err != nil {
    panic(err)
  }

  for err == nil {
    key, val := i.Current()
    println("Do something here with ", key, val)
    err = i.Next()
  }
}

Discord bot

This repository also contains a Discord bot that allows users to perform analysis.

It currently supports these commands:

!anagram Attempts to find single-word anagrams, expanding '*' and '?' wildcards
!analysis Analyses text and provides a summary of potentially interesting findings [Aliases: !analyze, !analyse]
!chunk Splits the text into chunks of a given size
!colours Counts the colours within the image [Aliases: !colors]
!hidden Finds hidden pixels in images [Aliases: !hiddenpixels]
!letters Shows a frequency histogram of the number of letters in the input
!match Attempts to expand '?' wildcards to find a single-word match
!morse Attempts to split a morse code input to spell a single word
!multigram Attempts to find multi-word anagrams, expanding '?' wildcards [Aliases: !multianagram]
!multimatch Attempts to expand '?' wildcards to find multi-word matches
!obo Finds all words that are one character different from the input [Aliases: !offbyone, !ob1]
!rgb Splits an image into its red, green and blue channels
!shift Shows the result of the 25 possible caesar shifts [Aliases: !caesar]
!t9 Attempts to treat a series of numbers as T9 input to spell a single word
!transpose Transposes columns to rows and rows to columns
!wordsearch Searches for words in the given text grid
!help Shows this help text
!fstanagram Attempts to find anagrams from wikipedia, expanding '*' wildcards [Aliases: !fstagram]
!fstregex Attempts to find word matches from wikipedia using regexp [Aliases: !fstre]
!fstmorse Attempts to find word matches from wikipedia using morse

Documentation ¶

Index ¶

func Anagram(ctx context.Context, checker *SpellChecker, word string) ([]string, error)
func Analyse(checker *SpellChecker, input string) []string
func Chunk(input string, parts ...int) []string
func Dedupe(options *multiplexOptions)
func FindWords(checker *SpellChecker, input string) []string
func FromMorse(checker *SpellChecker, input string) []string
func FromT9(checker *SpellChecker, input string) []string
func HiddenPixels(reader io.Reader) (io.Reader, error)
func Match(ctx context.Context, checker *SpellChecker, pattern string) ([]string, error)
func MultiAnagram(ctx context.Context, checker *SpellChecker, word string) ([]string, error)
func MultiMatch(ctx context.Context, checker *SpellChecker, pattern string) ([]string, error)
func MultiplexAnagram(ctx context.Context, checkers []*SpellChecker, pattern string, ...) ([][]string, error)
func MultiplexFindWords(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string
func MultiplexFromMorse(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string
func MultiplexFromT9(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string
func MultiplexMatch(ctx context.Context, checkers []*SpellChecker, pattern string, ...) ([][]string, error)
func MultiplexMultiAnagram(ctx context.Context, checkers []*SpellChecker, pattern string, ...) ([][]string, error)
func MultiplexMultiMatch(ctx context.Context, checkers []*SpellChecker, pattern string, ...) ([][]string, error)
func MultiplexOffByOne(ctx context.Context, checkers []*SpellChecker, pattern string, ...) ([][]string, error)
func MultiplexWordSearch(checkers []*SpellChecker, pattern []string, opts ...MultiplexOption) [][]string
func OffByOne(ctx context.Context, checker *SpellChecker, input string) ([]string, error)
func SaveSpellChecker(writer io.Writer, checker *SpellChecker) error
func Score(checker *SpellChecker, input string) float64
func SplitRGB(reader io.Reader) (r, g, b io.Reader, err error)
func Transpose(input []string) []string
func WordSearch(checker *SpellChecker, input []string) []string
type ColourCount
- func ExtractColours(reader io.Reader) ([]ColourCount, error)
type MultiplexOption
type SpellChecker
- func CreateSpellChecker(reader io.Reader, wordCount int) (*SpellChecker, error)
- func LoadSpellChecker(reader io.Reader) (*SpellChecker, error)
- func (c *SpellChecker) Prefix(prefix string) bool
- func (c *SpellChecker) Valid(word string) bool

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func Anagram ¶

func Anagram(ctx context.Context, checker *SpellChecker, word string) ([]string, error)

Anagram finds all single-word anagrams of the given word, expanding '?' as a single wildcard character

func Analyse ¶

func Analyse(checker *SpellChecker, input string) []string

Analyse performs various forms of text analysis on the input and returns findings.

func Chunk ¶

func Chunk(input string, parts ...int) []string

Chunk takes the input, and splits it up into chunks of the given length. If the input is longer than the list of part lengths, the lengths will be repeated.

func Dedupe ¶

func Dedupe(options *multiplexOptions)

Dedupe removes duplicate entries from multiplexed results. That is, if the first checker provides words A, B and C, the second checker provides B and D, and the third A, D, and E, then the result will be: {A,B,C},{D},{E}.

func FindWords ¶

func FindWords(checker *SpellChecker, input string) []string

FindWords attempts to find substrings of the input that are valid words according to the checker. Duplicates may be present in the output if they occur at multiple positions.

func FromMorse ¶

func FromMorse(checker *SpellChecker, input string) []string

FromMorse takes a sequence of morse signals (as ASCII dots and hyphens) and returns a set of possible words that could be constructed from them.

func FromT9 ¶

func FromT9(checker *SpellChecker, input string) []string

FromT9 takes an input that represents a sequence of key presses on a T9 keyboard and returns possible words that match. The input should not contain spaces (the "0" digit) - words should be solved independently, to avoid an explosion of possible results.

func HiddenPixels ¶

func HiddenPixels(reader io.Reader) (io.Reader, error)

HiddenPixels attempts to find patterns of hidden pixels (those that are consistently used near very similar colours).

func Match ¶

func Match(ctx context.Context, checker *SpellChecker, pattern string) ([]string, error)

Match returns all valid words that match the given pattern, expanding '?' as a single character wildcard

func MultiAnagram ¶

func MultiAnagram(ctx context.Context, checker *SpellChecker, word string) ([]string, error)

MultiAnagram finds all single- and multi-word anagrams of the given word, expanding '?' as a single wildcard character. To avoid duplicates, words are sorted lexicographically (i.e., "a ball" will be returned and "ball a" will not).

func MultiMatch ¶

func MultiMatch(ctx context.Context, checker *SpellChecker, pattern string) ([]string, error)

MultiMatch returns valid sequences of words that match the given pattern, expanding '?' as a single character wildcard. To reduce the search space, multi-match will first try to look for matches consisting only of longer words, then gradually reduce that threshold until at least one match is found.

func MultiplexAnagram ¶

func MultiplexAnagram(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexAnagram performs the Anagram operation over a number of different checkers.

func MultiplexFindWords ¶

func MultiplexFindWords(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string

MultiplexFindWords performs the FindWords operation over a number of different checkers.

func MultiplexFromMorse ¶

func MultiplexFromMorse(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string

MultiplexFromMorse performs the FromMorse operation over a number of different checkers.

func MultiplexFromT9 ¶

func MultiplexFromT9(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string

MultiplexFromT9 performs the FromT9 operation over a number of different checkers.

func MultiplexMatch ¶

func MultiplexMatch(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexMatch performs the Match operation over a number of different checkers.

func MultiplexMultiAnagram ¶

func MultiplexMultiAnagram(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexMultiAnagram performs the MultiAnagram operation over a number of different checkers.

func MultiplexMultiMatch ¶

func MultiplexMultiMatch(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexMultiMatch performs the MultiMatch operation over a number of different checkers.

func MultiplexOffByOne ¶

func MultiplexOffByOne(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexOffByOne performs the OffByOne operation over a number of different checkers.

func MultiplexWordSearch ¶

func MultiplexWordSearch(checkers []*SpellChecker, pattern []string, opts ...MultiplexOption) [][]string

MultiplexWordSearch performs the WordSearch operation over a number of different checkers.

func OffByOne ¶

func OffByOne(ctx context.Context, checker *SpellChecker, input string) ([]string, error)

OffByOne returns all words that can be made by performing one character change on the input. The input is assumed to be a single, lowercase word containing a-z chars only.

func SaveSpellChecker ¶

func SaveSpellChecker(writer io.Writer, checker *SpellChecker) error

SaveSpellChecker serialises the given checker and writes it to the writer. It can later be restored with LoadSpellChecker.

func Score ¶

func Score(checker *SpellChecker, input string) float64

Score assigns a score to an input showing how likely it is to be English text. A score of 1.0 means almost certainly English, a score of 0.0 means almost certainly not. This is fairly arbitrary and is not very good.

func SplitRGB ¶

func SplitRGB(reader io.Reader) (r, g, b io.Reader, err error)

SplitRGB reads an image, and returns three readers containing PNG encoded images containing just the red, green and blue channels respectively. If the input image is excessively small, the outputs will be scaled up.

func Transpose ¶

func Transpose(input []string) []string

Transpose rotates the input text so rows become columns, and columns become rows.

func WordSearch ¶

func WordSearch(checker *SpellChecker, input []string) []string

WordSearch returns all words found by FindWords in the input word search grid. Words may occur horizontally, vertically or diagonally, and may read in either direction. If a word is found multiple times in different places it will be returned multiple times.

Types ¶

type ColourCount ¶

type ColourCount struct {
	Colour color.Color
	Count  int
}

func ExtractColours ¶

func ExtractColours(reader io.Reader) ([]ColourCount, error)

ExtractColours returns a sorted slice containing each individual colour used in the image, and the total number of pixels that have that colour. Colours are sorted from most-used to least-used.

type MultiplexOption ¶

type MultiplexOption func(*multiplexOptions)

type SpellChecker ¶

type SpellChecker struct {
	// contains filtered or unexported fields
}

SpellChecker provides a way to tell whether a word exists in a dictionary.

func CreateSpellChecker ¶

func CreateSpellChecker(reader io.Reader, wordCount int) (*SpellChecker, error)

CreateSpellChecker creates a new SpellChecker by reading words line-by-line from the given reader. The wordCount parameter should be an approximation of the number of words available.

This is likely to be a relatively expensive operation; for routine use prefer saving the spell checker via SaveSpellChecker and restoring it with LoadSpellChecker.

func LoadSpellChecker ¶

func LoadSpellChecker(reader io.Reader) (*SpellChecker, error)

LoadSpellChecker attempts to load a SpellChecker that was previously saved with SaveSpellChecker.

func (*SpellChecker) Prefix ¶

func (c *SpellChecker) Prefix(prefix string) bool

Prefix determines - probabilistically - whether the given string is a prefix of any known word. There is a small chance of false positives, i.e. an input that is not a prefix to any word in the word list might be incorrectly identified as valid; there is no chance of false negatives.

func (*SpellChecker) Valid ¶

func (c *SpellChecker) Valid(word string) bool

Valid determines - probabilistically - whether the given word was in the word list used to create this SpellChecker. There is a small chance of false positives, i.e. a word that wasn't in the word list might be incorrectly identified as valid; there is no chance of false negatives.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
compile
kowalski
data
fst

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL