kowalski

package module
v4.0.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 2, 2021 License: MIT Imports: 20 Imported by: 0

README

Kowalski

Kowalski, analysis

Kowalski is a Go library for performing various operations to help solve puzzles.

Supported functions

Wildcard matching

AKA Crossword Solving. Given an input with one or more missing letters (represented by ? characters), returns a list of dictionary words that match.

Anagram solving

Given a set of letters, possibly including ? wildcards, checks all possible anagrams and returns a list of dictionary words that match.

Morse decoding

Given a Morse-encoded word (represented with - and . characters) without spaces, finds all valid dictionary words that could match.

Usage

The current functions all revolve around the SpellChecker struct, which can indicate whether a word is a valid dictionary word or not. To create a new SpellChecker you must provide it with an io.reader where it can read words line-by-line, and a rough estimate of the number of words it will find:

package example

import (
    "github.com/csmith/kowalski/v4"
    "os"
)

func create() {
    f, _ := os.Open("file.txt")
    defer f.Close()
    
    checker, err := kowalski.CreateSpellChecker(f, 100)
}

As creating a spellchecker can be expensive and cumbersome, Kowalski supports serialising data to disk and loading it back. Some examples of these serialised models are available in the models directory. To load a model:

package example

import (
    "github.com/csmith/kowalski/v4"
    "os"
)

func create() {
    f, _ := os.Open("file.wl")
    defer f.Close()
    
    checker, err := kowalski.LoadSpellChecker(f)
}

This repository also contains a command-line tool to generate a new SpellChecker and export the serialised model:

go run cmd/compile -in wordlist.txt -out model.wl

Discord bot

This repository also contains a Discord bot that allows users to perform analysis.

It currently supports these commands:

  • analysis <term> performs some analysis on the input and returns hints about what it could be.

  • match <term> returns all known words that match the given term, where '?' is a single-character wildcard. e.g. match melism? will return melisma.

  • anagram <term> returns all known anagrams that match the given term, where '?' is a single-character wildcard. e.g. anagram lismem? will return melisma.

  • letters <term> shows a chart of the distribution of the English letters (A-Z, ignoring case) in the given term.

  • morse <term> returns all possible words that match the given morse code (specified using - and .), ignoring all spaces/pauses.

  • shift <term> performs caesar shifts of 1-25 on the term and displays them.

  • t9 <term> returns all possible words that match the given T9 input (specified using numbers).

  • wordsearch <grid> returns all found words in the word search.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Anagram

func Anagram(ctx context.Context, checker *SpellChecker, word string) ([]string, error)

Anagram finds all single-word anagrams of the given word, expanding '?' as a single wildcard character

func Analyse

func Analyse(checker *SpellChecker, input string) []string

Analyse performs various forms of text analysis on the input and returns findings.

func CaesarShift

func CaesarShift(input string, count uint8) string

CaesarShift performs a caesar shift of the given amount on all A-Z characters.

func CaesarShifts

func CaesarShifts(input string) [25]string

CaesarShifts performs all 25 possible caesar shifts on the input.

func Chunk

func Chunk(input string, parts ...int) []string

Chunk takes the input, and splits it up into chunks of the given length. If the input is longer than the list of part lengths, the lengths will be repeated.

func Dedupe

func Dedupe(options *multiplexOptions)

Dedupe removes duplicate entries from multiplexed results. That is, if the first checker provides words A, B and C, the second checker provides B and D, and the third A, D, and E, then the result will be: {A,B,C},{D},{E}.

func FindWords

func FindWords(checker *SpellChecker, input string) []string

FindWords attempts to find substrings of the input that are valid words according to the checker. Duplicates may be present in the output if they occur at multiple positions.

func FromMorse

func FromMorse(checker *SpellChecker, input string) []string

FromMorse takes a sequence of morse signals (as ASCII dots and hyphens) and returns a set of possible words that could be constructed from them.

func FromT9

func FromT9(checker *SpellChecker, input string) []string

FromT9 takes an input that represents a sequence of key presses on a T9 keyboard and returns possible words that match. The input should not contain spaces (the "0" digit) - words should be solved independently, to avoid an explosion of possible results.

func LetterDistribution

func LetterDistribution(input string) [26]int

LetterDistribution counts the number of the occurrences of each English letter (ignoring case).

func Match

func Match(ctx context.Context, checker *SpellChecker, pattern string) ([]string, error)

Match returns all valid words that match the given pattern, expanding '?' as a single character wildcard

func MultiAnagram added in v4.0.1

func MultiAnagram(ctx context.Context, checker *SpellChecker, word string) ([]string, error)

MultiAnagram finds all single- and multi-word anagrams of the given word, expanding '?' as a single wildcard character. To avoid duplicates, words are sorted lexicographically (i.e., "a ball" will be returned and "ball a" will not).

func MultiMatch added in v4.0.1

func MultiMatch(ctx context.Context, checker *SpellChecker, pattern string) ([]string, error)

MultiMatch returns valid sequences of words that match the given pattern, expanding '?' as a single character wildcard. To reduce the search space, multi-match will first try to look for matches consisting only of longer words, then gradually reduce that threshold until at least one match is found.

func MultiplexAnagram

func MultiplexAnagram(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexAnagram performs the Anagram operation over a number of different checkers.

func MultiplexFindWords

func MultiplexFindWords(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string

MultiplexFindWords performs the FindWords operation over a number of different checkers.

func MultiplexFromMorse

func MultiplexFromMorse(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string

MultiplexFromMorse performs the FromMorse operation over a number of different checkers.

func MultiplexFromT9

func MultiplexFromT9(checkers []*SpellChecker, pattern string, opts ...MultiplexOption) [][]string

MultiplexFromT9 performs the FromT9 operation over a number of different checkers.

func MultiplexMatch

func MultiplexMatch(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexMatch performs the Match operation over a number of different checkers.

func MultiplexMultiAnagram added in v4.0.1

func MultiplexMultiAnagram(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexMultiAnagram performs the MultiAnagram operation over a number of different checkers.

func MultiplexMultiMatch added in v4.0.1

func MultiplexMultiMatch(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexMultiMatch performs the MultiMatch operation over a number of different checkers.

func MultiplexOffByOne

func MultiplexOffByOne(ctx context.Context, checkers []*SpellChecker, pattern string, opts ...MultiplexOption) ([][]string, error)

MultiplexOffByOne performs the OffByOne operation over a number of different checkers.

func MultiplexWordSearch

func MultiplexWordSearch(checkers []*SpellChecker, pattern []string, opts ...MultiplexOption) [][]string

MultiplexWordSearch performs the WordSearch operation over a number of different checkers.

func OffByOne

func OffByOne(ctx context.Context, checker *SpellChecker, input string) ([]string, error)

OffByOne returns all words that can be made by performing one character change on the input. The input is assumed to be a single, lowercase word containing a-z chars only.

func SaveSpellChecker

func SaveSpellChecker(writer io.Writer, checker *SpellChecker) error

SaveSpellChecker serialises the given checker and writes it to the writer. It can later be restored with LoadSpellChecker.

func Score

func Score(checker *SpellChecker, input string) float64

Score assigns a score to an input showing how likely it is to be English text. A score of 1.0 means almost certainly English, a score of 0.0 means almost certainly not. This is fairly arbitrary and is not very good.

func SplitRGB

func SplitRGB(reader io.Reader) (r, g, b io.Reader, err error)

SplitRGB reads an image, and returns three readers containing PNG encoded images containing just the red, green and blue channels respectively. If the input image is excessively small, the outputs will be scaled up.

func Transpose

func Transpose(input []string) []string

Transpose rotates the input text so rows become columns, and columns become rows.

func WordSearch

func WordSearch(checker *SpellChecker, input []string) []string

WordSearch returns all words found by FindWords in the input word search grid. Words may occur horizontally, vertically or diagonally, and may read in either direction. If a word is found multiple times in different places it will be returned multiple times.

Types

type ColourCount

type ColourCount struct {
	Colour color.Color
	Count  int
}

func ExtractColours

func ExtractColours(reader io.Reader) ([]ColourCount, error)

ExtractColours returns a sorted slice containing each individual colour used in the image, and the total number of pixels that have that colour. Colours are sorted from most-used to least-used.

type MultiplexOption

type MultiplexOption func(*multiplexOptions)

type SpellChecker

type SpellChecker struct {
	// contains filtered or unexported fields
}

SpellChecker provides a way to tell whether a word exists in a dictionary.

func CreateSpellChecker

func CreateSpellChecker(reader io.Reader, wordCount int) (*SpellChecker, error)

CreateSpellChecker creates a new SpellChecker by reading words line-by-line from the given reader. The wordCount parameter should be an approximation of the number of words available.

This is likely to be a relatively expensive operation; for routine use prefer saving the spell checker via SaveSpellChecker and restoring it with LoadSpellChecker.

func LoadSpellChecker

func LoadSpellChecker(reader io.Reader) (*SpellChecker, error)

LoadSpellChecker attempts to load a SpellChecker that was previously saved with SaveSpellChecker.

func (*SpellChecker) Prefix

func (c *SpellChecker) Prefix(prefix string) bool

Prefix determines - probabilistically - whether the given string is a prefix of any known word. There is a small chance of false positives, i.e. an input that is not a prefix to any word in the word list might be incorrectly identified as valid; there is no chance of false negatives.

func (*SpellChecker) Valid

func (c *SpellChecker) Valid(word string) bool

Valid determines - probabilistically - whether the given word was in the word list used to create this SpellChecker. There is a small chance of false positives, i.e. a word that wasn't in the word list might be incorrectly identified as valid; there is no chance of false negatives.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL