# markov

package module
Version: v1.0.0 Latest Latest

Go to latest
Published: Mar 28, 2019 License: MIT

### Markov Chains in Go

If you don't know what a Markov Chain is I recommend reading up on it and checking out this visual explanation. To put it simply, Markov Chain represent probabilistic state changes inside a Finite-State-Machine. The state transition probabilities can be easily represented as a matrix but in practice (software) that leaves lots of entries being 0 and taking up memory, so a more elegant solution is a nested hash map.

In the code above the following are used:

``````// represents a grouping of individual words
// eg: []string{"I", "am", "Alex"}, this can be extracted
// from an original string of any form or shape:
// "I am Alex", "I   am  Alex", "I:am:Alex"
// and it's all up to the caller to split their strings into sequences
type Sequence []string
``````
``````// a pairs represents a possible transition between a sequence of n words
// and the next (single) word
// the Current sequence must be of an equal lenght to the chain pair size
// meaning you can't have some transitions for 2-grouped words and 1-grouped words
type Pair struct {
Current Sequence
Next    string
}
``````
``````// by having a
type transitionMap map[string]int
// and then nested inside
frequencyMatrix map[string]transitionMap
// we generate our mapping of all encountered
// sequences to their respective next word
// and the number of times this occurs
``````

Once your wrap your head around these structures, the rest of the functions are easy to understand.

#### Examples

The text below was generated on a 2-paired sequence chain with a student movie script as input.

``````You know I can't KEN I understand. KEN opens the door to
a soaking wet KEN, who stands on the pink scissors and
picks them up, toying with them. KEN glances around for
his wallet. DEBRA (Comforting) There's nothing to be
embarrassed about you know. Lots of people it happens.
KEN grunts. KEN Yeah. The bird chirps from its cage in
the corner. KEN tries to scream but is unable to pull in
oxygen. The pressure in his ears begins to burst as the
bathtub overflows. 29. A figure appears above him,
blurry through the water.
``````

If you have Go installed, you can simply run:

``````go get cpl.li/go/markov/markov-cli
``````

#### Usage

I provided an example main function with `stdin` input and basic flag parsing for generating n words from the input data.

``````package main

import (
"flag"
"fmt"
"io/ioutil"
"log"
"os"
"strings"

"cpl.li/go/markov"
)

func main() {
// handle flags
maxWords := flag.Int("words", 100, "max words to generate (default 100)")
pairSize := flag.Int("pairs", 2, "size of a word pair (default 2)")
flag.Parse()

c := NewChain(*pairSize) // create markov chain

if err != nil {
log.Fatal(err)
}

// give data as sequence to chain model

b := c.NewBuilder(nil)             // create builder on top of chain
b.Generate(*maxWords - c.PairSize) // generate new words
fmt.Println(b.String())            // print end product
}
``````

## Documentation ¶

### Overview ¶

Package markov provides a markov chain implementation which allows you to "train" a model using any form of text as input. The markov chain will split the text sequence into pairs and then generate the transition mapping.

A Builder implementation also exists, this can be generated on top of a chain in order to generate a continuous flow of new words.

Example (Basic)

This example shows a general usecase for the Markov Chain and the builder. It takes input from `stdin` and trains the markov chain then generates a given number of words nd prints out the fully generated string. The flags can configure the max number of words to generate and the sequence pairing to be used when "training" the markov chain.

```package main

import (
"flag"
"fmt"
"io/ioutil"
"log"
"os"
"strings"

"cpl.li/go/markov"
)

func main() {
// handle flags
maxWords := flag.Int("words", 100, "max words to generate (default 100)")
pairSize := flag.Int("pairs", 2, "size of a word pair (default 2)")
flag.Parse()

c := markov.NewChain(*pairSize) // create markov chain

if err != nil {
log.Fatal(err)
}

// give data as sequence to chain model

b := c.NewBuilder(nil)             // create builder on top of chain
b.Generate(*maxWords - c.PairSize) // generate new words
fmt.Println(b.String())            // print end product
}
```
```Output:

```

### Constants ¶

This section is empty.

### Variables ¶

This section is empty.

### Functions ¶

This section is empty.

### Types ¶

#### type Builder ¶

```type Builder struct {
// contains filtered or unexported fields
}```

Builder spawns from a Markov chain and using the generated tokens from it, creates sequences of words.

#### func (*Builder) Generate ¶

`func (b *Builder) Generate(count int) int`

Generate will tell the builder to poll the markov chain for at most `count` new words to append to the builders sequence. The function will return the real number of generated words, 0 meaning no new words could be generated.

#### func (*Builder) String ¶

`func (b *Builder) String() string`

String will return the word sequence as a single string of all words separated by spaces.

#### type Chain ¶

```type Chain struct {
PairSize int
// contains filtered or unexported fields
}```

Chain represents a Markov chain composed of given length pairs extracted from provided sequences.

#### func NewChain ¶

`func NewChain(pairSize int) *Chain`

NewChain generates a Chain with pairs of given length.

`func (c *Chain) Add(sequence Sequence)`

Add adds the transition counts to the chain for a given sequence of words.

#### func (*Chain) NewBuilder ¶

`func (c *Chain) NewBuilder(seed Sequence) Builder`

NewBuilder creates a Markov sequence builder form the current chain.

#### func (*Chain) Next ¶

`func (c *Chain) Next(seed Sequence) string`

Next will give you the next possible token for a certain sequence based on a random weighted decision.

#### func (*Chain) TransitionProbability ¶

`func (c *Chain) TransitionProbability(p Pair) (float64, error)`

TransitionProbability returns the probability of transition between the current and next state of a pair.

#### type Pair ¶

```type Pair struct {
Current Sequence
Next    string
}```

Pair represents a state transition between a set of 1 or more words and the next word in the Sequence.

#### type Sequence ¶

`type Sequence []string`

Sequence represents a set of 1 or more words used to build a Markov chain.

#### func (Sequence) Pairs ¶

`func (s Sequence) Pairs(size int) []Pair`

Pairs extracts pairs with states of given size transitioning to new words.

#### func (Sequence) String ¶

`func (s Sequence) String() string`

String returns the words of the sequence as a single string formed of space separeted words.

Path Synopsis