model

package
v0.0.0-...-22e7a19 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Oct 13, 2017 License: Apache-2.0 Imports: 5 Imported by: 0

README

Model

Word2Vec

Word2Vec is the generic term below modules:

model:
- Skip-Gram
- CBOW

optimizer:
- Hierarchical Softmax
- Negative Sampling

In training, select one model and one optimizer above. model and optimizer represent architecture of objective and the way of approximating its function respectively.

Features
  • Skip-Gram
  • CBOW
  • Hierarchical Softmax
  • Negative Sampling
  • Subsampling
  • Update learning rate in training
Usage
Embed words using word2vec

Usage:
  word-embedding word2vec [flags]

Flags:
      --batchSize int       Set the batch size to update learning rate (default 10000)
  -d, --dimension int       Set the dimension of word vector (default 10)
      --initlr float        Set the initial learning rate (default 0.025)
  -i, --inputFile string    Set the input file path to load corpus (default "example/input.txt")
      --lower               Whether the words on corpus convert to lowercase or not (default true)
      --maxDepth int        Set the number of times to track huffman tree, max-depth=0 means to track full path from root to word (using only hierarchical softmax)
      --model string        Set the model of Word2Vec. One of: cbow|skip-gram (default "cbow")
      --optimizer string    Set the optimizer of Word2Vec. One of: hs|ns (default "hs")
  -o, --outputFile string   Set the output file path to save word vectors (default "example/word_vectors.txt")
      --sample int          Set the number of the samples as negative (using only negative sampling) (default 5)
      --theta float         Set the lower limit of learning rate (lr >= initlr * theta) (default 0.0001)
      --threshold float     Set the threshold for subsampling (default 0.001)
  -w, --window int          Set the context window size (default 5)

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func SigmoidF32

func SigmoidF32(f float32) float32

SigmoidF32 returns f(x) = \frac{1}{1 + e^{-x}}.

func SigmoidF64

func SigmoidF64(f float64) float64

SigmoidF64 returns f(x) = \frac{1}{1 + e^{-x}}. See: http://en.wikipedia.org/wiki/Sigmoid_function.

Types

type Config

type Config struct {
	ToLower          bool
	Dimension        int
	Window           int
	InitLearningRate float64
}

Config stores the common config.

func NewConfig

func NewConfig(toLower bool, dimension, window int, initlr float64) *Config

NewConfig creates *Config

type Model

type Model interface {
	Preprocess(f io.ReadSeeker) (io.ReadCloser, error)
	Train(f io.ReadCloser) error
	Save(outputFile string) error
}

Model is the interface of Preprocess, Train, Save.

type SyncTensor

type SyncTensor struct {
	sync.RWMutex
	tensor.Tensor
}

SyncTensor is a Tensor that has a read-write lock on it.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL