wordfreq

package module
v0.0.0-...-d01a71a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 10, 2018 License: MIT Imports: 5 Imported by: 0

README

wordfreq

Text corpus calculation in Golang. Supports Chinese, English.

This work is a derivative of wordfreq by Timothy Guan-tin Chien.

Install

With a correctly configured Go toolchain:

go get -u github.com/twsiyuan/wordfreq

Simple Example

import(
   "github.com/twsiyuan/wordfreq"
)

func main(){
   wfreq, _ := wordfreq.New(wordfreq.Options{})
   tlist := wfreq.Process("text")  // Term list
}

Available options in wordfreq.Options:

  • Languages: Array of keywords to specify languages to process. Available keywords are chinese, english. Default to both.
  • StopWordSets: Array of keywords to specify the built-in set of stop words to exclude in the count. Available: cjk, english1, and english2. Default to all.
  • StopWords: Array of words/phrases to exclude in the count. Case insensitive. Default to empty.
  • MinimumCount: Minimal count required to be included in the returned list. Default to 2.
  • NoFilterSubstring: (Chinese language only) No filter out the recounted substring. Default to false.
  • MaxiumPhraseLength: (Chinese language only) Maxium length to consider a phrase. Default to 8.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Options

type Options struct {
	Languages          []string // Default: ['chinese', 'english']
	StopWordSets       []string // Default: ['cjk', 'english1', 'english2']
	StopWords          []string // Default: []
	NoFilterSubstring  bool     // Default: false
	MaxiumPhraseLength int      // Default: 8
	MinimumCount       int      // Default: 2
}

type Term

type Term struct {
	Term  string
	Count int
}

type WordFeq

type WordFeq struct {
	// contains filtered or unexported fields
}

func New

func New(ops Options) (*WordFeq, error)

func (*WordFeq) Empty

func (w *WordFeq) Empty()

func (WordFeq) List

func (w WordFeq) List() []Term

func (*WordFeq) Process

func (w *WordFeq) Process(text string) []Term

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL