cnsimhash

package module
Version: v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 2, 2018 License: GPL-3.0 Imports: 3 Imported by: 0

README

cnsimhash

Based on jieba cut words, support the Chinese simhash

Documentation

Index

Constants

View Source
const (
	WORDS_ALL = -1
)

Variables

This section is empty.

Functions

func Compare

func Compare(a uint64, b uint64) uint8

Compare calculates the Hamming distance between two 64-bit integers

Currently, this is calculated using the Kernighan method [1]. Other methods exist which may be more efficient and are worth exploring at some point

func Distance added in v1.0.1

func Distance(v1 uint64, v2 uint64) int

func IDFPrint added in v1.0.1

func IDFPrint()

func LoadDictionary

func LoadDictionary(jiebapath, idfpath, stopwords, synonympath string) error

To Load jieba, idf, stop words dictionaries

func UnicodeSimhash added in v1.0.1

func UnicodeSimhash(s string, topN int) (uint64, analyse.Segments, []string)

calculate unicode simhash with top n keywords calculate with all words if topN < 0

Types

This section is empty.

Source Files

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
t or T : Toggle theme light dark auto
y or Y : Canonical URL