double_array_trie

package module

v1.2.1 Latest Latest Go to latest Published: Apr 21, 2024 License: MulanPSL-2.0 Imports: 9 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

gitee.com/ivfzhou/double-array-trie

Links

Open Source Insights

README ¶

说明

双数组检索树，固定词汇地高速地内存占用少地检索数据结构。

dat 可用于 httprule 的前缀匹配情景，或者其它固定词汇的检索情形。也可用于分词检索，其性能远优于二叉树。

使用

go get gitee.com/ivfzhou/double-array-trie@latest

import dat "gitee.com/ivfzhou/double-array-trie"

// 需要被检索的数据
data := []string{
    "/api/user/info",
    "/api/user/register",
    "/api/user/login"
}

// 构建双数组
d := dat.New(data)

// 判断是否存在与数据中
path := request.URL.Path
b := d.Matches(path)

// 判断数据中是否有前缀和 path 匹配
b := d.MatchPrefix(path)

// 获取 path 在字典中的索引
i := d.MatchesIndex(path)

// 返回所有匹配前缀 path 的数据
d.ObtainPrefixes(path)

// 解析 sentence，返回数据中出现在 sentence 中的数据和其位置
d.Analysis("sentence")

状态转移方程：

firstState = 1
check[code + state -2] = state 
base[code + state -2] = nextState

终节点：

check[code + state -2] = state 
base[code + state -2] = nextState
base[nextState - 2] < 0 or check[nextState -2] = nextState

例子

[]string{
 0     AC
 1     AD
 2     ADG
 3     ADH
 4     ADHG
 5     BEIZ
 6     BEL
 7     BF
 8     DG
}

双数组值：
check [1 1  4 1  3 3  8  9  10 11 7  7   14 8 8  10  18 19 20 12 12 16 13]
base  [3 7 -1 16 4 8 -2 -3 -4 -5  12 19 -6  9 10 11 -7 -8 -9  13 18 20 14]
code A=1 B=2 C=3 D=4 E=5 F=6 G=7 H=8 I=9 L=10 Z=11

树结构：
depth
  0                            Root
                               ⁰ ⁹
                     /                         \           \
  1                 A                          B           D    
                   ⁰ ⁵                         ⁵ ⁷         ⁸ ⁹ 
         /     /      \                     /        \      \
  2     C     D        D                  E           F      G 
       ⁰ ¹            ¹ ⁵                ⁵ ⁷         ⁷ ⁷    ⁸ ⁹
            /     /      \              /    \
  3        G     H        H            I      L
          ² ³            ³ ⁵          ⁵ ⁶    ⁶ ⁷                  
                           \         /
  4                         G       Z
                           ⁴ ⁵     ⁵ ⁶

联系电邮：ivfzhou@126.com

Documentation ¶

Index ¶

Variables
type Dat
- func New(keys []string) *Dat
- func ReadFromFile(filePath string) (*Dat, error)

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// MinExpansiveFactor 适配 state 时，数组最小扩容系数。如果同层子节点 code 相差大，则该值该调大。
	MinExpansiveFactor = 1.2
	// InitArrayFactor 初始扩容系数。
	InitArrayFactor = 2.5
)

Functions ¶

This section is empty.

Types ¶

type Dat ¶ added in v1.2.0

type Dat struct {
	// contains filtered or unexported fields
}

func New ¶

func New(keys []string) *Dat

New 构建双数组树。

func ReadFromFile ¶

func ReadFromFile(filePath string) (*Dat, error)

ReadFromFile 从备份文件中恢复对象。

func (*Dat) Analysis ¶ added in v1.2.0

func (d *Dat) Analysis(sentence string) (keys []string, indexes []int)

Analysis 返回词库中词匹配句子 sentence 中词的词和对应起始位置。

func (*Dat) DumpToFile ¶ added in v1.2.0

func (d *Dat) DumpToFile(filePath string) error

DumpToFile 备份树。

func (*Dat) Hollow ¶ added in v1.2.0

func (d *Dat) Hollow() int

Hollow 返回数组未使用的下标个数。

func (*Dat) KeySize ¶ added in v1.2.0

func (d *Dat) KeySize() int

KeySize 词汇个数。

func (*Dat) MatchPrefix ¶ added in v1.2.0

func (d *Dat) MatchPrefix(word string) bool

MatchPrefix 判断 word 是否匹配词库中任何一个词的前缀。

func (*Dat) Matches ¶ added in v1.2.0

func (d *Dat) Matches(word string) bool

Matches 判断 word 是否存在于词库。

func (*Dat) MatchesIndex ¶ added in v1.2.0

func (d *Dat) MatchesIndex(s string) int

MatchesIndex 查找 s 在树中索引，不存在返回-1。

func (*Dat) ObtainPrefixes ¶ added in v1.2.0

func (d *Dat) ObtainPrefixes(word string) (res []string)

ObtainPrefixes 返回所有匹配 word 前缀的词。

func (*Dat) Size ¶ added in v1.2.0

func (d *Dat) Size() int

Size 数组长度。

Source Files ¶

View all Source files

dat.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL