search

package module
v0.0.0-...-5379d26 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 14, 2022 License: MIT Imports: 32 Imported by: 0

README

search v2

Middleware for Caddy.

search indexes your static and dynamic documents then serves a HTTP search endpoint.

Folked from https://github.com/blevesearch . Modified to support Caddy V2, with a lot of new features.

  • Support Caddy V2
  • Use Bleve V2
  • Chinese segmentor support
  • Static file watcher
  • Mime type auto detection
Syntax
search [directory|regexp] [endpoint "search/"]
  • directory is the path, relative to site root, to a directory (static content)
  • regexp is the URL [regular expression] of documents that must be indexed (static and dynamic content)
  • endpoint is the path, relative to site's root url, of the search endpoint

For more options, use the following syntax:

search {
    dbname      (default: md5 of root)
    root        (default: .)(required)
    engine      (default: bleve)
    datadir     (default: /tmp/caddyIndex)
    endpoint    (default: /search)
    template    (default: nil)
    numworkers  (default: nuncpus/2)
    expire      (default: 0)
    filewatcher (default: true)
    analyzer    (default: standard)
    maxsize     (default: 50*1024*1024)

    +path       regexp
    -path       regexp
}
  • dbname is the engine for indexing and searching
  • root is the site root (required) (should be the same for the root directive)
  • engine is the engine for indexing and searching
  • datadir is the absolute path to where the indexer should store all data
  • template is the path to the search's HTML result's template
  • numworkers is the number of the index workers
  • expire is the duration (in seconds) for the static files in site root to be rescaned, default 0 meams not to scan the file
  • filewatcher true to enable filewatcher for the root
  • analyzer token analyzer for bleve, default is 'standard', use 'sego' for indexing Chinese
  • maxsize max file size for indexed files
  • +path include a path to be indexed (can be added multiple times)
  • -path exclude a path from being index (can be added multiple times)
Supported Engines
Caddyfile Examples
localhost:2016 {
	root * /tmp/www
	route {
		search {
			root        "/tmp/www"
			datadir     "./db"
			endpoint    "/search"
			expire      0		
			analyzer    "sego"
			numworker   2
			+path /static/docs/
			-path ^/blog/admin/
			-path robots.txt
		}
		file_server
	}
}

How to build

  • Put in under caddy/modules, import github.com/caddyserver/caddy/v2/modules/caddy-search in caddy/cmd/caddy/main.go
  • Or use xcaddy

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConvertToRegExp

func ConvertToRegExp(rexp []string) (r []*regexp.Regexp)

ConvertToRegExp compile a string regular expression to multiple *regexp.Regexp instances

func GetUrlPath

func GetUrlPath(u *url.URL) string

func NewIndexer

func NewIndexer(engine string, config indexer.Config, analyzer string) (index indexer.Handler, err error)

NewIndexer creates a new Indexer with the received config

func ScanToPipe

func ScanToPipe(fp string, indexManager *IndexerManager, index indexer.Handler) indexer.Record

ScanToPipe ...

Types

type IndexerManager

type IndexerManager struct {
	MaxFileSize int
	// contains filtered or unexported fields
}

IndexerManager is the structure that holds search's pipeline infos and methods

func NewIndexerManager

func NewIndexerManager(config *Search, MaxFileSize int, indxr indexer.Handler) (*IndexerManager, error)

NewIndexerManager creates a new Pipeline instance

func (*IndexerManager) Feed

func (p *IndexerManager) Feed(record indexer.Record)

Feed is the step of the pipeline that feeds valid documents to the indexer.

func (*IndexerManager) ValidatePath

func (p *IndexerManager) ValidatePath(path string) bool

ValidatePath is the method that checks if the target page can be indexed

type QueryResults

type QueryResults struct {
	httpserver.Context
	Query   string
	Results []Result
}

type Result

type Result struct {
	Path     string
	Title    string
	Body     template.HTML
	Json     string
	Modified time.Time
	Indexed  time.Time
	From     int
	Size     int
}

Result is the structure for the search result

type Search struct {
	DbName          string
	Engine          string
	IncludePathsStr []string
	ExcludePathsStr []string
	Endpoint        string
	IndexDirectory  string
	TemplateRaw     string
	Expire          time.Duration
	SiteRoot        string
	NumWorkers      int
	Analyzer        string
	MaxSizeFile     int
	FileWatcher     bool

	Indexer      indexer.Handler
	IndexManager *IndexerManager
	IncludePaths []*regexp.Regexp
	ExcludePaths []*regexp.Regexp
	Template     *template.Template
	// contains filtered or unexported fields
}

Search represents this middleware structure

func (*Search) CaddyModule

func (*Search) CaddyModule() caddy.ModuleInfo

CaddyModule returns the Caddy module information.

func (*Search) Cleanup

func (m *Search) Cleanup() error

func (*Search) Provision

func (search *Search) Provision(ctx caddy.Context) (err error)

Provision sets up the module.

func (*Search) SearchHTML

func (s *Search) SearchHTML(w http.ResponseWriter, r *http.Request) error

SearchHTML renders the search results in the HTML template

func (*Search) SearchJSON

func (s *Search) SearchJSON(w http.ResponseWriter, r *http.Request) error

SearchJSON renders the search results in JSON format

func (*Search) ServeHTTP

func (s *Search) ServeHTTP(w http.ResponseWriter, r *http.Request, next caddyhttp.Handler) error

ServerHTTP is the HTTP handler for this middleware

func (*Search) StartWatcher

func (m *Search) StartWatcher(fp string, indexManager *IndexerManager, index indexer.Handler)

func (*Search) UnmarshalCaddyfile

func (m *Search) UnmarshalCaddyfile(c *caddyfile.Dispenser) error

UnmarshalCaddyfile implements caddyfile.Unmarshaler.

func (*Search) Validate

func (m *Search) Validate() error

Validate implements caddy.Validator.

Directories

Path Synopsis
bleve/sego
Go中文分词
Go中文分词

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL