find

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 10, 2025 License: Apache-2.0 Imports: 6 Imported by: 0

Documentation

Overview

Package find contains optimized wildcard matching implementations. This file provides the ASCII-only case-sensitive matching engine for maximum performance. It eliminates all UTF-8/Unicode overhead through direct byte operations. For Unicode support and case-insensitive matching, see match_fold.go.

Package find contains optimized wildcard matching implementations. This file provides the Unicode-aware matching engine with full UTF-8 support and case-insensitive matching capabilities using Unicode simple folding. For maximum performance with ASCII-only input, see match.go.

Index

Constants

View Source
const (
	// All supported wildcard characters
	WildcardChars = "*?.[\\"
)

Variables

View Source
var ErrBadPattern = errors.New("syntax error in pattern")

ErrBadPattern indicates a pattern was malformed.

View Source
var ErrNoMatch = errors.New("no match found")

ErrNoMatch indicates no match was found.

Functions

func IsWildcardByte

func IsWildcardByte(b byte) bool

IsWildcardByte checks if a byte is a wildcard character (`*`, `?`, `.`, `[`, `\`). This is optimized for ASCII-only matching and works directly with bytes.

func MatchInternal

func MatchInternal[T ~string | ~[]byte](pattern, s T) (int, error)

MatchInternal is the optimized ASCII-only case-sensitive matching algorithm. This implementation eliminates all UTF-8/Unicode overhead for maximum performance through direct byte-by-byte comparison and single-byte character advancement.

Performance optimizations:

  • No UTF-8 decoding (direct byte access)
  • No rune conversion overhead
  • Single-byte character advancement in backtracking
  • Simplified ASCII character class parsing
  • Early exit for non-wildcard patterns (O(1) for literal matching)
  • Star wildcard optimization using strings.Index/bytes.Index

The algorithm supports:

  • `*`: Matches any sequence of characters (greedy with backtracking)
  • `?`: Matches zero or one character (with backtracking for both options)
  • `.`: Matches any single character except newline
  • `[abc]`: ASCII-only character classes
  • `\x`: Escape sequences for literal characters

For Unicode support and case-insensitive matching, use MatchInternalFold instead. This provides 2-5x performance improvement over Unicode-aware matching for ASCII input.

func MatchInternalFold

func MatchInternalFold[T ~string | ~[]byte](pattern, s T, fold bool) (int, error)

MatchInternalFold is the Unicode-aware matching algorithm that handles both case-sensitive and case-insensitive matching with full UTF-8 support. It uses proper Unicode simple folding for case-insensitive comparisons and handles multi-byte UTF-8 sequences correctly.

Unicode capabilities:

  • Full UTF-8 decoding with utf8.DecodeRune* functions
  • Proper multi-byte character handling
  • Unicode simple folding for case-insensitive matching
  • Support for any Unicode character in patterns and input
  • Correct character width calculation for backtracking

The algorithm supports:

  • `*`: Matches any sequence of characters (greedy with backtracking)
  • `?`: Matches zero or one character (with backtracking for both options)
  • `.`: Matches any single character except newline
  • `[abc]`: Character classes with full Unicode support (always case-sensitive)
  • `\x`: Escape sequences for literal characters

The fold parameter controls case-insensitive matching using Unicode simple folding. Character classes remain case-sensitive even when fold=true to maintain standard glob behavior compatibility.

For ASCII-only input, consider using the optimized MatchInternal function in match.go for 2-5x better performance.

func NewCharClass

func NewCharClass[T ~string | ~[]byte](pattern T, pi int) (*charClass, int, error)

parsecharClass creates a new charClass by parsing the pattern at the given position. This function is optimized for ASCII-only characters and provides maximum performance by:

  • Operating directly on bytes without UTF-8 decoding
  • Using simplified range validation for ASCII characters
  • Avoiding Unicode character class complexity

Returns the parsed charClass, the new position after the class, and any error. For Unicode character class support, use NewCharClass in match_fold.go.

func NewcharClassFold

func NewcharClassFold[T ~string | ~[]byte](pattern T, pi int) (*charClassFold, int, error)

NewcharClassFold creates a new charClassFold by parsing the pattern at the given position. Returns the parsed charClassFold, the new position after the class, and any error.

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL