Documentation
¶
Overview ¶
Package find contains optimized wildcard matching implementations. This file provides the ASCII-only case-sensitive matching engine for maximum performance. It eliminates all UTF-8/Unicode overhead through direct byte operations. For Unicode support and case-insensitive matching, see match_fold.go.
Package find contains optimized wildcard matching implementations. This file provides the Unicode-aware matching engine with full UTF-8 support and case-insensitive matching capabilities using Unicode simple folding. For maximum performance with ASCII-only input, see match.go.
Index ¶
- Constants
- Variables
- func IsWildcardByte(b byte) bool
- func MatchInternal[T ~string | ~[]byte](pattern, s T) (int, error)
- func MatchInternalFold[T ~string | ~[]byte](pattern, s T, fold bool) (int, error)
- func NewCharClass[T ~string | ~[]byte](pattern T, pi int) (*charClass, int, error)
- func NewcharClassFold[T ~string | ~[]byte](pattern T, pi int) (*charClassFold, int, error)
Constants ¶
const (
// All supported wildcard characters
WildcardChars = "*?.[\\"
)
Variables ¶
var ErrBadPattern = errors.New("syntax error in pattern")
ErrBadPattern indicates a pattern was malformed.
var ErrNoMatch = errors.New("no match found")
ErrNoMatch indicates no match was found.
Functions ¶
func IsWildcardByte ¶
IsWildcardByte checks if a byte is a wildcard character (`*`, `?`, `.`, `[`, `\`). This is optimized for ASCII-only matching and works directly with bytes.
func MatchInternal ¶
MatchInternal is the optimized ASCII-only case-sensitive matching algorithm. This implementation eliminates all UTF-8/Unicode overhead for maximum performance through direct byte-by-byte comparison and single-byte character advancement.
Performance optimizations:
- No UTF-8 decoding (direct byte access)
- No rune conversion overhead
- Single-byte character advancement in backtracking
- Simplified ASCII character class parsing
- Early exit for non-wildcard patterns (O(1) for literal matching)
- Star wildcard optimization using strings.Index/bytes.Index
The algorithm supports:
- `*`: Matches any sequence of characters (greedy with backtracking)
- `?`: Matches zero or one character (with backtracking for both options)
- `.`: Matches any single character except newline
- `[abc]`: ASCII-only character classes
- `\x`: Escape sequences for literal characters
For Unicode support and case-insensitive matching, use MatchInternalFold instead. This provides 2-5x performance improvement over Unicode-aware matching for ASCII input.
func MatchInternalFold ¶
MatchInternalFold is the Unicode-aware matching algorithm that handles both case-sensitive and case-insensitive matching with full UTF-8 support. It uses proper Unicode simple folding for case-insensitive comparisons and handles multi-byte UTF-8 sequences correctly.
Unicode capabilities:
- Full UTF-8 decoding with utf8.DecodeRune* functions
- Proper multi-byte character handling
- Unicode simple folding for case-insensitive matching
- Support for any Unicode character in patterns and input
- Correct character width calculation for backtracking
The algorithm supports:
- `*`: Matches any sequence of characters (greedy with backtracking)
- `?`: Matches zero or one character (with backtracking for both options)
- `.`: Matches any single character except newline
- `[abc]`: Character classes with full Unicode support (always case-sensitive)
- `\x`: Escape sequences for literal characters
The fold parameter controls case-insensitive matching using Unicode simple folding. Character classes remain case-sensitive even when fold=true to maintain standard glob behavior compatibility.
For ASCII-only input, consider using the optimized MatchInternal function in match.go for 2-5x better performance.
func NewCharClass ¶
parsecharClass creates a new charClass by parsing the pattern at the given position. This function is optimized for ASCII-only characters and provides maximum performance by:
- Operating directly on bytes without UTF-8 decoding
- Using simplified range validation for ASCII characters
- Avoiding Unicode character class complexity
Returns the parsed charClass, the new position after the class, and any error. For Unicode character class support, use NewCharClass in match_fold.go.
Types ¶
This section is empty.