revregex

package module
v0.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 25, 2022 License: MIT Imports: 7 Imported by: 0

README

revregex

Go Reference Go Report Card

A package you can use to generate all possible strings matching a given regular expression.

How to use


import (
    "fmt"
    "github.com/xavier268/revregex"
    )

    // start from a POSIX regular expression pattern
    pattern := "a(b|c)a?"
    // create a generator for this pattern
	generator := NewGen(pattern)
    // create a source of entropy
	chooser := NewRandChooser() 

	
	// Now, you can ask the generator to create a string using the chooser to make its decisions.
	result := generator.Next(chooser) // for instance, result will get "aca" or "ab".
		
    // if you don't trust the package and want to verify that the string actually matches ...
    err := generator.Verify(chooser)
    if err != nil {
        ...
    }
    

See full example file here

Regex syntax

Any POSIX format regexes valid in Golang is accepted.

No flags are available. Grouping or named captures have no effect. Word, line or text boundaries are meaningless here and should be avoided.

Unbounded metacharacteres such as * or + are accepted. They generate shorter strings first, giving an exponentially decreasing priority to the longer strings.

Beware of negative classes, such as [^a] or the dot "." operator, because they will likely generate a lot of strange unicode caracters ( up to 4 bytes characters ! ). Prefer to use positive classes such as [[:digit:]] or [a-zA-Z] to limit unprintable characters.

See reference here

Deterministic or random ?

To choose which string to generate among the many - possibly unlimited - strings matching the provided pattern, one needs to make choices. These choices are driven by something that fullfils the Chooser interface.


type Chooser interface {
	// Intn provides a number between 0 and (n-1) included.
	Intn(n int) int
}

Two ways to construct a Chooser are provided :

  • NewRandChooser to make decision randomly. If you computer has a good random generator, you most likely will endup generating all the shorter possible strings.

  • NewBytesChooser, which takes a []byte as input, will use the information contained in this array to make its choices. There is a one-to-one relation between a given byte array and the sequence of strings generated. However, the information from the provided byte array is consumed as we generate strings and make choices, and, at some point, once there is no more information (underlying array is nil or contains only zeros), the defaults choices will always be made, generating from that point always the same default answer.

Documentation

Index

Examples

Constants

View Source
const MaxUnicode = '\U0010ffff'

MaxUnicode is the maximum Unicode character that can be generated.

View Source
const VERSION = "0.1.3"

Variables

View Source
var ErrVerificationFailed = fmt.Errorf("verification failed")

Functions

This section is empty.

Types

type Chooser

type Chooser interface {
	// Intn provides a number between 0 and (n-1) included.
	Intn(n int) int
}

Chooser defines an interface that allows to make choices.

func NewBytesChooser

func NewBytesChooser(buf []byte) Chooser

NewBytesChooser uses buf as a source of information for decision. This makes the exploration of all possible strings perfectily deterministic. Using a chooser to make a decision "consumes" the available information. When all information is "consumed", defaults or first choices will always be the one chosen.

func NewRandChooser

func NewRandChooser() Chooser

NewRandChooser uses random as the source for decision. It is guaranteed that no string has a zero probability, but longer strings have a much lower chance of appearing.

func NewRandChooserSeed

func NewRandChooserSeed(seed int64) Chooser

NewRandChooserSeed uses random as the source for decision. It is guaranteed that no string has a zero probability, but longer strings have a much lower chance of appearing. Setting the seed allows for reproductibility in tests.

type Gen

type Gen struct {
	// contains filtered or unexported fields
}

Gen can generate deterministic or random strings that match a given regexp. Gen is thread safe.

func NewGen

func NewGen(source string) *Gen

NewGen creates a new generator. It will panic if the regexp provided is not syntaxicaly correct. Use POSIX syntax. No implicit parse tree simplification.

Example
pattern := "a(b|c)a?"
generator := NewGen(pattern)
entropy := NewRandChooserSeed(42) // or use NewRandInter() for real randomness ...

// Generate 5 strings that match "a(b|c)a?"
for i := 0; i < 5; i++ {
	result := generator.Next(entropy)
	fmt.Println(result)

	// Verify each generated string actually matches ?
	if err := generator.Verify(result); err != nil {
		fmt.Println("Verification failed with error : ", err)
	}
}
Output:

aca
ab
aca
ac
aba

func NewGenSimpl

func NewGenSimpl(source string) *Gen

Same as NewGen, but in addition, the tree is simplified.

func (*Gen) Next

func (g *Gen) Next(it Chooser) string

Next generate a string that match the provided regexp, using the provided Chooser to make its choices.

func (*Gen) String

func (g *Gen) String() string

func (*Gen) Verify

func (g *Gen) Verify(s string) error

Verify if a string match the regexp used to create g.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL