Documentation ¶
Index ¶
- Constants
- type Attribute
- type BytesTermAttr
- type CharTermAttr
- type DefaultAttributeFactory
- type Factory
- type OffsetAttr
- type PackedTokenAttr
- type PayloadAttr
- type PositionIncrAttr
- type PositionLengthAttr
- type Source
- func (r *Source) BytesTerm() BytesTermAttr
- func (r *Source) CharTerm() CharTermAttr
- func (r *Source) Offset() OffsetAttr
- func (r *Source) PackedTokenAttribute() PackedTokenAttr
- func (r *Source) Payload() PayloadAttr
- func (r *Source) PositionIncrement() PositionIncrAttr
- func (r *Source) PositionLength() PositionLengthAttr
- func (r *Source) Reset() error
- func (r *Source) Term2Bytes() Term2BytesAttr
- func (r *Source) TermFrequency() TermFreqAttr
- func (r *Source) Type() TypeAttr
- type Term2BytesAttr
- type TermFreqAttr
- type TypeAttr
Constants ¶
const ( ClassBytesTerm = "BytesTerm" ClassCharTerm = "CharTerm" ClassOffset = "Offset" ClassPositionIncrement = "PositionIncrement" ClassPayload = "Payload" ClassPositionLength = "PositionLength" ClassTermFrequency = "TermFrequency" ClassTermToBytesRef = "TermToBytesRef" ClassType = "Type" )
const (
DEFAULT_TYPE = "word"
)
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Attribute ¶
type Attribute interface { Interfaces() []string Reset() error CopyTo(target Attribute) error Clone() Attribute }
Attribute Base class for Attributes that can be added to a AttributeSourceV2. Attributes are used to add data in a dynamic, yet types-safe way to a source of usually streamed objects,
type BytesTermAttr ¶
type BytesTermAttr interface { Attribute GetBytes() []byte // SetBytes // Sets the bytes of the term SetBytes(bytes []byte) error Reset() error }
BytesTermAttr This attribute can be used if you have the raw term bytes to be indexed. It can be used as replacement for CharTermAttr, if binary terms should be indexed.
type CharTermAttr ¶
type CharTermAttr interface { Attribute GetBytes() []byte // GetString // Returns the internal termBuffer character array which you can then directly alter. If the array is // too small for your token, use resizeBuffer(int) to increase it. After altering the buffer be sure to call // setLength to record the number of valid characters that were placed into the termBuffer. GetString() string // AppendString // Appends the specified String to this character sequence. // The characters of the String argument are appended, in order, increasing the length of this sequence by the // length of the argument. If argument is null, then the four characters "null" are appended. AppendString(s string) error AppendRune(r rune) error // Reset // Sets the length of the termBuffer to zero. Use this method before appending contents // using the Appendable interface. Reset() error }
CharTermAttr The term text of a Token.
type DefaultAttributeFactory ¶
type DefaultAttributeFactory struct { }
func (DefaultAttributeFactory) CreateAttributeInstance ¶
func (d DefaultAttributeFactory) CreateAttributeInstance(class string) (Attribute, error)
type Factory ¶
type Factory interface { // CreateAttributeInstance Returns an AttributeImpl for the supplied Attribute interface class. CreateAttributeInstance(class string) (Attribute, error) }
var (
DEFAULT_ATTRIBUTE_FACTORY Factory = &DefaultAttributeFactory{}
)
type OffsetAttr ¶
type OffsetAttr interface { // StartOffset // Returns this Token's starting offset, the position of the first character corresponding // to this token in the source text. // Note that the difference between endOffset() and startOffset() may not be equal to termText.length(), // as the term text may have been altered by a stemmer or some other filter. // See Also: setOffset(int, int) StartOffset() int // EndOffset // Returns this Token's ending offset, one greater than the position of the last character // corresponding to this token in the source text. The length of the token in the source text // is (endOffset() - startOffset()). // See Also: setOffset(int, int) EndOffset() int // SetOffset // Set the starting and ending offset. // Throws: IllegalArgumentException – If startOffset or endOffset are negative, or if startOffset is // greater than endOffset // See Also: startOffset(), endOffset() SetOffset(startOffset, endOffset int) error }
OffsetAttr The start and end character offset of a Token.
type PackedTokenAttr ¶
type PackedTokenAttr interface { Attribute TypeAttr PositionIncrAttr PositionLengthAttr OffsetAttr TermFreqAttr CharTermAttr }
func NewPackedTokenAttr ¶
func NewPackedTokenAttr() PackedTokenAttr
type PayloadAttr ¶
type PayloadAttr interface { Attribute // GetPayload Returns this Token's payload. // See Also: setPayload(BytesRef) GetPayload() []byte // SetPayload Sets this Token's payload. // See Also: getPayload() SetPayload(payload []byte) error Reset() error }
PayloadAttr The payload of a Token. The payload is stored in the index at each position, and can be used to influence scoring when using Payload-based queries. NOTE: because the payload will be stored at each position, it's usually best to use the minimum number of bytes necessary. Some codec implementations may optimize payload storage when all payloads have the same length. See Also: org.apache.lucene.index.PostingsEnum
type PositionIncrAttr ¶
type PositionIncrAttr interface { // SetPositionIncrement // Set the position increment. The default value is one. // positionIncrement: the distance from the prior term SetPositionIncrement(positionIncrement int) error // GetPositionIncrement // Returns the position increment of this Token. GetPositionIncrement() int }
PositionIncrAttr Determines the position of this token relative to the previous Token in a TokenStream, used in phrase searching. The default value is one. Some common uses for this are:
- Set it to zero to put multiple terms in the same position. This is useful if, e.g., a word has multiple stems. Searches for phrases including either stem will match. In this case, all but the first stem's increment should be set to zero: the increment of the first instance should be one. Repeating a token with an increment of zero can also be used to boost the scores of matches on that token.
- Set it to values greater than one to inhibit exact phrase matches. If, for example, one does not want phrases to match across removed stop words, then one could build a stop word filter that removes stop words and also sets the increment to the number of stop words removed before each non-stop word. Then exact phrase queries will only match when the terms occur with no intervening stop words.
See Also: org.apache.lucene.index.PostingsEnum
type PositionLengthAttr ¶
type PositionLengthAttr interface { // SetPositionLength // Set the position length of this Token. // The default value is one. // Params: positionLength – how many positions this token spans. // Throws: IllegalArgumentException – if positionLength is zero or negative. // See Also: getPositionLength() SetPositionLength(positionLength int) error // GetPositionLength // Returns the position length of this Token. // See Also: setPositionLength GetPositionLength() int }
type Source ¶
type Source struct {
// contains filtered or unexported fields
}
func (*Source) BytesTerm ¶
func (r *Source) BytesTerm() BytesTermAttr
func (*Source) CharTerm ¶
func (r *Source) CharTerm() CharTermAttr
func (*Source) Offset ¶
func (r *Source) Offset() OffsetAttr
func (*Source) PackedTokenAttribute ¶
func (r *Source) PackedTokenAttribute() PackedTokenAttr
func (*Source) Payload ¶
func (r *Source) Payload() PayloadAttr
func (*Source) PositionIncrement ¶
func (r *Source) PositionIncrement() PositionIncrAttr
func (*Source) PositionLength ¶
func (r *Source) PositionLength() PositionLengthAttr
func (*Source) Term2Bytes ¶
func (r *Source) Term2Bytes() Term2BytesAttr
func (*Source) TermFrequency ¶
func (r *Source) TermFrequency() TermFreqAttr
type Term2BytesAttr ¶
Term2BytesAttr This attribute is requested by TermsHashPerField to index the contents. This attribute can be used to customize the final byte[] encoding of terms. Consumers of this attribute call getBytesRef() for each term.
type TermFreqAttr ¶
type TermFreqAttr interface { // SetTermFrequency // Set the custom term frequency of the current term within one document. SetTermFrequency(termFrequency int) error // GetTermFrequency // Returns the custom term frequency. GetTermFrequency() int }
TermFreqAttr Sets the custom term frequency of a term within one document. If this attribute is present in your analysis chain for a given field, that field must be indexed with IndexOptions.DOCS_AND_FREQS.