ranking

package
v0.4.15 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 10, 2024 License: Apache-2.0 Imports: 26 Imported by: 0

Documentation

Index

Constants

View Source
const (
	CollaborativeBPR = "bpr"
	CollaborativeCCD = "ccd"
)

Variables

This section is empty.

Functions

func Evaluate

func Evaluate(estimator MatrixFactorization, testSet, trainSet *DataSet, topK, numCandidates, nJobs int, scorers ...Metric) []float32

Evaluate evaluates a model in top-n tasks.

func GetModelName added in v0.2.5

func GetModelName(m Model) string

func HR

func HR(targetSet *i32set.Set, rankList []int32) float32

HR means Hit Ratio.

func LoadDataFromBuiltIn

func LoadDataFromBuiltIn(dataSetName string) (*DataSet, *DataSet, error)

LoadDataFromBuiltIn loads a built-in Data set. Now support:

func MAP

func MAP(targetSet *i32set.Set, rankList []int32) float32

MAP means Mean Average Precision. mAP: http://sdsawtelle.github.io/blog/output/mean-average-precision-MAP-for-recommender-systems.html

func MRR

func MRR(targetSet *i32set.Set, rankList []int32) float32

MRR means Mean Reciprocal Rank.

The mean reciprocal rank is a statistic measure for evaluating any process that produces a list of possible responses to a sample of queries, ordered by probability of correctness. The reciprocal rank of a query response is the multiplicative inverse of the rank of the first correct answer: 1 for first place, ​1⁄2 for second place, ​1⁄3 for third place and so on. The mean reciprocal rank is the average of the reciprocal ranks of results for a sample of queries Q:

MRR = \frac{1}{Q} \sum^{|Q|}_{i=1} \frac{1}{rank_i}

func MarshalModel added in v0.2.7

func MarshalModel(w io.Writer, m Model) error

func NDCG

func NDCG(targetSet *i32set.Set, rankList []int32) float32

NDCG means Normalized Discounted Cumulative Gain.

func Precision

func Precision(targetSet *i32set.Set, rankList []int32) float32

Precision is the fraction of relevant ItemFeedback among the recommended ItemFeedback.

\frac{|relevant documents| \cap |retrieved documents|} {|{retrieved documents}|}

func Rank

func Rank(model MatrixFactorization, userId int32, candidates []int32, topN int) ([]int32, []float32)

func Recall

func Recall(targetSet *i32set.Set, rankList []int32) float32

Recall is the fraction of relevant ItemFeedback that have been recommended over the total amount of relevant ItemFeedback.

\frac{|relevant documents| \cap |retrieved documents|} {|{relevant documents}|}

Types

type BPR

type BPR struct {
	BaseMatrixFactorization
	// contains filtered or unexported fields
}

BPR means Bayesian Personal Ranking, is a pairwise learning algorithm for matrix factorization model with implicit feedback. The pairwise ranking between item i and j for user u is estimated by:

p(i >_u j) = \sigma( p_u^T (q_i - q_j) )

Hyper-parameters:

 Reg 		- The regularization parameter of the cost function that is
			  optimized. Default is 0.01.
 Lr 		- The learning rate of SGD. Default is 0.05.
 nFactors	- The number of latent factors. Default is 10.
 NEpochs	- The number of iteration of the SGD procedure. Default is 100.
 InitMean	- The mean of initial random latent factors. Default is 0.
 InitStdDev	- The standard deviation of initial random latent factors. Default is 0.001.

func NewBPR

func NewBPR(params model.Params) *BPR

NewBPR creates a BPR model.

func (*BPR) Clear

func (bpr *BPR) Clear()

func (*BPR) Complexity added in v0.4.5

func (bpr *BPR) Complexity() int

func (*BPR) Fit

func (bpr *BPR) Fit(trainSet, valSet *DataSet, config *FitConfig) Score

Fit the BPR model. Its task complexity is O(bpr.nEpochs).

func (*BPR) GetItemFactor added in v0.3.2

func (bpr *BPR) GetItemFactor(itemIndex int32) []float32

GetItemFactor returns the latent factor of an item.

func (*BPR) GetParamsGrid

func (bpr *BPR) GetParamsGrid(withSize bool) model.ParamsGrid

func (*BPR) GetUserFactor added in v0.3.2

func (bpr *BPR) GetUserFactor(userIndex int32) []float32

GetUserFactor returns the latent factor of a user.

func (*BPR) Init

func (bpr *BPR) Init(trainSet *DataSet)

func (*BPR) InternalPredict

func (bpr *BPR) InternalPredict(userIndex, itemIndex int32) float32

func (*BPR) Invalid added in v0.2.2

func (bpr *BPR) Invalid() bool

func (*BPR) Marshal added in v0.2.7

func (bpr *BPR) Marshal(w io.Writer) error

Marshal model into byte stream.

func (*BPR) Predict

func (bpr *BPR) Predict(userId, itemId string) float32

Predict by the BPR model.

func (*BPR) SetParams

func (bpr *BPR) SetParams(params model.Params)

SetParams sets hyper-parameters of the BPR model.

func (*BPR) Unmarshal added in v0.2.7

func (bpr *BPR) Unmarshal(r io.Reader) error

Unmarshal model from byte stream.

type BaseMatrixFactorization

type BaseMatrixFactorization struct {
	model.BaseModel
	UserIndex       base.Index
	ItemIndex       base.Index
	UserPredictable *bitset.BitSet
	ItemPredictable *bitset.BitSet
	// Model parameters
	UserFactor [][]float32 // p_u
	ItemFactor [][]float32 // q_i
}

func (*BaseMatrixFactorization) Bytes added in v0.4.3

func (baseModel *BaseMatrixFactorization) Bytes() int

func (*BaseMatrixFactorization) GetItemIndex

func (baseModel *BaseMatrixFactorization) GetItemIndex() base.Index

func (*BaseMatrixFactorization) GetUserIndex

func (baseModel *BaseMatrixFactorization) GetUserIndex() base.Index

func (*BaseMatrixFactorization) Init

func (baseModel *BaseMatrixFactorization) Init(trainSet *DataSet)

func (*BaseMatrixFactorization) IsItemPredictable added in v0.3.1

func (baseModel *BaseMatrixFactorization) IsItemPredictable(itemIndex int32) bool

IsItemPredictable returns false if item has no feedback and its embedding vector never be trained.

func (*BaseMatrixFactorization) IsUserPredictable added in v0.3.1

func (baseModel *BaseMatrixFactorization) IsUserPredictable(userIndex int32) bool

IsUserPredictable returns false if user has no feedback and its embedding vector never be trained.

func (*BaseMatrixFactorization) Marshal added in v0.2.7

func (baseModel *BaseMatrixFactorization) Marshal(w io.Writer) error

Marshal model into byte stream.

func (*BaseMatrixFactorization) Unmarshal added in v0.2.7

func (baseModel *BaseMatrixFactorization) Unmarshal(r io.Reader) error

Unmarshal model from byte stream.

type CCD

type CCD struct {
	BaseMatrixFactorization
	// contains filtered or unexported fields
}

func NewCCD

func NewCCD(params model.Params) *CCD

NewCCD creates a eALS model.

func (*CCD) Clear

func (ccd *CCD) Clear()

func (*CCD) Complexity added in v0.4.5

func (ccd *CCD) Complexity() int

func (*CCD) Fit

func (ccd *CCD) Fit(trainSet, valSet *DataSet, config *FitConfig) Score

Fit the CCD model. Its task complexity is O(ccd.nEpochs).

func (*CCD) GetItemFactor added in v0.3.2

func (ccd *CCD) GetItemFactor(itemIndex int32) []float32

GetItemFactor returns latent factor of an item.

func (*CCD) GetParamsGrid

func (ccd *CCD) GetParamsGrid(withSize bool) model.ParamsGrid

func (*CCD) GetUserFactor added in v0.3.2

func (ccd *CCD) GetUserFactor(userIndex int32) []float32

GetUserFactor returns latent factor of a user.

func (*CCD) Init

func (ccd *CCD) Init(trainSet *DataSet)

func (*CCD) InternalPredict

func (ccd *CCD) InternalPredict(userIndex, itemIndex int32) float32

func (*CCD) Invalid added in v0.2.2

func (ccd *CCD) Invalid() bool

func (*CCD) Marshal added in v0.2.7

func (ccd *CCD) Marshal(w io.Writer) error

Marshal model into byte stream.

func (*CCD) Predict

func (ccd *CCD) Predict(userId, itemId string) float32

Predict by the ALS model.

func (*CCD) SetParams

func (ccd *CCD) SetParams(params model.Params)

SetParams sets hyper-parameters for the ALS model.

func (*CCD) Unmarshal added in v0.2.7

func (ccd *CCD) Unmarshal(r io.Reader) error

Unmarshal model from byte stream.

type DataSet

type DataSet struct {
	UserIndex      base.Index
	ItemIndex      base.Index
	FeedbackUsers  base.Array[int32]
	FeedbackItems  base.Array[int32]
	UserFeedback   [][]int32
	ItemFeedback   [][]int32
	Negatives      [][]int32
	ItemLabels     [][]int32
	UserLabels     [][]int32
	HiddenItems    []bool
	ItemCategories [][]string
	CategorySet    *strset.Set
	// statistics
	NumItemLabels    int32
	NumUserLabels    int32
	NumItemLabelUsed int
	NumUserLabelUsed int
}

DataSet contains preprocessed data structures for recommendation models.

func NewDirectIndexDataset

func NewDirectIndexDataset() *DataSet

func NewMapIndexDataset

func NewMapIndexDataset() *DataSet

NewMapIndexDataset creates a data set.

func (*DataSet) AddFeedback

func (dataset *DataSet) AddFeedback(userId, itemId string, insertUserItem bool)

func (*DataSet) AddItem

func (dataset *DataSet) AddItem(itemId string)

func (*DataSet) AddUser

func (dataset *DataSet) AddUser(userId string)

func (*DataSet) Bytes added in v0.4.3

func (dataset *DataSet) Bytes() int

func (*DataSet) Count

func (dataset *DataSet) Count() int

func (*DataSet) GetIndex

func (dataset *DataSet) GetIndex(i int) (int32, int32)

GetIndex gets the i-th record by <user index, item index, rating>.

func (*DataSet) ItemCount

func (dataset *DataSet) ItemCount() int

ItemCount returns the number of ItemFeedback.

func (*DataSet) NegativeSample

func (dataset *DataSet) NegativeSample(excludeSet *DataSet, numCandidates int) [][]int32

func (*DataSet) SetNegatives

func (dataset *DataSet) SetNegatives(userId string, negatives []string)

func (*DataSet) Split

func (dataset *DataSet) Split(numTestUsers int, seed int64) (*DataSet, *DataSet)

Split dataset by user-leave-one-out method. The argument `numTestUsers` determines the number of users in the test set. If numTestUsers is equal or greater than the number of total users or numTestUsers <= 0, all users are presented in the test set.

func (*DataSet) UserCount

func (dataset *DataSet) UserCount() int

UserCount returns the number of UserFeedback.

type FitConfig

type FitConfig struct {
	*task.JobsAllocator
	Verbose    int
	Candidates int
	TopK       int
	Task       *task.Task
}

func NewFitConfig added in v0.2.2

func NewFitConfig() *FitConfig

func (*FitConfig) LoadDefaultIfNil

func (config *FitConfig) LoadDefaultIfNil() *FitConfig

func (*FitConfig) SetJobsAllocator added in v0.4.6

func (config *FitConfig) SetJobsAllocator(allocator *task.JobsAllocator) *FitConfig

func (*FitConfig) SetTask added in v0.4.5

func (config *FitConfig) SetTask(t *task.Task) *FitConfig

func (*FitConfig) SetVerbose added in v0.2.5

func (config *FitConfig) SetVerbose(verbose int) *FitConfig

type MatrixFactorization

type MatrixFactorization interface {
	Model
	// Predict the rating given by a user (userId) to a item (itemId).
	Predict(userId, itemId string) float32
	// InternalPredict predicts rating given by a user index and a item index
	InternalPredict(userIndex, itemIndex int32) float32
	// GetUserIndex returns user index.
	GetUserIndex() base.Index
	// GetItemIndex returns item index.
	GetItemIndex() base.Index
	// IsUserPredictable returns false if user has no feedback and its embedding vector never be trained.
	IsUserPredictable(userIndex int32) bool
	// IsItemPredictable returns false if item has no feedback and its embedding vector never be trained.
	IsItemPredictable(itemIndex int32) bool
	// Marshal model into byte stream.
	Marshal(w io.Writer) error
	// Unmarshal model from byte stream.
	Unmarshal(r io.Reader) error
	// Bytes returns used memory.
	Bytes() int
	// Complexity returns the complexity of the model.
	Complexity() int
}

func Clone

Clone a model with deep copy.

func UnmarshalModel added in v0.2.7

func UnmarshalModel(r io.Reader) (MatrixFactorization, error)

type Metric

type Metric func(targetSet *i32set.Set, rankList []int32) float32

Metric is used by evaluators in personalized ranking tasks.

type Model

type Model interface {
	model.Model
	// Fit a model with a train set and parameters.
	Fit(trainSet *DataSet, validateSet *DataSet, config *FitConfig) Score
	// GetItemIndex returns item index.
	GetItemIndex() base.Index
	// Marshal model into byte stream.
	Marshal(w io.Writer) error
	// Unmarshal model from byte stream.
	Unmarshal(r io.Reader) error
	// GetUserFactor returns latent factor of a user.
	GetUserFactor(userIndex int32) []float32
	// GetItemFactor returns latent factor of an item.
	GetItemFactor(itemIndex int32) []float32
}

type ModelSearcher

type ModelSearcher struct {
	// contains filtered or unexported fields
}

ModelSearcher is a thread-safe personal ranking model searcher.

func NewModelSearcher

func NewModelSearcher(nEpoch, nTrials int, searchSize bool) *ModelSearcher

NewModelSearcher creates a thread-safe personal ranking model searcher.

func (*ModelSearcher) Complexity added in v0.4.5

func (searcher *ModelSearcher) Complexity() int

func (*ModelSearcher) Fit

func (searcher *ModelSearcher) Fit(trainSet, valSet *DataSet, t *task.Task, j *task.JobsAllocator) error

func (*ModelSearcher) GetBestModel

func (searcher *ModelSearcher) GetBestModel() (string, MatrixFactorization, Score)

GetBestModel returns the optimal personal ranking model.

type ParamsSearchResult

type ParamsSearchResult struct {
	BestModel  MatrixFactorization
	BestScore  Score
	BestParams model.Params
	BestIndex  int
	Scores     []Score
	Params     []model.Params
}

ParamsSearchResult contains the return of grid search.

func GridSearchCV

func GridSearchCV(estimator MatrixFactorization, trainSet *DataSet, testSet *DataSet, paramGrid model.ParamsGrid,
	_ int64, fitConfig *FitConfig) ParamsSearchResult

GridSearchCV finds the best parameters for a model.

func RandomSearchCV

func RandomSearchCV(estimator MatrixFactorization, trainSet *DataSet, testSet *DataSet, paramGrid model.ParamsGrid,
	numTrials int, seed int64, fitConfig *FitConfig) ParamsSearchResult

RandomSearchCV searches hyper-parameters by random.

func (*ParamsSearchResult) AddScore

func (r *ParamsSearchResult) AddScore(params model.Params, score Score)

type Score

type Score struct {
	NDCG      float32
	Precision float32
	Recall    float32
}

type SnapshotManger

type SnapshotManger struct {
	BestWeights []interface{}
	BestScore   Score
}

SnapshotManger manages the best snapshot.

func (*SnapshotManger) AddSnapshot

func (sm *SnapshotManger) AddSnapshot(score Score, weights ...interface{})

AddSnapshot adds a copied snapshot.

func (*SnapshotManger) AddSnapshotNoCopy

func (sm *SnapshotManger) AddSnapshotNoCopy(score Score, weights ...interface{})

AddSnapshotNoCopy adds a snapshot without copy.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL